You are on page 1of 29

Engineering White Paper

Best Practices for EMC Symmetrix 8000


with IBM DB2 Universal Database TM

Karen Sullivan
IBM Canada Ltd
IBM Toronto Lab
John Macdonald
EMC Corporation
June 2003

EMC Corporation and IBM Corporation

Copyright 2003 EMC Corporation and IBM Corporation. All rights reserved.
EMC and IBM believe the information in this publication is accurate as of its publication date. The
information is subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED AS IS. - NEITHER EMC
CORPORATION NOR IBM CORPORATION MAKE ANY REPRESENTATIONS OR WARRANTIES
OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND BOTH
SPECIFICALLY DISCLAIM IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR
A PARTICULAR PURPOSE.
The furnishing of this document does not imply giving license to any IBM or EMC patents.
References in this document to IBM products, Programs, or Services do not imply that IBM intends to
make these available in all countries in which IBM operates.
Use, copying, and distribution of any EMC or IBM software described in this publication requires an
applicable software license.

IBM, AIX, DB2, DB2 Universal Database, and RS/6000 are trademarks or registered trademarks of
International Business Machines Corporation in the United States, other countries, or both.

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

EMC Corporation and IBM Corporation

Table of Contents
Introduction ......................................................................................................................4
Symmetrix Concepts and Definitions........................................................................5
Hypervolumes ..................................................................................................................... 5
Hypervolume Size............................................................................................................ 5
Metavolumes ...................................................................................................................... 6
Metavolume Size............................................................................................................. 6
Meta Head and Meta Tail ................................................................................................. 7
Types of Metavolumes......................................................................................................... 7
Concatenated Metavolume ............................................................................................... 7
Striped Metavolume......................................................................................................... 8
Mirrored Metavolumes ..................................................................................................... 9
Channel Directors ............................................................................................................... 9

DB2 UDB Concepts and Definitions ....................................................................... 10


Instances .......................................................................................................................... 10
Databases ........................................................................................................................ 10
Database Partitions ........................................................................................................... 10
Nodegroups .................................................................................................................. 10
Buffer Pools ...................................................................................................................... 11
Tables .............................................................................................................................. 11
Table Spaces .................................................................................................................... 11
System-Managed ve rsus Database-Managed Table Spaces ............................................ 12
Containers .................................................................................................................... 12
Pages ........................................................................................................................... 12
Extents.......................................................................................................................... 12
Prefetch Size................................................................................................................. 12
Prefetching ....................................................................................................................... 13
Page Cleaners .................................................................................................................. 13

Configuring a Symmetrix System ........................................................................... 14


Creating Metavolumes ....................................................................................................... 14
Metavolume Size........................................................................................................... 14
Hypervolumes versus Physical Disks.............................................................................. 15
Striped versus Concatenated Metavolumes ..................................................................... 15
Stripe Size .................................................................................................................... 15
Channel Directors ............................................................................................................. 15

Configuring the Operating System ......................................................................... 17


Multipathing with PowerPath .............................................................................................. 17
Operating System Logical Volume Striping ......................................................................... 17

Configuring DB2 UDB................................................................................................. 18


Table Space Container Configurations................................................................................ 18
Shared Nothing ............................................................................................................. 18
Shared Everything ......................................................................................................... 20
JBOD............................................................................................................................ 22
Table Space Configuration................................................................................................. 22
Extent Size.................................................................................................................... 22

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

EMC Corporation and IBM Corporation

Prefetch Size................................................................................................................. 22
Overhead and Transfer Rate .......................................................................................... 22
Other Tuning Parameters .................................................................................................. 23
I/O Servers .................................................................................................................... 23
DB2_PARALLEL_IO ...................................................................................................... 23
Multipage File Allocation ................................................................................................ 23

Understanding Existing Systems............................................................................ 24

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

EMC Corporation and IBM Corporation

Introduction
For every complex problem, there is a solution that is simple, neat, and wrong.
H. L. Mencken
H. L. Mencken was a journalist whose clever observation that simple solutions to complicated problems are
not always right ones, reflects the fear people sometimes feel when attempting to take on new, complicated
problems. Avoiding the solution that is neat and wrong is never simple; rather, it takes patience, planning,
and experience. This paper gives a general overview of IBM DB2 Universal Database (DB2 UDB) with
database paritioning and the EMC Symmetrix 8000 series (Symmetrix). It also supplies practical
recommendations for implementing DB2 UDB for data warehouse applications running on EMC
Symmetrix 8000 series storage servers. The information presented within this paper was compiled using
DB2 UDB V7.2. However, unless otherwise noted, concepts and methodologies remain the same for DB2
UDB V8.1.
The paper does not provide complete descriptions for DB2 UDB or the Symmetrix 8000 series products;
refer to www.emc.com or www.ibm.com/db2 for additional product information.

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

EMC Corporation and IBM Corporation

Symmetrix Concepts and Definitions


The following is a brief summary of Symmetrix terminology and a discussion of limits required to
understand the contents of this white paper. Note that the configuration and capacity limits are microcodelevel dependent and subject to change. For a complete description, review your EMC Symmetrix Product
Guide. The limits discussed in this paper are all based on microcode level 5068.
The physical disks within a Symmetrix system can be subdivided into various-sized logical volumes that
can be logically joined together again. These logically linked volumes are then presented to a server as an
addressable device. Within a Symmetrix system there are two types of logical volumes: hypervolumes and
metavolumes. The maximum number of logical volumes is a microcode-dependent value currently set to
8000 volumes.

Hypervolumes
A hypervolume, also referred to as a hyper, is a range of contiguous space on a single physical disk that is
defined to be an individually addressable Symmetrix logical volume. Each physical disk can be divided
into a maximum of 128 hypervolumes. People familiar with the process of creating hypervolumes will
often refer to the process as slicing up the physical disks or creating splits. For clarity, the terms slices and
splits will not be used to describe hypervolumes. While hypervolumes can be presented to the server as a
directly addressable device, they are also the foundation for creating metavolumes. The major attribute that
defines a hypervolume is its size.
36 GB Physical Disk
9 GB Hypervolume
9 GB Hypervolume
9 GB Hypervolume
9 GB Hypervolume
Figure 1. A 36 GB Physical Disk Divided into Four Hypervolumes of 9 GB Each

Hypervolume Size
The amount of physical disk space associated with one hypervolume is called the hypervolume size.
Hypervolume size is microcode-dependent and currently limited to 15 GB. In Figure 1, a 36 GB physical
disk is subdivided into four hypervolumes, each with a hypervolume size of 9 GB.

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

EMC Corporation and IBM Corporation

Metavolumes
Once a physical disk has been divided into hypervolumes, a group of hypervolumes of the same size can
then be logically joined across various physical disks to create a metavolume. This newly created logical
volume can then be presented to a server as an addressable device. The hypervolumes that make up the
newly created metavolume can no longer be presented to the server as separate devices.
Figure 2 provides an example of how four metavolumes can logically reside on four 36 GB physical disks.
Note, for simplicitys sake only one metavolume is labeled. In the diagram, the four physical disks are
subdivided into four 9 GB hypervolumes. Each hypervolume is coloured red, yellow, blue, or green. The
metavolumes are made up of four like-coloured 9 GB hypervolumes. Therefore, there are four
metavolumes of 36 GB each in the diagram.

9 GB Hypers

36 GB Metavolume

Figure 2. Example Layout of Four Metavolumes across Four Physical Di sks

Metavolume Size
A metavolume consists of 2 to 255 hypervolumes. Each time a metavolume is created, the number of
hypervolumes it contains must be determined. This value is referred to as the number of hypers per meta
for one metavolume. The metavolume size is simply the product of the hypervolume size and the number
of hypers per meta.

metavolume size = ( hypers per meta ) * ( hypervolume size )


Formula 1. Metavolume Size
Given the fact that the maximum hypervolume size is 15 GB and the maximum number of hypers per meta
is 255, the largest metavolume that can be created is 3825 GB or 3.74 TB (based on the formula given).

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

EMC Corporation and IBM Corporation

Meta Head and Meta Tail


As the hypervolumes are assigned to a metavolume, they are given a sequence number. The first
hypervolume in the sequence is known as the meta head, and the last hypervolume in the sequence is
considered the meta tail. All remaining hypervolumes are considered members of the metavolume. Figure
3 shows which hypervolumes are considered the meta head, meta members, and the meta tail.
Head

Member

Member

Tail

9 GB Hypers

36 GB Metavolume

Figure 3. Relative Positions of a Meta Head, Member, and Tail for an Example Metavolume
Data is always placed on the metavolume across the hypervolumes from meta head to meta tail. Therefore,
when data is first written to a metavolume, the first write always takes place on the meta head.

Types of Metavolumes
There are two different types of metavolumes: concatenated metavolumes and striped metavolumes. For
both types, the metavolume size is defined in the same way. For example, if you have the same number of
hypers per meta (e.g., four) and the same hypervolume size (e.g., 9 GB), then both a concatenated and a
striped metavolume will produce a device of the same size (e.g., 36 GB). The difference between
concatenated and striped metavolumes is in the method in which the logical data is placed on the
underlying hypervolumes. DB2 UDB database data allocation will be discussed in more detail later in the
paper.

Concatenated Meta volume


A concatenated metavolume writes data to a hypervolume until the hypervolume size is reached before
placing data onto the next hypervolume. Therefore, when first allocating data to the metavolume, the meta
head would receive all the data until the hypervolume size is reached. Only after the meta head is full will
data be placed onto the next hypervolume.
Head

Tail

Figure 4. Logical Data Placement on a Concatenated Metavolume

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

EMC Corporation and IBM Corporation

Striped Metavolume
In the case of a striped metavolume, data will be placed on the underlying hypervolumes in multiples of
Symmetrix cylinders. When an application writes data to the metavolume, the first write will take place on
the meta head, and the subsequent write will reside on the next member of the metavolume. Allocation will
continue in this fashion until the meta tail is reached. The process is then repeated starting at the meta head
once again.

Head

Tail

Figure 5. Logical Data Placement on a Striped Metavolume


Stripe Size
The amount of data written to a single hypervolume is known as the stripe size. The size is based on units
of disk cylinders with the default and minimum value being two cylinders. Since a cylinder is 480 KB of
data, the minimum stripe size is 960 KB.

Stripe Size = 2 cylinders

Figure 6. Close Up of a Stripe on a Single Hypervolume


Stripe Width
The stripe width is the stripe size times the number of hypers per meta. So, if we have the default stripe
size of 960 KB and four hypers per meta, the stripe width would be 3840 KB.

stripe width = ( stripe size ) * ( hypers per meta )

Formula 2. Stripe Width (with diagram)

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

EMC Corporation and IBM Corporation

Mirrored Metavolumes
Any metavolume, whether it be striped or concatenated, can also be mirrored. This means another
complete copy of the metavolume is created and is stored on a different physical disk using like hyper
volumes (same hyper volume size with the same hypers per meta but different physical disks). Even
though mirrored metavolumes require twice as much physical disk space, the metavolume size does not
change. The device presented to the server will still be the same size as a nonmirrored metavolume. When
a read request takes place against a mirrored metavolume, either hyper volume where the data resides may
service the request. Consequently, when a write request takes place, the write must occur on both hyper
volumes. To safeguard redundancy, a Symmetrix system ensures that mirrored copies are not created on
the same physical disk.
Same Logical
Data Written

Figure 7. Logical Data Placement on a Mirrored Striped Metavolume

Channel Directors
Host adapters on the Symmetrix system are known as channel directors. This is where the server
physically attaches to the storage server via cables. Each card contains a number of fiber, SCSI, or serial
ESCON ports.

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

EMC Corporation and IBM Corporation

DB2 UDB Concepts and Definitions


The following is a brief summary of some DB2 UDB V7.2 concepts and terminology required to
understand the contents of this white paper. Although some of the terminology has changed for DB2 UDB
V8.1, the overall concepts remain the same. For a more in-depth discussion of these and other DB2 UDB
terms, refer to the DB2 UDB manuals available online.
For DB2 UDB V7.2:
http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix
/support/v7pubs.d2w/en_main
For DB2 UDB V8.1:
http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix
/support/v8infocenter.d2w/report?target=mainFrame&fn=c0008880.htm

Instances
An instance, in DB2 UDB, is a logical database manager environment where you can create and/or catalog
databases and set various instance-wide configuration parameters. A database manager instance can also
be defined as being similar to an image of the actual database manager environment. Furthermore, you can
have several instances of the database manager product on the same database server. You can use these
instances to separate the development environment from the production environment, tune the database
manager to a particular environment, and protect sensitive information from a particular group of people.
For a partitioned database environment, all database partitions will reside within a single instance and will
share at the instance level a common set of configuration parameters.

Databases
A database is created within an instance. They present logical data as a collection of database objects (e.g.,
tables and indexes). Each database includes a set of system catalog tables that describe the logical and
physical structure of the data, configuration files containing the parameter values allocated for the database,
and recovery log(s).
DB2 UDB allows multiple databases to be defined within a single database instance. Configuration
parameters can also be set at the database level to tune various characteristics, such as memory usage and
logging.

Database Partitions
DB2 UDB allows the user to divide a single database into multiple logical database partitions. Each of
these database partitions can look and behave as an independent database. Therefore, multiple database
partitions can reside on the same server, and/or database partitions can reside on many servers. They are all
part of the same database that is joined through the catalog database partition where the database is actually
created. This database partition stores the overall database configuration information. Each database
partition also has access to its own set of database-level configuration parameters.
Another term for a database partition is a node. A unique node number identifies each node.

Nodegroups
A nodegroup is a set of one or more database partitions. For nonpartitioned database implementations,
there is only one nonconfigurable nodegroup, which is always made up of a single database partition.
Figure 9 shows how five database partitions can be divided into three different nodegroups. As you can
see, a database partition can reside within multiple nodegroups. In this example, nodegroup 1 is made up

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

10

EMC Corporation and IBM Corporation

of database partitions 1, 2, 3, and 4. Nodegroup 2 contains only a single database partition, database
partition 1, and finally nodegroup 3 compris es database partitions 4 and 5.

DB2 UDB Database


Nodegroup 1

Database
Partition 1

Database
Partition 2

Database
Partition 3

Database
Partition 4

Nodegroup 3
Nodegroup 2

Database
Partition 5

Figure 8. One DB2 UDB Database Comprising Five Database Partitions Grouped into Three
Nodegroups

Buffer Pools
A buffer pool is the main memory allocated in the host processor to cache table and index data pages as
they are being read from disk, or being modified. The purpose of the buffer pool is to improve system
performance. Data can be accessed much faster from memory than from disk; therefore, the fewer times
the database manager needs to read from or write to disk (I/O) the better the performance. Buffer pools are
created by database partitions and each partition can have multiple buffer pools.

Tables
The primary database object is the table. A table is defined as a named data object consisting of a specific
number of columns and a various number rows. Tables are uniquely identified units of storage maintained
within a DB2 table space. They consist of a series of logically linked blocks of storage that have been
given the same name. They also have a unique structure for storing information that permits that
information to be related to information in other tables.
When creating a table, you can choose to have certain objects, such as indexes, stored separately from the
rest of the table data. In order to do this, the table must be defined to a DMS (data-managed space) table
space.

Table Spaces
A database is logically organized into table spaces. A table space is a place to store tables. The table space
is where the database is defined to use the disk storage subsystem. One method to spread a table space over
one or more physical storage devices is to simply specify multiple containers.

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

11

EMC Corporation and IBM Corporation

There are three main types of user table spaces: regular, temporary, and long. In addition to these userdefined table spaces, DB2 also defines separate system and catalog table spaces. For partitioned database
environments, the catalog table space resides on the catalog database partition.

System-Managed versus Database-Managed Table Spaces


For partitioned databases, the table spaces can reside in nodegroups. During the create table space
command, the containers themselves are assigned to a specific database partition in the nodegroup, thus
maintaining the shared nothing character of DB2 UDB. Table spaces can be either system-managed
space (SMS), or data-managed space (DMS). For an SMS table space, each container is a directory in the
file system, and the operating systems file manager controls the storage space. For a DMS table space,
each container is either a fixed-size pre-allocated file or a physical volume, and the database manager
controls the storage space itself.

Containers
A container is an allocation of physical storage. It is a way to define the device that will be made available
for storing database objects. Containers may be assigned to file systems by specifying a directory. Such
containers are identified as PATH containers and are used with SMS table spaces. Containers may also
reference files that reside within a directory. These are identified as FILE containers, and a specific size
must be identified. FILE containers are only used with DMS file table spaces. Containers may also
reference raw character devices. These containers are used by DMS raw table spaces and are identified as
DEVICE containers. Note that the device must already exist on the system before the container can be
used. In all cases, containers must be unique and can belong to only one table space.

Pages
Data is transferred to and from devices in discrete blocks that are buffered in memory called pages. DB2
UDB supports various page sizes including 4 KB, 8 KB, 16 KB and 32 KB. When an application accesses
data randomly, the page size determines the amount of data transferred. In other words, it will correspond
to the data transfer request size to the disk array. Page size determines the maximum length of a row, and
is associated with the maximum size of a table space. These limits are shown in Table 1. In all cases DB2
UDB limits the number of data rows on a single page to 255 rows.
Table 1. Page Size Limits
Page Size

Max Table Space Size

Max Row Length

4 KB

64 GB

4005 B

8 KB
16 KB

128 GB
256 GB

8101 B
16293 B

32 KB

512 GB

32677 B

Extents
An extent is the unit at which space is allocated within a container of a table space for a single table space
object. This allocation consists of multiple pages. The size of the extent is specified when the table space is
created. Note that when data is written to a table space with multiple containers, the data is striped across
all containers in extent-sized blocks.

Prefetch Size
The number of pages that the database manager will prefetch can be defined for each table space using the
PREFETCHSIZE clause with either the CREATE TABLESPACE or ALTER TABLESPACE statements.
The value specified is maintained in the PREFETCHSIZE column of the SYSCAT.TABLESPACES
system catalog table.

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

12

EMC Corporation and IBM Corporation

Prefetching
Prefetching is a technique for anticipating data needs and reading ahead from storage in large blocks. By
transferring data in larger blocks, fewer system resources are expended and less total time is required.
Sequential prefetches read consecutive pages into the buffer pool before they are needed by DB2. List
prefetches are more complex. In this case, the DB2 optimizer optimizes the retrieval of randomly located
data.
The amount of data being prefetched is part of what determines the amount of parallel I/O activity.
Ordinarily the database administrator should define a prefetch value large enough to allow parallel use of
all of the available containers, and therefore all of the arrays physical disks.
Consider the following example:

A table space is defined with a page size of 16 KB using raw DMS.

The table space is defined across four containers, and each container resides on a separate logical disk,
and each logical disk resides on a separate RAID array.

The extent size is defined as 16 pages (or 256 KB).

The prefetch value is specified as 64 pages (number of containers x extent size).

Suppose a user issued a query that results in a table space scan, which then results in DB2 performing a
prefetch operation. The following would happen:

DB2 UDB would recognize that this prefetch request for 64 pages (a megabyte) evenly spans four
containers, and would issue four parallel I/O requests, one against each of those containers. The
request size to each container would be 16 pages, or 256 KB.

The AIX Logical Volume Manager would divide the 256 KB request to each AIX logical volume into
smaller units (128 KB is the largest), and pass them on to the array as back -to-back requests against
each logical disk.

An array receives a request for 128 KB; if the data is not in cache, four arrays would operate in parallel
to retrieve the data.

After receiving several of these requests, the array would recognize that these DB2 UDB prefetch
requests are arriving as sequential accesses, causing the array sequential prefetch to take effect.

Page Cleaners
Page cleaners write dirty pages from the buffer pool to disk, reducing the chance that agents looking for
victim buffer pool slots in memory will have to incur the cost of writing dirty pages to disk. For example,
if you have updated a large amount of data in a table, many data pages in the buffer pool may be updated
but not written into disk storage (these pages are called dirty pages). Since agents cannot place fetched data
pages into the dirty pages in the buffer pool, these dirty pages must be flushed to disk storage before their
buffer pool memory can be used for other data pages.

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

13

EMC Corporation and IBM Corporation

Configuring a Symmetrix System


Several factors affect database performance and should be considered when configuring a Symmetrix
system. Some examples are:

The size of device required by the database

The number of physical disk spindles that will service DB2 UDB to provide a device of the size
required by the database

The affect (if any) on the positioning schema of the maximum number of hypervolumes allowed with a
Symmetrix system

Creating Metavolumes
When creating metavolumes for a database server, several factors must be considered in order to ensure
reasonable performance.

Metavolume Size
A metavolumes size depends on two factors: hypervolume size and the number of hypers per meta. The
value for either of these factors can affect the overall metavolume performance.
Hypervolume Size
If you use a hyper size that is too large, you may not reach the six to ten desired spindles per CPU typically
recommended by DB2 UDB for your server. For example, if the hypervolume size is 15 GB and the
sought-after metavolume size is 30 GB, then only two physical disks (one per hypervolume) are required.
Even if DB2 UDB uses multiple containers created out of these devices, only two physical disks will be
servicing the requests. However, if a hypervolume size of 5 GB is used, and all six hypervolumes are
placed on different physical disks, then there will be six physical disks servicing the requests.
However, if your hypervolume size is too small, it is possible to reach the maximum number of logical
volumes allowed within a Symmetrix system. The equation in Formula 3 describes how to calculate the
maximum number of physical disks that can be partitioned before reaching this limit. Formula 3 assumes
each metavolume will be created using the same hypervolume size and number of hypers per meta.

maximum # of volumes maximum # of disks =


floor

maximum # of volumes
hypers per meta

physical disk size


hyper volume size

Formula 3. Maximum Number of Disks


The rule of thumb for hypervolume size is 9 GB. This value is also easily divisible into common
Symmetrix disk sizes.
Hypers per Meta
Although this does not explicitly affect performance, too many hypers per meta may increase the chances
of wasting disk space. This can only occur when physical disks are dedicated to the database server. In
this case, it is possible to meet the database disk space requirements without fully allocating the underlying
physical disks.
If you use too few hypers per meta, you may not exploit the full performance potential of your Symmetrix
system, since the underlying disks may not be able to fully parallelize your transactions. The suggested
starting point is to create a metavolume using four hypers per meta.
Finally, four hypers per meta combined with a hyper size of 9 GB will produce a 36 GB metavolume. A 36
GB device is typically large enough without becoming unmanageable.

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

14

EMC Corporation and IBM Corporation

Hypervolumes versus Physical Disks


The number of hypervolumes servicing a database system is not necessarily equivalent to the number of
physical disks servicing that same system. Although each hypervolume within a metavolume is typically
created upon a separate physical disk, those same physical disks can contain other hypervolumes. These
hypervolumes can belong to other metavolumes servicing the same database system as well.

Striped versus Concatenated Metavolumes


When creating a metavolume, there are two methods in which a metavolume can be created: concatenated
and striped. To achieve the best performance, it is recommended that striped metavolumes be used. In the
concatenated case, some disk spindles can be left idle from lack of data or fro m data always being found on
the same disk. Therefore, your system will not benefit from using all drive heads. Using striped
metavolumes will increase the average number of drive heads servicing a request since it will be more
likely that data being retrieved will be found on different underlying hypervolumes.

Head

Tail

Figure 9. Logical Data Placement on a Striped Metavolume

Stripe Size
Another consideration when defining metavolumes is stripe size. The minimum, and currently the default,
stripe size is 960 KB. This minimum value is based on the size of two disk cylinders. It is possible to set
this value higher, but it is recommended that the stripe size be left at the default for most systems. This
allows for the highest likelihood of requested data being spread across more than one underlying
hypervolumes, thus minimizing the chance for a bottleneck to occur on only one resource.

Channel Directors
Another physical performance consideration occurs when connecting a Symmetrix system to the database
server. The I/O cables should be spread across as many channel directors as possible. Each channel
director has a tangible throughput limit. Therefore, spreading the cables across all available channel
directors will decrease the likelihood of the channel directors becoming a bottleneck. Figures 10a and 10b
demonstrate the difference between the recommended and the not recommended method for attaching the
cables.
Multiple Fiber Channel (FC) connections per physical server provide both performance and redundancy to
the overall configuration. With current generation 2 GB FC ports, DB2 UDB can be configured with two
to four FC ports per physical server per attached Symmetrix system. (It is possible to configure more, but
they would not normally provide additional value.)

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

15

EMC Corporation and IBM Corporation

Channel Director

Channel Director

I/O Cables

I/O Cables

Server

Server

Symmetrix

Figure 10a. Two I/O Cables Connect to


Two Channel Directors (Recommended)

Symmetrix

Figure 10b.Two I/O Cables Connect to


One Channel Directors (Not
Recommended)

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

16

EMC Corporation and IBM Corporation

Configuring the Operating System


Multipathing with PowerPath
EMC produces a software product, PowerPath , which enables multipathing for Symmetrix arrays and other
storage systems. Multipathing can increase the overall throughput between storage and server by increasing
the number of I/O channels available to the server to address a specific device. For more detailed
instructions on multipathing with PowerPath, refer to the PowerPath product guide.
There are several load balancing policy settings for PowerPath. Changing the policy can have a large impact
on DB2 UDB performance. However, the default policy, Symmetrix Optimization, is generally best. AIX
DB2 UDB users must use version 2.1.0 or higher in order to avoid a known performance defect in
PowerPath.

Operating System Logical Volume Striping


For decision support systems, logical volumes at the operating system level should not be striped. The
striping at the DB2 UDB container level and on the Symmetrix system is enough to exploit parallelism
without compromising overall sequential detection. Adding additional layers of striping may cause data to
be placed in a random order on the underlying physical disks, which could affect when sequential detection
occurs. This is less of an issue on systems where the workload is generally random I/O.

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

17

EMC Corporation and IBM Corporation

Configuring DB2 UDB


Table Space Container Configurations
There are many different ways the devices presented to a server by a Symmetrix system can be allocated for
use by DB2 UDB. However, most schemas can be categorized into two major philosophies: shared nothing
and shared everything. This discussion assumes each metavolume corresponds to a single file system on the
database server.

Shared Nothing
The basic concept behind shared nothing is resources are isolated for use by specific applications. For DB2
UDB, this usually means isolating physical disks for use by a particular database partition or table space.
Therefore, all metavolumes residing on a set of physical disks are used by a single database partition or table
space. This must be done carefully as more than one metavolume can reside on a physical disk. When
successful, each database partition or table space will have its own dedicated set of physical disks.
Figure 12 shows an example of isolating physical disks at the database partition level. In this example, there
are 16 physical disks. Each set of four physical disks has been divided into four metavolumes as is in Figure
2. Therefore, the total of 16 separate metavolumes can be addressed by a server.
For this example, we want to create two SMS table spaces for an imaginary database that has two database
partitions. As Figure 12 shows, the file systems that are mounted on the top eight metavolumes will be
assigned to database partition 1, while the file systems on the bottom eight metavolumes will be assigned to
database partition 2. Thus, the underlying disks are isolated to be used exclusively by a specific database
partition. This particular layout corresponds to the CREATE TABLESPACE statement presented in Figure
13.
Although not highlighted in the example, shared nothing is typically easier to configure and manage since
the creation of numerous additional devices is not usually required. However, in some cases, performance
under this configuration may not be optimal. When a system has a limited number of physical disks, sharing
all the physical disks between all the DB2 UDB database partitions can cause a performance gain.

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

18

EMC Corporation and IBM Corporation


Mounted On

Physical
Disks

Metavolume 1

/node1/meta_volume1

Metavolume 2

/node1/meta_volume2

Metavolume 3

/node1/meta_volume3

Metavolume 4

/node1/meta_volume4

Metavolume 5

/node1/meta_volume5

Metavolume 6

/node1/meta_volume6

Metavolume 7

/node1/meta_volume7

Metavolume 8

/node1/meta_volume8

Metavolume 9

/node2/meta_volume9

Metavolume 10

/node2/meta_volume10

Metavolume 11

/node2/meta_volume11

Metavolume 12

/node2/meta_volume12

Metavolume 13

/node2/meta_volume13

Metavolume 14

/node2/meta_volume14

Metavolume 15

/node2/meta_volume15

Metavolume 16

/node2/meta_volume16

Figure 11. 16 Physical Disks Arranged in a Shared Nothing Configuration for


DB2 UDB
CREATE TABLESPACE My_Tablespace PAGESIZE 16K
MANAGED BY SYSTEM
USING( /node1/meta_volume1/My_Tablespace,
/node1/meta_volume2/My_Tablespace,
/node1/meta_volume3/My_Tablespace,
/node1/meta_volume4/My_Tablespace,
/node1/meta_volume5/My_Tablespace,
/node1/meta_volume6/My_Tablespace,
/node1/meta_volume7/My_Tablespace,
/node1/meta_volume8/My_Tablespace) ON NODE (1)
USING( /node2/meta_volume9/My_Tablespace,
/node2/meta_volume10/My_Tablespace,
/node2/meta_volume11/My_Tablespace,
/node2/meta_volume12/My_Tablespace,
/node2/meta_volume13/My_Tablespace,
/node2/meta_volume14/My_Tablespace,
/node2/meta_volume15/My_Tablespace,
/node2/meta_volume16/My_Tablespace) ON NODE (2)
EXTENTSIZE 16
PREFETCHSIZE 128;

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

19

EMC Corporation and IBM Corporation

Figure 12. Example CREATE TABLESPACE Statement to Figure 11

Shared Everything
With shared everything, resources are not isolated for use. All applications should have access to all the
resources. For DB2 UDB, this typically means all database partitions and table spaces will reside on all
physical disks. Therefore, each physical disk will need to be addressable by each database partition. This
can only be accomplished by creating at least one hypervolume per database partition on every physical
disk. In addition, if you are planning on using DMS raw table space containers, a separate metavolume must
be created for each table space in each database partition on every physical disk. You should notice how this
design can quickly increase the number of devices that must be managed by your system administrator.
Therefore, the chance of a possible performance gain should be weighed against the extra administrative
costs.
Figures 13 and 14 provide an example of creating a shared everything table space on a database with two
database partitions. Note that the Symmetrix disk configuration for this example has the exact same layout
as in the previous example for shared nothing (Figure 11). As before, 16 physical disks have been divided
into 16 separate metavolumes. This difference is in how the metavolumes are address by the database
server(s). Look closely at the ordering of the file system names used as containers for the two database
partitions in Figure 14. Notice how each database partition in the create table space statement has access to
each physical disk.

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

20

EMC Corporation and IBM Corporation


Mounted On

Physical
Disks

Metavolume 1

/node1/meta_volume1

Metavolume 2

/node1/meta_volume2

Metavolume 3

/node2/meta_volume3

Metavolume 4

/node2/meta_volume4

Metavolume 5

/node1/meta_volume5

Metavolume 6

/node1/meta_volume6

Metavolume 7

/node2/meta_volume7

Metavolume 8

/node2/meta_volume8

Metavolume 9

/node1/meta_volume9

Metavolume 10

/node1/meta_volume10

Metavolume 11

/node2/meta_volume11

Metavolume 12

/node2/meta_volume12

Metavolume 13

/node1/meta_volume13

Metavolume 14

/node1/meta_volume14

Metavolume 15

/node2/meta_volume15

Metavolume 16

/node2/meta_volume16

Figure 13.16 Physical Disks Arranged in a Shared Everything Configuration


for DB2 UDB
CREATE TABLESPACE My_Tablespace PAGESIZE 16K
MANAGED BY SYSTEM
USING( /node1/meta_volume1/My_Tablespace,
/node1/meta_volume2/My_Tablespace,
/node1/meta_volume5/My_Tablespace,
/node1/meta_volume6/My_Tablespace,
/node1/meta_volume9/My_Tablespace,
/node1/meta_volume10/My_Tablespace,
/node1/meta_volume13/My_Tablespace,
/node1/meta_volume14/My_Tablespace) ON NODE (1)
USING( /node2/meta_volume3/My_Tablespace,
/node2/meta_volume4/My_Tablespace,
/node2/meta_volume7/My_Tablespace,
/node2/meta_volume8/My_Tablespace,
/node2/meta_volume11/My_Tablespace,
/node2/meta_volume12/My_Tablespace,
/node2/meta_volume15/My_Tablespace,
/node2/meta_volume16/My_Tablespace) ON NODE (2)
EXTENTSIZE 16
PREFETCHSIZE 128;

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

21

EMC Corporation and IBM Corporation

Figure 14. Example CREATE TABLESPACE Statement that Corresponds with Figure 13

JBOD
The final way to lay out Symmetrix physical disks is called JBOD (just a bunch of disks). In essence, it is
another example of shared nothing, where physical dis ks are isolated for use. It is only possible to create a
JBOD schema without the use of metavolumes. In this case, each hypervolume is presented to the server as
a separate addressable device. These devices are then used as containers by various DB2 UDB table spaces.
Basically, this is the same as the previous shared nothing schema, without the metavolume layer. When
databases are less than 1 TB in size, removing the metavolume layer does provide an additional performance
gain. However, if your database is larger, or has a chance of growing past 1 TB, do not use a JBOD
schema.

Table Space Configuration


Extent Size
Extent size is usually configured to be the same as the stripe width of the devices on which the table space
resides. However, a typical stripe width for the Symmetrix system is 3840 KB (960 KB stripe size * 4
hypers per meta), which is significantly larger than other like systems. Setting the extent size to the stripe
width can actually impede performance; instead, the extent size should be configured around 256 KB.

Prefetch Size
Prefetch size specifies how much data should be read into the buffer pool on a prefetch data request.
Prefetching data can help queries avoid unnecessary page faults. Therefore, the value of the most efficient
prefetch size for a table space is closely linked to its workload, and must be tuned on a per-system basis.
However, a good starting point for a Symmetrix-based system is to multiply the number of containers in the
table space by its extent size in KB, and then double it: This is twice the usual rule of thumb for prefetch
size and is linked to the ability of the Symmetrix mirrored metavolumes to fulfill a read request from two
separate physical disks.
prefetchsize (KB) = extentsize (KB) * # of containers * 2
Formula 4. Prefetch Size
Note that prefetchsize is tunable after table space creation. This is not true for extent size and page size.
These values are set at table space creation time and cannot be altered without re-defining the table space
and re-loading its data.

Overhead and Transfer Rate


Two other parameters that relate to I/O preference can be configured for a table space: overhead and transfer
rate. These parameters are used when making optimization decisions, and help determine the relative cost of
random versus sequential accesses.
Overhead provides an estimate (in milliseconds) of the time required by the container before any data is read
into memory. This overhead activity includes the container's I/O-controller overhead, as well as the disk
latency time, which includes the disk seek time.
Transfer rate provides an estimate (in milliseconds) of the time required to read one page of data into
memory.

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

22

EMC Corporation and IBM Corporation

Table 2. Suggested Overhead and Transfer Rate Values


Transfer Rate
Disk Capacity

Overhead

4 KB

8 KB

16 KB

32 KB

36 GB 10K RPM

8.7

0.1

0.1

0.3

0.6

50 GB 7200 RPM
73 GB 10K RPM

11.6
8.6

0.1
0.1

0.1
0.1

0.3
0.2

0.6
0.5

181 GB 7200 RPM

11.7

0.1

0.1

0.2

0.5

Other Tuning Parameters


I/O Servers
The number of I/O servers configured for a database can also have a significant impact on performance. I/O
servers are used on behalf of the database agents to perform I/O prefetches and asynchronous I/O for utilities
such as backup and restore. This value, like prefetch size depends on overall system workload. However, a
good starting point for configuring I/O servers is to count the number of containers in the table space with
the most containers, and multiply that number by two.

DB2_PARALLEL_IO
It is recommended that DB2_PARALLEL_IO be set to ON for all table spaces using containers created on
RAID devices. And Symmetrix striped metavolumes fall into this category. DB2_PARALLEL_IO allows
for multiple read and writes to occur on a single container, thus increasing throughput.

Multipage File Allocation


In an SMS table space, a file is extended one page at a time as the object grows. If you need improved insert
performance, you can consider enabling multipage file allocation. This allows the system to allocate or
extend the file by more than one page at a time. You must run db2empfa to enable multipage file
allocation. In a partitioned database environment, run this utility on each database partition. Once multipage
file allocation is enabled, it cannot be disabled.

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

23

EMC Corporation and IBM Corporation

Understanding Existing Systems


If your database system already exists, it is still possible to understand how the database system relates to
the underlying disks and vice versa. This information can be vital when monitoring a system for
performance. In addition, it can also help in isolating bottlenecks. Figure 16 gives a general view of the
relationship between a DB2 UDB database created using a Symmetrix system for storage for an imaginary
system. The following procedure walks through an example set of commands that can be used to
determine the nature of the layout for a system. The example output corresponds to Figure 16 and the steps
follow the diagram from right to left.
Procedure/Step:

Command (examples for AIX):

Example Output (output corresponds with Figure 14):

1.

Determine the number of


database partitions.

db2 connect to <database_name> (e.g., my_db)


db2 list nodes
db2 connect reset

NODE NUMBER
---------------------0
1
2 record(s) selected.
2.

Determine which table


spaces reside within a
particular database partition
and their corresponding table
space ID value.

export DB2NODE=<database_partition_number>
(e.g., 1)
db2 connect to <database_name> (e.g., my_db)
db2 list tablespaces
Note: Only table space with IDs 3 and 4 are shown in diagram.

Tablespaces for Current Database


Tablespace ID
Name
Type
Contents
State
Detailed explanation:
Normal

=
=
=
=
=

0
SYSCATSPACE
System managed space
Any data
0x0000

Tablespace ID
Name
Type
Contents
State
Detailed explanation:
Normal

=
=
=
=
=

1
TEMPSPACE1
System managed space
System Temporary data
0x0000

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

24

EMC Corporation and IBM Corporation

Tablespace ID
Name
Type
Contents
State
Detailed explanation:
Normal

= 2
= USERSPACE1
= System managed space
= Any data
= 0x0000

Tablespace ID
Name
Type
Contents
State
Detailed explanation:
Normal

=
=
=
=
=

3
TABLESPACE1
System managed space
Any data
0x0000

Tablespace ID
Name
Type
Contents
State
Detailed explanation:

=
=
=
=
=

4
TABLESPACE2
System managed space
Any data
0x0000

Normal
3.

Determine which containers


belong to a table space
specified using its table
space ID.

Db2 list tablespace containers for


<tablespace_id> (e.g., 3)
db2 connect reset
export DB2NODE=

Tablespace Containers for Tablespace 3


Container ID
Name
Type

= 0
= /my_fs0/tbspace
= Path

Container ID
Name
Type

= 1
= /my_fs1/tbspace
= Path

Container ID
Name
Type

= 2
= /my_fs2/tbspace
= Path

Container ID
Name
Type

= 3
= /my_fs3/tbspace
= Path

Container ID
Name
Type

= 4
= /my_fs4/tbspace
= Path

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

25

EMC Corporation and IBM Corporation

4.

Determine the logical


volume in which the
container name resides.

df <container_name> (e.g., /my_fs2/tbspace)

Filesystem
512-blocks
/dev/lv_myfs2
75497472
5. Determine the PowerPath
physical volumes that make
up the logical volume.
lv_myfs2:/my_fs2
PV
hdiskpower0
hdiskpower1
hdiskpower2
hdiskpower3

Free %Used
45462376
40%

lslv l <logical_volume_name> (e.g., lv_myfs2)

COPIES
288:000:000
288:000:000
288:000:000
288:000:000

6. Determine the Symmetrix


volume ID for the meta
head.

Iused %Iused Mounted on


17134
1% /my_fs2

IN BAND
20%
20%
20%
20%

DISTRIBUTION
058:058:058:058:056
058:058:058:058:056
058:058:058:058:056
058:058:058:058:056

powermt display dev=<hdiskpower> (e.g., hdiskpower0)

Pseudo name=hdiskpower0
Symmetrix frame ID=000276901285; volume ID=0174
state=alive; policy=SymmOpt; priority=0; queued-IOs=0
======================================================================
--------- Host Devices -------- - Symm - --- Path ---- -- Stats --### HW-path
device
director mode
state q-IOs errors
======================================================================
2 fscsi1
hdisk11
FA 3aA
active open
0
0
3 fscsi2
hdisk18
FA 4aA
active open
0
0
4 fscsi3
hdisk25
FA 13aA
active open
0
0
0 fscsi4
hdisk32
FA 14aA
active open
0
0
1 fscsi0
hdisk44
FA 13bA
active open
0
0
7. Determine the Symmetrix
physical disks that make up
the metavolume on which
the hdiskpower resides.

symdev -DA ALL list | head -9


symdev -DA ALL list | grep "<hdiskpower>"
(e.g., hdiskpower0)

Symmetrix ID: 000276901285


Device Name
Directors
Device
------------------------- ------------------ ------------------------Cap
Sym Physical
SA :P DA :IT Hyper Config
(MB)
------------------------- ------------------ ------------------------0177 /dev/rhdiskpower0
???:? 01A:D5 1
2-Way Mir
(m)
0176 /dev/rhdiskpower0
???:? 02A:D5 1
2-Way Mir
(m)
-

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

26

EMC Corporation and IBM Corporation

0175
0174
0174
0175
0176
0177

/dev/rhdiskpower0
/dev/rhdiskpower0
/dev/rhdiskpower0
/dev/rhdiskpower0
/dev/rhdiskpower0
/dev/rhdiskpower0

8. Determine which other


physical volumes reside on
a particular Symmetrix
physical disk.

???:?
13B:0
13B:0
???:?
???:?
???:?

15A:D5
16A:D5
01B:C3
02B:C3
15B:C3
16B:C3

1
1
1
1
1
1

2-Way
2-Way
2-Way
2-Way
2-Way
2-Way

Mir
Mir
Mir
Mir
Mir
Mir

(m)
(M)
(M)
(m)
(m)
(m)

37125
37125
-

symdev -DA ALL list | head -9


symdev -DA ALL list | grep "<DA :IT> " |
\ sort -k 5 (e.g.,01A:D5 )

Symmetrix ID: 000276901285


Device Name
Directors
Device
------------------------- ------------------ ------------------------Cap
Sym Physical
SA :P DA :IT Hyper Config
(MB)
------------------------- ------------------ ------------------------0177
0168
01A7
0198
0117

/dev/rhdiskpower0
/dev/rhdiskpower22
/dev/rhdiskpower43
/dev/rhdiskpower34
/dev/rhdiskpower1

???:?
13B:0
13B:0
13B:0
???:?

01A:D5
01A:D5
01A:D5
01A:D5
01A:D5

1
2
3
4
5

2-Way
2-Way
2-Way
2-Way
2-Way

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

Mir
Mir
Mir
Mir
Mir

(m)
(M)
(m)
(M)
(m)

24750
6188
20625
-

27

EMC Corporation and IBM Corporation

Figure 15. Overall View of an Example Relationship between DB2 UDB and Symmetrix

Best Practices for EMC Symmetrix with IBM DB2 Universal Database

28

You might also like