San-Xiv Storage Excellent Concepts

Front cover
IBM XIV Storage System: Concepts, Architecture, and Usage

Explore the XIV concepts and architecture Install XIV in your environment
Understand usage guidelines and scenarios
Bertrand Dufrasne Giacomo Chiapparini Attila Grosz Mark Kremkus Lisa Martinez Markus Oscheka Guenter Rebmann Christopher Sansone
ibm.com/redbooks
International Technical Support Organization IBM XIV Storage System: Concepts, Architecture, and Usage January 2009
SG24-7659-00
Note: Before using this information and the product it supports, read the information in Notices on page ix.
First Edition (January 2009) This edition applies to the XIV Storage System (2810-A14) with Version 10.0.0 of the XIV Storage System software (5639-XXA) and the XIV Storage System GUI and Extended Command Line Interface (XCLI) Version 2.2.43.
Copyright International Business Machines Corporation 2009. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .x Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi The team that wrote this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv Chapter 1. IBM XIV Storage System overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 System components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 Key design points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.3 The XIV Storage System Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 2 3 4
Chapter 2. XIV logical architecture and concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1 Architectural overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Massive parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.1 Grid architecture over monolithic architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.2 Logical parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3 Full storage virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3.1 Logical system concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.2 System usable capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.3.3 Storage Pool concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3.4 Capacity allocation and thin provisioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.4 Reliability, availability, and serviceability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.4.1 Resilient architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.4.2 Rebuild and redistribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.4.3 Minimized exposure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Chapter 3. XIV physical architecture and components . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 IBM XIV Storage System Model A14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Hardware characteristics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 IBM XIV hardware components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 The rack and the UPS modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Data and Interface Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 SATA disk drives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4 The patch panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.5 Interconnection and switches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.6 Hardware needed by support and IBM SSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Redundant hardware. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Power redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Switch/interconnect redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Hardware parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 4. Physical planning and installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Ordering IBM XIV hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Feature codes and hardware configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 2810-A14 Capacity on Demand ordering options . . . . . . . . . . . . . . . . . . . . . . . . . 45 46 46 47 48 50 57 59 60 61 61 62 62 62 63 64 64 64 66
Contents
iii
4.3 Physical planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Site requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Basic configuration planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Network connection considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1 Fibre Channel connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.2 iSCSI connections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.3 Mixed iSCSI and Fibre Channel host access . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.4 Management connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.5 Mobile computer ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.6 Remote access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Remote Copy connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.1 Remote Copy links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.2 Remote Target Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Planning for growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.1 Future requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8 IBM XIV installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8.1 Physical installation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8.2 Basic configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8.3 Complete the physical installation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
67 67 69 70 70 73 74 74 75 75 76 76 76 77 77 77 77 78 78
Chapter 5. Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.1 IBM XIV Storage Management software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.1.1 XIV Storage Management software installation . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.2 Managing the XIV Storage System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.2.1 Launching the Management Software GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.2.2 Log on to the system with XCLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 5.3 Storage Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.3.1 Managing Storage Pools with XIV GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.3.2 Manage Storage Pools with XCLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5.4 Volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.4.1 Managing volumes with the XIV GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.4.2 Managing volumes with XCLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 5.5 Host definition and mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.5.1 Managing hosts and mappings with XIV GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.5.2 Managing hosts and mappings with XCLI. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 5.6 Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Chapter 6. Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Physical access security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 User access security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Role Based Access Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Manage user rights with the GUI. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.3 Managing users with XCLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.4 LDAP and Active Directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Password management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Managing multiple systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Event logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 Viewing events in the XIV GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.2 Viewing events in the XCLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.3 Define notification rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 126 126 126 128 134 138 139 140 141 142 143 145
Chapter 7. Host connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 7.1 Connectivity overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 7.1.1 Module, patch panel, and host connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 iv
7.1.2 FC and iSCSI simplified access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.3 Remote Mirroring connectivity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Fibre Channel (FC) connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Preparation steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2 FC configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.3 Zoning and VSAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.4 Identification of FC ports (initiator/target) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.5 IBM XIV logical FC maximum values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 iSCSI connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 iSCSI configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 Link aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.3 IBM XIV Storage System iSCSI setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.4 Identifying iSCSI ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.5 Using iSCSI hardware or software initiator (recommendation) . . . . . . . . . . . . . . 7.3.6 IBM XIV logical iSCSI maximum values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.7 Boot from iSCSI target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Logical configuration for host connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Required generic information and preparation . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 Prepare for a new host: XIV GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.3 Prepare for a new host: XCLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 8. OS-specific considerations for host connectivity . . . . . . . . . . . . . . . . . . . 8.1 Attaching Microsoft Windows host to XIV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.1 Windows host FC configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.2 Windows host iSCSI configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.3 Management volume LUN 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Attaching AIX hosts to XIV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.1 AIX host FC configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.2 AIX host iSCSI configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.3 Management volume LUN 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.1 Support issues that distinguish Linux from other operating systems . . . . . . . . . 8.3.2 FC and multi-pathing for Linux using PROCFS . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.3 FC and multi-pathing for Linux using SYSFS . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.4 Linux iSCSI configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Sun Solaris . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.1 FC and multi-pathing configuration for Solaris . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.2 iSCSI configuration for Solaris . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 VMware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5.1 FC and multi-pathing for VMware ESX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5.2 ESX Server iSCSI configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 9. Performance characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Performance concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.1 Full disk resource utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.2 Caching considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.3 Data mirroring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.4 SATA drives compared to FC drives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.5 Snapshot performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.6 Remote Mirroring performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Best practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.1 Distribution of connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.2 Host configuration considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
150 151 152 152 153 155 156 159 159 160 162 162 164 167 168 169 170 170 173 177 179 180 181 185 192 193 194 199 203 204 204 204 210 216 218 218 221 223 223 224 235 236 236 236 237 238 238 238 239 239 239
Contents
9.2.3 XIV sizing validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Performance statistics gathering with XIV. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.1 Using the GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.2 Using the XCLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 10. Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 System monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.1 Monitoring with the GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.2 Monitoring with XCLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.3 SNMP-based monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.4 XIV SNMP setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.5 Using IBM Director . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Call Home and remote support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.1 Setting up Call Home . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.2 Remote support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.3 Repair flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 11. Copy functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Snapshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.1 Architecture of snapshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.2 Volume snapshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.3 Consistency Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.4 Snapshot with Remote Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.5 Windows Server 2003 Volume Shadow Copy Service . . . . . . . . . . . . . . . . . . . 11.1.6 MySQL database backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Volume Copy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.2 Performing a Volume Copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.3 Creating an OS image with Volume Copy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 12. Remote Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1 Remote Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.1 Remote Mirror overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.2 Boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.3 Initial setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.4 Coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.5 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.6 Disaster Recovery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.7 Reads and writes in a Remote Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.8 Role switchover. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.9 Remote Mirror step-by-step illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.10 Recovering from a failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 13. Data migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2 Handling I/O requests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3 Data migration stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3.1 Initial configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3.2 Testing the configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3.3 Activate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3.4 Migration process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3.5 Synchronization complete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3.6 Delete the data migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
240 240 240 246 249 250 250 255 260 262 264 273 273 281 282 285 286 286 288 300 310 311 312 317 318 318 319 323 324 324 325 325 331 332 333 333 333 334 351 353 354 355 356 357 359 359 359 360 360
vi
Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IBM Redbooks publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Other publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . How to get IBM Redbooks publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
361 361 361 362 362 362
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
Contents
vii
viii
Notices
This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A. The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.
Notices
ix
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. These and other IBM trademarked terms are marked on their first occurrence in this information with the appropriate symbol ( or ), indicating US registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both:
AIX 5L AIX Alerts BladeCenter DB2 Universal Database DB2 DS4000 DS6000 DS8000 i5/OS IBM NetView POWER Redbooks Redbooks (logo) System Storage System x System z Tivoli TotalStorage XIV
The following terms are trademarks of other companies: Disk Magic, and the IntelliMagic logo are trademarks of IntelliMagic BV in the United States, other countries, or both. Snapshot, and the NetApp logo are trademarks or registered trademarks of NetApp, Inc. in the U.S. and other countries. SUSE, the Novell logo, and the N logo are registered trademarks of Novell, Inc. in the United States and other countries. Oracle, JD Edwards, PeopleSoft, Siebel, and TopLink are registered trademarks of Oracle Corporation and/or its affiliates. QLogic, and the QLogic logo are registered trademarks of QLogic Corporation. SANblade is a registered trademark in the United States. VMware, the VMware "boxes" logo and design are registered trademarks or trademarks of VMware, Inc. in the United States and/or other jurisdictions. Java, MySQL, RSM, Solaris, Sun, Sun StorEdge, and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Active Directory, ESP, Microsoft, MS, SQL Server, Windows Server, Windows Vista, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. Intel Xeon, Intel, Intel logo, Intel Inside logo, and Intel Centrino logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a trademark of Linus Torvalds in the United States, other countries, or both. Mozilla, Firefox, as well as the Firefox logo are owned exclusively by the Mozilla Foundation . All rights in the names, trademarks, and logos of the Mozilla Foundation, including without limitation Other company, product, or service names may be trademarks or service marks of others.
Preface
This IBM Redbooks publication describes the concepts, architecture, and implementation of the IBM XIV Storage System (2810-A14). The XIV Storage System is designed to be a scalable enterprise storage system that is based upon a grid array of hardware components. It can attach to both Fibre Channel Protocol (FCP) and IP network Small Computer System Interface (iSCSI) capable hosts. This system is a good fit for clients who want to be able to grow capacity without managing multiple tiers of storage to maximize performance and minimize cost. The XIV Storage System is well suited for mixed or random access workloads, such as the processing of transactions, video clips, images, and e-mail, and industries, such as telecommunications, media and entertainment, finance, and pharmaceutical, as well as new and emerging workload areas, such as Web 2.0. The first chapters of this book provide details about several of the unique and powerful concepts that form the basis of the XIV Storage System logical and physical architecture. We explain how the system was designed to eliminate direct dependencies between the hardware elements and the software that governs the system. In subsequent chapters, we explain the planning and preparation tasks that are required to deploy the system in your environment, which is followed by a step-by-step procedure describing how to configure and administer the system. We provide illustrations about how to perform those tasks by using the intuitive, yet powerful XIV Storage Manager GUI or the Extended Command Line Interface (XCLI). We explore and illustrate the use of snapshots and Remote Copy functions. The book also outlines the requirements and summarizes the procedures for attaching the system to various host platforms. This IBM Redbooks publication is intended for those people who want an understanding of the XIV Storage System and also targets readers who need detailed advice about how to configure and use the system.
The team that wrote this book

This book was produced by a team of specialists from around the world working at the International Technical Support Organization, San Jose Center. Bertrand Dufrasne is an IBM Certified I/T Specialist and Project Leader for System Storage disk products at the International Technical Support Organization, San Jose Center. He has worked at IBM in various I/T areas. He has authored many IBM Redbooks publications and has also developed and taught technical workshops. Before joining the ITSO, he worked for IBM Global Services as an Application Architect. He holds a degree in Electrical Engineering. Giacomo Chiapparini is an IBM accredited Senior IT Specialist for Storage Solutions in the Field Sales Support Team Switzerland. He has over 10 years of experience with IBM in storage hardware and software support, design, deployment, and management. He has developed and taught IBM classes in all areas of storage hardware. Before joining the STG FTSS team in 2008, he was the storage services technical expert in IBM Switzerland.
Copyright IBM Corp. 2009. All rights reserved.
xi
Attila Grosz is a Field Technical Sales Specialist at the IBM Systems and Technology Group in Budapest, Hungary. He is a member of the CEMAAS STG Systems Architect team. He is responsible for System Storage presales technical support within STG. He has 10 years of experience with storage in Open Systems environments, including AIX, Linux, and Windows. Attila has worked at IBM since 1999, in various divisions. He holds a Communication-Technical Engineering degree from the University of Godollo, Hungary. Mark Kremkus is a Senior Accredited I/T Specialist based in Austin, Texas. He has seven years of experience providing consultative sales support for the full spectrum of IBM Storage products. His current area of focus involves creating and presenting Disk Magic studies for the full family of DS4000, DS6000, DS8000, and SAN Volume Controller (SVC) products across a broad range of open and mainframe environments. He holds a Bachelor of Science degree in Electrical Engineering from Texas A&M University, and graduated with honors as an Undergraduate Research Fellow in MRI technology. Lisa Martinez is a Senior Software Engineer working in the DS8000 System Test Architecture in Tucson, Arizona. She has nine years of experience in Enterprise Disk Test. She holds a Bachelor of Science degree in Electrical Engineering from the University of New Mexico and a Computer Science degree from New Mexico Highlands University. Her areas of expertise include Open Systems and IBM System Storage DS8000 including Copy Services, with recent experience in System z. Markus Oscheka is an IT Specialist for Proof of Concepts and Benchmarks in the Enterprise Disk High End Solution Europe team in Mainz, Germany. His areas of expertise include setup and demonstration of IBM System Storage and TotalStorage solutions in various environments, such as AIX, Linux, Windows, Hewlett-Packard UNIX (HP-UX), and Solaris. He has worked at IBM for seven years. He has performed many Proof of Concepts with Copy Services on DS6000/DS8000, as well as Performance-Benchmarks with DS4000/DS6000/DS8000. He has written extensively in various IBM Redbooks publications and acts as the co-project lead for these IBM Redbooks publications, including DS6000/DS8000 Architecture and Implementation and DS6000/DS8000 Copy Services. He holds a degree in Electrical Engineering from the Technical University in Darmstadt. Guenter Rebmann is an IBM Certified Specialist for High End Disk Solutions, working for the EMEA DASD Hardware Support Center in Mainz, Germany. Guenter has more than 20 years of experience in large system environments and storage hardware. Currently, he provides support for the EMEA Regional FrontEnd Support Centers with High End Disk Subsystems, such as the ESS, DS8000, and previous High End Disk products. Since 2004, he has been a member of the Virtual EMEA Team (VET) Support Team. Christopher Sansone is a performance analyst located in IBM Tucson. He currently works with DS8000, DS6000, and XIV storage products to generate marketing material and assist with performance issues related to these products. Prior to working in performance, he worked in several development organizations writing C code for Fibre Channel storage devices, including DS8000, ESS 800, TS7740, and Virtual Tape Server (VTS). Christopher holds a Masters degree in Electrical Engineering from NTU and a Bachelors degree in Computer Engineering from Virginia Tech.
xii
Figure 1 The team: Lisa, Attila (back), Christopher, Bert, Markus, Guenter, Mark, and Giacomo
Special thanks to: John Bynum Worldwide Technical Support Management IBM US, San Jose For their technical advice and support, many thanks to: Rami Elron Aviad Offer Thanks to the following people for their contributions to this project: Barbara Reed Darlene Ross Helen Burton Juan Yanes John Cherbini Richard Heffel Jim Sedgwick Brian Sherman Dan Braden Rosemary McCutchen Kip Wagner Maxim Kooser Izhar Sharon Melvin Farris Dietmar Dausner
Preface
xiii
Alexander Warmuth Ralf Wohlfarth Wenzel Kalabza Theeraphong Thitayanun
Become a published author

Join us for a two- to six-week residency program. Help write a book dealing with specific products or solutions, while getting hands-on experience with leading-edge technologies. You will have the opportunity to team with IBM technical professionals, IBM Business Partners, and Clients. Your efforts will help increase product acceptance and client satisfaction. As a bonus, you will develop a network of contacts in IBM development labs, and increase your productivity and marketability. Find out more about the residency program, browse the residency index, and apply online at: ibm.com/redbooks/residencies.html
Comments welcome
Your comments are important to us. We want our books to be as helpful as possible. Send us your comments about this book or other IBM Redbooks publications in one of the following ways: Use the online Contact us review IBM Redbooks publications form found at: ibm.com/redbooks Send your comments in an e-mail to: redbooks@us.ibm.com Mail your comments to: IBM Corporation, International Technical Support Organization Dept. HYTD Mail Station P099 2455 South Road Poughkeepsie, NY 12601-5400
xiv
Chapter 1.
IBM XIV Storage System overview

IBM XIV Storage System is an innovative, fully scalable enterprise storage system that is based on a grid of standard, off-the-shelf hardware components. This chapter provides a high level overview of the IBM XIV Storage System.
1.1 Overview
The XIV Storage System architecture is designed to deliver performance, scalability, and ease of management while harnessing the high capacity and cost benefits of Serial Advanced Technology Attachment (SATA) drives. The system employs off-the-shelf products as opposed to traditional offerings, which need more expensive components that use proprietary designs.
1.1.1 System components

The XIV Storage System is comprised of the following components, which are visible in Figure 1-1: Host Interface Modules consisting of six modules, each containing 12 SATA Disk Drives Data Modules made up of nine modules, each containing 12 SATA Disk Drives A Uninterruptible Power Supply (UPS) module complex made up of three redundant UPS units Two Ethernet Switches and an Ethernet Switch Redundant Power Supply (RPS) A Maintenance Module An Automatic Transfer Switch (ATS) for external power supply redundancy A modem, connected to the Maintenance module for externally servicing the system
Figure 1-1 IBM XIV Storage System 2810-A14 components: Front and rear view
All of the modules in the system are linked through an internal redundant Gigabit Ethernet network, which enables maximum bandwidth utilization and is resilient to at least any single component failure. The system and all of its components come pre-assembled and wired in a lockable rack.
1.1.2 Key design points

Next, we describe the key aspects of the XIV Storage System architecture.
Massive parallelism
The system architecture ensures full exploitation of all system components. Any I/O activity involving a specific logical volume in the system is always inherently handled by all spindles. The system harnesses all storage capacity and all internal bandwidth, and it takes advantage of all available processing power, which is as true for host-initiated I/O activity as it is for system-initiated activity, such as rebuild processes and snapshot generation. All disks, CPUs, switches, and other components of the system contribute at all times.
Workload balancing
The workload is evenly distributed over all hardware components at all times. All disks and modules are utilized equally, regardless of access patterns. Despite the fact that applications might access certain volumes more frequently than other volumes, or access certain parts of a volume more frequently than other parts, the load on the disks and modules will be balanced perfectly. Pseudo-random distribution ensures consistent load-balancing even after adding, deleting, or resizing volumes, as well as adding or removing hardware. This balancing of all data on all system components eliminates the possibility of a hot-spot being created.
Self-healing
Protection against double disk failure is provided by an efficient rebuild process that brings the system back to full redundancy in minutes. In addition, the XIV Storage System extends the self-healing concept, resuming redundancy even after failures in components other than disks.
True virtualization
Unlike other system architectures, storage virtualization is inherent to the basic principles of the XIV Storage System design. Physical drives and their locations are completely hidden from the user, which dramatically simplifies storage configuration, letting the system lay out the users volume in the optimal way. The automatic layout maximizes the systems performance by leveraging system resources for each volume, regardless of the users access patterns.
Thin provisioning
The system enables thin provisioning, which is the capability to allocate storage to applications on a just-in-time and as needed basis, allowing significant cost savings compared to traditional provisioning techniques. The savings are achieved by defining a logical capacity that is larger than the physical capacity. This capability allows users to improve storage utilization rates, thereby significantly reducing capital and operational expenses by allocating capacity based on total space consumed, rather than total space allocated.
Chapter 1. IBM XIV Storage System overview
We discuss these key design points and underlying architectural concepts in detail in Chapter 2, XIV logical architecture and concepts on page 7.
1.1.3 The XIV Storage System Software

The IBM XIV software provides the functions of the system, which include: Support for 16 000 snapshots utilizing advanced writable snapshot technology The snapshot capabilities within the XIV Storage System Software utilize a metadata, redirect-on-write design that allows snapshots to occur in a subsecond time frame with little performance overhead. Up to 16 000 full or differential copies can be taken. Any of the snapshots can be made writable, and then snapshots can be taken of the newly writable snapshots. Volumes can even be restored from these writable snapshots. Support for synchronous mirroring to another XIV 2810-A14 device Synchronous mirroring can be performed over Fibre Channel (FC) or IP network Small Computer System Interface (iSCSI) connections and are protocol agnostic (iSCSI volumes can be mirrored over FC and the reverse is also true). It is also possible to test the secondary mirror site without stopping the mirroring. Support of thin provisioning Thin provisioning allows storage administrators to define logical volume sizes that are larger than the physical capacity of the system. Unlike other approaches, the physical capacity only needs to be larger than the actual written data, not larger than the logical volumes. Physical capacity needs to be increased only when actual written data increases. Support for in-band data migration of heterogeneous storage The XIV Storage System Software is also capable of acting as a host, gaining access to volumes on an existing system. The machine is then configured as a proxy to answer requests between the current hosts and the current storage while migrating all existing data in the background. In addition, XIV supports thick-to-thin data migration, which allows the XIV Storage System to reclaim any allocated space that is not occupied by actual data. In other words, it automatically shrinks volumes upon migrating data from a non-XIV system, offering great power and space savings. Robust user auditing with access control lists The XIV Storage System Software offers the capability for robust user auditing with access control lists (ACLs) in order to provide more control and historical information.
IBM XIV Storage Manager GUI

IBM XIV Storage Manager acts as the management console for the XIV Storage System. A simple and intuitive GUI enables storage administrators to manage and monitor all system aspects easily, with almost no learning curve.
Figure 1-2 The IBM XIV Storage Manager GUI
IBM XIV Storage XCLI

The XIV Storage System offers also a comprehensive set of Extended Command Line Interface (XCLI) commands to configure and monitor the system. The XCLI can be used in a shell environment to interactively configure the system or as part of a script to perform lengthy and complex tasks.
Chapter 1. IBM XIV Storage System overview
Chapter 2.
XIV logical architecture and concepts

This chapter elaborates on several of the XIV underlying design and architectural concepts that were introduced in the executive overview chapter. The topics described in this chapter include: Architectural elements Parallelism Virtualization Data distribution Rebuild redundancy Self-healing and resiliency Caching Thin provisioning
2.1 Architectural overview

The XIV Storage System architecture incorporates a variety of features designed to uniformly distribute data across key internal resources. This unique data distribution method fundamentally differentiates the XIV Storage System from conventional storage subsystems, thereby effecting numerous availability, performance, and management benefits across both physical and logical elements of the system.
Hardware elements
In order to convey the conceptual principles that comprise the XIV Storage System architecture, it is useful to first to provide a glimpse of the physical infrastructure. The primary components of the XIV Storage System are known as modules. Modules provide processing, cache, and host interfaces and are based on standard Intel and Linux systems. They are redundantly connected to one another through an internal switched Ethernet fabric. All of the modules work together concurrently as elements of a grid architecture, and therefore, the system harnesses the powerful parallelism inherent to a distributed computing environment, as shown in Figure 2-1. We discuss the grid architecture in 2.2, Massive parallelism on page 10.
Ethernet Switches
Interface/Data Modules Data Modules
UPS units
Although externally similar in appearance, Data and Interface/Data Modules differ in functions, interfaces, and in how they are interconnected.
Figure 2-1 IBM XIV Storage System major hardware elements
Data Modules
At a conceptual level, the Data Modules function as the elementary building blocks of the system, providing physical capacity, processing power, and caching, in addition to advanced system-managed services that comprise the systems internal operating environment. The equivalence of hardware across Data Modules and the Data Module ability to share and manage system software and services are key elements of the physical architecture, as depicted in Figure 2-2 on page 9.
Interface Modules
Fundamentally, Interface Modules are equivalent to Data Modules in all aspects, with the following exceptions: In addition to disk, cache, and processing resources, Interface Modules are designed to include both Fibre Channel and iSCSI interfaces for host system connectivity as well as Remote Mirroring. Figure 2-2 conceptually illustrates the placement of Interface Modules within the topology of the XIV IBM Storage System architecture. The system services and software functionality associated with managing external I/O reside exclusively on the Interface Modules.
Ethernet switches
The XIV Storage System contains a redundant switched Ethernet fabric that conducts both data and metadata traffic between the modules. Traffic can flow in the following ways: Between two Interface Modules Between an Interface Module and a Data Module Between two Data Modules Note: It is important to realize that Data Modules and Interface Modules are not connected to the Ethernet switches in the same way. For further details about the hardware components, refer to Chapter 3, XIV physical architecture and components on page 45.
Interface and Data Modules are connected to each other through an internal IP switched network. Figure 2-2 Architectural overview
Note: Figure 2-2 depicts the conceptual architecture of the system only; do not misinterpret the number of connections and such as a precise hardware layout.
Chapter 2. XIV logical architecture and concepts
2.2 Massive parallelism

The concept of parallelism pervades all aspects of the XIV Storage System architecture by means of a balanced, redundant data distribution scheme in conjunction with a pool of distributed (or grid) computing resources. In order to explain the principle of parallelism further, it is helpful to consider the ramifications of both the hardware and software implementations independently. We subsequently examine virtualization principles in 2.3, Full storage virtualization on page 14.
2.2.1 Grid architecture over monolithic architecture

Conceptually, while both monolithic and grid architectures are capable of accomplishing identical tasks, the topologies are fundamentally divergent by definition. For purposes of this discussion, first consider the concept of a computing resource, which we define as an independent entity containing all components necessary to process, store, transmit, and receive data: Monolithic architectures are characterized by a single powerful, integrated, customized, and generally costly computing resource. Monolithic architectures are more resistant to adopt new or external technologies, and while being able to demonstrate scalability to a certain extent, they require careful matching between system resources, which cannot be easily attained across the system just by adding more of the existing building blocks. Note that a clustered topology, while not truly comprised of a single system, is still considered a monolithic architecture. Unlike a grid architecture, a given work task remains undivided from the perspective of the computing resource as a whole. Grid architectures utilize a number of horizontally distributed, relatively expendable, identical (or very similar) computing resources tied together through sophisticated operational algorithms. A given work task is strategically split and performed in parallel across many computing resources before being consolidated into a cohesive work product. The power of the system is increased by adding more modules (that execute jobs of a similar type) so that the overall workload is distributed across more modules.
Monolithic subsystems
Conventional storage subsystems utilize proprietary, custom-designed hardware components (rather than generally available hardware components) and interconnects that are specifically engineered to be integrated together to achieve target design performance and reliability objectives. The complex high performance architectural aspects of redundant monolithic systems generally effectuate one or more of the following characteristics: Openness: Components that need to be replaced due to a failure or a hardware upgrade are generally manufacturer-specific due to the custom design inherent to the system. The system cannot easily leverage newer hardware designs or components introduced to the market. Performance: Even in a N+1 clustered system, the loss of a clustered component not only might have a significant impact on the way that the system functions, but might also impact the performance experienced by hosts and host applications. Upgradability and scalability: The ability to upgrade the system by scaling up resources: Though the system might remain operational, the process of upgrading system resources has the potential to impact performance and availability for the duration of the upgrade procedure.
10
Upgrades generally require careful and potentially time-consuming planning and administration, and might even require a degree of outage under certain circumstances. Although a specific layer of the vertically integrated monolithic storage subsystem hierarchy can be enhanced during an upgrade, it is possible that: It will result in an imbalance by skewing the ratio of resources, such as cache, processors, disks, and buses, thus precluding the full benefit of the upgrade by allowing certain resources, or portions thereof, to go unused. Architectural limitations of the monolithic system might prevent a necessary complementary resource from scaling. For example, a disk subsystem might accommodate an upgrade to the number of drives, but not the processors, resulting in a limitation of the performance potential of the overall system.
Generally, monolithic systems cannot be scaled out by adding computing resources: The major disadvantage of monolithic architectures is their proprietary nature, which impedes the adoption of new technologies, even partially. Monolithic architectures are harder to extend through external products or technologies, even though they typically contain all of the necessary ingredients for functioning. At a certain point, it is necessary to simply migrate data to a newer subsystem, because the upgradeability of the current system has been exhausted, resulting in: The need for a large initial acquisition or hardware refresh. The necessity of potentially time-consuming data migration planning and administration.
Figure 2-3 illustrates the concept of a monolithic storage subsystem architecture.
Interface
Interface
Interface
Controllers
Building blocks:
Disks Cache Controllers Interfaces Interconnects
Cache JBOD JBOD
Figure 2-3 Monolithic storage subsystem architecture
11
IBM XIV Storage System grid overview

The XIV Grid design entails the following characteristics: Both Interface Modules and Data Modules work together in a distributed computing sense. In other words, although Interface Modules have additional interface ports and assume certain unique functions, they also contribute to the system operations equally to Data Modules. The modules communicate with each other through the internal, redundant Ethernet network. The software services and distributed computing algorithms running within the modules collectively manage all aspects of the operating environment.
Design principles
The XIV Storage System grid architecture, by virtue of its distributed topology and standard Intel and Linux building block components, ensures that the following design principles are possible: Performance: The relative effect of the loss of a given computing resource, or module, is minimized. All modules are able to participate equally in handling the total workload. This design principle is true regardless of access patterns. The system architecture enables excellent load balancing, even if certain applications access certain volumes, or certain parts within a volume, more frequently. Openness: Modules consist of standard ready to use components. Because components are not specifically engineered for the subsystem, the resources and time required for the development of newer hardware technologies are minimized. This benefit, coupled with the efficient integration of computing resources into the grid architecture, enables the subsystem to realize the rapid adoption of the newest hardware technologies available without the need to deploy a whole new subsystem. Upgradability and scalability: Computing resources can be dynamically changed: Scaled out by either adding new modules to accommodate both new capacity and new performance demands, or by tying together groups of modules Scaled up by upgrading modules Important: While the grid architecture of the XIV Storage System enables the potential for great flexibility, the current supported hardware configuration contains a fixed number of
modules.
Figure 2-4 on page 13 depicts a conceptual view of the XIV Storage System grid architecture and its design principles.
12
Design principles:
Massive parallelism Granular distribution Coupled disk, RAM and CPU Off-the-shelf components User simplicity
Interface
Interface
Interface
Interface
Interface
Switching
Data Module Data Module Data Module Data Module Data Module Data Module
Data Module
Figure 2-4 IBM XIV Storage System scalable conceptual grid architecture
Important: Figure 2-4 is a conceptual depiction of the XIV Storage System grid architecture, and therefore, is not intended to accurately represent numbers of modules, module hardware, switches, and so on.
Proportional scalability
Within the XIV Storage System, each module is a discrete computing (and capacity) resource containing all of the pertinent hardware elements that are necessary for a grid topology (processing, caching, and storage). All modules are connected through a scalable network. This aspect of the grid infrastructure enables the relative proportions of cache, processors, disk, and interconnect bandwidth to remain optimal even in the event that modules are added or removed: Linear cache growth: The total system cache size and cache bandwidth increase linearly with disk capacity, because every module is a self-contained computing resource that houses its own cache. Note that the cache bandwidth scales linearly in terms of both host-to-cache and cache-to-disk throughput, and the close proximity of cache, processor, and disk is maintained. Proportional interface growth: Interface Modules house iSCSI and Fibre Channel host interfaces and are able to access not only the local resources within the module, but the entire system. Adding modules to the system proportionally scales both the number of host interfaces and the bandwidth to the internal resources. Constant switching capacity: The internal switching capacity is designed to scale proportionally as the system grows, preventing bottlenecks regardless of the number of modules. This capability ensures that internal throughput scales proportionally to capacity. Embedded processing power: Because each module incorporates its own processing power in conjunction with cache and disk components, the ability of the system to perform processor-intensive tasks, such as aggressive prefetch caching, sophisticated cache updates, snapshot management, and data distribution, is always maintained regardless of of the system capacity.
13
2.2.2 Logical parallelism

In addition to the hardware parallelism, the XIV Storage System also employs sophisticated and patented data distribution algorithms to achieve optimal parallelism.
Pseudo-random algorithm
The spreading of data occurs in a pseudo-random fashion. While a discussion of random algorithms is beyond the scope of this book, the term pseudo-random is intended to describe the uniform but random spreading of data across all available disk hardware resources while maintaining redundancy. Figure 2-5 on page 17 provides a conceptual representation of the pseudo-random distribution of data within the XIV Storage System. For more details about the topic of data distribution and storage virtualization, refer to 2.3.1, Logical system concepts on page 16. Note: The XIV Storage System exploits mass parallelism at both the hardware and software levels.
Modular software design

The XIV Storage System internal operating environment stems from a set of software functions that are loosely coupled with the hardware modules. Various aspects of managing the system are associated with instances of specific software functions that can reside on one or more modules and can be redistributed among modules as required, thus ensuring resiliency under changing hardware conditions.
2.3 Full storage virtualization

While the concept of virtualized storage is far from new, the data distribution algorithms employed by the XIV Storage System are novel in that they are deeply integrated into the subsystem architecture itself, instead of at the host or storage area network level. In the latter case, storage virtualization occurs across separate storage subsystems through an additional layer of virtualization between hosts and storage. The XIV Storage System is unique in that it is based on an innovative implementation of full storage virtualization within the subsystem itself. This approach permits logical structures within the subsystem to change dynamically and transparently while maintaining excellent load balancing and resource utilization at the lowest level of hardware granularity. In order to fully appreciate the value inherent to the virtualization design that is used by the XIV Storage System, it is helpful to remember several aspects of the physical and logical relationships that comprise conventional storage subsystems. Specifically, traditional subsystems rely on storage administrators to carefully plan the relationship between logical structures, such as Storage Pools and volumes, and physical resources, such as drives and arrays, in order to strategically balance workloads, meet capacity demands, eliminate hot-spots, and provide adequate performance. Thus, whenever there is a new demand for capacity or performance, or a change to existing resources is required, the storage administrator is responsible for replanning and rebalancing resources, which can be an error-prone and time-consuming process.
14
IBM XIV Storage System virtualization benefits

The implementation of full storage virtualization employed by the XIV Storage System eliminates many of the potential logistical and operational drawbacks that can be present with conventional storage subsystems, while maximizing the overall potential of the subsystem. The XIV Storage System virtualization offers the following benefits: Easier volume management: Logical volume placement is driven by the distribution algorithms, freeing the storage administrator from planning and maintaining volume layout. The data distribution algorithms manage all of the data in the system collectively without deference to specific logical volume definitions. Any interaction, whether host or subsystem driven, with a specific logical volume in the system is inherently handled by all resources; it harnesses all storage capacity, all internal bandwidth, and all processing power currently available in the subsystem. Logical volumes are not exclusively associated with a subset of physical resources, nor is there a permanent static relationship between logical volumes and specific physical resources: Logical volumes can be dynamically resized. Logical volumes can be thinly provisioned, as discussed in 2.3.4, Capacity allocation and thin provisioning on page 24.
Consistent performance and scalability: Hardware resources are always utilized equitably, because all logical volumes always span all physical resources and are therefore able to reap the performance potential of the full subsystem: Virtualization algorithms automatically redistribute the logical volumes data and workload when new hardware is added, thereby maintaining the system balance while preserving transparency to the attached hosts. Conversely, equilibrium and transparency are maintained during the phase-out of old or defective hardware resources.
There are no pockets of capacity, orphaned spaces, or resources that are inaccessible due to array mapping constraints or data placement. Maximized availability and data integrity The full virtualization scheme enables the IBM XIV Storage Subsystem to manage and maintain data redundancy as hardware changes: In the event of a hardware failure or when hardware is phased out, data is automatically, efficiently, and rapidly rebuilt across all the drives and modules in the system, thereby preserving host transparency, equilibrium, and data redundancy at all times while virtually eliminating any performance penalty associated with conventional RAID rebuild activities. When new hardware is added to the system, data is transparently redistributed across all resources to restore equilibrium to the system. Flexible snapshots: Full storage virtualization incorporates snapshots that are differential in nature; only updated data consumes physical capacity: Many concurrent snapshots (Up to 16 000 volumes and snapshots can be defined.) Note that many concurrent snapshots are possible, because a snapshot uses physical space only after a change has occurred on the source.
15
Multiple snapshots of a single master volume can exist independently of each other. Snapshots can be cascaded, in effect, creating snapshots of snapshots.
Snapshot creation and deletion do not require data to be copied and hence occur immediately. While updates occur to master volumes, the systems virtualized logical structure enables it to elegantly and efficiently preserve the original point-in-time data associated with any and all dependent snapshots by simply redirecting the update to a new physical location on disk. This process, which is referred to as redirect on write, occurs transparently from the host perspective by virtue of the virtualized remapping of the updated data and minimizes any performance impact associated with preserving snapshots, regardless of the number of snapshots defined for a given master volume. Note: The XIV snapshot process uses redirect on write, which is more efficient than the copy on write that is used by other storage subsystems. Because the process uses redirect on write and does not necessitate data movement, the size of a snapshot is independent of the source volume size. Data migration efficiency: XIV supports thin provisioning. When migrating from a system that only supports regular (or thick) provisioning, XIV allows thick-to-thin provisioning of capacity. Thin-provisioned capacity is discussed in 2.3.4, Capacity allocation and thin provisioning on page 24. Due to the XIV pseudo-random and uniform distribution of data, the performance impact of data migration on production activity is minimized, because the load is spread evenly over all resources.
2.3.1 Logical system concepts

In this section, we elaborate on the logical system concepts, which form the basis for the system full storage virtualization.
Logical constructs
The XIV Storage System logical architecture incorporates constructs that underlie the storage virtualization and distribution of data, which are integral to its design. The logical structure of the subsystem ensures that there is optimum granularity in the mapping of logical elements to both modules and individual physical disks, thereby guaranteeing an ideal distribution of data across all physical resources.
Partitions
The fundamental building block of logical volumes is known as a partition. Partitions have the following characteristics: All partitions are 1 MB (1024 KB) in size. A partition contains either a primary copy or secondary copy of data: Each partition is mapped to a single physical disk: This mapping is dynamically managed by the system through a proprietary pseudo-random distribution algorithm in order to preserve data redundancy and equilibrium. For more information about the topic of data distribution, refer to Logical volume layout on physical disks on page 19.
16
The storage administrator has no control or knowledge of the specific mapping of partitions to drives.
Secondary partitions are always placed onto a physical disk that does not contain the primary partition. In addition, secondary partitions are also in a module that does not contain its corresponding primary partition. Important: In the context of the XIV Storage System logical architecture, a partition consists of 1 MB (1024 KB) of data. Do not confuse this definition with other definitions of the term partition.
The diagram illustrates that data is uniformly and randomly distributed over all disks. Each 1 MB of data is duplicated in a primary and a secondary partition. For the same data, the system ensures that the primary partition and its corresponding secondary partition are not located on the same disk and are also not within the same module.
Figure 2-5 Pseudo-random data distribution1
Copyright 2005-2008 Mozilla. All Rights Reserved. All rights in the names, trademarks, and logos of the Mozilla Foundation, including without limitation, Mozilla, Firefox, as well as the Firefox logo, are owned exclusively by the Mozilla Foundation.
17
Logical volumes
The XIV Storage System presents logical volumes to hosts in the same manner as conventional subsystems; however, both the granularity of logical volumes and the mapping of logical volumes to physical disks fundamentally differ: As discussed previously, every logical volume is comprised of 1 MB (1024 KB) pieces of data known as partitions. The physical capacity associated with a logical volume is always a multiple of 17 GB
(decimal).
Therefore, while it is possible to present a block-designated (refer to Creating volumes on page 107) logical volume to a host that is not a multiple of 17 GB, the maximum physical space that is allocated for the volume will always be the sum of the minimum number of 17 GB increments needed to meet the block-designated capacity. Note that the initial physical capacity actually allocated by the system upon volume creation can be less than this amount, as discussed in Hard and soft volume sizes on page 25. The maximum number of volumes that can be concurrently defined on the system is limited by: The logical address space limit: The logical address range of the system permits up to 16 377 volumes, although this constraint is purely logical, and therefore, is not normally a practical consideration. Note that the same address space is used for both volumes and snapshots.
The limit imposed by the logical and physical topology of the system for the minimum volume size. The physical capacity of the system, based on 180 drives with 1 TB of capacity per drive and assuming the minimum volume size of 17 GB, limits the maximum volume count to 4 605 volumes. Again, a system with active snapshots can have more than 4 605 addresses assigned collectively to both volumes and snapshots, because volumes and snapshots share the same address space. Important: The logical address limit is ordinarily not a practical consideration during planning, because under most conditions, this limit will not be reached; it is intended to exceed the adequate number of volumes for all conceivable circumstances. Logical volumes are administratively managed within the context of Storage Pools, discussed in 2.3.3, Storage Pool concepts on page 22. Storage Pools are not part of the logical hierarchy inherent to the systems operational environment, because the concept of Storage Pools is administrative in nature.
Storage Pools Storage Pools are purely logical entities that enable storage administrators to manage
relationships between volumes and snapshots and to define separate capacity provisioning and snapshot requirements for separate applications and departments. Storage Pools are not tied in any way to specific physical resources, nor are they part of the data distribution scheme. We discuss Storage Pools and their associated concepts in 2.3.3, Storage Pool concepts on page 22.
18
Snapshots A snapshot represents a point-in-time copy of a volume. Snapshots are governed by almost
all of the principles that apply to volumes. Unlike volumes, snapshots incorporate dependent relationships with their source volumes, which can be either logical volumes or other snapshots. Because they are not independent entities, a given snapshot does not necessarily wholly consist of partitions that are unique to that snapshot. Conversely, a snapshot image will not share all of its partitions with its source volume if updates to the source occur after the snapshot was created. Chapter 11, Copy functions on page 285 examines snapshot concepts and practical considerations, including locking behavior and implementation.
Logical volume layout on physical disks

The XIV Storage System facilitates the distribution of logical volumes over disks and modules by means of a dynamic relationship between primary data copies, secondary data copies, and physical disks. This virtualization of resources in the XIV Storage System is governed by a pseudo-random algorithm.
Partition table
Mapping between a logical partition number and the physical location on disk is maintained in a partition table. The partition table maintains the relationship between the partitions that comprise a logical volume and its physical location on disk. Note: Both the distribution table and the partition table are redundantly maintained among the modules.
Volume layout
At a high level, the data distribution scheme is an amalgam of mirroring and striping. While it is tempting to think of this scheme in the context of RAID 1+0 (10) or 0+1, the low-level virtualization implementation precludes the usage of traditional RAID algorithms in the architecture. Conventional RAID implementations cannot incorporate dynamic, intelligent, and automatic management of data placement based on knowledge of the volume layout, nor is it feasible for a traditional RAID system to span all drives in a subsystem due to the vastly unacceptable rebuild times that can result. As discussed previously, the XIV Storage System architecture divides logical volumes into 1 MB partitions. This granularity and the mapping strategy are integral elements of the logical design that enable the system to realize the following features and benefits: Partitions are distributed on all disks using what is defined as a pseudo-random distribution function, which was introduced in 2.2.2, Logical parallelism on page 14. The distribution algorithms seek to preserve the statistical equality of access among all physical disks under all conceivable real-world aggregate workload conditions and associated volume access patterns. Essentially, while not truly random in nature, the distribution algorithms in combination with the system architecture preclude the occurrence of the phenomenon traditionally known of as hot-spots: The XIV Storage System contains 180 disks, and each volume is allocated across at least 17 GB (decimal) of capacity that is distributed evenly across all disks. Each logically adjacent partition on a volume is distributed across a different disk; partitions are not combined into groups before they are spread across the disks. The pseudo-random distribution ensures that logically adjacent partitions are never striped sequentially across physically adjacent disks. Refer to 2.2.2, Logical parallelism on page 14 for a further overview of the partition mapping topology.
19
Each disk has its data mirrored across all other disks, excluding the disks in the same
module.
Each disk holds approximately one percent of any other disk in other modules. Disks have an equal probability of being accessed from a statistical standpoint, regardless of aggregate workload access patterns. Note: When the number of disks or modules changes, the system defines a new data layout that preserves redundancy and equilibrium. This target data distribution is called the goal distribution and is discussed in Goal distribution on page 37. As discussed previously in IBM XIV Storage System virtualization benefits on page 15: The storage system administrator does not plan the layout of volumes on the modules. Provided there is space available, volumes can always be added or resized instantly with negligible impact on performance. There are no unusable pockets of capacity known as orphaned spaces. When the system is scaled out through the addition of modules, a new goal distribution is created whereby just a minimum of partitions is moved to the newly allocated capacity to arrive at the new distribution table. The new capacity is fully utilized within several hours and with no need for any administrative intervention. Thus, the system automatically returns to a state of equilibrium among all resources. Upon the failure or phase-out of a drive or a module, a new goal distribution is created whereby data in non-redundant partitions is copied and redistributed across the remaining modules and drives. The system rapidly returns to a state in which all partitions are again redundant, because all disks and modules participate in the enforcement of the new goal distribution.
Volumes and snapshots

The relationship between volumes and snapshots in the context of the data layout is: Volumes and snapshots are mapped using the same distribution scheme. A given partition of a primary volume and its snapshot is stored on the same disk drive: As a result, a write to this partition is redirected within the module, minimizing latency and utilization associated with additional interactions between modules. As updates occur to master volumes, the systems virtualized logical structure enables it to elegantly and efficiently preserve the original point-in-time data associated with any and all dependent snapshots by simply redirecting the update to a new physical location on the disk. This process, which is referred to as redirect on write, occurs transparently from the host perspective by virtue of the virtualized remapping of the updated data and minimizes any performance impact associated with preserving snapshots regardless of the number of snapshots defined for a given master volume.
2.3.2 System usable capacity

The XIV Storage System reserves physical disk capacity for: Global spare capacity Metadata, including statistics and traces Mirrored copies of data
20
Global spare capacity

The dynamically balanced distribution of data across all physical resources by definition obviates the inclusion of dedicated spare drives that are necessary with conventional RAID technologies. Instead, the XIV Storage System reserves capacity on each disk in order to provide adequate space for the redistribution and recreation of redundant data in the event of a hardware failure. This global spare capacity approach offers advantages over dedicated hot spare drives, which are used only upon failure and are not used otherwise, therefore reducing the number of spindles that the system can leverage for better performance. Also, those non-operating disks are typically not subject to background scrubbing processes, whereas in XIV, all disks are operating and subject to examination, which helps detect potential reliability issues with drives. Finally, it is thought that exposing drives to a sudden spike in usage, which happens when using hot dedicated spare disks upon failure, increases the likelihood of failure for those drives. The global reserved space includes sufficient space to withstand the failure of a full module in addition to three disks, enabling the system to execute a new goal distribution, which is discussed in 2.4.2, Rebuild and redistribution on page 37, and to return to full redundancy even after multiple hardware failures. Space for hot spare is reserved as a percentage of each drives overall capacity, because the reserve spare capacity does not reside on dedicated disks. Important: The system will tolerate multiple hardware failures, including up to an entire module in addition to three subsequent drive failures outside of the failed module, provided that a new goal distribution is fully executed before a subsequent failure occurs. If the system is less than 100% full, it can sustain more subsequent failures based on the amount of unused disk space that will be allocated at the event of failure as a spare capacity. For a thorough discussion of how the system uses and manages reserve capacity under specific hardware failure scenarios, refer to 2.4, Reliability, availability, and serviceability on page 33. Note: The XIV Storage System does not manage a global reserved space for snapshots. We explore this topic in the next section.
Metadata and system reserve

The system reserves roughly 4% of the physical capacity for statistics and traces, as well as the distribution and partition tables.
Net usable capacity The calculation of the net usable capacity of the system consists of the total disk count, less
disk space reserved for sparing (which is the equivalent of one module plus three more disks), multiplied by the amount of capacity on each disk that is dedicated to data (that is 96%), and finally reduced by a factor of 50% to account for data mirroring. Note: The calculation of the usable space is: Usable capacity = [drive space x (% utilized for data) x [Total Drives - Hot Spare reserve]/2 Usable capacity = [1000 GB x 0.96 x [180-[12 + 3]]]/2 = 79 113 GB (decimal)
21
2.3.3 Storage Pool concepts

While the hardware resources within the XIV Storage System are virtualized in a global sense, the available capacity in the system can be administratively portioned into separate and independent Storage Pools. The concept of Storage Pools is purely administrative in that they are not a layer of the functional hierarchical logical structure employed by the system operating environment, which is discussed in 2.3.1, Logical system concepts on page 16. Instead, the flexibility of Storage Pool relationships from an administrative standpoint derives from the granular virtualization within the system. Essentially, Storage Pools function as a means to effectively manage a related group of logical volumes and their snapshots.
Improved management of storage space

Storage Pools form the basis for controlling the usage of storage space by specific applications, a group of applications, or departments, enabling isolated management of relationships within the associated group of logical volumes and snapshots while imposing a capacity quota. A logical volume is defined within the context of one and only one Storage Pool, and knowing that a volume is equally distributed among all system disk resources, it follows that all Storage Pools must also span all system resources. As a consequence of the system virtualization, there are no limitations on the size of Storage Pools or on the associations between logical volumes and Storage Pools. In fact, manipulation of Storage Pools consists exclusively of metadata transactions and does not impose any data copying from one disk or module to another disk or module. Therefore, changes are completed instantly and without any system overhead or performance degradation.
Consistency Groups
A Consistency Group is a group of volumes of which a snapshot can be made at the same point in time, thus ensuring a consistent image of all volumes within the group at that time. The concept of Consistency Group is ubiquitous among storage subsystems, because there are many circumstances in which it is necessary to perform concurrent operations collectively across a set of volumes, so that the result of the operation preserves the consistency among volumes. For example, effective storage management activities for applications that span multiple volumes, or for creating point-in-time backups, is not possible without first employing Consistency Groups. A notable practical scenario necessitating Consistency Groups arises when a consistent, instantaneous image of the database application (spanning both the database and the transaction log) is required. Obviously, taking snapshots of the volumes serially will result in an incongruent relationship among the volumes if the application issues writes to any of the application volumes while snapshots are occurring. This consistency between the volumes in the group is paramount to maintaining data integrity from the application perspective. By first grouping the application volumes into a Consistency Group, it is possible to later capture a consistent state of all volumes within that group at a given point-in-time using a special snapshot command for Consistency Groups. Issuing this type of a command results in the following process: 1. Complete and destage writes across the constituent volumes. 2. Instantaneously suspend I/O activity simultaneously across all volumes in the Consistency Group. 3. Create the snapshots.
22
4. Finally, resume normal I/O activity. The XIV Storage System manages these suspend and resume activities for all volumes within the Consistency Group. Note that additional mechanisms or techniques, such as techniques provided by the Microsoft Volume Shadowcopy Services (VSS) framework, might still be required to maintain full application consistency.
Storage Pool relationships

Storage Pools facilitate the administration of relationships among logical volumes, snapshots, and Consistency Groups. Review the principles in 2.3.1, Logical system concepts on page 16 before reading this section. The following principles govern the relationships between logical entities within the Storage Pool: A logical volume can have multiple independent snapshots. This logical volume is also known as a master volume. A master volume and all of its associated snapshots are always a part of only one Storage Pool. A volume can only be part of a single Consistency Group. All volumes of a Consistency Group must belong to the same Storage Pool. Storage Pools have the following characteristics: The size of a Storage Pool can range from as small as possible (17 GB, the minimum size that can be assigned to a logical volume) to as large as possible (the entirety of the available space in the system) without any limitation imposed by the system (this flexibility is not true for hosts, however).
Snapshot reserve capacity is defined within each regular Storage Pool (not thinly provisioned Storage Pools) and is effectively maintained separately from logical, or
master, volume capacity. The same principles apply for thinly provisioned Storage Pools, which are discussed in Storage Pool-level thin provisioning on page 26, with the exception that space is not guaranteed to be available for snapshots due to the potential for hard space depletion, which is discussed in Depletion of hard capacity on page 32: Snapshots are structured in the same manner as logical volumes (also known as master volumes); however, a Storage Pools snapshot reserve capacity is granular at the partition level (1 MB). In effect, snapshots collectively can be thought of as being thinly provisioned within each increment of 17 GB of capacity defined in the snapshot reserve space. Note: The snapshot reserve needs to be a minimum of 34 GB. The system preemptively deletes snapshots if the snapshots fully consume the allocated available space. As discussed in the previous example, snapshots will only be automatically deleted when there is inadequate physical capacity available within the context of each Storage Pool independently. This process is managed by a snapshot deletion priority scheme, which is discussed in 11.1, Snapshots on page 286. Therefore, when a Storage Pools size is exhausted, only the snapshots that reside in the affected Storage Pool are deleted.
23
The space allocated for a Storage Pool can be dynamically changed by the storage administrator: The Storage Pool can always be increased in size. It is limited only by the unallocated space on the system. The Storage Pool can always be decreased in size. It is limited only by the space that is consumed by the volumes and snapshots that are defined within that Storage Pool. The designation of a Storage Pool as a regular pool or a thinly provisioned pool can be dynamically changed even for existing Storage Pools. Thin provisioning is discussed in-depth in 2.3.4, Capacity allocation and thin provisioning on page 24. The storage administrator can relocate logical volumes between Storage Pools without any limitations, provided there is sufficient free space in the target Storage Pool: If necessary, the target Storage Pool capacity can be dynamically increased prior to volume relocation, assuming there is sufficient unallocated capacity available in the system. When a logical volume is relocated to a target Storage Pool, sufficient space must be available for all of its snapshots to reside in the target Storage Pool as well. Note: When moving a volume into a Storage Pool, the size of the Storage Pool is not automatically increased by the size of the volume. Likewise, when removing a volume from a Storage Pool, the size of the Storage Pool does not decrease by the size of the volume.
Note: The system defines capacity using decimal metrics. Do not confuse, for example, 1 GB (decimal) with approximately 0.93 GB (binary), or put another way, 1 073 741 824 bytes (decimal) with 1 GB (binary).
2.3.4 Capacity allocation and thin provisioning

The XIV Storage System virtualization empowers storage administrators to thinly provision resources, vastly improving aggregate capacity utilization and simplifying resource allocation tremendously. Thin provisioning is a central theme of the virtualized design of the system, because it uncouples the virtual, or apparent, allocation of a resource from the underlying hardware allocation. The following benefits emerge from the XIV Storage Systems implementation of thin provisioning: Capacity associated with specific applications or departments can be dynamically increased or decreased per the demand imposed at a given point in time, without necessitating an accurate prediction of future needs. Physical capacity is only committed to the logical volume when the associated applications execute writes, as opposed to when the logical volume is initially allocated. Because the total system capacity is architected as a globally available pool, thinly provisioned resources share the same buffer of free space, which results in highly efficient aggregate capacity utilization without pockets of inaccessible unused space. With the static, inflexible relationship between logical and physical resources commonly imposed by traditional storage subsystems, each applications capacity must be managed and allocated independently. This situation often results in a large percentage of the total system capacity remaining unused, because the capacity is confined within each volume at a highly granular level.
24
Capacity acquisition and deployment can be more effectively deferred until actual application and business needs demand additional space, in effect facilitating an on-demand infrastructure.
Hard and soft volume sizes

The physical capacity that is assigned to traditional, or fat, volumes is equivalent to the logical capacity presented to hosts, which does not have to be the case with the XIV Storage System. All logical volumes by definition have the potential to be thinly provisioned as a consequence of the XIV Storage Systems virtualized architecture and therefore provide the most efficient capacity utilization possible. For a given logical volume, there are effectively two associated sizes. The physical capacity allocated for the volume is not static, but it increases as host writes fill the volume.
Soft volume size The soft volume size is the size of the logical volume that is observed by the host, as defined
upon volume creation or as a result of a resizing command. The storage administrator specifies the soft volume size in the same manner regardless of whether the Storage Pool itself will be thinly provisioned. The soft volume size is specified in one of two ways, depending on units: In terms of GB: The system will allocate the soft volume size as the minimum number of discrete 17 GB increments needed to meet the requested volume size. In terms of blocks: The capacity is indicated as a discrete number of 512 byte blocks. The system will still allocate the soft volume size consumed within the Storage Pool as the minimum number of discrete 17 GB increments needed to meet the requested size (specified in 512 byte blocks); however, the size that is reported to hosts is equivalent to the precise number of blocks defined. Incidentally, the snapshot reserve capacity associated with each Storage Pool is a soft capacity limit, and it is specified by the storage administrator, though it effectively limits the hard capacity consumed collectively by snapshots as well. Note: Defining logical volumes in terms of blocks is useful when you must precisely match the size of an existing logical volume residing on another system.
Hard volume size The volume allocated hard space reflects the physical space allocated to the volume
following host writes to the volume and is discretely and dynamically provisioned by the system (not the storage administrator). The upper limit of this provisioning is determined by the soft size assigned to the volume. The volume consumed hard space is not necessarily equal to the hard volume allocated capacity, because the hard space allocation occurs in increments of 17 GB, while actual space is consumed at the granularity of the 1 MB partitions. Therefore, the actual physical space consumed by a volume within a Storage Pool is transient, because a volumes consumed hard space reflects the total amount of data that has been previously written by host applications: Hard capacity is allocated to volumes by the system in increments of 17 GB due to the underlying logical and physical architecture; there is no greater degree of granularity than 17 GB even if a only a few partitions are initially written beyond each 17 GB boundary. For more details, refer to 2.3.1, Logical system concepts on page 16.
25
Application write access patterns determine the rate at which the allocated hard volume capacity is consumed and subsequently the rate at which the system allocates additional increments of 17 GB up to the limit defined by the soft size for the volume. As a result, the storage administrator has no direct control over the hard capacity allocated to the volume by the system at any given point in time. During volume creation, or when a volume has been formatted, there is zero physical capacity assigned to the volume. As application writes accumulate to new areas of the volume, the physical capacity allocated to the volume will grow in increments of 17 GB and can ultimately reach the full soft volume size. Increasing the soft volume size does not affect the hard volume size.
Storage Pool-level thin provisioning

Note: As a result of the shared snapshot reserve space defined within a Storage Pool, and the differential nature of the snapshots, snapshots are effectively virtualized, because the snapshot size viewed by hosts is not necessarily equivalent to the physical capacity uniquely consumed by the snapshot within the Storage Pool. Also, snapshot differential physical capacity is granularly provisioned by the system at the partition level, whereas volume physical provisioning occurs in full 17 GB increments. While volumes are effectively thinly provisioned automatically by the system, Storage Pools can be defined by the storage administrator (when using the GUI) as either regular or thinly provisioned. Note that when using the Extended Command Line Interface (XCLI), there is no specific parameter to indicate thin provisioning for a Storage Pool. You indirectly and implicitly create a Storage Pool as thinly provisioned by specifying a pool soft size greater than its hard size. With a regular pool, the host-apparent, capacity is guaranteed to be equal to the physical capacity reserved for the pool. The total physical capacity allocated to the constituent individual volumes and collective snapshots at any given time within a regular (non-thinly provisioned) pool will reflect the current usage by hosts, because the capacity is dynamically consumed as required. However, the remaining unallocated space within the pool remains reserved for the pool and cannot be used by other Storage Pools. Therefore, the pool will not achieve full utilization unless the constituent volumes are fully utilized, but conversely, there is no chance of exceeding the physical capacity that is available within the pool as is possible with a thinly provisioned pool. In contrast, a thinly provisioned Storage Pool is not fully backed by hard capacity, meaning the entirety of the logical space within the pool cannot be physically provisioned unless the pool is transformed first into a regular pool. However, benefits can be realized when physical space consumption is less than the logical space assigned, because the amount of logical capacity assigned to the pool that is not covered by physical capacity is available for use by other Storage Pools. When a Storage Pool is created using thin provisioning, each Storage Pool can be defined in terms of both a soft size and a hard size independently, as opposed to a regular Storage Pool in which these sizes are by definition equivalent. Hard pool size and soft pool size are defined and used in the following ways.
Hard pool size

Thin provisioning of the Storage Pool maximizes capacity utilization in the context of a group of volumes, wherein the aggregate host-apparent, or soft, capacity assigned to all volumes surpasses the underlying physical, or hard, capacity allocated to them. This utilization requires that the aggregate space available to be allocated to hosts within a thinly provisioned 26
Storage Pool must be defined independently of the physical, or hard, space allocated within the system for that pool. Thus, the Storage Pool hard size that is defined by the storage administrator limits the physical capacity that is available collectively to volumes and snapshots within a thinly provisioned Storage Pool, whereas the aggregate space that is assignable to host operating systems is specified by the Storage Pools soft size, which is described in Soft pool size on page 27. Important: Do not confuse the hard space associated with volumes with that associated with Storage Pools. The hard space associated with volumes derives from the physical space written by hosts, and the hard space associated with Storage Pools represents the physical space allocated for the Storage Pool within the system, which is independent of host writes. Whereas regular Storage Pools effectively segregate the hard space reserved for volumes from the hard space consumed by snapshots by limiting the soft space allocated to volumes, thinly provisioned Storage Pools permit the totality of the hard space to be consumed by volumes with no guarantee of preserving any hard space for snapshots. Logical volumes take precedence over snapshots and might be allowed to overwrite snapshots if necessary as hard space is consumed. The hard space that is allocated to the Storage Pool that is unused (or in other words, the incremental difference between the aggregate soft and hard volume sizes) can, however, be used by snapshots in the same Storage Pool. Careful management is critical to prevent hard space for both logical volumes and snapshots from being exhausted. Ideally, hard capacity utilization must be maintained under a certain threshold by increasing the pool hard size as needed in advance. Note: As discussed in Storage Pool relationships on page 23, Storage Pools control when and which snapshots are deleted when there is insufficient space assigned within the pool for snapshots. Note: The soft snapshot reserve capacity and the hard space allocated to the Storage Pool are consumed only as changes occur to the master volumes or the snapshots themselves, not as snapshots are created.
Soft pool size

When defining the soft pool size for a thinly provisioned Storage Pool (the GUI provides a specific field for specifying the soft pool size), the size includes both of the following subsets: The combined limit of the soft sizes of all the volumes in the Storage Pool The space reserved for snapshots, which is a separately managed subset of the total pools soft size Thin provisioning is managed for each Storage Pool independently of all other Storage Pools: Regardless of any unused capacity that might reside in other Storage Pools, snapshots within a given Storage Pool will be deleted by the system according to corresponding snapshot pre-set priority if the hard pool size contains insufficient space to create an additional volume or increase the size of an existing volume. (Note that snapshots will actually only be deleted when a write occurs under those conditions, and not when allocating more space). As discussed previously, the storage administrator defines both the soft size and the hard size of thinly provisioned Storage Pools and allocates resources to volumes within a given Storage Pool without any limitations imposed by other Storage Pools.
27
The designation of a Storage Pool as a regular pool or a thinly provisioned pool can be dynamically changed by the storage administrator: When a regular pool needs to be converted to a thinly provisioned pool, the soft pool size parameter needs be explicitly set in addition to the hard pool size, which will remain unchanged unless updated. When a thinly provisioned pool needs to be converted to a regular pool, the soft pool size is automatically reduced to match the current hard pool size. If the combined allocation of soft capacity for existing volumes in the pool exceeds the pool hard size, the Storage Pool cannot be converted. Of course, this situation can be resolved if individual volumes are selectively resized or deleted to reduce the soft space consumed.
Note: Unlike volumes, a thinly provisioned Storage Pools hard size and soft size are fully configured by the storage administrator.
System-level thin provisioning

The definitions of hard size and soft size naturally apply at the subsystem level as well, because by extension, it is necessary to permit the full system to be defined in terms of thin provisioning in order to achieve the full potential benefit previously described: namely, the ability to defer deployment of additional capacity on an as-needed basis. The XIV Storage Systems architecture allows the global system capacity to be defined in terms of both a hard system size and a soft system size. When thin provisioning is not activated at the system level, these two sizes are equal to the systems physical capacity. With thin provisioning, these concepts have the following meanings.
Hard system size

The hard system size represents the physical disk capacity that is available within the XIV Storage System. Obviously, the systems hard capacity is the upper limit of the aggregate hard capacity of all the volumes and snapshots and can only be increased by installing new hardware components in the form of individual modules (and associated disks) or groups of modules. There are conditions that may temporarily reduce the systems hard limit. For further details, refer to 2.4.2, Rebuild and redistribution on page 37.
Soft system size

The soft system size is the total, global, logical space available for all Storage Pools in the system. When the soft system size exceeds the hard system size, it is possible to logically provision more space than is physically available, thereby allowing the aggregate benefits of thin provisioning of Storage Pools and volumes to be realized at the system level. The soft system size obviously limits the soft size of all volumes in the system and has the following attributes: It is not related to any direct system attribute and can be defined to be larger than the hard system size if thin provisioning is implemented. Note that the storage administrator cannot set the soft system size.
28
Note: If the Storage Pools within the system are thinly provisioned, but the soft system size does not exceed the hard system size, the total system hard capacity cannot be filled until all Storage Pools are regularly provisioned. Therefore, we recommend that you define all Storage Pools in a non-thinly provisioned system as regular Storage Pools. The soft system size is a purely logical limit; however, you must exercise care when the soft system size is set to a value greater than the maximum potential hard system size. Obviously, it must be possible to upgrade the systems hard size to be equal to the soft size, so defining an unreasonably high system soft size can result in full capacity depletion. It is for this reason that defining the soft system size is not within the scope of the storage administrator role. There are conditions that might temporarily reduce the systems soft limit. For further details, refer to 2.4.2, Rebuild and redistribution on page 37.
Thin provisioning conceptual examples

In order to further expound upon the thin provisioning principles previously discussed, it is helpful to examine the following basic examples, because they incorporate all of the concepts inherent to the XIV Storage Systems implementation of thin provisioning.
System-level thin provisioning conceptual example

Figure 2-6 on page 30 depicts the incremental allocation of capacity to both a regular Storage Pool and a thinly provisioned Storage Pool within the context of the global system soft and hard sizes. This example assumes that the soft system size has been defined to exceed its hard size. The unallocated capacity shown within the systems soft and hard space is represented by a discontinuity in order to convey the full scope of both the logical and physical view of the systems capacity. Each increment in the diagram represents 17 GB of soft or hard capacity. When a regular Storage Pool is defined, only one capacity is specified, and this amount is allocated to the Storage Pool from both the hard and soft global capacity within the system. When a thinly provisioned Storage Pool is defined, both the soft and hard capacity limits for the Storage Pool must be specified, and these amounts are deducted from the systems global available soft and hard capacity, respectively. The next example will focus on the regular Storage Pool introduced in Figure 2-6 on page 30.
29
Thin Provisioning System Hard and Soft Size with Storage Pools
The system allocates the amount of space requested by the administrator in increments of 17GB.
Regular Storage Pool
For a Thin Storage Pool, the system allocates the amount of soft space requested by the administrator independently from the hard space.
Thin Storage Pool Soft Size Unallocated
Logical View
...
System Soft Size 17GB
Physical View
System Hard Size
...
Regular Storage Pool Thin Storage Pool Hard Size Unallocatd
For a Regular Storage Pool, the system allocates an amount of hard space that is equivalent to the size defined for the pool by the administrator.
For a Thin Storage Pool, the system allocates only the amount of hard space requested by the administrator. This space is consumed as hosts issue writes to new areas of the constituent volumes, and may require dynamic expansion to achieve the soft space allocated to one or more of the volumes
Figure 2-6 Thin provisioning at the system level
Regular Storage Pool conceptual example

Figure 2-7 on page 31 represents a focused view of the regular Storage Pool that is shown in Figure 2-6 and depicts the division of both soft and hard capacity among volumes within the pool. Note that the regular pool is the same size (102 GB) in both diagrams. First, consider Volume 1. Although Volume 1 is defined as 19 737 900 blocks (10 GB), the soft capacity allocated will nevertheless be comprised of the minimum number of 17 GB increments needed to meet or exceed the requested size in blocks, which is in this case only a single 17 GB increment of capacity. The host will, however, see exactly 1 973 900 blocks. When Volume 1 is created, the system does not initially allocate any hard capacity. At the moment that a host writes to Volume 1, even if it is just to initialize the volume, the system will allocate 17 GB of hard capacity. The hard capacity allocation of 17 GB for Volume 1 is illustrated in Figure 2-7 on page 31, although clearly this allocation will never be fully utilized as long as the host-defined capacity remains only 10 GB. Unlike Volume 1, Volume 2 has been defined in terms of gigabytes and has a soft capacity allocation of 34 GB, which is the amount that is reported to any hosts that are mapped to the volume. In addition, the hard capacity consumed by host writes has not yet exceeded the 17 GB threshold, and hence, the system has thus far only allocated one increment of 17 GB hard capacity. However, because the hard capacity and the soft capacity allocated to a regular Storage Pool are equal by definition, the remaining 17 GB of soft capacity assigned to Volume 2 is effectively preserved and will remain available within the pools hard space until it is needed by Volume 2. In other words, because the pools soft capacity does not exceed its hard capacity, there is no way to allocate soft capacity to effectively overcommit the available hard capacity.
30
The final reserved space within the regular Storage Pool shown in Figure 2-7 is dedicated for the snapshot usage. The diagram illustrates that the specified snapshot reserve capacity of 34 GB is effectively deducted from both the hard and soft space defined for the regular Storage Pool, thus guaranteeing that this space will be available for consumption collectively by the snapshots associated with the pool. Although snapshots consume space granularly at the partition level, as discussed in Storage Pool relationships on page 23, the snapshot reserve capacity is still defined in increments of 17 GB. The remaining 17 GB within the regular Storage Pool have not been allocated to either volumes or snapshots. Note that all soft capacity remaining in the pool is backed by hard capacity; the remaining unused soft capacity will always be less than or equal to the remaining unused hard capacity.
Regular Provisioning Example Storage Pool with Volumes
Volume 1 Allocated Soft Space The block definition allows hosts to see a precise number of blocks. Even for block defined volumes, the system allocates logical capacity in increments of 17GB. For a Regular Storage Pool, the soft size and hard size are equal.
Volume 1 Size = 10GB (Block Definition)
Volume 2 Size = 34GB
Snapshot Reserve
Unused
Logical View
Pool Soft Size = 102GB 17GB 17GB 17GB 34GB 17GB

Unused
Pool Hard Size = 102GB
Physical View
Volume 1 Consumed Hard Space Volume 1 Allocated Hard Space
Volume 2 Consumed Hard Space Volume 2 Allocated Hard Space
34GB
Unused
Snapshot Reserve
The consumed hard space grows as host writes accumulate to new areas of the volume.
In a Regular Storage Pool, the maximum hard space available to be consumed by a volume is guaranteed to be equal to the soft size that was allocated.
Figure 2-7 Volumes and snapshot reserve space within a regular Storage Pool
Thinly provisioned Storage Pool conceptual example

The thinly provisioned Storage Pool that was introduced in Figure 2-6 on page 30 is explored in detail in Figure 2-8 on page 32. Note that the hard capacity and the soft capacity allocated to this pool are the same in both diagrams: 136 GB of soft capacity and 85 GB of hard capacity are allocated. Because the available soft capacity exceeds the available hard capacity by 51 GB, it is possible to thinly provision the volumes collectively by up to 66.7%, assuming that the snapshots are preserved and the remaining capacity within the pool is allocated to volumes. Consider Volume 3 in Figure 2-8 on page 32. The size of the volume is defined as 34 GB; however, less than 17 GB has been consumed by host writes, so only 17 GB of hard capacity have been allocated by the system. In comparison, Volume 4 is defined as 51 GB, but Volume 4 has consumed between 17 GB and 34 GB of hard capacity and therefore has been allocated 34 GB of hard space by the system. It is possible for either of these two volumes to
31
require up to an additional 17 GB of hard capacity to become fully provisioned, and therefore, at least 34 GB of additional hard capacity must be allocated to this pool in anticipation of this requirement. Finally, consider the 34 GB of snapshot reserve space depicted in Figure 2-8. If a new volume is defined in the unused 17 GB of soft space in the pool, or if either Volume 3 or Volume 4 requires additional capacity, the system will sacrifice the snapshot reserve space in order to give priority to the volume requirements. Normally, this scenario does not occur, because additional hard space must be allocated to the Storage Pool as the hard capacity utilization crosses certain thresholds.
Thin Provisioning Example Storage Pool with Volumes and Snapshots

This is the volume size defined during volume creation/resizing. Volume 3 Soft Size = 34GB Volume 4 Soft Size = 51GB Snapshots Consumed Soft Space Snapshot Reserve
Unused
Logical View
Pool Soft Size = 136GB 17GB 34GB 34GB 51GB 34GB 34GB 17GB
For a Thin Storage Pool, the pool soft size is greater than the pool hard size. The snapshot reserve limits the maximum hard space that can be consumed by snapshots, but for a Thin Storage Pool it does not guarantee that hard space will be available. Volume 3 Consumed Hard Space Volume 3 Allocated Hard Space Volume 4 Consumed Hard Space Volume 4 Allocated Hard Space Snapshots Consumed Hard Space Unused This is the physical space consumed collectively by the snapshots in the pool. Since snapshots are differential at the partition level, multiple snapshots can potentially exist within a single 17GB increment of capacity.
Pool Hard Size = 85GB
Physical View
The consumed hard space grows as host writes accumulate to new areas of the volume. The system must allocate new 17GB increments to the volume as space is consumed.
In a Thin Storage Pool, the maximum hard space consumed by a volume is not guaranteed to be equal to the size that was allocated, because it is possible for the volumes in the pool to collectively exhaust all hard space allocated to the pool. This will cause the pool to be locked.
Figure 2-8 Volumes and snapshot reserve space within a thinly provisioned Storage Pool
Depletion of hard capacity

Using thin provisioning creates the inherent danger of exhausting the available physical capacity. If the soft system size exceeds the hard system size, the potential exists for applications to fully deplete the available physical capacity. Important: Upgrading the system through the addition of modules is currently not supported.
32
Snapshot deletion
As mentioned previously, snapshots in regular Storage Pools can be automatically deleted by the system in order to provide space for newer snapshots, or in the case of thinly provisioned pools, to permit more physical space for volumes. For example, if you had created a Storage Pool with a soft size of 350 GB, a hard size of 250 GB and a snapshot reserve of 200 GB, when the volume hard size reaches more than 250 GB - 200 GB = 50 GB, space for the volumes is taken from the snapshot reserve. And if the space was already consumed by snapshots, several snapshots will be deleted. The snapshot deletion order is based on the deletion priority and creation time, as explained in 11.1, Snapshots on page 286.
Volume locking
If more hard capacity is still required after all the snapshots in a thinly provisioned Storage Pool have been deleted, all the volumes in the Storage Pool are locked (you can specify two
possible behaviors for a locked volume: either no I/O at all, or read only), thereby preventing any additional consumption of hard capacity.
Important: Volume locking prevents writes to all volumes in the Storage Pool.
It is very important to note that thin provisioning implementation in the XIV Storage System manages space allocation within each Storage Pool, so that hard capacity depletion in one Storage Pool will never affect the hard capacity available to another Storage Pool. There are both advantages and disadvantages: Because Storage Pools are independent, thin provisioning volume locking on one Storage Pool never cascades into another Storage Pool. Hard capacity cannot be reused across Storage Pools, even if a certain Storage Pool has free hard capacity available, which can lead to a situation where volumes are locked due to the depletion of hard capacity in one Storage Pool, while there is available capacity in another Storage Pool. Of course, it is still possible for the storage administrator to intervene in order to redistribute hard capacity.
2.4 Reliability, availability, and serviceability

The XIV Storage Systems unique modular design and logical topology fundamentally differentiate it from traditional monolithic systems, and this architectural divergence extends to the exceptional reliability, availability, and serviceability aspects of the system. In addition, the XIV Storage System incorporates autonomic, proactive monitoring and self-healing features that are capable of not only transparently and automatically restoring the system to full redundancy within minutes of a hardware failure, but also taking preventive measures to preserve data redundancy even before a component malfunction actually occurs. For further reading about the XIV Storage Systems parallel modular architecture, refer to 2.2, Massive parallelism on page 10.
2.4.1 Resilient architecture

As with any enterprise class system, redundancy pervades every aspect of the XIV Storage System, including the hardware, internal operating environment, and the data itself. However, the design elements, including the distribution of volumes across the whole of the system, in combination with the loosely coupled relationship between the underlying hardware and
33
software elements, empower the XIV Storage System to realize unprecedented resiliency. The resiliency of the architecture encompasses not only high availability, but also excellent maintainability, serviceability, and performance under non-ideal conditions resulting from planned or unplanned changes to the internal hardware infrastructure, such as the loss of a module.
Availability
The XIV Storage System maximizes operational availability and minimizes the degradation of performance associated with nondisruptive planned and unplanned events, while providing for the capability to preserve the data to the fullest extent possible in the event of a disaster.
High reliability
The XIV Storage System not only withstands individual component failures by quickly and efficiently reinstating full data redundancy, but also automatically monitors and phases out individual components before data redundancy is compromised. We discuss this topic in detail in Proactive phase-out and self-healing mechanisms on page 43. The collective high reliability provisions incorporated within the system constitute multiple layers of protection from unplanned outages and minimize the possibility of related service actions.
Maintenance freedom
While the potential for unplanned outages and associated corrective service actions are mitigated by the reliability attributes inherent to the system design, the XIV Storage Systems autonomic features also minimize the need for storage administrators to conduct non-preventative maintenance activities that are purely reactive in nature, by adapting to potential issues before they are manifested as a component failure. The continually restored redundancy in conjunction with the self-healing attributes of the system effectively enable maintenance activities to be decoupled from the instigating event (such as a component failure or malfunction) and safely carried out according to a predefined schedule. In addition to the systems diagnostic monitoring and autonomic maintenance, the proactive and systematic, rather than purely reactive, approach to maintenance is augmented, because the entirety of the logical topology is continually preserved, optimized, and balanced according to the physical state of the system. The modular system design also expedites the installation of any replacement or upgraded components, while the automatic, transparent data redistribution across all resources eliminates the downtime, even in the context of individual volumes, associated with these critical activities.
High availability
The rapid restoration of redundant data across all available drives and modules in the system during hardware failures, and the equilibrium resulting from the automatic redistribution of data across all newly installed hardware, are fundamental characteristics of the XIV Storage System architecture that minimize exposure to cascading failures and the associated loss of access to data.
Consistent performance
The XIV Storage System is capable of adapting to the loss of an individual drive or module efficiently and with relatively minor impact compared to monolithic architectures. While traditional monolithic systems employ an N+1 hardware redundancy scheme, the XIV Storage System harnesses the resiliency of the grid topology, not only in terms of the ability to sustain a component failure, but also by maximizing consistency and transparency from the perspective of attached hosts. The potential impact of a component failure is vastly reduced, because each module in the system is responsible for a relatively small percentage of the systems operation. Simply put, a controller failure in a typical N+1 system likely results
34
in a dramatic (up to 50%) reduction of available cache, processing power, and internal bandwidth, whereas the loss of a module in the XIV Storage System translates to only 1/15th of the system resources and does not compromise performance nearly as much as the same failure with a typical architecture. Additionally, the XIV Storage System incorporates innovative provisions to mitigate isolated disk-level performance anomalies through redundancy-supported reaction, which is discussed in Redundancy-supported reaction on page 44, and flexible handling of dirty data, which is discussed in Flexible handling of dirty data on page 44.
Disaster recovery
Enterprise class environments must account for the possibility of the loss of both the system and all of the data as a result of a disaster. The XIV Storage System includes the provision for Remote Mirror functionality as a fundamental component of the overall disaster recovery strategy. Refer to Chapter 12, Remote Mirror on page 323.
Write path redundancy

Data arriving from the hosts is temporarily placed in two separate caches before it is permanently written to disk drives located in separate modules. This design guarantees that the data is always protected against possible failure of individual modules, even before the data has been written to the disk drives. Figure 2-9 on page 36 illustrates the path taken by a write request as it travels through the system. The diagram is intended to be viewed as a conceptual topology, so do not interpret the specific numbers of connections and so forth as literal depictions. Also, for purposes of this discussion, the Interface Modules are depicted on a separate level from the Data Modules. However, in reality the Interface Modules also function as Data Modules. The following numbers correspond to the numbers in Figure 2-9 on page 36: 1. A host sends a write request to the system. Any of the Interface Modules that are connected to the host can service the request, because the modules work in an active-active capacity. Note that the XIV Storage System does not load balance the requests itself. Load balancing must be implemented by storage administrators to equally distribute the host requests among all Interface Modules. 2. The Interface Module uses the system configuration information to determine the location of the primary module that houses the referenced data, which can be either an Interface Module, including the Interface Module that received the write request, or a Data Module. The data is written only to the local cache of the primary module. 3. The primary module uses the system configuration information to determine the location of the secondary module that houses the copy of the referenced data. Again, this module can be either an Interface Module or a Data Module, but it will not be the same as the primary module. The data is redundantly written to the local cache of the secondary module. After the data is written to cache in both the primary and secondary modules, the host receives an acknowledgement that the I/O is complete, which occurs independently of copies of either cached, or dirty, data being destaged to physical disk.
35
Figure 2-9 Write path
System quiesce and graceful shutdown

When an event occurs that compromises both sources of power to the XIV Storage Systems redundant universal power supplies, the system executes the graceful shutdown sequence. Full battery power is guaranteed during this event, because the system monitors available battery charge at all times and takes proactive measures to prevent the possibility of conducting write operations when battery conditions are non-optimal. Due to the XIV Storage Systems grid topology, a system quiesce event essentially entails the graceful shutdown of all modules within the system. Each module can be thought of as an independent entity that is responsible for managing the destaging of dirty data, that is, written data that has not yet been destaged to physical disk. The dirty data within each module consists of equal parts primary and secondary copies of data, but will never contain
both primary and secondary copies of the same data. Write cache protection
Each module in the XIV Storage System contains an local, independent space reserved for caching operations within its system memory. Each module contains 8 GB of high speed volatile memory (a total of 120 GB), from which 5.5 GB (and 82.5 GB overall) is dedicated for caching data.
Note: The system does not contain non-volatile memory space that is reserved for write operations. However, the close proximity of the cache and the drives, in conjunction with the enforcement of an upper limit for dirty, or non-destaged, data on a per-drive basis, ensures that the full destage will occur while operating under battery power.
36
Graceful shutdown sequence

The system executes the graceful shutdown sequence under either of these conditions: The battery charge remaining in two or more universal power supplies is below a certain threshold, which is conservatively predetermined in order to provide adequate time for the system to fully destage all dirty data from cache. The system detects the loss of external power. When either of these events occurs, the system initiates the following sequence of actions: 1. All volumes and snapshots become inaccessible. 2. Each module begins destaging: Dirty data from its cache to the disks The partition table and the distribution table to reserved space on the disks, as defined in Partition table on page 19 3. The system preserves the current system configuration information in non-volatile compact flash memory. 4. The system will proceed to normal operation if the shutdown sequence was initiated by a temporary power loss, only if both of the following conditions are met: a. External power has been fully restored during the shutdown sequence. b. The remaining battery charge in all universal power supplies exceeds the threshold that is needed to ensure adequate time for a subsequent destage of dirty data to disk. Note: If the battery charge is inadequate, the system will remain fully locked until the battery charge has exceeded the necessary threshold to safely resume I/O activity.
Power on sequence
Upon startup, the system will verify that the battery charge levels in all universal power supplies exceed the threshold necessary to guarantee that a graceful shutdown can occur. If the charge level is inadequate, the system will not begin servicing host I/O until the charge level has exceeded the minimum required threshold.
2.4.2 Rebuild and redistribution

As discussed in Pseudo-random algorithm on page 14, the XIV Storage System dynamically maintains the pseudo-random distribution of data across all modules and disks while ensuring that two copies of data exist at all times when the system is in a steady-state condition. Obviously, when there is a change to the hardware infrastructure as a result of a failed component, data must be restored to redundancy and distributed, or when a component is added, or phased-in, a new data distribution must accommodate the change.
Goal distribution
The process of achieving a new goal distribution while simultaneously restoring data redundancy due to the loss of a disk or module is known as a rebuild. Because a rebuild occurs as a result of a component failure that compromises full data redundancy, there is a period during which the non-redundant data is both restored to full redundancy and homogeneously redistributed over the remaining disks.
37
The process of achieving a new goal distribution (only occurring when redundancy exists) is known as a redistribution, during which all data in the system (including both primary and secondary copies) is redistributed, when it is a result of: The replacement of a failed disk or module following a rebuild, also known as a phase-in. When one or more modules are added to the system, known as a scale out upgrade. While the XIV Storage System does not currently support the addition of new racks, the systems inherent virtualization capabilities naturally apply in this context as well. Following any of these occurrences, the XIV Storage System immediately initiates the following sequence of events: 1. The XIV Storage System distribution algorithms calculate which partitions must be relocated and copied based on the pseudo-random distribution that is described in 2.2.2, Logical parallelism on page 14. The resultant distribution table is known as the goal
distribution.
2. The Data Modules and Interface Modules begin concurrently redistributing and copying (in the case of a rebuild) the partitions according to the goal distribution: This process occurs in a parallel, any-to-any fashion concurrently among all modules and drives in the background, with complete host transparency. The priority associated with achieving the new goal distribution is internally determined by the system. The priority cannot be adjusted by the storage administrator: Rebuilds have the highest priority; however, the transactional load is homogeneously distributed over all the remaining disks in the system resulting in a very low density of system-generated transactions. Phase-outs (caused by the XIV technician removing and replacing a failed module) have lower priority than rebuilds, because at least two copies of all data exist at all times during the phase-out. Redistributions have the lowest priority, because there is neither a lack of data redundancy nor has the system detected the potential for an impending failure.
3. The system resumes steady-state operation after the goal distribution has been met. Following the completion of goal distribution resulting from a rebuild or phase-out, a subsequent redistribution must occur when the system hardware is fully restored through a phase-in. Note: The goal distribution is transparent to storage administrators and cannot be changed. In addition, the goal distribution has many determinants depending on the precise state of the system.
Important: Never perform a phase-in to replace a failed disk or module until after the rebuild process has completed. These operations must be performed by the IBM XIV technician anyway.
Preserving data redundancy

Whereas conventional storage systems maintain a static relationship between RAID arrays and logical volumes by preserving data redundancy only across a subset of disks that are defined in the context of a particular RAID array, the XIV Storage System dynamically and fluidly restores redundancy and equilibrium across all disks and modules in the system during the rebuild and phase-out operations. Refer to Logical volume layout on physical disks on page 19 for a detailed discussion of the low-level virtualization of logical volumes within the
38
XIV Storage System. The proactive phase-out of non-optimal hardware through autonomic monitoring and the modules cognizance of the virtualization between the logical volumes and physical disks yield unprecedented efficiency, transparency, and reliability of data preservation actions, encompassing both rebuilds and phase-outs: The rebuild of data is many times faster than conventional RAID array rebuilds and can complete in a short period of time for a fully provisioned system, because the redistribution workload spans all drives in the system resulting in very low transactional density: Statistically, the chance of exposure to data loss or a cascading hardware failure, which occurs when corrective actions in response to the original failure result in a subsequent failure, is minimized due to both the brevity of the rebuild action and the low density of access on any given disk. Rebuilding conventional RAID arrays can take many hours to complete, depending on the type of the array, the number of drives, and the ongoing host-generated transactions to the array. The rebuild process can complete 25% to 50% more quickly for systems that are not fully provisioned, which equates to a rebuild completion in as little as 15 minutes. The system relocates only real data, as opposed to rebuilding the entire array, which consists of complete disk images that often include unused space, vastly reducing the potential number of transactions that must occur. Conventional RAID array rebuilds can place many times the normal transactional load on the disks and substantially reduce effective host performance. The number of drives participating in the rebuild is about 20 times greater than in most average-sized conventional RAID arrays, and by comparison, the array rebuild workload is greatly dissipated, greatly reducing the relative impact on host performance. Whereas standard dedicated spare disks utilized during a conventional RAID array rebuild might not be globally accessible to all arrays in the system, the XIV Storage System maintains universally accessible reserve space on all disks in the system, as discussed in Global spare capacity on page 21. Because the system maintains access density equilibrium, hot-spots are statistically eliminated, which reduces the chances of isolated workload-induced failures. The system-wide goal distribution alleviates localized drive stress and associated heat soak, which can significantly increase the probability of a double drive failure during the rebuild of a RAID array in conventional subsystems. Modules intelligently send information to each other directly. There is no need for a centralized supervising controller to read information from one disk module and write to another disk module. All disks are monitored for errors, poor performance, or other signs that might indicate that a full or partial failure is impending. Dedicated spare disks in conventional RAID arrays are inactive, and therefore, unproven and unmonitored, increasing the potential for a second failure during an array rebuild.
Rebuild examples
When the full redundancy of data is compromised due to a module failure, as depicted in Figure 2-10 on page 40, the system immediately identifies the non-redundant partitions and begins the rebuild process. Because none of the disks within a given module contain the secondary copies of data residing on any of the disks in the module, the secondary copies are read from the remaining modules in the system. Therefore, during a rebuild resulting from a module failure, there will be concurrently 168 disks (180 disks in the system minus 12 disks in a module) reading, and 168 disks writing, as is conceptually illustrated in Figure 2-10 on page 40.
39
Figure 2-10 Non-redundant group of partitions following module failure
Figure 2-11 on page 41 depicts a denser population of redundant partitions for both volumes A and B, thus representing the completion of a new goal distribution, as compared to Figure 2-10, which contains the same number of redundant partitions for both volumes distributed less densely over the original number of modules and drives. Finally, consider the case of a single disk failure occurring in an otherwise healthy system (no existing phased-out or failed hardware). During the subsequent rebuild, there will be only 168 disks reading, because there is no non-redundant data residing on the other disks within the same module as the failed disk. Concurrently, there will be 179 disks writing in order to preserve full data distribution. Note: Figure 2-10 and Figure 2-11 conceptually illustrate the rebuild process resulting from a failed module. The diagrams are not intended to depict in any way the specific
placement of partitions within a real system, nor do they literally depict the number of modules in a real system.
40
Figure 2-11 Performing a new goal distribution following module failure
Transient soft and hard system size

The capacity allocation that is consumed for purposes of either restoring non-redundant data during a rebuild, or creating a tertiary copy during a phase-out, will be sourced based on availability, with the following precedence: 1. Unallocated system hard capacity: The system might consume hard capacity that was not assigned to any Storage Pools at the time of the failure. 2. Unallocated Storage Pool hard capacity: The hard capacity of Storage Pools that is not
assigned to any existing volumes or consumed by snapshots, as measured before the failure, is unallocated hard capacity. For details about the topic of Storage Pool sizes, refer to Storage Pool-level thin provisioning on page 26. Do not confuse this
unallocated Storage Pool hard capacity with unconsumed capacity, which is unwritten hard space allocated to volumes. 3. Reserve spare capacity: As discussed previously, the system reserves enough capacity to sustain the consecutive, non-concurrent failure of three drives and an entire module before replacement hardware must be phased in to ensure that data redundancy can be restored during subsequent hardware failures. In the event that sufficient unallocated hard capacity is available, the system will withhold allocating reserve spare space to complete the rebuild or phase-out process in order to provide additional protection. As a result, it is possible for the system to report a maximum soft size that is temporarily less than the allocated soft capacity. The soft and hard system sizes will not revert to the original values until a replacement disk or module is phased-in, and the resultant redistribution completes.
41
Important: While it is possible to resize or create volumes, snapshots, or Storage Pools while a rebuild is underway, we strongly discourage these activities until the system has completed the rebuild process and restored full data redundancy.
Redistribution
The XIV Storage System homogeneously redistributes all data across all disks whenever new disks or modules are introduced or phased in to the system. This redistribution process is not equivalent to the striping volumes on all disks employed in traditional systems: Both conventional RAID striping, as well as the data distribution, fully incorporate all spindles when the hardware configuration remains static; however, when new capacity is added and new volumes are allocated, ordinary RAID striping algorithms do not intelligently redistribute data to preserve equilibrium for all volumes through the pseudo-random distribution of data, which is described in 2.2.2, Logical parallelism on page 14. Thus, the XIV Storage System employs dynamic volume-level virtualization, obviating the need for ongoing manual volume layout planning. The redistribution process is triggered by the phase-in of a new drive or module and differs from a rebuild or phase-out in that: The system does not need to create secondary copies of data to reinstate or preserve full data redundancy. The distribution density, or the concentration of data on each physical disk, decreases instead of increasing. The redistribution of data performs differently, because the concentration of write activity on the new hardware resource is the bottleneck: When a replacement module is phased-in, there will be concurrently 168 disks reading and 12 disks writing, and thus the time to completion is limited by the throughput of the replacement module. Also, the read access density on the existing disks will be extremely low, guaranteeing extremely low impact on host performance during the process. When a replacement disk is phased-in, there will be concurrently 179 disks reading and only one disk writing. In this case, the replacement drive obviously limits the achievable throughput of the redistribution. Again, the impact on host transactions is extremely small, or insignificant.
2.4.3 Minimized exposure

This section describes other features that contribute to the XIV Storage System reliability.
Disaster recovery
All high availability SAN implementations must account for the contingency of data recovery and business continuance following a disaster, as defined by the organizations recovery point and recovery time objectives. The provision within the XIV Storage System to efficiently and flexibly create nearly unlimited snapshots, coupled with the ability to define Consistency Groups of logical volumes, constitutes integral elements of the data preservation strategy. In addition, the XIV Storage Systems synchronous data mirroring functionality facilitates excellent potential recovery point and recovery time objectives as a central element of the full disaster recovery plan. Refer to 11.1, Snapshots on page 286 and 12.1, Remote Mirror on page 324.
42
Proactive phase-out and self-healing mechanisms

As previously discussed, the XIV Storage System can seamlessly restore data redundancy with minimal data migration and overhead. Yet a further enhancement to the level of reliability standards attained by the XIV Storage System entails self-diagnosis and early detection mechanisms that autonomically phase out components before the probability of a failure increases beyond a certain point. In real systems, the failure rate is not constant with time, but rather increases with service life and duty cycle. By actively gathering component statistics to monitor this trend, the system ensures that components will not operate under conditions beyond an acceptable threshold of reliability and performance. Thus, the XIV Storage Systems self-healing mechanisms dramatically increase the already exceptional level of availability of the system, because they virtually preclude the possibility of data redundancy from ever being compromised along with the associated danger, however unlikely, of subsequent failures during the rebuild process. The autonomic attributes of the XIV Storage System cumulatively impart an enormous benefit to not only the reliability of the system, but also the overall availability, by augmenting the maintainability and serviceability aspects of the system. Both the monetary and time demands associated with maintenance activities, or in other words, the total cost of ownership (TCO), are effectively minimized by reducing reactive service actions and enhancing the potential scope of proactive maintenance policies.
Disk scrubbing
The XIV Storage System maintains a series of scrubbing algorithms that run as background processes concurrently and independently scanning multiple media locations within the system in order to maintain the integrity of the redundantly stored data. This continuous checking enables the early detection of possible data corruption, alerting the system to take corrective action to restore the data integrity before errors can manifest themselves from the host perspective. Thus, redundancy is not only implemented as part of the basic architecture of the system, but it is also continually monitored and restored as required. In summary, the data scrubbing process has the following attributes: Verifies the integrity and redundancy of stored data Enables early detection of errors and early recovery of redundancy Runs as a set of background processes on all disks in parallel Checks whether data can be read from partitions and verifies data integrity by employing checksums Examines a single disk partition every two seconds (note that here the term partition does not refer to a logical partition)
Enhanced monitoring and disk diagnostics

The XIV Storage System continuously monitors the performance level and reliability standards of each disk drive within the system, using an enhanced implementation of Self-Monitoring, Analysis and Reporting Technology (SMART) tools. As typically implemented in the storage industry, SMART tools simply indicate whether certain thresholds have been exceeded, thereby alerting that a disk is at risk for failure and thus needs to be replaced.
43
However, as implemented in XIV Storage System, the SMART diagnostic tools, coupled with intelligent analysis and low tolerance thresholds, provide an even greater level of refinement of the disk behavior diagnostics and the performance and reliability driven reaction. For instance, the XIV Storage System measures the specific values of parameters including, but not limited to:
Reallocated sector count: If the disk encounters a read or write verification error, it designates the affected sector as reallocated and relocates the data to a reserved area of spare space on the disk. Note that this spare space is a parameter of the drive itself and is not related in any way to the system reserve spare capacity that is described in Global spare capacity on page 21. The XIV Storage System initiates phase-out at a much lower count than the manufacturer recommends. Disk temperature: The disk temperature is a critical factor that contributes to premature
drive failure and is constantly monitored by the system.
Raw read error : The raw read error count provides an indication of the condition of the magnetic surface of the disk platters and is carefully monitored by the system to ensure the integrity of the magnetic media itself. Spin-up time: The spin-up time is a measure of the average time that is required for a
spindle to accelerate from zero to 7 200 rpm. The XIV Storage System recognizes abnormal spin-up time as a potential indicator of an impending mechanical failure. Likewise, for additional early warning signs, the XIV Storage System continually monitors other aspects of disk-initiated behavior, such as spontaneous reset or unusually long latencies. The system intelligently analyzes this information in order to reach crucial decisions concerning disk deactivation and phase-out. The parameters involved in these decisions allow for a very sensitive analysis of the disk health and performance.
Redundancy-supported reaction
The XIV Storage System incorporates redundancy-supported reaction, which is the provision to exploit the distributed redundant data scheme by intelligently redirecting reads to the secondary copies of data, thereby extending the systems tolerance of above average disk service time when accessing primary data locations. The system will reinstate reads from the primary data copy when the transient degradation of the disk service time has subsided. Of course, redundancy-supported reaction itself might be triggered by an underlying potential disk error that will ultimately be managed autonomically by the system according to the severity of the exposure, as determined by ongoing disk monitoring.
Flexible handling of dirty data

In a similar manner to the redundancy-supported reaction for read activity, the XIV Storage System can also make convenient use of its redundant architecture in order to consistently maintain write performance. Because intensive write activity directed to any given volume is distributed across all modules and drives in the system, and the cache is independently managed within each module, the system is able to tolerate sustained write activity to an under-performing drive by effectively maintaining a considerable amount of dirty, or unwritten, data in cache, thus potentially circumventing any performance degradation resulting from the transient, anomalous service time of a given disk drive.
44
Chapter 3.
XIV physical architecture and components

This chapter describes the hardware architecture of the XIV Storage System. We present the physical structures that make up the XIV Storage System, such as the system rack, Interface Modules, Data Modules, Management Modules, disks, switches, and power distribution devices. We begin with a general overview of the hardware and then describe the individual components in more detail.
45
3.1 IBM XIV Storage System Model A14

The XIV Storage System seen in Figure 3-1 is designed to be a scalable enterprise storage system based upon a grid array of hardware components. The architecture offers the highest performance through maximized utilization of all disks, true distributed cache implementation, coupled with more effective bandwidth. It also offers superior reliability through distributed architecture, redundant components, self-monitoring, and self-healing.
Figure 3-1 IBM XIV Storage System front and rear views
3.1.1 Hardware characteristics

The IBM 2810-A14 is a new generation of IBM high-performance, high-availability, and high-capacity enterprise disk storage subsystem. Figure 3-2 on page 47 summarizes the main hardware characteristics. All XIV hardware components come pre-installed in a standard American Power Conversion (APC) AR3100 rack. At the bottom of the rack, a Uninterruptible Power Supply (UPS) module complex, which is made up of three redundant UPS units, is installed and provides power to the Data Modules, Interface Modules, and switches. A fully populated rack contains 15 Data Modules, where six modules are combined Data and Interface Modules equipped with the connectivity adapters (Fibre Channel (FC) and Ethernet). Each module includes twelve 1 TB Serial Advanced Technology Attachment (SATA) disk drives. This design translates into a total raw capacity of 180 TB for the complete system. For information about usable capacity, refer to 2.3, Full storage virtualization on page 14.
46
Machine Type: 2810-A14 42U Rack:

9 2U Data Modules:
12 1 TB 7200 RPM SATA drives
Module 15 (Data) Module 14 (Data) Module 13 (Data) Module 12 (Data) Module 11 (Data) Module 10 (Data) Module 9 (Data + Interface) Module 8 (Data + Interface) Module 7 (Data + Interface) Ethernet Switch, Maintenance Module 6 (Data + Interface) Module 5 (Data + Interface) Module 4 (Data + Interface) Module 3 (Data) Module 2 (Data) Module 1 (Data) UPS 3 UPS 2 UPS 1
6 2U Data + Interface Modules:

12 1 TB 7200 RPM SATA drives 2 Dual-ported 4 Gb FC 2 1 Gb port for iSCSI/mgmt interface
Raw capacity 180 TB Usable capacity approximately 79 TB 120 GB of system memory per rack
(8 GB per module)
1 1U Maintenance Module 2 Redundant power supplies 2 48 port 1 Gbps Ethernet switches 3 UPS systems
Figure 3-2 Hardware overview
Two 48 port 1Gbps Ethernet switches form the basis of an internal redundant Gigabit Ethernet that links all the modules in the system. The switches are installed in the middle of the rack between the Interface Modules. The connections between the modules and switches and also all internal power connections in the rack are realized by a redundant set of cables. For power connections, standard power cables and plugs are used. Additionally, standard Ethernet cables are used for interconnection between the modules and switches. All 15 modules (six Interface Modules and nine Data Modules) have redundant connections through two 48-port 1Gbps Ethernet switches. This grid network ensures communication between all modules even if one of the switches or a cable connection fails. Furthermore, this grid network provides the capabilities for parallelism and the execution of a data distribution algorithm that contribute to the excellent performance of the XIV Storage System.
3.2 IBM XIV hardware components

Because the system architecture and operations are not dependent upon specifically designed hardware or proprietary technology, off-the-shelf components1 are used and can potentially be replaced over time with newer and more performing components without any change to the underlying architecture. Investment protection is granted with this hardware architecture and technology.
With the exception of the Automatic Transfer System (ATS)
Chapter 3. XIV physical architecture and components
47
Note: For the same reason that the system is not dependent on specially developed parts, there might be differences in the hardware components that are used in your particular system compared with those components described next.
3.2.1 The rack and the UPS modules

Both the rack and the UPS modules are produced from American Power Conversion (APC).
The rack
The IBM XIV hardware components are installed in a 482.6 mm (19 inches) NetShelter SX 42U rack (APC AR3100) from APC. The rack is 1070 mm (42 inches) deep to accommodate deeper size modules and to provide more space for cables and connectors. Adequate space is provided to house all components and to properly route all cables. The rack door and side panels are locked with a key to prevent unauthorized access to the installed components. For detailed dimensions and the weight of the rack and its components, refer to 4.3, Physical planning on page 67.
The UPS module complex

Figure 3-3 shows the UPS module complex.
Figure 3-3 UPS module complex
The Uninterruptible Power Supply (UPS) module complex consists of three UPS units. Each unit maintains an internal power supply in the event of temporal failure of the external power supply. In case of extended external power failure or outage, the UPS module complex maintains battery power long enough to allow a safe and ordered shutdown of the XIV Storage System. The complex can sustain the failure of one UPS unit, while protecting against external power disorders. Figure 3-4 on page 49 shows the UPS.
48
Figure 3-4 UPS
The three UPS modules are located at the bottom of the rack. Each of the modules has an output of 6 kVA to supply power to all other components in the rack and is 3U in height. The design allows proactive detection of temporary power problems and can correct them before the system goes down. In the case of a complete power outage, integrated batteries continue to supply power to the entire system. Depending on the load of the IBM XIV, the batteries are designed to continue system operation from 3.3 minutes to 11.9 minutes, which gives you enough time to gracefully power off the system.
Automatic Transfer System (ATS)

The Automatic Transfer System (ATS) seen in Figure 3-5 supplies power to all three Uninterruptible Power Supplies (UPS) and to the Maintenance Module. Two separate external main power sources supply power to the ATS.
Figure 3-5 ATS
49
In case of power problems or a failing UPS, the ATS reorganizes the power load balance between the power components. The operational components take over the load from the failing power source or power supply. This rearrangement of the internal power load is performed by the ATS in a seamless way, and system operation continues without any application impact. Note that if you do not have the two 60 amp power feeds normally required and use instead four 30 amp power feeds (feature code (FC) 9899), two of the lines will go to the ATS, which is then only connected to UPS unit 2. One of the other two lines goes to UPS unit 1 and the other line goes to UPS unit 2 as seen in Figure 3-5 on page 49.
30A Service
30A rated
#1
UPS
3U
Pigtail
30A Service 30A rated
ATS
30A Service
#2
UPS
3U
30A Service
30A rated
#3
UPS
3U
Figure 3-6 ATS with 30 amp power feeds
3.2.2 Data and Interface Modules

The hardware of the Interface Modules and the Data Modules is a Xyratex 1235E-X1. The module is 87.9 mm (3.46 inches) (2U) tall, 483 mm (19 inches) wide, and 707 mm (27.8 inches) deep. The weight depends on configuration and type (Data Module or Interface Module) and is a maximum of 30 kg (66.14 lbs). Figure 3-7 on page 51 shows a representation of a module in perspective.
50
Figure 3-7 Data Module/Interface Module
Data Module
The fully populated rack hosts nine Data Modules (Module 1-3 and Module 10-15). There is no difference in the hardware between Data Modules and Interface Modules (refer to Interface Module on page 54) except for the additional host adapters and GigE adapters in the Interface Modules. The main components of the module that are shown in Figure 3-7 in addition to the 12 disk drives are: System planar Processor Memory/cache Enclosure Management Card Cooling devices (fans) Memory Flash Card Redundant power supplies In addition, each Data Module contains four redundant Gigabit Ethernet ports. These ports together with the two switches form the internal network, which is the communication path for data and metadata between all modules. One Dual GigE adapter is integrated in the System Planar (port 1 and 2). The remaining two ports (3 and 4) are on an additional Dual GigE adapter installed in a PCIe slot as seen in Figure 3-8 on page 52.
51
Data Module
2 x On-board GigE Serial Dual-port GigE
4 x USB
Module Power Two different UPS
Management USB to Serial
Switch N1
Switch N2
Figure 3-8 Data Module connections
System planar
The system planar used in the Data Modules and the Interface Modules is a standard ATX board from Intel. This high-performance server board with a built-in serial-attached SCSI (SAS) adapter supports: 64-bit quad-core Intel Xeon processor to improve performance and headroom and to provide scalability and system redundancy with multiple virtual applications Eight fully buffered 533/667 MHz dual inline memory modules (DIMMs) to increase capacity and performance Dual Gb Ethernet with Intel I/O Acceleration Technology to improve application and network responsiveness by moving data to and from applications faster Four PCI Express slots to provide the I/O bandwidth needed by servers SAS adapter
Processor
The processor is a Xeon Quad Core Processor. This 64-bit processor has the following characteristics: 2.33 GHz clock 12 MB cache 1.33 GHz Front Serial Bus
Memory/Cache
Every module has 8 GB of memory installed (8 x 1 GB FBDIMM). Fully Buffered DIMM memory technology increases reliability, speed, and density of memory for use with Xeon Quad Core Processor platforms. This processor memory configuration can provide three times higher memory throughput, enable increased capacity and speed to balance
52
capabilities of quad core processors, perform reads and writes simultaneously, and eliminate the previous read to write blocking latency. Part of the memory is used as module system memory, while the rest is used as cache memory for caching data previously read, pre-fetching of data from disk, and for delayed destaging of previously written data. For a description of the cache algorithm, refer to Write cache protection on page 36.
Cooling devices
To provide enough cooling for the disks, processor, and board, the system includes 10 fans located between the disk drives and the board. The cool air is aspirated from the front of the module through the disk drives. An air duct leads the air around the processor before it leaves the module through the back. The air flow and the alignment of the fans assure proper cooling of the entire module, even if a fan is failing.
Enclosure management card

The enclosure management card is located between the disk drives and the system planar. In addition to the internal module connectivity between the drive backplane and the system planar, this card is the backplane for the 10 fans. Furthermore, it includes fan control and the logic to generate hardware alarms in the case of problems in the module.
Compact Flash Card

Each module contains a Compact Flash Card (1 GB) in the right-most rear slot. Refer to Figure 3-9.
Figure 3-9 Compact Flash Card
This card is the boot device of the module and contains the software and module configuration files. Important: Due to the configuration files, the Compact Flash Card is not interchangeable between modules.
Power supplies
Figure 3-10 on page 54 shows the redundant power supplies.
53
Figure 3-10 Redundant module power supplies
The modules are powered by an Astec redundant Power Supply Unit (PSU) cage with a dual 850W PSU assembly as seen in Figure 3-10. These power supplies are redundant and can be individually replaced. Consequently, a power supply failure will not cause an outage, and also, there is no need to stop the system to replace it. The power supply is a field-replaceable unit (FRU).
Interface Module
Figure 3-11shows an Interface Module with iSCSI ports.
Interface Module with iSCSI ports

Quad-port GigE 2 x On-board GigE Serial
Fibre ports to Patch Panel
iSCSI ports to Patch Panel
4 x USB
Switch N1
Switch N2
Figure 3-11 Interface Module with iSCSI ports
54
The Interface Module is similar to the Data Module. The only differences are: Each Interface Module contains iSCSI and Fibre Channel ports, through which hosts can attach to the XIV Storage System. These ports can also be used to establish Remote Mirror links with another remote XIV Storage System. There are two 4-port GigE PCIe adapters installed for additional internal network connections and also for the iSCSI ports. Refer to Figure 3-11 on page 54 and Figure 3-12.
Interface Module without iSCSI ports
Quad-port GigE 2 x On-board GigE Serial
Fibre ports To Patch Panel
Management UPS Technician VPN
4 x USB
Switch N1
Switch N2
Figure 3-12 Interface Module without iSCSI ports
All Fibre Channel ports, iSCSI ports, and Ethernet ports used for external connections are internally connected to a patch panel where the external cables are actually hooked up. Refer to 3.2.4, The patch panel on page 59. There are six Interface Modules (modules 4-9) available in the rack.
Fibre Channel connectivity

There are four Fibre Channel (FC) ports (two 2-port adapters) available in each Interface Module for a total of 24 Fibre Channel Protocol (FCP) ports. They support 4Gbps (Gigabit per second) full-duplex data transfer over short wave fibre links, using 50 micron multi-mode cable. The cable needs to be terminated on one end by a Lucent Connector (LC). In each module, the ports are allocated in the following manner: Ports 1 and 2 are allocated for host connectivity Ports 3 and 4 are allocated for remote connectivity
4Gb FC PCI Express adapter

Fibre channel connections to the Interface Modules are realized by two 2-port 4Gb FC PCI Express Adapters (from LSI Corporation) per Interface Module for faster connectivity and improved data protection. This adapter is illustrated in Figure 3-13 on page 56.
55
Figure 3-13 FC PCI Express Host Adapter
This Fibre Channel host bus adapter (HBA) is LSIs powerful FC949E controller and features full-duplex capable FC ports that automatically detect connection speed and can independently operate at 1Gbps, 2Gbps, or 4Gbps. The ability to operate on slower speeds ensures that these adapters remain fully compatible with existing equipment. This adapter also supports new end-to-end error detection through a cyclic redundancy check (CRC) for improved data integrity during reads and writes.
iSCSI connectivity
There are six iSCSI service ports (two ports per Interface Module) available for iSCSI over IP/Ethernet services. These ports are available in Interface Modules 7, 8, and 9 supporting the 1Gbps Ethernet host connection (refer to Figure 3-12 on page 55). These ports need to connect through the patch panel to the users IP network and provide connectivity to the iSCSI hosts. You can operate iSCSI connections for various functionalities: As an iSCSI target that the server hosts through the iSCSI protocol As an iSCSI initiator for Remote Mirroring when connected to another iSCSI port As an iSCSI for data migration when connected to a third-party iSCSI storage system For XCLI and GUI access over the iSCSI ports iSCSI ports can be defined for various uses: Each iSCSI port can be defined as an IP interface. Groups of Ethernet iSCSI ports on the same module can be defined as a single link aggregation group (IEEE standard: 802.3ad): Ports defined as a link aggregation group must be connected to the same Ethernet switch, and a parallel link aggregation group must be defined on that Ethernet switch. Although a single port is defined as a link aggregation group of one, IBM XIV support can override this configuration if this setup cannot operate with the clients Ethernet switches. For each iSCSI IP interface, you can define these configuration options: 56 IP address (mandatory) Network mask (mandatory) Default gateway (optional) MTU; Default: 1 536; Maximum: 8 192 MTU
3.2.3 SATA disk drives

The SATA disk drives, which are shown in Figure 3-14 and used in the IBM XIV, are 1 TB, 7 200 rpm hard drives designed for high-capacity storage in enterprise environments.
Figure 3-14 SATA disks
The IBM XIV was engineered with substantial protection against data corruption and data loss, thus not just relying on the sophisticated distribution and reconstruction methods that were described in Chapter 2, XIV logical architecture and concepts on page 7. Several features and functions implemented in the disk drive also increase reliability and performance. We describe the highlights next.
Performance features and benefits

Performance features and benefits include: SAS interface The disk drive features a 3Gbps SAS interface supporting key features in the SATA specification, including Native Command Queuing (NCQ) and staggered spin-up and hot-swap capability. 32 MB cache buffer The internal 32 MB cache buffer enhances the data transfer performance. Rotation Vibration Safeguard (RVS) In multi-drive environments, rotational vibration, which results from the vibration of neighboring drives in a system, can degrade hard drive performance. To aid in maintaining high performance, the disk drive incorporates the enhanced Rotation Vibration Safeguard (RVS) technology, providing up to a 50% improvement over the previous generation against performance degradation, and therefore, leading the industry.
57
Reliability features and benefits

Reliability features and benefits include: Advanced magnetic recording heads and media There is an excellent soft error rate for improved reliability and performance. Self-Protection Throttling (SPT) SPT monitors and manages I/O to maximize reliability and performance. Thermal Fly-height Control (TFC) TFC provides a better soft error rate for improved reliability and performance. Fluid Dynamic Bearing (FDB) Motor The FDB Motor improves acoustics and positional accuracy. Load/unload ramp The R/W heads are placed outside the data area to protect user data when the power is removed. All IBM XIV disks are installed in the front of the modules, twelve disks per module. Each single SATA disk is installed in a disk tray, which connects the disk to the backplane and includes the disk indicators on the front. If a disk is failing, it can be replaced easily from the front of the rack. The complete disk tray is one FRU, which is latched in its position by a mechanical handle. Important: SATA disks in the IBM XIV must never be swapped within a module or placed in another module.
58
3.2.4 The patch panel

The patch panel is located at the rear of the rack. Interface Modules are connected to the patch panel using 50 micron cables. All external connections must be made through the patch panel. In addition to the host connections and to the network connections, more ports are available on the patch panel for service connections. Figure 3-15 shows the details for the patch panel and the ports.
Fibre Channel connections to the six Interface Modules. Each Interface Module has two Fibre Channel adapters with two ports. Thus, four FC ports per Interface Module are available at the patch panel.
iSCSI connections to Interface Modules 7, 8, and 9. There are two iSCSI connections for each module.
Connections to client network for system management with the GUI or Extended Command Line Interface (XCLI) Ports for VPN connections Connected to client network Service ports for IBM service support representative (SSR) For connection to the maintenance console Reserved
Figure 3-15 Patch panel and patch panel ports
59
3.2.5 Interconnection and switches

In this section, we discuss the internal cabling between modules and switches within the XIV Storage System rack.
Internal Ethernet switches

The internal network is based on two redundant 48-port Gigabit Ethernet switches (Dell Power Connect 6248) shown in Figure 3-16. Each of the modules (Data or Interface) is directly attached to each of the switches with multiple connections (refer to Figure 3-8 on page 52 and Figure 3-11 on page 54), and the switches are also linked to each other. This network topology enables maximal bandwidth utilization, because the switches are used in active-active configuration, while being tolerant to any individual failure in network components, such as ports, links, or switches. If one switch is failing, the bandwidth of the remaining connections is fair enough to prevent noticeable performance impact and still keep enough parallelism in the system.
Figure 3-16 Dell Power Connect 6248
The Dell PowerConnect 6248 is a Gigabit Ethernet Layer 3-Switch with 48 copper and four combined ports (small form-factor pluggable (SFP) or 10/100/10000), robust stacking, and 10 Gigabit-Ethernet uplink capability. The switches are powered by Dell RPS-600 redundant power supplies to eliminate the switch power supply as a single point of failure.
Module: USB to serial connections

The Module: USB to Serial connections are used by internal processes to keep the communication to the modules alive in the event that the network connection is not operational. Modules are linked together with those USB to serial cables in groups of three modules. This emergency link is needed to communicate between the modules for internal processes and is used by maintenance to repair internal network communication issues only. The USB to Serial connection always connects a group of three Modules: USB Module 1 is connected to Serial Module 3. USB Module 3 is connected to Serial Module 2. USB Module 2 is connected to Serial Module 1. This connection sequence is repeated for the modules 4-6, 7-9, 10-12, and 13-15. This connection sequence is depicted in the diagram shown in Figure 3-17 on page 61.
60
Figure 3-17 Module: USB to serial
3.2.6 Hardware needed by support and IBM SSR

The Maintenance module and the modem, which are installed in the middle of the rack, are used for IBM XIV Support and the IBM service support representative (SSR) to maintain and repair the machine. When there is a software or hardware problem that needs the attention of the IBM XIV Support Center, a remote connection will be required and used to analyze and possibly repair the faulty system. The connection can be established either through a virtual private network (VPN) broadband connection or through a phone line and modem. For further information about remote connections, IBM XIV Support Structure, and IBM XIV repair and maintenance, refer to 10.2.2, Remote support on page 281.
Modem
The modem installed in the rack is needed and used for remote support. It enables the IBM XIV Support Center specialists and, if necessary, a higher level of support to connect to the XIV Storage System. Problem analysis and repair actions without a remote connection are complicated and time-consuming.
Maintenance module
A 1U remote support server is also required for the full functionality and supportability of the IBM XIV. This device has fairly generic requirements, because it is only used to gain remote access to the device through VPN or a modem for the support personnel. The current choice for this device is a SuperMicro 1U server with an average commodity level configuration.
3.3 Redundant hardware

The IBM XIV hardware is redundant. This redundancy prevents machine outage when any single hardware component is failing. The combination of hardware redundancy with the
61
logical architecture that is described in Chapter 2, XIV logical architecture and concepts on page 7 makes the XIV Storage System extremely resilient to outages.
3.3.1 Power redundancy

To prevent the complete rack or single components to fail due to power problems, all power components in the IBM XIV are redundant: To ensure redundant power availability at the rack level, a device must be present to enable switching from one power source to another available power source, which is realized by an Automatic Transfer Switch (ATS). In case of a failing UPS, this switch transfers the load to the remaining two UPSs without interrupting the system. Each module has two independent power supplies. During normal operation, both power supplies operate on half of the maximal load. If one power supply fails, the operational power supply can take over, and the module continues its operation without any noticeable impact. After the failing power supply is replaced, the power load balancing is restored. The two switches are powered by the PowerConnect RPS-600 Redundant Power Bank to eliminate the power supply as a single point of failure.
3.3.2 Switch/interconnect redundancy

The IBM XIV internal network is built around two Ethernet switches. The switches are interconnected for redundancy. Furthermore, each module has multiple connections to both switches to eliminate any failing hardware component within the network from becoming a single point of failure.
3.4 Hardware parallelism

One supporting piece of IBM XIVs massive parallelism, in addition to the parallelism inherent in the system logical architecture as described in 2.2, Massive parallelism on page 10, is the hardware architecture. Each Data Module and Interface Module is a powerful system including its own processor, memory, and cache. Each write I/O is initially staged into the cache of two modules. As soon as the data is in the cache, the host gets the acknowledgement of a successful write, which means that at this point, the write operation is completed after very fast cache writes. The destage operation is performed in the background, invisible to other modules and the hosts. This hardware design allows reads and writes to occur in parallel on multiple modules. All modules are connected to the internal network with multiple connections to allow parallel communication between all modules. There are four Ethernet connections for the Data Modules and six Ethernet connections for the Interface Modules, providing enough bandwidth for multiple modules to communicate in parallel.
62
Chapter 4.
Physical planning and installation

This chapter provides an overview of planning considerations for the XIV Storage System, including a reference listing of the information required for the setup. The information in this chapter includes physical requirements, delivery requirements, site requirements, and cabling requirements. For more detailed planning information, refer to the IBM XIV Storage System Model 2810 Installation Planning Guide, GC52-1327-01, and to the IBM XIV Storage System Pre-Installation Network Planning Guide for Customer Configuration, GC52-1328-01.
63
4.1 Overview
For a smooth and efficient installation of the XIV Storage System Model A14, planning and preparation tasks must take place well before the system is scheduled for delivery and installation in the data center. There are four major areas involved in installation and installation planning: Ordering the IBM XIV hardware Select required features Physical site planning: Space requirements, dimensions, and weight Raised floor requirements Power requirements, cooling, cabling, and additional equipment Configuration planning: Basic configurations Network connections Management connections Remote Mirroring configuration Installation: Physical Installation Basic configuration
4.2 Ordering IBM XIV hardware

This part of the planning describes ordering the appropriate XIV hardware configuration. At this point, consider actual requirements, but also consider potential future requirements.
4.2.1 Feature codes and hardware configuration

The XIV Storage System hardware is mostly pre-configured and consists of only one model, the 2810-A14. No storage capacity expansion capabilities are available in the initial product release. There are only a few optional or specific hardware features that you can select as part of the initial order. Refer to Table 4-1 on page 65.
64
Table 4-1 Feature code overview Feature code 9000 9101 9800 9801 9802 9803 9804 Description Remote Support Server Modem Single phase power US/Canada/LA/AP line cord US Chicago line cord EMEA line cord Israel line cord
Notes: Line cords are 250V/60A-rated. Each cord has two poles and three wires. Conductor size for non-EMEA and Chicago line cords is 6 AWG. Conductor size for EMEA and Israel line cords is 10 mm (.3937 inches). Two line cord plugs are IEC309 complaint. Chicago line cord extends1.8 m (6 ft) when exiting the frame from the bottom and 1.6 m (5 ft 4 inches) when exiting the frame from the top. All other cords extend 4.3 m (14 ft) when exiting the frame from the bottom and 4.1 m (13 ft 4 inches) when exiting the frame from the top. All installations will require wall circuit breakers having ratings of 50A to 60A. Do not exceed wire rating facility. 9820 9821 9899 Top Exit Cables Bottom Exit Cables 30 amp Line cord
Feature codes
Specific details about these feature codes include: FC 9000 and FC 9101 These features are required to enable IBM remote support personnel to connect to your XIV Storage System for monitoring and repair. For more details about the system maintenance, refer to Chapter 10, Monitoring on page 249. FC 98nn Depending on where the system will be geographically located, be sure to select the correct feature code to get the appropriate power cables and suitable connectors for your region. FC 9820 and FC 9821: These feature codes specify whether the exit for all external cables is on the top or on the bottom of the IBM XIV rack (APC 482.6 mm (19 inches) NetShelter SX 42U rack). FC 9820 is best suited for data centers without a raised floor and a cable routing above the machines in a cable duct. FC 9821 is for installing the machine in data centers with raised floors, which are the most common types of data centers.
Chapter 4. Physical planning and installation
65
FC 9899 30 amp power line cord. This feature is required if you cannot provide 60 amp power feeds at your site (refer to Figure 3-6 on page 50).
4.2.2 2810-A14 Capacity on Demand ordering options

The IBM XIV is also available with Capacity on Demand (CoD feature capabilities). Table 4-2 shows the correlation for CoD with other feature codes.
Table 4-2 Capacity on Demand Feature code 1110 1115 1119 Description CoD Interface Module CoD Data Module CoD 5.3 TB activation Notes Feature 1115 is co-requisite; 1110 is mutually exclusive of CoD feature codes 1100 and 1105 Feature 1110 is co-requisite; 1115 is mutually exclusive of CoD feature codes 1100 and 1105 Prerequisite of 1110 and 1115. Initial order has minimum quantity of four and can be ordered in any quantity thereafter Required feature Required feature N/A
9000 9101 9101
Remote support module Modem Single phase power indicator
The system will be delivered with the same hardware configuration as a non-CoD system. Feature code 1119 represents the amount of usable storage per IBM XIV Data Module. The initial order must include a minimum quantity of four FC 1119. After the initial order, additional FC 1119 can be ordered in any increment up to the maximum of 15. There will not be any restriction on the amount of usable storage. The system will be shipped with 79 TB of potentially usable storage, and the IBM XIV will use the Call Home feature (e-mail) to provide reports to IBM about the actual allocated storage. If the licensed capacity is exceeded, IBM will notify you either to reduce the amount of space used or to buy additional FC 1119. This Call Home-based process is the reason why FC9000 (RSM) and FC9101 (modem) are required for CoD configurations.
66
4.3 Physical planning

Physical planning considers the size, weight, and the environment on which you will install the IBM XIV.
4.3.1 Site requirements

The physical requirements for the room where the XIV Storage System is going to be installed must be checked well ahead of the arrival of the machine, because substantial preparation work might be required: The floor must tolerate the weight of the IBM XIV to be installed. Consider also possible future machine upgrades. The IBM XIV can be installed on a non-raised floor (FC 9820), but we highly recommend that you use a raised floor for increased air circulation and cooling. Enough clearance around the box must be left for cooling. The airflow enters the machine on the front and is exhausted to the rear of the modules and the rack. Also, you must ensure that there is enough service clearance. Space must be available to fully open all of the front and back doors. Consider also the environmental circumstances, such as any ramps, elevators, and floor characteristics according to the height and weight of the machine. Remember that the system is hosted in a tall (42U) rack. Figure 4-1 gives an general overview of the clearance needed for airflow and service around the rack.
Rack dimension:
199.1 cm high 109.1 cm deep 60.0 cm wide
Clearances: Rear door: Front door: Top:
100 cm 120 cm 60 cm
Sides: Not closer than 45 cm to a wall, but can have adjacent racks.
Figure 4-1 Dimension and weight
67
For detailed information and further requirements, refer to the IBM XIV Installation Planning Guide.
Weight and raised floor requirements

The machine is shipped as one unit. There is also a weight reduction feature available (FC 0200) in case the site access requirements conflict with the weight of the fully populated rack: IBM Storage System XIV, model A14 weight: 867 kg (1910 lb) Raised floor requirements (Figure 4-2): Reinforcement is needed to support a weight of 800 kg (1760 lb) on an area of 60 cm x 109 cm. Provide enough ventilation tiles in front of the rack. Provide a cutout (opening) for the cables according to the template in the IBM XIV Storage System Model 2810 Installation Planning Guide.
Figure 4-2 Raised floor requirements and cable cutout
Power requirements: Two 220V 60 amp power feeds (or four 30 amp power feeds, which require FC 9899) 7.5kW, 9kW peak Correct power connector (FC 980x) according to your local requirements Cooling requirements: More than 24K BTU/hr. Adequate airflow on the front and on the back of the box Delivery requirements: Clear and level path to bring the box in the computer room Clearance for an upright rack from the truck to the building, doors, and elevators
68
4.4 Basic configuration planning

The configuration planning must be completed first to allow the IBM service support representative (SSR) to physically install and configure the system. In addition, you must provide the IBM SSR with the information required to attach the system to your network for operations and management, as well as enabling remote connectivity for IBM support and maintenance. Figure 4-3 summarizes the required information.
Figure 4-3 Network and remote connectivity
Fill in all information to prevent further inquiry and delays during the installation (refer to 4.8, IBM XIV installation on page 77): Interface Module All three Interface Modules need an IP address, Netmask, and Gateway. This address is needed to manage and monitor the IBM XIV with either the GUI or Extended Command Line Interface (XCLI). Each Interface Module needs a separate IP address in case a module is failing. DNS server If Domain Name System (DNS) is used in your environment, the IBM XIV needs to have the IP address, Netmask, and Gateway from the primary DNS server and, if available, also from the secondary server. SMTP Gateway The Simple Mail Transfer Protocol (SMTP) Gateway is needed for event notification through e-mail. IBM XIV can initiate an e-mail notification, which will be sent out through the configured SMTP Gateway (IP Address or server name, Netmask, and Gateway)
69
NTP (Time server) IBM XIV can be used with a Network Time Protocol (NTP) time server to synchronize the system time with other systems. To use this time server, IP Address, or server name, Netmask and Gateway need to be configured. Time zone Usually the time zone depends on the location where the system is installed. But, exceptions can occur for remote locations where the time zone equals the time of the host system location. E-mail sender address This is the e-mail address that is shown in the e-mail notification as the sender. Remote access/virtual private network (VPN) The modem number or an external IP Address needs to be configured for remote support. The IBM XIV support center needs to connect to the machine in case of problems. Refer to 10.2, Call Home and remote support on page 273. This basic configuration data will be entered in the system by the IBM SSR following the physical installation. Refer to 4.8.2, Basic configuration on page 78. Other configuration tasks, such defining Storage Pools, volumes, and hosts, are the responsibility of the user and are described in Chapter 5, Configuration on page 79.
4.5 Network connection considerations

Network connection planning is also essential to prepare to install the XIV Storage System. To deploy and connect the system in your environment and then operate the system, various types of network connections are required: Fibre Channel connections for host I/O over Fibre Channel Gigabit Ethernet connections for host I/O over iSCSI, over Ethernet Gigabit Ethernet connections for management Gigabit Ethernet connections for IBM XIV remote support, connected through VPN Gigabit Ethernet connections for the IBM SSR (field technican ports) All external IBM XIV connections are hooked up through the patch panel as explained in 3.2.4, The patch panel on page 59. For details about the host connections, refer to Chapter 7, Host connectivity on page 147.
4.5.1 Fibre Channel connections

When shipped, the XIV Storage System is by default equipped with 24 Fibre Channel ports, ready for multi-mode fibres. The IBM XIV supports 50 micron fibres. If you have other requirements or special considerations, contact your IBM XIV advanced technical support. The 24 FC ports are available from the six Interface Modules, four in each module, and they are internally connected to the patch panel. The external (client-provided) cables are plugged into the patch panel. We recommend that you set up all other storage area network (SAN) components and lay the cables prior to the start of the XIV physical installation. 70
High-availability configuration
To configure the Fibre Channel connections (SAN) for high availability, refer to the configuration illustrated in Figure 4-4. This configuration is highly recommended for all production systems to maintain system access and operations following a single hardware element or SAN component failure. Note that the connections depicted show only an example, and all Interface Modules can be used in this configuration.
Hosts with 2 HBAs
IM 9
Switch 1
IM 7
Switch 2
Figure 4-4 High-availability configuration
For a high-availability Fibre Channel configuration, use the following guidelines: Each XIV Storage System Interface Module is connected to two Fibre Channel switches, using two ports of the module. Each host is connected to two switches using two host bus adapters or a host bus adapter (HBA) with two ports. This configuration assures full connectivity and no single point of failure: Switch failure: Each host remains connected to all modules through the second switch. Module failure: Each host remains connected to the other modules. Cable failure: Each host remains connected through the second physical link.
71
Single switch solution

Only use a single switch solution, which is illustrated in Figure 4-5, when there is no second switch available.
Hosts with 1 HBA
IM 9
Switch
IM 7
Figure 4-5 Single switch solution
This configuration is resilient to the failures of a single Interface Module, host bus adapter, and cables. However in this configuration, the switch represents a single point of failure; if the switch goes down due to a hardware failure or simply because of a software update, the connected hosts will lose access.
Single HBA host connectivity

Hosts, which are equipped with a single Fibre Channel port, can only access one switch. Therefore, this configuration is resilient to the failure of an individual Interface Module, but there are several possible points of failure (switch, cable, and HBA), which can cause access loss from the host to the IBM XIV. This configuration is not recommended for any production system and must be used if there is no way of adding a second Fibre Channel port to the host.
Direct Host-to-IBM XIV Storage System connections

A host can be connected directly to the IBM XIV without using any switch or SAN environment, which is configured either with a single connection for hosts with one Fibre Channel port or with two connections for hosts with two Fibre Channel ports. Because this configuration is not fault-tolerant against any hardware failure, it is only recommended for non-productive systems or test systems.
72
Fibre Channel cabling and configuration

Fibre Channel cabling must be prepared based on the required fibre length and depending on the selected configuration. When installing an XIV Storage System, perform the following Fibre Channel configuration procedures: You must configure Fibre Channel switches so that they do not block access between the hosts and the XIV Storage System (zoning). The specific configuration to follow depends on the specific Fibre Channel switch. Hosts need to be set up and configured with the appropriate multi-pathing software to balance the load over several paths. For multi-pathing software and setup, refer to the specific operating system section in Chapter 7, Host connectivity on page 147.
4.5.2 iSCSI connections

By default, the XIV Storage System ships equipped with six iSCSI ports, ready for Gigabit Ethernet over CAT5 cables. Three of the Interface Modules support iSCSI, with two ports in each module.
iSCSI network configurations

Logical network configurations for iSCSI are equivalent to the logical configurations that are suggested for Fibre Channel networks. Four options are available: High availability: Each module connects through two ports to two Ethernet switches, and each host is connected to the two switches. This design provides a network architecture resilient to a failure of any individual network switch or module. Single switch configuration: A single switch interconnects all modules and hosts. Single port host solution: Each host connects to a single switch, and a switch is connected to two modules. Direct connectivity to host connection: Each module connects directly to a host.
Ethernet link aggregation

IBM XIV Storage System supports the configuration of Ethernet link aggregation, where two Ethernet ports of the same module are connected to the same switch and are considered as one logical Ethernet port (including failover and load balancing).
IP configuration
The configuration of the XIV Storage System iSCSI connection is highly dependent on your network. In the high availability configuration, the two client-provided Ethernet switches used for redundancy can be configured as either two IP subnets or as part of the same subnet. The XIV Storage System iSCSI configuration must match the clients network. You must provide the required following configuration information for each Ethernet port: You must decide whether to configure the two ports of each module as either one logical Ethernet port (link aggregation) or as two independent Ethernet ports. This decision affects both the systems configuration and the switches configuration. IP address Net mask MTU (optional) Maximum Transmission Unit (MTU) configuration is required if your network supports an MTU, which is larger than the standard one. The largest possible MTU must be specified
73
(we advise you to use up to 9 000 bytes, if supported by the switches and routers). If the iSCSI hosts reside on a different subnet than the XIV Storage System, a default IP gateway per port must be specified. Default gateway (optional) Because XIV Storage System always acts as a TCP server for iSCSI connections, packets are always routed through the Ethernet port from which the iSCSI connection was initiated. The default gateways are required only if the hosts do not reside on the same layer-2 subnet as the XIV Storage System. The IP network configuration must be ready to ensure connectivity between the XIV Storage System and the host prior the physical system installation: If required, Ethernet switches that connect to two ports on the same module must have their ports configured as a link aggregation group (with a parallel configuration on the XIV Storage System). Ethernet virtual local area networks (VLANs), if required, must be configured correctly to enable access between hosts and the XIV Storage System. IP routers (if present) must be configured correctly to enable access between hosts and the XIV Storage System.
4.5.3 Mixed iSCSI and Fibre Channel host access

IBM XIV Storage System supports mixed concurrent access from the same host to the same volumes through FC and iSCSI. When building this type of a topology, you must plan carefully to properly ensure redundancy and load balancing. Note: Not all hosts support multi-path configurations between the protocols. We highly recommend that you contact the XIV Storage System support personnel for help in planning configurations that include mixed iSCSI and Fibre Channel host access.
4.5.4 Management connectivity

IBM XIV Storage System is managed through three IPs over Ethernet interfaces in order to be resilient to two hardware failures. Thus, you must have three Ethernet ports available for management. If you require management to be resilient to a single network failure, we recommend that you connect these ports to two switches. Make sure as well that the networking equipment providing the management communication is Uninterruptible Power Supply (UPS)-protected.
Management IP configurations
For each of the three management ports, you must provide the following configuration information to the IBM SSR upon installation (refer also to 4.4, Basic configuration planning on page 69): IP address of the port Net mask Default IP gateway
74
The following system-level IP information should be provided (not port-specific): IP address of the primary and secondary DNS servers IP address or DNS names of the SNMP manager, if required IP address or DNS names of the Simple Mail Transfer Protocol (SMTP) servers
Protocols
The XIV Storage System is managed through dedicated management ports running TCP/IP over Ethernet. Management is carried out through the following protocols (consider this design when configuring firewalls and other security protocols): Proprietary IBM XIV protocols are used to manage XIV Storage System from the GUI and the XCLI. This management communication is performed over TCP port 7778 where the GUI/CLI, as the client, always initiates the connection, and the XIV Storage System performs as the server. IBM XIV Storage System responds to SNMP management packets. IBM XIV Storage System initiates SNMP packets when sending traps to SNMP managers. IBM XIV Storage System initiates SMTP traffic when sending e-mails (for either event notification through e-mail or for e-mail-to-SMS gateways).
4.5.5 Mobile computer ports

The XIV Storage System has two Ethernet mobile computer ports. A single mobile computer, or other computer, can be connected to these ports. When connected, the system will serve as a Dynamic Host Configuration Protocol (DHCP) server and will automatically configure the mobile computer. Restriction: Do not connect these ports to the user (client) network.
4.5.6 Remote access

To facilitate remote support by IBM XIV personnel, we recommend that you configure a dedicated Ethernet port for remote access. This port must be connected through the organizational firewall so that IBM XIV personnel can access the XIV Storage System, if required. Access can be limited to only permit access from XIVs IP address space. It is also possible to configure a VPN between IBM XIVs offices and the organizational firewall, enabling access to this port only through the defined VPN. XIV Storage System only accepts connections from the VPN port on server port 22 and never uses this port for outgoing IP traffic. Provide the following configuration regarding the remote access port: IP address of the remote access port Network mask Default gateway External IP address to be used from IBM XIV support VPN to be defined at IBM XIV support, if required
75
4.6 Remote Copy connectivity

Planning the physical connections also includes considerations about particularities when the IBM XIV is installed in a Remote Copy environment. We also recommend that you contact advanced IBM XIV support for assistance for planning Remote Mirroring connectivity to assure the maximum resilience to hardware failures and connection failures.
4.6.1 Remote Copy links

Remote Copy links, which connect the direct primary system and secondary system, need to
also be planned prior the physical installation. The physical Remote Copy links can be Fibre Channel links, direct or through a SAN, or iSCSI port connections using the Ethernet.
4.6.2 Remote Target Connectivity

This is a brief overview of remote target connectivity, because with IBM XIV, this capability is handled differently than in the past. Installation planning is affected if remote copy is implemented with IBM XIV.
Remote Target Connectivity enables the communication topology between a local storage
system and a remote storage system in order to enable Remote Mirroring and data migration capabilities. Important: When defining mirroring of a volume on a remote system, the local system must be defined as a remote target on the remote system. Defining the local system as a remote target on the remote system is required, because mirroring roles can be switched and, therefore, all definitions must be symmetric.
Remote Target:
Determine the protocol that will be used for remote connectivity, either iSCSI or Fibre Channel (FC). Each remote target is available through only one of these protocols. To change the protocol the target definition must be deleted and then redefined. If the remote target is an IBM XIV then remote mirroring and data migration are supported, otherwise only data migration can be used with the remote target.
Target Port Sets:

Port Sets define a group of ports on the target. Define target port sets in the local IBM XIV that contain ports in the remote system and define ports in the remote IBM XIV that contain ports in the local system. Connectivity in both directions is required. The ports selected should be in different modules for redundancy. All ports in a port set should share the same switch. If there is more than one switch involved then create a port set per switch. The Fibre Channel ports for remote connectivity in each module include an initiator port and a target port. Both need to be used to connect to the target system. Best practice is to define initiator and target ports in each port set.
Connectivity:
Connectivity between source and target storage system is defined between specific physical modules on a source storage system and port set on the target system. The system automatically uses the required port of the local module on the local storage system.
Allow Remote Access:

The local storage system gives the target storage system permission to read, write, view, or create volumes. This is required to allow remote mirroring operations. This operation should also be run on the target storage system so that it will give permission to access the local storage system. If remote mirroring is activated before target connectivity is defined and activated, then the remote mirroring will not become operational.
Figure 4-6 Remote Target Connectivity setup
To get more detailed information about Remote Target Connectivity, refer to Chapter 12, Remote Mirror on page 323.
76
4.7 Planning for growth

Do not forget to plan for growth and for the future. Most databases grow quickly and the need for greater storage capacity increases rapidly. Plan for growth prior to the implementation of the first IBM XIV in the environment.
4.7.1 Future requirements

There is a statement of general direction that the IBM XIV will be available in the future also as a multiple rack machine. Therefore, install the first IBM XIV in a place with sufficient space next to the rack for the point at which you will need more racks. Also, consider future power and cooling requirements. Plan the current power supply and air conditioning to be sufficient to grow easily in the future without additional effort and costs.
4.8 IBM XIV installation

After all previous planning steps are completed and the machine is delivered to its final location, the physical installation can begin. A IBM SSR will perform all the necessary tasks and perform the first configuration steps up to the point where you can connect the IBM XIV through the GUI and the XCLI. Configuring Storage Pools, logical unit numbers (LUNs), and attaching the IBM XIV to the host are client responsibilities. Refer to Chapter 5, Configuration on page 79.
4.8.1 Physical installation

It is the responsibility of the client or moving contractor to uncrate and move the IBM XIV Storage System as close as possible to its final destination before an IBM SSR can start the physical installation. Carefully check and inspect the delivered crate and hardware for any visible damage. If there is no visible damage, and the tilt and shock indicators show no problem, sign for the delivery. Also arrange that with the start, or during the physical installation, an electrican is available who is able to handle the power requirements in the environment up to the IBM XIV power connectors. The physical installation steps are: 1. Place and adjust the rack in its final position in the computer room. 2. Check the IBM XIV hardware. When the machine is delivered with the weight reduction feature (FC 0200), the IBM SSR will install the removed modules and components. 3. Connect the IBM XIV line cords to the client-provided power source and advise an electrican to switch on the power connections. 4. Perform the initial power-on of the machine and perform necessary checks according to the given power-on procedure. 5. To complete the physical steps of the installation, the IBM SSR will perform several final checks of the hardware before continuing with the basic configuration.
77
4.8.2 Basic configuration

After the completion of the physical installation steps, the IBM SSR establishes a connection to the IBM XIV through the patch panel (refer to 3.2.4, The patch panel on page 59) and completes the basic configuration. You must provide the required completed information sheet that is referenced in 4.4, Basic configuration planning on page 69. The basic configuration steps are: 1. Set the Management IP Addresses, (Client Network) Gateway, and Netmask. 2. Set the system name. 3. Set the e-mail sender address. 4. Set the primary DNS and the secondary DNS. 5. Set the time zone.
4.8.3 Complete the physical installation

After the IBM SSR completes the physical installation and basic configuration, the IBM SSR performs the final checks for the IBM XIV: Power off and power on the machine by using the XIV Storage Management (GUI) and XCLI. Check the Events carefully for problems. Verify that all settings are correct and persistent. At this point, the installation is complete, and the IBM XIV is ready to be handed over to the client to configure and use. Refer to Chapter 5, Configuration on page 79.
78
Chapter 5.
Configuration
This chapter discusses the tasks to be performed by the storage administrator to configure the XIV Storage System using the XIV Management Software. We provide step-by-step instructions covering the following topics, in this order: Install and customize the XIV Management Software Connect to and manage XIV using graphical and command line interfaces Organize system capacity by Storage Pools Create and manage volumes in the system Create and maintain hosts and clusters Allocate logical unit numbers (LUNs) to hosts or clusters Create integrated scripts
79
5.1 IBM XIV Storage Management software

The XIV Storage System software supports the functions of the XIV Storage System. The software provides the functional capabilities of the system. It is preloaded on each module (Data and Interface Modules) within the XIV Storage System. The functions and nature of this software are equivalent to what is usually referred to as microcode or firmware on other storage systems. The XIV Storage Management software is used to communicate with the XIV Storage System Software which in turn interacts with the XIV Storage hardware. The XIV Storage Manager can be installed on a Linux, Microsoft Windows, or Mac OS-based management workstation that will then act as the management console for the XIV Storage System. The Storage Manager software is provided at time of installation, or optionally downloadable from the following Web site: http://www.ibm.com/systems/storage/disk/xiv/index.html For detailed information about the XIV Storage Management software compatibility, refer to the XIV interoperability matrix or the System Storage Interoperability Center (SSIC) at: http://www.ibm.com/systems/support/storage/config/ssic/index.jsp The IBM XIV Storage Manager includes a user-friendly and intuitive Graphical User Interface (GUI) application, as well as an Extended Command Line Interface (XCLI) component offering a comprehensive set of commands to configure and monitor the system.
Graphical User Interface (GUI)

A simple and intuitive GUI allows a user to perform most administrative and technical operations (depending upon the user role) quickly and easily, with minimal training and knowledge. The main motivation behind the XIV management and GUI design is the desire to keep the complexities of the system and its internal workings completely hidden from the user. The most important operational challenges, such as overall configuration changes, volume creation or deletion, snapshot definitions, and many more, are achieved with a few clicks. This chapter contains descriptions and illustrations of tasks performed by a storage administrator when using the XIV graphical user interface (GUI) to interact with the system.
Extended Command Line Interface (XCLI)

The XIV Extended Command Line Interface (XCLI) ia a powerful text-based, command line-based tool that enables an administrator to issue simple commands to configure, manage, or maintain the system, including the definitions required to connect to hosts and applications. The XCLI can be used in a shell environment to interactively configure the system or as part of a script to perform lengthy and more complex tasks. Tip: Any operation that can be performed via XIVs GUI is also supported by the XIV Extended Command Line Interface (XCLI). This chapter presents the most common XCLI commands and tasks normally used by the administrator to interact with the system.
80
5.1.1 XIV Storage Management software installation

This section illustrates the step-by-step installation of the XIV Storage Management software under Microsoft Windows XP. Important: Minimum requirements for installation: CPU: Memory: Graphic Memory: Disk capacity: Supported OS: Screen resolution: Graphics: Double Core or equivalent 1024 MB 128 MB 100 MB Free Windows 2000 Server, ME, XP, Windows Server 2003, Windows Vista 1024x768 (recommended) to 1600x1200 24/32 True Color Recommended
At the time of writing, the XIV Storage Manager Version 2.2.42 was available and later GUI releases might slightly differ in appearance. Perform the following steps to install the XIV Storage Management software: 1. Locate the XIV Storage Manager installation file (either on the installation CD or a copy you downloaded from the Internet). Running the installation file first shows the welcome window displayed in Figure 5-1. Click Next.
Figure 5-1 Installation: Welcome window
Chapter 5. Configuration
81
2. A Setup dialog window is displayed (Figure 5-2) where you can specify the installation directory. Keep the default installation folder or change it accordingly to your needs. When done, click Next.
Figure 5-2 Specify the installation directory
3. The next installation dialog is displayed. You can choose between a FULL installation method or just a command line interface installation method. We recommend that you choose FULL installation as shown in Figure 5-3. In this case, the Graphical User Interface and the Command Line Interface as well will be installed. Click Next.
Figure 5-3 Choose the installation type
82
4. The next step is to specify the Start Menu Folder as shown in Figure 5-4. When done, click Next.
Figure 5-4 Select Start Menu Folder
5. The dialog shown in Figure 5-5 is displayed. Select the desktop icon placement and click Next.
Figure 5-5 Select additional tasks
83
6. The dialog window shown in Figure 5-6 is displayed. The XIV Storage Manager requires the Java Runtime Environment Version 6, which will be installed during the setup if needed. Click Finish.
Figure 5-6 Completing setup
If the computer on which the XIV GUI is installed is connected to the Internet, a window might appear to inform you that a new software upgrade is available. Click OK to download and install the new upgrade, which normally only requires a few minutes and will not interfere with your current settings or events.
5.2 Managing the XIV Storage System

The basic storage system configuration sequence followed in this chapter goes from the initial installation steps, followed by disk space definition and management, and up to allocating or mapping this usable capacity to application hosts. Additional configuration or advanced management tasks are cross-referenced to specific chapters where they are discussed in more detail. Figure 5-7 on page 85 presents an overview of the configuration flow. Note that the XIV GUI is extremely intuitive, and you can easily and quickly achieve most configuration tasks.
84
Basic configuration
XIV Storage Management Software Install
XGUI
Used Management tool
Advanced configuration
XCLI
Connecting to XIV Storage
Customizing XCLI and connecting to XIV Storage
Manage Snapshots Consistency Groups
Manage Storage Pools
Manage Storage Pools
Remote Mirroring Manage Volumes Manage Volumes
Data Migration
Manage Hosts and Mappings
Manage Hosts and Mappings
Security
XCLI Scripting
Monitoring and event notification
Figure 5-7 Basic configuration sequence and advanced tasks
After the installation and customization of the XIV Management Software on a Windows, Linux, or Mac OS management workstation, a physical Ethernet connection must be established to the XIV Storage System itself. The Management workstation is used to: Execute commands through the XCLI interface Control the XIV Storage System through the GUI Send e-mail notification messages and Simple Network Management Protocol (SNMP) traps upon occurrence of specific events or alerts. To ensure management redundancy in case of Interface Module failure, the XIV Storage System management functionality is accessible from three IP addresses. Each of the three IP addresses is linked to a different (hardware) Interface Module. The various IP addresses are transparent to the user, and management functions can be performed through any of the IP addresses. These addresses can also be used simultaneously for access by multiple management clients. Users only need to configure the GUI or XCLI for the set of IP addresses that are defined for the specific system. Note: All management IP interfaces must be connected to the same subnet and use the same: Network mask Gateway Maximum Transmission Unit (MTU)
85
The XIV Storage System management connectivity system allows for users to manage the system from both XCLI and GUI interfaces. Accordingly, the XCLI and GUI can be configured to manage the system through IP interfaces. Both XCLI and GUI management run over TCP port 7778, with all traffic encrypted through the Secure Sockets Layer (SSLv3) or Transport Layer Security (TLS 1.0) protocol. Note: Both XCLI and GUI management run over TCP port 7778 with all traffic encrypted through the Secure Sockets Layer (SSL).
5.2.1 Launching the Management Software GUI

Upon launching the XIV GUI application, a login window prompts you for a user name and its corresponding password before granting access to the XIV Storage System. The default user is admin and the default corresponding password is adminadmin, as shown in Figure 5-8. Important: Remember to change the default passwords to properly secure your system. The default admin user comes with storage administrator (storageadmin) rights. The XIV Storage System offers role-based user access management. For more information about user security and roles, refer to Chapter 6, Security on page 125.
Figure 5-8 Login window with default access
To connect to an XIV Storage System, you must initially add the system to make it visible in the GUI by specifying its IP addresses. To add the system: 1. Make sure that the management workstation is set up to have access to the LAN subnet where the XIV Storage System resides. Verify the connection by pinging the IP address of the XIV Storage System. If this is the first time you start the GUI on this management workstation and no XIV Storage System had been previously defined to the GUI, the Add System Management dialog window is automatically displayed: If the default IP address of the XIV Storage System was not changed, check Use Predefined IP, which populates the IP/DNS Address1 field with the default IP address. Click Add to effectively add the system to the GUI. Refer to Figure 5-9 on page 87.
86
Figure 5-9 Connection window with default IP
If the default IP address had already been changed to a client-specified IP address (or set of IP addresses, for redundancy), you must enter those addresses in the IP/DNS Address fields. Click Add to effectively add the system to the GUI. Refer to Figure 5-10.
Figure 5-10 Add System Management
2. You are now returned to the main XIV Management window. Wait until the system is displayed and shows as enabled. Under normal circumstances, the system will show a status of Full Redundancy displayed in a green label box. 3. Move the mouse cursor over the image of the XIV Storage System and click to open the XIV Storage System Management main window as shown in Figure 5-11 on page 88.
87
The XIV Storage Management GUI is mostly self-explanatory with a well-organized structure and simple navigation.
Toolbar
Menu bar
User indicator
Function Icons
Main display
Status bar indicators
Figure 5-11 XIV Storage Manager main window: System view
The main window is divided into the following areas: Function icons: Located on the left side of the main window, you find a set of vertically stacked icons that are used to navigate between the functions of the GUI, according to the icon selected. Moving the mouse cursor over an icon brings up a corresponding option menu. The various menu options available from the function icons are presented in Figure 5-12 on page 89. Main display: It occupies the major part of the window and provides graphical representation of the XIV Storage System. Moving the mouse cursor over the graphical representation of a specific hardware component (module, disk, and Uninterruptible Power Supply (UPS) unit) brings up a status callout. When a specific function is selected, the main display shows a tabular representation of that function. Menu bar: It is used for configuring the system and as an alternative to the Function icons for accessing the various functions of the XIV Storage System. Toolbar: It is used to access a range of specific actions linked to the individual functions of the system.
88
Status bar indicators: They are located at the bottom of the window. This area indicates the overall operational levels of the XIV Storage System: The first indicator on the left shows the amount of soft or hard storage capacity currently allocated to Storage Pools and provides alerts when certain capacity thresholds are reached. As the physical, or hard, capacity consumed by volumes within a Storage Pool passes certain thresholds, the color of this meter indicates that additional hard capacity might need to be added to one or more Storage Pools. The second indicator (in the middle) displays the number of I/O operations per second (IOPS). The third indicator on the far right shows the general system status and will, for example, indicate when a redistribution is underway. Additionally, an Uncleared Event indicator is visible when events occur for which a repetitive notification was defined that has not yet been cleared in the GUI (these notifications are called Alerting Events).
XIV Storage Manager Menu options

Systems Groups the managed systems Volumes manage storage volumes and their snapshots. Define, delete, edit volumes Host and LUNs managing hosts: define, edit, delete, rename, and link the host servers
Monitor define the general System connectivity and monitor overall system activity
Pools configure the features provided by the XIV Storage System for hosts and their connectivity
Remote management define the communication topology between a local and a remote storage system
Access management access control system that specifies defined user roles to control access Figure 5-12 Menu items in XIV Storage Management software
89
Tip: The configuration information regarding the connected systems and the GUI itself is stored in various files under the users home directory. As a useful and convenient feature, all the commands issued from the GUI are saved in a log in the format of XCLI syntax. The default location is in the Documents and Setting folder of the Microsoft Windows current user, for example: C:\Documents and Settings\<YOUR LOGGED IN USER>\Application Data\XIV\GUI10\logs\nextraCommands_*.log
5.2.2 Log on to the system with XCLI

After the installation of XIV Storage Management software, you can find the XCLI executable file (xcli.exe) in the specified installation directory. The usage of XCLI is simple, and detailed illustrations are given in this chapter. There are three methods of invoking the XCLI functions:
Invoking the XCLI in order to define configurations: In these invocations, the XCLI utility is used to define configurations. A configuration is a mapping between a user-defined
name and a list of three IP addresses. This configuration can be referenced later in order to execute a command without having to specify the system IP addresses (refer to next method in this list). These various configurations are stored on the local host running the XCLI utility and must be defined again for each host.
Invoking the XCLI to execute a command: This method is the most basic and important
type of invocation. Whenever invoking an XCLI command, you must also provide either the systems IP addresses or a configuration name.
Invoking the XCLI for general purpose functions: These invocations can be used to get
the XCLIs software version or to print the XCLIs help text. The command to execute is generally specified along with parameters and their values. A
script can be defined to specify the name and path to the commands file (lists of commands
will be executed in User Mode only). For complete and detailed documentation of the IBM XIV Storage Manager XCLI, refer to the XCLI Reference Guide, GC27-2213-00.
Customizing the XCLI environment

For convenience and more efficiency in using the XCLI, we recommend that you customize your management workstation environment as described in the following steps: 1. Create a DOS prompt invocation for the XCLI on your desktop, an XCLI icon: a. b. c. d. Start All Programs Accessories. Right-click Command Prompt and select Create Shortcut. Drag the shortcut icon to your desktop. Rename it xcli_admin.
2. Create a safe working directory for XCLI (for example, create the XCLI directory in Document and Settings). 3. Customize the xcli_admin icon: a. Right-click the icon and select Properties.
90
b. On the Shortcut tab: Target: %SystemRoot%\system32\cmd.exe /k cd C:\Documents and Settings\Administrator\My Documents\xcli && setup Start in: c:\Program Files\XIV\GUI10 c. On the Options tab: Check QuickEdit mode d. On the Layout tab: Screen buffer size Width: 160 Window size Width: 120 Height: 40 e. Click Apply and click OK. Be informed that setup (highlighted in bold in step 3b) represents a batch program. This batch program is described later in this section and is used to store relevant environment variables. Refer to Example 5-2 on page 92. As part of XIVs high-availability features, each system is assigned three IP addresses. When executing a command, the XCLI utility is provided with these three IP addresses and tries each of them sequentially until communication with one of the IP addresses is successful.You must pass the IP addresses (IP1, IP2, and IP3) with each command. To avoid too much typing and having to remember IP addresses, you can use instead a a predefined configuration name. Note: When executing a command, you must specify either a configuration or IP addresses, but not both. To issue a command against a specific XIV Storage System, you also need to supply the username and the password for it. The default user is admin and the default password is adminadmin, which can be used with the following parameters: -u user or -user sets the user name that will be used to execute the command. -p password or -password is the XCLI password that must be specified in order to execute a command in the system. -m IP1 [-m IP2 [-m IP3]] defines the IP addresses of the Nextra system Example 5-1 illustrates a common command execution syntax on a given XIV Storage System.
Example 5-1 Simple XCLI command
xcli -u admin -p adminadmin -m 149.168.100.101 user_list Managing the XIV Storage System by using the XCLI always requires that you specify these same parameters. To avoid repetitive typing, you can instead define and use specific environment variables. We recommend that you create a batch file in which you set the value for those specific environment variables, which is shown in Example 5-2 on page 92.
91
Create this setup.bat file in C:\Documents and Settings\Administrator\My Documents\xcli.

Example 5-2 Creating setup.bat file
@echo off set XCLI_CONFIG_FILE=C:\Documents and Settings\hu02230\My Documents\xiv\xcliconfigs.xml set XIV_XCLIUSER=admin set XIV_XCLIPASSWORD=adminadmin xcli -L
The XCLI utility requires user and password options. If user and passwords are not specified, the default environment variables XIV_XCLIUSER and XIV_XCLIPASSWORD are utilized. If neither command options nor environment variables are specified, commands are run with the user defined in config_set default_user=XXXX. This allows smooth migration to the IBM XIV Software System for clients that do not have defined users. The configurations are stored in a file under the users home directory. A different file can be specified by -f or --file switch (applicable to configuration creation, configuration deletion, listing configurations, and command execution). Alternatively, the environment variable XCLI_CONFIG_FILE, if defined, determines the files name and path. We recommend that you create a configuration file as shown in Example 5-3.
Example 5-3 Create a configuration file
Create an empty file C:\Documents and Settings\Administrator\My Documents\xcli\xcliconfigs.xml Create XCLI config file xcli -f "C:\Documents and Settings\Administrator\My Documents\xcli\xcliconfigs.xml" -a Redbook -m <IP1> [m <IP2> [m <IP3>]] After this specification, the shortened command syntax works as shown in Example 5-4.
Example 5-4 Short command syntax
xcli -c Redbook user_list The default IP address for XIV Storage System is 14.10.202.250.
Getting help with XCLI commands

To get help about the usage and commands, proceed as shown in Example 5-5.
Example 5-5 xcli help commands
xcli xcli -c Redbook help xcli -c Redbook help command=help format=full The first command prints out the usage of xcli. The second one prints all the commands that can be used by the user in that particular system. The third one shows the usage of the help command itself with all the parameters. As mentioned in the output of a simple XCLI command, there are different parameters to get the result of a command in a predefined format. The default is the user readable format.
92
Specify the -s parameter to get it in a comma-separated format or specify the -x parameter to obtain an XML format. Note: The XML format contains all the fields of a particular command. The user and the comma-separated formats provide just the default fields as a result. To specify the required fields of a command, use the -t parameter as shown here: xcli -c Redbook -t name,fields help command=user_list
5.3 Storage Pools

We have introduced the concept of XIV Storage Pools in 2.3.3, Storage Pool concepts on page 22. Storage Pools function as a means to effectively manage a related group of logical volumes and their snapshots. Storage Pools offer the following key benefits:
Improved management of storage space: Specific volumes can be grouped within a

Storage Pool, giving you the flexibility to control the usage of storage space by specific applications, a group of applications, or departments.
Improved regulation of storage space: Automatic snapshot deletion occurs when the storage capacity limit is reached for each Storage Pool independently. Therefore, when a Storage Pools size is exhausted, only the snapshots that reside in the affected Storage Pool are deleted.
The size of Storage Pools and the associations between volumes and Storage Pools are constrained by: The size of a Storage Pool can range from as small as possible (17.1 GB) to as large as possible (the entire system) without any limitation. The size of a Storage Pool can always be increased, limited only by the free space on the system. The size of a Storage Pool can always be decreased, limited only by the space already consumed by the volumes and snapshots in that Storage Pool. Volumes can be moved between Storage Pools without any limitations, as long as there is enough free space in the target Storage Pool. Important: All of these operations are handled by the system at the metadata level, and they do not cause any data movement (copying) from one disk drive to another. Hence, they are completed almost instantly and can be done at any time without impacting the applications.
Thin provisioned pools Thin provisioning is the practice of allocating storage on a just-in-time and as needed
basis by defining a logical, or soft, capacity that is larger than the physical, or hard, capacity. Thin provisioning enables XIV Storage System administrators to manage capacity based on the total space actually consumed rather than just the space allocated. Thin provisioning can be specified at the Storage Pool level. Each thinly provisioned pool has its own hard capacity (which limits the actual disk space that can be effectively consumed) and soft capacity (which limits the total logical size of volumes defined).
93
The difference is:
Hard pool size: The hard pool size represents the physical storage capacity allocated to
volumes and snapshots in the Storage Pool. The hard size of the Storage Pool limits the total of the hard volume sizes of all volumes in the Storage Pool and the total of all storage consumed by snapshots.
Soft Pool size: This size is the limit on the total soft sizes of all the volumes in the Storage
Pool. The soft pool size has no effect on snapshots. For more detailed information about the concept of XIV thin provisioning and a detailed discussion of hard and soft size for Storage Pools and volumes, refer to 2.3.4, Capacity allocation and thin provisioning on page 24. When using the GUI, you specify what type of pool is desired (Regular Pool or a Thin Provisioned Pool) when creating the pool. Refer to Creating Storage Pools on page 96. When using the XCLI, you create a thinly provisioned pool by setting the soft size to a greater value than its hard size. In case of changing requirements, the pools type can be changed (non-disruptively) later. Tip: The thin provisioning management is performed individually for each Storage Pool, and running out of space in one pool does not impact other pools.
5.3.1 Managing Storage Pools with XIV GUI

Managing pools with the GUI is pretty simple and intuitive. As always, the related tasks can be reached by either the menu bar or the corresponding function icon on the left (called Pools), which is shown in Figure 5-13.
Figure 5-13 Opening the Pools menu
To view overall information about the Storage Pools, select Pools from the Pools menu shown in Figure 5-13 to display the Storage Pool window seen in Figure 5-14 on page 95.
94
Figure 5-14 Storage Pools view
The Storage Pools GUI window displays a table of all the pools in the system combined with a series of gauges for each pool. This view gives the administrator a quick grasp and general overview of essential information about the system pools. The capacity consumption by volumes and snapshots within a given Storage Pool is indicated by different colors: Green is the indicator for consumed capacity below 80%. Yellow represents a capacity consumption above 80%. Orange is the color for a capacity consumption of over 90%. Any Storage Pool with depleted hard capacity appears in red within this view. The name, the size, and the separated segments are labeled adequately. Figure 5-15 shows the meaning of the various numbers.
Storage Pool Soft Limit
Size of all the volumes defined
Data written to Storage Pool
Storage Pool Hard Limit
Figure 5-15 Storage Pool and size numbers
95
Creating Storage Pools

The creation and resizing of Storage Pools is relatively straightforward, and care need only be taken with the size allocation and re-allocation. The name of a Storage Pool must be unique in the system. Note: The size of the Storage Pool is specified as an integer multiple of 109 bytes, but the actual size of the created Storage Pool is rounded up to the nearest integer multiple of 16x230 bytes. According to this rule, the smallest pool size is 17.1 GB. When creating a Storage Pool, a reserved area is automatically defined for snapshots. The system initially provides a default snapshots size, which can be changed at the time of creation or later to accommodate future needs. Note: The Snapshots size (default or specified) is included in the specified pool size. It is not an additional space. Sizing must take into consideration volumes that are to be added to (or already exist in) the specific Storage Pool, the current allocation of storage in the total system capacity, and future activity within the Storage Pool, especially with respect to snapshot propagation resulting from creating too many snapshots. The system enables the assignment of the entire available capacity to user-created Storage Pools. The Storage Pool is initially empty and does not contain volumes. However, you cannot create a Storage Pool with zero capacity. To create a Storage Pool: 1. Click Add Pool in the Storage Pools view or simply right-click in the body of the window. A Create Pool window displays as shown in Figure 5-16.
Figure 5-16 Create Pool
96
2. In the Select Type drop-down list box, choose Regular or Thin Provisioned according to your needs. For thinly provisioned pools, two new fields appear: Soft Size: Here, you specify the upper limit of soft capacity. Lock Behavior: Here, you specify the behavior in case of depleted capacity. This value specifies whether the Storage Pool is locked for write or whether it is disabled for both read and write when running out of storage space. The default value is read only. 3. In the Pool Size field, specify the required size of the Storage Pool. 4. In the Snapshots Size field, enter the required size of the reserved snapshot area. Note: Although it is possible to create a pool with identical snapshot and pool size, you cannot create a new volume in this type of a pool afterward without resizing first. 5. In the Pool Name field, enter the desired name (it must be unique across the Storage System) for the Storage Pool. 6. Click Add to add this Storage Pool.
Resizing Storage Pools

This action can be used to both increase or decrease a Storage Pool size. Capacity calculation is performed in respect to the total system net capacity. All reductions and increases are reflected in the remaining free storage capacity. Note: When increasing a Storage Pool size, you must ensure that the total system capacity holds enough free space to enable the increase in Storage Pool size. When decreasing a Storage Pool size, you must ensure that the Storage Pool itself holds enough free capacity to enable a reduction in size. This operation is also used to shrink or increase the snapshot capacity inside the Storage Pool. This alteration only affects the space within the Storage Pool. In other words, increasing snapshot size will consume the free capacity only from the corresponding pool. To change the size of one Storage Pool in the system, simply right-click in the Storage Pools view (Figure 5-14 on page 95) on the desired pool and choose Resize. The window shown in Figure 5-17 on page 98 is displayed. Change the pool hard size, soft size, or the snapshot size to match your new requirements.The green bar at the top of the window represents the systems actual hard capacity that is used by Storage Pools. The vertical red line indicates the current size, and the yellow part is the desired new size of the particular pool. Obviously, the remaining space in the bar without color indicates the consumable free capacity in the system.
97
Figure 5-17 Resizing pool
The resize operation can also be used to change the type of Storage Pool from thin provisioned to regular or from regular to thin provisioned. Just change the type of the pool in the Resize Pool window Select Type list box. Refer to Figure 5-18 on page 99: When a regular pool is converted to a thin provisioned pool, you have to specify an additional soft size parameter besides the existing hard size. Obviously, the soft size must be greater than the hard pool size. When a thin provisioned pool is changed to a regular pool, the soft pool size parameter will disappear from the window; in fact, its value will be equal to the hard pool size. If the space consumed by existing volumes exceeds the pools actual hard size, the pool cannot be changed to a regular type pool. In this case, you have to specify a minimum pool hard size equal to the total capacity consumed by all the volumes within this pool.
98
Figure 5-18 Resizing and changing the type of a pool
The remaining soft capacity is displayed in red characters and calculated by the system in the following manner: Remaining Soft Capacity = [Current Storage Pool Soft Size + Remaining System Soft Size] - Current Storage Pool Hard Size
Deleting Storage Pools

To delete a Storage Pool, right-click the Storage Pool and select Delete. The system will ask for a confirmation before deleting this Storage Pool. The capacity of the deleted Storage Pool is reassigned to the systems free capacity, which means that the free hard capacity is increasing by the size of the deleted Storage Pool. Restriction: You cannot delete a Storage Pool if it still contains volumes.
Moving volumes between Storage Pools

In order for a volume to be moved to a specific Storage Pool, there must be enough room for the volume to reside there. If there is not enough free capacity (meaning adequate capacity has not been allocated), the Storage Pool must be resized, or other volumes must be moved out first to make room for the new volume. When moving a master volume from one Storage Pool to another, all of its snapshots are moved along with it to the destination Storage Pool. You cannot move a snapshot alone, independently of its master volume. The destination Storage Pool must have enough free storage capacity to accommodate the volume and its snapshots. The exact amount of storage capacity allocated from the destination Storage Pool is released at the source Storage Pool.
99
A volume that belongs to a Consistency Group cannot be moved without the entire Consistency Group. As shown in the Figure 5-19, in the Volume by Pools report, just select the appropriate volume with a right-click and initiate a Move to Pool operation to change the location of a volume.
Figure 5-19 Volumes by Pools
In the pop-up window, select the appropriate Storage Pool as shown in Figure 5-20 and click OK to move the volume into it.
Figure 5-20 Move Volume to another Pool
100
5.3.2 Manage Storage Pools with XCLI

All of the operations described in 5.3.1, Managing Storage Pools with XIV GUI on page 94, can also be done through the command line interface. To get a list of all the Storage Pool-related XCLI commands, type the following command from the XCLI command shell: xcli -c Redbook help category=storage-pool Important: Note that the commands as shown in this section assume that you have created a configuration file named Redbook as explained in Customizing the XCLI environment on page 90. The output shown in Example 5-6 is displayed.
Example 5-6 All the Storage Pool-related commands
Category storage-pool storage-pool storage-pool storage-pool storage-pool storage-pool storage-pool storage-pool
Name cg_move
Description Moves a Consistency Group, all its volumes and all their Snapshots and Snapshot Sets from one Storage Pool to another. pool_change_config Changes the Storage Pool Snapshot limitation policy. pool_create Creates a Storage Pool. pool_delete Deletes a Storage Pool. pool_list Lists all Storage Pools or the specified one. pool_rename Renames a specified Storage Pool. pool_resize Resizes a Storage Pool. vol_move Moves a volume and all its Snapshot from one Storage Pool to another. To list the existing Storage Pools in a system, use the following command: xcli -c Redbook pool_list A sample result of this command is illustrated in Figure 5-21.
Hard Size (GB) 51 1511 39015 17 807 206 206 Empty Space (GB) 0 549 429 0 360 0 17 Used by Volumes (GB) 17 0 19069 0 0 0 0 Used by Snapshots (GB) 0 0 0 0 0 0 0
Name aixboot_pool xiv61_pool perfsh3a_pool linuxboot_pool aixPool hank_reg hank_thin
Size (GB) 51 1511 39015 17 807 206 309
Snapshot Size (GB) 0 515 19516 0 412 51 103
Locked no no no no no no no
Figure 5-21 Result of the pool_list command
101
For the purpose of new pool creation, enter the following command: xcli -c Redbook pool_create pool=DBPool size=1000 snapshot_size=0 The size of the Storage Pool is specified as an integer multiple of 109 bytes, but the actual size of the created Storage Pool is rounded up to the nearest integer multiple of 16x230 bytes. The snapshot_size parameter specifies the size of the snapshot area within the pool. It is a mandatory parameter, and you must specify a positive integer value for it. The following command shows how to resize one of the existing pools: xcli -c Redbook pool_resize pool=DBPool size=1300 With this command, you can increase or decrease the pool size. The pool_create and the pool_resize commands are also used to manage the size of the snapshot area within a Storage Pool. To rename an existing pool, issue this command: xcli -c Redbook pool_rename pool=DBPool new_name=DataPool To delete a pool, type: xcli -c Redbook pool_delete pool=DBPool Use the following command to move the volume named log_vol to the Storage Pool DBPool: xcli -c Redbook vol_move vol=log_vol pool=DBPool The command only succeeds if the destination Storage Pool has enough free storage capacity to accommodate the volume and its snapshots. The following command will move a particular volume and its snapshots from one Storage Pool to another, but if the volume is part of a Consistency Group, the entire group must be moved. In this case, the cg_move command is the correct solution: xcli -c Redbook cg_move cg=DBGroup pool=DBPool All volumes in the Consistency Group are moved, all snapshot groups of this Consistency Group are moved, and all snapshots of the volumes are moved.
Thin provisioned pools

To create thinly provisioned pools, specify the hard_size and the soft_size parameters. For thin provisioning concepts, refer to the 2.3.4, Capacity allocation and thin provisioning on page 24. A typical Storage Pool creation command with thin provisioning parameters can be issued as shown in the following example: xcli -c Redbook pool_create pool=Pool1 hard_size=100 soft_size=200 snapshot_size=0 The soft_size is the maximal storage capacity seen by the host, which cannot be smaller than the hard_size, which is the hard physical capacity of the Storage Pool. If a Storage Pool runs out of hard capacity, all of its volumes are locked to all write commands. Even though write commands that overwrite existing data can be technically serviced, they are blocked as well in order to ensure consistency.
102
To specify the behavior in case of depleted capacity reserves in a thin provisioned pool, use the following command: xcli -c Redbook pool_change_config pool=DBPool lock_behavior=read_only This command specifies whether the Storage Pool is locked for write or whether it disables both read and write when running out of storage space. Note: The lock_behavior parameter can be specified for non-thin provisioning pools, but it has no effect.
5.4 Volumes
After defining Storage Pools, the next milestone in the XIV Storage System configuration is volume management. The XIV Storage System offers logical volumes as the basic data storage element for allocating usable storage space to attached hosts. This logical unit concept is well known and widely used by other storage subsystems and vendors. However, neither the volume segmentation nor its distribution over the physical disks is conventional in the XIV Storage System. Traditionally, logical volumes are defined within various RAID arrays, where their segmentation and distribution are manually specified. The result is often a suboptimal distribution within and across modules (expansion units) and is significantly dependent upon the administrators knowledge and expertise. As explained in 2.3, Full storage virtualization on page 14, the XIV Storage System uses true virtualization as one of the basic principles for its unique design. With XIV, each volume is divided into tiny 1 MB partitions, and these partitions are distributed randomly and evenly, and duplicated for protection. The result is optimal distribution in and across all modules, which means that for any volume, the physical drive location and data placement are invisible to the user. This method dramatically simplifies storage provisioning, letting the system lay out the users volume in an optimal way. This method offers complete virtualization, without requiring preliminary volume layout planning or detailed and accurate stripe or block size pre-calculation by the administrator. All disks are equally used to maximize the I/O performance and exploit all the processing power and all the bandwidth available in the storage system. XIV Storage System virtualization incorporates an advanced snapshot mechanism with unique capabilities, which enables creating a virtually unlimited number of point-in-time copies of any volume, without incurring any performance penalties. The concept of snapshots is discussed in detail in 11.1, Snapshots on page 286. Volumes can also be grouped into larger sets called Consistency Groups and Storage Pools. Refer to 5.3, Storage Pools on page 93 and 11.1.3, Consistency Groups on page 300. Important: The basic hierarchy is (refer to Figure 5-22 on page 104): A volume can have multiple snapshots. A volume can be part of one and only one Consistency Group. A volume is always a part of one and only one Storage Pool. All volumes of a Consistency Group must belong to the same Storage Pool.
103
Storage Pool
Consistency Group
DbVol1 LogVol1 LogVol2
DbVol2
TestVol
Snapshots from CG
DbVol1 LogVol1 LogVol2
DbVol2 DbVol2_snap2 DbVol2_snap1
Snapshot Reserve
Figure 5-22 Basic storage hierarchy
5.4.1 Managing volumes with the XIV GUI

To start a volume management function from the XIV GUI, you can either select View Volumes Volumes from the menu bar or click the Volume icon and then select the appropriate menu item. Refer to Figure 5-23.
Figure 5-23 Opening the Volumes menu
The Volumes & Snapshots menu item is used to list all the volumes and snapshots that have been defined in this particular XIV Storage System. An example of the resulting window can be seen in Figure 5-24 on page 105.
104
Figure 5-24 Volumes and Snapshots view
Volumes are listed in a tabular format. If the volume has snapshots, then a + or a - icon appears on the left. Snapshots are listed under their master volumes, and the list can be expanded or collapsed at the volume level by clicking the + or - icon respectively. Snapshots are listed as a sub-branch of the volume of which they are a replica, and their row is indented and highlighted in off-white. The Master column of a snapshot shows the name of the volume of which it is a replica. If this column is empty, the volume is the master. Tip: To customize the columns in the lists, just click one of the column headings and make the required selection of attributes. The default column set does not contain the Master column. Table 5-1 on page 106 shows the columns of this view with their description.
105
Table 5-1 Columns in the Volumes and Snapshots view Column Qty. Description indicates the number snapshots belonging to a volume Name of a volume or snapshot Volume or snapshot size. (value is zero if the volume is specified in blocks) Used capacity in a volume Volume size in blocks Consumed capacity Snapshot Masters name Consistency Group name Storage Pool name Indicates the locking status of a volume or snapshot. Lock icon. Shows if the snapshot was unlocked or modified Indicates the priority of deletion by numbers for snapshots Shows the creation time of a snapshot Volume or snapshot creator name Volume or snapshot serial number Shows the mirroring type status Default N
Name Size (GB)
Y Y
Used (GB) Size (Blocks) Size (Consume) Master Consistency Group Pool () () Deletion Priority Created Creator Serial Number Sync Type
Y N N N Y Y Y Y N Y N N N
Most of the volume-related and snapshot-related actions can be selected by right-clicking any row in the table to display a drop-down menu of options. The options in the menu differ slightly for volumes and snapshots.
Menu option actions

The following actions can be performed through these menu options: Adding/Creating Volume(s); refer to Creating volumes on page 107 Resizing a Volume Deleting a Volume or snapshot Formatting a Volume Renaming a Volume or snapshot Creating a Consistency Group with These Volumes Adding to a Consistency Group
106
Removing from a Consistency Group Moving Volumes Between Storage Pools; refer toMoving volumes between Storage Pools on page 99 Creating a snapshot Creating a snapshot/(Advanced) Overwriting a snapshot Copying a Volume or snapshot Locking/Unlocking a Volume or snapshot Mappings Displaying Properties of a Volume or snapshot Changing a snapshots Deletion Priority Duplicating a snapshot or a snapshot (Advanced) Restoring from a snapshot
Creating volumes
When you create a volume in a traditional or regular Storage Pool, the entire volume storage capacity is reserved (static allocation). In other words, you cannot define more space for volumes in a regular Storage Pool than the actual hard capacity of the pool, which guarantees the functionality and integrity of the volume. If you create volumes in a Thin Provisioned Pool, the capacity of the volume will not be reserved immediately to the volumes, but a basic 17.1 GB piece, taken out of the Storage Pool hard capacity, will be allocated at the first I/O operation. In a Thin Provisioned Pool, you are able to define more space for volumes than the actual hard capacity of the pool, up to the soft size of the pool. The volume size is the actual net storage space, as seen by the host applications, not including any mirroring or other data protection overhead. The free space consumed by the volume will be the smallest multiple of 17 GB that is greater than the specified size. For example, if we request an 18 GB volume to be created, the system will round this volume size to 34 GB. In case of a 16 GB volume size request, it will be rounded to 17 GB. Figure 5-25 on page 108 gives you several basic examples of volume definition and planning in a thinly provisioned pool. It depicts the volumes with the minimum amount of capacity, but the principle can be used for larger volumes as well. As shown in this picture, we recommend that you plan carefully the number of volumes or the hard size of the thinly provisioned pool because of the minimum hard capacity that is consumed by one volume. If you create more volumes in a thinly provisioned pool than the hard capacity can cover, the I/O operations against the volumes will fail at the first I/O attempt. Note: We recommend that you plan the volumes in a Thin Provisioned Pool in accordance with this formula: Pool Hard Size >= 17 GB x (number of volumes in the pool)
107
Pool Soft Size 51 GB Pool Hard Size 34 GB
Volume I 17 GB
Volume II 17 GB
Volume I 17 GB
Volume II 17 GB
Volume III 17 GB
Pool hard size >= 17 GB x ( number of volumes in the pool )
Volume I 17 GB
Volume II 34 GB
Figure 5-25 Planning the number of volumes in a Thin Provisioning Pool
The size of a volume can be specified either in gigabytes (GB) or in blocks (where each block is 512 bytes). If the size is specified in blocks, volumes are created in the exact size specified, and the size will be not rounded up. It means that the volume will show the exact block size and capacity to the hosts but will nevertheless consume a 17 GB size in the XIV Storage System. This capability is relevant and useful in migration scenarios. If the size is specified in gigabytes, the actual volume size is rounded up to the nearest 17.1 GB multiple (making the actual size identical to the free space consumed by the volume, as just described). This rounding up prevents a situation where storage space is not fully utilized because of a gap between the free space used and the space available to the application. The volume is logically formatted at creation time, which means that any read operation results in returning all zeros as a response. To create volumes with the XIV Storage Management GUI: 1. Click the add volumes icon in the Volume and Snapshots view (Figure 5-24 on page 105) or right-click in the body of the window (not on a volume or snapshot) and select Add Volumes. The window shown in Figure 5-26 on page 109 is displayed. 2. From the Select Pool field, select the Pool in which this volume is stored. You can refer to 5.3, Storage Pools on page 93 for a description of how to define Storage Pools. The storage size and allocation of the selected Storage Pool is shown textually and graphically in a color-coded bar: Green indicates the space already allocated in this Storage Pool. Yellow indicates the space that will be allocated to this volume (or volumes) after it is created. Gray indicates the space that remains free after this volume (or volumes) is allocated.
108
Figure 5-26 Create Volumes
3. In the Number of Volumes field, specify the required number of volumes. 4. In the Volume Size field, specify the size of each volume to define. The size can also be modified by dragging the yellow part of the size indicator. Note: When multiple volumes are created, they all have the same size as specified in the Volume Size field. 5. In the Volume Name field, specify the name of the volume to define. The name of the volume must be unique in the system. If you specified that more than one volume be defined, they are successively named by appending an incrementing number to end of the specified name. 6. Click Create to effectively create and add the volumes to the Storage Pool. After a volume is successfully added, its state is unlocked, meaning that write, format, and resize operations are permitted. The creation time of the volume is set to the current time and is never changed.
Resizing volumes
Resizing volumes is an operation very similar to their creation. Only an unlocked volume can be resized. When you resize a volume, its size is specified as an integer multiple of 109 bytes, but the actual new size of the volume is rounded up to the nearest valid size, which is an integer multiple of 17 GB. Note: The size of the volume can be decreased. However, to avoid possible data loss, you must contact your IBM XIV support personnel if you need to decrease a volume size. (Mapped volume size cannot be decreased.)
109
The volume address space is extended (at the end of the existing volume) to reflect the increased size, and the additional capacity is logically formatted (that is, zeroes are returned for all read commands). When resizing a regular volume (not a writable snapshot), all storage space that is required to support the additional volume capacity is reserved (static allocation), which guarantees the functionality and integrity of the volume, regardless of the resource levels of the Storage Pool containing that volume. Resizing a master volume does not change the size of its associated snapshots. These snapshots can still be used to restore their individual master volumes at their initial sizes.
Figure 5-27 Resize an existing volume
To resize volumes with XIV Storage Management GUI: 1. Right-click the row of the volume to be resized and select Resize. The total amount of storage is presented both textually and graphically. The amount that is already allocated by the other existing volumes is shown in green. The amount that is free is shown in gray. The current size of the volume is displayed in yellow, to the left of a red vertical bar. This red bar provides a constant indication of the original size of the volume as you resize it. Place the mouse cursor over the red bar to display the volumes initial size. 2. In the New Size field, use the arrows to set the new size or type the new value. 3. Click Update to resize the volume.
Deleting volumes
With the GUI, the deletion of a volume is as easy as creating one. Important: After you delete a volume or a snapshot, all data stored on the volume is lost and cannot be restored.
110
All the storage space that was allocated (or reserved) for the volume or snapshot is freed and returned to its Storage Pool. The volume or snapshot is then removed from all the logical unit number (LUN) Maps that contain mapping of this volume. Deleting a volume deletes all the snapshots associated with this volume, even snapshots that are part of snapshot Groups. This deletion can only happen when the volume was in a Consistency Group and was removed from it. You can delete a volume regardless of the volumes lock state, but you cannot delete a volume that is part of a Consistency Group. To delete a volume or a snapshot: 1. Right-click the row of the volume to be deleted and select Delete. 2. Click to delete the volume.
Maintaining volumes
There are several other operations that can be issued on a volume. Refer to Menu option actions on page 106. The usage of these operations is obvious, and you can initiate an operation with a right-mouse click. These operations are: Format a volume: A formatted volume returns zeros as a response to any read command. The formatting of the volume is done logically, and no data is actually written to the physical storage space allocated for the volume. Consequently, the formatting action is performed instantly. Rename a volume: A volume can be renamed to a unique name in the system. A locked volume can also be renamed. Lock/Unlock a volume: You can lock a volume so that hosts cannot write to it. A volume that is locked is write-protected, so that hosts can read the data stored on it, but they cannot change it. The volume appears then as a lock icon. In addition, a locked volume cannot be formatted or resized. In general, locking a volume prevents any operation (other than deletion) that changes the volumes image. Note: Master volumes are set to unlocked when they are created. Snapshots are set to locked when they are created. Consistency Groups: XIV Storage System enables a higher level of volume management provided by grouping volumes and snapshots into sets called Consistency Groups. This kind of grouping is especially useful for cluster-specific volumes. Refer to 11.1.3, Consistency Groups on page 300 for a detailed description. Copy a volume: You can copy a source volume onto a target volume. Obviously, all the data that was previously stored on the target volume is lost and cannot be restored. Refer to 11.2, Volume Copy on page 317 for a detailed description. Snapshot functions: The XIV Storage Systems advanced snapshot feature has unique capabilities that enable the creation of a virtually unlimited number of copies of any volume, with no performance penalties. Refer to 11.1, Snapshots on page 286. Map a volume: While the storage system sees volumes and snapshots at the time of their creation, the volumes and snapshots are visible to the hosts only after the mapping procedure. To get more information about mapping, refer to 5.5, Host definition and mappings on page 113.
111
5.4.2 Managing volumes with XCLI

All of the operations that are explained in 5.4.1, Managing volumes with the XIV GUI on page 104can also be performed through the command line interface. To get a list of all the volume-related commands, enter the following command in the XCLI command shell: xcli -c Redbook help category=volume Important: Note that the commands as shown in this section assume that you have created a configuration file named Redbook as explained in Customizing the XCLI environment on page 90. Example 5-7 shows the output of the command.
Example 5-7 All the volume-related commands
Category volume volume volume volume volume volume volume volume volume volume volume
Name vol_by_id vol_clear_keys vol_copy vol_create vol_delete vol_format vol_list vol_lock vol_rename vol_resize vol_unlock
Description Prints the volume name according to its specified SCSI serial number. Clears all SCSI reservations and registrations. Copies a source volume onto a target volume. Creates a new volume. Deletes a volume. Formats a volume. Lists all volumes or a specific one. Locks a volume so that it is read-only. Renames a volume. Resizes a volume. Unlocks a volume, so that it is no longer read-only and can be written to.
To list the existing volumes in a system, use the following command: xcli -c Redbook vol_list The result of this command is similar to the illustration given in Figure 5-28.
Size (GB) 34 34 34 51 Master Name Consistency Group Used Capacity (GB) 0 0 0 0
Name reg_1 reg_2 reg_3 reg_4
Pool hank_reg hank_reg hank_reg hank_reg
Creator admin admin admin admin
Figure 5-28 Result of the vol_list command
To find and list a specific volume by its SCSI ID, issue the following command: xcli -c Redbook vol_by_id=12 To create a new volume, enter the following command: xcli -c Redbook vol_create vol=DBVolume size=2000 pool=DBPool The size can be specified either in gigabytes or in blocks (where each block is 512 bytes). If the size is specified in blocks, volumes are created in the exact size specified. If the size is specified in gigabytes, the actual volume size is rounded up to the nearest 17 GB multiple 112
(making the actual size identical to the free space consumed by the volume, as described above). This rounding up prevents a situation where storage space is not fully utilized because of a gap between the free space used and the space available to the application. Note: If pools are already created in the system, the specification of the Storage Pool name is mandatory. The volume is logically formatted at creation time, which means that any read operation results in returning all zeros as a response. To format a volume, use the following command: xcli -c Redbook vol_format vol=DBVolume Note that all data stored on the volume will be lost and unrecoverable. If you want to bypass the warning message, just put -y right after the XCLI command. The following example shows how to resize one of the existing volumes: xcli -c Redbook vol_resize vol=DBVolume size=2100 With this command, you can increase or decrease the volume size. The size of the volume can be decreased. However, to avoid data loss, contact the XIV Storage System support personnel if you need to decrease the size of a volume. To rename an existing volume, issue this command: xcli -c Redbook vol_rename vol=DBVolume new_name=DataVol To delete an existing created volume, enter: xcli -c Redbook vol_delete vol=DataVol
5.5 Host definition and mappings

Because the XIV Storage System can be attached to multiple, heterogeneous hosts, it is necessary to specify which particular host can access which specific logical drives in the XIV Storage System. In other words, mappings must be defined between hosts and volumes in the XIV Storage System. The XIV Storage System is able to manage single hosts or hosts grouped together in clusters.
5.5.1 Managing hosts and mappings with XIV GUI

The management of hosts and their connections or mappings to volumes is invoked by clicking the corresponding icon located on the left side of the main GUI window, which is shown in Figure 5-29 on page 114.
113
Figure 5-29 Opening Hosts and LUNs menu
Clicking the icon brings up the Hosts and LUNs menu. Select Hosts from this menu to get the Hosts main view displayed in Figure 5-30. The Hosts view enables you to perform a range of activities for managing hosts, including defining, editing, deleting, renaming, and linking the host servers used in the XIV Storage System. The main Hosts window lists all the hosts that have been defined in the XIV Storage System.
Figure 5-30 Hosts view
114
Table 5-2 describes the columns displayed in the Hosts window. The hidden columns can be revealed by right-clicking the heading line of the view.
Table 5-2 Column Name Type Columns in the Hosts view Description Host name FC: Fibre Channel iSCSI: iSCSI (Internet Small Computer System Interface) The creator of the host Cluster name to which this host belongs LUN Map to which this host is linked LUN Map identification number Access Default Y Y
Creator Cluster LUN Map LUN Map ID Access
N Y N N Y
By expanding the hosts, you can see the port World Wide Web Net addresses for Fibre Channel (FC) hosts and the Internet Small Computer System Interface (iSCSI) initiator names for iSCSI hosts.
Creating a host
When trying to define a new host (Figure 5-31), remember that the name of the host must be unique in the system.
Figure 5-31 Define Host window
To create Hosts with the XIV Storage Management GUI: 1. Click the Add Host icon on the toolbar in the Hosts view or right-click in the body of the window (not on a host or a port) and select Add Host. A Define Host panel is displayed as shown in Figure 5-31. 2. Enter the desired name for the host. 3. Select the cluster name if the host going to be a part of a cluster; otherwise, select None. 4. Click Create to effectively define the new Host.
115
The host just created is reflected in the Hosts view but without any ports yet. The name of the host later can be modified by selecting the rename option in the menu.
Adding a port to a host

Ports must be specified to the previously defined host according to the connection type. There are two port types in the XIV Storage System: FC (Fibre Channel) ports and iSCSI (Internet Small Computer System Interface) ports. The IBM XIV Storage System supports hosts that have both type of ports. Although we advise that you do not use these two protocols together to access the same volume, this type of dual configuration can be useful in the following cases: As a way to smoothly migrate a host from FC to iSCSI As a way to access different volumes from the same host, but through different protocols Note: A host with multiple ports can be defined; however, the connection through different protocols to the same LUN is not recommended. It means if the host has both Fibre Channel ports and iSCSI ports, you will need to create a different host definition for each port type. You must have one host definition with the iSCSI ports and another host definition with the FC ports. Next, we show you how to add one port type or another port type to a predefined host (Figure 5-32).
Figure 5-32 Add a port to an existing host
Adding a port to a host: 1. Right-click the predefined host and select Add Port. The Add Port panel appears as shown in Figure 5-32. 2. Select the Port Type according to the host connection type (FC or iSCSI). 3. Specify the Port Name. The FC port address or iSCSI initiator (port) name assigned to the host must be unique in the XIV Storage System: a. For FC port specification, you can choose the existing port names, which are already seen by the XIV Storage System, from a list box or just type a new port name. The FC port name must be exactly 16 characters long, in hexadecimal form. Only the following 116
alphanumeric characters are valid: 0-9, A-F, and a-f. In addition to the 16 characters, colons (:) can be used as separators in the 16 character port name. The port naming convention for XIV Storage ports is: WWPN: 5000001738XXXXYZ 001738 = Registered identifier for XIV XXXX = Serial number in hex Y = (hex) Interface Module number Z = (hex) FC port number within the Interface Module b. For iSCSI port selection, the iSCSI port (initiator) name must not exceed 253 characters and must not contain any blank spaces. 4. Click Add to add a new port to the host. The name of the ports cannot be modified later, just deleted. If you need to change a name, a new port must be allocated to the host.
Mapping volumes to a host

While the storage system views volumes and snapshots by name, hosts view volumes and snapshots according to logical unit numbers (LUNs). A LUN is an integer, which is used when mapping a systems volume to a registered host. Each host can access some or all of the volumes and snapshots on the storage system, where each accessed volume or snapshot is identified by the host through a LUN. Note: For each host, a LUN identifies a single volume or snapshot. However, different hosts can use the same LUN to access different volumes or snapshots. The mapping itself can be managed by the Volume to LUN mapping view, which can be displayed by right-clicking the specified host in the Host or the Host Connectivity view. The Host view and the menu can be seen in Figure 5-30 on page 114. From this menu, selecting Map Volumes to this Host displays the Volume to LUN Mapping view for a specific host. Refer to the Figure 5-33 on page 118 as an example. The left side of the view represents the volumes in the XIV Storage System and the right side of the view is the actual selected host with its current mapping status. The volumes can be found by their names or serial numbers (in the customized view). The LUNs are represented on the right side in ascending order. Besides the assigned numbers, the name of the volume is shown. By default, the LUN Maps view displays all volumes that are not mapped to any LUN Map. The icon in the toolbar toggles to display all volumes, both mapped and unmapped, thus enabling you to map a given volume to more than one LUN Map. Another useful action is the Collapse LUNs option on the toolbar. By using this option, the LUNs table collapses and shows only those volumes that have been mapped. It means there are no more empty rows in that table. To expand the table to its original state, click Expand in the toolbar, or right-click anywhere below the list of volumes in the LUNs table and select Expand.
117
Figure 5-33 Volume to LUN Mapping of Host view: Automatic mapping
Volumes can be mapped to the logical unit numbers (assigned to the right side of the window) in one of these ways: Mapping volumes automatically by selecting volumes and clicking Map Mapping volumes semi-automatically, by selecting volumes, selecting a free LUN row, and clicking Map Mapping volumes (manually) by dragging and dropping To place volumes automatically: 1. Click the volumes required from the Volumes table while holding down Ctrl if more than one volume is required. The volumes are automatically allocated places in the LUNs table in sequential order, according to the free locations. The LUN rows are temporarily highlighted in light yellow, and the volume names and sizes are displayed, as shown in Figure 5-33. 2. Click Map to accept this mapping configuration.
118
To place volumes semi-automatically: 1. Click the required volumes from the Volumes table while holding down Ctrl if more than one volume is required. The volumes are automatically allocated places in the LUNs table in sequential order, according to the free locations. The LUN rows are temporarily highlighted in light yellow, and the volume names and sizes are displayed showing the initial, automatic placement. 2. Click a starting point on the LUNs table where you want the volumes to be copied. The volumes automatically cascade downward to the next available free locations, with the first volume marked in a darker yellow, as shown in Figure 5-34.
Figure 5-34 Volume to LUN Mapping of Host view: Semi-automatic mapping
3. Click Map to accept this mapping configuration. You can manually map volumes to LUNs while selecting the same number of rows on the LUNs table as in the Volumes table, which helps you select LUNs that are not in sequential order. Another option is to change mapped LUNs by selecting them on the right and dragging and dropping them to a different free location on the right table. Volumes mapped to LUNs (assigned on the right side of the window) can also be unassigned or unmapped. Unmapping LUNs from a host after this procedure is easy and simple, just right-click the LUN to be unmapped and select Unmap.
119
Cluster
In many cases, you might need to define identical mappings for a set of hosts. To implement this configuration, it is necessary to define a cluster as an entity that groups several hosts together and assigns the same mapping to all of the hosts. The mapping of volumes to LUN identifiers is defined per cluster and applies concurrently to all the hosts in the cluster. There is no way to define different mappings for different hosts belonging to the same cluster.
Creating a cluster
The cluster creation in XIV Storage Management software is similar to the host creation (Figure 5-35).
Figure 5-35 Create Cluster
To create a cluster with XIV Storage Management GUI: 1. Click the Add Cluster icon on the toolbar in the Hosts view or right-click in the body of the window (not on a host or a port) and select Add Cluster. A Create Cluster panel appears as shown in Figure 5-35. 2. Enter the name, which must be unique in the system, of the new cluster. 3. Click OK to define the new cluster.
Managing hosts in the cluster

After creating a cluster, you can easily add existing hosts to this cluster, or by using the new host definition, you can define a host as a member of this cluster. Additional considerations: When adding a host to a cluster, you can either: Change the hosts mapping to be identical to the clusters mapping Change the clusters mapping to be identical to the hosts mapping When a host is removed from a cluster, the hosts mapping remains the same as the clusters mapping.
120
The mapping definitions do not revert to the hosts original mapping before it was added to the cluster. After removing the host from the cluster, the administrator can change the hosts mapping. A cluster of specific volumes can all be used simultaneously as a group and a synchronized snapshot of them can be created by using a Consistency Group. The volumes in a Consistency Group are grouped into a single Volume Set. To get more information about Consistency Groups, refer to 11.1.3, Consistency Groups on page 300.
5.5.2 Managing hosts and mappings with XCLI

You can perform all of the previous operations by using the command line interface. To getting a list of all the Storage Pool-related commands, just type the following command into the XCLI tool: xcli -c Redbook help category=hosts The result is similar to Example 5-8.
Example 5-8 Storage Pool-related commands
Category host host host host host host host host host host host host host host host host host
Name cluster_add_host cluster_create cluster_delete cluster_list cluster_remove_host cluster_rename host_add_port host_define host_delete host_list host_remove_port host_rename map_vol mapping_list special_type_set unmap_vol vol_mapping_list
Description Adds a host to a cluster. Creates a new cluster. Deletes a cluster. Lists a specific cluster or all of them. Removes a host from a cluster. Renames a cluster. Adds a port address to a host. Defines a new host to connect to the XIV Storage System. Deletes a host. Lists a specific host or all hosts. Removes a port from a host. Renames a host. Maps a volume to a host or a cluster. Lists the mapping of volumes to a specified host or cluster. Sets the special type of a host or a cluster. Unmaps a volume from a host or a cluster. Lists all the hosts and clusters to which a volume is mapped.
This section shows the most common ways to manage the host or cluster management by using the XCLI tool. To create a host or a cluster with XCLI, use these commands: xcli -c Redbook host_define host=Windows_Server1 xcli -c Redbook cluster_create cluster=Mscs_Cluster1 These commands create a new host or cluster. The newly created cluster contains the hosts. Neither the host nor the cluster has mapping at the time of definition. Adding members to a cluster can be done in one of two ways. First, at the creation of a host, you can define another parameter, cluster=. Or, use the cluster_add_host command: xcli -c Redbook cluster_add_host cluster=Mscs_Cluster1 host=Server1 map=cluster
121
To list the existing hosts or clusters in the XIV Storage System, use the following two commands as shown: xcli -c Redbook host_list xcli -c Redbook cluster_list You need to allocate ports to the newly defined hosts. There are two port types in the XIV Storage System: FC (Fibre Channel) and iSCSI (Internet Small Computer System Interface) ports. The FC port address or iSCSI initiator (port) name assigned to the host must be unique for the XIV Storage System. The FC port name must be exactly 16 characters long and in hexadecimal form. Only the following alphanumeric characters are valid: 0-9, A-F, and a-f. In addition to the 16 characters, colons (:) can be used as separators in the 16 character port name. The iSCSI initiator name must not exceed 253 characters and must not contain any blank spaces. The port naming convention for XIV Storage System ports is: WWPN: 5000001738XXXXYZ 001738 = Registered identifier for XIV XXXX = Serial number in hex Y = (hex) Interface Module number Z = (hex) FC port number within the Interface Module The following example shows a port definition: xcli -c Redbook host_add_port host=Server1 fcaddress=10000000C92CFD36 You can get more information about the FC and iSCSI connectivity by using these commands: xcli -c Redbook fc_connectivity_list xcli -c Redbook fc_port_list xcli -c Redbook host_connectivity_list Using the previous commands, you can discover the network or list the existing connections. After the port definition, you can map existing volumes to one of the hosts or clusters: xcli -c Redbook map_vol host=Server1 vol=DBvol1 lun=1 ovverride=yes The command fails if another volume is mapped to the same LUN for this cluster/host, unless the override is specified. If the Override option is specified, the hosts existing mapping is replaced by the newly specified mapping, which enables the host (or all the hosts in the cluster) to see the continuous mapping of volumes to this LUN. Although, the volumes content, and possibly size, might change.
5.6 Scripts
IBM XIV Storage Manager software XCLI commands can be used in scripts or batch programs in case you need to use repetitive or complex operations.The XCLI can be used in a shell environment to interactively configure the system or as part of a script to perform specific tasks. Example 5-9 on page 123 shows a Windows XP batch program.
122
Example 5-9 A Windows batch program
rem rem rem rem rem rem rem rem rem
-----------------------------------------------------------------------------This batch program erase all volumes and snapshots in a specified Storage Pool within an IBM XIV Storage System Prerequisite: existing configuration with user, password and IP address Limitation: it will not delete volumes with mirror relationship Operating system: Windows XP Tested xcli version: 2.2.43 ------------------------------------------------------------------------------
@echo off cls rem rem set set set rem Set the varaiables -----------------------------------------------------------------------------POOLNAME= ANSWER=N SYSNAME=Redbook ------------------------------------------------------------------------------
:POOLIST rem lists the poolS and sorts them rem ----------------------------------------------------------------------------echo ***These are the storage pools in - %SYSNAME% - Storage System*** xcli -t name -c %SYSNAME% pool_list :POOLINPUT rem prompt for user input for Storage Pool name and check it echo ---------------------------------------------------------------------------@set /p POOLNAME=Type the Storage Pool name where you want to delete ALL the volumes: xcli -s -t name -c %SYSNAME% pool_list | findstr \<%POOLNAME%\> @if errorlevel 1 ( echo There is no such Storage Pool in the system called: %POOLNAME% goto quit ) :VOLIST rem lists the volumes in the requested pool echo ---------------------------------------------------------------------------echo ***Listing volumes in - %POOLNAME% - Storage Pool*** for /F usebackq skip=1 delims=, tokens=1,3,5 %%i in (`xcli -s -c %SYSNAME% vol_list`) do ( if %%~k EQU %POOLNAME% ( if %%j EQU ( echo Volume: %%~i ) else ( echo Snapshot: %%~i - Master: %%~j ) ) ) :VOLLEY rem ask confirmation for delete and in case of YES delete the volumes echo ---------------------------------------------------------------------------echo Please type Y for Yes or N for No - default is No
123
set /p ANSWER=Do you really want to delete all the volumes and snapshots in %POOLNAME% now ? Y/N: if /i %ANSWER:~,1% EQU N ( echo - Action is cancelled goto quit ) if /i %ANSWER:~,1% EQU y ( for /F usebackq skip=1 delims=, tokens=1,3,5 %%i in (`xcli -s -c %SYSNAME% vol_list`) do ( if %%~k equ %POOLNAME% ( if %%j EQU ( rem ----------------------------------------------------------------------------rem Uncomment the next commands if you want to remove the volumes from a CG rem ----------------------------------------------------------------------------rem echo Removing %%~i from CG ... rem xcli -y -c %SYSNAME% cg_remove_vol vol=%%~i rem ----------------------------------------------------------------------------rem Uncomment the next commands if you want to unmap mapped volumes rem ----------------------------------------------------------------------------rem echo Unmapping %%~i ... rem for /F usebackq skip=1 delims=, tokens=1 %%h in (`xcli -s -c %SYSNAME% vol_mapping_list vol^=%%~i`) do ( rem xcli -y -c %SYSNAME% unmap_vol vol=%%~i host=%%~h rem ) rem ----------------------------------------------------------------------------rem This command will delete the volumes rem ----------------------------------------------------------------------------echo Erasing %%~i ... xcli -y -c %SYSNAME% vol_delete vol=%%~i ) ) ) goto quit ) goto VOLLEY :quit echo ---------------------------------------------------------------------------echo End of Program
124
Chapter 6.
Security
This chapter discusses the XIV Storage System security features from different perspectives. More specifically, it covers the following topics: System physical access security User access and authorizations Password management Managing multiple machines Enhanced access security
125
6.1 Physical access security

When installing an XIV Storage System, you need to apply the same security best practices that you apply to any other business critical IT system. A good reference on storage security can be found at the Storage Networking Industry Association (SNIA) Web site: http://www.snia.org/forums/ssif/programs/best_practices A common risk with storage subsystems is the retention of volatile caches. The XIV Storage System is perfectly safe in regard to external operations and a loss of external power. In the case of a power failure, the internal Uninterruptible Power Supply (UPS) units provide power to the system. The UPS enables the XIV Storage System to gracefully shut down. However, if someone were to gain physical access to the rack, that person might manually shut off components by bypassing the recommended process. In this case, the storage system probably will lose the contents of its volatile caches, resulting in a data loss and system unavailability. To eliminate or greatly reduce this risk, the XIV rack is equipped with lockable doors; prevent unauthorized people from accessing the rack by simply locking the doors, which will also protect against unintentional as well as malicious changes inside the system rack.
Important: Protect your XIV Storage System by locking the rack doors and monitoring physical access to the rack.
6.2 User access security

This section presents the features that govern and secure user access to the XIV Storage System: Role Based Access Control Predefined roles and users Procedure for managing user rights with GUI and XCLI Indications for integration with Lightweight Directory Access Protocol (LDAP) and Active Directory (AD)
6.2.1 Role Based Access Control

Basically, a user ID and password are required to gain access to the XIV Storage System, using either the Management GUI or the Extended Command Line Interface (XCLI). In addition, the XIV Storage System also implements an advanced access authorization system that specifies predefined user roles to control access to specific functions of the system. Each user of the system must be associated with a user role or category, which is known as Role Based Access Control (RBAC). Note: The XIV Storage System implements Role Based Access Control (RBAC).
126
Predefined access roles

There four predefined roles (also referred to as categories) for the associated users with the following rights: storageadmin The storageadmin (Storage Administrator) category is the highest, non-manufacturing access available on the system. A user belonging to this category is able to perform changes on any system resource. technician The technician category has a predefined user ID (technician) and is intended to be used by IBM support personnel to service the system. A technician has extremely restricted access to the system, which is focused on hardware maintenance and data collection. This user is unable to perform any configuration changes to pools, volumes, or host definitions on the XIV Storage System. applicationadmin The applicationadmin category is for application administrators. The primary task of the application administrator is to create and manage snapshots. An application administrator can be configured as allowed to create snapshots on all volumes in the system (if configured with access_all=yes), or limited to manage snapshots of a specific subset on the volumes by using the user group to host association concept. This concept is explained in User groups on page 128. readonly As the name implies, the readonly category entitles a user in this category to only be able to look at the XIV Storage System configuration. This category is in particular intended for someone who needs to monitor the system events or watch the performance, but must not be authorized to modify the storage configuration. Note: At the time of writing this book, there is no capability to define or add new categories (roles). Also, after a user is assigned to a role, the only way to reassign the same user to another role is by first deleting the existing user and then recreating it.
Predefined users and settings

Table 6-1 on page 128 lists the default, predefined users and their associated roles (or categories). Two default users are defined to ensure that the XIV Storage System is accessible and manageable at the time of installation. The technician ID allows the IBM representative to install the system, and the admin ID allows the user to create other IDs and to build the logical configuration. Important: As a best practice, all default passwords must be changed at installation time. The xiv_development and xiv_maintenance IDs are internal IDs for IBM development and support. These IDs are visible to the user but are only accessible by qualified IBM service support representatives (SSRs). Users and their settings (passwords and roles) are defined on an individual system basis.
Chapter 6. Security
127
Table 6-1 Default users and their categories Predefined user admin technician N/A N/A xiv_development xiv_maintenance Default password adminadmin technician N/A N/A N/A N/A Category storageadmin technician applicationadmin readonly xiv_development xiv_maintenance
Both GUI and XCLI use the same user and role definitions.
User groups
A user group is a group of application administrators who share the same set of snapshot creation limitations. The limitations are enforced by associating the user groups to hosts or clusters and, therefore, the snapshots of volumes that are mapped to these hosts or clusters. After a user (application administrator) belongs to a user group, which is associated with a host, it is possible for the user to manage snapshots of the volumes mapped to that host. The concept of user groups allows a simple update, through a single command, of the limitations for all the users in the user group. User groups have these rules: Only users who are defined as application administrators can be assigned to a group. A user can belong to only a single user group. A user group can contain up to eight users. Important: A user group only applies to users with the application administrator role. Storage Administrators create the user groups and control the various application administrator permissions. Rules: A maximum of 32 users can be created. A maximum of eight user groups can be created. A maximum of eight users can be attached to a user group.
6.2.2 Manage user rights with the GUI

This section illustrates the use of the XIV GUI to manage users and their roles, as well as the creation and association of user groups for application administrators.
128
Adding users with the GUI

The following steps require that you initially log on to the XIV Storage System with storage administrator access rights (storageadmin role). If this is the first time that you access the system, use the predefined user admin (default password adminadmin): 1. Open the XIV GUI and log on as shown in Figure 6-1.
Figure 6-1 GUI Login
2. Users are defined per system. If you manage multiple systems and they have been added to the GUI, select the particular system with which you want to work. 3. In the main Storage Manager GUI window, move the mouse pointer over the padlock icon to display the Access menu. All user access operations can be performed from the Access menu (refer to Figure 6-2). There are three choices: Users: Define or change single users Users Groups: Define or change user groups, and assign application administrator users to groups Access Control: Define or change user groups, and assign application administrator users or user group to hosts 4. Move the mouse over the Users menu item (it is now highlighted in yellow) and click (Figure 6-2).
Figure 6-2 GUI Access menu
5. The Users window is displayed. If the storage system is accessed for the first time, the window displays the predefined users (refer Figure 6-3 on page 130 for an example). The default columns are Name, Role, Group, E-mail, and Phone. An additional column called Full Access can eventually be displayed (this indication only applies to users with a role of application administrator). To add the Full Access column, right-click the blue heading bar to display the Customize Columns dialog.
Chapter 6. Security
129
Figure 6-3 GUI Users management
c. We recommend that you change the default passwords for the predefined users, which can be accomplished by right-clicking the user name and selecting Change Password from the context menu, as illustrated in Figure 6-4. Repeat the operation for each of the four predefined users.
Figure 6-4 GUI change password
6. To add a new user, you can either click the Add icon in the menu bar or right-click the empty space to get the context menu. Both options are visible in Figure 6-5. Click Add User.
Figure 6-5 GUI Add User option
130
7. The Define User dialog is displayed. A user is defined by a unique name and a password (refer to Figure 6-6). The default role (denoted as Category in the dialog panel) is storageadmin and must be changed. Optionally, enter the e-mail address and phone number for the user. Click Define to define the user and return to the Users window.
Figure 6-6 GUI Define User attributes
8. If you need to test the user that you just defined, click the current user name shown in the upper right corner of the IBM XIV Storage Manager window (Figure 6-7), which allows you to log in as a new user.
Figure 6-7 GUI quick user change
Defining user groups with the GUI

The IBM XIV Storage Subsystem can simplify user management tasks with the capability to create users groups. Users groups only apply to users with application administrator roles. A user group is also associated to one or more application hosts or clusters. The following steps illustrate how to create users groups, add users (with application administrator role) to the group, and how to define host associations for the group: 1. Be sure to log in as admin (or another user with storage administrator rights). From the Access menu, click Users Groups as shown in Figure 6-8. In our scenario, we create a users group called EXCHANGE CLUSTER 01. As shown in Figure 6-8, the user groups can be accessed from the Access menu (padlock icon).
Figure 6-8 Select Users Groups
Chapter 6. Security
131
2. The Users Groups window displays. To add a new user group, either click the Add User Group icon (shown in Figure 6-9) in the menu bar, or right-click in an empty area of the Users Groups table and select Add User Group from the context menu as shown in Figure 6-9.
Figure 6-9 Add User Group
3. The Create User Group dialog displays. Enter a meaningful group name and click OK (refer to Figure 6-10).
Figure 6-10 Enter New User Group Name
4. At this stage, the user group EXCHANGE CLUSTER 01 is still empty. Next, we add a host to the user group. Select Access Control from the Access menu as shown in Figure 6-11. This Access Control window appears.
Figure 6-11 Access Control
5. Right-click the name of the user group that you have created to bring up a context menu and select Updating Access Control as shown in Figure 6-12 on page 133.
132
Figure 6-12 Updating Access Control for a user group
6. The Access Control Definitions dialog that is shown in Figure 6-13 is displayed. The panel contains the names of all the hosts or clusters defined to the XIV Storage System. The left pane displays the list of Unauthorized Hosts/Clusters for this particular user group and the right pane shows the list of hosts that have already been associated to the user group. You can add or remove hosts from either list by selecting a host and clicking the appropriate arrow. Finally, click Update to save the changes.
Figure 6-13 Access Control Definitions panel
7. After a host (or multiple hosts) have been associated to a user group, you can add users to the user group (remember that a user must have the application administrator role to be added to a user group). Go to the Users window and right-click the user name to display the context menu. From the context menu (refer to Figure 6-14), select Add to Group to add this user to a group.
Figure 6-14 Add a user to a group
Chapter 6. Security
133
8. The Select User Group dialog is displayed. Select the desired group from the pull-down list and click OK (refer to Figure 6-15).
Figure 6-15 Select User Group
9. The user adm_mike02 has been assigned to the user group EXCHANGE CLUSTER 01 in this example. You can verify this assignment in the Users panel as shown in Figure 6-16.
Figure 6-16 View user associated to a user group
10.The user adm_mike02 is an applicationadmin with the Full Access right set to no. This user can now perform snapshots of the EXCHANGE CLUSTER 01 volumes. Because the exchange cluster is the only host in the group, adm_mike02 is only allowed to map those snapshots to the EXCHANGE CLUSTER 01. However, you can add another host, such as a test or backup host, to allow adm_mike02 to map a snapshot volume to a test server.
6.2.3 Managing users with XCLI

This section summarizes the commands and options available to manage associated roles as well as user groups and associated hosts resources through the XCLI user access. Table 6-2 on page 135 shows the various commands and a brief description for each command. The table also indicates the user role required to issue specific commands.
134
Table 6-2
XCLI access control commands Description Defines an association between a user group and a host. Deletes an access control definition. Lists access control definitions. Defines a new user. Deletes a user. Adds a user to a user group. Creates a user group. Deletes a user group. Lists all user groups or a specific one. Removes a user from a user group. Renames a user group. Lists all users or a specific user. Renames a user. Updates a user. You can update the password, Access_all or Full_access, e-mail, area code, or phone number. Role required to use command storageadmin
Command access_define
access_delete access_list user_define user_delete user_group_add_user user_group_create user_group_delete user_group_list user_group_remove_user user_group_rename user_list user_rename user_update
storageadmin storageadmin, readonly, and applicationadmin storageadmin storageadmin storageadmin storageadmin storageadmin storageadmin, readonly, and applicationadmin storageadmin storageadmin storageadmin, readonly, and applicationadmin storageadmin technician, storageadmin, and applicationadmin
Adding users with the XCLI

To perform the following steps, the XCLI component must be installed on the management workstation, and a storageadmin user is required. The following examples assume a Windows- based management workstation. 1. Open a Windows command prompt and execute the command XCLI -L to see the registered managed systems. In Example 6-1, there are two IBM XIV Storage Subsystems registered. The configuration is saved with the serial number as the system name.
Example 6-1 XCLI List managed systems
C:\>xcli System MN00050 1300203
-L Managements IPs 9.155.56.100, 9.155.56.101, 9.155.56.102 9.155.56.58, 9.155.56.56, 9.155.56.57
2. In Example 6-2, we check the current state of the particular system with which we want to work (note that if the system name contains blanks, it must be inserted between single quotation marks). The default user admin is used.
Chapter 6. Security
135
Example 6-2 XCLI state_list
C:\>xcli -c 1300203 -u admin -p adminadmin state_list Command completed successfully system_state off_type=off safe_mode=no shutdown_reason=No Shutdown system_state=on target_state=on 3. XCLI commands are grouped into categories. The help command can be used to get a list of all commands related to the category accesscontrol. Refer to Example 6-3.
Example 6-3 XCLI help
C:\>xcli -c 1300203 -u admin -p adminadmin help category=accesscontrol Category Name Description accesscontrol access_define Associate a user group and a host. accesscontrol access_delete Deletes an access control definition. accesscontrol access_list Lists access control definitions. accesscontrol user_define Defines a new user. accesscontrol user_delete Deletes a user. accesscontrol user_group_add_user Adds a user to a user group. accesscontrol user_group_create Creates a user group. accesscontrol user_group_delete Deletes a user group. accesscontrol user_group_list Lists all user groups or a specific one. accesscontrol user_group_remove_user Removes a user from a user group. accesscontrol user_group_rename Renames a user group. accesscontrol user_list Lists all users or a specific user. accesscontrol user_rename Renames a user. accesscontrol user_update Updates a user. 4. Use the user_list command to obtain the list of predefined users and roles (categories) as shown in Example 6-4. This example assumes that no users, other than the default users, have been added to the system.
Example 6-4 XCLI user_list
C:\>xcli -c 1300203 -u admin -p adminadmin user_list Name Category Group/EmailAddress/AreaCode/Phone/AccessAll xiv_development xiv_development xiv_maintenance xiv_maintenance admin storageadmin technician technician 5. If this is a new system, you must change the default passwords for obvious security reasons. Use the update_user command as shown in Example 6-5 for the user technician.
Example 6-5 XCLI user_update
C:\>xcli -c 1300203 -u admin -p adminadmin user_update user=technician password=d0ItNOW password_verify=d0ItNOW Command completed successfully 6. Adding a new user is straightforward as shown in Example 6-6. A user is defined by a unique name, password, and role (designated here as category). 136
Example 6-6 XCLI user_define
C:\>xcli -c 1300203 -u admin -p adminadmin user_define user=adm_itso02 password=wr1teFASTER password_verify=wr1teFASTER category=storageadmin Command completed successfully 7. Example 6-7 shows a quick test to verify that the new user can log on.
C:\>xcli -c 1300203 -u adm_itso02 -p wr1teFASTER user_list Name Category xiv_development xiv_development xiv_maintenance xiv_maintenance admin storageadmin technician technician adm_itso02 storageadmin
Defining user groups with the GUI

To use the GUI to define user groups: 1. Use the user_group_create command as shown in Example 6-8 to create a user group called EXCHANGE_CLUSTER_01.
Example 6-8 XCLI user_group_create
C:\>xcli -c 1300203 -u adm_itso02 -p wr1teFASTER user_group_create user_group=EXCHANGE_CLUSTER_01 Command completed successfully Note: Avoid spaces in user group names. If spaces are required, the group name must be placed between single quotation marks, such as name with spaces. 2. The user group EXCHANGE_CLUSTER_01 is empty and has no associated host. The next step is to associate a host or cluster. In Example 6-9, user group EXCHANGE_CLUSTER_01 is associated to EXCHANGE_CLUSTER_MAINZ.
Example 6-9 XCLI access_define
C:\>xcli -c 1300203 -u adm_itso02 -p wr1teFASTER access_define user_group="EXCHANGE_CLUSTER_01" cluster="EXCHANGE_CLUSTER_MAINZ" Command completed successfully 3. A host has been assigned to the user group. The user group still does not have any users included. In Example 6-10 on page 137, we add the first user.
Example 6-10 XCLI user_group_add_user
C:\>xcli -c 1300203 -u adm_itso02 -p wr1teFASTER user_group_add_user user_group="EXCHANGE_CLUSTER_01" user="adm_mike02" Command completed successfully 4. The user adm_mike02 has been assigned to the user group EXCHANGE_CLUSTER_ 01. You can verify the assignment by using the XCLI user_list command as shown in Example 6-11.
Chapter 6. Security
137
C:\>xcli -c 1300203 -u adm_itso02 -p wr1teFASTER user_list Name Category Group Access All xiv_development xiv_development xiv_maintenance xiv_maintenance admin storageadmin technician technician adm_itso02 storageadmin adm_mike02 applicationadmin EXCHANGE_CLUSTER_01 no The user adm_mike02 is an applicationadmin with the Full Access right set to no. This user can now perform snapshots of the EXCHANGE_CLUSTER_01 volumes. Because EXCHANGE_CLUSTER_01 is the only cluster (or host) in the group, adm_mike02 is only allowed to map those snapshots to the same EXCHANGE_CLUSTER_01. This is not useful in practice and is not supported in most cases. Most servers (operating systems) cannot handle having two disks with the same metadata mapped to the system. In order to prevent issues with the server, you need to map the snapshot to another host, not the host to which the master volume is mapped. Therefore, to make things practical, a user group is typically associated to more than one host.
6.2.4 LDAP and Active Directory

Currently, there is no Lightweight Directory Access Protocol (LDAP) and no Active Directory (AD) integration for the XIV Storage System. IBM intends to provide single sign-on capabilities for many IBM devices, including the XIV Storage System, and storage software applications that enable the storage administrator to use a single set of secure credentials to authenticate across all products via a single centralized point-of-control. Important: This statement regarding IBM future direction and intent is subject to change or withdrawal without notice and represents goals and objectives only.
6.3 Password management

Password management is completely internal to the XIV Storage System. a password must be six to 12 characters and can only be alphanumeric. You cannot enforce password change rules, such as a 90 day expiration, or any other rules, such as preventing reuse of the same passwords. Also, if you want to log on to multiple systems (at one time) through the GUI, all systems must share same user ID and password.
Reset a user password

As long as users can log in, they can change their own password. The predefined admin user (or any user with a storageadmin role) is authorized to change other users password. Access to user and password stores is only possible through the GUI and XCLI in order to enforce security.
138
Figure 6-17 shows that you can change a password by right-clicking the selected user in the Users window. Then, select Change Password from the context menu.
Figure 6-17 GUI change password context menu
The Change Password dialog shown in Figure 6-18 is displayed. Enter the New Password and then retype it for verification in the appropriate field (remember that only alphanumeric characters are allowed). Click Update.
Figure 6-18 GUI Change Password window
Example 6-12 on page 139 shows the same password change procedure using the XCLI. Remember that a user with the storageadmin role is required to change the password on behalf of another user.
Example 6-12 XCLI change user password
C:\>xcli -c 1300203 -u admin -p adminadmin user_update user=adm_mike02 password=workLESS password_verify=workLESS Command completed successfully
6.4 Managing multiple systems

Managing multiple XIV Storage Systems is straightforward within the IBM XIV management tools (GUI and XCLI). Because users are defined on an individual system basis, it is key is to keep the same user name (with storageadmin role) and password on different XIV Storage Systems to allow for quick transitions between systems in the XIV GUI. This approach is especially useful in Remote Mirror configurations, where the storage administrator is required to switch from source to target system.
Chapter 6. Security
139
Figure 6-19 illustrates the GUI view of multiple systems when using different IDs (or passwords). For this example, the system named ESP has an ID named tester that provides the storage admin operation. Because the tester ID is not configured for the other XIV Storage Systems, only the ESP system is currently shown as accessible. The user can see the other systems, but the user is unable to access them with the tester ID (the unauthorized systems appear in black and white). They also state that the user is unknown. If the system has the user ID tester defined with a different password, the systems are still displayed in the same state.
Figure 6-19 Single User Login
In order to allow easy access between systems for the system manager, it is best to have a universal ID. Figure 6-20 illustrates the universal ID being used with five systems. The storage administrator can easily switch between these systems for the activities without having to log on each time with a different user ID. Each of the authorized systems for that ID and password combination is now displayed in color with an indication of its status.
140
Figure 6-20 Universal user ID
6.5 Event logging

The XIV Storage System uses a centralized event log. For any command that has been executed or any change in the system, an event entry is generated and recorded in the event log. The object creation time and the user are also stored as object attributes. The event log is implemented as a circular log and is able to hold a set number of entries. When the log is full, the system wraps back to the beginning. If you need to save the log entries beyond what the system will normally hold, you can issue the XCLI command event_list and save the output to a file. Event entries can be viewed by GUI, XCLI commands, or notification. A flexible system of filters and rules allows you to generate customized reports and notifications. For details about how to create customized rules, refer to 6.5.3, Define notification rules on page 145. Note: It is possible to display a maximum of 300 events on the XIV GUI and XCLI. In most cases, filtering is required to find a specific entry. Use attributes as shown in Event attributes on page 142.
6.5.1 Viewing events in the XIV GUI

The XIV GUI provides a nice and easy to use view of the event log. To get to the view shown in Figure 6-21, click the Monitor icon from the main GUI window and the select Events from the context menu.
Chapter 6. Security
141
The window is split into two sections: The top part contains the management tools, such as wizards in the menu bar and a series of input fields and drop-down menus that act as selection filters. The bottom part is a table displaying the events according to the selection criteria. Use the table tile bar or headings to enable or change sort direction.
Figure 6-21 GUI events main view
At the time of writing this book, the XIV GUI is limited to displaying a maximum of 300 events, and the name field is disabled (the name field allows you to filter based on the name of the user). At this time, the only way to get events related to a specific user is through the XCLI. Note: The XIV GUI does not display the user who performed a transaction. A transaction audit with the user name can only be performed on the XCLI.
Event attributes
This section gives an overview of all available event types, event codes, and their severity levels.
Severity Levels
Select one of six possible severity levels as the minimal level to be displayed: none: Includes all severity levels informational: Changes, such as volume deletion, size changes, or host multi-pathing warning: Volume usage limits reach 80%, failing message sent minor : Power supply power input loss, volume usage over 90%, component TEST failed major: Component failed (disk), user system shutdown, volume and pool usage 100%, UPS on battery or Simple Mail Transfer Protocol (SMTP) gateway unreachable critical: Module failed or UPS failed
Event codes
Refer to the XCLI Reference Guide, GC27-2213-00, for a list of event codes.
142
Event types
The following event types can be used as filters (specified with the parameter object_type in the XCLI command): cons_group: Consistency Group destgroup: event notification group of single destination (mixture of SMTP/SMS) dest: event notification address dm: data migration host: host map: volume mapping mirror: mirroring pool: pool rule: rule smsgw: sms gateway smtpgw: smtp gateway target: fc/iSCSI connection volume: volume
6.5.2 Viewing events in the XCLI

Table 6-3 provides a list of all the event-related commands available in the XCLI. This list covers setting up notifications and viewing the events in the system. Refer to Chapter 10, Monitoring on page 249 for a more in-depth discussion of system monitoring.
Table 6-3 XCLI: All event commands Command custom_event dest_define dest_delete dest_list dest_rename dest_test dest_update destgroup_add_dest destgroup_create destgroup_delete destgroup_list destgroup_remove_dest destgroup_rename event_clear event_list event_list_uncleared event_redefine_threshold Description Generates a custom event. Defines a new destination for event notifications. Deletes an event notification destination. Lists event notification destinations. Renames an event notification destination. Sends a test message to an event notification destination. Updates a destination. Adds an event notification destination to a destination group. Creates an event notification destination group. Deletes an event notification destination group. Lists destination groups. Removes an event notification destination from a destination group. Renames an event notification destination group. Clears alerting events. Lists system events. Lists uncleared alerting events. Redefines the threshold of a parameterized event.
Chapter 6. Security
143
Command smsgw_define smsgw_delete smsgw_list smsgw_prioritize smsgw_rename smsgw_update smtpgw_define smtpgw_delete smtpgw_list smtpgw_prioritize smtpgw_rename smtpgw_update rule_activate rule_create rule_deactivate rule_delete rule_list rule_rename rule_update
Description Defines a Short Message Service (SMS) gateway. Deletes an SMS gateway. Lists SMS gateways. Sets the priorities of the SMS gateways for sending SMS messages. Renames an SMS gateway. Updates an SMS gateway. Defines an SMTP gateway. Deletes a specified SMTP gateway. Lists SMTP gateways. Sets the priority of which SMTP gateway to use to send e-mails. Renames an SMTP gateway. Updates the configuration of an SMTP gateway. Activates an event notification rule. Creates an event notification rule. Deactivates an event notification rule. Deletes an event notification rule. Lists event notification rules. Renames an event notification rule. Updates an event notification rule.
Event_list command and parameters

The syntax of the event_list command is: event_list [max_events=MaxEventsToList] [after=<afterTimeStamp|ALL>] [before=<beforeTimeStamp|ALL>] [min_severity=<informational|warning|minor|major|critical> ] [alerting=<yes|no>] [cleared=<yes|no>] [code=EventCode] [object_type=<cons_group | destgroup | dest|dm | host | map | mirror | pool | rule smsgw | smtpgw | target | volume> ] [ beg=BeginIndex ] [ end=EndIndex ] [ internal=<yes|no|all> ]
XCLI examples
To illustrate how the commands operates, the event_list command displays the events currently in the system. Example 6-13 shows the first few events logged in our system.
144
Example 6-13 XCLI viewing events
C:\XIV>xcli -c "XIV Index Code Alerting Clea red User 8545 UNMAP_VOLUME 08:26:49 no admin 8546 UNMAP_VOLUME 08:26:50 no admin 8547 UNMAP_VOLUME 08:26:51 no admin 8548 UNMAP_VOLUME 08:26:53 no admin 8549 UNMAP_VOLUME 08:26:54 no admin ....... .......
V10.0 MN00050" event_list Severity Timestamp
Informational yes Informational yes Informational yes Informational yes Informational yes
2008-08-20
2008-08-20
2008-08-20
2008-08-20
2008-08-20
Example 6-14 illustrates the command for listing all instances when the user was updated. The USER_UPDATED event is generated when a users password, e-mail, or phone number is modified. In this example, the -t option is used to display specific fields, such as index, code, description of the event, time stamp, and user name. The description field provides the ID that was modified, and the user field is the ID of the user performing the action.
Example 6-14 View USER_UPDATED event with the XCLI
C:\XIV>xcli -c "XIV ESP 1
V10.0 1300203" -t
User admin admin admin admin
index,code,description,timestamp,user_name event_list code=USER_UPDATED Index Code Description Timestamp 425 USER_UPDATED User with name 'xiv_pfe' was updated. 2008-08-07 426 USER_UPDATED User with name 'xiv_pfe' was updated. 2008-08-07 441 USER_UPDATED User with name 'adm_mike02' was updated. 2008-08-07 1279 USER_UPDATED User with name 'chris' was updated. 2008-08-25
12:06:27 12:07:05 13:27:20 09:33:11
6.5.3 Define notification rules

Example 6-15 on page 145 describes how to set up a rule in the XCLI to notify the storage administrator when a users access control has changed. The rule itself has four event codes that generate a notification. The events are separated with commas with no spaces around the commas. If any of these four events are logged, the XIV Storage System uses the relay destination to issue the notification.
Example 6-15 Setting up an access notification rule using the XCLI
C:\XIV>xcli -c "XIV ESP 1 V10.0 1300203" rule_create rule=test codes=ACCESS_OF_USER_GROUP_TO_CLUSTER_REMOVED,ACCESS_OF_USER_GROUP_TO_HOST_REMOVED ,ACCESS_TO_CLUSTER_GRANTED_TO_USER_GROUP,ACCESS_TO_HOST_GRANTED_TO_USER_GROUP dests=relay
Chapter 6. Security
145
Command executed successfully. A simpler example is setting up a rule notification for when a user account is modified. Example 6-16 creates a rule on the XIV Storage System called ESP that sends a notification whenever a user ID is modified on the system. The notification is transmitted through the relay destination.
Example 6-16 Create a rule for notification with the XCLI
C:\XIV>xcli -c "XIV ESP 1 V10.0 1300203" rule_create rule=user_update codes=USER_UPDATED dests=relay Command executed successfully. The same rule can be created in the GUI. From the events menu, select Rules and enter the details in the panel. Refer to Chapter 10, Monitoring on page 249 for more details about configuring the system to provide notifications and setting up rules.
Figure 6-22 Create a rule for notification with the GUI
146
Chapter 7.
Host connectivity
This chapter discusses the host attachment capabilities for the XIV Storage System. It addresses key aspects of host attachment and reviews concepts and requirements for both Fibre Channel and Internet Small Computer System Interface (iSCSI) protocols. The information in this chapter applies to the various host operating systems that are compatible with XIV. Operating system-specific information is provided in subsequent chapters of the book. You can configure the XIV Storage System for the following adapter types and protocols: Fibre Channel adapters for support of Fibre Channel Protocol (FCP) Ethernet Adapter or iSCSI host bus adapter (HBA) for support of iSCSI over IP Ethernet networks As explained in Interface Module on page 54, the XIV Storage System has six host Interface Modules. Each host Interface Module contains four Fibre Channel ports, and three host Interface Modules contain two iSCSI ports. Use these ports to attach hosts and a remote XIV Storage System to the XIV Storage System. To simplify host cabling, the XIV Storage System has an integrated patch panel.
147
7.1 Connectivity overview

The XIV Storage System supports iSCSI and Fibre Channel protocols for communication with various host systems. As explained in Chapter 3, XIV physical architecture and components on page 45, the system includes a patch panel in the back of the rack for the physical connections (FC or iSCSI). Hosts can attach to the Fibre Channel and iSCSI ports either directly through an FC fabric or through a Gigabit Ethernet switch. Figure 7-1 gives an example of how to connect a host through either a storage-attached network (SAN) or an Ethernet Network to the XIV Storage System.
Host
Host
iSCSI
iSCSI
SAN Fabric 1
iSC SI
FCP
SAN Fabric 2
FCP
Ethernet Network
Figure 7-1 Host connectivity overview (without patch panel)
Any host traffic is served through six Interface Modules (numbers 4-9). Although the XIV Storage System distributes the traffic between I/O modules and Data Modules, it is important to understand that it is the storage administrators responsibility to ensure that host I/Os are equitably distributed among the various Interface Modules. This workload balance must be watched and reviewed over time when host traffic patterns change.
Important: Host I/Os are not automatically balanced by the system. It is the storage administrators responsibility to ensure that host connections are made to avoid a single point of failure (such as a Module or HBA) and that the host workload is adequately spread across the connections and Interface Modules.
FCP
FC FC FC FC FC FC FC FC ETH FC FC ETH FC FC ETH
Ethernet HBA 2 x 1 Gigabit

ETH
Module 4
Module 5
Module 6
Module 7
Module 8
Module 9
FC
FC HBA 2 x 4 Gigabit (1 Target, 1 Initiator)
IBM XIV Storage System
148
7.1.1 Module, patch panel, and host connectivity

This section presents a simplified view of the host connectivity. It is meant to explain interactions between individual system components and to differentiate between internal and external cabling to help you better understand the implications. Refer to 3.2, IBM XIV hardware components on page 47 for more details and an explanation of the individual components. Figure 7-2 shows on the left a schematic view of the six Interface Modules that equip the XIV Storage System. Each Interface Module offers four FC ports. Interface Modules 7, 8, and 9 each offer two additional iSCSI ports. All these ports are internally connected to the patch panel that is shown in the middle of the picture. On the right, we have included a variety of host systems and indicated how they can physically connect to the patch panel, either directly or through a switch (green color) and using FCP or iSCSI.
Patch Panel
Host connection examples 3rd party storage
Module 10-15
FCP
Module 9
FC FC ETH
FCP FCP
FC P
FC FC ETH
Remote Mirror to another XIV
Module 8 Module 7
FC FC ETH
FCP
SAN Fabric 1 SAN Fabric 2
FCP
FCP
FCP
Host
FCP
FCP
Module 6
FC FC
iSCSI
Host
Module 5
FC FC
iSCSI
Ethernet Network iSCSI
SI iSC
Module 4
FC FC
Host
Module 1-3
1-10 Ports
ETH

FC
FC HBA 2 x 4 Gigabit (1 Target, 1 Initiator)
INTERNAL CABLES
EXTERNAL CABLES
Figure 7-2 Host connectivity end-to-end view: Internal cables compared to external cables
Figure 7-3 on page 150 provides additional details, such as the iSCSI-qualified name and Fibre Channel worldwide port names (WWPNs).
Chapter 7. Host connectivity
149
iSCSI: iqn.2005-10.com.xivstorage:000050 FC (WWPN): 5001738000320xxx

...190 ...191 ...192 ...193 ...182 ...183 ...172 ...173
1 3 2 4 Panel Port Nr.
Target FC HBA, 2 x 4 Gigabit Initiator

FC

ETH
9 8 Interface Modules 7
...190 ...191 FC ...180 ...181 FC ...170 ...171 FC
...192 ...193 FC ...182 ...183 FC ...172 ...173 FC
IP(1) IP(2) ETH IP(3) IP(4) ETH IP(5) IP(6) ETH ...150 ...151 ...152 ...153 ...142 ...143 ...160 ...161 ...162 ...163 ...180 ...181 ...170 ...171
SAN Fabric 1
FC P
HBA 1 WWPN HBA 2 WWPN
P FC
SAN Fabric 2
P FC
HOST
6 5 4
...160 ...161 FC ...150 ...151 FC ...140 ...141 FC
...162 ...163 FC ...152 ...153 FC ...142 ...143 FC IP(5) IP(6) IP(3) IP(4) ...140 ...141 IP(1) IP(2)
I iSCS
Ethernet iSCSI Network iSCSI
IP(7) IP(8)
iqn.hostxy HOST
Patch Panel
Network
Hosts
Figure 7-3 Host connectivity end-to-end view: Port names
An even more detailed view of host connectivity and variations is given for FCP in 7.2, Fibre Channel (FC) connectivity on page 152 and for iSCSI in 7.3, iSCSI connectivity on page 159.
7.1.2 FC and iSCSI simplified access

The ability to attach over iSCSI or Fibre Channel gives you the flexibility to start with a less expensive iSCSI implementation, such as an already available Ethernet network infrastructure. Most companies have existing Ethernet connections between their locations and can use that infrastructure to implement a less expensive backup or disaster recovery setup. Imagine taking a snapshot of a critical Microsoft Exchange server and being able to serve the snapshot through iSCSI to a remote data center server for backup. In this case, you can simply use the existing network resources without the need for more expensive FC switches or routers. As soon as workload and performance requirements justify it, you can progressively convert to a more expensive Fibre Channel infrastructure. From a technical standpoint and after HBAs and cabling are in place, the migration is easy and only requires the XIV storage administrator to add the HBA definitions to the existing host configuration to make the logical unit numbers (LUNs) visible over FC paths.
150
You can also have a mix (or coexistence) of FC and iSCSI connections to attach various hosts (do not use both FC and iSCSI connections to the same host). Figure 7-4 shows, however, that it is possible to use FC and iSCSI connections concurrently and for attaching the same LUN. This configuration is usually not supported by most operating systems. For details, refer to the IBM System Storage Interoperability Center (SSIC) at: http://www.ibm.com/systems/support/storage/config/ssic/displayesssearchwithoutjs.w ss?start_over=yes We only mention it, because this capability is useful and solely recommended when migrating from a former storage system that only supports one of the protocols. Note: We do not recommend that you use both FCP and iSCSI for shared access to the same LUN from a single host.
IBM XIV Storage System FCP SAN Fabric 1

FCP
HOST
HBA 1 WWPN
HBA 2 WWPN
FCP
iSCSI
Ethernet Network
iSCSI
IP(7)
iqn.hostxy
Figure 7-4 Host connectivity FCP and iSCSI
7.1.3 Remote Mirroring connectivity

FC and iSCSi connections can also be used to establish Remote Mirror relationships between XIV Storage Systems. Remote Mirroring is typically used for Disaster Recovery, which might not be a requirement at installation time.
Start small and change as you grow

The XIV Storage System can leverage the capability to interchange iSCSI and FC used in host attachment for Remote Mirroring connectivity. Depending on the number of volumes to be kept in a Remote Mirroring relationship and the resulting change rate, you might not need the full bandwidth and speed of a Fibre Channel connection. In that case, you can start with iSCSI interconnections and leverage existing network connections between data centers.
151
Plan for future FC Remote Copy connectivity

As performance requirements grow, use reserved FC connections on the XIV Storage System for Remote Mirror connectivity. Although there is no technical limitation to using all 24 FC ports for host connectivity (at the time of writing this book, the six initiator ports need to be changed to target ports by an IBM request for price quotation (RPQ)), which prevents you from establishing an FC-based Remote Mirror later. At the time of writing this book, Remote Mirroring is only supported in synchronous mode. Keep in mind the distance limitations inherent to this mode. Important: Plan for future Remote Mirror connectivity and keep reserved FC ports. Technically, it is possible to use all 24 FC ports for host connectivity. An IBM RPQ is required to change six initiator ports to target mode.
7.2 Fibre Channel (FC) connectivity

This section focuses on FC connectivity topics that apply to the XIV Storage System in general. For operating system-specific information, refer to the corresponding chapter in this book.
7.2.1 Preparation steps

Before you can attach the XIV Storage System to a host system, you must review a list of the general requirements that pertain to all hosts. You must also review the specific host requirements described in the introduction to each specific host. 1. For information about supported Fibre Channel HBAs and the recommended or required firmware and device driver levels for all IBM storage systems, you can visit the IBM System Storage Interoperability Center (SSIC) at: http://www.ibm.com/systems/support/storage/config/ssic/displayesssearchwithoutj s.wss?start_over=yes For each query, select the XIV Storage System, a host server model, an operating system, and an HBA type. Each query shows a list of all supported HBAs together with the required firmware and device driver levels for your combination. Furthermore, a list of supported SAN switches and directors is displayed. 2. All of the Fibre Channel HBA vendors have Web sites that provide information about their products, facts, and features, as well as support information. These sites are useful when you need details that cannot be supplied by IBM resources, for example, when troubleshooting an HBA driver. Be aware that IBM cannot be held responsible for the content of these sites. 3. Check the LUN limitations for your host operating system and verify that there are enough adapters installed on the host server to manage the total number of LUNs that you want to attach. 4. Define the host and I/O port configurations. Make sure that you define the worldwide port names for FC ports. 152
5. Install the adapter driver that came with your HBA or download and install an updated adapter driver.
7.2.2 FC configurations
Hosts can attach to the Fibre Channel ports either directly or through an FC fabric. Several configurations are technically possible, and they vary in terms of their cost and the degree of flexibility, performance, and reliability that they provide. A desirable goal is a high availability and high performance solution. Avoid a single point of failure in the connectivity solution and use as many connections as possible. However, to keep the cost of the solution in-line with the business requirement, less expensive, less desirable solutions can be justified. Next, we review the three most common FC topologies that are supported.
Redundant non-resilient fabric

The optimal high availability (HA) configuration is illustrated in Figure 7-5.
HBA 1 WWPN ...190 ...191 ...192 ...193 ...182 ...183 ...172 ...173 HBA 1 WWPN HBA 2 WWPN ...160 ...161 ...150 ...151 ...140 ...141 ...162 ...163 ...152 ...153 ...142 ...143
...180 ...181 ...170 ...171
SAN Fabric 1
FC P
HBA 2 WWPN
Hosts 1 Hosts 2 Hosts 3 Hosts 4 Hosts n
P FC
SAN Fabric 2
Patch Panel
Network
Hosts
Figure 7-5 FC configurations: HA fabric
In this configuration: Each host is equipped with dual HBAs. Each HBA (or HBA port) is connected to one or two FC switches. Each of the FC switches has a connection to a separate FC port of each of the six Interface Modules. This configuration has no single point of failure: If a module fails, each host remains connected to the other five modules. If an FC switch fails, each host remains connected to all modules through the second FC switch. Upon an HBA port failure, the host can still connect over the other HBA port.
153
Single switch
This configuration must be used if only a single switch is available, as shown in Figure 7-6. Another better variation of this solution is to use two HBAs for each host, which then still makes the XIV Storage System resilient to a single HBA failure. Still, the unique SAN switch remains a potential single point of failure.
HBA 1 WWPN ...190 ...191 ...192 ...193 ...182 ...183 ...172 ...173 HBA 1 WWPN
Host 1 Host 2 Host 3 Host 4 Host n
...180 ...181 ...170 ...171
Single SAN Switch
FC P
HBA 1 WWPN HBA 1 WWPN HBA 1 WWPN
FCP
...160 ...161 ...150 ...151 ...140 ...141
...162 ...163 ...152 ...153 ...142 ...143
Patch Panel
Figure 7-6 FC configurations: Single switch
Network
Hosts
Direct host connection to XIV Storage System

A direct connection can be established between the host and the storage without an intermediate switch. This connection can be configured through either a single connection or through two connections to two different modules (resilient to individual module failure or single HBA failure) and is illustrated in Figure 7-7.
...190 ...191
...192 ...193 ...182 ...183 ...172 ...173
FCP
Hosts 1 Hosts 2
...180 ...181 ...170 ...171
...160 ...161 ...150 ...151 ...140 ...141
...162 ...163 ...152 ...153 ...142 ...143
Hosts n
Hosts 6
Patch Panel
Figure 7-7 FC configurations: Direct attach
Hosts
154
7.2.3 Zoning and VSAN

Zoning is required when a SAN fabric is used to connect hosts to the XIV Storage System. The concept is to isolate any single HBA for security and reliability reasons. Zoning allows for finer segmentation of the switched fabric. Zoning can be used to instigate a barrier between different environments. Only the members of the same zone can communicate within that zone, and all other attempts from outside are rejected. Here are examples of situations that can be avoided with proper zoning: HBAs from different vendors behave differently when they perform error recovery, which can impact other hosts connected to the same switch if they are not isolated through zoning. Any change in the SAN fabric, such as a change caused by a server restarting or a new product being added to the SAN, triggers a Registered State Change Notification (RSCN). An RSCN requires that any device that can see the affected or new device to acknowledge the change while interrupting its own traffic flow. Tape drives are the typical example of devices affected by this type of a request, because they will need to rewind the tape to the position where it was when the data flow was interrupted. Zoning helps to avoid these situations, and the most secure zoning is to have zones consisting of a single initiator and single target as shown in Figure 7-8.
...190 ...191
...192 ...193 ...182 ...183 ...172 ...173
...180 ...181 ...170 ...171
1 2 3 4 5 6 1 2 3 4 5 6
SAN Fabric 1
FC
P
Hosts 1
...160 ...161 ...150 ...151 ...140 ...141
...162 ...163 ...152 ...153 ...142 ...143
FCP
SAN Fabric 2 Patch Panel Network Hosts
Figure 7-8 FC SAN zoning: Most secure
However, in large implementations, this approach also makes the zoning management effort grow drastically. A common way of zoning is then to have a single initiator - multiple targets zone as shown in Figure 7-9 on page 156. For more in-depth information about SAN zoning, refer to section 4.7 of the IBM Redbooks publication Introduction to Storage Area Networks, SG24-5470. You can download it from: http://www.redbooks.ibm.com/redbooks/pdfs/sg245470.pdf
155
SAN Fabric 1
...190 ...191 ...192 ...193 ...182 ...183 ...172 ...173
...180 ...181 ...170 ...171
1 2
FCP
Hosts 1
...160 ...161 ...150 ...151 ...140 ...141
...162 ...163 ...152 ...153 ...142 ...143
Hosts 2
FCP
SAN Fabric 2 Patch Panel

Figure 7-9 FC SAN zoning: Optimal
Network
Hosts
Follow these best practices recommendations: For general configurations, zone each host HBA to a single port from each of three Interface Modules, which provides six paths to dual HBA hosts. For high workload applications, consider zoning each HBA to one port from each of the six Interface Modules. Do not configure more than 24 logical paths per host. There is no advantage to configuring more than 24 logical paths to a single host and doing so can actually compromise overall stability. Remember also to use separate HBAs if you need to attach tape devices to a host that is also connected to the XIV Storage System. Disk and tape traffic are different in nature, and a specific HBA can only be set to either be disk-optimized or tape-optimized. Disks will run into timeouts and disconnect while tape drives will have to rewind and backup or restore performance drops significantly. Note: Use a single initiator zoning scheme. Do not share a host HBA for disk and tape access. Zone members are either identified by switch ports (hard zoning) or by their HBA worldwide port name (WWPN) (soft zoning). While it is simple to get the switch port number, several specific commands are required to get the WWPN of the target ports. The next section explains how to get them.
7.2.4 Identification of FC ports (initiator/target)

Identification of a port is required for setting up the zoning but also to check that the connection to that port was correctly set up. The unique name that identifies an FC port is called the World Wide Port Name (WWPN). Example 7-1 on page 157 shows all WWPNs for one of the XIV Storage Systems that we used in the preparation of this book. The example also shows the Extended Command Line Interface (XCLI) command to issue.
156
Example 7-1 XCLI: How to get WWPN of IBM XIV Storage System
C:\>xcli -c "XIV V10.0 MN00050" -u admin -p adminadmin fc_port_list Component ID Status Currently WWPN Port ID Role Functioning
1:FC_Port:4:4 1:FC_Port:4:3 1:FC_Port:4:2 1:FC_Port:4:1 1:FC_Port:5:4 1:FC_Port:5:3 1:FC_Port:5:2 1:FC_Port:5:1 1:FC_Port:6:4 1:FC_Port:6:3 1:FC_Port:6:2 1:FC_Port:6:1 1:FC_Port:7:4 1:FC_Port:7:3 1:FC_Port:7:2 1:FC_Port:7:1 1:FC_Port:9:4 1:FC_Port:9:3 1:FC_Port:9:2 1:FC_Port:9:1 1:FC_Port:8:4 1:FC_Port:8:3 1:FC_Port:8:2 1:FC_Port:8:1 OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes 5001738000320143 5001738000320142 5001738000320141 5001738000320140 5001738000320153 5001738000320152 5001738000320151 5001738000320150 5001738000320163 5001738000320162 5001738000320161 5001738000320160 5001738000320173 5001738000320172 5001738000320171 5001738000320170 5001738000320193 5001738000320192 5001738000320191 5001738000320190 5001738000320183 5001738000320182 5001738000320181 5001738000320180 00FFFFFF 00020C00 00010F00 00FFFFFF 00FFFFFF 00010E00 00010400 00020400 000000EF 000000EF 00010D00 00020500 00FFFFFF 00640900 001B0F00 00130B00 00120C00 00640800 000A1000 00120B00 00130C00 00FFFFFF 001B0E00 00111000 Initiator Target Target Target Initiator Target Target Target Initiator Target Target Target Initiator Target Target Target Initiator Target Target Target Initiator Target Target Target
To get same information from the XIV GUI, select the main view of an XIV Storage System, use the arrow at the bottom (circled in red) to reveal the patch panel and move the mouse cursor over a particular port to reveal the port details, including the WWPN (refer to Figure 7-10 on page 158).
157
Figure 7-10 GUI: How to get WWPNs of IBM XIV Storage System
Note: The WWPNs of an XIV Storage System are static. The last two digits of the WWPN indicate from which module and port the WWPN came. As shown in Figure 7-10, the WWPN is 5001738000320151, which means that the WWPN is from module 5 port 2. The ports in the WWPN are numbered from 0 to 3 (instead of 1 to 4). The values that comprise the WWPN are shown in Example 7-2.
Example 7-2 WWPN illustration
If WWPN is 50:01:73:8N:NN:NN:RR:MP 5 001738 NNNNN RR M P NAA (Network Address Authority) IEEE Company ID IBM XIV Serial Number in hex Rack ID (01-ff, 0 for WWNN) Module ID (1-f, 0 for WWNN) Port ID (0-7, 0 for WWNN)
Migration and Remote Mirroring require an initiator port. Note that in the default IBM XIV Storage System configuration, port 4 (component ID: 1:FC_Port:Module ID:4) of each Interface Module is configured as an initiator.
158
7.2.5 IBM XIV logical FC maximum values

This section references maximum values in the context of FC connectivity. These numbers usually change with every product release; therefore, you need to refer to the support Web site for the latest numbers. The numbers shown in Table 7-1 are valid at the time of writing this book for Release 10 of the IBM XIV Storage System software.
Table 7-1 Maximum values in context of FC connectivity Maximum values in context of FC connectivity Maximum number of Interface Modules Maximum number of 4 GB FC ports per Interface Module Maximum queue depth per FC host port Maximum queue depth per mapped volume per (host port, target port, volume) tuple Maximum FC ports for host connections (default configuration) Maximum FC ports for migration/XDRP (default config) Maximum FC ports for host connections (no migration/XDRP) Maximum volumes mapped per host Maximum number of clusters Maximum number of hosts (defined WWPNs/iSCSI qualified names (IQNs)) Maximum number of mirroring coupling (number of mirrors) Maximum number of mirrors on remote machine Maximum values 6 4 1400 256 12 12 24 512 100 4000 128 128
7.3 iSCSI connectivity

The XIV Storage System acts as a Transmission Control Protocol (TCP) server for iSCSI connections; packets are always routed through the Ethernet port from which the iSCSI connection was initiated. The specification of a default gateway is required only if the attached hosts do not reside on the same layer-2 subnet. Maximum Transmission Unit (MTU) configuration is required if the users network supports an MTU that is larger than the standard one. The largest possible MTU must be specified (it is advisable to use up to 8.192 bytes, if supported by the switches and routers). The MTU default value is 1 536 bytes and the maximum value is 8 192 bytes. iSCSI uses the Internet Protocol (IP) for communication, and an iSCSI Qualified Name (IQN) is required for entities taking part in that communication (that is, the XIV Storage System and any attached host). The IQN uniquely identifies the different entities. The IQN for the XIV Storage System is configured when the system is delivered and must not be changed. Contact IBM technical support if a change is required. Our XIV Storage Systems name was iqn.2005-10.com.xivstorage:000050.
159
An iSCSI storage system can use Challenge Handshake Authentication Protocol (CHAP) to authenticate initiators, and initiators can likewise authenticate targets, such as the storage system. CHAP is a method of authenticating iSCSI users. The IBM XIV Storage System does not currently support CHAP. We therefore recommend that you segregate the iSCSI network on a private network.
7.3.1 iSCSI configurations

In the XIV Storage System, each iSCSI port is defined as an IP interface with its own IP address, or the ports can be bundled (bonding) for load balance to one logical iSCSI port with one IP address. By default, there are six predefined iSCSI target ports on the XIV Storage System to serve hosts through iSCSI.
High availability
Figure 7-11 illustrates the best practice for high availability (HA) iSCSI connectivity. This solution makes the best usage of the available iSCSI connectors in the XIV Storage System. Each Interface Module is connected through two ports to two separate Ethernet switches, and each host is connected to the two switches. This solution provides a network architecture resilient to failure of any individual network switch or Interface Module. Use IP(1) to IP(7) and IP(4) to IP(8) to spread traffic from the hosts. There is no additional management required for the physical connections on the storage side. In the case of a network failure, hosts still utilize all Interface Modules and caches.
...190 ...191 FC ...180 ...181 FC ...170 ...171 FC
...192 ...193 FC ...182 ...183 FC ...172 ...173 FC
IP(1) IP(2) ETH IP(3) IP(4) ETH IP(5) IP(6) ETH
IP(1) IP(2) IP(3) IP(4) IP(5) IP(6)
iSCS
Ethernet Network Ethernet Network
iS C
SI
IP(7) IP(8)
HOST
6 5 4
...160 ...161 FC ...150 ...151 FC ...140 ...141 FC
...162 ...163 FC ...152 ...153 FC ...142 ...143 FC
Patch Panel
Network
Hosts
Figure 7-11 iSCSI configurations: HA solution
160
Note: This High Availability configuration is the best practice for iSCSI connectivity. For the best performance, use a dedicated iSCSI network infrastructure. Aggregation of ports is not possible in this solution.
Single switch
Single switch connectivity must only be used when cost is a concern or a second Ethernet switch is not available. As shown in Figure 7-12, the host has dual connections, and there are multiple connections to the different iSCSI-equipped Interface Modules for module resiliency. However, the Ethernet switch remains a single point of failure. To achieve hardware high availability, a compromise is to use a resilient Ethernet switch, such as the Cisco 6500. With this configuration, you can also bond the two iSCSI connections of a module to get a 2 Gigabit bandwidth. In this case, you only require one IP address per link aggregate (also refer to 7.3.2, Link aggregation on page 162).
...190 ...191 FC ...180 ...181 FC ...170 ...171 FC
...192 ...193 FC ...182 ...183 FC ...172 ...173 FC
IP(1) IP(2) IP(3) IP(4) IP(5) IP(6)
iSCSI
possible bonding
iSC SI
Ethernet Network
IP(7) IP(8)
HOST
6 5 4
...160 ...161 FC ...150 ...151 FC ...140 ...141 FC
...162 ...163 FC ...152 ...153 FC ...142 ...143 FC
Patch Panel
Network
Hosts
Figure 7-12 iSCSI configurations: Single switch
Direct host to XIV Storage System

For very small host environments, direct connectivity is a possible and viable solution. It is the most cost-effective way to connect a host to an XIV Storage System. Connections must be established to maintain module resiliency. As shown in Figure 7-13 on page 162, a maximum of three hosts can be connected in this way.
161
...190 ...191 FC ...180 ...181 FC ...170 ...171 FC
...192 ...193 FC ...182 ...183 FC ...172 ...173 FC
IP(1) IP(2) IP(3) IP(4) IP(5) IP(6)
iSC SI
IP(7) IP(8)
Host 1
IP(9) IP(10)
Host 2
IP(11) IP(12)
Host 3
6 5 4
...160 ...161 FC ...150 ...151 FC ...140 ...141 FC
...162 ...163 FC ...152 ...153 FC ...142 ...143 FC
Patch Panel
Hosts
Figure 7-13 iSCSI configurations: Direct connection
A possible application for this configuration might be for a small test environment, while the production hosts are connected through FC.
7.3.2 Link aggregation

The XIV Storage System supports the configuration of Ethernet link aggregation, where two Ethernet ports of the same module are connected to the same switch and are considered as one logical Ethernet port (including failover and load balancing IEEE standard: 802.3ad). Ports defined as a link aggregation group must be connected to the same Ethernet switch, and a parallel link aggregation group must be defined on that Ethernet switch. A possible cabling design is shown in Figure 7-12 on page 161. Link aggregation provides the ability to have a 2 Gigabit Ethernet connection from one Interface Module to the public network. Given the network connection shown in Figure 7-12 on page 161, peak throughput requirements caused by multiple hosts connected with 1 Gigabit connections can be leveraged by faster connectivity to the XIV Storage System shared with multiple hosts. Defining link aggregation on the XIV Storage System is simple as shown in 7.3.3, IBM XIV Storage System iSCSI setup on page 162.
7.3.3 IBM XIV Storage System iSCSI setup

Initially, no iSCSI connections are configured in the XIV Storage System. The configuration process is simple but requires several steps (more steps than for the FC connection setup). The iSCSI protocol requires a working TCP/IP network. This TCP/IP network includes the definition and availability of parameters, such as IP addresses, a network mask, and a gateway address to be provided by the network administrator. Additionally, the Maximum Transmission Unit (MTU) can be specified. 162
As indicated before, an IQN is also required to uniquely identify the systems for iSCSI communication in the IP network and have it play the role of an iSCSI initiator. The rest of this section explains step-by-step how to use the GUI or XCLI to set up iSCSI communications.
iSCSI port GUI setup

To set up the iSCSI port using the GUI: 1. Log on to the XIV GUI, select the XIV Storage System to be configured, and move the mouse over the Hosts and LUNs icon. Select iSCSI connectivity (refer to Figure 7-14).
Figure 7-14 GUI: iSCSI Connectivity menu option
2. The iSCSI connections window opens. Click the Define icon at the top of the window (refer to Figure 7-15) to open the Define IP interface dialog.
Figure 7-15 GUI: iSCSI Define interface icon
3. Enter the name, address, netmask, and gateway in the appropriate fields. The default MTU is 4500. All devices in a network must use the same MTU. If in doubt, set MTU to 1500, because 1500 is the default value for Gigabit Ethernet. Performance might be impacted if the MTU is set incorrectly. In this example, we also set up network aggregation by selecting both available network ports of one module, in this case, module 7 (refer to Figure 7-16).
Figure 7-16 Define IP Interface: iSCSI setup window
163
4. Click Define to conclude defining the IP interface and iSCSI setup.
iSCSI port XCLI setup

Open a command line and enter a command as shown in Example 7-3.
Example 7-3 XCLI: iSCSI setup
C:\>xcli -c "XIV V10.0 MN00050" -u admin -p adminadmin ipinterface_create ipinterface=iSCSI_module7_bonding address=192.168.1.1 netmask=255.255.255.0 module=1:Module:7 ports=P1,P2 Command executed successfully.
7.3.4 Identifying iSCSI ports

iSCSI ports can easily be identified and configured in the XIV Storage System. Use either the GUI or an XCLI command to display current settings.
Current iSCSI configuration: GUI

Log on to the XIV GUI, select the XIV Storage System to be configured and move the mouse over the Hosts and LUNs icon. Select iSCSI connectivity (refer to Figure 7-14 on page 163). The iSCSI connectivity panel that is shown in Figure 7-17 can be used to either view or update the properties of iSCSI ports. Right-click the port to be updated in the list and select Edit from the context menu to make changes. Note that in our example, only five of the six iSCSI ports show. Indeed, non-configured ports do not show up in this view. If there is a need, use XCLI to see all iSCSI ports of an XIV Storage System.
Figure 7-17 iSCSI connectivity
Use the IP addresses shown in this view to connect a host to the corresponding iSCSI port for the XIV Storage System. To check that the connection was successfully established, move the mouse over the Hosts and LUNs icon in the main GUI window and select Host Connectivity from the Hosts and LUNs menu (refer to Figure 7-14 on page 163). A working connection with a specific iSCSI module/port is indicated by green check sign, as seen in Figure 7-18 on page 165. Note the identifier at beginning of the line (iqn.), because it is the only way to differentiate FCP from iSCSI connections here. This host (x342_alex) that we used for our illustration had an iSCSI HBA connected to three iSCSI ports on the XIV Storage Systems, but is only showing one of them as working.
164
Figure 7-18 GUI iSCSI host successfully connected
If you need to analyze why a connection is not working, use the XCLI, because it provides additional capabilities for that purpose.
View iSCSI configuration: XCLI

The ipinterface_list command illustrated in Example 7-4 can be used to display configured network ports only.
Example 7-4 XCLI to list iSCSI ports with ipinterface_list command
C:\>xcli.exe -c "XIV V10.0 MN00050" -u admin -p adminadmin ipinterface_list

Name management VPN management management VPN M7_P1 M8_P1 M9_P1 iSCSI_M8_P2 iSCSI_M9_P2 Type Management VPN Management Management VPN iSCSI iSCSI iSCSI iSCSI iSCSI IP Address 9.155.56.100 0.0.0.0 9.155.56.101 9.155.56.102 0.0.0.0 192.168.1.1 192.168.1.2 192.168.1.3 9.155.50.28 9.155.50.27 Network Mask 255.255.255.0 255.0.0.0 255.255.255.0 255.255.255.0 255.0.0.0 255.255.255.0 255.255.255.0 255.255.255.0 255.255.255.0 255.255.255.0 Default Gateway 9.155.56.1 0.0.0.0 9.155.56.1 9.155.56.1 0.0.0.0 0.0.0.0 0.0.0.0 0.0.0.0 9.155.50.1 9.155.50.1 MTU 1500 1500 1500 1500 1500 4500 4500 4500 4500 4500 Module 1:Module:4 1:Module:4 1:Module:5 1:Module:6 1:Module:6 1:Module:7 1:Module:8 1:Module:9 1:Module:8 1:Module:9 Ports
1 1 1 2 2
Note the column named Type that displays the role of each port (Management, VPN, or iSCSI). Again, port 2 of module 7 is not shown, because it is not configured yet. Also, note the MTU indicated on an individual port basis. To see a complete list of IP interfaces, including iSCSI, use the command ipinterface_list_ports. A reworked output of this command is shown in Example 7-5. To make it more readable and focus on iSCSI, the output was reworked to include only iSCSI Role ports.
Example 7-5 XCLI to list iSCSI ports with ipinterface_list_ports command
C:\>xcli.exe -c "XIV V10.0 MN00050" -u Index Role IP Interface Link Up? 1 iSCSI M7_P1 yes 2 iSCSI no 1 iSCSI M9_P1 yes 2 iSCSI iSCSI_M9_P2 yes 1 iSCSI M8_P1 yes 2 iSCSI iSCSI_M8_P2 yes
admin -p adminadmin ipinterface_list_ports Speed (MB/s) Full Duplex? Module 1000 yes 1:Module:7 0 no 1:Module:7 1000 yes 1:Module:9 1000 yes 1:Module:9 1000 yes 1:Module:8 1000 yes 1:Module:8
From the XCLI, you can use specific network commands to help in IP problem determination. Next, we illustrate the ipinterface_run_traceroute command and the ipinterface_run_arp command.
165
Example 7-6 shows the ipinterface_run_traceroute command. In this particular example, we look at the IP connectivity between two XIV Storage Systems. The result confirms that both systems are connected to the same Ethernet switch and that the iSCSI interface (IP 9.155.56.100) on the first system can communicate with iSCSI 9.155.56.58 on the second system. This output indicates that the two systems are in a remote copy relationship and are able to communicate.
Example 7-6 XCLI iSCSI diagnostics with traceroute
C:\>xcli.exe -c "XIV V10.0 MN00050" -u admin -p adminadmin ipinterface_run_traceroute localipaddress=9.155.56.100 remote=9.155.56.58 Command executed successfully. data=traceroute to 9.155.56.58 (9.155.56.58), 5 hops max, 40 byte packets data= 1 9.155.56.58 (9.155.56.58) 1.478 ms 0.267 ms 0.091 ms Example 7-7 illustrates the use of the ipinterface_run_arp command with the following scenario: One host with one iSCSI HBA (IP:192.168.1.4) is accessing all three modules of the XIV Storage System, and all three paths work. Another host with one iSCSI HBA (IP:192.168.1.7) has the same configuration, but only two of the paths work. Given that the first host can see a path to each of the modules, we can conclude that the problem is not with the Interface Module, and the problem determination must now focus on either the network connections to Interface Module M 9 or the second host.
Example 7-7 XCLI iSCSI diagnostics with arp
C:\>xcli.exe -c "XIV V10.0 MN00050" -u admin -p adminadmin ipinterface_run_arp localipaddress=192.168.1.1 Command executed successfully.
data=Address data=192.168.1.4 data=192.168.1.7 HWtype ether ether HWaddress 00:C0:DD:07:78:31 00:C0:DD:04:15:BB Flags Mask C C Iface M7_P1 M7_P1
data=Address data=192.168.1.7 data=192.168.1.4 HWtype ether ether HWaddress 00:C0:DD:04:15:BB 00:C0:DD:07:78:31 Flags Mask C C Iface M8_P1 M8_P1
data=Address data=192.168.1.4 HWtype ether HWaddress 00:C0:DD:07:78:31 Flags Mask C Iface M9_P1
Getting the current iSCSI name (IQN)

Every XIV Storage System has a unique iSCSI Qualified Name (IQN). The format of the IQN is simple and includes a fixed text string followed by the last digits of the XIV Storage System serial number.
166
Important: Do not attempt to change the IQN. If a change is required, you must engage IBM support. The IQN is visible as part of the XIV Storage System configuration properties. From the XIV GUI, click Configure System from the system main window menu bar to display the Configure System dialog that is shown in Figure 7-19. For the corresponding XCLI config_get command, refer to Example 7-8.
Figure 7-19 iSCSI: Use XIV GUI to get iSCSI name (IQN) Example 7-8 iSCSI: use XCLI to get iSCSI name (IQN)
C:\>xcli -c "XIV V10.0 MN00050" -u admin -p adminadmin config_get Command executed successfully. default_user= dns_primary= dns_secondary= email_reply_to_address= email_sender_address=PFE_XIV@de.ibm.com email_subject_format={severity}: {description} iscsi_name=iqn.2005-10.com.xivstorage:000050 machine_model=A14 machine_serial_number=MN00050 machine_type=2810 ntp_server= snmp_community=XIV snmp_contact=Unknown snmp_location=Unknown system_id=50 system_name=XIV V10.0 MN00050 timezone=-7200
7.3.5 Using iSCSI hardware or software initiator (recommendation)

For iSCSI connections, you have a choice between using hardware iSCSI HBAs on the host or using a software initiator with a regular Network Interface Card (NIC).
167
Before we highlight the advantages of one solution over the other, here are several considerations affecting performance in both cases: To optimize performance, separate data from storage traffic. In other words, use a separate Ethernet switch for storage-related traffic. An iSCSI boot is possible with both types of initiators. However, software initiators are slightly more restrictive (check vendor documents about limitations). Using iSCSI is not recommended for latency sensitive applications. In that case, use FCP. The key benefits of using hardware initiators (iSCSI HBAs) are: Their performance will be noticeably faster. The traffic that passes through the HBAs will not load the servers CPU to the same extent as when the traffic goes through the standard IP stack (as is the case with software initiators). The key benefits of using software initiators (Ethernet network interface cards (NICs)) are: There are no daughter cards. The cost of the hardware (daughter cards) is avoided. There is no need for PCI slots (saves two slots). You use fewer IP addresses, because the two IP addresses that are used for storage traffic can also be used for data traffic. It is the least expensive possible storage networking solution, including both data and storage networking, by even avoiding the cost of additional switches; however, this method might impact performance. It is possible to access other network storage devices, such as network access servers (NASs), network file servers, or other file servers using the same network interfaces as those used for iSCSI. Note: The core idea of the XIV Storage System is to use commodity hardware and implement functionality in the software. Based on this philosophy, the system uses commodity Ethernet adapters for the iSCSI traffic.
7.3.6 IBM XIV logical iSCSI maximum values

This section is a reference for the maximum value of several parameters in the context of iSCSI connectivity. The numbers shown here were validated at the time of writing this book for Release 10 of the IBM XIV Storage System software.
168
Table 7-2 iSCSI connectivity parameter maximum values Parameter Maximum number of Interface Modules with iSCSI ports Maximum number of 1 GB iSCSI ports per Interface Module Maximum queue depth per iSCSI host port Maximum queue depth per mapped volume per (host port, target port, or volume) tuple Maximum iSCSI ports for any connection (host or XDRP) Maximum number of hosts (defined WWPNs and IQNs) Maximum number of mirroring coupling (number of mirrors) Maximum number of mirrors on remote machine Maximum number of remote targets Maximum value 3 2 1400 256 6 4000 128 128 4
7.3.7 Boot from iSCSI target

Eliminating the need for a local hard drive opens a wide range of possibilities. It means that the operating systems and configuration of many network computers can be centrally managed, therefore, facilitating backup, redundancy, dynamic allocation, and disaster recovery procedures. Booting iSCSI depends on the iSCSI initiator used on the host server. The two possibilities are: Hardware-based: When using an iSCSI HBA, the code required to connect to an iSCSI target is already incorporated in the BIOS, so there is no additional host software needed. Software-based: A software iSCSI initiator is a driver that is loaded within an operating system. It is only available after an operating system is running. There are supported work-arounds to allow a boot from the iSCSI. A typical method is to perform a network boot with PXE/TFTP. Or, you can also use a network boot program, such as iboot. Refer to: http://www.haifa.ibm.com/projects/storage/iboot/index.htmlich Booting an OS from a remote disk involves a transfer of the base OS data, the size of which can vary significantly depending on the operating system and software packages installed. Assume the traffic is at least 50 MB (it will be more) per host server. A single host can be adequately handled on a 1 Gigabit connection. But, imagine the time required to boot 20 host servers concurrently, which is often the case with VMware hosts. An administrator needs several minutes to perform this task, and the delay might be too long.
169
7.4 Logical configuration for host connectivity

This section shows the tasks required to define a volume or logical unit number (LUN) in the XIV Storage System and make it visible to a newly defined host system. The sequence of steps involves: 1. Gather information on hosts and storage systems (WWPN and IQN). 2. Create SAN Zoning for the FC connection to populate the WWPN list in the XIV Storage System. 3. Create a Storage Pool. 4. Create a volume (LUN) within that pool. 5. Define a host. 6. Add ports to the host (WWPN and IQN). 7. Map the volume to the host. 8. Check host connectivity at the XIV Storage System. 9. Continue with the operating system-specific installation. 10.Reboot the host server or scan new disks. 11.Save and document configuration changes for disaster recovery (DR) situations. These steps are operating system-independent and are illustrated with the XIV GUI and XCLI. Important: For the host system to effectively see and use the LUN, additional and operating system-specific configuration tasks are required. The tasks are described in subsequent chapters of this book according to the operating system (OS) of the host that is being configured.
7.4.1 Required generic information and preparation

We use the environment depicted in Figure 7-20 on page 171 to illustrate the configuration tasks. In our example, we have two hosts: one host using FC connectivity and the other host using iSCSI. Both hosts have a network (SNA or Ethernet fabric) to connect to the IBM XIV Storage System patch panel. The diagram also shows the unique names of components, which are also used in the configuration steps.
170
iSCSI: iqn.2005-10.com.xivstorage:000050 FC (WWPN): 5001738000320xxx

...190 ...191 ...192 ...193 ...182 ...183 ...172 ...173
1 3 2 4 Panel Port Nr.
Target FC HBA, 2 x 4 Gigabit Initiator

FC
Ethernet NIC 2 x 1 Gigabit

ETH
...190 ...191 FC ...180 ...181 FC ...170 ...171 FC
...192 ...193 FC ...182 ...183 FC ...172 ...173 FC
IP(1) IP(2) ETH IP(3) IP(4) ETH IP(5) IP(6) ETH ...150 ...151 ...152 ...153 ...142 ...143 ...160 ...161 ...162 ...163 ...180 ...181 ...170 ...171
SAN Fabric 1
FC P
HBA 1 WWPN: 210100E08BAFA29E HBA 2 WWPN: 210000E08B8FA29E
P FC
SAN Fabric 2
P FC
FC HOST
6 5 4
...160 ...161 FC ...150 ...151 FC ...140 ...141 FC
...162 ...163 FC ...152 ...153 FC ...142 ...143 FC IP(5) IP(6) IP(3) IP(4) ...140 ...141 IP(1) IP(2)
iqn.2000-04.com.qlogic:host.ibm.com
I iSCS
Ethernet iSCSI Network iSCSI
IP(7) IP(8)
iSCSI HOST
Patch Panel
Network
Hosts
Figure 7-20 Example: Host connectivity overview of base setup
To prepare the hardware: 1. Hardware preparation: In our scenario, we assume that the systems are already in place and physically cabled (with redundancy) as shown in Figure 7-20. Write down the component names and IDs as illustrated in Table 7-3 on page 172 for our particular example.
171
Table 7-3 Example: Required component information Component IBM XIV FC HBAs FC environment WWPN: 5001738000320nnn nnn for Fabric1: 140, 150, 160, 170, 180, and 190 nnn for Fabric2: 142, 152, 162, 172, 182, and 192 HBA1 WWPN: 210100E08BAFA29E HBA2 WWPN: 210000E08B8FA29E N/A iSCSI environment N/A
Host HBAs
N/A
IBM XIV iSCSI IPs
IP(1): 192.168.1.1 IP(2): 192.168.1.2 IP(3): 192.168.1.3 IP(4): 192.168.1.4 IP(5): 192.168.1.5 IP(6): 192.168.1.6 iqn.2005-10.com.xivstorage:00 0050 IP(7): 192.168.1.7 IP(8): 192.168.1.8 iqn.2000-04.com.qlogic:host.ib m.com
IBM XIV iSCSI IQN (do not change) Host IPs Host iSCSI IQN
N/A N/A N/A
The OS type is also required information. With the current XIV Storage System, it is only relevant for Hewlett-Packard UNIX (HP/UX), all other OSs use the same host type. 2. If the new server is using FCP to connect, it is preferable to first configure the network (SAN Fabric 1 and 2) and power on the host server, which populates the XIV Storage System list of WWPN hosts and allows a simple selection of the host adapter WWPN. The configuration steps for the FCP network include zoning: a. Log on to the Fabric 1 SAN switch and create a host zone (single initiator): Zonename: FChost_HBA1 Members: 5001738000320140, 5001738000320150, 5001738000320160, 5001738000320170, 5001738000320180, 5001738000320190, 210100E08BAFA29E b. Log on to the Fabric 2 SAN switch and create a host zone (single initiator): Zonename: FChost_HBA2 Members: 5001738000320142, 5001738000320152, 5001738000320162, 5001738000320172, 5001738000320182, 5001738000320192, 210000E08B8FA29E c. Add new zones to the current zone set, and save and apply the new configuration. For iSCSI connectivity, there is no zoning step, because the required IP addresses are easily entered manually. The subsequent steps can be performed via the XIV GUI or XCLI. For XIV GUI, continue with 7.4.2, Prepare for a new host: XIV GUI on page 173. For XCLI, go to 7.4.3, Prepare for a new host: XCLI on page 177.
172
7.4.2 Prepare for a new host: XIV GUI

Follow these steps: 1. To define a new Storage Pool, refer to 5.3, Storage Pools on page 93. 2. To define a volume in that Storage Pool, refer to 5.4, Volumes on page 103. 3. Define a host. In the XIV Storage System main GUI window, move the mouse cursor over the Hosts and LUNs icon and select Hosts from the Hosts and LUNs pop-up menu (refer to Figure 7-21).
Figure 7-21 Hosts and LUNs menu
4. The Hosts window is displayed showing a list of hosts that are already defined, if any. To add a new host or cluster, click either the Add Host or Add Cluster choice in the menu bar (refer to Figure 7-22). In our example, we select Add host.
Figure 7-22 Add new host
5. The Define Host dialog is displayed as shown in Figure 7-23. Enter a name for the host. If a cluster definition was created in the previous step, it is available in the cluster drop-down list box. To add a server to a cluster, select a cluster name. Because we do not create a cluster in our example, we select None.
Figure 7-23 Define FC-attached host
6. Repeat steps 4 and 5 to create a second host definition for the iSCSI-attached host (refer to Figure 7-24 on page 174).
173
Figure 7-24 Define iSCSI host
7. Host access to LUNs is granted depending on the host adapter ID. For an FC connection, the host adapter ID is the FC HBA WWPN. For an iSCSI connection, the host adapter ID is the host or HBA IQN. To add a WWPN or IQN to a host definition, right-click the host and select Add Port from the context menu (refer to Figure 7-25).
Figure 7-25 GUI example: Add port to host definition
8. The Add Port dialog is displayed as shown in Figure 7-26. Select port type FC or iSCSI. In this example, the FC host is defined first. Add the WWPN for HBA1 as listed in Table 7-3 on page 172. If the host is correctly connected and has done a port login at least one time, the WWPN is shown in the drop-down list box. Otherwise, you can manually enter the WWPN.
Figure 7-26 GUI example: Add FC port WWPN
Now, proceed in the same manner to add the second HBA (HBA2) WWPN as shown in Figure 7-27 on page 175. The XIV Storage System does not care which FC port name is added first. 174
Figure 7-27 GUI example: Add FC port WWPN
In our example, there is also an iSCSI host. An iSCSI HBA is installed in this host. In the Add port dialog, specify the port type as iSCSI and enter the IQN of the HBA as the iSCSI Name or port name (refer to Figure 7-28).
Figure 7-28 GUI example: Add iSCSI port IQN
9. The final configuration step is to map the volume to the host. While still in the Hosts configuration pane, right-click the host to which the volume is to be mapped and select Map Volumes to this Host from the context menu (refer to Figure 7-29).
Figure 7-29 Map volume to host
10.The Volume to Host Mapping window opens. The process of adding a volume to a host definition is straightforward in this panel (refer to Figure 7-30 on page 176): Select an available volume from the left pane. The GUI will suggest a LUN ID to which to map the volume.
175
Click Map and the volume is assigned immediately. Note the GUI enforces that a volume can only be connected to one host. (For a cluster, create a cluster definition and map volumes to the cluster definition.)
Figure 7-30 Map FC volume to FC host
There is no difference in mapping a volume to an FC or iSCSI host in the XIV GUI volume mapping view. For completeness, add a volume to the iSCSI host defined in this example (refer to Figure 7-31).
Figure 7-31 Map iSCSI volume to iSCSI host
11.To complete this example, power up the host server and check connectivity. The XIV Storage System has a real-time connectivity status overview. Select Hosts Connectivity from the Hosts and LUNs menu to access the connectivity status (refer to Figure 7-32).
Figure 7-32 Hosts Connectivity
The host connectivity window is displayed (from the XIV Storage System point of view). In our example, the ExampleFChost was expected to have dual path connectivity to every module. However, only two modules (5 and 6) show as connected (refer to Figure 7-33 on page 177), and the iSCSI host has no connection to module 9.
176
Figure 7-33 GUI example: Host connectivity matrix
12.At this stage, the setup of the new FC and iSCSI hosts on the XIV Storage System is complete. To complete the additional configuration, OS-dependent steps must be performed, which are described in their relative OS chapters. 13.It is a best practice to document all changes performed on a production system.
7.4.3 Prepare for a new host: XCLI

Follow these steps to use the XCLI to prepare for a new host: 1. To create a new Storage Pool, refer to 5.3, Storage Pools on page 93. 2. To create a new volume (LUN), refer to 5.4, Volumes on page 103. 3. The next step is to create a host definition for our FC and iSCSI hosts, using the host_define command. Refer to Example 7-9.
Example 7-9 XCLI example: Create host definition
C:\>xcli -c "XIV V10.0 MN00050" -u admin -p adminadmin host_define host=ExampleFChost Command executed successfully. C:\>xcli -c "XIV V10.0 MN00050" -u admin -p adminadmin host_define host=ExampleiSCSIhost Command executed successfully. 4. Host access to LUNs is granted depending on the host adapter ID. For an FC connection, the host adapter ID is the FC HBA WWPN. For an iSCSI connection, the host adapter ID is the IQN found on the host network interface card (NIC) with the software initiator or iSCSI HBA. In Example 7-10, the WWPN of the FC host for HBA1 and HBA2 is added with the host_add_port command and by specifying an fcaddress.
Example 7-10 Create FC port and add to host definition
C:\>xcli -c "XIV V10.0 MN00050" -u admin -p adminadmin host_add_port host=ExampleFChost fcaddress=210100E08BAFA29E Command executed successfully. C:\>xcli -c "XIV V10.0 MN00050" -u admin -p adminadmin host_add_port host=ExampleFChost fcaddress=210000E08B8FA29E Command executed successfully.
177
In Example 7-11, the IQN of the iSCSI host is added. Note this is the same host_add_por t command, but with the iscsi_name parameter instead of fcaddress.
Example 7-11 Create iSCSI port and add to the host definition
C:\>xcli -c "XIV V10.0 MN00050" -u admin -p adminadmin host_add_port host=ExampleiSCSIhost iscsi_name=iqn.2000-04.com.qlogic:host.ibm.com Command executed successfully. 5. The final configuration step is to map volumes to the host definition. Note that for a cluster the volumes are mapped to the cluster host definition. Again, there is no difference for FC or iSCSI mapping to a host. Both commands are shown in Example 7-12.
Example 7-12 XCLI example: Map volumes to hosts
C:\>xcli -c "XIV V10.0 MN00050" -u admin -p adminadmin map_vol host=ExampleFChost vol=ExampleFChost lun=1 Command executed successfully. C:\>xcli -c "XIV V10.0 MN00050" -u admin -p adminadmin map_vol host=ExampleiSCSIhost vol=ExampleiSCSIhost lun=1 Command executed successfully. 6. To complete the example, power up the server and check the host connectivity status from the XIV Storage System point of view. Example 7-13 shows the output for both hosts.
Example 7-13 XCLI example: Check host connectivity
C:\>xcli -c "XIV V10.0 MN00050" -u admin -p adminadmin host_connectivity_list host=ExampleFChost

Host Host Port Module Local FC port ExampleFChost 210100E08BAFA29E 1:Module:5 1:FC_Port:5:2 ExampleFChost 210000E08B8FA29E 1:Module:6 1:FC_Port:6:2 Type FC FC
C:\>xcli -c "XIV V10.0 MN00050" -u admin -p adminadmin host_connectivity_list host=ExampleiSCSIhost

Host ExampleiSCSIhost ExampleiSCSIhost Host Port Module iqn.2000-04.com.qlogic:host.ibm.com 1:Module:7 iqn.2000-04.com.qlogic:host.ibm.com 1:Module:8 Local FC port Type iSCSI iSCSI
In Example 7-13, there is only one path per host FC HBA instead of the expected six paths per host FC HBA, which was intentional in our setup for illustration purposes. Problem determination for a situation like this one needs to start at the FC fabric zoning and FC cabling of the XIV Storage System. Similarly, the iSCSI host in our example has one connection missing to module 9. Investigate to solve this situation by using diagnostic commands on the XIV Storage System to trace the route, check network cabling to module 9, and check the network switch configuration for a possible firewall or virtual local area network (VLAN) misconfiguration. 7. Setup of the new FC and iSCSI hosts on the XIV Storage System is now complete. The remaining steps are OS dependent and described in the relative OS chapters. 8. It is a best practice to document changes performed on a production system.
178
Chapter 8.
OS-specific considerations for host connectivity

This chapter explains OS-specific considerations for host connectivity and describes the host attachment-related tasks for the following operating system platforms: Windows AIX Solaris Linux VMWare ESX
179
8.1 Attaching Microsoft Windows host to XIV

This section discusses the attachment of Microsoft Windows-based hosts to the XIV Storage System. It provides specific instructions for Fibre Channel (FC) and Internet Small Computer System Interface (iSCSI) connections. The procedure and instructions given here are based on code that was available at the time of writing this book. For the latest information about XIV Storage Management software compatibility, refer to the System Storage Interoperability Center (SSIC) at: http://www.ibm.com/systems/support/storage/config/ssic/index.jsp Also, refer to the XIV Storage System Host System Attachment Guide for WindowsInstallation Guide, which is available at: http://publib.boulder.ibm.com/infocenter/ibmxiv/r2/index.jsp
Interoperability and prerequisites

IBM XIV Storage System supports different versions of the Windows operating system and can attach either over Fibre Channel (FC) or iSCSI connections. FC Host Bus Adapters (HBAs) from Emulex, QLogic, and LSI are supported. For the supported versions of Windows or specific HBAs, refer to the latest XIV interoperability information at the following Web site: http://www.ibm.com/systems/support/storage/config/ssic/displayesssearchwithoutjs.w ss?start_over=yes With Windows Server 2003, IBM recommends and supports the following software components to install (according to your host environment): Installation of Storport update package KB932755 is required. Recent HBA drivers are required. IBM recommends: Emulex: FC Port driver 2.00a12 or higher QLogic: 9.1.7.15 for x86-32 and x86-64 or higher, and 9.1.4.15 for IA64 or higher Microsoft Multi-path I/O (MPIO) support requires use of MPIO Framework 1.21 with XIV DSM 11.21. There is a VSS provider supporting XIV. Microsoft Cluster is tested and recommended with Windows Server 2003 with SP2 on x86-64.
Multi-path support
Microsoft provides a multi-path framework and development kit called the Microsoft Multi-path I/O (MPIO). The driver development kit allows storage vendors to create Device Specific Modules (DSM) for MPIO and to build interoperable multi-path solutions that integrate tightly with the Microsoft Windows family of products. MPIO has been extended by IBM to support XIV Storage System. The current version of the MPIO framework (with XIV extension) is 1.21 and also requires the XIV Device Specific Module (DSM) 11.21. MPIO ensures the high availability of data by utilizing multiple different paths between the server on which the application executes and the storage where the data is physically stored. Microsoft MPIO support allows the initiator to establish multiple sessions with the same target and aggregate the duplicate devices into a single device exposed to Windows. 180
In other words, the MPIO framework provides an active/active policy for a Windows host system to connect to the XIV Storage System, and it can handle I/O on any path at any time.
8.1.1 Windows host FC configuration

This section describes attaching Windows systems through Fibre Channel and provides detailed descriptions and installation instructions for the various software components required.
Installing HBA drivers

For Windows Server 2003, verify that the new Storport driver from the KB932755 update package is installed. If necessary, install the update available from Microsoft at: http://support.microsoft.com/kb/932755 Download the latest and appropriate driver for your specific HBA. With Windows operating systems, the queue depth settings are specified as part of the host adapters configuration through the BIOS settings or using a specific software provided by the HBA vendor. The XIV Storage System can handle a queue depth of 1400 per FC host port and 256 per volume. Optimize your environment by trying to evenly spread the I/O load across all available ports, taking into account the load on a particular server, its queue depth, and the number of volumes.
Installing MPIO multi-path driver

Follow these steps to install multi-path capability in Windows Server 2003 for XIV Storage System: 1. Extract the MPIO archive matching your particular version of Windows to the servers boot drive. Hopefully, you received an archive for each supported Windows version and architecture. IBM recommends placing the MPIO files in the following path: C:\Program Files\XIV\mpio\1.21\ Note: Make sure that you have a recent backup of the server system drive. If the server is configured to boot from SAN, it is best to perform a snapshot of the boot volume. If necessary, follow the instructions listed under Upgrading from previous version of MPIO framework on page 182. 2. Open a command prompt and navigate to the mpio directory where you extracted the archive earlier. Issue the following command: install -i "C:\Program Files\XIV\mpio\1.21" dsmxiv.inf Root\DSMXIV 3. Reboot the server. 4. Verify that Multi-path support has been added to the SCSI and RAID controllers section of the Windows Device Manager. The MPIO 1.21 driver details are shown in Figure 8-1 on page 182.
Chapter 8. OS-specific considerations for host connectivity
181
Figure 8-1 MPIO 1.21
Upgrading from previous version of MPIO framework

If there is a previous version of MPIO already installed on your server, it must be upgraded to the latest supported version. The MPIO driver and its version can be checked in the Computer management window, under Device Manager as shown in Figure 8-2.
Figure 8-2 Device Manager: Multi-Path Support (MPIO)
Upgrading an existing MPIO driver requires a complete removal of the old version, and a regular installation of the new one. 182
During the upgrade process, we recommend that you remove all volumes already mapped to the system. As usual, we also recommend that you back up your system configuration prior to the upgrade. Important: Make sure to use the latest supported MPIO version. At the time of writing this book, it is Version 1.21. These are the steps of the upgrade process: 1. Download the latest MPIO installer package for XIV Storage System. 2. Extract the compressed package to a working directory. In our case, it is C:\Program Files\XIV\mpio\1.21\ 3. Open a command prompt and navigate to this directory. Make sure that the directory contains the latest MPIO installer files. 4. Issue the following command to remove any previous version of the MPIO framework: install -u C:\Program Files\XIV\mpio\1.21\ dsmxiv.inf Root\DSMXIV
Figure 8-3 MPIO 1.21 Framework install
5. Reboot the server. 6. Verify in Device Manager that Multi-Path Support has been removed from the SCSI and RAID controllers section. If for some reason it is still listed, make sure that the Windows server is not attached to any external storage regardless of the type of connection (FC, iSCSI, SAS, or others). Right-click the device driver and remove it manually. Finally, reboot the server again. 7. Proceed with the normal MPIO installation procedure as explained in Installing MPIO multi-path driver on page 181.
Windows host FC attachment

To successfully attach your Windows host to the XIV Storage System, first ensure that the physical connections have been correctly established. Refer to 7.2, Fibre Channel (FC) connectivity on page 152. If you are using a SAN fabric (switches) to connect the XIV Storage System storage, make sure, if you have multiple hosts, that the connections are evenly distributed between the six Interface Modules, ensuring host traffic distribution and also resiliency. The best practice for establishing the physical connections is to have each host configured with one path to each Interface Module (IM). Thus, if you have two HBAs on the host (recommended), zone HBA1 to three IMs and HBA2 to the other three IMs. This arrangement provides the host with a total of six paths, which is generally enough especially when using 4Gbps FC technology.
183
For 2Gbps FC and high-throughput workloads, each host HBA can be zoned to all six IMs for a total path count of 12. After the connections are in place and the zoning established, you will see the host worldwide names (WWNs) in the XIV Storage System. To check a specific WWN, use a command as shown in the following example: xcli -c Redbook fc_connectivity_list | find "210000E08B0B941D"
Example 8-1 shows the command output.

Example 8-1 The result of the fc_connectivity_list command
1:FC_Port:7:2 1 1:FC_Port:9:2 1
210000E08B0B941D 210000E08B0B941D
00130600 Initiator 00130600 Initiator
yes yes
In the XIV GUI, these ports will selectable from the drop-down list in the Host Port Definition window as shown in Figure 5-32 on page 116. For the detailed descriptions of host definition and volume mapping, refer to 5.5, Host definition and mappings on page 113. After the host definition and volume mapping have been done in the XIV Storage System, issue a Rescan Disk command in the Windows Computer Management window (right-click Disk Management). You get a list of the attached XIV volumes as shown in Figure 8-4.
Figure 8-4 Multi-Path disk devices in Windows Server 2003
The number of IBM 2810XIV SCSI Disk Devices depends on the number of paths from the host to the XIV Storage System. The mapped volume can be seen as illustrated in Figure 8-5 on page 185.
184
Figure 8-5 Mapped volume appears as a new disk in Windows Server 2003
8.1.2 Windows host iSCSI configuration

To establish the physical iSCSI connection to the XIV Storage System, refer to 7.3, iSCSI connectivity on page 159. Microsoft provides an iSCSI software initiator for all the supported versions of Windows. The Microsoft iSCSI software initiator enables connection of a Windows host to an external iSCSI storage system, such as XIV, using Ethernet Network Interface Cards (NICs). Download the latest Microsoft iSCSI initiator for your particular Windows operating system. At the time of writing this book, Microsoft iSCSI Initiator 1.21 software can be downloaded from the following Web site: http://www.microsoft.com/downloads/details.aspx?familyid=12cb3c1a-15d6-4585-b385-b efd1319f825&displaylang=en&tm The iSCSI User Guide is available with the installation or can be downloaded from the following Web site: http://download.microsoft.com/download/a/e/9/ae91dea1-66d9-417c-ade4-92d824b871af/ uguide.doc Before installing the iSCSI software initiator, make sure that all the network interfaces are visible and that they are working correctly. We recommend using static IP addresses. Install the MPIO framework with a Device Specific Module as described in Installing MPIO multi-path driver on page 181.
185
Installing Microsoft iSCSI software initiator

Install Microsoft iSCSI software initiator according to these steps: 1. Run the installer package for your specific Windows version. On the welcome window shown in Figure 8-6, click Next.
Figure 8-6 Microsoft iSCSI Initiator installation wizard start
2. The Installation Options dialog is displayed. Select Initiator Service and software initiator. Refer to Figure 8-7. Important: Do not select MPIO at this time. The MPIO framework is packaged together with a Device Specific Module for XIV Storage System.
Figure 8-7 iSCSI software initiator selection window
3. Read the license agreement, select I Agree and click Next. Refer to Figure 8-8 on page 187.
186
Figure 8-8 License Agreement
4. The Microsoft iSCSI software initiator is now being installed. When the process is complete, click Finish as shown in Figure 8-9.
Figure 8-9 Finish iSCSI software initiator installation
5. To verify the new device installation, check the status in the Device Manager window, under SCSI and RAID controllers. Refer to Figure 8-10 on page 188.
187
Figure 8-10 iSCSI device in Device Manager
Configuring Microsoft iSCSI software initiator

The iSCSI connection must now be configured both on the Windows host and on the XIV Storage System. Follow these instructions to complete the iSCSI configuration: 1. The iSCSI installation places an icon on your desktop and creates an entry in the Start menu for the iSCSI Initiator Software. Start the iSCSI Initiator software. Take note of the servers iSCSI Qualified Name (IQN) from the General tab in the iSCSI Initiator Properties window. Copy this IQN to your clipboard and use it to define this host on the XIV Storage System.
Figure 8-11 iSCSI Initiator Properties
188
For the detailed description of host definition and volume mapping in XIV, refer to 5.5, Host definition and mappings on page 113. 2. In the iSCSI Initiator Properties window, select the Discovery tab and click Add in the Target Portals pane. Use one of your systems iSCSI IP addresses, as defined during the systems installation. To view IP addresses for the iSCSI ports in the XIV GUI, move the mouse cursor over the Hosts and LUNs icons in the main XIV window and select iSCSI Connectivity from the Host and LUNs menu as shown in Figure 8-12.
Figure 8-12 iSCSI Connectivity
Alternatively, you can issue the Extended Command Line Interface (XCLI) command as shown in Example 8-2.
Example 8-2 List iSCSI interfaces c:\>xcli -c Redbook ipinterface_list | find "iSCSI iSCSI_M8_P1 iSCSI 9.155.56.80 255.255.255.0 9.155.56.1 iSCSI_M7_P1 iSCSI 9.155.56.81 255.255.255.0 9.155.56.1 4500 1:Module:8 1 4500 1:Module:7 1
You can see that the iSCSI addresses used in our test environment are 9.155.56.80 and 9.155.56.81. 3. If your host is equipped with more than one Ethernet adapter (Network Interface Card (NIC)), click Advanced (in the Discovery tab of the iSCSI Initiator Properties window) and select iSCSI Software Initiator and your preferred IP address. To add the Target Portal, click OK as shown in Figure 8-13.
Figure 8-13 Add Target Portal to discover XIV Storage System
4. The XIV Storage System is now being discovered by the initiator. Change to the Targets tab in the iSCSI Initiator Properties window to see the discovered XIV Storage System. Refer to Figure 8-14 on page 190. The storage most likely shows as inactive status.
189
Figure 8-14 A discovered XIV Storage with Inactive connection status
To activate the connection, click Log On. 5. In the Log On to Target pop-up window, select Enable multi-path as shown in Figure 8-15. You can select the first check box too if you want to automatically restore this connection at the system boot time. Click Advanced.
Figure 8-15 Log On to Target
6. The Advanced Settings window is displayed. Select the Microsoft iSCSI initiator from the Local adapter drop-down. In the Source IP drop-down, click the IP address that is connected to the first iSCSI LAN, and in the Target Portal, select the first available IP address of the XIV Storage System as illustrated in Figure 8-16 on page 191. Click OK. You are returned to the parent window. Click OK again.
190
Figure 8-16 Advanced Settings in the Log On to Target panel
7. The iSCSI Target connection status now shows as active and connected. Make sure that the target is in the Connected status. The redundant paths are not yet configured. To do so, repeat this process for all IP addresses in your system. In other words, establish connection sessions to all of the desired XIV iSCSI interfaces from all of your desired source IP addresses. After the iSCSI sessions are created to each target portal, you can see details of the sessions. Select the iSCSI target in the target list (Figure 8-14 on page 190), and click Details to verify the sessions of the connection. Refer to Figure 8-17.
Figure 8-17 Target connection details
191
Depending on your environment, numerous sessions can appear, according to what you have configured. If you have already mapped volumes to the host system, you will see them under the Devices tab. If no volumes are mapped to this host yet, you can allocate them now. Another way to verify your allocated disks is to open the Windows Device Manager as shown in Figure 8-18.
Figure 8-18 Windows Device Manager with XIV disks connected through iSCSI
8.1.3 Management volume LUN 0

According to the SCSI standard, XIV Storage System maps itself in every map to LUN 0. This LUN serves as the well known LUN for that map, and the host can issue SCSI commands to that LUN, which are not related to any specific volume. This device (of type Storage Controller) appears in Windows Device Manager. By default, it is not recognized by Windows and appears with an unknown device question mark next to it. This behavior is completely harmless. However, IBM provides an optional.inf file to be installed on Windows hosts to properly recognize the device. This .inf file was tested on Windows Server 2003 SP1 and SP2. The .inf file is trivial and does not affect the host in any way. To install the .inf file, follow these steps: 1. Remove (uninstall) all XIV LUN 0 devices under the Unknown Devices category. 2. Run the script xiv-lun0-install.cmd. 3. Perform a rescan of hardware through Device Manager.
Exchange management LUN 0 to a real volume

You might want to eliminate this management LUN on your system, or you have to assign the LUN 0 number to a specific volume. In that case, all you need to do is just map your volume to the first place in the mapping view, and it will replace the management LUN to your volume and assign it the zero value. To see the mapping method, refer to 5.5.1, Managing hosts and mappings with XIV GUI on page 113.
192
8.2 Attaching AIX hosts to XIV

This section provides information and procedures for attaching the XIV Storage System to AIX on an IBM POWER platform. The Fibre Channel connectivity is discussed first, then iSCSI attachment.
Interoperability
IBM XIV Storage System supports different versions of AIX operating system, either via Fibre Channel (FC) or iSCSI connectivity. Various FC Host Bus Adapters (HBAs) are supported. Supported IBM-branded EMULEX HBAs and IBM HBA Firmware versions can be found at: http://www.software.ibm.com/webapp/set2/firmware/gjs For the supported version of AIX and its hardware environment, refer to the latest XIV interoperability information at the following Web site: http://www.ibm.com/systems/support/storage/config/ssic/displayesssearchwithoutjs.w ss?start_over=yes
Prerequisites
If the current AIX operating system level installed on your system is not a level that is compatible with XIV, you must upgrade the system prior to attaching the XIV storage. To determine the maintenance package or technology level currently installed on your system, use the oslevel command as shown in Example 8-3.
Example 8-3 Determine current AIX version and maintenance level
# oslevel -g Fileset Actual Level Maintenance Level ----------------------------------------------------------------------------bos.rte 5.3.8.0 5.3.0.0 At the time of writing this book, a binary AIX patch is required prior to the installation according to the current oslevel. These patches are shown in the Table 8-1.
Table 8-1 Details of AIX patches When Before AIX level AIX 5.3 TL7 AIX 5.3 TL7 AIX 5.3 TL8 AIX 5.3 TL9 After AIX 5.3 TL9 AIX 6.1 TL0 AIX 6.1 TL1 AIX 6.1 TL2 After AIX 6.1 TL2 Patch Not supported IZ28969 IZ28970 IZ28047 iFIX/APAR is not required IZ28002 IZ28004 IZ28079 iFIX/APAR is not required
193
The following best practices documents describe system planning and support procedures for the AIX operating system: http://www.software.ibm.com/webapp/set2/sas/f/best/aix_service_strategy_v3.pdf Download AIX upgrades from the IBM Fix Central Web site: http://www.ibm.com/eserver/support/fixes/fixcentral/main/pseries/aix Before further configuring your host system or the XIV Storage System, make sure that the physical connectivity between the XIV and the POWER system is properly established. In addition to proper cabling, if using FC switched connections, you must ensure a correct zoning (using the WWPN numbers of the AIX host). Refer to 7.2, Fibre Channel (FC) connectivity on page 152 for the recommended cabling and zoning setup.
8.2.1 AIX host FC configuration

Attaching XIV Storage System to an AIX host using Fibre Channel involves the following activities from the host side: Identify the Fibre Channel host bus adapters (HBAs) and determine their WWPN values Install XIV-specific AIX package Configure multi-pathing
Identifying FC adapters and their attributes

In order to allocate XIV volumes to an AIX host, the first step is to identify the Fibre Channel adapters on the AIX server. Use the lsdev command to list all the FC adapter ports in your system, as shown in Example 8-4.
Example 8-4 Listing FC adapters
# lsdev -Cc adapter | grep fcs0 Available 02-00 4Gb fcs1 Available 00-00 4Gb fcs2 Available 00-01 4Gb
fcs FC PCI Express Adapter (df1000fe) FC PCI Express Adapter (df1000fe) FC PCI Express Adapter (df1000fe)
This example shows that, in our case, we have three FC ports. Another useful command that is shown in Example 8-5 returns not just the ports, but also where the Fibre Channel adapters reside in the system (in which PCI slot). This command can be used to physically identify in what slot a specific adapter is placed.
Example 8-5 Locating FC adapters
# lsslot -c pci | grep fcs U789D.001.DQD73N0-P1-C2 PCI-E capable, Rev 1 slot with 8x lanes U789D.001.DQD73N0-P1-C6 PCI-E capable, Rev 1 slot with 8x lanes
fcs0 fcs1 fcs2
To obtain the Worldwide Port Name (WWPN) of each of the POWER system FC adapters, you can use the lscfg command, as shown in Example 8-6 on page 195.
194
Example 8-6 Finding Fibre Channel adapter WWN lscfg -vl fcs0 fcs0 U1.13-P1-I1/Q1 FC Adapter
Part Number.................00P4494 EC Level....................A Serial Number...............1A31005059 Manufacturer................001A Feature Code/Marketing ID...2765 FRU Number.................. 00P4495 Network Address.............10000000C93318D6 ROS Level and ID............02C03951 Device Specific.(Z0)........2002606D Device Specific.(Z1)........00000000 Device Specific.(Z2)........00000000 Device Specific.(Z3)........03000909 Device Specific.(Z4)........FF401210 Device Specific.(Z5)........02C03951 Device Specific.(Z6)........06433951 Device Specific.(Z7)........07433951 Device Specific.(Z8)........20000000C93318D6 Device Specific.(Z9)........CS3.91A1 Device Specific.(ZA)........C1D3.91A1 Device Specific.(ZB)........C2D3.91A1 Device Specific.(YL)........U1.13-P1-I1/Q1
You can also print the WWPN of an HBA directly by issuing this command: lscfg -vl <fcs#> | grep Network The # stands for the instance of any FC HBA that you want to query. After you have identified the FC adapter in the system, use the lsattr command to list its attributes. Refer to Example 8-7.
Example 8-7 Listing FC adapter attributes in AIX operating system
# lsattr -El fcs0 bus_intr_lvl bus_io_addr 0xffc00 bus_mem_addr 0xffebf000 init_link al intr_msi_1 66085 intr_priority 3 lg_term_dma 0x800000 max_xfer_size 0x100000 num_cmd_elems 200 pref_alpa 0x1 sw_fc_class 2
Bus interrupt level False Bus I/O address False Bus memory address False INIT Link flags True Bus interrupt level False Interrupt priority False Long term DMA True Maximum Transfer Size True Maximum number of COMMANDS to queue to the adapter True Preferred AL_PA True FC Class for Fabric True
At this point, you can define the AIX host system on the XIV Storage and assign FC ports for the WWPNs. If the FC connection was correctly done, the zoning enabled, and the FC adapters are in an available state on the host, these ports will be selectable from the drop-down list in the Host Port Definition window of the XIV Graphical User Interface. Refer to Figure 5-32 on page 116.
195
For the detailed description of host definition and volume mapping, refer to 5.5, Host definition and mappings on page 113.
Installing the XIV-specific package for AIX

For AIX to recognize the disks mapped from the XIV Storage System, a specific fileset is required on the AIX system. This fileset will also enable multi-pathing. The fileset can be downloaded from: ftp://ftp.software.ibm.com/storage/XIV/ To install the fileset, follow these steps: 1. Download or copy the downloaded fileset to your AIX system. 2. From the AIX prompt, change to the directory where your XIV package is located and execute the inutoc . command to create the table of contents file. 3. Use the AIX installp command or SMITTY (smitty Software Installation and Maintenance Install and Update Software Install Software) to install the XIV disk package. Complete the parameters as shown in Example 8-8 and Figure 8-19.
Example 8-8 Manual installation
installp -aXY -d . devices.fcp.disk.ibm.mpio.xiv
Figure 8-19 Smitty install
It might happen that disk drives with an Other FC SCSI Disk Drive description appear in the system if FC discovery (cfgmgr) was run before the previously mentioned package installation was completed. In that case, remove these drives and run the discovery procedure again. The removal and running the cfgmgr procedure are illustrated in Example 8-9.
Example 8-9 Cleanup and reconfiguration
# lsdev -Cc disk hdisk0 Available Virtual SCSI Disk Drive hdisk2 Available 20-58-02 Other FC SCSI Disk Drive hdisk3 Available 20-58-02 Other FC SCSI Disk Drive 196
# rmdev -dl hdisk2 hdisk2 deleted # rmdev -dl hdisk3 hdisk3 deleted # cfgmgr -l fcs0 # cfgmgr -l fcs1 Now, when we list the disks, we see the correct number of disks from the storage, and we see them labeled as XIV disks, as shown in Example 8-10.
Example 8-10 XIV labeled FC disks
# lsdev -Cc disk hdisk0 Available Virtual SCSI Disk Drive hdisk1 Available 00-00-02 IBM 2810XIV Fibre Channel Disk hdisk2 Available 00-00-02 IBM 2810XIV Fibre Channel Disk
Configuring AIX Multi-path I/O (MPIO)

AIX MPIO is an enhancement to the base OS environment that provides native support for multi-path Fibre Channel storage attachment. MPIO automatically discovers, configures, and makes available every storage device path. The storage device paths are managed to provide high availability and load balancing for storage I/O. MPIO is part of the base AIX kernel and is available starting with AIX Version 5.2. The base functionality of MPIO is limited. It provides an interface for vendor-specific Path Control Modules (PCMs) that allow for implementation of advanced algorithms. For basic information about MPIO, refer to the online guide AIX 5L System Management Concepts: Operating System and Devices from the AIX documentation Web site at:
http://publib16.boulder.ibm.com/pseries/en_US/aixbman/admnconc/hotplug_mgmt.htm#mpioconcepts
The management of MPIO devices is described in the online guide System Management Guide: Operating System and Devices for AIX 5L from the AIX documentation Web site at:
http://publib16.boulder.ibm.com/pseries/en_US/aixbman/baseadmn/manage_mpio.htm
Changing the queue depth

In a multi-path environment, you can change the disk behavior from failover to round robin mode or from round robin to failover mode. In round robin mode, which is the default, a queue depth of 1 only is allowed. Whenever you change from round robin mode to failover mode, you can enlarge the disks queue depth. However, if you later return to the round robin mode, remember to set the queue depth to 1 again, within the same command. Otherwise, the AIX patch you installed earlier does not allow you to return to the round-robin behavior.
Example 8-11 Change queue depth command
chdev -a algorithm=round_robin -a queue_depth=1 -l hdiskX
Best practice recommendation

The default algorithm is round_robin with a queue_depth=1, and this algorithm is only appropriate for single-threaded applications doing synchronous I/O. Typical applications are multi-threaded and often do asynchronous I/O.
197
Consequently, we recommend that you switch the algorithm to fail-over and assign a greater queue_depth of 32 (as a typical value). Tip: Use the fail_over algorithm with a queue depth of 32. Then, make sure to load balance the I/Os across the FC adapters and paths by setting the path priority attribute for each LUN so that 1/nth of the LUNs are assigned to each of the n FC paths. A fail-over algorithm can be used in such a way that it is superior to any load balancing algorithm (such as round-robin). First, consider, that any load balancing algorithm must consume CPU and memory resources to determine the best path to use. Second, it is possible to set up fail-over LUNs so that the loads are balanced across the available FC adapters. Let us use an example with two FC adapters. Assume that we correctly lay out our data so that the I/Os are balanced across the LUNs, which is usually a best practice. Then, if we assign half the LUNs to FC adapterA and half to FC adapterB, the I/Os are evenly balanced across the adapters. There might be times when I/Os to one LUN on FC adapterA are higher than a LUN on FC adapterB. The question to then ask is, Will the additional load on the adapter have a significant impact on I/O latency? In most cases, because the FC adapters are capable of handling more than 30 000 IOPS, we are unlikely to bottleneck at the adapter and add significant latency to the I/O. There is also a priority attribute for paths that can be used to specify a preference for the path used for I/Os (as of this writing, the lspath man page incorrectly refers to this as the weight attribute). The effect of the priority attribute depends whether the hdisk algorithm attribute is set to fail_over or round_robin: For algorithm=fail_over, then the path with the higher priority value handles all the I/Os unless there is a path failure, then the other path will be used. After a path failure and recovery, if you have IY79741 installed, I/Os will be redirected down the path with the highest priority; otherwise, if you want the I/Os to go down the primary path, you will have to use chpath to disable the secondary path, and then re-enable it. If the priority attribute is the same for all paths, the first path listed with lspath -Hl <hdisk> will be the primary path. So, you can set the primary path to be used by setting its priority value to 1, and the next paths priority (in case of path failure) to 2, and so on. For algorithm=round_robin, and the priority attributes are the same, then I/Os go down each path equally. If you set pathAs priority to 1 and pathBs to 255, for every I/O going down pathA, there will be 255 I/Os sent down pathB. To change the path priority of an MPIO device, use the chpath command. Refer to Example 8-14 on page 199 for an illustration.
Useful MPIO commands

The lspath command displays the operational status for the paths to the devices, as shown in Example 8-12.
Example 8-12 The lspath command shows the paths for hdisk2
# lspath -l hdisk2 -F status:name:parent:path_id:connection Enabled:hdisk2:fscsi0:0:5001738000cb0181,0 Enabled:hdisk2:fscsi0:1:5001738000cb0161,0 Enabled:hdisk2:fscsi1:2:5001738000cb0191,0 Enabled:hdisk2:fscsi1:3:5001738000cb0171,0
198
It can also be used to read the attributes of a given path to an MPIO-capable device. It is shown in Example 8-13. It is also good to know that the <connection> info is either <SCSI ID>, <LUN ID> for SCSI, (for example, 5,0 ) or <WWN>, <LUN ID> for FC devices.
Example 8-13 The lspath command reads attributes of the 0 path for hdisk2
# lspath -AHE -l hdisk2 -p fscsi0 -w "5001738000cb0181,0" attribute value description user_settable scsi_id 0x120d00 N/A False node_name 0x5001738000cb0000 FC Node Name False priority 1 Priority True The chpath command is used to perform change operations on a specific path. It can either change the operational status or tunable attributes associated with a path. It cannot perform both types of operations in a single invocation. Example 8-14 illustrates the use of the chpath command with an XIV Storage System, which sets the primary path to fscsi1 using the first path listed (there are two paths from the switch to the storage for this adapter). Then for the next disk, we set the priorities to 4, 1, 2, 3 respectively. Assuming the I/Os are relatively balanced across the hdisks, this will balance the I/Os evenly across the paths.
Example 8-14 The chpath command
# lspath -l hdisk34 -F"status parent connection" Enabled fscsi1 5001738000230190,21000000000000 Enabled fscsi1 5001738000230180,21000000000000 Enabled fscsi3 5001738000230190,21000000000000 Enabled fscsi3 5001738000230180,21000000000000 # chpath -l hdisk34 -p fscsi1 -w 5001738000230180,21000000000000 -a priority=2 path Changed # chpath -l hdisk34 -p fscsi3 -w 5001738000230190,21000000000000 -a priority=3 path Changed # chpath -l hdisk34 -p fscsi3 -w 5001738000230180,21000000000000 -a priority=4 path ChangedL The rmpath command unconfigures or undefines, or both, one or more paths to a target device. It is not possible to unconfigure (undefine) the last path to a target device using the rmpath command. The only way to unconfigure (undefine) the last path to a target device is to unconfigure the device itself (for example, use the rmdev command). Refer to the man pages of the MPIO commands for more information.
8.2.2 AIX host iSCSI configuration

AIX 5.3 and AIX 6.1 operating systems are supported for iSCSI connectivity with XIV. Refer to the latest XIV interoperability information at the following Web site: http://www-03.ibm.com/systems/support/storage/config/ssic/displayesssearchwithoutj s.wss?start_over=yes
199
To make sure that your system is equipped with the required filesets, run the lslpp command as shown in Example 8-15. We used the AIX Version 5.3 operating system with Technology level 9 in our illustrations.
Example 8-15 Verifying installed iSCSI filesets in AIX
lslpp -la "*.iscsi*" Fileset Level State Description ---------------------------------------------------------------------------Path: /usr/lib/objrepos devices.common.IBM.iscsi.rte 5.3.0.60 COMMITTED Common iSCSI Files 5.3.8.0 COMMITTED Common iSCSI Files devices.iscsi.disk.rte 5.3.0.60 COMMITTED iSCSI Disk Software 5.3.7.0 COMMITTED iSCSI Disk Software devices.iscsi.tape.rte 5.3.0.30 COMMITTED iSCSI Tape Software devices.iscsi_sw.rte 5.3.0.60 COMMITTED iSCSI Software Device Driver 5.3.8.0 COMMITTED iSCSI Software Device Driver Path: /etc/objrepos devices.common.IBM.iscsi.rte 5.3.0.60 COMMITTED Common iSCSI Files 5.3.8.0 COMMITTED Common iSCSI Files devices.iscsi_sw.rte 5.3.0.60 COMMITTED iSCSI Software Device Driver 5.3.8.0 COMMITTED iSCSI Software Device Driver At the time of writing this book, only AIX iSCSI software initiator is supported for connecting to the XIV Storage System.
Current limitations when using iSCSI software initiator

The code available at the time of preparing this book had limitations when using the iSCSI software initiator in AIX. These restrictions will be lifted over time: Single path only is supported. Remote boot is not supported. The maximum number of configured LUNs tested using the iSCSI software initiator is 128 per iSCSI target. The software initiator uses a single TCP connection for each iSCSI target (one connection per iSCSI session). This TCP connection is shared among all LUNs that are configured for a target. The software initiators TCP socket send and receive space are both set to the system socket buffer maximum. The maximum is set by the sb_max network option. The default is 1 MB.
Volume Groups
To avoid configuration problems and error log entries when you create Volume Groups using iSCSI devices, follow these guidelines: Configure Volume Groups that are created using iSCSI devices to be in an inactive state after reboot. After the iSCSI devices are configured, manually activate the iSCSI-backed Volume Groups. Then, mount any associated file systems. Volume Groups are activated during a different boot phase than the iSCSI software driver. For this reason, it is not possible to activate iSCSI Volume Groups during the boot process. Do not span Volume Groups across non-iSCSI devices.
200
I/O failures
To avoid I/O failures: If connectivity to iSCSI target devices is lost, I/O failures occur. To prevent I/O failures and file system corruption, stop all I/O activity and unmount iSCSI-backed file systems before doing anything that will cause long term loss of connectivity to the active iSCSI targets. If a loss of connectivity to iSCSI targets occurs while applications are attempting I/O activities with iSCSI devices, I/O errors will eventually occur. It might not be possible to unmount iSCSI-backed file systems, because the underlying iSCSI device stays busy. File system maintenance must be performed if I/O failures occur due to loss of connectivity to active iSCSI targets. To do file system maintenance, run the fsck command.
Configuring the iSCSI software initiator

The software initiator is configured using System Management Interface Tool (SMIT) as shown in this procedure: 1. Select Devices. 2. Select iSCSI. 3. Select Configure iSCSI Protocol Device. 4. Select Change / Show Characteristics of an iSCSI Protocol Device. 5. Verify that the Initiator Name value is correct. The Initiator Name value is used by the iSCSI Target during login. Note: A default initiator name is assigned when the software is installed. This initiator name can be changed by the user to match local network naming conventions. You can issue the lsattr command as well to verify the initiator_name parameter as shown in Example 8-16.
Example 8-16 Check initiator name
# lsattr -El iscsi0|grep initiator_name initiator_name iqn.com.ibm.de.mainz.p6-570-lab-2v17.hostid.099b325a iSCSI Initiator Name True 6. The Maximum Targets Allowed field corresponds to the maximum number of iSCSI targets that can be configured. If you reduce this number, you also reduce the amount of network memory pre-allocated for the iSCSI protocol driver during configuration. After the software initiator is configured, define iSCSI targets that will be accessed by the iSCSI software initiator. To specify those targets: 1. First of all, you have to know one of your iSCSI IP addresses in the XIV Storage System. To get that information, select iSCSI Connectivity from the Host and LUNs menu as shown in Figure 8-20 on page 202. Or just issue the following command in Example 8-17 in the XCLI.
Example 8-17 List iSCSI interfaces
c:\>xcli -c Redbook ipinterface_list | find "iSCSI" iSCSI_M8_P1 iSCSI 9.155.56.80 255.255.255.0 9.155.56.1 iSCSI_M7_P1 iSCSI 9.155.56.81 255.255.255.0 9.155.56.1
4500 1:Module:8 4500 1:Module:7
1 1
201
Figure 8-20 iSCSI Connectivity
You can see our current iSCSI addresses are: 9.155.56.80 and 9.155.56.81. 2. The next step is find the iSCSI name (IQN) of the XIV Storage. To get this information, navigate to the basic system view in the XIV GUI and right-click the XIV Storage box itself and select Properties. The System Properties window appears as shown in Figure 8-21.
Figure 8-21 Verifying iSCSI name in XIV Storage System
If you are using XCLI, issue the config_get command. Refer to Example 8-18.
Example 8-18 The config_get command in XCLI
C:\>xcli -c ESP config_get | find "iscsi" iscsi_name=iqn.2005-10.com.xivstorage:000203 3. Go back to the AIX system and edit the /etc/iscsi/targets file to include the iSCSI targets needed during device configuration: Note: The iSCSI targets file defines the name and location of the iSCSI targets that the iSCSI software initiator will attempt to access. This file is read any time that the iSCSI software initiator driver is loaded. Each uncommented line in the file represents an iSCSI target. iSCSI device configuration requires that the iSCSI targets can be reached through a properly configured network interface. Although the iSCSI software initiator can work using a 10/100 Ethernet LAN, it is designed for use with a gigabit Ethernet network that is separate from other network traffic. 202
Include your specific connection information in the targets file as shown in Example 8-19. Insert a HostName PortNumber and iSCSIName similar to what is shown in this example.
Example 8-19 Inserting connection information into /etc/iscsi/targets file in AIX operating system
9.155.56.80 3260 iqn.2005-10.com.xivstorage:000203 4. After editing the /etc/iscsi/targets file, enter the following command at the AIX prompt: cfgmgr -l iscsi0 This command will reconfigure the software initiator driver, and this command causes the driver to attempt to communicate with the targets listed in the /etc/iscsi/targets file, and to define a new hdisk for each LUN found on the targets. Note: If the appropriate disks are not defined, review the configuration of the initiator, the target, and any iSCSI gateways to ensure correctness. Then, rerun the cfgmgr command. If you want to further configure parameters for iSCSI software initiator devices, use SMIT.
iSCSI performance considerations

To ensure the best performance, enable the TCP Large Send, TCP send and receive flow control, and Jumbo Frame features of the AIX Gigabit Ethernet Adapter and the iSCSI Target interface. Tune network options and interface parameters for maximum iSCSI I/O throughput on the AIX system: Enable the RFC 1323 network option. Set up the tcp_sendspace, tcp_recvspace, sb_max, and mtu_size network options and network interface options to appropriate values: The iSCSI software initiators maximum transfer size is 256 KB. Assuming that the system maximums for tcp_sendspace and tcp_recvspace are set to 262144 bytes, an ifconfig command used to configure a gigabit Ethernet interface might look like: ifconfig en2 10.1.2.216 mtu 9000 tcp_sendspace 262144 tcp_recvspace 262144 Set the sb_max network option to at least 524288, and preferably 1048576. Set the mtu_size to 9000. For certain iSCSI targets, the TCP Nagle algorithm must be disabled for best performance. Use the no command to set the tcp_nagle_limit parameter to 0, which will disable the Nagle algorithm.
8.2.3 Management volume LUN 0

According to the SCSI standard, XIV Storage System maps itself in every map to LUN 0. This LUN serves as the well known LUN for that map, and the host can issue SCSI commands to that LUN that are not related to any specific volume. This device appears as a normal hdisk in the AIX operating system, and because it is not recognized by Windows by default, it appears with an unknown devices question mark next to it.
Exchange management of LUN 0 to a real volume

You might want to eliminate this management LUN on your system, or you have to assign the LUN 0 number to a specific volume. In that case, all you need to do is just map your volume
203
to the first place in the mapping view and it will replace the management LUN to your volume and assign the zero value to it. To see the mapping method, refer to 5.5.1, Managing hosts and mappings with XIV GUI on page 113.
8.3 Linux
There are several organizations (distributors) that bundle the Linux kernel, tools, and applications to form a distribution, a package that can be downloaded or purchased and installed on a computer. Several of these distributions are commercial; others are not. The Linux kernel, along with the tools and software needed to run an operating system, are maintained by a loosely organized community of thousands of (mostly) volunteer programmers.
8.3.1 Support issues that distinguish Linux from other operating systems
Linux is different from the other proprietary operating systems in many ways: There is no one person or organization that can be held responsible or called for support. Depending on the target group, the distributions differ largely in the kind of support that is available. Linux is available for almost all computer architectures. Linux is rapidly changing. All these factors make it difficult to promise and provide generic support for Linux. As a consequence, IBM has decided on a support strategy that limits the uncertainty and the amount of testing. IBM only supports the major Linux distributions that are targeted at enterprise clients: Red Hat Enterprise Linux SUSE Linux Enterprise Server These distributions have release cycles of about one year, are maintained for five years, and require you to sign a support contract with the distributor. They also have a schedule for regular updates. These factors mitigate the issues listed previously. The limited number of supported distributions also allows IBM to work closely with the vendors to ensure interoperability and support. Details about the supported Linux distributions and supported SAN boot environments can be found in the System Storage Interoperation Center (SSIC): http://www-03.ibm.com/systems/support/storage/config/ssic/displayesssearchwithoutj s.wss?start_over=yes
8.3.2 FC and multi-pathing for Linux using PROCFS

This section explains the process for configuring Fibre Channel and multi-pathing using the process pseudo-file system (PROCFS). Follow these steps to configure the Linux host for FC attachment with multi-pathing:
204
1. Install the HBAs on the Linux server and configure the options according to the HBA manufacturers instructions. Check the Fibre Channel physical connection from your host to the XIV Storage System. 2. Make configuration changes and install the additional packages required on the Linux host to support the XIV Storage System. 3. Configure Device-Mapper multi-pathing. 4. Configure the host and volumes, and define host mappings in the XIV Storage System. 5. Reboot the Linux server to discover the volumes created on the XIV Storage System. Our environment to prepare the examples that we present in the remainder of this section consisted of an IBM System x server x345 with QLogic HBAs QLA2340, which ran Red Hat Enterprise Linux 5.2.
QLogic HBA installation and options configuration

Download a supported driver version for the QLA2340 according to the SSIC: http://www-03.ibm.com/systems/support/storage/config/ssic/displayesssearchwithoutj s.wss?start_over=yes Install the driver as shown in Example 8-20.
Example 8-20 QLogic driver installation and setup
[root@x345-tic-30 ~]# tar -xvzf qla2xxx-v8.02.14_01-dist.tgz qlogic/ qlogic/drvrsetup qlogic/libinstall qlogic/libremove qlogic/qla2xxx-src-v8.02.14_01.tar.gz qlogic/qlapi-v4.00build12-rel.tgz qlogic/README.qla2xxx [root@x345-tic-30 ~]# cd qlogic/ [root@x345-tic-30 qlogic]# ./drvrsetup Extracting QLogic driver source... Done. [root@x345-tic-30 qlogic]# cd qla2xxx-8.02.14/ [root@x345-tic-30 qla2xxx-8.02.14]# ./extras/build.sh install QLA2XXX -- Building the qla2xxx driver, please wait... Installing intermodule.ko in /lib/modules/2.6.18-92.el5/kernel/kernel/ QLA2XXX -- Build done. QLA2XXX -- Installing the qla2xxx modules to /lib/modules/2.6.18-92.el5/kernel/drivers/scsi/qla2xxx/... Set the queue depth to 127, disable the failover mode for the driver, and set the timeout for a PORT-DOWN status before returning I/O back to the OS to 1 in the /etc/modprobe.conf. Refer to Example 8-21 for details.
Example 8-21 Modification of /etc/modprobe.conf for the XIV
[root@x345-tic-30 > options qla2xxx > options qla2xxx > options qla2xxx
qla2xxx-8.02.14]# cat >> /etc/modprobe.conf << EOF qlport_down_retry=1 ql2xfailover=0 ql2xmaxqdepth=127

>
205
> EOF [root@x345-tic-30 qla2xxx-8.02.14]# cat /etc/modprobe.conf alias eth0 e1000 alias eth1 e1000 alias scsi_hostadapter mptbase alias scsi_hostadapter1 mptspi alias scsi_hostadapter2 qla2xxx install qla2xxx /sbin/modprobe qla2xxx_conf; /sbin/modprobe --ignore-install qla2xxx remove qla2xxx /sbin/modprobe -r --first-time --ignore-remove qla2xxx && { /sbin/modprobe -r --ignore-remove qla2xxx_conf; } alias qla2100 qla2xxx alias qla2200 qla2xxx alias qla2300 qla2xxx alias qla2322 qla2xxx alias qla2400 qla2xxx options qla2xxx qlport_down_retry=1 options qla2xxx ql2xfailover=0 We now have to build a new RAM disk image, so that the driver will be loaded by the operating system loader after a boot. Next, we reboot the Linux host as shown in Example 8-22.
Example 8-22 Build a new ram disk image
[root@x345-tic-30 qla2xxx-8.02.14]# cd /boot/ [root@x345-tic-30 boot]# cp -f initrd-2.6.18-92.el5.img initrd-2.6.18-92.el5.img.bak [root@x345-tic-30 boot]# mkinitrd -f initrd-2.6.18-92.el5.img 2.6.18-92.el5 [root@x345-tic-30 boot]# reboot Broadcast message from root (pts/1) (Tue Aug The system is going down for reboot NOW! 5 13:57:28 2008):
Configuration changes and installation of additional packages

In this step, we make changes to the Linux host and install additional packages to support the XIV Storage System. Disable Security-enhanced Linux in the /etc/selinux/config file, according to Example 8-23.
Example 8-23 Modification of /etc/selinux/config
[root@x345-tic-30 ~]# cat /etc/selinux/config # This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - SELinux is fully disabled. SELINUX=disabled # SELINUXTYPE= type of policy in use. Possible values are: # targeted - Only targeted network daemons are protected. # strict - Full SELinux protection. SELINUXTYPE=targeted Download the udev package from the following Web site: 206
https://launchpad.net/udev/main/ Compile the udev package and install the scsi_id_t10 package as illustrated in Example 8-24.
Example 8-24 Compilation of udev and installation of scsi_id_t10
[root@x345-tic-30 ~]# tar jxvf udev-095.tar.bz2 . . [root@x345-tic-30 ~]# cd udev-095 [root@x345-tic-30 udev-095]# cd extras/scsi_id [root@x345-tic-30 scsi_id]# patch -l -p 1 scsi_serial.c << EOF > 40a41 > > { SCSI_ID_T10_VENDOR, SCSI_ID_NAA_DONT_CARE, SCSI_ID_ASCII }, > 59d59 > < { SCSI_ID_T10_VENDOR, SCSI_ID_NAA_DONT_CARE, SCSI_ID_ASCII }, > EOF patching file scsi_serial.c Hunk #2 FAILED at 60. 1 out of 2 hunks FAILED -- saving rejects to file scsi_serial.c.rej [root@x345-tic-30 scsi_id]# patch -l -p 1 scsi_serial.c << EOF > 40a41 > > { SCSI_ID_T10_VENDOR, SCSI_ID_NAA_DONT_CARE, SCSI_ID_ASCII }, > 59d59 > < { SCSI_ID_T10_VENDOR, SCSI_ID_NAA_DONT_CARE, SCSI_ID_ASCII }, > EOF patching file scsi_serial.c Hunk #2 succeeded at 42 (offset -18 lines). [root@x345-tic-30 scsi_id]# make -C ../.. EXTRAS=extras/scsi_id USE_STATIC=true 2>&1 | grep -v warning make: Entering directory `/root/udev-095' GENHDR udev_version.h CC udev_device.o CC udev_config.o CC udev_node.o CC udev_db.o CC udev_sysfs.o CC udev_rules.o CC udev_rules_parse.o CC udev_utils.o CC udev_utils_string.o CC udev_utils_file.o CC udev_utils_run.o CC udev_libc_wrapper.o AR libudev.a RANLIB libudev.a CC udev.o LD udev CC udevd.o LD udevd CC udevtrigger.o LD udevtrigger CC udevsettle.o LD udevsettle CC udevcontrol.o LD udevcontrol
207
CC udevmonitor.o LD udevmonitor CC udevinfo.o LD udevinfo CC udevtest.o LD udevtest CC udevstart.o LD udevstart make[1]: Entering directory `/root/udev-095/extras/scsi_id' GENHDR scsi_id_version.h CC scsi_id.o CC scsi_serial.o LD scsi_id make[1]: Leaving directory `/root/udev-095/extras/scsi_id' make: Leaving directory `/root/udev-095' [root@x345-tic-30 scsi_id]# /bin/cp -f scsi_id /lib/udev/scsi_id_t10 [root@x345-tic-30 scsi_id]# ln -s -f /lib/udev/scsi_id_t10 /sbin [root@x345-tic-30 scsi_id]# cd ../../.. [root@x345-tic-30 ~]# /bin/rm -rf udev-095
Configure Device Mapper multi-pathing (DM-MPIO)

To adapt the multi-pathing for the XIV Storage System, you need to change the /etc/multipath.conf so that the multipathing daemon will start automatically after a boot of the Linux host. The commands are shown in Example 8-25.
Example 8-25 Configuration of Device Mapper multi-pathing
[root@x345-tic-30 ~]# chkconfig --add multipathd [root@x345-tic-30 ~]# chkconfig --level 2345 multipathd on [root@x345-tic-30 ~]# /bin/cp -p /etc/multipath.conf /etc/multipath.conf.`date +%d-%m-%Y.%H:%M:%S` [root@x345-tic-30 ~]# cat > /etc/multipath.conf << EOF > blacklist { > device { > vendor "IBM-ESXS" > } > device { > vendor "LSILOGIC" > } > device { > vendor "ATA" > } > > } > devices { > device { > vendor "IBM" > product "2810XIV" > selector "round-robin 0" > path_grouping_policy multibus > rr_min_io 1000 > getuid_callout "/sbin/scsi_id_t10 -g -u -s /block/%n" > path_checker tur > failback immediate > no_path_retry queue 208
> } > } > EOF
Configure the host and volumes and define host mappings

To map the volumes to the Linux host, you must know the World Wide Port Names (WWPNs) of the HBAs. The WWPNs can be found in the /proc file system. Refer to Example 8-26 for details.
Example 8-26 WWPNs of the HBAs
[root@x345-tic-30 ~]# cat /proc/scsi/qla2xxx/2|grep scsi-qla0-adapter-port scsi-qla0-adapter-port=210000e08b08e7c4; [root@x345-tic-30 ~]# cat /proc/scsi/qla2xxx/3|grep scsi-qla1-adapter-port scsi-qla1-adapter-port=210000e08b0b973c; Create and map two volumes to the Linux host, as described in 5.5, Host definition and mappings on page 113.
Discovering the volumes

Now, reboot the host system and you will see the two volumes that were created and that are accessible over two paths as illustrated in Example 8-27.
Example 8-27 Discovered LUNs
[root@x345-tic-30 ~]# cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: LSILOGIC Model: 1030 IM Type: Direct-Access Host: scsi0 Channel: 00 Id: 02 Lun: 00 Vendor: IBM Model: IC35L018UCD210-0 Type: Direct-Access Host: scsi0 Channel: 00 Id: 08 Lun: 00 Vendor: IBM Model: 32P0032a S320 1 Type: Processor Host: scsi0 Channel: 01 Id: 00 Lun: 00 Vendor: IBM Model: IC35L018UCD210-0 Type: Direct-Access Host: scsi2 Channel: 00 Id: 00 Lun: 00 Vendor: IBM Model: 2810XIV-LUN-0 Type: RAID Host: scsi2 Channel: 00 Id: 02 Lun: 00 Vendor: IBM Model: 2810XIV Type: Direct-Access Host: scsi2 Channel: 00 Id: 02 Lun: 01 Vendor: IBM Model: 2810XIV Type: Direct-Access Host: scsi3 Channel: 00 Id: 02 Lun: 00 Vendor: IBM Model: 2810XIV Type: Direct-Access Host: scsi3 Channel: 00 Id: 02 Lun: 01 Vendor: IBM Model: 2810XIV Type: Direct-Access [root@x345-tic-30 ~]# cat /proc/partitions
Rev: 1000 ANSI SCSI revision: 02 Rev: S5BS ANSI SCSI revision: 03 Rev: 1 ANSI SCSI revision: 02 Rev: S5BS ANSI SCSI revision: 03 Rev: 10.0 ANSI SCSI revision: 05 Rev: 10.0 ANSI SCSI revision: 05 Rev: 10.0 ANSI SCSI revision: 05 Rev: 10.0 ANSI SCSI revision: 05 Rev: 10.0 ANSI SCSI revision: 05
209
major minor
#blocks
name
8 0 17921024 sda 8 1 104391 sda1 8 2 17816085 sda2 8 16 17921835 sdb 8 17 17920476 sdb1 253 0 33652736 dm-0 253 1 2031616 dm-1 8 32 16777216 sdc 8 48 16777216 sdd 8 64 16777216 sde 8 80 16777216 sdf 253 2 16777216 dm-2 253 3 16777216 dm-3 [root@x345-tic-30 ~]# multipathd -k"show topo" 1IBM_2810XIV_MN000320016dm-2 IBM,2810XIV [size=16G][features=1 queue_if_no_path][hwhandler=0] \_ round-robin 0 [prio=2][active] \_ 2:0:2:0 sdc 8:32 [active][ready] \_ 3:0:2:0 sde 8:64 [active][ready] 1IBM_2810XIV_MN000320017dm-3 IBM,2810XIV [size=16G][features=1 queue_if_no_path][hwhandler=0] \_ round-robin 0 [prio=2][active] \_ 2:0:2:1 sdd 8:48 [active][ready] \_ 3:0:2:1 sdf 8:80 [active][ready] [root@x345-tic-30 ~]# multipathd -k"list paths" hcil dev dev_t pri dm_st chk_st next_check 0:0:2:0 sdb 8:16 1 [undef] [ready] [orphan] 2:0:2:0 sdc 8:32 1 [active][ready] XXXXXXX... 15/20 2:0:2:1 sdd 8:48 1 [active][ready] XXXXXXX... 15/20 3:0:2:0 sde 8:64 1 [active][ready] XXXXXXX... 15/20 3:0:2:1 sdf 8:80 1 [active][ready] XXXXXXX... 15/20 [root@x345-tic-30 ~]# multipathd -k"list maps status" name failback queueing paths dm-st 1IBM_2810XIV_MN000320016 immediate on 2 active 1IBM_2810XIV_MN000320017 immediate on 2 active
8.3.3 FC and multi-pathing for Linux using SYSFS

Follow these steps to configure the Linux host for FC attachment with multi-pathing: 1. Install the HBAs on the Linux server and configure the options. Check the Fibre Channel connection from your host to the XIV Storage System. 2. Configure changes and install the additional packages that are required to support the XIV Storage System on the Linux host. 3. Configure Device Mapper multi-pathing. 4. Configure the host, volumes, and host mapping in the IBM XIV. 5. Reboot the Linux server to discover the volumes created on XIV. Our environment to prepare the examples that we present in the remainder of this section consisted of an IBM BladeCenter HS20 with QLogic HBAs QMC 2462 that ran Red Hat Enterprise Linux 5.2. 210
QLogic HBA installation and options configuration

Download a supported driver version for the QMC2462 from the following FTP site: ftp://ftp.qlogic.com/outgoing/linux/firmware/ql2400_fw.bin If your driver is not supported, install and patch the driver as documented in the Host System Attachment Guide for Linux, which can be found at the XIV Storage System Information Center: http://publib.boulder.ibm.com/infocenter/ibmxiv/r2/index.jsp Set the timeout for a PORT-DOWN status before returning I/O back to the OS to 1 in the /etc/modprobe.conf. Refer to Example 8-28 for details.
Example 8-28 Modification of /etc/modprobe.conf for the XIV
[root@HS20-tic-15 ~]# cat >> /etc/modprobe.conf << EOF > options qla2xxx qlport_down_retry=1 > EOF Now, we have to build a new RAM disk image, so that the driver will be loaded by the operating system loader after a boot. Next, reboot the Linux host as shown in Example 8-22 on page 206.
Example 8-29 Build a new ram disk image
[root@HS20-tic-15 ~]# cd /boot [root@HS20-tic-15 boot]# cp -f initrd-2.6.18-92.el5.img initrd-2.6.18-92.el5.img.bak [root@HS20-tic-15 boot]# mkinitrd -f initrd-2.6.18-92.el5.img 2.6.18-92.el5 [root@HS20-tic-15 boot]# reboot Broadcast message from root (pts/1) (Fri Aug 22 08:47:56 2008): The system is going down for reboot NOW!
Configuration changes and installation of additional packages

In this step, we make changes to the Linux host and install additional packages to support the XIV Storage System. Disable security-enhanced Linux in the /etc/selinux/config file, according to Example 8-30.
Example 8-30 Modification of /etc/selinux/config
[root@HS20-tic-15 boot]# cat /etc/selinux/config # This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - SELinux is fully disabled. SELINUX=disabled # SELINUXTYPE= type of policy in use. Possible values are: # targeted - Only targeted network daemons are protected. # strict - Full SELinux protection. SELINUXTYPE=targeted
211
Add udev rules for the XIV Storage System to set the devices loss timeout to 1 second and the queue depth for the devices to 127, as shown in Example 8-31.
Example 8-31 Add udev rules for the XIV Storage System
[root@HS20-tic-15 ~]# cat > /etc/udev/rules.d/45-xiv-devs.rules << EOF > SUBSYSTEM=="block", ACTION=="add", KERNEL=="sd*[!0-9]", SYSFS{model}=="2810XIV", RUN+="/bin/sh -c 'echo 127 > /sys\$devpath/device/queue_depth'" > SUBSYSTEM=="fc_remote_ports", ACTION=="add", SYSFS{port_name}=="0x5001738000*", RUN+="/bin/sh -c 'echo 1 > /sys\$devpath/dev_loss_tmo'" > EOF Download the udev package from the following Web site: https://launchpad.net/udev/main/ Compile the udev package and install the scsi_id_t10 package as illustrated in Example 8-24 on page 207.
Example 8-32 Compilation of udev and installation of scsi_id_t10
[root@HS20-tic-15 ~]# cd udev-095 [root@HS20-tic-15 udev-095]# cd extras/scsi_id/ [root@HS20-tic-15 scsi_id]# patch -l -p 1 scsi_serial.c << EOF > 40a41 > > { SCSI_ID_T10_VENDOR, SCSI_ID_NAA_DONT_CARE, SCSI_ID_ASCII }, > 59d59 > < { SCSI_ID_T10_VENDOR, SCSI_ID_NAA_DONT_CARE, SCSI_ID_ASCII }, > EOF patching file scsi_serial.c Hunk #2 FAILED at 60. 1 out of 2 hunks FAILED -- saving rejects to file scsi_serial.c.rej [root@HS20-tic-15 scsi_id]# patch -l -p 1 scsi_serial.c << EOF > 40a41 > > { SCSI_ID_T10_VENDOR, SCSI_ID_NAA_DONT_CARE, SCSI_ID_ASCII }, > 59d59 > < { SCSI_ID_T10_VENDOR, SCSI_ID_NAA_DONT_CARE, SCSI_ID_ASCII }, > EOF patching file scsi_serial.c Hunk #2 succeeded at 42 (offset -18 lines). [root@HS20-tic-15 scsi_id]# make -C ../.. EXTRAS=extras/scsi_id USE_STATIC=true 2>&1 | grep -v warning make: Entering directory `/root/udev-095' GENHDR udev_version.h CC udev_device.o CC udev_config.o CC udev_node.o CC udev_db.o CC udev_sysfs.o CC udev_rules.o CC udev_rules_parse.o CC udev_utils.o CC udev_utils_string.o CC udev_utils_file.o CC udev_utils_run.o CC udev_libc_wrapper.o AR libudev.a 212
RANLIB libudev.a CC udev.o LD udev CC udevd.o LD udevd CC udevtrigger.o LD udevtrigger CC udevsettle.o LD udevsettle CC udevcontrol.o LD udevcontrol CC udevmonitor.o LD udevmonitor CC udevinfo.o LD udevinfo CC udevtest.o LD udevtest CC udevstart.o LD udevstart make[1]: Entering directory `/root/udev-095/extras/scsi_id' GENHDR scsi_id_version.h CC scsi_id.o CC scsi_serial.o LD scsi_id make[1]: Leaving directory `/root/udev-095/extras/scsi_id' make: Leaving directory `/root/udev-095' [root@HS20-tic-15 scsi_id]# /bin/cp -f scsi_id /lib/udev/scsi_id_t10 [root@HS20-tic-15 scsi_id]# ln -s -f /lib/udev/scsi_id_t10 /sbin [root@HS20-tic-15 scsi_id]# cd ../../.. [root@HS20-tic-15 ~]# /bin/rm -rf udev-095
Configure Device Mapper multi-pathing (DM-MPIO)

To adapt the multi-pathing for the XIV Storage System, you need to change the /etc/multipath.conf. The multipathing daemon must be set up to start automatically after a boot of the Linux host. The commands are shown in Example 8-25 on page 208.
Example 8-33 Configuration of Device Mapper multi-pathing
root@HS20-tic-15 ~]# chkconfig --add multipathd [root@HS20-tic-15 ~]# chkconfig --level 2345 multipathd on [root@HS20-tic-15 ~]# /bin/cp -p /etc/multipath.conf /etc/multipath.conf.`date +%d-%m-%Y.%H:%M:%S` [root@HS20-tic-15 ~]# cat > /etc/multipath.conf << EOF > blacklist { > device { > vendor "IBM-ESXS" > } > device { > vendor "LSILOGIC" > } > device { > vendor "ATA" > } > > }
213
> > > > > > > > > > > > > >
devices { device { vendor "IBM" product "2810XIV" selector "round-robin 0" path_grouping_policy multibus rr_min_io 1000 getuid_callout "/sbin/scsi_id_t10 -g -u -s /block/%n" path_checker tur failback immediate no_path_retry queue } } EOF
Configure the host and volumes and define host mappings

To map the volumes to the Linux host, you must know the World Wide Port Names (WWPNs) of the HBAs. WWPNs can be found in the SYSFS. Refer to Example 8-34 for details.
[root@HS20-tic-15 ~]# cat /sys/class/fc_host/host1/port_name 0x210000e08b853458 [root@HS20-tic-15 ~]# cat /sys/class/fc_host/host2/port_name 0x210100e08ba53458
Create and map two volumes to the Linux host, as described in 5.5, Host definition and mappings on page 113.
Discovering the volumes

Now, reboot the host system, and you will see the two volumes that were created and that are accessible over two paths as illustrated in Example 8-35.
Example 8-35 Discovered LUNs
[root@HS20-tic-15 ~]# cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: IBM-ESXS Model: ST973401LC FN Type: Direct-Access Host: scsi1 Channel: 00 Id: 00 Lun: 00 Vendor: IBM Model: 2810XIV-LUN-0 Type: RAID Host: scsi1 Channel: 00 Id: 01 Lun: 00 Vendor: IBM Model: 2810XIV-LUN-0 Type: RAID Host: scsi1 Channel: 00 Id: 02 Lun: 00 Vendor: IBM Model: 2810XIV Type: Direct-Access Host: scsi1 Channel: 00 Id: 02 Lun: 01 Vendor: IBM Model: 2810XIV Type: Direct-Access Host: scsi2 Channel: 00 Id: 00 Lun: 00 Vendor: IBM Model: 2810XIV
Rev: B41D ANSI SCSI revision: 04 Rev: 10.0 ANSI SCSI revision: 05 Rev: 10.0 ANSI SCSI revision: 05 Rev: 10.0 ANSI SCSI revision: 05 Rev: 10.0 ANSI SCSI revision: 05 Rev: 10.0
214
Type: Direct-Access Host: scsi2 Channel: 00 Id: 00 Lun: 01 Vendor: IBM Model: 2810XIV Type: Direct-Access Host: scsi2 Channel: 00 Id: 01 Lun: 00 Vendor: IBM Model: 2810XIV-LUN-0 Type: RAID Host: scsi2 Channel: 00 Id: 02 Lun: 00 Vendor: IBM Model: 2810XIV-LUN-0 Type: RAID [root@HS20-tic-15 ~]# cat /proc/partitions major minor #blocks name
ANSI SCSI revision: 05 Rev: 10.0 ANSI SCSI revision: 05 Rev: 10.0 ANSI SCSI revision: 05 Rev: 10.0 ANSI SCSI revision: 05
8 0 71687000 sda 8 1 104391 sda1 8 2 71577607 sda2 8 16 16777216 sdb 8 32 16777216 sdc 8 48 16777216 sdd 8 64 16777216 sde 253 0 69533696 dm-0 253 1 2031616 dm-1 253 2 16777216 dm-2 253 3 16777216 dm-3 [root@HS20-tic-15 ~]# multipathd -k"show topo" 1IBM_2810XIV_MN0003201AAdm-2 IBM,2810XIV [size=16G][features=1 queue_if_no_path][hwhandler=0] \_ round-robin 0 [prio=2][active] \_ 1:0:2:0 sdb 8:16 [active][ready] \_ 2:0:0:0 sdd 8:48 [active][ready] 1IBM_2810XIV_MN0003201ABdm-3 IBM,2810XIV [size=16G][features=1 queue_if_no_path][hwhandler=0] \_ round-robin 0 [prio=2][active] \_ 1:0:2:1 sdc 8:32 [active][ready] \_ 2:0:0:1 sde 8:64 [active][ready] [root@HS20-tic-15 ~]# multipathd -k"list paths" hcil dev dev_t pri dm_st chk_st next_check 1:0:2:0 sdb 8:16 1 [active][ready] X......... 2/20 1:0:2:1 sdc 8:32 1 [active][ready] X......... 2/20 2:0:0:0 sdd 8:48 1 [active][ready] X......... 2/20 2:0:0:1 sde 8:64 1 [active][ready] X......... 2/20 [root@HS20-tic-15 ~]# multipathd -k"list maps status" name failback queueing paths dm-st 1IBM_2810XIV_MN0003201AA immediate on 2 active 1IBM_2810XIV_MN0003201AB immediate on 2 active
215
8.3.4 Linux iSCSI configuration

Follow these steps to configure the Linux host for iSCSI attachment with multi-pathing: 1. Check the TCP/IP connections to the iSCSI target. 2. Install the iSCSI initiator package. 3. Configure the iscsid.conf file. 4. Check that the Device Mapper multi-pathing is configured for the IBM XIV. 5. Configure the host and volumes and host mapping in the IBM XIV. 6. Reboot the Linux server to discover the volumes created on XIV. 7. Discover the iSCSI targets and LUNs. Our environment to prepare the examples that we present in the remainder of this section consisted of an IBM System x server x345, running Red Hat Enterprise Linux 5.2 with the iSCSI software initiator.
Install the iSCSI initiator package

Download a supported iSCSI driver version according to the SSIC: http://www-03.ibm.com/systems/support/storage/config/ssic/displayesssearchwithoutj s.wss?start_over=yes Install the iSCSI initiator and change the configuration, so that iSCSI driver will be automatically loaded by the OS loader, as shown in Example 8-36. For additional information about iSCSI in Linux environments, refer to: http://open-iscsi.org/
Example 8-36 Installation of the iSCSI initiator
[root@x345-tic-30 ~]# rpm -ivh iscsi-initiator-utils-6.2.0.868-0.7.el5.i386.rpm warning: iscsi-initiator-utils-6.2.0.868-0.7.el5.i386.rpm: Header V3 DSA signature: NOKEY, key ID 37017186 Preparing... ########################################### [100%] 1:iscsi-initiator-utils ########################################### [100%] [root@x345-tic-30 ~]# chkconfig --add iscsi [root@x345-tic-30 ~]# chkconfig --level 2345 iscsi on
Configure the iscsid.conf file

To adapt the iSCSI settings for the XIV Storage System, you need to change several settings in the /etc/iscsi/iscsid.conf file according to the following list: node.conn[0].timeo.login_timeout = 5 node.conn[0].timeo.noop_out_interval = 5 node.conn[0].timeo.noop_out_timeout = 5 node.session.iscsi.InitialR2T = No node.session.iscsi.ImmediateData = No Now, reboot the server to start the iscsi daemon.
216
Discovering the iSCSI targets

To discover the iSCSI targets, use the iscsiadm command. Refer to Example 8-37 for details.
Example 8-37 iSCSI target discovery
[root@x345-tic-30 ~]# iscsiadm -m discovery -t sendtargets -p 9.155.50.27 9.155.50.27:3260,2 iqn.2005-10.com.xivstorage:000050 9.155.50.34:3260,4 iqn.2005-10.com.xivstorage:000050 192.168.1.1:3260,1 iqn.2005-10.com.xivstorage:000050 192.168.1.2:3260,3 iqn.2005-10.com.xivstorage:000050 192.168.1.3:3260,5 iqn.2005-10.com.xivstorage:000050 [root@x345-tic-30 ~]# iscsiadm -m discovery -t sendtargets -p 9.155.50.34 9.155.50.27:3260,2 iqn.2005-10.com.xivstorage:000050 9.155.50.34:3260,4 iqn.2005-10.com.xivstorage:000050 192.168.1.1:3260,1 iqn.2005-10.com.xivstorage:000050 192.168.1.2:3260,3 iqn.2005-10.com.xivstorage:000050 192.168.1.3:3260,5 iqn.2005-10.com.xivstorage:000050 [root@x345-tic-30 ~]# cat /etc/iscsi/initiatorname.iscsi InitiatorName=iqn.1994-05.com.redhat:92d581f8b30 Create and map two volumes to the Linux host, as described in 5.5, Host definition and mappings on page 113. Now, reboot the host and you will see disks dm-4 and dm-5 as illustrated in Example 8-38.
Example 8-38 iSCSI multi-pathing output
[root@x345-tic-30 ~]# multipathd -k"show topo" 1IBM_2810XIV_MN000320016dm-2 IBM,2810XIV [size=16G][features=1 queue_if_no_path][hwhandler=0] \_ round-robin 0 [prio=2][active] \_ 2:0:2:0 sdc 8:32 [active][ready] \_ 3:0:2:0 sde 8:64 [active][ready] 1IBM_2810XIV_MN000320017dm-3 IBM,2810XIV [size=16G][features=1 queue_if_no_path][hwhandler=0] \_ round-robin 0 [prio=2][active] \_ 2:0:2:1 sdd 8:48 [active][ready] \_ 3:0:2:1 sdf 8:80 [active][ready] 1IBM_2810XIV_MN000320052dm-4 IBM,2810XIV [size=16G][features=1 queue_if_no_path][hwhandler=0] \_ round-robin 0 [prio=2][enabled] \_ 5:0:0:0 sdh 8:112 [active][ready] \_ 4:0:0:0 sdg 8:96 [active][ready] 1IBM_2810XIV_MN000320053dm-5 IBM,2810XIV [size=16G][features=1 queue_if_no_path][hwhandler=0] \_ round-robin 0 [prio=2][enabled] \_ 5:0:0:1 sdj 8:144 [active][ready] \_ 4:0:0:1 sdi 8:128 [active][ready] [root@x345-tic-30 ~]# multipathd -k"list paths" hcil dev dev_t pri dm_st chk_st next_check 0:0:2:0 sdb 8:16 1 [undef] [ready] [orphan] 2:0:2:0 sdc 8:32 1 [active][ready] XXX....... 7/20 2:0:2:1 sdd 8:48 1 [active][ready] XXX....... 7/20 3:0:2:0 sde 8:64 1 [active][ready] XXX....... 7/20 3:0:2:1 sdf 8:80 1 [active][ready] XXX....... 7/20 4:0:0:0 sdg 8:96 1 [active][ready] XXX....... 7/20
217
5:0:0:0 sdh 8:112 1 [active][ready] XXX....... 7/20 4:0:0:1 sdi 8:128 1 [active][ready] XXX....... 7/20 5:0:0:1 sdj 8:144 1 [active][ready] XXX....... 7/20 [root@x345-tic-30 ~]# multipathd -k"list maps status" name failback queueing paths dm-st 1IBM_2810XIV_MN000320016 immediate on 2 active 1IBM_2810XIV_MN000320017 immediate on 2 active 1IBM_2810XIV_MN000320052 immediate on 2 active 1IBM_2810XIV_MN000320053 immediate on 2 active
8.4 Sun Solaris

At the time of writing this book, Sun Solaris 9 on SPARC and Sun Solaris 10 are supported. Supported multi-pathing drivers are either Sun StorEdge Traffic Manager Software (STMS) or Dynamic Multipathing (DMP), a part of the VERITAS Storage Foundation suite. Details about the supported Solaris versions and multi-pathing drivers, SAN boot, and REMOTE boot via iSCSI can be found in the System Storage Interoperation Center (SSIC): http://www.ibm.com/systems/support/storage/config/ssic/displayesssearchwithoutjs.w ss?start_over=yes
8.4.1 FC and multi-pathing configuration for Solaris

Follow these steps to configure the Solaris host for FC attachment with multi-pathing: 1. Check the Fibre Channel connection from your host to XIV Storage System. 2. Customize /kernel/drv/scsi_vhci.conf. 3. Configure the host, volumes, and host mapping in the XIV Storage System. 4. Discover the volumes created on XIV. Our environment to prepare the examples presented in the remainder of this section consisted of a SunFire V480 equipped with SG-XPCI1FC-QL2 HBAs, running Sun Solaris 10 (SPARC) 05/08.
Customizing /kernel/drv/scsi_vhci.conf
Add the following lines to the /kernel/drv/scsi_vhci.conf and enable STMS by entering stmsboot -e as shown in Example 8-39. For detailed information about STMS, refer to: http://dlc.sun.com/pdf/819-5604-17/819-5604-17.pdf
Example 8-39 Multi-pathing configuration and enabling STMS
bash-3.00# cat >> /kernel/drv/scsi_vhci.conf << EOF > device-type-scsi-options-list = > "IBM 2810XIV", "symmetric-option"; > symmetric-option = 0x1000000; > EOF bash-3.00# stmsboot -e WARNING: stmsboot operates on each supported multipath-capable controller detected in a host. In your system, these controllers are 218
/devices/pci@8,600000/SUNW,qlc@1/fp@0,0 /devices/pci@8,600000/SUNW,qlc@2/fp@0,0 /devices/pci@9,600000/SUNW,qlc@2/fp@0,0 If you do NOT wish to operate on these controllers, please quit stmsboot and re-invoke with -D { fp | mpt } to specify which controllers you wish to modify your multipathing configuration for. Do you wish to continue? [y/n] (default: y) y Checking mpxio status for driver fp Checking mpxio status for driver mpt WARNING: This operation will require a reboot. Do you want to continue ? [y/n] (default: y) y The changes will come into effect after rebooting the system. Reboot the system now ? [y/n] (default: y) y
Discover the volumes

To map the volumes to the Linux host, you must know the World Wide Port Names (WWPNs) of the HBAs. The WWPNs can be found using the prtconf command. Refer to Example 8-40 for details.
bash-3.00# prtconf -vp port-wwn: port-wwn: port-wwn:
|grep port-wwn 210000e0.8b096cf6 210000e0.8b0995f3 21000003.ba4dbd8a
The first two HBAs in our example are the SG-XPCI1FC-QL2 HBAs, and the third HBA is a controller for the local disks. Create and map two volumes to the Solaris host, as described in 5.5, Host definition and mappings on page 113. You now see the two volumes and two paths for WWPN 5001738000320190 on controller c2 and 5001738000320170 on controller c3. Refer to Example 8-41. To use the disk, the disks must be labelled using the format command. If the disks are visible via the cfgadm command and do not show up when using the format command, enter the devfsadm vCc disks command to clean up and repopulate the /dev namespace.
Example 8-41 Discovery of the volumes
bash-3.00# cfgadm -lao show_FCP_dev Ap_Id Type c1 fc-private c1::500000e010183a51,0 disk c1::500000e0102e9dd1,0 disk c2 fc-fabric c2::5001738000320190,0 disk c2::5001738000320190,1 disk c2::5001738000cb0161 unavailable c2::5001738000cb0181,0 array-ctrl c3 fc-fabric c3::5001738000320170,0 disk
Receptacle connected connected connected connected connected connected connected connected connected connected
Occupant configured configured configured configured configured configured unconfigured configured configured configured
Condition unknown unknown unknown unknown unknown unknown failed unknown unknown unknown 219
c3::5001738000320170,1 c3::5001738000cb0171,0 c3::5001738000cb0191,0 bash-3.00# format Searching for disks...done
disk array-ctrl array-ctrl
connected connected connected
configured configured configured
unknown unknown unknown
c4t001738000032014Ad0: configured with capacity of 15.98GB c4t001738000032014Bd0: configured with capacity of 15.98GB
AVAILABLE DISK SELECTIONS: 0. c1t0d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424> /pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w500000e0102e9dd1,0 1. c1t1d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424> /pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w500000e010183a51,0 2. c4t001738000032014Ad0 <IBM-2810XIV-10.0 cyl 2046 alt 2 hd 128 sec 128> /scsi_vhci/ssd@g001738000032014a 3. c4t001738000032014Bd0 <IBM-2810XIV-10.0 cyl 2046 alt 2 hd 128 sec 128> /scsi_vhci/ssd@g001738000032014b Specify disk (enter its number): 2 selecting c4t001738000032014Ad0 [disk formatted] Disk not labeled. Label it now? y
FORMAT MENU: disk type partition current format repair label analyze defect backup verify save inquiry volname !<cmd> quit format> disk
select a disk select (define) a disk type select (define) a partition table describe the current disk format and analyze the disk repair a defective sector write label to the disk surface analysis defect list management search for backup labels read and display labels save new disk/partition definitions show vendor, product and revision set 8-character volume name execute <cmd>, then return
AVAILABLE DISK SELECTIONS: 0. c1t0d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424> /pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w500000e0102e9dd1,0 1. c1t1d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424> /pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w500000e010183a51,0 2. c4t001738000032014Ad0 <IBM-2810XIV-10.0 cyl 2046 alt 2 hd 128 sec 128> /scsi_vhci/ssd@g001738000032014a 3. c4t001738000032014Bd0 <IBM-2810XIV-10.0 cyl 2046 alt 2 hd 128 sec 128> /scsi_vhci/ssd@g001738000032014b Specify disk (enter its number)[2]: 3
220
selecting c4t001738000032014Bd0 [disk formatted] Disk not labeled. Label it now? y format> quit
8.4.2 iSCSI configuration for Solaris

Information about the configuration of iSCSI initiators in Solaris 10 can be found at the following Web site: http://docs.sun.com/app/docs/doc/817-5093/fmvcd?a=view Follow these steps to configure the Solaris host for iSCSI attachment with multi-pathing: 1. 2. 3. 4. Check the TCP/IP connections to the iSCSI targets. Discover the iSCSI targets. Configure the host, volumes, and host mapping in the IBM XIV. Run devfsadm to discover the volumes created on the IBM XIV.
Discover the iSCSI targets

To discover the iSCSI targets, use the iscsiadm command as shown in Example 8-37 on page 217. To create the host connection on the XIV Storage System, you need the iSCSI initiator name. To get this information, run the iscsiadm list initiator-node command. You can use static or send target as the discovery method. In our case, we used the static discovery method because of known runtime errors with Solaris 10. Details about the runtime errors with Solaris 10 5/08 Release can be found on the following Web site: http://docs.sun.com/app/docs/doc/820-4078/chapter2-1000?l=en&a=view
Example 8-42 iSCSI target discovery
bash-3.00# iscsiadm add discovery-address 9.155.50.27:3260 bash-3.00# iscsiadm list discovery-address -v 9.155.50.27:3260 Discovery Address: 9.155.50.27:3260 Target name: iqn.2005-10.com.xivstorage:000050 Target address: 192.168.1.1:3260, 1 Target name: iqn.2005-10.com.xivstorage:000050 Target address: 9.155.50.27:3260, 2 Target name: iqn.2005-10.com.xivstorage:000050 Target address: 192.168.1.2:3260, 3 Target name: iqn.2005-10.com.xivstorage:000050 Target address: 9.155.50.34:3260, 4 Target name: iqn.2005-10.com.xivstorage:000050 Target address: 192.168.1.3:3260, 5 bash-3.00# iscsiadm add static-config iqn.2005-10.com.xivstorage:000050,9.155.50.27:3260 bash-3.00# iscsiadm add static-config iqn.2005-10.com.xivstorage:000050,9.155.50.34:3260 bash-3.00# iscsiadm list static-config Static Configuration Target: iqn.2005-10.com.xivstorage:000050,9.155.50.27:3260 Static Configuration Target: iqn.2005-10.com.xivstorage:000050,9.155.50.34:3260 bash-3.00# iscsiadm modify discovery --static enable bash-3.00# iscsiadm list initiator-node
221
Initiator node name: iqn.1986-03.com.sun:01:0003ba4dbd8a.489acacf Initiator node alias: v480-1 Login Parameters (Default/Configured): Header Digest: NONE/Data Digest: NONE/Authentication Type: NONE RADIUS Server: NONE RADIUS access: unknown Configured Sessions: 1
Discovery of the iSCSI LUNs

Create and map two volumes to the Solaris host, as described in 5.5, Host definition and mappings on page 113. After execution of the devfsadm command, you will see disks by using the format command as shown in Example 8-43.
Example 8-43 iSCSI LUN discovery
bash-3.00# devfsadm -i iscsi bash-3.00# format Searching for disks...done c4t0017380000320056d0: configured with capacity of 15.98GB c4t0017380000320057d0: configured with capacity of 15.98GB
AVAILABLE DISK SELECTIONS: 0. c1t0d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424> /pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w500000e0102e9dd1,0 1. c1t1d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424> /pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w500000e010183a51,0 2. c4t001738000032014Ad0 <IBM-2810XIV-10.0 cyl 2046 alt 2 hd /scsi_vhci/ssd@g001738000032014a 3. c4t001738000032014Bd0 <IBM-2810XIV-10.0 cyl 2046 alt 2 hd /scsi_vhci/ssd@g001738000032014b 4. c4t0017380000320056d0 <IBM-2810XIV-10.0 cyl 2046 alt 2 hd /scsi_vhci/ssd@g0017380000320056 5. c4t0017380000320057d0 <IBM-2810XIV-10.0 cyl 2046 alt 2 hd /scsi_vhci/ssd@g0017380000320057 Specify disk (enter its number): 4 selecting c4t0017380000320056d0 [disk formatted] Disk not labeled. Label it now? y
128 sec 128> 128 sec 128> 128 sec 128> 128 sec 128>
FORMAT MENU: disk type partition current format repair label analyze defect backup 222
select a disk select (define) a disk type select (define) a partition table describe the current disk format and analyze the disk repair a defective sector write label to the disk surface analysis defect list management search for backup labels
verify save inquiry volname !<cmd> quit format> disk
read and display labels save new disk/partition definitions show vendor, product and revision set 8-character volume name execute <cmd>, then return
AVAILABLE DISK SELECTIONS: 0. c1t0d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424> /pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w500000e0102e9dd1,0 1. c1t1d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424> /pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w500000e010183a51,0 2. c4t001738000032014Ad0 <IBM-2810XIV-10.0 cyl 2046 alt 2 hd /scsi_vhci/ssd@g001738000032014a 3. c4t001738000032014Bd0 <IBM-2810XIV-10.0 cyl 2046 alt 2 hd /scsi_vhci/ssd@g001738000032014b 4. c4t0017380000320056d0 <IBM-2810XIV-10.0 cyl 2046 alt 2 hd /scsi_vhci/ssd@g0017380000320056 5. c4t0017380000320057d0 <IBM-2810XIV-10.0 cyl 2046 alt 2 hd /scsi_vhci/ssd@g0017380000320057 Specify disk (enter its number)[4]: 5 selecting c4t0017380000320057d0 [disk formatted] Disk not labeled. Label it now? y format>quit
128 sec 128> 128 sec 128> 128 sec 128> 128 sec 128>
8.5 VMware
The XIV Storage System currently supports the VMware high-end virtualization solution Virtual Infrastructure 3 and the included VMware ESX Server 3.5. At the time of writing this book, SAN boot was supported. Details about the supported VMware ESX server versions, supported SAN boot environments, and REMOTE boot via iSCSI can be found at the System Storage Interoperation Center (SSIC) Web site: http://www.ibm.com/systems/support/storage/config/ssic/displayesssearchwithoutjs.w ss?start_over=yes
8.5.1 FC and multi-pathing for VMware ESX

Follow these steps to configure the VMware host for FC attachment with multi-pathing: 1. Check the Fibre Channel connection from your host to XIV Storage System. 2. Configure the host, volumes, and host mapping in the XIV Storage System. 3. Discover the volumes created on XIV. Details about Fibre Channel Configuration on the VMware ESX server can be found at: http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_san_cfg.pdf
223
Our environment to prepare the examples presented in the remainder of this section consisted of an IBM BladeCenter HS20 equipped with QMC 2462 HBAs and running VMware ESX Server 3.5 Update 2.
Discover the volumes

Create and map two volumes to the VMware ESX host, as described in 5.5, Host definition and mappings on page 113. After a rescan on the ESX host, you will see the two volumes and the two paths in the Configuration tab of the Storage Adapter panel for vmhba2, which is illustrated in Figure 8-22. The same LUNs are also available through two paths on vmhba1. The World Wide Port Names (WWPNs) of the HBAs are also displayed on the Storage Adapters panel.
Figure 8-22 FC LUNs on Storage Adapter vmhba2
For detailed information about how to now use these LUNs with virtual machines, refer to the VMware Guides, available at the following Web sites: http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_admin_guide.pdf http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_3_server_config.pdf
8.5.2 ESX Server iSCSI configuration

Information about the configuration of iSCSI initiators with VMware ESX 3.5 Update 2 can be found at the following Web site: http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_iscsi_san_cfg.pdf Follow these steps to configure the ESX host for iSCSI attachment with multi-pathing: 1. Check the TCP/IP connections to the iSCSI targets. 2. Check the firewall settings for the Software iscsi client.
224
3. Add Network for VMkernel. 4. Enable the iscsi Software Adapter. 5. Configure the host, volumes, and host mapping in the IBM XIV. 6. Discover the iSCSI targets.
Firewall settings
Before you start configuring the iSCSI storage, make sure that the firewall settings on the VMware host allow the software iSCSI client to connect to the iSCSI target. The firewall settings can be found in the Configuration tab under Security Profile. Verify that the Software iSCSI client is enabled as shown in Figure 8-23.
Figure 8-23 Firewall settings
Adding the network for VMkernel

Before you can configure iSCSI storage, you must create a VMkernel port to handle iSCSI networking: 1. Therefore, go to the Configuration tab and select Networking as shown in Figure 8-24 on page 226.
225
Figure 8-24 Networking
2. Click Add Networking to initialize the Add Network Wizard. The Add Network Wizard is shown in Figure 8-25.
Figure 8-25 Add Network Wizard: Connection Type
3. Select VMkernel as the Connection Type and click Next. The next Add Network Wizard window is displayed as shown in Figure 8-26 on page 227.
226
Figure 8-26 Add Network Wizard: Network Access
4. Select an unused network adapter to create a new virtual switch as illustrated in Figure 8-26, and click Next. 5. Fill in the Connection settings, such as the network address and the subnet mask for the VMkernel as shown in Figure 8-27 on page 228, and click Next.
227
Figure 8-27 Add Network Wizard: Connection Settings
6. The next window shows a summary of your settings. If correct, click Finish (refer to Figure 8-28).
Figure 8-28 Add Network Wizard: Summary
228
The Networking panel now shows an additional virtual switch (vSwitch) for the VMkernel as illustrated in Figure 8-29.
Figure 8-29 Networking with VMkernel
Enable the iSCSI Software Adapter

To enable the iSCSI Software Adapter: 1. Go to the Configuration tab and select Storage Adapters. Then, select iSCSI Software Adapter as shown in Figure 8-30.
Figure 8-30 iSCSI Software Adapter unconfigured
Click Properties (Figure 8-30) to display the iSCSI Initiator (iSCSI Software Adapter) Properties panel as shown in Figure 8-31 on page 230.
229
Figure 8-31 iSCSI Software Adapter general properties unconfigured
2. In the iSCSI initiator pane under the General tab (Figure 8-31), click Configure to enable the iSCSI Software Adapter as illustrated in Figure 8-32.
Figure 8-32 iSCSI Software Adapter general properties: Enable iSCSI initiator
3. Check Enabled and click OK. A message will appear as shown in Figure 8-33.
Figure 8-33 iSCSI Software Adapter general properties: Enable iSCSI initiator message
230
4. Because with VMware ESX Server 3.5, it is no longer necessary to define a service console network port on the same vSwitch as the VMkernel, click No (Figure 8-33 on page 230). The General tab now displays the iSCSI initiator properties (iSCSI name and alias) as shown in Figure 8-34.
Figure 8-34 iSCSI Software Adapter general properties: Configured
5. Close the Properties panel, and the iSCSI Software Adapter is now visible as vmhba32 as illustrated in Figure 8-35.
Figure 8-35 iSCSI Software Adapter configured
231
Discover iSCSI targets and volumes

The steps are: 1. Go to the Configuration tab and select Storage Adapters. Then, select the iSCSI Software Adapter as shown in Figure 8-30 on page 229. Click Properties to display the iSCSI initiator Properties window. Select the Dynamic Discovery tab as shown in Figure 8-31 on page 230.
Figure 8-36 iSCSI Software Adapter Dynamic Discovery: Unconfigured
2. Click Add and define the IP addresses for the iSCSI ports on the XIV as depicted in Figure 8-37.
Figure 8-37 iSCSI Software Adapter Dynamic Discovery: Add Send Targets Server
After adding all iSCSI ports available on the XIV Storage System, they will be visible in the Dynamic Discovery tab as displayed in Figure 8-38 on page 233.
232
Figure 8-38 iSCSI Software Adapter Dynamic Discovery: Configured
3. Create and map two volumes to the VMware ESX host, as described in 5.5, Host definition and mappings on page 113. After a rescan on the ESX host, you will see the two volumes as LUN 1 and 2 over two paths. This information is visible in the Configuration tab of the Storage Adapter panel for vmhba32 as illustrated in Figure 8-39 on page 234. According to the SCSI standard, XIV Storage System maps itself in every map to LUN 0. This LUN serves as the well known LUN for that map, and the host can issue SCSI commands to that LUN, which are not related to any specific volume.
233
Figure 8-39 iSCSI LUNs on Storage Adapter vmhba32
234
Chapter 9.
Performance characteristics
The XIV Storage System is a high performing disk storage subsystem. We have described in Chapter 2, XIV logical architecture and concepts on page 7 the XIV Storage Systems massive parallelism, disk utilization, and unique caching algorithms. These characteristics, inherent to the system design, guarantee optimized, consistent performance. As increased stress is applied to the system, the XIV Storage System maintains a consistent performance level. This chapter further explores the concepts behind the high performance, provides the best practices recommendations when connecting to an XIV Storage System, and explains how to use the statistics monitor that is provided by the XIV Storage System.
235
9.1 Performance concepts

Conceptually, XIV keeps its performance high by leveraging all the disk resources in the system at all times, instituting top of the line caching features, and fully utilizing a large CPU resource pool. By the utilization of disks and data mirroring, the XIV Storage System offers a rapid rebuild when a disk loss is encountered. These characteristics are the basis for many unique features that distinguishes it from the competition.
9.1.1 Full disk resource utilization

Utilization of all disk resources improves the performance by minimizing the bottlenecks within the system. The XIV Storage System stripes data in 1 MB partitions across all the disks in the system; it then disperses the 1 MB partitions in a pseudo-random distribution. This pseudo-random distribution results in a lower access density, which is measured by throughput divided by the total disk capacity. Refer to Chapter 2, XIV logical architecture and concepts on page 7 for further details about the architecture of the system. Several benefits result from fully utilizing all of the disk resources. Each disk drive performs less work as the data is balanced across the entire system. The pseudo-random distribution ensures load balancing at all times and assists in preventing hot-spots (the continual accessing of data on the same disk drive while other disk drives are idle) in the system.
9.1.2 Caching considerations

XIV caching management is unique, by dispersing the cache into each Data Module as opposed to a central memory cache. The distributed cache enables each Data Module to concurrently service host I/Os and cache to disk access, as opposed to the central memory caching algorithm, which implements memory locking algorithms that generate contention between the accesses. To improve memory management, each Data Module uses a PCI Express (PCI-e) bus between the cache and the disk modules, which provides a sizable interconnect between the disk and the cache. This design choice (Figure 9-1) allows large amounts of data (GBps) to be quickly transferred between the disks and the cache.
Memory
PCI-e Bus
Figure 9-1 Data Module layout
236
Having a large pipe permits the XIV Storage System to have small cache pages. Indeed, a large pipe between the disk and the cache allows the system to perform many small requests in parallel, thus improving the performance. If the pipe were small, the system has to serialize the requests or attempt to group small requests together to maximize the data throughput between the disk and the cache. Therefore, a large pipe enables the XIV system to manage many small tasks that can be accommodated by small cache pages. A Least Recently Used (LRU) algorithm is the basis for the cache management algorithm. Combined with the small cache pages, the LRU algorithm becomes very efficient. This feature allows the system to generate a high hit ratio for frequently utilized data. In other words, the efficiency of the cache usage for small transfers is very high, when the host is accessing the same data set. Due to the efficiency of the cache, the prefetching algorithm is very aggressive. The algorithm starts with a small number of pages and gradually increase the number of pages prefetched until an entire partition, 1 MB, is read into cache. Specifically, the algorithm starts with two pages (8 KB). If the access results in a cache hit, the algorithm doubles the amount of data prefetched into the system. In this example, the next prefetch requests four pages, 16 KB, of data. The algorithm continues to double the prefetch size until a cache miss occurs, or the prefetch size maximum of 1 MB is obtained. Because the modules are managed independently if a prefetch crosses a module boundary, then the logically adjacent module (for that volume) is notified in order to begin pre-staging the data into its local cache.
9.1.3 Data mirroring

The XIV Storage System keeps two copies of each 1 MB data partition, referred to as the primary partition and secondary partition. The primary partition and secondary partition for the same data are also kept on separate disks in separate modules. We call this data mirroring (not to be confused with the Remote Mirroring function). By implementing data mirroring, the XIV Storage System performs a single disk access on reads and two disk accesses on writes: one access for the primary copy and one access for the mirrored (or secondary) copy. Other storage systems that use RAID architecture might translate the I/O into two disk accesses for RAID 5 and three disk accesses for RAID 6. Therefore, the data mirroring algorithm used by XIV reduces the disk accesses and provides quicker response time to requests. A 1 MB partition is the amount of data stored on a disk with a contiguous address range. Because the cache operates on 4 KB pages, the smallest chunk of data that can be staged into cache is a single cache page, or 4 KB. The data mirroring only mirrors the data that has been modified. By only storing modified data, the system performs at maximum efficiency, therefore improving the performance of the data mirroring operation. It is also important to note that the data mirroring scheme is not the same as a data stripe in a RAID architecture. Specifically, a partition refers to a contiguous region on a disk. The partition is how the XIV Storage System tracks and manages data. This large 1 MB partition is not the smallest workable unit. Because the cache uses 4 KB pages, the system can stage and modify smaller portions of data within the partition, which means that the large partition assists the sequential workloads and the small cache page improves performance for the random workloads. Disk rebuilds gain an advantage due to the mirroring implementation. When a disk fails in the RAID 5 or RAID 6 system, a spare disk is added to the array and the data is reconstructed on that disk using the parity data. The process can take several hours based on the size of the
Chapter 9. Performance characteristics
237
disk drive. With the XIV Storage System, the rebuild is not focused on one disk, instead the work is spread across all the disks in the system. By spreading the work across all the disks, each disk is performing a small percentage of work, therefore the impact to the host is minimal. After the disk is repaired, the system enters a redistribution phase. In the background, the system slowly moves data back onto to the new disk, which causes the new disk to be heavily utilized as the data is written to it. Due to the background work, the host encounters a small impact to performance.
9.1.4 SATA drives compared to FC drives

The major differences between Fibre Channel (FC) disk drives and Serial Advanced Technology Attachment (SATA) disk drives are the speed and the size of a single drive. The FC disk drive performs twice as fast for at least double the cost with about half the size. Essentially, the FC disk drive improves read latency when the data being retrieved is not in cache, a read miss. In the other cases, the data accesses are memory operations and do not have the disk penalty. As a result, a single read operation can have a longer latency on a SATA disk drive as compared to an FC disk drive. Based on the XIV architecture, all disk spindles are utilized at the same time. This design point allows for parallelism of data accesses. The result is that the performance impact is minimal for most operational workloads. Therefore, the SATA disk drives were selected to provide the best ratio when comparing cost per byte with performance.
9.1.5 Snapshot performance

Snapshots perform almost instantly with the XIV Storage System. When a snapshot occurs, the data is not moved, instead the new snapshot creates pointers to the current data. As the host changes the data in the master volume, the XIV Storage System redirects the write data to a new partition. Only the data that was modified by the host is copied into the new partition, which prevents moving the data multiple times and simplifies the management of the data. Refer to Chapter 11, Copy functions on page 285 for more details about how the snapshot function is implemented.
9.1.6 Remote Mirroring performance

The Remote Mirroring is a synchronous implementation. For every write the system performs, the data must be committed to both the local and remote copies before the host is notified of completion. Impact to the host writes are dependent upon the amount of latency between the two XIV Storage Systems. If the latency is small, the impact to the hosts writes is also small; however, if the latency between the two systems is large then the impact to the host is equally large.
Resynchronization is the process of establishing the connection to the remote system after a link failure. In this situation, only the modified data is transferred to the remote XIV Storage System in order to speed up the recovery process. Therefore, recovery time is dependent upon the amount of data that has been changed between the time of the link failure and the time that the recovery process completes.
238
9.2 Best practices

Tuning of the XIV subsystem is not required by design. Because the data is balanced across all the disks, the performance is already at maximum efficiency. This section is dedicated to external considerations that enable maximum performance. The recommendations in this section are host-agnostic and are general rules when operating the XIV Storage System. Tip: By design, the performance of the XIV Storage System is already at maximum efficiency.
9.2.1 Distribution of connectivity

The main goal for the host connectivity is to create a balance on the resources in the XIV Storage System. Balance is achieved by distributing the physical connections across the Interface Modules. A host usually manages multiple physical connections to the storage device for redundancy purposes. Distribute these physical connections across each of the Interface Modules to obtain maximum performance. For example, if a host has six physical fibre connections, each cable must be inserted into a separate module. Refer to Figure 9-2 for a diagram of the physical cable layout on the back of the XIV Storage System. SAN connectivity applies the same recommendations. When attaching the cables from a switch, the cables need to be divided across the modules. By dividing the cables among the Interface Modules, the host utilizes the full resources of each module that is connected. It is not necessary for each host to have six cables, as shown in Figure 9-2. However, when the host has more than one physical connection, it is beneficial to have the cables divided across the modules.
Figure 9-2 Physical cable layout
Similarly, if you connect multiple hosts and have multiple connections, make sure to spread all of the connections evenly across the Interface Modules.
9.2.2 Host configuration considerations

There are several key points when configuring the host for optimal performance. Because the XIV Storage System is striping the data across the disks, an additional layer of striping (as when using a Logical Volume Manager on the host) might hurt performance for certain workloads. Multiple levels of striping can create an imbalance across a specific resource. As a result of host striping, it becomes difficult to ensure that a specific resource, such as a host bus adapter (HBA), is not overused. Therefore, the recommendation is to turn off host striping of data for XIV volumes and allow the XIV Storage System to manage the data. Based on your host workload, you might need to modify the maximum transfer size that the host generates to the disk to obtain the best performance. For applications with large transfer sizes, if a smaller max host transfer size is selected, the transfers are broken up, causing multiple round-trips between the host and the XIV Storage System. By making the host transfer size as large or larger than the application transfer size, fewer round-trips occur, and
239
the system experiences better performance. If the transfer is smaller than the maximum host transfer size, the host only transfers the amount of data that it has to send. Refer to Chapter 8, OS-specific considerations for host connectivity on page 179 or vendor hardware manuals for queue depth recommendations. Due to the distributed nature of the XIV Storage System, high performance is achieved by parallelism. Specifically, the system maintains a high level of performance as the number of parallel transactions occur to the volumes. Ideally, the host workload can be tailored to use multiple threads or spread the work across multiple hosts.
9.2.3 XIV sizing validation

Currently, the IBM marketing representative provides sizing recommendations based on the workload.
9.3 Performance statistics gathering with XIV

During normal operation, the XIV Storage System constantly gathers performance metrics. The data can then be processed using the GUI or Extended Command Line Interface (XCLI). This section introduces the techniques for processing the statistics metrics.
9.3.1 Using the GUI

The GUI provides a powerful mechanism to gather statistics data. For a description of setting up and using the GUI, refer to Chapter 5, Configuration on page 79. When working with the statistics information, the XIV Storage System collects and maintains the information internally. As the data ages, it is consolidated to save space. By selecting specific filters, the requested data is mined and displayed. This section discusses the functionality of the GUI and how to retrieve the required data. The first item to note is the current IOPS for the system is always displayed in the bottom center of the window. This feature provides simple access to the current stress of the system. Figure 9-3 on page 241 illustrates the GUI and the IOPS display; Figure 9-3 on page 241 also describes how to start the statistics monitor.
240
Figure 9-3 Starting the statistics monitor on the GUI
Select Statistics from the Monitor menu as shown in Figure 9-3 to display the Monitor default view that is shown in Figure 9-4 on page 242. Figure 9-4 on page 242 shows the system IOPS for the past 24 hours: The X-axis of the graph represents the time and can vary from minutes to months. The Y-axis of the graph is the measurement selected. The default measurement is IOPS. The statistics monitor also illustrates latency and bandwidth.
241
Figure 9-4 Default statistics monitor view
The other options in the statistics monitor act as filters for separating data. These filters are separated by the type of transaction (reads or writes), cache properties (hits compared to misses), or the transfer size of I/O as seen by the XIV Storage System. Refer to Figure 9-5 for a better view of the filter pane.
Figure 9-5 Filter pane for the statistics monitor
The filter pane allows you to select multiple items within a specific filter, for example, if you want to see reads and writes separated on the graph. By holding down Ctrl on the keyboard and selecting the read option and then the write option, you can witness both items displayed on the graph.
242
As shown in Figure 9-6, one of the lines represents the reads and the other line represents the writes. On the GUI, these lines are drawn in separate colors to differentiate the metrics. This selection process can be performed on the other filter items as well.
Figure 9-6 Multiple filter selection
In certain cases, the user needs to see multiple graphs at one time. On the right side of the filter pane, there is a selection to add graphs (refer to Figure 9-5 on page 242). Up to four graphs are managed by the GUI. Each graph is independent and can have separate filters. Figure 9-7 on page 244 illustrates this concept. The top graph is the IOPS for the day with the reads and writes separated. The second graph displays the bandwidth for several minutes with reads and writes separated, which provides quick and easy access to multiple views of the performance metrics.
243
Figure 9-7 Multiple graphs using the GUI
There are several additional filters available, such as filtering by host, volumes, or targets. These items are defined on the left side of the filter pane. When clicking one of these filters, a dialog window appears. Highlight the item that needs to be filtered and then click Click to select. It moves the highlighted item to the lower half of the dialog box. In order to generate the graph, you must click the green check mark located on the lower right side of the dialog box. Your new graph is generated with the name of the filter at the top of the graph. Refer to Figure 9-8 on page 245 for an example of this filter.
244
Figure 9-8 Example of a host filter
On the left side of the chart in the blue bar, there are several tools to assist you in managing the data. The top two tools (magnifying glasses) zoom in and out for the chart, and the second set of two tools adjusts the X-axis and the Y-axis for the chart. Finally, the bottom two tools allow you to export the data to a comma-separated file or print the chart to a printer. Figure 9-9 shows the chart toolbar in more detail.
Figure 9-9 Chart toolbar
245
9.3.2 Using the XCLI

The second method to collect statistics is through the XCLI operation. In order to set up and use the XCLI, refer to Chapter 5, Configuration on page 79. After they are configured, statistics are queried using two commands. The first command obtains the current system time, and the second command performs the query to the XIV Storage System. To retrieve the system time, issue the time_list command, and the system retrieves the current time. Refer to Example 9-1 for an example of retrieving the XIV Storage System time.
Example 9-1 Retrieving the XIV Storage System time
C:\XIV>xcli -c MN00033 time_list Time Date Time Zone Daylight Saving Time 00:48:15 2008-07-29 US/Arizona no After the system time is obtained, the statistics_get command can be formatted and issued. The statistics_get command requires several parameters to operate. The command requires that you enter a starting or ending time point, a count for the number of intervals to collect, the size of the interval, and the units related to that size. The TimeStamp is modified from the previous time_list command. Example 9-2 provides a description of the command.
Example 9-2 The statistics_get command format
statistics_get < start=TimeStamp | end=TimeStamp > count=N interval=IntervalSize resolution_unit=< minute|hour|day|week|month > To further explain this command, assume that you want to collect 10 intervals, and each interval is for one minute. The point of interest occurred at 28 July 28 2008 starting at 25 minutes after 00 hours. It is important to note the statistics_get command allows you to gather the performance data from any time period. The time stamp is formatted as YYYY-MM-DD:hh:mm:ss, where the YYYY represents a four digit year, MM is the two digit month, and DD is the two digit day. After the date portion of the time stamp is specified, you specify the time, where hh is the hour, mm is the minute, and ss represents the seconds. In order to save the data, you have to redirect the output to a file for post-processing. Example 9-3 shows an example of this command, and Figure 9-10 on page 247 shows an example of the output of the statistics. The output displayed is a small portion of the data provided.
Example 9-3 The statistics_get command example
C:\XIV>xcli -c MN00033 statistics_get start=2008-07-28.00:25:00 count=10 interval=1 resolution_unit=minute > data.out
246
Figure 9-10 Output from statistics_get command
Extending this example, assume that you want to filter out a specific host defined in the XIV Storage System. By using the host filter in the command, you can specify for which host you want to see performance metrics, which allows you to refine the data that you are analyzing. Refer to Example 9-4 for an example of how to perform this operation and Figure 9-11 for a sample of the output for the command.
Example 9-4 The statistics_get command using the host filter
C:\XIV>xcli -c MN00033 statistics_get start=2008-07-28.00:25:00 host=23a5372 count=10 interval=1 resolution_unit=minute > data.out
Figure 9-11 Output from the statistics_get command using the host filter
In addition to the filter just shown, the statistics_get command is capable of filtering iSCSI names, host worldwide port names (WWPNs), volume names, modules, and many more fields. As an additional example, assume you want to see the workload on the system for a specific module. The module filter breaks out the performance on the specified module. Example 9-5 pulls the performance statistics for module 6 during the same time period of the previous examples.
Example 9-5 The statistics_get command using the module filter
C:\XIV>xcli -c MN00033 statistics_get start=2008-07-28.00:25:00 module=6 count=10 interval=1 resolution_unit=minute > data.out
247
Figure 9-12 Output from statistics_get command using the module filter
248
10
Chapter 10.
Monitoring
This chapter describes the various methods and functions that are available to monitor the XIV Storage System. It also shows how you can gather information from the system in real time, in addition to the self-monitoring, self-healing, and automatic alerting function implemented within the XIV software. Furthermore, this chapter also discusses the Call Home function and remote support and repair.
249
10.1 System monitoring

The XIV Storage System software includes features that allow users to monitor the system: You can review or request at any time the current system status and performance statistics. You can set up alerts to be triggered when specific error conditions or problems arise in the system. Alerts can be conveyed as messages to the operator, an e-mail, or a Short Message Service (SMS) text to a mobile phone. Depending on the nature or severity of the problem, the system will automatically alert the IBM support center, which immediately initiates the necessary actions to promptly repair the system. In addition, the optional Remote Support feature allows remote monitoring and repair by IBM support.
10.1.1 Monitoring with the GUI

The monitoring functions available from the XIV Storage System GUI allow the user to easily display and review the overall system status, as well as events and several statistics. These functions are accessible from the Monitor menu as shown in Figure 10-1.
Figure 10-1 GUI monitor functions
Monitoring the system

Selecting System from the Monitor menu shown in Figure 10-1 takes you to the system view, shown in Figure 10-2 on page 251 (note that this view is also the default or main GUI window for the selected system). The System view shows a graphical representation of the XIV Storage System rack with its components. You can click the curved arrow located at the bottom right of the picture of the rack to display a view of the patch panel. You get a quick overview in real time of the systems overall condition and the status of its individual components. The display changes dynamically to provide details about a specific component when you position the mouse cursor over that component.
250
Figure 10-2 Monitoring the IBM XIV
Status bar indicators located at the bottom of the window indicate the overall operational levels of the XIV Storage System: The first indicator on the left shows the amount of soft or hard storage capacity currently allocated to Storage Pools and provides alerts when certain capacity thresholds are reached. As the physical, or hard, capacity consumed by volumes within a Storage Pool passes certain thresholds, the color of this meter indicates that additional hard capacity might need to be added to one or more Storage Pools. Clicking the icon on the right side of the indicator bar that represents up and down arrows will toggle the view between hard and soft capacity. Our example indicates that the system has a usable hard capacity of 79113 GB, of which 84% or 66748 GB are actually used. You can also get more detailed information and perform more accurate capacity monitoring by looking at Storage Pools (refer to 5.3.1, Managing Storage Pools with XIV GUI on page 94). The second indicator in the middle, displays the number of I/O operations per second (IOPS). The third indicator on the far right shows the general system status and, for example, indicates when a redistribution is underway. In our example, the general system status indicator shows that the system is undergoing a Rebuilding phase, which was triggered because of a failing disk (Disk 7 in Module 7) as shown in Figure 10-2.
Chapter 10. Monitoring
251
Monitoring events
To get to the Events window, select Events from the Monitor menu as shown in Figure 10-3. Extensive information and many events are logged by the XIV Storage System. The system captures entries for problems with various levels of severity, including warnings and other informational messages. These informational messages include detailed information about logins, configuration changes, and the status of attached hosts and paths. All of the collected data can be reviewed in the Events window that is shown in Figure 10-3.
Figure 10-3 Events
Because many events are logged, the number of entries is typically huge. To get a more useful and workable view, there is an option to filter the events logged. Without filtering the events, it is extremely difficult to find the entries for a specific incident or information. Figure 10-4 shows the possible filter options for the events.
Figure 10-4 Event filter
If you double-click a specific event in the list, you can get more detailed information about that particular event, along with a recommendation about what eventual action to take. Figure 10-5 on page 253 show details for a critical event where a module failed. For this type of event, you must immediately contact IBM XIV support.
252
Figure 10-5 Event detail
Event severity
The events are classified into a level of severity depending on their impact on the system. Figure 10-6 gives an overview of the criteria and meaning of the various severity levels.
Severity:
= Critical = Major = Minor = Warning = Informational The Events are categorized in these five categories. Informational event is for information only without any impact or danger for system operation Warning information for the user that something in the system has changed but no impact for the system Minor an event occurred where a part has failed but system is still fully redundant and has no operational impact Major an event has occurred where a part has failed and the redundancy is temporary affected. (ex: failing disk) Critical an event has occurred where one or more parts have failed and the redundancy and machine operation can be affected.
Figure 10-6 Event severity
Event configuration
The events monitor window offers a Configuration option in the Toolbar (refer to Figure 10-7) to let the configuration call home notifications and rules for specific events. Clicking the Configuration icon starts the Configuration Wizard, which guides you through the settings to define rules.
Figure 10-7 Event rules configuration
For further information about event notification rules, refer to 10.2, Call Home and remote support on page 273.
253
Monitoring statistics
The Statistics monitor, which is shown in Figure 10-8, provides information about the performance and workload of the IBM XIV.
Figure 10-8 Monitor statistics
There is flexibility in how you can visualize the statistics. Options are selectable from a control pane located at the bottom of the window, which is shown in Figure 10-9.
Figure 10-9 Statistics filter
For detailed information about performance monitoring, refer to 9.3, Performance statistics gathering with XIV on page 240.
254
10.1.2 Monitoring with XCLI

The Extended Command Line Interface (XCLI) provides several commands to monitor the XIV Storage System and gather real-time system status, monitor events, and retrieve statistics. Refer also to 5.1, IBM XIV Storage Management software on page 80 for more information about how to set up and use the XCLI.
System monitoring
Several XCLI commands are available for system monitoring. We illustrate several commands next. For complete information about these commands, refer to the XCLI Users Guide, which is available at: http://publib.boulder.ibm.com/infocenter/ibmxiv/r2/index.jsp The state_list command, shown Example 10-1, gives an overview of the general status of the system. In the example, the system is operational, and no shutdown is pending.
Example 10-1 The state_list command
xcli> XCLI -c clr26 state_list Command completed successfully system_state off_type=off safe_mode=no shutdown_reason=No Shutdown system_state=on target_state=on In Example 10-2, the system_capacity_list command shows an overview of used and free capacity system-wide. In the example, the usable capacity is 79113 GB, with 11355 GB of free capacity. The difference of these two values provides the capacity used.
Example 10-2 The system_capacity_list command
xcli> XCLI -c clr26 system_capacity_list Soft Hard Free Hard Free Soft Spare Modules Spare Disks Target Spare Modules 79113 79113 11355 11355 1 3 1 In Example 10-3, the version_get command displays the current version of the XIV code installed on the system. Knowing the current version of your system assists you in determining when upgrades are required.
Example 10-3 The version_get command
C:\XIV>xcli -c clr26 version_get Version XIV_SYS-10.0-P0803 In Example 10-4 on page 256, the time_list command is used to retrieve the current time from the XIV Storage System. This time is normally set at the time of installation. Knowing the current system time is required when reading statistics or events. In certain cases, the system time might differ from the current time (at the users location), and therefore, knowing when something occurred according to the system time assists with debugging issues. In the example provided, the current system time and current user time are displayed.
255
Example 10-4 The time_list and user time commands
C:\XIV>xcli -c clr26 time_list & date /T & time /T Time Date Time Zone Daylight Saving Time 08:57:15 2008-08-19 GMT no Tue 08/19/2008 10:58 AM
System components status

In this section, we present several XCLI commands that are available to get the status of specific system components, such as disks, modules, or adapters. The component_list command that is shown In Example 10-5 gives the status of all hardware components in the system. The filter options filter=<FAILED | NOTOK> is used to only return failing components. The example shows a failing disk in module 9 on position 4.
Example 10-5 The component_list command
xcli> XCLI -c clr26 component_list filter=NOTOK Component ID Status Currently Functioning 1:Disk:4:9 Failed no Shown in Example 10-6, the disk_list command provides more in-depth information for any individual disk in the XIV Storage System, which might be helpful in determining the root cause of a disk failure. If the command is issued without the disk parameter, all the disks in the system are displayed.
Example 10-6 The disk_list command
C:\XIV>xcli -c clr26 disk_list disk=1:Disk:13:11 Component ID Status Currently Functioning Capacity (GB) 1:Disk:13:11 Failed yes 1TB C:\XIV>xcli -c clr26 disk_list disk=1:Disk:13:10 Component ID Status Currently Functioning Capacity (GB) 1:Disk:13:10 OK yes 1TB
Target Status
Target Status
In Example 10-7, the module_list command displays details about the modules themselves. If the module parameter is not provided, all the modules are displayed. In addition to the status of the module, the output describes the number of disks, number of FC ports, and number of iSCSI ports.
Example 10-7 The module_list command
C:\XIV>xcli -c clr26 module_list module=1:Module:4 Component ID Status Currently Functioning Type Data Disks 1:Module:4 OK yes p10hw_auxiliary 12 4
FC Ports 0
iSCSI Ports
In Example 10-8 on page 257, the ups_list command describes the current status of the Uninterruptible Power Supply (UPS) component. It provides details about when the last test was performed and the results. Equally important is the current battery charge level. A non-fully charged battery can be a cause of problems in case of power failure.
256
Example 10-8 The ups_list command
C:\XIV>xcli -c clr26 ups_list ups=1:UPS:2

Component ID Status Currently Functioning Input Power On Self Test Date Last Self Test Result 1:UPS:2 OK yes yes 01/31/2000 Passed Battery Charge Level Last 100
Example 10-9 shows the switch_list command that is used to display the current status of the switches.
Example 10-9 The switch_list command
C:\XIV>xcli -c clr26 switch_list switch=1:Switch:2

Component ID Status 1:Switch:2 OK Currently Functioning AC Power State yes OK DC Power State OK Interconnect Down Failed Fans 0
The psu_list command that is shown in Example 10-10 lists all the power supplies in each of the modules. There is no option to display an individual Power Supply Unit (PSU).
Example 10-10 The psu_list command
C:\XIV>xcli -c clr26 psu_list Component ID Status Currently Functioning 1:PSU:1:1 OK yes 1:PSU:1:2 OK yes 1:PSU:2:1 OK yes 1:PSU:2:2 OK yes 1:PSU:3:1 OK yes 1:PSU:3:2 OK yes 1:PSU:4:1 OK yes 1:PSU:4:2 OK yes 1:PSU:6:1 OK yes 1:PSU:6:2 OK yes 1:PSU:7:1 OK yes 1:PSU:7:2 OK yes 1:PSU:9:1 OK yes 1:PSU:9:2 OK yes 1:PSU:10:1 OK yes 1:PSU:10:2 OK yes 1:PSU:11:1 OK yes 1:PSU:11:2 OK yes 1:PSU:12:1 OK yes 1:PSU:12:2 OK yes 1:PSU:13:1 OK yes 1:PSU:13:2 OK yes 1:PSU:14:1 OK yes 1:PSU:14:2 OK yes 1:PSU:15:1 OK yes 1:PSU:15:2 OK yes 1:PSU:8:1 OK yes 1:PSU:8:2 OK yes 1:PSU:5:1 OK yes 1:PSU:5:2 OK yes
Hardware Status OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK
257
Events
Events can also be handled with XCLI. Several commands are available to list, filter, close, and send notifications for the events. There are many commands and parameters available. You can obtain detailed and complete information in the IBM XIV XCLI User Manual. We only illustrate here several options of the event_list command.
Example 10-11 The event_list command
xcli> XCLI -c clr26 event_list max_events=100 Index Code Severity Timestamp 38454 HOST_MULTIPATH_OK Informational 2008-08-11 38473 UNMAP_VOLUME Informational 2008-08-11 38549 HOST_DISCONNECTED Informational 2008-08-11 38542 SES_POWER_SUP_FAI Major 2008-07-23 39539 MODULE_FAILED Critical 2008-07-29
11:04:54 11:08:41 13:51:39 15:18:51 14:43:44
Alert no no no no no
Cleared User yes yes admin yes yes yes
Several parameters can be used to sort and filter the output of the event_list command. Refer to Table 10-1 for a list of the most commonly used parameters.
Table 10-1 The event_list command parameters Name max_events after before min_severity alerting Description Lists a specific number of events Lists events after the specified date and time Lists events before specified date and time Lists events with the specified and higher severities Lists events for which an alert was sent or for which no alert was sent Lists events for which an alert was cleared or for which the alert was not cleared Syntax and example <event_list max_events=100> <event_list after=2008-08-11 04:04:27> <event_list before 2008-08-11 14:43:47> <event_list min_severity=major> <event_list alerting=no> <event_list alerting=yes> <event_list cleared=yes> <event_list cleared=no>
cleared
These parameters can be combined for better filtering. In Example 10-12, two filters were combined to limit the amount of information displayed. The first parameter max_events only allows three events to be displayed. The second parameter is the date and time that the events must not exceed. In this case, the next closest event to the date provided is event 573. The event occurred approximately 12 minutes before the cutoff time.
Example 10-12 The event_list command with max_events and date filter
C:\XIV>xcli -c clr26 event_list max_events=3 Index Code Severity Cleared 573 DISK_FINISHED_PHASEIN Informational 574 DISK_FINISHED_PHASEIN Informational 575 DISK_FINISHED_PHASEIN Informational
before=2008-08-11.14:56:14 Timestamp Alerting 2008-08-11 14:44:11 2008-08-11 14:44:11 2008-08-11 14:44:11 no no no yes yes yes
258
The event list can also be filtered on severity. Example 10-13 displays all the events in the system that contain a severity level of Major and all higher levels, such as Critical.
Example 10-13 The event_list command filtered on severity
C:\XIV>xcli -c clr26 event_list min_severity=Major max_events=5

Index 1001 1056 1081 1082 1087 Code MODULE_FAILED UPS_FAILED UPGRADE_LOCAL_VERSION_DOWNLOAD_FAILED UPGRADE_LOCAL_VERSION_DOWNLOAD_FAILED STORAGE_POOL_VOLUME_USAGE_INCREASED Severity Critical Critical Critical Critical Major Timestamp 2008-08-18 15:18:48 2008-08-19 13:13:50 2008-08-19 14:39:31 2008-08-19 14:58:58 2008-08-19 17:44:04 Alerting no no no no no Cleared yes yes yes yes yes User
Certain events generate an alert message and do not stop until the event has been cleared. These events are called alerting events and can be viewed by the GUI or XCLI with a separate command. After the alerting event is cleared, it is removed from this list, but it is still visible with the event_list command.
Example 10-14 The event_list_uncleared command
C:\XIV>xcli -c clr26 event_list_uncleared No alerting events exist in the system
Monitoring statistics
The statistics gathering mechanism is a powerful tool. The XIV Storage System continually gathers performance metrics and stores them internally. Using the XCLI, data can be retrieved and filtered by using many metrics. Example 10-15 provides an example of gathering the statistics for 10 days, with each interval covering an entire day. The system is given a time stamp as the ending point for the data. Due to the magnitude of the data being provided, it is best to redirect the output to a file for further post-processing. Refer to Chapter 9, Performance characteristics on page 235 for a more in-depth view of performance.
Example 10-15 The statistics_get command
C:\XIV>xcli -c clr26 statistics_get count=10 interval=1 resolution_unit=day end=2008-08-11.14:56:14 > perf.out The usage_get command is a powerful tool to provide details about the current utilization of pools and volumes. The system saves the usage every hour for later retrieval. This command works the same as the statistics_get command. You specify the time stamp to begin or end the collection and the number of entries to collect. In addition, you need to specify the pool name or the volume name.
Example 10-16 The usage_get command by pool
C:\XIV>xcli -c clr26 Time 2008-08-20 00:00:00 2008-08-20 01:00:00 2008-08-20 02:00:00 2008-08-20 03:00:00 2008-08-20 04:00:00
usage_get pool=WindowsPool max=10 start=2008-08-20.00:00:00 Volume Usage (MB) Snapshot Usage (MB) 32768 0 32768 0 32768 0 32768 0 32768 0
259
2008-08-20 2008-08-20 2008-08-20 2008-08-20 2008-08-20
05:00:00 06:00:00 07:00:00 08:00:00 09:00:00
32768 32768 32768 32768 32768
0 0 0 0 0
Note that the usage is displayed in MB. Example 10-17 shows that the volume Red_vol_1 is utilizing 78 MB of space. The time when the data was written to the device is also recorded. In this case, the host wrote data to the volume for the first time on 14 August 2008.
Example 10-17 The usage_get command by volume
C:\XIV>xcli -c clr26 Time 2008-08-14 05:00:00 2008-08-14 06:00:00 2008-08-14 07:00:00 2008-08-14 08:00:00 2008-08-14 09:00:00 2008-08-14 10:00:00 2008-08-14 11:00:00 2008-08-14 12:00:00 2008-08-14 13:00:00 2008-08-14 14:00:00
usage_get vol=Red_vol_1 max=10 start=2008-08-14.05:00:00 Volume Usage (MB) Snapshot Usage (MB) 0 0 0 0 0 0 69 0 69 0 78 0 78 0 78 0 78 0 78 0
10.1.3 SNMP-based monitoring

So far, we have discussed how to perform monitoring based on the XIV GUI and the XCLI. The XIV Storage System also supports Simple Network Management Protocol (SNMP) for monitoring. SNMP-based monitoring tools, such as IBM Tivoli NetView or the IBM Director, can be used to monitor the XIV Storage System.
Simple Network Management Protocol (SNMP)

SNMP is an industry-standard set of functions for monitoring and managing TCP/IP-based networks and systems. SNMP includes a protocol, a database specification, and a set of data objects. A set of data objects forms a Management Information Base (MIB). The SNMP protocol defines two terms, agent and manager, instead of the client and server terms that are used in many other TCP/IP protocols.
SNMP agent
An SNMP agent is a daemon process that provides access to the MIB objects on IP hosts on which the agent is running. An SNMP agent, or daemon, is implemented in the IBM XIV software and provides access to the MIB objects defined in the system. The SNMP daemon can send SNMP trap requests to SNMP managers to indicate that a particular condition exists on the agent system, such as the occurrence of an error.
SNMP manager
An SNMP manager can be implemented in two ways. An SNMP manager can be implemented as a simple command tool that can collect information from SNMP agents. An SNMP manager also can be composed of multiple daemon processes and database applications. This type of complex SNMP manager provides you with monitoring functions using SNMP. It typically has a graphical user interface for operators. The SNMP manager gathers information from SNMP agents and accepts trap requests sent by SNMP agents. In 260
addition, the SNMP manager generates traps when it detects status changes or other unusual conditions while polling network objects. IBM Director is an example of an SNMP manager with a GUI interface.
SNMP trap A trap is a message sent from an SNMP agent to an SNMP manager without a specific
request from the SNMP manager. SNMP defines six generic types of traps and allows you to define enterprise-specific traps. The trap structure conveys the following information to the SNMP manager: Agents object that was affected IP address of the agent that sent the trap Event description (either a generic trap or enterprise-specific trap, the including trap number) Time stamp Optional enterprise-specific trap identification List of variables describing the trap
SNMP communication
The SNMP manager sends SNMP get, get-next, or set requests to SNMP agents, which listen on UDP port 161, and the agents send back a reply to the manager. The SNMP agent can be implemented on any kind of IP host, such as UNIX workstations, routers, network appliances, and also on the XIV Storage System. You can gather various information about the specific IP hosts by sending the SNMP get and get-next requests, and you can update the configuration of IP hosts by sending the SNMP set request. The SNMP agent can send SNMP trap requests to SNMP managers, which listen on UDP port 162. The SNMP trap requests sent from SNMP agents can be used to send warning, alert, or error notification messages to SNMP managers. Figure 10-10 on page 262 illustrates the characteristics of SNMP architecture and communication.
261
Network Management Console (NMC)

SNMP AGENT
Daemon process in IBM XIV software)
Sends traps UDP 161
IBM Director
Listen/Replies on UDP 162
SNMP MANAGER
Management Information Base (MIB)
IP Network
Management Data bases = all MIB files
Figure 10-10 SNMP communication
You can configure an SNMP agent to send SNMP trap requests to multiple SNMP managers.
Management Information Base (MIB)

The objects, which you can get or set by sending SNMP get or set requests, are defined as a set of databases called the Management Information Base (MIB). The structure of MIB is defined as an Internet standard in RFC 1155. The MIB forms a tree structure. Most hardware and software vendors provide you with extended MIB objects to support their own requirements. The SNMP standards allow this extension by using the private sub-tree, which is called an enterprise-specific MIB. Because each vendor has a unique MIB sub-tree under the private sub-tree, there is no conflict among vendors original MIB extensions. The XIV Storage System comes with its own, specific MIB.
10.1.4 XIV SNMP setup

To effectively use SNMP monitoring with the XIV Storage System, you must first set it up to send SNMP traps to an SNMP manager (such as the IBM Director server), which is defined in your environment. Figure 10-11 on page 263 illustrates where to start to set up the SNMP destination. Also, you can refer to Setup notification and rules with the GUI on page 273.
262
Figure 10-11 Configure destination
Configure a new destination

The steps are: 1. From the XIV GUI main window, select the Monitor icon. 2. From the Monitor menu, select Events to display the Events window as shown in Figure 10-11. From the toolbar: a. Click Destinations. The Destinations dialog window opens. b. Select SNMP from the Destinations pull-down list. c. Click the green plus sign (+) and select Destination from the pop-up menu to add a destination as illustrated in Figure 10-12.
Figure 10-12 Add SNMP destination
3. The Define Destination dialog is now open. Enter a Destination Name (a unique name of your choice) and the IP or Domain Name System (DNS) of the server where the SNMP Management software is installed. Refer to Figure 10-13 on page 264.
263
Figure 10-13 Define SNMP destination
4. Click Define to effectively add the SNMP Manager as a destination for SNMP traps. Your XIV Storage System is now set up to send SNMP Traps to the defined SNMP manager. The SNMP Manager software will process the received information (SNMP traps) according to the MIB file.
10.1.5 Using IBM Director

In this section, we illustrate how to use IBM Director to monitor the XIV Storage System. The IBM Director is an example of a possible SNMP manager for XIV. Other SNMP Managers can be used with XIV as well. IBM Director provides an integrated suite of software tools for a consistent, single point of management and automation. With IBM Director, IT administrators can view and track the hardware configuration of remote systems in detail and monitor the usage and performance of critical components, such as processors, disks, and memory. IBM Director enables monitoring and event management across a heterogeneous IT environment, including Intel and POWER systems that support Windows, Linux, NetWare, ESX Server, AIX 5L, and i5/OS from a single Java-based user interface. From one access point, users can monitor system resources, inventory, events, task management, core corrective actions, distributed commands, and hardware control for servers, clients, and storage. IBM Director can be used to proactively monitor an SNMP compatible system (such as the XIV Storage System) overall system status and alert the system administrator in case of problems. The hardware in an IBM Director environment can be divided into the following groups:
Management servers: One or more servers on which IBM Director Server is installed Managed systems: Servers, workstations, and any computer managed by IBM Director Management consoles: Servers and workstations, from which you communicate with one
or more IBM Director Servers
SNMP devices: Network devices, disk systems, or computers that have SNMP agents
installed or embedded (such as the XIV Storage System). Figure 10-14 on page 265 depicts a typical IBM Director management environment.
264
Management server TCP/IP Management console -IBM Director Console installed IBM Director Server installed, which installs: -IBM Director Agent -IBM Director Console
TCP/IP
Various protocols
SNMP devices
Level 0 managed systems -no IBM Director components installed
Level 1 managed systems -IBM Director Core Services installed on each
Level 2 managed systems -IBM Director Agent installed on each
Figure 10-14 Typical IBM Director management environment
IBM Director software has four main components:
IBM Director Server

IBM Director Server is the main component of IBM Director. IBM Director Server contains the management data, the server engine, and the application logic. It provides basic functions, such as discovery of the managed systems, persistent storage of inventory data, SQL database support, presence checking, security and authentication, management console support, and administrative tasks. In the default installation under Windows, Linux, and AIX, IBM Director Server stores management information in an embedded Apache Derby database. You can access information that is stored in this integrated, centralized, relational database even when the managed systems are not available. For large-scale IBM Director solutions, you can use a stand-alone database application, such as IBM DB2 Universal Database, Oracle, or Microsoft SQL Server.
IBM Director Agent (also known as Level-2 Agent)

IBM Director Agent is installed on a managed system to provide enhanced functionality for IBM Director to communicate with and administer the managed system. IBM Director Agent provides management data to the management server through various network protocols.
265
IBM Director Core Services (also known as Level-1 Agent)

IBM Director Core Services provides a subset of IBM Director Agent functionality that is used to communicate with and administer a managed system. Systems that have IBM Director Core Services (but not IBM Director Agent) installed on them are referred to as Level-1 managed systems.
IBM Director Console

IBM Director Console is the graphical user interface (GUI) for IBM Director Server. Using IBM Director Console, system administrators can conduct comprehensive hardware management using either a drag-and-drop action or a single click. You can install IBM Director Console on as many systems as you need. The license is available at no charge. All IBM clients can download the latest version of IBM Director code from the IBM Director Software Download Matrix page: http://www.ibm.com/systems/management/director/downloads.html For detailed information regarding the installation, setup, and configuration of IBM Director, refer to the documentation available at: http://www.ibm.com/systems/management/director/
Compile MIB file

After you have completed the installation of your IBM Director environment, you prepare it to manage the XIV Storage System by compiling the provided MIB file. Make sure to always use the latest MIB file provided. To compile the MIB file in your environment: 1. At the IBM Director Console window, click Tasks SNMP Browser Manage MIBs, as shown in Figure 10-15.
Figure 10-15 Manage MIBs
266
2. In the MIB Management window, click File Select MIB to Compile. 3. In the Select MIB to Compile window that is shown in Figure 10-16, specify the directory and file name of the MIB file that you want to compile, and click OK. A status window indicates the progress.
Figure 10-16 Compile MIB
When you compile a new MIB file, it is also automatically loaded in the Loaded MIBs file directory and is ready for use. To load an already compiled MIB file, select: In the MIB Management window, click File Select MIB to load. Select the MIB (that you to load) in the Available MIBs window, click Add, Apply, and OK. This action will load the selected MIB file, and the IBM Director is ready to be configured for monitoring the IBM XIV.
Discover the XIV Storage System

After loading the MIB file into the IBM Director, the next step is to discover the XIV Storage Systems in your environment. Therefore, configure the IBM Director for auto-discover. From the IBM Director Console window, select Options Discovery Preferences, as shown in Figure 10-17 on page 268.
267
Figure 10-17 Discovery Preferences
In the Discovery Preferences window that is shown in Figure 10-18 on page 269, follow these steps to discover XIV Storage Systems: a. Click the Level 0: Agentless System tab. b. Click Add to bring up a window to specify whether you want to add a single address or an address range. Select Unicast Range. Note: Because each XIV system is presented through three IP addresses, select Unicast Range when configuring the auto-discovery preferences. c. Next, enter the address range for the XIV systems in your environment. You also set the Auto-discover period and the Presence Check period.
268
Figure 10-18 Discover Range
After you have set up the Discovery Preferences, the IBM Director will discover the XIV Storage Systems and add them to the IBM Director Console as seen in Figure 10-19.
Figure 10-19 IBM Director Console
At this point, the IBM Director and IBM Director Console are ready to receive SNMP traps from the discovered XIV Storage Systems. With the IBM Director, you can display general information about your IBM XIVs, monitor the Event Log, and browse more information.
General System Attributes

Double-click the entry corresponding to your XIV Storage System in the IBM Director Console window to display the General System Attributes as illustrated in Figure 10-20 on page 270. This window gives you a general overview of the system status.
269
Figure 10-20 General System Attributes
270
Event Log
To open the Event Log, right-click the entry corresponding to your XIV Storage System and select Event Log from the pop-up menu that is shown in Figure 10-21.
Figure 10-21 Select Event Log
The Event Log window can be configured to show the events for a defined time frame or to limit the number of entries to display. Selecting a specific event will display the Event Details in a pane on the right side of the window as shown in Figure 10-22.
Figure 10-22 IBM Director Event Log
271
Event actions
Based on the SNMP traps and the events, you can define different Event Actions with the Event Actions Builder as illustrated in Figure 10-23. Here, you can define several actions for the IBM Director to perform in response to specific traps and events. IBM Director offers a wizard to help you define an Event Action Plan. Start the wizard by selecting Tasks Event Action Plans Event Action Plan Wizard in the IBM Director Console window. The Wizard will guide you through the setup. The window in Figure 10-23 shows that the IBM Director will send, for all events, an e-mail (to a predefined e-mail address or to a group of e-mail addresses).
Figure 10-23 Event Action Plan Builder
272
10.2 Call Home and remote support

The Call Home function allows the XIV Storage System to alert designated client personnel or IBM support in case of internal software or hardware problems. Depending on the configured notification rules, an e-mail or Short Message Service (SMS) will be sent out to predefined persons, including IBM Support if agreed to by the client. In this case, the IBM Support Center can connect the XIV Storage System, either by modem or virtual private network (VPN), to further analyze the problem. Depending on the nature of the problem, it will be fixed remotely by IBM Support personnel, or the Support Center will initiate the necessary steps (order parts, send IBM service support representative (SSR) on-site) to repair the IBM XIV as quickly as possible.
10.2.1 Setting up Call Home

Configuration options are available from the XIV GUI. You have the flexibility to create a detailed events notification plan based on specific rules. This flexibility allows the storage administrator to decide, for instance, where to direct alerts (to which specific persons or to IBM support) for various event types. All these settings can also be done with XCLI commands.
Setup notification and rules with the GUI

To set up e-mail or SMS notification and rules, from the XIV GUI main window, select the Monitor icon. From the Monitor menu, select Events to display the Events window as shown in Figure 10-24.
Figure 10-24 Setup notification and rules
From the toolbar, click Configuration to invoke the Configuration wizard. The wizard will guide you through the configuration of Gateways, Destinations, and Rules. The wizard Welcome window is shown in Figure 10-25 on page 274.
273
Figure 10-25 Events Configuration wizard
Gateways
The wizard will first take you through the configuration of the gateways. Click Next to display the Events Configuration - Gateway dialog. The steps are: 1. Click Define Gateway. The Gateway Create Welcome panel shown in Figure 10-26 appears. Click Next.
Figure 10-26 Configure Gateways
2. The Gateway Create - Select gateway type panel displays as shown in Figure 10-27 on page 275.
274
Figure 10-27 Select gateway type
3. Type: Here the wizard is asking for the type of the gateway, either SMTP for e-mail notification or SMS if an alert or information will initiate an SMS. Check either SMTP or SMS. 4. The next steps differ for SMTP and SMS. Our illustration from now on is for SMTP. However, the steps to go through for SMS are similarly self-explanatory. Enter: Name: Next, enter the gateway name of the SMTP or SMS Gateway, depending on the previous selected Type. Address: Enter the IP address or DNS name of the SMTP gateway for the gateway address. Sender: In case of e-mail problems, such as the wrong e-mail address, a response e-mail will be sent to this address. You can either enter an address for this server or use the system-wide global address: Use default sender Use new sender address
275
Destinations
Next, the Configuration wizard will guide you through the setup of the destinations where you configure e-mail addresses or SMS receivers. The Welcome panel is displayed. Click Next to proceed. The Select Destination type panel, shown in Figure 10-28, is displayed.
Figure 10-28 Select destination type
Here you configure: Type: Event notification destination type can be either a destination group (containing other destinations), SNMP manager for sending SNMP traps, e-mail address for sending e-mail notification, or mobile phone number for SMS notification: SNMP EMAIL SMS Group of Destinations
Depending on the selected type, the remaining configuration information required differs but is self-explanatory. Our illustration from now on is for SNMP. Enter: Destination Name: Enter the name of the new destination. Make sure to use a meaningful name, which you can remember. Usually, there will be many destinations in a system. SNMP: Here, you enter either the DNS name or the IP address of the SNMP manager.
Rules
At this point, the rules for the event notification can be defined. The rules are either based on the severity, an event code, or a combination of both the severity and the event code. The Welcome panel is displayed. Click Next. The Rule Create - Rule name panel shown in Figure 10-29 on page 277 is displayed.
276
Figure 10-29 Configure Events notification rules
To define a rule, configure: Rule Name: Enter a name for the new rule. Names are case-sensitive and can contain letters, digits, or the underscore character (_). You cannot use the name of an already defined rule. Rule condition setting: Select the severity if you want the rule to be triggered by severity, event code if you want the rule be triggered by event, or both severity and event code for events that might have multiple severities depending on a threshold of certain parameters: Severity only Event Code only Both severity and event code Select the severity trigger: Select the minimum severity to trigger the rules activation. Events of this severity or higher will trigger the rules. Select the event code trigger: Select the event code to trigger the rules activation. Rule destinations: Select destinations and destination groups to be notified when the events condition occurs. Here, you can select one or more existing destinations or also define a new destination (refer to Figure 10-30).
Figure 10-30 Select destination
Rule snooze: Defines whether the system repeatedly alerts the defined destination until the event is cleared. If so, a snooze time must be selected. Either: Check Use snooze timer Snooze time in minutes
277
Rule escalation: Defines the system to send alerts via other rules if the event is not cleared within a certain time. If so, an escalation time and rule must be specified: Check Use escalation rule Escalation Rule Escalation time in minutes Create Escalation Rule
Set up notification and rules with the XCLI

You use the same process to set up the XIV Storage System for notification with the XCLI as you used with the GUI. The three-step process includes all the required configurations to allow the XIV Storage System to provide notification of events: Gateway Destination Rules The gateway definition is used for SMTP and SMS messages. There are several commands used to create and manage the gateways for the XIV Storage System. Example 10-18 shows a SMTP gateway being defined. The gateway is named test and the messages from the XIV Storage System are addressed to xiv@us.ibm.com. When added, the existing gateways are listed for confirmation. In addition to gateway address and sender address, the port and reply to address can also be specified. There are several other commands that are available for managing a gateway.
Example 10-18 The smtpgw_define command
C:\XIV>xcli -c clr26 smtpgw_define smtpgw=test address=test.ibm.com from_address=xiv@us.ibm.com Command executed successfully. C:\XIV>xcli -c clr26 smtpgw_list Name Address Priority relay_de 9.149.165.228 1 test test.ibm.com 2 The SMS gateway is defined in a similar method. The difference is that the fields can use tokens to create variable text instead of static text. When specifying the address to send the SMS message, tokens can be used instead of hard-coded values. In addition, the message body also uses a token to have the error message sent instead of a hard-coded text. Example 10-19 provides an example of defining a SMS gateway. The tokens available to be used for the SMS gateway definition are: {areacode}: This escape sequence is replaced by the destinations mobile or cellular phone number area code. {number}: This escape sequence is replaced by the destinations cellular local number. {message}: This escape sequence is replaced by the text to be shown to the user. \{, \}, \\: These symbols are replaced by the {, } or \ respectively.
Example 10-19 The smsgw_define command
C:\XIV>xcli -c clr26 smsgw_define smsgw=test email_address={areacode}{number}@smstest.ibm.com subject_line="XIV System Event Notification" email_body={message} Command executed successfully.
278
C:\XIV>xcli -c clr26 smsgw_list Name Email Address test {areacode}{number}@smstest.ibm.com
SMTP Gateways all
When the gateways are defined, the destination settings can be defined. There are three types of destinations: SMTP or e-mail SMS SNMP Example 10-20 provides an example of creating a destination for all three types of notifications. For the e-mail notification, the destination receives a test message every Monday at 12:00. Each destination can be set to receive notifications on multiple days of the week at multiple times.
Example 10-20 Destination definitions
C:\XIV>xcli -c clr26 dest_define dest=emailtest type=EMAIL email_address=test@ibm.com smtpgws= ALL heartbeat_test_hour=12:00 heartbeat_test_days=Mon Command executed successfully. C:\XIV>xcli -c clr26 dest_define dest=smstest type=SMS area_code=555 number=5555555 smsgws=ALL Command executed successfully. C:\XIV>xcli -c clr26 dest_define dest=snmptest type=SNMP snmp_manager=9.9.9.9 Command executed successfully. C:\XIV>xcli -c clr26 dest_list Name Type Email Address User BladeC4_W2K3 SNMP relay EMAIL moscheka@de.ibm.com emailtest EMAIL test@ibm.com smstest SMS snmptest SNMP
Area Code
Phone Number
SNMP Manager 9.155.59.159
555
5555555 9.9.9.9
Finally, the rules can be set for which messages can be sent. Example 10-21 provides two examples of setting up rules. The first rule is for SNMP and e-mail messages and all messages, even informational messages, are sent to the processing servers. The second example creates a rule for SMS messages. Only critical messages are sent to the SMS server, and they are sent every 15 minutes until the error condition is cleared.
Example 10-21 Rule definitions
C:\XIV>xcli -c clr26 rule_create rule=emailtest min_severity=informational dests=emailtest,snmptest Command executed successfully. C:\XIV>xcli -c clr26 rule_create rule=smstest min_severity=critical dests=smstest snooze_time=15 Command executed successfully. C:\XIV>xcli -c clr26 rule_list
279
Name Minimum Severity Event Codes Active Escalation Only markus_rule none all no emailtest Informational no smstest Critical no
Except Codes
Destinations
BladeC4_W2K3,relay yes emailtest,snmptest yes smstest yes
Example 10-22 shows illustrations of deleting rules, destinations, and gateways. It is not possible to delete a destination if a rule is using that destination, and it is not possible to delete a gateway if a destination is pointing to that gateway.
Example 10-22 Deletion of notification setup
C:\XIV>xcli -c clr26 -y rule_delete rule=smstest Command executed successfully. C:\XIV>xcli -c clr26 -y rule_delete rule=emailtest Command executed successfully. C:\XIV>xcli -c clr26 -y dest_delete dest=smstest Command executed successfully. C:\XIV>xcli -c clr26 -y dest_delete dest=emailtest Command executed successfully. C:\XIV>xcli -c clr26 -y dest_delete dest=snmptest Command executed successfully. C:\XIV>xcli -c clr26 -y smsgw_delete smsgw=test Command executed successfully. C:\XIV>xcli -c clr26 -y smtpgw_delete smtpgw=test Command executed successfully.
280
10.2.2 Remote support

The XIV Storage System is repaired and maintained predominantly with the help of the IBM XIV remote support center. In case of problems, a remote support specialist can connect to the system to analyze and repair the malfunction and assist the IBM SSR who is dispatched eventually on-site.
Remote connection
The Remote support center has two ways to connect the system. Depending on the clients choice, the support specialist can either connect using a modem dial-up connection or, if provided and agreed to by the client, a secure, high-speed connection VPN. These possibilities are depicted in Figure 10-31. In case of problems, the remote specialist is able to analyze problems and also assist an IBM SSR dispatched on-site in repairing the system or in replacing field-replaceable units (FRUs). Remote access is protected by different passwords for different access levels to prevent unauthorized remote access. For details, you can also refer to Chapter 6, Security on page 125.
IB M X IV to ra g e S y s te m
S e c u re h ig h -s p e e d c o n n e c tio n
IN T E R N E T
IN T
RA
NE
T
F ire w a ll USER
VPN
IN
F ire w a ll IB M
TR
AN
ET
P H O N E L IN E
M odem M odem
IB M X IV R e m o te S u p p o rt C e n te r
D ia l u p c o n n e c tio n
Figure 10-31 Remote Support connections
To enable remote support, you must allow an external connection, such as either: A telephone line An Internet connection through your firewall that allows IBM to use a VPN connection to your XIV Storage System.
281
Dial-up connection (modem)

This is a low-speed asynchronous modem connection to a telephone line. The Support Center is able to initiate a dial-up connection via modem to the IBM XIVs Maintenance module.
Secure high-speed connection

This connection is through a high-speed Ethernet connection that can be configured through a secure virtual private network (VPN) Internet connection to ensure authentication and data encryption. A high-speed connection is the ideal infrastructure for effective support services. Note: We highly recommend that the XIV Storage System is connected to the clients public network over secure high-speed connection (secure VPN) instead of a dial-up connection.
10.2.3 Repair flow

In case of system problems, IBM XIV support center will be notified by a hardware or software call generated by a notification from the system or by a users call. Based on this call, the remote support center will initiate the necessary steps to repair the problem according to the flow depicted in Figure 10-32.
IBM XIV send problem notification IBM XIV Support Center USER report problem to IBM
Support Center initiate parts for send on-site and inform CMC (Call management center)
Support Center Specialist connects to the system to analyze the problem
Problem can be solved from remote
CMC coordinate SSR and FRUs to be on-site
SSR on-site repair with Support Center assistance
Problem is solved from SSR on-site
Problem is solved
Figure 10-32 Problem Repair Flow
282
Either a call from the user or an e-mail notification will generate an IBM internal problem record and alert the IBM XIV Support Center. A Support Center Specialist will remotely connect to the system and evaluate the situation to decide what further actions to take to solve the reported issue: Remote Repair: Depending on the nature of the problem, a specialist will fix the problem while connected. Data Collection: Start to collect data in the system for analysis to develop an action plan to solve the problem. On-site Repair: Provide an action plan, including needed parts, to the call management center (CMC) to initiate an IBM SSR repairing the system on-site. IBM SSR assistance: Support the IBM SSR during on-site repair via remote connection. The architecture of the IBM XIV is self-healing. Failing units are logically removed from the system automatically, which greatly reduces the potential impact of the event and results in service actions being performed in a fully redundant state. For example, if a disk drive fails, it will be automatically removed from the system. The process has a minimal impact on performance, because only a small part of the available resources has been removed. The rebuild time is fast, because most of the remaining drives will participate in redistributing the data. Due to this self-healing mechanism, with most failures, there is no need for urgent action and service can be performed at a convenient time. The IBM XIV will be in a fully redundant state, which mitigates issues that might otherwise arise if a failure occurs during a service action.
283
284
11
Chapter 11.
Copy functions
The XIV Storage System has a rich set of copy functions suited for various data protection scenarios, which enables clients to set up their business continuance, data migration, and online backup solutions. This chapter provides an overview of the snapshot and Volume Copy functions for the XIV product. Furthermore, it describes the requirements, application range, and implementation of the two copy functions in the enterprise environment. The Remote Mirror function is covered in Chapter 12, Remote Mirror on page 323.
285
11.1 Snapshots
A snapshot is a point in time copy of a volumes data. The XIV snapshot is based on several innovative technologies to ensure minimal degradation or impact in system performance.
11.1.1 Architecture of snapshots

Metadata management is the key to the rapid snapshot performance. A snapshot points to the partitions of its master volume for all unchanged partitions. When the data is modified, a new partition is allocated for the modified data. In other words, the XIV Storage System manages a set of pointers based on the volume and the snapshot. Those pointers are modified when changes are made to the user data. Managing pointers to data enables XIV to instantly create snapshots, as opposed to physically copying the data into a new partition. Refer to Figure 11-1. The actual metadata overhead for a snapshot is small. When the snapshot is created, the system does not require new pointers, because the volume and snapshot are exactly the same, which means the time to create the snapshot is independent of the size or number of snapshots present in the system. As data is modified, new metadata is created to track the changes to the data. Note: The XIV system minimizes the impact to the host for write operations by performing a redirect on write operation. As the host writes data to a volume with a snapshot relationship, the incoming information is placed into a newly allocated partition. Then, the pointer to the data for the master volume is modified to point at the new partition. The snapshot volume continues to point at the original data partition. Because the XIV Storage System tracks the snapshot changes on a partition basis, data is only copied when a transfer is less than the size of a partition. For example, a host writes 4 KB of data to a volume with a snapshot relationship. The 4 KB is written to a new partition, but in order for the partition to be complete, the remaining data must be copied from the original partition to the newly allocated partition. Therefore, less data is moved internally, which improves the snapshot performance.
Data layout before modification Empty Empty Snapshot Pointer to Partition Volume A Host modifies data in Volume A Empty Volume A Snapshot Pointer to Partition Snapshot of A Volume Pointer to Partition Volume Pointer to Partition
Figure 11-1 Example of a redirect on write operation
286
The alternative to redirect on write is the copy on write function. Most systems do not move the location of the volume data. Instead, when the disk subsystem receives a change, it copies the volumes data to a new location for the point in time copy. When the copy is complete, the disk system commits the newly modified data. Therefore, each individual modification does take longer, because the entire block must be copied before the change can be made. As the storage assigned to the snapshot is completely utilized, the XIV Storage System implements a deletion mechanism to protect itself from overutilizing the set pool space. It is important to note that the snapshot pool does not need to be full before a deletion occurs. If the space assigned for the snapshots reaches the limitation on a single physical disk, this limit causes the deletion of the entire snapshot to occur on all disks (refer to Figure 11-2). If you know in advance that an automatic deletion is possible, a pool can be expanded to accommodate additional snapshots. This function requires that there is available space on the system for the Storage Pool. Refer to Resizing Storage Pools on page 97 for details about adding more space to an existing Storage Pool.
Snapshot space on a single disk
Snapshot free partition
Snapshot 2 Snapshot 1
Utilization before a new allocation
Snapshot 3 Snapshot 3 Snapshot 2 Snapshot 1
Snapshot 3 Snapshot 2
Snapshot free partition
Snapshot 3 allocates a partition and Snapshot 1 is deleted, because there must always be at least one free partition for any subsequent snapshot.
Figure 11-2 Diagram of automatic snapshot deletion
Each snapshot has a deletion priority property that is set by the user. There are four priorities with 1 being the highest priority and 4 being the lowest priority. The system uses this priority to determine which snapshot to delete first. The lowest priority becomes the first candidate for deletion. If there are multiple snapshots with the same deletion priority, the XIV system deletes the snapshot that was created first. Refer to Deletion priority on page 291 for an example of working with deletion priorities. A snapshot also has a unique ability to be unlocked. By default, a snapshot is locked on creation and is only readable. Unlocking a snapshot allows the user to modify the data in the snapshot for post-processing. When unlocked, the snapshot takes on the properties of a volume and can be resized or modified. As soon as the snapshot has been unlocked, the modified property is set. The modified property cannot be reset after a snapshot is unlocked, even if the snapshot is relocked without modification.
Chapter 11. Copy functions
287
In some cases, it might be important to duplicate a snapshot. When duplicating a snapshot, the duplicate snapshot points to the original data and has the same creation date as the original snapshot, if the first snapshot has not been unlocked. This feature can be beneficial when the user wants to have one copy for a backup and another copy for testing purposes. If the first snapshot is unlocked and the duplicate snapshot already exists, the creation time for the duplicate snapshot does not change. The duplicate snapshot points to the original snapshot. If a duplicate snapshot is created from the unlocked snapshot, the creation date is the time of duplication and the duplicate snapshot points at the original snapshot. An application can utilize many volumes on the XIV Storage System, for example, a database application can span several volumes for journaling and user data. In this case, the snapshot for the volumes must occur at the same moment in time so that the journal and data are synchronized. The Consistency Group allows the user to perform the snapshot on all the volumes assigned to the group at the same moment in time, therefore, enforcing data consistency. The XIV Storage System creates a special snapshot related to the Remote Mirroring functionality. During the recovery process of a lost link, the system creates a snapshot of all the volumes in the system. This snapshot is used if the synchronization process fails. The data can be restored to a point of known consistency. A special value of the deletion priority is used to prevent the snapshot from being automatically deleted. Refer to 11.1.4, Snapshot with Remote Mirror on page 310 for an example of this snapshot.
11.1.2 Volume snapshots

The creation and management of snapshots with the XIV Storage System are simple and easy to perform. This section guides you through the life cycle of a snapshot, providing examples of how to interact with the snapshots using the GUI. This section also covers duplicate snapshots and the automatic deletion of snapshots.
Creating a snapshot
Snapshot creation is a simple and easy task to accomplish. Using the Volumes and snapshots view, right-click the volume and select Create Snapshot. Figure 11-3 depicts how to make a snapshot of volume redbook_markus_01.
Figure 11-3 Creating a snapshot
288
The new snapshot is displayed in Figure 11-4. The XIV Storage System uses a specific naming convention: The first part is the name of the volume followed by the word snapshot and then a number or count of snapshots for the volume. Also, the snapshot is the same size as the master volume; however, it does not display how much space has been used by the snapshot. From this view shown in Figure 11-4, there are three other details: First is the locked property of the snapshot. By default, a snapshot is locked at the time of creation. Secondly, the modified property is displayed to the right of the locked property. In this example, the snapshot has not been modified. Third, the creation date is displayed. For this example, the snapshot was created on 6 August 2008 at 19:17.
Figure 11-4 View of the new snapshot
After making a snapshot, the next option is to create a duplicate snapshot for backup purposes. The duplicate has the same creation date as the first snapshot, and it also has a similar creation process. From the Volumes and snapshots view, right-click the snapshot to duplicate. Select Duplicate from the menu to create a new duplicate snapshot. Figure 11-5 provides an example of duplicating the snapshot redbook_markus_01.snapshot_00001.
Figure 11-5 Creating a duplicate snapshot
After selecting Duplicate from the menu, the duplicate snapshot is displayed directly under the original snapshot. Note, the creation date of the duplicate snapshot in Figure 11-6 is the same creation date as the original snapshot. Even though it is not shown, the duplicate snapshot points to the master volume, not the original snapshot.
Figure 11-6 View of the new snapshot
289
An example of creating a snapshot and a duplicate snapshot with the Extended Command Line Interface (XCLI) is provided in Example 11-1.
Example 11-1 Creating a snapshot and a duplicate with the XCLI
xcli -c MZ_PFE_1 snapshot_create vol=redbook_markus_03 xcli -c MZ_PFE_1 snapshot_duplicate snapshot=redbook_markus_03.snapshot_00003 After the snapshot is created, it needs to be mapped to a host in order to access the data. This action is performed in the same way as mapping a normal volume. Refer to 5.5, Host definition and mappings on page 113 for the process to map a volume to a host. It is important to note that a snapshot is an exact replica of the original volume. Certain hosts do not properly handle having two volumes with the same exact metadata describing them. In these cases, you must map the snapshot to a different host to prevent failures. Creation of a snapshot is only done in the volumes Storage Pool. A snapshot cannot be created in a different Storage Pool than the one that owns the volume. If a volume is moved to another Storage Pool, the snapshots are moved with the volume to the new Storage Pool (provided there is enough space).
Viewing snapshot details

After creating the snapshots, you might want to view the details of the snapshot for creation date, deletion priority, and whether the volume has been modified. Using the GUI, select Snapshot Tree from the Volumes menu, as shown in Figure 11-7.
Figure 11-7 Selecting the Snapshot Tree view
290
The GUI displays all the volumes in a list. Scroll down to the snapshot of interest and select the snapshot by clicking the name. Details of the snapshot are displayed in the upper right panel. Looking at the volume redbook_markus_01, it contains a snapshot 00001 and a duplicate snapshot 00002. The snapshot and the duplicate snapshot have the same creation date of 2008-07-30 13:29:35 as shown in Figure 11-8. In addition, the snapshot is locked, it has not been modified, and it has a deletion priority of 1 (which is the highest priority for deletion). Along with these properties, the tree view shows a hierarchal structure of the snapshots. This structure provides details about restoration and overwriting snapshots. Any snapshot can be overwritten by any parent snapshot, and any child snapshot can restore a parent snapshot or a volume in the tree structure. In Figure 11-8, the duplicate snapshot is a child of the original snapshot, or in other words, the original snapshot is the parent of the duplicate snapshot. This structure has nothing to do with how the XIV Storage System manages the pointers with the snapshots but is intended to provide an organizational flow for snapshots.
Figure 11-8 Viewing the snapshot details
Example 11-2 is an example of viewing the snapshot data in the XCLI. Due to space limitations, only a small portion of the data is displayed from the output.
Example 11-2 Viewing the snapshots on the XCLI
xcli -c MZ_PFE_1 snapshot_list vol=redbook_markus_03 Name redbook_markus_03.snapshot_00003 redbook_markus_03.snapshot_00004 Size (GB) 17 17 Master Name redbook_markus_03 redbook_markus_03
Deletion priority
Deletion priority enables the user to rank the importance of the snapshots within a pool. For the current example, the duplicate snapshot redbook_markus_01.snapshot_00002 is not as important as the original snapshot redbook_markus_01.snapshot_00001. Therefore, the deletion priority is reduced. If the snapshot space is full, the duplicate snapshot is deleted first even though the original snapshot is older. To modify the deletion priority, right-click the snapshot in the Volumes and snapshots view and select Change Deletion Priority as shown in Figure 11-9 on page 292.
291
Figure 11-9 Changing the deletion priority
After clicking Change Deletion Priority, select the desired deletion priority from the dialog window and accept the change by clicking OK. Figure 11-10 shows the four options that are available for setting the deletion priority. The lowest priority setting is 4, which causes the snapshot to be deleted first. The highest priority setting is 1, and these snapshots are deleted last. All snapshots have a default deletion priority of 1, if not specified on creation.
Figure 11-10 Lowering the priority for a snapshot
Figure 11-11 confirms that the duplicate snapshot has had its deletion priority lowered to 4. As shown in the upper right panel, the Delete Priority is reporting a 4 for snapshot redbook_markus_01.snapshot_00002.
Figure 11-11 Confirming the modification to the deletion priority
To change the deletion priority for the XCLI, just specify the snapshot and new deletion priority, as illustrated in Example 11-3 on page 293.
292
Example 11-3 Changing the deletion priority for a snapshot
xcli -c MZ_PFE_1 snapshot_change_priority snapshot=redbook_markus_03.snapshot_00004 delete_priority=4
Snapshot restoration
The XIV Storage System provides the ability to restore the data from a snapshot back to the master volume, which can be helpful for operations where data was modified incorrectly, and you want to restore the data. From the Volumes and snapshots view, right-click the snapshot and click Restore. This action causes a dialog box to appear, click OK to perform the restoration. Figure 11-12 illustrates selecting the Restore action on the snapshot redbook_marcus_01.snapshot_00001. After you perform the restore action, you return to the Volumes and snapshots panel. The process is instantaneous, and none of the properties (creation date, deletion priority, modified properties, or locked properties) of the snapshot or the volume have changed. Specifically, the process modifies the pointers to the master volume so that they are equivalent to the snapshot pointer. This change only occurs for partitions that have been modified. On modification, the XIV Storage System stored the data in a new partition and modified the master volumes pointer to the new partition. The snapshot pointer did not change and remained pointing at the original data. The restoration process restores the pointer back to the original data and frees the modified partition space.
Figure 11-12 Restoring a snapshot
The XCLI provides more options for restoration than the GUI. With the XCLI, you can restore a snapshot to a parent snapshot (Example 11-4). The GUI only allows a snapshot to be restored to the master volume. If the target snapshot is not specified, the data is restored to the master volume. In addition, you must specify the -y option when issuing the command, which tells the XCLI to respond affirmatively when prompted for validation to run the command. Important: The XCLI provides more functionality with the snapshots than the GUI. In this case, the XCLI allows the snapshot to be restored to another snapshot and not just the master volume.
Example 11-4 Restoring a snapshot to another snapshot
xcli -c MZ_PFE_1 -y snapshot_restore snapshot=redbook_markus_03.snapshot_00004 target_snapshot=redbook_markus_03.snapshot_00003
293
Overwriting snapshots
Certain situations require the snapshot to be refreshed or updated with the latest changes to the data. For instance, a backup application requires the latest copy of the data to perform its backup operation. This operation modifies the pointers to the snapshot data to be reset to the master volume. Therefore, all pointers to the original data are lost, and the snapshot appears as new. From the Volumes and Snapshots view, right-click the snapshot to overwrite. Select Overwrite from the menu, and a dialog box appears. Click OK to validate the overwriting of the snapshot. Figure 11-13 illustrates overwriting the snapshot named redbook_marcus_01.snapshot_00001.
Figure 11-13 Overwriting a snapshot
It is important to note that the overwrite process modifies the snapshot properties and pointers when involving duplicates. Figure 11-14 shows two changes to the properties. The snapshot named redbook_marcus_01.snapshot_00001 has a new creation date. The duplicate snapshot still has the original creation date. However, it no longer points to the original snapshot; instead, it points to the master volume according to the snapshot tree, which prevents a restoration of the duplicate to the original snapshot. If the overwrite occurs on the duplicate snapshot, the duplicate creation date is changed, and the duplicate is now pointing to the master volume.
Figure 11-14 Snapshot tree after the overwrite process has occurred
The XCLI performs the overwrite operation through the snapshot_create command. There is an optional parameter in the command to specify which snapshot to overwrite. If the optional parameter is not used, the master volume is overwritten.
Example 11-5 Overwriting a snapshot
xcli -c MZ_PFE_1 -y snapshot_create vol=redbook_markus_03 overwrite=redbook_markus_03.snapshot_00003
Unlocking a snapshot
At certain times, it might be beneficial to modify the data in a snapshot. This feature is useful for performing tests on a set of data or performing other types of data mining activities. There are two scenarios that you need to investigate when unlocking snapshots. The first scenario is to unlock the duplicate. By unlocking the duplicate, none of the snapshot properties are modified, and the structure remains the same. This method is straightforward 294
and provides a backup of the master volume along with a working copy for modification. To unlock the snapshot, simply right-click the snapshot and select Unlock, as shown in Figure 11-15.
Figure 11-15 Unlocking a snapshot
The results in the Snapshots Tree window show that the Locked property is off and the Modified property is on for redbook_markus_01.snapshot_00002. Even if the volume is relocked or overwritten with the original master volume, the modified property remains on. Also, note that in Figure 11-16, the structure is unchanged; the parent of the duplicate is still redbook_markus_01.snapshot_00001. If an error occurs in the modified duplicate snapshot, the duplicate snapshot can be deleted, and the original snapshot duplicated a second time to restore the information.
Figure 11-16 Unlocked duplicate snapshot
For the second scenario, the original snapshot is unlocked and not the duplicate. Figure 11-17 on page 296 shows the new property settings for redbook_markus_01.snapshot.00001. At this point, the duplicate snapshot mirrors the unlocked snapshot, because both snapshots still point to the original data. While the unlocked snapshot is modified, the duplicate snapshot references the original data. If the unlocked snapshot is deleted, the duplicate snapshot remains, and its parent becomes the master volume. Because the hierarchal snapshot structure was unmodified, the duplicate snapshot can be overwritten by the original snapshot. The duplicate snapshot can be restored to the master volume. Based on the results, this process is no different than the first scenario. There is still a backup and a working copy of the data.
295
Figure 11-17 Unlocked original snapshot
To unlock a snapshot is the same as unlocking a volume. Again the -y parameter needs to be specified in order to provide an affirmative response to the validation request.
Example 11-6 Unlocking a snapshot
xcli -c MZ_PFE_1 -y vol_unlock vol=redbook_markus_03.snapshot_00004
Locking a snapshot
If the changes made to a snapshot need to be preserved, you can lock an unlocked snapshot. Figure 11-18 shows locking the snapshot named redbook_markus_01.snapshot.00001. From the Volumes and snapshots panel, right-click the snapshot to lock and select Lock. The snapshot is locked immediately.
Figure 11-18 Locking a snapshot
The locking process completes immediately, preventing further modification to the snapshot. In Figure 11-19, the snapshot redbook_markus_01.00001 shows that both the lock property is on and the modified property is on. Even though there has not been a change to the snapshot, the system does not remove the modified property.
Figure 11-19 Validating that the snapshot is locked
296
The XCLI lock command (vol_lock), which is shown in Example 11-7, is almost a mirror operation of the unlock command. Only the actual command changes, but the same operating parameters are used when issuing the command.
Example 11-7 Locking a snapshot
xcli -c MZ_PFE_1 -y vol_lock vol=redbook_markus_03.snapshot_00004
Deleting a snapshot
When a snapshot is no longer needed, you need to delete it. Figure 11-20 illustrates how to delete a snapshot. In this case, the modified snapshot redbook_markus_01.snapshot.00001 is no longer needed. To delete the snapshot, right-click it and select Delete from the menu. A dialog box appears requesting that you validate the operation.
Figure 11-20 Deleting a snapshot
Figure 11-21 no longer displays the snapshot redbook_markus_01.snapshot.00001. Note that the volume and the duplicate snapshot are unaffected by the removal of this snapshot. In fact, the duplicate becomes the child of the master volume. The XIV Storage System provides the ability to restore the duplicate snapshot to the master volume or to overwrite the duplicate snapshot from the master volume even after deleting the original snapshot.
Figure 11-21 Validating the snapshot is removed
The delete snapshot command (snapshot_delete) operates the same as the creation snapshot. Refer to Example 11-8. The -y parameter needs to be specified so that the validation response is accepted.
Example 11-8 Deleting a snapshot
xcli -c MZ_PFE_1 -y snapshot_delete snapshot=redbook_markus_03.snapshot_00004
Automatic deletion of a snapshot

The XIV Storage System has a feature in place to protect a Storage Pool from becoming full. If the space allocated for snapshots becomes full, the XIV Storage System automatically deletes a snapshot. Figure 11-22 on page 298 shows a Storage Pool with a single 17 GB volume labeled XIV_ORIG_VOL. The host connected to this volume is sequentially writing to a file that is stored on this volume. While the data is written, a snapshot called XIV_ORIG_VOL.snapshot.00006 is created, and one minute later, a second snapshot is taken (not a duplicate), which is called XIV_ORIG_VOL.snapshot.00007.
297
With this scenario, a duplicate does not cause the automatic deletion to occur. Because a duplicate is a mirror copy of the original snapshot, the duplicate does not create the additional allocations in the Storage Pool.
Figure 11-22 Snapshot before the automatic deletion
Approximately one minute later, the oldest snapshot (XIV_ORIG_VOL.snapshot_00006) is removed from the display. The Storage Pool is 51 GB in size, with a snapshot size of 34 GB, which is enough for one snapshot (refer to Storage Pool relationships on page 23). If the master volume is unmodified, many snapshots can exist within the pool, and the automatic deletion does not occur. If there were two snapshots and two volumes, it might take longer to cause the deletion, because the volumes utilize different portions of the disks, and the snapshots might not have immediately overlapped. To examine the details of the scenario at the point where the second snapshot is taken, a partition is in the process of being modified. The first snapshot caused a redirect on write, and a partition was allocated from the snapshot area in the Storage Pool. Because the second snapshot occurs at a different time, this action generates a second partition allocation in the Storage Pool space; this second allocation does not have available space and the oldest snapshot is deleted. Figure 11-23 on page 299 shows that the master volume XIV_ORIG_VOL and the newest snapshot XIV_ORIG_VOL.snapshot.00007 are present. The oldest snapshot XIV_ORIG_VOL.snapshot.00006 was removed.
298
Figure 11-23 Snapshot after automatic deletion
To determine the cause of removal, you must go to the Events panel under the System menu. Refer to Chapter 10, Monitoring on page 249 for more details about managing events. As shown on Figure 11-24 on page 300, the event SNAPSHOT_DELETED_DUE_TO_POOL_EXHAUSTION is logged. The snapshot name XIV_ORIG_VOL.snapshot.00006 and time 2008-07-31 15:17:31 are also logged for future reference.
299
Figure 11-24 Record of automatic deletion
11.1.3 Consistency Groups

The purpose of a Consistency Group is to pool multiple volumes together so that a snapshot can be taken of all the volumes at the same moment in time. This action creates a synchronized snapshot of all the volumes and is ideal for applications that span multiple volumes, for example, a database application that has the logs on one volume and the database on another volume. When creating a backup of the database, it is important to synchronize the data so that it is consistent. If the data is inconsistent, a database restore is not possible, because the log and the data are different and therefore, part of the data can be lost.
Creating a Consistency Group

There are two methods of creating a Consistency Group. The first method is to create the Consistency Group and add the volumes in one step. The second method creates the Consistency Group and then adds the volumes in a subsequent step. It is important to note that the volumes in a Consistency Group must be in the same Storage Pool. A Consistency Group cannot have volumes from different pools. Starting at the Volumes menu, select the volume that is to be added to the Consistency Group. To select multiple volumes, hold down Ctrl and click each volume. After the volumes are selected, right-click a selected volume to bring up the Operations menu. From there, click Create Consistency Group with these Volumes. Refer to Figure 11-25 on page 301 for an example of this operation.
300
Figure 11-25 Creating a Consistency Group with these Volumes
After selecting the Create option from the menu, a dialog window appears. Enter the name of the Consistency Group. Because the volumes are added during creation, it is not possible to change the pool name. Figure 11-26 shows the process of creating a Consistency Group. After the name is entered, click Create.
Figure 11-26 Naming the Consistency Group
Viewing the volumes displays the owning Consistency Group. As in Figure 11-27, the two volumes contained in the xiv_volume_copy pool are now owned by the xiv_db_cg Consistency Group. The volumes are displayed in alphabetical order and do not reflect a preference or internal ordering.
Figure 11-27 Viewing the volumes after creating a Consistency Group
In order to obtain details about the Consistency Group, the GUI provides a panel to view the information. Under the Volumes menu, select Consistency Groups. Figure 11-28 on page 302 illustrates how to access this panel.
301
Figure 11-28 Accessing the Consistency Group view
This selection sorts the information by Consistency Group. The panel allows you to expand the Consistency Group and see all the volumes owned by that Consistency Group. In Figure 11-29, there are two volumes owned or contained by the xiv_db_cg Consistency Group. In this example, a snapshot of the volumes has not been created.
Figure 11-29 Consistency Group view
From the Consistency Group view, you can create a Consistency Group without adding volumes. On the menu bar at the top of the window, there is an icon to add a new Consistency Group. By clicking the Add Consistency Group icon shown in Figure 11-30, a creation dialog box appears as shown in Figure 11-26 on page 301. You then provide a name and the Storage Pool for the Consistency Group.
Figure 11-30 Adding a new Consistency Group
302
When created, the Consistency Group appears in the Consistency Groups view of the GUI (Figure 11-31). The new group does not have any volumes associated to it. A new Consistency Group named xiv_db_cg is created. The Consistency Group cannot be expanded yet, because there are no volumes contained in the Consistency Group xiv_db_cg.
Figure 11-31 Validating new Consistency Group
Using the Volumes view in the GUI, select the volumes to add to the Consistency Group. You can select multiple volumes by holding Ctrl down and clicking the desired volumes. After selecting the desired volumes, right-click the volumes and select Add to Consistency Group. Figure 11-32 shows two volumes being added to a Consistency Group: xiv_vmware_1 and xiv_vmware_2.
Figure 11-32 Adding volumes to a Consistency Group
After selecting the volumes to add, a dialog box appears asking for the Consistency Group to which to add the volumes. Figure 11-33 on page 304 adds the volumes to the xiv_db_cg group. Clicking OK completes the operation.
303
Figure 11-33 Selecting a Consistency Group for adding volumes
Using the XCLI, the process must be done in two steps. First, create the Consistency Group and then the volumes are added. Example 11-9 provides an example of setting up a Consistency Group and adding volumes using the XCLI.
Example 11-9 Creating Consistency Groups and adding volumes with the XCLI
xcli -c MZ_PFE_1 cg_create cg=xiv_new_cg pool=redbook_markus xcli -c MZ_PFE_1 cg_add_vol cg=xiv_new_cg vol=redbook_markus_03 xcli -c MZ_PFE_1 cg_add_vol cg=xiv_new_cg vol=redbook_markus_04
Creating a snapshot using Consistency Groups

When the Consistency Group is created and the volumes added, the snapshot can be created. From the Consistency Group view on the GUI, select the Consistency Group to copy. As in Figure 11-34, right-click the group and select Create Snapshots Group from the menu. The system immediately creates the snapshot.
Figure 11-34 Creating a snapshot using Consistency Groups
The new snapshots are created and displayed beneath the volumes in the Consistency Group view (Figure 11-35 on page 305). These snapshots have the same creation date and time. Each snapshot is locked on creation and has the same defaults as a regular snapshot. The snapshots are contained in a group structure (called a snapshot group) that allows all the snapshots to be managed by a single operation.
304
Figure 11-35 Validating the new snapshots in the Consistency Group
Adding volumes to a Consistency Group does not prevent you from creating a single volume snapshot. If a single volume snapshot is created, it is not displayed in the Consistency Group view. The single volume snapshot is also not consistent across multiple volumes. However, the single volume snapshot does work according to all the previous rules defined by 11.1.2, Volume snapshots on page 288. With the XCLI, when the Consistency Group is set up, it is simple to create the snapshot. One command creates all the snapshots within the group at the same moment in time.
Example 11-10 Creating a snapshot group
xcli -c MZ_PFE_1 cg_Snapshots_create cg=xiv_new_cg
Managing a Consistency Group

After the snapshots are created within a Consistency Group, you have several options available. The same management options for a snapshot are available to a Consistency Group. Specifically, the deletion priority is modifiable, and the snapshot or group can be unlocked and locked, as well as the group can be restored or overwritten. Refer to 11.1.2, Volume snapshots on page 288 for the specifics of performing these operations. In addition to the snapshot functions, you can remove a volume from the Consistency Group. Removing a volume from a Consistency Group after a snapshot is performed prevents restoration of any snapshots in the group. If the volume is added back into the group, the group can be restored. The only time that the restoration process does not work is if the snapshot group has been unlocked. By right-clicking the volume, a menu appears, click Remove from Consistency Group and validate the removal on the dialog window that appears. Figure 11-36 on page 306 provides an example of removing the xiv_windows_1 volume from the Consistency Group.
305
Figure 11-36 Removing a volume from a Consistency Group
To obtain details about a Consistency Group, you can select Snapshots Group Tree from the Volumes menu. Figure 11-37 on page 307 shows where to find the group view.
306
Figure 11-37 Selecting the Snapshot Group Tree
From the Snapshots Group Tree view, you can see many details. Select the group to view on the left panel by clicking the group snapshot. The right panes provide more in-depth information about the creation time, the associated pool, and the size of the snapshots. In addition, the Consistency Group view points out the individual snapshots present in the group. Refer to Figure 11-38 on page 308 for an example of the data that is contained in a Consistency Group.
307
Figure 11-38 Snapshots Group Tree view
To display all the Consistency Groups in the system, issue the cg_list command.
Example 11-11 Listing the Consistency Groups
xcli -c MZ_PFE_1 cg_list Name Group1 EXCH_CLU_CONSGROUP snapshot_test Tie mirror_cg xiv_db_cg MySQL Group xiv_new_cg Pool Name GCHI_THIN_01 GCHI_THIN_01 snapshot_test xiv_pool redbook_mirror xiv_volume_copy redbook_markus redbook_markus
More details are available by viewing all the Consistency Groups within the system that have snapshots. The groups can be unlocked or locked, restored, or overwritten. All the operations discussed in the snapshot section are available with the snap_group operations. Example 11-12 on page 309 illustrates the snap_group_list command.
308
Example 11-12 Listing all the Consistency Groups with snapshots
xcli -c MZ_PFE_1 snap_group_list Name xiv_db_cg.snap_group_00001 MySQL Group.snap_group_00001 xiv_new_cg.snap_group_00001 CG xiv_db_cg MySQL Group xiv_new_cg Snapshot Time 2008-08-07 18:59:06 2008-08-08 18:16:53 2008-08-08 20:39:57 Deletion Priority 1 1 1
Deleting a Consistency Group

Before a Consistency Group can be deleted, the associated volumes must be removed from the Consistency Group. The presence of snapshots does not prevent the deletion of a Consistency Group. On deletion of a Consistency Group, the snapshots become independent snapshots and remain tied to their volume. To delete the Consistency Group, right-click the group and select Delete. Validate the operation by clicking OK. Figure 11-39 provides an example of deleting the Consistency Group called xiv_db_cg.
Figure 11-39 Deleting a Consistency Group
In order to delete a Consistency Group with the XCLI, you must first remove all the volumes one at a time. As in Example 11-13, each volume in the Consistency Group is removed first. Then, the Consistency Group is available for deletion. As with the GUI, the snapshots do not have to be deleted in order to delete the Consistency Group. Deletion of the Consistency Group does not delete the individual snapshots.
Example 11-13 Deleting a Consistency Group
xcli -c MZ_PFE_1 cg_remove_vol vol=redbook_markus_03 xcli -c MZ_PFE_1 cg_remove_vol vol=redbook_markus_04 xcli -c MZ_PFE_1 cg_delete cg=xiv_new_cg It is also possible to automate the process of removing and deleting the Consistency Group using the XCLI. Working with the Linux version of the XCLI, the data is extracted from the output of the volume list command, and then, a removal process is performed. In Example 11-14, you need to specify the Consistency Group to delete. The Consistency Group name must be unique from the volume names so that the script finds the accurate data.
Example 11-14 Automated Consistency Group deletion script
cg_name=CHRIS1_CG volume=$(xcli -c xiv_esp vol_list | grep $cg_name | awk '{ print $1 }') for v in $volume; do XCLI -c xiv_esp cg_remove_vol vol=$v; done xcli -c xiv_esp cg_delete cg=$cg_name
309
11.1.4 Snapshot with Remote Mirror

XIV has a special snapshot (shown in Figure 11-40) that is automatically created by the system. During the recovery phase of a Remote Mirror, the system creates a snapshot on the target to ensure a consistent copy. Important: This snapshot has a special deletion priority and is not deleted automatically if the snapshot space becomes fully utilized. When the synchronization is complete, the snapshot is removed by the system, because it is no longer needed. The following list describes the sequence of events to trigger the creation of the special snapshot. Note that if a write does not occur while the links are broken, the system does not create the special snapshot. The events are: 1. 2. 3. 4. Remote Mirror is synchronized. Loss of connectivity to remote system occurs. Writes continue to the primary XIV Storage System. Mirror paths are reestablished (here, the snapshot is created), and synchronization starts.
Figure 11-40 Special snapshot during Remote Mirror synchronization operation
For more details about Remote Mirror, refer to Chapter 12, Remote Mirror on page 323. Important: The special snapshot is created regardless of the amount of pool space on the target pool. If the snapshot causes the pool to be overutilized, the mirror remains inactive. The pool must be expanded to accommodate the snapshot, and then, the mirror can be reestablished.
310
11.1.5 Windows Server 2003 Volume Shadow Copy Service

The Microsoft Volume Shadow Copy Service (VSS) provides a mechanism for creating consistent point-in-time copies of data, which are known as shadow copies. VSS can integrate the XIV Storage System snapshot functionality with the VSS function to produce consistent shadow copies, while also coordinating with business applications, file system services, backup applications, and fast recovery solutions. For more information, refer to: http://technet2.microsoft.com/WindowsServer/en/library/2b0d2457-b7d8-42c3-b6c9-59c 145b7765f1033.mspx?mfr=true There are three required steps to enable this tool: 1. The XML configuration file used by the XCLI and GUI must be present. Example 11-15 provides an example of a XML configuration file. This XML file is the default file generated by the XIV GUI. 2. The XIV Storage System API support for Microsoft VSS must be installed. Contact your IBM marketing representative to obtain this file. 3. An application that supports the VSS framework must be present, for example, the IBM Tivoli Storage Manager.
Example 11-15 XML configuration file
<systems> <system> <name value="XIV V10.0 MN00050"/> <management id="1"> <ip value="9.155.56.100"/> <port value="7778"/> </management> <management id="2"> <ip value="9.155.56.101"/> <port value="7778"/> </management> <management id="3"> <ip value="9.155.56.102"/> <port value="7778"/> </management> <active value="2"/> <serial value="MN00050"/> <last> <noRacks value="1"/> <utilizationP value="84"/> </last> </system> </systems> During the installation of the XIV Storage System API for VSS, the process requests the XML file containing the configuration details. Use the full path of the XML file defined in step one. The Windows server is now ready to perform snapshots on the XIV Storage System. Refer to your application documentation for completing the VSS setup.
311
11.1.6 MySQL database backup

MySQL is an open source database application that is used by many Web programs. For more information, go to: http://www.mysql.com The database has several important files: the database data, the log data, and the backup data. The MySQL database stores the data in a set directory and cannot be separated. The backup data, when captured, can be moved to a separate system. The following scenario shows an incremental backup of a database and then using snapshots to restore the database to verify that the database is valid. The first step is to back up the database. For simplicity, a script is created to perform the backup and take the snapshot. Two volumes are assigned to a Linux host (Figure 11-41): the first volume contains the database and the second volume holds the incremental backups, in case of a failure.
Figure 11-41 XIV view of the volumes
On the Linux host, the two volumes are mapped onto separate file systems. The first volume xiv_pfe_1 maps to volume redbook_markus_09, and the second volume xiv_pfe_2 maps to redbook_markus_10. These volumes belong to the Consistency Group MySQL Group so that when the snapshot is taken, snapshots of both volumes are taken at the same moment. To perform the backup, you need to configure: The XIV XCLI must be installed on the server. This way, the backup script can invoke the snapshot instead of relying on human intervention. Secondly, the database needs to have the incremental backups enabled. To enable the incremental backup feature, MySQL must be started with the --log-bin feature (Example 11-16). This feature enables the binary logging and allows database restorations.
Example 11-16 Starting MySQL
./bin/mysqld_safe --no-defaults --log-bin=backup The database is installed on /xiv_pfe_1. However, a pointer in /usr/local is made, which allows all the default settings to coexist, and yet the database is stored on the XIV volume. To create the pointer, use the command in Example 11-17. Note, the source directory needs to be changed for your particular installation. You can also install the MySQL application on a local disk and change the default data directory to be on the XIV volume.
Example 11-17 MySQL setup
cd /usr/local ln -s /xiv_pfe_1/mysql-5.0.51a-linux-i686-glibc23 mysql
312
The backup script is simple, and depending on the implementation of your database, the following script might be too simple. However, the following script (Example 11-18) does force an incremental backup and copies the data to the second XIV volume. Then, the script locks the tables so that no more data can be modified. When the tables are locked, the script initiates a snapshot, which saves everything for later use. Finally, the tables are unlocked.
Example 11-18 Script to perform backup
# Report the time of backing up date # First flush the tables this can be done while running and # creates an incremental backup of the DB at a set point in time. /usr/local/mysql/bin/mysql -h localhost -u root -ppassword < ~/SQL_BACKUP # Since the mysql daemon was run specifying the binary log name # of backup the files can be copied to the backup directory on another disk cp /usr/local/mysql/data/backup* /xiv_pfe_2 # Secondly lock the tables so a Snapshot can be performed. /usr/local/mysql/bin/mysql -h localhost -u root -ppassword < ~/SQL_LOCK # XCLI command to perform the backup # ****** NOTE User ID and Password are set in the user profile ***** /root/XIVGUI/xcli -c xiv_pfe cg_Snapshots_create cg="MySQL Group" # Unlock the tables so that the database can continue in operation. /usr/local/mysql/bin/mysql -h localhost -u root -ppassword < ~/SQL_UNLOCK When issuing commands to the MySQL database, the password for the root user is stored in an environment variable (not in the script, as was done in Example 11-18 for simplicity). Storing the password in an environment variable allows the script to perform the action without requiring user intervention. For the script to invoke the MySQL database, the SQL statements are stored in separate files and piped into the MySQL application. Example 11-19 provides the three SQL statements that are issued to perform the backup operation.
Example 11-19 SQL commands to perform backup operation
SQL_BACKUP FLUSH TABLES SQL_LOCK FLUSH TABLES WITH READ LOCK SQL_UNLOCK UNLOCK TABLES Before running the backup script, a test database, which is called redbook, is created. The database has one table, which is called chapter, which contains the chapter name, author, and pages. The table has two rows of data that define information about the chapters in the redbook. Figure 11-42 on page 314 show the information in the table before the backup is performed.
313
Figure 11-42 Data in database before backup
Now that the database is ready, the backup script is run. Example 11-20 is the output from the script. Then, the snapshots are displayed to show that the system now contains a backup of the data.
Example 11-20 Output from the backup process
[root@x345-tic-30 ~]# ./mysql_backup Mon Aug 11 09:12:21 CEST 2008 Command executed successfully. [root@x345-tic-30 ~]# /root/XIVGUI/xcli -c xiv_pfe snap_group_list cg="MySQLGroup" Name CG Snapshot Time Deletion Priority MySQL Group.snap_group_00006 MySQL Group 2008-08-11 15:14:24 1 [root@x345-tic-30 ~]# /root/XIVGUI/xcli -c xiv_pfe time_list Time Date Time Zone Daylight Saving Time 15:17:04 2008-08-11 Europe/Berlin yes [root@x345-tic-30 ~]# To show that the restore operation is working, the database is dropped (Figure 11-43 on page 315) and all the data is lost. After the drop operation is complete, the database is permanently removed from MySQL. It is possible to perform a restore action from the incremental backup. For this example, the snapshot function is used to restore the entire database.
314
Figure 11-43 Dropping the database
The restore script, which is shown in Example 11-21, stops the MySQL daemon and unmounts the Linux file systems. Then, the script restores the snapshot and finally remounts and starts MySQL.
Example 11-21 Restore script
[root@x345-tic-30 ~]# cat mysql_restore # This resotration just overwrites all in the database and puts the # data back to when the snapshot was taken. It is also possible to do # a restore based on the incremental data; this script does not handle # that condition. # Report the time of backing up date # First shutdown mysql mysqladmin -u root -ppassword shutdown # Unmount the filesystems umount /xiv_pfe_1 umount /xiv_pfe_2 #List all the snap groups /root/XIVGUI/xcli -c xiv_pfe snap_group_list cg="MySQL Group" #Prompt for the group to restore echo "Enter Snapshot group to restore: " read -e snap_group # XCLI command to perform the backup # ****** NOTE User ID and Password are set in the user profile ***** /root/XIVGUI/xcli -c xiv_pfe snap_group_restore snap_group="$snap_group"
315
# Mount the FS mount /dev/dm-2 /xiv_pfe_1 mount /dev/dm-3 /xiv_pfe_2 # Start the MySQL server cd /usr/local/mysql ./configure The output from the restore action is shown in Example 11-22.
Example 11-22 Output from the restore script
[root@x345-tic-30 ~]# ./mysql_restore Mon Aug 11 09:27:31 CEST 2008 STOPPING server from pid file /usr/local/mysql/data/x345-tic-30.mainz.de.ibm.com.pid 080811 09:27:33 mysqld ended Name CG Snapshot Time Deletion Priority MySQL Group.snap_group_00006 MySQL Group 2008-08-11 15:14:24 1 Enter Snapshot group to restore: MySQL Group.snap_group_00006 Command executed successfully. NOTE: This is a MySQL binary distribution. It's ready to run, you don't need to configure it! To help you a bit, I am now going to create the needed MySQL databases and start the MySQL server for you. If you run into any trouble, please consult the MySQL manual, that you can find in the Docs directory. Installing MySQL system tables... OK Filling help tables... OK To start mysqld at boot time you have to copy support-files/mysql.server to the right place for your system PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER ! To do so, start the server, then issue the following commands: ./bin/mysqladmin -u root password 'new-password' ./bin/mysqladmin -u root -h x345-tic-30.mainz.de.ibm.com password 'new-password' Alternatively you can run: ./bin/mysql_secure_installation which also gives the option of removing the test databases and anonymous user created by default. strongly recommended for production servers. See the manual for more instructions. You can start the MySQL daemon with: cd . ; ./bin/mysqld_safe &
This is
316
You can test the MySQL daemon with mysql-test-run.pl cd mysql-test ; perl mysql-test-run.pl Please report any problems with the ./bin/mysqlbug script! The latest information about MySQL is available on the Web at http://www.mysql.com Support MySQL by buying support/licenses at http://shop.mysql.com Starting the mysqld server. You can test that it is up and running with the command: ./bin/mysqladmin version [root@x345-tic-30 ~]# Starting mysqld daemon with databases from /usr/local/mysql/data When complete, the data is restored and the redbook database is available, as shown in Figure 11-44.
Figure 11-44 Database after restore operation
11.2 Volume Copy

The XIV Storage System provides the ability to copy a volume into another volume. This valuable feature, known as Volume Copy, is best used for duplicating an image of the volume when the data residency is extremely long and the information diverges after the copy is complete.
317
11.2.1 Architecture
The Volume Copy feature provides an instantaneous copy of data from one volume to another volume. By utilizing the same functionality of the snapshot, the system modifies the target volume to point at the source volumes data. After the pointers are modified, the host has full access to the data on the volume. After the XIV Storage System completes the setup of the pointers to the source data, a background copy of the data is performed. The data is copied from the source volume to a new area on the disk, and the pointers of the target volume are then updated to use this new space. The copy operation is done in such a way as to minimize the impact to the system. If the host performs an update before the background copy is complete, a redirect on write occurs, which allows the volume to be readable and writable before the Volume Copy completes.
11.2.2 Performing a Volume Copy

Performing a Volume Copy is a simple task. The only requirement is that the target volume must be created before the copy can occur. If the sizes of the volumes differ, the size of the target volume is modified to match the source volume when the copy is initiated. The resize operation does not require user intervention. Figure 11-45 illustrates making a copy of volume redbook_markus_01. The target volume for this example is redbook_chris_01. By right-clicking the source volume, a menu appears and you can then select Copy This Volume. This action causes a dialog box to appear.
Figure 11-45 Initiating a copy volume process
318
From the dialog box, select redbook_chris_01 and click OK. The system then asks that you validate the copy action. The XIV Storage System instantly performs the update process and displays a completion message. When the copy process is complete, the volume is available for use. Figure 11-46 provides an example of the volume selection.
Figure 11-46 Target volume selection
To create a Volume Copy with the XCLI, the source and target volumes must be specified in the command. In addition, the -y parameter must be specified to provide an affirmative response to the validation questions.
Example 11-23 Performing a Volume Copy
xcli -c MZ_PFE_1 -y vol_copy vol_src=xiv_vmware_1 vol_trg=xiv_vmware_2
11.2.3 Creating an OS image with Volume Copy

This section presents another usage of the Volume Copy feature. In certain cases, you might want to install another operating system (OS) image. By using Volume Copy, the installation can be done immediately. Usage of VMware simplified the need for SAN boot; however, this example can be applied to any OS installation in which the hardware configuration is similar. VMware allows the resources of a server to be separated into logical virtual systems, each containing its own OS and resources. When creating the configuration, it is extremely important to have the hard disk assigned to the virtual machine to be a Mapped Raw LUN. If the Hard Disk is a VMware File System (VMFS), the Volume Copy fails, because there are duplicate file systems in VMware. In Figure 11-47 on page 320, the Mapped Raw LUN is the XIV volume that was mapped to the VMware server. For specific information about mapping volumes, refer to 5.5.1, Managing hosts and mappings with XIV GUI on page 113.
319
Figure 11-47 Configuration of the virtual machine in VMware
To perform the Volume Copy, use the follow sequence: 1. Validate the configuration for your host. With VMware, you need to ensure that the hard disk assigned to the virtual machine is a Mapped Raw LUN. For a disk directly attached to a server, the SAN boot needs to be enabled, and the target server needs the XIV volume discovered. 2. Shut down the source server or OS. If the source remains active, there might be data in memory that is not synchronized to the disk. If this step is skipped, unexpected results can occur. 3. Perform Volume Copy from the source volume to the target volume. 4. Power on the new system. A demonstration of the process is simple using VMware. Starting with the VMware resource window, power off the virtual machines for both the source and the target.The summary described in Figure 11-48 shows that both XIV Source VM (1), the source, and XIV Source VM (2), the target, are powered off.
Figure 11-48 VMware virtual machine summary
Looking at the XIV Storage System before the copy (Figure 11-49), xiv_vmware_1 is mapped to the XIV Source VM (1) in VMware and has utilized 1 GB of space. This information shows that the OS is installed and operational. The second volume xiv_vmware_2 is the target volume for the copy and is mapped to XIV Source VM (2) and is 0 in size. At this point, the OS has not been installed on the virtual machine, and thus, the OS is not usable.
Figure 11-49 The XIV volumes before the copy
320
Because the virtual machines are powered off, simply initiate the copy process as just described. Selecting xiv_vmware_1 as the source, copy the volume to the target xiv_vmware_2. The copy completes immediately and is available for usage. To verify that the copy is complete, the used area of the volumes must match as shown in Figure 11-50.
Figure 11-50 The XIV volumes after the copy
After the copy is complete, simply power up the new virtual machine to use the new operating system. Both servers usually boot up normally with only minor modifications to the host. In this example, the server name needed to be changed, because there were two servers on the network with the same name. Refer to Figure 11-51.
Figure 11-51 VMware summary showing both virtual machines powered on
Figure 11-52 on page 322 shows the second virtual machine console with the Windows operating system powered on.
321
Figure 11-52 Booted Windows server
322
12
Chapter 12.
Remote Mirror
This chapter describes the basic characteristics, the options, and the available interfaces for Remote Mirroring. It also includes step-by-step procedures for setting it up and removing the mirror.
323
12.1 Remote Mirror

The Remote Mirror function of the XIV Storage System provides a real-time copy between two or more storage systems supported over Fibre Channel (FC) or iSCSI links. This feature provides a method to protect data from site failures. Remote Mirror is a synchronous copy solution where write operations are completed on both copies before they are considered to be complete.
12.1.1 Remote Mirror overview

A Remote Mirroring configuration consists of two sites: Primary site: The primary (or local) site is made up of the primary storage and the servers and hosts running applications or production data to the storage. Secondary site: The secondary (or remote) site holds the mirror copy of the data on another XIV Storage System and usually standby servers and hosts as well. In this case, the secondary site is capable of becoming the active production site with all the data available due to the synchronous copy in the event of a failure at the local site. Figure 12-1 illustrates the flow of data during the Remote Mirror synchronous process.
Figure 12-1 Remote Mirror overview
As mentioned earlier, synchronous copy ensures that a host write operation is written to both the primary and secondary systems. Synchronous copy issues an acknowledgement of the write to the host (application) only after both copies are written to maintain consistent data. 324
The storage system can serve dual functions as both primary and secondary machines at the same time. For example, if System A is primary for Application 1 and System B is the remote copy for Application 1, System B can also be the primary for Application 2 with System A as the remote site for Application 2, providing a bidirectional capability for Remote Mirroring. However, this type of setup is not appropriate for protecting against a complete disaster at the primary site. There are multiple features of Remote Mirroring on the XIV Storage System that include disaster recovery, backups, recovery from single volume media errors, role switchovers, synchronization, and coupling, which will be discussed in detail later in this chapter. The target volumes on the secondary system will need to be created before any mirroring can be configured. This task is usually performed by the storage administrator who will verify that the target volumes are equal in size to the source volumes on the primary system. After a Remote Mirror has been created, it will first have to complete the initialization before it is activated. The initialization process begins by copying all the data from the primary volume to the secondary volume. This initialization is performed only once during the initial setup of the Remote Mirror, and once complete, the volumes are considered synchronized. At this time, Remote Mirroring is considered active and will continue to keep the copies synchronized by writing all data to the primary copy followed by writing the data to the secondary copy and then acknowledging to the host that the write operation has completed.
12.1.2 Boundaries
Currently, the XIV Storage System has the following boundaries or limits: Maximum remote systems: The maximum number of remote systems that can be attached to a single primary is four. Number of Remote Mirrors: The maximum number of Remote Mirrors allowed on a single primary at any one time is 128. Distance: Distance is only limited by the response time of the medium used. Lack of Consistency Groups: There is no support for Consistency Groups within Remote Mirroring. Snapshots: Snapshots are allowed with either the primary or secondary volumes without stopping the mirror.
12.1.3 Initial setup

When preparing to set up Remote Mirroring, take the following questions into consideration: Will the paths be configured via SAN or direct attach, FC or iSCSI? Is the desired port configured as an initiator or target? How are ports changed if needed? How many pairs will be copied? How many secondary machines will be used for a single primary? Let us take a look at the first item in the list. Remote Mirroring can be set up on paths that are either direct or SAN attach via FC or iSCSI protocols. For most disaster recovery solutions, the secondary system will be located at a geographically remote site. The sites will be connected either using SAN connectivity with Fibre Channel Protocol (FCP) or Ethernet with iSCSI. In certain cases, using direct connect might be the option of choice if the machines are
Chapter 12. Remote Mirror
325
located near each other. Bandwidth considerations must be taken into account when planning the infrastructure to support the Remote Mirroring implementation. Knowing when the peak write rate occurs for systems attached to the storage will help with the planning for the number of paths needed to support the Remote Mirroring function and any future growth plans. There must always be a minimum of two paths configured for redundancy within Remote Mirroring, and these paths must be dedicated for Remote Mirror. When the protocol has been selected, it is time to determine which ports on the XIV Storage System will be used. The port settings are easily displayed using the Extended Command Line Interface (XCLI) command fc_port_list for Fibre Channel or ipinterface_list for iSCSI. This leads us into the second item in the list. Fibre channel paths for Remote Mirroring have slightly more requirements for setup, and we will look at this interface first. As seen in Example 12-1, in the column titled Role, each Fibre Channel port is identified as either a target or initiator. Simply put, a target in a Remote Mirror configuration is the port that will be receiving data from the primary system (also known as the secondary system) while an initiator is the port that will be doing the sending of data (the primary system). In this example, there are three initiators configured. Initiators, by default, are configured on FC:X:4 (X is the module number). In this highlighted example, port 4 in module 6 is configured as the initiator.
Example 12-1 The fc_port_list output command
C:\xcli>xcli -c MN00033 fc_port_list Component ID Status Currently Functioning

1:FC_Port:4:4 1:FC_Port:4:3 1:FC_Port:4:2 1:FC_Port:4:1 1:FC_Port:6:4 1:FC_Port:6:3 1:FC_Port:6:2 1:FC_Port:6:1 1:FC_Port:7:4 1:FC_Port:7:3 1:FC_Port:7:2 1:FC_Port:7:1 1:FC_Port:8:4 1:FC_Port:8:3 1:FC_Port:8:2 1:FC_Port:8:1 1:FC_Port:9:4 1:FC_Port:9:3 1:FC_Port:9:2 1:FC_Port:9:1 1:FC_Port:5:4 1:FC_Port:5:3 1:FC_Port:5:2 1:FC_Port:5:1 OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes
WWPN
Port ID
0001000D 000000EF 00000000 006D0B13 00FFFFFF 000000EF 000000EF 00681A13 00FFFFFF 00FFFFFF 000000EF 00060213 000000EF 0075000B 00611913 00760000 000000EF 00613913 0075000A 00060214 0001000E 000000EF 000000EF 00711000
Role
5001738000210143 5001738000210142 5001738000210141 5001738000210140 5001738000210163 5001738000210162 5001738000210161 5001738000210160 5001738000210173 5001738000210172 5001738000210171 5001738000210170 5001738000210183 5001738000210182 5001738000210181 5001738000210180 5001738000210193 5001738000210192 5001738000210191 5001738000210190 5001738000210153 5001738000210152 5001738000210151 5001738000210150
Target Target Target Target Initiator Target Target Target Target Target Target Target Initiator Target Target Target Initiator Target Target Target Target Target Target Target
The iSCSI connections are shown in Example 12-2 on page 327 using the command ipinterface_list. The output has been truncated to show just the iSCSI connections in which we are interested here. The command will also display all Ethernet connections and settings. In this example, we have two connections displayed for iSCSI, one connection in module 7 and one connection in module 8.
326
Example 12-2 The ipinterface_list command
C:\Documents and Settings\Administrator\My Documents\xcli>xcli -c "XIV MN00033" ipinterface_list Name Type IP Address Network Mask Default Gateway MTU Module Ports nextrabam4 iSCSI 19.11.237.208 255.255.254.0 0.0.0.0 9000 1:Module:7 1,2 nextrabam5 iSCSI 19.11.237.209 255.255.254.0 0.0.0.0 9000 1:Module:8 1,2 Alternatively, a single port can be queried by selecting a system in the GUI, followed by selecting Targets Connectivity. Right-click a specific port and select Properties, the output of which is shown in Figure 12-2. This particular port is configured as a target suitable for the secondary port.
Figure 12-2 Port properties displayed with GUI
Another way to query the port configuration is to select the desired system, click the curved arrow (at the bottom right of the window) to display the ports on the back of the system, and mouse over a port as shown in Figure 12-3 on page 328. This view displays all the information that is shown in Figure 12-2.
327
Figure 12-3 Port information from the patch panel view
Similar information can be displayed for the iSCSI connections using the GUI as shown in Figure 12-4. This view can be seen by either right-clicking the Ethernet port (similar to the Fibre Channel port shown in Figure 12-3) or by selecting the system, then selecting Hosts and Luns iSCSI Connectivity. This sequence displays the same two iSCSI definitions that are shown with the XCLI command.
Figure 12-4 iSCSI connectivity
By default, Fibre Channel ports 3 and 4 (target and initiator respectively) from every module are designed to be used for Remote Mirroring. For example, port 4 module 5 (initiator) on the local machine is connected to port 3 module 4 (target) on the remote machine. When setting up a new system, it is best to plan for any Remote Mirroring to reserve these ports for that purpose. In the event that a port role does need to be changed (which is the third item in the list), you can change the port role with both the XCLI and the GUI. Use the XCLI command fc_port_config to change a port as shown in Example 12-3 on page 329. Using the output from fc_port_list, we can get the fc_port name to be used in the command, changing the port role to be either initiator or target as needed.
328
Example 12-3 XCLI command to configure a port
C:\Documents and Settings\Administrator\My Documents\xcli>xcli fc_port_config fc_port=1:FC_Port:4:4 role=initiator Command completed successfully C:\Documents and Settings\Administrator\My Documents\xcli>xcli list Component ID Status Currently Functioning WWPN 1:FC_Port:4:4 OK yes 5001738000210143
-c MN00033
-c MN00033 fc_port_ Port ID Role 0001000D Initiator
To perform the same function with the GUI, select the primary system and choose Remote Targets Connectivity as shown in Figure 12-5. Alternatively, this same view can be accessed via Hosts and LUNs Targets Connectivity.
Figure 12-5 Targets Connectivity view
With the Targets Connectivity view displayed, select the desired port to change, right-click the port, and select Configure from the pop-up menu, which is shown in Figure 12-6 on page 330.
329
Figure 12-6 Configure port attributes with GUI
This option opens a configuration window as shown in Figure 12-7, which allows the port to be enabled (or disabled), its role defined as Target, Initiator, or Dual, and finally, the speed for the port configured (Auto, 1Gbps, 2Gbps, or 10Gbps).
Figure 12-7 Configure port with GUI
Planning for the Remote Mirroring is important when determining how many copy pairs will exist, which is the fourth item for discussion in this section. The current limit on simultaneous copies is 128. In addition, a single primary system is limited to a maximum of four secondary systems. In this configuration, the maximum number of Remote Mirrors allowed is still 128 for the primary. These can be spread across multiple secondaries. Each secondary is also limited to 128 Remote Mirrors (so it can, for example, be a secondary to more than one primary or a secondary to one system and a primary to another). If the XIV Data Migration feature is also used, the combined number of Remote Mirrors and volumes in a data migration is limited to 128 on a single system, which is also covered in Chapter 13, Data migration on page 353.
330
12.1.4 Coupling
There are two ways to configure coupling, which define the action that Remote Mirroring will take upon a failure:
Best effort coupling: Writes to the primary volume will continue when there is a failure between the primary and secondary links. Remote Mirroring is set to an unsynchronized state. All updates to the primary volume are recorded so that only these updates are written to the secondary volume after the problem has been resolved. Mandatory coupling: When a failure in the communication between the primary and
secondary system is detected, all writes to the primary volume are prohibited. This extreme action is taken only when the two systems must be synchronized at all times, and this action is only rarely done. Coupling settings can be changed at anytime, even after a failure. However, a mandatory coupling configuration cannot be deactivated without first changing it to a best effort coupling type. Coupling will always start off in standby mode, which means that the mirror has not yet been activated so no data has been written to the secondary. The primary system will keep track of all changes to the volumes so that at a later time, when the mirroring has been activated, all updates on the primary since creating the mirror will be written to the secondary system as soon as the mirror is activated. This mode can also be used during system maintenance at the secondary so that the primary system will not generate alerts. Coupling can only be removed during standby mode. Transitions between standby and active modes are performed on the primary volume using the XCLI or the GUI. An example of configuring mirroring and coupling is shown in Figure 12-8 and Figure 12-9.
Figure 12-8 Configure Mirror
Figure 12-9 Select the coupling option: Best effort or mandatory
331
12.1.5 Synchronization
The synchronization process runs when the Remote Mirror is first set up, as well as any time that there has been a failure that has been recovered and the mirroring operations are able to continue. Synchronization ensures that the secondary volume will receive all the changes that were done to the primary volume while the coupling was not available. Synchronization consists of the following states:
Initialization: Data from the primary is being copied to the secondary. Synchronized: Both copies are consistent. Timestamp: Taken to keep track of when the coupling has become non-operational. Unsynchronized: Remote Mirroring is non-operational.
Figure 12-10 illustrates the states of initialization from a newly created mirror and synchronized. Figure 12-10 also displays the role of the volume and the volume name on the remote volume, as well as the remote system.
Note: The terms Primary and Secondary denote the possible two mirroring designated roles of a peer. In contrast, the Master and Subordinate terms denote, respectively, the active role of the peer that accepts write requests from hosts, and the active role of the peer that is being synchronized. The purpose of this scheme is to record the original peer roles, while being enabled to change the active role as required.
Figure 12-10 Remote Mirroring states
When a link failure occurs, the primary system must start tracking changes to Remote Mirror source volumes so that these changes can be copied to the secondary once recovered. These changes are known as the uncommitted data. When recovering from a link failure, the best effort coupling will perform the following steps to synchronize the data: If the secondary is still in a synchronized state, a snapshot of the last consistent volume is created and renamed using the time stamp created at the time of the link failure (we discuss more about this last consistent snapshot later). This action allows the use of the secondary volume in the event of a failure at the primary site during the synchronization process. Although the secondary might not have all the updates from the primary, it will at least have a consistent copy to work with. The primary system will synchronize the secondary volume, copying all the uncommitted data. After synchronization is complete between the two systems for all volumes, the primary will delete the snapshot. An example of this process can be seen in 12.1.10, Recovering from a failure on page 351 at the end of this chapter. 332
12.1.6 Disaster Recovery

There are two types of disaster recovery failures considered here: A disaster that destroys the data at the primary site In this scenario, the backup servers are activated using the secondary storage system to continue normal operations. After the primary site is recovered, the data at the secondary site will be copied back to the primary site to become synchronized once again. If desired, a planned site switch will then take place to resume production activities at the primary site. A disaster that makes the primary site unusable, but the data is still intact In this scenario, a switchover is made to the secondary site by changing the role of the secondary volume to become a primary. We discuss role switching in 12.1.8, Role switchover on page 333.
12.1.7 Reads and writes in a Remote Mirror

In most cases, all reads and writes are allowed on the primary volume while in a Remote Mirroring state. The exception, as stated previously, is when Mandatory Coupling has been set, and communication between the two systems has been lost. The secondary volume is in a read only state. A host can access the data on the secondary volume for reads only. If a write is attempted, it will fail.
12.1.8 Role switchover

This feature allows the storage administrator to easily swap the roles of a primary and secondary volume using the XCLI or the GUI. This swap will perform a role reversal at a volume level, which can be extremely convenient when trying to recover data from a single volume on the secondary. Switchover is shown in more detail in 12.1.9, Remote Mirror step-by-step illustration on page 334. The two most common reasons for issuing a role switchover while Remote Mirroring is available are: To test the configuration in preparation for a site disaster at the primary To perform scheduled maintenance on the primary but to still have access to all data on the secondary A switchover might also need to take place in the event of a failure at the primary site, requiring the secondary site to be used for production while the failed site is recovered. The steps taken for a role switchover in this case are more complicated. If any of the mirrors were in a non-synchronized state, the storage administrator will need to wait for the primary system to be recovered before a switchover can take place (communication between the two sites is needed in this case). However, if the mirror is in a synchronized state, the following recovery steps can be taken: 1. Secondary role is switched to primary. 2. Coupling is in unsynchronized state and remains in standby mode until the other site is also switched. 3. Writes are now accepted at the new primary. 4. All changes are recorded or logged so the system knows what needs to be updated on the secondary when it has been recovered.
333
If the link was broken (due to a failure in the physical paths or a loss of the primary system), a snapshot of the last consistent state will exist, as explained in 12.1.5, Synchronization on page 332. The storage administrator has the option of using the most updated version, which might not be consistent or reverting to the last consistent snapshot, which we illustrate later in this chapter. After the old primary system has been recovered, its role will need to be switched so that updates to the new primary can be copied to the old primary. In addition, any data that was not committed from the old primary to the old secondary (new primary) will be in a data list that the old primary will send to the new primary. This list is made up of any writes that occurred on the old primary before the role switchover. It is up to the new primary to synchronize this list with the list of updates that it has been creating to bring the two systems into a synchronized state. To illustrate this process, we use an example of a link failure between the two sites: 1. A link failure occurs between the two sites (A is the primary and B is the secondary). 2. Production writes continue on A, and A starts to generate a list of changes that are being made. 3. Role switchover occurs on B. B now becomes the new primary, and production writes are switched to this site. 4. B now starts to generate a list of changes that need to be copied to the secondary (A). 5. Role switchover occurs on A (A becomes secondary), and the links are recovered. 6. A sends its list of uncommitted updates to B. 7. B merges the two lists, and updates are applied to the primary (B) as needed, as well as sending updates to the secondary (A). 8. The mirror becomes synchronized.
12.1.9 Remote Mirror step-by-step illustration

In this section, we cover the steps taken to set up, synchronize, and remove Remote Mirroring, utilizing both the GUI and the XCLI. We begin with the GUI example.
GUI example
At this point, we are continuing after the setup reviewed in 12.1.3, Initial setup on page 325, which assumes that the Fibre Channel ports have been properly defined as source and targets, the Ethernet switch has been updated to jumbo frames, and all the physical paths are in place. Follow these steps: 1. We start by selecting a primary system and then choosing Remote Targets Connectivity as shown in Figure 12-11 on page 335.
334
Figure 12-11 Targets Connectivity
2. After this option is selected, a new window is displayed as shown in Figure 12-12 on page 336. In this example, we see the available Fibre Channel ports on the primary system. At the top of the window, there are two options: Add Target (+ symbol) and Select Target. Selecting Add Target results in the box, which allows the target system to be defined, as shown in Figure 12-13 on page 336.
335
Figure 12-12 Fibre Channel Ports
3. In Figure 12-13, the Target Type is selected from the drop-down list. The two options in this list are Mirroring and Data Migration. Mirroring is the first type in the list and will be the default. Select Mirroring to set up Remote Mirror. The next item required is the Target Name. The drop-down list displays all the systems configured in the GUI. If the target system has not yet been added to the GUI, it will not be displayed in the list.
Figure 12-13 Adding a target
Choose the desired target after it is displayed in the list. The final required item is the Target Protocol. There are two protocols contained in the drop-down list: FC (Fibre Channel) and iSCSI. In this example, we have chosen FC as the protocol. After the selection has been made, click Define. After the target system has been defined, it will be displayed in the window as shown in Figure 12-14 on page 337.
336
Figure 12-14 View of primary and secondary without paths
4. The next step is to create the paths (Figure 12-15 on page 338). At this point, we assume that all the physical paths and zoning have been completed.
337
Figure 12-15 Defining paths
5. To set up the Remote Mirror paths using iSCSI, you must first get the iSCSI name of the secondary system to be used in the setup on the primary. To get the name, select the secondary system in the GUI view and at the top of the window, click Configure System. This option will open a window as shown in Figure 12-16. Copy the entry in the box called iSCSI Name to use later.
Figure 12-16 iSCSI name
6. The next step will be to configure the IP address of the port or ports that will be used for Remote Mirroring. After obtaining the network information for the network administrator (Name, IP, Gateway, and Default Mask), select Hosts and LUNs iSCSI Connectivity. The results are displayed in Figure 12-17 on page 339. This particular example already has two iSCSI connections defined named nextrabam4 and nextrabam5.
338
Figure 12-17 iSCSI connectivity view
7. At the top of the window in Figure 12-17 is an option called + Define. Click Define to get the window where you define a new iSCSI connection (Figure 12-18). 8. In Figure 12-18, we see the new interface with all the options defined. There are five required fields for this window: Name, Address, Netmask, Module, and Port Number. The name, address, and netmask were obtained from the network administrator. The module and port numbers are defined based on where the Remote Mirror connections will physically be plugged in. One other important field that needs to be included for Remote Mirror is the Maximum Transmission Unit (MTU). The default listed when opening this window is 4500. Change this value to a higher number for jumbo frames. In this example, we entered 9000.
Figure 12-18 Define iSCSI IP interface
339
9. After selecting Remote Targets Connectivity, we get the IP panel as shown in Figure 12-19. Once again, select Add Target as we did with Fibre Channel.
Figure 12-19 Target connectivity with iSCSI
10.This process is the same as creating a Fibre Channel target; however, this time we will select iSCSI as the protocol, which prompts us to enter the iSCSI Initiator Name that we found in Figure 12-16 on page 338. Enter this name as shown in Figure 12-20.
Figure 12-20 Configure the iSCSI target
11.After selecting Define, we see the Target system now displayed in the window as shown in Figure 12-21 on page 341.
340
Figure 12-21 Primary and secondary iSCSI defined without ports
12.We are now ready to create the iSCSI path for Remote Mirroring. The ports available for Remote Mirror paths are those shown in white, which have previously been configured with IP information. Click with the mouse and hold the mouse down over the IP Interface in the desired module on the primary system. Drag from this IP Interface to the IP Interface in the target module and then release the left mouse button. This step defines or results in a path as shown in Figure 12-22 on page 342. By right-clicking the path using the mouse, you have the option of deleting, activating, or deactivating the path. Now that the paths are set up, the remaining steps will be the same for both Fibre Channel and iSCSI.
341
Figure 12-22 Defined ports for Remote Mirror
We are now ready to create the mirrors. 13.Verify the target pool information on the secondary system by selecting the secondary system in the GUI, and then selecting Pools Storage Pools. This step is shown in Figure 12-23 on page 343.
342
Figure 12-23 Verify the name of the pool on target system
14.After this information is determined, select the primary system in the GUI and then choose Remote Remote Mirroring, which results in the window that is shown in Figure 12-24 on page 344.
343
Figure 12-24 Remote Mirroring option selected
15.At the top of the window in Figure 12-24, choose Create Mirror. 16.This action will open the Create Mirror configuration window as shown in Figure 12-25 on page 345. In the Master Volume line, select the name of the primary or source volume that will be mirrored to the secondary from the drop-down list. Then, select the Target System in the pull-down list. This list will contain any secondary systems configured in one of the previous steps. Next, enter the slave Pool name, which is the target pool on the secondary system. The Slave Volume name will automatically be filled in and will be the same name as the Master Volume name. This name can be left as the default name or changed. Notice under these lines that there is a box that you can select if the target volume on the secondary system needs to be created. If this box is selected, the Slave Volume name can be any name that we choose, such as leaving the default, appending RM to the default to indicate that it is a Remote Mirror, or any other name meaningful to the storage administrator. If the target volume already exists on the secondary system, this name is the name that needs to be entered on the Slave Volume line, which also means the Create slave box will not be checked. An example of this window is seen in Figure 12-25 on page 345.
344
Figure 12-25 Define the mirror
17.After the mirror has been created, it is displayed in the Remote Mirroring window as seen in Figure 12-26. By default, the mirror is inactive after created. This list includes the name of the source volume on the primary system, its size in gigabytes (GB), the volumes current role (M for master or S for slave), a link to indicate the status of the paths (link up or link down), the state of the mirror (active or deactivated), the status of the mirror (unsynchronized, initializing, or synchronized), the remote volume name, which was just defined, and the name of the secondary system. The GUI window can be resized to better display any of the information in this table if needed.
Figure 12-26 Activate the mirror
18.To activate the mirror, right-click anywhere on the link, which will show a context menu as shown in Figure 12-26.
345
19.There are many options in the context menu that is shown in Figure 12-26 on page 345: Configure Mirror (available once the mirror has reached the synchronized state) Delete Mirror (available only if the mirror is not active) Activate (available if the mirror is not active) Deactivate Switch Roles (only available if the mirror is synchronized) Change Roles (the mirror must be deactivated to change roles) Show Target Connectivity (displays the window that shown in Figure 12-22 on page 342 for iSCSI and Figure 12-15 on page 338 for Fibre Channel) Properties (displays all the properties of the mirror) For our purposes, we select Activate to start the initialization of the mirror. Figure 12-27 displays mirrors in different states. We can see that the source volume cirrus12__02 has not been activated yet, volume cirrus12__03 is in the Initialization state, and the remaining volumes are in the Synchronized state.
Figure 12-27 Remote Mirror states
20.Checking the state of the mirrors on the secondary system shows the status of the mirror as Consistent for those mirrors that are displayed as Synchronized on the primary system, and then the other mirrors are Inactive and in an Initialization state (which is the same as on the primary system). This information is shown in Figure 12-28.
Figure 12-28 Checking the copy from the target side
XCLI example
Here, we describe the steps that are required to set up a single Remote Mirror copy using the XCLI: 1. We first need to check the initiator and target configuration of the ports to be used on the primary and secondary systems. Using the XCLI command, fc_port_list, we can see that the system in Example 12-4 on page 347, named XIV1, is configured correctly to be used as the primary (port 4 = initiator) system.
346
Example 12-4 Port settings information on Primary Storage
xcli>xcli -c XIV1 fc_port_list

Component ID 1:FC_Port:4:4 1:FC_Port:4:3 1:FC_Port:4:2 1:FC_Port:4:1 1:FC_Port:5:4 1:FC_Port:5:3 1:FC_Port:5:2 1:FC_Port:5:1 1:FC_Port:7:4 1:FC_Port:7:3 1:FC_Port:7:2 1:FC_Port:7:1 1:FC_Port:8:4 1:FC_Port:8:3 1:FC_Port:8:2 1:FC_Port:8:1 1:FC_Port:9:4 1:FC_Port:9:3 1:FC_Port:9:2 1:FC_Port:9:1 1:FC_Port:6:4 1:FC_Port:6:3 1:FC_Port:6:2 1:FC_Port:6:1 Status OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK Currently Functioning WWPN Port ID Role yes 5001738000320143 00FFFFFF Initiator yes 5001738000320142 00020C00 Target yes 5001738000320141 00FFFFFF Target yes 5001738000320140 00FFFFFF Target yes 5001738000320153 00FFFFFF Initiator yes 5001738000320152 00FFFFFF Target yes 5001738000320151 00010400 Target yes 5001738000320150 00020400 Target yes 5001738000320173 00FFFFFF Initiator yes 5001738000320172 00FFFFFF Target yes 5001738000320171 00FFFFFF Target yes 5001738000320170 00130B00 Target yes 5001738000320183 00130C00 Initiator yes 5001738000320182 000000EF Target yes 5001738000320181 00FFFFFF Target yes 5001738000320180 00FFFFFF Target yes 5001738000320193 00120C00 Initiator yes 5001738000320192 00FFFFFF Target yes 5001738000320191 00FFFFFF Target yes 5001738000320190 00120B00 Target yes 5001738000320163 00FFFFFF Initiator yes 5001738000320162 000000EF Target yes 5001738000320161 00010500 Target yes 5001738000320160 00020500 Target
2. In Example 12-5, we have the output from the same command for the secondary system, which is also ready to be used (port 3 = target).
Example 12-5 Port listing information on Secondary Storage 1:FC_Port:6:4 1:FC_Port:6:3 1:FC_Port:6:2 1:FC_Port:6:1 1:FC_Port:5:4 1:FC_Port:5:3 1:FC_Port:5:2 1:FC_Port:5:1 1:FC_Port:8:4 1:FC_Port:8:3 1:FC_Port:8:2 1:FC_Port:8:1 1:FC_Port:4:4 1:FC_Port:4:3 1:FC_Port:4:2 1:FC_Port:4:1 1:FC_Port:9:4 1:FC_Port:9:3 1:FC_Port:9:2 1:FC_Port:9:1 1:FC_Port:7:4 1:FC_Port:7:3 1:FC_Port:7:2 1:FC_Port:7:1 OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes 5001738000CB0163 5001738000CB0162 5001738000CB0161 5001738000CB0160 5001738000CB0153 5001738000CB0152 5001738000CB0151 5001738000CB0150 5001738000CB0183 5001738000CB0182 5001738000CB0181 5001738000CB0180 5001738000CB0143 5001738000CB0142 5001738000CB0141 5001738000CB0140 5001738000CB0193 5001738000CB0192 5001738000CB0191 5001738000CB0190 5001738000CB0173 5001738000CB0172 5001738000CB0171 5001738000CB0170 000000E8 000000E8 00120E00 00011100 000000E8 00FFFFFF 00310000 00FFFFFF 00FFFFFF 00FFFFFF 00120D00 00300000 00FFFFFF 00FFFFFF 00300F00 00FFFFFF 00130F00 000000E8 00130D00 00310F00 00120F00 000000E8 00130E00 00011100 Initiator Target Target Target Initiator Target Target Target Initiator Target Target Target Initiator Target Target Target Initiator Target Target Target Initiator Target Target Target
347
3. Next, we check the primary system to determine if any targets have been configured. We use the XCLI command, target_list. Example 12-6 illustrates the expected output if the targets have not yet been configured to the system.
Example 12-6 The target_list command
C:\Documents and Settings\Administrator\My Documents\xcli>xcli -c XIV1 target_list 2008-07-25 11:58:22,265 DEBUG (XIVProperties.java:buildSystemsFromFile.292) Going to build systems from file: C:\Documents and Settings\Administrator\My Docum ents\xcli\xivconfigs.xml No Remote Targets are defined 4. Next, we set up the secondary machine as an available target using the XCLI command, target_mirroring_allow. In Example 12-7, we use the target_list command against the XIV1 system to check for any existing definitions.
Example 12-7 The target_list command
xcli>xcli -c XIV1 target_list No Remote Targets are defined 5. In Example 12-8, we see the output of the same command on a system that has two targets defined. In this case, each target is defined as a different SCSI type (iSCSI and FC).
Example 12-8 The target_list command with defined targets
xcli>xcli -c "XIV MN00033" target_list Name SCSI Type Connected XIV V10.0 MN00035 iSCSI yes prime FC no 6. Next, we will set up the secondary machine as an available target using the XCLI command target_mirroring_allow. In Example 12-9, we see the command executed on the primary system XIV1 using xiv_lab_02 as the secondary system.
Example 12-9 Command to set up secondary machine as target
xcli>xcli -c XIV1 target_mirroring_allow target=xiv_lab_02 Command completed successfully 7. As with the GUI, we must get the iSCSI name of the target system before setting up the Remote Mirror paths. We get this name by executing the XCLI command config_get, as shown in Example 12-10. The output that we will need is indicated in bold in this example. The output from iscsi_name will be used to define the secondary.
Example 12-10 listing iSCSI name
xcli>xcli -c "XIV V10.0 MN00035" config_get Command executed successfully. default_user= dns_primary= dns_secondary= email_reply_to_address= email_sender_address= email_subject_format={severity}: {description} iscsi_name=iqn.2005-10.com.xivstorage:000035 machine_model=A14 machine_serial_number=6000035 348
machine_type=2810 ntp_server= snmp_community=XIV snmp_contact=Unknown snmp_location=Unknown system_id=35 system_name=XIV V10.0 MN00035 timezone=0 8. The iSCSI port can then be configured using the command ipinterface_create, which will require the IP address, netmask, module, and port. After this command has been executed, we can list the configured iSCSI target ports on the secondary using the target_port_list command as shown in Example 12-11.
Example 12-11 iSCSI target listed in XCLI
xcli>xcli -c "XIV MN00033" target_port_list target="XIV V10.0 MN00035" Target Name Port Type Active WWPN iSCSI Address Port XIV V10.0 MN00035 iSCSI yes 0000000000000000 9.11.237.157 XIV V10.0 MN00035 iSCSI yes 0000000000000000 9.11.237.158
iSCSI 3260 3260
9. We next need to make sure the MTU settings are correct for both iSCSI interfaces on the primary and secondary systems. We verify the settings by using the command ipinterface_list, which will display all IP and iSCSI interfaces configured. For brevity, we have just shown the output for iSCSI in Example 12-12. The settings for this primary system are correct for each iSCSI interface configured (MTU = 9000). As previously stated, the secondary and Ethernet switch will also require this change to run Remote Mirroring with iSCSI paths.
Example 12-12 Check iSCSI MTU settings
xcli>xcli -c Name Module nextrabam4 1:Module:7 nextrabam5 1:Module:8
"XIV MN00033" ipinterface_list Type IP Address Network Mask Ports iSCSI 9.11.237.208 255.255.254.0 1,2 iSCSI 9.11.237.209 255.255.254.0 1,2
Default Gateway 9.11.236.1 9.11.236.1
MTU 9000 9000
10.We are now ready to define the secondary system from the primary, which is done by using the target_define command as shown in Example 12-13. In this example, we have specified the protocol as iSCSI. This command can also be executed using the protocol FC, which is shown in Example 12-14. After the target or secondary has been defined with either protocol, the mirrors can be created. No unique commands for iSCSI or FC are required after this point.
Example 12-13 Defining the iSCSI target
xcli>xcli -c "XIV MN00033" target_define target="XIV V10.0 MN00035" protocol=iSCSI Command executed successfully.
Example 12-14 Defining the FC target
xcli>xcli -c "XIV MN00033" target_define target="XIV V10.0 MN00035" protocol=FC Command executed successfully.
349
11.Next, we can define the mirror by using the mirror_create command. In this case, unlike GUI, the volume must exist on the secondary system (GUI gives the option to create the volume on the secondary system). However, later we will show how to create the volume on the secondary system with XCLI. As seen in Example 12-15, we have specified the volume on the primary system (cirrus12__06), the secondary system, and the volume on the secondary (cirrus12__06RM).
Example 12-15 Create a mirror
xcli>xcli -c "XIV MN00033" mirror_create local_volume=cirrus12__06 target="XIV V10.0 MN00035" slave=cirrus12__06RM Command executed successfully. 12.We can list the mirrors at this point using the command mirror_list, which is shown in Example 12-16. In this example, we have several mirrors that are unsynchronized, and the newly created mirror shows its status as Initializing. We now have to activate that new mirror to start the copy by using the mirror_activate command as shown in Example 12-17.
Example 12-16 List mirrors
xcli>xcli -c "XIV MN00033" mirror_list

Local Volume Master Target Connected cirrus12__01 yes 23c5440_01 yes role_test yes cirrus12__02 yes cirrus12__03 yes cirrus12__06 yes Remote System XIV XIV XIV XIV XIV XIV V10.0 V10.0 V10.0 V10.0 V10.0 V10.0 MN00035 MN00035 MN00035 MN00035 MN00035 MN00035 Remote Volume cirrus12__01 23c5440_01RM role_testRM cirrus12__02 cirrus12__03 cirrus12__06RM Active yes yes yes yes yes no Status Unsynchronized Unsynchronized Unsynchronized Unsynchronized Unsynchronized Initializing yes yes yes yes yes yes
Example 12-17 Activate the mirror
xcli>xcli -c "XIV MN00033" mirror_activate vol=cirrus12__06 Command executed successfully. 13.All of the Remote Mirror commands executed are logged to the event list and can be viewed with the XCLI command event_list. An example of this event list is shown in Example 12-18. This command is useful to view the commands and when they were issued.
Example 12-18 The event_list in XCLI from the primary system while setting up iSCSI Remote Mirror
15546 no 15547 no 15548 no 15549 no 15550 no 15551 no 15552 no
TARGET_PORT_ADD yes admin TARGET_ISCSI_CONNECTIVITY_CREATE yes admin TARGET_CONNECTION_ESTABLISHED yes HOST_NO_MULTIPATH_ONLY_ONE_MODULE yes HOST_NO_MULTIPATH_ONLY_ONE_MODULE yes HOST_MULTIPATH_OK yes HOST_MULTIPATH_OK yes
Informational Informational Informational Informational Informational Informational Informational
2008-08-19 14:54:36 2008-08-19 14:54:36 2008-08-19 14:54:37 2008-08-19 14:58:03 2008-08-19 14:58:14 2008-08-19 15:00:42 2008-08-19 15:05:36
350
15553 no 15554 no 15555 no
MIRROR_CREATE yes admin MIRROR_ACTIVATE yes admin MIRROR_SYNC_STARTED yes
Informational Informational Informational
2008-08-19 15:08:57 2008-08-19 15:09:55 2008-08-19 15:09:56
Alternatively, to create the volume on the secondary using XCLI, we need to include two additional options with the mirror_create command. The one piece of information required, as with GUI, is the name of the pool on the secondary system, which can always be found by issuing the XCLI command pool_list. In Example 12-19, we have created another mirror and indicated that the volume on the secondary must be created (create_slave=yes) in the pool called target_test.
Example 12-19 Create Remote Mirror
$ xcli -c "XIV MN00033" mirror_create local_volume=cirrus12__01 target="XIV V10.0 MN00035" slave=cirrus12__01 create_slave=yes remote_pool=target_test Command executed successfully. Using the XCLI commands to set up the mirrors can be extremely convenient if the maximum number of mirrors will be created by putting all the commands into one script for you to execute.
12.1.10 Recovering from a failure

Remote Mirroring features the automatic creation of a snapshot volume on the secondary before any synchronization or resynchronization process is enabled. This automatic creation allows for the volume to be used in the case of a primary site failure before synchronization has completed. This snapshot is done automatically by the primary volume after reconnecting to the secondary system. Snapshots are only created for volumes that are ready for resynchronization. This automatic snapshot creation is known as the last consistent snapshot. After synchronization has completed, the primary volume will automatically remove the secondary snapshot volume.
351
352
13
Chapter 13.
Data migration
This chapter introduces the XIV Storage System embedded data migration function.
353
13.1 Overview
As with any data center change, whatever the reason for your data migration, it is preferable to avoid disrupting or disabling active applications where possible. While there are many options available for migrating data off one storage system to another, the XIV Storage System includes a Data Migration feature that enables the easy movement of data, at times large amounts of data, from an existing storage box to the XIV Storage System. This feature enables the production environment to continue functioning during the transfer of data with very little downtime for the business applications. Figure 13-1 illustrates what the data migration environment might look like.
Figure 13-1 Data migration environment
The IBM XIV Data Migration solution offers a smooth data transfer, because it: Enables the immediate connection of a host server to the XIV storage, providing the user with direct access to all the data even before it has been copied to the XIV Storage System Synchronizes between the two storage systems using a transparent copying of the data to the XIV Storage System as a background process with minimal performance impact Supports data migration from any storage vendor Can be set up with either Fibre Channel or iSCSI interfaces During the entire process, the host server is connected to the XIV Storage System. The XIV Storage System handles all read and write requests from the host server, even if the data is not resident on the XIV Storage System. In other words during the data migration, the data transfer is transparent to the host and the data is available for immediate access. The XIV Storage System manages the data migration by simulating host behavior. When connected to the storage device containing the source data, it looks and behaves like an initiator, or host. After the connection is established, the storage device containing the source data believes it is receiving read requests from a host, when in fact the XIV Storage System is reading the data and storing the information within its storage. If the XIV Storage System detects a media error on the non-IBM XIV Storage System, this error is reflected on the XIV Storage System at the same block, even though the block has not actually failed. 354
It is important that the connections between the two storage systems remain intact during the entire migration process. If at any time during the migration process, the communication between the storage systems fails, the process will also fail and all writes from the host will also fail. 13.2, Handling I/O requests on page 355 discusses this process in detail. The process of migrating data is performed at a volume level as a background process. As with Remote Mirror, the Data Migration facility has the following boundaries: Maximum of four targets configured for the system, including any combination of Remote Mirroring and Data Migration Maximum number of logical unit numbers (LUNs) that can be configured at any one time between Remote Mirroring and Data Migration is 128 During this chapter discussion, the XIV Storage System is considered the target while the other storage system is known as the source for data migration. This terminology is also used in Remote Mirroring, and both functions share the same terminology for setting up paths for transferring data. To maintain consistency with the way that the commands are used, the source system in a Data Migration scenario is referred to as a target when setting up paths between the XIV Storage System and the storage from which data will be migrated.
13.2 Handling I/O requests

As previously stated, the XIV Storage System handles all I/O requests for the host server during the data migration process. All read requests are handled based on where the data currently resides. For example, if the data has already been written to the XIV Storage System, it is read from that location. However, if the data has not yet been migrated to the IBM XIV storage, the read request comes from the host to the XIV Storage System, which in turn retrieves the data from the source array and provides it to the host server. The XIV Storage System handles all host server write requests and is transparent to the host. All write requests are handled with one of two methods, which is selected when defining the data migration. The two methods are known as source updating and no source updating.
Source updating
This method for handling write requests ensures that both storage systems are updated with the writes. The source system remains updated during the migration process, and the two storage systems are identical throughout the background copy process. Similar to synchronous Remote Mirroring, the write commands are only acknowledged from the XIV Storage System to the host after writing to itself, writing to the source array, and receiving an acknowledgement from the source array. An important aspect of selecting this option is that if there is a communication failure between the target and source storage systems or any other error that causes a write to fail to the source system, the XIV Storage System also fails the write operation to the host. By failing the update, the systems are guaranteed to remain consistent. However, if there is a failure between the source and target systems, the host is no longer capable of updating the data on the XIV Storage System. Important: This option might cause an interruption to the production systems in the case of a failure on the source device.
Chapter 13. Data migration
355
No source updating
This method for handling write requests is much more tolerant of communication failures between the two storage systems. In this scenario, the source array is not updated with any write requests from the host system. The source and target arrays are not synchronized at any time during the data migration. An example of selecting which method to use is shown in Figure 13-2. The box must be checked to select source updating, shown here as Keep Source Updated. Without this box checked, write requests are only written to the target system.
Figure 13-2 Define data migration options
13.3 Data migration stages

The flow chart shown in Figure 13-3 illustrates the steps taken to complete the data migration process. Each stage is thoroughly discussed in this section.
Figure 13-3 Data migration flow
356
13.3.1 Initial configuration

The initial configuration of the data migration process begins with creating host definitions and connectivity between the storage systems and the host system.
Host definitions
First, define the host server that is involved in the data migration by using the same methods that you use to define any server attached to the XIV Storage System. Chapter 7, Host connectivity on page 147 describes defining the host server. In addition, the XIV Storage System must be defined on the source storage system as a host. The volumes created on the XIV Storage System for the data migration must be equal in the number of blocks to the volumes on the system being migrated, which is verified upon activation.
Connectivity
The next step is to set up the connectivity between the two storage systems. If the data migration is to be done over iSCSI (the target storage must also support iSCSI), both storage devices must have the ports defined and configured. It is also important to enable jumbo frames on the Ethernet switch to which the systems are connected, as well as on the systems themselves when the ports are configured. If the data migration is to be carried out over Fibre Channel, zones must be created between the two storage systems, as well as the host server to the XIV Storage System. If you use more than one path (recommended for Fibre Channel), create two zones to include one port from each storage system.
Define target and paths

When the physical connections are in place and all the zoning, if applicable, is complete, the migration paths can be defined. This process is similar to the path definitions for Remote Mirroring, except that the target system defined is actually the source array for the data migration. In addition, the target type is defined as Data Migration instead of Mirroring, which is illustrated in Figure 13-4.
Figure 13-4 Defining data migration target
After defining the target, you must add ports to the target and then create the paths. Using the GUI, right-click the target system (also referred to as the source storage system) and select Add Port as shown in Figure 13-5 on page 358. After the port is defined, the worldwide port name (WWPN) or iSCSI definition must be included. The example shown in Figure 13-6 on page 358 shows the window where you define the WWPN for a Fibre Channel port definition.
357
Figure 13-5 Define ports and paths for data migration
Figure 13-6 Enter the WWPN
Note: The host and data migration paths can be created with either iSCSI or Fibre Channel or mixed as long as only one protocol is used for each path.
Define the data migration

We are now ready to define the data migration. Using the GUI, select Remote Data Migration as shown in Figure 13-7. After selecting the Data Migration option, defining the data migration is the next step. This step was shown previously in Figure 13-2 on page 356. Alternatively, the Extended Command Line Interface (XCLI) command dm_define can be used for the same purpose.
Figure 13-7 Data Migration option on the Remote menu
358
After creating the definition, we are ready to test the data migration.
13.3.2 Testing the configuration

Before the data migration can start, and even before the host server is connected to the XIV Storage System, test the configuration to verify that the XIV Storage System can access the source storage system. To test the configuration, run the XCLI command dm_test or with the GUI, right-click the name of the data migration and select Test Data Migration as shown in Figure 13-8.
Figure 13-8 Test Data Migration option
After the test completes successfully, we are ready to activate the migration.
13.3.3 Activate
It is now time to connect the host server to the XIV Storage System and activate the data migration. This option is shown in Figure 13-8 for GUI and can also be done using the XCLI command dm_activate. From this point on, with the data migration in progress, all the host servers reads and writes are sent to the XIV Storage System, which can also retrieve and optionally write data to the source storage system.
13.3.4 Migration process

When activated, the migration process has begun. If the data migration is deactivated, using the GUI as shown in Figure 13-8 or using the XCLI command dm_deactivate, the volume cannot be accessed by the host server. This deactivation remains until the data migration is activated once again. During the migration process, the status of the volumes can be monitored using the GUI or with the XCLI command dm_list. Both of these methods display the current status of the migration, including active or inactive, synchronized or unsynchronized, and the remaining data to copy as well as an estimate of the time remaining for the migration to complete. The migration process continues as a background process with little to no impact on the host system performance. The amount of data to be migrated determines the duration of the process. This process can be stopped if necessary, but it is not advised. However, in the event that the migration must be removed, for example, if the tested configuration fails, using the force option might be necessary and can only be executed by using the XCLI command dm_delete.
359
13.3.5 Synchronization complete

After all the data by means of the background process is copied from the source array to the XIV Storage System, the status of the data migration definition displays synchronized. This status is based on a volume level and does not mean that all volumes belonging to a host server have completed migration. It is important to monitor all volumes in order to maintain consistent data. At this point, all read and write requests are serviced through the XIV Storage System. If source updating was selected, the XIV Storage System continues to write data to both itself and to the non-IBM XIV storage, which will continue until the data migration settings are deleted.
13.3.6 Delete the data migration

The final step, after synchronization is complete, is to delete the data migration. After the migration has been deleted, it cannot be restarted. If necessary, a new definition will have to be made. The data migration can be deleted by using the GUI or with the XCLI command dm_delete. At this point, after all the volumes defined for data migration have completed and the migration has been deleted, new definitions for any remaining volumes can be created. As previously stated, at any given time only 128 volumes can be part of a data migration. If Remote Mirroring is also set up on the same machine (as primary or target volumes), the total number of combined volumes belonging to Data Migration and Remote Mirroring cannot exceed 128.
360
Related publications
The publications listed in this section are considered particularly suitable for a more detailed discussion of the topics covered in this book.
IBM Redbooks publications

For information about ordering this publication, refer to How to get IBM Redbooks publications on page 362. This document might be available in softcopy only: Introduction to Storage Area Networks, SG24-5470
Other publications
These publications are also relevant as further information sources: IBM XIV Storage System Installation and Service Manual, GA32-0590 IBM XIV Storage System XCLI Manual, GC27-2213 IBM XIV Storage System Introduction and Theory of Operations, GC27-2214 IBM XIV Storage System Host System, GC27-2215 IBM XIV Storage System Model 2810 Installation Planning Guide, GC52-1327-01 IBM XIV Storage System Pre-Installation Network Planning Guide for Customer Configuration, GC52-1328-01 XCLI Reference Guide, GC27-2213-00 Host System Attachment Guide for Windows- Installation Guide: http://publib.boulder.ibm.com/infocenter/ibmxiv/r2/index.jsp The iSCSI User Guide : http://download.microsoft.com/download/a/e/9/ae91dea1-66d9-417c-ade4-92d824b871 af/uguide.doc AIX 5L System Management Concepts: Operating System and Devices: http://publib16.boulder.ibm.com/pseries/en_US/aixbman/admnconc/hotplug_mgmt.htm #mpioconcepts System Management Guide: Operating System and Devices for AIX 5L: http://publib16.boulder.ibm.com/pseries/en_US/aixbman/baseadmn/manage_mpio.htm Host System Attachment Guide for Linux, which can be found at the XIV Storage System Information Center: http://publib.boulder.ibm.com/infocenter/ibmxiv/r2/index.jsp Sun StorEdge Traffic Manager Software 4.4 Release Notes: http://dlc.sun.com/pdf/819-5604-17/819-5604-17.pdf Fibre Channel SAN Configuration Guide: http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_san_cfg.pdf
361
Basic System Administration (VMware Guide): http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_admin_guide.pdf Configuration of iSCSI initiators with VMware ESX 3.5 Update 2: http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_iscsi_san_cfg.pdf ESX Server 3 Configuration Guide: http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_3_server_config.pdf
Online resources
These Web sites are also relevant as further information sources: IBM XIV Storage Web site: http://www.ibm.com/systems/storage/disk/xiv/index.html System Storage Interoperability Center (SSIC): http://www.ibm.com/systems/support/storage/config/ssic/index.jsp SNIA (Storage Networking Industry Association) Web site: http://www.snia.org/ IBM Director Software Download Matrix page: http://www.ibm.com/systems/management/director/downloads.html IBM Director documentation: http://www.ibm.com/systems/management/director/
How to get IBM Redbooks publications

You can search for, view, or download IBM Redbooks publications, Redpapers, Technotes, draft publications and Additional materials, as well as order hardcopy IBM Redbooks publications, at this Web site: ibm.com/redbooks
Help from IBM

IBM Support and downloads ibm.com/support IBM Global Services ibm.com/services
362
Index
A
Active Directory integration 126 address space 18 admin 86, 91, 127128, 131, 157, 164165 Agent See IBM Director, Agent AIX 179, 193194 fileset 196 supported version 193 alerting event 259 application administrator 127129 user groups 128 applicationadmin 127128, 134 automatic deletion 287288, 298 autonomic features 34 availability 8, 10, 15 connectivity 148150 connectivity adapters 46 Consistency Group 22, 42, 100102, 143, 288, 300301 create 300301 delete 309 expand 302 new snapshots 304 snapshot groups 102 Storage Pool 300, 302 xiv_windows_1 volume 305 context menu 132133, 164, 174175, 345 select events 142 cooling 53 copy functions 285 copy on write 16, 287 copy pairs 330 Core Services 266 coupling 325, 332 created Storage Pool actual size 96
B
background copy 318, 355 backup 285, 288289 backup script 312314 bandwidth 13, 15, 46, 52, 60 Basic configuration 64 battery 48, 256 best-effort coupling 331 block 16, 18, 103, 108, 112 block-designated capacity 18 boot device 53 buffer 57
D
Data Collection 283 data distribution algorithm 47 data integrity 15, 22, 43 data migration 4, 11, 16, 43, 56, 143, 285, 330, 353355 host definitions 357 Data Module 9, 35, 5051, 55, 66, 236 data redundancy 1516, 33 data stripe 237 deamon 260 default IP address 86, 92 gateway 74 definable 56 delete snapshot 287, 297 deletion priority 23, 33, 287288, 290292 depleted capacity 97, 103 depletion 23 destage 22, 36 destination 262, 264, 276 destination type 276 direct connection 154, 162 Director Console 266267, 269 dirty data 3536, 44 Disaster recovery 35, 42 disaster recovery 35, 42, 325, 333 disk drive 57, 93, 236, 238, 283 reliability standards 43 disk scrubbing 43 disk_list 256 ditribution 3739 dm_activate 359
C
cache 89, 11, 46, 5253 buffer 57 page 237 caching management 236 call home 249, 253 capacity 79, 81, 84 depleted 95 unallocated 24, 26, 29 category 126127, 136 CE/SSR 70, 7778, 283 cg_delete 309 cg_list 308 cg_move 101102 client 265266 Command Line Interface 80 Compact Flash Card 53 component_list 256 computing resource 10, 12 configuration flow 84
Index
363
dm_deactivate 359 dm_define 358 dm_list 359 dm_test 359 DM-MPIO 208, 213 DNS 263, 275276 duplicate 288290 duplicate snapshot 288289, 291 creation date 288289, 291 creation time 288
GUI example 334
H
hard capacity 79, 81, 84 depletion 33 hard pool size 2728 hard size 94, 98, 107 hard space 23, 25 hard storage capacity 89, 251 hard system size 2829, 32 hard_size 102 hardware 4547 high availability 34, 42 host transfer size 239 host connectivity 55, 149, 152 Host definition 116 host HBA 156, 239 host server 152, 169170, 354355, 357 example power 176 I/O requests 355
E
E-mail notification 85 enclosure management card 53 encryption 282 Ethernet connection 162 Ethernet fabric 89 Ethernet switch 56, 148, 161162, 334, 357 event 252253, 258259 alerting 259 severity 250, 252253, 258259 event code 276 event_list 258259
I
IBM development and support 127 IBM Director 260, 262, 264 Agent 265266 components 264265 Core Services 266 enhanced functionality 265 main component 265 MIB file 266267 Server 264266 IBM Redbooks publication Introduction 155 IBM SSR 6970, 74 IBM System Storage Interoperability Center 151152 IBM XIV 46, 4849, 6465, 251252, 254 data migration solution 354 Data Module 66 FC HBAs 172 final checks 78 hardware 61, 66 hardware component 61 installation 77 internal network 62 iSCSI IPs 172 iSCSI IQN 172 personnel 75 power components 62 rack 77 remote support 70 remote support center 281 repair 61 SATA disks 58 Serial Number 158 software 4, 260, 273 Software System 92
F
failure 324, 331332 fan 53 fc_port_config 328329 fc_port_list 326, 328, 346 Feature Code overview 65 Fibre Channel host I/O 70 Fibre Channel connectivity 55 Fibre Channel port definition 357 Fibre Channel ports 55, 328, 334335 filter 242244 free capacity 97, 99 full installation 82 Function icons 88
G
Gateway 85, 274275, 278 gateway 56 GHz clock 52 GigE adapter 51 given disk drive transient, anomalous service time 44 given XIV Storage System common command execution syntax 91 Goal Distribution 20 goal distribution 3739 priority 38 graceful shutdown 3637 graceful shutdown sequence 36 grid architecture 8, 10, 12 grid topology 13, 34, 36 GUI 8081, 84
364
storage 66, 355, 360 Storage Manager 4 Storage System 70 support 56, 61, 70, 109, 273 Support Center 61, 273 Support Structure 61 technician 38 XCLI User Manual 258 IBM XIV Storage Manager 4, 80, 90, 122 Manager GUI 5 Manager window 131 Subsystem 15 System 7274 System patch panel 170 XCLI 5 IBM XIV Storage System grid overview 12 initialization 325, 346 initiator 325326, 328, 354 Intel Xeon 52 interface 1213 Interface Module 9, 35, 5051, 55, 66, 69, 71, 85, 117, 122, 147149 2-port 4Gb FC PCI Express Adapters 55 FC port number 117 inutoc 196 IOPS 240241, 243 IP address 56, 69, 7374, 86, 160161, 189190, 201, 261, 275276, 349 space 75 ipinterface_list 326, 349 iSCSI 5456, 147149, 354, 357358 connectivity 56 initiator 56 ports 5556 target 56 iSCSI connection 7374, 159, 174, 177, 339 iSCSI connectivity 122, 160161, 163 iSCSI HBA 147, 164, 166 iSCSI host 165, 169, 174 iSCSI name 166167, 338, 348 iscsid.conf 216
local site 325, 332333 lock Snapshot 313 lock_behavior 103 locking 19, 33 logical structure 16, 20, 22 logical volume 34, 15, 18 LUN Map 115, 117 identification number 115 LUNs table 117119
M
MacOS 80, 85 Main display 88 Maintenance Module 49, 61 managed system 265266 See also IBM Director, Agent management console 80, 265 See also IBM Director, Console Management Information Base (MIB) 262 management server 265 See also IBM Director, Server Management workstation 85 management workstation 80, 8586 Mandatory Coupling 333 mandatory coupling 331 mapping 1516, 84, 90, 111 master volume 16, 20, 23, 99, 138, 238, 286, 289, 293 duplicate snapshot 289 maximum volume count 18 memory 5253, 62 Menu bar 88 meta data 51 metadata 9, 286, 290 metrics 240, 243, 247 MIB 262 MIB extensions 262 MIB file 266267 migration 92, 108 migration paths 357358 mirror_create local_volume 350351 Modem 61 modem 273, 281282 module_list 256 monitor statistics 255 monitoring 249251 monolithic architecture 10 monolithic system 11 MTU 56, 85, 327, 339, 349 default 159, 163 maximum 159, 163 multipathing 194, 196, 204 MySQL 308309, 312
J
just-in-time 93
K
KB 16, 237, 286
L
latency 238, 241 left pane 175 link aggregation 56 link aggregation group 56 Linux 179, 204205 queue depth 181, 197198 load balancing 12, 14
N
naming convention 289 Network mask 85
Index
365
O
OK button 120, 292293 On-site Repair 283 original snapshot 288289, 291 duplicate snapshot points 288 mirror copy 298
P
parallelism 8, 10, 14, 47, 60, 62, 235, 238, 240 partition 1617 patch panel 5556, 59, 70, 78, 147149, 250 PCI Express 52, 55 PCI-e 236 performance 235237 metrics 240, 243, 247 phase-out 15, 38, 41 phases-out 34 phone line 61 physical capacity 34, 8, 15, 18, 102 physical disk 1617 pointer 286, 293, 312 pool size 2628, 9698 hard 98 soft 98 pool soft size 26 pool_change_config 101, 103 pool_delete 101102 pool_rename 101102 pool_resize 101102 port configuration 327 port role 328 ports 328, 334335 Power on sequence 37 power outage 49 power supply 48, 50, 54 power-off 49 predefined user admin 129 role 126, 129 prefetch 13 pre-fetching 237 primary site 325, 332333 primary system 325326, 329 source volume 345 source volumes 332 primary volume 20, 325, 331332 problem record 283 pseudo random distribution MB partitions 236 pseudo-random distribution algorithm 16
RAID striping 42 RAM disk 206, 211 raw capacity 46 raw read error count 44 RBAC 126 read command 111 readonly 127128, 135 rebuild 15, 19, 37 redirect on write 16, 20, 286, 298, 318 redirect-on-write 4 redistribution 21, 34, 38, 238 redundancy 7, 1416, 33 redundancy-supported reaction 35, 44 Registered State Change Notification 155 regular pool 24, 26, 28, 98 regular Storage Pool 26, 2930, 107 final reserved space 31 snapshot reserve space 31 remote connection 61 Remote Mirror 35, 151152, 285, 310, 324, 326, 355 initiator 336, 341 target 336, 341 Remote Mirroring 151152 function 326 implementation 326 window 343, 345 remote mirroring 9, 56, 76, 151, 158, 237238, 324325, 328, 360 consistency groups 325 multiple features 325 Remote Repair 283 remote site 324, 333 Remote Support 250, 281 remote support 249, 273, 281 reserve capacity 23, 25, 27 resiliency 7, 14, 34 resize Volume 109 resource utilization 14 resume 23, 37 Right click 90, 132, 327 role 127128 Role Based Access Control 126 role switchover 333334 roles 126128 RSCN 155 rule 277279
S
same LUN 116117, 122, 151 SAN boot 319 SAS adapter 52 SATA disk 46, 5758 scalability 10, 13, 15 script 80, 90, 122 scrubbing 21, 43 secondary site 324, 333 secondary system 325326, 331 same command 347 target pool 342 secondary volume 325, 331332
Q
queue depth 181, 197198
R
rack 4547 rack door 48 RAID 15, 19, 21, 103
366
sector count 44 Secure Sockets Layer 86 security 125126, 137 self-healing 3, 3334, 43 Serial-ATA specification supporting key features 57 Server See IBM Director, Server serviceability 33, 43 severity 250, 252253 shell 80, 101, 112 shutdown sequence 36 sizing 240 SMS 250, 273, 275 message tokens 278 SMS Gateway 275 SMTP 275, 278279 SMTP gateway 144, 275, 278 snap_group 308, 315 Snapshot 16, 23, 33 automatic deletion 287288, 298 creation 288, 307 deletion priority 287, 291292 details 307 duplicate 288290 lock 313 performance 238 restore 315316 size 307 snapshot 13, 16, 19, 285287 delete 287, 297 duplicate 288290 locked 287, 289 naming convention 289 reserve capacity 23, 25, 27 snapshot group 304305, 307 snapshot_delete 297 SNMP 260262 destination 262, 264, 276 SNMP agent 260261 SNMP communication 261 SNMP manager 75, 262, 264 SNMP trap 260261 SNMP traps 262, 264, 269 soft capacity 79, 81, 84 soft pool size 2728 soft size 27, 94, 9798 soft system size 2829, 32 soft_size 102 software services 12 software upgrade 84 Solaris 179, 218219 iSCSI 218, 221 source updating 355356, 360 space depletion 23 space limit 18 spare capacity 2021, 41 spare disk 237 SSL 86 standby mode 331, 333
state 109, 111, 117 state_list 255 static allocation 107, 110 statistics 235, 240241 monitor 255 statistics_get 246248 statistics_get command 246247 Status bar 89, 251 STMS 218 storage administrator 1415, 17, 79, 86, 129, 131, 138, 273, 325, 333, 344 Storage Management software 8081, 89 installation 81 Storage Networking Industry Association 126 Storage Pool 2223, 89, 93, 170, 173, 177, 251, 287, 290, 297 additional allocations 298 and hard capacity limits 29 available capacity 33 capacity 2425 delete 99 future activity 96 logical volumes 23 overall information 94 required size 97 resize 98 resource levels 110 snapshot area 102 snapshot capacity 97 Snapshot Sets 101 space allocation 25 system capacity 9697 unused hard capacity 31 XCLI 101, 121 Storage Pools 14, 18 create 96 storage space 22, 93, 97, 103 storage system 84, 103, 108, 126, 129, 133, 149, 151152, 325, 333, 354355, 357 Storage System software 80 storage virtualization 3, 1415 innovative implementation 14 storageadmin 127128, 131 striping 239 suspend 22 switch_list 257 switching 13 switchover 333334 synchronization 325, 332, 351 synchronous 324 SYSFS 210, 212, 214 System level thin provisioning 2829 System Planar 5153 system quiesce 36 system services 9 system size 2829, 32 hard 2829 soft 2829 system time 255
Index
367
system_capacity_list 255
T
target 325327 Target Protocol 336 target volume 111112, 318, 320, 344 source volume 111 target_list 348 target_mirroring_allow 348 TCO 43 TCP port 86 technician 127128, 135 telephone line 281282 thick-to-thin provisioning 16 Thin Provisioning 94, 108 thin provisioning 34, 16, 24, 26, 94, 102 system level 28 time 255 time_list 246, 255256 TimeStamp 246 timestamp 332 TLS 86 token 278 Toolbar 88 toolbar 115, 117, 120, 263, 273 total cost of ownership 43 transfer size 239 Transient system 41 Transport Layer Security 86 trap 261
vol_move 101102 Volume 99100, 104 resize 109 soft size 27 state 109, 111, 117 Volume Copy 285, 317, 319 OS image 319 volume count 18 Volume Shadow Copy Service (VSS) 311 volume size 25, 107109 VPN 61, 273, 281282 VSS 23
W
Welcome panel 274, 276
X
XCLI 5, 26, 80, 85, 90, 126, 134135, 156, 163164, 240, 246, 255256, 258, 290292, 326, 328, 331 XCLI command dm_delete 359360 event_list 141 XCLI utility 9092 XIV GUI 84, 86, 94, 128129, 140, 157, 163164, 260, 263, 273, 311 Viewing events 142 XIV Storage hardware 80, 85, 88 Management 75, 78 Management GUI 88, 108, 110 Management software 8081, 90 Management software compatibility 80, 180 Manager 80 Manager installation file 81 port 86, 116 System 13, 810, 4547, 6364, 80, 125127, 147148, 151, 235237, 250, 252, 255, 285287, 324326, 353355 System API 311 System grid architecture 1213 System installation 70 system reliability 42 System time 246 Systems 103, 140, 151, 164, 166 XIV storage administrator 150 XIV Storage Manager 8081, 84 XIV Storage System 7980, 84, 249251 administrator 86 API 311 architecture 8, 10, 12 configuration 103, 116 design 3 grid architecture 12 hardware 64, 74 installation 64 internal operating environment 14, 33 iSCSI configuration 73 logical architecture 16
U
UDP 261 unallocated capacity 24, 26, 29 uncommitted data 332 unlocked 106, 109, 111 UNMAP_VOLUME 145, 258 unsynchronized 331, 333, 345 UPS module 46, 48 UPS module complex 48 ups_list 256257 usable capacity 21, 46 usable space 21 USB to Serial 60 user group 127129 Access Control 132133 Unauthorized Hosts/Clusters 133 users 126128 users location 255
V
version_get 255 Virtual Private Network 61 virtualization 1415, 103 VMware 319320 VMware ESX 223224, 231, 362 vol_copy 319 vol_lock 297
368
logical hierarchy 17 main GUI window 164, 173 management functionality 85 Management main window 87 Overview 1 point 176, 178 rack 6061 reserves capacity 21 serial number 166 snapshot functionality 311 software 4, 80 stripes data 236 support 65 time 237238, 246 use 91, 116, 122 virtualization 1415 WWPN 157158, 170 XIV subsystem 239, 286287 XIV V10.0 MN00050 145, 157, 164165, 311 xiv_development 127128 xiv_maintenance 127128, 136 XML configuration file 311
Z
zoning 155156, 172
Index
369
370

(0.5 spine) 0.475<->0.873 250 <-> 459 pages
Back cover

Explore the XIV concepts and architecture Install XIV in your environment Understand usage guidelines and scenarios
The IBM XIV Storage System (2810-A14) is designed to be a scalable enterprise storage system based upon a grid array of hardware components. It can attach to both Fibre Channel Protocol (FCP) and iSCSI capable hosts. The first chapters of this book provide details about many of the unique and powerful concepts that form the basis of the XIV Storage System logical and physical architecture. We explain how the system was designed to eliminate direct dependencies between the hardware elements and the software that governs the system. In subsequent chapters, we explain the planning and preparation tasks that are required to deploy the system in your environment. This explanation is followed by a step-by-step procedure of how to configure and administer the system. We provide illustrations of how to perform those tasks by using the intuitive, yet powerful XIV Storage Manager GUI or the Extended Command Line Interface (XCLI). We explore and illustrate the use of snapshots and Remote Copy functions. The book also outlines the requirements and summarizes the procedures for attaching the system to various host platforms. This IBM Redbooks publication is intended for those individuals who want an understanding of the XIV Storage System and also targets readers who need detailed advice about how to configure and use the system.
INTERNATIONAL TECHNICAL SUPPORT ORGANIZATION
BUILDING TECHNICAL INFORMATION BASED ON PRACTICAL EXPERIENCE

IBM Redbooks are developed by the IBM International Technical Support Organization. Experts from IBM, Customers and Partners from around the world create timely technical information based on realistic scenarios. Specific recommendations are provided to help you implement IT solutions more effectively in your environment.
For more information: ibm.com/redbooks

SG24-7659-00 ISBN 0738432016

San-Xiv Storage Excellent Concepts

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

San-Xiv Storage Excellent Concepts

Uploaded by

Copyright:

Available Formats

Front cover

IBM XIV Storage System: Concepts, Architecture, and Usage

Understand usage guidelines and scenarios

IBM XIV Storage System: Concepts, Architecture, and Usage

361 361 361 362 362 362

IBM XIV Storage System: Concepts, Architecture, and Usage

IBM XIV Storage System: Concepts, Architecture, and Usage

The team that wrote this book

Copyright IBM Corp. 2009. All rights reserved.

IBM XIV Storage System: Concepts, Architecture, and Usage

Alexander Warmuth Ralf Wohlfarth Wenzel Kalabza Theeraphong Thitayanun

Become a published author

IBM XIV Storage System: Concepts, Architecture, and Usage

IBM XIV Storage System overview

Copyright IBM Corp. 2009. All rights reserved.

1.1.1 System components

IBM XIV Storage System: Concepts, Architecture, and Usage

1.1.2 Key design points

Chapter 1. IBM XIV Storage System overview

1.1.3 The XIV Storage System Software

IBM XIV Storage Manager GUI

IBM XIV Storage System: Concepts, Architecture, and Usage

Figure 1-2 The IBM XIV Storage Manager GUI

IBM XIV Storage XCLI

Chapter 1. IBM XIV Storage System overview

IBM XIV Storage System: Concepts, Architecture, and Usage

XIV logical architecture and concepts

Copyright IBM Corp. 2009. All rights reserved.

2.1 Architectural overview

Interface/Data Modules Data Modules

IBM XIV Storage System: Concepts, Architecture, and Usage

Chapter 2. XIV logical architecture and concepts

2.2 Massive parallelism

2.2.1 Grid architecture over monolithic architecture

IBM XIV Storage System: Concepts, Architecture, and Usage

Figure 2-3 illustrates the concept of a monolithic storage subsystem architecture.

Cache JBOD JBOD

Figure 2-3 Monolithic storage subsystem architecture

Chapter 2. XIV logical architecture and concepts

IBM XIV Storage System grid overview

IBM XIV Storage System: Concepts, Architecture, and Usage

2.2.2 Logical parallelism

Modular software design

2.3 Full storage virtualization

IBM XIV Storage System: Concepts, Architecture, and Usage

IBM XIV Storage System virtualization benefits

Chapter 2. XIV logical architecture and concepts

2.3.1 Logical system concepts

IBM XIV Storage System: Concepts, Architecture, and Usage

Chapter 2. XIV logical architecture and concepts

IBM XIV Storage System: Concepts, Architecture, and Usage

Logical volume layout on physical disks

Chapter 2. XIV logical architecture and concepts

Volumes and snapshots

2.3.2 System usable capacity

IBM XIV Storage System: Concepts, Architecture, and Usage

Global spare capacity

Metadata and system reserve

Chapter 2. XIV logical architecture and concepts

2.3.3 Storage Pool concepts

Improved management of storage space

IBM XIV Storage System: Concepts, Architecture, and Usage

Storage Pool relationships