Professional Documents
Culture Documents
1. Explain and recognize basic Storage Networking Technology Components and Concepts (9%)
1.1 Compare and contrast how the disk technologies of Fibre Channel, ATA, SATA, SCSI, and SAS operate
ATA (IDE) Also known as parallel ATA (PATA) 8 or 16-bits interface Maximum theoretical speed 100MB/s (ATA-6) Fibre Channel: A 24-bit address consists of the following 3 parts (in order): Domain (1-239), Area (0-255) and Node Address (the AL_PA) 8 Bit Domain ID, 8 Bit Area ID, and 8 Bit Port ID Domain The domain is a unique number assigned to each switch in a logical fabric. A domain ID assigned to a switch can range from 1 to 239. This number comprises the first 8 bits of the FCID. Area -The 8-bit area field is assigned by the switch as well. It can range from 0 to 256. In some third-party switches this number is assigned by using the physical port number (that is, port 3 out of 16 ports), limiting availability on some operating systems. The Cisco MDS assigns these sequentially regardless of the physical port number. Port -The port field is also 8 bits ranging from 0 to 256. This field is unique in that it also is used to assign the arbitrated loop physical address (ALPA) for devices that use loop. In the context of a device that is not using arbitrated loop, it is common to see the field set to 0, although this is not required. http://www.cisco.com/en/US/prod/collateral/ps4159/ps6409/ps4358 /prod_white_paper0900aecd80285738_ns512_Networking_Solutions_White_Paper.html SAS (Serial Attached SCSI): Max 128 devices (first generation), max 256 devices (second generation) Max 3 Gb/s, will be 6 Gb/s in near future Hot-pluggable SAS devices can communicate with both SATA and SCSI devices (the backplanes of SAS devices are identical to SATA devices). A specific difference between SCSI and SAS devices is the addition in SAS devices of two data ports, each of which resides in a different SAS domain. This enables to use redudancy (failover possibility). If one path fails, there is still communication along a separate and independent path.
1 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
SATA (Serial ATA): Serial link Current standard maximum 6 Gbit/s speed Most disks currently can't saturate the 1.5 Gbit/s Uses native command queuing to deal with incoming actions 7-pins connector for data, 15-pins connector for power When converting SAS to SATA use an adapter or cable Example: http://www.cs-electronics.com/sas-products.htm SCSI Parallel Up to 320 MB/s (Ultra-320 SCSI) or even 640 MB/s (Ultra-640 SCSI)
Describe virtualization implementation techniques and management strategies (e.g., in-band and
2 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
out-of-band)
host-based: storage-based: main reasons for segmentation and security. Segmentation/virtualization helps in performing upgrades, migrating data etc. Switch-based virtualization (in-band / out-of-band): in-band: control and data travel the same path. Pro's are easier installation (no specific software required), offloading and performance optimizations in data path possible. out-of-band: control and data have their own path
2.2 Explain HBA configuration parameters; justify the reasons for each parameter setting
QueueDepth If the number of outstanding I/Os per device is expected to be above 32, then QueueDepth needs to be increased. Usually the vendor of the storage and/or HBA's have documents describing how to adjust the value and how to measure the value with the best performance. Usually dividing the total of the storage array's queue lenght with the amount of HBA's. If QueueDepth is undersized, there can be a performance degradation due to Storport throttling of its device queue. I/O coalesce
3 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
IO coalesce controls the number of CPU interrupts, for more efficient CPU utilization. Turn on the I/O coalesce parameter in high-performance environments. However when adjusting the related parameters it's important to find the most suitable values. Reducing the number of interrupts can cause poor performance. It depends mainly on the workload. CoalesceMsCnt is the count in milliseconds, CoalesceRspCntis the count of pending responses. ConnectionOption DataRate FrameSize HardLoopID ResetDelay EnableBIOS EnableHardLoopID EnableFCPErrRecovery ExecutionThrottle EnableExtendedLogging LoginReTryCount EnableLipReset PortDownRetryCount EnableLIPFullLogin LinkDownTimeOut EnableTargetReset MaximumLUNsPerTarget LinkDownError FastErrorReporting Parameter Data Rate Execution Throttle Connection options (topology) Loop Reset Delay Enable LIP Full Login Enable Target Reset Port Down Retry Count Link Down Timeout LUNs Per Target Adapter Hard Loop ID Hard Loop ID Descending Search LoopID Operation Mode Interrupt Delay Times Enable Interrupt (24xx HBAs) CO DR FR HD RD EB HL EF ET EL LR LP PD FL LT TR ML LD FE 0-3 0-3 512,1024,2048 0-125 0-255 0,1 0,1 0,1 1-65535 0,1 0-255 0,1 0-255 0,1 0-240 0,1 0,8,16,32,64,128,256 0,1 0,1 See note 1 below See note 2 below
See note 5 below See note 3 below See See See See note note note note 3,5 below 5 below 3,5 below 3,5 below
Qlogic default setting 0 (1 Gb/s) 16 2 (Loop preferred, otherwise pointto-point) 5 Yes No 8 30 8 Enabled 0 0 0 0 No
EMC-approved setting 2 (AutoSelect) 256 2 (Loop preferred, otherwise pointto-point) 5 Yes Yes 45 45 256 Disabled 0 1 0 0 No
Execution Throttle: Specifies the maximum number of I/O commands allowed to execute on a HBA port. When a ports execution throttle is reached, no new commands are executed until the current command finishes 256 1256 Windows Frame Size Specifies the size of a Fibre Channel frame per I/O. 2048 5122048 All Fibre Channel Data Rate Specifies the HBA adapter data rate. When set to Auto, the adapter auto-negotiates the data rate with the connecting SAN device. Auto 1 (Auto), 2 (1Gb), 3 (2Gb), 4 (4Gb) All Maximum Queue Depth Specifies the maximum number of I/O commands allowed to execute/queue on a HBA port. 32 1-65535 VMware ESX Maximum Scatter Gather List Size Specifies the size of the list of DMA items that are reported to SCSI mid-level per I/O request. 32 1-255 VMware ESX Maximum Sectors Specifies the maximum number of disk sectors that are reported to SCSI mid-Level per I/O request. 512 512, 1024, 2048 VMware ESX
4 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
Adding and removing ISLs is the result of connecting or disconnecting E-ports (Expansion port). Reasons: Load sharing Fail over Connecting fabrics, increasing throughput. Or adding links to an existing ISL trunk.
BB Credit
Configure the number of buffers that are available to attached devices for frame receipt default 16. Values range 1-16.
R_A_TOV
Resource allocation time out value. This works with the E_D_TOV to determine switch actions when presented with an error condition
E_D_TOV
Error detect time out value. This timer is used to flag potential error condition when an expected response is not received within the set time
5 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
6 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
Domain reconfiguration nondisruptive This event is limited to the affected VSAN. Only Cisco MDS 9000 Family switches have this capabilityonly the domain manager process for the affected VSAN is restarted and not the entire switch. Name server Verify that all vendors have the correct values in their respective name server database. IVR IVR-enabled VSANs can be configured in any interop mode. Brocade's msplmgmtdeactivate command must explicitly be run prior to connecting from a Brocade switch to either Cisco MDS 9000 Family switches or to McData switches. This command uses Brocade proprietary frames to exchange platform information, which Cisco MDS 9000 Family switches and McData switches do not understand. Rejecting these frames causes the common E ports to become isolated.
Effective configuration: active set, loaded in memory. Can be saved with cfgSave. Defined configuration: saved set on flash, can be loaded with cfgEnable.
7 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
2.8 Identify best practices for storage allocation in Fibre Channel SAN Adding storage to a new host
EMC: Create raid pool Bind LUN Create storage pool Register host
8 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
Present LUN to host Upgrading EMC: Extend LUN NetApp: Extend volume or iSCSI LUN
Identify business context for NAS (e.g., email repository, content archiving)
NAS is often used for sharing documents, file stores, content archiving, email repositories, backups
Identify business context for SAN (e.g., database repository, data replication)
Storage with low latency demands like databases and OLTP. Also mass storage demands including data replication.
9 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
Identify steps needed to bring environment back to a controlled situation (e.g., host is swapped out or a device is changed)
xxx
Implementing decommission of hardware (e.g., classify information to understand proper disposal methods, erasure of passwords, configs and zone sets, disk, tape, and data
Cisco devices: clear zone database (clears zone information of VSAN) Passwords: clear passwords Configs: clear configuration before reusing or throwing hardware away. Zone sets: xxx Disk: xxx Tape:Remove from catalog (remove or 'expire' the tape media) and use the company's disposal method.
3.5 Apply steps to add a configured switch to an existing fabric (e.g., verify that domain ID is unique, insure zone names are unique, backup existing zone before changes, validate existing admin account has unique username/password on new switch) 3.6 Using scenarios, illustrate reasons to add or remove ISLs (Inter Switch Links)
Increasing throughput, connecting more fabrics together.
Determine impact of adding an ISL (e.g., more options for SAN expansion, allows configuration to take full advantage of ports)
More ISLs means a better usage of the ports (and less oversubscription needed). Also expansion of the SAN is possible.
3.7 Identify processes that occur on a switch during a fabric merge (e.g., name services, protocol sequence, and principle switch selection)
While merging, the following processes happen: Zoneset passing Name server distribution Negotiation of (shortest) paths principal switch selection/negiotiation (lowest WWN wins usually)
Awareness of fabric behavior upon merge (e.g., takes 5-10 minutes to stabilize because of background processes)
Tips: - Use one ISL at a time
Activation of new production zone sets once the merge is complete (e.g., two switches on Fabric A, and one HBA going to each fabric) 3.9 Using scenarios, determine appropriate methodologies and tools for troubleshooting zone sets Validation of host and LUNs Validation of HBA logged into fabric
10 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
Validation of storage subsystem being logged into the switch 3.10 Predict the symptoms when the distance limitations between long-wave and shortwave fiber has been exceeded Explain why there is excessive SCSI re-transmit errors (e.g., intermittent loss of signal)
- Signal loss - Oversubscription
3.12 Using scenarios, illustrate additional conflicts that could cause fabric segmentation
(see initial reasons in 2.7) If an Extended Fabrics port is to be installed on a SilkWorm 2000 Series switch, the fabric wide configuration parameter fabric.ops.mode.longDistance must be set to 1 on all switches operating within the fabric. Additionally, each long distance port must be set using the portCfgLongDistance command. Each of the two ports within a long distance ISL must be configured identically, otherwise fabric segmentation will occur.
frames enc crc too too bad enc disc link loss loss frjt fbsy tx rx in err shrt long eof out c3 fail sync sig ===================================================================== 4: 617m 2.8g 0 2 0 0 0 268k 0 0 2 9 0 0 4: 2.8g 617m 0 29 0 0 0 1 333 0 1 5 0 0
Describe the technical advantages and disadvantages of each configuration (i.e., performance)
11 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
xxx
Identify external requirements that are uniquely satisfied by serverless backup or third-party copy
xxx
4.2 Analyze potential backup problems (e.g., open file, out of space, virus scanner)
xxx
Using scenarios, analyze the trade-offs with disk-to-tape, back-up window, media, silo (e.g., low cost, portable, but slow) xxx Using scenarios, explain advantages of disk-to-disk method (e.g., physical space, space on media, security and access to data)
xxx
Using scenarios, explain the advantages of off-host (e.g., dedicated back-up server, speed vs. cost)
xxx
Using scenarios, explain advantage of LAN-free (e.g., tapes and disks on a dedicated fabric)
Low overhead on servers High speed Tape devices and backup disks could be zoned or placed in a dedicated fabric.
Compare the difference between hard and soft zoning regarding security
Hard zoning: members of a zone are physical ports, also known as port zoning Soft zoning: WWN of PWWN are members of zone, happens within a fabric switch. Software zoning lets you create symbolic names for the zones and zone members.
Explain the process to configure secure management access to Fibre Channel switches
Use protocols with encryption like SSH (instead of telnet) and HTTPS (instead of HTTP).
12 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
5.2 Compare the RAID levels and implementation (e.g., hardware, software, host-based)
Raid 0: Raid 1: Raid 2: Raid 3: Raid 4: Raid 5: Raid 6: Raid 0+1: Raid 1+0: Hardware VS software: hardware has better performance and doesn't let the CPU do all the work.
5.3 Implementing Switch Technology Differentiate among Core/Edge, Cascaded and Mesh designs
Cascaded: inexpensive, easy to extend. However, low reliability and low scalability. Ring: same as Cascaded topology, but with better reliability Core/Edge: best flexibility and reliability. Multi-layer design. Examples: tiered hybrid Mesh: can be full or partially crossed. Good for any-any traffic. The downside is ISLs using valuable ports.
Identify the slot to place the HBA for maximum performance and reliability
When using SSD: ALWAYS use a single port per PCI-E HBA card. Do not attempt to use multiple ports on your HBA cards, as the SSD bandwidth will be limited by the PCI bus Avoid putting more HBAs on a server than the bus throughput can support
Explain the reasons for virtualizing servers (e.g., ability to failover, load balance, fully utilize physical assets
Better utilizing hardware, less power, more central management possible, load balancing, clustering and failover possibilities by placing VM's on different hosts.
List NFS/CIFS common parameters (e.g., which OS, journaling level, statefull/stateless
13 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
NFS: UDP or TCP, port 2049, versions 2, 3, 4, usually Linux/Solaris, stateful (TCP), but no intervention needed when failing over. NFS is stateless, as in: failure is transparant for client and server. Recovering doesn't need actions like rebooting the system to free up resources or states. CIFS: TCP, port 445, usually Windows, stateful, intervention required at failover, due state recovery. With CIFS, the client maintains the connection and open file names, directories and various other aspects of the files and directories. CIFS is a "stateful" protocol, which is also a problem when the underlying connection is lost. The client does not know when to recreate the connecting. File content is cached via a cooperative process between client and server code, and this is where problems can occur. The state survives only as long as the session between the server and the client survives, and this session survives only as long as the underlying network connection (generally TCP/IP) survives. See http://www.snia.org/images/tutorial_docs/Networking/JimPinkertonSMB2_Big_Improvements_Remote_FS_Protocol-v3.pdf
Explain when no block level access is significant or insignificant (e.g., FSCK-CHKDSK, forensics)
When using file level protocols, the NAS will have to perform the local integrity of a file system. However, when performing forensics or file system checks, and data is being served via block based access (SAN/iSCSI), the guest system has to perform the operations.
Compare NDMP with standard NAS file level back-up (e.g., scalability, block vs. file, offloading of work to NAS unit)
xxx
6.1 Use tools to access the performance of a network storage environment for analysis
Switch performance: Brocade example: switch1:admin> portPerfShow 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Total ------------------------------------------------------------------------------------0 0 21m 28m 31m 0 8.4m 0 28m 21m 31m 0 8.4m 0 0 0 178m 0 0 20m 29m 31m 0 10m 0 29m 20m 31m 0 10m 0 0 0 182m 0 0 18m 36m 31m 0 14m 0 36m 18m 31m 0 14m 0 0 0 201m 0 0 17m 34m 30m 0 7.0m 0 34m 17m 31m 0 7.0m 0 0 0 179m
Use a time server across environments for log correlation, security, discovery process and troubleshooting
Time synchronization is important for troubleshooting, when trying to debug issues and compare log events with error messages. Also interesting for security breaches and/or events, to trace back all steps in a investigation. Protocol: NTP Port: 123 Brocade switches: configure time on principal switch. Other switches will use principal switch to synchronize time. Another use for having the correct time is the discovery process happening with RSCN. When a new disk array is attached to the fabric (ONLY the switch with the connected array), the HBA's registered within the switch's notification list, will be notified and can start discovering new devices/LUN's. Discovery process SCSI discovery process In the modern SCSI transport protocols, there is an automated process of "discovery" of the IDs. SSA initiators "walk the loop" to determine what devices are there and then assign each one a 7-bit "hop-count" value. Serial Storage Architecture (SSA) is an IBM developed serial interface. SSA is a serial technology which basically runs the SCSI-2 software protocol. The good news about SSA compared to SCSI is: it is far easier configured and cabled -- no termination needed! it is built with HA features. The SSA loop architecture (as opposed to a SCSI bus) has no SPOF (see diagram below). If part of a loop fails, the device driver will automatically and transparently reconfigure itself to make sure all SSA devices can be accessed without any noticable interruption. it uses no SCSI ID addressing which means no hassle with setting up the adapters. the SSA loop can transport 4 times 20 MByte/s -- two independent reads and two independant writes across
14 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
each loop direction. Current actual adapter implementations allow for 35 MByte/s per adapter. SSA uses no bus arbitration as opposed to SCSI. Rather than that, a network-like scheme is used. Data is sent and received in 128 Byte packets, and all devices on the loop can request time slots independantly. SCSI in turn needs bus arbitration which can lead to performance deadlocks if an initiator doesn't release the bus in time. SSA allows for 25 meters between each two devices. Plus, there is a fiberoptical extender which allows for data transfers across 50 Micrometer optical cables over distances up to 2.4 km. This makes it even suitable for site disaster recovery if configured properly. Most SSA adapters support two independent loops which makes it possible to attach mirrored disks to different loops for higher availability. The SSA loops are symmetrical, twisted-pair, potential free. No TERMPWR potential shift problem.
FC-AL initiators use the LIP (Loop Initialization Protocol) to interrogate each device port for its WWN ( World Wide Name ). For iSCSI, because of the unlimited scope of the (IP) network, the process is quite complicated. These discovery processes occur at power-on/initialization time and also if the bus topology changes later, for example if an extra device is added.
Analyze performance implications on the fabric involving RAID, caching and connectivity configurations (i.e., identifying potential bottlenecks among these indicators)
xxx Cache Optimizing the cache usage can have a great performance gain on the storage. More data can be quickly served from the cache, instead of the much slower disks. While having cache memory is usually a good thing, it should be disabled if only small random reads are being used. NetApp: sysstat -x 5 EMC Navisphere (CLI): navicli -h XXX getcache Example: # navicli -h 192.168.29.133getcache Prct Dirty Cache Pages = High Watermark: Low Watermark: -pdp -high -low 51 80 60
If 80% of cache is dirty, then it will flush cache down to 60%, currently it is at 51%. RAID level Using the best RAID level optimized for safety and read and/or write speed is important. By creating several different RAID levels within the storage tiers, much of the data processing can be improved.
Monitor, collect, and analyze trending information to avoid bottlenecks or resource constraints on the system architecture
Monitoring logs is probably the most basic form of tracking the health of any system. Also checking trends by using tools like RRD, SNMP can give valuable information about the health and grow speed of affected systems. Also monitoring tools like Nagios, Zabbix etc are useful to respond to problems in time. Brocade switches provide the commands portperfshow and porterrshow.
Analyze Resolve problem; document problem tracking, root cause analysis, problem resolution, problem prevention timeline
Root cause analysis (RCA): document describing events happened after a big issue/problem. Often with additional information about follow up actions, problem description, timeline of events, problem resolution/solution.
6.3 Asses methods to reduce performance impacts when adding long distance connections
Use a proper amount of buffer-to-buffer credits. Use asynchronous replication instead of synchronous, to prevent huge (application) delays, if the RPO can be higher than zero. Set speed on both sides of the link to a fixed value (instead of auto negotiation)
15 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
compare with window size (in TCP connections). The value can be increased when the link is stable (or shorter). Brocade formula: Buffer Credits = ((Distance in km) * (Data Rate) * 1000) / 2112 Brocade switches can also use LD mode (Dynamic long distance mode) to automatically adjust the buffer-to-buffer credit value.
Use LSANs or VSANs to isolate traffic such that only required traffic is transferred
VSAN: virtual SAN or virtual fabric, to achieve isolation without having the need to setup a physical separated fabric. If a switch does not support VSANs, create a SAN as small as possible, but with room for growth. LSAN: sharing (zone) information across fabrics (zones are usually prefixed with "lsan_").
7.2 Identify protocols and technologies best used for implementing business recovery solutions
DWDM or IP extenders (in combination with FCIP or iFCP).
7.3 Identify techniques and processes to be used as part of a business continuance solution
Host-based replication: LAN-based replication: SAN-based replication: CDP (Continuous Data Protection)
16 sur 17
14/01/2012 18:37
Study Guide / Book for SNIA Certified Storage Engineer (SCSE, S10-201) http://www.rootkit.nl/files/book_snia_certified_storage_engineer_s10-...
The pWWN and device alias entries for a virtual device are identical (in terms of primary and secondary). There are no virtual device name conflicts across VSANs in fabrics. Zoning conflict parameters When merging two fabrics, zoning information from the two previously separated fabrics is merged as much as possible into the new fabric. Sometimes, zoning inconsistency can occur and zoning information cannot be merged. Segmentation due to zoning will usually be flagged by an error message that says "Fabric segmented, zone conflict" appearing in the error logs. One of the solutions is to make sure zoning information on both switches is consistent before bringing up the ISL. Upgrading firmware on Brocade switches: The internal process will be as follows 1. firmware -s download command is entered, and you respond to prompts. 2. Firmware is downloaded to Secondary Partition 3. Primary and Secondary boot pointers are swapped 4. CP boots from firmware in new Primary partition. Say no to autocommit and yes to reboot after download. After a few days of cool operation, run the firmwareCommit command and then the new firmware is copied to the seconday partition as well. http://www.cisco.com/en/US/products/ps5989/prod_troubleshooting_guide_chapter09186a008067a309.html Sources used: http://www.scsita.org/aboutscsi/sas/tutorials/SAS_General_overview_public.pdf http://www.directron.com /ncqvstcq.html
17 sur 17
14/01/2012 18:37