Professional Documents
Culture Documents
Student Guide
Text Part Number: 67-2461-01
DISCLAIMER WARRANTY: THIS CONTENT IS BEING PROVIDED AS IS. CISCO MAKES AND YOU RECEIVE NO WARRANTIES IN CONNECTION WITH THE CONTENT PROVIDED HEREUNDER, EXPRESS, IMPLIED, STATUTORY OR IN ANY OTHER PROVISION OF THIS CONTENT OR COMMUNICATION BETWEEN CISCO AND YOU. CISCO SPECIFICALLY DISCLAIMS ALL IMPLIED WARRANTIES, INCLUDING WARRANTIES OF MERCHANTABILITY, NON-INFRINGEMENT AND FITNESS FOR A PARTICULAR PURPOSE, OR ARISING FROM A COURSE OF DEALING, USAGE OR TRADE PRACTICE. This learning product may contain early release content, and while Cisco believes it to be accurate, it falls subject to the disclaimer above.
Table of Contents
Volume 3 Appendix A: The Fibre Channel Protocol
Overview Module Objectives
AA-1
AA-1 AA-1
AA-3
AA-3 AA-3 AA-4 AA-7 AA-10 AA-11 AA-13 AA-15 AA-18 AA-19 AA-20
FC Protocol Concepts
Overview Objectives Fibre Channel Overview Fibre Channel: The Best of Both Worlds Advantages of Serial Architecture Fibre Channel Performance Fibre Channel Topologies What is the Point-to-Point Topology? What is the Arbitrated Loop Topology? What is the Switched Fabric Topology? Fibre Channel Ports Fibre Channel HBAs Fibre Channel Classes of Service Summary
AA-21
AA-21 AA-21 AA-22 AA-23 AA-24 AA-25 AA-29 AA-30 AA-31 AA-32 AA-34 AA-36 AA-37 AA-39
FC Layers
Overview Objectives Fibre Channel Layers FC-0: Physical Interface FC-1: Encoding FC-2: Framing and Flow Control FC-3: Common Services FC-4: Upper-Layer Protocol Interfaces Fibre Channel Data Constructs Fibre Channel Frames Frame Headers SCSI-FCP Operations Link Services Types of Link Services Basic Link Services Extended Link Services Summary
AA-41
AA-41 AA-41 AA-42 AA-44 AA-46 AA-49 AA-50 AA-51 AA-53 AA-54 AA-56 AA-58 AA-60 AA-61 AA-62 AA-63 AA-64
FC Flow Control
Overview Objectives Fibre Channel Flow Control Credit-Based Flow Control Types of Flow Control
AA-65
AA-65 AA-65 AA-66 AA-67 AA-68
Buffer-to-Buffer Flow Control and End to End Flow Control Credit Management Methods The Base Credit Management Method Allocating Buffer Credits Example Example (Cont.) Example (Cont.) Example (Cont.) Fibre Channel Addressing The Switched Fabric Address Space The FC-AL Address Space World-Wide Names Summary
AA-69 AA-71 AA-72 AA-73 AA-74 AA-75 AA-76 AA-77 AA-79 AA-79 AA-81 AA-82 AA-84
FC Login
Overview Objectives Fabric Login Port Login Port and Address Discovery Process Login Loop Initialization and Arbitration The Loop Initialization Protocol The Loop Arbitration Protocol The Loop Port State Machine Summary
AA-85
AA-85 AA-85 AA-86 AA-94 AA-97 AA-98 AA-103 AA-103 AA-103 AA-105 AA-106
FC Error Recovery
Overview Objectives FC-1 Errors R_T_TOV FC-2 Errors E_D_TOV Sequence Recovery R_A_TOV SCSI-FCP Error Recovery Summary
AA-107
AA-107 AA-107 AA-108 AA-109 AA-111 AA-113 AA-114 AA-117 AA-118 AA-122
FC Switched Fabric
Overview Objectives Fabric Configuration Overview FSPF FSPF Protocol Operations Stage 1The Hello Protocol Stage 2Initial Database Synchronization Stage 3Database Maintenance Stage 4Path Discovery Stage 5Path Computation Limitations of FSPF The RSCN Process Fabric State Changes The RSCN Process Standard Fabric Services The Domain Manager The Name Server Name Server Operations The Management Server Well-Known Addresses Summary
ii Implementing Cisco Storage Networking Solutions (ICSNS) v3.0
AA-125
AA-125 AA-125 AA-126 AA-128 AA-130 AA-131 AA-133 AA-133 AA-135 AA-135 AA-137 AA-139 AA-139 AA-139 AA-143 AA-144 AA-145 AA-146 AA-147 AA-148 AA-149
2007 Cisco Systems, Inc.
AB-1
AB-1 AB-1
AB-3
AB-3 AB-4 AB-8 AB-12 AB-26 AB-30 AB-40
iii
iv
Appendix A
Module Objectives
Upon completing this module, you will be able to describe the SCSI and Fibre Channel protocols. This includes being able to meet these objectives: Describe the basic characteristics of the SCSI protocol Explain the role of Fibre Channel in a storage environment Describe the Fibre Channel layered model, data constructs, SCSI-FCP read and write operations, and Link Services Explain Fibre Channel flow control and addressing Describe the Fibre Channel device login process Explain how Fibre Channel recovers from errors Describe the Fibre Channel Switched Fabric protocol
AA-2
Lesson 1
Objectives
Upon completing this lesson, you will be able to describe the basic characteristics of the SCSI protocol. This includes being able to meet these objectives: Explain the function of the SCSI protocol in a storage environment Describe the SCSI architecture model Explain the operations and limitations of SCSI parallel technology Describe the SCSI operational phases Identify the most common SCSI commands and status messages Explain the role of SCSI messages in error handling
Requests Responses
LUNs
Application Client
Device Server
Tasks
The Small Computer System Interface (SCSI) performs the heavy lifting of passing commands, status, and block data between platforms and storage devices. One function of operating systems is to hide the complexity of the computing environment from the end user. Management of system resources including , memory, peripheral devices, display, context switching between concurrent applications, and son on, are generally concealed behind the user interface. The internal operations of the OS must be robust, closely monitor changes of state, ensure that transactions are completed within the allowable time frames, and automatically initiate recovery or retires in the event of incomplete or failed procedures. For I/O operations for peripheral devices such as disk, tape, optical storage, printers, and scanners, these functions are provided by the SCSI protocol, typically embedded in a device driver or logic onboard a host adapter. Because the SCSI protocol layer sits between the operating system and the peripheral resources, it has different functional components. Applications typically access data as files or records. Although these may be ultimately stored on disk or tape media in the form of data blocks, retrieval of the file requires a hierarchy of functions to assemble raw data blocks into a coherent file that can be manipulated by an application. SCSI architecture defines the relationship between initiators (hosts) and targets (for example, disks) as a client/server exchange. The SCSI-3 application client resides in the host and represents the upper layer application, file system, and operating system I/O requests. The SCSI-3 device server sits in the target device, responding to requests.
AA-4
Files or Records
Files or Records Files or Records Logical Drives SCSI Mapping Device Driver SCSI Command or Data Interconnect
The file system presents an abstraction of data to the user application. Physical storage devices are presented as an abstraction to the file system.
2007 Cisco Systems, Inc. All rights reserved. ICSNS v3.07-5
When a user application opens a file, a series of processes are launched that rely on lower SCSI commands and controls to transport the appropriate data blocks from storage safely into memory. A translation between file representation and block I/O thus occurs in the file system layer. Just as the file system presents an abstraction of data to the user application, the physical storage devices are presented as an abstraction to the file system. An E: drive in Windows or a /dev/dsk2 in UNIX may be a single disk, a partition on a larger disk, or a striped array of multiple disks. The file system depends on a volume management function to present sometimes diverse storage devices as coherent and easily addressable resources. Device virtualization turns physical storage into logical storage, and assumes the intricate tasks necessary for placement of data blocks on disks. This file/block translation and mapping function can be as sophisticated as a separate volume management application or as straightforward as an adaptor card device driver interface to an operating systems disk utility. This hierarchy of logical abstractions descends to the physical world of actual SCSI devices and their connectivity to the host system. Common access methods at the OS level allow uniform treatment of SCSI devices regardless of their physical attachment. In saving a file, the file system does not need to be concerned with whether the logical drive identifier fronts a direct SCSI-attached unit, a Fibre Channel array or an IP storage device somewhere on the Gigabit Ethernet network. Regardless of the underlying plumbing, the operating systems view of the physical storage is defined by the bus/target/LUN triad inherited from parallel SCSI technology. The mapping between the bus/target/LUN designation and the logical drive identifier provides the portal between physical devices and the upper layer file system. Because Fibre Channel and IP storage are serial transports and have no bus component, the bus identifier is fabricated for compatibility with the operating systems SCSI nomenclature. Two IP storage NICs in a single server, for example, may have different bus designations to mimic SCSI adapter configuration.
AA-5
The bus/target/LUN identifier may be further mapped to the addressing requirements of a specific transport. FCP, for example, maps bus/target/LUN to a device identification (ID)/LUN pair. Consequently, the representation of physical storage has two components: 1. One directed to the operating system, to establish a familiar, addressable entity based on the SCSI triad 2. The other is directed at the specific transport, to accommodate the addressing requirements of that topology
AA-6
SSC
SES
SMC
SPC
SBP
SSP
FCP
IEEE 1394
SSA
FC-PH
ICSNS v3.07-7
The SCSI Architecture Model (SAM) consists of four layers of functionality: 1. The physical interconnect layer specifies the characteristics of the physical SCSI link: FC-PH is the physical interconnect specification for Fibre Channel. Serial Storage Architecture (SSA) is a storage bus aimed primarily at the server market. IEEE1394 is the FireWire specification. SCSI-3 Parallel Interface (SPI) is the specification used for parallel SCSI buses.
2. The transport protocol layer defines the protocols used for session management: FCP is the transport protocol specification for Fibre Channel. Serial Storage Protocol (SSP) is the transport protocol used by SSA devices. Serial Bus Protocol (SBP) is the transport protocol used by IEEE1394 devices.
3. The shared command set layer consists of command sets for accessing storage resources: SCSI Primary Commands (SPC) are common to all SCSI devices. SCSI Block Commands (SBC) are used with block-oriented devices, such as disks. SCSI Stream Commands (SBC) are used with stream-oriented devices, such as tapes. SCSI Media Changer Commands (SMC) are used to implement media changers, such as robotic tape libraries and CD-ROM carousels.
AA-7
SCSI Enclosure Services (SES) defines commands used to monitor and manage SCSI device enclosures, such as RAID arrays.
4. The SCSI Common Access Method (CAM) defines the SCSI device driver application programming interface (API).
AA-8
FC
SCSI Card FC Card NIC
Firewire Interface
*SCSI-3: Separation of physical interface, transport protocols, and SCSI Command Set
2007 Cisco Systems, Inc. All rights reserved. ICSNS v3.07-8
The SCSI_3 family of standards introduced several new variations of SCSI commands and a protocol, including serial SCSI-3 and special command sets for streaming and media handling required for tape. As shown in the diagram, the command layer is independent of the protocol layer, which is required to carry SCSI-3 commands between devices. This enables more flexibility in substituting different transports beneath the SCSI-3 command interface to the operating system.
AA-9
ICSNS v3.07-10
The bus/target/LUN triad is defined from parallel SCSI technology. The bus represents one of several potential SCSI interfaces installed in the host, each supporting a separate string of disks. The target represents a single disk controller on the string. And the LUN designation allows for additional disks governed by a controller for example, a RAID device. The following are characteristics of parallel SCSI technology: SCSI uses a parallel architecture in which data is sent simultaneously over multiple wires. SCSI is half-duplexdata travels in one direction at a time. On a SCSI bus, a device must assume exclusive control over the bus in order to communicate. (SCSI is sometimes referred to as a simplex channel because only one device can transmit at a time).
AA-10
Data/Address Bus
Terminator
Control Signals
Interface
ID ID
ID ID
ID ID
7
Priority
2007 Cisco Systems, Inc. All rights reserved.
ID ID
ICSNS v3.07-11
AA-11
Terminator
Control Signals
Interface
LUN 0 LUN 1
SCSI was designed to support a few devices at most, so its device addressing scheme is fairly simpleand not very flexible. SCSI devices use hard addressing: Each device has a series of jumpers that determine the devices physical address, or SCSI ID. The ID is software-configurable on some devices. Each device must have a unique ID. Before adding a device to the cable, the administrator must know the ID of every other device connected to the cable and choose a unique ID for this new device. The ID of each device determines its priority on the bus. For example, the SCSI target with ID 7 always has a higher priority than the SCSI initiator with ID 6. Because each device must have exclusive use of the bus while it is transmitting, ID 6 must wait until ID 7 has finished transmitting. Fixed priority makes it more difficult for administrators to control performance and quality-of-service.
AA-12
SCSI Operation
This topic provides an overview of SCSI protocol operations.
SCSI Operation
SCSI includes three phases of operation:
Command send the required command and parameters via a Command Descriptor Block (CDB) Data Transfer data in accordance with the command Status Receive confirmation of command execution Target
FC
Initiator
FC
HBA
ICSNS v3.07-14
Every communication on the SCSI bus is formed by sequences of events called bus phases. Each phase has a purpose and is linked to other phases to execute SCSI commands and transfer data and messages back and forth. The majority of the SCSI protocol is controlled by the target. The initiator only initiates a SCSI task by selecting a target. Once the target is selected, it (the target) controls the bus. It does this by picking up the command from the initiator, executing it and delivering a status back to the initiator.
AA-13
FC
FC
SCSI Command:
Command
Send required command and parameters via a Command Descriptor Block (CDB) SCSI Data (optional): Transfer Data in accordance with the command SCSI Status or Response: Receive confirmation of command execution SCSI Message: Send Command Complete message Release the Bus
2007 Cisco Systems, Inc. All rights reserved.
Phase
e Data In Phas
DATA
Status Phas
RSP
Disconnect
ICSNS v3.07-15
A simple SCSI task can be described by using the following example. 1. The host adapter is the initiator. A host adapter wants to read a logical block of data from a disk drive. 2. The host adapter waits until the bus is free. 3. When the bus is free, the host adapter uses the arbitration phase to acquire initial control over the bus. 4. The disk drive that will be the target is selected. The disk accepts the selection by taking over control of the bus. 5. The host sends a SCSI READ command to the disk. 6. The disk picks up the command from the host adapter. The disk reads its data from the media and enters the data phase to send it across the bus to the host adapter 7. After the data transfer, the disk enters a status phase and sends the status code GOOD. 8. The SCSI task is finished, so the disk sends a COMMAND COMPLETE message to the host adapter and releases the bus to the BUS FREE phase.
AA-14
Byte 0 1 2 3 4 5 6 7 8 9
0
First Byte Operation Code
Group Code
Command Code
Service Action Reserved Logical Block Address MSB Logical Block Address Logical Block Address Logical Block Address Reserved MSB Transfer Length Transfer Length Control
LSB Number of SCSI Blocks to be transferred LSB Last Byte Control Byte
AA-15
SCSI commands are built from a common structure: Operation Code byte N bytes of parameters Control byte The Operation Code consists of a Group Code and a command Code Group Code establishes the total command length. Command Code establishes the command function. The number of bytes of parameters (N) can be determined from the Operation Code byte which is located in byte 0 of the Command Descriptor Block (CDB). The Control Byte, which is located in the last byte of a Command Descriptor Block, contains control bits that define the behavior of the command.
AA-16
Standard SCSI commands are used on all devices. After a SCSI command is sent to a target, the initiator expects a status.
ICSNS v3.07-18
SCSI defines commands for all devices as well as commands for specific devices. For example: The OpCode for the General Command Write Buffer is 3Bh. The OpCode for the General Command Read Buffer is 3Ch. The OpCode for the Disk Command Write(6) is 0Ah, Write(10) is 2Ah. The OpCode for the Disk Command Read(6) is 08h, Read(10) is 28h. The numbers in the parenthesis (6) and (10) refer to the type of CDB utilized.
AA-17
SCSI Messages
This topic introduces the functions of SCSI messages.
SCSI Messages
SCSI messages are an additional way in which the initiator and the target communicate with each other.
Some SCSI transmission parameters are not tied to a specific command, but to the relationship between a specific initiator and target. Transfer speed Data width Other asynchronous events such as: To abort a SCSI command that is currently executed by a target RESTORE POINTERS
ICSNS v3.07-20
SCSI messages are an additional way in which the initiator and the target communicate with each other. Some SCSI transmission parameters are not tied to a specific command, but to the relationship between a specific initiator and target: Transfer speed Data width Other asynchronous events such as: To abort a SCSI command that is currently executed by a target RESTORE POINTERS
AA-18
ICSNS v3.07-21
Error Handling
Parallel SCSI is not as efficient in detecting transmission errors as, LAN protocols or SAN protocols. SCSI uses a parity bit. The receiving device calculates the parity and compares it with the parity bit. If they dont match, a parity error has occurred. Consequently, the device that detected the parity error sends a RESTORE POINTERS message that causes the data transfer counter to be reset to the value at the last disconnect so that the transfer of data is repeated from that point on.
AA-19
Summary
This topic summarizes the key points that were discussed in this lesson.
Summary
The SCSI protocol was originally based on parallel technology and modeled after a bus topology. The SCSI Architecture Model is a reference for the SCSI functional layers and the SCSI Transport Interfaces. To communicate, the SCSI protocol operates in phases. The SCSI protocol has a set of command codes and status codes. SCSI messages are used for error handling.
ICSNS v3.07-22
AA-20
Lesson 2
FC Protocol Concepts
Overview
Fibre Channel (FC) has characteristics of both I/O channels and data networks, and this unique blend of features is what makes FC ideal for storage area networks (SANs). This lesson takes a close look at the features and capabilities of FC, and compares these features and capabilities with those of traditional I/O channels such as SCSI, and data networks such as Ethernet and ATM.
Objectives
Upon completing this lesson, you will be able to explain the role of Fibre Channel in a storage environment. This includes being able to meet these objectives: Describe the basic characteristics of Fibre Channel Describe Fibre Channel performance characteristics Identify the three basic Fibre Channel topologies Define a Fibre Channel port Explain the functions of a Fibre Channel HBA Explain the Fibre Channel Classes of Service
FC
HBA
FC
HBA
FC
IP Network
FC
HBA
FC
ICSNS v3.07-4
FC is a technology for transporting data between devices. It is the network interconnect technology that is most commonly used for SANs today. Traditional storage technologies, such as SCSI, are designed for controlled, local environments. They support few devices and only short distances, but they deliver data quickly and reliably. Traditional data network technologies, such as Ethernet, are designed for chaotic, distributed environments. They support many devices and long distances, but delivery of data can be delayed. FC combines the best of both worlds. It supports many devices and longer distances, and it provides reliable data delivery In the diagram, the network on the right, consisting of servers and storage devices, is an FC SAN. The SAN consists of servers and storage devices connected by an FC network.
AA-22
FC
HBA
I/O Channel
Fibre Channel
Many devices Dynamic Low latency Long distances Hardware-based delivery management
Network
Many devices Dynamic
x High latency
Long distances
x Short distances
Hardware-based delivery management
2007 Cisco Systems, Inc. All rights reserved.
x Software-based
delivery management
ICSNS v3.07-5
AA-23
ICSNS v3.07-6
AA-24
ICSNS v3.07-8
The performance characteristics of FCs are as follows: Bandwidth: 100, 200, 400, and 1000 MBps (sustained, each direction) Mode: Full duplex, serial Maximum number of nodes: 126 for arbitrated loop, >16 million for switched fabric Link Distance: Up to 30 m/link copper, to 10 km/link optical Bit Error Ratio (BER): < 10-12 Note that 100MBps, 200MBps, 400MBps, and 1000MBps are the half-duplex rates for Fibre Channel, but Fibre Channel is actually a full-duplex technology. In other words, Fibre Channel supports up to 1000MBps between two ports in both directions simultaneously.
AA-25
ICSNS v3.07-9
The Bit Error Ratio (BER) is calculated by dividing the number of erroneous bits by the total number of bits transmitted, received, or processed over some stipulated period. For example, 2.5 erroneous bits out of 100,000 bits transmitted would be 2.5 divided by 100,000 or 2.5 105. The minimum and maximum values of average received power range determine the input power range required to maintain a BER less than 10-12. This value takes into account worst case signal characteristics. A BER of 10-12 corresponds to one error every 8 minutes at 2Gbps. This might seem like a very low error rate, but due to some stringent applications, the industry is working on achieving a BER of 10-15, which results in one error every 5.5 days at 2Gbps.
AA-26
Bandwidth:
Mode: Maximum # of Nodes: Link Distance: Protocol Model:
2007 Cisco Systems, Inc. All rights reserved.
20-320 MBps (burst) Half duplex, parallel, shared bus 16 1.525m Monolithic (SCSI)
The table compares the characteristics of FC to those of SCSI. Significant differences between FC and SCSI include: Bandwidth: FC is capable of delivering the published data rates in a sustained manner. The maximum SCSI bit rate is the peak rate, and cannot be sustained for long periods of time. Mode: SCSI uses a parallel bus, with half duplex capability (transmission in one direction at a time), while the FC serial connection has full duplex capability. Maximum number of nodes: 16 for SCSI, up to 16 million for FC. The SCSI cable length limitations results in a maximum link distance of 25 meters, while FC, using optical cable, has a maximum link distance of 10 kilometers. Software protocols: FC supports multiple protocols simultaneously. A version of the SCSI command set is often used on FC SANs, but the same SAN can simultaneously carry IP traffic and other protocols. Note that the storage market typically measures data rates in megabytes-per-second (MBps), whereas the network market typically measures data rates in megabits-per-second (Mbps) or gigabits-per-second (Gbps). The Fibre Channel market measures data rates in both MBps and Gbps, so you must be able to quickly translate between both units of measure. In Fibre Channel, 100MBps equals 1Gbps. Note that this conversion assumes that each byte equals 10 bits. This is actually trueFibre Channel uses a bit encoding scheme in which each 8-bit byte is encoded as 10 bits for transmission.
AA-27
Bandwidth:
Mode: Average Continuous Data Flow: Link Distance: Protocol Model:
2007 Cisco Systems, Inc. All rights reserved.
1000 MBps (burst) Full duplex, serial, packet-based ~ 40% 100m copper 5Km optical Layered
The table compares the characteristics of FC to those of Gigabit Ethernet. One notable difference is in the Average Continuous Data Flow of each network. This figure represents how well the different technologies utilize their link bandwidth, and is stated as a percentage of the maximum link bandwidth. Ethernet has significant system overheads in processing the data from high speed links, so the Average Continuous Data Flow is typically far less than the maximum bandwidth. FC, however, maximizes efficiency by implementing many functions in hardware instead of in its software drivers, and is able to achieve an Average Continuous Data Flow of up to 95 percent of maximum bandwidth. Note that all link distances stated here are according to the specifications. Many vendors support longer distances. For example, Finisar sells long-wave GBICs that support up to 30km on single-mode optical fiber. FC and Gigabit Ethernet support similar link distances. However, IP is the most common protocol used on Ethernet networks, and there is a global WAN infrastructure that supports IP, so people tend to think of Ethernet as spanning longer distances than FC. FCs distance barrier is not its physical specification but its incompatibility with the global IP infrastructure. Today, however, emerging technologies like FCIP allow IP networks to carry FC data, breaking down that distance barrier.
AA-28
FC
HBA
FC
FC
HBA
Point-to-Point
HBA
FC
HBA
FC
HBA
FC
HBA
FC
HBA
FC
HBA
FC
HBA
FC
FC
HBA
FC
HBA
FC
FC
HBA
Arbitrated Loop
2007 Cisco Systems, Inc. All rights reserved.
Switched Fabric
ICSNS v3.07-13
Fibre Channel Protocol includes three basic SAN topologies: Point-to-point Arbitrated loop Switched fabric
AA-29
FC
HBA
ICSNS v3.07-14
AA-30
Scalability:
127 addressable ports:
126 available for nodes 1 reserved for fabric-attach port
FC
HBA
FC
HBA
FC
HBA
Hub
FC
HBA
Reliability:
If one device fails, the entire loop can fail
2007 Cisco Systems, Inc. All rights reserved.
FC
FC
HBA
ICSNS v3.07-15
AA-31
FC
FC HBA
FC
FC HBA
FC HBA
FC FC HBA
FC FC FC
FC
HBA
FC
HBA
FC
HBA
ICSNS v3.07-16
AA-32
ICSNS v3.07-17
The Fibre Channel Switched Fabric (FC-SW) protocol differs from the arbitrated loop topology in several important ways: Switches can support multiple simultaneous conversations. Each conversation between two devices can use the full link bandwidth. The FC-SW device addressing scheme allows over 16,000,000 ports. Existing implementations can support hundreds and even thousands of nodes using large directorclass switches. The FC-SW protocol defines several management services that increase the scalability, manageability, and security of the SAN. Due to the limitations of the Arbitrated Loop topology, the majority of modern organizations choose to implement a Switched Fabric topology because it offers greater scalability, performance, reliability, and manageability.
AA-33
FC
FC FC
HBA
Tape device
Server
I/O Adapter
Switch
Array controller
Storage
ICSNS v3.07-19
In data networking terminology, ports are often thought of as just physical interfaces where you plug in the cable. In FC, however, ports are intelligent interfaces, responsible for actively performing critical network functions. The preceding graphic contains several ports. There are ports in the host I/O adapter (host bus adapter [HBA]), ports in the switch, and ports in the storage devices. FC terminology differentiates between several different types of ports, each of which performs a specific role on the SAN. You will encounter these terms often as you continue to learn about FC, so it is important that you learn to recognize the different port types. In addition to the common ports defined for FC, Cisco has developed some proprietary port types.
AA-34
FC
Host
FL_Port
Standard Ports
E_Ports
N_Port
FC
HBA
F_Port
FC
Host
B_Port
E_Port
Storage Array
WAN Bridge
ICSNS v3.07-20
An N_Port (Node Port) is a port on a node that connects to a fabric: I/O adapters and array controllers contain one or more N_Ports N_Ports can also directly connect two nodes in a point-to-point topology An F_Port (Fabric Port) is a port on a switch that connects to an N_Port. An E_Port (Expansion Port) is a port on a switch that connects to an E_Port on another switch. An FL_Port (Fabric Loop Port) is a port on a switch that connects to an arbitrated loop. Logically, an FL_Port is considered part of both the fabric and the loop. FL_Ports are always physically located on the switch. Note that FC hubs, although they obviously have physical interfaces, do not contain FC ports. Hubs are basically just passive signal splitters and amplifiers. They do not actively participate in the operation of the network. On an arbitrated loop, the node ports manage all FC operations. Not all switches support FL_Port operation. For example, some McDATA switches do not support FL_Port operation. An NL_Port (Node Loop Port) is a port on a node that connects to another port in an arbitrated loop topology. There are two types of NL_Ports: Private NL_Ports can communicate only with other loop ports. Public NL_Ports can communicate with other loop ports and with N_Ports on an attached fabric. Note that the term L_Port (Loop Port) is sometimes used to refer to any port on an arbitrated loop topology. L_Port can mean either FL_Port or NL_Port. In reality, there is no such thing as an L_Port.
AA-35
OS
OS
I/O Subsystem
I/O Subsystem
TCP Driver
FC Driver
HBA
HBAs are I/O adapters that are designed to maximize performance by performing protocol processing functions in silicon. HBAs are roughly analogous to network interface cards, but HBAs are optimized for storage networks, and provide features that are specific to storage. The figure contrasts HBAs with NICs, illustrating that HBAs offload protocol processing functions into silicon. With NICs, protocol processing functions such as flow control, sequencing, segmentation and reassembly, and error correction are performed by software drivers. HBAs offload these protocol processing functions onto the HBA hardware itselfusually some combination of an application-specific integrated circuit (ASIC) and firmware. Offloading these functions is necessary to provide the performance required by storage networks. NICs can utilize over 80 percent of a servers CPU capacity (measured with a 1Ghz Intel Pentium CPU) to deliver 50-80MBps on a Gigabit Ethernet link. I/O processing adds considerable real cost to what may appear to be an inexpensive NIC. HBAs manage I/O transactions with little or no involvement of the server CPU. FC HBAs can provide throughput at nearly 95 percent of link speed with less than 10 percent server CPU utilization.
AA-36
Use
Specialized applications; not widely supported Generally supported but not widely used Most commonly used Class of Service
Fractional bandwidth virtual circuit Specialized applications; not Confirmed delivery supported in SAN products Connection-oriented multicast Confirmed delivery Packet-switched Confirmed delivery Specialized applications; not supported in SAN products Used for inter-switch communication
ICSNS v3.07-24
The table displays uses of the FC Classes of Service: Few commercially available FC SAN products currently support Class 1. Many FC products support Class 2, but it is not widely used. Class 3 is by far the most commonly used Class of Service on fabrics, and it is often the only class supported on arbitrated loops. All FC SAN products support Class 3. No commercially available FC SAN products currently support Class 4. No commercially available FC SAN products currently support Class 6. Class F is always used for inter-switch communication. Note that Class 5 is not yet defined. Class 5 was intended to enable isochronous transactions by multiple ports, but has not been completed. An isochronous connection is one in which bandwidth and data delivery rate are guaranteed. Class 5 would be appropriate for video delivery services.
AA-37
Yes Yes
ICSNS v3.07-25
The preceding table summarizes the features of the Classes of Service. Although Classes 2 and 3 are the only options currently available in Fibre Channel products today, customers might have specialized applications that call for the features of other classesand might be willing to investigate specialized products that support those applications.
AA-38
Summary
This topic summarizes the key points that were discussed in this lesson.
Summary
Fibre Channel supports many devices, dynamic network reconfiguration, low latency, long distances, and hardware-based delivery management. Fibre Channel currently supports 100, 200, 400 and 1000 MBps. The Fibre Channel Protocol supports three topologies: Point to Point, Arbitrated Loop and Switched Fabric. Ports are intelligent interface points on the Fibre Channel network. Fibre Channel HBAs offload flow control, sequencing, segmentation, and error correction into the HBA hardware, increasing performance. Fibre Channel has defined classes of service similar to the Class of Service models in LAN networks, however the Fibre Channel implementation is different.
ICSNS v3.07-26
AA-39
AA-40
Lesson 3
FC Layers
Overview
Like nearly all modern networks, Fibre Channel (FC) is designed with a modular, layered architecture. This architecture is designed to carry other protocols, as well as new native protocols. A layered architecture provides benefits for both vendors and users because it enhances the clarity and flexibility of the architecture. This lesson describes the five layers of the FC layered model, and the upper layer protocols that FC supports.
Objectives
Upon completing this lesson, you will be able to describe the Fibre Channel layered model, data constructs, SCSI-FCP read and write operations, and Link Services. This includes being able to meet these objectives: Describe the FC layered model Describe the Fibre Channel Data constructs Describe SCSI-FCP protocol operations Describe Link Services
IP
Application Presentation Session Transport Network
IP IPX ... HTTP FTP SNMP ... TCP SPX ...
Ethernet
ICSNS v3.07-4
The OSI model defines seven layers of functionality for network protocols. While FC does not map directly to the OSI model, it does use a layered model. FCs lower layers relate closely to the lower layers of OSI: FC defines the lower three layers (approximately) of the OSI model: Physical, Data link, and Network Other protocols, such as SCSI, are responsible for the upper layers If you are familiar with data networking, you probably understand the difference between physical-layer protocols, such as Ethernet, and logical-layer protocols, such as TCP and IP: Ethernet defines how data is physically transmitted. Protocols like TCP and IP define aspects of the network such as flow control and addressing. The preceding graphic shows that FC defines both the physical layer and part of the logical layer, and then interfaces with ULPs that perform the functions of the upper layers of the OSI model.
AA-42
Upper-layer protocols FC-4 FC-3 Common Services FC-2 Framing and flow control
FC-FS
Fibre Channel
ICSNS v3.07-5
The five layers of FC are: FC-0: Physical interface, transmission and signaling FC-1: 8b/10b encode/decode, link control, ordered set specifications FC-2: Framing, flow control, exchange/sequence management FC-3: Application-specific layer for fabric services FC-4: Upper-layer protocol mapping specification The lower three layers (FC-0, FC-1, and FC-2) are collectively known as the FC Physical Layer (FC-PH), even though they also implement logical functions such as framing and flow control. The FC-3 layer provides a framework for implementing new SAN-wide services, while the FC4 layer interfaces with the ULPs and maps them to the FC. The FC-PH specification was the original document that defined layers FC-0, FC-1, and FC-2. The final version of the FC-PH specification was FC-PH-3. However, FC-PH was then superceded by two additional documents: Fibre Channel Physical Interface (FC-PI) defines FC-0. Fibre Channel Framing and Signaling (FC-FS) defines FC-1 and FC-2
AA-43
FC-PI
ICSNS v3.07-6
AA-44
33m 59m 30m100m 300m 150m 70m 500m 300m 150m 10km 10km 10km is supported by some manufacturers
ICSNS v3.07-7
Electrical Cables
Optical Cables
Optical media types: Multimode fiber uses a 780nm short-wave laser. Single-mode fiber uses a 1300nm long-wave laser. Maximum link distances vary by data rate. The current specification states a minimum 2m distance for optical fiber. This is to allow for a build-up of photons that occur in the first 2m of cable after the laser fires, and, in multimode cables, to eliminate problems associated with some modes of light which, due to their steep angle of reflection, do not travel very far down the cable.
AA-45
FC-1: Encoding
FC-1 defines the bit encoding scheme:
Encoding and decoding of serial signals Bit-level error detection Clock synchronization Link initialization and recovery
ULPs
FC-4 FC-3 Common Services FC-2 Framing and flow control FC-1 Encoding FC-0 Physical interface FC-FS
FC-PI
ICSNS v3.07-8
FC-1: Encoding
The FC-1 layer specifies how data is encoded at the bit and byte levels for transmission across the link. FC-1 is responsible for: Taking data from the transmitters I/O bus and encoding into a serial signal for transmission Taking a serialized signal and decoding it into a signal that can be sent to the receivers I/O bus Bit-level error detection Clock synchronization between the transmitter and receiver Link initialization and recovery
AA-46
8b/10b Encoder
Transmitter
Tx Byte
Parallel Input
Serial Output
Media Output
ICSNS v3.07-9
The preceding diagram shows the FC-1 encoding process: The data from the transmitting I/O bus is encoded using the 8b/10b encoding scheme. The parallel data is converted to serial format. The serial data is transmitted across a link. The receive end decodes the serial data and forwards it to the receivers I/O bus.
AA-47
0xDF
11011111 6
1010110110 0101000110
+ -
31 = D31.6
ICSNS v3.07-10
Transmission characters always have either: Positive disparity: 6 ones and 4 zeros Negative disparity: 4 ones and 6 zeros Neutral disparity: 5 ones and 5 zeros The 8b/10b scheme defines multiple transmission characters for each 8-bit data byte. Because the encoder can choose between multiple 10-bit representations for each 8-bit byte, it can balance the number of ones and zeros in the data stream. The imbalance between the number of 1s and 0sknown as the running disparityis continually reevaluated. To balance the number of ones and zeros, every transmitted byte is encoded into one of two possible 10-bit representations depending on the current running disparity. In the FC-1 specification, every 10-bit character is represented using a special notation: Dxx.y: Used for data characters that map to 8-bit characters; xx is the decimal value of the lowest 5 bits and y is the decimal value of the highest bits. Kxx.y: Used for special control characters; xx and y are defined as for data characters.
AA-48
ULPs
FC-PI
ICSNS v3.07-11
AA-49
Future services?
Compression Encryption Link multiplexing
FC-4
ULPs
FC-3 Common Services FC-2 Framing and flow control FC-1 Encoding FC-0 Physical interface FC-FS
FC-PI
ICSNS v3.07-12
AA-50
FC-PI
ICSNS v3.07-13
AA-51
FCPSCSI
FC-SB-2
FC-LE
FC-IP
FC-FP
FC-BB-2
FC-3 Common Services FC-2 Framing and Flow Control FC-1 Encoding FC-0 Physical interface
2007 Cisco Systems, Inc. All rights reserved. ICSNS v3.07-14
The Small Computer System Interface (SCSI) command set is widely used among storage devices. Even though SCSI bus technology is not suitable for SANs, the SCSI command set is well-suited for many types of storage applications. The use of the SCSI command set enables the use of inexpensive SCSI disks and SCSI tape drives in FC SAN storage devices. SCSI-FCP also enables compatibility with existing operating systems and legacy storage applications. In fact, most operating systems and applications are not aware of the FC SANFC devices appear to the host and its applications as SCSI devices. The mapping of the SCSI protocol to FC is called SCSI-Fibre Channel Protocol (SCSI-FCP), or sometimes simply FCP. SCSI-FCP is the ULP command set used on most FC SANs. SCSIFCP provides the command set for reading and writing data to and from storage devices. The fact that FC supports a wide range of protocols allows FC to meet the needs of diverse applications and integrate with heterogeneous platforms. FC supports the following existing ULP protocols: The Enterprise Systems Connection (ESCON) protocol is a storage interconnect used in IBM mainframe computing environments. The Fibre Connection (FICON) protocol allows ESCON assets to be used within an FC SAN infrastructure. The FC-SB-2 standard maps FICON to FC-2. The IEEE 802.2 standard defines the generic logical link control (LLC) layer in the OSI Reference Model. The FC-LE standard helps map IEEE 802.2-based protocols to FC. IP is the protocol that drives the Internet. FC-IP allows FC to carry the IP protocol. Servers can use IP to communicate with each other over the SAN. High Performance Parallel Interface (HiPPI) connects devices at short distances and high speeds. HiPPI is used primarily to connect supercomputers and to provide high-speed backbones for LANs. The FC-FP standard maps HiPPI to FC. The FC-BB-2 standard enables FC to exchange data with ATM and Synchronous Optical Network (SONET) networks for long-haul transport of FC data.
AA-52 Implementing Cisco Storage Networking Solutions (ICSNS) v3.0 2007 Cisco Systems, Inc.
FC
HBA
Initiator
Target
ICSNS v3.07-16
The preceding graphic shows a transaction between a host (initiator) and a storage device (target): The smallest unit of data is a word. Words consist of 32 bits (4 bytes) of data that are encoded into a 40-bit form by the 8b/10b encoding process. Words are packaged into frames. An FC frame is equivalent to an IP packet. A sequence is a series of frames sent from one node to another node. Sequences are unidirectionalin other words, a sequence is a set of frames that are issued by one node. An exchange is a series of sequences sent between tow nodes. The exchange is the mechanism used by two ports to identify and manage a discrete transaction. The exchange defines an entire transaction, such as a SCSI read or write request. An exchange is opened whenever a transaction is started between two ports and is closed when the transaction ends. An FC exchange is equivalent to a TCP session.
AA-53
6 24
Header
0528 02112
Payload
1 4
CRC
1 = 537 4 = 2148
E O F
Optional Headers
Data or commands
Fill Bytes
0512 02048
AA-54
Frames (F)
ICSNS v3.07-18
The screen image displays an FC protocol trace. A single FC frameFabric Login (FLOGI) is displayed in the right-hand window. Each word in the frame is depicted on a separate line, beginning with the SOF Frame Delimiter (SOFi3) and ending with the EOF Frame Delimiter (EOFt). The display shows the 6 words in the frame header, 29 words in the payload, and the 32-bit CRC.
AA-55
Frame Headers
S O F Header Payload CRC E O F
24 23
16 15
8 7
ICSNS v3.07-19
Frame Headers
These are the header fields of an FC frame: R_CTL (Routing Control, 8 bytes): Frame type and function; used by the switch to route frames CS_CTL (Class Specific Control, 8 bytes): Class specific control information for Class 1, 4 &6 D_ID (Destination ID, 24 bytes): 24-bit address of the destination port S_ID Source ID, 24 bytes): 24-bit address of the source port TYPE (Data Structure Type, 8 bytes): Type of Information Unit & ULP carried by this frame F_CTL (Frame Control, 24 bytes): Specifies number of fill bytes and sequence control information SEQ_ID (Sequence ID, 8 bytes): Unique ID for each sequence SEQ_CNT (Sequence Count, 16 bytes): Frame count identifying each frame in the sequence DF_CTL (Data Frame Control, 8 bytes): Information about optional headers OX_ID (Originator ID, 16 bytes): Unique ID set by the exchange originator RX_ID (Receiver ID, 16 bytes): Unique ID set by the exchange responder Parameter (Parameter or Offset, 32 bytes): Used for multi-purpose parameters, such as buffer offset.
AA-56
Ordered Sets
Transmission Word
Ordered Set
K28.5, Dxx.y, Dxx.y, Dxx.y
Data Word
Dxx.y, Dxx.y, Dxx.y, Dxx.y
Primitive Signal
Frame Delimiter
Fill Word
Control Signal
Receiver Ready Virtual Circuit Ready Close Open Dynamic Half-Duplex Mark Synchronize
Primitive Sequence
Non-Operational State Offline State Link Reset Link Reset Response Loop Initialization Loop Port Bypass Loop Port Enable
Start-of-Frame End-of-Frame
Idle Arbitrate
ICSNS v3.07-20
Ordered Sets are FC words (5 bytes) that are used for link-level functions. They are used because they are fast and light, and because commands sometimes need to be exchanged before devices have been assigned FC addresses. The first byte of an Ordered Set is always the K28.5 character, which defines the word as an Ordered Set. The second byte identifies the Ordered Set type, and the last two bytes can be used to transmit other parameters. There are three types of Ordered Sets: Frame Delimiters are used to mark the beginning and end of frames. Primitive Signals are used to initiate, synchronize, and terminate communication sessions, and to maintain synchronization when no other information is being transmitted on the link. The two types of Primitive Signals are fill words and control signals. Primitive Sequences are similar to Primitive Signals, but are transmitted repeatedly until a response is received. They are used for link and loop initialization.
AA-57
SCSI-FCP Operations
This section provides a brief overview of SCSI-FCP protocol operations.
Fabric
2
Frame Frames
Target Sequence 1
Exchange
Sequence 2
Frame
IU 2 FCP_DATA
Sequence 3
FCP_RSP IU 3
ICSNS v3.07-22
The preceding diagram illustrates a SCSI-FCP read operation: 1. The initiator node generates a SCSI read request (FCP_CMD), which is packaged as IU 1. 2. The initiator FC-2 layer converts IU 1 to a single command chunk and sends it across the fabric as a single frame. This constitutes Sequence 1. 3. The target node processes IU 1, retrieves the requested data (FCP_DATA) from storage and packages the data as IU 2. 4. The target FC-2 layer converts IU 2 to one or more data chunks and sends them across the fabric. This constitutes Sequence 2. 5. The target node then generates a status command (FCP_RSP) that informs the initiator that the requested data transmission is complete. The status command is packaged as IU 3. 6. The target FC-2 layer converts IU 3 to a single command chunk and sends it across the fabric. This constitutes Sequence 3. At this point, the I/O operation is complete. The collection of three sequences constitutes a single exchange.
AA-58
Fabric
2
Frame Frame
Target Sequence 1
4
FCP_XFR_ IU 2 RDY
Sequence 2 Exchange
5 IU 3 FCP_DATA 6
Frames
Sequence 3
Frame
Sequence 4
2007 Cisco Systems, Inc. All rights reserved.
FCP_RSP IU 4
ICSNS v3.07-23
The preceding diagram illustrates a SCSI-FCP write operation: 1. The initiator node generates a SCSI write request (FCP_CMD), which is packaged by the FC-4 layer as IU 1. 2. The initiator FC-2 layer converts IU 1 to a single command chunk and sends it across the fabric as a single frame. This constitutes Sequence 1. 3. The target node responds with a SCSI write request response (FCP_XFR_RDY), which is packaged as IU 2. The write request response is required for synchronization between the initiator and target. 4. The target FC-2 layer converts IU 2 to a single command chunk and sends it across the fabric. This constitutes Sequence 2. 5. The initiator node retrieves the data (FCP_DATA) from its ULP buffers and packages it as IU 3. 6. The initiator FC-2 layer converts IU 3 to one or more data chunks and sends them across the fabric. This constitutes Sequence 3. 7. The target then generates a status command (FCP_RSP) to confirm the end of the exchange. The command is packaged as IU 4. 8. The target FC-2 layer converts IU 4 to a single command chunk and sends it across the fabric. This constitutes Sequence 4. The collection of four Sequences constitutes a single Exchange.
AA-59
Link Services
This section describes what a Link Services command is, the role of Link Services, and how Link Services differ from Ordered Sets.
BA_ACC N_Port
2007 Cisco Systems, Inc. All rights reserved.
N_Port
ICSNS v3.07-25
Link Services are upper-layer protocol (ULP) independent FC commands. Link Services are used to implement control functions used in session management, such as fabric and port login, address resolution, and error recovery. Link Services are defined within the Fibre Channel Common Transport (FC-CT) framework. Link Services are transparent to ULPs. In other words, Link Services frames are generated by the initiator N_Port, not by the ULP driver. Upon receiving a Link Services command, the target N_Port processes and discards all Link Services frames. The preceding diagram shows an example of a Link Services exchange: The N_Port on the left has sent an ABTS Link Services command to attempt to terminate the current FC sequence. The N_Port on the right receives the ABTS request and responds with the BA_ACC Link Services command to indicate that the N_Port has successfully processed the ABTS request.
Note Ordered Sets are not Link Services. Ordered Sets are short, one-word (four-byte) commands that can carry, at the most, two bytes of parameters, whereas Link Services commands consist of one or more FC frames. Ordered Sets are typically used at the physical layer to perform basic link management functions. Link Services comprise a higherlevel command set that is essential to performing session management and error recovery.
AA-60
ICSNS v3.07-26
AA-61
ICSNS v3.07-27
AA-62
AA-63
Summary
This topic summarizes the key points that were discussed in this lesson.
Summary
Fibre Channel has a layered model that is similar in some respects to the OSI model. Fibre Channel constructs include words, frames, sequences and exchanges. The Fibre Channel protocol is broken down into control and data units. The control units in Fibre Channel are called Ordered Sets. Link Services are additional communications from port to port used to relay logins, state changes, error situations and other administration messages.
ICSNS v3.07-29
AA-64
Lesson 4
FC Flow Control
Overview
Like any network protocol, Fibre Channel (FC) must define how the flow of data is managed. FC defines two flow control processes that are used either individually or together. FC uses a unique receiver-based flow control strategy that ensures that data is delivered efficiently and with a minimum of delivery errors.
Objectives
Upon completing this lesson, you will be able to explain Fibre Channel flow control and addressing. This includes being able to meet these objectives: Explain the Fibre Channel flow control process Calculate the number of buffer credits needed for a FC link Explain the Fibre Channel addressing schemes Explain the function of World Wide Names
Tx
PAUSE
Data
Data Data
Rx
ICSNS v3.07-4
Flow control is a mechanism for ensuring that frames are sent only when there is somewhere for them to go. Just as traffic lights are used to control the flow of traffic in cities, flow control manages the data flow in an FC fabric. Some data networks, such as Ethernet, use a flow-control strategy that can result in degraded performance: A transmitting port (Tx) can begin sending data packets at any time. When the receiving ports (Rx) buffers are completely filled and cannot accept any more packets, Rx tells Tx to stop or slow the flow of data. After Rx has processed some data and has some buffers available to accept more packets, it tells Tx to resume sending data. This strategy results in lost packets when the receiving port is overloaded, because the receiving port tells the transmitting port to stop sending data after it has already overflowed. Lost packets must be retransmitted, which degrades performance. Performance degradation can become severe under heavy traffic loads.
AA-66
1 0
DATA
Tx
READY
ICSNS v3.07-5
AA-67
N_Port
F_Port
E_Port
E_Port
F_Port
N_Port
ICSNS v3.07-6
AA-68
F_Port
Data N_Port A
R_RDY
4
R_RDY
2
R_RDY
ACK
1 N_Port
B
5
Buffer-to-buffer flow control
When end-to-end flow control is used, the transmitting port is responsible for ensuring that all frames are delivered. Only when the transmitting N_Port receives the last ACK frame in response to a sequence of frames sent does it know that all frames have been delivered correctly, and only then will it empty its ULP data buffers. If a returning ACK indicates that the receiving port has detected an error, the transmitting N_Port has access to the ULP data buffers and can resend all of the frames in the sequence.
AA-70
ICSNS v3.07-8
AA-71
Rx
ICSNS v3.07-9
AA-72
ICSNS v3.07-11
The number of buffer-to-buffers required for a link depends on the physical length of that link. The number of credits required is calculated based on frame size, propagation delay (speed of light in fiber), and the end-to-end latency of the link; of all of these factors, latency is the only variable. On a pure FC link, latency is deterministic and depends primarily on the length of the link and the number of hops. On an FC WAN link (such as FCIP), latency depends on the characteristics of the WAN. The default credit allocation on most vendors switches is generally sufficient for intradatacenter links. However, the credit allocation often must be increased for long-haul links. FC WAN links, including FCIP, typically require additional buffer credits due to the increased latency of the IP network. Cisco 16-port switch modules support up to 255 credits per port, which provides ample credits for most applications.
AA-73
Initiator N_Port
2007 Cisco Systems, Inc. All rights reserved.
Target N_Port
ICSNS v3.07-12
You can calculate the number of credits required on a link to maintain optimal performance using the following formula:
Credits = (Round_Trip_Time + Processing_Time) / Serialization_Time
Example
This diagram and the following two diagrams illustrate how the required number of BB_Credits are calculated for a 10km, 1Gb/s FC link: At a link rate of 1.0625 Gb/s, the time required to serialize (transmit) each byte is 9.41ns. (Note that each byte is 10 bits due to 8b/10b encoding.) The maximum Fibre Channel frame size is 2048 bytes. The frame size used in an actual customer environment would be based on the I/O characteristics of the customers applications. You also need to account for the frame header, which is 36 bytes, and the number of IDLEs between frames, which is usually 6 IDLEs, or 24 bytes. This gives a total of 2108 bytes. The total serialization time for a 2108-byte frame (including idles) is 19.84s, or approximately 20s.
AA-74
Processing time:
Assume same as deserialization time 20s 10Km 20s 50s Frame 20s
Frame
Initiator N_Port
Target N_Port
ICSNS v3.07-13
The speed of light in a fiber optic cable is approximately 5s per kilometer, so each frame will take about 50s to travel across the link. The receiving port must then process the frame, free a buffer, and generate an R_RDY. This processing time can varyfor example, if the receiver ULP driver is busy, the frame might not be processed immediately. In this case, we can assume that the receiving port will process the frame immediately, so the processing time is equal to the time it takes to deserialize the frame. The deserialization time is equal to the serialization time: 20s
AA-75
20s
Frame
Initiator N_Port
ICSNS v3.07-14
The receiving port then transmits a credit (R_RDY) back across the link. This response takes another 50s to reach the transmitter. The total latency on the link is equal to the frame serialization time plus the round-trip time across the link, or about 120s.
AA-76
Initiator N_Port
Target N_Port
ICSNS v3.07-15
Given a frame serialization time of 20s, and a total round-trip latency of 120s, there could be up to 6 frames on the link at one time. In other words, six buffer-to-buffer credits are required to make full use of the bandwidth of the link.
AA-77
The formula for calculating required credits is Credits = (Round_Trip_Time + Processing_Time) / Serialization_Time. Serialization time is proportional to frame size, so the number of credits required varies with frame size. For example, with a 10Km link at 2Gb/s, only 11 credits are required if the average frame size is the maximum (2048 payload bytes). However, if the average payload is 32 bytes, 232 credits are required.
AA-78
Domain
Area
Port
FC
Nodes Hub
FC
HBA
FC
HBA
Switch
FC FC
FC
HBA
FC
HBA
ICSNS v3.07-18
The FC point-to-point topology uses a 1-bit addressing scheme. One port assigns itself an address of 000000 and then assigns the other port an address of 000001. The FC Arbitrated Loop topology uses an 8-bit addressing scheme: The Arbitrated Loop Physical Address (AL_PA) is an 8-bit address, which provides 256 potential addresses. However, only a subset of 127 addresses are available due to 8b/10b encoding requirements. One address is reserved for an FL_Port, so there are 126 addresses available for nodes. Addresses are cooperatively chosen during loop initialization.
AA-79
Each switch must have a unique Domain ID, so there can be no more than 239 switches in a fabric. The largest director-class switch available today has 256 ports, so the practical limit on the number of nodes that can be supported in a fabric is 61184 ports (239 domains x 256 ports). With 16-port switches the total port count is reduced to 3824 (239 domains x 16 ports), minus the number of ports used for ISLs. Note that these calculations do not take into account ports consumed by inter-switch links (ISLs)which reduces the number of portsor the fact that an arbitrated loop multiple L_Ports can be attached to a single FL_Portwhich increases the potential number of ports.
AA-80
Fabric
Domain Domain
Area Area
00000000 00000000
ICSNS v3.07-19
AA-81
World-Wide Names
This section introduces WWNs, a second addressing scheme used on FC SANs.
World-Wide Names
Every Fibre Channel port and node has a hard-coded address called a World Wide Name (WWN):
Allocated to manufacturer by IEEE Coded into each device when manufactured 64 or 128 bits (128 bits most common today)
WWNs are unique identifiers that are hard-coded into FC devices. Every FC port has at least one WWN. Vendors buy blocks of WWNs from the IEEE and allocate them to devices in the factory. WWNs are important for enabling fabric services because they are: Guaranteed to be globally unique Permanently associated with devices These characteristics ensure that the fabric can reliably identify and locate devices, which is an important consideration for fabric services. When a management service or application needs to quickly locate a specific device: 1. The service or application queries the switch Name Server service with the WWN of the target device 2. The Name Server looks up and returns the current port address that is associated with the target WWN 3. The service or application communicates with the target device using the port address
AA-82
There are two types of WWNs: WWNNs uniquely identify devices. Every host bus adaptor (HBA), array controller, switch, gateway, and FC disk drive has a single unique WWNN. WWPNs uniquely identify each port in a device. A dual-ported HBA has three WWNs: one WWNN and a WWPN for each port. WWNNs and WWPNs are both needed because devices can have multiple ports. On singleported devices, the WWNN and WWPN are usually the same. On multi-ported devices, however, the WWPN is used to uniquely identify each port. Ports must be uniquely identifiable because each port participates in a unique data path. WWNNs are required because the node itself must sometimes be uniquely identified. For example, path failover and multiplexing software can detect redundant paths to a device by observing that the same WWNN is associated with multiple WWPNs. Cisco MDS switches use the following acronyms: PWWN (Port WWN) NWWN (Node WWN)
AA-83
Summary
This topic summarizes the key points that were discussed in this lesson.
Summary
Fibre Channel uses a credit-based strategy Two types of flow control: Buffer-to-buffer (port-to-port) End-to-end (source-to-destination) Credit requirements depend on frame size, RTT, serialization, and processing time FC addressing is a 24-bit number: 3 bytes represent: [ domain ] [ area ] [ port ] Every Fibre Channel port and node has a hard-coded address called a World Wide Name (WWN).
2007 Cisco Systems, Inc. All rights reserved. ICSNS v3.07-22
AA-84
Lesson 5
FC Login
Overview
The Fabric Login, Port Login, and Process Login protocols define how fabric ports behave when they are brought online and when they want to establish a communication session. This lesson provides a detailed examination of each of the login protocols. It explains the role that each login protocol serves, and identifies the commands that are exchanged during each phase of each protocol.
Objectives
Upon completing this lesson, you will be able to describe the Fibre Channel device login process. This includes being able to meet these objectives: Identify the phases of the Fabric Login protocol Identify the phases of the Port Login protocol Identify the phases of the Process Login protocol Identify the phases of the Loop Initialization and Arbitration protocols
Fabric Login
This section provides an overview of the session establishment protocols that are performed by N_Ports and F_Ports in a fabric topology.
Fabric Login
Fabric Node N_Port A FLOGI FLOGI PLOGI PLOGI Process A Process F_Port A F_Port B N_Port B FLOGI FLOGI Node
Process
PRLI PRLI
Process B
ICSNS v3.07-4
Before an N_Port can begin exchanging data with other N_Ports, three processes must occur: The N_Port must log in to its attached F_Port. This process is known as Fabric Login (FLOGI). The N_Port must log in to its target N_Port. This process is known as Port Login (PLOGI). The N_Port must exchange information about ULP support with its target N_Port to ensure that the initiator and target process can communicate. This process is known as Process Login (PRLI).
AA-86
F_Port
1
F_Port N_Port F_Port
Node
Switch
N_Port
NOS OLS
Node
F_Port
ICSNS v3.07-5
FLOGI is the initial bootstrap process that occurs when an N_Port is connected to an F_Port. FLOGI is mandatory for N_Ports, and optional for NL_ports.The N_Port uses Fabric Login to discover if a fabric is present. Communication with other N_Ports may not be attempted until the Fabric Login process is complete. The FLOGI protocol follows this process: 1. The F_Port sends a primitive sequence of NOS (Not Operational) to the N_Port. 2. When the N_Port receives the NOS, it responds with a primitive sequence of OLS (Offline State) to begin link initialization.
AA-87
F_Port
3
F_Port N_Port F_Port
Node
Switch
N_Port
LR LRR
Node
4
F_Port
ICSNS v3.07-6
3. After the N_Port begins the initialization process with by sending OLS, the F_Port tries to reset the port by sending an LR (Link Reset) command. 4. The N_Port responds with an LRR (Link Reset Response) command.
AA-88
F_Port
F_Port
N_Port
F_Port
Node
5
IDLE
Switch
N_Port
IDLE
Node
F_Port
ICSNS v3.07-7
5. From this point on, the link is active and IDLE fill words flow in both directions on the link. 6. Following link initialization, a new N_Port uses an S_ID of 000000 or 0000[AL_PA] to indicate that the port is unidentified during FLOGI. An existing N_Port uses its existing port address as its S_ID.
AA-89
Great, I have established a link with the switch! Now I need to request a port address.
F_Port
7
F_Port Login Server N_Port F_Port
Node
Switch
N_Port
LS_ACC FLOGI
Node
6
F_Port
ICSNS v3.07-8
7. After the N_Port has established a link to its F_Port, the N_Port obtains a port address by sending a FLOGI Link Services command to the switch Login Server (at Well-Known Address 0xFFFFFE). 8. The Login Server sends an ACC reply that contains the N_Port address in the D_ID field. When an N_Port is performing FLOGI and receives ACC frame that indicates that the ACC came from another N_Port, then the N_Port that is logging in assumes that it is in a point-topoint configuration. In this case, the N_Port immediately initiates PLOGI with the other N_Port after completing FLOGI.
AA-90
Now that I have a port address I will log in to the Name Server and tell it about me.
F_Port
9
F_Port Name Server N_Port F_Port
Node
Switch
N_Port
LS_ACC PLOGI
Node
8
F_Port
ICSNS v3.07-9
9. After receiving a port address, the N_Port logs into the Fabric Name Server at address 0xFFFFFC and transmits its service parameters, such as the number of buffer credits it supports, its maximum payload size, and supported Classes of Service. 10. The Name Server responds with an LS_ACC frame.
AA-91
Bits 31-24 Bits 23-16 Bits 15-8 Bits 7-0 Command 04 00 00 00 Common Service parameters (16 bytes) N_Port Name (8 bytes) Node Name (8 bytes) Class 1 Service Parameters (16 bytes) Class 2 Service Parameters (16 bytes) Class 3 Service Parameters (16 bytes) Class 4 Service Parameters (16 bytes) Vendor Version Level (16 bytes)
ICSNS v3.07-10
AA-92
ICSNS v3.07-11
The preceding image shows an analyzer trace that displays part of a fabric login sequence. The top of the trace shows the OLS-LR-LRR sequence that occurred while the link was being initialized. The right-hand panel shows the contents of the FLOGI frame from the N_Port to the F_Port (FFFFFE). Useful information can be obtained by studying these analyzer traces: Notice that at this time the N_Port does not yet have an address. Notice also that the World Wide Port Name is the same as the World Wide Node Name. This is common in single ported nodes. The N_Port does not support Class 1, but it does support Classes 2 and 3. The N_Port supports Alternate Buffer Credit Management Method and can guarantee 2 BB_Credits at its receiver port. You can see that this is a single-frame Class 3 sequence because the Start of Frame is SOFi3 and End of Frame is EOFt, meaning that this initial first frame is also the last one in the sequence.
AA-93
Port Login
This section provides a description of the PLOGI protocol. Each command used during PLOGI is identified; however, the parameters exchanged during each command are not described in detail.
Port Login
I want to exchange data with another N_Port. I will tell them I am here and find out what their capabilities are.
F_Port
F_Port
N_Port
F_Port
Node
N_Port
Node
PLOGI
PLOGI
F_Port
ICSNS v3.07-13
After completing the FLOGI process, the N_Port can log in to another N_Port using the PLOGI protocol. PLOGI must be completed before the nodes can perform any ULP operations. The PLOGI protocol follows this process: 1. The initiator N_Port sends a PLOGI frame that contains the N_Ports operating parameters encapsulated in the payload.
AA-94
2
F_Port N_Port F_Port
Node
N_Port
LS_ACC
LS_ACC
Node
I see that this port has some limitations, so I will operate in Class 3 with small frame sizes.
F_Port
ICSNS v3.07-14
2. The target N_Port responds to the initiator N_Port by sending an ACC frame that specifies the target N_Ports operating parameters. The operating system driver that manages the initiator N_Port stores this information in a parameter block. An N_Port can be logged into multiple N_Ports simultaneously. N_Ports typically perform Port Logout (PLOGO) only when one of the nodes go offline.
AA-95
ICSNS v3.07-15
The image shows the N_Port logging in to another N_Port a PLOGI command. Note that the N_Port has provided the same data that it provided when it logged in to the Name Server.
AA-96
ICSNS v3.07-16
AA-97
Process Login
This section provides a description of the PRLI protocol. Each command used during PRLI is identified; however, the parameters exchanged during each command are not described in detail.
Process Login
I am going to be using the SCSI protocol. I wonder if the target can support the same functions as I can.
F_Port
F_Port
N_Port
F_Port
Node
N_Port
PRLI
PRLI
Node
1
F_Port
I must tell the target what SCSI functionality I can support and find out what the target can support.
2007 Cisco Systems, Inc. All rights reserved. ICSNS v3.07-18
After completing the PLOGI protocol, both N_Ports knows about the others Fibre Channel (FC) operating parameters capabilities. At this point, the driver for the initiator port can open a channel with the driver associated with the target port using the PRLI protocol. The PRLI protocol is used to establish a session between two FC-4 level logical processes. The PRLI protocol follows this process: 1. The initiator sends a PRLI frame that contains information about its ULP support.
AA-98
2
F_Port N_Port F_Port
Node
N_Port
LS_ACC
LS_ACC
Node
F_Port
Now we have information about each other, we will only talk in protocols that we can both understand
ICSNS v3.07-19
2. The target port responds with an ACC frame that contains details about its ULP support. At this point, a channel has been successfully opened and communication can take place. The relationship between the initiator process and the target process is known as an image pair.
AA-99
4
F_Port N_Port
Node
N_Port
F_Port
LS_ACC PRLO
LS_ACC PRLO
Node
3
F_Port
ICSNS v3.07-20
3. When the initiator has finished exchanging data with the target, the initiator sends a Process Logout (PRLO) frame. 4. The target responds with an ACC frame, and the image pair is then terminated. At this point, the image pair must be established again before further communication can take place.
AA-100
ICSNS v3.07-21
The image shows the N_Port performing process login (PRLI) with its target N_Port. The payload data in a PRLI is relevant to the ULP, which in this case is SCSI-FCP. For example: This N_Port can function as an initiator. The ULP driver does not use the SCSI-3 XFER_RDY command during SCSI Read operations.
AA-101
SP=0000001 2
ICSNS v3.07-22
The image shows the target N_Port responding to the PRLI command shown on the previous page.
AA-102
Arbitration
A
Arbitrate Arbitrate for ownership for ownership
C D E
Position Map Position Map Reporting Reporting Position Map Position Map Distribution Distribution
ICSNS v3.07-24
AA-103
AA-104
LPSM LPSM
Upper-layer protocols FC-4 FC-3 Common Services FC-2 Framing & flow control FC-1 Encoding FC-0 Physical interface
FC-FS
FC-PI
ICSNS v3.07-25
AA-105
Summary
This topic summarizes the key points that were discussed in this lesson.
Summary
Fabric Login (FLOGI) is performed by all connecting devices to acquire an FCID Port Login (PLOGI) is performed to pass identification and capability to the name server Process Login (PRLI) is performed between end devices to exchange operating parameters capabilities and establish a session Loop initialization is performed by all device connecting to a loop hub or device to acquire an AL_PA (arbitrated loop physical address)
ICSNS v3.07-26
AA-106
Lesson 6
FC Error Recovery
Overview
Each Fibre Channel (FC) layer plays a role in error management. In this lesson, you will learn about how each layer detects and recovers from errors. You will also learn about configuration parameters that affect the way a FC SAN responds to error conditions.
Objectives
Upon completing this lesson, you will be able to explain how Fibre Channel recovers from errors. This includes being able to meet these objectives: Explain how FC handles frame-level errors detected by the FC-1 layer Explain how FC handles sequence-level errors detected by the FC-2 layer Explain how the SCSI-FCP protocol handles error conditions
FC-1 Errors
This section describes the error recognition protocol in the FC-1 layer of Fibre Channel.
FC-1 Errors
Four consecutive invalid transmission words trigger an FC-0 loss-of-synchronization error
Invalid word detected Loss of Sync Valid word detected Sync (re)gained
Synchronization Acquired 1 2 3 4
ICSNS v3.07-4
Four consecutive invalid transmission words must occur to trigger an FC-0 loss-ofsynchronization error. This requirement prevents transient errors from causing loss of synchronization. The preceding graphic shows the trigger conditions required to cause a loss-of-synchronization error: The system starts in state 1. When an invalid word is detected, the system moves to state 2. If the next word is valid, the system moves back to state 1. After three consecutive invalid words, the system is in state 4. The next consecutive invalid word will trigger a loss-of-synchronization error.
AA-108
Exceeding R_T_TOV during these events results in link failure Default value is 100ms Cannot be changed on MDS switches
ICSNS v3.07-5
Link failure occurs in the following situations: Loss of synchronization occurs and synchronization cannot be reestablished within a specified timeout period. An expected Primitive Sequence is not received within a specified timeout period during the link initialization, reset, and failure protocols.
R_T_TOV
The timeout period that governs both of these cases is the Receiver-Transmitter Timeout Value (R_T_TOV). The default value of R_T_TOV is 100ms. R_T_TOV cannot be changed on MDS switches. R_T_TOV is an FC-1 layer timer. This timer is used to detect loss of synchronization between the transmitter and receiver, and is also used to time link reset events. If R_T_TOV is too low, the transmitter and receiver will experience repeated loss of synchronization and link reset events. If R_T_TOV is too low for the link reset process to complete, the link will not come up.
AA-109
R_T_TOV (Cont.)
All ports in the fabric must have the same R_T_TOV value Fabric will segment if R_T_TOV is not consistent Shorter R_T_TOV (100s) has been proposed to provide faster error detection required for real-time systems, such as avionics environments
ICSNS v3.07-6
R_T_TOV is a fabric-wide timeout value. All ports in the fabric must have the same value. If R_T_TOV is not the same on two connected switches, the fabric will segment. The default value of 100ms is acceptable in most situations. However, R_T_TOV might need to be adjusted in some environments. Real-time environments like FC-AE require very fast responses, fast error recovery, and low latency. For applications with these requirements, 100ms is a long time to wait; 5000 2KB frames could be sent in that time. (Each 2KB frame takes approx 20s to serialize, so 5000 x 20s = 100ms.) Some FC developers have proposed reducing the default R_T_TOV to 100s (1000 times shorter) for certain environments.
AA-110
FC-2 Errors
This section describes the error recognition protocol in the FC-2 layer of Fibre Channel.
FC-2 Errors
Frame Errors
Invalid D_ID Invalid OX_ID Invalid R_CTL Invalid S_ID Invalid RX_ID Invalid F_CTL Invalid SEQ_ID Invalid SEQ_CNT Invalid DF_CTL Unsupported ULP or invalid TYPE Invalid Offset
Resource Errors
Too many Sequences Cannot establish Exchange
Delimiter Errors
Invalid SOF/EOF Unsupported Class of Service
Delivery Errors
Missing frame or ACK Out-of-order SEQ_CNT Undeliverable frames Sequence or link timeout
The FC-2 layer detects four general types of errors: Frame Errors occur when any of the frame header fields are invalid, such as a frame with an invalid D_ID or unsupported ULP. Resource Errors occur when the sequence count exceeds the maximum number of sequences within an exchange (256) or when a valid exchange cannot be established. Delimiter Errors occur when either SOF or EOF are invalid or if a frame is received with an unsupported Class of Service. Delivery Errors occur when frames arrive out of sequence, are missing, or fail to arrive within a specified time period.
AA-111
ICSNS v3.07-9
In Classes 1, 2, 4, and F, which all provide acknowledged delivery, a RJT or BSY response will be sent to the transmitting port when a frame is invalid or cannot be delivered: The fabric will reply with F_BSY if the destination switch port had no free buffers. The fabric will reply with F_RJT if the frame had an invalid D_ID or S_ID, or if the port is unavailable. The receiver will reply with P_BSY if the receiver port had no free buffers. The receiver will reply with P_RJT if the requested Class of Service or ULP is not supported. In Class 3, frames will be discarded without notification if the receiver port has no buffers, is unavailable, or does not support the requested Class of Service or ULP.
AA-112
E_D_TOV
E_D_T_V is an FC-2 layer timer:
Determines how long a receiver waits for a response before declaring an error condition
Default value of 2 seconds On FCIP links you might need to increase E_D_TOV if RTT exceeds 2 seconds:
Unusually high latency in the IP WAN Dropped packets that need to be retransmitted Congestion at the FCIP gateway (low-bandwidth IP)
ICSNS v3.07-10
E_D_TOV
E_D_TOV is an FC-2 layer timer. E_D_TOV determines how long a receiver waits for an expected response before declaring an error condition. For example, if a frame arrives out of sequence, the receiver waits E_D_TOV before it declares an error and aborts the sequence. The default value of E_D_TOV is usually 2 seconds on FC switches. This value is always sufficient for DWDM and almost always sufficient for SONET/SDH. On FCIP links, however, you might need to increase E_D_TOV due to two factors: There might be unusually high latency in the IP WAN, or there might be dropped packets that need to be retransmitted. In either case, it is possiblealthough unlikely in a welldesigned IP networkfor the total round-trip latency to exceed 2 seconds. The bandwidth of the IP link might be less than the bandwidth of the FC fabric, so frames could pile up in the fabric if the IP link becomes congested.
AA-113
Sequence Recovery
Sequence Error Detected
Send ABTS
Reply BA_ACC
Discard all frames in Sequence BA_ACC received? YES BA_ACC received? NO Implicit logout of other port
2007 Cisco Systems, Inc. All rights reserved.
Retry ABTS
NO
YES
? ? ? ?
ICSNS v3.07-11
Sequence Recovery
This diagram illustrates the first part of the sequence recovery process: When a sequence error occurs, the N_Port that detected the error sends the Abort Sequence (ABTS) Extended Link Services command to abort the sequence. ABTS can be transmitted as part of the current sequence or as a new sequence. The other N_Port responds with the Basic Accept (BA_ACC) command.. Both ports discard all frames in the Sequence. If the N_Port that sent ABTS does not receive BA_ACC, it assumes that the other ports is no longer available and performs an implicit port logout. In some cases, the entire exchange is aborted with the Abort Exchange (ABTX) command, and the entire exchange must be reestablished.
AA-114
The wait time is determined by the Resource Allocation Timeout Value (R_A_TOV):
Default value is 10 seconds in a fabric Default value is 2 * E_D_TOV (4 sec) for point-to-point
ICSNS v3.07-12
When ABTS is issued to abort a sequence, the fabric must be purged of all frames in the sequence before the sequence can be re-transmitted; otherwise, old frames could arrive out of sequence. The receiver might not be able to differentiate between the old frames and the retransmitted frames, and data errors could result at the ULP level. Therefore, before the sequence is resent, the initiator waits for a specified period of time before retransmitting the sequence. This time period is determined by the Resource Allocation Timeout Value (R_A_TOV): In a fabric, the default value of R_A_TOV is 10 seconds. In a point-to-point topology, the default value of R_A_TOV is twice the value of E_D_TOV, or 4 seconds.
AA-115
Send ABTS
Reply BA_ACC
Discard all frames in Exchange BA_ACC received? YES BA_ACC received? NO Implicit logout of other port
2007 Cisco Systems, Inc. All rights reserved.
Retry ABTS
NO
YES
Wait R_A_TOV
Send RRQ
ICSNS v3.07-13
This diagram continues the description of the sequence recovery process: After BA_ACC is received, the originating N_Port waits for a time equal to R_A_TOV (10 seconds in a fabric). After R_A_TOV has expired, the originating N_Port sends the Resource Recovery Qualifier (RRQ) command. After RRQ is sent, the ports can begin retransmission of the failed sequence or exchange.
AA-116
R_A_TOV
R_A_TOV is an FC-2 layer timer:
Specifies how long a frame can be in transit Used to determine how long a sender must wait before it can begin resending an aborted sequence
ICSNS v3.07-14
R_A_TOV
R_A_TOV is an FC-2 layer timer. R_A_TOV specifies how long a FC frame can be in transit. This value is used to determine how long a sender must wait before it can begin resending a sequence after the sequence was aborted after an error occurred. The sender must wait for R_A_TOV because if a sender begins to resend a sequence before the frames from the old aborted sequence have been received, discarded or expired, frames from the old and new sequences might arrive intermixed. Because the FC protocol provides no way for the receiver to guarantee that the new sequence ID will be different than the old sequence ID, the sender waits until there is no chance that frames could still be in transit. The default value of R_A_TOV is usually 10 seconds on FC switches. This value is sufficient for any type of WAN link except some IP links.
AA-117
ICSNS v3.07-16
By default, the SCSI-FCP protocol uses the Abort, Discard Multiple Sequences exchange error policy, in which all sequences in the exchange are retransmitted. However, discarding the entire exchange is often not the most desirable solution. The initiator must wait for the R_A_TOV timeout period (10 seconds by default) to expire before retrying the aborted exchange. In addition to reducing overall performance, this long wait time can have greater impact in some situations. For example, if the failed operation is a backup application streaming frames to a tape drive, then the tape buffer will empty and the drive will stop. When the buffer begins to fill again, the tape will rewind, run up to speed, and continue streaming from the last file mark.
AA-118
Target
X
BA_ACC
R_A_TOV
12 seconds
Retry FCP_CMND
The preceding diagram shows an example of basic SCSI-FCP error recovery: An FCP command (FCP_CMND) is issued by the initiator, Something goes wrong during the exchange and the FCP status sequence (FCP_RSP) does not arrive. The initiator waits for E_D_TOV (2 seconds) for the missing FCP_RSP to arrive. The initiator sends ABTS to abort the exchange. The target responds with BA_ACC. The initiator then waits for R_A_TOV (10 seconds) for all frames to be purged from the fabric before retrying the FCP command. A total of about 12 seconds elapses until the FCP command is resent.
AA-119
ICSNS v3.07-18
Some ports are capable of using an enhanced recovery technique that allows nodes to recover from sequence errors without having to abort the entire exchange. This enhanced recovery technique is defined by FC-4 Link Services commands. FC-4 Link Services are similar to Extended Link Services, but FC-4 Link Services are defined by the ULP, whereas Extended Link Services are defined by FC-2. The Read Exchange Concise (REC) Extended Link Service command allows the initiator to ask the target to report the status of the exchange. The Sequence Retransmission Request (SRR) Extended Link Service command requests retransmission of the exchange beginning at a specific sequence. The Read Exchange Concise Timeout Value (REC_TOV) determines how long the initiator waits before sending the REC command. The default value of REC_TOV is equal to the value of E_D_TOV (2 seconds) plus 1 second. The REC and SRR commands are not defined in the FC-PH specification; rather, they are defined in the FCP-2 ULP specification. Many vendors take advantage of this technique.
AA-120
Target
X
REC
3 seconds FC-4_ACC
The preceding diagram shows an example of enhanced SCSI-FCP error recovery: An FCP_CMND is issued by the initiator, but something goes wrong during the exchange. The FCP_RSP does not arrive. The initiator waits REC_TOV (3 seconds) for the missing frame to arrive The initiator sends REC to request information about the status of the exchange. The target acknowledges REC by sending FC-4_ACC. When the initiator receives FC-4_ACC, the initiator knows where in the exchange the failure occurred. The initiator then sends SRR to request retransmission of the sequence. The target resends the missing FCP_RSP sequence. A total of about 3 seconds (the default value of REC_TOV) elapses until the initiator sends REC. This is about one quarter of the time elapsed during basic error recovery (12 seconds) in a similar situation.
AA-121
Summary
This topic summarizes the key points that were discussed in this lesson.
Summary
Four consecutive invalid transmission words trigger an FC-0 loss-of-synchronization error At FC-1, all ports in the fabric must have the same R_T_TOV value FC-2 layer detects four general types of errors:
1. 2. 3. 4. Frame Errors Resource Errors Delimiter Errors Delivery Errors
Initiator must wait a minimum of R_A_TOV (10 seconds) before retrying the aborted exchange
ICSNS v3.07-20
AA-122
Summary (Cont.)
Receiver Transmitter Time-Out Value (R_T_TOV):
Short timer used to detect link-level failures
ICSNS v3.07-21
AA-123
AA-124
Lesson 7
FC Switched Fabric
Overview
This lesson explains three important protocols in a Fibre Channel switched fabric. The fabric configuration protocol, the FSPF protocol, and the RSCN protocol. You will also learn about fabric services and how they are addressed.
Objectives
Upon completing this lesson, you will be able to describe the Fibre Channel Switched Fabric protocol. This includes being able to meet these objectives: Describe the high-level phases of the fabric configuration protocol Explain the FSPF protocol Explain the RSCN protocol Identify the standard fabric services and their well-known addressed
Wait for Fabric Login Principal Switch Selected (DIA) Domain Identifier Assigned Exchange Fabric Parameters (EFP)
Domain ID Assigned?
2007 Cisco Systems, Inc. All rights reserved.
YES
The diagram describes the steps taken, when a fabric is first initialized, a new switch is added to an existing, or a link becomes active. A switch port detects a valid signal on its attached link and achieves word synchronization. The switch port begins link initialization. If the port is capable of operating at more than speed, it may perform speed negotiation. The switch port determines the proper operating mode; FL_Port, F_Port or E_Port. Exchange Link Parameters (ELP). When two E_Ports are connected and the link initialized, the ports exchange link parameters. This is accomplished by using a set of switch internal link service (SW_ILS) parameters called Exchange Link Parameters (ELP). The ELP is sent from the Fabric Controller (xFFFFFD) in one switch to the Fabric Controller in the neighbor switch using Class-F service. Exchange Switch Capabilities (ESC). Next, ESC is sent between neighboring Fabric Controllers to agree upon a common routing protocol. Exchange Fabric Parameters (EFP). The principal switch is selected using the Exchange Fabric Parameters (EFP) (SW_ILS). The EFT is sent between Fabric Controllers in neighbor switches. Domain ID Identifier (DIA). After a principal switch has been selected, Domain_IDs are assigned to the switches. The Principal Switch assigns itself a Domain ID, then floods the fabric with this information. Request Domain Identifier (RDI). After a switch receives a Domain Identifier Assigned (DIA) switch internal link service, it can request a Domain_ID from the principal switch by sending a Request Domain identifier (RDI) to the principal switch.
AA-126
Fabric Shortest Path First (FSPF). After the Domain_ID assignment phase is complete, routing tables are built. The switch may use the standardized FSPF protocol or a vendorunique routing protocol Build Routing Tables. Finally , each switch computes the paths it will use to deliver frames to other switches.
AA-127
FSPF
This section provides an overview of the FSPF protocol.
FSPF
Fabric Shortest Path First (FSPF):
Computes the least-cost path through the fabric, based on:
Link speed Number of hops
FC
HBA
FC
HBA
Avoids looping of frames All frames follow the same path Ensures in-order delivery in a stable SAN
FC
The FSPF protocol is the routing protocol used on FC SAN fabrics. The preceding diagram shows that FSPF selects a single path for a given I/O transaction, avoiding looping and ensuring in-order delivery. The FSPF algorithm is a cost-based routing algorithm that computes the most efficient path between two connected nodes. The cost of a given path is based on two factors: The speed of each of the ISLs along the path The number of hops on the path Routing using a single fixed path prevents looping of frames and, in a stable SAN, ensures inorder delivery. In other words, if routes are stable, frames always follow the same path. However, if the least-cost route changes while a session is in progress, frames sent after the route change might take the new route.
AA-128
FSPF (Cont.)
Three protocols used in FSPF:
The Hello protocol is used to establish communication between two connected switches Initial Link State Record (LSR) database synchronization LSR database maintenance
These protocols use Switch Internal Link Services (SW_ILS) with Class F frames
ICSNS v3.07-7
The FSPF protocol maintains a Topology Database which is distributed to every switch in the fabric. If a switch detects a lost connection, either to a Node or to another switch (ISL), it will update the Topology Database and send a Link State Update frame to all other switches directly connected to it. Each of these switches will update their Topology Database and pass the LSR frame onto other switches. In this way the fabric is flooded with updates to the Topology Database. Any LSR frames already received are discarded to stop duplicate LSRs from being distributed throughout the fabric.
AA-129
ICSNS v3.07-8
AA-130
ICSNS v3.07-9
After a switch acquires a Domain_ID, it begins the process of building a routing table: 1. Does not know if neighbor switch has acquired a Domain_ID 2. Begins transmitting Hello messages to its neighbors on all initialized ISLs 3. Exchanges Domain_IDs with all neighbors After two switches have exchanged Domain_IDs, the ISL is active and FSPF topology database synchronization can begin.
AA-131
Note that the default values of these intervals mean that FSPF can take up to 100s to become aware of a link failure. You can lower these values to promote faster recovery when a link fails, but you should also keep in mind that Hello messages are flooded, so smaller Hello Interval values increase congestion. The Hello Dead Interval should generally be set to 4 times the Hello Interval to avoid triggering unnecessary FSPF route computation if Hello messages are lost due to congestion.
AA-132
LSU(DB-A) LSU(DB-B)
LSA(DBB) LSA(DB-A)
LSU(LSR-A) LSU(LSR-B)
Link State Update (LSU) used to exchange entire LSD Recipients respond with Link State Acknowledgement (LSA) After database in sync, LSUs issued only upon topology changes, which are flooded throughout the entire fabric LSUs retransmitted by a mechanism called Reliable Flooding
ICSNS v3.07-10
LSA(LSRB) LSA(LSR-A)
2007 Cisco Systems, Inc. All rights reserved.
A new ISL completes link initialization (stage 1) and initial database synchronization (stage 2). One or more LSRs are transmitted to notify other switches to add the new information to their databases. This process by which LSRs are propagated through the fabric is known as reliable flooding. When a switch receives an LSR, it retransmits the LSR on other links. After the LSR is acknowledged, the switch stops transmitting that LSR on that link. The switch continues to send the LSR on other links until acknowledgement is received on those links.
AA-134
Examples:
Default link cost for 1Gb/s link: 1 * (1.0625e12 / 1.0625e9) = 1000 Default link cost for 2Gb/s link: 1 * (1.0625e12 / 2.1250e9) = 500
AA-135
The calculation is performed on a link-by-link basis, so each link in a data path can be advertised with a different cost. These costs are used by the path selection algorithm to determine the most efficient paths. When a path contains multiple links, the costs of each link are added up to determine the total cost of the path. In the case of two or more paths of equal cost, the decision of which path to use is not specified and is determined by the switch vendor. Note that FSPF only considers the ISLs along the data pathit does not consider the node-toswitch link at either end of the path. FSPF routes frames between domains only.
AA-136
Limitations of FSPF
FSPF algorithm does not account for traffic load All frames in an exchange follow the same path Path changes only in response to changes in the fabric topology
Switch B
Host 2 Host 1
FC
HBA
2Gb/s Cost=500
2Gb/s Cost=500
Storage 2 Storage 1 FC
FC
FC
HBA
Switch A
2Gb/s Cost=500
Switch C
ICSNS v3.07-12
Limitations of FSPF
The FSPF protocol supports load sharing, but it does not support load balancing. Load sharing is significantly different than load balancing, and the distinction can have significant effects for fabric design, especially when tuning performance: Load sharing simply means that multiple paths can be used Load balancing means that traffic load is balanced across multiple paths FSPF does not account for actual path utilization. In other words, an unused path with a cost of 1000 will be disregarded in favor of an overutilized path with a cost of 500. All frames in an exchange must follow the same path, and paths are recomputed only when the physical ISL configuration changes. The preceding diagram shows a simple SAN with two data paths: Path AC has a total cost of 500 Path ABC has a total cost of 1000 FSPF will never use path ABC, even if path AC is congested.
AA-137
Host 2 Host 1
FC
HBA
2Gb/s Cost=500
2Gb/s Cost=500
FC
HBA
Switch A
1Gb/s Cost=1000
Switch C
ICSNS v3.07-13
The FSPF least-cost path algorithm does not necessarily select the best path. For example, in the preceding diagram, links AB and BC are 2Gb/s links, with a default cost of 500 per link. Link AC is a 1Gb/s link with a default cost of 1000. (Note that this diagram differs from the previous diagram only in that link AC is a 1Gb/s link in this diagram.) There are two paths available from Switch A to Switch C: Path ABC has a total cost of 1000 and supports 2Gb/s along the entire path Path AC also has a total cost of 1000 but supports only 1Gb/s FSPF will weight both paths identically. When a single pair of devices (Host 1 and Storage 1) are attached to the SAN, FSPF might select path AC even though that path supports only half the bandwidth of path ABC. (Path ABC does have greater latency than path AC, but latency is a far less significant performance factor than bandwidth.) When a second pair of devices (Host 2 and Storage 2) are attached to the same switches, the switch will use the second equal-cost data path in an attempt to distribute the load evenly. In other words, Host 1Storage 1 will be assigned one path, and Host 2Storage 2 will be assigned the other path. Both data paths will be used. In both situations, the administrator can force path selection by adjusting the administrative weighting factor.
AA-138
RSCN
Registered State Change Notification
Switch
Path failure
FC
SCR
FC
HBA
SCR RSCN
Fabric Controller
RSCN
LS_ACC
LS_ACC
Host
Storage
ICSNS v3.07-15
Nodes respond to the RSCN with an LS_ACC frame. The RSCN message identifies the ports that were affected by the state change event, and it identifies the general nature of the event. After receiving an RSCN, the node can then use additional Link Services commands to obtain more information about the event. For example, if the RSCN specifies that the status of Port Y has changed, the nodes that receive the RSCN can attempt to verify the current (new) state of Port Y by querying the Name Server.
AA-140
ICSNS v3.07-16
The Fabric Controller will generate RSCNs in the following circumstances: A fabric login (FLOGI) from an Nx_Port. The path between two Nx_Ports has changed (e.g., a change to the fabric routing tables that affects the ability of the fabric to deliver frames in order, or an E_Port initialization or failure) An implicit fabric logout of an Nx_Port, including implicit logout resulting from loss-ofsignal, link failure, or when the fabric receives a FLOGI from a port that had already completed FLOGI. Any other fabric-detected state change of an Nx_Port. Loop initialization of an L_Port, and the L_bit was set in the LISA Sequence. An Nx_Port can also issue a request to the Fabric Controller to generate an RSCN. For example, if one port in a multi-ported node fails, another port in that node can send an RSCN to notify the fabric about the failure.
AA-141
Reserved
Event Qualifier Domain_ID of affected Nx_Port Area_ID of affected Nx_Port Port_ID of affected Nx_Port
0001 Changed Name Server object 0010 Changed port attribute 0011 Changed fabric service object 0100 Changed switch configuration
ICSNS v3.07-17
An RSCN frame payload contains one or more Port_ID Pages. Each Port_ID page is a 4-byte page that describes a single state change that has occurred with respect to a single Nx_Port. Each Port_ID page contains the following fields: The Domain_ID, Area_ID, and Port_ID of the affected Nx_Port (bytes 1-3) The Event Qualifier (bits 2-5 of byte 0) The Event Qualifier is a 4-bit code that specifies the general nature of the event: 0001 A Name Server object has changed; for example, a port came online or went offline. 0010 A port attribute has changed; for example, the number of buffer credits assigned to that port was changed. 0011 A fabric service object has changed; for example, an Alias_ID was added. In this case, the Port_ID page will refer to the Well-Known Address of the affected fabric service. 0100 The switch configuration has changed; for example, a time-out value was changed. Note that the Event Qualifiers do not communicate much information. For example, Event Qualifier code 0001 indicates a change to a Name Server object. This could signify that a port came online, went offline, or changed zones. The ports that receive the RSCN must then query the Name Server to determine the specific change that occurred.
AA-142
Domain Manager
Name Server
Alias Server
Key Server
Time Server
FC-4 ULP Mapping FC-3 Generic Services FC-2 Framing & flow control FC-1 Encoding FC-0 Physical interface
2007 Cisco Systems, Inc. All rights reserved. ICSNS v3.07-19
The FC-SW-2 specification defines several services that are required for fabric management. These services include: Name Server Login Server Address Manager Alias Server Fabric Controller Management Server Key Distribution Server Time Server The FC-SW-2 specification does not require that switches implement all of these services; some services can be implemented as an external server function. However, the services discussed in this lesson are typically implemented in the switch, as in Cisco MDS 9000 Family Switches.
AA-143
VSAN Manager
WWN Manager
Domain Manager
Fabric Configuration Principal Switch Selection Domain ID Allocation
FC_ID Allocation
Port Manager
2007 Cisco Systems, Inc. All rights reserved.
Login Server
ICSNS v3.07-20
AA-144
Supports soft zoning Provides information only about nodes in the requestors zone Distributed Name Server (dNS) resides in each switch Responsible for entries associated with that switchs domain Maintains local data copies and updates via RSCNs Sends RSCNs to the fabric when a local change occurs
ICSNS v3.07-21
AA-145
ICSNS v3.07-22
AA-146
ICSNS v3.07-23
AA-147
Well-Known Addresses
Well-known addresses are the highest 16 addresses in the 24-bit fabric address space
Broadcast Alias Fabric Login Server Fabric Controller Name Server Time Server Management Server QoS Facilitator Alias Server Key Distribution Server Clock Synchronization Server Multicast Server Reserved FFFFFF FFFFFE FFFFFD FFFFFC FFFFFB FFFFFA FFFFF9 FFFFF8 FFFFF7 FFFFF6 FFFFF5 FFFFF4 FFFFF0 Mandatory Mandatory Mandatory Optional Optional Optional Optional Optional Optional Optional Optional
ICSNS v3.07-24
Well-Known Addresses
Well-known Addresses allow devices to reliably access switch services. All services are addressed in the same way as an N_Port is addressed. Nodes communicate with services by sending and receiving Extended Link Services commands (frames) to and from Well-Known Addresses Well-known addresses are the highest 16 addresses in the 24-bit fabric address space: FFFFFF - Broadcast Alias FFFFFE - Fabric Login Server FFFFFD - Fabric Controller FFFFFC - Name Server FFFFFB - Time Server FFFFFA - Management Server FFFFF9 - Quality of Service Facilitator FFFFF8 - Alias Server FFFFF7 - Key Distribution Server FFFFF6 - Clock Synchronization Server FFFFF5 - Multicast Server FFFFF4FFFFF0 - Reserved
AA-148
Summary
This topic summarizes the key points that were discussed in this lesson.
Summary
The FSPF protocol is the routing protocol used on FC SAN fabrics Five stages of the FSPF protocol: Hello protocol; Initial topology database synchronization; Topology database maintenance; Path discovery; Path computation FC Name Server is a database implemented by the switch that stores information about each node The Fabric Controller service provides a mechanism for state change notification through the Registered State Change Notification (RSCN) process Well-known addresses are the highest 16 addresses in the 24-bit fabric address space
ICSNS v3.07-25
AA-149
AA-150
Appendix B
Module Objectives
Upon completing this module, you will be able to describe installation and configuration guidelines. This includes being able to meet these objectives: Explain the process used to install and power up the switch
AB-2
Lesson 1
Objectives
Upon completing this lesson, you will be able to explain the process used to install and power up the switch. This includes being able to meet these objectives: Describe the installation guidelines for the MDS 9000 platform Describe the types of rack and cabinet installations that are compatible with the MDS 9000 platform Describe the power supply configuration options for the MDS 9000 platform, and state the power requirements of individual modules Describe the characteristics and installation requirements of fan modules for the MDS 9000 Series Describe the functions, interfaces, and installation requirements of MDS 9000 supervisor modules
Installation Guidelines
This topic describes the installation guidelines for the MDS 9000 platform.
Installation Guidelines
Prepare the site:
Space evaluation Weight distribution and floor loading Environmental evaluation Power evaluation Grounding evaluation Cable and interface equipment evaluation EMI evaluation Gather network-related information
ICSNS v3.02-4
Installation Guidelines
Follow these guidelines when installing the Cisco MDS 9500 Series: Plan your site configuration and prepare the site before installing the chassis. It is recommended that you use the site planning tasks listed in the Cisco MDS Series Hardware Installation Guide. Ensure there is adequate space around the switch to allow for servicing the switch and for adequate airflow. Ensure the air conditioning meets the heat dissipation requirements listed in the Cisco MDS Series Hardware Installation Guide. Ensure the cabinet, or rack, meets the requirements listed in the Cisco MDS Series Hardware Installation Guide.
Note Jumper power cords are available for use in a cabinet.
Ensure the chassis is adequately grounded. Grounding the chassis is recommended in all cases, and it is mandatory for Cisco MDS 9506 Directors that have a DC power supply installed. If the switch is not mounted in a grounded rack or cabinet, it is recommend connecting both the system ground on the chassis and the power supply ground to an earth ground, regardless of whether the power supplies are AC or DC. Ensure the site power meets the power requirements listed in the Cisco MDS Series Hardware Installation Guide. If available, you can use an uninterruptible power supply (UPS) to protect against power failures.
AB-4
Note
Avoid UPS types that use ferroresonant technology. These UPS types can become unstable with systems such as the Cisco MDS 9000 Family, which can have substantial current draw fluctuations because of fluctuating data traffic patterns.
Ensure circuits are sized according to local and national codes. If you are using 200/240 VAC power sources in North America, the circuits must be protected by two-pole circuit breakers.
Note To prevent loss of input power, ensure that the total maximum loads on the circuits supplying power are within the current ratings of the wiring and breakers.
Record your installation and configuration information as you work. See Site Planning and Maintenance Records in the Cisco MDS Series Hardware Installation Guide.
Screw Torques
Use the following screw torques when installing the switch: Captive screws: 4 in-lb M3 screws: 4 in-lb M4 screws: 12 in-lb 10-32 screws: 20 in-lb 12-24 screws: 30 in-lb
Required Equipment
Gather the following items before beginning the installation: Number 1 and number 2 Phillips screwdrivers with torque capability 3/16-inch flat-blade screwdriver Tape measure and level ESD wrist strap or other grounding device Antistatic mat or antistatic foam In addition to the grounding items provided in the accessory kit, you need the following items: Grounding cable (6 AWG recommended), sized according to local and national installation requirements; the required length depends on the proximity of the Cisco MDS 9500 to proper grounding facilities. Crimping tool large enough to accommodate girth of lug Wire-stripping tool For DC power supplies in a Cisco MDS 9506 Director, you need two 10-32 ring lugs for each DC power supply.
AB-5
Installations with three 9513s per rack should include a floorloading assessment to support 296.75 lbs as pictured.
Front view
Rear view
System front-fan tray: 18.0 lbs Fabric card: 5.75 lbs Four-port 10-Gbps line card : 8.5 lbs 12-port line card: 7.5 lbs 48-port line card : 11.0 lbs 24-port line card : 7.75 lbs Supervisor-2 line card : 7.25 lbs Line card blank panels: 0.50 lbs
C Molock du les
rS up pli es
Installation of the Cisco MDS 9513 Director in a rack requires a mechanical lift to place the chassis in the rack. Make sure you have access to the lift during the installation process. A fully loaded 9513 can weigh about 300 pounds. Installations with three 9513s per rack should include a floor-loading assessment to evaluate the static-load rating of the flooring as part of the site evaluation process. For additional information about floor loading requirements, consult the UL document GR-63-CORE, Network Equipment-Building System (NEBS) Requirements: Physical Protection.
Module Weights
The following components are listed with their weights: Crossbar switching module 6 lbs ( 2.7 kg) 48-port 4-Gbps switching module 11.0 lbs ( 4.99 kg) 24-port 4-Gbps switching module 7.75 lbs (3.52 kg) 12-port 4-Gbps switching module 7.5 lbs (3.40 kg) 4-port 10-Gbps switching module 8.5 lbs (3.86 kg) 32-port FC switching module 9 lbs (4.1 kg) 16-port FC switching module 9 lbs (4.1 kg) Source-specific multicast (SSM) 11 lbs (5 kg) Advanced Services Module (ASM) 11 lbs (5 kg) Cisco Security Manager (CSM) 11.5 lbs (5.2 kg) IPS-8 10 lbs (4.5 kg) IPS-4 9 lbs (4.1 kg) MPS-14/2 10 lbs (4.5 kg) Supervisor-2 for MDS 9500 Series 7.25 lbs ( kg)
AB-6 Implementing Cisco Storage Networking Solutions (ICSNS) v3.0 2007 Cisco Systems, Inc.
Po we
Supervisor-1 for MDS 9500 Series 9 lbs (4.1 kg) Supervisor for MDS 9200 Series 9 lbs (4.1 kg) Crossbar module fan tray 2.25 lbs (1.13 kg) Module blank panels 0.50 lbs ( 0.25 kg)
AB-7
Installation Options
Standard telco rack (no side panels):
Not intended for use with the Cisco MDS 9513 Minimum of 6 inches (15.2 cm) of clearance between chassis is recommended Minimum of 2.5 inches (6.4 cm) of distance between the chassis air vents and any walls is required
The Cisco MDS 9506 and MDS 9509 directors can be installed using the following methods: In an open EIA rack, using: The rack mount kit shipped with the switch. The Telco and EIA Shelf Bracket Kit, optional and purchased separately, in addition to the rack mount kit shipped with the switch. In a perforated or solid-walled EIA cabinet, using: The rack mount kit shipped with the switch. The Telco and EIA Shelf Bracket Kit, optional and purchased separately, in addition to the rack mount kit shipped with the switch. In a two-post telco rack, using: The rack mount kit shipped with the switch. The Telco and EIA Shelf Bracket Kit, optional and purchased separately, in addition to the front brackets shipped with the switch. The Cisco MDS 9509 Director can also be installed in a four-post nonthreaded cabinet or rack, using the optional 9500 Shelf Bracket Kit. The Cisco MDS 9513 Director can be installed in solid or perforated walled cabinets, but not in two-post telco racks.
Note The Telco and EIA Shelf Bracket Kit is optional and is not provided with the switch. To order the kit, contact your switch provider.
AB-8
Note
The Telco and EIA Shelf Bracket Kit is not intended for use with a Cisco MDS 9509 Director in a two-post telco rack. The MDS 9513 exceeds telco rack load ratings.
Requirements and recommendations for perforated cabinets are: The front and rear doors must have at least a 60-percent open-area perforation pattern with at least 15 square inches of open area per rack unit of door height. The roof should be perforated with at least a 20-percent open-area perforation pattern. An open or perforated cabinet floor is recommended to enhance cooling. Requirements and recommendations for solid-walled cabinets are: A roof-mounted fan tray with bottom-to-top airflow that has a minimum of 500 cfm of airflow exiting the cabinet roof through the fan tray. Non-perforated (solid and sealed) front and back doors and side panels so that air travels predictably from bottom to top. A cabinet depth of 36 to 42 inches (91.4 to 106.7 cm) to allow the doors to close and adequate airflow is recommended. A minimum of 150 square inches (968 sq. cm) of open area must be at the floor air intake of the cabinet. The lowest piece of equipment should be installed a minimum of 1.75 inches (4.4 cm) above the floor openings to prevent blocking the floor intake. Requirements and recommendations for telco racks are: Minimum of 6 inches (15.2 cm) of clearance between chassis is recommended. Minimum of 2.5 inches (6.4 cm) of distance between the chassis air vents and any walls is required. Not intended for use with the Cisco MDS 9513 director.
AB-9
ICSNS v3.02-8
The cabinet or rack must conform to the following: Standard 19-inch four-post EIA cabinet, or rack, with mounting rails that conform to English universal hole spacing per section 1 of ANSI/EIA-310-D-1992. Standard two-post telco racks are not intended for use with the 9513.
AB-10
ICSNS v3.02-9
If mounting the chassis in an open rack (no side panels or doors), ensure the rack meets two requirements: The minimum width between two front mounting rails must be 17.75 inches (45.1 cm). The minimum vertical rack space per chassis must be at least: For the Cisco MDS 9513 chassis 24.5 inches (62.2 cm), or 14 RU. For the Cisco MDS 9509 chassis 24.5 inches (62.2 cm), or 14 RU. For the Cisco MDS 9506 chassis 12.25 inches (31.1 cm), or 7 RU.
Note The rack-mount support brackets provided with the Cisco MDS 9513 Director require an additional height of 0.75 inches (1.9 cm). They are required for the installation of the Cisco MDS 9513 Director and can not be removed. The side rail mount brackets provided with the Cisco MDS 9509 Director require an additional height of 0.75 inches (1.9 cm). They are required only for the installation of the Cisco MDS 9509 Director and can be removed, or left installed, after the front rack mount brackets are securely fastened to the rack-mounting rails.
Note
AB-11
To install an AC power supply in the Cisco MDS 9513 Director, follow these steps:
Step 1 Step 2 Step 3 Step 4
Ensure that the system (earth) ground connection has been made. If a filler panel is installed, remove the filler panel from the power supply bay by loosening the captive screw. Ensure that the power switch is in the off (0) position on the power supply that is being installed. Grasp the power supply handles, one with each hand. Orient the power supply and align it with the bay.
There is a handle at the top rear of the power supply you can also use to tilt the power supply into the bay.
Note
Step 5 Step 6
Slide the power supply into the power supply bay. Ensure that the power supply is fully seated in the bay. Secure all four 6-32 panel fasteners and tighten to 8 in-lbs.
AB-12
ICSNS v3.02-12
Plug the power cable into the power supply. Tighten the screw on the cable retention device to ensure the cable can not be pulled out. Connect the other end of the power cable to an AC power source. Turn the power switch to the on (1) position on the power supply. Verify power supply operation by checking that the power supply LEDs are in the following states: Input OK: LEDs are green. Fans OK: LED is green. Output Fail: LED is off.
AB-13
ICSNS v3.02-13
AB-14
Use both hands to install and remove power supplies. Each power supply weighs 34.2 lbs (15.5 kg).
ICSNS v3.02-14
Turn the power switch on the power supply to the off (0) position. There is an internal-lock mechanism that prevents you from removing the power supply if it is not set to the off position. Disconnect the power cables from the power source. Loosen the screw on the cable retention device, and disconnect the power cable from the power supply. Loosen all four panel fasteners at the corners of the power supply. Grasp the power supply handles and slide the power supply partially out of the chassis, about 4 to 5 inches. If the power supply is at your waist or chest level, place your other hand underneath the power supply and slide the power supply completely out of the chassis.
To avoid damage to the panel fasteners, do not place the power supply down on the perforated ends.
Note
Step 7
Install a filler panel over the opening. Tighten the captive screws if the power supply bay is to remain empty.
AB-15
ICSNS v3.02-15
Remove the blank power-supply filler plate from the chassis power-supply bay opening by loosening the captive installation screw, if necessary. Turn the power switch to the off (0) position on the power supply you are installing. Grasp the power supply handle with one hand. Place your other hand underneath the power supply, as shown in the figure. Slide the power supply into the power supply bay. Make sure that the power supply is completely seated in the bay. Tighten the power supply captive installation screw. Plug in the power cord to the power supply and tighten the screw on the cable retention device. Turn the power switch to the on (1) position on the power supply you are installing. Check LED status indicators for proper operation.
Remove the blank power-supply filler plate from the chassis power-supply bay opening by loosening the captive installation screw, if necessary. Turn the power switch to the off (0) position on the power supply that you are installing. Grasp the power supply handle with one hand. Place your other hand underneath the power supply, as shown in the figure. Slide the power supply into the power supply bay. Make sure that the power supply is fully seated in the bay. Tighten the power supply captive installation screw.
2007 Cisco Systems, Inc.
Step 4
AB-16
Step 5 Step 6
Remove two screws securing the terminal block cover. Slide the cover off the terminal block. Attach appropriate lugs to the DC-input wires. The maximum width of a lug is 0.300 inch (7.6 cm). The wire should be sized according to local and national installation requirements.
Use only copper wire.
Connect DC-input wires to the terminal block in the following order: (1) ground, (2) negative (-), (3) positive (+). Turn the power switch to the off (0) position on the power supply that is being installed. Check LED status indicators for proper operation.
For redundant or combined power requirements, the number and type of line cards and supervisor modules determine the amount of power needed by the chassis. If each power supply in the chassis is capable of supplying the total chassis power, then the power supplies can be redundant. If each power supply is unable to supply the total chassis power, then the power supplies are shared, and the loss of one supply results in some of the cards being inoperable.
AB-17
The MDS 9500 series supports redundant hot swappable power supplies that support AC or DC input voltages. Each power supply is capable of supplying sufficient power to the entire chassis should one fail. The power supplies monitor their output voltage and provide status to the supervisor module. To prevent the unexpected shutdown of an optional module, the power management software only allows a module to power up if adequate power is available. The power supplies can be configured to be redundant or combined. By default, they are configured as redundant so that if one fails, the remaining power supply can still power the entire system. Condition LEDs give visual indications of the installed modules and their operation.
AB-18
ICSNS v3.02-17
As with the MDS 9509 power supplies, the MDS 9506 Director supports redundant AC hotswappable power supplies, each of which is capable of supplying sufficient power to the entire chassis should one power supply fail. The power supplies monitor their output voltage and provide status to the supervisor module. Also, they can be configured to be redundant or combined. By default, they are configured as redundant. Condition LEDs are also available on the power supply modules. The 1900-watt supply provides full output capabilities when powered by 220 VAC; however, that output is reduced to 1050 watts when powered by a 110 VAC input. It has a current rating of 15 amps, but a maximum draw of 12 amps under normal conditions.
AB-19
ICSNS v3.02-18
AB-20
Ensure that all power to the DC circuit is off. Ensure that the system (earth) ground connection is made. Loosen the captive screws on the DC PEM. Pull the PEM part way out of the chassis to provide access to the PEM terminal block screws. The process for connecting the positive and negative DC cables to the DC PEM with a 10-32 ring lug for each cable is as follows:
1. Identify the positive and negative DC cables and ensure that both are copper and sized according to local and national installation requirements. 2. Strip the cable ends to allow for metal-to-metal contact. Insert each cable into a separate ring lug. Crimp the lugs around the cables. 3. Insert each cable and lug into the appropriate hole in the front of the PEM. Fasten the lugs to the appropriate terminal block screws in the following order: (1) negative (-), (2) positive (+). 4. Secure the cables in place by tightening the terminal block screws.
AB-21
ICSNS v3.02-19
The Cisco MDS 9216 switch supports dual hot swappable 845-watt AC power supplies. Each supply is autoranging on the input voltage and can provide sufficient power to the entire chassis should one of them fail. They also monitor their own output voltage and provide status to the systems supervisor module. MDS 9216 power supplies can be configured to be redundant or combined. By default, they are configured as redundant, so that if one fails, the remaining power supply can still power the entire system. The MDS 9216 power supplies are field-replaceable units (FRUs), are installed, and can be removed easily from the rear of the chassis using pull handles. They also provide condition LEDs for operational status. The MDS 9216 supports AC voltage inputs only, not DC, ranging from 100 to 240 VAC. The power supplies have a current rating of 15 amps for circuit breakers but draw a maximum of 12 amps at 110 VAC and only 5 amps at 220 VAC.
Ensure that the system (earth) ground connection has been made. If the power-supply bay has a filler panel, loosen the screws holding it on and remove the panel. Verify that the power switch is in the off (0) position on the power supply you are installing. Orient the power supply as shown in the figure. Hold it by the handle and slide the power supply into the chassis power supply bay. Ensure that the power supply is completely seated in the bay. Tighten the power supply captive screws. Plug the power cable into the power supply. Tighten the screw on the power cable retainer to ensure that the cable can not be pulled out.
AB-22
Connect the other end of the power cable to an AC power source. Turn the power switch to the on (1) position on the power supply. Verify power supply operation by checking that the power supply LEDs are in the following states: Input OK: LED is green. Fan OK: LED is green. Output Fail: LED is off.
Note
In a system with dual power supplies, connect each power supply to a separate power source. In case of a power source failure, the second source will most likely still be available.
AB-23
ICSNS v3.02-20
Dual 300-W AC Input Power Supplies Installation in MDS 9100 Fabric Switches
To install a dual 300-W AC input power supply in an MDS 9100 fabric switch, follow these steps:
Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Note
Ensure that the system (earth) ground connection has been made. Make sure the power cord is disconnected before installing the power supply. Verify that the power switch is in the off (0) position on the power supply you are installing. Slide the power supply into the power supply bay. Make sure that the power supply is completely seated in the bay. Tighten the power supply captive screw. Plug in the power cord to the power supply. Connect the other end of the power cord to an AC input power source.
Depending on the outlet receptacle on your power distribution unit, you might need the optional jumper power cord to connect the Cisco MDS 9216 switch to your outlet receptacle.
Step 8 Step 9
Turn the power switch to the on (1) position on the power supply. Verify the power supplys operation by checking that the power supply (P/S) LED in the front panel is green.
AB-24
Combined mode:
Is nonredundant Twice the power capacity of the lower-capacity supply Sufficient power might not be available in case of a power supply failure System reset if power requirements exceed capacity Only modules with sufficient power are powered up If no reset, no modules down but no new modules up Should not be used for director-class switches
Power reserved for the supervisor and fan assemblies Power failure triggers Syslog, Call Home, and SNMP trap
2007 Cisco Systems, Inc. All rights reserved. ICSNS v3.02-21
Power supplies are configured in redundant mode by default, but they can also be configured in a combined, or nonredundant, mode: Redundant mode: The chassis uses the power capacity of the lower-capacity power supply so that sufficient power is available in case of a single power supply failure. Combined mode: The chassis uses twice the power capacity of the lower-capacity power supply. Sufficient power might not be available in case of a power supply failure in this mode. If there is a power supply failure and the real power requirements for the chassis exceed the power capacity of the remaining power supply, the entire system is reset automatically to prevent permanent damage to the power supply. In both modes, power is reserved for the supervisor and fan assemblies. Each supervisor module has roughly 220 watts in reserve, even if there is only one installed; and the fan module has 210 watts in reserve. In the case of insufficient power, after supervisors and fans are powered, line card modules are given power from the top of the chassis down. After the reboot, only those modules that have sufficient power are powered up. If the real power requirements do not trigger an automatic reset, no module is powered down. Instead, no new module is powered up. In all cases of power supply failure or removal, the following occur: A Syslog message is printed A Call Home message is sent if configured A Simple Network Management Protocol (SNMP) trap is sent
Note Combined mode should not be used for director-class switches.
AB-25
Air Flow
The MDS 9500 Series supports hot swappable fan modules that are easily installed or removed from the from of the chassis. They provide 85 cfm of airflow per slot with 410 watts of power dissipation per slot. The MDS 9506 has a fan module with 6 fans, the Cisco MDS 9509 has a fan module with 9 fans, and the Cisco MDS 9513 has a fan module with 15 fans. Sensors on the supervisor module monitor the internal air temperature. If the air temperature exceeds a preset threshold, the environmental monitor displays warning messages. If one or more fans within the module fails, the Fan Status LED turns red, and the module must be replaced. When all fans are operating properly, the LED is green. If the fan LED is red, the fan assembly might not be seated properly in the chassis. If this happens, remove the fan assembly and reinstall. After reinstalling, if the LED is still red, then a failure on the fan assembly has occurred. Fan LED status indication is provided on a per-module basis. If one fan fails, then the module is considered failed. The switch can continue to run when the fan module is removed for a maximum of 5 minutes if the temperature thresholds are not exceeded. In this way, you can swap out a fan module without having to bring the system down. The fan module is designed to be removed and replaced while the system is operating without presenting an electrical hazard or damage to the system, provided the replacement is performed promptly. Install the fan module in the front chassis cavity with the status LED at the top. Push the fan module to ensure that the power supply connector mates with the chassis. Tighten the captive installation screws. If the switch is powered on, listen for the fans. You should hear them operating immediately. No automated shutdown sequence is associated with the removal of the crossbar module fan tray. Shutdown is initiated when temperature thresholds for the crossbar modules are exceeded. Replacement of the crossbar module fan tray should be performed promptly.
AB-26
MDS 9513
2007 Cisco Systems, Inc. All rights reserved. ICSNS v3.02-24
Hold the fan module so that the Fan Status LED is at the top. Place the fan module in the front chassis cavity so that it rests on the chassis. Lift the fan module up slightly to align the top and bottom chassis guides. Push in the fan module to the chassis until it seats in the backplane and the captive screws make contact with the chassis. The fan module snaps in. If the switch is powered on, listen for the fans. You should hear them operating immediately.
If you do not hear the fans, ensure that the fan module is inserted completely in the chassis and the outside surface is flush with the outside surface of the chassis.
Note
Step 5
Verify that the Fan Status LED is green. If the LED is not green, one or more fans are faulty.
Push the button on the top fan-module latch to release the fan module from the midplane. Repeat this on the bottom fan-module latch. Grasp the fan module with both hands and pull it outward. Rock it gently, if necessary, to unseat the power connector from the backplane. Pull the fan module clear of the chassis.
AB-27
AB-28
ICSNS v3.02-25
Orient the crossbar module fan tray in the chassis by positioning the module in the slot, and then sliding the module carefully into the slot until the fan tray is completely inserted in the chassis. Tighten the two captive screws on the crossbar module fan tray to 8 in-lb screws.
Step 2
Loosen the two captive screws on the fan tray. Hold the two captive screws and pull the fan tray out of the chassis with both hands. Take one hand and hold the face of the fan tray while supporting it with the other hand. Pull the fan module clear of the chassis.
AB-29
In a Cisco MDS 9513 Director, slots 7 and 8 are reserved for the Supervisor-2 modules. In the Cisco MDS 9506 and 9509 Directors, slots 5 and 6 are reserved for the supervisor modules. A supervisor module should be installed before installing any switching modules.
Before installing any modules in the chassis, it is recommended that you install the chassis in the rack. Verify that there is enough clearance to accommodate any cables or interface equipment that you want to connect to the module. Verify that the captive screws are tightened to 8 in-lb on all modules already installed in the chassis. This ensures that the EMI gaskets are fully compressed and maximizes the opening space for the module being installed. If a filler panel is installed, remove the two Phillips pan-head screws from the filler panel and remove the panel. Open completely both ejector levers on the new or replacement module.
Step 4 Step 5
AB-30
Slide the module carefully into the slot until the EMI gasket along the top edge of the module contacts the module in the slot above it and both ejector levers close to approximately 45 degrees with respect to the front of the module. Grasp the two ejector levers with the thumb and forefinger of each hand, and then press down to create a small 0.040-inch (1-mm) gap between the module's EMI gasket and the module above it. While pressing down, simultaneously close the left and right ejector levers to completely seat the supervisor module or switching module in the backplane connector. The ejector levers are completely closed when they are flush with the front of the module. Tighten the two captive screws on the supervisor module or switching module to 8 in-lb.
Step 2
Step 3
Step 4
AB-31
ICSNS v3.02-28
Connect the management port to the LAN using a Category 5 unshielded twistedpair (UTP) cable. Connect the supplied RJ-45 to DB-9 female adaptor to the computer serial port.
It is recommend to use the adaptor and cable provided with the switch.
Then connect the console cable (a rollover RJ-45 to RJ-45 cable) to the console port and to the RJ-45 to DB-9 adapter at the computer serial port. Configure the terminal emulator program to match the following default port characteristics:
AB-32
AB-33
If necessary, copy the contents of the SSM NVRAM to the standby Supervisor-2 module. Initiate a switchover on the active Supervisor-1 module to power it down and cause the standby Supervisor-2 module to become the active supervisor module with the system switchover command. Install the other Supervisor-2 module in the chassis. Run the install all command to update the image versions and boot variables.
AB-34
3. Console port 4. 10/100 Ethernet management port 5. COM1 serial port 6. CompactFlash LED 7. CompactFlash eject button 9. CF1 slot
ICSNS v3.02-30
This supervisor module is installed in the MDS 9500 Series chassis and has the following interfaces: Status LEDs: Status, System, Active/Standby, and Power Management Module reset button: Used for a warm start. Console port: RS-232 (RJ-45) for local command line interface (CLI) management. 10/100 Ethernet interface: Out-of-band (OOB) management access with integrated link and activity LEDs. COM1 serial port: DB-9 interface is an RS-232 port that you can use to connect to an external serial communication device such as a modem. CompactFlash LED: This LED is lit when a CompactFlash (CF) card is installed into slot 0. CompactFlash eject button: Push to eject a CompactFlash card. CF1: Slot you can use for a CompactFlash card. The Status LED states are: Green: OK Orange: Initializing or over temperature The System LED states are: Green: System OK Orange: Environmental error, incompatible power supply, or redundant clock failure Red: Major temperature threshold has been exceeded The Active LED states are: Green: Active Orange: Standby
2007 Cisco Systems, Inc. Appendix B: Installation and Configuration Reference AB-35
The Power/Mangement LED states are: Green: Good power, that is, sufficient power for all modules Orange: Not enough power, that is, insufficient power for all modules Connect the modem to the COM1 serial port with the adaptors and cables provided with the accessory kit as follows:
Step 1 Step 2 Step 3
Connect the DB-9 serial adapter to the COM1 port. Connect the RJ-45 to DB-25 modem adaptor to the modem. Connect the adapters using the RJ-45 to RJ-45 rollover cable (or equivalent crossover cable).
AB-36
The MDS 9216 supervisor module has 2 slots: Slot 1: This slot is reserved for the supervisor module with its integrated 16-port switching module. Slot 2: This slot can contain an optional 16- or 32-port switching module or a services module such as an 8-port IP Storage Services (IPS) module.
AB-37
AB-38
Module Shutdown
Use the poweroff module command to power down a module Verify status with the show module command Remove module safely without shutting down entire switch
# conf t (config)# (config)# Mod Ports --- ----1 16 2 4 5 0 6 0
poweroff module 1 do show module Module-Type -------------------------1/2 Gbps FC Module IP Storage Services Module Supervisor/Fabric-1 Supervisor/Fabric-1
ICSNS v3.02-33
Caution
Even though you can hot-swap MDS 9000 modules, it is recommended that you shut down a module before removal.
To shut down any module, use the poweroff module command in config mode:
<config># poweroff module 2
To verify the status of a module at any time, use the show module command in EXEC mode. To view information on one module only, you can specify a module slot number.
Example
The command show module 1 returns the status information of only the module installed in slot 1.
# show module Mod Ports Module-Type Model Status --- ----- -------------------------- --------------- ---------1 16 1/2 Gbps FC Module DS-X9016 ok 2 4 IP Storage Services Module powered-dn 5 0 Supervisor/Fabric-1 DS-X9530-SF1-K9 active * 6 0 Supervisor/Fabric-1 DS-X9530-SF1-K9 ha-standby
AB-39
Summary
This topic summarizes the key points that were discussed in this lesson.
Summary
Installation guidelines include recommendations for evaluating site preparedness, rack hardware, and power requirements. MDS 9000 switches can be installed in a standard telco rack or cabinets with solid panels. The MDS 9513 can not be used with a telco rack. Power supply installation and configuration should be carefully considered to ensure high-availability. MDS 9000 fan modules are hot swappable and provide for easy installation and replacement. Install at least one supervisor module before installing any line card modules, and use the poweroff module command to shut down individual line cards prior to removal.
ICSNS v3.02-34
AB-40