Professional Documents
Culture Documents
This FREE Web-based Fibre Channel Theory Fundamentals course was designed to act as a level set for new SAN adopter's and serve as a prerequisite for Brocade courseware. It is presented as a FREE service offering for all.
Course Objectives
After completing this course, attendees should be able to:
Describe key reasons, benefits and components related to Fibre Channel (FC) Storage Area Networks (SANs) Identify FC protocol layers and key related tasks & components List key FC topologies, terminology, and addresses Describe FC services and expected behaviors Summarize and state relevance of FC theory fundamentals learned and know where to find additional FC information
Class Agenda
Introduction Why Fibre Channel? Fibre Channel layers and related components Fibre Channel topologies, terminology and addresses Expected Fibre Channel behaviors Summary
Objectives summary: Introduction objectives include setting expectations, sharing best practices and providing links to additional resources. Why Fibre Channel? objectives include reasons for existence of the FC protocol; FC markets and components related to FC SANs. FC layers and related component objectives include discussions of the FC layers and components/ knowledge related to each layer. For example, FC-0 includes feeds and speeds therefore distance, cable, GBIC/SFP information is incorporated. FC-1 incorporates ordered sets and link control information and FC-2 incorporates packaging: exchanges, sequences & frames; COS and flow control. FC topologies, terminology and address objectives include a discussion of FC topologies, terminology, and FC addressing: WWNs; Fibre Channel addresses called port identifiers (PIDs) and Well-Known addresses. Expected FC behavior objectives include FC communication methodology, services and expected interactive behaviors. Summary objectives include a review of FC theory information presented during this course.
Best Practices
See Course Navigation Instructions to learn how to move around presentations please also notice and implement best sound recommendations Download and print *.pdfs of PowerPoint presentations use these to take notes as you listen Download and print course resources such as the FC Recommended Reading List and SAN Glossary files Optionally view, download and/or print additional presentation resources as time and interest permit Set your own start and stop times but be consistent until course material is completed - please schedule regular breaks while taking this online course Most lectures are recorded in small time blocks you should be able to complete each in less than one hour
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
Brocade Certifications
Brocade Certified Fabric Professional (BCFP)
This level of certification indicates you have mastered all of the basics of the Brocade SilkWorm switch and are knowledgeable in Fibre Channel theory.
Brocade Certified SAN Designer (BCSD) This level of certification indicates that you have mastered the concepts and intricacies of building a SAN from the basic components through the integration of industry applications and state-of-the-art storage components.
Brocade Certified SAN Manager (BCSM) This level of certification indicates that you have a detailed understanding of administering Brocade SilkWorm switches and managing aspects of a SAN.
The BSCD exam can be passed with SAN experience and extensive knowledge of Brocade sponsored books and white papers. Go to www.brocade.com and follow education links to certification for more information. Benefits A 10% discount off the listed price on all future Brocade training classes delivered at a Brocade facility (excludes non-Brocade facilities). The 10% discount applies to the listed price only and may not be used in conjunction with other discounts at non-Brocade facilities Industry Recognition as a Brocade Certified Fabric Professional, or a Brocade Certified SAN Manager A skill set that could translate into a better position in your organization An additional differentiator that can be used by potential employers Access to the certification logo and a set of usage guidelines A certificate of completion of the program and achievement award Employer Benefits A direct return on the investment in your training A watermark for determining future training needs Improved customer support and satisfaction A greater credibility with the customer base
Choose the BCFP path if you are part of Post-Sales Support, Repair, or Maintenance Services OR
Choose the BCSM path if you are part of Post-Sales Support, Maintenance, or a SAN Manager BCFP is considered a prerequisite OR
SAN320
AFS160 (Web)
AFS212
AFS156 (Web)
CSM260
SFO200
SFO101 (Web)
including white papers helpful to pass BCSD exam, can be found at www.brocade.com Brocade Connect tab - Brocade Connect registration requires a Brocade switch serial number
Brief course descriptions: AFS 200 provides the student with a thorough understanding of the Brocade SilkWorm family of Fibre Channel switches. AFS 300 focuses on techniques needed by the second level and higher support engineer and lead Storage Area Network (SAN) administrator to troubleshoot Brocade SANs. CFP 260 is an accelerated class that combines both AFS 200 and AFS 300 certification specific information. SAN 320 provides students with knowledge and experience about various ways to manage Brocade SANs including Brocade Fabric Manager, Brocade Fabric Watch and SNMP. AFS 212 provides students with knowledge and experience with the Brocade SilkWorm 12000 SFO 200 provides students with the technical skills required to plan and implement security in a Brocade 1gb and/or 2gb Fabrics. AFS 160 is an ONLINE course designed to enable students to identify and understand the following Brocade SilkWorm 12000 switch information: the key hardware and software components; basic deployment tasks; a solid understanding of the Brocade Advanced Fabric Services software that are part of the Fabric OS that runs on the SilkWorm 12000. AFS 156 is an ONLINE course that enables familiarity with Brocade Advanced Performance Monitor. SFO 101 is an ONLINE web course that provides students with an overview of Brocade Secure Fabric OS. CSM 260 is an accelerated class designed to help students pass the BSCM exam. It combines both SAN 320 and SFO 200 information needed to pass BSCM certification test. Additional course information can be found at www.brocade.com education.
In the slide above, Baseline Knowledge refers to previous experience with SANs and/or equivalent work experience associated with SCSI storage and Local Area Networks (LAN). It also includes this free WBT FC Theory fundamentals course. Additional information,
Virtual University Enterprises (VUE) is our chosen test vendor. They operate over 2800 testing centers worldwide. To register for an exam or locate a testing center nearest to you: Visit http://www.vue.com/brocade Call 866-361-5817 toll free in North America Visit http://www.vue.com/contact/brocade_numbers.html for other contact numbers worldwide (some locations may not have toll free numbers)
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
The exam cost is $150 US for each attempt. No student may take the exam more than 2 times in a two week period. VUE accepts many of the major world currencies. All examinees are required to accept a non-disclosure agreement. This agreement means the examinee will not discuss or disclose any of the questions or exam contents. Failure to comply with agreement may result in forfeiture of certification status and benefits.
Additional Information
General course resources for all presentation can be found under syllabus in Fibre Channel Theory Fundamentals Resource & Reference Material section general course resources include: Fibre Channel Recommended Reading & Resource List SAN Glossary Most presentations in this course will have an Additional Information slide at the end with a list of related resources Additional www.brocade.com resources for this introductory module include: Course Catalog follow education link to Course Catalog for course descriptions and schedules Certification follow education link to Certification for additional certification and test information Brocade Connect Additional information, including white papers helpful to pass BCSD exam, can be found from Brocade Connect tab Brocade Connect registration requires a Brocade switch serial number
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
Brocade Connect will ask you to provide a Brocade switch serial number when registering.
In this presentation we will investigate and exemplify Fibre Channels purpose and role in today's IT infrastructure.
Objectives
After completing this module, attendees should be able to: Identify reasons the FC protocol exists Identify benefits of FC SANs Identify some components related to FC SANs
Topics
Identify reasons and markets for FC protocol
Storage access history and SAN introduction What is Fibre Channel? Why FC SANs? What is a FC SAN?
Identify some components related to FC SANs Embedded throughout this presentation we will discuss FC standards and reasons for the success of FC SANs
Storage Access
History
Attaching storage to non-mainframe servers during the 1970s and 1980s was straightforward: Storage was directly attached to the server Networks were avoided, to ensure best performance and reliability To enhance performance, a parallel interface with a limited number of devices was used Result: A high-speed channel from server to storage
The primary external interface in the early days of external storage was the Small Computer System Interface (SCSI), a bus architecture with dedicated parallel cabling between servers and storage devices. It is an open standard that has been enhanced over the years to support increases in device speed and functionality. By providing a dedicated physical channel, high levels of reliability could be ensured during data transfers between servers and storage. Storage-server connections must have high levels of reliability if there are any glitches in a server-storage transfer, valuable data is compromised permanently. For this reason, server-storage connections traditionally avoided networks as providing insufficient levels of confidence.
Storage Access
Storage Area Networks (SANs)
Over time, storage-server connection requirements have changed Network-level flexibility with channel-like performance and reliability Leverage SCSI command set over emerging serial interfaces Extend over a much larger geographic area These are the origins of the Storage Area Network
We can now see that several factors contributed to the rise of Storage Area Networks: For a variety of reasons (business mergers, introducing new technologies, explosive data growth), the number of servers and storage devices that intercommunicate has risen rapidly. The flexibility required for server-storage access has reached network-like levels but with a need for channel-like reliability and performance. The various SCSI committees tried to keep up with the exploding storage market, and had success in maintaining a rich set of device commands. The SCSI driver is usually implemented to be more efficient in interacting with an operating system than the IP stack, and makes it more well-suited to handling block data transfers. Newer serial-based technologies (Ethernet, OCR, etc.) have seen more rapid improvements in performance than SCSI-style parallel buses.
SAN Technology
Fibre Channel
The dominant technology used in modern Storage Area Networks is Fibre Channel An open standard (IEEE T11 committee) Designed to emulate channels large block transfer behavior while extending distance and allowing many-to-many connectivity Connectivity: Thousands of devices per fabric (network) Performance: Current speeds: 1 and 2 Gbit/sec (100 and 200 MBytes/sec), with 10 Gbit/sec (1 GBytes/sec) coming; our focus - 2 Gbit/sec Initiator arbitrates for access before transmitting (ensures channel-like access to target) All SCSI commands and user data is sent over 2112 byte Fibre Channel payload frames
The above is a brief overview of the Fibre Channel protocol. More details about Fibre Channel are available in Chapter 2 of Building SANs with Brocade Fabric Switches.
Fibre Channel Highlights: A standard: AN ANSI standard providing flexible serial data transport at long distances for Storage Area and System Area Networks - ratified as ANSI standard in 1994. Now an ISO/IEC Standard High performance and speed: Hardware based transport mechanism for high performance; 1, 2, 4, 10 Gb/s speeds Low latency: Less than 2 micro second latency input port to output port of FC switch Long distance: Up to 10KM distance (longer with extenders), can be extended non-natively over ATMs up to 3000 km Robust data integrity: Uses IBMs 8B/10B encoding scheme for robust integrity plus FC has a bit error rate (BER) of 10-12 - a transmission might have a BER of 10-12 means that, out of 10,000,000 bits transmitted, one bit was in error Large connectivity: -Per the standard, Fibre Channel allows a theoretical 16M devices to be connected to one Fabric -Support for multiple physical media types - Copper, Optical Fibre (multimode and Single mode) and Mixed media -Support for multiple protocols - SCSI, IP, VIA, Ficon, etc. and mixed protocols -Support for multiple topologies - Point-to-Point, Switched, Loop and mixed topologies -Heterogeneous interconnect scheme for computing and peripheral devices
Fibre Channel
- Hybrid Transport System -
Fibre Channel combines the best of both worlds: It is a channel transport that shares many of the characteristics of an I/O bus (e.g. SCSI). This means that hosts and applications see the disk devices as locally attached storage. It also incorporates the best of the networking world as Fibre Channel allows multiple protocol support, such as SCSI, IP, Ficon, BB, and others. Manageability of the SAN can be done by typical networking management applications, I.e. HP Openview. And, Fibre Channel allows for a heterogeneous set of devices to participate.
LAN Servers
Fabric
Fabric is a well-designed NETWORK of highly intelligent Fibre Channel switches which provides enterprise-class scalability, performance, manageability and availability.
Storage Subsystems
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
Why give storage its own networkFabric? Answer: A good LAN does not make a good SAN!
LANs LANs LANs LANs use different protocols, different tools are physically insecure at the desktop and potentially vulnerable at the server seldom have spare capacity for storage networking are tuned to favor short, bursty user transmissions versus large, continuous data transfers
While Local Area Networks (LANs) may do a good job of supporting user access to servers, they are less than ideal for providing servers with access to storage systems. For one thing, user workstations and storage systems use different network protocols. LAN hardware and operating systems are geared toward user trafficthey are designed for a fast user response to messaging requests. By definition, user networks have to go to where the users are and often this means that the servers may also be located all over the enterprise. With a SAN, the storage units can be secured separately from the servers and totally apart from the user network. Most enterprises are in an ongoing struggle to maintain adequate LAN performance in the face of the rapid increase in user utilization rates. For them, it would be asking too much to also provide ongoing access to storage systems. Better to move all the storage traffic to the SAN and give the LAN a room to breath. Finally, it should be noted that user networks frequently employ broadcasts to coordinate access activities. If storage devices are attached to the main network, they are needlessly included in such broadcasts. The intermittent flurries of user broadcasts can be disruptive to bulk data transfers.
(Note: WAN = Wide Area Network, long distance large network)
Here is a formal definition for SAN from the Storage Network Industry Association (SNIA): A network whose primary purpose is the transfer of data between computer systems and storage elements and among storage elements. Abbreviated SAN. A SAN consists of a communication infrastructure, which provides physical connections, and a management layer, which organizes the connections, storage elements, and computer systems so that data transfer is secure and robust.--SNIA Technical Dictionary, copyright Storage Networking
10
The high-speed, low-delay connections offered by Fibre Channel makes it ideal for a variety of dataintensive applications. Please note that the Fibre Channel is not a SAN only technology. The Fibre Channel technology has been used for networking in the movie and TV companies for the postproduction of moving the video imaging between servers and editing stations. Highlight in some of the Fibre Channel features: High Speed Currently at 2Gb/sec, moving to 10Gb/sec. Some Fibre Channel vendors will skip the 4Gbps speed generation and will go directly to 10Gbps. Note that the Ethernet 10Gb group and the Fibre Channel 10Gb group have a joint working group and both technologies will be released simultaneously. Networking technologies are synergistic, not competing against each other. Furthermore, ATM (Asynchronous Transfer Mode) is moving towards 10 Gbps speed paradise. Long Distance Fibre Channel is 10 kilometers by the standard specification. Today, some Fibre Channel vendors have found solutions to implement long distance SAN (Up to 3,000 kilometers using ATM as a WAN transport) without breaking the Fibre Channel standard. Up to 256 Upper Layer Protocols (ULP) support In this book, we will focus on SCSI and TCP/IP protocols support only. But do aware that Fibre Channel has the capability to support many other storage, network, video and clustering protocols as well. This makes Fibre Channel easier for IT professionals to understand and support, as they do not need to learn a new storage or network command set. The last thing an IT professional want is sorry, you need to throw away what you know and learn a new protocol/language again!
What is a FC SAN?
Open Systems Model for Networked Storage Enhanced Storage Management Flexibility to add or reconfigure storage as needed without downtime Independent Scaling of CPU and Storage capacity De-couples servers and storage so that either can be scaled separately Easy Migration Current applications run without software changes Incremental deployment allows flexible adoption
11
A Storage Area Network (SAN) is an enabling Infrastructure that provides network class of benefits for IT data centers Data can become a unified, virtual, resource Legacy systems can be seamlessly integrated SANs provide the flexibility for deploying various enterprise IT applications using a single infrastructure: SAN Backup Storage Consolidation Remote Data Replication High Availability Fast Server Failover
12
Node
Node Port (N_Port) has no knowledge of the path. This relieves the N_Port of having local routing tables. This translates to easy
E_Port Switch
Fibre Channel provides an interesting network scenario where network clients have very little idea about what is going on inside the network. They do not know or care how connections are routed, they just know what they are connected to across the Fabric. This implementation means we can now offload many requirements from the N_Port CPUs, thus making things simpler. FC devices use FC protocols to connect and communicate. These protocols often represent Fabric services and, as the name implies, they reside in the Fabric. There is an initialization process, for example, that occurs when a device connects to a Fabric port. The port that the device attaches will become the the type of communication portal attached devices needs. In this picture we see connecting E_Ports or Expansion ports. When switches connect to each other they exchange link parameters (ELP) letting them know what is attached at the other end. FC has a switch-switch protocol called Inter link services (ILS) with a rich set of commands that allow switches to exchange information. We also see an N_Port (Node Port) attached to a F_Port (Fabric Port). Node ports need Fabric ports to communicate. From the Fabric perspective an F_Port implies a N_Port is at the other end of the cable. Fabric access, called Fabric logins or FLOGIs and query methodologies represent Fabric services that are essentially hidden from end ports - the node ports do not need to keep track of all Fabric service information. The Node ports absorb only the Fabric service information they need to build device lists and communicate across the Fabric. In this course, well examine the internal operations of a switched Fabric, both in terms of its interaction with a various elements of the Fabric, as well as its fairly rich set of network capabilities and services offering to the attached nodes.
SAN Fabric
Fabric is a term used to describe a generic switching environment. It can consists one or more interconnected switches (domains) One Fibre Channel Switch = One Fabric Domain Maximum of 239 domains in a single Fabric Fabric communication is based on 24-bits address space partitioning Special Agent: Principal Switch Special Delivery: Class F Service
24 Bit Address Space Domain ID 8 Bits Area ID 8 Bits ALPA ID 8 Bits
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
13
Fabric
In Fibre Channel, the Fabric is normally an entity that distributes address identifiers to the N_Ports. In general, the N_Ports need to be aware of how the Fabric manages address identifier allocation. A Domain is the highest logical construct in the hierarchy of Port Identifiers. Areas are the intermediate level logical construct and ALPAs are the lowest level logical construct in the hierarchy. To facilitate Fabric communication address management, a partitioning scheme has been developed. The 24-bit address is divided into three 8-bit fields
The upper 8-bits are the Domain The middle 8-bits are the Area The lower 8-bits are the ALPA (Arbitrated Loop Physical Address)
The domain is used to identify a Fibre Channel Switch. When a frame is received, it is routed to the correct domain (Switch). Once the frame reaches the correct domain, it is routed to the correct area and finally, the frame is routed to the correct port. Fibre Channel has the concept of a Principal Switch. The function of the Principal Switch is to simplify the problem of determining precedence between Fibre Channel Switches without adding a separate external Fabric management software component, The Principal Switch facilitates the bring up of the Fabric, acts as controller of domains (ensures each Switch coming into a Fabric has a unique domain) and handles time services when available. Class F is the communication class of service used between switches. It is a special internal communication service within a multi-switch Fabric. The primary purpose for the Class F is Fabric management and operation.
FC SAN Components
Host Bus Adapter in Server
Note: Cables & GBIC (SFP) will be discussed in next section
14
FC SANs can also include other interconnecting devices like HUBs and Bridges
Cable GBIC/SFP
A SAN is a mass storage infrastructure that frees up the LAN or WAN, while it operates faster and does FC SAN Fabrics are high-performance networks based on Fibre Channel, and dedicated to storage. They provide any-to-any connectivity for the resources in the SAN. Any server can potentially talk to any storage device, and the SAN Fabric also enables a communication between storage and SAN devices (switches, hubs, routers, bridges). SANs employ fiber optic and copper connections to create dedicated networks for servers and their storage systems. More on these later. Servers/HBAs (Host bus Adapters) - HBAs are similar to the Network Interface Card (NIC) that is used to connect devices to a LAN UNIX Windows Linux Storage - Disks (RAID/JBOD) RAID stands for Redundant Array of Independent Disks. These Arrays look like a a single disk volume to the server and they are fault-tolerant either through Mirroring or Parity-checking. They also typically have their own management software just for that RAID array. JBOD - Just a Bunch of Disks that usually plug into an enclosure that has the connection to the SAN. These disks have no protection against failure. Tape Interconnecting Devices Hubs/Switches Bridges/FC Extenders Software - SAN Management Applications Telnet Front Panels/Serial Connections - depends on the device being managed WEB Browser Fabric Manager (FM) SNMP - i.e. HP Openview, CMNS, CA Unicenter, Adventnet Application Programming Interface (API) 3rd party applications that use SNMP and/or API
15
Fibre Channel Switches (named Switch Element in Fibre Channel terminology) are intelligent devices able to interconnect individual nodes, devices, and even other switch elements. At the physical layers, switch intelligence means that a switch is very much plug-and-play, it can detect whatever type of device is plugged in and, provided they have the proper GBIC/ SFP installed. The reason for small switch versus bigger switch approach is to enable pay-asyou-grow implementation model. Most IT organizations can start deploying a small SAN island using two 8-ports switches, once they felt comfortable with the new technology, they can buy bigger switches (for example, 16-ports). The 64-ports and 128-ports core switches are mainly used for connecting many SAN islands across an organization.
Fabric port types: F_Port: For direct connection E_Port: For switch connection FL_Port: For loop/hub connection FC header/protocol support: Hardware based cut through frame routing to keep latency small Link level flow control to prevent loss of frames Link level error detection/recovery for high application performance Small to large frame sizes to meet different application throughputs/latencies Embedded services: Name Service; Alias Service; Management Service and more. Login: Establishment of operating characteristics Automatic address assignment (24 bit wide)
Integrated SNMP and MIB-compliant management for remote management (as well as Telnet) Configuration management tools and utilization monitoring (web-based graphical user interface)
Automated port isolation and device fail-over for fault-tolerance, along with N+1 hot-swappable components
NL_Port
FL_Port
N_Port
F_Port
1 node to 1 port
N_Port
Switch Ports: E_port -- expansion port, connects two switches to make a fabric F_port -- a Fabric port to which an N-port attaches FL_Port -- A Fabric Loop port to which a Loop attaches Device Ports: N_port -- port designator for direct fabric attached devices NL_Port -- device that is attached to the loop (ie, host, storage)
Interconnect Devices
- Hubs Hub
17
Hubs utilize arbitrated loop topology. The hub will tie circuits through each port, joining the last ports Tx circuit to the first ports Rx circuit, thus sharing the backplanes throughput. Ports must be able to recover valid clocked info at the rate of 1.0625 gigabaud. There is no addressing scheme between the Hub and the connected device. Hubs posses the ability to auto bypass ports to allow ease of connectivity. Hubs follow FC-AL, FC-AL-2 Fibre Channel standards. As long as vendors do not supercede Fibre Channel standards they will create their Hubs with different Management functionality, port density, signaling processing, port type, and/or port density. The Fibre Channel Hub is what connects
devices together to create an Arbitrated Loop. Loops support a maximum of 126 devices and has a maximum bandwidth of 100MB per second on the whole loop. Every device on the loop must share that bandwidth and access. Only two devices can be communicating on the loop at any point in time. With this in mind, Hubs are a way for the devices that support FC Loop ONLY to become part of the Fabric. As members of the Fabric/SAN, loops are multiple independent networks with limited connectivity
Unmanaged Hubs - These hubs are usually used in small environments since they are simple, low cost and posses an entry level interconnection scheme. They will generally have bypassing technology as long as the signaling thresholds are met. They will usually provide simplistic LED functionality. Managed Hubs - These hubs introduce another level of intelligence for manageability. Basic functionality can now be managed via TCP/IP I.e., Web, Telnet, SNMP. There are two levels to this functionality. First is hardware additions to the hub, second is the software to run the new hardware. Managed Hubs also provide the ability to recognize ordered sets, CRC error detection, link errors, invalid transmission words, most active AL_Pas, loop status, topology mappings, and event tracking.
Interconnect devices - Bridges FC-SCSI Router/Bridge Maps SCSI devices to units of a single Arbitrated Loop Physical Address (AL_PA) Bridge Configurable mapping table SNMP management
SCSI Only
18
Fibre Channel-to-SCSI Bridges (also known as routers) allow the connection of non-Fibre Channel devices to the SAN. Typically used to connect SCSI tape devices to the SAN, a bridge can also be used to connect servers or workstations to the SAN in initiator mode. Bridges interface Fibre Channel and SCSI, or to connect Fibre Channel links to devices without Fibre Channel ports. FC-based tape libraries are not yet prevalent so bridges are used to bring SCSI-based tape libraries into Fibre Channel SANs. Removing the tape device from an application server and attaching it to the bridge allows for faster, more accurate backups. It also allows the tape device to be shared by other servers across the SAN versus it being a dedicated resource. The need for these systems is declining with the development of SAN capable devices.
Heterogeneous Attachment
Storage
RAID Redundant Array of Independent Disks JBOD Just a Bunch of Disks Tape Primary use for Backup and Recovery
19
Two terms often heard in discussions of SAN storage subsystem are RAID and JBOD. RAID, or Redundant Array of Inexpensive Disks, is a disk clustering technology that has been available on larger systems for many years. Depending on how we configure the array, we can have the data mirrored (duplicate copies on separate drives), striped (interleaved across several drives), or parity protected (extra data written to identify errors). These can be used in combination to deliver the balance of performance and reliability that the user requires. Because of the high capacity (and cost) of RAID storage systems, they are good candidates for sharing across a SAN. Although we can certainly have a SAN without RAID, these two technologies are often used hand in hand. JBOD stands for Just a Bunch of Disks and is a counterpart of RAID. It is a collection of disks that share a common connection to the server, but dont include the mirroring, striping, or parity facilities that RAID systems do, but these capabilities are available with host-based software. JBOD represents the simplest and least expensive "raw storage" option. The individual disks are arranged in a simple cabinet and are available to its servers as a group of independently accessible disks. They have little or no buffering (or cache memory) or an intelligent controller that enables advanced features. JBOD has a limited growth capacity and it usually scales to less than 1 terabyte (TB) per cabinet. Since they have no inherent intelligence, parity checking or data striping, there is no protection in the event of a drive failure. Tape helps you to store data sequentially on a magnetic tape cartridge. Generally used to store large amounts of data for backup purposes. Tape storage would be a common method of saving information in the event of a drive failure.
20
Storage Devices Raid Controllers act as the interface between the actual disks and the fabric. They handle all tasks necessary to present the disks in the RAID array to the Fabric, including LUN masking, cache, port initialization and communication management.
21
Handles I/O and Control requests Copper/Optical media support (may be dual port cards)
Every device connected to a SAN requires a Fibre Channel interface or adapter board. Fibre Channel Host Bus Adapters (HBAs) are available for a variety of bus types including PCI and Sbus. Some leading adapters on the market today provide the following features:
Plug-and-play flexibility and copper/optical connector support SNMP (Simple Network Management Protocol) and MIB (Management Information Base) support Support for both Arbitrated Loop and Switched Fabric topologies
22
Connectors = Ports
Note: When compared with normal Ethernet network card, Ethernet card only handles the first three operations and the rest of the functionality require server CPU handling.
HBAs
Yesterday
23
Application Layer
SCSI Driver
SCSI Adapter SCSI Adapter Driver Driver INTERNAL I/O BUS INTERNAL I/O BUS
In this slide we can see the different layers that go into servicing a particular SCSI command from the Application layer down to the SCSI Adaptor card. This view is an illustration of a parallel SCSI system. How does this work? The application layer depicts Netscape, Payroll, and inventory applications. The Kernel OS System call interface and file subsystem layers process application layer communication. The disk, tape or CD ROM device drivers are responsible for handling specific request (i.e., Opens/Reads/Writes/Closes) from one or more applications these drivers put the necessary device specific handlers on the communication going down the chain. When we look at SCSI communication in this picture, the Kernel OS SCSI driver is responsible for accepting generic I/O requests from upper layers in the Operating Systems Kernel and converting them to the appropriate Device Specific SCSI Command Descriptor Blocks. Note: SCSI Command Descriptor Blocks (CDBs) are a specific unit of work to be acted upon by a SCSI Initiator or Target. The SCSI adaptor drivers that you load or let the server operating system load for you, is the interface between the physical card and entry point into the Kernel OS SCSI drivers. A SCSI adaptor will take SCSI commands and package them into the parallel format needed on the SCSI data bus where the disk and tape devices are sitting.
HBAs
Today
24
Application Layer
Here is a view of the different layers that go into servicing a SCSI command. The difference between the previous slide and this one is the fact that the device drivers are using Serial SCSI-3 commands that enhance error recovery and device sharing, plus the block level device or device Kernel OS is communicating with does not have to be on the same physical parallel bus for the host to access it. A FC HBA communication process today is very similar: The application layer still handles Netscape, Payroll, and inventory applications. By the way, this level also includes backup, multipathing, clustering and volume management applications. The Kernel OS System call interface and file subsystem layers still process application layer communication. The disk, tape or CD ROM device drivers are responsible for handling specific request (i.e., Opens/Reads/Writes/Closes) from one or more applications these drivers put the necessary device specific handlers on the communication going down the chain. When we look at SCSI communication (most prevalent FC protocol), the Kernel OS SCSI driver is still responsible for accepting generic I/O requests from upper layers in the Operating Systems Kernel and converting them to the appropriate Device Specific SCSI Command Descriptor Blocks. Note: SCSI Command Descriptor Blocks (CDBs) are a specific unit of work to be acted upon by a SCSI Initiator or Target. These new and improved SCSI drivers also understand how to communicate with FC Adaptors, called HBA drivers. The FC adaptor or HBA drivers that you load or let the server operating system load for you, represent the interface between the physical card and entry point into the Kernel OS SCSI drivers. These adapters will take the SCSI commands and package them into sequences of frames, append FC addressing information and flow control parameters. They handle what we call FC Layer 2 processes. HBA components also assign counters, register statistical data, serialize, encode & decode FC frames. All steps necessary to send these frames out the link through the Fabric to their destination ports.
Summary
FC combines the best of Channel and Network protocols FC SANs are a new multi-protocol network that provide high speed, congestion free communication between heterogeneous devices with legacy support SAN components include interconnect devices (switches, hubs, bridges), storage (storage controllers) and hosts (HBAs)
25
Additional Information
26
Use the resources page Internet Links, find SAN ED 101 link SAN Fabric Foundation, go to chapter 1 (Introduction to SAN Concepts and Benefits) for additional information related to this presentation
Course Objectives
After completing this module, attendees should be able to:
! Identify ! Link ! Link
FC protocol layers
Topics
! Ethernet ! FC
" " " " "
protocol layers
FC 0 Speeds and Feeds FC 1 8b/10b Encoding FC 2 Data Delivery FC 3 Common Services FC 4 Upper Level Protocols (ULPs)
TCP/ IP
Ether net
A look at network protocol layers Ethernet is the most widely used LAN technology today. Originally conceived and developed by Xerox Corporation, it is specified in the IEEE 802.3 standard. Since its inception, initial speeds of 10 Mbit/sec have evolved into Fast Ethernet (100BASET), providing transmission speeds up to 100 Mbit/sec. Gigabit Ethernet provides an even higher level of backbone support at 1000 Mbit/sec primarily on fiber optic cable. In the diagram, we illustrate how Ethernet utilizes the OSI model. Although it looks like Ethernet provides solutions for layers one through four of the OSI model, in fact, Ethernet by itself only covers the two bottom layers. Ethernet is, in general, accessed through TCP/IP (or UDP/IP) protocol stack, where the SAN is accessed through a simple SCSI protocol stack with fewer overheads on the server processor. Most Ethernet adapters work at the packet level. All higher-level segmentation and reassembly into IP datagrams, or TCP-level sockets are software driven and require server CPU intervention. Be aware that TCP/IP often drops packets of data when the network becomes congested. When this happens, the packet must then be retransmitted using more bandwidth. Footnote 1: FC-2 has elements of OSI layer four, so it is not a perfect one-to-one correlation between the OSI and the Fibre Channel layers. Overall, it is a modular architecture, 5 layers (FC-0 to FC-4) and each level does not define physical or programming interfaces between the levels.
Multiple Protocols
On Common Fibre Channel Transport
FC - 0 and 1 layers specify physical and data link functions needed to physically send from one port to another. FC 0 specifications include information about media connections and cables, it is sometimes referred to at the feeds and speeds layer. 10 Mbaud speed, although not depicted, is also in development. FC 1 layer contains specifications for 8b/10b encoding, ordered set and link control communication functions. FC 2 specify content and structure of information along with how to control and manage information delivery. This layer contains basic rules needed for sending data across network. This includes: (1) how to divide the data into smaller frames, (2) how much data should be sent at one time before sending more (flow control), and (3) where the frame should go. It also includes Classes of Services, which define different implementations that can be selected depending on the application. FC 3 defines advanced features such as striping (to transmit one data unit across multiple links) and multicast (to transmit a single transmission to multiple destinations) and hunt group (mapping multiple ports to a single node). So, while FC-2 level concerns itself with the definition of functions with a single port. The FC-3 level deals with functions that span multiple ports. There are parallel link service functions that many think reside at this layer because they logically fit. These link service functions utilize all FC-2 and lower services like an ULP but are not ULPs neither but neither do they reside at FC-3. I think of them as performing FC-4 like functions in a category parallel to FC-3. FC 4 provides mapping of Fibre Channel capabilities to pre-existing protocols, such as IP or SCSI, or ATM, etc.
Serial interface (separate one-bit transmit and receive lines) Media types: " Cables # Fiber Optic Cables Single Mode Multi Mode " Connectors Fiber and Electrical Cable specifications
FC-0: Physical Interface The lowest level, FC-0, specifies the physical link. One big purpose of Fibre Channel is to have the selected protocol operate over various physical media and data rates. This approach ensures maximum flexibility, allowing existing cable plants and a number of different technologies to be used to meet a wide variety of system requirements. Common cables are either copper or optics. Single Mode optical fiber is used between buildings or sites because of its long distance data transmission capabilities. Multi Mode Fiber (MMF) is used within a building between nodes and switches or between floors. Optical connectors are used for interconnection between devices such as nodes, fibre channel switches and hubs. These connectors include SFP (Small Form Pluggable), GBIC (Gigabit Interface Connector) and MIA (Media Interface Adapter). Serial data received from FC-1 is converted to a signal type associated with the transmission media and sent out transmission port. Transmission continues as long as link remains in OFC Active state also called Open-Fiber State.
Fibre Channel links are driven optically or electronically. The optical and electrical links can be combined in a single system when there is a Fabric or other media-type converter available. The active part of the optical cable is constructed out of the core (the optical conduit), surrounded by cladding (to keep the light in the core), and fiber coating wrapped around the cladding. The optical fiber is very thin and could easily become damaged. The fiber core can be either 9 micron, 50 micron or 62.5 micron. For comparison, a human hair is about 75 micron thick.
A ferrule
Optical FC links consists of two fibers; one for transmitting information, the other for fiber for receiving information. Each side TX connects to the other sides RX consisting of a single point-to-point connection. Using a switched fabric a nodes TX is attached to the Brocade ports RX and the Brocade TX is connected to the nodes RX.
Cladding (125 m)
Core (50/62.5 m)
Cladding (125 m)
Core (9 m)
Fiber types are available in two types: single mode or multimode. The smaller the glass core the greater the distance. Multimode offers a core of 50 or 62.5 micron in shorter distances and have a smaller cost base. Single-mode is used for high speed, long distance links, while Multimode is utilized in lower cost, intermediate links. (MMF) is used for short wavelength. It comes in 50 & 62.5 micron cores, with a cladding of 125 microns. The core is much larger (50/62.5 um) allowing for multiple modes and paths the light can follow. This propagation method is referred as Modal Dispersion. The distance under this propagation method is significantly reduced. Single-mode fiber (SMF) is used for long wavelength. It comes with a 9 micron core, and a cladding of 125 microns. 9 micron goes 2m 10km distances. The light travels along the same path since the diameter of the core is reduced (9 um) to such a degree it constrains the light.
In Single-mode fiber, the idea is to reduce the core size until the possibility of modal dispersion is reduced to such a degree that only one mode of photon flight is exhibited. By eliminating the chromatic dispersion (by utilizing monochromatic laser light sources), great distance and bandwidths become possible in single-mode fiber. Multi-mode propagation refers to the fact that the pulses of light can travel down the optical wave guide taking different reflective paths. Following are the prefix and measurement for the small numbers: Deci-meter (dm) = 10-1 of a meter Centi-meter (cm) = 10-2 of a meter Milli-meter (mm) = 10-3 (One thousandth) of a meter Micro-meter (m) = 10-6 (One millionth) of a meter ( - Greek word) Nano-meter (nm) = 10-9 (One billionth) of a meter Pico-meter (pm) = 10-12 (One trillionth) of a meter Femto-meter (fm) = 10-15 of a meter Atto-meter (am) = 10-18 of a meter
Fibre Channel does not necessarily mean just fiber cable Optical and Copper cables are the most common
Media Type Speed 100 MB/s 200 MB/s 100 MB/s 200 MB/s 100 MB/s 200 MB/s 100 MB/s .5 m 500 m .5 m 300 m .5 m 175 m .5 m 90 m 0 m 33 m Distance (m=meter) $ 10 km
9 micron Single-Mode Fibre (Long Distance) 50 micron Multi-Mode Fibre (Short Distance) 62.5 micron Multi-Mode Fibre (Short Distance) Electrical (Copper)
Fibre Channel can be implemented using either fiber optic or copper cabling. Each has its own advantages and disadvantages. Fiber optic cables are more expensive but they give reliability, distance (up to 10 kilometers), and ease connectivity. They come in two kinds: single-mode (yielding greater distance) and multi-mode. Multi-mode is by far the most common today, typically providing 500 meters distance at 1Gbit/sec. Each optical patch in a fiber is called a mode. Fiber pairs come with easyto-use push-pull SC (Sieman Connector) or LC (Lucent Connector) connectors at the cable ends.
2Gbit/sec switches typically use the LC connectors. The 1Gbit/sec switches typically use an SC connector. The 2Gbit/sec switches use an LC connector. Both cables have white tips, called ferrules, that stick into the SFP (GBIC).
Copper cables are less expensive but suffer from more reliability problems and limited distance. They are available as Coaxial cable and Twisted Pair with connectors in DB-9 or High Speed Serial Data Connectors (HSSDCs). The HSSDC is a new connector designed for Fibre Channel, providing for a screw-less, easy to plug/unplug connection.
10
LC
SC (Siemens Connectors) plug into GBICs (Gigabit Interface Connector) receptors
SFP
SC
GBIC
Typically - 2Gbit/sec cables use LC connectors to SFPs while 1Gbit/sec fiber cables use SC connectors to GBICs
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
The SC connector features a molded body and a push-pull locking system and was designed as a low cost alternative to the ST connector. It is utilized in both Multimode and Single mode hi-bandwidth applications. The LC connector, a small-form factor connector, features a ceramic ferrule and looks like a mini SC connector. SWL Fiber Optic GBIC/SFP Modules The SWL fiber optic GBIC/SFP modules, with an SC or LC connector (respectively), are based on short-wavelength 850 nm lasers supporting 1 or 2 Gbit/sec link speeds. This GBIC/SFP modules support 50/62.5 micron, multimode fiber optic cables 50/125 microns will go up to 500 meters in length @ 1 Gbit/sec and 300 meters in length @ 2 Gbit/sec. Fiber optic cables 62.5/125 microns will reach 175m @ 1 Gbit/sec and 90m @ 2 Gbit/sec. LWL Fiber Optic GBIC/ SFP Modules The LWL fiber optic GBIC/SFP modules, with SC or LC connector are based on long-wavelength 1300nm lasers supporting 1 or 2 Gbit/sec link speeds. This GBIC module supports 9 micron single-mode fiber optic cables up to 10 kilometers in length with a maximum of five splices.
11
Cable connection:
! ! ! ! !
LC Connector
SC Connector
Passive Copper (Cu) - Up to 13M (1Gbit/sec) Active Copper (Cu) - Up to 33M (1Gbit/sec) Short wavelength (SWL) GBIC / SFP Long wavelength (LWL) GBIC / SFP Extended Fabrics - Up to 100K
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
Check OEM Compatibility matrices for a list of supported GBICs and SFPs. Brocade supported GBICs / current vendors include: Agilent, Finisar and IBM. Passive/Active Copper GBIC Module The Copper (Cu) GBIC module is based on the High-Speed Serial Data Connection (HSSDC) interface standards. The GBIC provides a female HSSDC connector. Passive Copper cables up to 13 meters and active up to 33 meters have currently been qualified, thereby supporting ANSI X3.230 FC-PH intracabinet requirements. Standard cables with HSSDC-to-DB9 male connectors are also available. SWL Fiber Optic GBIC/SFP Modules The SWL fiber optic GBIC/SFP modules, with an SC or LC connector (respectively), are based on short-wavelength 850 nm lasers supporting 1 or 2 Gbit/sec link speeds. This GBIC/SFP modules support 50/62.5 micron, multimode fiber optic cables 50/125 microns will go up to 500 meters in length @ 1 Gbit/sec and 300 meters in length @ 2 Gbit/sec. Fiber optic cables 62.5/125 microns will reach 175m @ 1 Gbit/sec and 90m @ 2 Gbit/sec. LWL Fiber Optic GBIC/ SFP Modules The LWL fiber optic GBIC/SFP modules, with SC or LC connector are based on long-wavelength 1300nm lasers supporting 1 or 2 Gbit/sec link speeds. This GBIC module supports 9 micron single-mode fiber optic cables up to 10 kilometers in length with a maximum of five splices.
12
! !
Types of Optical Cable(s) " LC to LC SFP to SFP " LC to SC SFP to GBIC " SC to SC GBIC to GBIC Ethernet Port: RJ45 Serial Port: Straight 9-pin female-female D-sub cable " Pins 2,3,5 are required
MIA
Media Interface Adapters (MIAs) are used to convert electrical signals to optical signals (DB9 to SC)
SC Coupler
LC Coupler
FC-1 Layer:
13
! !
FC-1: Byte Encoding This layer describes the 8-bit/10-bit transmission code that is used to provide balance of the transmitted bit stream. In addition, the coding provides a mechanism for detection of the transmission and reception errors. 8-bit/10-bit encoding scheme is selected for its superior transmission characteristics. This well-balanced code allows for low-cost component design and provides good transition density for easier clock recovery. Note: 8-bit/10-bit scheme is used in IBM ESCON as well. FC-1 level defines the method to encode data prior to transmission. As data is passed down in octet form, FC1 layer encodes it into a ten bit code group (20% overhead). Each octet is given a code group name according to the bit arrangement. This will guarantee certain characteristics about the information that gets sent across the serial link.
Syncing up summary --$ Serial data comes into a receiver port with 7 bit comma patterns at the front of each ordered set character. These ordered set characters act as frame delimiters, primitive signals and primitive sequences. Receiver ports first bit sync (using these commas) and then count to 10 thus character syncing. They also count to 40 to word sync. Now frames can flow. If a loss of sync occurs a Loss-of-Sync procedure takes place (5 invalid transmission words $ receiver attempts to re-sync).
14
Primitive Signal Fill Word IDLE ARB Non-Fill Word R_RDY VC_RDY CLS OPN
Primitive Sequence NOS - Not Operational OLS - Offline LR - Link Reset LRR - Link Reset Resp. LIP - Loop Initialization LPB - Loop Port Bypass LPE - Loop Port Enable
Transmission words can be classified into two categories: 1. Words that begin with a special character are called ordered sets. Any character that starts with a K (ex: K28.5) is called a special character to indicate its usage as an ordered set. Ordered sets occur outside of the FC frame content and include frame delimiters, primitive sequences and signals. 2. Words that begin with an encoded data character are called data words. Data Words occur within the FC Frame content. Each transmission character in both the Ordered Sets and Data Words is 10 bits, so transmission words are 40-bit words. Ordered Sets (K28.5, Dxx.y, Dxx.y, Dxx.y) include: Frame Delimiters (SOF, EOF) to delimit, or identify start and end of frames Primitive Signals to indicate events at the sending port. Primitive Signals are used to indicate events or actions and are normally transmitted once.
Fill Word (Idle, ARB(X), ARB(F0), ARB(FF) transmitted on a link whenever a port is operational and has no other specific information to send Non-Fill Word (R_RDY, VC_RDY, CLS, OPN, DHD, MRK, SYNx,y,z)- signal the events
Undefined but valid ordered sets are treated as fill words (no explicit action in necessary. Primitive Sequences (NOS, OLS, LR, LRR, LIP) and (LPB, LPE) to control an optional port bypass circuit (for loop). Primitive Sequences are used to indicate states or conditions and are normally transmitted continuously until something causes the current state to change. At least 3 primitive sequences must be received before the appropriate response can be generated.
15
Port State Machine (PSM) provides control signal specifications for Fabric connections Loop Port State Machine (LPSM) provides control signal specifications for arbitrated loop connections
"
Rx
Rx
Nx_Port A
Tx
Nx_Port B
Tx
There are four states of the Port State Machine (PSM) used for Fabric (G_Port) connections: 1. Active State (AC) 2. Link Reset State (LR) 3. Offline State (OLS) 4. Link Failure State (LF) Active state (AC) in the Active state, the port is able to transmit and receive frames and primitive signals. Certain conditions encountered within the port may cause the port to exit the active state and perform one of the following primitive sequence protocols: Link Reset, Offline state, or Link Failure. Link Reset state consists of: LR1 (Link Reset Transmit) LR2 (Link Reset Receive) LR3 (Link Reset Response) Offline state consists of : OL1 (Offline Transmit) OL2 (Offline Receive) OL3 (Wait for Offline) Link Failure state consists of: LF1 (No Operational Receive) LF2 (No Operational Transmit)
16
! ! !
Exchange and Sequence Management Frame Structure Class of Service Flow Control " Buffer to Buffer " End to End
FC-2 defines the structure and organization of the information being delivered and how that delivery is controlled and managed. Exchange management is the mechanism that two fibre channel ports use to identify and assign an exchange ID number for a set of related information units. When the entire stream of data will fit in a single frame (2112 bytes) a single exchange id is created and a sequence number is assigned. However, when a stream of data will not fit into a single frame (2112 bytes), data is put into sequences of frames. Within the exchange ID sequence management is used to number the sequence segments in the stream of data. Sequence numbers associated with the exchange will be used at the recipient to re-order the sequence segments, to re-assemble as a contiguous stream of data. In other protocols, this is commonly known as fragmentation and re-assembly. Frame Structure has a start-of-frame delimiter ordered set and ends with an end-of-frame delimiter set. Flow control is the process to deliver a frame. When a frame is ready for transmission, it is sent thru the encoder (8b/10b), to the serializer (sfp/gbic) and transmitted to the receiver port where it is deserialized, decoded and stored in a receive buffer. The receiving port sends to the transmitting port a credit to send another frame and decrements a credit from the credit value established during the login session (buffer to buffer credit). When the receiving port moves the buffer to the next port, the debit is restored. Buffer credits regulate the flow of frames into and out of the fabric. When a N_port and a destination N_port communicate, an end-to-end credit is established. End-to-end credit is established between the pairs and is used to manage the flow of frames between a specific pair of N_ports and allows the receiving port to control which source N_ports are allowed to send frames to the receiver.
17
Command 2
DATA IN (Sequence)
Command 3
STATUS (Sequence)
Information Unit
Information Unit
Information Unit
CMD
DATA IN
STATUS
CMD
DATA IN
STATUS
1
FC- 0 & 1
4 3
Port 2 1
With Fibre Channel, there are almost no limits on the size of transfers between applications. Whereas with Ethernet the software is sensitive to a maximum packet size that can be transmitted (1518 bytes). With Fibre Channel, frame sizes are transparent to the software because of a logical construct called a "sequence." A frame is not a unit of transfer but sequence is and it always maps with the Upper Layer Protocol (ULP) command instruction. Lets see how it works: First, all the commands coming from a ULP are mapped into logical constructs called "information units." An Individual information unit is generally mapped to a sequence. Related information units, such as those required in an I/O operation, are mapped as a single exchange. The sequence and exchange structures are general enough and contain tunable options concerning flow control and error recovery policy. Sequence - A sequence is a set of one or more related data frames transmitted for a single operation, flowing in the same direction on the link (unidirectional from one N_Port to another N_Port). The N_Port that transmits a sequence is referred to as the "sequence initiator," and the N_Port that receives the sequence is referred to as the "sequence recipient". A sequence is also the recovery boundary in Fibre Channel. When an error is detected, Fibre Channel identifies the sequence in error and allows that sequence to be retransmitted. Exchange - An exchange is composed of one or more non-concurrent, related sequences for a single higher-level operation. For example, an operation may consist of several phases: a command to read some data, followed by the data, then followed by the completion status of the operation. Each phase of the command, data, and status is a separate sequence, but they can form a single exchange. Within the single exchange, only one sequence direction can be active at a time, although sequences for different Exchanges may be concurrently active. (Fibre Channel multiplexing support). Sequence initiative is handed over before sequences flow in the other direction. Sequences going in one direction can be streamed. This just means that a 2nd sequence can be sent in that direction before waiting for delivery confirmation. Note: The four Exchange error policies are: (1) Abort, discard multiple sequences; (2) Abort, discard a single sequence; (3) Process with infinite buffering (this policy is a special design for specific transfers, such as video data); (4) Discard multiple sequences with immediate retransmission.
18
FRAME
S O F 4 HEADER PAYLOAD C E R O C F 4 4
24
A frame has a header and may have a payload. The header contains control and addressing information associated with the frame. The payload contains the information being transported by the frame on behalf of the higher level service or FC-4 upper level protocol. There are many different payload formats, based on the protocol. The TYPE field (Word 2, bit 31- 24) tells which format to use. The routing control INFO bit (bit 27-24) determines how to interpret the payload.
Field Definitions Routing Control bits (R_CTL) are the first 8 bits of the header. They define the type of frame and its content or function. The first 4 bits (Bit 31-28) identifies the frame type. The 2nd four bits INFO bit (Bit 27-24) defines the contents of the frame or identify the function of the frame. Destination_ID (D_ID)- Port Identifier (PID) or 24 bit address of the recipient. It could also be a a well-known address like the Name Server FFFFFC more on this later. Class specific Control Field (CS_CTL) The control necessary for different classes of service. This field is always zero for classes 2 and 3 per the standards. Classes 1 and 4 use it. Brocade switches currently only use Classes 2,3, and F. If CS_CTL is something other than zero in a Brocade port log (a running log extracted from portions of the FC frame displayed with the portLogDump command), then it is a Brocade internal code called IU_Status Values. Source_ID (S_ID) - Port identifier (PID) or 24 bit address of the source. It could be a a wellknown address like the Name Server FFFFFC. Type identifies the protocol of the frame content for Data Frames (i.e FC_CT, FCP, IPFC) Frame Control (F_CTL) -This field contains miscellaneous control information regarding the frame such as who owns initiative, first frame of the Exchange, last frame of the Exchange, etc. Sequence ID (SEQ_ID) used to identify and track all of the frames within a sequence between a source and destination port pair. Data Field Control (DF_CTL) this field indicates if any optional headers are present at the beginning of the data field of the frame. Optional headers are used for information that may be required by some applications or protocol mappings. Sequence Count (SEQ_CNT) used to indicate the sequential order of frame transmission within a sequence or multiple consecutive sequences within the same exchange. This is a counter that increments as sequences of frames are transmitted. Originator_ID (OX_ID) Exchange ID assigned by the originator port Responder_ID (RX_ID) - Responder_ID, optionally assigned by the responder to the Exchange. Data Field/Payload The standards limit the size. The maximum size is 2112 bytes.
19
Class 6 behaves like Class 1 with one $ many capabilities Class F* connectionless: switch-to-switch, with ack and BB_Credit ready
The ability to differentiate Classes of Service characteristics will be needed when discussing Link and Flow control of frames The Classes of service with an asterisk (*) are the ones that Brocade switches currently support. Note 1 - There are other link control signals associated with Classes 1, 2, 4 and 6 such as ack, f_bsy (fabric busy), p_bsy (port busy), f_rjt (frame reject), lcr (link credit reset), nty (destination port engaged in a class-1 connection), and other control signals. Note 2 - Classes of service 2, 3 and F have no bandwidth, deterministic latency, or IOD guarantees. Other Fabric services can provide this capability. Additional Notes: Class 1 Circuit-switching means that when Fabric receives Start of Frame connect Class One (SOFc1) signal it starts establishing a connecting to destination port. There are variations of COS 1: Exclusive connection, Intermix (allows class 2 and 3 frame transmission when class 1 not in use) , Intermix Bandwidth Recovery (allows Fabric to hold class 1 frame one frame cycle and make up that frame cycle in idle cycles). Class F Based on a simplified Class 2 model It is used to communicate Fabric-related traffic on inter-switch links (ISLs).
20
Class 2 Service #Multiplexes frames at the frame boundary #Adaptive routing %Each frame routed separately by Fabric %If multiple routes supported, frames might be delivered out of order #Confirmation of delivered frames #Login with both N_Port and Fabric required #Connectionless service with no turnaround delay to establish connection #N_Port to N_Port flow control #End-to-End (EE) delivery confirmation with ACK #Buffer-to-Buffer (BB) Link level flow control #Notification of frame delivery failure Class 3 Service #Multiplexes frames at the frame boundary #Adaptive routing %Each frame routed separately by Fabric %If multiple routes supported, frames might be delivered out of order #Unconfirmed delivery of frames %Datagram service #Login with both N_Port and Fabric required #Connectionless service with no turnaround delay to establish connection #Buffer-to-Buffer (BB) Link level flow control #ULP recovery from frame delivery failures Brocade supports Class 2 and 3.
Flow Control
!
21
ACK
Acknowledgement
R_RDY Receiver_Ready
! !
COS 3 uses R_RDY (buffer to buffer or BB_Credit) flow control, each R_RDY received increments BB_Credit value COS 2 and F use R_RDY and ACK (end to end or EE_Credit) flow control, each ACK received increments EE_Credit value
"
Brocade switches use cut through routing read S_ID / D_ID pairs and then shoot out the wire
Flow Control also includes mechanisms to communication busyness: F_BSY Fabric Busy F_RJT Fabric_Reject P_BSY N_PORT_BUSY P_RJT N_PORT_REJECT
Flow control definitions: Acknowledgement (ACK) is used to confirm successful frame delivery and used by class 2, F and not Class 3. ACK times are established with the RA_TOV value. The default time used by Brocade is 10 seconds. Receiver Ready (R_RDY) is used by the receiving port to signal it is ready to receive a new frame. Fabric Busy (F_BSY) occurs when the fabric is unable to deliver a frame. Fabric busy conditions are an unusual event. The time for frame delivery is determined by the Error Error Detection Time Out Value (E_D_TOV). The default time used by Brocade is 2 seconds Fabric_Reject (F_RJT) Used to signal that a frame could not be delivered. Typical reason is for an invalid S_ID N_Port_Busy (P_BSY) When a N_PORT is unable to accept a valid frame due to a busy condition, the receiving N_PORT will send a P_BSY to the sender of the frame. N_PORT_REJECT (P_RJT) Used to signal that a frame could not be delivered. Typical reason is for an ACK
22
Frames are moved from one buffer to another using Receiver Ready (R_Rdy) primitive signals Frame flow is always from the source buffer to the destination buffer Multiple intermediate buffers may be involved
Tx Rx Tx Rx
data
data
N_Port A
Rx Tx
Fabric
Rx
N_Port B
Tx
23
Frame flow is controlled by the receiver as a back-pressure mechanism Flow control is dependent on class of service (COS), but most use BB_Credits
" "
BB_Credit and length established during Fabric login BB_Credits are also exchanged during port login but ignored unless in a point-to- point topology (two devices, no switches)
End to end (EE_Credit) flow control is a originator device port to destination device port flow control mechanism
" "
EE_Credits are establish during node port login ACKs increments EE_Credit value
24
Flow control is a mechanism to establish the maximum amount of data that can be sent at any one time A throttle-back mechanism to handle congestion without the need to discard frames Flow control considerations include:
" " "
Bandwidth Considerations - At high speeds more data buffers are required to achieve the same link utilization for a given distance. Internal Host Buses - Internal host Buses (like PCI) can become congested and thus throttle back receiving data from the Fabric Extended Distance - Over extended distances, more data buffers are required to maintain a high throughput
25
Switch 1 ISL
Point 2 (F_Port)
!
Switch 2
BB_credits are exchanged between points 1 and 2 during Fabric login and are used as throttle-back mechanisms BB_credits are also exchanged across E_Ports (points 3 and 4) and are used as both throttle-back and bandwidth utilization (performance) mechanisms
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
Point 1 (N_Port)
Increase E_Port buffer credits to optimize performance at required distance The latency of a laser beam going through a fiber is 5nsec/meter (light wave propagation) or 5sec/km
10km link round trip latency is 100sec
A 2KB frame propagation delay at 100MB/sec is 20sec. So to keep a 10km pipe full, the switch would need at least 5 E_Port buffers (credits) Proportionally, a 2KB frame at 50km needs 25 buffer credits
At high speeds more data buffers are required to achieve the same link utilization for a given distance
A 2KB frame propagation delay at 200MB/sec is 40sec and would need at least 10 E_Port buffers (credits) Proportionally, a 2KB frame at 50km needs 50 buffer credits
26
Class 2/ F
BB and EE (AWK) credit Yes, by Fabric or destination 0 to 255 Final ACK received w/ EOFt All data frames received and all ACKs sent
Class 3
BB credit No, frame discarded 0 to 255 Acc to ABTS; or ULP All data frames received and EOFt in last frame
Note Due to limited space and non-use of COS 1 and 4 in Brocade SANs at this time, they will not be discussed in relation to flow control
27
Largely unused: mostly theoretic Functions span multiple ports Common service advanced features: " Striping " Hunt groups
FC-3: Common Services This layer defines advanced features such as striping (to transmit one data unit across multiple links) and multicast (to transmit a single transmission to multiple destinations) and hunt group (mapping multiple ports to a single node). So, while FC-2 level concerns itself with the definition of functions with a single port. The FC-3 level deals with functions that span multiple ports. Fibre Channel provides internal protocols called Basic Link Services and Extended Link Services. These services would logically reside at FC-3 but are NOT part of the FC-3 standards. Basic link services include low-level functions that are transported as a single data frame within a sequence. Basic link service examples include: Abort sequence (ABTS) used to abort a sequence or a frame, BA_ACC or BA_RJT are both ABTS responses and respectively mean Basic link service accept or reject. Another basic link frame is called No operation (NOP) which is used to initiate or terminate connections and sequences. There is also a remove connection (RMC) class of service (COS) 1 frame that is used to request a COS 1 connections removal. Extended link services (ELS) perform upper level protocol like functions between two fabric ports, one of them is often a wellknown address like the fabric F_Port, fabric controller or name/directory server. Common ELS commands include FLOGI (Fabric login), PLOGI (port login), SCR (State Change Registration), RSCN (Registered State Change Notification), PRLI (Process login) and more.
28
! ! ! !
"SCSI (SCSI-FCP or FCSI-201) - Transport of SCSI commands and data over the Fibre Channel protocol hierarchy is such a major part of the Fibre Channel usage that a particular acronym, FCP, is used to denote the SCSI over Fibre Channel, it is like the native protocol for Fibre Channel, so we called it Fibre Channel Protocol. "IP(IETF Draft FC_IP or FCSI 202)- Maps the IP protocol over Fibre Channel "WAN Tunneling (FC-BB)- defines mapping for ATM or LAN over FC "Audio Visual (FC-AV)- Mapping for digital TV standard and MPEG (For example) over FC "VIA (FC-VI)- Virtual Interface Architecture mapping, (i.e. clustering) "FICON (FC-SB-2) - Mapping FICON over FC. The standard is called FC-SB-2 for Fibre Channel Single Byte Command 2.
29
SCSI-3
IP
ATM
FC-4
FC Link Encapsulation FC - LE
FC - ATM
Common Services
Fibre Channel Physical & Signaling Interface ( FC- PH, FC-PH2, FC-PH3 )
FC - AL
FC - AL -2
A large part of the work done in developing the Fibre Channel architecture has been concentrated in assuring that the architecture could efficiently and naturally operate as a transport or delivery mechanism for a wide variety of well-established Upper Layer Protocols, or ULPs. Since much of the investment in current Operating Systems is at the device driver level, the incremental cost in transferring systems over to a Fibre Channel data communication level decreases if the interfaces can be made as similar to previously-existing interfaces as possible. This allows new capabilities to be added with minimal changes to currently-available interface. The following protocols are currently support: Storage protocols - Small Computer System Interface (SCSI) command set - Single Byte command code set (SBCCS) - High Performance Parallel Interface (HIPPI) - Intelligent Peripheral Interface (IPI) Network protocols - IP (Internet Protocol) - ATM Adaptation Layer for Computer Data (AAL5) - IEEE 802.2 - Link Encapsulation (FC-LW) Cluster Server protocols - VI (Virtual Interface) FC-4, the ULP mappings, some of which are listed, isolate the ULPs from the underlying Fibre Channel fabric and define interoperability of processes within a ULP. This layer maps the ULP onto the Fibre Channel transport layers. Fibre Channel is equally adept at transporting both network and channel information and allows both protocol types to be concurrently transported over the same physical interface.
Summary
!
30
FC - 0 and 1 layers specify physical and data link functions needed to physically send from one port to another
" "
FC-0 defines media connections and cables (speeds and feeds) FC-1 defines 8b/10b encoding, link control and ordered sets
FC 2 specify content and structure of information along with how to control and manage information delivery
"
FC-2 defines exchange and sequence management; frame structure, COS and flow control mechanisms
While FC-2 level concerns itself with the definition of functions with a single port, FC-3 level deals with functions that span multiple ports FC 4 provides mapping of Fibre Channel capabilities to pre-existing protocols, such as IP or SCSI, or ATM, etc.
Additional Information
31
Internet Links to access SAN ED 101 SAN Fabric Foundation chapter 2 (Fibre Channel Essentials)
Also use the Fibre Channel Theory Fundamentals Resource and Reference Material selection under course syllabus SAN Glossary and FC Recommended Reading and Resource List
Course Objectives
After completing this module, attendees should be able to:
! Identify ! Discuss ! Discuss
Topics
! ! !
World Wide Names (WWNs) Port Identifiers (PIDs), also called 24-bit, S_ID, or D_ID addresses Well-known Addresses
Point-to-Point (Pt to Pt) - Allows two devices to talk Arbitrated Loop - Allows 126 devices to talk, Arbitrated Loop Physical Address (AL_PA) 00, is reserved for the Fabric Loop Port (FL_Port) Switched Fabric Allows 16 Million theoretical devices to talk
"
Point-to-Point is limited to two devices but they can talk at greater distances than SCSI allows. Arbitrated Loop is limited to 126 devices in a blocking architecture (plus one for FL_Port). Without a switch only two of these devices can talk at a time, all others are blocked until those two are done. An arbitrated loop attached to a switch allows queuing into and out of the port where the loop is attached. The embedded port will take one AL_PA, so on a Brocade switch port there are 125 available AL_PAs. Switched fabric can theoretically allow 16 million nodes to talk (16^6 There are 6 Port Identifier (PID) slots with 16 hex choices per slot)). The committee reserves million of these addresses for well known addresses and testing purposes.
Point-to-point is a simple topology that allows bi-directional communication between two nodes, in this case a storage system and a server. This topology is very similar with SCSI direct attached except it is faster and supports longer distance. Point-to-point, like all SAN topologies, benefits from a longer reach with fiber optic connections. It is clear that a point-to-point topology has its limitations, yet it has proven to be a fast and powerful method for connecting storage devices/arrays directly to the servers.
The arbitrated loop is a ring topology where each node passes data to its adjacent nodes. Like an IBM Token Ring network, the SAN hub arbitrates requests for data to make optimum use of the available bandwidth. In an Arbitrated Loop configuration, the transmitter of each node is connected to the receiver of the next node. In order to send data from one node to another, devices must arbitrate for access to the loop. The initiating device arbitrates for control of the loop. Once the device wins arbitration, it then opens a communication session with the target and sends the data. The initiating node engages in a Point-to-Point connection with the recipient node. Only one connection can be established at a time. When the data transfer is completed, the initiator closes the session and releases control of the loop, allowing other devices to arbitrate for the loop. Currently, the maximum bandwidth is 100 MB/sec Fibre Channel Arbitrated Loop - the transmit of each node is connected to the receive of the next node. Reduced cost path into FC SCSI Replacement Requires FC Hub technology Easy for vendors to develop Difficult for customers to deploy Limited possible nodes (126) plus the Loop Master (FL_Port) Lower overall throughput - 100MB maximum bandwidth Limited any to any connectivity - nodes on the loop have to arbitrate for control of the loop in order to be able to communicate with a target device on the loop. While this communication is happening all other devices are waiting to get their turn.
Switch
F F FL
NL NL NL
Hub
NL
NL NL
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
NL
This diagram shows an example of a FCAL loop attached to a switch Communication can take place between: 1. Devices on the loop. 2. A device on the loop and a device attached to the switch. (A host on the loop could access data from the Fabric-attached storage.) 3. A device attached to the switch and a device on the loop. (The Fabricattached host could write data to the storage on the loop.) Devices on the loop can either be public (capable of doing a Fabric Login called a FLOGI) or private (not capable of doing a Fabric Login called a FLOGI) If the devices are private, the switch will probe them and get them into the Fabric Name Server if possible (private host devices do not accept probes). The FL_Port that the private loop device is attached to will also provide translation of Fabric 24-bit addresses and FCAL 8-bit addresses.
Arbitrated Loop Physical Address (AL_PA) needed to communicate " An 8-bit address assigned to each device on loop Maximum of 126 AL_PAs attached to FL_Port " AL_PA 00 reserved for FL_Port These 127 AL_PAs are a unique set out of the possible 256 bit patterns The lower the AL_PA, the higher the priority
Arbitrated Loop uses 8 bits to identify each of the devices on a loop. This is the Physical Address for the device and is known as the AL_PA. The protocol allows for 127 devices, so 126 unique AL_PAs need to exist for the NL Nodes AL_PA 00 reserved for the Switch FL_Port. Using certain bit combinations can create disparity errors so the 126 AL_PAs available for the NL_Ports are a fixed set. The next slide shows the valid AL_PA table. Not all AL_PAs are created equal. In arbitrating for control of the loop, the device with the highest priority succeeds. The lower the AL_PA assigned, the higher the priority for the device in the loop. Arbitrated Loop devices receive an AL_PA during the loop initialization process, and are described in later slides.
Valid AL_PAs
00 01 02 04 25 26 27 29 2A 2B 2C 2D 2E 10 31 32 33 34 35 36 51 52 53 54 55 56 71 72 73 74 75 76 80 81 82 84 A5 A6 A7 A9 AA AB AC AD AE 90 B1 B2 B3 B4 B5 B6 D1 D2 D3 D4 D5 D6 E0 E1 E2 E4
Highest priority
23
43 45 46 47 49 4A 4B 4C 4D 4E
63 65 66 67 69 6A 6B 6C 6D 6E
A3
C3 C5 C6 C7 C9 CA CB CC CD CE
08
17 18
88 39 3A 3C 59 5A 5C 79 7A
97 98
E8 B9 BA BC D9 DA DC
1B 1D 1E 1F
9B 7C 9D 9E 9F
0F
8F
EF
Lowest priority
The 8-bit addresses not shown in this table will never be used as an AL_PA for a device on the loop (03, 05, 06, etc.). The AL_PA for the FL_Port on a public loop will always be 00.
Loop Initialization
What can cause it?
!
10
Power On /Power On Reset Entering/Leaving a participating mode Loop failure Arbitration Wait timeout Selective Reset LIP
Power On / Power On Reset Loop Initialization occurs when a port is powered on or was given an equivalent reset. Enter/Leave Participating Mode A port in nonparticipating mode may, after a port dependent timeout, attempt to become a participating port. If the port is successful in obtaining an AL_PA, it can participate in loop operations after initialization completes. If the port is unsuccessful it remains in nonparticipating mode. A port already in participating mode can change to nonparticipating mode. It relinquishes the AL_PA it was assigned and makes it available for other ports to acquire. Loop Failure This may have occurred due to a port on the loop failing, being powered off, or a physical connection in the loop is broken. Arbitration Wait Timeout Excessive unfairness or a hung port may cause a port to not win arbitration. The port may use loop initialization to clear this condition. Selective reset LIP Causes the ports on the loop to do a vendor-unique reset. Usually, this is equivalent to a power-on reset.
Loop Initialization
What happens?
! ! ! ! ! ! ! !
11
Loop initialization begins by a port transmitting LIPs All loop activity is suspended All ports enter the Open-Initializing state One port is selected as Master Ports are assigned an AL_PA Positional AL_PA map of loop is built (if supported) All ports return to Monitoring state Normal loop operations resume
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
The Loop Initialization Primitive (LIP) is used to begin the process and suspend any activities if the loop is currently active. Receiving ports recognize the loop initialization process when at least three consecutive LIPs are received. The port enters the Open-Init state and continues to retransmit the LIPs to the next port on the loop. Once all the ports are in the Open-Init state, a series of frames are passed around the loop to determine a loop master, assign an AL_PA to each device on the loop and report the position of each device (optional). The loop returns to the monitoring state and normal loop operations can resume.
Loop Initialization
Sequence of Events
LIPs LISM LIFA LIPA LIHA LISA LIRP LILP ( ( ( ( ( ( ( (Loop Initialization Primitive Sequence) Select Master) Fabric Assigned) Previous Assigned) Hardware Assigned) Software Assigned) Report Position (if supported)) Loop Position (if supported))
12
FC Frames
CLS (Close Primitive Signal ) Initialization is complete # ! Public loop devices can log into the Fabric!
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
LIP and CLS are Ordered Sets used to indicate states or events. Ordered Sets are special four-character combinations that have special meaning in Fibre Channel. A LIP is a Primitive Sequence Ordered Set. Primitive Sequences are used to indicate states or conditions and are normally transmitted continuously until something causes the current state to change. CLS is a Primitive Signal Ordered Set. Primitive Signals are used to indicate events or actions and are normally transmitted once. LISM is a frame that each device enters on the loop. It will determine the device that becomes the loop master. This port controls the rest of the loop initialization process. LIFA, LIPA, LIHA, LISA are frames passed around the loop for devices to have their AL_PA assigned. LIRP and LILP are also frames that are passed around the loop but are used to allow the reporting of the position of the device on the loop. This is an optional step in the loop initialization process. It allows any device to learn not only the AL_PA of all the devices but the order in which they occur on the loop. There are different types of LIP sequences: LIP(F7,F7) - loop port in initialization state does not have an AL_PA LIP(F7,AL_PS) - loop port identified by AL_PS requests loop initialization LIP(F8, F7) - loop port, without a valid AL_PA (thus the F7), in the initializing state, requests loop initialization due to loop failure LIP(F8,AL_PS) - loop port identified by AL_PS detects loop failure LIP(AL_PD,AL_PS) - used to perform a vendor specific reset at loop port AL_PD, AL_PS port originated the request
13
Switched fabric - An extensive storage network in which large numbers of servers and storage systems are connected using Fibre Channel switches. Switches can be cascaded and combined with loops to create highly interwoven networks known as fabrics. Fortunately, these complex solutions can be kept under control by software that takes advantage of SAN management capabilities built directly into the fabric. Switched SAN Fabrics Fullest FC Network topology Require FC Switch technology Difficult for vendors to develop Easy for customers to deploy Maximum possible nodes (16 Million or 224 theoretical) Higher overall aggregate throughput - each connection to the switch is 100 or 100 MB per second. Enterprise any to any connectivity - Any device on the switch/Fabric can communicate with any other device on the Fabric Scaling is easy as switches can be connected together in various topologies. The result is a Fibre Channel Fabric. Footnote 1: fabric.ops parameters contain configurable parameters that need to be the same on all Fabric switches, examples include: fabric.ops.dataFieldSize: 2112 <Output truncated> fabric.ops.mode.pidFormat: <Output truncated> 1
14
The term Fabric can also refer to the physical switches, or to a set of global software components such as the routing tables, zoning configuration, and name server.
15
Nodes Transmit and Receive information via one or more ports which provide the physical connection(s) for the nodes Ports Separate transmit (tx) and receive (rx) functions
"
Tx encodes and transforms data to serial format Rx recovers clock from serial data received, decodes and deserializes the data
"
Node I
Tx
Tx
Node II
Nx_Port A
Rx
Nx_Port B
Rx
Each Node has a unique 64-bit address called Node World Wide Name. The format of this 64-bit identifier along with the format for the port on this nodes 64-bit identifier are specified by IEEE. Each Port also has a unique 64-bit address called Port World Wide Name. N_Ports are node ports that can either attach to other N_Ports or to Fabric Ports (F_Port). Nx_Ports could either be N_Ports (x not used) or NL_Ports (Node Loop Ports) used in the Arbitrated Loop topology. Each Nx_Port also has a 24-bit address also referred to as: port identifier (PID), Source Identifier (S_ID) when its used as a source in FC communications and Destination Identifier (D_ID) when its used as a destination address in FC communications. The PID is assigned to the port when it logs into the fabric (FLOGI).
Deciphering FC Addresses
More FC Terms
!
16
Node and port names Fixed 64 bit addresses used to uniquely identify Fabric devices also referred to as node and/or port world wide name (WWN) Fabric Address Required address, needed for devices and services to communicate, also referred to as
" " "
Port identifiers (PIDs) 24-bit addresses Source ids (S_ID) or destination ids (D_ID)
Well-Known Addresses The Fabric addresses used for accessing Fabric services Fabric Services Intelligent services provided by a Fabric, necessary for Fabric operation (more on these in next module)
FC Addresses - FC layers
Compare to OSI layers:
17
18
10:00:00:60:69:50:60:02
Single hexadecimal Name Assignment Authority (NAA) digit (Brocade uses a 1) FC Standard reserved
Based on the IEEE Standard format, a typical SilkWorm Node WWN is: 10:00:00:60:69:xx:xx:xx Where: The first 2 bytes are always 10:00 (format 1 addressing); The next 3 bytes are vendor specific. Brocade was assigned 00:60:69; The last 3 bytes are derived from the Brocade SilkWorm main board; The 3 byte company ID found in the 64 bit IEEE Standard format WWN can be searched at: http://standards.ieee.org/regauth/oui/index.html The 1st 4 bits of FC 64-bit addresses identify the authority responsible for administration of that address or the Name Assignment Authority (NAA). A subset of NAA address authority denotes the naming convention used. FC-PH Rev 4.3 Fibre Channel standards table 41 define the NAA identifiers. Brocade uses a HEX 1 in the first 4 bits this translates to a binary 0001 and tells you that the Brocade node address represents an IEEE format 1 name which is based directly on the 48-bit MAC address in the middle 3 bytes of 64-bit address (Brocades = 00:60:69). See the notes on the next slide for a list of ommon NAA identifiers.
19
20:00:00:60:69:50:60:02
Single hexadecimal Name Assignment Authority (NAA) digit (Brocade uses a 2) Three hex digits usually set by the vendor to uniquely identify a port on a device or switch
Fabric Port Name 2p:pp:00:60:69:xx:xx:xx The next 3 nibbles (p:pp) are used by Brocade to show the switch port number. 20:04:00:60:69:1f:25:e6 The 0:04 which means this is port 4 on the switch Common NAA identifiers include: HEX
1
Binary
0001
NAA description
Address based on IEEE 48-bit address (middle 3 bytes of 8-byte (64-bit) address (WWN) - referred to as address format 1. Brocades = 00:60:69
0010
A format 2 address based on the same IEEE address described in NAA HEX 1 identifier but used to define ports associated with a node using IEEE address format 1. Format 5 IEEE registered addressing was added in FC-PH3 standards to extend the number of vendor addresses beyond NAA = Ox1. Format 5 allows vendors to uniquely use the whole address space (all bits) as a Vendor-Specific IDentifier (VSID).
0101
20
! Each
!A !A
node WWN is often referred to as node name port WWN is often referred to as port name
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
21
Not all Fabric devices assign port WWNs the same. The fictitious HBA vendor pictured above used the 2nd nibble of the first byte to designate port numbers while Brocade switch port WWNs use the 2nd nibble of the 2nd byte to designate ports in WWN addressing.
FC Addresses Analogy
Telephone Numbers for FC Devices
!
22
Telephone Service:
" " "
Telephone number to call Your telephone number Telephone service (accessed with your telephone number)
Fabric Service:
"
" "
Destination ID (the Fabric address of whom you want to communicate with) Source ID (your Fabric address) Fabric service
FC Addresses
!
23
Fabric addresses are 24-bits (3 bytes long) A devices Fabric address indicates: " The switch and port number to which the device is connected " The FC-type of device (Fabric or loop) Fabric addresses are represented in hexadecimal format (0x) which often appear before the address Fabric addresses come in two modes: Native and Core PID address modes
Native Addressing Mode: The PID format on switches running Fabric OS v2.x and v3.x could originally only support a maximum of 16 ports in one switch. The 24-bit port address format consists of three bytes defining the Domain identifier, Area address and AL_PA fields respectively. Each field can provide 00-FF addressing. The Domain ID field byte provides domain addressing 1-239. The three byte fields of the old PID format were defined as XX1YZZ, where Y was a hexadecimal number that specified a particular port on a switch and 1 was constant. When Brocade developed the ASIC for the SilkWorm 2000 series, the largest switch has 16 ports, so only half of the second byte in the Area field of the PID was required to specify ports. Core PID Addressing Mode: To support the increased port count on the higher port count products based upon Brocade Fabric OS v4.x, the new format XXYYZZ has been adopted, where YY represents a port area designator. Using the entire middle byte for the port area designator allows Brocade switches to scale up to the Fibre Channel standard maximum of 256 ports per switch. Core PID addressing mode is the default address mode on all Brocade switches with greater than 16 ports. To ensure inter-operability between Fabric OS v4.x based products and Fabric OS v2.x and v3.x based products, while maintaining compatibility with older firmware versions, a setting was created to enable the PID format to be set to use either the new format or the old format. This is commonly known as the Core Switch PID format setting.
FC Addresses Cont.
!
24
Each switch (Domain) is responsible for assigning unique 24 bit Fabric address (also referred to as PID, S_ID or D_ID) Address are three bytes long:
" " " "
FC Addresses Cont.
Address Assignment Dependency 3 Address Classifications:
Fabric Address Public Loop Address Private Loop Address
25
NN NN 00
NN NN 00 is the generic address of any Fabric device that has logged into the fabric (FLOGI). Device FLOGI response assigns 24 bit Fabric address Native Mode has 2nd byte 1st nibble with a 1 Core PID mode uses entire 2nd byte # AREA
LL LL PP
Where LL LL is assigned by the Fabric at login (FLOGI); and PP = the local loop address (AL_PA) These devices 1st LIP to get 8 bit AL_PA and then FLOGI and are assigned the other 16 bits (LLLL) PP is always a non zero value
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
00 00 PP
Device LIPs # PP = the local loop address (AL_PA) Private address only use the last byte (8 bits) of the 24-bit Fabric address PP is always a non zero value
Fabric attached devices use an address format of NN NN 00, where NN NN 00 is the address of any Fabric-attached device that has logged into the fabric. This Fabric assigned address 1 byte represents the domain of the switch. the last byte (2 nibbles) is 00 indicating a Fabric device. The 2nd byte or 3rd nibble is 1 (native mode) for a 2000 series, the 2nd byte 4th nibble is the port, there are 15 possibilities (0-F). Port counts greater than 15 required a change in addressing modes, so core pid addressing was developed and the 1 offset (2nd byte, 3rd nibble) was done away with. Core PID address mode uses an AREA designation to indicate port numbers 0 # 256. Public Loop attached devices use an address format of LL LL PP, where LL LL is assigned by the Fabric at login; and PP = the local loop address (AL_PA). This type of address is simply a Fabric assigned address for a device attached to an FL_Port (24 bits). The value of LL LL is the same for all Public Loop devices attached to the same FL_Port and has the same meaning as NN NN Fabric addressing. Private devices use an address format of 00 00 PP, where PP = the local loop address. A Private Loop device has a 1-byte, 8-bit address, called the arbitrated loop physical address (AL_PA). This type of address is all that a Private device is capable of receiving or sending (8 bits). Therefore, the Private devices may only communicate with the devices it can see on the local loop.
26
Native address mode can be used when a 3900 or 12000 is not present in the fabric: XX 1Y ZZ
02 14 00
Port Number = 4 Domain (Switch) ID = 2
A sample Fabric address:021500 XX1YZZ XX=02 1 Y=5 ZZ=00 of the device Domain_ID of the switch Native mode Port # If 00, then it is an F_Port. If non-zero, then it is the AL_PA on the FL_Port.
27
Enables attachment to higher port count switches that use OS v4.x Configurable option under the configure command Core Switch PID Format: (0..1) [1]
! !
This format allows interoperability for switches with port > 16 Core PID address mode is default, non configurable address mode on switches with greater than 16 ports
AREA = 21 HEX so this is port 33
0a 21 00
AL_PA = 00 (Non-Loop Fabric device) AREA Number = 21 Domain (Switch) ID = 0a = 10
!
The PID format on switches running Fabric OS v2.x and v3.x could originally only support a maximum of 16 ports in one switch. The 24-bit port address format consists of three bytes defining the Domain identifier, Area address and AL_PA fields respectively. Each field can provide 00-FF addressing. The Domain ID field byte provides domain addressing 1-239. The three byte fields of the old PID format were defined as XX1YZZ, where Y was a hexadecimal number that specified a particular port on a switch and 1 was constant. When Brocade developed the ASIC for the SilkWorm 2000 series, the largest switch has 16 ports, so only half of the second byte in the Area field of the PID was required to specify ports. To support the increased port count on the higher port count products based upon Brocade Fabric OS v4.x, the new format XXYYZZ has been adopted, where YY represents a port area designator. Using the entire middle byte for the port area designator allows Brocade switches to scale up to the Fibre Channel standard maximum of 256 ports per switch. To ensure inter-operability between Fabric OS v4.x based products and Fabric OS v2.x and v3.x based products, while maintaining compatibility with older firmware versions, a setting was created to enable the PID format to be set to use either the new format or the old format. This is commonly known as the Core Switch PID format setting.
28
0a 10 e8
Area Number = 10 Domain (Switch) ID = 0a = 10
!
This loop device is connected to switch 10 on port 16 and has a loop address of e8 To determine address mode on a switch with 16 or less ports, check the core PID address mode using configshow or configure commands
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
29
Well-Known Addresses
Fabric Login FFFFFE Directory Server FFFFFC Fabric Controller FFFFFD Time Server FFFFFB Mgmt Server FFFFFA Alias Server FFFFF8 Broadcast Server FFFFFF
Every switch has reserved three byte addresses known as Well Known Addresses. The services residing at these addresses provide a service to either nodes or management applications in the fabric. Fabric Login: Before a fabric node can communicate with services on the switch or other nodes in the fabric an address is assigned by the fabric login server. Fabric addresses assigned to nodes are three bytes long and are a combination of the domain id plus the port area number of the port the node is attached to. Directory Server: The directory server/name server is where fabric/public nodes register themselves and query to discover other devices in the fabric. Fabric Controller: The fabric controller provides state change notifications to registered nodes when a change in the fabric topology occurs. Time Server: The time server sends to the member switches in the fabric the time on either the principal switch or the Primary FCS switch. Management Server: The Management server provides a single point for managing the fabric. Alias Server: The Alias server keeps a group of nodes registered as one name to handle for multicast groups Broadcast Server: This service is optional and when frames are transmitted to this address are broadcasted to all operational N and NL ports. When registration and query frames are sent to a Well Known Address a different protocol service, Fibre Channel Common Transport (FC-CT), is used. This protocol provides a simple, consistent format and behavior when a service provider is accessed for registration and query purposes.
30
The Exchange manages the transaction it contains a set of related sequences Sequences within the Exchange hold sets of related Fibre Channel frames A Frame contains a header and payload and is up to 2148 bytes An example Small Computer Serial Interface (SCSI) Read Command: Initiator
CMD (Se quence)
Target
One Exchange
Sequences of Frames
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
31
Sequence
Buffer
S O F
Buffer
Data C R C E O F
N_Port A
This picture shows the flow of data in the Fibre Channel environment for a point-to-point connection or a Fibre Channel connection through a Fabric. One or more frames will be sent and those frames can reside in one or more sequences. The sequences reside in an Exchange. From the other point of view: Exchanges consist of sequences of frames. The flow control (throttling of data from one port to another) depends upon the class of service (COS) being used as specified during PLOGI when common service parameters where exchanged.
Exchange
" Composed of 1-n non-concurrent sequences " Unit- or bi-directional flow of sequences for an operation " Exchanges normally uses the same ULP " Exchange may be Identified by each end: Originator / Responder Exchange IDs
" Exchange controls found in F_Ctl frame header field are: Seq_Init (initiating
sequence); First_Seq (indicates first sequence of exchange); Last_Seq (indicates last sequence of exchange); Seq_ID (sequence identifier) Info_CAT- Unsolicited Command (information category); Exc_Contxt (indicates whether originator or responder in Exchange)
Header
(server/storage/WS)
N_Port B
Sequence
" Composed of 1-n Frames " Unidirectional set of frames for an operation " Each Sequence is identified by initiator: Sequence Identifier (SEQ_ID) " Each frame within a Sequence is numbered: Sequential Count (SEQ_CNT) " Other sequence controls: SOFiX (start of frame for class x used to indicate class of
service this sequence is using, SOFnX is used for subsequent frames); R_CTL (routing control to indicate data, ACK (for COS 1,2,4,6 and F); End_Seq (set to 1 for last sequence); SEQ_CNT (SEQ_CNT is incremented by 1 for each data frame sent); EOF (last frame of sequence will be indicated by a EOFt)
Frame:
" Frame is smallest unit of transfer and is discussed in more detail in the next slide.
32
Control information (routing, class, sequence count) Addressing (Source and Destination)
Byte 0 Byte 1 Byte 2 Byte 3 R_CTL Destination ID (D_ID) CS_CTL Source ID (S_ID) Type Frame CTL SEQ_ID DF_CTL SEQ_CNT OX_ID RX_ID Parameter Payload
Important bytes:
R_CTL = Routing Control Destination Fabric Address Source Fabric Address Protocol Type:SCSI, IP Payload word that defines what is being said called command code Payloads often contain node and port names
R_CTL - Routing Control bits communicate the type of frame we are looking at: Extended Link Service Frame, Data Frame, and Acknowledge Frames are common. D_ID - Destination ID (Native port address or well-known address) CS_CTL - Class specific Control Field. This field is always zero for Classes 2 and 3 per the standards but may change in the future S_ID - Source ID (Native port address or well-known address) Type - Data Structure Type that describes what the data is: i.e., 01 = Extended Link Services 05 = ISO/IEC 8802-2 LLC/SNAP (IPFC) 08 = SCSI FCP 20 = Fibre Channel Services F_CTL - Frame Control. This field contains information related to the frame contents. Example: First/last sequence, passing initiative SEQ_ID - Sequence ID DF_CTL - Data Field Control. This field indicates if there are any optional headers SEQ_CNT - sequence Count - Indicates the sequential order of the frame in the sequence OX_ID - Originator ID - Exchange ID assigned by the originator RX_ID - Responder ID - The exchange ID assigned by the responder to the Exchange Data Field/Payload - This is the payload of the frame and can be from 0 to 2112 bytes in length The R_CTL byte is one of the first items to check. Note: S_ID and D_Ids are also referred to as PIDs (port identifies) or 24 bit addresses.
Summary
!
33
FC Topologies include Point-to-Point, Arbitrated Loop and Switched Fabric FC Terminology discussed includes Fabric, Nodes, Ports, and a review of Exchanges, Sequences and Frames FC devices have node and port WWNs FC device address classifications: Private, Fabric and Public Loop FC addresses are called Port Identifiers (PIDs), 24-bit addresses, and S_ID, or D_ID addresses Well-known Addresses are used to communicate with Fabric services The FC Frame is 2148 Bytes and includes a SOF, Header, Payload, CRC and EOF
"
! ! !
! !
Additional Information
34
Use resources page Internet Link section - Find SAN ED 101 link SAN Fabric Foundation, look at chapter 2 (Fibre Channel Essentials) for additional information about material presented in this module Use resources page Reference section - FC_AL Initialization presentation for detailed FC_AL initialization information
Course Objectives
After completing this module, attendees should be able to:
! Identify
FC Communication Methods
! Introduce
! Identify
Topics
!
FC Communication Methods
" " " " "
Switch Initialization FC Communication Terminology Connection Naming Device Initialization Link Communication Well-Known address fabric communication services
What happens when a Fabric device connects to a Fabric? What happens when a public loop device connects to a Fabric? How do private devices communicate in a Fabric?
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
Switch
F_Port (Fabric Port) F
NL NL
Hub
NL
Private NL
Note: All NL_Ports are public unless otherwise noted
NL
Private NL
Nodes that attach to the fabric can either be a N (Node) or NL (Node Loop). NL nodes have two classifications; private or public. Private NL nodes can only communicate other nodes that are attached to the same hub or FL port; hence the word private. Public NL nodes can communicate with any member of the same hub of FL port and have the ability to send a frame to the fabric. Fabric Nodes (N_Ports) can communicate with any Fabric Node and can communicate with private or public NL nodes on a loop. This diagram shows an example of loop that contains public and private devices. Communication can take place between: 1. Devices on the loop. 2. Brocade FL_Ports allow communication to occur between: A device attached to the switch and a device on the loop. (The fabric attached host could write data to the storage on the loop.
Switch Initialization
!
Verify CPU DRAM memory Initialize base Fabric Operating System (FOS)
The initialized FOS does the following: " Execute Power-On-Self-Test (POST) tests " Initialize ASICs and front panel " Initialize link for all ports (put online) " Explore the fabric and determine the principal switch " Assign addresses to ports " Build unicast routing tables " Enable N_Port operations
Port Initialization Error Conditions NO_SYNC and NO_SEGMENT errors indicate that the port has a problem initializing # Usually Media or loopback device - cable or plug (self loop) ERRSTAT and ERRSTATS generally indicate that the port is good enough to initialize, but not good enough to sustain traffic # Usually signal integrity. PORTDIED and TIMEOUT errors indicate that frame data issues caused the low level driver or hardware to discard a frame or take a port offline.
no
y/n
FL - Port
(State 2)
yes
y/n
no
G - Port
Im waiting for someone to talk to me (State 3) Are you a switch or a Fabric point-to-point device? (Transition 3)
F - Port
(State 5)
E - Port
(State 4)
1. A switch port is a Universal Port (U_Port) that operates in either E/F_Port (G_Port) mode or FL_Port mode State 1 2. Is something connected to the port? If yes Transition 1 continue. 3. U_Port starts mode detection process by transmitting at least 12 LIP(F7) Primitive Sequences Transition 2 a. If at least 3 consecutive LIP primitive sequences LIP(F7) received then port enters OPEN_INIT state and attempts FC-AL loop initialization State 2 b. If LIP Primitive Sequences are not received U_Port attempts OLD_PORT initialization by taking the link down then transmitting NOS primitives Transition 2. If Link Initialization Protocol fails after 1 retry or LIP received after 1 second go to Transition 2 (FC-AL) initialization c. When operating in the FL_Port mode, a U_Port will try the loop initialization procedure three times. If all these tries fail, the port will be marked as faulty. To ensure N_Port, reinitialize the port and the switch port will cut the laser forcing a loss of signal state for at lease 20 s then the switch port will bring back the laser and issue NOSs Transition 2 4. U_Port will attempt the OLD_PORT initialization (Link Initialization Protocol for point to point) by taking the link down and then transmitting NOS Primitive Sequence if LIP timeout or any of the loop initialization phases timeout or only one non-zero AL_PA is claimed in Loop Inititalization Sequences (LOOP_EMPTY=false) or no nonzero AL_PAs is claimed (LOOP_EMPTY=true) 5. If the ACTIVE state is reached, the port will operate in the G_Port mode State 3. The normal E_Port or F_Port mode detection procedure follows Transition 2. a. If ELP succeeds, the U_Port operates in the E_Port mode State 4. b. If a valid FLOGI is received, the U_Port becomes an F_Port State 5. c. If self loopback detected after ELP exchanges and LOOP_EMPTY = false port exits G_Port and reinitializes as FL_Port State 2 Note The firmware will automatically attempt to reinitialize a faulty port every two seconds.
Link Initialization
Device Initialization into Fabric from Device Perspective
If speed negotiation available device and switch will negotiate to the highest common speed # Speed negotiation process on next slide
Switch
F
Ordered set characters act as frame delimiters, primitive signals and primitive
sequences
Speed Negotiation
Speed List 2Gbps 1Gbps
(START)
RX = Incoming Speed
YES
IF RX = TX
YES
Speed Set
(16ms)
Set RX Speed
NO
Set TX Speed
NO
Switch Port, Auto-Negotiation process: In general, the primary objective of the Auto-Negotiation algorithm is to start at the highest speed possible and step down until both devices agree on a speed. Lower speeds are tried only if higher speeds fail. When signal (Power ON, Loss of Signal, Loss of Sync) is detected the Transmitter (TX) and Receiver (RX) will be set to the highest possible speed for that port in preparation for the Auto-Negotiation process. Once set, the port will wait for light and then proceed with the Speed Negotiation process. Speed negotiation can be viewed as a two phase process. The inner box is used to set the RX Speed, whereas the outer box is used to set the TX speed. Speed List is used by the TX and RX to try eligible speed settings. At this time, the Bloom ASIC will have 2 speeds in the speed list (1Gbps and 2Gbps). Once the lowest speed has been tested by either TX or RX, that specific pointer will be reset back to the top of the list and the highest speed will be tested again. RX and TX have separate Speed List pointers. There are three primary timers associated with setting the TX/RX speeds: TX Sync Timer (154ms): This timer actually controls the amount of time a Receiver can spend synchronizing on a specific speed from the Speed List. When the Receiver selects a speed, it has 16ms to sync up before going to the next lower speed. Since the TX Sync Timer is set to 154ms, this will allow multiple speeds to be tested before exiting to the Set TX Speed Process. RX Sync Timer (16ms): This timer controls the amount of time allowed for a receiver to sync up on a specific speed. Auto-Neg Timer (1600ms): This timer controls the amount of time allowed for Speed Auto-Negotiation. If this timer is exceeded, the process starts all over again. Setting the RX Speed: 1.The port will allow the Receiver 16ms to sync. If a successful sync is accomplished the RX speed is compared to the TX speed to ensure consistency. If equivalent, the Auto-Negotiation process has completed. 2. If a successful sync was not possible within the TX Sync Timer (154ms) window, then the RX Speed process is exited and the TX Speed is performed (See Setting TX Speed Below). 3. If the TX Sync Timer has not been exceeded, the next speed in the Speed List is selected for RX and the RX Sync process is attempted again. Setting the TX Speed: 1. This would be entered when TX Sync Timer (154ms) has been exceeded. 2. If the Auto-Negotiation timer has been exceeded, the process stops and control is passed back to the Wait for Light condition. 3. If the Auto-Negotiation timer has not been exceeded the TX is set to the next Speed in the Speed List and control is passed to the Set RX Speed process described above. Note - The Speed List exists inside the switch and contains the acceptable values to which the port can AutoNegotiate.
Active State (AC) Link Reset State (LR) Offline State (OLS) Link Failure State (LF)
1. Active state (AC) in the Active state, the port is able to transmit and receive frames and primitive signals. Certain conditions encountered within the port may cause the port to exit the active state and perform one of the following primitive sequence protocols: Link Reset, Offline state, or Link Failure. 2. Link Reset state consists of: LR1 (Link Reset Transmit) LR2 (Link Reset Receive) LR3 (Link Reset Response) 3. Offline state consists of : OL1 (Offline Transmit) OL2 (Offline Receive) OL3 (Wait for Offline) 4. Link Failure state consists of: LF1 (No Operational Receive) LF2 (No Operational Transmit)
10
FC Communication Overview
After Speed Negotiation and PSM ACTIVE
The Extended Link Service (ELS) protocol is used for the following Fabric services: FLOGI, PLOGI, SCR, RSCN, and LOGO The Fibre Channel Common Transport (FC_CT) protocol is used for registration and query services to the name server (FFFFFC)
11
Buffer
I D L E I D L E R R D Y I D L E
EOF t SOF n3
Buffer
I D L E I D L E I D L E I D L E I D L E F R A M E I D L E I D L E (switch port)
I D L E
F R A M E
I D L E
N_Port A
(server/storage/Work Station)
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
F_Port
EOF n SOF i3
Exchange Sequences contain two types of transmission words: Ordered Sets occur outside the FC Frame content Ordered set characters act as frame delimiters (see block arrows below), primitive signals and primitive sequences There are at least two idles (primitive signal) between primitive signals at the source Data Words occur within the FC Frame content There are normally at least six idles (primitive signal) between any two FC Frames at the source Frame delimiters (SOF, EOF) in a Brocade Fabric (classes of service 2,3 and F) frame delimiters delimit, or identify the start and end of frames The SOF delimiter is used to indicate: a start-of-frame; the class of service associated with the frame (designated by the x in SOFix and SOFnx); and if the frame is the first sequence of a frame (SOFix) or a normal/ subsequent frame (SOFnx) SOFi3 indicates a start-of-frame, first sequence for a class of service 3 connection SOFn3 indicates a normal, non initiating frame within a class 3 sequence The EOF delimiter is used to indicate: if the frame is terminating the sequence (EOFt) or is a normal frame within the sequence (EOFn); if the sender of the frame aborts transmission of the frame (EOFa); and if Fabric between communicating devices detected an error (EOFni) EOFn indicates the end of a normal, non terminating frame within a sequence EOFt indicates the end of the last frame within a sequence EOFa indicates the sender of this frame aborted it EOFni indicates the Fabric between sender and receiver detected an error
Communication Protocols
Established Fabric Communication Processes Overview
!
12
Fabric Login Accept N_PORT Login Process Login Registered State Change Notification State Change Registration Logout
Extended Link Services provide a set of command instructions that are used to perform a unique task.
13
FLOGI # PLOGI to Name Server # SCR to Fabric Controller # Register & Query [using Fibre Channel Common Transport (FC_CT) Protocol] # LOGO Private NL: LIP (FFFCxx PLOGI and PRLI will enable private storage devices that accept PRLI and thus appear Fabric capable) Public NL: LIP -> FLOGI -> PLOGI -> SCR -> Register & Query -> LOGO and then PLOGI -> PRLI and communicate with other end nodes in the Fabric LIP process include: LIP, LISM, LIFA, LIPA, LIHA, LISA and in some cases LIRP and LILP
Loop Devices
"
"
"
Embedded Port (Domain Controller) is responsible for communication to/ from all Well-Known addresses The Fabric embedded port (domain controller) needs a Fabric address Brocade uses FFFCxx where xx represents one of 239 possible domains FFFCxx is used to PLOGI,PRLI and probe attached devices to retrieve information to put into name server data base this is how private storage devices appear like Fabric devices in Brocade Fabrics Embedded port probing (Fabric probing) is enabled by default thus allowing private targets that accept PRLI into Fabric This Fabric probing can be disabled using the Brocade configure command
The Loop Initialization Process that Arbitrated Loop devices would have already gone through include: LIPs 1.LISM 2.LIFA 3.LIPA 4.LIHA 5.LISA 6.LIRP 7.LILP CLS (Loop Initialization Primitive Sequence) ( ( ( ( ( ( ( (Close) Select Master) Fabric Assigned) Previous Assigned) Hardware Assigned) Software Assigned) Report Position (if supported)) Loop Position (if supported))
Initialization is complete Now devices on the loop can log into the fabric!
14
! Fabric !
! The
Fabric Name Server provides a data base of end nodes (Fabric devices and private targets that accept Switch PRLI) Well-Known address has associated Fabric service(s)
! Each
The three most-common Well-Known Addresses and associated Fabric services are: FFFFFE is the address for Fabric F_Port Service often called Fabric login FFFFFD is the address for Fabric Controller Service FFFFFC is the address for Name Server Service
Login Services
Three different levels of login:
!
15
The following information is implicitly captured and put into the Name Server during this process: type; COS; PID; port name (port WWN) ; and node name (node WWN)
N_Port Login (PLOGI) is used by one Nx_Ports to establish service parameters with another N_Port or NL_Port Process Login (PRLI) is used by an upper-level process in one
port to establish image pairs and service parameters with the corresponding upper-level process in the other port
"
For example, it can be used to establish the environment between related SCSI processes on an origination Nx_Port and a responding Nx_Port
The Login services are used to exchange Service Parameters between N_Ports and N_Ports and between N_Ports and F_Ports. A PRLI is used to establish ULP "image pairs" and service parameters for a pair or all processes that are to occur between the PRLI initiator and PRLI responder during this PRLI session. A PRLI requests "image pair" establishment (EIP) and when bit 13 of ExLnk 0002 is set (as reported in capture depicted) both image pair establishment and service parameters between the image pairs are exchanged. The image pair service parameters are exchanged in ExLnk 0005 depicted below.
16
You need the Fabric F_Port to establish Fabric communication capability FC devices must Fabric Login (FLOGI) to the Fabric F_Port to be part of the Fabric This FLOGI is sent to Well-Known Address FFFFFE do devices normally communicate with the Fabric F_Port? Devices will FLOGI upon initial connection to the Fabric Upon re-connection (possibly due to a link loss)
!The
"
17
FLOGI
Accept
! !
When devices 1st connect their address is 000000 (unless they are loop devices, then their address will be 0000pp, where pp = ALPA) FLOGI is required before any frame can be sent thru the fabric FLOGI is sent to well-known address FFFFFE (Fabric F_Port)
Fabric Login is used by a Nx_PORT to establish a session with the fabric and is necessary before any frames can be sent thru the fabric.
18
PLOGI
FFFFFC
SCSI Device SCSI Device SCSI Device
Accept
The server is asking permission to have a conversation with the Name Server (NS) in the switch
After the FLOGI has completed the HBA will send a session request to FFFFFC asking for permission to login and send a storage query.
19
State Change Notification (SCN)- State Change Notifications (SCN) are used for internal state change notifications, not external
" "
This is the switch logging that the port is online or is an Fx_port This is not sent from the switch to the Nx_ports!
FC Devices that choose to receive RSCNs must register for this service Devices send a State Change Registration (SCR) to FFFFFD Registration indicates that the device wants to be notified of changes Devices normally register immediately after a PLOGI to Name Server (NS) but could do so at any time after NS PLOGI
"
Registered State Change Notification (RSCN)- issued by the Fabric or an Nx_Port to devices that registered (issued an SCR requesting this notification)
" "
Not all devices handle RSCNs gracefully Brocade provides methods to help devices deal with RSCNs
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
Brocade provides a configurable feature that allows users to determine RSCN behavior of capable attached devices: Switch:admin> configure Configure... Fabric parameters (yes, y, no, n): [no] Virtual Channel parameters (yes, y, no, n): [no] Zoning Operation parameters (yes, y, no, n): [no] RSCN Transmission Mode (yes, y, no, n): [no] y End-device RSCN Transmission Mode (0 = RSCN with single PID, 1 = RSCN with multiple PIDs, 2 = Fabric RSCN): (0..2) [0] Brocade zoning also keeps RSCNs within the zone in most FOS versions (see notes on later page in the module, look for an RSCN table).
20
Devices register to receive change notifications using a SCR Fabric Controller provides a unique notification service to its registered nodes called a Registered State Change Notification
RSCN is a mechanism defined in Fibre Channel to allow a Fabric to transmit an unsolicited notification about a state change to an N_Port that has registered (SCR) to receive it. Whenever a change occurs in the Fabric (e.g. new device, new switch, device removed, link failure, etc) RSCN will sent out to those devices that have registered. This is a feature provided by the Fabric Controller to make the network more proactive and easy to manage. Who is using this notification services? Devices that use this service are generally servers that want to keep track of a number of storage targets. A device registers for a state change notification by transmitting a State Change Registration frame (SCR) to the well-known address of the Fabric Controller (FF FFFD). And when there is a change in fabric topology, the Fabric Controller transmits a Registered State Change Notification (RSCN) frame to the device. Sample list of State Change Notification events:
Fabric Detected An N_Port logs in or re-logs in with the Fabric An N_Port implicitly logs out (goes offline) A Fabric reconfiguration has occurred N_Port detected An RSCN is received from an N_Port
The RSCN frame is simply a notification to the device that there has been a change in the network. It is up to the device to query the Name Server (FF FFFC) to assess the state of the Fabric.
21
SCR
FFFFFD
Accept
The server is registering to receive notification when something in the Fabric changes
After the FLOGI has completed the HBA will send a session request to FFFFFC asking for permission to login and send a storage query.
22
A Fabric shall contain one or more Fabric controllers Responsible for managing the operation of the fabric " Fabric Initialization " Fabric Configuration " Generate links response " Start & stop connections " Frame Routing Management
Fabric Controller (FFFFFD) service is a required logical entity within a Fabric that controls the general operation of the Fabric. It is the Fabric owner as well as the traffic controller. Functions include Fabric initialization, frame routing management, generation of link responses, and setup and tear down of dedicated connections. Since Fabric Controller is such an important service, Fibre Channel deploys a fully distributed environment for this service. The Fabric Controller exists in every single Switch in a Fabric, therefore, there is no single point of failure. Major Fabric management responsibilities: Execution of the Fabric initialization procedure Advertise RSCN (Registered State Change Notification) Major traffic management responsibilities: F_Ports are interconnected by a routing function which managed by Fabric Controller and allowing frames to flow from one F_Port to another N_Port connect to a F_Port in the Fabric. Setup and tear down of Dedicated Connections Perform general frame routing Parse and routes of frames directed to well-known addresses Generation of F_BSY (Fabric Busy) and F_RJT (Fabric Rejected) link responses The F_BSY indicates that the Frame cant be delivered, because either the Fabric or the destination N_Port is temporarily busy. On receipt of an F_BSY in response to a Frame transmitted, the source N_Port is expected to attempt Frame retransmission, up to some number of retries. Recovery after retry is exhausted is dependent on the FC-4 ULP and the Exchange Error Policy. The F_RJT response to a Frame indicates that delivery of that Frame is being rejected. Rejection indicates that the Frame contents are intact (i.e. no transmission errors) but the Frame can not be received for some protocol-related reasons, such as non-support of a service or inconsistent Frame header fields.
23
Device registered to receive RSCN using a SCR A new device has been added An existing device has been removed A zone has been changed A switch name or IP address changed The Fabric reconfigured
Registration is optional
" "
The Fabric Controller is responsible for routing changes, topology changes and the SCR/SCN/RSCN processes. Footnote 1 - RSCNs are FOS dependent:
24
Name Service is part of the FC_GS specification Responsible for directory information about Fabric connected devices Every device must registered to the Name Server when it is in network Name Server uses FC-CT (Common Transport) protocol to distribute information throughout the Fabric (dynamic scalability) Fibre Channel device can query any Name Server for network resources information (simple request and response model)
The Fibre Channel Name Server is responsible for directory information about Fabricconnected devices. Name Server, simple yet effective, maintains name and address information about Fabric-connected devices in a complex, diverse environment. Fibre Channel Name Server Characteristics: No single point of failure Each Distributed Name Server maintains and owns local information and retrieves remote information from other Distributed Name Servers using Server-to-server protocol (based on FC-CT) Server-to-server communication is transparent to the external Name Service client A distributed Name Server may cache remote information for a period of time (900 seconds) Fast, efficient device discovery (rather than a server device manually polling for other devices sequentially, it discovers other devices with a single Name Server query) Up-to-date device information. Upon registration, the Name Server has essential device information that is available immediately. All associated Name Service information is deleted from the directory automatically when a switch or device goes down (deregisters). Quick note: All the built-in network server services are part of the Fibre Channel Generic Services (FC-GS) specification. Try not to confuse these server services with the services offered by FC-3 Common Service Layer. FC-3 layer focuses on multiple N_Ports services like Multicasting, Hunt Groups, etc.
25
Automatic Server location targeting Dynamic member registration Support detail member characteristics description Support full or restricted queries Automatic replication database Multi-mastering (full read and write update) in the entire network Automatic unify resources view
Yes
(Well-Known)
Yes (via DHCP) Yes (Dynamic DNS) No (IP and Name only) No (partial query only) No (manual zone transfer) No (Primary & Secondary) No (manual work)
What is a Domain Name Server (DNS) in TCP/IP? When we are surfing the Internet, we want to go to an destination website such as; http://www.brocade.com. It is the job of a DNS to translate this English name to a native IP address which is the real address destination of that website. The DNS helps us locating the actual target address. Like DNS Server, one of the key functions of the Fibre Channel Name Server is to help device finding the actual address of the target device. Since every device registers itself with the Name Server during the login procedures, other devices can use the Name Server as a central query point. The Name Server will translate the target devices World Wide Name (64-bits) and return its actual dynamic native Fibre Channel address (24-bits). Maintain Critical Member Information. Devices are identified by names, addresses, and other attributes. The Name Server maintains this information, acting like a telephone directory. The Name Server has no responsibility for the process of routing data among devices. Provide automatic registration of essential device information on behalf of the devices, as well as deregistration. Each Fibre Channel switch contains a Distributed Name Server that maintains local information. The Name Server provides local devices with an access to the Name Server. When a query requires remote information, the Distributed Name Server will communicate with other Name Servers within the Fabric on behalf of its local device. This behind-the-scene communication is transparent to the local device. The Name Server is distributed throughout the Fabric. There can be a multiple of switches and whichever switch is local to the host will serve as the Name Server for the fabric for that host (both read and write). Meaning if a server sends a query to the Name Server, that local Name Server will query the rest of the switches and then respond to the host with all of the fabrics information that corresponds to the query that came to it.
26
Devices always login to the Fabric, FLOGI to FFFFFE Devices then login (PLOGI) to FFFFFC After PLOGI, devices typically register and send queries to NS at FFFFFC, devices can also deregister
! When
" "
Upon initial connection, after FLOGI After receiving an RSCN from the Fabric Controller
SCSI initiators typically communicate with the name server to register information about them selves and to determine who they can communicate with in the Fabric. Query request information returned from the Name Service is used to build a device table data base. Some legacy devices may have a list of devices (PIDs or WWNs) they communicate with. Devices that attempt to communicate with other Fabric devices not in queried Name Server data base are called bad citizens. The Name Server keeps a list of attached devices and their attributes. Attributes include: Type (N or NL) COS (Class of Service device supports) PID (Port identifier also called 24-bit address) Port WWN also called port name Node WWN also called node name FC-4 types this device is capable of communicating [08=SCSI (FCP) and 05 = IP over FC] Symbolic device manufacturer information that device registers
27
Register
Accept
Query * Accept includes List of SCSI devices
A PLOGI to the Name Server is required before devices can register or query
" "
Typically, both SCSI initiators and targets register Additionally, SCSI initiators usually query for a list of allowed communication devices so they can build or rebuild device tables (see Query * above)
Recall that registration and query processes use the Fibre Channel Common Transport (FC_CT) protocol
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
Fabric Login is used by a Nx_PORT to establish a session with the fabric and is necessary before any frames can be sent thru the fabric.
28
FLOGI to FFFFFE and obtain 24-bit Fabric address PLOGI to FFFFFC SCR to FFFFFD Register and query FFFFFC for database of Storage/Tape targets
$
FFFFFC matches HBA to zones (Fabric subsets containing devices youallow to communicate) and returns 24-bit addresses of targets that HBA requested within authorized zones PLOGI and then sends SCSI probes to fabric destination addresses of targets returned from Name Server
"
Assign target ids (with server OS) to each node probed to build a device table that applications can use to store and retrieve data
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
With Storage Area Networks pure SCSI initiators were replaced with a HBA. With networked storage the HBA must connect to the fabric before storage can be probed. The HBA must perform a FLOGI to the well known address FFFFFE. When the address is acquired the HBA will start a session with the well known address FFFFFC followed by a query to FFFFFC. The Name Server at FFFFFC will match the host WWN to the storage targets currently in the fabric and will reply back the network address of the match targets. The HBA will create a new frame to each destination address it received and encapsulate a PLOGI followed by a SCSI probe/inquiry to the network address of the target. When the SCSI probe/inquiry reply is received the HBA will assign network address a TARGET/LUN number and present this to the operating system.
29
PLOGI
PLOGI
Accept Accept
The host is asking to have a conversation (PLOGI) with one of the SCSI target devices in preparation probe for information
The HBA will logout of FFFFFC and send a PLOGI to each network address returned from FFFFFC.
30
PRLI
PRLI
Once a host has established a conversation (PLOGI), it can now PRLI and then probe for information
(An example of a host probing for information would be asking a target SCSI device if there are any LUNs to report)
During the Process PRLI process the HBA will send the SCSI probe/inquiry to learn information on the LUN.
31
SAN Targets are typically storage devices: Fabric RAID controllers, Loop JBODs (just a bunch of disks) and Tape Non-loop Fabric capable Targets typically:
" " "
FLOGI to FFFFFE and obtain full 24-bit Fabric address PLOGI to FFFFFC Register information with FFFFFC
$
Targets typically register symbolic node information that allows easy identification using Name Server commands (nsshow)
Go through LIP initialization to obtain AL_PA and then FLOGI to obtain the other 16-bits of Fabric address PLOGI and register with FFFFFC
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
"
32
The Fabric embedded port (domain controller) needs a Fabric address Brocade uses FFFCxx where xx represents one of 239 possible domains
"
FFFCxx is used to PLOGI,PRLI and probe attached devices to retrieve information to put into name server data base this is how private storage devices appear like Fabric devices in Brocade Fabrics Embedded port probing (Fabric probing) is enabled by default thus allowing private targets that accept PRLI into Fabric
"
This Fabric probing can be disabled using the Brocade configure command
33
PLOGI
Accept
PLOGI
FFFCxx
Accept PLOGI
Accept
The embedded port is asking permission to have a conversation with each device in the Fabric
After the FLOGI has completed the HBA will send a session request to FFFFFC asking for permission to login and send a storage query.
34
PRLI
FFFCxx
PRLI
Accept Reject
NS query
NS Info
SCSI initiators commonly reject PRLI SCSI targets typically accept PRLI If PRLI accepted, Embedded Port goes on to query for NS information
The HBA will send a name server query to FFFFFC inquiring for the network addresses of targets it can communicate with.
35
R_RDY
R_RDY
Data Data
Each R_RDY allows ULP data frames to flow Even private storage devices that accepts embedded port PRLI are allowed in! The Brocade embedded port probing enables automatic FL_Port translation of private (8-bit) communication to/from Fabric (24-bit) communication 1
The HBA will logout of FFFFFC and send a PLOGI to each network address returned from FFFFFC. Footnote 1: This automatic translation capability is a configurable capability called translative mode. It is automatically enabled but it can be disabled.
Summary
!
36
FC communication includes switch, device and FC protocol initialization and communication processes Common Fabric services include Login services, Fabric F_Port service to FFFFFE, Fabric controller services to FFFFFD and Fabric Name Services to FFFFFC When Fabric capable loop devices (public loop) connect to a Fabric they will LIP and get an AL_PA and then FLOGI to FFFFFE, PLOGI to FFFFFC, SCR to FFFFFD, and register with FFFFFC Private storage devices that accept PRLIs are put into the Fabric Name Server using Brocade embedded port PLOGI, PRLI and probe processes
Additional Information
37
Use resources page Internet Link section - Find SAN ED 101 link SAN Fabric Foundation, look at chapter 2 (Fibre Channel Essentials) for additional information about material presented in this module Use resources page Internet Link section Find Solution Technology link, look for Fibre Channel courses Use resources page Reference section - FC Glossary and additional reading / WEB site locations list
Solution Technology provides detailed Fibre Channel training related to materials presented in this module. The FC course that they teach spends about a day on FC-2 layer communication processes alone. It is a comprehensive FC offering.
Describe key reasons, benefits and components related to Fibre Channel (FC) Identify FC protocol layers and key related tasks & components List key FC topologies, terminology, and addresses Describe FC services and expected behaviors Summarize and state relevance of FC theory fundamentals learned
High speed Low latency Long distance Robust data integrity Large connectivity
FC value adds: " Server & storage consolidation " Centralized management " Increased utilization " Unmatched availability " Performance and scalability FC SAN components
"
"
FC Protocol Layers
Key related tasks & components
!
FC-4 ULP mapping FC-3 Advanced features FC-2 Framing and flow control FC-1 Encoding & link control FC-0 Speeds and Feeds (SFPs & cable specs)
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
FC - 0 and 1 layers specify physical and data link functions needed to physically send from one port to another. FC 0 specifications include information about feeds and speeds. 10 Mbaud speed, although not depicted, is also in development. FC 1 layer contains specifications for 8b/10b encoding, ordered set and link control communication functions. FC 2 specify content and structure of information along with how to control and manage information delivery. This layer contains basic rules needed for sending data across network. This includes: (1) how to divide the data into smaller frames, (2) how much data should be sent at one time before sending more (flow control), and (3) where the frame should go. It also includes Classes of Services, which define different implementations that can be selected depending on the application. FC 3 defines advanced features such as striping (to transmit one data unit across multiple links) and multicast (to transmit a single transmission to multiple destinations) and hunt group (mapping multiple ports to a single node). So, while FC-2 level concerns itself with the definition of functions with a single port. The FC-3 level deals with functions that span multiple ports. FC 4 provides mapping of Fibre Channel capabilities to pre-existing protocols, such as IP or SCSI, or ATM, etc.
FC terms include Fabric, Nodes, Ports, node names (Node WWN), port names (Port WWN), Well-Known addresses and FC services Fabric addresses (PIDs) are 24-bit domain, area and ALPA designators than enable cut-through routing
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
Point-to-Point is limited to two devices but they can talk at greater distances than SCSI allows. Arbitrated Loop is limited to 126 devices in a blocking architecture (plus one for FL_Port). Without a switch only two of these devices can talk at a time, all others are blocked until those two are done. An arbitrated loop attached to a switch allows queuing into and out of the port where the loop is attached. The embedded port will take one AL_PA, so on a Brocade switch port there are 125 available AL_PAs. Switched fabric can theoretically allow 16 million nodes to talk (16^6 There are 6 Port Identifier (PID) slots with 16 hex choices per slot)). The committee reserves million of these addresses for well known addresses and testing purposes. Fabric Addressing The PID format on switches running Fabric OS v2.x and v3.x could originally only support a maximum of 16 ports in one switch. The 24-bit port address format consists of three bytes defining the Domain identifier, Area address and AL_PA fields respectively. Each field can provide 00-FF addressing. The Domain ID field byte provides domain addressing 1-239. The three byte fields of the old PID format were defined as XX1YZZ, where Y was a hexadecimal number that specified a particular port on a switch and 1 was constant. When Brocade developed the ASIC for the SilkWorm 2000 series, the largest switch has 16 ports, so only half of the second byte in the Area field of the PID was required to specify ports. To support the increased port count on the higher port count products based upon Brocade Fabric OS v4.x, the new format XXYYZZ has been adopted, where YY represents a port area designator. Using the entire middle byte for the port area designator allows Brocade switches to scale up to the Fibre Channel standard maximum of 256 ports per switch. To ensure inter-operability between Fabric OS v4.x based products and Fabric OS v2.x and v3.x based products, while maintaining compatibility with older firmware versions, a setting was created to enable the PID format to be set to use either the new format or the old format. This is commonly known as the Core Switch PID format setting.
Switch Initialization FC Communication Terminology Connection Naming (Port type, WWN and Fabric Addresses) Device Initialization and Link Communication Well-Known address Fabric communication services
" "
! Expected
" " "
What happens when a Fabric device connects to a Fabric? What happens when a public loop device connects to a Fabric? How do private devices communicate in a Fabric?
2003 Brocade Communications Systems, Incorporated. Revision0.1_FC101_2003
= SAN
In this course we have gone through a series of key Fibre Channel technologies and we have demonstrated many features that make Fibre Channel the prefect solution for today SAN implementation.