You are on page 1of 193

Storage Technology Basics

PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sun, 23 Sep 2012 07:55:52 UTC

Contents
Articles
Computer Basics
Central processing unit Random-access memory Computer data storage Disk storage Hard disk drive Disk-drive performance characteristics 1 1 13 17 28 31 58 64 64 84 84 101 104 104 119 123 130 130 137 145 153 154 157 157 163 164 165 170 176

RAID
RAID

Operating System
Operating system Unix-like

File System
File system Network File System Server Message Block

Protocols
SCSI iSCSI Fibre Channel Internet Fibre Channel Protocol Fibre Channel over Ethernet

NetApp
NetApp filer Write Anywhere File Layout Zero-copy Direct memory access Memory management unit Log-structured file system

Locality of reference

178

References
Article Sources and Contributors Image Sources, Licenses and Contributors 182 188

Article Licenses
License 190

Computer Basics
Central processing unit

An Intel 80486DX2 CPU from above

An Intel 80486DX2 from below

The central processing unit (CPU, occasionally central processor unit[1]) is the hardware within a computer system which carries out the instructions of a computer program by performing the basic arithmetical, logical, and input/output operations of the system. The term has been in use in the computer industry at least since the early 1960s.[2] The form, design, and implementation of CPUs have changed over the course of their history, but their fundamental operation remains much the same. On large machines, CPUs require one or more printed circuit boards. On personal computers and small workstations, the CPU is housed in a single silicon chip called a microprocessor. Since the 1970s the microprocessor class of CPUs has almost completely overtaken all other CPU implementations. Modern CPUs are large scale integrated circuits in packages typically less than four centimeters square, with hundreds of connecting pins. Two typical components of a CPU are the arithmetic logic unit (ALU), which performs arithmetic and logical operations, and the control unit (CU), which extracts instructions from memory and decodes and executes them, calling on the ALU when necessary. Not all computational systems rely on a central processing unit. An array processor or vector processor has multiple parallel computing elements, with no one unit considered the "center". In the distributed computing model, problems are solved by a distributed interconnected set of processors.

Central processing unit

History
Computers such as the ENIAC had to be physically rewired to perform different tasks, which caused these machines to be called "fixed-program computers." Since the term "CPU" is generally defined as a device for software (computer program) execution, the earliest devices that could rightly be called CPUs came with the advent of the stored-program computer. The idea of a stored-program computer was already present in the design of J. Presper Eckert and John William Mauchly's ENIAC, but was initially omitted so that it could be finished sooner. On June30, 1945, before ENIAC was made, mathematician John von Neumann distributed the paper entitled First Draft of a Report on the EDVAC. It was the outline of a stored-program computer that would eventually be completed in August 1949.[3] EDVAC was designed to perform a certain number of instructions (or operations) of various types. These instructions could be combined to create useful programs for the EDVAC, one of the first stored program computers EDVAC to run. Significantly, the programs written for EDVAC were stored in high-speed computer memory rather than specified by the physical wiring of the computer. This overcame a severe limitation of ENIAC, which was the considerable time and effort required to reconfigure the computer to perform a new task. With von Neumann's design, the program, or software, that EDVAC ran could be changed simply by changing the contents of the memory. Early CPUs were custom-designed as a part of a larger, sometimes one-of-a-kind, computer. However, this method of designing custom CPUs for a particular application has largely given way to the development of mass-produced processors that are made for many purposes. This standardization began in the era of discrete transistor mainframes and minicomputers and has rapidly accelerated with the popularization of the integrated circuit(IC). The IC has allowed increasingly complex CPUs to be designed and manufactured to tolerances on the order of nanometers. Both the miniaturization and standardization of CPUs have increased the presence of digital devices in modern life far beyond the limited application of dedicated computing machines. Modern microprocessors appear in everything from automobiles to cell phones and children's toys. While von Neumann is most often credited with the design of the stored-program computer because of his design of EDVAC, others before him, such as Konrad Zuse, had suggested and implemented similar ideas. The so-called Harvard architecture of the Harvard Mark I, which was completed before EDVAC, also utilized a stored-program design using punched paper tape rather than electronic memory. The key difference between the von Neumann and Harvard architectures is that the latter separates the storage and treatment of CPU instructions and data, while the former uses the same memory space for both. Most modern CPUs are primarily von Neumann in design, but elements of the Harvard architecture are commonly seen as well. Relays and vacuum tubes (thermionic valves) were commonly used as switching elements; a useful computer requires thousands or tens of thousands of switching devices. The overall speed of a system is dependent on the speed of the switches. Tube computers like EDVAC tended to average eight hours between failures, whereas relay computers like the (slower, but earlier) Harvard Mark I failed very rarely.[2] In the end, tube based CPUs became dominant because the significant speed advantages afforded generally outweighed the reliability problems. Most of these early synchronous CPUs ran at low clock rates compared to modern microelectronic designs (see below for a discussion of clock rate). Clock signal frequencies ranging from 100 kHz to 4MHz were very common at this time, limited largely by the speed of the switching devices they were built with.

Central processing unit

Transistor and integrated circuit CPUs


The design complexity of CPUs increased as various technologies facilitated building smaller and more reliable electronic devices. The first such improvement came with the advent of the transistor. Transistorized CPUs during the 1950s and 1960s no longer had to be built out of bulky, unreliable, and fragile switching elements like vacuum tubes and electrical relays. With this improvement more complex and reliable CPUs were built onto one or several printed circuit boards containing discrete (individual) components. During this period, a method of manufacturing many interconnected CPU, core memory, and external bus interface of transistors in a compact space was developed. The integrated circuit a DEC PDP-8/I. Made of medium-scale integrated circuits (IC) allowed a large number of transistors to be manufactured on a single semiconductor-based die, or "chip." At first only very basic non-specialized digital circuits such as NOR gates were miniaturized into ICs. CPUs based upon these "building block" ICs are generally referred to as "small-scale integration" (SSI) devices. SSI ICs, such as the ones used in the Apollo guidance computer, usually contained up to a few score transistors. To build an entire CPU out of SSI ICs required thousands of individual chips, but still consumed much less space and power than earlier discrete transistor designs. As microelectronic technology advanced, an increasing number of transistors were placed on ICs, thus decreasing the quantity of individual ICs needed for a complete CPU. MSI and LSI (medium- and large-scale integration) ICs increased transistor counts to hundreds, and then thousands. In 1964 IBM introduced its System/360 computer architecture which was used in a series of computers that could run the same programs with different speed and performance. This was significant at a time when most electronic computers were incompatible with one another, even those made by the same manufacturer. To facilitate this improvement, IBM utilized the concept of a microprogram (often called "microcode"), which still sees widespread usage in modern CPUs.[4] The System/360 architecture was so popular that it dominated the mainframe computer market for decades and left a legacy that is still continued by similar modern computers like the IBM zSeries. In the same year (1964), Digital Equipment Corporation (DEC) introduced another influential computer aimed at the scientific and research markets, the PDP-8. DEC would later introduce the extremely popular PDP-11 line that originally was built with SSI ICs but was eventually implemented with LSI components once these became practical. In stark contrast with its SSI and MSI predecessors, the first LSI implementation of the PDP-11 contained a CPU composed of only four LSI integrated circuits.[5] Transistor-based computers had several distinct advantages over their predecessors. Aside from facilitating increased reliability and lower power consumption, transistors also allowed CPUs to operate at much higher speeds because of the short switching time of a transistor in comparison to a tube or relay. Thanks to both the increased reliability as well as the dramatically increased speed of the switching elements (which were almost exclusively transistors by this time), CPU clock rates in the tens of megahertz were obtained during this period. Additionally while discrete transistor and IC CPUs were in heavy usage, new high-performance designs like SIMD (Single Instruction Multiple Data) vector processors began to appear. These early experimental designs later gave rise to the era of specialized supercomputers like those made by Cray Inc.

Central processing unit

Microprocessors

Die of an Intel 80486DX2 microprocessor (actual size: 126.75mm) in its packaging

Intel Core i5 CPU on a Vaio E series laptop motherboard (on the right, beneath the heat pipe).

In the 1970s the fundamental inventions by Federico Faggin (Silicon Gate MOS ICs with self aligned gates along with his new random logic design methodology) changed the design and implementation of CPUs forever. Since the introduction of the first commercially available microprocessor (the Intel 4004) in 1970, and the first widely used microprocessor (the Intel 8080) in 1974, this class of CPUs has almost completely overtaken all other central processing unit implementation methods. Mainframe and minicomputer manufacturers of the time launched proprietary IC development programs to upgrade their older computer architectures, and eventually produced instruction set compatible microprocessors that were backward-compatible with their older hardware and software. Combined with the advent and eventual success of the ubiquitous personal computer, the term CPU is now applied almost exclusively to microprocessors. Several CPUs can be combined in a single processing chip. Previous generations of CPUs were implemented as discrete components and numerous small integrated circuits (ICs) on one or more circuit boards. Microprocessors, on the other hand, are CPUs manufactured on a very small number of ICs; usually just one. The overall smaller CPU size as a result of being implemented on a single die means faster switching time because of physical factors like decreased gate parasitic capacitance. This has allowed synchronous microprocessors to have clock rates ranging from tens of megahertz to several gigahertz. Additionally, as the ability to construct exceedingly small transistors on an IC has increased, the complexity and number of transistors in a single CPU has increased many fold. This widely observed trend is described by Moore's law, which has proven to be a fairly accurate predictor of the growth of CPU (and other IC) complexity.[6] While the complexity, size, construction, and general form of CPUs have changed enormously since 1950, it is notable that the basic design and function has not changed much at all. Almost all common CPUs today can be very accurately described as von Neumann stored-program machines. As the aforementioned Moore's law continues to hold true, concerns have arisen about the limits of integrated circuit transistor technology. Extreme miniaturization of electronic gates is causing the effects of phenomena like electromigration and subthreshold leakage to become much more significant. These newer concerns are among the many factors causing researchers to investigate new methods of computing such as the quantum computer, as well as to expand the usage of parallelism and other methods that extend the usefulness of the classical von Neumann model.

Central processing unit

Operation
The fundamental operation of most CPUs, regardless of the physical form they take, is to execute a sequence of stored instructions called a program. The program is represented by a series of numbers that are kept in some kind of computer memory. There are four steps that nearly all CPUs use in their operation: fetch, decode, execute, and writeback. The first step, fetch, involves retrieving an instruction (which is represented by a number or sequence of numbers) from program memory. The location in program memory is determined by a program counter (PC), which stores a number that identifies the current position in the program. After an instruction is fetched, the PC is incremented by the length of the instruction word in terms of memory units.[7] Often, the instruction to be fetched must be retrieved from relatively slow memory, causing the CPU to stall while waiting for the instruction to be returned. This issue is largely addressed in modern processors by caches and pipeline architectures (see below). The instruction that the CPU fetches from memory is used to determine what the CPU is to do. In the decode step, the instruction is broken up into parts that have significance to other portions of the CPU. The way in which the numerical instruction value is interpreted is defined by the CPU's instruction set architecture (ISA).[8] Often, one group of numbers in the instruction, called the opcode, indicates which operation to perform. The remaining parts of the number usually provide information required for that instruction, such as operands for an addition operation. Such operands may be given as a constant value (called an immediate value), or as a place to locate a value: a register or a memory address, as determined by some addressing mode. In older designs the portions of the CPU responsible for instruction decoding were unchangeable hardware devices. However, in more abstract and complicated CPUs and ISAs, a microprogram is often used to assist in translating instructions into various configuration signals for the CPU. This microprogram is sometimes rewritable so that it can be modified to change the way the CPU decodes instructions even after it has been manufactured. After the fetch and decode steps, the execute step is performed. During this step, various portions of the CPU are connected so they can perform the desired operation. If, for instance, an addition operation was requested, the arithmetic logic unit (ALU) will be connected to a set of inputs and a set of outputs. The inputs provide the numbers to be added, and the outputs will contain the final sum. The ALU contains the circuitry to perform simple arithmetic and logical operations on the inputs (like addition and bitwise operations). If the addition operation produces a result too large for the CPU to handle, an arithmetic overflow flag in a flags register may also be set. The final step, writeback, simply "writes back" the results of the execute step to some form of memory. Very often the results are written to some internal CPU register for quick access by subsequent instructions. In other cases results may be written to slower, but cheaper and larger, main memory. Some types of instructions manipulate the program counter rather than directly produce result data. These are generally called "jumps" and facilitate behavior like loops, conditional program execution (through the use of a conditional jump), and functions in programs.[9] Many instructions will also change the state of digits in a "flags" register. These flags can be used to influence how a program behaves, since they often indicate the outcome of various operations. For example, one type of "compare" instruction considers two values and sets a number in the flags register according to which one is greater. This flag could then be used by a later jump instruction to determine program flow. After the execution of the instruction and writeback of the resulting data, the entire process repeats, with the next instruction cycle normally fetching the next-in-sequence instruction because of the incremented value in the program counter. If the completed instruction was a jump, the program counter will be modified to contain the address of the instruction that was jumped to, and program execution continues normally. In more complex CPUs than the one described here, multiple instructions can be fetched, decoded, and executed simultaneously. This section describes what is generally referred to as the "classic RISC pipeline", which in fact is quite common among the simple CPUs used in many electronic devices (often called microcontroller). It largely ignores the important role of CPU cache, and therefore the access stage of the pipeline.

Central processing unit

Design and implementation


The basic concept of a CPU is as follows: Hardwired into a CPU's design is a list of basic operations it can perform, called an instruction set. Such operations may include adding or subtracting two numbers, comparing numbers, or jumping to a different part of a program. Each of these basic operations is represented by a particular sequence of bits; this sequence is called the opcode for that particular operation. Sending a particular opcode to a CPU will cause it to perform the operation represented by that opcode. To execute an instruction in a computer program, the CPU uses the opcode for that instruction as well as its arguments (for instance the two numbers to be added, in the case of an addition operation). A computer program is therefore a sequence of instructions, with each instruction including an opcode and that operation's arguments. The actual mathematical operation for each instruction is performed by a subunit of the CPU known as the arithmetic logic unit or ALU. In addition to using its ALU to perform operations, a CPU is also responsible for reading the next instruction from memory, reading data specified in arguments from memory, and writing results to memory. In many CPU designs, an instruction set will clearly differentiate between operations that load data from memory, and those that perform math. In this case the data loaded from memory is stored in registers, and a mathematical operation takes no arguments but simply performs the math on the data in the registers and writes it to a new register, whose value a separate operation may then write to memory.

Control unit
The control unit of the CPU contains circuitry that uses electrical signals to direct the entire computer system to carry out stored program instructions. The control unit does not execute program instructions; rather, it directs other parts of the system to do so. The control unit must communicate with both the arithmetic/logic unit and memory.

Integer range
The way a CPU represents numbers is a design choice that affects the most basic ways in which the device functions. Some early digital computers used an electrical model of the common decimal (base ten) numeral system to represent numbers internally. A few other computers have used more exotic numeral systems like ternary (base three). Nearly all modern CPUs represent numbers in binary form, with each digit being represented by some two-valued physical quantity such as a "high" or "low" voltage.[10] Related to number representation is the size and precision of numbers that a CPU can represent. In the case of a binary CPU, a bit refers to one significant place in the numbers a CPU deals with. The number of bits (or numeral places) a CPU uses to represent numbers is often called "word size", "bit width", "data path width", or "integer MOS 6502 microprocessor in a dual in-line precision" when dealing with strictly integer numbers (as opposed to package, an extremely popular 8-bit design floating point). This number differs between architectures, and often within different parts of the very same CPU. For example, an 8-bit CPU deals with a range of numbers that can be represented by eight binary digits (each digit having two possible values), that is, 28 or 256 discrete numbers. In effect, integer size sets a hardware limit on the range of integers the software run by the CPU can utilize.[11] Integer range can also affect the number of locations in memory the CPU can address (locate). For example, if a binary CPU uses 32 bits to represent a memory address, and each memory address represents one octet (8bits), the maximum quantity of memory that CPU can address is 232 octets, or 4 GiB. This is a very simple view of CPU address space, and many designs use more complex addressing methods like paging to locate more memory than their integer range would allow with a flat address space.

Central processing unit Higher levels of integer range require more structures to deal with the additional digits, and therefore more complexity, size, power usage, and general expense. It is not at all uncommon, therefore, to see 4- or 8-bit microcontrollers used in modern applications, even though CPUs with much higher range (such as 16, 32, 64, even 128-bit) are available. The simpler microcontrollers are usually cheaper, use less power, and therefore generate less heat, all of which can be major design considerations for electronic devices. However, in higher-end applications, the benefits afforded by the extra range (most often the additional address space) are more significant and often affect design choices. To gain some of the advantages afforded by both lower and higher bit lengths, many CPUs are designed with different bit widths for different portions of the device. For example, the IBM System/370 used a CPU that was primarily 32 bit, but it used 128-bit precision inside its floating point units to facilitate greater accuracy and range in floating point numbers.[4] Many later CPU designs use similar mixed bit width, especially when the processor is meant for general-purpose usage where a reasonable balance of integer and floating point capability is required.

Clock rate
The clock rate is the speed at which a microprocessor executes instructions. Every computer contains an internal clock that regulates the rate at which instructions are executed and synchronizes all the various computer components. The CPU requires a fixed number of clock ticks (or clock cycles) to execute each instruction. The faster the clock, the more instructions the CPU can execute per second. Most CPUs, and indeed most sequential logic devices, are synchronous in nature.[12] That is, they are designed and operate on assumptions about a synchronization signal. This signal, known as a clock signal, usually takes the form of a periodic square wave. By calculating the maximum time that electrical signals can move in various branches of a CPU's many circuits, the designers can select an appropriate period for the clock signal. This period must be longer than the amount of time it takes for a signal to move, or propagate, in the worst-case scenario. In setting the clock period to a value well above the worst-case propagation delay, it is possible to design the entire CPU and the way it moves data around the "edges" of the rising and falling clock signal. This has the advantage of simplifying the CPU significantly, both from a design perspective and a component-count perspective. However, it also carries the disadvantage that the entire CPU must wait on its slowest elements, even though some portions of it are much faster. This limitation has largely been compensated for by various methods of increasing CPU parallelism. (see below) However, architectural improvements alone do not solve all of the drawbacks of globally synchronous CPUs. For example, a clock signal is subject to the delays of any other electrical signal. Higher clock rates in increasingly complex CPUs make it more difficult to keep the clock signal in phase (synchronized) throughout the entire unit. This has led many modern CPUs to require multiple identical clock signals to be provided to avoid delaying a single signal significantly enough to cause the CPU to malfunction. Another major issue as clock rates increase dramatically is the amount of heat that is dissipated by the CPU. The constantly changing clock causes many components to switch regardless of whether they are being used at that time. In general, a component that is switching uses more energy than an element in a static state. Therefore, as clock rate increases, so does heat dissipation, causing the CPU to require more effective cooling solutions. One method of dealing with the switching of unneeded components is called clock gating, which involves turning off the clock signal to unneeded components (effectively disabling them). However, this is often regarded as difficult to implement and therefore does not see common usage outside of very low-power designs. One notable late CPU design that uses clock gating is that of the IBM PowerPC-based Xbox 360. It utilizes extensive clock gating to reduce the power requirements of the aforementioned videogame console in which it is used.[13] Another method of addressing some of the problems with a global clock signal is the removal of the clock signal altogether. While removing the global clock signal makes the design process considerably more complex in many ways, asynchronous (or clockless) designs carry marked advantages in power consumption and heat dissipation in comparison with

Central processing unit similar synchronous designs. While somewhat uncommon, entire asynchronous CPUs have been built without utilizing a global clock signal. Two notable examples of this are the ARM compliant AMULET and the MIPS R3000 compatible MiniMIPS. Rather than totally removing the clock signal, some CPU designs allow certain portions of the device to be asynchronous, such as using asynchronous ALUs in conjunction with superscalar pipelining to achieve some arithmetic performance gains. While it is not altogether clear whether totally asynchronous designs can perform at a comparable or better level than their synchronous counterparts, it is evident that they do at least excel in simpler math operations. This, combined with their excellent power consumption and heat dissipation properties, makes them very suitable for embedded computers.[14]

Parallelism
The description of the basic operation of a CPU offered in the previous section describes the simplest form that a CPU can take. This type of CPU, usually referred to as subscalar, operates on and executes one instruction on one or two pieces of data at a time.

Model of a subscalar CPU. Notice that it takes fifteen cycles to complete three instructions.

This process gives rise to an inherent inefficiency in subscalar CPUs. Since only one instruction is executed at a time, the entire CPU must wait for that instruction to complete before proceeding to the next instruction. As a result, the subscalar CPU gets "hung up" on instructions which take more than one clock cycle to complete execution. Even adding a second execution unit (see below) does not improve performance much; rather than one pathway being hung up, now two pathways are hung up and the number of unused transistors is increased. This design, wherein the CPU's execution resources can operate on only one instruction at a time, can only possibly reach scalar performance (one instruction per clock). However, the performance is nearly always subscalar (less than one instruction per cycle). Attempts to achieve scalar and better performance have resulted in a variety of design methodologies that cause the CPU to behave less linearly and more in parallel. When referring to parallelism in CPUs, two terms are generally used to classify these design techniques. Instruction level parallelism (ILP) seeks to increase the rate at which instructions are executed within a CPU (that is, to increase the utilization of on-die execution resources), and thread level parallelism (TLP) purposes to increase the number of threads (effectively individual programs) that a CPU can execute simultaneously. Each methodology differs both in the ways in which they are implemented, as well as the relative effectiveness they afford in increasing the CPU's performance for an application.[15] Instruction level parallelism One of the simplest methods used to accomplish increased parallelism is to begin the first steps of instruction fetching and decoding before the prior instruction finishes executing. This is the simplest form of a technique known as instruction pipelining, and is utilized in almost all Basic five-stage pipeline. In the best case modern general-purpose CPUs. Pipelining allows more than one scenario, this pipeline can sustain a completion instruction to be executed at any given time by breaking down the rate of one instruction per cycle. execution pathway into discrete stages. This separation can be compared to an assembly line, in which an instruction is made more complete at each stage until it exits the execution pipeline and is retired. Pipelining does, however, introduce the possibility for a situation where the result of the previous operation is needed to complete the next operation; a condition often termed data dependency conflict. To cope with this, additional care must be taken to check for these sorts of conditions and delay a portion of the instruction pipeline if this occurs. Naturally, accomplishing this requires additional circuitry, so pipelined processors are more complex than subscalar ones (though not very significantly so). A pipelined processor can become very nearly scalar,

Central processing unit inhibited only by pipeline stalls (an instruction spending more than one clock cycle in a stage). Further improvement upon the idea of instruction pipelining led to the development of a method that decreases the idle time of CPU components even further. Designs that are said to be superscalar include a long instruction pipeline and multiple identical execution units.[16] In a superscalar pipeline, multiple instructions are read and passed to a dispatcher, which decides whether or not the instructions can be executed in parallel (simultaneously). If so they are dispatched to available execution units, resulting in the ability for several instructions to be executed simultaneously. In general, the more instructions a superscalar CPU is able to dispatch simultaneously to waiting execution units, the more instructions will be completed in a given cycle.

Simple superscalar pipeline. By fetching and dispatching two instructions at a time, a maximum of two instructions per cycle can be completed.

Most of the difficulty in the design of a superscalar CPU architecture lies in creating an effective dispatcher. The dispatcher needs to be able to quickly and correctly determine whether instructions can be executed in parallel, as well as dispatch them in such a way as to keep as many execution units busy as possible. This requires that the instruction pipeline is filled as often as possible and gives rise to the need in superscalar architectures for significant amounts of CPU cache. It also makes hazard-avoiding techniques like branch prediction, speculative execution, and out-of-order execution crucial to maintaining high levels of performance. By attempting to predict which branch (or path) a conditional instruction will take, the CPU can minimize the number of times that the entire pipeline must wait until a conditional instruction is completed. Speculative execution often provides modest performance increases by executing portions of code that may not be needed after a conditional operation completes. Out-of-order execution somewhat rearranges the order in which instructions are executed to reduce delays due to data dependencies. Also in case of Single Instructions Multiple Data a case when a lot of data from the same type has to be processed, modern processors can disable parts of the pipeline so that when a single instruction is executed many times, the CPU skips the fetch and decode phases and thus greatly increases performance on certain occasions, especially in highly monotonous program engines such as video creation software and photo processing. In the case where a portion of the CPU is superscalar and part is not, the part which is not suffers a performance penalty due to scheduling stalls. The Intel P5 Pentium had two superscalar ALUs which could accept one instruction per clock each, but its FPU could not accept one instruction per clock. Thus the P5 was integer superscalar but not floating point superscalar. Intel's successor to the P5 architecture, P6, added superscalar capabilities to its floating point features, and therefore afforded a significant increase in floating point instruction performance. Both simple pipelining and superscalar design increase a CPU's ILP by allowing a single processor to complete execution of instructions at rates surpassing one instruction per cycle (IPC).[17] Most modern CPU designs are at least somewhat superscalar, and nearly all general purpose CPUs designed in the last decade are superscalar. In later years some of the emphasis in designing high-ILP computers has been moved out of the CPU's hardware and into its software interface, or ISA. The strategy of the very long instruction word (VLIW) causes some ILP to become implied directly by the software, reducing the amount of work the CPU must perform to boost ILP and thereby reducing the design's complexity.

Central processing unit Thread-level parallelism Another strategy of achieving performance is to execute multiple programs or threads in parallel. This area of research is known as parallel computing. In Flynn's taxonomy, this strategy is known as Multiple Instructions-Multiple Data or MIMD. One technology used for this purpose was multiprocessing (MP). The initial flavor of this technology is known as symmetric multiprocessing (SMP), where a small number of CPUs share a coherent view of their memory system. In this scheme, each CPU has additional hardware to maintain a constantly up-to-date view of memory. By avoiding stale views of memory, the CPUs can cooperate on the same program and programs can migrate from one CPU to another. To increase the number of cooperating CPUs beyond a handful, schemes such as non-uniform memory access (NUMA) and directory-based coherence protocols were introduced in the 1990s. SMP systems are limited to a small number of CPUs while NUMA systems have been built with thousands of processors. Initially, multiprocessing was built using multiple discrete CPUs and boards to implement the interconnect between the processors. When the processors and their interconnect are all implemented on a single silicon chip, the technology is known as a multi-core microprocessor. It was later recognized that finer-grain parallelism existed with a single program. A single program might have several threads (or functions) that could be executed separately or in parallel. Some of the earliest examples of this technology implemented input/output processing such as direct memory access as a separate thread from the computation thread. A more general approach to this technology was introduced in the 1970s when systems were designed to run multiple computation threads in parallel. This technology is known as multi-threading (MT). This approach is considered more cost-effective than multiprocessing, as only a small number of components within a CPU is replicated to support MT as opposed to the entire CPU in the case of MP. In MT, the execution units and the memory system including the caches are shared among multiple threads. The downside of MT is that the hardware support for multithreading is more visible to software than that of MP and thus supervisor software like operating systems have to undergo larger changes to support MT. One type of MT that was implemented is known as block multithreading, where one thread is executed until it is stalled waiting for data to return from external memory. In this scheme, the CPU would then quickly switch to another thread which is ready to run, the switch often done in one CPU clock cycle, such as the UltraSPARC Technology. Another type of MT is known as simultaneous multithreading, where instructions of multiple threads are executed in parallel within one CPU clock cycle. For several decades from the 1970s to early 2000s, the focus in designing high performance general purpose CPUs was largely on achieving high ILP through technologies such as pipelining, caches, superscalar execution, out-of-order execution, etc. This trend culminated in large, power-hungry CPUs such as the Intel Pentium 4. By the early 2000s, CPU designers were thwarted from achieving higher performance from ILP techniques due to the growing disparity between CPU operating frequencies and main memory operating frequencies as well as escalating CPU power dissipation owing to more esoteric ILP techniques. CPU designers then borrowed ideas from commercial computing markets such as transaction processing, where the aggregate performance of multiple programs, also known as throughput computing, was more important than the performance of a single thread or program. This reversal of emphasis is evidenced by the proliferation of dual and multiple core CMP (chip-level multiprocessing) designs and notably, Intel's newer designs resembling its less superscalar P6 architecture. Late designs in several processor families exhibit CMP, including the x86-64 Opteron and Athlon 64 X2, the SPARC UltraSPARC T1, IBM POWER4 and POWER5, as well as several video game console CPUs like the Xbox 360's triple-core PowerPC design, and the PS3's 7-core Cell microprocessor.

10

Central processing unit Data parallelism A less common but increasingly important paradigm of CPUs (and indeed, computing in general) deals with data parallelism. The processors discussed earlier are all referred to as some type of scalar device.[18] As the name implies, vector processors deal with multiple pieces of data in the context of one instruction. This contrasts with scalar processors, which deal with one piece of data for every instruction. Using Flynn's taxonomy, these two schemes of dealing with data are generally referred to as SIMD (single instruction, multiple data) and SISD (single instruction, single data), respectively. The great utility in creating CPUs that deal with vectors of data lies in optimizing tasks that tend to require the same operation (for example, a sum or a dot product) to be performed on a large set of data. Some classic examples of these types of tasks are multimedia applications (images, video, and sound), as well as many types of scientific and engineering tasks. Whereas a scalar CPU must complete the entire process of fetching, decoding, and executing each instruction and value in a set of data, a vector CPU can perform a single operation on a comparatively large set of data with one instruction. Of course, this is only possible when the application tends to require many steps which apply one operation to a large set of data. Most early vector CPUs, such as the Cray-1, were associated almost exclusively with scientific research and cryptography applications. However, as multimedia has largely shifted to digital media, the need for some form of SIMD in general-purpose CPUs has become significant. Shortly after inclusion of floating point execution units started to become commonplace in general-purpose processors, specifications for and implementations of SIMD execution units also began to appear for general-purpose CPUs. Some of these early SIMD specifications like HP's Multimedia Acceleration eXtensions (MAX) and Intel's MMX were integer-only. This proved to be a significant impediment for some software developers, since many of the applications that benefit from SIMD primarily deal with floating point numbers. Progressively, these early designs were refined and remade into some of the common, modern SIMD specifications, which are usually associated with one ISA. Some notable modern examples are Intel's SSE and the PowerPC-related AltiVec (also known as VMX).[19]

11

Performance
The performance or speed of a processor depends on the clock rate (generally given in multiples of hertz) and the instructions per clock (IPC), which together are the factors for the instructions per second (IPS) that the CPU can perform.[20] Many reported IPS values have represented "peak" execution rates on artificial instruction sequences with few branches, whereas realistic workloads consist of a mix of instructions and applications, some of which take longer to execute than others. The performance of the memory hierarchy also greatly affects processor performance, an issue barely considered in MIPS calculations. Because of these problems, various standardized tests, often called "benchmarks" for this purposesuch as SPECint have been developed to attempt to measure the real effective performance in commonly used applications. Processing performance of computers is increased by using multi-core processors, which essentially is plugging two or more individual processors (called cores in this sense) into one integrated circuit.[21] Ideally, a dual core processor would be nearly twice as powerful as a single core processor. In practice, however, the performance gain is far less, only about 50%,[21] due to imperfect software algorithms and implementation. Increasing the number of cores in a processor (i.e. dual-core, quad-core, etc.) increases the workload that a computer can handle. This means that the processor can now handle numerous asynchronous events, Interrupts, etc. which can take a toll on the CPU (Central Processing Unit) when overwhelmed. It is best to think of these numerous cores as different floors in a processing plant, with each floor handling a different task. Sometimes, these cores will handle the same tasks as cores adjacent to them if a single core is not enough to handle the information to prevent a crash.

Central processing unit

12

Integrated heat spreader


IHS is usually made of copper covered with a nickel plating.

References
[1] Envos Corporation (September 1988). "1. INTRODUCTION" (http:/ / www. bitsavers. org/ pdf/ xerox/ interlisp/ 400004_1108UsersGuide_Sep88. pdf) (PDF). 1108 USER'S GUIDE (Manual). Envos Corporation. p.1. . Retrieved May 24, 2012. [2] Weik, Martin H. (1961). A Third Survey of Domestic Electronic Digital Computing Systems (http:/ / ed-thelen. org/ comp-hist/ BRL61. html). Ballistic Research Laboratories. . [3] First Draft of a Report on the EDVAC (http:/ / www. virtualtravelog. net/ entries/ 2003-08-TheFirstDraft. pdf). Moore School of Electrical Engineering, University of Pennsylvania. 1945. . [4] Amdahl, G. M., Blaauw, G. A., & Brooks, F. P. Jr. (1964). Architecture of the IBM System/360 (http:/ / www. research. ibm. com/ journal/ rd/ 441/ amdahl. pdf). IBM Research. . [5] Digital Equipment Corporation (November 1975). "LSI-11 Module Descriptions" (http:/ / www. classiccmp. org/ bitsavers/ pdf/ dec/ pdp11/ 1103/ EK-LSI11-TM-002. pdf). LSI-11, PDP-11/03 user's manual (2nd ed.). Maynard, Massachusetts: Digital Equipment Corporation. pp.43. . [6] (PDF) Excerpts from A Conversation with Gordon Moore: Moore's Law (ftp:/ / download. intel. com/ museum/ Moores_Law/ Video-Transcripts/ Excepts_A_Conversation_with_Gordon_Moore. pdf). Intel. 2005. . Retrieved 2012-07-25. [7] Since the program counter counts memory addresses and not instructions, it is incremented by the number of memory units that the instruction word contains. In the case of simple fixed-length instruction word ISAs, this is always the same number. For example, a fixed-length 32-bit instruction word ISA that uses 8-bit memory words would always increment the PC by 4 (except in the case of jumps). ISAs that use variable length instruction words,increment the PC by the number of memory words corresponding to the last instruction's length. [8] Because the instruction set architecture of a CPU is fundamental to its interface and usage, it is often used as a classification of the "type" of CPU. For example, a "PowerPC CPU" uses some variant of the PowerPC ISA. A system can execute a different ISA by running an emulator. [9] Some early computers like the Harvard Mark I did not support any kind of "jump" instruction, effectively limiting the complexity of the programs they could run. It is largely for this reason that these computers are often not considered to contain a CPU proper, despite their close similarity as stored program computers. [10] The physical concept of voltage is an analog one by its nature, practically having an infinite range of possible values. For the purpose of physical representation of binary numbers, set ranges of voltages are defined as one or zero. These ranges are usually influenced by the circuit designs and operational parameters of the switching elements used to create the CPU, such as a transistor's threshold level. [11] While a CPU's integer size sets a limit on integer ranges, this can (and often is) overcome using a combination of software and hardware techniques. By using additional memory, software can represent integers many magnitudes larger than the CPU can. Sometimes the CPU's ISA will even facilitate operations on integers larger than it can natively represent by providing instructions to make large integer arithmetic relatively quick. While this method of dealing with large integers is somewhat slower than utilizing a CPU with higher integer size, it is a reasonable trade-off in cases where natively supporting the full integer range needed would be cost-prohibitive. See Arbitrary-precision arithmetic for more details on purely software-supported arbitrary-sized integers. [12] In fact, all synchronous CPUs use a combination of sequential logic and combinational logic. (See boolean logic) [13] Brown, Jeffery (2005). "Application-customized CPU design" (http:/ / www-128. ibm. com/ developerworks/ power/ library/ pa-fpfxbox/ ?ca=dgr-lnxw07XBoxDesign). IBM developerWorks. . Retrieved 2005-12-17. [14] Garside, J. D., Furber, S. B., & Chung, S-H (1999). AMULET3 Revealed (http:/ / www. cs. manchester. ac. uk/ apt/ publications/ papers/ async99_A3. php). University of Manchester Computer Science Department. . [15] Neither ILP nor TLP is inherently superior over the other; they are simply different means by which to increase CPU parallelism. As such, they both have advantages and disadvantages, which are often determined by the type of software that the processor is intended to run. High-TLP CPUs are often used in applications that lend themselves well to being split up into numerous smaller applications, so-called "embarrassingly parallel problems". Frequently, a computational problem that can be solved quickly with high TLP design strategies like SMP take significantly more time on high ILP devices like superscalar CPUs, and vice versa. [16] Huynh, Jack (2003). "The AMD Athlon XP Processor with 512KB L2 Cache" (http:/ / courses. ece. uiuc. edu/ ece512/ Papers/ Athlon. pdf). University of Illinois Urbana-Champaign. pp.611. . Retrieved 2007-10-06. [17] Best-case scenario (or peak) IPC rates in very superscalar architectures are difficult to maintain since it is impossible to keep the instruction pipeline filled all the time. Therefore, in highly superscalar CPUs, average sustained IPC is often discussed rather than peak IPC. [18] Earlier the term scalar was used to compare the IPC (instructions per cycle) count afforded by various ILP methods. Here the term is used in the strictly mathematical sense to contrast with vectors. See scalar (mathematics) and Vector (geometric). [19] Although SSE/SSE2/SSE3 have superseded MMX in Intel's general purpose CPUs, later IA-32 designs still support MMX. This is usually accomplished by providing most of the MMX functionality with the same hardware that supports the much more expansive SSE instruction sets. [20] "CPU Frequency" (http:/ / www. cpu-world. com/ Glossary/ C/ CPU_Frequency. html). CPU World Glossary. CPU World. 25 March 2008. . Retrieved 1 January 2010.

Central processing unit


[21] "What is (a) multi-core processor?" (http:/ / searchdatacenter. techtarget. com/ sDefinition/ 0,,sid80_gci1015740,00. html). Data Center Definitions. SearchDataCenter.com. 27 March 2007. . Retrieved 1 January 2010.

13

External links
How Microprocessors Work (http://www.howstuffworks.com/microprocessor.htm) at HowStuffWorks 25 Microchips that shook the world (http://spectrum.ieee.org/25chips) an article by the Institute of Electrical and Electronics Engineers

Random-access memory
Random-access memory (RAM) is a form of computer data storage. A random-access device allows stored data to be accessed in very nearly the same amount of time for any storage location, so data can be accessed quickly in any random order. In contrast, other data storage media such as hard disks, CDs, DVDs and magnetic tape, as well as early primary memory types such as drum memory, read and write data only in a predetermined order, consecutively, because of mechanical design limitations. Therefore the time to access a given data location varies significantly depending on its physical location.
Example of writable volatile random-access

Today, random-access memory takes the form of integrated circuits. memory: Synchronous Dynamic RAM modules, primarily used as main memory in personal Strictly speaking, modern types of DRAM are not random access, as computers, workstations, and servers. data is read in bursts, although the name DRAM / RAM has stuck. However, many types of SRAM, ROM, OTP, and NOR flash are still random access even in a strict sense. RAM is often associated with volatile types of memory (such as DRAM memory modules), where its stored information is lost if the power is removed. Many other types of non-volatile memory are RAM as well, including most types of ROM and a type of flash memory called NOR-Flash. The first RAM modules to come into the market were created in 1951 and were sold until the late 1960s and early 1970s.

History
Early computers used relays, or delay lines for "main" memory functions. Ultrasonic delay lines could only reproduce data in the order it was written. Drum memory could be expanded at low cost but retrieval of non-sequential memory items required knowledge of the physical layout of the drum to optimize speed. Latches built out of vacuum tube triodes, and later, out of discrete transistors, were used for smaller and faster memories such as random-access register banks and registers. Such registers were relatively large, power-hungry and too costly to use for large amounts of data; generally only a few hundred or few thousand bits of such memory could be provided.

1 Megabit chip - one of the last models developed by VEB Carl Zeiss Jena in 1989

The first practical form of random-access memory was the Williams tube starting in 1947. It stored data as electrically charged spots on the face of a cathode ray tube. Since the electron beam of the CRT could read and write the spots on the tube in any order, memory was random access. The capacity of the Williams tube was a few hundred to around a thousand bits, but it was much smaller, faster, and more power-efficient than using individual vacuum tube latches.

Random-access memory Magnetic-core memory, invented in 1947 and developed up until the mid 1970s, became a widespread form of random-access memory. It relied on an array of magnetized rings; by changing the sense of magnetization, data could be stored, with each bit represented physically by one ring. Since every ring had a combination of address wires to select and read or write it, access to any memory location in any sequence was possible. Magnetic core memory was the standard form of memory system until displaced by solid-state memory in integrated circuits, starting in the early 1970s. Robert H. Dennard invented dynamic random-access memory (DRAM) in 1968; this allowed replacement of a 4 or 6-transistor latch circuit by a single transistor for each memory bit, greatly increasing memory density at the cost of volatility. Data was stored in the tiny capacitance of each transistor, and had to be periodically refreshed in a few milliseconds before the charge could leak away. Prior to the development of integrated read-only memory (ROM) circuits, permanent (or read-only) random-access memory was often constructed using diode matrices driven by address decoders, or specially wound core rope memory planes.

14

Types of RAM
The two main forms of modern RAM are static RAM (SRAM) and dynamic RAM (DRAM). In static RAM, a bit of data is stored using the state of a flip-flop. This form of RAM is more expensive to produce, but is generally faster and requires less power than DRAM and, in modern computers, is often used as cache memory for the CPU. DRAM stores a bit of data using a transistor and capacitor pair, which together comprise a memory cell. The capacitor holds a high or low Top L-R, DDR2 with heat-spreader, DDR2 charge (1 or 0, respectively), and the transistor acts as a switch that lets without heat-spreader, Laptop DDR2, DDR, Laptop DDR the control circuitry on the chip read the capacitor's state of charge or change it. As this form of memory is less expensive to produce than static RAM, it is the predominant form of computer memory used in modern computers. Both static and dynamic RAM are considered volatile, as their state is lost or reset when power is removed from the system. By contrast, Read-only memory (ROM) stores data by permanently enabling or disabling selected transistors, such that the memory cannot be altered. Writeable variants of ROM (such as EEPROM and flash memory) share properties of both ROM and RAM, enabling data to persist without power and to be updated without requiring special equipment. These persistent forms of semiconductor ROM include USB flash drives, memory cards for cameras and portable devices, etc. As of 2007, NAND flash has begun to replace older forms of persistent storage, such as magnetic disks and tapes, while NOR flash is being used in place of ROM in netbooks and rugged computers, since it is capable of true random access, allowing direct code execution. ECC memory (which can be either SRAM or DRAM) includes special circuitry to detect and/or correct random faults (memory errors) in the stored data, using parity bits or error correction code. In general, the term RAM refers solely to solid-state memory devices (either DRAM or SRAM), and more specifically the main memory in most computers. In optical storage, the term DVD-RAM is somewhat of a misnomer since, unlike CD-RW or DVD-RW it does not require to be erased before reuse. Nevertheless a DVD-RAM behaves much like a hard disc drive if somewhat slower.

Random-access memory

15

Memory hierarchy
One can read and over-write data in RAM. Many computer systems have a memory hierarchy consisting of CPU registers, on-die SRAM caches, external caches, DRAM, paging systems, and virtual memory or swap space on a hard drive. This entire pool of memory may be referred to as "RAM" by many developers, even though the various subsystems can have very different access times, violating the original concept behind the random access term in RAM. Even within a hierarchy level such as DRAM, the specific row, column, bank, rank, channel, or interleave organization of the components make the access time variable, although not to the extent that rotating storage media or a tape is variable. The overall goal of using a memory hierarchy is to obtain the higher possible average access performance while minimizing the total cost of the entire memory system (generally, the memory hierarchy follows the access time with the fast CPU registers at the top and the slow hard drive at the bottom). In many modern personal computers, the RAM comes in an easily upgraded form of modules called memory modules or DRAM modules about the size of a few sticks of chewing gum. These can quickly be replaced should they become damaged or when changing needs demand more storage capacity. As suggested above, smaller amounts of RAM (mostly SRAM) are also integrated in the CPU and other ICs on the motherboard, as well as in hard-drives, CD-ROMs, and several other parts of the computer system.

Other uses of RAM


In addition to serving as temporary storage and working space for the operating system and applications, RAM is used in numerous other ways.

Virtual memory
Most modern operating systems employ a method of extending RAM capacity, known as "virtual memory". A portion of the computer's hard drive is set aside for a paging file or a scratch partition, and the combination of physical RAM and the paging file form the system's total memory. (For example, if a computer has 2 GB of RAM and a 1 GB page file, the operating system has 3 GB total memory available to it.) When the system runs low on physical memory, it can "swap" portions of RAM to the paging file to make room for new data, as well as to read previously swapped information back into RAM. Excessive use of this mechanism results in thrashing and generally hampers overall system performance, mainly because hard drives are far slower than RAM.

RAM disk
Software can "partition" a portion of a computer's RAM, allowing it to act as a much faster hard drive that is called a RAM disk. A RAM disk loses the stored data when the computer is shut down, unless memory is arranged to have a standby battery source.

Shadow RAM
Sometimes, the contents of a relatively slow ROM chip are copied to read/write memory to allow for shorter access times. The ROM chip is then disabled while the initialized memory locations are switched in on the same block of addresses (often write-protected). This process, sometimes called shadowing, is fairly common in both computers and embedded systems. As a common example, the BIOS in typical personal computers often has an option called use shadow BIOS or similar. When enabled, functions relying on data from the BIOSs ROM will instead use DRAM locations (most can also toggle shadowing of video card ROM or other ROM sections). Depending on the system, this may not result in increased performance, and may cause incompatibilities. For example, some hardware may be inaccessible to the operating system if shadow RAM is used. On some systems the benefit may be hypothetical because the BIOS is not used after booting in favor of direct hardware access. Free memory is reduced by the size of the shadowed ROMs.[1]

Random-access memory

16

Recent developments
Several new types of non-volatile RAM, which will preserve data while powered down, are under development. The technologies used include carbon nanotubes and approaches utilizing the magnetic tunnel effect. Amongst the 1st generation MRAM, a 128 KiB (128210 bytes) magnetic RAM (MRAM) chip was manufactured with 0.18m technology in the summer of 2003. In June 2004, Infineon Technologies unveiled a 16MiB (16220 bytes) prototype again based on 0.18m technology. There are two 2nd generation techniques currently in development: Thermal Assisted Switching (TAS)[2] which is being developed by Crocus Technology, and Spin Torque Transfer (STT) on which Crocus, Hynix, IBM, and several other companies are working.[3] Nantero built a functioning carbon nanotube memory prototype 10GiB (10230 bytes) array in 2004. Whether some of these technologies will be able to eventually take a significant market share from either DRAM, SRAM, or flash-memory technology, however, remains to be seen. Since 2006, "Solid-state drives" (based on flash memory) with capacities exceeding 256 gigabytes and performance far exceeding traditional disks have become available. This development has started to blur the definition between traditional random-access memory and "disks", dramatically reducing the difference in performance. Some kinds of random-access memory, such as "EcoRAM", are specifically designed for server farms, where low power consumption is more important than speed.[4]

Memory wall
The "memory wall" is the growing disparity of speed between CPU and memory outside the CPU chip. An important reason for this disparity is the limited communication bandwidth beyond chip boundaries. From 1986 to 2000, CPU speed improved at an annual rate of 55% while memory speed only improved at 10%. Given these trends, it was expected that memory latency would become an overwhelming bottleneck in computer performance.[5] Currently, CPU speed improvements have slowed significantly partly due to major physical barriers and partly because current CPU designs have already hit the memory wall in some sense. Intel summarized these causes in their Platform 2015 documentation (PDF) [6] First of all, as chip geometries shrink and clock frequencies rise, the transistor leakage current increases, leading to excess power consumption and heat... Secondly, the advantages of higher clock speeds are in part negated by memory latency, since memory access times have not been able to keep pace with increasing clock frequencies. Third, for certain applications, traditional serial architectures are becoming less efficient as processors get faster (due to the so-called Von Neumann bottleneck), further undercutting any gains that frequency increases might otherwise buy. In addition, partly due to limitations in the means of producing inductance within solid state devices, resistance-capacitance (RC) delays in signal transmission are growing as feature sizes shrink, imposing an additional bottleneck that frequency increases don't address. The RC delays in signal transmission were also noted in Clock Rate versus IPC: The End of the Road for Conventional Microarchitectures [7] which projects a maximum of 12.5% average annual CPU performance improvement between 2000 and 2014. The data on Intel Processors [8] clearly shows a slowdown in performance improvements in recent processors. However, Intel's Core 2 Duo processors (codenamed Conroe) showed a significant improvement over previous Pentium 4 processors; due to a more efficient architecture, performance increased while clock rate actually decreased.

Random-access memory

17

Notes and references


[1] [2] [3] [4] [5] [6] [7] [8] "Shadow Ram" (http:/ / hardwarehell. com/ articles/ shadowram. htm). . Retrieved 2007-07-24. The Emergence of Practical MRAM http:/ / www. crocus-technology. com/ pdf/ BH%20GSA%20Article. pdf http:/ / www. eetimes. com/ news/ latest/ showArticle. jhtml?articleID=218000269 "EcoRAM held up as less power-hungry option than DRAM for server farms" (http:/ / blogs. zdnet. com/ green/ ?p=1165) by Heather Clancy 2008 The term was coined in (http:/ / www. eecs. ucf. edu/ ~lboloni/ Teaching/ EEL5708_2006/ slides/ wulf94. pdf). http:/ / epic. hpi. uni-potsdam. de/ pub/ Home/ TrendsAndConceptsII2010/ HW_Trends_borkar_2015. pdf http:/ / www. cs. utexas. edu/ users/ cart/ trips/ publications/ isca00. pdf http:/ / www. intel. com/ pressroom/ kits/ quickreffam. htm

External links
Memory Prices (1957-2010) (http://www.jcmit.com/memoryprice.htm)

Computer data storage


Computer data storage, often called storage or memory, is a technology consisting of computer components and recording media used to retain digital data. It is a core function and fundamental component of computers. In contemporary usage, memory is usually semiconductor storage read-write random-access memory, typically DRAM (Dynamic-RAM) or other forms of fast but temporary storage. Storage consists of storage devices and their media not directly accessible by the CPU, (secondary or tertiary storage), typically hard disk drives, optical disc drives, and other devices slower than RAM but are non-volatile (retaining contents when powered down).[1] Historically, memory has been called core, main memory, real storage or internal memory while storage devices have been referred to as secondary storage, external memory or auxiliary/peripheral storage. The distinctions are fundamental to the architecture of computers. The distinctions also reflect an important and significant technical difference between memory and mass storage devices, which has been blurred by the historical usage of the term storage. Nevertheless, this article uses the traditional nomenclature. Many different forms of storage, based on various natural phenomena, connected to a computer it serves as secondary have been invented. So far, no practical universal storage medium storage. exists, and all forms of storage have some drawbacks. Therefore a computer system usually contains several kinds of storage, each with an individual purpose. A modern digital computer represents data using the binary numeral system. Text, numbers, pictures, audio, and nearly any other form of information can be converted into a string of bits, or binary digits, each of which has a value of 1 or 0. The most common unit of storage is the
40 GB PATA hard disk drive (HDD); when

1 GB of SDRAM mounted in a personal computer. An example of primary storage.

Computer data storage

18

byte, equal to 8 bits. A piece of information can be handled by any computer or device whose storage space is large enough to accommodate the binary representation of the piece of information, or simply data. For example, the complete works of Shakespeare, about 1250 pages in print, can be stored in about five megabytes (forty million bits) with one byte per character. The defining component of a computer is the central processing unit (CPU, or simply processor), because it operates on data, performs computations, and controls other components. In the most commonly used computer architecture, the CPU consists of two main parts: Control Unit and Arithmetic Logic Unit (ALU). The former controls the flow of data between the CPU and memory; the latter performs arithmetic and logical operations on data.

160 GB SDLT tape cartridge, an example of off-line storage. When used within a robotic tape library, it is classified as tertiary storage instead.

Without a significant amount of memory, a computer would merely be able to perform fixed operations and immediately output the result. It would have to be reconfigured to change its behavior. This is acceptable for devices such as desk calculators, digital signal processors, and other specialised devices. Von Neumann machines differ in having a memory in which they store their operating instructions and data. Such computers are more versatile in that they do not need to have their hardware reconfigured for each new program, but can simply be reprogrammed with new in-memory instructions; they also tend to be simpler to design, in that a relatively simple processor may keep state between successive computations to build up complex procedural results. Most modern computers are von Neumann machines. In practice, almost all computers use a variety of memory types, organized in a storage hierarchy around the CPU, as a trade-off between performance and cost. Generally, the lower a storage is in the hierarchy, the lesser its bandwidth and the greater its access latency is from the CPU. This traditional division of storage to primary, secondary, tertiary and off-line storage is also guided by cost per bit.

Computer data storage

19

Hierarchy of storage
Primary storage
Direct links to this section: Primary storage, Main memory, Internal Memory. Primary storage (or main memory or internal memory), often referred to simply as memory, is the only one directly accessible to the CPU. The CPU continuously reads instructions stored there and executes them as required. Any data actively operated on is also stored there in uniform manner. Historically, early computers used delay lines, Williams tubes, or rotating magnetic drums as primary storage. By 1954, those unreliable methods were mostly replaced by magnetic core memory. Core memory remained dominant until the 1970s, when advances in integrated circuit technology allowed semiconductor memory to become economically competitive. This led to modern random-access memory (RAM). It is small-sized, light, but quite expensive at the same time. (The particular types of RAM used for primary storage are also volatile, i.e. they lose the information when not powered). As shown in the diagram, traditionally there are two more sub-layers of the primary storage, besides main large-capacity RAM:

Various forms of storage, divided according to their distance from the central processing unit. The fundamental components of a general-purpose computer are arithmetic and logic unit, control circuitry, storage space, and input/output devices. Technology and capacity as in common home computers around 2005.

Processor registers are located inside the processor. Each register typically holds a word of data (often 32 or 64 bits). CPU instructions instruct the arithmetic and logic unit to perform various calculations or other operations on this data (or with the help of it). Registers are the fastest of all forms of computer data storage. Processor cache is an intermediate stage between ultra-fast registers and much slower main memory. It's introduced solely to increase performance of the computer. Most actively used information in the main memory is just duplicated in the cache memory, which is faster, but of much lesser capacity. On the other hand, main memory is much slower, but has a much greater storage capacity than processor registers. Multi-level hierarchical cache setup is also commonly usedprimary cache being smallest, fastest and located inside the processor; secondary cache being somewhat larger and slower. Main memory is directly or indirectly connected to the central processing unit via a memory bus. It is actually two buses (not on the diagram): an address bus and a data bus. The CPU firstly sends a number through an address bus, a number called memory address, that indicates the desired location of data. Then it reads or writes the data itself using the data bus. Additionally, a memory management unit (MMU) is a small device between CPU and RAM recalculating the actual memory address, for example to provide an abstraction of virtual memory or other tasks. As the RAM types used for primary storage are volatile (cleared at start up), a computer containing only such storage would not have a source to read instructions from, in order to start the computer. Hence, non-volatile primary storage

Computer data storage containing a small startup program (BIOS) is used to bootstrap the computer, that is, to read a larger program from non-volatile secondary storage to RAM and start to execute it. A non-volatile technology used for this purpose is called ROM, for read-only memory (the terminology may be somewhat confusing as most ROM types are also capable of random access). Many types of "ROM" are not literally read only, as updates are possible; however it is slow and memory must be erased in large portions before it can be re-written. Some embedded systems run programs directly from ROM (or similar), because such programs are rarely changed. Standard computers do not store non-rudimentary programs in ROM, rather use large capacities of secondary storage, which is non-volatile as well, and not as costly. Recently, primary storage and secondary storage in some uses refer to what was historically called, respectively, secondary storage and tertiary storage.[2]

20

Secondary storage
Secondary storage (also known as external memory or auxiliary storage), differs from primary storage in that it is not directly accessible by the CPU. The computer usually uses its input/output channels to access secondary storage and transfers the desired data using intermediate area in primary storage. Secondary storage does not lose the data when the device is powered downit is non-volatile. Per unit, it is typically also two orders of magnitude less expensive than primary storage. Consequently, modern computer systems typically have two orders of magnitude more secondary storage than primary storage and data are kept for a longer time there.

A hard disk drive with protective cover removed.

In modern computers, hard disk drives are usually used as secondary storage. The time taken to access a given byte of information stored on a hard disk is typically a few thousandths of a second, or milliseconds. By contrast, the time taken to access a given byte of information stored in random-access memory is measured in billionths of a second, or nanoseconds. This illustrates the significant access-time difference which distinguishes solid-state memory from rotating magnetic storage devices: hard disks are typically about a million times slower than memory. Rotating optical storage devices, such as CD and DVD drives, have even longer access times. With disk drives, once the disk read/write head reaches the proper placement and the data of interest rotates under it, subsequent data on the track are very fast to access. To reduce the seek time and rotational latency, data are transferred to and from disks in large contiguous blocks. When data reside on disk, block access to hide latency offers a ray of hope in designing efficient external memory algorithms. Sequential or block access on disks is orders of magnitude faster than random access, and many sophisticated paradigms have been developed to design efficient algorithms based upon sequential and block access. Another way to reduce the I/O bottleneck is to use multiple disks in parallel in order to increase the bandwidth between primary and secondary memory.[3] Some other examples of secondary storage technologies are: flash memory (e.g. USB flash drives or keys), floppy disks, magnetic tape, paper tape, punched cards, standalone RAM disks, and Iomega Zip drives. The secondary storage is often formatted according to a file system format, which provides the abstraction necessary to organize data into files and directories, providing also additional information (called metadata) describing the owner of a certain file, the access time, the access permissions, and other information. Most computer operating systems use the concept of virtual memory, allowing utilization of more primary storage capacity than is physically available in the system. As the primary memory fills up, the system moves the least-used chunks (pages) to secondary storage devices (to a swap file or page file), retrieving them later when they are needed. As more of these retrievals from slower secondary storage are necessary, the more the overall system performance is

Computer data storage degraded.

21

Tertiary storage
Tertiary storage or tertiary memory,[4] provides a third level of storage. Typically it involves a robotic mechanism which will mount (insert) and dismount removable mass storage media into a storage device according to the system's demands; these data are often copied to secondary storage before use. It is primarily used for archiving rarely accessed information since it is much slower than secondary storage (e.g. 560 seconds vs. 110 milliseconds). This is primarily useful for extraordinarily large data stores, accessed without human operators. Typical examples include tape libraries and optical jukeboxes. When a computer needs to read information from the tertiary storage, it will first consult a catalog database to determine which tape or disc contains the information. Next, the computer will instruct a robotic arm to fetch the medium and place it in a drive. When the computer has finished reading the information, the robotic arm will return the medium to its place in the library.

Off-line storage

Large tape library. Tape cartridges placed on shelves in the front, robotic arm moving in the back. Visible height of the library is about 180 cm.

Off-line storage is a computer data storage on a medium or a device that is not under the control of a processing unit.[5] The medium is recorded, usually in a secondary or tertiary storage device, and then physically removed or disconnected. It must be inserted or connected by a human operator before a computer can access it again. Unlike tertiary storage, it cannot be accessed without human interaction. Off-line storage is used to transfer information, since the detached medium can be easily physically transported. Additionally, in case a disaster, for example a fire, destroys the original data, a medium in a remote location will probably be unaffected, enabling disaster recovery. Off-line storage increases general information security, since it is physically inaccessible from a computer, and data confidentiality or integrity cannot be affected by computer-based attack techniques. Also, if the information stored for archival purposes is rarely accessed, off-line storage is less expensive than tertiary storage. In modern personal computers, most secondary and tertiary storage media are also used for off-line storage. Optical discs and flash memory devices are most popular, and to much lesser extent removable hard disk drives. In enterprise uses, magnetic tape is predominant. Older examples are floppy disks, Zip disks, or punched cards.

Computer data storage

22

Characteristics of storage
Storage technologies at all levels of the storage hierarchy can be differentiated by evaluating certain core characteristics as well as measuring characteristics specific to a particular implementation. These core characteristics are volatility, mutability, accessibility, and addressibility. For any particular implementation of any storage technology, the characteristics worth measuring are capacity and performance.

Volatility
Non-volatile memory Will retain the stored information even if it is not constantly supplied with electric power. It is suitable for long-term storage of information. Volatile memory Requires constant power to maintain the stored information. The fastest memory technologies of today are volatile ones (not a universal rule). Since primary storage is required to be very fast, it predominantly uses volatile memory. Dynamic random-access memory A form of volatile memory which also requires the stored information to be periodically re-read and re-written, or refreshed, otherwise it would vanish. Static random-access memory A form of volatile memory similar to DRAM with the exception that it never needs to be refreshed as long as power is applied. (It loses its content if power is removed).

A 1GB DDR RAM module (detail)

Mutability
Read/write storage or mutable storage Allows information to be overwritten at any time. A computer without some amount of read/write storage for primary storage purposes would be useless for many tasks. Modern computers typically use read/write storage also for secondary storage. Read only storage Retains the information stored at the time of manufacture, and write once storage (Write Once Read Many) allows the information to be written only once at some point after manufacture. These are called immutable storage. Immutable storage is used for tertiary and off-line storage. Examples include CD-ROM and CD-R. Slow write, fast read storage Read/write storage which allows information to be overwritten multiple times, but with the write operation being much slower than the read operation. Examples include CD-RW and flash memory.

Computer data storage

23

Accessibility
Random access Any location in storage can be accessed at any moment in approximately the same amount of time. Such characteristic is well suited for primary and secondary storage. Most semiconductor memories and disk drives provide random access. Sequential access The accessing of pieces of information will be in a serial order, one after the other; therefore the time to access a particular piece of information depends upon which piece of information was last accessed. Such characteristic is typical of off-line storage.

Addressability
Location-addressable Each individually accessible unit of information in storage is selected with its numerical memory address. In modern computers, location-addressable storage usually limits to primary storage, accessed internally by computer programs, since location-addressability is very efficient, but burdensome for humans. File addressable Information is divided into files of variable length, and a particular file is selected with human-readable directory and file names. The underlying device is still location-addressable, but the operating system of a computer provides the file system abstraction to make the operation more understandable. In modern computers, secondary, tertiary and off-line storage use file systems. Content-addressable Each individually accessible unit of information is selected based on the basis of (part of) the contents stored there. Content-addressable storage can be implemented using software (computer program) or hardware (computer device), with hardware being faster but more expensive option. Hardware content addressable memory is often used in a computer's CPU cache. CAS(content-addressable storage) addresses the thinking behind how are we to find and access the information that we currently have or will gather in the future.

Capacity
Raw capacity The total amount of stored information that a storage device or medium can hold. It is expressed as a quantity of bits or bytes (e.g. 10.4 megabytes). Memory storage density The compactness of stored information. It is the storage capacity of a medium divided with a unit of length, area or volume (e.g. 1.2 megabytes per square inch).

Performance
Latency The time it takes to access a particular location in storage. The relevant unit of measurement is typically nanosecond for primary storage, millisecond for secondary storage, and second for tertiary storage. It may make sense to separate read latency and write latency, and in case of sequential access storage, minimum, maximum and average latency. Throughput

Computer data storage The rate at which information can be read from or written to the storage. In computer data storage, throughput is usually expressed in terms of megabytes per second or MB/s, though bit rate may also be used. As with latency, read rate and write rate may need to be differentiated. Also accessing media sequentially, as opposed to randomly, typically yields maximum throughput.

24

Energy use
Storage devices that reduce fan usage, automatically shut-down during inactivity, and low power hard drives can reduce energy consumption 90 percent.[6] 2.5 inch hard disk drives often consume less power than larger ones.[7][8] Low capacity solid-state drives have no moving parts and consume less power than hard disks.[9][10][11] Also, memory may use more power than hard disks.[11]

Fundamental storage technologies


As of 2011, the most commonly used data storage technologies are semiconductor, magnetic, and optical, while paper still sees some limited usage. Media is a common name for what actually holds the data in the storage device. Some other fundamental storage technologies have also been used in the past or are proposed for development.

Semiconductor
Semiconductor memory uses semiconductor-based integrated circuits to store information. A semiconductor memory chip may contain millions of tiny transistors or capacitors. Both volatile and non-volatile forms of semiconductor memory exist. In modern computers, primary storage almost exclusively consists of dynamic volatile semiconductor memory or dynamic random access memory. Since the turn of the century, a type of non-volatile semiconductor memory known as flash memory has steadily gained share as off-line storage for home computers. Non-volatile semiconductor memory is also used for secondary storage in various advanced electronic devices and specialized computers. As early as 2006, notebook and desktop computer manufacturers started using flash-based solid-state drives (SSDs) as default configuration options for the secondary storage either in addition to or instead of the more traditional HDD.[12][13][14][15][16]

Magnetic
Magnetic storage uses different patterns of magnetization on a magnetically coated surface to store information. Magnetic storage is non-volatile. The information is accessed using one or more read/write heads which may contain one or more recording transducers. A read/write head only covers a part of the surface so that the head or medium or both must be moved relative to another in order to access data. In modern computers, magnetic storage will take these forms: Magnetic disk Floppy disk, used for off-line storage Hard disk drive, used for secondary storage Magnetic tape, used for tertiary and off-line storage In early computers, magnetic storage was also used as: Primary storage in a form of magnetic memory, or core memory, core rope memory, thin-film memory and/or twistor memory. Tertiary (e.g. NCR CRAM) or off line storage in the form of magnetic cards. Magnetic tape was then often used for secondary storage.

Computer data storage

25

Optical
Optical storage, the typical optical disc, stores information in deformities on the surface of a circular disc and reads this information by illuminating the surface with a laser diode and observing the reflection. Optical disc storage is non-volatile. The deformities may be permanent (read only media ), formed once (write once media) or reversible (recordable or read/write media). The following forms are currently in common use:[17] CD, CD-ROM, DVD, BD-ROM: Read only storage, used for mass distribution of digital information (music, video, computer programs) CD-R, DVD-R, DVD+R, BD-R: Write once storage, used for tertiary and off-line storage CD-RW, DVD-RW, DVD+RW, DVD-RAM, BD-RE: Slow write, fast read storage, used for tertiary and off-line storage Ultra Density Optical or UDO is similar in capacity to BD-R or BD-RE and is slow write, fast read storage used for tertiary and off-line storage. Magneto-optical disc storage is optical disc storage where the magnetic state on a ferromagnetic surface stores information. The information is read optically and written by combining magnetic and optical methods. Magneto-optical disc storage is non-volatile, sequential access, slow write, fast read storage used for tertiary and off-line storage. 3D optical data storage has also been proposed.

Paper
Paper data storage, typically in the form of paper tape or punched cards, has long been used to store information for automatic processing, particularly before general-purpose computers existed. Information was recorded by punching holes into the paper or cardboard medium and was read mechanically (or later optically) to determine whether a particular location on the medium was solid or contained a hole. A few technologies allow people to make marks on paper that are easily read by machinethese are widely used for tabulating votes and grading standardized tests. Barcodes made it possible for any object that was to be sold or transported to have some computer readable information securely attached to it.

Uncommon
Vacuum tube memory A Williams tube used a cathode ray tube, and a Selectron tube used a large vacuum tube to store information. These primary storage devices were short-lived in the market, since Williams tube was unreliable and the Selectron tube was expensive. Electro-acoustic memory Delay line memory used sound waves in a substance such as mercury to store information. Delay line memory was dynamic volatile, cycle sequential read/write storage, and was used for primary storage. Optical tape is a medium for optical storage generally consisting of a long and narrow strip of plastic onto which patterns can be written and from which the patterns can be read back. It shares some technologies with cinema film stock and optical discs, but is compatible with neither. The motivation behind developing this technology was the possibility of far greater storage capacities than either magnetic tape or optical discs. Phase-change memory uses different mechanical phases of Phase Change Material to store information in an X-Y addressable matrix, and reads the information by observing the varying electrical resistance of the material. Phase-change memory would be non-volatile, random-access read/write storage, and might be used for primary, secondary and

Computer data storage off-line storage. Most rewritable and many write once optical disks already use phase change material to store information. Holographic data storage stores information optically inside crystals or photopolymers. Holographic storage can utilize the whole volume of the storage medium, unlike optical disc storage which is limited to a small number of surface layers. Holographic storage would be non-volatile, sequential access, and either write once or read/write storage. It might be used for secondary and off-line storage. See Holographic Versatile Disc (HVD). Molecular memory stores information in polymer that can store electric charge. Molecular memory might be especially suited for primary storage. The theoretical storage capacity of molecular memory is 10 terabits per square inch.[18]

26

Related technologies
Network connectivity
A secondary or tertiary storage may connect to a computer utilizing computer networks. This concept does not pertain to the primary storage, which is shared between multiple processors in a much lesser degree. Direct-attached storage (DAS) is a traditional mass storage, that does not use any network. This is still a most popular approach. This retronym was coined recently, together with NAS and SAN. Network-attached storage (NAS) is mass storage attached to a computer which another computer can access at file level over a local area network, a private wide area network, or in the case of online file storage, over the Internet. NAS is commonly associated with the NFS and CIFS/SMB protocols. Storage area network (SAN) is a specialized network, that provides other computers with storage capacity. The crucial difference between NAS and SAN is the former presents and manages file systems to client computers, whilst the latter provides access at block-addressing (raw) level, leaving it to attaching systems to manage data or file systems within the provided capacity. SAN is commonly associated with Fibre Channel networks.

Robotic storage
Large quantities of individual magnetic tapes, and optical or magneto-optical discs may be stored in robotic tertiary storage devices. In tape storage field they are known as tape libraries, and in optical storage field optical jukeboxes, or optical disk libraries per analogy. Smallest forms of either technology containing just one drive device are referred to as autoloaders or autochangers. Robotic-access storage devices may have a number of slots, each holding individual media, and usually one or more picking robots that traverse the slots and load media to built-in drives. The arrangement of the slots and picking devices affects performance. Important characteristics of such storage are possible expansion options: adding slots, modules, drives, robots. Tape libraries may have from 10 to more than 100,000 slots, and provide terabytes or petabytes of near-line information. Optical jukeboxes are somewhat smaller solutions, up to 1,000 slots. Robotic storage is used for backups, and for high-capacity archives in imaging, medical, and video industries. Hierarchical storage management is a most known archiving strategy of automatically migrating long-unused files from fast hard disk storage to libraries or jukeboxes. If the files are needed, they are retrieved back to disk.

Computer data storage

27

References
This article incorporatespublic domain material from the General Services Administration document "Federal Standard 1037C" [19].
[1] Storage as defined in Microsoft Computing Dictionary, 4th Ed. (c)1999 or in The Authoritative Dictionary of IEEE Standard Terms, 7th Ed., (c) 2000. [2] "Primary Storage or Storage Hardware" (shows usage of term "primary storage" meaning "hard disk storage") (http:/ / searchstorage. techtarget. com/ topics/ 0,295493,sid5_tax298620,00. html). Searchstorage.techtarget.com (2011-06-13). Retrieved on 2011-06-18. [3] J. S. Vitter, Algorithms and Data Structures for External Memory (http:/ / faculty. cse. tamu. edu/ jsv/ Papers/ Vit. IO_book. pdf), Series on Foundations and Trends in Theoretical Computer Science, now Publishers, Hanover, MA, 2008, ISBN 978-1-60198-106-6. [4] A thesis on Tertiary storage (http:/ / www. eecs. berkeley. edu/ Pubs/ TechRpts/ 1994/ CSD-94-847. pdf). (PDF) . Retrieved on 2011-06-18. [5] National Communications System (1996). Federal Standard 1037C Telecommunications: Glossary of Telecommunication Terms (http:/ / www. its. bldrdoc. gov/ fs-1037/ fs-1037c. htm). General Services Administration. FS-1037C. . Retrieved 2007-10-08 See also article Federal Standard 1037C. [6] Energy Savings Calculator (http:/ / www. springlightcfl. com/ consumer/ energy_savings_calculator. aspx) and Fabric website (http:/ / www. simpletech. com/ content/ eco-friendly-redrive) [7] Mike Chin (8 March 2004). "IS the Silent PC Future 2.5-inches wide?" (http:/ / www. silentpcreview. com/ article145-page1. html). . Retrieved 2008-08-02. [8] Mike Chin (2002-09-18). "Recommended Hard Drives" (http:/ / www. silentpcreview. com/ article29-page2. html). . Retrieved 2008-08-02. [9] Super Talent's 2.5" IDE Flash hard drive The Tech Report Page 13 (http:/ / techreport. com/ articles. x/ 10334/ 13). The Tech Report. Retrieved on 2011-06-18. [10] Power Consumption Tom's Hardware : Conventional Hard Drive Obsoletism? Samsung's 32 GB Flash Drive Previewed (http:/ / www. tomshardware. com/ reviews/ conventional-hard-drive-obsoletism,1324-5. html). Tomshardware.com (2006-09-20). Retrieved on 2011-06-18. [11] Aleksey Meyev (2008-04-23). "SSD, i-RAM and Traditional Hard Disk Drives" (http:/ / www. xbitlabs. com/ articles/ storage/ display/ ssd-iram. html). . [12] New Samsung Notebook Replaces Hard Drive With Flash (http:/ / www. extremetech. com/ article2/ 0,1558,1966644,00. asp). ExtremeTech (2006-05-23). Retrieved on 2011-06-18. [13] Welcome to TechNewsWorld (http:/ / www. technewsworld. com/ rsstory/ 60700. html?wlc=1308338527). Technewsworld.com. Retrieved on 2011-06-18. [14] Mac Pro Storage and RAID options for your Mac Pro (http:/ / www. apple. com/ macpro/ features/ storage. html). Apple (2006-07-27). Retrieved on 2011-06-18. [15] MacBook Air The best of iPad meets the best of Mac (http:/ / www. apple. com/ macbookair/ design. html). Apple. Retrieved on 2011-06-18. [16] MacBook Air Replaces the Standard Notebook Hard Disk for Solid State Flash Storage (http:/ / news. inventhelp. com/ Articles/ Computer/ Inventions/ apple-macbook-air-12512. aspx). News.inventhelp.com (2010-11-15). Retrieved on 2011-06-18. [17] The DVD FAQ (http:/ / www. dvddemystified. com/ dvdfaq. html) is a comprehensive reference of DVD technologies. [18] New Method Of Self-assembling Nanoscale Elements Could Transform Data Storage Industry (http:/ / www. sciencedaily. com/ releases/ 2009/ 02/ 090219141438. htm). Sciencedaily.com (2009-03-01). Retrieved on 2011-06-18. [19] http:/ / www. its. bldrdoc. gov/ fs-1037/ fs-1037c. htm

Disk storage

28

Disk storage
Disk storage or disc storage is a general category of storage mechanisms, in which data are digitally recorded by various electronic, magnetic, optical, or mechanical methods on a surface layer deposited of one or more planar, round and rotating disks (or discs) (also referred to as the media). A disk drive is a device implementing such a storage mechanism with fixed or removable media; with removable media the device is usually distinguished from the media as in compact disc drive and the compact disc. Notable types are the hard disk drive (HDD) containing a non-removable disk, the floppy disk drive (FDD) and its removable floppy disk, and various optical disc drives and associated optical disc media.

Background
Musical and audio information was originally recorded by analog methods (see Sound recording and reproduction). Similarly the first video disc used an analog recording method. In the music industry, analog recording has been mostly replaced by digital optical technology where the data is recorded in a digital format with optical information. The first commercial digital disk storage device was the IBM RAMAC 350 shipped in 1956 as a part of the IBM 305 RAMAC computing system. The random-access, low-density storage of disks was developed to complement the already used sequential-access, high-density storage provided by tape drives using magnetic tape. Vigorous innovation in disk storage technology, coupled with less vigorous innovation in tape storage, has reduced the density and cost per bit gap between disk and tape, reducing the importance of tape as a complement to disk. Disk storage is now used in both computer storage and consumer electronic storage, e.g., audio CDs and video discs (standard DVD and Blu-ray).
Six hard disk drives

Three floppy disk drives

Access methods
Digital disk drives are block storage devices. Each disk is divided into logical blocks (collection of sectors). Blocks are addressed using their logical block addresses (LBA). Read from or writing to disk happens at the granularity of blocks. Originally the disk capacity was quite low and has been improved in one of several ways. Improvements in mechanical design and A CD-ROM (optical) disc drive manufacture allowed smaller and more accurate heads, meaning that more tracks could be stored on each of the disks. Advancements in data compression methods permitted more information to be stored in each of the individual sectors. The drive stores data onto cylinders, heads, and sectors. The sectors unit is the smallest size of data to be stored in a hard disk drive and each file will have many sectors units assigned to it. The smallest entity in a CD is called a frame, which consists of 33 bytes and contains six complete 16-bit stereo samples (two bytes two channels six samples = 24 bytes). The other nine bytes consist of eight CIRC error-correction bytes and one subcode byte used for

Disk storage control and display. The information is sent from the computer processor to the BIOS into a chip controlling the data transfer. This is then sent out to the hard drive via a multi-wire connector. Once the data is received onto the circuit board of the drive, it is translated and compressed into a format that the individual drive can use to store onto the disk itself. The data is then passed to a chip on the circuit board that controls the access to the drive. The drive is divided into sectors of data stored onto one of the sides of one of the internal disks. An HDD with two disks internally will typically store data on all four surfaces. The hardware on the drive tells the actuator arm where it is to go for the relevant track and the compressed information is then sent down to the head which changes the physical properties, optically or magnetically for example, of each byte on the drive, thus storing the information. A file is not stored in a linear manner, rather, it is held in the best way for quickest retrieval.

29

Rotation speed and track layout


Mechanically there are two different motions occurring inside the drive. One is the rotation of the disks inside the device. The other is the side-to-side motion of the heads across the disk as it moves between tracks. There are two types of disk rotation methods: constant linear velocity (used mainly in optical storage) varies the rotational speed of the optical disc depending upon the position of the head, and constant angular velocity (used in HDDs, standard FDDs, a few optical disc systems, and vinyl audio records) spins the media at one constant speed regardless of where the head is positioned. Track positioning also follows two different denotes start and red denotes end. methods across disk storage devices. * Some CD-R(W) and DVD-R(W)/DVD+R(W) recorders operate in ZCLV, CAA Storage devices focused on holding or CAV modes. computer data, e.g., HDDs, FDDs, Iomega zip drives, use concentric tracks to store data. During a sequential read or write operation, after the drive accesses all the sectors in a track it repositions the head(s) to the next track. This will cause a momentary delay in the flow of data between the device and the computer. In contrast, optical audio and video discs use a single spiral track that starts at the inner most point on the disc and flows continuously to the outer edge. When reading or writing data there is no need to stop the flow of data to switch tracks. This is similar to vinyl records except vinyl records started at the outer edge and spiraled in toward the center.
Comparison of several forms of disk storage showing tracks (not-to-scale); green

Disk storage

30

Interfaces
The disk drive interface is the mechanism/protocol of communication between the rest of the system and the disk drive itself. Storage devices intended for desktop and mobile computers typically use ATA (PATA) and SATA interfaces. Enterprise systems and high-end storage devices will typically use SCSI, SAS, and FC interfaces in addition to some use of SATA.

Basic terminology
Platter An individual recording disk. In a hard disk drive we tend to find a set of platters and developments in optical technology have led to multiple recording layers on a single DVD's. Rotation Platters rotate; two techniques are common: Constant angular velocity (CAV) keeps the disk spinning at a fixed rate, measured in revolutions per minute (RPM). This means the heads cover more distance per unit of time on the outer tracks than on the inner tracks. This method is typical with computer hard drives. Constant linear velocity (CLV) keeps the distance covered by the heads per unit time fixed. Thus the disk has to slow down as the arm moves to the outer tracks. This method is typical for CD drives. Track The circle of recorded data on a single recording surface of a platter. Sector A segment of a track Low level formatting Establishing the tracks and sectors. Head The device that reads and writes the informationmagnetic or opticalon the disk surface. Arm The mechanical assembly that supports the head as it moves in and out. Seek time Time needed to move the head to a new position (specific track). Rotational latency Average time, once the arm is on the right track, before a head is over a desired sector. Data transfer rate - The rate at which user data bits are transferred from or to the medium, technically this would more accurately be entitled the "gross" data transfer rate.

References

Hard disk drive

31

Hard disk drive


Hard disk drive
Video of modern hard disk drive operation (cover removed) Date invented Invented by 24 December 1954 [1]

IBM team led by Rey Johnson

A hard disk drive (HDD; also hard drive, hard disk, or disk drive)[2] is a device for storing and retrieving digital information, primarily computer data. It consists of one or more rigid (hence "hard") rapidly rotating discs (platters) coated with magnetic material, and with magnetic heads arranged to write data to the surfaces and read it from them. Hard drives are classified as non-volatile, random-access, digital, magnetic, data storage devices. Introduced by IBM in 1956, hard disk drives have been the dominant device for secondary storage of data in general purpose computers since the early 1960s.[3] They have maintained this position because of advances which have resulted in increased recording capacity, reliability, and speed, as well as decreased cost, allowing them to keep pace with ever more demanding requirements for secondary storage.[3]

History
Hard disk drives were introduced in 1956 as data storage for an IBM real-time transaction processing computer[4] and were developed for use with general purpose mainframe and minicomputers. The first IBM drive, the 350 RAMAC, was approximately the size of two refrigerators and stored 5 million 6-bit characters (the equivalent of 3.75 million 8-bit bytes) on a stack of 50 discs. In 1961 IBM introduced the model 1311 disk drive, which was about the size of a washing machine and stored two million characters on a removable disk "pack." Users could buy additional packs and interchange them as needed, much like reels of magnetic tape. Later models of removable pack drives, from IBM and others, became the norm in most computer installations and reached capacities of 300 megabytes by the early 1980s. In 1973, IBM introduced a new type of hard drive codenamed "Winchester." Its primary distinguishing feature was that the disk heads were not withdrawn completely from the stack of disk platters when the drive was powered down. Instead, the heads were allowed to "land" on a special area of the disk surface upon spin-down, "taking off" again when the disk was later powered on. This greatly reduced the cost of the head actuator mechanism, but precluded removing just the disks from the drive as was done with the disk packs of the day. Instead, the first models of "Winchester technology" drives featured a removable disk module, which included both the disk pack and the head assembly, leaving the actuator motor in the drive upon removal. Later "Winchester" drives abandoned the removable media concept and returned to non-removable platters. Like the first removable pack drive, the first "Winchester" drives used platters 14inches in diameter. A few years later, designers were exploring the possibility that physically smaller platters might offer advantages. Drives with non-removable eight-inch platters appeared, and then drives that fit in a "five and a quarter inch" form factor (a mounting width equivalent to that used by a five and a quarter inch floppy disk drive). The latter were primarily intended for the then-fledgling personal computer market. As the 1980s began, hard disk drives were a rare and very expensive additional feature on personal computers (PCs); however by the late '80s, their cost had been reduced to the point where they were standard on all but the cheapest PC. Most hard disk drives in the early 1980s were sold to PC end users as an external, add-on subsystem. The subsystem was not sold under the drive manufacturer's name but under the subsystem manufacturer's name such as Corvus

Hard disk drive Systems and Tallgrass Technologies, or under the PC system manufacturer's name such as the Apple ProFile. The IBM PC/XT in 1983 included an internal 10MB hard disk drive, and soon thereafter internal hard disk drives proliferated on personal computers. External hard disk drives remained popular for much longer on the Apple Macintosh. Every Mac made between 1986 and 1998 has a SCSI port on the back, making external expansion easy; also, "toaster" Compact Macs did not have easily accessible hard drive bays (or, in the case of the Mac Plus, any hard drive bay at all), so on those models, external SCSI disks were the only reasonable option. Driven by areal density doubling every two to four years since their invention, hard disk drives have changed in many ways. A few highlights include: Capacity per HDD increasing from 3.75 megabytes[4] to 4 terabytes or more, more than a million times larger. Physical volume of HDD decreasing from 68ft3[4] or about 2,000 litre (comparable to a large side-by-side refrigerator), to less than 20 ml[5] (1.2in3), a 100,000-to-1 decrease. Weight decreasing from 2,000lbs[4] (~900kg) to 48grams[5] (~0.1lb), a 20,000-to-1 decrease. Price decreasing from about US$15,000 per megabyte[6] to less than $0.0001 per megabyte ($100/1 terabyte), a greater than 150-million-to-1 decrease.[7] Average access time decreasing from over 100milliseconds to a few milliseconds, a greater than 40-to-1 improvement. Market application expanding from mainframe computers of the late 1950s to most mass storage applications including computers and consumer applications such as storage of entertainment content.

32

Technology
Magnetic recording
A hard disk drive records data by magnetizing a thin film of ferromagnetic material on a disk. Sequential changes in the direction of magnetization represent binary data bits. The data is read from the disk by detecting the transitions in magnetization. User data is encoded using an encoding scheme, such as run-length limited encoding,[8] which determines how the data is represented by the magnetic transitions. A typical HDD design consists of a Diagram labeling the major components of a computer hard disk drive spindle[9] that holds flat circular disks, also called platters, which hold the recorded data. The platters are made from a non-magnetic material, usually aluminium alloy, glass, or ceramic, and are coated with a shallow layer of magnetic material typically 1020 nm

Hard disk drive

33

in depth, with an outer layer of carbon for protection.[10][11][12] For reference, a standard piece of copy paper is 0.070.18 millimetre (70000180000 nm).[13]

Overview of how a hard disk drive functions.

The platters in contemporary HDDs are spun at speeds varying from 4,200 rpm in energy-efficient portable devices, to 15,000 rpm for high performance servers.[15] The first hard drives spun at 1,200 rpm[16] and, for many years, 3,600 rpm was the norm.[17] Today, most consumer hard drives operate at a speed of 7,200 rpm. Information is written to and read from a platter as it rotates past devices called read-and-write heads that operate very close (often tens of nanometers) over the magnetic surface. The read-and-write head is used to detect and modify the magnetization of the material immediately under it. In modern drives there is one head for each magnetic platter surface on the spindle, mounted on a common arm. An actuator arm (or access arm) moves the heads on an arc (roughly radially) across the platters as they spin, allowing each head to access almost the entire surface of the platter as it spins. The arm is moved using a voice coil actuator or in some older designs a stepper motor. The magnetic surface of each platter is conceptually divided into many small

Magnetic cross section & frequency modulation encoded binary data

Recording of single magnetisations of bits on an hdd-platter (recording made visible [14] using CMOS-MagView).

Hard disk drive

34

sub-micrometer-sized magnetic regions, referred to as magnetic domains, (although these are not magnetic domains in a rigorous physical sense), each of which has a mostly uniform magnetization. Due to the polycrystalline nature of the magnetic material each of these magnetic regions is composed of a few hundred magnetic grains. Magnetic grains are typically 10nm in size and each form a single true magnetic domain. Each magnetic region in total forms a magnetic dipole which generates a magnetic field. In older Longitudinal recording (standard) & perpendicular recording diagram disk designs the regions were oriented horizontally and parallel to the disk surface, but beginning about 2005, the orientation was changed to perpendicular to allow for closer magnetic domain spacing. For reliable storage of data, the recording material needs to resist self-demagnetization, which occurs when the magnetic domains repel each other. Magnetic domains written too densely together to a weakly magnetizable material will degrade over time due to rotation of the magnetic moment one or more domains to cancel out these forces. The domains rotate sideways to a halfway position that weakens the readability of the domain and relieves the magnetic stresses. Older hard disks used iron(III) oxide as the magnetic material, but current disks use a cobalt-based alloy.[18] A write head magnetizes a region by generating a strong local magnetic field, and a read head detects the magnetization of the regions. Early HDDs used an electromagnet both to magnetize the region and to then read its magnetic field by using electromagnetic induction. Later versions of inductive heads included metal in Gap (MIG) heads and thin film heads. As data density increased, read heads using magnetoresistance (MR) came into use; the electrical resistance of the head changed according to the strength of the magnetism from the platter. Later development made use of spintronics; in read heads, the magnetoresistive effect was much greater than in earlier types, and was dubbed "giant" magnetoresistance (GMR). In today's heads, the read and write elements are separate, but in close proximity, on the head portion of an actuator arm. The read element is typically magneto-resistive while the write element is typically thin-film inductive.[19] The heads are kept from contacting the platter surface by the air that is extremely close to the platter; that air moves at or near the platter speed. The record and playback head are mounted on a block called a slider, and the surface next to the platter is shaped to keep it just barely out of contact. This forms a type of air bearing. In modern drives, the small size of the magnetic regions creates the danger that their magnetic state might be lost because of thermal effects. To counter this, the platters are coated with two parallel magnetic layers, separated by a 3-atom layer of the non-magnetic element ruthenium, and the two layers are magnetized in opposite orientation, thus reinforcing each other.[20] Another technology used to overcome thermal effects to allow greater recording densities is perpendicular recording, first shipped in 2005,[21] and as of 2007 the technology was used in many HDDs.[22][23][24]

Hard disk drive

35

Components
A typical hard disk drive has two electric motors; a disk motor that spins the disks and an actuator (motor) that positions the read/write head assembly across the spinning disks. The disk motor has an external rotor attached to the disks; the stator windings are fixed in place. Opposite the actuator at the end of the head support arm is the read-write head (near center in photo); thin printed-circuit cables connect the read-write heads to amplifier electronics mounted at the pivot of the actuator. A flexible, somewhat U-shaped, ribbon cable, seen edge-on below and to the left of the actuator

HDD with disks and motor hub removed exposing copper colored stator coils surrounding a bearing in the center of the spindle motor. Orange stripe along the side of the arm is thin printed-circuit cable, spindle bearing is in the center and the actuator is in the upper left.

arm continues the connection to the controller board on the opposite side. The head support arm is very light, but also stiff; in modern drives, acceleration at the head reaches 550 g. The silver-colored structure at the upper left of the first image is the top plate of the actuator, a permanent-magnet and moving coil motor that swings the heads to the desired position (it is shown removed in the second image). The plate supports a squat neodymium-iron-boron (NIB) high-flux magnet. Beneath this plate is the moving coil, often referred to as the voice coil by analogy to the coil in loudspeakers, which is attached to the actuator hub, and beneath that is a second NIB magnet, mounted on the bottom plate of the motor (some drives only have one magnet). The voice coil itself is shaped rather like an arrowhead, and made of doubly coated copper magnet wire. The inner layer is insulation, and the outer is thermoplastic, which bonds the coil together after it is wound on a form, making it self-supporting. The portions of the coil along the two sides of the arrowhead (which point to the actuator bearing center) interact with the magnetic field, developing a tangential force that rotates the actuator. Current flowing radially outward along one A disassembled and labeled 1997 hard drive. All major components were placed on a side of the arrowhead and radially mirror, which created the symmetrical reflections. inward on the other produces the tangential force. If the magnetic field were uniform, each side would generate opposing forces that would cancel each other out. Therefore the surface of the magnet is half N pole, half S pole, with the radial dividing line in the middle, causing the two sides of the coil to see opposite magnetic fields and produce forces that add instead of canceling. Currents along the top and bottom of the coil produce radial forces that do not rotate the head.

Hard disk drive Actuation of moving arm The hard drive's electronics control the movement of the actuator and the rotation of the disk, and perform reads and writes on demand from the disk controller. Feedback of the drive electronics is accomplished by means of special segments of the disk dedicated to servo feedback. These are either complete concentric circles (in the case of dedicated servo technology), or segments interspersed with real data (in the case of embedded servo technology). The servo feedback optimizes the signal to noise ratio of the GMR sensors by adjusting the voice-coil of the actuated arm. The spinning of the disk also uses a servo motor. Modern disk firmware is capable of scheduling reads and writes efficiently on the platter surfaces and remapping sectors of the media which have failed.

36

Head stack with an actuator coil on the left and read/write heads on the right

Error handling
Modern drives make extensive use of error correction codes (ECCs), particularly ReedSolomon error correction. These techniques store extra bits, determined by mathematical formulas, for each block of data; the extra bits allow many errors to be corrected invisibly. The extra bits themselves take up space on the hard drive, but allow higher recording densities to be employed without causing uncorrectable errors, resulting in much larger storage capacity.[25] In the newest drives of 2009, low-density parity-check codes (LDPC) were supplanting Reed-Solomon; LDPC codes enable performance close to the Shannon Limit and thus provide the highest storage density available.[26] Typical hard drives attempt to "remap" the data in a physical sector that is failing to a spare physical sectorhopefully while the errors in the bad sector are still few enough that the ECC can recover the data without loss. The S.M.A.R.T. system counts the total number of errors in the entire hard drive fixed by ECC and the total number of remappings, as the occurrence of many such errors may predict hard drive failure.

Future development
Due to bit-flipping errors and other issues, perpendicular recording densities may be supplanted by other magnetic recording technologies. Toshiba is promoting bit-patterned recording (BPR),[27] while Xyratex is developing heat-assisted magnetic recording (HAMR).[28] October 2011: TDK has developed a special laser that heats up a hard disk's surface with a precision of a few dozen nanometers. TDK also used the new material in the magnetic head and redesigned its structure to expand the recording density. This new technology apparently makes it possible to store one terabyte on one platter and for the initial hard drive TDK plans to include two platters.[29]

Hard disk drive

37

Capacity
The capacity of an HDD may appear to the end user to be a different amount than the amount stated by a drive or system manufacturer due to amongst other things, different units of measuring capacity, capacity consumed in formatting the drive for use by an operating system and/or redundancy.

Units of storage capacity


Advertised capacity by manufacturer (using decimalmultiples) With prefix 100MB 100GB Bytes 100,000,000 100,000,000,000 Expected capacity by consumers in class action (using binarymultiples) Bytes 104,857,600 107,374,182,400 Diff. 4.86% 7.37% Reported capacity Windows (using binary multiples) 95.4MB 93.1GB, 95,367MB MacOSX 10.6+ (using decimal multiples) 100MB 100 GB

1TB 1,000,000,000,000 1,099,511,627,776

9.95% 931GB, 953,674MB 1,000GB, 1,000,000MB

The capacity of hard disk drives is given by manufacturers in megabytes (1MB=1,000,000 bytes), gigabytes (1GB=1,000,000,000 bytes) or terabytes (1TB=1,000,000,000,000 bytes).[30][31] This numbering convention, where prefixes like mega- and giga- denote powers of 1,000, is also used for data transmission rates and DVD capacities. However, the convention is different from that used by manufacturers of memory (RAM, ROM) and CDs, where prefixes like kilo- and mega- mean powers of 1,024. When the unit prefixes like kilo- denote powers of 1,024 in the measure of memory capacities, the 1,024n progression (for n = 1, 2, ...) is as follows:[30] kilo = 210 = 1,0241 = 1,024, mega = 220 = 1,0242 = 1,048,576, giga = 230 = 1,0243 = 1,073,741,824, The practice of using prefixes assigned to powers of 1,000 within the hard drive and computer industries dates back to the early days of computing.[32] By the 1970s million, mega and M were consistently being used in the powers of 1,000 sense to describe HDD capacity.[33][34][35] As HDD sizes grew the industry adopted the prefixes G for giga and T for tera denoting 1,000,000,000 and 1,000,000,000,000 bytes of HDD capacity respectively. Likewise, the practice of using prefixes assigned to powers of 1,024 within the computer industry also traces its roots to the early days of computing[36] By the early 1970s using the prefix K in a powers of 1,024 sense to describe memory was common within the industry.[37][38] As memory sizes grew the industry adopted the prefixes M for mega and G for giga denoting 1,048,576 and 1,073,741,824 bytes of memory respectively. Computers do not internally represent HDD or memory capacity in powers of 1,024; reporting it in this manner is just a convention.[39] Creating confusion, operating systems report HDD capacity in different ways. Most operating systems, including the Microsoft Windows operating systems use the powers of 1,024 convention when reporting HDD capacity, thus an HDD offered by its manufacturer as a 1 TB drive is reported by these OSes as a 931 GB HDD. Apple's current OSes, beginning with Mac OSX 10.6 (Snow Leopard), use powers of 1,000 when reporting HDD capacity, thereby avoiding any discrepancy between what it reports and what the manufacturer advertises. In the case of mega-, there is a nearly 5% difference between the powers of 1,000 definition and the powers of 1,024 definition. Furthermore, the difference is compounded by 2.4% with each incrementally larger prefix (gigabyte, terabyte, etc.) The discrepancy between the two conventions for measuring capacity was the subject of several class action suits against HDD manufacturers. The plaintiffs argued that the use of decimal measurements effectively misled consumers[40][41] while the defendants denied any wrongdoing or liability, asserting that their marketing and advertising complied in all respects with the law and that no class member sustained any damages or

Hard disk drive injuries.[42] In December 1998, an international standards organization attempted to address these dual definitions of the conventional prefixes by proposing unique binary prefixes and prefix symbols to denote multiples of 1,024, such as mebibyte (MiB), which exclusively denotes 220 or 1,048,576 bytes.[43] In the over13 that have since elapsed, the proposal has seen little adoption by the computer industry and the conventionally prefixed forms of byte continue to denote slightly different values depending on context.[44][45]

38

HDD formatting
The presentation of an HDD to its host is determined by its controller. This may differ substantially from the drive's native interface particularly in mainframes or servers. Modern HDDs, such as SAS[46] and SATA[47] drives, appear at their interfaces as a contiguous set of logical blocks; typically 512 bytes long but the industry is in the process of changing to 4,096 byte logical blocks; see Advanced Format.[48] The process of initializing these logical blocks on the physical disk platters is called low level formatting which is usually performed at the factory and is not normally changed in the field.[49] High level formatting then writes the file system structures into selected logical blocks to make the remaining logical blocks available to the host OS and its applications.[50] The operating system file system uses some of the disk space to organize files on the disk, recording their file names and the sequence of disk areas that represent the file. Examples of data structures stored on disk to retrieve files include the MS DOS file allocation table (FAT) and UNIX inodes, as well as other operating system data structures. As a consequence not all the space on a hard drive is available for user files. This file system overhead is usually less than 1% on drives larger than 100 MB.

Redundancy
In modern HDDs spare capacity for defect management is not included in the published capacity; however in many early HDDs a certain number of sectors were reserved for spares, thereby reducing capacity available to end users. In some systems, there may be hidden partitions used for system recovery that reduce the capacity available to the end user. For RAID subsystems, data integrity and fault-tolerance requirements also reduce the realized capacity. For example, a RAID1 subsystem will be about half the total capacity as a result of data mirroring. RAID5 subsystems with x drives, would lose 1/x of capacity to parity. RAID subsystems are multiple drives that appear to be one drive or more drives to the user, but provides a great deal of fault-tolerance. Most RAID vendors use some form of checksums to improve data integrity at the block level. For many vendors, this involves using HDDs with sectors of 520 bytes per sector to contain 512 bytes of user data and 8 checksum bytes or using separate 512 byte sectors for the checksum data.[51]

HDD parameters to calculate capacity


Because modern disk drives appear to their interface as a contiguous set of logical blocks their gross capacity can be calculated by multiplying the number of blocks by the size of the block. This information is available from the manufacturers specification and from the drive itself through use of special utilities invoking low level commands[46][47] The gross capacity of older HDDs can be calculated by multiplying for each zone of the drive the number of cylinders by the number of heads by the number of sectors/zone by the number of bytes/sector (most commonly 512) and then summing the totals for all zones. Some modern ATA drives will also report cylinder, head, sector (C/H/S) values to the CPU but they are no longer actual physical parameters since the reported numbers are constrained by historic operating-system interfaces.

Hard disk drive The old C/H/S scheme has been replaced by logical block addressing. In some cases, to try to "force-fit" the C/H/S scheme to large-capacity drives, the number of heads was given as 64, although no modern drive has anywhere near 32 platters.

39

Form factors
Mainframe and minicomputer hard disks were of widely varying dimensions, typically in free standing cabinets the size of washing machines or designed to fit a 19" rack. In 1962, IBM introduced its model 1311 disk, which used 14inch (nominal size) platters. This became a standard size for mainframe and minicomputer drives for many years,[52] but such large platters were never used with microprocessor-based systems. With increasing sales of microcomputers having built in floppy-disk drives (FDDs), HDDs that would fit to the FDD mountings became desirable. Thus hard disk drive Form factors, initially followed those of 8-inch, 5.25-inch, and 3.5-inch floppy disk drives. Because there were no smaller floppy disk drives, smaller hard disk drive form factors developed from product offerings or industry standards. 8inch: 9.5in 4.624in 14.25in (241.3mm 117.5mm 362mm) In 1979, Shugart Associates' SA1000 was the first form factor compatible HDD, having the same dimensions and a compatible interface to the 8 FDD. 5.25inch: 5.75in 3.25in 8in (146.1mm 82.55mm 203mm) This smaller form factor, first used in an HDD by Seagate in 1980,[53] was the same size as full-height 514-inch-diameter (130mm) FDD, 3.25-inches high. This is twice as high as "half height"; i.e., 1.63in (41.4mm). Most desktop models of drives for optical 120mm disks (DVD, CD) use the half height 5 dimension, but it fell out of fashion for HDDs. The Quantum Bigfoot HDD was the last to use it in the late 1990s, with "low-profile" (25mm) and "ultra-low-profile" (20mm) high versions.

5 full height 110MB HDD 2 (8.5mm) 6,495MB HDD

2.5" SATA HDD from a Sony VAIO laptop

3.5inch: 4in 1in 5.75in (101.6mm 25.4mm 146mm) = 376.77344cm This smaller form factor is similar to that used in an HDD by Rodime in 1983,[54] which was the same size as the "half height" 3 FDD, i.e., 1.63inches high. Today, the 1-inch high ("slimline" or "low-profile") version of this form factor is the most popular form used in most desktops. 2.5inch: 2.75in 0.2750.59in 3.945in (69.85mm 715mm 100mm) = 48.895104.775cm3

Hard disk drive This smaller form factor was introduced by PrairieTek in 1988;[55] there is no corresponding FDD. It came to be widely used for hard disk drives in mobile devices (laptops, music players, etc.) and for solid-state drives, by 2008 replacing some 3.5inch enterprise-class drives.[56] It is also used in the Playstation 3[57] and Xbox 360 video game consoles. Drives 9.5mm high became an unofficial standard for all except the largest-capacity laptop drives (usually having two platters inside); 12.5mm-high drives, typically with three platters, are used for maximum capacity, but will not fit most laptop computers. Enterprise-class drives can have a height up to 15mm.[58] Seagate released a 7mm drive aimed at entry level laptops and high end netbooks in December 2009.[59]

40

Six hard drives with 8, 5.25, 3.5, 2.5, 1.8, and 1 hard disks with a ruler to show the length of platters and read-write heads.

1.8inch: 54mm 8mm 71mm = 30.672cm This form factor, originally introduced by Integral Peripherals in 1993, has evolved into the ATA-7 LIF with dimensions as stated. For a time it was increasingly used in digital audio players and subnotebooks, but its popularity decreased. There is a variant for 25GB sized HDDs that fit directly into a PC card expansion slot. These became popular for use in iPods and other HDD based MP3 players. 1inch: 42.8mm 5mm 36.4mm This form factor was introduced in 1999 as IBM's Microdrive to fit inside a CF Type II slot. Samsung calls the same form factor "1.3inch" drive in its product literature.[60] 0.85inch: 24mm 5mm 32mm Toshiba announced this form factor in January 2004[61] for use in mobile phones and similar applications, including SD/MMC slot compatible HDDs optimized for video storage on 4G handsets. Toshiba manufactured a 4GB (MK4001MTD) and an 8GB (MK8003MTD) version[62] and holds the Guinness World Record for the smallest hard disk drive.[63] 3.5-inch and 2.5-inch hard disks were the most popular sizes as of 2012. By 2009 all manufacturers had discontinued the development of new products for the 1.3-inch, 1-inch and 0.85-inch form factors due to falling prices of flash memory,[64][65] which has no moving parts. While these sizes are customarily described by an approximately correct figure in inches, actual sizes have long been specified in millimeters.

Current hard disk form factors


Form factor Width (mm) 3.5 2.5 1.8 102 69.9 54 Height (mm) 19 or 25.4 5 [70] , 7, 9.5 [71] 4 TB , 12.5, or 15 2TB Largest capacity [66][67][68][69] [72][73] [74] (2012) Platters (max) Per platter (GB) 1000 GB 500 GB 160 GB

(2011) 5 4 2

5 or 8

320GB

(2009)

Hard disk drive

41

Obsolete hard disk form factors


Form factor 5.25 FH 5.25 HH 1.3 Width (mm) 146 146 43 Largest capacity 47GB [75] (1998) (1998) Platters (max) Per platter (GB) 14 4 1 1 (2004) 1 [77] 3.36 GB 4.83 GB 40 GB 20 GB 8 GB

19.3GB 40GB

[76]

[78]

(2007)

1 (CFII/ZIF/IDE-Flex) 42 0.85 24

20GB (2006) 8GB [79][80]

Performance characteristics
See also: Solid-state_drive#Comparison_of_SSD_with_hard_disk_drives

Access time
The factors that limit the time to access the data on a hard disk drive (Access time) are mostly related to the mechanical nature of the rotating disks and moving heads. Seek time is a measure of how long it takes the head assembly to travel to the track of the disk that contains data. Rotational latency is incurred because the desired disk sector may not be directly under the head when data transfer is requested. These two delays are on the order of milliseconds each. The bit rate or data transfer rate (once the head is in the right position) creates delay which is a function of the number of blocks transferred; typically relatively small, but can be quite long with the transfer of large contiguous files. Delay may also occur if the drive disks are stopped to save energy, see Power management. An HDD's Average Access Time is its average Seek time which technically is the time to do all possible seeks divided by the number of all possible seeks, but in practice is determined by statistical methods or simply approximated as the time of a seek over one-third of the number of tracks[81] Defragmentation is a procedure used to minimize delay in retrieving data by moving related items to physically proximate areas on the disk.[82] Some computer operating systems perform defragmentation automatically. Although automatic defragmentation is intended to reduce access delays, performance will be temporarily reduced while the procedure is in progress.[83] Access time can be improved by increasing rotational speed (thus reducing latency) and/or by reducing the time spent seeking. Increasing areal density increases throughput by increasing data rate and by increasing the amount of data under a set of heads, thereby potentially reducing seek activity for a given amount of data. Based on historic trends, analysts predict a future growth in HDD areal density (and therefore capacity) of about 40% per year.[84] Access times have not kept up with throughput increases, which themselves have not kept up with growth in storage capacity. Interleave Sector interleave is a mostly obsolete device characteristic related to access time, dating back to when computers were too slow to be able to read large continuous streams of data. Interleaving introduced gaps between data sectors to allow time for slow equipment to get ready to read the next block of data. Without interleaving, the next logical sector would arrive at the read/write head before the equipment was ready, requiring the system to wait for another complete disk revolution before reading could be performed. However, because interleaving introduces intentional physical delays into the drive mechanism, setting the interleave to a ratio higher than required causes unnecessary delays for equipment that has the performance needed to read sectors more quickly. The interleaving ratio was therefore usually chosen by the end-user to suit their particular

Hard disk drive computer system's performance capabilities when the drive was first installed in their system. Modern technology is capable of reading data as fast as it can be obtained from the spinning platters, so hard drives usually have a fixed sector interleave ratio of 1:1, which is effectively no interleaving being used. Seek time Average seek time ranges from 3ms[85] for high-end server drives, to 15ms for mobile drives, with the most common mobile drives at about 12ms[86] and the most common desktop type typically being around 9ms. The first HDD [87] had an average seek time of about 600 ms and by the middle 1970s HDDs were available with seek times of about 25 ms [88]. Some early PC drives used a stepper motor to move the heads, and as a result had seek times as slow as 80120ms, but this was quickly improved by voice coil type actuation in the 1980s, reducing seek times to around 20ms. Seek time has continued to improve slowly over time. Some desktop and laptop computer systems allow the user to make a tradeoff between seek performance and drive noise. Faster seek rates typically require more energy usage to quickly move the heads across the platter, causing loud noises from the pivot bearing and greater device vibrations as the heads are rapidly accelerated during the start of the seek motion and decelerated at the end of the seek motion. Quiet operation reduces movement speed and acceleration rates, but at a cost of reduced seek performance. Rotational latency
Rotational speed [rpm] 15,000 10,000 7,200 5,400 4,800 Average latency [ms]

42

2 3 4.16 5.55 6.25

Latency is the delay for the rotation of the disk to bring the required disk sector under the read-write mechanism. It depends on rotational speed of a disk, measured in revolutions per minute (rpm). Average rotational latency is shown in the table below, based on the statistical relation that the average latency in milliseconds for such a drive is one-half the rotational period. Data transfer rate As of 2010, a typical 7,200 rpm desktop hard drive has a sustained "disk-to-buffer" data transfer rate up to 1,030 Mbits/sec.[89] This rate depends on the track location; the rate is higher for data on the outer tracks (where there are more data sectors) and lower toward the inner tracks (where there are fewer data sectors); and is generally somewhat higher for 10,000 rpm drives. A current widely used standard for the "buffer-to-computer" interface is 3.0 Gbit/s SATA, which can send about 300 megabyte/s (10-bit encoding) from the buffer to the computer, and thus is still comfortably ahead of today's disk-to-buffer transfer rates. Data transfer rate (read/write) can be measured by writing a large file to disk using special file generator tools, then reading back the file. Transfer rate can be influenced by file system fragmentation and the layout of the files.[82] HDD data transfer rate depends upon the rotational speed of the platters and the data recording density. Because heat and vibration limit rotational speed, advancing density becomes the main method to improve sequential transfer rates.[90] While areal density advances by increasing both the number of tracks across the disk and the number of sectors per track, only the latter increases the data transfer rate for a given rpm. Since data transfer rate performance only tracks one of the two components of areal density, its performance improves at a lower rate.

Hard disk drive

43

Power consumption
Power consumption has become increasingly important, not only in mobile devices such as laptops but also in server and desktop markets. Increasing data center machine density has led to problems delivering sufficient power to devices (especially for spin up), and getting rid of the waste heat subsequently produced, as well as environmental and electrical cost concerns (see green computing). Heat dissipation is tied directly to power consumption, and as drives age, disk failure rates increase at higher drive temperatures.[91] Similar issues exist for large companies with thousands of desktop PCs. Smaller form factor drives often use less power than larger drives. One interesting development in this area is actively controlling the seek speed so that the head arrives at its destination only just in time to read the sector, rather than arriving as quickly as possible and then having to wait for the sector to come around (i.e. the rotational latency).[92] Many of the hard drive companies are now producing Green Drives that require much less power and cooling. Many of these Green Drives spin slower (<5,400 rpm compared to 7,200, 10,000 or 15,000 rpm) thereby generating less heat. Power consumption can also be reduced by parking the drive heads when the disk is not in use reducing friction, adjusting spin speeds,[93] and disabling internal components when not in use.[94] Drives use more power, briefly, when starting up (spin-up). Although this has little direct effect on total energy consumption, the maximum power demanded from the power supply, and hence its required rating, can be reduced in systems with several drives by controlling when they spin up. On SCSI hard disk drives, the SCSI controller can directly control spin up and spin down of the drives. Some Parallel ATA (PATA) and Serial ATA (SATA) hard disk drives support power-up in standby or PUIS: each drive does not spin up until the controller or system BIOS issues a specific command to do so. This allows the system to be set up to stagger disk start-up and limit maximum power demand at switch-on. Some SATA II and later hard disk drives support staggered spin-up, allowing the computer to spin up the drives in sequence to reduce load on the power supply when booting.[95] Power management Most hard disk drives today support some form of power management which uses a number of specific power modes that save energy by reducing performance. When implemented an HDD will change between a full power mode to one or more power saving modes as a function of drive usage. Recovery from the deepest mode, typically called Sleep, may take as long as several seconds.[96]

Audible noise
Measured in dBA, audible noise is significant for certain applications, such as DVRs, digital audio recording and quiet computers. Low-noise disks typically use fluid bearings, slower rotational speeds (usually 5,400rpm) and reduce the seek speed under load (AAM) to reduce audible clicks and crunching sounds. Drives in smaller form factors (e.g. 2.5inch) are often quieter than larger drives.

Hard disk drive

44

Shock resistance
Shock resistance is especially important for mobile devices. Some laptops now include active hard drive protection that parks the disk heads if the machine is dropped, hopefully before impact, to offer the greatest possible chance of survival in such an event. Maximum shock tolerance to date is 350 g for operating and 1,000 g for non-operating.[97]

Access and interfaces


Hard disk drives are accessed over one of a number of bus types, including as of 2011 parallel ATA (PATA, also called IDE or EIDE; described before the introduction of SATA as ATA), Serial ATA (SATA), SCSI, Serial Attached SCSI (SAS), and Fibre Channel. Bridge circuitry is sometimes used to connect hard disk drives to buses with which they cannot communicate natively, such as IEEE1394, USB and SCSI. For the now obsolete ST-506 interface, the data encoding scheme as written to the disk surface was also important. The first ST-506 disks used Modified Frequency Modulation (MFM) encoding, and transferred data at a rate of 5megabits per second. Later controllers using 2,7RLL (or just "RLL") encoding caused 50% more data to appear under the heads compared to one rotation of an MFM drive, increasing data storage and data transfer rate by 50%, to 7.5megabits per second. Many ST-506 interface disk drives were only specified by the manufacturer to run at the 1/3 lower MFM data transfer rate compared to RLL, while other drive models (usually more expensive versions of the same drive) were specified to run at the higher RLL data transfer rate. In some cases, a drive in practice had sufficient margin to allow the MFM specified model to run at the faster RLL data transfer rate, although not officially supporting this mode. Also, any RLL-certified drive could run on any MFM controller, but with 1/3 less data capacity and as much as 1/3 less data transfer rate compared to its RLL specifications. Enhanced Small Disk Interface (ESDI) also supported multiple data rates (ESDI disks always used 2,7RLL, but at 10, 15 or 20megabits per second), but this was usually negotiated automatically by the disk drive and controller; most of the time, however, 15 or 20megabit ESDI disk drives were not downward compatible (i.e. a 15 or 20megabit disk drive would not run on a 10megabit controller). ESDI disk drives typically also had jumpers to set the number of sectors per track and (in some cases) sector size. Modern hard drives present a consistent interface to the rest of the computer, no matter what data encoding scheme is used internally. Typically a DSP in the electronics inside the hard drive takes the raw analog voltages from the read head and uses PRML and ReedSolomon error correction[98] to decode the sector boundaries and sector data, then sends that data out the standard interface. That DSP also watches the error rate detected by error detection and correction, and performs bad sector remapping, data collection for Self-Monitoring, Analysis, and Reporting Technology, and other internal tasks. SCSI originally had just one signaling frequency of 5MHz for a maximum data rate of 5 megabytes/second over 8 parallel conductors, but later this was increased dramatically. The SCSI bus speed had no bearing on the disk's internal speed because of buffering between the SCSI bus and the disk drive's internal data bus; however, many early disk drives had very small buffers, and thus had to be reformatted to a different interleave (just like ST-506 disks) when used on slow computers, such as early Commodore Amiga, IBM PC compatibles and Apple Macintoshes. Parallel ATA interfaces were designed to support two drives on each channel, connected as master and slave on a single cable. Disks typically had no problems with interleave or data rate, due to their controller design, but many early models were incompatible with each other and could not run with two devices on the same physical cable. This was mostly remedied by the mid-1990s, when ATA's specification was standardized and the details began to be cleaned up, but still causes problems occasionally, especially with CD-ROM and DVD-ROM disks, and when mixing Ultra DMA and non-UDMA devices. Serial ATA supports one drive per channel and per cable, with its own set of I/O ports, avoiding master/slave problems.

Hard disk drive FireWire/IEEE1394 and USB(1.0/2.0/3.0) hard drives consist of enclosures containing generally ATA or Serial ATA disks with built-in adapters to these external buses.

45

Disk interface families used in personal computers


Historical bit serial interfaces connect a hard disk drive (HDD) to a hard disk controller (HDC) with two cables, one for control and one for data. (Each drive also has an additional cable for power, usually connecting it directly to the power supply unit). The HDC provided significant functions such as serial/parallel conversion, data separation, and track formatting, and required matching to the drive (after formatting) in order to assure reliability. Each control cable could serve two or more drives, while a dedicated (and smaller) data cable served each drive. ST506 used MFM (Modified Frequency Modulation) for the data encoding method.
Several Parallel ATA hard disk drives

ST412 was available in either MFM or RLL (Run Length Limited) encoding variants. Enhanced Small Disk Interface (ESDI) was an industry standard interface similar to ST412 supporting higher data rates between the processor and the disk drive. Modern bit serial interfaces connect a hard disk drive to a host bus interface adapter (today typically integrated into the "south bridge") with one data/control cable. (As for historical bit serial interfaces above, each drive also has an additional power cable, usually direct to the power supply unit.) Fibre Channel (FC) is a successor to parallel SCSI interface on enterprise market. It is a serial protocol. In disk drives usually the Fibre Channel Arbitrated Loop (FC-AL) connection topology is used. FC has much broader usage than mere disk interfaces, and it is the cornerstone of storage area networks (SANs). Recently other protocols for this field, like iSCSI and ATA over Ethernet have been developed as well. Confusingly, drives usually use copper twisted-pair cables for Fibre Channel, not fibre optics. The latter are traditionally reserved for larger devices, such as servers or disk array controllers. Serial ATA (SATA). The SATA data cable has one data pair for differential transmission of data to the device, and one pair for differential receiving from the device, just like EIA-422. That requires that data be transmitted serially. A similar differential signaling system is used in RS485, LocalTalk, USB, Firewire, and differential SCSI. Serial Attached SCSI (SAS). The SAS is a new generation serial communication protocol for devices designed to allow for much higher speed data transfers and is compatible with SATA. SAS uses a mechanically identical data and power connector to standard 3.5-inch SATA1/SATA2 HDDs, and many server-oriented SAS RAID controllers are also capable of addressing SATA hard drives. SAS uses serial communication instead of the parallel method found in traditional SCSI devices but still uses SCSI commands.

Hard disk drive

46

Word serial interfaces connect a hard disk drive to a host bus adapter (today typically integrated into the "south bridge") with one cable for combined data/control. (As for all bit serial interfaces above, each drive also has an additional power cable, usually direct to the power supply unit.) The earliest versions of these interfaces typically had a 8bit parallel data transfer to/from the drive, but 16-bit versions became much more common, and there are 32bit versions. Modern variants have serial data transfer. The word nature of data transfer makes the design of a host bus adapter significantly simpler than that of the precursor HDD controller.

Inner view of a 1998 Seagate hard disk drive which used Parallel ATA interface

Integrated Drive Electronics (IDE), later standardized under the name AT Attachment, with the alias P-ATA or PATA (Parallel ATA) retroactively added upon introduction of the new variant Serial ATA. The original name reflected the integration of the controller with the hard drive itself. (That integration was not new with IDE, having been done a few years earlier with SCSI drives.) Moving the HDD controller from the interface card to the disk drive helped to standardize the host/contoller interface, reduce the programming complexity in the host device driver, and reduced system cost and complexity. The 40-pin IDE/ATA connection transfers 16bits of data at a time on the data cable. The data cable was originally 40-conductor, but later higher speed requirements for data transfer to and from the hard drive led to an "ultra DMA" mode, known as UDMA. Progressively swifter versions of this standard ultimately added the requirement for an 80-conductor variant of the same cable, where half of the conductors provides grounding necessary for enhanced high-speed signal quality by reducing cross talk. The interface for 80-conductor only has 39 pins, the missing pin acting as a key to prevent incorrect insertion of the connector to an incompatible socket, a common cause of disk and controller damage. EIDE was an unofficial update (by Western Digital) to the original IDE standard, with the key improvement being the use of direct memory access (DMA) to transfer data between the disk and the computer without the involvement of the CPU, an improvement later adopted by the official ATA standards. By directly transferring data between memory and disk, DMA eliminates the need for the CPU to copy byte per byte, therefore allowing it to process other tasks while the data transfer occurs. Small Computer System Interface (SCSI), originally named SASI for Shugart Associates System Interface, was an early competitor of ESDI. SCSI disks were standard on servers, workstations, Commodore Amiga, and Apple Macintosh computers through the mid-1990s, by which time most models had been transitioned to IDE (and later, SATA) family disks. Only in 2005 did the capacity of SCSI disks fall behind IDE disk technology, though the highest-performance disks are still available in SCSI, SAS and Fibre Channel only. The range limitations of the data cable allows for external SCSI devices. Originally SCSI data cables used single ended (common mode) data transmission, but server class SCSI could use differential transmission, either low voltage differential (LVD) or high voltage differential (HVD). ("Low" and "High" voltages for differential SCSI are relative to SCSI standards and do not meet the meaning of low voltage and high voltage as used in general electrical engineering contexts, as apply e.g. to statutory electrical codes; both LVD and HVD use low voltage signals (3.3 V and 5 V respectively) in general terminology.)

Hard disk drive

47

Acronym or abbreviation SASI

Meaning Shugart Associates System Interface Small Computer System Interface Serial Attached SCSI Seagate Technology Seagate Technology Enhanced Small Disk Interface Advanced Technology Attachment, Serial ATA Historical predecessor to SCSI.

Description

SCSI

Bus oriented that handles concurrent operations.

SAS ST-506 ST-412 ESDI

Improvement of SCSI, uses serial communication instead of parallel. Historical Seagate interface. Historical Seagate interface (minor improvement over ST-506). Historical; backwards compatible with ST-412/506, but faster and more integrated. Successor to ST-412/506/ESDI by integrating the disk controller completely onto the device. Incapable of concurrent operations. Modification of ATA, uses serial communication instead of parallel.

ATA [(PATA)Parallel Advanced Technology Attachment] SATA

Integrity
Due to the extremely close spacing between the heads and the disk surface, hard disk drives are vulnerable to being damaged by a head crasha failure of the disk in which the head scrapes across the platter surface, often grinding away the thin magnetic film and causing data loss. Head crashes can be caused by electronic failure, a sudden power failure, physical shock, contamination of the drive's internal enclosure, wear and tear, corrosion, or poorly manufactured platters and heads. The HDD's spindle system relies on air pressure inside the disk enclosure to support the heads at their proper flying height while the disk rotates. Hard disk drives require a certain range of air pressures in Close-up HDD head resting on disk platter. (Its order to operate properly. The connection to the external environment mirror reflection is visible on the platter surface.) and pressure occurs through a small hole in the enclosure (about 0.5mm in breadth), usually with a filter on the inside (the breather [99] filter). If the air pressure is too low, then there is not enough lift for the flying head, so the head gets too close to the disk, and there is a risk of head crashes and data loss. Specially manufactured sealed and pressurized disks are needed for reliable high-altitude operation, above about 3000m (9800ft).[100] Modern disks include temperature sensors and adjust their operation to the operating environment. Breather holes can be seen on all disk drivesthey usually have a sticker next to them, warning the user not to cover the holes. The air inside the operating drive is constantly moving too, being swept in motion by friction with the spinning platters. This air passes through an internal recirculation (or "recirc") filter to remove any leftover contaminants from manufacture, any particles or chemicals that may have somehow entered the enclosure, and any particles or outgassing generated internally in normal operation. Very high humidity for extended periods can corrode the heads and platters. For giant magnetoresistive (GMR) heads in particular, a minor head crash from contamination (that does not remove the magnetic surface of the disk) still results in the head temporarily overheating, due to friction with the disk surface, and can render the data unreadable for a short period until the head temperature stabilizes (so called "thermal asperity", a problem which can partially be dealt with by proper electronic filtering of the read signal).

Hard disk drive

48

Modes of failure
Hard drives may fail in a number of ways. Failure may be immediate and total, progressive, or limited. Data may be totally destroyed, or partially or totally recoverable. Earlier drives tended to develop bad sectors with use and wear, which could be "mapped out" so that they did not affect operation; this was considered normal unless many bad sectors developed in a short period. Later drives map out bad sectors automatically and invisibly to the user; S.M.A.R.T. information logs these problems. A drive with bad sectors may usually continue to be used. Other failures which may be either progressive or limited are usually considered to be a reason to replace a drive; the value of data potentially at risk usually far outweighs the cost saved by continuing to use a drive which may be failing. Repeated but recoverable read or write errors, unusual noises, excessive and unusual heating, and other abnormalities, are warning signs. Head crash: a head may contact the rotating platter due to mechanical shock or other reason. At best this will cause irreversible damage and data loss where contact was made. In the worst case the debris scraped off the damaged area may contaminate all heads and platters, and destroy all data on all platters. If damage is initially only partial, continued rotation of the drive may extend the damage until it is total.[10] Bad sectors: some magnetic sectors may become faulty without rendering the whole drive unusable. This may be a limited occurrence or a sign of imminent failure. Stiction: after a time the head may not "take off" when started up as it tends to stick to the platter, a phenomenon known as stiction. This is usually due to unsuitable lubrication properties of the platter surface, a design or manufacturing defect rather than wear. This occasionally happened with some designs until the early 1990s. Circuit failure: components of the electronic circuitry may fail making the drive inoperable. Bearing and motor failure: electric motors may fail or burn out, and bearings may wear enough to prevent proper operation. Miscellaneous mechanical failures: parts, particularly moving parts, of any mechanism can break or fail, preventing normal operation, with possible further damage caused by fragments.

Recovery of data from failed drive


Data from a failed drive can sometimes be partially or totally recovered if the platters' magnetic coating is not totally destroyed. Specialised companies carry out data recovery, at significant cost, by opening the drives in a clean room and using appropriate equipment to read data from the platters directly. If the electronics have failed, it is sometimes possible to replace the electronics board, though often drives of nominally exactly the same model manufactured at different times have different, incompatible, circuit boards. Sometimes operation can be restored for long enough to recover data. Risky techniques are justifiable if the drive is otherwise dead. If a drive is started up once it may continue to run for a shorter or longer time but never start again, so as much data as possible is recovered as soon as the drive starts. A 1990s drive that does not start due to stiction can sometimes be started by tapping it or rotating the body of the drive rapidly by hand. Another technique which is sometimes known to work is to cool the drive, in a waterproof wrapping, in a domestic freezer. There is much useful information about this in blogs and forums,[101] but professionals also resort to this method with some success.[102]

Hard disk drive

49

Landing zones and load/unload technology


During normal operation heads in HDDs fly above the data recorded on the disks. Modern HDDs prevent power interruptions or other malfunctions from landing its heads in the data zone by either physically moving (parking) the heads to a special landing zone on the platters that is not used for data storage, or by physically locking the heads in a suspended (unloaded) position raised off the platters. Some early PC HDDs did not park the heads automatically when power was prematurely disconnected and the heads would land on data. In some other early units the user would run a program to manually park the heads. Landing zones A landing zone is an area of the platter usually near its inner diameter (ID), where no data is stored. This area is called the Contact Start/Stop (CSS) zone. Disks are designed such that either a spring or, more recently, rotational inertia in the platters is used to park the heads in the case of unexpected power loss. In this case, the spindle motor temporarily acts as a generator, providing power to the actuator. Spring tension from the head mounting constantly pushes the heads Microphotograph of an older generation hard disk towards the platter. While the disk is spinning, the heads are supported drive head and slider (1990s) by an air bearing and experience no physical contact or wear. In CSS drives the sliders carrying the head sensors (often also just called heads) are designed to survive a number of landings and takeoffs from the media surface, though wear and tear on these microscopic components eventually takes its toll. Most manufacturers design the sliders to survive 50,000 contact cycles before the chance of damage on startup rises above 50%. However, the decay rate is not linear: when a disk is younger and has had fewer start-stop cycles, it has a better chance of surviving the next startup than an older, higher-mileage disk (as the head literally drags along the disk's surface until the air bearing is established). For example, the Seagate Barracuda 7200.10 series of desktop hard disks are rated to 50,000 start-stop cycles, in other words no failures attributed to the head-platter interface were seen before at least 50,000 start-stop cycles during testing.[103] Around 1995 IBM pioneered a technology where a landing zone on the disk is made by a precision laser process (Laser Zone Texture = LZT) producing an array of smooth nanometer-scale "bumps" in a landing zone,[104] thus vastly improving stiction and wear performance. This technology is still largely in use today, predominantly in desktop and enterprise (3.5inch) drives. In general, CSS technology can be prone to increased stiction (the tendency for the heads to stick to the platter surface), e.g. as a consequence of increased humidity. Excessive stiction can cause physical damage to the platter and slider or spindle motor.

Read/write head from circa-1998 Fujitsu 3.5" hard disk (approx. 2.0 mm x 3.0 mm)

Hard disk drive Unloading Load/Unload technology relies on the heads being lifted off the platters into a safe location, thus eliminating the risks of wear and stiction altogether. The first HDD RAMAC and most early disk drives used complex mechanisms to load and unload the heads. Modern HDDs use ramp loading, first introduced by Memorex in 1967,[105] to load/unload onto plastic "ramps" near the outer disk edge. Addressing shock robustness, IBM also created a technology for their ThinkPad line of laptop computers called the Active Protection System. When a sudden, sharp movement is detected by the built-in accelerometer in the Thinkpad, internal hard disk heads automatically unload themselves to reduce the risk of any potential data loss or scratch defects. Apple later also utilized this technology in their PowerBook, iBook, MacBook Pro, and MacBook line, known as the Sudden Motion Sensor. Sony,[106] HP with their HP 3D DriveGuard[107] and Toshiba[108] have released similar technology in their notebook computers.

50

Metrics of failures
Most major hard disk and motherboard vendors now support S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology), which measures drive characteristics such as operating temperature, spin-up time, data error rates, etc. Certain trends and sudden changes in these parameters are thought to be associated with increased likelihood of drive failure and data loss. However, S.M.A.R.T. parameters alone may not be useful for predicting individual drive failures.[109] Unpredictable breakdown may occur at any time in normal use, with potential loss of all data. Recovery of some or even all data from a damaged drive is sometimes, but not always possible, and is normally costly. A 2007 study published by Google suggested very little correlation between failure rates and either high temperature or activity level; however, the correlation between manufacturer/model and failure rate was relatively strong. Statistics in this matter are kept highly secret by most entities. Google did not relate manufacturers' names with failure rates,[109] though they have since revealed that they use Hitachi Deskstar drives in some of their servers.[110] While several S.M.A.R.T. parameters have an impact on failure probability, a large fraction of failed drives do not produce predictive S.M.A.R.T. parameters.[109] The Google study indicated that "lower temperatures are associated with higher failure rates". Hard drives with S.M.A.R.T.-reported average temperatures below 27 C (81F) had higher failure rates than hard drives with the highest reported average temperature of 50 C (122F), failure rates at least twice as high as the optimum S.M.A.R.T.-reported temperature range of 36 C (97F) to 47 C (117F).[109] SCSI, SAS, and FC drives are more expensive than consumer-grade PATA and SATA drives, and usually used in servers and disk arrays, where PATA and SATA drives were sold to the home computer and desktop market and were perceived to be less reliable. This distinction is now becoming blurred. The mean time between failures (MTBF) of SATA drives is usually about 600,000 hours (some drives such as Western Digital Raptor have rated 1.4million hours MTBF),[111] while SCSI drives are rated for upwards of 1.5million hours. However, independent research indicates that MTBF is not a reliable estimate of a drive's longevity.[112] MTBF is conducted in laboratory environments in test chambers and is an important metric to determine the quality of a disk drive before it enters high volume production. Once the drive product is in production, the more valid metric is annualized failure rate (AFR). AFR is the percentage of real-world drive failures after shipping. Differences in reliability between drives with different interfaces are due to marketing and issues in the drive itself; the interface in itself is not a significant factor, but expensive server-grade drives where reliability (determined by construction) and performance (determined by interface) are more important than purchase price are designed both for higher reliability and faster interface. Consequently SCSI and SAS drives are designed for higher MTBF and reliability than consumer PATA and SATA drives. However, there are SATA drives designed and produced for enterprise markets, designed for reliability comparable to other enterprise-class drives.[113][114]

Hard disk drive Typically as of 2007 enterprise drives experienced between 0.70%0.78% annual failure rates from the total installed drives.

51

External removable drives

Toshiba 1 TB 2.5" external USB 2.0 hard disk drive

3.0 TB 3.5" Seagate FreeAgent GoFlex plug and play external USB 3.0-compatible drive (left), 750 GB 3.5" Seagate Technology push-button external USB 2.0 drive (right), and a 500 GB 2.5" generic brand plug and play external USB 2.0 drive (front).

External removable hard disk drives[115] typically connect via USB. Plug and play drive functionality offers system compatibility, and features large storage options and portable design. External hard disk drives are available in 2.5" and 3.5" sizes, and as of March 2012 their capacities generally range from 160GB to 2TB. Common sizes are 160GB, 250GB, 320GB, 500GB, 640GB, 1TB, and 2TB.[116][117] External hard disk drives are available as preassembled integrated products, or may be assembled by combining an external enclosure (with USB or other interface) with a separately purchased drive. Features such as biometric security or multiple interfaces are available at a higher cost.[118]

Hard disk drive

52

Market segments
Desktop HDDs typically store between 60 GB and 4 TB and rotate at 5,400 to 10,000rpm, and have a media transfer rate of 0.5 Gbit/s or higher (1GB = 109 bytes; 1Gbit/s = 109 bit/s). As of September 2011, the highest capacity consumer HDDs is 4TB.[67] Mobile HDDs or laptop HDDs, smaller than their desktop and enterprise counterparts, tend to be slower and have lower capacity. Mobile HDDs spin at 4,200 rpm, 5,200 rpm, 5,400 rpm, or 7,200 rpm, with 5,400 rpm being typical. 7,200 rpm drives tend to be more expensive and have smaller capacities, while 4,200 rpm models usually have very high storage capacities. Because of smaller platter(s), mobile HDDs generally have lower capacity than their greater desktop counterparts. Enterprise HDDs are typically used with multiple-user computers running enterprise software. Examples are: transaction processing databases, internet infrastructure (email, webserver, e-commerce), scientific computing software, and nearline storage management software. Enterprise drives commonly operate continuously ("24/7") in demanding environments while delivering the highest possible performance without sacrificing reliability. Maximum capacity is not the primary goal, and as a result the drives are often offered in capacities that are relatively low in relation to their cost.[119] The fastest enterprise HDDs spin at 10,000 or 15,000rpm, and can achieve sequential media transfer speeds above 1.6Gbit/s.[120] and a sustained transfer rate up to 1Gbit/s.[120] Drives running at 10,000 or 15,000rpm use smaller platters to mitigate increased power requirements (as they have less air drag) and therefore generally have lower capacity than the highest capacity desktop drives. Enterprise HDDs are commonly connected through Serial Attached SCSI (SAS) or Fibre Channel (FC). Some support multiple ports, so they can be connected to a redundant host bus adapter. They can be reformatted with sector sizes larger than 512 bytes (often 520, 524, 528 or 536 bytes). The additional storage can be used by hardware RAID cards or to store a Data Integrity Field. Consumer electronics HDDs include drives embedded into digital video recorders and automotive vehicles. The former are configured to provide a guaranteed streaming capacity, even in the face of read and write errors, while the latter are built to resist larger amounts of shock. The exponential increases in disk space and data access speeds of HDDs have enabled consumer products that require large storage capacities, such as digital video recorders and digital audio players.[121] In addition, the availability of vast amounts of cheap storage has made viable a variety of web-based services with extraordinary capacity requirements, such as free-of-charge web search, web archiving, and video sharing (Google, Internet Archive, YouTube, etc.).

Hard disk drive

53

Manufacturers and sales


More than 200 companies have manufactured hard disk drives over time. But recent consolidations have concentrated production into just three main manufacturers: Western Digital, Seagate, and Toshiba. Worldwide revenues for HDDs shipments are expected to reach $38 billion in 2012, up about 19% from $32 billion in 2011. This corresponds to a 2012 unit shipment forecast of 673 million units compared to 624 million units in 2011 and 654 million units in 2010 (the drop in 2011 was due to the impact of Thailand flooding on HDD production capacity in late 2011). The estimated 2012 market shares are about 40% each for Seagate and Western Digital and 15-20% for Toshiba.[122]

Icons

HDDs are traditionally symbolized as a stylized stack of platters or as a cylinder and are found in diagrams or on lights to indicate hard drive access. In most modern operating systems, hard drives are represented by an illustration or photograph of the drive enclosure, as shown in the examples below.

Diagram of hard disk drive manufacturer consolidation

HDDs are commonly symbolized with a drive icon

RAID diagram icon symbolizing the array of disks

1970s vintage disk pack with the cover removed

Notes and references


[1] This is the original filing date of the application which led to US Patent 3,503,060, generally accepted as the definitive disk drive patent; see, Kean, David W., "IBM San Jose, A quarter century of innovation", 1977. [2] Further terms used to describe hard disk drives include disk file, direct access storage device (DASD), fixed disk, CKD disk, and Winchester Disk Drive (after the IBM 3340). The term DASD includes other devices besides disks. [3] Magnetic Storage Handbook 2nd Ed., Section 2.1.1, Disk File Technology, Mee and Daniel, (c)1990, [4] "IBM350 disk storage unit" (http:/ / www-03. ibm. com/ ibm/ history/ exhibits/ storage/ storage_350. html). IBM. . Retrieved 13 December 2011. [5] "Toshiba MK4009GAL: 1.8 40GB drive specifications" (http:/ / storage. toshiba. com/ main. aspx?Path=StorageSolutions/ 1. 8-inchHardDiskDrives/ MK4009GAL/ MK4009GALSpecifications). Storage.toshiba.com. . Retrieved 26 April 2012. [6] Phister, Jr., Montgomery (1979). Data Processing Technology and Economics, 2nd Ed.. Santa Monica Publishing Company. p.369. [7] Scotia, Nova. "Cost of Hard Drive storage Space" (http:/ / ns1758. ca/ winch/ winchest. html). . Retrieved 12 March 2011. [8] Historically a variety of run-length limited codes have been used in magnetic recording including for example, codes named FM, MFM and GCR which are no longer used in modern HDDs. [9] In storage engineering, the term spindle also refers to a single drive that can only handle one or a limited number of I/O operations making it a point of focus in scheduling operations to an array of drives.

Hard disk drive


[10] "Hard Drives" (http:/ / www. escotal. com/ harddrive. html). escotal.com. . Retrieved 16 July 2011. [11] "What is a "head-crash" & how can it result in permanent loss of my hard drive data?" (http:/ / www. data-master. com/ HeadCrash-explain-hard-disk-drive-fail_Q18. html). data-master.com. . Retrieved 16 July 2011. [12] "Hard Drive Help" (http:/ / www. hard-drive-help. com/ technology. html). hardrivehelp.com. . Retrieved 16 July 2011. [13] Elert, Glenn. "Thickness of a Piece of Paper" (http:/ / hypertextbook. com/ facts/ 2001/ JuliaSherlis. shtml). HyperTextbook.com. . Retrieved 9 July 2011. [14] CMOS-MagView (http:/ / www. matesy. de/ index. php?option=com_content& view=article& id=84& Itemid=80& lang=en) is an instrument that visualizes magnetic field structures and strengths. [15] Blount, Walker C. (November 2007). "Why 7,200 RPM Mobile Hard Disk Drives?" (http:/ / www. hitachigst. com/ tech/ techlib. nsf/ techdocs/ 33c6a76df9338b0a86256d34007054c0/ $file/ why7200mobilehdds. pdf). . Retrieved 17 July 2011. The source states 3,600 RPM drives were available in the past, but as of November 2007, 4,200 was the lowest available from Hitachi. [16] See History of IBM magnetic disk drives [17] www.pcguide.com/ref/hdd/op/spin_Speed.htm [18] Kanellos, Michael (24 August 2006). "A divide over the future of hard drives" (http:/ / www. zdnetasia. com/ a-divide-over-the-future-of-hard-drives-39393818. htm). CNETNews.com. . Retrieved 24 June 2010. [19] "IBM OEM MR Head | Technology | The era of giant magnetoresistive heads" (https:/ / www1. hitachigst. com/ hdd/ technolo/ gmr/ gmr. htm). Hitachigst.com. 27 August 2001. . Retrieved 4 September 2010. [20] Brian Hayes, Terabyte Territory (http:/ / www. americanscientist. org/ template/ AssetDetail/ assetid/ 14750), American Scientist, Vol 90 No 3 (MayJune 2002) p. 212 [21] "Press Releases December14, 2004" (http:/ / www. toshiba. co. jp/ about/ press/ 2004_12/ pr1401. htm). Toshiba. . Retrieved 13 March 2009. [22] "Seagate Momentus 2" HDDs per webpage January 2008" (http:/ / www. seagate. com/ www/ en-us/ products/ laptops/ momentus/ ). Seagate.com. 24 October 2008. . Retrieved 13 March 2009. [23] "Seagate Barracuda 3" HDDs per webpage January 2008" (http:/ / www. seagate. com/ www/ en-us/ products/ desktops/ barracuda_hard_drives/ ). Seagate.com. . Retrieved 13 March 2009. [24] "Western Digital Scorpio 2" and Greenpower 3" HDDs per quarterly conference, July 2007" (http:/ / www. wdc. com/ en/ company/ investor/ q108remarks. asp). Wdc.com. . Retrieved 13 March 2009. [25] Error Correcting Code (http:/ / pcguide. com/ ref/ hdd/ geom/ errorECC-c. html), The PC Guide [26] "Iterative Detection Read Channel Technology in Hard Disk Drives" (http:/ / www. hitachigst. com/ tech/ techlib. nsf/ techdocs/ FB376A33027F5A5F86257509001463AE/ $file/ IDRC_WP_final. pdf), Hitachi [27] "Will Toshiba's Bit-Patterned Drives Change the HDD Landscape?" (http:/ / www. pcmag. com/ article2/ 0,2817,2368023,00. asp). PC Magazine. 19 August 2010. . Retrieved 21 August 2010. [28] "Xyratex no-go for bit-patterned media" (http:/ / www. theregister. co. uk/ 2010/ 05/ 24/ xyratex_hamr/ ). The Register. 24 April 2010. . Retrieved 21 August 2010. [29] "Report: TDK Technology "More Than Doubles" Capacity Of HDDs" (http:/ / techcrunch. com/ 2011/ 10/ 04/ report-tdk-technology-more-than-doubles-capacity-of-hdds/ ?utm_source=feedburner& utm_medium=feed& utm_campaign=Feed:+ Techcrunch+ (TechCrunch)). . Retrieved 4 October 2011. [30] "Drive displays a smaller capacity than the indicated size on the drive label" (http:/ / wdc. custhelp. com/ app/ answers/ detail/ a_id/ 615/ session/ L2F2LzIvc25vLzEvdGltZS8xMzAyMjgxMDM1L3NpZC9NRjNaWFpxaw==). Wdc.custhelp.com. 29 March 2012. . Retrieved 26 April 2012. [31] i.e. see HGST (http:/ / www. hitachigst. com/ tech/ techlib. nsf/ techdocs/ B259B4A73296DA628625751600058A80/ $file/ ProductBrochureMarch2009. pdf), Samsung (http:/ / www. samsung. com/ global/ business/ hdd/ faqView. do?b2b_bbs_msg_id=167), Seagate (http:/ / www. seagate. com/ docs/ pdf/ whitepaper/ storage_solutions_guide. pdf), Toshiba (http:/ / sdd. toshiba. com/ techdocs/ MKxx33GSG_MK1235GSL_r1. pdf) and Western Digital (http:/ / www. wdc. com/ en/ library/ 2579-001028. pdf) websites [32] "650 RAMAC announcement" (http:/ / www-03. ibm. com/ ibm/ history/ exhibits/ 650/ 650_pr2. html). .IBM Press Release, 13 September 1956, announcing the IBM 305 RAMAC. The first disk drive invented by IBM in 1956 was described by IBM using the descriptive noun million, as in "...a built-in 5-million-digit disk memory" [33] Mulvany, R.B., "Engineering Design of a Disk Storage Facility with Data Modules." IBM JRD, November 1974 [34] Introduction to IBM Direct Access Storage Devices, M. Bohl, IBM publication SR20-4738 1981. [35] CDC Product Line Card (http:/ / www. bitsavers. org/ pdf/ cdc/ discs/ brochures/ ProductLine_Oct74. pdf), October 1974 [36] G. M. Amdahl et. al. (1964). "Architecture of the IBM System/360". IBM JRD 8 (2). "Capacity 8-bit Bytes 1K = 1,024" [37] IBM System/360 Operating System: Storage Estimates, 12th ed. (http:/ / www. bitsavers. org/ pdf/ ibm/ 360/ os/ R20. 0_Jan71/ GC28-6551-12_OS_StorageEstimates_R20_Jan71. pdf), IBM Corp., January 1971, p.21, GC28-6551-12, , "Model 40 with 64K bytes of storage and storage protection" This document contains approximately 468 usages of K meaning 1,024 [38] PDP11/05/10/30/40 Processor Handbook (http:/ / www. bitsavers. org/ pdf/ dec/ pdp11/ handbooks/ PDP1105-40_Handbook_1973. pdf), Digital Equipment Corp., 1973, pp.11, , "... direct addressing of 32K 16-bit words or 64K 8-bit bytes (K = 1,024)" This document contains approximately 80 instances of K meaning 1,024. [39] Topher Kessler. "Snow Leopard changes how file and drive sizes are calculated" (http:/ / reviews. cnet. com/ 8301-13727_7-10330509-263. html). cnet Reviews. . "Altering this convention to agree with English could have been done at any time; however, for some reason it just

54

Hard disk drive


stuck this way for a lot of the computing industry". [40] "Western Digital Settles Hard-Drive Capacity Lawsuit, Associated Press 28 June 2006 retrieved 25 November 2010" (http:/ / www. foxnews. com/ story/ 0,2933,201269,00. html). Fox News. 22 March 2001. . Retrieved 26 April 2012. [41] Published on 26 October 2007 by Phil Cogar (26 October 2007). "Seagate lawsuit concludes, settlement announced" (http:/ / www. bit-tech. net/ news/ bits/ 2007/ 10/ 26/ seagate_lawsuit_concludes_settlement_announced/ 1). Bit-tech.net. . Retrieved 26 April 2012. [42] "Western Digital Notice of Class Action Settlement email" (http:/ / www. xtremesystems. org/ forums/ showthread. php?t=93512). Xtremesystems.org. . Retrieved 26 April 2012. [43] National Institute of Standards and Technology. "Prefixes for binary multiples" (http:/ / physics. nist. gov/ cuu/ Units/ binary. html). . "In December 1998 the International Electrotechnical Commission (IEC) [...] approved as an IEC International Standard names and symbols for prefixes for binary multiples for use in the fields of data processing and data transmission." [44] Upgrading and Repairing PCs, Scott Mueller, Pg. 596, ISBN 0-7897-2974-1 [45] The silicon web: physics for the Internet age, Michael G. Raymer, Pg. 40, ISBN 978-1-4398-0311-0

55

The LBAs on a logical unit shall begin with zero and shall be contiguous up to the last logical block on the logical unit.Information technology Serial Attached SCSI 2 (SAS-2), INCITS 457 Draft 2, 8 May 2009, chapter 4.1 Direct-access block device type model overview
[47] ISO/IEC 791D:1994, AT Attachment Interface for Disk Drives (ATA-1), section 7.1.2 [48] "Western Digital's Advanced Format: The 4K Sector Transition Begins" (http:/ / www. anandtech. com/ show/ 2888). Anandtech.com. . Retrieved 26 April 2012. [49] See: Low-Level Formatting (http:/ / www. pcguide. com/ ref/ hdd/ geom/ formatLow-c. html). However, some enterprise SAS drives have other block sizes such as 520, 524 and 528 bytes which can be changed in the field. [50] "High-Level Formatting" (http:/ / www. pcguide. com/ ref/ hdd/ geom/ formatHigh-c. html). Pcguide.com. 17 April 2001. . Retrieved 26 April 2012. [51] "How to Measure Storage Efficiency Part II Taxes" (http:/ / blogs. netapp. com/ efficiency/ 2009/ 08/ measuring-storage-efficiency-part-ii-taxes. html). Blogs.netapp.com. 14 August 2009. . Retrieved 26 April 2012. [52] Emerson W. Pugh, Lyle R. Johnson, John H. Palmer IBM's 360 and early 370 systems MIT Press, 1991 ISBN 0-262-16123-0, page 266 [53] Christensen, Clayton M. (1997). The Innovator's Dilemma. New York, New York: HarperBusiness. p.252. ISBN0-06-662069-4. [54] "Winchester has 3.5-inch diameter". Electronics: 184. March 1983. [55] Chandler, Doug (26 September 1988). "Startup Ships 2.5-Inch Hard Disk Aimed for Portables, Laptops". PC Week: 6. [56] Schmid, Patrick and Achim Roos (8 May 2010). "3.5" Vs. 2.5" SAS HDDs: In Storage, Size Matters" (http:/ / www. tomshardware. co. uk/ enterprise-storage-sas-hdd,review-31891. html). Tomshardware.com. . Retrieved 25 June 2010. [57] "Playstation 3 Slim Teardown" (http:/ / www. ifixit. com/ Teardown/ PlayStation-3-Slim/ 1121/ 1). 25 August 2009. . Retrieved 15 November 2010. [58] Schmid, Patrick and Achim Roos (22 May 2010). "9.5 Versus 12.5 mm: Which Notebook HDD Is Right For You?" (http:/ / www. tomshardware. com/ reviews/ 2. 5-inch-12. 5-mm-9. 5-mm,2623. html). Tomshardware.com. . Retrieved 22 June 2010. [59] "Seagate Unveils World's Thinnest 2.5-Inch Hard Drive For Slim Laptop Computers" (http:/ / www. physorg. com/ news180118264. html). physorg.com. 15 December 2009. . Retrieved 15 December 2009. [60] 1.3 HDD Product Specification (http:/ / www. samsung. com/ global/ system/ business/ hdd/ prdmodel/ 2008/ 1/ 25/ 2469101. 3_Inch_Spec_PATA_rev. 2. 3. pdf), Samsung, 2008 [61] Toshiba's 0.85-inch HDD is set to bring multi-gigabyte capacities to small, powerful digital products (http:/ / www. toshiba. co. jp/ about/ press/ 2004_01/ pr0801. htm), Toshiba press release, 8 January 2004 [62] (http:/ / www3. toshiba. co. jp/ storage/ english/ spec/ hdd/ mk4001. htm) [63] Toshiba enters Guinness World Records Book with the world's smallest hard disk drive (http:/ / www. toshiba. co. jp/ about/ press/ 2004_03/ pr1601. htm), Toshiba press release, 16 March 2004 [64] Flash price fall shakes HDD market (http:/ / www. eetasia. com/ ART_8800474064_499486_NT_3335be30. HTM), EETimes Asia, 1 August 2007. [65] In 2008 Samsung (http:/ / www. samsung. com/ global/ business/ hdd/ newsView. do?b2b_bbs_msg_id=143) introduced the 1.3-inch SpinPoint A1 HDD but by March 2009 the family was listed as End Of Life Products (http:/ / www. samsung. com/ global/ business/ hdd/ products/ Product_EOLProducts. html) and new 1.3-inch models were not available in this size. [66] Branded external only. [67] "Seagates 4TB GoFlex Breaks the Hard Drive Capacity Ceiling" (http:/ / techjost. com/ 2011/ 09/ 07/ seagates-4tb-goflex-breaks-the-hard-drive-capacity-ceiling/ ). 7 September 2011. . Retrieved 10 December 2011. [68] "Seagate FreeAgent GoFlex 4TB Desk External Drive Review" (http:/ / www. legitreviews. com/ article/ 1704/ ). Legitreviews.com. 12 September 2011. . Retrieved 26 April 2012. [69] 750 GB for IDE-based barebone, 3 TB for SATA-based barebone. [70] (http:/ / www. engadget. com/ 2012/ 09/ 10/ western-digital-builds-5mm-thick-hybrid-hard-drive/ ) [71] Most common. [72] Hard disk drive for laptop (SATA only, 2 TB, only Western Digital, Green series). (http:/ / www. wdc. com/ en/ products/ products. aspx?id=830)

Hard disk drive


[73] 320 GB for IDE-based barebone. [74] "Toshiba Storage Solutions MK3233GSG" (http:/ / www. toshiba. co. jp/ about/ press/ 2009_11/ pr0501. htm). . [75] Seagate Elite 47, shipped 12/97 per 1998 Disk/Trend Report Rigid Disk Drives [76] Quantum Bigfoot TS, shipped 10/98 per 1999 Disk/Trend Report Rigid Disk Drives [77] The Quantum Bigfoot TS used a maximum of 3 platters, other earlier and lower capacity product used up to 4 platters in a 5.25 HH form factor, e.g. Microscience HH1090 circa 1989. [78] "SDK Starts Shipments of 1.3-Inch PMR-Technology-Based HD Media" (http:/ / www. sdk. co. jp/ aa/ english/ news/ 2008/ aanw_08_0812. html). Sdk.co.jp. 10 January 2008. . Retrieved 13 March 2009. [79] "Proving that 8 GB, 0.85 inch hard disk drive exists" (http:/ / digitaljournal. com/ article/ 117340). Digitaljournal.com. 17 February 2007. . Retrieved 26 April 2012. [80] "Toshiba Enters Guinness World Records Book with the World's Smallest Hard Disk Drive" (http:/ / www. toshiba. co. jp/ about/ press/ 2004_03/ pr1601. htm). Toshiba Corp.. 16 March 2004. . Retrieved 11 September 2012. [81] "Western Digital definition of ''average access time''" (http:/ / www. wdc. com/ en/ company/ glossaryofterms/ wdglossarycontent. asp). Wdc.com. 1 July 2006. . Retrieved 26 April 2012. [82] Kearns, Dave (18 April 2001). "How to defrag" (http:/ / www. itworld. com/ NWW01041100636262). ITWorld. . [83] Broida, Rick (10 April 2009). "Turning Off Disk Defragmenter May Solve a Sluggish PC" (http:/ / www. pcworld. com/ article/ 162955/ turning_off_disk_defragmenter_may_solve_a_sluggish_pc. html). PCWorld. . [84] "Seagate Outlines the Future of Storage:: Articles:: www.hardwarezone.com" (http:/ / www. hardwarezone. com. ph/ articles/ view. php?cid=1& id=1805& pg=2). hardwarezone.com. 27 January 2006. . Retrieved 13 March 2009. [85] "WD VelicoRaptor: Drive Specifications" (http:/ / www. wdc. com/ en/ products/ products. aspx?id=20). Western Digital. June 2010. . Retrieved 15 January 2011. [86] "WD Scorpio Blue Mobile: Drive Specifications" (http:/ / www. wdc. com/ en/ products/ products. aspx?id=140). Western Digital. June 2010. . Retrieved 15 January 2011. [87] http:/ / www-03. ibm. com/ ibm/ history/ exhibits/ storage/ storage_350. html [88] http:/ / www-03. ibm. com/ ibm/ history/ exhibits/ storage/ storage_3350. html [89] "Speed Considerations" (http:/ / www. seagate. com/ www/ en-us/ support/ before_you_buy/ speed_considerations), Seagate website. Retrieved 22 January 2011. [90] Higher speeds require more power absorbed by the electric engine, which hence warms up more; high speeds also amplify vibrations due to baricenter of disk not being exactly in the center of the disk itself. [91] Artamonov, Oleg (6 December 2007). "Hard Disk Drive Power Consumption Measurements: X-bits Methodology" (http:/ / www. xbitlabs. com/ articles/ storage/ display/ hdd-power-cons. html). Xbit Laboratories. . [92] e.g. Western Digital's Intelliseek (http:/ / www. wdc. com/ en/ flash/ index. asp?family=intelliseek) [93] "Hitachi Unveils Energy-Efficient Hard Drive with Variable Spindle Speed" (http:/ / www. xbitlabs. com/ news/ storage/ display/ 20071022123416. html). Xbitlabs.com. 22 October 2007. . Retrieved 26 April 2012. [94] Webber, Lawrence; Wallace, Michael (2009). Green tech: how to plan and implement sustainable IT solutions (http:/ / books. google. com/ books?id=BKTALNq5ceAC& lpg=PA62& dq=green disk drive& pg=PA62#v=onepage& q=green disk drive& f=false). p.62. ISBN0-8144-1446-X. . [95] Trusted Reviews (31 August 2005). "Hitachi Deskstar 7K500 500GB HDD: As fast as it's big?" (http:/ / www. theregister. co. uk/ 2005/ 08/ 31/ review_hitachi_7k500/ ). . [96] "Adaptive Power Management for Mobile Hard Drives" (http:/ / www. almaden. ibm. com/ almaden/ mobile_hard_drives. html#2). Almaden.ibm.com. . Retrieved 26 April 2012. [97] Momentus 5400.5 SATA 3Gb/s 320-GB Hard Drive (http:/ / www. seagate. com/ ww/ v/ index. jsp?vgnextoid=5fb658a3fd20a110VgnVCM100000f5ee0a0aRCRD) [98] "Reed Solomon Codes Introduction" (http:/ / gaussianwaves. blogspot. com/ 2008/ 06/ reed-solomon-codes-introduction. html) [99] "Micro House PC Hardware Library Volume I: Hard Drives, Scott Mueler, Macmillan Computer Publishing" (http:/ / www. alasir. com/ books/ hards/ 022-024. html). Alasir.com. . Retrieved 26 April 2012. [100] Waea.org (http:/ / www. waea. org/ committees/ technology/ other_tech_documents/ harddisk. pdf), Ruggedized Disk Drives for Commercial Airborne Computer Systems [101] "Detailed description of drive that worked for 20 minutes after freezing" (http:/ / geeksaresexy. blogspot. com/ 2006/ 01/ freeze-your-hard-drive-to-recover-data. html). Geeksaresexy.blogspot.com. 19 January 2006. . Retrieved 26 April 2012. [102] "Failing Hard Drives and the Freezer Technique Revisited" (http:/ / www. dtidata. com/ resourcecenter/ 2011/ 03/ 18/ failing-hard-drives-and-the-freezer-technique-revisited/ ). DtiData. 18 March 2011. . Retrieved 26 April 2012. [103] "Barracuda 7200.10 Serial ATA Product Manual" (http:/ / www. seagate. com/ support/ disc/ manuals/ sata/ 100402371a. pdf) (PDF). . Retrieved 26 April 2012. [104] IEEE.org (http:/ / ieeexplore. ieee. org/ xpls/ abs_all. jsp?arnumber=490199), Baumgart, P.; Krajnovich, D.J.; Nguyen, T.A.; Tam, A.G.; IEEE Trans. Magn. [105] Pugh et al.; "IBM's 360 and Early 370 Systems"; MIT Press, 1991, pp.270 [106] "Sony | For Business | VAIO SMB" (http:/ / b2b. sony. com/ Solutions/ lpage. do?page=/ vaio_smb/ index. html& name=VAIO SMB). B2b.sony.com. . Retrieved 13 March 2009.

56

Hard disk drive


[107] "HP.com" (http:/ / www. hp. com/ sbso/ solutions/ pc_expertise/ professional_innovations/ hp-3d-drive-guard. pdf) (PDF). . Retrieved 26 April 2012. [108] "Toshiba HDD Protection measures." (http:/ / eu. computers. toshiba-europe. com/ Contents/ Toshiba_teg/ EU/ WORKSHOP/ files/ EXP-2005-04-HDD-Protection-EN. pdf) (PDF). . Retrieved 26 April 2012. [109] Eduardo Pinheiro, Wolf-Dietrich Weber and Luiz Andr Barroso (February 2007). [research.google.com/archive/disk_failures.pdf "Failure Trends in a Large Disk Drive Population"]. USENIX Conference on File and Storage Technologies. 5th USENIX Conference on File and Storage Technologies (FAST 2007) (http:/ / www. usenix. org/ event/ fast07/ ). research.google.com/archive/disk_failures.pdf. Retrieved 15 September 2008. [110] Shankland, Stephen (1 April 2009). "CNet.com" (http:/ / news. cnet. com/ 8301-1001_3-10209580-92. html). News.cnet.com. . Retrieved 26 April 2012. [111] "WD VelociRaptor Drive Specification Sheet (PDF)" (http:/ / www. wdc. com/ wdproducts/ library/ SpecSheet/ ENG/ 2879-701284. pdf) (PDF). . Retrieved 26 April 2012. [112] "Everything You Know About Disks Is Wrong" (http:/ / storagemojo. com/ ?p=383). StorageMojo. 20 February 2007. . Retrieved 29 August 2007. [113] "Differences between an Enterprise-Class HDD and a Desktop-Class HDD" (http:/ / www. synology. com/ wiki/ index. php/ Differences_between_an_Enterprise-Class_HDD_and_a_Desktop-Class_HDD). Synology.com. 4 September 2008. . Retrieved 13 March 2009. [114] "Intel Whitepaper on Enterprise-class versus Desktop-class Hard Drives" (http:/ / download. intel. com/ support/ motherboards/ server/ sb/ enterprise_class_versus_desktop_class_hard_drives_. pdf) (PDF). . Retrieved 26 April 2012. [115] These differ from removable disk media, e.g., disk packs, data modules, in that they include, e.g., actuators, drive elctronics, motors. [116] Graham, Darien (10 December 2010). "Pocket drive battle: 10 high speed external hard disks rated PC & Tech Authority" (http:/ / www. pcauthority. com. au/ GroupTests/ 241328,pocket-drive-battle-10-high-speed-external-hard-disks-rated. aspx). Pcauthority.com.au. . Retrieved 26 April 2012. [117] "External Hard Drives | Buying guide for external hard drives, including portable and desktop devices " (http:/ / www. helpwithpcs. com/ buy-guides/ external-hard-drives. html). Helpwithpcs.com. . Retrieved 26 April 2012. [118] "Back Up Your Important Data to External Hard disk drive | Biometric Safe | Info and Products Reviews about Biometric Security Device " (http:/ / biometricsecurityproducts. org/ biometric-safe/ back-up-your-important-data-to-external-hard-disk-drive. html). Biometricsecurityproducts.org. 26 July 2011. . Retrieved 26 April 2012. [119] "Enterprise Hard Drives" (http:/ / www. articlesbase. com/ data-recovery-articles/ enterprise-hard-drives-825563. html). Articlesbase.com. . Retrieved 26 April 2012. [120] Seagate Cheetah 15K.5 (http:/ / www. seagate. com/ docs/ pdf/ datasheet/ disc/ ds_cheetah_15k_5. pdf) [121] Walter, Chip (25 July 2005). "Kryder's Law" (http:/ / www. sciam. com/ article. cfm?articleID=000B0C22-0805-12D8-BDFD83414B7F0000& ref=sciam& chanID=sa006). Scientific American (Verlagsgruppe Georg von Holtzbrinck GmbH). . Retrieved 29 October 2006. [122] Hard Disk Drive Capital Equipment and Technology Report, 2012, Coughlin Associates (http:/ / www. tomcoughlin. com/ Techpapers/ 2012 Capital Equipment Report Brochure 021112. pdf)

57

Further reading
Mueller, Scott (2011). Upgrading and Repairing PCs (20th ed.). Que. ISBN0-7897-4710-3. Messmer, Hans-Peter (2001). The Indispensable PC Hardware Book (4th ed.). Addison-Wesley. ISBN0-201-59616-4.

External links
Computer History Museum's HDD Working Group Website (http://chmhdd.wetpaint.com/) HDD Tracks and Zones (http://hddscan.com/doc/HDD_Tracks_and_Zones.html) HDD from inside (http://hddscan.com/doc/HDD_from_inside.html) Hard Disk Drives Encyclopedia (http://www.smarthdd.com/en/help.htm) Hard Disk Drive Technology and Utility Tutorials (http://www.hardstoragetech.com) Video showing an opened HD working (http://www.engineerguy.com/videos/video-harddrive.htm) Average seek time of a computer disk (http://faculty.plattsburgh.edu/jan.plaza/teaching/papers/seektime. html)

Disk-drive performance characteristics

58

Disk-drive performance characteristics


Disk-drive performance characteristics are the attributes which control the time it takes to transfer (read or write) data between a computer and a data storage device (most typically disk storage) starting with the initial command from the computer or host until the storage device completes the command. Higher performance comes from devices which have faster performance characteristics.[1][2] These devices include those with rotating media, hereby called rotating drives, i.e., hard-disk drives (HDD), floppy disk drives (FDD), optical discs (DVD-RW / CD-RW), and it also covers devices without moving parts like solid-state drives (SSD). For SSDs, most of the attributes related to the movement of mechanical components are not applicable, but the device is actually affected by some other electrically based element that still causes a measurable delay when isolating and measuring that attribute.[3] These performance characteristics can be grouped into two categories: access time and data transfer time (or rate).[4]

Access time
The access time or response time of a rotating drive is a measure of the time it takes before the drive can actually transfer data. The factors that control this time on a rotating drive are mostly related to the mechanical nature of the rotating disks and moving heads. It is composed of a few independently measurable elements that are added together to get a single value when evaluating the performance of a storage device. The access time can vary significantly, so it is typically provided by manufacturers or measured in benchmarks as an A hard disk head on an access arm resting on a average.[4][5] For SSDs this time is not dependent on moving parts, but hard disk platter. rather electrical connections to solid state memory, so the access time is very quick and consistent.[6] Most testing and benchmark applications do not draw a distinction between rotating drives and SSDs so they both go through the same measurement process. The key components that are typically added together to obtain the access time are:[2][7] Seek time Rotational latency Other Command processing time Settle time

Seek time
With rotating drives, the seek time measures the time it takes the head assembly on the actuator arm to travel to the track of the disk where the data will be read or written.[7] The data on the media is stored in sectors which are arranged in parallel circular tracks (concentric or spiral depending upon the device type) and there is an actuator with an arm that suspends a head that can transfer data with that media. When the drive needs to read or write a certain sector it determines in which track the sector is located. It then uses the actuator to move the head to that particular track. If the initial location of the head was the desired track then the seek time would be zero. If the initial track was the outermost edge of the media and the desired track was at the innermost edge then the seek time would be the maximum for that drive.[8][9] Seek times are not linear compared with the seek distance traveled because of factors of acceleration and deceleration of the actuator arm.[10] A rotating drive's average seek time is the average of all possible seek times which technically is the time to do all possible seeks divided by the number of all possible seeks, but in practice it is determined by statistical methods or

Disk-drive performance characteristics simply approximated as the time of a seek over one-third of the number of tracks.[8][7][11] Average seek time ranges from 3ms[12] for high-end server drives, to 15ms for mobile drives, with the most common mobile drives at about 12ms[13] and the most common desktop drives typically being around 9ms. The first HDD[14] had an average seek time of about 600 ms, and by the middle 1970s, HDDs were available with seek times of about 25ms.[15] Some early PC drives used a stepper motor to move the heads, and as a result had seek times as slow as 80120ms, but this was quickly improved by voice coil type actuation in the 1980s, reducing seek times to around 20ms. Seek time has continued to improve slowly over time. The other two less commonly referenced seek measurements are track-to-track and full stroke. The track-to-track measurement is the time required to move from one track to an adjacent track.[7] This is the shortest (fastest) possible seek time. In HDDs this is typically between 0.2 and 0.8ms.[6] The full stroke measurement is the time required to move from the outermost track to the innermost track. This is the longest (slowest) possible seek time.[8] With SSDs there are no moving parts, so a measurement of the seek time is only testing electronic circuits preparing a particular location on the memory in the storage device. Typical SSDs will have a seek time between 0.08 and 0.16ms.[6] Short stroking Short stroking is a term used in enterprise storage environments to describe an HDD that is purposely restricted in total capacity so that the actuator only has to move the heads across a smaller number of total tracks. This limits the maximum distance the heads can be from any point on the drive thereby reducing its average seek time, but also restricts the total capacity of the drive. This reduced seek time enables the HDD to increase the number of IOPS available from the drive. The cost and power per usable byte of storage rises as the maximum track range is reduced, but the increase in IOPS per dollar is better.[16] Effect of audible noise and vibration control Measured in dBA, audible noise is significant for certain applications, such as DVRs, digital audio recording and quiet computers. Low noise disks typically use fluid bearings, slower rotational speeds (usually 5,400rpm) and reduce the seek speed under load (AAM) to reduce audible clicks and crunching sounds. Drives in smaller form factors (e.g. 2.5inch) are often quieter than larger drives.[17] Some desktop- and laptop-class disk drives allow the user to make a trade-off between seek performance and drive noise. For example, Seagate offers a set of features in some drives called Sound Barrier Technology that include some user or system controlled noise and vibration reduction capability. Faster seek times typically require more energy usage to quickly move the heads across the platter, causing loud noises from the pivot bearing and greater device vibrations as the heads are rapidly accelerated during the start of the seek motion and decelerated at the end of the seek motion. Quiet operation reduces movement speed and acceleration rates, but at a cost of reduced seek performance.[18]

59

Rotational latency

Disk-drive performance characteristics

60

Typical HDD figures


HDD Average Spindle rotational [rpm] latency [ms] 4,200 5,400 7,200 10,000 15,000 7.14 5.56 4.17 3.00 2.00

Rotational latency (sometimes called rotational delay or just latency) is the delay waiting for the rotation of the disk to bring the required disk sector under the read-write head.[19] It depends on the rotational speed of a disk (or spindle motor), measured in revolutions per minute (RPM).[7][20] For most magnetic media-based drives, the average rotational latency is typically based on the empirical relation that the average latency in milliseconds for such a drive is one-half the rotational period. Maximum rotational latency is the time it takes to do a full rotation excluding any spin-up time (as the relevant part of the disk may have just passed the head when the request arrived).[21] Therefore the rotational latency and resulting access time can be improved (decreased) by increasing the rotational speed of the disks.[7] This also has the benefit of improving (increasing) the throughput (discussed later in this article).

Comparison of several forms of disk storage showing tracks (not-to-scale); green denotes start and red denotes end. * Some CD-R(W) and DVD-R(W)/DVD+R(W) recorders operate in ZCLV, CAA or CAV modes.

For more details on track layout see Disk storage The spindle motor speed can use one of two types of disk rotation methods: 1) constant linear velocity (CLV), used mainly in optical storage, varies the rotational speed of the optical disc depending upon the position of the head, and 2) constant angular velocity (CAV), used in HDDs, standard FDDs, a few optical disc systems, and vinyl audio records, spins the media at one constant speed regardless of where the head is positioned. Another wrinkle occurs depending on whether surface bit densities are constant. Usually, with a CAV spin rate, the densities are not constant so that the long outside tracks have the same number of bits as the shorter inside tracks. When the bit density is constant, outside tracks have more bits than inside tracks and is generally combined with a CLV spin rate. In both these schemes contiguous bit transfer rates are constant. This is not the case with other schemes such as using constant bit density with a CAV spin rate.

Disk-drive performance characteristics Effect of reduced power consumption Power consumption has become increasingly important, not only in mobile devices such as laptops but also in server and desktop markets. Increasing data center machine density has led to problems delivering sufficient power to devices (especially for spin-up), and getting rid of the waste heat subsequently produced, as well as environmental and electrical cost concerns (see green computing). Most hard disk drives today support some form of power management which uses a number of specific power modes that save energy by reducing performance. When implemented, an HDD will change between a full power mode to one or more power saving modes as a function of drive usage. Recovery from the deepest mode, typically called Sleep where the drive is stopped or spun down, may take as long as several seconds to be fully operational thereby increasing the resulting latency.[22] The drive manufacturers are also now producing green drives that include some additional features that do reduce power, but can adversely affect the latency including slower spindle speeds and parking heads off the media to reduce friction.[23]

61

Other
The command processing time or command overhead is the time it takes for the drive electronics to set up the necessary communication between the various components in the device so it can read or write the data. This is in the range of 0.003ms. With a value this low most people or benchmarks tend to ignore this time.[2][24] The settle time measures the time it takes the heads to settle on the target track and stop vibrating so it does not read or write off track. This amount is usually very small (typically less than 0.1 ms) or already included in the seek time specifications from the drive manufacturer.[25] In a benchmark test the settle time would be included in the seek time.

Data transfer rate


The data transfer rate of a drive (also called throughput) covers both the internal rate (moving data between the disk surface and the controller on the drive) and the external rate (moving data between the controller on the drive and the host system). The measurable data transfer rate will be the lower (slower) of the two rates. The sustained data transfer rate or sustained throughput of a drive will be the slower of the sustained internal and sustained external rates. The sustained rate is less than or equal to the maximum or burst rate because it does not have the benefit of any cache or buffer memory in the drive. The internal rate is further broken down into media rate, head switch time, and cylinder switch time. These are not applicable to SSDs.[7][26] Media rate speed at which the drive can read bits from the surface of the media Head switch time time required to electrically switch from one head to another; only applies to multi-head drive and is about 1 to 2 ms.[27] Cylinder switch time time required to move to an adjacent track; the name cylinder is used because typically all the tracks of a drive with more than one head or data surface are read before moving the actuator. This time is typically about twice the track-to-track seek time or about 2 to 3 ms.[28] Data transfer rate (read/write) can be measured by writing a large file to disk using special file generator tools, then reading back the file. As of 2010, a typical 7200RPM desktop HDD has a sustained "disk-to-buffer" data transfer rate up to 1030Mbit/s.[29] This rate depends on the track location, so it will be higher for data on the outer tracks (where there are more data sectors) and lower toward the inner tracks (where there are fewer data sectors); and is generally somewhat higher for 10,000RPM drives. A current widely used standard for the "buffer-to-computer" interface is 3.0Gbit/s SATA, which can send about 300megabyte/s (10-bit encoding) from the buffer to the computer, and thus is still comfortably ahead of today's disk-to-buffer transfer rates. SSDs do not have the same internal limits of HDDs, so their internal and external transfer rates are often maximizing the capabilities of the drive-to-host interface.

Disk-drive performance characteristics

62

Effect of file fragmentation


Transfer rate can be influenced by file system fragmentation and the layout of the files. Defragmentation is a procedure used to minimize delay in retrieving data by moving related items to physically proximate areas on the disk.[30] Some computer operating systems perform defragmentation automatically. Although automatic defragmentation is intended to reduce access delays, the procedure can slow response when performed while the computer is in use.[31] In stark contrast to HDDs, flash memory-based SSDs do not need defragmentation. The nature of recording information on flash memory wears it out over time, so any unnecessary writes to the SSD are bad. Since the data is accessed differently (solid state electronics compared to physical sectors on a disk) defragmentation is neither necessary nor desirable.[32]

Effect of areal density


HDD data transfer rate depends upon the rotational speed of the disks and the data recording density. Because heat and vibration limit rotational speed, advancing density becomes the main method to improve sequential transfer rates.[33] Areal density advances by increasing both the number of tracks across the disk and the number of sectors per track, the latter will increase the data transfer rate (for a given RPM). Improvement of data transfer rate performance is correlated to the areal density only by increasing a track's linear surface bit density (sectors per track). Simply increasing the number of tracks on a disk can affect seek times but not gross transfer rates. Based on historic trends, analysts predict a future growth in HDD areal density (and therefore capacity) of about 40% per year.[34] Seek times have not kept up with throughput increases, which themselves have not kept up with growth in storage capacity.

References
[1] "Hard Disk (Hard Drive) Performance transfer rates, latency and seek times" (http:/ / www. pctechguide. com/ hard-disk-hard-drive-performance-transfer-rates-latency-and-seek-times). pctechguide.com. . Retrieved 2011-07-01. [2] "Red Hat Documentation: Hard Drive Performance Characteristics" (http:/ / docs. redhat. com/ docs/ en-US/ Red_Hat_Enterprise_Linux/ 4/ html/ Introduction_To_System_Administration/ s1-storage-perf. html). redhat.com. . Retrieved 2011-07-01. [3] Lee, Yu Hsuan (2008-12). "To Defrag or Not to DefragThat Is the Question for SSD" (http:/ / rtcmagazine. com/ articles/ view/ 101053). rtcmagazine.com. . Retrieved 2011-07-01. [4] Kozierok, Charles (2001-04-17). "Access Time" (http:/ / pcguide. com/ ref/ hdd/ perf/ perf/ spec/ posAccess-c. html). pcguide.com. . Retrieved 2012-04-04. [5] "Getting the hang of IOPS" (http:/ / www. symantec. com/ connect/ articles/ getting-hang-iops). 2011-04-25. . Retrieved 2011-07-03. [6] "Understanding Solid State Drives (part two performance)" (http:/ / h41112. www4. hp. com/ promo/ blades-community/ eur/ en/ library/ weekly_comment/ 081027_HP-SSD-part2-v2_clean. pdf). HP. 2008-10-27. . Retrieved 2011-07-06. [7] "Hard Drive Data Recovery Glossary" (http:/ / www. newyorkdatarecovery. com/ hard-drive-glossary. html). New York Data Recovery. . Retrieved 2011-07-14. [8] Kozierok, Charles (2001-04-17). "Seek Time" (http:/ / pcguide. com/ ref/ hdd/ perf/ perf/ spec/ posSeek-c. html). pcguide.com. . Retrieved 2012-04-04. [9] Kozierok, Charles (2001-04-17). "Hard Disk Tracks, Cylinders and Sectors" (http:/ / pcguide. com/ ref/ hdd/ geom/ tracks. htm). pcguide.com. . Retrieved 2012-04-04. [10] John Wilkes (1994-03). "An introduction to disk drive modeling" (http:/ / www. cs. uh. edu/ ~paris/ 7360/ PAPERS03/ IEEEComputer. DiskModel. pdf). Hewlett-Packard Laboratories. . Retrieved 2011-08-02. [11] "Definition of Average Seek time" (http:/ / www. lintech. org/ comp-per/ 10HDDISK. pdf). . Retrieved 2011-07-06. [12] "WD VelicoRaptor: Drive Specifications" (http:/ / www. wdc. com/ en/ products/ products. aspx?id=20). Western Digital. 2010-06. . Retrieved 2011-01-15. [13] "WD Scorpio Blue Mobile: Drive Specifications" (http:/ / www. wdc. com/ en/ products/ products. aspx?id=140). Western Digital. 2010-06. . Retrieved 2011-01-15. [14] "IBM Archives IBM 350 disk storage unit" (http:/ / www-03. ibm. com/ ibm/ history/ exhibits/ storage/ storage_350. html). IBM. . Retrieved 2011-07-04. [15] "IBM Archives IBM 3350 direct access storage" (http:/ / www-03. ibm. com/ ibm/ history/ exhibits/ storage/ storage_3350. html). IBM. . Retrieved 2011-07-04.

Disk-drive performance characteristics


[16] Schmid, Patrick; Roos, Achim (2009-03-05). "Accelerate Your Hard Drive By Short Stroking" (http:/ / www. tomshardware. com/ reviews/ short-stroking-hdd,2157. html). tomshardware.com. . Retrieved 2011-07-05. [17] Kozierok, Charles (2001-04-17). "Noise and Vibration" (http:/ / pcguide. com/ ref/ hdd/ perf/ qual/ issuesNoise-c. html). pcguide.com. . Retrieved 2012-04-04. [18] "Seagate's Sound Barrier Technology" (http:/ / www. seagate. com/ docs/ pdf/ whitepaper/ sound_barrier. pdf). 2000-11. . Retrieved 2011-07-06. [19] In the 1950s and 1960s magnetic data storage devices used a drum instead of flat discs. [20] In some early PCs the internal bus was slower than the drive data rate so sectors would be missed resulting the loss of an entire revolution. To prevent this sectors were interleaved to slow the effective data rate preventing missed sectors. This is no longer a problem for current PCs and storage devices. [21] Lowe, Scott (2010-02-12). "Calculate IOPS in a storage array" (http:/ / www. techrepublic. com/ blog/ datacenter/ calculate-iops-in-a-storage-array/ 2182). techrepublic.com. . Retrieved 2011-07-03. [22] "Adaptive Power Management for Mobile Hard Drives" (http:/ / www. almaden. ibm. com/ almaden/ mobile_hard_drives. html#2). IBM. . Retrieved 2011-07-06. [23] "Momentus 5400.5 SATA 3Gb/s 320-GB Hard Drive" (http:/ / www. seagate. com/ ww/ v/ index. jsp?vgnextoid=5fb658a3fd20a110VgnVCM100000f5ee0a0aRCRD). . Retrieved 2011-07-06. [24] Kozierok, Charles (2001-04-17). "Command Overhead Time" (http:/ / pcguide. com/ ref/ hdd/ perf/ perf/ spec/ posOverhead-c. html). pcguide.com. . Retrieved 2012-04-04. [25] Kozierok, Charles (2001-04-17). "Settle Time" (http:/ / pcguide. com/ ref/ hdd/ perf/ perf/ spec/ posSettle-c. html). pcguide.com. . Retrieved 2012-04-04. [26] Kozierok, Charles (2001-04-17). "Transfer Performance Specifications" (http:/ / pcguide. com/ ref/ hdd/ perf/ perf/ spec/ trans. htm). pcguide.com. . Retrieved 2012-04-04. [27] Kozierok, Charles (2001-04-17). "Head switch Time" (http:/ / pcguide. com/ ref/ hdd/ perf/ perf/ spec/ transHeadSwitch-c. html). pcguide.com. . Retrieved 2012-04-04. [28] Kozierok, Charles (2001-04-17). "Cylinder switch Time" (http:/ / pcguide. com/ ref/ hdd/ perf/ perf/ spec/ transCylinderSwitch-c. html). pcguide.com. . Retrieved 2012-04-04. [29] "Speed Considerations" (http:/ / www. seagate. com/ www/ en-us/ support/ before_you_buy/ speed_considerations). Seagate. . Retrieved 2011-01-22. [30] Kearns, Dave (2001-04-18). "How to defrag" (http:/ / www. itworld. com/ NWW01041100636262). ITWorld. . Retrieved 2011-07-03. [31] Broida, Rick (2009-04-10). "Turning Off Disk Defragmenter May Solve a Sluggish PC" (http:/ / www. pcworld. com/ article/ 162955/ turning_off_disk_defragmenter_may_solve_a_sluggish_pc. html). PCWorld. . Retrieved 2011-07-03. [32] "Sustaining SSD Performance" (http:/ / www. supertalent. com/ datasheets/ TRIM White Paper. pdf). 2010. . Retrieved 2011-07-06. [33] Kozierok, Charles (2001-04-17). "Areal Density" (http:/ / pcguide. com/ ref/ hdd/ perf/ perf/ spec/ postransAreal-c. html). pcguide.com. . Retrieved 2012-04-04. [34] "Seagate Outlines the Future of Storage :: Articles :: www.hardwarezone.com" (http:/ / www. hardwarezone. com. ph/ articles/ view. php?cid=1& id=1805& pg=2). www.hardwarezone.com. 2006-01-27. . Retrieved 2009-03-13.

63

64

RAID
RAID
RAID (redundant array of independent disks, originally redundant array of inexpensive disks[1][2]) is a storage technology that combines multiple disk drive components into a logical unit. Data is distributed across the drives in one of several ways called "RAID levels", depending on what level of redundancy and performance (via parallel communication) is required. In October 1986, the IBM S/38 announced "checksum". Checksum was an implementation of RAID-5. The implementation was in the operating system and was software only and had a minimum of 10% overhead. The S/38 "scatter loaded" all data for performance. The downside was the loss of any single disk required a total system restore for all disks. Under checksum, when a disk failed, the system halted and was then shutdown. Under maintenance, the bad disk was replaced and then a parity-bit disk recovery was run. The system was restarted using a recovery procedure similar to the one run after a power failure. While difficult, the recovery from a drive failure was much shorter and easier than without checksum. RAID is an example of storage virtualization and was first defined by David Patterson, Garth A. Gibson, and Randy Katz at the University of California, Berkeley in 1987.[3] Marketers representing industry RAID manufacturers later attempted to reinvent the term to describe a redundant array of independent disks as a means of disassociating a low-cost expectation from RAID technology.[4] RAID is now used as an umbrella term for computer data storage schemes that can divide and replicate data among multiple physical drives. The physical drives are said to be "in a RAID", however the more common, incorrect parlance is to say that they are "in a RAID array"[5]. The array can then be accessed by the operating system as one single drive. The different schemes or architectures are named by the word RAID followed by a number (e.g., RAID 0, RAID 1). Each scheme provides a different balance between three key goals: resiliency, performance, and capacity.

Standard levels
A number of standard schemes have evolved which are referred to as levels. There were five RAID levels originally conceived, but many more variations have evolved, notably several nested levels and many non-standard levels (mostly proprietary). RAID levels and their associated data formats are standardised by the Storage Networking Industry Association (SNIA) in the Common RAID Disk Drive Format (DDF) standard.[6] Following is a brief textual summary of the most commonly used RAID levels.[7] RAID 0 (block-level striping without parity or mirroring) has no (or zero) redundancy. It provides improved performance and additional storage but no fault tolerance. Hence simple stripe sets are normally referred to as RAID 0. Any drive failure destroys the array, and the likelihood of failure increases with more drives in the array (at a minimum, catastrophic data loss is almost twice as likely compared to single drives without RAID). A single drive failure destroys the entire array because when data is written to a RAID 0 volume, the data is broken into fragments called blocks. The number of blocks is dictated by the stripe size, which is a configuration parameter of the array. The blocks are written to their respective drives simultaneously on the same sector. This allows smaller sections of the entire chunk of data to be read off each drive in parallel, increasing bandwidth. RAID 0 does not implement error checking, so any error is uncorrectable. More drives in the array means higher bandwidth, but greater risk of data loss. In RAID 1 (mirroring without parity or striping), data is written identically to two drives, thereby producing a "mirrored set"; the read request is serviced by either of the two drives containing the requested data, whichever

RAID one involves least seek time plus rotational latency. Similarly, a write request updates the strips of both drives. The write performance depends on the slower of the two writes (i.e., the one that involves larger seek time and rotational latency); at least two drives are required to constitute such an array. While more constituent drives may be employed, many implementations deal with a maximum of only two; of course, it might be possible to use such a limited level 1 RAID itself as a constituent of a level 1 RAID, effectively masking the limitation. The array continues to operate as long as at least one drive is functioning. With appropriate operating system support, there can be increased read performance, and only a minimal write performance reduction; implementing RAID 1 with a separate controller for each drive in order to perform simultaneous reads (and writes) is sometimes called "multiplexing" (or "duplexing" when there are only two drives). In RAID 10 (mirroring and striping), data is written in stripes across the primary disks and then mirrored to the secondary disks. A typical RAID 10 configuration consists of four drives. Two for striping and two for mirroring. A RAID 10 configuration takes the best concepts of RAID 0 and RAID 1 and combines them to provide better performance along with the reliability of parity without actually having parity as with RAID 5 and RAID 6. RAID 10 is often referred to as RAID 1+0 (mirrored+striped). In RAID 2 (bit-level striping with dedicated Hamming-code parity), all disk spindle rotation is synchronized, and data is striped such that each sequential bit is on a different drive. Hamming-code parity is calculated across corresponding bits and stored on at least one parity drive. In RAID 3 (byte-level striping with dedicated parity), all disk spindle rotation is synchronized, and data is striped so each sequential byte is on a different drive. Parity is calculated across corresponding bytes and stored on a dedicated parity drive. RAID 4 (block-level striping with dedicated parity) is identical to RAID 5 (see below), but confines all parity data to a single drive. In this setup, files may be distributed between multiple drives. Each drive operates independently, allowing I/O requests to be performed in parallel. However, the use of a dedicated parity drive could create a performance bottleneck; because the parity data must be written to a single, dedicated parity drive for each block of non-parity data, the overall write performance may depend a great deal on the performance of this parity drive. RAID 5 (block-level striping with distributed parity) distributes parity along with the data and requires all drives but one to be present to operate; the array is not destroyed by a single drive failure. Upon drive failure, any subsequent reads can be calculated from the distributed parity such that the drive failure is masked from the end user. However, a single drive failure results in reduced performance of the entire array until the failed drive has been replaced and the associated data rebuilt. Additionally, there is the potentially disastrous RAID 5 write hole. RAID 5 requires at least three disks. RAID 6 (block-level striping with double distributed parity) provides fault tolerance of two drive failures; the array continues to operate with up to two failed drives. This makes larger RAID groups more practical, especially for high-availability systems. This becomes increasingly important as large-capacity drives lengthen the time needed to recover from the failure of a single drive. Single-parity RAID levels are as vulnerable to data loss as a RAID 0 array until the failed drive is replaced and its data rebuilt; the larger the drive, the longer the rebuild takes. Double parity gives additional time to rebuild the array without the data being at risk if a single additional drive fails before the rebuild is complete. Like RAID 5, a single drive failure results in reduced performance of the entire array until the failed drive has been replaced and the associated data rebuilt. The following table provides an overview of the most important parameters of standard RAID levels. In each case: Array space efficiency is given as an expression in terms of the number of drives, ; this expression designates a value between 0 and 1, representing the fraction of the sum of the drives' capacities that is available for use. For example, if three drives are arranged in RAID 3, this gives an array space efficiency of (approximately 66%); thus, if each drive in this example has a capacity of 250GB, then the array has a total capacity of 750GB but the capacity that is usable for data storage is only

65

RAID 500GB. Array failure rate is given as an expression in terms of the number of drives, , and the drive failure rate, (which is assumed to be identical and independent for each drive). For example, if each of three drives has a failure rate of 5% over the next 3 years, and these drives are arranged in RAID 3, then this gives an array failure rate of over the next 3 years.
Level Description Minimum # of drives** 2 Space Efficiency 1 Fault Tolerance Array Failure Rate*** 1(1r)n Read Benefit nX Write Benefit nX Image

66

RAID0

Block-level striping without parity or mirroring. Mirroring without parity or striping.

0 (none)

RAID1

1/n

n1 drives

rn

nX

1X

RAID2 Bit-level striping with dedicated Hamming-code parity.

1 1/n log2(n-1)

RAID 2 can recover from 1 drive failure or repair corrupt data or parity when a corrupted bit's corresponding data and parity are good. 1 drive

variable

variable

variable

RAID3

Byte-level striping with dedicated parity. Block-level striping with dedicated parity. Block-level striping with distributed parity. Block-level striping with double distributed parity.

1 1/n

n(n1)r2 n(n1)r2 n(n1)r2

(n1)X

(n1)X*

RAID4

1 1/n

1 drive

(n1)X

(n1)X*

RAID5

1 1/n

1 drive

(n1)X*

(n1)X*

RAID6

1 2/n

2 drives

n(n-1)(n-2)r3

(n2)X*

(n2)X*

* Assumes hardware is fast enough to support; ** Assumes a nondegenerate minimum number of drives; *** Assumes independent, identical rate of failure amongst drives

Nested (hybrid) RAID


In what was originally termed hybrid RAID,[8] many storage controllers allow RAID levels to be nested. The elements of a RAID may be either individual drives or RAIDs themselves. However, if a RAID is itself an element of a larger RAID, it is unusual for its elements to be themselves RAIDs. As there is no basic RAID level numbered larger than 9, nested RAIDs are usually clearly described by attaching the numbers indicating the RAID levels, sometimes with a "+" in between. The order of the digits in a nested RAID designation is the order in which the nested array is built: For a RAID 1+0, drives are first combined into multiple level 1 RAIDs that are themselves treated as single drives to be combined into a single RAID 0; the reverse structure is also possible (RAID 0+1). The final RAID is known as the top array. When the top array is a RAID 0 (such as in RAID 1+0 and RAID 5+0), most vendors omit the "+" (yielding RAID 10 and RAID 50, respectively). RAID 0+1: striped sets in a mirrored set (minimum four drives; even number of drives) provides fault tolerance and improved performance but increases complexity. The key difference from RAID 1+0 is that RAID 0+1 creates a second striped set to mirror a primary striped set. The array continues to operate with one or more drives failed in the same mirror set, but if drives fail on

RAID both sides of the mirror the data on the RAID system is lost. RAID 1+0: (a.k.a. RAID 10) mirrored sets in a striped set (minimum four drives; even number of drives) provides fault tolerance and improved performance but increases complexity. The key difference from RAID 0+1 is that RAID 1+0 creates a striped set from a series of mirrored drives. The array can sustain multiple drive losses so long as no mirror loses all its drives.[9] RAID 5+3: mirrored striped set with distributed parity (some manufacturers label this as RAID 53). Whether an array runs as RAID 0+1 or RAID 1+0 in practice is often determined by the evolution of the storage system. A RAID controller might support upgrading a RAID 1 array to a RAID 1+0 array on the fly, but require a lengthy off-line rebuild to upgrade from RAID 1 to RAID 0+1. With nested arrays, sometimes the path of least disruption prevails over achieving the preferred configuration.

67

RAID parity
Many RAID levels employ an error protection scheme called "parity", a widely used method in information technology to provide fault tolerance in a given set of data. Most use the simple XOR parity described in this section, but RAID 6 uses two separate parities based respectively on addition and multiplication in a particular Galois Field[10] or Reed-Solomon error correction. In Boolean logic, there is an operation called exclusive or (XOR), meaning "one or the other, but not both," that is: 0 0 1 1 XOR XOR XOR XOR 0 1 0 1 = = = = 0 1 1 0

The XOR operator is central to how parity data is created and used within an array. It is used both for the protection of data, as well as for the recovery of missing data. As an example, consider a simple RAID made up of 6 drives (4 for data, 1 for parity, and 1 for use as a hot spare), where each drive has only a single byte worth of storage (a '-' represents a bit, the value of which doesn't matter at this point in the discussion): Drive Drive Drive Drive Drive Drive #1: #2: #3: #4: #5: #6: ------------------------------------------(Data) (Data) (Data) (Data) (Hot Spare) (Parity)

Suppose the following data is written to the drives: Drive Drive Drive Drive Drive Drive #1: #2: #3: #4: #5: #6: 00101010 10001110 11110111 10110101 --------------(Data) (Data) (Data) (Data) (Hot Spare) (Parity)

RAID Every time data is written to the data drives, a parity value must be calculated in order for the array to be able to recover in the event of a failure. To calculate the parity for this RAID, a bitwise XOR of each drive's data is calculated as follows, the result of which is the parity data: 00101010 XOR 10001110 XOR 11110111 XOR 10110101 = 11100110 The parity data 11100110 is then written to the dedicated parity drive: Drive Drive Drive Drive Drive Drive #1: #2: #3: #4: #5: #6: 00101010 10001110 11110111 10110101 -------11100110 (Data) (Data) (Data) (Data) (Hot Spare) (Parity)

68

Suppose Drive #3 fails. In order to restore the contents of Drive #3, the same XOR calculation is performed against the data of all the remaining data drives and data on the parity drive (11100110) which was stored in Drive #6: 00101010 XOR 10001110 XOR 11100110 XOR 10110101 = 11110111 The XOR operation will yield the missing data. With the complete contents of Drive #3 recovered, the data is written to the hot spare, which then acts as a member of the array and allows the group as a whole to continue operating. Drive Drive Drive Drive Drive Drive #1: #2: #3: #4: #5: #6: 00101010 10001110 --Dead-10110101 11110111 11100110 (Data) (Data) (Data) (Data) (Hot Spare) (Parity)

At this point the failed drive has to be replaced with a working one of the same size. Depending on the implementation, the new drive becomes a new hot spare, and the old hot spare drive continues to act as a data drive of the array, or (as illustrated below) the original hot spare's contents are automatically copied to the new drive by the array controller, allowing the original hot spare to return to its original purpose. The resulting array is identical to its pre-failure state: Drive Drive Drive Drive Drive Drive #1: #2: #3: #4: #5: #6: 00101010 10001110 11110111 10110101 -------11100110 (Data) (Data) (Data) (Data) (Hot Spare) (Parity)

This same basic XOR principle applies to parity within RAID groups regardless of capacity or number of drives. As long as there are enough drives present to allow for an XOR calculation to take place, parity can be used to recover data from any single drive failure. (A minimum of three drives must be present in order for parity to be used for fault tolerance, because the XOR operator requires two operands, and a place to store the result).

RAID

69

RAID 6 replacing RAID 5 in enterprise environments


Modern large drive capacities and the large RAID arrays used in modern servers create two problems (discussed below in Problems with Raid.) First, in almost all arrays the drives are fitted at the time of manufacture and will therefore wear at similar rates and times. Therefore, the times of failure for individual drives correlate more closely than they should for a truly random event. Second, it takes time to replace the faulty drive and to rebuild the array. Rebuilding a RAID 5 array after a failure will add additional stress to all of the working drives because every area on every disc marked as being "in use" must be read to rebuild the redundancy that has been lost. If drives are close to failure, the stress of rebuilding the array can be enough to cause another drive to fail before the rebuild has been finished, and even more so if the server is still accessing the drives to provide data to clients, users, applications, etc.[11] It is during this rebuild of the "missing" drive that the entire raid array is at risk of a catastrophic failure. The rebuild of an array on a busy and large system can take hours and sometimes days[11] and therefore it is not surprising that when systems need to be highly available and highly reliable or fault tolerant RAID 6 is chosen.[11] With a RAID 6 array using drives from multiple sources and manufacturers it is possible to mitigate most of the problems associated with RAID 5. The larger the drive capacities and the larger the array size, the more important it becomes to choose RAID 6 instead of RAID 5.[11] A disadvantage of RAID 6 is extra cost because two redundant drives are required. In small arrays, this can add significantly to the production cost and also to the ongoing cost because of the additional power consumption and additional physical space required. RAID 6 is a relatively new technology compared to RAID 5, and therefore the hardware is more expensive to purchase and drivers will be limited to a smaller range of operating systems. In software implementations of RAID 6, the algorithms require more CPU time when compared to RAID 5, because the algorithms are more complex and there are more data to be processed. Therefore, RAID 6 in software implementations may require more powerful CPUs than RAID 5. RAID 6 also suffers a greater write performance penalty than RAID 5. For small (non-full stripe in size) write operations, which are the dominant size in transaction processing systems, the spindle operation overhead is 50% greater and latency will be slightly higher than with RAID 5. Providing the same write performance as a RAID 5 array requires that a RAID 6 array be built of approximately 50% more spindles and this impacts the cost of performance.

RAID 10 versus RAID 5 in relational databases


A common opinion (and one which serves to illustrate the dynamics of proper RAID deployment) is that RAID 10 is inherently better for relational databases than RAID 5, because RAID 5 requires the recalculation and redistribution of parity data on a per-write basis.[12] While this may have been a hurdle in past RAID 5 implementations, the task of parity recalculation and redistribution within modern storage area network (SAN) appliances is performed as a back-end process transparent to the host, not as an in-line process which competes with existing I/O. (i.e. the RAID controller handles this as a housekeeping task to be performed during a particular spindle's idle timeslices, so as not to disrupt any pending I/O from the host.) The "write penalty" inherent to RAID 5 has been effectively masked since the late 1990s by a combination of improved controller design, larger amounts of cache, and faster drives. The effect of a write penalty when using RAID 5 is mostly a concern when the workload cannot be de-staged efficiently from the SAN controller's write cache. SAN appliances generally service multiple hosts that compete both for controller cache, potential SSD cache, and spindle time. In enterprise-level SAN hardware, any writes which are generated by the host are simply stored in a small, mirrored NVRAM cache, acknowledged immediately, and later physically written when the controller sees fit to do so from an efficiency standpoint. From the host's perspective, an individual write to a RAID 10 volume is no faster than an individual write to a RAID 5 volume, both are acknowledged immediately, and serviced on the

RAID back-end. The choice between RAID 10 and RAID 5 for the purpose of housing a relational database depends upon a number of factors (spindle availability, cost, business risk, etc.) but, from a performance standpoint, it depends mostly on the type of I/O expected for a particular database application. For databases that are expected to be exclusively or strongly read-biased, RAID 10 is often chosen because it offers a slight speed improvement over RAID 5 on sustained reads and sustained randomized writes. If a database is expected to be strongly write-biased, RAID 5 becomes the more attractive option, since RAID 5 does not suffer from the same write handicap inherent in RAID 10; all spindles in a RAID 5 can be utilized to write simultaneously, whereas only half the members of a RAID 10 can be used. However, for reasons similar to what has eliminated the "read penalty" in RAID 5, the "write penalty" of RAID 10 has been largely masked by improvements in controller cache efficiency and drive throughput. What causes RAID 5 to be slightly slower than RAID 10 on sustained reads is the fact that RAID 5 has parity data interleaved within normal data. For every read pass in RAID 5, there is a probability that a read head may need to traverse a region of parity data. The cumulative effect of this is a slight performance drop compared to RAID 10, which does not use parity, and therefore never encounters a circumstance where data underneath a head is of no use. For the vast majority of situations, however, most relational databases housed on RAID 10 perform equally well in RAID 5. The strengths and weaknesses of each type only become an issue in atypical deployments, or deployments on overcommitted hardware. Often, any measurable differences between the two formats are masked by structural deficiencies at the host layer, such as poor database maintenance, or sub-optimal I/O configuration settings.[13] There are, however, other considerations which must be taken into account other than simply those regarding performance. RAID 5 and other non-mirror-based arrays offer a lower degree of resiliency than RAID 10 by virtue of RAID 10's mirroring strategy. In a RAID 10, I/O can continue even in spite of multiple drive failures. By comparison, in a RAID 5 array, any failure involving more than one drive renders the array itself unusable by virtue of parity recalculation being impossible to perform. Thus, RAID 10 is frequently favored because it provides the lowest level of risk.[14] Additionally, the time required to rebuild data on a hot spare in a RAID 10 is significantly less than in a RAID 5, because all the remaining spindles in a RAID 5 rebuild must participate in the process, whereas only the hot spare and one surviving member of the broken mirror are required in a RAID 10. Thus, in comparison to a RAID 5, a RAID 10 has a smaller window of opportunity during which a second drive failure could cause array failure. Modern SAN design largely masks any performance hit while a RAID is in a degraded state, by virtue of being able to perform rebuild operations both in-band or out-of-band with respect to existing I/O traffic. Given the rare nature of drive failures in general, and the exceedingly low probability of multiple concurrent drive failures occurring within the same RAID, the choice of RAID 5 over RAID 10 often comes down to the preference of the storage administrator, particularly when weighed against other factors such as cost, throughput requirements, and physical spindle availability.[14] In short, the choice between RAID 5 and RAID 10 involves a complicated mixture of factors. There is no one-size-fits-all solution, as the choice of one over the other must be dictated by everything from the I/O characteristics of the database, to business risk, to worst case degraded-state throughput, to the number and type of drives present in the array itself. Over the course of the life of a database, one may even see situations where RAID 5 is initially favored, but RAID 10 slowly becomes the better choice, and vice versa.

70

RAID

71

New RAID classification


In 1996, the RAID Advisory Board introduced an improved classification of RAID systems. It divides RAID into three types: Failure-resistant (systems that protect against loss of data due to drive failure). Failure-tolerant (systems that protect against loss of data access due to failure of any single component). Disaster-tolerant (systems that consist of two or more independent zones, either of which provides access to stored data). The original "Berkeley" RAID classifications are still kept as an important historical reference point and also to recognize that RAID levels 06 successfully define all known data mapping and protection schemes for disk-based storage systems. Unfortunately, the original classification caused some confusion due to the assumption that higher RAID levels imply higher redundancy and performance; this confusion has been exploited by RAID system manufacturers, and it has given birth to the products with such names as RAID-7, RAID-10, RAID-30, RAID-S, etc. Consequently, the new classification describes the data availability characteristics of a RAID system, leaving the details of its implementation to system manufacturers. Failure-resistant disk systems (FRDS) (meets a minimum of criteria 16) 1. Protection against data loss and loss of access to data due to drive failure 2. Reconstruction of failed drive content to a replacement drive 3. Protection against data loss due to a "write hole" 4. Protection against data loss due to host and host I/O bus failure 5. Protection against data loss due to replaceable unit failure 6. Replaceable unit monitoring and failure indication Failure-tolerant disk systems (FTDS) (meets a minimum of criteria 115) 1. Disk automatic swap and hot swap 2. Protection against data loss due to cache failure 3. Protection against data loss due to external power failure 4. Protection against data loss due to a temperature out of operating range 5. Replaceable unit and environmental failure warning 6. Protection against loss of access to data due to device channel failure 7. Protection against loss of access to data due to controller module failure 8. Protection against loss of access to data due to cache failure 9. Protection against loss of access to data due to power supply failure Disaster-tolerant disk systems (DTDS) (meets a minimum of criteria 121) 1. 2. 3. 4. 5. 6. Protection against loss of access to data due to host and host I/O bus failure Protection against loss of access to data due to external power failure Protection against loss of access to data due to component replacement Protection against loss of data and loss of access to data due to multiple drive failures Protection against loss of access to data due to zone failure Long-distance protection against loss of data due to zone failure

RAID

72

Non-standard levels
Many configurations other than the basic numbered RAID levels are possible, and many companies, organizations, and groups have created their own non-standard configurations, in many cases designed to meet the specialised needs of a small niche group. Most of these non-standard RAID levels are proprietary. Himperia [15] is using RAID 50EE in ZStore 3212L product. This is a RAID 0 of two pools with RAID 5EE (7+1+1). It is tolerant up to 2 disks failures at the same time, and up to 4 disks failures in degrade mode. Reconstruction time is set to a minimum thanks to RAID 5EE. And performance is increased, thanks to RAID 0. Storage Computer Corporation used to call a cached version of RAID 3 and 4, RAID 7. Storage Computer Corporation is now defunct. EMC Corporation used to offer RAID S as an alternative to RAID 5 on their Symmetrix systems. Their latest generations of Symmetrix, the DMX and the V-Max series, do not support RAID S (instead they support RAID 1, RAID 5 and RAID 6.) The ZFS filesystem, available in Solaris, OpenSolaris and FreeBSD, offers RAID-Z, which solves RAID 5's write hole problem. Hewlett-Packard's Advanced Data Guarding (ADG) is a form of RAID 6. NetApp's Data ONTAP uses RAID-DP (also referred to as "double", "dual", or "diagonal" parity), is a form of RAID 6, but unlike many RAID 6 implementations, it does not use distributed parity as in RAID 5. Instead, two unique parity drives with separate parity calculations are used. This is a modification of RAID 4 with an extra parity drive. Accusys Triple Parity (RAID TP) implements three independent parities by extending RAID 6 algorithms on its FC-SATA and SCSI-SATA RAID controllers to tolerate a failure of 3 drives. Linux MD RAID10 (RAID 10) implements a general RAID driver that defaults to a standard RAID 1 with 2 drives, and a standard RAID 1+0 with four drives, but can have any number of drives, including odd numbers. MD RAID 10 can run striped and mirrored, even with only two drives with the f2 layout (mirroring with striped reads, giving the read performance of RAID 0; normal Linux software RAID 1 does not stripe reads, but can read in parallel).[9][16][17] Hewlett-Packard's EVA series arrays implement vRAID - vRAID-0, vRAID-1, vRAID-5, and vRAID-6; vRAID levels are closely aligned to Nested RAID levels: vRAID-1 is actually a RAID 1+0 (or RAID 10), vRAID-5 is actually a RAID 5+0 (or RAID 50), etc. IBM (among others) has implemented a RAID 1E (Level 1 Enhanced). It requires a minimum of 3 drives. It is similar to a RAID 1+0 array, but it can also be implemented with either an even or odd number of drives. The total available RAID storage is n/2. Hadoop has a RAID system that generates a parity file by xor-ing a stripe of blocks in a single HDFS file.[18]

Data backup
A RAID system used as secondary storage is not an alternative to backing up data. In parity configurations, a RAID protects from catastrophic data loss caused by physical damage or errors on a single drive within the array (or two drives in, say, RAID 6). However, a true backup system has other important features such as the ability to restore an earlier version of data, which is needed both to protect against software errors that write unwanted data to secondary storage, and also to recover from user error and malicious data deletion. A RAID can be overwhelmed by catastrophic failure that exceeds its recovery capacity and, of course, the entire array is at risk of physical damage by fire, natural disaster, and human forces, while backups can be stored off-site. A RAID is also vulnerable to controller failure because it is not always possible to migrate a RAID to a new, different controller without data loss.[19]

RAID

73

Implementations
The distribution of data across multiple drives can be managed either by dedicated computer hardware or by software. A software solution may be part of the operating system, or it may be part of the firmware and drivers supplied with a hardware RAID controller.

Software-based RAID
Software RAID implementations are now provided by many operating systems. Software RAID can be implemented as: a layer that abstracts multiple devices, thereby providing a single virtual device (e.g. Linux's md). a more generic logical volume manager (provided with most server-class operating systems, e.g. Veritas or LVM). a component of the file system (e.g. ZFS or Btrfs). Volume manager support Server class operating systems typically provide logical volume management, which allows a system to use logical volumes which can be resized or moved. Often, features like RAID or snapshots are also supported. Vinum is a logical volume manager supporting RAID-0, RAID-1, and RAID-5. Vinum is part of the base distribution of the FreeBSD operating system, and versions exist for NetBSD, OpenBSD, and DragonFly BSD. Solaris SVM supports RAID 1 for the boot filesystem, and adds RAID 0 and RAID 5 support (and various nested combinations) for data drives. Linux LVM supports RAID 0 and RAID 1. HP's OpenVMS provides a form of RAID 1 called "Volume shadowing", giving the possibility to mirror data locally and at remote cluster systems. File system support Some advanced file systems are designed to organize data across multiple storage devices directly (without needing the help of a third-party logical volume manager). ZFS supports equivalents of RAID 0, RAID 1, RAID 5 (RAID Z), RAID 6 (RAID Z2), and a triple parity version RAID Z3, and any nested combination of those like 1+0. ZFS is the native file system on Solaris, and also available on FreeBSD. Btrfs supports RAID 0, RAID 1, and RAID 10 (RAID 5 and 6 are under development). Other support Many operating systems provide basic RAID functionality independently of volume management. Apple's Mac OS X Server[20] and Mac OS X[21] support RAID 0, RAID 1, and RAID 1+0. FreeBSD supports RAID 0, RAID 1, RAID 3, and RAID 5, and all nestings via GEOM modules[22][23] and ccd.[24] Linux's md supports RAID 0, RAID 1, RAID 4, RAID 5, RAID 6, and all nestings.[25][26] Certain reshaping/resizing/expanding operations are also supported.[27] Microsoft's server operating systems support RAID 0, RAID 1, and RAID 5. Some of the Microsoft desktop operating systems support RAID such as Windows XP Professional which supports RAID level 0 in addition to spanning multiple drives but only if using dynamic disks and volumes. Windows XP can be modified to support RAID 0, 1, and 5.[28] NetBSD supports RAID 0, RAID 1, RAID 4, and RAID 5, and all nestings via its software implementation, named RAIDframe. OpenBSD aims to support RAID 0, RAID 1, RAID 4, and RAID 5 via its software implementation softraid.

RAID FlexRAID (for Linux and Windows) is a snapshot RAID implementation. Software RAID has advantages and disadvantages compared to hardware RAID. The software must run on a host server attached to storage, and the server's processor must dedicate processing time to run the RAID software; the additional processing capacity required for RAID 0 and RAID 1 is low, but parity-based arrays require more complex data processing during write or integrity-checking operations. As the rate of data processing increases with the number of drives in the array, so does the processing requirement. Furthermore, all the buses between the processor and the drive controller must carry the extra data required by RAID, which may cause congestion. Fortunately, over time, the increase in commodity CPU speed has been consistently greater than the increase in drive throughput;[29] the percentage of host CPU time required to saturate a given number of drives has decreased. For instance, under 100% usage of a single core on a 2.1GHz Intel "Core2" CPU, the Linux software RAID subsystem (md) as of version 2.6.26 is capable of calculating parity information at 6GB/s; however, a three-drive RAID 5 array using drives capable of sustaining a write operation at 100MB/s only requires parity to be calculated at the rate of 200MB/s, which requires the resources of just over 3% of a single CPU core. Furthermore, software RAID implementations may employ more sophisticated algorithms than hardware RAID implementations (e.g. drive scheduling and command queueing), and thus, may be capable of better performance. Another concern with software implementations is the process of booting the associated operating system. For instance, consider a computer being booted from a RAID 1 (mirrored drives); if the first drive in the RAID 1 fails, then a first-stage boot loader might not be sophisticated enough to attempt loading the second-stage boot loader from the second drive as a fallback. In contrast, a RAID 1 hardware controller typically has explicit programming to decide that a drive has malfunctioned and that the next drive should be used. At least the following second-stage boot loaders are capable of loading a kernel from a RAID 1: LILO (for Linux). Some configurations of the GRUB. The boot loader for FreeBSD.[30] The boot loader for NetBSD.

74

For data safety, the write-back cache of an operating system or individual drive might need to be turned off in order to ensure that as much data as possible is actually written to secondary storage before some failure (such as a loss of power); unfortunately, turning off the write-back cache has a performance penalty that can be significant depending on the workload and command queuing support. In contrast, a hardware RAID controller may carry a dedicated battery-powered write-back cache of its own, thereby allowing for efficient operation that is also relatively safe. Fortunately, it is possible to avoid such problems with a software controller by constructing a RAID with safer components; for instance, each drive could have its own battery or capacitor on its own write-back cache, and the drive could implement atomicity in various ways, and the entire RAID or computing system could be powered by a UPS, etc. Finally, a software RAID controller that is built into an operating system usually uses proprietary data formats and RAID levels, so an associated RAID usually cannot be shared between operating systems as part of a multi boot setup. However, such a RAID may be moved between computers that share the same operating system; in contrast, such mobility is more difficult when using a hardware RAID controller because both computers must provide compatible hardware controllers. Also, if the hardware controller fails, data could become unrecoverable unless a hardware controller of the same type is obtained. Most software implementations allow a RAID to be created from partitions rather than entire physical drives. For instance, an administrator could divide each drive of an odd number of drives into two partitions, and then mirror partitions across drives and stripe a volume across the mirrored partitions to emulate IBM's RAID 1E configuration. Using partitions in this way also allows for constructing multiple RAIDs in various RAID levels from the same set of drives. For example, one could have a very robust RAID 1 for important files, and a less robust RAID 5 or RAID 0 for less important data, all using the same set of underlying drives. (Some BIOS-based controllers offer similar

RAID features, e.g. Intel Matrix RAID.) Using two partitions from the same drive in the same RAID puts data at risk if the drive fails; for instance: A RAID 1 across partitions from the same drive makes all the data inaccessible if the single drive fails. Consider a RAID 5 composed of 4 drives, 3 of which are 250GB and one of which is 500GB; the 500GB drive is split into 2 partitions, each of which is 250GB. Then, a failure of the 500GB drive would remove 2 underlying 'drives' from the array, causing a failure of the entire array.

75

Hardware-based RAID
Hardware RAID controllers use proprietary data layouts, so it is not usually possible to span controllers from different manufacturers. They do not require processor resources, the BIOS can boot from them, and tighter integration with the device driver may offer better error handling. On a desktop system, a hardware RAID controller may be an expansion card connected to a bus (e.g., PCI or PCIe), a component integrated into the motherboard; there are controllers for supporting most types of drive technology, such as IDE/ATA, SATA, SCSI, SSA, Fibre Channel, and sometimes even a combination. The controller and drives may be in a stand-alone enclosure, rather than inside a computer, and the enclosure may be directly attached to a computer, or connected via a SAN. Most hardware implementations provide a read/write cache, which, depending on the I/O workload, improves performance. In most systems, the write cache is non-volatile (i.e. battery-protected), so pending writes are not lost in the event of a power failure. Hardware implementations provide guaranteed performance, add no computational overhead to the host computer, and can support many operating systems; the controller simply presents the RAID as another logical drive.

Firmware/driver-based RAID
A RAID implemented at the level of an operating system is not always compatible with the system's boot process, and it is generally impractical for desktop versions of Windows (as described above). However, hardware RAID controllers are expensive and proprietary. To fill this gap, cheap "RAID controllers" were introduced that do not contain a dedicated RAID controller chip, but simply a standard drive controller chip with special firmware and drivers; during early stage bootup, the RAID is implemented by the firmware, and once the operating system has been more completely loaded, then the drivers take over control. Consequently, such controllers may not work when driver support is not available for the host operating system.[31] Initially, the term "RAID controller" implied that the controller does the processing. However, while a controller without a dedicated RAID chip is often described by a manufacturer as a "RAID controller", it is rarely made clear that the burden of RAID processing is borne by a host computer's central processing unit rather than the RAID controller itself. Thus, this new type is sometimes called "fake" RAID; Adaptec calls it a "HostRAID". Moreover, a firmware controller can often only support certain types of hard drive to form the RAID that it manages (e.g. SATA for an Intel Matrix RAID, as there is neither SCSI nor PATA support in modern Intel ICH southbridges; however, motherboard makers implement RAID controllers outside of the southbridge on some motherboards).

Hot spares
Both hardware and software RAIDs with redundancy may support the use of a hot spare drive; this is a drive physically installed in the array which is inactive until an active drive fails, when the system automatically replaces the failed drive with the spare, rebuilding the array with the spare drive included. This reduces the mean time to recovery (MTTR), but does not completely eliminate it. As with non-hot-spare systems, subsequent additional failure(s) in the same RAID redundancy group before the array is fully rebuilt can cause data loss. Rebuilding can take several hours, especially on busy systems.

RAID RAID 6 without a spare uses the same number of drives as RAID 5 with a hot spare and protects data against failure of up to two drives, but requires a more advanced RAID controller and may not perform as well. Further, a hot spare can be shared by multiple RAID sets.

76

Data scrubbing / Patrol read


Data scrubbing is periodic reading and checking by the RAID controller of all the blocks in a RAID, including those not otherwise accessed. This allows bad blocks to be detected before they are used.[32] An alternate name for this is patrol read. This is defined as a check for bad blocks on each storage device in an array, but which also uses the redundancy of the array to recover bad blocks on a single drive and reassign the recovered data to spare blocks elsewhere on the drive.[33]

Reliability terms
Failure rate Two different kinds of failure rates are applicable to RAID systems. Logical failure is defined as the loss of a single drive and its rate is equal to the sum of individual drives' failure rates. System failure is defined as loss of data and its rate will depend on the type of RAID. For RAID 0 this is equal to the logical failure rate, as there is no redundancy. For other types of RAID, it will be less than the logical failure rate, potentially very small, and its exact value will depend on the type of RAID, the number of drives employed, the vigilance and alacrity of its human administrators, and chance (improbable events do occur, though infrequently). Mean time to data loss (MTTDL) In this context, the average time before a loss of data in a given array.[34] Mean time to data loss of a given RAID may be higher or lower than that of its constituent hard drives, depending upon what type of RAID is employed. The referenced report assumes times to data loss are exponentially distributed, so that 63.2% of all data loss will occur between time 0 and the MTTDL. Mean time to recovery (MTTR) In arrays that include redundancy for reliability, this is the time following a failure to restore an array to its normal failure-tolerant mode of operation. This includes time to replace a failed drive mechanism and time to re-build the array (to replicate data for redundancy). Unrecoverable bit error rate (UBE) This is the rate at which a drive will be unable to recover data after application of cyclic redundancy check (CRC) codes and multiple retries. Write cache reliability Some RAID systems use RAM write cache to increase performance. A power failure can result in data loss unless this sort of drive buffer has a supplementary battery to ensure that the buffer has time to write from RAM to secondary storage before the drive powers down. Atomic write failure Also known by various terms such as torn writes, torn pages, incomplete writes, interrupted writes, non-transactional, etc.

RAID

77

Problems with RAID


Correlated failures
The theory behind the error correction in RAID assumes that failures of drives are independent. Given these assumptions, it is possible to calculate how often they can fail and to arrange the array to make data loss arbitrarily improbable. There is also an assumption that motherboard failures won't damage the hard drive and that hard drive failures occur more often than motherboard failures. In practice, the drives are often the same age (with similar wear) and subject to the same environment. Since many drive failures are due to mechanical issues (which are more likely on older drives), this violates those assumptions; failures are in fact statistically correlated. In practice, the chances of a second failure before the first has been recovered (causing data loss) is not as unlikely as four random failures. In a study including about 100thousand drives, the probability of two drives in the same cluster failing within one hour was observed to be four times larger than was predicted by the exponential statistical distribution which characterizes processes in which events occur continuously and independently at a constant average rate. The probability of two failures within the same 10-hour period was twice as large as that which was predicted by an exponential distribution.[35] A common assumption is that "server-grade" drives fail less frequently than consumer-grade drives. Two independent studies (one by Carnegie Mellon University and the other by Google) have shown that the "grade" of a drive does not relate to the drive's failure rate.[36][37] In addition, there is no protection circuitry between the motherboard and hard drive electronics, so a catastrophic failure of the motherboard can cause the harddrive electronics to fail. Therefore, taking elaborate precautions via RAID setups ignores the equal risk of electronics failures elsewhere which can cascade to a hard drive failure. For a robust critical data system, no risk can outweigh another as the consequence of any data loss is unacceptable.

Atomicity
This is a little understood and rarely mentioned failure mode for redundant storage systems that do not utilize transactional features. Database researcher Jim Gray wrote "Update in Place is a Poison Apple"[38] during the early days of relational database commercialization. However, this warning largely went unheeded and fell by the wayside upon the advent of RAID, which many software engineers mistook as solving all data storage integrity and reliability problems. Many software programs update a storage object "in-place"; that is, they write a new version of the object on to the same secondary storage addresses as the old version of the object. While the software may also log some delta information elsewhere, it expects the storage to present "atomic write semantics," meaning that the write of the data either occurred in its entirety or did not occur at all. However, very few storage systems provide support for atomic writes, and even fewer specify their rate of failure in providing this semantic. Note that during the act of writing an object, a RAID storage device will usually be writing all redundant copies of the object in parallel, although overlapped or staggered writes are more common when a single RAID processor is responsible for multiple drives. Hence an error that occurs during the process of writing may leave the redundant copies in different states, and furthermore may leave the copies in neither the old nor the new state. The little known failure mode is that delta logging relies on the original data being either in the old or the new state so as to enable backing out the logical change, yet few storage systems provide an atomic write semantic for a RAID. While the battery-backed write cache may partially solve the problem, it is applicable only to a power failure scenario. Since transactional support is not universally present in hardware RAID, many operating systems include transactional support to protect against data loss during an interrupted write. Novell NetWare, starting with version 3.x, included a transaction tracking system. Microsoft introduced transaction tracking via the journaling feature in

RAID NTFS. ext4 has journaling with checksums; ext3 has journaling without checksums but an "append-only" option, or ext3cow (Copy on Write). If the journal itself in a filesystem is corrupted though, this can be problematic. The journaling in NetApp WAFL file system gives atomicity by never updating the data in place, as does ZFS. An alternative method to journaling is soft updates, which are used in some BSD-derived systems' implementation of UFS. This can present as a sector read failure. Some RAID implementations protect against this failure mode by remapping the bad sector, using the redundant data to retrieve a good copy of the data, and rewriting that good data to the newly mapped replacement sector. The UBE (Unrecoverable Bit Error) rate is typically specified at 1 bit in 1015 for enterprise class drives (SCSI, FC, SAS), and 1 bit in 1014 for desktop class drives (IDE/ATA/PATA, SATA). Increasing drive capacities and large RAID 5 redundancy groups have led to an increasing inability to successfully rebuild a RAID group after a drive failure because an unrecoverable sector is found on the remaining drives. Double protection schemes such as RAID 6 are attempting to address this issue, but suffer from a very high write penalty.

78

Write cache reliability


The drive system can acknowledge the write operation as soon as the data is in the cache, not waiting for the data to be physically written. This typically occurs in old, non-journaled systems such as FAT32, or if the Linux/Unix "writeback" option is chosen without any protections like the "soft updates" option (to promote I/O speed whilst trading-away data reliability). A power outage or system hang such as a BSOD can mean a significant loss of any data queued in such a cache. Often a battery is protecting the write cache, mostly solving the problem. If a write fails because of power failure, the controller may complete the pending writes as soon as restarted. This solution still has potential failure cases: the battery may have worn out, the power may be off for too long, the drives could be moved to another controller, and the controller itself could fail. Some systems provide the capability of testing the battery periodically, however this leaves the system without a fully charged battery for several hours. An additional concern about write cache reliability exists, specifically regarding devices equipped with a write-back cachea caching system which reports the data as written as soon as it is written to cache, as opposed to the non-volatile medium.[39] The safer cache technique is write-through, which reports transactions as written when they are written to the non-volatile medium.

Equipment compatibility
The methods used to store data by various RAID controllers are not necessarily compatible, so that it may not be possible to read a RAID on different hardware, with the exception of RAID 1, which is typically represented as plain identical copies of the original data on each drive. Consequently a non-drive hardware failure may require the use of identical hardware to recover the data, and furthermore an identical configuration has to be reassembled without triggering a rebuild and overwriting the data. Software RAID however, such as implemented in the Linux kernel, alleviates this concern, as the setup is not hardware dependent, but runs on ordinary drive controllers, and allows the reassembly of an array. Additionally, individual drives of a RAID 1 (software and most hardware implementations) can be read like normal drives when removed from the array, so no RAID system is required to retrieve the data. Inexperienced data recovery firms typically have a difficult time recovering data from RAID drives, with the exception of RAID1 drives with conventional data structure.

RAID

79

Data recovery in the event of a failed array


With larger drive capacities the odds of a drive failure during rebuild are not negligible. In that event, the difficulty of extracting data from a failed array must be considered. Only a RAID 1 (mirror) stores all data on each drive in the array. Although it may depend on the controller, some individual drives in a RAID 1 can be read as a single conventional drive; this means a damaged RAID 1 can often be easily recovered if at least one component drive is in working condition. If the damage is more severe, some or all data can often be recovered by professional data recovery specialists. However, other RAID levels (like RAID level 5) present much more formidable obstacles to data recovery.

Drive error recovery algorithms


Many modern drives have internal error recovery algorithms that can take upwards of a minute to recover and re-map data that the drive fails to read easily. Frequently, a RAID controller is configured to drop a component drive (that is, to assume a component drive has failed) if the drive has been unresponsive for 8 seconds or so; this might cause the array controller to drop a good drive because that drive has not been given enough time to complete its internal error recovery procedure. Consequently, desktop drives can be quite risky when used in a RAID, and so-called enterprise class drives limit this error recovery time in order to obviate the problem. A fix specific to Western Digital's desktop drives used to be known: A utility called WDTLER.exe could limit a drive's error recovery time; the utility enabled TLER (time limited error recovery), which limits the error recovery time to 7 seconds. Around September 2009, Western Digital disabled this feature in their desktop drives (e.g., the Caviar Black line), making such drives unsuitable for use in a RAID.[40] However, Western Digital enterprise class drives are shipped from the factory with TLER enabled. Similar technologies are used by Seagate, Samsung, and Hitachi. Of course, for non-RAID usage, an enterprise class drive with a short error recovery timeout that cannot be changed is therefore less suitable than a desktop drive.[40] In late 2010, the Smartmontools program began supporting the configuration of ATA Error Recovery Control, allowing the tool to configure many desktop class hard drives for use in a RAID.[40]

Recovery time is increasing


Drive capacity has grown at a much faster rate than transfer speed, and error rates have only fallen a little in comparison. Therefore, larger capacity drives may take hours, if not days, to rebuild. The re-build time is also limited if the entire array is still in operation at reduced capacity.[41] Given a RAID with only one drive of redundancy (RAIDs 3, 4, and 5), a second failure would cause complete failure of the array. Even though individual drives' mean time between failure (MTBF) have increased over time, this increase has not kept pace with the increased storage capacity of the drives. The time to rebuild the array after a single drive failure, as well as the chance of a second failure during a rebuild, have increased over time.[42]

Operator skills, correct operation


In order to provide the desired protection against physical drive failure, a RAID must be properly set up and maintained by an operator with sufficient knowledge of the chosen RAID configuration, array controller (hardware or software), failure detection and recovery. Unskilled handling of the array at any stage may exacerbate the consequences of a failure, and result in downtime and full or partial loss of data that might otherwise be recoverable. Particularly, the array must be monitored, and any failures detected and dealt with promptly. Failure to do so will result in the array continuing to run in a degraded state, vulnerable to further failures. Ultimately more failures may occur, until the entire array becomes inoperable, resulting in data loss and downtime. In this case, any protection the array may provide merely delays this.

RAID The operator must know how to detect failures or verify healthy state of the array, identify which drive failed, have replacement drives available, and know how to replace a drive and initiate a rebuild of the array. In order to protect against such issues and reduce the need for direct onsite monitoring, some server hardware includes remote management and monitoring capabilities referred to as Baseboard Management, using the Intelligent Platform Management Interface. A server at a remote site which is not monitored by an onsite technician can instead be remotely managed and monitored, using a separate standalone communications channel that does not require the managed device to be operating. The Baseboard Management Controller in the server functions independent of the installed operating system, and may include the ability to manage and monitor a server even when it is in its "powered off / standby" state. Hardware labeling issues The hardware itself can contribute to RAID array management challenges, depending on how the array drives are arranged and identified. If there is no clear indication of which drive is failed, an operator not familiar with the hardware might remove a non-failed drive in a running server, and destroy an already degraded array. A controller may refer to drives by an internal numbering scheme such as 0, 1, 2... while an external drive mounting frame may be labeled 1, 2, 3...; in this situation drive #2 as identified by the controller is actually in mounting frame position #3. For large arrays spanning several external drive frames, each separate frame may restart the numbering at 1, 2, 3... but if the drive frames are cabled together, then the second row of a 12-drive frame may actually be drive 13, 14, 15... SCSI ID's can be assigned directly on the drive rather than through the interface connector. For direct-cabled drives, it is possible for the drive ID's to be arranged in any order on the SCSI cable, and for cabled drives to swap position keeping their individually assigned ID, even if the server's external chassis labeling indicates otherwise. Someone unfamiliar with a server's management challenges could swap drives around while the power is off without causing immediate damage to the RAID array, but which misleads other technicians at a later time that are assuming failed drives are in the original locations.

80

Other problems
While RAID may protect against physical drive failure, the data is still exposed to operator, software, hardware and virus destruction. Many studies[43] cite operator fault as the most common source of malfunction, such as a server operator replacing the incorrect drive in a faulty RAID, and disabling the system (even temporarily) in the process.[44] Most well-designed systems include separate backup systems that hold copies of the data, but do not allow much interaction with it. Most copy the data and remove the copy from the computer for safe storage. Hardware RAID controllers are really just small computers running specialized software. Although RAID controllers tend to be very thoroughly tested for reliability, the controller software may still contain bugs that cause damage to data in certain unforeseen situations. The controller software may also have time-dependent bugs that don't manifest until a system has been operating continuously, beyond what is a feasible time-frame for testing, before the controller product goes to market.

RAID

81

History
Norman Ken Ouchi at IBM was awarded a 1978 U.S. patent 4,092,732[45] titled "System for recovering data stored in failed memory unit." The claims for this patent describe what would later be termed RAID 5 with full stripe writes. This 1978 patent also mentions that drive mirroring or duplexing (what would later be termed RAID 1) and protection with dedicated parity (that would later be termed RAID 4) were prior art at that time. In October 1986, the IBM S/38 announced "checksum" - an operating system software level implementation of what became RAID-5. The S/38 "scatter-loaded" data over all disks for better performance and ease of use. As a result, a single disk failure forced the restore of the entire system. With S/38 checksum, when a disk failed, the system stopped and was powered off. Under maintenance, the bad disk was replaced and the new disk was fully recovered using RAID parity bits. While checksum had 10%-30% overhead and was not concurrent recovery, non-concurrent recovery was still a far better solution than a reload of the entire system. With 30% overhead and the then high expense of extra disk, few customers implemented checksum. The term RAID was first defined by David A. Patterson, Garth A. Gibson and Randy Katz at the University of California, Berkeley, in 1987. They studied the possibility of using two or more drives to appear as a single device to the host system and published a paper: "A Case for Redundant Arrays of Inexpensive Disks (RAID)" in June 1988 at the SIGMOD conference.[3] This specification suggested a number of prototype RAID levels, or combinations of drives. Each had theoretical advantages and disadvantages. Over the years, different implementations of the RAID concept have appeared. Most differ substantially from the original idealized RAID levels, but the numbered names have remained. This can be confusing, since one implementation of RAID 5, for example, can differ substantially from another. RAID 3 and RAID 4 are often confused and even used interchangeably. One of the early uses of RAID 0 and 1 was the Crosfield Electronics Studio 9500 page layout system based on the Python workstation. The Python workstation was a Crosfield managed international development using PERQ 3B electronics, benchMark Technology's Viper display system and Crosfield's own RAID and fibre-optic network controllers. RAID 0 was particularly important to these workstations as it dramatically sped up image manipulation for the pre-press markets. Volume production started in Peterborough, England in early 1987.

Non-RAID drive architectures


Non-RAID drive architectures also exist, and are often referred to, similarly to RAID, by standard acronyms, several tongue-in-cheek. A single drive is referred to as a SLED (Single Large Expensive Disk/Drive), by contrast with RAID, while an array of drives without any additional control (accessed simply as independent drives) is referred to, even in a formal context such as equipment specification, as a JBOD (Just a Bunch Of Disks). Simple concatenation is referred to as a "span".

References
[1] Donald, L. (2003). MCSA/MCSE 2006 JumpStart Computer and Network Basics (2nd ed.). Glasgow: SYBEX. [2] Howe, Denis, ed. Redundant Arrays of Independent Disks from [[Free On-line Dictionary of Computing|FOLDOC (http:/ / foldoc. org/ RAID)]]. Imperial College Department of Computing (http:/ / www. doc. ic. ac. uk/ ). . Retrieved 2011-11-10. [3] David A. Patterson, Garth Gibson, and Randy H. Katz: A Case for Redundant Arrays of Inexpensive Disks (RAID) (http:/ / www-2. cs. cmu. edu/ ~garth/ RAIDpaper/ Patterson88. pdf). University of California Berkeley. 1988. [4] "Originally referred to as Redundant Array of Inexpensive Disks, the concept of RAID was first developed in the late 1980s by Patterson, Gibson, and Katz of the University of California at Berkeley. (The RAID Advisory Board has since substituted the term Inexpensive with Independent.)" Storagecc Area Network Fundamentals; Meeta Gupta; Cisco Press; ISBN 978-1-58705-065-7; Appendix A. [5] See RAS syndrome. [6] "Common RAID Disk Drive Format (DDF) standard" (http:/ / www. snia. org/ tech_activities/ standards/ curr_standards/ ddf/ ). Snia.org. . Retrieved 2012-08-26. [7] "SNIA Dictionary" (http:/ / www. snia. org/ education/ dictionary). Snia.org. . Retrieved 2010-08-24.

RAID
[8] Vijayan, S.; Selvamani, S. ; Vijayan, S (1995). "Dual-Crosshatch Disk Array: A Highly Reliable Hybrid-RAID Architecture" (http:/ / books. google. com/ ?id=QliANH5G3_gC& dq="hybrid+ raid"). Proceedings of the 1995 International Conference on Parallel Processing: Volume 1. CRC Press. pp.I146ff. ISBN0-8493-2615-X. . [9] Jeffrey B. Layton: "Intro to Nested-RAID: RAID-01 and RAID-10" (http:/ / www. linux-mag. com/ id/ 7928?hq_e=el& hq_m=1151565& hq_l=36& hq_v=3fa9646c7f), Linux Magazine, January 6, 2011 [10] DAwkins, Bill and Jones, Arnold. "Common RAID Disk Data Format Specification" (http:/ / www. snia. org/ tech_activities/ standards/ curr_standards/ ddf/ SNIA-DDFv1. 2. pdf) [Storage Networking Industry Association] Colorado Springs, 28 July 2006. Retrieved on 22 February 2011. [11] "Why RAID 6 stops working in 2019" (http:/ / www. zdnet. com/ blog/ storage/ why-raid-6-stops-working-in-2019/ 805). ZDNet. 22 February 2010. . [12] "RAID Classifications" (http:/ / www. bytepile. com/ raid_class. php#5). BytePile.com. 2012-04-10. . Retrieved 2012-08-26. [13] (http:/ / www-03. ibm. com/ systems/ resources/ systems_storage_disk_ess_pdf_raid5-raid10. pdf) [14] "RAID Classifications" (http:/ / www. bytepile. com/ raid_class. php#10). BytePile.com. 2012-04-10. . Retrieved 2012-08-26. [15] http:/ / zstore. himperia. info/ [16] (http:/ / tldp. org/ HOWTO/ Software-RAID-0. 4x-HOWTO-8. html), question 4 [17] "Main Page - Linux-raid" (http:/ / linux-raid. osdl. org/ ). Linux-raid.osdl.org. 2010-08-20. . Retrieved 2010-08-24. [18] "Hdfs Raid" (http:/ / hadoopblog. blogspot. com/ 2009/ 08/ hdfs-and-erasure-codes-hdfs-raid. html). Hadoopblog.blogspot.com. 2009-08-28. . Retrieved 2010-08-24. [19] "The RAID Migration Adventure" (http:/ / www. tomshardware. com/ reviews/ RAID-MIGRATION-ADVENTURE,1640. html). . Retrieved 2010-03-10. [20] "Apple Mac OS X Server File Systems" (http:/ / www. apple. com/ server/ macosx/ technology/ file-system. html). . Retrieved 2008-04-23. [21] "Mac OS X: How to combine RAID sets in Disk Utility" (http:/ / support. apple. com/ kb/ TA24359). . Retrieved 2010-01-04. [22] "FreeBSD System Manager's Manual page for GEOM(8)" (http:/ / www. freebsd. org/ cgi/ man. cgi?query=geom). . Retrieved 2009-03-19. [23] "freebsd-geom mailing list - new class / geom_raid5" (http:/ / lists. freebsd. org/ pipermail/ freebsd-geom/ 2006-July/ 001356. html). . Retrieved 2009-03-19. [24] "FreeBSD Kernel Interfaces Manual for CCD(4)" (http:/ / www. freebsd. org/ cgi/ man. cgi?query=ccd). . Retrieved 2009-03-19. [25] "The Software-RAID HOWTO" (http:/ / tldp. org/ HOWTO/ Software-RAID-HOWTO. html). . Retrieved 2008-11-10. [26] "RAID setup" (http:/ / linux-raid. osdl. org/ index. php/ RAID_setup). . Retrieved 2008-11-10. [27] "RAID setup" (https:/ / raid. wiki. kernel. org/ index. php/ RAID_setup). . Retrieved 2010-09-30. [28] "Using WindowsXP to Make RAID 5 Happen" (http:/ / www. tomshardware. com/ reviews/ windowsxp-make-raid-5-happen,925. html). Tomshardware.com. . Retrieved 2010-08-24. [29] "Rules of Thumb in Data Engineering" (http:/ / research. microsoft. com/ pubs/ 68636/ ms_tr_99_100_rules_of_thumb_in_data_engineering. pdf). . Retrieved 2010-01-14. [30] "FreeBSD Handbook" (http:/ / www. freebsd. org/ doc/ en_US. ISO8859-1/ books/ handbook/ geom-mirror. html). Chapter 19 GEOM: Modular Disk Transformation Framework. . Retrieved 2009-03-19. [31] "SATA RAID FAQ - ata Wiki" (https:/ / ata. wiki. kernel. org/ index. php/ SATA_RAID_FAQ). Ata.wiki.kernel.org. 2011-04-08. . Retrieved 2012-08-26. [32] Ulf Troppens, Wolfgang Mueller-Friedt, Rainer Erkens, Rainer Wolafka, Nils Haustein. Storage Networks Explained: Basics and Application of Fibre Channel SAN, NAS, ISCSI,InfiniBand and FCoE. John Wiley and Sons, 2009. p.39 [33] Dell Computers, Background Patrol Read for Dell PowerEdge RAID Controllers, BY DREW HABAS AND JOHN SIEBER, Reprinted from Dell Power Solutions, February 2006 http:/ / www. dell. com/ downloads/ global/ power/ ps1q06-20050212-Habas. pdf [34] Jim Gray and Catharine van Ingen, "Empirical Measurements of Disk Failure Rates and Error Rates" (http:/ / research. microsoft. com/ research/ pubs/ view. aspx?msr_tr_id=MSR-TR-2005-166), MSTR-2005-166, December 2005 [35] Disk Failures in the Real World: What Does an MTTF of 1,000,000 Hours Mean to You? (http:/ / www. usenix. org/ events/ fast07/ tech/ schroeder. html) Bianca Schroeder and Garth A. Gibson [36] "Everything You Know About Disks Is Wrong" (http:/ / storagemojo. com/ 2007/ 02/ 20/ everything-you-know-about-disks-is-wrong/ ). Storagemojo.com. 2007-02-22. . Retrieved 2010-08-24. [37] Eduardo Pinheiro, Wolf-Dietrich Weber and Luiz Andr Barroso (February 2007). "Failure Trends in a Large Disk Drive Population" (http:/ / research. google. com/ archive/ disk_failures. pdf). Google Inc. . Retrieved 2011-12-26. [38] Jim Gray: The Transaction Concept: Virtues and Limitations (http:/ / www. informatik. uni-trier. de/ ~ley/ db/ conf/ vldb/ Gray81. html) (Invited Paper) VLDB 1981 (http:/ / www. informatik. uni-trier. de/ ~ley/ db/ conf/ vldb/ vldb81. html#Gray81): 144-154 [39] "Definition of write-back cache at SNIA dictionary" (http:/ / www. snia. org/ education/ dictionary/ w/ ). . [40] "Error recovery control with smartmontools" (http:/ / www. csc. liv. ac. uk/ ~greg/ projects/ erc/ ). . Retrieved 2011. [41] Patterson, D., Hennessy, J. (2009). Computer Organization and Design. New York: Morgan Kaufmann Publishers. pp 604-605. [42] Newman, Henry (2009-09-17). "RAID's Days May Be Numbered" (http:/ / www. enterprisestorageforum. com/ technology/ features/ article. php/ 3839636). EnterpriseStorageForum. . Retrieved 2010-09-07. [43] These studies are: Gray, J (1990), Murphy and Gent (1995), Kuhn (1997), and Enriquez P. (2003). See following source. [44] Patterson, D., Hennessy, J. (2009), 574.

82

RAID
[45] US patent 4092732 (http:/ / worldwide. espacenet. com/ textdoc?DB=EPODOC& IDX=US4092732), Norman Ken Ouchi, "System for recovering data stored in failed memory unit", issued 1978-05-30

83

External links
RAID (http://www.dmoz.org/Computers/Hardware/Storage/Subsystems/RAID/) at the Open Directory Project

84

Operating System
Operating system
An operating system (OS) is a collection of software that manages computer hardware resources and provides common services for computer programs. The operating system is a vital component of the system software in a computer system. Application programs require an operating system to function. Time-sharing operating systems schedule tasks for efficient use of the system and may also include accounting for cost allocation of processor time, mass storage, printing, and other resources. For hardware functions such as input and output and memory allocation, the operating system acts as an intermediary between programs and the computer hardware,[1][2] although the application code is usually executed directly by the hardware and will frequently make a system call to an OS function or be interrupted by it. Operating systems can be found on almost any device that contains a computerfrom cellular phones and video game consoles to supercomputers and web servers. Examples of popular modern operating systems include Android, BSD, iOS, Linux, Mac OS X, Microsoft Windows,[3] Windows Phone, and IBM z/OS. All these, except Windows and z/OS, share roots in UNIX.

Types
Real-time A real-time operating system is a multitasking operating system that aims at executing real-time applications. Real-time operating systems often use specialized scheduling algorithms so that they can achieve a deterministic nature of behavior. The main objective of real-time operating systems is their quick and predictable response to events. They have an event-driven or time-sharing design and often aspects of both. An event-driven system switches between tasks based on their priorities or external events while time-sharing operating systems switch tasks based on clock interrupts. Multi-user A multi-user operating system allows multiple users to access a computer system concurrently. Time-sharing systems and Internet servers can be classified as multi-user systems as they enable multiple-user access to a computer through the sharing of time. Single-user operating systems, as opposed to multi-user operating systems, are usable by a single user at a time. Being able to use multiple accounts on a Windows operating system does not make it a multi-user system. Rather, only the network administrator is the real user. But for a UNIX-like operating system, it is possible for two users to log in at a time and this capability of the OS makes it a multi-user operating system. Multi-tasking vs. single-tasking When only a single program is allowed to run at a time, the system is grouped as a single-tasking system. However, when the operating system allows the execution of multiple tasks at one time, it is classified as a multi-tasking operating system. Multi-tasking can be of two types: pre-emptive or co-operative. In pre-emptive multitasking, the operating system slices the CPU time and dedicates one slot to each of the programs. Unix-like operating systems such as Solaris and Linux support pre-emptive multitasking, as does AmigaOS. Cooperative multitasking is achieved by relying on each process to give time to the other processes in a defined manner. 16-bit versions of Microsoft Windows used cooperative multi-tasking. 32-bit versions, both Windows NT and Win9x, used pre-emptive multi-tasking. Mac OS prior to OS X used to support cooperative

Operating system multitasking. Distributed Further information: Distributed system A distributed operating system manages a group of independent computers and makes them appear to be a single computer. The development of networked computers that could be linked and communicate with each other gave rise to distributed computing. Distributed computations are carried out on more than one machine. When computers in a group work in cooperation, they make a distributed system. Embedded Embedded operating systems are designed to be used in embedded computer systems. They are designed to operate on small machines like PDAs with less autonomy. They are able to operate with a limited number of resources. They are very compact and extremely efficient by design. Windows CE and Minix 3 are some examples of embedded operating systems.

85

History
Early computers were built to perform a series of single tasks, like a calculator. Operating systems did not exist in their modern and more complex forms until the early 1960s.[4] Basic operating system features were developed in the 1950s, such as resident monitor functions that could automatically run different programs in succession to speed up processing. Hardware features were added that enabled use of runtime libraries, interrupts, and parallel processing. When personal computers became popular in the 1980s, operating system were made for them similar in concept to those used on larger computers. In the 1940s, the earliest electronic digital systems had no operating systems. Electronic systems of this time were programmed on rows of mechanical switches or by jumper wires on plug boards. These were special-purpose systems that, for example, generated ballistics tables for the military or controlled the printing of payroll checks from data on punched paper cards. After programmable general purpose computers were invented, machine languages (consisting of strings of the binary digits 0 and 1 on punched paper tape) were introduced that sped up the programming process (Stern, 1981). In the early 1950s, a computer could execute only one program at a time. Each user had sole use of the computer for a limited period of time and would arrive at a scheduled time with program and data on punched paper cards and/or punched tape. The program would be loaded into the machine, and the machine would be set to work until the program completed or crashed. Programs could generally be debugged via a front panel using toggle switches and panel lights. It is said that Alan Turing was a master of this on the early Manchester Mark 1 machine, and he was already deriving the primitive conception of an operating system from the principles of the Universal Turing machine.[4] Later machines came with libraries of programs, which would be linked to a user's program to assist in operations such as input and output and generating computer code from human-readable symbolic code. This was the genesis of the modern-day computer system. However, machines still ran a single job at a time. At Cambridge University in England the job queue was at one time a washing line from which tapes were hung with different colored clothes-pegs to indicate job-priority.

OS/360 was used on most IBM mainframe computers beginning in 1966, including the computers that helped NASA put a man on the moon.

Operating system

86

Mainframes
Through the 1950s, many major features were pioneered in the field of operating systems, including batch processing, input/output interrupt, buffering, multitasking, spooling, runtime libraries, link-loading, and programs for sorting records in files. These features were included or not included in application software at the option of application programmers, rather than in a separate operating system used by all applications. In 1959 the SHARE Operating System was released as an integrated utility for the IBM 704, and later in the 709 and 7090 mainframes, although it was quickly supplanted by IBSYS/IBJOB on the 709, 7090 and 7094. During the 1960s, IBM's OS/360 introduced the concept of a single OS spanning an entire product line, which was crucial for the success of the System/360 machines. IBM's current mainframe operating systems are distant descendants of this original system and applications written for OS/360 can still be run on modern machines. OS/360 also pioneered the concept that the operating system keeps track of all of the system resources that are used, including program and data space allocation in main memory and file space in secondary storage, and file locking during update. When the process is terminated for any reason, all of these resources are re-claimed by the operating system. The alternative CP-67 system for the S/360-67 started a whole line of IBM operating systems focused on the concept of virtual machines. Other operating systems used on IBM S/360 series mainframes included systems developed by IBM: COS/360 (Compatibility Operating System), DOS/360 (Disk Operating System), TSS/360 (Time Sharing System), TOS/360 (Tape Operating System), BOS/360 (Basic Operating System), and ACP (Airline Control Program), as well as a few non-IBM systems: MTS (Michigan Terminal System), MUSIC (Multi-User System for Interactive Computing), and ORVYL (Stanford Timesharing System). Control Data Corporation developed the SCOPE operating system in the 1960s, for batch processing. In cooperation with the University of Minnesota, the Kronos and later the NOS operating systems were developed during the 1970s, which supported simultaneous batch and timesharing use. Like many commercial timesharing systems, its interface was an extension of the Dartmouth BASIC operating systems, one of the pioneering efforts in timesharing and programming languages. In the late 1970s, Control Data and the University of Illinois developed the PLATO operating system, which used plasma panel displays and long-distance time sharing networks. Plato was remarkably innovative for its time, featuring real-time chat, and multi-user graphical games. Burroughs Corporation introduced the B5000 in 1961 with the MCP, (Master Control Program) operating system. The B5000 was a stack machine designed to exclusively support high-level languages with no machine language or assembler, and indeed the MCP was the first OS to be written exclusively in a high-level language ESPOL, a dialect of ALGOL. MCP also introduced many other ground-breaking innovations, such as being the first commercial implementation of virtual memory. During development of the AS400, IBM made an approach to Burroughs to licence MCP to run on the AS400 hardware. This proposal was declined by Burroughs management to protect its existing hardware production. MCP is still in use today in the Unisys ClearPath/MCP line of computers. UNIVAC, the first commercial computer manufacturer, produced a series of EXEC operating systems. Like all early main-frame systems, this was a batch-oriented system that managed magnetic drums, disks, card readers and line printers. In the 1970s, UNIVAC produced the Real-Time Basic (RTB) system to support large-scale time sharing, also patterned after the Dartmouth BC system. General Electric and MIT developed General Electric Comprehensive Operating Supervisor (GECOS), which introduced the concept of ringed security privilege levels. After acquisition by Honeywell it was renamed to General Comprehensive Operating System (GCOS). Digital Equipment Corporation developed many operating systems for its various computer lines, including TOPS-10 and TOPS-20 time sharing systems for the 36-bit PDP-10 class systems. Prior to the widespread use of UNIX, TOPS-10 was a particularly popular system in universities, and in the early ARPANET community.

Operating system In the late 1960s through the late 1970s, several hardware capabilities evolved that allowed similar or ported software to run on more than one system. Early systems had utilized microprogramming to implement features on their systems in order to permit different underlying computer architectures to appear to be the same as others in a series. In fact most 360s after the 360/40 (except the 360/165 and 360/168) were microprogrammed implementations. But soon other means of achieving application compatibility were proven to be more significant. The enormous investment in software for these systems made since 1960s caused most of the original computer manufacturers to continue to develop compatible operating systems along with the hardware. The notable supported mainframe operating systems include: Burroughs MCP B5000, 1961 to Unisys Clearpath/MCP, present. IBM OS/360 IBM System/360, 1966 to IBM z/OS, present. IBM CP-67 IBM System/360, 1967 to IBM z/VM, present. UNIVAC EXEC 8 UNIVAC 1108, 1967, to OS 2200 Unisys Clearpath Dorado, present.

87

Microcomputers
The first microcomputers did not have the capacity or need for the elaborate operating systems that had been developed for mainframes and minis; minimalistic operating systems were developed, often loaded from ROM and known as monitors. One notable early disk operating system was CP/M, which was supported on many early microcomputers and was closely imitated by Microsoft's MS-DOS, which became wildly popular as the operating system chosen for the IBM PC (IBM's version of it was called IBM DOS or PC DOS). In the '80s, Apple Computer Inc. (now Apple PC-DOS was an early personal computer OS that featured a command line interface. Inc.) abandoned its popular Apple II series of microcomputers to introduce the Apple Macintosh computer with an innovative Graphical User Interface (GUI) to the Mac OS. The introduction of the Intel 80386 CPU chip with 32-bit architecture and paging capabilities, provided personal computers with the ability to run multitasking operating systems like those of earlier minicomputers and mainframes. Microsoft responded to this progress by hiring Dave Cutler, who had developed the VMS operating system for Digital Equipment Corporation. He would lead the development of the Windows NT operating system, which continues to serve as the basis for Microsoft's operating systems line. Steve Jobs, a co-founder of Apple Inc., started NeXT Computer Inc., which developed the NEXTSTEP operating system. NEXTSTEP would later be acquired by Apple Inc. and used, along with code from FreeBSD as the core of Mac OS X. The GNU Project was started by activist and programmer Richard Stallman with the goal of creating a complete free software replacement to the proprietary UNIX operating system. While the project was highly successful in duplicating the functionality of various parts of UNIX, development of the GNU Hurd kernel proved to be unproductive. In 1991, Finnish computer science student Linus Torvalds, with cooperation from volunteers collaborating over the Internet, released the first version of the Linux kernel. It was soon merged with the GNU user space components and system software to form a complete operating system. Since then, the combination of the two major components has usually been referred to as simply "Linux" by the software industry, a naming convention that Stallman and the Free Software Foundation remain opposed to, preferring the name GNU/Linux. The Berkeley Software Distribution, known as BSD, is the UNIX derivative distributed by the University of California, Berkeley, starting in the 1970s. Freely distributed and ported to many minicomputers, it eventually also gained a following for use on PCs, mainly as FreeBSD, NetBSD and OpenBSD.

Operating system

88

Examples of operating systems


UNIX and UNIX-like operating systems
Ken Thompson wrote B, mainly based on BCPL, which he used to write Unix, based on his experience in the MULTICS project. B was replaced by C, and Unix developed into a large, complex family of inter-related operating systems which have been influential in every modern operating system (see History). The UNIX-like family is a diverse group of operating systems, with several major sub-categories including System V, BSD, and Linux. The name "UNIX" is a trademark of The Open Group which licenses it for Evolution of Unix systems use with any operating system that has been shown to conform to their definitions. "UNIX-like" is commonly used to refer to the large set of operating systems which resemble the original UNIX. Unix-like systems run on a wide variety of computer architectures. They are used heavily for servers in business, as well as workstations in academic and engineering environments. Free UNIX variants, such as Linux and BSD, are popular in these areas. Four operating systems are certified by the The Open Group (holder of the Unix trademark) as Unix. HP's HP-UX and IBM's AIX are both descendants of the original System V Unix and are designed to run only on their respective vendor's hardware. In contrast, Sun Microsystems's Solaris Operating System can run on multiple types of hardware, including x86 and Sparc servers, and PCs. Apple's Mac OS X, a replacement for Apple's earlier (non-Unix) Mac OS, is a hybrid kernel-based BSD variant derived from NeXTSTEP, Mach, and FreeBSD. Unix interoperability was sought by establishing the POSIX standard. The POSIX standard can be applied to any operating system, although it was originally created for various Unix variants. BSD and its descendants A subgroup of the Unix family is the Berkeley Software Distribution family, which includes FreeBSD, NetBSD, and OpenBSD, PC-BSD. These operating systems are most commonly found on webservers, although they can also function as a personal computer OS. The Internet owes much of its existence to BSD, as many of the protocols now commonly used by computers to connect, send and receive data over a network were widely implemented and refined in BSD. The world wide web was also first demonstrated on a number of computers running an OS based on BSD called NextStep. BSD has its roots in Unix. In 1974, University of California, Berkeley installed its first Unix system.
The first server for the World Wide Web ran on NeXTSTEP, based on BSD.

Operating system Over time, students and staff in the computer science department there began adding new programs to make things easier, such as text editors. When Berkely received new VAX computers in 1978 with Unix installed, the school's undergraduates modified Unix even more in order to take advantage of the computer's hardware possibilities. The Defense Advanced Research Projects Agency of the US Department of Defense took interest, and decided to fund the project. Many schools, corporations, and government organizations took notice and started to use Berkeley's version of Unix instead of the official one distributed by AT&T. Steve Jobs, upon leaving Apple Inc. in 1985, formed NeXT Inc., a company that manufactured high-end computers running on a variation of BSD called NeXTSTEP. One of these computers was used by Tim Berners-Lee as the first webserver to create the World Wide Web. Developers like Keith Bostic encouraged the project to replace any non-free code that originated with Bell Labs. Once this was done, however, AT&T sued. Eventually, after two years of legal disputes, the BSD project came out ahead and spawned a number of free derivatives, such as FreeBSD and NetBSD. OS X Main Article: OS X Mac OS X is a line of open core graphical operating systems developed, marketed, and sold by Apple Inc., the latest of which is pre-loaded on all currently shipping Macintosh computers. Mac OS X is the successor to the original Mac OS, which had been Apple's primary operating system since 1984. Unlike its predecessor, Mac OS X is a UNIX operating system built on technology that had been developed at NeXT through the second half of the 1980s and up until Apple purchased the company in early 1997. The operating system was first released in 1999 as Mac OS X Server 1.0, with a desktop-oriented version (Mac OS X v10.0 "Cheetah") following in March 2001. Since then, six more distinct "client" and "server" editions of Mac OS X have been released, the most recent being OS X 10.8 "Mountain Lion", which was first made available on February 16, 2012 for developers, and was then released to the public on July 25th 2012. Releases of Mac OS X are named after big cats. The server edition, Mac OS X Server, is architecturally identical to its desktop counterpart but usually runs on Apple's line of Macintosh server hardware. Mac OS X Server includes work group management and administration software tools that provide simplified access to key network services, including a mail transfer agent, a Samba server, an LDAP server, a domain name server, and others. In Mac OS X v10.7 Lion, all server aspects of Mac OS X Server have been integrated into the client version.[5] Linux and GNU Linux (or GNU/Linux) is a Unix-like operating system that was developed without any actual Unix code, unlike BSD and its variants. Linux can be used on a wide range of devices from supercomputers to wristwatches. The Linux kernel is released under an open source license, so anyone can read and modify its code. It has been modified to run on a large variety of electronics. Although estimates suggest that Linux is used on 1.82% of all personal computers,[6][7] it has been widely adopted for use in servers[8] and embedded systems[9] (such as cell phones). Linux has superseded Unix in most places, and is used on the 10 most

89

Ubuntu, desktop Linux distribution

Operating system powerful supercomputers in the world.[10] The Linux kernel is used in some popular distributions, such as Red Hat, Debian, Ubuntu, Linux Mint and Google's Android. The GNU project is a mass collaboration of programmers who seek to create a completely free and open operating system that was similar to Unix but with completely original code. It was started in 1983 by Richard Stallman, and is responsible for many of the parts of most Linux variants. Thousands of pieces of software for virtually every operating system are licensed under the GNU General Public License. Meanwhile, the Linux kernel began as a side project of Linus Torvalds, a university student from Finland. In 1991, Torvalds began work on it, and posted information about his project on a newsgroup for computer students and programmers. He received a wave of support and volunteers who ended up creating a full-fledged kernel. Programmers from GNU took notice, and members of both projects worked to integrate the finished GNU parts with the Linux kernel in order to create a full-fledged operating system. Google Chrome OS Chrome is an operating system based on the Linux kernel and designed by Google. Since Chrome OS targets computer users who spend most of their time on the Internet, it is mainly a web browser with no ability to run applications. It relies on Internet applications (or Web apps) used in the web browser to accomplish tasks such as word processing and media viewing, as well as online storage for storing most files.
Android, a popular mobile operating system using the Linux kernel

90

Microsoft Windows
Microsoft Windows is a family of proprietary operating systems designed by Microsoft Corporation and primarily targeted to Intel architecture based computers, with an estimated 88.9 percent total usage share on Web connected computers.[7][11][12][13] The newest version is Windows 7 for workstations and Windows Server 2008 R2 for servers. Windows 7 recently overtook Windows XP as most used OS.[14][15][16]

Bootable Windows To Go USB flash drive

Microsoft Windows originated in 1985 as an operating environment running on top of MS-DOS, which was the standard operating system shipped on most Intel architecture personal computers at the time. In 1995, Windows 95 was released which only used MS-DOS as a bootstrap. For backwards compatibility, Win9x could run real-mode MS-DOS[17][18] and 16 bits Windows 3.x[19] drivers. Windows Me, released in 2000, was the last version in the Win9x family. Later versions have all been based on the Windows NT kernel. Current versions of Windows run on IA-32 and x86-64 microprocessors, although Windows 8 will support ARM architecture. In the past, Windows NT supported non-Intel architectures. Server editions of Windows are widely used. In recent years, Microsoft has expended significant capital in an effort to promote the use of Windows as a server operating system. However, Windows' usage on servers is not as widespread as on personal computers, as Windows competes against Linux and BSD for server market share.[20][21]

Operating system

91

Other
There have been many operating systems that were significant in their day but are no longer so, such as AmigaOS; OS/2 from IBM and Microsoft; Mac OS, the non-Unix precursor to Apple's Mac OS X; BeOS; XTS-300; RISC OS; MorphOS and FreeMint. Some are still used in niche markets and continue to be developed as minority platforms for enthusiast communities and specialist applications. OpenVMS formerly from DEC, is still under active development by Hewlett-Packard. Yet other operating systems are used almost exclusively in academia, for operating systems education or to do research on operating system concepts. A typical example of a system that fulfills both roles is MINIX, while for example Singularity is used purely for research. Other operating systems have failed to win significant market share, but have introduced innovations that have influenced mainstream operating systems, not least Bell Labs' Plan 9.

Components
The components of an operating system all exist in order to make the different parts of a computer work together. All user software needs to go through the operating system in order to use any of the hardware, whether it be as simple as a mouse or keyboard or complex as an Internet connection.

Kernel
With the aid of the firmware and device drivers, the kernel provides the most basic level of control over all of the computer's hardware devices. It manages memory access for programs in the RAM, it determines which programs get access to which hardware resources, it sets up or resets the CPU's operating states for optimal operation at all times, and it organizes the data for long-term non-volatile storage with file systems on such media as disks, tapes, flash memory, etc. Program execution
A kernel connects the application software to the The operating system provides an interface between an application hardware of a computer. program and the computer hardware, so that an application program can interact with the hardware only by obeying rules and procedures programmed into the operating system. The operating system is also a set of services which simplify development and execution of application programs. Executing an application program involves the creation of a process by the operating system kernel which assigns memory space and other resources, establishes a priority for the process in multi-tasking systems, loads program binary code into memory, and initiates execution of the application program which then interacts with the user and with hardware devices.

Interrupts Interrupts are central to operating systems, as they provide an efficient way for the operating system to interact with and react to its environment. The alternative having the operating system "watch" the various sources of input for events (polling) that require action can be found in older systems with very small stacks (50 or 60 bytes) but are unusual in modern systems with large stacks. Interrupt-based programming is directly supported by most modern CPUs. Interrupts provide a computer with a way of automatically saving local register contexts, and running specific code in response to events. Even very basic computers support hardware interrupts, and allow the programmer to specify code which may be run when that event takes place. When an interrupt is received, the computer's hardware automatically suspends whatever program is currently running, saves its status, and runs computer code previously associated with the interrupt; this is analogous to

Operating system placing a bookmark in a book in response to a phone call. In modern operating systems, interrupts are handled by the operating system's kernel. Interrupts may come from either the computer's hardware or from the running program. When a hardware device triggers an interrupt, the operating system's kernel decides how to deal with this event, generally by running some processing code. The amount of code being run depends on the priority of the interrupt (for example: a person usually responds to a smoke detector alarm before answering the phone). The processing of hardware interrupts is a task that is usually delegated to software called device driver, which may be either part of the operating system's kernel, part of another program, or both. Device drivers may then relay information to a running program by various means. A program may also trigger an interrupt to the operating system. If a program wishes to access hardware for example, it may interrupt the operating system's kernel, which causes control to be passed back to the kernel. The kernel will then process the request. If a program wishes additional resources (or wishes to shed resources) such as memory, it will trigger an interrupt to get the kernel's attention. Modes Modern CPUs support multiple modes of operation. CPUs with this capability use at least two modes: protected mode and supervisor mode. The supervisor mode is used by the operating system's kernel for low level tasks that need unrestricted access to hardware, such as controlling how memory is written and erased, and communication with devices like graphics cards. Protected mode, in contrast, is used for almost everything else. Applications operate within protected mode, and can only use hardware by communicating with the kernel, which controls everything in Privilege rings for the x86 available in protected mode. Operating systems supervisor mode. CPUs might have other determine which processes run in each mode. modes similar to protected mode as well, such as the virtual modes in order to emulate older processor types, such as 16-bit processors on a 32-bit one, or 32-bit processors on a 64-bit one. When a computer first starts up, it is automatically running in supervisor mode. The first few programs to run on the computer, being the BIOS or EFI, bootloader, and the operating system have unlimited access to hardware - and this is required because, by definition, initializing a protected environment can only be done outside of one. However, when the operating system passes control to another program, it can place the CPU into protected mode. In protected mode, programs may have access to a more limited set of the CPU's instructions. A user program may leave protected mode only by triggering an interrupt, causing control to be passed back to the kernel. In this way the operating system can maintain exclusive control over things like access to hardware and memory. The term "protected mode resource" generally refers to one or more CPU registers, which contain information that the running program isn't allowed to alter. Attempts to alter these resources generally causes a switch to supervisor mode, where the operating system can deal with the illegal operation the program was attempting (for example, by killing the program).

92

Operating system Memory management Among other things, a multiprogramming operating system kernel must be responsible for managing all system memory which is currently in use by programs. This ensures that a program does not interfere with memory already in use by another program. Since programs time share, each program must have independent access to memory. Cooperative memory management, used by many early operating systems, assumes that all programs make voluntary use of the kernel's memory manager, and do not exceed their allocated memory. This system of memory management is almost never seen any more, since programs often contain bugs which can cause them to exceed their allocated memory. If a program fails, it may cause memory used by one or more other programs to be affected or overwritten. Malicious programs or viruses may purposefully alter another program's memory, or may affect the operation of the operating system itself. With cooperative memory management, it takes only one misbehaved program to crash the system. Memory protection enables the kernel to limit a process' access to the computer's memory. Various methods of memory protection exist, including memory segmentation and paging. All methods require some level of hardware support (such as the 80286 MMU), which doesn't exist in all computers. In both segmentation and paging, certain protected mode registers specify to the CPU what memory address it should allow a running program to access. Attempts to access other addresses will trigger an interrupt which will cause the CPU to re-enter supervisor mode, placing the kernel in charge. This is called a segmentation violation or Seg-V for short, and since it is both difficult to assign a meaningful result to such an operation, and because it is usually a sign of a misbehaving program, the kernel will generally resort to terminating the offending program, and will report the error. Windows 3.1-Me had some level of memory protection, but programs could easily circumvent the need to use it. A general protection fault would be produced, indicating a segmentation violation had occurred; however, the system would often crash anyway.

93

Operating system Virtual memory Further information: Page fault The use of virtual memory addressing (such as paging or segmentation) means that the kernel can choose what memory each program may use at any given time, allowing the operating system to use the same memory locations for multiple tasks. If a program tries to access memory that isn't in its current range of accessible memory, but nonetheless has been allocated to it, the kernel will be interrupted in the same way as it would if the program were to exceed its allocated memory. (See section on memory management.) Under UNIX this kind of interrupt is referred to as a page fault. When the kernel detects a page fault it will generally adjust the virtual memory range of the program which triggered it, granting it access to the memory requested. This gives the kernel discretionary power over where a particular application's memory is stored, or even whether or not it has actually been allocated yet. In modern operating systems, memory which is accessed less frequently can be temporarily stored on disk or other media to make that space available for use by other programs. This is called swapping, as an area of memory can be used by multiple programs, and what that memory area contains can be swapped or exchanged on demand. "Virtual memory" provides the programmer or the user with the perception that there is a much larger amount of RAM in the computer than is really there.[22] Multitasking Further information: Context switch,Preemptive multitasking,andCooperative multitasking Multitasking refers to the running of multiple independent computer programs on the same computer; giving the appearance that it is performing the tasks at the same time. Since most computers can do at most one or two things at one time, this is generally done via time-sharing, which means that each program uses a share of the computer's time to execute. An operating system kernel contains a piece of software called a scheduler which determines how much time each program will spend executing, and in which order execution control should be passed to programs. Control is passed to a process by the kernel, which allows the program access to the CPU and memory. Later, control is returned to the kernel through some mechanism, so that another program may be allowed to use the CPU. This so-called passing of control between the kernel and applications is called a context switch. An early model which governed the allocation of time to programs was called cooperative multitasking. In this model, when control is passed to a program by the kernel, it may execute for as long as it wants before explicitly returning control to the kernel. This means that a malicious or malfunctioning program may not only prevent any other programs from using the CPU, but it can hang the entire system if it enters an infinite loop.

94

Many operating systems can "trick" programs into using memory scattered around the hard disk and RAM as if it is one continuous chunk of memory, called virtual memory.

Operating system Modern operating systems extend the concepts of application preemption to device drivers and kernel code, so that the operating system has preemptive control over internal run-times as well. The philosophy governing preemptive multitasking is that of ensuring that all programs are given regular time on the CPU. This implies that all programs must be limited in how much time they are allowed to spend on the CPU without being interrupted. To accomplish this, modern operating system kernels make use of a timed interrupt. A protected mode timer is set by the kernel which triggers a return to supervisor mode after the specified time has elapsed. (See above sections on Interrupts and Dual Mode Operation.) On many single user operating systems cooperative multitasking is perfectly adequate, as home computers generally run a small number of well tested programs. The AmigaOS is an exception, having pre-emptive multitasking from its very first version. Windows NT was the first version of Microsoft Windows which enforced preemptive multitasking, but it didn't reach the home user market until Windows XP (since Windows NT was targeted at professionals). Disk access and file systems Access to data stored on disks is a central feature of all operating systems. Computers store data on disks using files, which are structured in specific ways in order to allow for faster access, higher reliability, and to make better use out of the drive's available space. The specific way in which files are stored on a disk is called a file system, and enables files to have names and attributes. It also allows them to be stored in a hierarchy of directories or folders arranged in a directory tree. Early operating systems generally supported a single Filesystems allow users and programs to organize and sort files on a type of disk drive and only one kind of file system. computer, often through the use of directories (or "folders") Early file systems were limited in their capacity, speed, and in the kinds of file names and directory structures they could use. These limitations often reflected limitations in the operating systems they were designed for, making it very difficult for an operating system to support more than one file system. While many simpler operating systems support a limited range of options for accessing storage systems, operating systems like UNIX and Linux support a technology known as a virtual file system or VFS. An operating system such as UNIX supports a wide array of storage devices, regardless of their design or file systems, allowing them to be accessed through a common application programming interface (API). This makes it unnecessary for programs to have any knowledge about the device they are accessing. A VFS allows the operating system to provide programs with access to an unlimited number of devices with an infinite variety of file systems installed on them, through the use of specific device drivers and file system drivers. A connected storage device, such as a hard drive, is accessed through a device driver. The device driver understands the specific language of the drive and is able to translate that language into a standard language used by the operating system to access all disk drives. On UNIX, this is the language of block devices. When the kernel has an appropriate device driver in place, it can then access the contents of the disk drive in raw format, which may contain one or more file systems. A file system driver is used to translate the commands used to access each specific file system into a standard set of commands that the operating system can use to talk to all file systems. Programs can then deal with these file systems on the basis of filenames, and directories/folders, contained within a hierarchical structure. They can create, delete, open, and close files, as well as gather various information about them, including access permissions, size, free space, and creation and modification dates.

95

Operating system Various differences between file systems make supporting all file systems difficult. Allowed characters in file names, case sensitivity, and the presence of various kinds of file attributes makes the implementation of a single interface for every file system a daunting task. Operating systems tend to recommend using (and so support natively) file systems specifically designed for them; for example, NTFS in Windows and ext3 and ReiserFS in Linux. However, in practice, third party drives are usually available to give support for the most widely used file systems in most general-purpose operating systems (for example, NTFS is available in Linux through NTFS-3g, and ext2/3 and ReiserFS are available in Windows through third-party software). Support for file systems is highly varied among modern operating systems, although there are several common file systems which almost all operating systems include support and drivers for. Operating systems vary on file system support and on the disk formats they may be installed on. Under Windows, each file system is usually limited in application to certain media; for example, CDs must use ISO 9660 or UDF, and as of Windows Vista, NTFS is the only file system which the operating system can be installed on. It is possible to install Linux onto many types of file systems. Unlike other operating systems, Linux and UNIX allow any file system to be used regardless of the media it is stored in, whether it is a hard drive, a disc (CD,DVD...), a USB flash drive, or even contained within a file located on another file system. Device drivers A device driver is a specific type of computer software developed to allow interaction with hardware devices. Typically this constitutes an interface for communicating with the device, through the specific computer bus or communications subsystem that the hardware is connected to, providing commands to and/or receiving data from the device, and on the other end, the requisite interfaces to the operating system and software applications. It is a specialized hardware-dependent computer program which is also operating system specific that enables another program, typically an operating system or applications software package or computer program running under the operating system kernel, to interact transparently with a hardware device, and usually provides the requisite interrupt handling necessary for any necessary asynchronous time-dependent hardware interfacing needs. The key design goal of device drivers is abstraction. Every model of hardware (even within the same class of device) is different. Newer models also are released by manufacturers that provide more reliable or better performance and these newer models are often controlled differently. Computers and their operating systems cannot be expected to know how to control every device, both now and in the future. To solve this problem, operating systems essentially dictate how every type of device should be controlled. The function of the device driver is then to translate these operating system mandated function calls into device specific calls. In theory a new device, which is controlled in a new manner, should function correctly if a suitable driver is available. This new driver will ensure that the device appears to operate as usual from the operating system's point of view. Under versions of Windows before Vista and versions of Linux before 2.6, all driver execution was co-operative, meaning that if a driver entered an infinite loop it would freeze the system. More recent revisions of these operating systems incorporate kernel preemption, where the kernel interrupts the driver to give it tasks, and then separates itself from the process until it receives a response from the device driver, or gives it more tasks to do.

96

Networking
Currently most operating systems support a variety of networking protocols, hardware, and applications for using them. This means that computers running dissimilar operating systems can participate in a common network for sharing resources such as computing, files, printers, and scanners using either wired or wireless connections. Networks can essentially allow a computer's operating system to access the resources of a remote computer to support the same functions as it could if those resources were connected directly to the local computer. This includes everything from simple communication, to using networked file systems or even sharing another computer's graphics or sound hardware. Some network services allow the resources of a computer to be accessed transparently, such as

Operating system SSH which allows networked users direct access to a computer's command line interface. Client/server networking allows a program on a computer, called a client, to connect via a network to another computer, called a server. Servers offer (or host) various services to other network computers and users. These services are usually provided through ports or numbered access points beyond the server's network address. Each port number is usually associated with a maximum of one running program, which is responsible for handling requests to that port. A daemon, being a user program, can in turn access the local hardware resources of that computer by passing requests to the operating system kernel. Many operating systems support one or more vendor-specific or open networking protocols as well, for example, SNA on IBM systems, DECnet on systems from Digital Equipment Corporation, and Microsoft-specific protocols (SMB) on Windows. Specific protocols for specific tasks may also be supported such as NFS for file access. Protocols like ESound, or esd can be easily extended over the network to provide sound from local applications, on a remote system's sound hardware.

97

Security
A computer being secure depends on a number of technologies working properly. A modern operating system provides access to a number of resources, which are available to software running on the system, and to external devices like networks via the kernel. The operating system must be capable of distinguishing between requests which should be allowed to be processed, and others which should not be processed. While some systems may simply distinguish between "privileged" and "non-privileged", systems commonly have a form of requester identity, such as a user name. To establish identity there may be a process of authentication. Often a username must be quoted, and each username may have a password. Other methods of authentication, such as magnetic cards or biometric data, might be used instead. In some cases, especially connections from the network, resources may be accessed with no authentication at all (such as reading files over a network share). Also covered by the concept of requester identity is authorization; the particular services and resources accessible by the requester once logged into a system are tied to either the requester's user account or to the variously configured groups of users to which the requester belongs. In addition to the allow/disallow model of security, a system with a high level of security will also offer auditing options. These would allow tracking of requests for access to resources (such as, "who has been reading this file?"). Internal security, or security from an already running program is only possible if all possibly harmful requests must be carried out through interrupts to the operating system kernel. If programs can directly access hardware and resources, they cannot be secured. External security involves a request from outside the computer, such as a login at a connected console or some kind of network connection. External requests are often passed through device drivers to the operating system's kernel, where they can be passed onto applications, or carried out directly. Security of operating systems has long been a concern because of highly sensitive data held on computers, both of a commercial and military nature. The United States Government Department of Defense (DoD) created the Trusted Computer System Evaluation Criteria (TCSEC) which is a standard that sets basic requirements for assessing the effectiveness of security. This became of vital importance to operating system makers, because the TCSEC was used to evaluate, classify and select trusted operating systems being considered for the processing, storage and retrieval of sensitive or classified information. Network services include offerings such as file sharing, print services, email, web sites, and file transfer protocols (FTP), most of which can have compromised security. At the front line of security are hardware devices known as firewalls or intrusion detection/prevention systems. At the operating system level, there are a number of software firewalls available, as well as intrusion detection/prevention systems. Most modern operating systems include a software firewall, which is enabled by default. A software firewall can be configured to allow or deny network traffic to or from a service or application running on the operating system. Therefore, one can install and be running an insecure service, such as Telnet or FTP, and not have to be threatened by a security breach because the firewall

Operating system would deny all traffic trying to connect to the service on that port. An alternative strategy, and the only sandbox strategy available in systems that do not meet the Popek and Goldberg virtualization requirements, is the operating system not running user programs as native code, but instead either emulates a processor or provides a host for a p-code based system such as Java. Internal security is especially relevant for multi-user systems; it allows each user of the system to have private files that the other users cannot tamper with or read. Internal security is also vital if auditing is to be of any use, since a program can potentially bypass the operating system, inclusive of bypassing auditing.

98

User interface
Every computer that is to be operated by an individual requires a user interface. The user interface is not actually a part of the operating systemit generally runs in a separate program usually referred to as a shell, but is essential if human interaction is to be supported. The user interface requests services from the operating system that will acquire data from input hardware devices, such as a keyboard, mouse or credit card reader, and requests operating system services to display prompts, status messages and such on output hardware devices, such as a video monitor or printer. The two most common forms of a user interface have historically been the command-line interface, where computer commands are typed out line-by-line, and the graphical user interface, where a visual environment (most commonly a WIMP) is present.

A screenshot of the Bourne Again Shell command line. Each command is typed out after the 'prompt', and then its output appears below, working its way down the screen. The current command prompt is at the bottom.

Graphical user interfaces Most of the modern computer systems support graphical user interfaces (GUI), and often include them. In some computer systems, such as the original implementation of Mac OS, the GUI is integrated into the kernel. While technically a graphical user interface is not an operating system service, incorporating support for one into the operating system kernel can allow the GUI to be more responsive by reducing the number of context switches required for the GUI to perform its output functions. Other operating systems are modular, separating the graphics subsystem from the kernel and the Operating System. In the 1980s UNIX, VMS and many others had operating systems that

A screenshot of the KDE Plasma Desktop graphical user interface. Programs take the form of images on the screen, and the files, folders (directories), and applications take the form of icons and symbols. A mouse is used to navigate the computer.

were built this way. Linux and Mac OS X are also built this way. Modern releases of Microsoft Windows such as Windows Vista implement a graphics subsystem that is mostly in user-space; however the graphics drawing routines

Operating system of versions between Windows NT 4.0 and Windows Server 2003 exist mostly in kernel space. Windows 9x had very little distinction between the interface and the kernel. Many computer operating systems allow the user to install or create any user interface they desire. The XWindow System in conjunction with GNOME or KDE Plasma Desktop is a commonly found setup on most Unix and Unix-like (BSD, Linux, Solaris) systems. A number of Windows shell replacements have been released for Microsoft Windows, which offer alternatives to the included Windows shell, but the shell itself cannot be separated from Windows. Numerous Unix-based GUIs have existed over time, most derived from X11. Competition among the various vendors of Unix (HP, IBM, Sun) led to much fragmentation, though an effort to standardize in the 1990s to COSE and CDE failed for various reasons, and were eventually eclipsed by the widespread adoption of GNOME and K Desktop Environment. Prior to free software-based toolkits and desktop environments, Motif was the prevalent toolkit/desktop combination (and was the basis upon which CDE was developed). Graphical user interfaces evolve over time. For example, Windows has modified its user interface almost every time a new major version of Windows is released, and the MacOS GUI changed dramatically with the introduction of MacOSX in 1999.[23]

99

Real-time operating systems


A real-time operating system (RTOS) is a multitasking operating system intended for applications with fixed deadlines (real-time computing). Such applications include some small embedded systems, automobile engine controllers, industrial robots, spacecraft, industrial control, and some large-scale computing systems. An early example of a large-scale real-time operating system was Transaction Processing Facility developed by American Airlines and IBM for the Sabre Airline Reservations System. Embedded systems that have fixed deadlines use a real-time operating system such as VxWorks, PikeOS, eCos, QNX, MontaVista Linux and RTLinux. Windows CE is a real-time operating system that shares similar APIs to desktop Windows but shares none of desktop Windows' codebase. Symbian OS also has an RTOS kernel (EKA2) starting with version 8.0b. Some embedded systems use operating systems such as Palm OS, BSD, and Linux, although such operating systems do not support real-time computing.

Operating system development as a hobby


Operating system development is one of the most complicated activities in which a computing hobbyist may engage. A hobby operating system may be classified as one whose code has not been directly derived from an existing operating system, and has few users and active developers. [24] In some cases, hobby development is in support of a "homebrew" computing device, for example, a simple single-board computer powered by a 6502 microprocessor. Or, development may be for an architecture already in widespread use. Operating system development may come from entirely new concepts, or may commence by modeling an existing operating system. In either case, the hobbyist is his/her own developer, or may interact with a small and sometimes unstructured group of individuals who have like interests. Examples of a hobby operating system include ReactOS and Syllable.

Operating system

100

Diversity of operating systems and portability


Application software is generally written for use on a specific operating system, and sometimes even for specific hardware. When porting the application to run on another OS, the functionality required by that application may be implemented differently by that OS (the names of functions, meaning of arguments, etc.) requiring the application to be adapted, changed, or otherwise maintained. This cost in supporting operating systems diversity can be avoided by instead writing applications against software platforms like Java or Qt. These abstractions have already borne the cost of adaptation to specific operating systems and their system libraries. Another approach is for operating system vendors to adopt standards. For example, POSIX and OS abstraction layers provide commonalities that reduce porting costs.

References
[1] [2] [3] [4] Stallings (2005). Operating Systems, Internals and Design Principles. Pearson: Prentice Hall. p.6. Dhotre, I.A. (2009). Operating Systems.. Technical Publications. p.1. "Operating System Market Share" (http:/ / marketshare. hitslink. com/ operating-system-market-share. aspx?qprid=10). Net Applications. . Hansen, Per Brinch, ed. (2001). Classic Operating Systems (http:/ / books. google. com/ ?id=-PDPBvIPYBkC& lpg=PP1& pg=PP1#v=onepage& q). Springer. pp.47. ISBN0-387-95113-X. .

[5] "OS X Mountain Lion - Move your Mac even further ahead" (http:/ / www. apple. com/ macosx/ lion/ ). Apple. . Retrieved 2012-08-07. [6] Usage share of operating systems [7] "Top 5 Operating Systems from January to April 2011" (http:/ / gs. statcounter. com/ #os-ww-monthly-201101-201104-bar). StatCounter. October 2009. . Retrieved November 5, 2009. [8] "IDC report into Server market share" (http:/ / www. idc. com/ about/ viewpressrelease. jsp?containerId=prUS22360110& sectionId=null& elementId=null& pageType=SYNOPSIS). Idc.com. . Retrieved 2012-08-07. [9] Linux still top embedded OS (http:/ / www. linuxdevices. com/ news/ NS4920597981. html) [10] Tom Jermoluk (2012-08-03). "TOP500 List November 2010 (1100) | TOP500 Supercomputing Sites" (http:/ / www. top500. org/ list/ 2010/ 11/ 100). Top500.org. . Retrieved 2012-08-07. [11] "Global Web Stats" (http:/ / marketshare. hitslink. com/ operating-system-market-share. aspx?qprid=8). Net Market Share, Net Applications. May 2011. . Retrieved 2011-05-07. [12] "Global Web Stats" (http:/ / www. w3counter. com/ globalstats. php). W3Counter, Awio Web Services. September 2009. . Retrieved 2009-10-24. [13] "Operating System Market Share" (http:/ / marketshare. hitslink. com/ operating-system-market-share. aspx?qprid=8). Net Applications. October 2009. . Retrieved November 5, 2009. [14] "w3schools.com OS Platform Statistics" (http:/ / www. w3schools. com/ browsers/ browsers_os. asp). . Retrieved October 30, 2011. [15] "Stats Count Global Stats Top Five Operating Systems" (http:/ / gs. statcounter. com/ #os-ww-monthly-201010-201110). . Retrieved October 30, 2011. [16] "Global statistics at w3counter.com" (http:/ / www. w3counter. com/ globalstats. php). . Retrieved 23 January 2012. [17] "Troubleshooting MS-DOS Compatibility Mode on Hard Disks" (http:/ / support. microsoft. com/ kb/ 130179/ EN-US). Support.microsoft.com. . Retrieved 2012-08-07. [18] "Using NDIS 2 PCMCIA Network Card Drivers in Windows 95" (http:/ / support. microsoft. com/ kb/ 134748/ en). Support.microsoft.com. . Retrieved 2012-08-07. [19] "INFO: Windows 95 Multimedia Wave Device Drivers Must be 16 bit" (http:/ / support. microsoft. com/ kb/ 163354/ en). Support.microsoft.com. . Retrieved 2012-08-07. [20] "Operating System Share by Groups for Sites in All Locations January 2009" (http:/ / news. netcraft. com/ SSL-Survey/ CMatch/ osdv_all). . [21] "Behind the IDC data: Windows still No. 1 in server operating systems" (http:/ / blogs. zdnet. com/ microsoft/ ?p=5408). ZDNet. 2010-02-26. . [22] Stallings, William (2008). Computer Organization & Architecture. New Delhi: Prentice-Hall of India Private Limited. p.267. ISBN978-81-203-2962-1. [23] Poisson, Ken. "Chronology of Personal Computer Software" (http:/ / www. islandnet. com/ ~kpolsson/ compsoft/ soft1998. htm). Retrieved on 2008-05-07. Last checked on 2009-03-30. [24] "My OS is less hobby than yours" (http:/ / www. osnews. com/ story/ 22638/ My_OS_Is_Less_Hobby_than_Yours). Osnews. December 21, 2009. . Retrieved December 21, 2009.

Windows to surpass Android by 2015 (http:/ / www. greatphone. co. uk/ 594/ windows-to-surpass-android-by-2015/ )

Operating system

101

Further reading
Auslander, Marc A.; Larkin, David C.; Scherr, Allan L. (1981). The evolution of the MVS Operating System (http://www.research.ibm.com/journal/rd/255/auslander.pdf). IBM J. Research & Development. Deitel, Harvey M.; Deitel, Paul; Choffnes, David. Operating Systems. Pearson/Prentice Hall. ISBN978-0-13-092641-8. Bic, Lubomur F.; Shaw, Alan C. (2003). Operating Systems. Pearson: Prentice Hall. Silberschatz, Avi; Galvin, Peter; Gagne, Greg (2008). Operating Systems Concepts. John Wiley & Sons. ISBN0-470-12872-0.

External links
Operating Systems (http://www.dmoz.org/Computers/Software/Operating_Systems/) at the Open Directory Project Multics History (http://www.cbi.umn.edu/iterations/haigh.html) and the history of operating systems How Stuff Works - Operating Systems (http://computer.howstuffworks.com/operating-system.htm) Help finding your Operating System type and version (http://whatsmyos.com)

Unix-like
A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification. There is no standard for defining the term, and some difference of opinion is possible as to the degree to which a given operating system is "Unix-like". The term can include free and open source operating systems inspired Diagram of the relationships between the major by Bell Labs' Unix or designed to emulate its features, commercial and Unix-like systems proprietary work-alikes, and even versions based on the licensed UNIX source code (which may be sufficiently "Unix-like" to pass certification and bear the "UNIX" trademark).

Definition
The Open Group owns the UNIX trademark and administers the Single UNIX Specification, with the "UNIX" name being used as a certification mark. They do not approve of the construction "Unix-like", and consider it a misuse of their trademark. Their guidelines require "UNIX" to be presented in uppercase or otherwise distinguished from the surrounding text, strongly encourage using it as a branding adjective for a generic word such as "system", and discourage its use in hyphenated phrases.[1] Other parties frequently treat "Unix" as a genericized trademark. Some add a wildcard character to the name to make an abbreviation like "Un*x"[2] or "*nix", since Unix-like systems often have Unix-like names such as AIX, A/UX, HP-UX, IRIX, Linux, Minix, Ultrix, and Xenix. These patterns do not literally match many system names, but are still generally recognized to refer to any UNIX descendant or work-alike system, even those with completely dissimilar names such as Solaris or FreeBSD. In 2007, Wayne R. Gray sued to dispute the status of UNIX as a trademark, but lost his case, and lost again on appeal.

Unix-like Also in 2007, the Open Group reached a binding legal agreement to prevent the German University of Kassel from using "UNIK" as its short form name.[3]

102

History
"Unix-like" systems started to appear in the late 1970s and early 1980s. Many proprietary versions, such as Idris (1978), UNOS (1982), Coherent (1983), and UniFlex (1985), aimed to provide businesses with the functionality available to academic users of UNIX. When AT&T later allowed commercial licensing of UNIX in the 1980s, a variety of proprietary systems were developed based on it, including AIX, HP-UX, IRIX, SunOS, Tru64, Ultrix, and Xenix. These largely displaced the proprietary clones. Growing incompatibility between these systems led to the creation of interoperability standards, including POSIX and the Single UNIX Specification. Meanwhile, the GNU Project was launched in 1983 with the goal of making GNU, an operating system which all computer users could freely use, study, modify, and redistribute. Various "Unix-like" operating systems developed alongside GNU, frequently sharing substantial components with it (leading to some disagreement about whether they should be called "GNU" or not). These primarily served as low-cost and unrestricted substitutes for UNIX, and include 4.4BSD, Linux, and Minix. Some of these have in turn been the basis for commercial "Unix-like" systems, such as BSD/OS and Mac OS X. Notably, Mac OS X 10.5 and Mac OS X 10.6 running on Intel Macs are certified under the Single UNIX Specification.[4] The various BSD variants are notable in that they are in fact descendants of UNIX, developed by the University of California at Berkeley with UNIX source code from Bell Labs. However, the BSD code base has evolved since then, replacing all of the AT&T code. Since the BSD variants are not certified as compliant with the Single UNIX Specification (except for Mac OS X 10.5 Leopard and Mac OS X 10.6 Snow Leopard), they are referred to as "UNIX-like".

Categories
Dennis Ritchie, one of the original creators of Unix, expressed his opinion that Unix-like systems such as Linux are de facto Unix systems.[5] Eric S. Raymond and Rob Langley have suggested[6] that there are three kinds of Unix-like systems: Genetic UNIX Those systems with a historical connection to the AT&T codebase. Most (but not all) commercial UNIX systems fall into this category. So do the BSD systems, which are descendants of work done at the University of California, Berkeley in the late 1970s and early 1980s. Some of these systems have no original AT&T code but can still trace their ancestry to AT&T designs. Trademark or Branded UNIX These systemslargely commercial in naturehave been determined by the Open Group to meet the Single UNIX Specification and are allowed to carry the UNIX name. Most such systems are commercial derivatives of the System V code base in one form or another, although Apple Mac OS X 10.5 and later is a BSD variant, and has been certified, and a few certified systems (such as IBM z/OS) earned the trademark through a POSIX compatibility layer and are not otherwise inherently Unix systems. Many ancient UNIX systems no longer meet this definition. Functional UNIX Broadly, any Unix-like system that behaves in a manner roughly consistent with the UNIX specification; more specifically, this can refer to systems such as Linux or Minix that behave similarly to a UNIX system but have no genetic or trademark connection to the AT&T code base. Most free/open-source implementations of the

Unix-like UNIX design, whether genetic UNIX or not, fall into the restricted definition of this third category due to the expense of obtaining Open Group certification, which costs thousands of dollars for commercial closed source systems. Around 2001, Linux was given the oportunity to get a certification including free help from the POSIX chair Andrew Josey for the symbolic price of one dollar. There have been some acitivities to make Linux POSIX compliant, with Josey having prepared a list of differences between the POSIX standard and the Linux Standard Base specification,[7] but in August 2005, this project was shut down because of missing interest at the Linux/FSF side.

103

Compatibility layers
Some non-Unix-like operating systems provide a Unix-like compatibility layer, with variable degrees of Unix-like functionality. IBM z/OS's UNIX System Services is sufficiently complete to be certified as trademark UNIX. Cygwin and MSYS both provide a reasonably complete GNU environment, sufficient for most common open source software to be compiled and run, with some emulation of Linux, on top of the Microsoft Windows user API. Interix provides Unix-like functionality as a Windows NT subsystem.

References
[1] Trademark Guidelines (http:/ / www. opengroup. org/ tm-guidelines. htm) The Open Group. [2] Eric S. Raymond; Guy L. Steele Jr.. "UN*X" (http:/ / catb. org/ jargon/ html/ U/ UN-asterisk-X. html). The Jargon File. . Retrieved 2009-01-22. [3] "Das Killer-K" (17 April 2007) Publik Kasseler Hochschulzeitung, No.3 (http:/ / www. uni-kassel. de/ presse/ publik/ 07_03/ s1. pdf) [4] Register of Open Branded Products (http:/ / www. opengroup. org/ openbrand/ register/ ) The Open Group [5] Interview with Dennis M. Ritchie (http:/ / www. linuxfocus. org/ English/ July1999/ article79. html) Manuel Benet, LinuxFocus, July 1999 [6] The meaning of 'Unix' (http:/ / catb. org/ ~esr/ hackerlore/ sco-vs-ibm. html#id305450) Eric Raymond and Rob Langley, OSI Position Paper on the SCO-vs.-IBM Complaint [7] Andrew Josey (20 August 2005). "Conflicts between ISO/IEC 9945 (POSIX) and the Linux Standard Base" (http:/ / www. opengroup. org/ personal/ ajosey/ tr20-08-2005. txt). The Open Group. . Retrieved 23 July 23 2012.

External links
Unix-like Definition (http://www.linfo.org/unix-like.html) by The Linux Information Project (LINFO) UNIX history (http://www.levenez.com/unix/) a history time line graph of most UNIX and Unix-like systems by ric Lvnez Grokline's UNIX Ownership History Project (http://grokline.net/) a project to map out the technical history of UNIX and Unix-like systems

104

File System
File system
A file system (or filesystem) is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data as well as manage the available space on the device(s) which contain it. A file system organizes data in an efficient manner and is tuned to the specific characteristics of the device. A tight coupling usually exists between the operating system and the file system. Some file systems provide mechanisms to control access to the data and metadata. Ensuring reliability is a major responsibility of a file system. Some file systems allow multiple programs to update the same file at nearly the same time. File systems are used on data storage devices, such as hard disk drives, floppy disks, optical discs, or flash memory storage devices, to maintain the physical locations of the computer files. They may provide access to data on a file server by acting as clients for a network protocol (e.g. NFS, SMB, or 9P clients), or they may be virtual and exist only as an access method for virtual data (e.g. procfs). This is distinguished from a directory service and registry.

Aspects of file systems


Space management
File systems allocate space in a granular manner, usually multiple physical units on the device. The file system is responsible for organizing files and directories, and keeping track of which areas of the media belong to which file and which are not being used. For example, in Apple DOS of the early 1980s, 256-byte sectors on 140 kilobyte floppy disk used a track/sector map.

This results in unused space when a file is not an exact multiple of the allocation unit, sometimes referred to as slack space. For a 512-byte allocation, the average unused space is 255 bytes. For a 64KB clusters, the average unused space is 32KB. The size of the allocation unit is chosen when the file system is created. Choosing the allocation size based on the average size of the files expected to be in the file system can minimize the amount of unusable space. Frequently the default allocation may provide reasonable usage. Choosing an allocation size that is too small results in excessive overhead if the file system will contain mostly very large files. File system fragmentation occurs when unused space or single files are not contiguous. As a file system is used, files are created, modified and deleted. When a file is created the file system allocates space for the data. Some file systems permit or require specifying an initial space allocation and subsequent incremental allocations as the file grows. As files are deleted the space they were allocated eventually is considered available for use by other files. This creates alternating used and unused areas of various sizes. This is free space fragmentation. When a file is created and there is not an area of contiguous space available for its initial allocation the space must be assigned in fragments. When a file is modified such that it becomes larger it may exceed the space initially allocated to it, another allocation must be assigned elsewhere and the file becomes fragmented. A file system may not make use of a storage device but can be used to organize and represent access to any data, whether it is stored or dynamically generated (e.g. procfs).

Example of slack space, demonstrated with 4,096-byte NTFS clusters: 100,000 files, each 5 bytes per file, equals 500,000 bytes of actual data, but requires 409,600,000 bytes of disk space to store

File system

105

File names
A file name (or filename) is used to reference the storage location in the file system. Most file systems have restrictions on the length of the filename. In some file systems, filenames are case-insensitive; in others, they are case-sensitive. Most file system interface utilities have special characters that you cannot normally use in a filename (the file system may use these special characters to indicate a device, device type, directory prefix or file type). However, you may be able to use such special characters by, for example, enclosing the file name with double quotes ("). To make things easy, you may wish to avoid using file names with special characters. Some file system utilities, editors and compilers treat prefixes and suffixes in a special way. These are usually merely conventions and not implemented within the file system.

Directories
File systems typically have directories (sometimes called folders) which allow the user to group files. This may be implemented by connecting the file name to an index in a table of contents or an inode in a Unix-like file system. Directory structures may be flat (i.e. linear), or allow hierarchies where directories may contain subdirectories. The first file system to support arbitrary hierarchies of directories was the file system in the Multics operating system.[1] The native file systems of Unix-like systems also support arbitrary directory hierarchies, as do, for example, Apple's Hierarchical File System and its successor HFS+ in classic Mac OS (HFS+ is still used in Mac OS X), the FAT file system in MS-DOS 2.0 and later and Microsoft Windows, the NTFS file system in the Windows NT family of operating systems, and the ODS-2 and higher levels of the Files-11 file system in OpenVMS.

Metadata
Other bookkeeping information is typically associated with each file within a file system. The length of the data contained in a file may be stored as the number of blocks allocated for the file or as a byte count. The time that the file was last modified may be stored as the file's timestamp. File systems might store the file creation time, the time it was last accessed, the time the file's meta-data was changed, or the time the file was last backed up. Other information can include the file's device type (e.g. block, character, socket, subdirectory, etc.), its owner user ID and group ID, and its access permission settings (e.g. whether the file is read-only, executable, etc.). Additional attributes can be associated on file systems, such as NTFS, XFS, ext2/ext3, some versions of UFS, and HFS+, using extended file attributes. Some file systems provide for user defined attributes such as the author of the document, the character encoding of a document or the size of an image. Some file systems allow for different data collections to be associated with one file name. These separate collections may be referred to as streams or forks. Apple has long used a forked file system on the Macintosh, and Microsoft supports streams in NTFS. Some file systems maintain multiple past revisions of a file under a single file name; the filename by itself retrieves the most recent version, while prior saved version can be accessed using a special naming convention such as "filename;4" or "filename(-4)" to access the version four saves ago.

Utilities
File systems include utilities to initialize, alter parameters of and remove an instance of the file system. Some include the ability to extend or truncate the space allocated to the file system. Directory utilities create, rename and delete directory entries and alter metadata associated with a directory. They may include a means to create additional links to a directory (hard links in Unix), rename parent links (".." in Unix-like OS), and create bidirectional links to files. File utilities create, list, copy, move and delete files, and alter metadata. They may be able to truncate data, truncate or extend space allocation, append to, move, and modify files in-place. Depending on the underlying structure of the

File system file system, they may provide a mechanism to prepend to, or truncate from, the beginning of a file, insert entries into the middle of a file or delete entries from a file. Also in this category are utilities to free space for deleted files if the file system provides an undelete function. Some file systems defer reorganization of free space, secure erasing of free space and rebuilding of hierarchical structures. They provide utilities to perform these functions at times of minimal activity. Included in this category is the infamous defragmentation utility. Some of the most important features of file system utilities involve supervisory activities which may involve bypassing ownership or direct access to the underlying device. These include high-performance backup and recovery, data replication and reorganization of various data structures and allocation tables within the file system.

106

Restricting and permitting access


There are several mechanisms used by file systems to control access to data. Usually the intent is to prevent reading or modifying files by a user or group of users. Another reason is to ensure data is modified in a controlled way so access may be restricted to a specific program. Examples include passwords stored in the metadata of the file or elsewhere and file permissions in the form of permission bits, access control lists, or capabilities. The need for file system utilities to be able to access the data at the media level to reorganize the structures and provide efficient backup usually means that these are only effective for polite users but are not effective against intruders. See also password cracking. Methods for encrypting file data are sometimes included in the file system. This is very effective since there is no need for file system utilities to know the encryption seed to effectively manage the data. The risks of relying on encryption include the fact that an attacker can copy the data and use brute force to decrypt the data. Losing the seed means losing the data. See also filesystem-level encryption, Encrypting File System.

Maintaining integrity
One significant responsibility of a file system is to ensure that, regardless of the actions by programs accessing the data, the structure remains consistent. This includes actions taken if a program modifying data terminates abnormally or neglects to inform the file system that is has completed its activities. This may include updating the metadata, the directory entry and handling any data that was buffered but not yet updated on the physical storage media. Other failures which the file system must deal with include media failures or loss of connection to remote systems. In the event of an operating system failure or "soft" power failure, special routines in the file system must be invoked similar to when an individual program fails. The file system must also be able to correct damaged structures. These may occur as a result of an operating system failure for which the OS was unable to notify the file system, power failure or reset. The file system must also record events to allow analysis of systemic issues as well as problems with specific files or directories.

File system

107

User data
The most important purpose of a file system is to manage user data. This includes storing, retrieving and updating data. Some file systems accept data for storage as a stream of bytes which are collected and stored in a manner efficient for the media. When a program retrieves the data it specifies the size of a memory buffer and the file system transfers data from the media to the buffer. Sometimes a runtime library routine may allow the user program to define a record based on a library call specifying a length. When the user program reads the data the library retrieves data via the file system and returns a record. Some file systems allow the specification of a fixed record length which is used for all write and reads. This facilitates updating records. An identification for each record, also known as a key, makes for a more sophisticated file system. The user program can read, write and update records without regard with their location. This requires complicated management of blocks of media usually separating key blocks and data blocks. Very efficient algorithms can be developed with pyramid structure for locating records.

Using a file system


Utilities, language specific run-time libraries and user programs use file system APIs to make requests of the file system. These include data transfer, positioning, updating metadata, managing directories, managing access specifications and removal.

Multiple file systems within a single system


Frequently retail systems are configured with a single file system occupying the entire hard disk. Another approach is to partition the disk so that several file systems with different attributes can be used. One file system, for use as browser cache, might be configured with a small allocation size. This has the additional advantage of keeping the frantic activity of creating and deleting files typical of browser activity in a narrow area of the disk and not interfering with allocations of other files. A similar partition might be created for email. Another partition, and file system might be created for the storage of audio or video files with a relatively large allocation. One of the file systems may normally be set read-only and only periodically be set writable. A third approach, which is mostly used in cloud systems, is to use "disk images" to house additional file systems, with the same attributes or not, within another (host) file system as a file. A common example is virtualization: one user can run an experimental Linux distribution (using ext4 filesystem) in a virtual machine under his/her production Windows environment (using NTFS). The ext4 filesystem resides in a disk image, which is treated as a file (or multiple files, depend on the hypervisor and settings) in the NTFS host filesystem. Having multiple file systems on a single system has the additional benefit that in the event of a corruption of a single partition, the remaining file systems will frequently still be intact. This includes virus destruction of the system partition or even a system that will not boot. file system utilities which require dedicated access can effectively be completed piecemeal. In addition, defragmentation may be more effective. Several system maintenance utilities, such as virus scans and backups, can also be processed in segments. For example it is not necessary to back up the file system containing videos along with all the other files if none have been added since the last backup. As of the image files, one can easily "spin off" differential images which contain only "new" data written to the master (original) image. Differential images can be used for both safety concerns (as a "disposable" system - can be quickly restored if destroyed or containmated by a virus, as the old image can be removed and a new image can be created in matter of seconds, even without automated procedures) and quick virtual machine deployment (since the differential images can be quickly spawned using a script in batches)

File system

108

Design limitations
All file systems have some functional limit that defines the maximum storable data capacity within that system. These functional limits are a best-guess effort by the designer to determine how large the storage systems will be right now, and how large storage systems are likely to become in the future. Disk storage has continued to increase at near exponential rates (see Moore's law), so after a few years, file systems have kept reaching design limitations that require computer users to repeatedly move to a newer system with ever-greater capacity. File system complexity typically varies proportionally with the available storage capacity. The file systems of early 1980s home computers with 50KB to 512KB of storage would not be a reasonable choice for modern storage systems with hundreds of gigabytes of capacity. Likewise, modern file systems would not be a reasonable choice for these early systems, since the complexity of modern file system structures would consume most or all of the very limited capacity of the early storage systems.

Types of file systems


File system types can be classified into disk/tape file systems, network file systems and special-purpose file systems.

Disk file systems


A disk file system takes advantages of the ability of disk storage media to randomly address data in a short amount of time. Additional considerations include the speed of accessing data following that initially requested and the anticipation that the following data may also be requested. This permits multiple users (or processes) access to various data on the disk without regard to the sequential location of the data. Examples include FAT (FAT12, FAT16, FAT32), exFAT, NTFS, HFS and HFS+, HPFS, UFS, ext2, ext3, ext4, btrfs, ISO 9660, Files-11, Veritas File System, VMFS, ZFS, ReiserFS and UDF. Some disk file systems are journaling file systems or versioning file systems. Optical discs ISO 9660 and Universal Disk Format (UDF) are two common formats that target Compact Discs, DVDs and Blu-ray discs. Mount Rainier is an extension to UDF supported by Linux 2.6 series and Windows Vista that facilitates rewriting to DVDs.

Flash file systems


A flash file system considers the special abilities, performance and restrictions of flash memory devices. Frequently a disk file system can use a flash memory device as the underlying storage media but it is much better to use a file system specifically designed for a flash device.

Tape file systems


A tape file system is a file system and tape format designed to store files on tape in a self-describing form. Magnetic tapes are sequential storage media with significantly longer random data access times than disks, posing challenges to the creation and efficient management of a general-purpose file system. In a disk file system there is typically a master file directory, and a map of used and free data regions. Any file additions, changes, or removals require updating the directory and the used/free maps. Random access to data regions is measured in milliseconds so this system works well for disks. Tape requires linear motion to wind and unwind potentially very long reels of media. This tape motion may take several seconds to several minutes to move the read/write head from one end of the tape to the other. Consequently, a master file directory and usage map can be extremely slow and inefficient with tape. Writing typically involves reading the block usage map to find free blocks for writing, updating the usage map and directory

File system to add the data, and then advancing the tape to write the data in the correct spot. Each additional file write requires updating the map and directory and writing the data, which may take several seconds to occur for each file. Tape file systems instead typically allow for the file directory to be spread across the tape intermixed with the data, referred to as streaming, so that time-consuming and repeated tape motions are not required to write new data. However, a side effect of this design is that reading the file directory of a tape usually requires scanning the entire tape to read all the scattered directory entries. Most data archiving software that works with tape storage will store a local copy of the tape catalog on a disk file system, so that adding files to a tape can be done quickly without having to rescan the tape media. The local tape catalog copy is usually discarded if not used for a specified period of time, at which point the tape must be re-scanned if it is to be used in the future. IBM has developed a file system for tape called the Linear Tape File System. The IBM implementation of this file system has been released as the open-source IBM Linear Tape File System Single Drive Edition (LTFSSDE) product. The Linear Tape File System uses a separate partition on the tape to record the index meta-data, thereby avoiding the problems associated with scattering directory entries across the entire tape. Tape formatting Writing data to a tape is often a significantly time-consuming process that may take several hours. Similarly, completely erasing or formatting a tape can also take several hours. With many data tape technologies it is not necessary to format the tape before over-writing new data to the tape. This is due to the inherently destructive nature of overwriting data on sequential media. Because of the time it can take to format a tape, typically tapes are pre-formatted so that the tape user does not need to spend time preparing each new tape for use. All that is usually necessary is to write an identifying media label to the tape before use, and even this can be automatically written by software when a new tape is used for the first time.

109

Database file systems


Another concept for file management is the idea of a database-based file system. Instead of, or in addition to, hierarchical structured management, files are identified by their characteristics, like type of file, topic, author, or similar rich metadata. [2] IBM DB2 for i [3] (formerly known as DB2/400 and DB2 for i5/OS) is a database file system as part of the object based IBM i [4] operating system (formerly known as OS/400 and i5/OS), incorporating a single level store and running on IBM Power Systems (formerly known as AS/400 and iSeries), designed by Frank G. Soltis IBM's former chief scientist for IBM i. Around 1978 to 1988 Frank G. Soltis and his team at IBM Rochester have successfully designed and applied technologies like the database file system where others like Microsoft later failed to accomplish [5]. These technologies are informally known as 'Fortress Rochester' and were in few basic aspects extended from early Mainframe technologies but in many ways more advanced from a technology perspective. Some other projects that aren't "pure" database file systems but that use some aspects of a database file system: A lot of Web-CMS use a relational DBMS to store and retrieve files. Examples: XHTML files are stored as XML or text fields, image files are stored as blob fields; SQL SELECT (with optional XPath) statements retrieve the files, and allow the use of a sophisticated logic and more rich information associations than "usual file systems". Very large file systems, embodied by applications like Apache Hadoop and Google File System, use some database file system concepts.

File system

110

Transactional file systems


Some programs need to update multiple files "all at once". For example, a software installation may write program binaries, libraries, and configuration files. If the software installation fails, the program may be unusable. If the installation is upgrading a key system utility, such as the command shell, the entire system may be left in an unusable state. Transaction processing introduces the isolation guarantee, which states that operations within a transaction are hidden from other threads on the system until the transaction commits, and that interfering operations on the system will be properly serialized with the transaction. Transactions also provide the atomicity guarantee, that operations inside of a transaction are either all committed, or the transaction can be aborted and the system discards all of its partial results. This means that if there is a crash or power failure, after recovery, the stored state will be consistent. Either the software will be completely installed or the failed installation will be completely rolled back, but an unusable partial install will not be left on the system. Windows, beginning with Vista, added transaction support to NTFS, abbreviated TxF. There are a number of research prototypes of transactional file systems for UNIX systems, including the Valor file system,[6] Amino,[7] LFS,[8] and a transactional ext3 file system on the TxOS kernel,[9] as well as transactional file systems targeting embedded systems, such as TFFS.[10] Ensuring consistency across multiple file system operations is difficult, if not impossible, without file system transactions. File locking can be used as a concurrency control mechanism for individual files, but it typically does not protect the directory structure or file metadata. For instance, file locking cannot prevent TOCTTOU race conditions on symbolic links. File locking also cannot automatically roll back a failed operation, such as a software upgrade; this requires atomicity. Journaling file systems are one technique used to introduce transaction-level consistency to file system structures. Journal transactions are not exposed to programs as part of the OS API; they are only used internally to ensure consistency at the granularity of a single system call. Data backup systems typically do not provide support for direct backup of data stored in a transactional manner, which makes recovery of reliable and consistent data sets difficult. Most backup software simply notes what files have changed since a certain time, regardless of the transactional state shared across multiple files in the overall dataset. As a workaround, some database systems simply produce an archived state file containing all data up to that point, and the backup software only backs that up and does not interact directly with the active transactional databases at all. Recovery requires separate recreation of the database from the state file, after the file has been restored by the backup software.

Network file systems


A network file system is a file system that acts as a client for a remote file access protocol, providing access to files on a server. Examples of network file systems include clients for the NFS, AFS, SMB protocols, and file-system-like clients for FTP and WebDAV.

Shared disk file systems


A shared disk file system is one in which a number of machines (usually servers) all have access to the same external disk subsystem (usually a SAN). The file system arbitrates access to that subsystem, preventing write collisions. Examples include GFS2 from Red Hat, GPFS from IBM, and SFS from DataPlow.

File system

111

Special file systems


A special file system presents non-file elements of an operating system as files so they can be acted on using file system APIs. This is most commonly done in Unix-like operating systems, but devices are given file names in some non-Unix-like operating systems as well. Device file systems A device file system represents I/O devices and pseudo-devices as files, called device files. Examples in Unix-like systems include devfs and, in Linux 2.6 systems, udev. In non-Unix-like systems, such as TOPS-10 and other operating systems influenced by it, where the full filename or pathname of a file can include a device prefix, devices other than those containing file systems are referred to by a device prefix specifying the device, without anything following it. Others In the Linux kernel, configfs and sysfs provide files that can be used to query the kernel for information and configure entities in the kernel. procfs maps processes and, on Linux, other operating system structures into a filespace.

Minimal file system / Audio-cassette storage


In the late 1970s hobbyists saw the development of the microcomputer. Disk and digital tape devices were too expensive for hobbyists. An inexpensive basic data storage system was devised that used common audio cassette tape. When the system needed to write data, the user was notified to press "RECORD" on the cassette recorder, then press "RETURN" on the keyboard to notify the system that the cassette recorder was recording. The system wrote a sound to provide time synchronization, then modulated sounds that encoded a prefix, the data, a checksum and a suffix. When the system needed to read data, the user was instructed to press "PLAY" on the cassette recorder. The system would listen to the sounds on the tape waiting until a burst of sound could be recognized as the synchronization. The system would then interpret subsequent sounds as data. When the data read was complete, the system would notify the user to press "STOP" on the cassette recorder. It was primitive, but it worked (a lot of the time). Data was stored sequentially in an unnamed format. Multiple sets of data could be written and located by fast-forwarding the tape and observing at the tape counter to find the approximate start of the next data region on the tape. The user might have to listen to the sounds to find the right spot to begin playing the next data region. Some implementations even included audible sounds interspersed with the data.

Flat file systems


In a flat file system, there are no subdirectories. When floppy disk media was first available this type of file system was adequate due to the relatively small amount of data space available. CP/M machines featured a flat file system, where files could be assigned to one of 16 user areas and generic file operations narrowed to work on one instead of defaulting to work on all of them. These user areas were no more than special attributes associated with the files, that is, it was not necessary to define specific quota for each of these areas and files could be added to groups for as long as there was still free storage space on the disk. The Apple Macintosh also featured a flat file system, the Macintosh File System. It was unusual in that the file management program (Macintosh Finder) created the illusion of a partially hierarchical filing system on top of EMFS. This structure required every file to have a unique name, even if it appeared to be in a separate folder. While simple, flat file systems becomes awkward as the number of files grows and makes it difficult to organize data into related groups of files.

File system A recent addition to the flat file system family is Amazon's S3, a remote storage service, which is intentionally simplistic to allow users the ability to customize how their data is stored. The only constructs are buckets (imagine a disk drive of unlimited size) and objects (similar, but not identical to the standard concept of a file). Advanced file management is allowed by being able to use nearly any character (including '/') in the object's name, and the ability to select subsets of the bucket's content based on identical prefixes.

112

File systems and operating systems


Many operating systems include support for more than one file system. Sometimes the OS and the file system are so tightly interwoven it is difficult to separate out file system functions. There needs to be an interface provided by the operating system software between the user and the file system. This interface can be textual (such as provided by a command line interface, such as the Unix shell, or OpenVMS DCL) or graphical (such as provided by a graphical user interface, such as file browsers). If graphical, the metaphor of the folder, containing documents, other files, and nested folders is often used (see also: directory and folder).

Unix-like operating systems


Unix-like operating systems create a virtual file system, which makes all the files on all the devices appear to exist in a single hierarchy. This means, in those systems, there is one root directory, and every file existing on the system is located under it somewhere. Unix-like systems can use a RAM disk or network shared resource as its root directory. Unix-like systems assign a device name to each device, but this is not how the files on that device are accessed. Instead, to gain access to files on another device, the operating system must first be informed where in the directory tree those files should appear. This process is called mounting a file system. For example, to access the files on a CD-ROM, one must tell the operating system "Take the file system from this CD-ROM and make it appear under such-and-such directory". The directory given to the operating system is called the mount point it might, for example, be /media. The /media directory exists on many Unix systems (as specified in the Filesystem Hierarchy Standard) and is intended specifically for use as a mount point for removable media such as CDs, DVDs, USB drives or floppy disks. It may be empty, or it may contain subdirectories for mounting individual devices. Generally, only the administrator (i.e. root user) may authorize the mounting of file systems. Unix-like operating systems often include software and tools that assist in the mounting process and provide it new functionality. Some of these strategies have been coined "auto-mounting" as a reflection of their purpose. 1. In many situations, file systems other than the root need to be available as soon as the operating system has booted. All Unix-like systems therefore provide a facility for mounting file systems at boot time. System administrators define these file systems in the configuration file fstab (vfstab in Solaris), which also indicates options and mount points. 2. In some situations, there is no need to mount certain file systems at boot time, although their use may be desired thereafter. There are some utilities for Unix-like systems that allow the mounting of predefined file systems upon demand. 3. Removable media have become very common with microcomputer platforms. They allow programs and data to be transferred between machines without a physical connection. Common examples include USB flash drives, CD-ROMs, and DVDs. Utilities have therefore been developed to detect the presence and availability of a medium and then mount that medium without any user intervention. 1. Progressive Unix-like systems have also introduced a concept called supermounting; see, for example, the Linux supermount-ng project [11]. For example, a floppy disk that has been supermounted can be physically removed from the system. Under normal circumstances, the disk should have been synchronized and then unmounted before its removal. Provided synchronization has occurred, a different disk can be inserted into the drive. The system automatically notices that the disk has changed and updates the mount point contents to reflect the new

File system medium. Similar functionality is found on Windows machines. 2. An automounter will automatically mount a file system when a reference is made to the directory atop which it should be mounted. This is usually used for file systems on network servers, rather than relying on events such as the insertion of media, as would be appropriate for removable media. Linux Linux supports many different file systems, but common choices for the system disk on a block device include the ext* family (such as ext2, ext3 and ext4), XFS, JFS, ReiserFS and btrfs. For raw flash without a flash translation layer (FTL) or Memory Technology Device (MTD), there is UBIFS, JFFS2, and YAFFS, among others. SquashFS is a common compressed read-only file system. Solaris The Sun Microsystems Solaris operating system in earlier releases defaulted to (non-journaled or non-logging) UFS for bootable and supplementary file systems. Solaris defaulted to, supported, and extended UFS. Support for other file systems and significant enhancements were added over time, including Veritas Software Corp. (Journaling) VxFS, Sun Microsystems (Clustering) QFS, Sun Microsystems (Journaling) UFS, and Sun Microsystems (open source, poolable, 128 bit compressible, and error-correcting) ZFS. Kernel extensions were added to Solaris to allow for bootable Veritas VxFS operation. Logging or Journaling was added to UFS in Sun's Solaris 7. Releases of Solaris 10, Solaris Express, OpenSolaris, and other open source variants of the Solaris operating system later supported bootable ZFS. Logical Volume Management allows for spanning a file system across multiple devices for the purpose of adding redundancy, capacity, and/or throughput. Legacy environments in Solaris may use Solaris Volume Manager (formerly known as Solstice DiskSuite.) Multiple operating systems (including Solaris) may use Veritas Volume Manager. Modern Solaris based operating systems eclipse the need for Volume Management through leveraging virtual storage pools in ZFS. OS X OS X uses a file system that it inherited from classic Mac OS called HFS Plus, sometimes called Mac OS Extended. HFS Plus is a metadata-rich and case preserving file system. Due to the Unix roots of OS X, Unix permissions were added to HFS Plus. Later versions of HFS Plus added journaling to prevent corruption of the file system structure and introduced a number of optimizations to the allocation algorithms in an attempt to defragment files automatically without requiring an external defragmenter. Filenames can be up to 255 characters. HFS Plus uses Unicode to store filenames. On OS X, the filetype can come from the type code, stored in file's metadata, or the filename extension. HFS Plus has three kinds of links: Unix-style hard links, Unix-style symbolic links and aliases. Aliases are designed to maintain a link to their original file even if they are moved or renamed; they are not interpreted by the file system itself, but by the File Manager code in userland. OS X also supports the UFS file system, derived from the BSD Unix Fast File System via NeXTSTEP. However, as of Mac OS X 10.5 (Leopard), OS X can no longer be installed on a UFS volume, nor can a pre-Leopard system installed on a UFS volume be upgraded to Leopard.[12] Newer versions of OS X are capable of reading and writing to the legacy FAT file systems(16 & 32) common on Windows. They are also capable of reading the newer NTFS file systems for Windows. In order to write to NTFS file systems on OS X versions prior to 10.6 (Snow Leopard) third party software is necessary. Mac OS X 10.6 (Snow Leopard) and later allows writing to NTFS file systems, but only after a non-trivial system setting change (third party software exists that automates this).

113

File system

114

Plan 9
Plan 9 from Bell Labs treats everything as a file, and accessed as a file would be (i.e., no ioctl or mmap): networking, graphics, debugging, authentication, capabilities, encryption, and other services are accessed via I-O operations on file descriptors. The 9P protocol removes the difference between local and remote files These file systems are organized with the help of private, per-process namespaces, allowing each process to have a different view of the many file systems that provide resources in a distributed system. The Inferno operating system shares these concepts with Plan 9.

Microsoft Windows
Windows makes use of the FAT, NTFS, exFAT and ReFS file systems (the latter is only supported and usable in Windows Server 8; Windows cannot boot from it). Windows uses a drive letter abstraction at the user level to distinguish one disk or partition from another. For example, the path C:\WINDOWS represents a directory WINDOWS on the partition represented by the letter C. Drive C: is most commonly Directory listing in a Windows command shell used for the primary hard disk partition, on which Windows is usually installed and from which it boots. This "tradition" has become so firmly ingrained that bugs came about in older applications which made assumptions that the drive that the operating system was installed on was C. The use of drive letters, and the tradition of using "C" as the drive letter for the primary hard disk partition, can be traced to MS-DOS, where the letters A and B were reserved for up to two floppy disk drives. This in turn derived from CP/M in the 1970s, and ultimately from IBM's CP/CMS of 1967. FAT The family of FAT file systems is supported by almost all operating systems for personal computers, including all versions of Windows and MS-DOS/PCDOS and DR-DOS. (PCDOS is an OEM version of MS-DOS, MS-DOS was originally based on SCP's 86-DOS. DR-DOS was based on Digital Research's Concurrent DOS, a successor of CP/M-86.) The FAT file systems are therefore well-suited as a universal exchange format between computers and devices of most any type and age. The FAT file system traces its roots back to an (incompatible) 8-bit FAT precursor in Stand-alone Disk BASIC and the short-lived MDOS/MIDAS project. Over the years, the file system has been expanded from FAT12 to FAT16 and FAT32. Various features have been added to the file system including subdirectories, codepage support, extended attributes, and long filenames. Third-parties such as Digital Research have incorporated optional support for deletion tracking, and volume/directory/file-based multi-user security schemes to support file and directory passwords and permissions such as read/write/execute/delete access rights. Most of these extensions are not supported by Windows. The FAT12 and FAT16 file systems had a limit on the number of entries in the root directory of the file system and had restrictions on the maximum size of FAT-formatted disks or partitions. FAT32 addresses the limitations in FAT12 and FAT16, except for the file size limit of close to 4GB, but it remains limited compared to NTFS.

File system FAT12, FAT16 and FAT32 also have a limit of eight characters for the file name, and three characters for the extension (such as .exe). This is commonly referred to as the 8.3 filename limit. VFAT, an optional extension to FAT12, FAT16 and FAT32, introduced in Windows 95 and Windows NT 3.5, allowed long file names (LFN) to be stored in the FAT file system in a backwards compatible fashion. NTFS NTFS, introduced with the Windows NT operating system, allowed ACL-based permission control. Other features also supported by NTFS include hard links, multiple file streams, attribute indexing, quota tracking, sparse files, encryption, compression, and reparse points (directories working as mount-points for other file systems, symlinks, junctions, remote storage links), though not all these features are well-documented. exFAT exFAT is a proprietary and patent-protected file system with certain advantages over NTFS with regards to file system overhead. exFAT is not backwards compatible with FAT file systems such as FAT12, FAT16 or FAT32. The file system is supported with newer Windows systems, such as Windows 2003, Windows Vista, Windows 2008, Windows 7 and more recently, support has been added for Windows XP.[13] Support in other operating systems is sparse since Microsoft has not published the specifications of the file system and implementing support for exFAT requires a license.

115

Other file systems


The Prospero File System is a file system based on the Virtual System Model.[14] The system was created by Dr. B. Clifford Neuman of the Information Sciences Institute at the University of Southern California.[15] RSRE FLEX file system - written in ALGOL 68 The file system of the Michigan Terminal System (MTS) is interesting because: (i) it provides "line files" where record lengths and line numbers are associated as metadata with each record in the file, lines can be added, replaced, updated with the same or different length records, and deleted anywhere in the file without the need to read and rewrite the entire file; (ii) using program keys files may be shared or permitted to commands and programs in addition to users and groups; and (iii) there is a comprehensive file locking mechanism that protects both the file's data and its metadata.[16][17]

Limitations
Converting the type of a file system
It may be advantageous or necessary to have files in a different file system than they currently exist. Reasons include the need for an increase in the space requirements beyond the limits of the current file system. The depth of path may need to be increased beyond the restrictions of the file system. There may be performance or reliability considerations. Providing access to another operating system which does not support existing filesystem is another reason. In-place conversion In some cases conversion can be done in-place, although migrating the file system is more conservative, as it involves a creating a copy of the data and is recommended.[18] On Windows, FAT and FAT32 file systems can be converted to NTFS via the convert.exe utility, but not the reverse.[18] On Linux, ext2 can be converted to ext3 (and converted back), and ext3 can be converted to ext4 (but not back),[19] and both ext3 and ext4 can be converted to btrfs, and converted back until the undo information is deleted.[20] These conversions are possible due to using the

File system same format for the file data itself, and relocating the metadata into empty space, in some cases using sparse file support.[20] Migrating to a different file system Migration has the disadvantage of requiring additional space although it may be faster. The best case is if there is unused space on media which will contain the final file system. For example, to migrate a FAT32 file system to an ext2 file system. First create a new ext2 file system, then copy the data to the file system, then delete the FAT32 file system. An alternative, when there is not sufficient space to retain the original file system until the new one is created, is to use a work area (such as a removable media). This takes longer but a backup of the data is a nice side effect.

116

Long file paths and long file names


In hierarchical file systems, files are accessed by means of a path that is a branching list of directories containing the file. Different file systems have different limits on the depth of the path. File systems also have a limit on the length of an individual filename. Copying files with long names or located in paths of significant depth from one file system to another may cause undesirable results. This depends on how the utility doing the copying handles the discrepancy. See also pathmunge
[21]

References
Cited references
[1] R. C. Daley; P. G. Neumann (1965). "A General-Purpose File System For Secondary Storage" (http:/ / www. multicians. org/ fjcc4. html). Fall Joint Computer Conference. AFIPS. pp.213-229. doi:10.1145/1463891.1463915. . Retrieved 2011-07-30. [2] http:/ / www. theregister. co. uk/ 2002/ 03/ 29/ windows_on_a_database_sliced/ [3] http:/ / www-03. ibm. com/ systems/ i/ software/ db2/ index. html [4] http:/ / www. ibm. com/ developerworks/ ibmi/ newto/ [5] http:/ / www. theregister. co. uk/ 2002/ 01/ 28/ xp_successor_longhorn_goes_sql/ [6] Spillane, Richard; Gaikwad, Sachin; Chinni, Manjunath; Zadok, Erez and Wright, Charles P.; 2009; "Enabling transactional file access via lightweight kernel extensions" (http:/ / www. fsl. cs. sunysb. edu/ docs/ valor/ valor_fast2009. pdf); Seventh USENIX Conference on File and Storage Technologies (FAST 2009) [7] Wright, Charles P.; Spillane, Richard; Sivathanu, Gopalan; Zadok, Erez; 2007; "Extending ACID Semantics to the File System (http:/ / www. fsl. cs. sunysb. edu/ docs/ amino-tos06/ amino. pdf); ACM Transactions on Storage [8] Selzter, Margo I.; 1993; "Transaction Support in a Log-Structured File System" (http:/ / www. eecs. harvard. edu/ ~margo/ papers/ icde93/ paper. pdf); Proceedings of the Ninth International Conference on Data Engineering [9] Porter, Donald E.; Hofmann, Owen S.; Rossbach, Christopher J.; Benn, Alexander and Witchel, Emmett; 2009; "Operating System Transactions" (http:/ / www. sigops. org/ sosp/ sosp09/ papers/ porter-sosp09. pdf); In the Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP '09), Big Sky, MT, October 2009. [10] Gal, Eran; Toledo, Sivan; "A Transactional Flash File System for Microcontrollers" (http:/ / www. usenix. org/ event/ usenix05/ tech/ general/ full_papers/ gal/ gal. pdf) [11] http:/ / sourceforge. net/ projects/ supermount-ng [12] Mac OS X 10.5 Leopard: Installing on a UFS-formatted volume (http:/ / docs. info. apple. com/ article. html?artnum=306516) [13] Microsoft WinXP exFat patch (http:/ / www. microsoft. com/ downloads/ details. aspx?FamilyID=1cbe3906-ddd1-4ca2-b727-c2dff5e30f61& displaylang=en) [14] The Prospero File System: A Global File System Based on the Virtual System Model (http:/ / citeseer. ist. psu. edu/ viewdoc/ summary?doi=10. 1. 1. 132. 7982) [15] cs.ucsb.edu (http:/ / www. cs. ucsb. edu/ ~ravenben/ papers/ fsml/ prospero-gfsvsm. ps. gz) [16] "A file system for a general-purpose time-sharing environment" (http:/ / ieeexplore. ieee. org/ xpl/ freeabs_all. jsp?arnumber=1451786), G. C. Pirkola, Proceedings of the IEEE, June 1975, volume 63 no. 6, pp.918924, ISSN 0018-9219 [17] "The Protection of Information in a General Purpose Time-Sharing Environment" (https:/ / docs. google. com/ viewer?a=v& pid=sites& srcid=ZGVmYXVsdGRvbWFpbnxtaWNoaWdhbnRlcm1pbmFsc3lzdGVtfGd4Ojc5MTAxNzg1NTVmMjg5Mzk), Gary C. Pirkola and John Sanguinetti, Proceedings of the IEEE Symposium on Trends and Applications 1977: Computer Security and Integrity, vol. 10 no. 4, , pp.

File system
106-114 [18] How to Convert FAT Disks to NTFS (http:/ / technet. microsoft. com/ en-us/ library/ bb456984. aspx), Microsoft, October 25, 2001 [19] Converting an ext3 filesystem to ext4 (https:/ / ext4. wiki. kernel. org/ index. php/ Ext4_Howto#Converting_an_ext3_filesystem_to_ext4) [20] Conversion from Ext3 (https:/ / btrfs. wiki. kernel. org/ index. php/ Conversion_from_Ext3), Btrfs wiki [21] http:/ / www. cyberciti. biz/ faq/ redhat-linux-pathmunge-command-in-shell-script/

117

General references
Jonathan de Boyne Pollard (1996). "Disc and volume size limits" (http://homepage.ntlworld.com./jonathan. deboynepollard/FGA/os2-disc-and-volume-size-limits.html). Frequently Given Answers. Retrieved February 9, 2005. IBM. "OS/2 corrective service fix JR09427" (ftp://service.boulder.ibm.com/ps/products/os2/fixes/v4warp/ english-us/jr09427/JR09427.TXT). Retrieved February 9, 2005. "Attribute - $EA_INFORMATION (0xD0)" (http://linux-ntfs.sourceforge.net/ntfs/attributes/ea_information. html). NTFS Information, Linux-NTFS Project. Retrieved February 9, 2005. "Attribute - $EA (0xE0)" (http://linux-ntfs.sourceforge.net/ntfs/attributes/ea.html). NTFS Information, Linux-NTFS Project. Retrieved February 9, 2005. "Attribute - $STANDARD_INFORMATION (0x10)" (http://linux-ntfs.sourceforge.net/ntfs/attributes/ standard_information.html). NTFS Information, Linux-NTFS Project. Retrieved February 21, 2005. Apple Computer Inc. "Technical Note TN1150: HFS Plus Volume Format" (http://developer.apple.com/ technotes/tn/tn1150.html). Detailed HFS Plus and HFSX description. Retrieved May 2, 2006. File System Forensic Analysis (http://www.digital-evidence.org/fsfa/), Brian Carrier, Addison Wesley, 2005.

Further reading
Books
Carrier, Brian (2005). File System Forensic Analysis (http://www.digital-evidence.org/fsfa/). Addison-Wesley. ISBN0-321-26817-2. Custer, Helen (1994). Inside the Windows NT File System. Microsoft Press. ISBN1-55615-660-X. Giampaolo, Dominic (1999) (PDF). Practical File System Design with the Be File System (http://www.nobius. org/~dbg/practical-file-system-design.pdf). Morgan Kaufmann Publishers. ISBN1-55860-497-9. Retrieved 2010-01-22. McCoy, Kirby (1990). VMS File System Internals. VAX - VMS Series. Digital Press. ISBN1-55558-056-4. Mitchell, Stan (1997). Inside the Windows 95 File System (http://oreilly.com/catalog/156592200X). O'Reilly. ISBN1-56592-200-X. Nagar, Rajeev (1997). Windows NT File System Internals : A Developer's Guide (http://oreilly.com/catalog/ 9781565922495). O'Reilly. ISBN978-1-56592-249-5. Pate, Steve D. (2003). UNIX Filesystems: Evolution, Design, and Implementation (http://eu.wiley.com/ WileyCDA/WileyTitle/productCd-0471164836.html). Wiley. ISBN0-471-16483-6. Rosenblum, Mendel (1994). The Design and Implementation of a Log-Structured File System. The Springer International Series in Engineering and Computer Science. Springer. ISBN0-7923-9541-7. Russinovich, Mark; Solomon, David A.; Ionescu, Alex (2009). "File Systems". Windows Internals (5th ed.). Microsoft Press. ISBN0-7356-2530-1. Prabhakaran, Vijayan (2006). IRON File Systems (http://www.cs.wisc.edu/~vijayan/vijayan-thesis.pdf). PhD disseration, University of Wisconsin-Madison. Silberschatz, Abraham; Galvin, Peter Baer; Gagne, Greg (2004). "Storage Management". Operating System Concepts (7th ed.). Wiley. ISBN0-471-69466-5. Tanenbaum, Andrew S. (2007). Modern operating Systems (http://www.pearsonhighered.com/ product?ISBN=0136006639) (3rd ed.). Prentice Hall. ISBN0-13-600663-9.

File system Tanenbaum, Andrew S.; Woodhull, Albert S. (2006). Operating Systems: Design and Implementation (http:// www.pearsonhighered.com/pearsonhigheredus/educator/product/products_detail.page?isbn=0-13-142938-8) (3rd ed.). Prentice Hall. ISBN0-13-142938-8.

118

Online
Benchmarking Filesystems (outdated) (http://linuxgazette.net/102/piszcz.html) by Justin Piszcz, Linux Gazette 102, May 2004 Benchmarking Filesystems Part II (http://linuxgazette.net/122/piszcz.html) using kernel 2.6, by Justin Piszcz, Linux Gazette 122, January 2006 Filesystems (ext3, ReiserFS, XFS, JFS) comparison on Debian Etch (http://www.debian-administration.org/ articles/388) 2006 Interview With the People Behind JFS, ReiserFS & XFS (http://www.osnews.com/story.php?news_id=69) Journal File System Performance (outdated) (http://www.open-mag.com/features/Vol_18/filesystems/ filesystems.htm): ReiserFS, JFS, and Ext3FS show their merits on a fast RAID appliance Journaled Filesystem Benchmarks (outdated) (http://staff.osuosl.org/~kveton/fs/): A comparison of ReiserFS, XFS, JFS, ext3 & ext2 Large List of File System Summaries (most recent update 2006-11-19) (http://www.osdata.com/system/ logical/logical.htm) Linux File System Benchmarks (http://fsbench.netnation.com/) v2.6 kernel with a stress on CPU usage Linux Filesystem Benchmarks (http://www.techyblog.com/linux-news/linux-26-filesystem-benchmarks-older. html) Linux large file support (outdated) (http://www.suse.de/~aj/linux_lfs.html) Local Filesystems for Windows (http://www.microsoft.com/whdc/device/storage/LocFileSys.mspx) Overview of some filesystems (outdated) (http://osdev.berlios.de/osd-fs.html) Sparse files support (outdated) (http://www.lrdev.com/lr/unix/sparsefile.html) Jeremy Reimer (March 16, 2008). "From BFS to ZFS: past, present, and future of file systems" (http:// arstechnica.com/articles/paedia/past-present-future-file-systems.ars). arstechnica.com. Retrieved 2008-03-18.

External links
Filesystem Specifications - Links & Whitepapers (http://www.forensics.nl/filesystems) Interesting File System Projects (http://filesystems.org/all-projects.html)

Network File System

119

Network File System


Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems in 1984,[1] allowing a user on a client computer to access files over a network in a manner similar to how local storage is accessed. NFS, like many other protocols, builds on the Open Network Computing Remote Procedure Call (ONC RPC) system. The Network File System is an open standard defined in RFCs, allowing anyone to implement the protocol.

Versions and variations


Original NFS version
The implementation details are defined in RFC 1094. Sun used version 1 only for in-house experimental purposes. When the development team added substantial changes to NFS version 1 and released it outside of Sun, they decided to release the new version as v2, so that version interoperation and RPC version fallback could be tested.[2]

NFSv2
Version 2 of the protocol (defined in RFC 1094, March 1989) originally operated entirely over UDP. Its designers meant to keep the protocol stateless, with locking (for example) implemented outside of the core protocol. People involved in the creation of NFS version 2 include Rusty Sandberg, Bob Lyon, Bill Joy, and Steve Kleiman. NFSv2 only allowed the first 2 GB of a file to be read.

NFSv3
Version 3 (RFC 1813, June 1995) added: support for 64-bit file sizes and offsets, to handle files larger than 2 gigabytes (GB); support for asynchronous writes on the server, to improve write performance; additional file attributes in many replies, to avoid the need to re-fetch them; a READDIRPLUS operation, to get file handles and attributes along with file names when scanning a directory; assorted other improvements.

At the time of introduction of Version 3, vendor support for TCP as a transport-layer protocol began increasing. While several vendors had already added support for NFS Version 2 with TCP as a transport, Sun Microsystems added support for TCP as a transport for NFS at the same time it added support for Version 3. Using TCP as a transport made using NFS over a WAN more feasible.

NFSv4
Version 4 (RFC 3010, December 2000; revised in RFC 3530, April 2003), influenced by AFS and CIFS, includes performance improvements, mandates strong security, and introduces a stateful protocol.[3] Version 4 became the first version developed with the Internet Engineering Task Force (IETF) after Sun Microsystems handed over the development of the NFS protocols. NFS version 4.1 (RFC 5661, January 2010) aims to provide protocol support to take advantage of clustered server deployments including the ability to provide scalable parallel access to files distributed among multiple servers (pNFS extension).

Network File System

120

Other extensions
WebNFS, an extension to Version 2 and Version 3, allows NFS to integrate more easily into Web-browsers and to enable operation through firewalls. In 2007, Sun Microsystems open-sourced their client-side WebNFS implementation.[4] Various side-band protocols have become associated with NFS, including: The byte-range advisory Network Lock Manager (NLM) protocol (added to support UNIX System V file locking APIs). The remote quota reporting (RQUOTAD) protocol (to allow NFS users to view their data-storage quotas on NFS servers). NFS over RDMA is an adaptation of NFS that uses RDMA as a transport.[5][6]

Platforms
NFS is often used with Unix operating systems (such as Solaris, AIX and HP-UX) and Unix-like operating systems (such as Linux and FreeBSD). It is also available to operating systems such as the classic Mac OS, OpenVMS, Microsoft Windows, Novell NetWare, and IBM AS/400. Alternative remote file access protocols include the Server Message Block (SMB, also known as CIFS), Apple Filing Protocol (AFP), NetWare Core Protocol (NCP), and OS/400 File Server file system (QFileSvr.400). SMB and NetWare Core Protocol (NCP) occur more commonly than NFS on systems running Microsoft Windows; AFP occurs more commonly than NFS in Macintosh systems; and QFileSvr.400 occurs more commonly in AS/400 systems.

Typical implementation
Assuming a Unix-style scenario in which one machine (the client) requires access to data stored on another machine (the NFS server): 1. The server implements NFS daemon processes (running by default as nfsd) in order to make its data generically available to clients. 2. The server administrator determines what to make available, exporting the names and parameters of directories (typically using the /etc/exports configuration file and the exportfs command). 3. The server security-administration ensures that it can recognize and approve validated clients. 4. The server network configuration ensures that appropriate clients can negotiate with it through any firewall system. 5. The client machine requests access to exported data, typically by issuing a mount command. (The client asks the server (rpcbind) which port the NFS server is using, the client connects to the NFS server (nfsd), nfsd passes the request to mountd) 6. If all goes well, users on the client machine can then view and interact with mounted filesystems on the server within the parameters permitted. Note that automation of the NFS mounting process may take place perhaps using /etc/fstab and/or automounting facilities.

Network File System

121

Protocol development versus competing protocols


1980s
NFS and ONC figured prominently in the network-computing war between Sun Microsystems and Apollo Computer, and later the UNIX wars (ca 1987-1996) between AT&T and Sun on one side, and Digital Equipment, HP, and IBM on the other. During the development of the ONC protocol (called SunRPC at the time), only Apollo's Network Computing System (NCS) offered comparable functionality. Two competing groups developed over fundamental differences in the two remote procedure call systems. Arguments focused on the method for data-encoding ONC's External Data Representation (XDR) always rendered integers in big-endian order, even if both peers of the connection had little-endian machine-architectures, whereas NCS's method attempted to avoid byte-swap whenever two peers shared a common endianness in their machine-architectures. An industry-group called the Network Computing Forum formed (March 1987) in an (ultimately unsuccessful) attempt to reconcile the two network-computing environments. Later, Sun and AT&T announced that the two firms would jointly develop AT&T's next version of UNIX: System V Release 4. This caused many of AT&T's other licensees of UNIX System V to become concerned that this would put Sun in an advantaged position, and it ultimately led to Digital Equipment, HP, IBM, and others forming the Open Software Foundation (OSF) in 1988. Ironically, Sun and AT&T had previously competed over Sun's NFS versus AT&T's Remote File System (RFS), and the quick adoption of NFS over RFS by Digital Equipment, HP, IBM, and many other computer vendors tipped the majority of users in favor of NFS. OSF solicited the proposals for various technologies, including the remote procedure call (RPC) system and the remote file access protocol. In the end, proposals for these two requirements, called respectively, the Distributed Computing Environment (DCE), and the Distributed File System (DFS) won over Sun's proposed ONC and NFS. DCE derived from a suite of technologies, including NCS and Kerberos. DFS used DCE as the RPC and derived from the Andrew File System (AFS).

1990s
Sun Microsystems and the Internet Society (ISOC) reached an agreement to cede "change control" of ONC RPC so that ISOC's engineering-standards body, the Internet Engineering Task Force (IETF), could publish standards documents (RFCs) documenting the ONC RPC protocols and could extend ONC RPC. OSF attempted to make DCE RPC an IETF standard, but ultimately proved unwilling to give up change control. Later, the IETF chose to extend ONC RPC by adding a new authentication flavor based on GSSAPI, RPCSEC GSS, in order to meet IETF's requirements that protocol standards have adequate security. Later, Sun and ISOC reached a similar agreement to give ISOC change control over NFS, although writing the contract carefully to exclude NFS version 2 and version 3. Instead, ISOC gained the right to add new versions to the NFS protocol, which resulted in IETF specifying NFS version 4 in 2003.

Network File System

122

2000s
By the 21st century, neither DFS nor AFS had achieved any major commercial success as compared to CIFS or NFS. IBM, which had previously acquired the primary commercial vendor of DFS and AFS, Transarc, donated most of the AFS source code to the free software community in 2000. The OpenAFS project lives on. In early 2005, IBM announced end of sales for AFS and DFS.

Present
NFSv4.1 adds the Parallel NFS pNFS [7] capability, which enables data access parallelism. The NFSv4.1 protocol defines a method of separating the filesystem meta-data from the location of the file data; it goes beyond the simple name/data separation by striping the data amongst a set of data servers. This is different from the traditional NFS server which holds the names of files and their data under the single umbrella of the server. There exist products which are multi-node NFS servers, but the participation of the client in separation of meta-data and data is limited. The NFSv4.1 client can be enabled to be a direct participant in the exact location of file data and avoid solitary interaction with the single NFS server when moving data. The NFSv4.1 pNFS server is a collection of server resources or components; these are assumed to be controlled by the meta-data server. The pNFS client still accesses a single meta-data server for traversal or interaction with the namespace; when the client moves data to and from the server it may be directly interacting with the set of data servers belonging to the pNFS server collection. In addition to pNFS, NFSv4.1 provides Sessions, Directory Delegation and Notifications, Multi-server Namespace, ACL/SACL/DACL, Retention Attributions, and SECINFO_NO_NAME.

References
[1] "Design and Implementation of the Sun Network Filesystem" (http:/ / citeseerx. ist. psu. edu/ viewdoc/ summary?doi=10. 1. 1. 14. 473). USENIX. 1985. . [2] * NFS Illustrated (2000) by Brent Callaghan - ISBN 0-201-32570-5 [3] "NFS Version 4" (http:/ / www. usenix. org/ events/ usenix05/ tech/ italks. html#nFSv4). USENIX. 2005-04-14. . [4] yanfs.dev.java.net (https:/ / yanfs. dev. java. net/ ) [5] Tom Talpey (February 28, 2006). "NFS/RDMA Implementation(s) Update" (http:/ / www. connectathon. org/ talks06/ talpey-cthon06-nfs-rdma. pdf). Network Appliance, Inc.. . [6] Brent Callaghan (January 28, 2002). "NFS over RDMA" (http:/ / www. usenix. org/ events/ fast02/ wips/ callaghan. pdf). Sun Microsystems. . [7] http:/ / www. pnfs. com

External links
RFCs RFC 5661 - Network File System (NFS) Version 4 Minor Version 1 Protocol RFC 3530 - NFS Version 4 Protocol Specification RFC 2054 - WebNFS Specification RFC 2339 - Sun/ISOC NFS Change Control Agreement RFC 2203 - RPCSEC_GSS Specification RFC 1813 - NFS Version 3 Protocol Specification RFC 1790 - Sun/ISOC ONC RPC Change Control Agreement RFC 1094 - NFS Version 2 Protocol Specification

IETF: Network File System Version 4 (nfsv4) Charter (http://www.ietf.org/html.charters/nfsv4-charter.html) Linux NFS Overview, FAQ and HOWTO Documents (http://nfs.sourceforge.net/)

Network File System Christopher Smith (2006-05-02). "Linux NFS-HOWTO" (http://nfs.sourceforge.net/nfs-howto/index. html). Retrieved 2010-12-16. IBM: NFSv4 delivers seamless network access (http://www-128.ibm.com/developerworks/linux/library/ l-nfsv4.html?ca=dgr-lnxw06NFSv4SeamlessNetAccess) NFS operation explained with sequence diagrams (http://www.eventhelix.com/RealtimeMantra/Networking/ NFS_Protocol_Sequence_Diagram.pdf)

123

Server Message Block


In computer networking, Server Message Block (SMB), also known as Common Internet File System (CIFS, /sfs/) operates as an application-layer network protocol[1] mainly used for providing shared access to files, printers, serial ports, and miscellaneous communications between nodes on a network. It also provides an authenticated inter-process communication mechanism. Most usage of SMB involves computers running Microsoft Windows, where it was known as "Microsoft Windows Network" before the subsequent introduction of Active Directory. Corresponding Windows services are the "Server Service" (for the server component) and "Workstation Service" (for the client component). The Server Message Block protocol can run atop the Session (and lower) network layers in several ways: directly over TCP, port 445;[2] via the NetBIOS API, which in turn can run on several transports:[3] on UDP ports 137, 138 & TCP ports 137, 139 see NetBIOS over TCP/IP; on several legacy protocols such as NBF (incorrectly referred to as NetBEUI).

History
Barry Feigenbaum originally designed SMB at IBM with the aim of turning DOS "Interrupt 33" (21h) local file access into a networked file system.[4] Microsoft has made considerable modifications to the most commonly used version. Microsoft merged the SMB protocol with the LAN Manager product which it had started developing for OS/2 with 3Com c. 1990, and continued to add features to the protocol in Windows for Workgroups (c. 1992) and in later versions of Windows. SMB was originally designed to run on top of the NetBIOS/NetBEUI API (typically implemented with NBF, NetBIOS over IPX/SPX, or NBT). Since Windows 2000, SMB runs, by default, with a thin layer, similar to the Session Message packet of NBT's Session Service, on top of TCP, using TCP port 445 rather than TCP port 139 a feature known as "direct host SMB".[2] At around the time when Sun Microsystems announced WebNFS,[5] Microsoft launched an initiative in 1996 to rename SMB to Common Internet File System (CIFS), and added more features, including support for symbolic links, hard links, larger file sizes, and an initial attempt at supporting direct connections over TCP port 445 without requiring NetBIOS as a transport (a largely experimental effort that required further refinement). Microsoft submitted some partial specifications as Internet-Drafts to the IETF,[6] though these submissions have expired. The Samba project originated with the aim of reverse engineering the SMB protocol and implementing an SMB server to allow MS-DOS clients to use SMB to access files on Sun Microsystems machines.[7] Because of the importance of the SMB protocol in interacting with the widespread Microsoft Windows platform, Samba became a popular free implementation of a compatible SMB client and server for interoperating with non-Microsoft operating systems. Microsoft introduced SMB2 with Windows Vista in 2006, and later improved on it in Windows 7.

Server Message Block

124

Implementation
Client-server approach
SMB works through a client-server approach, where a client makes specific requests and the server responds accordingly. One section of the SMB protocol specifically deals with access to filesystems, such that clients may make requests to a file server; but some other sections of the SMB protocol specialize in inter-process communication (IPC). The Inter-Process Communication (IPC) share, or ipc$, is a network share on computers running Microsoft Windows. This virtual share is used to facilitate communication between processes and computers over SMB, often to exchange data between computers that have been authenticated. Developers have optimized the SMB protocol for local subnet usage, but users have also put SMB to work to access different subnets across the Internetexploits involving file-sharing or print-sharing in MS Windows environments usually focus on such usage. SMB servers make their file systems and other resources available to clients on the network. Client computers may want access to the shared file systems and printers on the server, and in this primary functionality SMB has become best-known and most heavily used. However, the SMB file-server aspect would count for little without the NT domains suite of protocols, which provide NT-style domain-based authentication at the very least. Almost all implementations of SMB servers use NT Domain authentication to validate user-access to resources.

Samba
Samba is a free software re-implementation of the SMB/CIFS networking protocol, originally developed by Andrew Tridgell. As of version 3, Samba provides file and print services for Microsoft Windows clients and can integrate with a Windows NT 4.0 server domain, either as a Primary Domain Controller (PDC) or as a domain member. Samba4 installations can act as an Active Directory domain controller or member server, at Windows 2008 domain and forest functional levels.[8]

Performance issues
NetBIOS The use of the SMB protocol has often correlated with a significant increase in broadcast traffic on a network. However the SMB itself does not use broadcaststhe broadcast problems commonly associated with SMB actually originate with the NetBIOS service location protocol. By default, a Microsoft Windows NT 4.0 server used NetBIOS to advertise and locate services. NetBIOS functions by broadcasting services available on a particular host at regular intervals. While this usually makes for an acceptable default in a network with a smaller number hosts, increased broadcast traffic can cause problems as the size of the network increases. The implementation of name resolution infrastructure in the form of Windows Internet Naming Service (WINS) or Domain Name System (DNS) resolves this problem. WINS was a proprietary implementation used with Windows NT 4.0 networks, but brought about its own issues and complexities in the design and maintenance of a Microsoft network. Since the release of Windows 2000, the use of WINS for name resolution has been deprecated by Microsoft, with hierarchical Dynamic DNS now configured as the default name resolution protocol for all Windows operating systems. Resolution of (short) NETBIOS names by DNS requires that a DNS client expand short names, usually by appending a connection-specific DNS suffix to its DNS lookup queries. WINS can still be configured on clients as a secondary name resolution protocol for interoperability with legacy Windows environments and applications. Further, Microsoft DNS servers can forward name resolution requests to legacy WINS servers in order to support name resolution integration with legacy (pre-Windows 2000) environments that do not support DNS.

Server Message Block WAN performance issues Network designers have found that latency has a significant impact on the performance of the SMB 1.0 protocol, that it performs more poorly than other protocols like FTP. Monitoring reveals a high degree of "chattiness" and a disregard of network latency between hosts.[9] For example, a VPN connection over the Internet will often introduce network latency. Microsoft has explained that performance issues come about primarily because SMB 1.0 is a block-level rather than a streaming protocol, that was originally designed for small LANs; it has a block size that is limited to 64K, SMB signing creates an additional overhead and the TCP window size is not optimized for WAN links.[10] Solutions to this problem include the updated SMB 2.0 protocol, Offline Files, TCP window scaling and WAN acceleration devices from various network vendors that cache and optimize SMB 1.0[11] and 2.0.[12]

125

Microsoft's modifications
Microsoft added several extensions to its own SMB implementation. For example, it added NTLM, then NTLMv2 authentication protocols in order to address security weakness in the original LanMan authentication. LanMan authentication derived from the original legacy SMB specification's requirement to use IBM "LanManager" passwords, but implemented DES in a flawed manner that allowed passwords to be cracked.[13] Later, Kerberos authentication was also added. The NT 4.0 Domain logon protocols initially used 40-bit encryption outside of the United States of America, because of export restrictions on stronger 128-bit encryption[14] (subsequently lifted in 1996 when President Bill Clinton signed Executive Order 13026[15]). Opportunistic locking support has changed with each server release.

Opportunistic locking
In the SMB protocol, opportunistic locking is a file locking mechanism designed to improve performance by controlling caching of network files by the client. Contrary to the traditional locks, OpLocks are not used in order to provide mutual exclusion. The main goal of OpLocks is to provide synchronization for caching. There are three types of opportunistic locks: Batch Locks Batch OpLocks were created originally to support a particular behavior of MS-DOS batch file execution operation in which the file is opened and closed many times in a short period, which is a performance problem. To solve this, a client may ask for a OpLock of type "batch". In this case, the client delays sending the close request and if a subsequent open request is given, the two requests cancel each other. Exclusive Locks When an application opens in "shared mode" a file hosted on an SMB server which is not opened by any other process (or other clients) the client receives an exclusive OpLock from the server. This means that the client may now assume that it is the only process with access to this particular file, and the client may now cache all changes to the file before committing it to the server. This is a performance improvement, since fewer round-trips are required in order to read and write to the file. If another client/process tries to open the same file, the server sends a message to the client (called a break or revocation) which invalidates the exclusive lock previously given to the client. The client then flushes all changes to the file.

Server Message Block Level 2 OpLocks If an exclusive OpLock is held by a client and a locked file is opened by a third party, the client has to relinquish its exclusive OpLock to allow the other client's write/read access. A client may then receive a "Level 2 OpLock" from the server. A Level 2 OpLock allows the caching of read requests, but excludes write caching. Breaks In contrast with the SMB protocol's "standard" behavior, a break request may be sent from server to client. It informs the client that an OpLock is no longer valid. This happens, for example, when another client wishes to open a file in a way that invalidates the OpLock. The first client is then sent an OpLock break and required to send all its local changes (in case of batch or exclusive OpLocks), if any, and acknowledge the OpLock break. Upon this acknowledgment the server can reply to the second client in a consistent manner.

126

SMB2
Microsoft introduced a new version of the Server Message Block (SMB) protocol (SMB 2.0 or SMB2) with Windows Vista in 2006.[16] Although the protocol is proprietary, its specification has been published to allow other systems to interoperate with Microsoft operating systems that use the new protocol.[17] SMB2 reduces the 'chattiness' of the SMB 1.0 protocol by reducing the number of commands and subcommands from over a hundred to just nineteen.[] It has mechanisms for pipelining, that is, sending additional requests before the response to a previous request arrives, thereby improving performance over high latency links. It adds the ability to compound multiple actions into a single request, which significantly reduces the number of round-trips the client needs to make to the server, improving performance as a result.[9] SMB1 also has a compounding mechanism known as AndX to compound multiple actions, but Microsoft clients rarely use AndX. It also introduces the notion of "durable file handles": these allow a connection to an SMB server to survive brief network outages, as are typical in a wireless network, without having to incur the overhead of re-negotiating a new session. SMB2 includes support for symbolic links. Other improvements include caching of file properties, improved message signing with HMAC SHA-256 hashing algorithm and better scalability by increasing the number of users, shares and open files per server among others.[9] The SMB1 protocol uses 16-bit data sizes, which amongst other things, limits the maximum block size to 64K. SMB2 uses 32 or 64-bit wide storage fields, and 128 bits in the case of file-handles, thereby removing previous constraints on block sizes, which improves performance with large file transfers over fast networks.[9] Windows Vista and later operating systems use SMB2 when communicating with other machines running Windows Vista or later. SMB1 continues in use for connections with older versions of Windows, as well as systems like Samba and various vendors' NAS solutions. Samba 3.5 also includes experimental support for SMB2.[18] Samba 3.6 fully supports SMB2, except the modification of user quotas using the Windows quota management tools.[19] SMB2 brings a number of benefits to third party implementers of SMB protocols. SMB1, originally designed by IBM, was reverse engineered, and later became part of a wide variety of non-Windows operating systems such as Xenix, OS/2 and VMS (Pathworks). X/Open standardised it partially; it also had draft IETF standards which lapsed. (See http:/ / ubiqx. org/ cifs/ Intro. html for historical detail.) SMB2 is also a relatively clean break with the past. Microsoft's SMB1 code has to work with a large variety of SMB clients and servers. SMB1 features many versions of information for commands (selecting what structure to return for a particular request) because features such as Unicode support were retro-fitted at a later date. SMB2 involves significantly reduced compatibility-testing for implementers of the protocol. SMB2 code has considerably less complexity since far less variability exists (for example, non-Unicode code paths become redundant as SMB2 requires Unicode support).

Server Message Block

127

SMB 2.1
SMB 2.1, introduced with Windows 7 and Server 2008 R2, introduced minor performance enhancements with a new opportunistic locking mechanism.[20]

SMB 3.0
SMB 3.0 (previously named SMB 2.2)[21] will be introduced with Windows 8[21] and Windows Server 2012.[21] It will bring several significant changes that are aimed to add functionality and improve SMB2 performance, notably in virtualized data centers like SMB2 RDMA Transport Protocol and multichannel.[22]

Features
The SMB "Inter-Process Communication" (IPC) system provides named pipes and was one of the first inter-process mechanisms commonly available to programmers that provides a means for services to inherit the authentication carried out when a client first connected to an SMB server. Some services that operate over named pipes, such as those which use Microsoft's own implementation of DCE/RPC over SMB, known as MSRPC over SMB, also allow MSRPC client programs to perform authentication, which over-rides the authorization provided by the SMB server, but only in the context of the MSRPC client program that successfully makes the additional authentication. Since Windows domain controllers use SMB to transmit policies at login, they have packet-signing enabled by default to prevent man-in-the-middle attacks; the feature can also be turned on for any server running Windows NT 4.0 Service Pack 3 or later.[23] The design of Server Message Block version 2 (SMB2) aims to mitigate this performance-limitation by coalescing SMB signals into single packets. SMB supports opportunistic locking a special type of locking-mechanism on files in order to improve performance. SMB serves as the basis for Microsoft's Distributed File System implementation.

Security
Over the years, there have been many security vulnerabilities in Microsoft's implementation of the protocol or components that it directly relies on,[24][25][26] with the most recent vulnerability (at time of writing) involving the SMB2 implementation.[27]

Specifications for SMB and SMB2 Protocols


The specifications for the SMB are proprietary and were originally closed, thereby forcing other vendors and projects to reverse-engineer the protocol in order to interoperate with it. The SMB 1.0 protocol was eventually published some time after it was reverse engineered, whereas the SMB 2.0 procotol was made available from Microsoft's MSDN Open Specifications Developer Center from the outset.[28] There are a number of specifications that are relevant to the SMB protocol: MS-CIFS [29] MS-CIFS is a recent replacement (2007) for the draft-leach-cifs-v1-spec-02.txt a document widely used to implement SMB clients, but also known to have errors of omission and commission. MS-SMB [30] Specification for Microsoft Extensions to MS-CIFS MS-SMB2 [31] Specification for the SMB 2 protocol MS-FSSO [32] Describes the intended functionality of the Windows File Access Services System, how it interacts with systems and applications that need file services, and how it interacts with administrative clients to configure and manage the system.

Server Message Block

128

References
[1] "Microsoft SMB Protocol and CIFS Protocol Overview" (http:/ / msdn. microsoft. com/ en-us/ library/ aa365233(VS. 85). aspx). Microsoft. 2009-10-22. . Retrieved 2009-11-01. [2] "Direct hosting of SMB over TCP/IP" (http:/ / support. microsoft. com/ kb/ 204279). Microsoft. 2007-10-11. . Retrieved 2009-11-01. [3] Richard Sharpe (8 October 2002). "Just what is SMB?" (http:/ / samba. anu. edu. au/ cifs/ docs/ what-is-smb. html). . Retrieved 18 July 2011. [4] WhichNAS "NAS Network Protocols Knowledge Base - NAS Basics" (http:/ / www. whichnas. net/ nasbasics/ nas-network-protocols) [5] YANFS yet another nfs (http:/ / www. sun. com/ software/ webnfs/ overview. xml) [6] * Common Internet File System Protocol (CIFS/1.0) (http:/ / www. tools. ietf. org/ html/ draft-heizer-cifs-v1-spec) CIFS Logon and Pass Through Authentication (http:/ / www. tools. ietf. org/ html/ draft-leach-cifs-logon-spec) CIFS/E Browser Protocol (http:/ / www. tools. ietf. org/ html/ draft-leach-cifs-browser-spec) CIFS Printing Specification (http:/ / www. tools. ietf. org/ html/ draft-leach-cifs-print-spec) CIFS Remote Administration Protocol (http:/ / www. tools. ietf. org/ html/ draft-leach-cifs-rap-spec) A Common Internet File System (CIFS/1.0) Protocol (http:/ / www. tools. ietf. org/ html/ draft-leach-cifs-v1-spec) [7] Tridgell, Andrew (June 27, 1997). "A bit of history and a bit of fun" (http:/ / www. rxn. com/ services/ faq/ smb/ samba. history. txt). . Retrieved 2011-07-26. [8] http:/ / samba. 2283325. n4. nabble. com/ Samba-4-functional-levels-td3322760. html [9] Jose Barreto (2008-12-09). "SMB2, a Complete Redesign of the Main Remote File Protocol for Windows" (http:/ / blogs. technet. com/ josebda/ archive/ 2008/ 12/ 05/ smb2-a-complete-redesign-of-the-main-remote-file-protocol-for-windows. aspx). Microsoft. . Retrieved 2009-11-01. [10] Neil Carpenter (2004-10-26). "SMB/CIFS Performance Over WAN Links" (http:/ / blogs. technet. com/ neilcar/ pages/ 247903. aspx). Microsoft. . Retrieved 2009-11-01. [11] Mark Rabinovich, Igor Gokhman. "CIFS Acceleration Techniques" (http:/ / www. snia. org/ sites/ default/ files2/ sdc_archives/ 2009_presentations/ monday/ MarkRabinovich-IgorGokhman-CIFS_Acceleration_Techniques. pdf). Storage Developer Conference, SNIA, Santa Clara 2009. . [12] Mark Rabinovich. "Accelerating SMB2" (http:/ / snia. org/ sites/ default/ files2/ SDC2011/ presentations/ wednesday/ MarkRabinovichAccelerating_SMB2. pdf). Storage Developer Conference, SNIA, Santa Clara 2011. . [13] Christopher Hertel (1999). "SMB: The Server Message Block Protocol" (http:/ / ubiqx. org/ cifs/ SMB. html). . Retrieved 2009-11-01. [14] "Description of Microsoft Windows Encryption Pack 1" (http:/ / support. microsoft. com/ kb/ 159709). Microsoft. 2006-11-01. . Retrieved 2009-11-01. [15] "US Executive Order 13026" (http:/ / www. gpo. gov/ fdsys/ pkg/ WCPD-1996-11-18/ pdf/ WCPD-1996-11-18-Pg2399. pdf). United States Government. 1996. . Retrieved 2009-11-01. [16] Navjot Virk and Prashanth Prahalad (March 10, 2006). "What's new in SMB in Windows Vista" (http:/ / blogs. msdn. com/ chkdsk/ archive/ 2006/ 03/ 10/ 548787. aspx). Chk Your Dsks. MSDN. . Retrieved 2006-05-01. [17] "(MS-SMB2): Server Message Block (SMB) Version 2 Protocol Specification" (http:/ / msdn. microsoft. com/ en-us/ library/ cc246482(PROT. 13). aspx). Microsoft. 2009-09-25. . Retrieved 2009-11-01. [18] Samba 3.5 - Release Notes Archive (http:/ / www. samba. org/ samba/ history/ samba-3. 5. 0. html) [19] Samba 3.6 - Release Notes Archive (http:/ / samba. org/ samba/ history/ samba-3. 6. 0. html) [20] "Implementing an End-User Data Centralization Solution" (http:/ / www. microsoft. com/ downloads/ details. aspx?displaylang=en& FamilyID=d8541618-5c63-4c4d-a0fd-d942cd3d2ec6). Microsoft. 2009-10-21. pp.1011. . Retrieved 2009-11-02. [21] Jeffrey Snover (19 April 2012). "Windows Server Blog: SMB 2.2 is now SMB 3.0" (http:/ / blogs. technet. com/ b/ windowsserver/ archive/ 2012/ 04/ 19/ smb-2-2-is-now-smb-3-0. aspx). Microsoft. . Retrieved 14 June 2012. [22] "" (http:/ / www. snia. org/ sites/ default/ files2/ SDC2011/ presentations/ keynote/ ThomasPfenning_The_Future_of_File_Protocols-final. pdf). . [23] "Overview of Server Message Block signing" (http:/ / support. microsoft. com/ kb/ 887429). Microsoft. 2007-11-30. . Retrieved 2009-11-01. [24] "MS02-070: Flaw in SMB Signing May Permit Group Policy to Be Modified" (http:/ / support. microsoft. com/ kb/ 329170). Microsoft. 2007-12-01. . Retrieved 2009-11-01. [25] "MS09-001: Vulnerabilities in SMB could allow remote code execution" (http:/ / support. microsoft. com/ kb/ 958687). Microsoft. 2009-01-13. . Retrieved 2009-11-01. [26] "MS08-068: Vulnerability in SMB could allow remote code execution" (http:/ / support. microsoft. com/ kb/ 957097). Microsoft. 2009-02-26. . Retrieved 2009-11-01. [27] "MS09-050: Vulnerabilities in SMB could allow remote code execution" (http:/ / support. microsoft. com/ kb/ 975517). Microsoft. 2009-10-13. . Retrieved 2009-02-26. [28] Windows Protocols (http:/ / msdn. microsoft. com/ en-us/ library/ cc216517(PROT. 10). aspx) [29] http:/ / msdn. microsoft. com/ en-us/ library/ ee442092%28PROT. 10%29. aspx [30] http:/ / msdn. microsoft. com/ en-us/ library/ cc246231%28PROT. 10%29. aspx [31] http:/ / msdn. microsoft. com/ en-us/ library/ cc246482%28PROT. 10%29. aspx [32] http:/ / msdn. microsoft. com/ en-us/ library/ ee392367%28PROT. 10%29. aspx

Server Message Block

129

External links
Hertel, Christopher (2003). Implementing CIFS The Common Internet FileSystem (http://www.ubiqx.org/ cifs/Book.html). Prentice Hall. ISBN 0-13-047116-X. (Text licensed under the Open Publication License, v1.0 or later, available from the link above.) Technical details about SMB/CIFS (http://ubiqx.org/cifs/) Common Internet File System (CIFS) File Access Protocol (http://www.microsoft.com/downloads/details. aspx?FamilyID=c4adb584-7ff0-4acf-bd91-5f7708adb23c&displaylang=en) - Technical details from Microsoft Corporation the NT LM 0.12 dialect of SMB (http://www.samba.org/samba/ftp/specs/smb-nt01.doc). In Microsoft Word format Samba development information (http://devel.samba.org/) Introduction to the Common Internet File System (CIFS): Leverage the Power of this Popular Network File Sharing Protocol (http://www.embeddedcomponents.com/marketplace/makers/visualitynq/intro/) Online introduction to CIFS: Lecture/blog by Ron Fredericks Zechner, Anton (2007). "Source-code of a free SMB server for small embedded systems" (http://members. inode.at/anton.zechner/az/AzSmb.en.htm) Description of the update that implements Extended Protection for Authentication in the Server service (http:// support.microsoft.com/kb/2345886) Post about SMB3 in the Windows Server Blog (http://blogs.technet.com/b/windowsserver/archive/2012/04/ 19/smb-2-2-is-now-smb-3-0.aspx)

130

Protocols
SCSI
Small Computer System Interface (SCSI, /skzi/ SKUZ-ee)[1] is a set of standards for physically connecting and transferring data between computers and peripheral devices. The SCSI standards define commands, protocols, and electrical and optical interfaces. SCSI is most commonly used for hard disks and tape drives, but it can connect a wide range of other devices, including scanners and CD drives, although not all controllers can handle all devices. The SCSI standard defines command sets for specific peripheral device types; the presence of "unknown" as one of these types means that in theory it can be used as an interface to almost any device, but the standard is highly pragmatic and addressed toward commercial requirements. SCSI is an intelligent, peripheral, buffered, peer to peer interface. It The icon/logo used for SCSI. hides the complexity of physical format. Every device attaches to the SCSI bus in a similar manner. Up to 8 or 16 devices can be attached to a single bus. There can be any number of hosts and peripheral devices but there should be at least one host. SCSI uses handshake signals between devices, SCSI-1, SCSI-2 have the option of parity error checking. Starting with SCSI-U160 (part of SCSI-3) all commands and data are error checked by a CRC32 checksum. The SCSI protocol defines communication from host to host, host to a peripheral device, peripheral device to a peripheral device. However most peripheral devices are exclusively SCSI targets, incapable of acting as SCSI initiatorsunable to initiate SCSI transactions themselves. Therefore peripheral-to-peripheral communications are uncommon, but possible in most SCSI applications. The Symbios Logic 53C810 chip is an example of a PCI host interface that can act as a SCSI target.

History
SCSI was derived from "SASI", the "Shugart Associates System Interface", developed c. 1978 and publicly disclosed in 1981.[2] A SASI controller provided a bridge between a hard disk drive's low-level interface and a host computer, which needed to read blocks of data. SASI controller boards were typically the size of a hard disk drive and were usually physically mounted to the drive's chassis. SASI, which was used in mini- and early microcomputers, defined the interface as using a 50-pin flat ribbon connector which was adopted as the SCSI-1 connector. SASI is a fully compliant subset of SCSI-1 so that many, if not all, of the then-existing SASI controllers were SCSI-1 compatible.[3] Larry Boucher is considered to be the "father" of SASI and SCSI due to his pioneering work first at Shugart Associates and then at Adaptec.[4] Until at least February 1982, ANSI developed the specification as "SASI" and "Shugart Associates System Interface;"[5] however, the committee documenting the standard would not allow it to be named after a company. Almost a full day was devoted to agreeing to name the standard "Small Computer System Interface," which Boucher intended to be pronounced "sexy", but ENDL's[6] Dal Allan pronounced the new acronym as "scuzzy" and that stuck.[4]

SCSI A number of companies such as NCR Corporation, Adaptec and Optimem were early supporters of the SCSI standard.[5] The NCR facility in Wichita, Kansas is widely thought to have developed the industry's first SCSI chip; it worked the first time.[7] The "small" part in SCSI is historical; since the mid-1990s, SCSI has been available on even the largest of computer systems. Since its standardization in 1986, SCSI has been commonly used in the Amiga, Apple Macintosh and Sun Microsystems computer lines and PC server systems. Apple started using Parallel ATA (also known as IDE) for its low-end machines with the Macintosh Quadra 630 in 1994, and added it to its high-end desktops starting with the Power Macintosh G3 in 1997. Apple dropped on-board SCSI completely (in favor of IDE and FireWire) with the (Blue & White) Power Mac G3 in 1999. Sun has switched its lower end range to Serial ATA (SATA). SCSI has never been popular in the low-priced IBM PC world, owing to the lower cost and adequate performance of ATA hard disk standard. However, SCSI drives and even SCSI RAIDs became common in PC workstations for video or audio production. Recent versions of SCSI Serial Storage Architecture (SSA), SCSI-over-Fibre Channel Protocol (FCP), Serial Attached SCSI (SAS), Automation/Drive Interface Transport Protocol (ADT), and USB Attached SCSI (UAS) break from the traditional parallel SCSI standards and perform data transfer via serial communications. Although much of the documentation of SCSI talks about the parallel interface, most contemporary development effort is on serial SCSI. Serial SCSI has a number of advantages over parallel SCSI: faster data rates, hot swapping (some but not all parallel SCSI interfaces support it), and improved fault isolation. The primary reason for the shift to serial interfaces is the clock skew issue of high speed parallel interfaces, which makes the faster variants of parallel SCSI susceptible to problems caused by cabling and termination. iSCSI preserves the basic SCSI paradigm, especially the command set, almost unchanged, through embedding of SCSI-3 over TCP/IP. SCSI is popular on high-performance workstations and servers. RAIDs on servers have almost always used SCSI hard disks, though a number of manufacturers now offer SATA-based RAID systems as a cheaper option. Instead of SCSI, desktop computers and notebooks more typically use ATA interfaces for internal hard disk drives, and USB, eSATA, and FireWire connections for external devices. As of 2012 SCSI interfaces had become impossible to find for laptop computers. Adaptec had years before produced PCMCIA SCSI interfaces, but when PCMCIA was superseded by the ExpressCard discontinued their PCMCIA line without supporting ExpressCard. Ratoc produced USB and Firewire to SCSI adaptors, but ceased production when the integrated circuits required were discontinued. Drivers for existing PCMCIA interfaces were not produced for newer operating systems.

131

Interfaces
SCSI is available in a variety of interfaces. The first, still very common, was parallel SCSI (now also called SPI), which uses a parallel electrical bus design. As of 2008, SPI is being replaced by Serial Attached SCSI (SAS), which uses a serial design but retains other aspects of the technology. Many other interfaces which do not rely on complete SCSI standards still implement the SCSI command protocol; others (such as iSCSI) drop physical implementation entirely while retaining the SCSI architectural model. iSCSI, for example, uses TCP/IP as a transport mechanism.
Two SCSI connectors.

SCSI SCSI interfaces have often been included on computers from various manufacturers for use under Microsoft Windows, Mac OS, Unix, Commodore Amiga and Linux operating systems, either implemented on the motherboard or by the means of plug-in adaptors. With the advent of SAS and SATA drives, provision for SCSI on motherboards is being discontinued. A few companies still market SCSI interfaces for motherboards supporting PCIe and PCI-X.

132

Parallel SCSI
Interface Alternative Specification Connector [8] names document Width Clock[9] (bits) Bandwidth Bandwidth [10] [11] (MB/s) (Mbit/s) Maximum Electrical

[14] Impedance Voltage Length Length Length Devices [13] HVD [] [V] (single LVD [12] ended)

SCSI-1

Narrow SCSI

SCSI-1 [15] (1986) SCSI-2 (1994)

IDC50; Centronics C50 IDC50; Centronics C50 2 x 50-pin (SCSI-2); 1 x 68-pin (SCSI-3) IDC50

5MHz

5 MB/s

40 Mbit/s

6m

NA

25m

SE 90 6 [16]

Fast SCSI

10MHz 10 MB/s

80 Mbit/s

3m

NA

25m

SE 90 6 [16]

Fast-Wide SCSI

SCSI-2; SCSI-3 SPI [15] (1996) Fast-20

16

10MHz 20 MB/s

160 Mbit/s

3m

NA

25m

16

SE 90 6 [16]

Ultra SCSI Ultra Wide SCSI

SCSI-3 [15] SPI SCSI-3 [15] SPI

20MHz 20 MB/s

160 Mbit/s

1.5 m 3m

NA NA NA NA NA 12m

25m NA 25m NA NA 25m

8 4 16 8 4 8

SE 90 6 [16] SE 90 6 [16]

68-pin

16

20MHz 40 MB/s

320 Mbit/s

NA 1.5 m 3m

Ultra2 SCSI Ultra2 Wide SCSI Ultra3 SCSI

Fast-40

SCSI-3 50-pin SPI-2 (1997) SCSI-3 SPI-2

40MHz 40 MB/s

320 Mbit/s

NA

LVD 125 [16] 10 LVD 125 [16] 10

68-pin; 16 80-pin (SCA/SCA-2) 68-pin; 16 80-pin (SCA/SCA-2) 68-pin; 16 80-pin (SCA/SCA-2) 16

40MHz 80 MB/s

640 Mbit/s

NA

12m

25m

16

Ultra-160; Fast-80 wide

SCSI-3 SPI-3 [15] (1999) SCSI-3 SPI-4 [15] (2002)

40MHz 160 MB/s DDR

1280 Mbit/s NA

12m

NA

16

LVD 125 [16] 10

Ultra-320 Ultra-4; SCSI Fast-160

80MHz 320 MB/s DDR

2560 Mbit/s NA

12m

NA

16

LVD 125 [16] 10

Ultra-640 Ultra-5 SCSI

SCSI-3 68-pin; SPI-5 (2003) 80-pin

160MHz 640 MB/s DDR

5120 Mbit/s

16

SCSI

133

Other SCSI interfaces


Interface Alternative Specification Connector Width Clock[9] names document (bits) Throughput [10] (MB/s) SSA SSA 40 FC-AL 1Gb FC-AL 2Gb FC-AL 4Gb SAS 1.1 SAS 2.0 iSCSI 1 1 1 200MHz 40 MB/s[17][18] 400MHz 80 MB/s[17][18] 1GHz [18][19] 100 MB/s [18][19] 200 MB/s [18][19] 400 MB/s [18][19] 300 MB/s [18][19] 600 MB/s Maximum Throughput [11] (Mbit/s) 320 Mbit/s 640 Mbit/s 800 Mbit/s Length [12] Devices [14]

25 m 25 m 500m/3km

96 96 [20] 127

2GHz

1600 Mbit/s

500m/3km

[20] 127

4GHz

3200 Mbit/s

500m/3km 6m 6m

[20] 127

1 1

3GHz 6GHz

2400 Mbit/s 4800 Mbit/s

16,256 16,256

[21] [21]

Implementation- and network-dependent

Cabling
SCSI Parallel Interface
Internal parallel SCSI cables are usually ribbons, with two or more 50, 68, or 80pin connectors attached. External cables are typically shielded (but may not be), with 50 or 68pin connectors at each end, depending upon the specific SCSI bus width supported.[22] The 80pin Single Connector Attachment (SCA) is typically used for hot-pluggable devices
Bus terminator with top cover removed.

Serial attached SCSI


Serial attached SCSI uses a modified Serial ATA data and power cable.

iSCSI
iSCSI (Internet Small Computer System Interface) usually uses Ethernet connectors and cables as its physical transport, but can run over any physical transport capable of transporting IP.

USB Attached SCSI


USB Attached SCSI allows SCSI devices to use the Universal Serial Bus.

Automation/Drive Interface

SCSI The Automation/Drive Interface Transport Protocol (ADT) is used to connect removable media devices, such as tape drives, with the controllers of the libraries (automation devices) in which they are installed. The ADI standard specifies the use of RS-422 for the physical connections. The second-generation ADT-2 standard defines iADT, use of the ADT protocol over IP (Internet Protocol) connections, such as over Ethernet. The Automation/Drive Interface Commands standards (ADC, ADC-2, and ADC-3) define SCSI commands for these installations.

134

SCSI command protocol


In addition to many different hardware implementations, the SCSI standards also include an extensive set of command definitions. The SCSI command architecture was originally defined for parallel SCSI buses but has been carried forward with minimal change for use with iSCSI and serial SCSI. Other technologies which use the SCSI command set include the ATA Packet Interface, USB Mass Storage class and FireWire SBP-2. In SCSI terminology, communication takes place between an initiator and a target. The initiator sends a command to the target, which then responds. SCSI commands are sent in a Command Descriptor Block (CDB). The CDB consists of a one byte operation code followed by five or more bytes containing command-specific parameters. At the end of the command sequence, the target returns a status code byte, such as 00h for success, 02h for an error (called a Check Condition), or 08h for busy. When the target returns a Check Condition in response to a command, the initiator usually then issues a SCSI Request Sense command in order to obtain a key code qualifier (KCQ) from the target. The Check Condition and Request Sense sequence involves a special SCSI protocol called a Contingent Allegiance Condition. There are 4 categories of SCSI commands: N (non-data), W (writing data from initiator to target), R (reading data), and B (bidirectional). There are about 60 different SCSI commands in total, with the most commonly used being: Test unit ready: Queries device to see if it is ready for data transfers (disk spun up, media loaded, etc.). Inquiry: Returns basic device information. Request sense: Returns any error codes from the previous command that returned an error status. Send diagnostic and Receive diagnostic results: runs a simple self-test, or a specialised test defined in a diagnostic page. Start/Stop unit: Spins disks up and down, or loads/unloads media (CD, tape, etc.). Read capacity: Returns storage capacity. Format unit: Prepares a storage medium for use. In a disk, a low level format will occur. Some tape drives will erase the tape in response to this command. SCSI Read format capacities: Retrieve the data capacity of the device. Read (four variants): Reads data from a device. Write (four variants): Writes data to a device. Log sense: Returns current information from log pages. Mode sense: Returns current device parameters from mode pages. Mode select: Sets device parameters in a mode page.

Each device on the SCSI bus is assigned a unique SCSI identification number or ID. Devices may encompass multiple logical units, which are addressed by logical unit number (LUN). Simple devices have just one LUN, more complex devices may have multiple LUNs. A "direct access" (i.e. disk type) storage device consists of a number of logical blocks, addressed by Logical Block Address (LBA). A typical LBA equates to 512 bytes of storage. The usage of LBAs has evolved over time and so four different command variants are provided for reading and writing data. The Read(6) and Write(6) commands contain a 21-bit LBA address. The Read(10), Read(12), Read Long, Write(10), Write(12), and Write Long commands all contain a 32-bit LBA address plus various other parameter options.

SCSI The capacity of a "sequential access" (i.e. tape-type) device is not specified because it depends, amongst other things, on the length of the tape, which is not identified in a machine-readable way. Read and write operations on a sequential access device begin at the current tape position, not at a specific LBA. The block size on sequential access devices can either be fixed or variable, depending on the specific device. Tape devices such as half-inch 9-track tape, DDS (4mm tapes physically similar to DAT), Exabyte, etc., support variable block sizes.

135

Device identification
In modern SCSI transport protocols, there is an automated process for "discovery" of the IDs. SSA initiators "walk the loop" to determine what devices are connected and then assigns each one a 7-bit "hop-count" value. Fibre Channel Arbitrated Loop (FC-AL) initiators use the LIP (Loop Initialization Protocol) to interrogate each device port for its WWN (World Wide Name). For iSCSI, because of the unlimited scope of the (IP) network, the process is quite complicated. These discovery processes occur at power-on/initialization time and also if the bus topology changes later, for example if an extra device is added. On a parallel SCSI bus, a device (e.g. host adapter, disk drive) is identified by a "SCSI ID", which is a number in the range 07 on a narrow bus and in the range 015 on a wide bus. On earlier models a physical jumper or switch controls the SCSI ID of the initiator (host adapter). On modern host adapters (since about 1997), doing I/O to the adapter sets the SCSI ID; for example, the adapter often contains a BIOS program that runs when the computer boots up and that program has menus that let the operator choose the SCSI ID of the host adapter. Alternatively, the host adapter may come with software that must be installed on the host computer to configure the SCSI ID. The traditional SCSI ID for a host adapter is 7, as that ID has the highest priority during bus arbitration (even on a 16 bit bus). The SCSI ID of a device in a drive enclosure that has a backplane is set either by jumpers or by the slot in the enclosure the device is installed into, depending on the model of the enclosure. In the latter case, each slot on the enclosure's back plane delivers control signals to the drive to select a unique SCSI ID. A SCSI enclosure without a back plane often has a switch for each drive to choose the drive's SCSI ID. The enclosure is packaged with connectors that must be plugged into the drive where the jumpers are typically located; the switch emulates the necessary jumpers. While there is no standard that makes this work, drive designers typically set up their jumper headers in a consistent format that matches the way that these switches implement. Note that a SCSI target device (which can be called a "physical unit") is often divided into smaller "logical units." For example, a high-end disk subsystem may be a single SCSI device but contain dozens of individual disk drives, each of which is a logical unit (more commonly, it is not that simplevirtual disk devices are generated by the subsystem based on the storage in those physical drives, and each virtual disk device is a logical unit). The SCSI ID, WWN, etc. in this case identifies the whole subsystem, and a second number, the logical unit number (LUN) identifies a disk device within the subsystem. It is quite common, though incorrect, to refer to the logical unit itself as a "LUN."[23] Accordingly, the actual LUN may be called a "LUN number" or "LUN id".[24] Setting the bootable (or first) hard disk to SCSI ID 0 is an accepted IT community recommendation. SCSI ID 2 is usually set aside for the floppy disk drive while SCSI ID 3 is typically for a CD-ROM drive.[25]

Device Type
While all SCSI controllers can work with read/write storage devices, i.e. disk and tape, some will not work with some other device types; older controllers are likely to be more limited[26], sometimes by their driver software, and more Device Types were added as SCSI evolved. Even CD-ROMs are not handled by all controllers. Device Type is a 5-bit field reported by a SCSI Inquiry Command; defined SCSI Peripheral Device Types include, in addition to many varieties of storage device, printer, scanner, communications device, and a catch-all "processor" type for

SCSI devices not otherwise listed.

136

SCSI enclosure services


In larger SCSI servers, the disk-drive devices are housed in an intelligent enclosure that supports SCSI Enclosure Services (SES). The initiator can communicate with the enclosure using a specialized set of SCSI commands to access power, cooling, and other non-data characteristics.

References
[1] Field. The Book of SCSI. pp.1. [2] ANSI Draft SASI Standard, Rev D, February 17, 1982, pg. ii states, "9/15/81 first presentation to ANSI committee X3T9-3 (2 weeks following announcement in Electronic Design)." [3] ANSI SCSI Standard, X3.131-1986, June 23, 1986, 2nd, foreword. [4] "How Computer Storage Became a Modern Business," Computer History Museum, March 9, 2005 (http:/ / www. youtube. com/ watch?v=OiLUIJ3ke-o) [5] Working document for ANSI meeting on March 3, 1982, "SASI SHUGART ASSOCIATES SYSTEM INTERFACE, Revision D, February 17, 1982" [6] ENDL Inc. Home Page (http:/ / www. endl. com/ ) [7] NCR Collection (LSI Logic)at Smithsonian Museum (http:/ / smithsonianchips. si. edu/ ncr/ scsi-1. htm) [8] Specifications are maintained by the T10 subcommittee of the International Committee for Information Technology Standards. [9] Clock rate in MHz for SPI, or bitrate (per second) for serial interfaces [10] In megabytes per second, not megabits per second [11] In megabits per second, not megabytes per second [12] For daisy-chain designs, length of bus, from end to end; for point-to-point, length of a single link [13] LVD cabling may be up to 25m when only a single device is attached to the host adapter [14] Including any host adapters (i.e., computers count as a device) [15] The SCSI-1 specification has been withdrawn and is superseded by SCSI-2. The SCSI-3 SPI specification has been withdrawn and is superseded by SPI-2. The SCSI-3 SPI-3 and SPI-4 specifications have been withdrawn and are superseded by SPI-5. "T10 Withdrawn Standards and Technical Reports" (http:/ / www. t10. org/ drafts. htm#OBSOLETE). . Retrieved March 18, 2010. [16] "Random Problems Encountered When Mixing SE and LVD SCSI Standards" (http:/ / support. microsoft. com/ kb/ 285013). . Retrieved May 7, 2008. [17] spatial reuse [18] full duplex [19] per direction [20] 500 meters for multi-mode, 3 kilometers for single-mode [21] 128 per expander [22] SCSI Standards & Cables for the "normal"* person (http:/ / www. ramelectronics. net/ scsi_cables__. ep) [23] "na_lun(1) Manual page for "lun" on NetApp DataONTAP". NetApp. July 7, 2009. "The lun command is used to create and manage luns[...]" [24] "na_lun(1) Manual page for "lun" on NetApp DataONTAP". NetApp. July 7, 2009. "If a LUN ID is not specified, the smallest number [...] is automatically picked." [25] Groth, David; Dan Newland (January 2001). A+ Complete Study Guide (2nd Edition) (http:/ / www. bookfinder4u. com/ IsbnSearch. aspx?isbn=0782128025& mode=direct). Alameda, CA, USA: l Sybex. pp.183. ISBN0-7821-4244-3. . [26] An example of an old SCSI interface which supported only named mass storage devices (http:/ / h30097. www3. hp. com/ docs/ base_doc/ DOCUMENTATION/ V40F_HTML/ MAN/ MAN7/ 0003____. HTM)

SCSI

137

Bibliography
Pickett, Joseph P., et al. (ed), ed. (2000). The American Heritage Dictionary of the English Language (AHD) (http://www.bartleby.com/61/) (Fourth Edition ed.). Houghton Mifflin Company. ISBN0-395-82517-2. Field, Gary; Peter Ridge, John Lohmeyer, Gerhard Islinger, Stefan Groll (2000). The Book of SCSI (2nd Edition ed.). No Starch Press. ISBN1-886411-10-7.

External links
SCSI Tutorial (http://www.pacificcable.com/SCSI-Tutorial.html) SCSI Details, Wiring, Compaq/HP (http://www.delec.com/guide/scsi/) All About SCSI (http://www.datapro.net/techinfo/scsi_doc.html) T10 Technical Committee (http://www.t10.org/) (SCSI standards) SCSITA terminology (http://www.scsita.org/terms-and-terminology.html) "Storage Cornucopia" SCSI links, maintained by a consultant (http://www.bswd.com/cornucop.htm) SCSI/iSCSI/RAID/SAS Information Sheet (http://www.woodsmall.com/SCSI.htm) SCSI basics (http://www.pcnineoneone.com/howto/scsi1.html) SCSI and ATA pinouts (http://pinouts.ru/pin_HD.shtml)

Anatomy of the Linux SCSI subsystem (http://www.ibm.com/developerworks/linux/library/l-scsi-subsystem/ ?ca=dgr-lnxw57LinuxSCSIsub&S_TACT=105AGX59&S_CMP=GR) List of Adapters by SCSI connector type (http://www.scsi4me.com/scsi-connectors.htm) SCSI Library (http://www.scsilibrary.com/) SCSI connector photos (http://www-304.ibm.com/systems/support/supportsite.wss/ docdisplay?brandind=5000008&lndocid=MIGR-4AQSCA)

iSCSI
i In computing, iSCSI ( /askzi/ eye-SKUZ-ee), is an abbreviation of Internet Small Computer System Interface, an Internet Protocol (IP)-based storage networking standard for linking data storage facilities. By carrying SCSI commands over IP networks, iSCSI is used to facilitate data transfers over intranets and to manage storage over long distances. iSCSI can be used to transmit data over local area networks (LANs), wide area networks (WANs), or the Internet and can enable location-independent data storage and retrieval. The protocol allows clients (called initiators) to send SCSI commands (CDBs) to SCSI storage devices (targets) on remote servers. It is a storage area network (SAN) protocol, allowing organizations to consolidate storage into data center storage arrays while providing hosts (such as database and web servers) with the illusion of locally attached disks. Unlike traditional Fibre Channel, which requires special-purpose cabling, iSCSI can be run over long distances using existing network infrastructure.

Functionality
iSCSI uses TCP (typically TCP ports 860 and 3260). In essence, iSCSI simply allows two hosts to negotiate and then exchange SCSI commands using IP networks. By doing this iSCSI takes a popular high-performance local storage bus and emulates it over wide-area networks, creating a storage area network (SAN). Unlike some SAN protocols, iSCSI requires no dedicated cabling; it can be run over existing IP infrastructure. As a result, iSCSI is often seen as a low-cost alternative to Fibre Channel, which requires dedicated infrastructure except in its FCoE (Fibre Channel over Ethernet) form. However, the performance of an iSCSI SAN deployment can be severely degraded if not operated on a dedicated network or subnet (LAN or VLAN).

iSCSI Although iSCSI can communicate with arbitrary types of SCSI devices, system administrators almost always use it to allow server computers (such as database servers) to access disk volumes on storage arrays. iSCSI SANs often have one of two objectives: Storage consolidation Organizations move disparate storage resources from servers around their network to central locations, often in data centers; this allows for more efficiency in the allocation of storage. In a SAN environment, a server can be allocated a new disk volume without any change to hardware or cabling. Disaster recovery Organizations mirror storage resources from one data center to a remote data center, which can serve as a hot standby in the event of a prolonged outage. In particular, iSCSI SANs allow entire disk arrays to be migrated across a WAN with minimal configuration changes, in effect making storage "routable" in the same manner as network traffic.

138

Network booting
For general data storage on an already-booted computer, any type of generic network interface may be used to access iSCSI devices. However, a generic consumer-grade network interface is not able to boot a diskless computer from a remote iSCSI data source. Instead it is commonplace for a server to load its initial operating system from a TFTP server or local boot device, and then use iSCSI for data storage once booting from the local device has finished. A separate DHCP server may be configured to assist interfaces equipped with network boot capability to be able to boot over iSCSI. In this case the network interface looks for a DHCP server offering a PXE or bootp boot image. This is used to kick off the iSCSI remote boot process, using the booting network interface's MAC address to direct the computer to the correct iSCSI boot target. Most Intel Ethernet controllers for servers support iSCSI boot.[1]

Concepts
Initiator
Further information: SCSI initiator An initiator functions as an iSCSI client. An initiator typically serves the same purpose to a computer as a SCSI bus adapter would, except that instead of physically cabling SCSI devices (like hard drives and tape changers), an iSCSI initiator sends SCSI commands over an IP network. An initiator falls into two broad types: Software initiator A software initiator uses code to implement iSCSI. Typically, this happens in a kernel-resident device driver that uses the existing network card (NIC) and network stack to emulate SCSI devices for a computer by speaking the iSCSI protocol. Software initiators are available for most popular operating systems and are the most common method of deploying iSCSI.

iSCSI Hardware initiator A hardware initiator uses dedicated hardware, typically in combination with software (firmware) running on that hardware, to implement iSCSI. A hardware initiator mitigates the overhead of iSCSI and TCP processing and Ethernet interrupts, and therefore may improve the performance of servers that use iSCSI. Host Bus Adapter An iSCSI host bus adapter (more commonly, HBA) implements a hardware initiator. A typical HBA is packaged as a combination of a Gigabit (or 10 Gigabit) Ethernet NIC, some kind of TCP/IP offload engine (TOE) technology and a SCSI bus adapter, which is how it appears to the operating system. An iSCSI HBA can include PCI option ROM to allow booting from an iSCSI SAN. TCP Offload Engine A TCP Offload Engine, or "TOE Card", offers an alternative to a full iSCSI HBA. A TOE "offloads" the TCP/IP operations for this particular network interface from the host processor, freeing up CPU cycles for the main host applications. When a TOE is used rather than an HBA, the host processor still has to perform the processing of the iSCSI protocol layer itself, but the CPU overhead for that task is low. iSCSI HBAs or TOEs are used when the additional performance enhancement justifies the additional expense of using an HBA for iSCSI, rather than using a software-based iSCSI client (initiator).

139

Target
The iSCSI specification refers to a storage resource located on an iSCSI server (more generally, one of potentially many instances of iSCSI storage nodes running on that server) as a target. "iSCSI target" should not be confused with the term "iSCSI" as the latter is a protocol and not a storage server instance. An iSCSI target is often a dedicated network-connected hard disk storage device, but may also be a general-purpose computer, since as with initiators, software to provide an iSCSI target is available for most mainstream operating systems. Common deployment scenarios for an iSCSI target include: Storage array In a data center or enterprise environment, an iSCSI target often resides in a large storage array, such as a EqualLogic, Nimble Storage, Isilon, NetApp filer, EMC NS-series, CX4,VNX, VNXe, VMAX or a HDS HNAS computer appliance. A storage array usually provides distinct iSCSI targets for numerous clients.[2] Software target Nearly all modern mainstream server operating systems (such as BSD, Linux, Solaris or Windows Server) can provide iSCSI target functionality, either as a built-in feature or with supplemental software. Some specific-purpose operating systems (such as FreeNAS, NAS4Free, Openfiler or OpenMediaVault) implement iSCSI target support. Logical Unit Number In SCSI terminology, LUN stands for logical unit number. A LUN represents an individually addressable (logical) SCSI device that is part of a physical SCSI device (target). In an iSCSI environment, LUNs are essentially numbered disk drives. An initiator negotiates with a target to establish connectivity to a LUN; the result is an iSCSI connection that emulates a connection to a SCSI hard disk. Initiators treat iSCSI LUNs the same way as they would a raw SCSI or IDE hard drive; for instance, rather than mounting remote directories as would be done in NFS or CIFS environments, iSCSI systems format and directly manage filesystems on iSCSI LUNs.

iSCSI In enterprise deployments, LUNs usually represent slices of large RAID disk arrays, often allocated one per client. iSCSI imposes no rules or restrictions on multiple computers sharing individual LUNs; it leaves shared access to a single underlying filesystem as a task for the operating system.

140

Addressing
Special names refer to both iSCSI initiators and targets. iSCSI provides three name-formats: iSCSI Qualified Name (IQN) Format: The iSCSI Qualified Name is documented in RFC 3720, with further examples of names in RFC 3721. Briefly, the fields are: literal iqn date (yyyy-mm) that the naming authority took ownership of the domain reversed domain name of the authority (org.alpinelinux, com.example, to.yp.cr) Optional ":" prefixing a storage target name specified by the naming authority. From the RFC: Naming String defined by Type Date Auth "example.com" naming authority +--++-----+ +---------+ +-----------------------------+ | || | | | | | iqn.1992-01.com.example:storage:diskarrays-sn-a8675309 iqn.1992-01.com.example iqn.1992-01.com.example:storage.tape1.sys1.xyz iqn.1992-01.com.example:storage.disk2.sys1.xyz[3] Extended Unique Identifier (EUI) Format: eui.{EUI-64 bit address} (e.g. eui.02004567A425678D) T11 Network Address Authority (NAA) Format: naa.{NASA 64 or 128 bit identifier} (e.g. naa.52004567BA64678D) IQN format addresses occur most commonly. They are qualified by a date (yyyy-mm) because domain names can expire or be acquired by another entity. The IEEE Registration authority provides EUI in accordance with the EUI-64 standard. NAA is part OUI which is provided by the IEEE Registration Authority. NAA name formats were added to iSCSI in RFC 3980, to provide compatibility with naming conventions used in Fibre Channel and Serial Attached SCSI (SAS) storage technologies. Usually an iSCSI participant can be defined by three or four fields: 1. 2. 3. 4. Hostname or IP Address (e.g., "iscsi.example.com") Port Number (e.g., 3260) iSCSI Name (e.g., the IQN "iqn.2003-01.com.ibm:00.fcd0ab21.shark128") An optional CHAP Secret (e.g., "secretsarefun")

iSCSI

141

iSNS
iSCSI initiators can locate appropriate storage resources using the Internet Storage Name Service (iSNS) protocol. In theory, iSNS provides iSCSI SANs with the same management model as dedicated Fibre Channel SANs. In practice, administrators can satisfy many deployment goals for iSCSI without using iSNS.

Security
Authentication
iSCSI initiators and targets prove their identity to each other using the CHAP protocol, which includes a mechanism to prevent cleartext passwords from appearing on the wire. By itself, the CHAP protocol is vulnerable to dictionary attacks, spoofing, or reflection attacks. If followed carefully, the rules for using CHAP within iSCSI prevent most of these attacks.[4] Additionally, as with all IP-based protocols, IPsec can operate at the network layer. The iSCSI negotiation protocol is designed to accommodate other authentication schemes, though interoperability issues limit their deployment.

Logical network isolation


To ensure that only valid initiators connect to storage arrays, administrators most commonly run iSCSI only over logically isolated backchannel networks. In this deployment architecture, only the management ports of storage arrays are exposed to the general-purpose internal network, and the iSCSI protocol itself is run over dedicated network segments or virtual LANs (VLAN). This mitigates authentication concerns; unauthorized users aren't physically provisioned for iSCSI, and thus can't talk to storage arrays. However, it also creates a transitive trust problem, in that a single compromised host with an iSCSI disk can be used to attack storage resources for other hosts.

Physical network isolation


While iSCSI can be logically isolated from the general network using VLANs only, it is still no different from any other network equipment and may use any cable or port as long as there is a completed signal path between source and target. Just a single cabling mistake by an inexperienced network technician can compromise the barrier of logical separation, and an accidental bridging may not be immediately detected because it does not cause network errors. In order to further differentiate iSCSI from the regular network and prevent cabling mistakes when changing connections, administrators may implement self-defined color coding and labeling standards, such as only using yellow-colored cables for the iSCSI connections and only blue cables for the regular network, and clearly labeling ports and switches used only for iSCSI. While iSCSI could be implemented as just a VLAN cluster of ports on a large multi-port switch that is also used for general network usage, the administrator may instead choose to use physically separate switches dedicated to iSCSI VLANs only, to further prevent the possibility of an incorrectly connected cable plugged into the wrong port bridging the logical barrier.

Authorization
Because iSCSI aims to consolidate storage for many servers into a single storage array, iSCSI deployments require strategies to prevent unrelated initiators from accessing storage resources. As a pathological example, a single enterprise storage array could hold data for servers variously regulated by the SarbanesOxley Act for corporate accounting, HIPAA for health benefits information, and PCI DSS for credit card processing. During an audit, storage systems must demonstrate controls to ensure that a server under one regime cannot access the storage assets of a

iSCSI server under another. Typically, iSCSI storage arrays explicitly map initiators to specific target LUNs; an initiator authenticates not to the storage array, but to the specific storage asset it intends to use. However, because the target LUNs for SCSI commands are expressed both in the iSCSI negotiation protocol and in the underlying SCSI protocol, care must be taken to ensure that access control is provided consistently.

142

Confidentiality and integrity


For the most part, iSCSI operates as a cleartext protocol that provides no cryptographic protection for data in motion during SCSI transactions. As a result, an attacker who can listen in on iSCSI Ethernet traffic can: reconstruct and copy the files and filesystems being transferred on the wire alter the contents of files by injecting fake iSCSI frames corrupt filesystems being accessed by initiators, exposing servers to software flaws in poorly tested filesystem code. These problems do not occur only with iSCSI, but rather apply to any SAN protocol without cryptographic security. IP based security protocols, such as IPsec, can provide standards based cryptographic protection to this traffic, generally at a severe performance penalty.

Industry implementation
Operating-system capability
The dates that appear in the following table might be misleading. It is known for example that IBM delivered an iSCSI storage device (NAS200i) in 2001 for use with Windows NT, Windows 2000 [5] and Linux [6]
OS First release date 2006-10 2006-06 i5/OS V5R4M0 ESX 3.0, ESX 4.0, ESXi 5.0 Version Features

i5/OS VMware ESX AIX Windows

Target, Multipath Initiator, Multipath

2002-10 2003-06

AIX 5.3 TL10, AIX 6.1 TL3 2000, XP Pro, 2003, Vista, 2008, 2008 R2, Windows7, Windows 8, Windows Server 2012 NetWare 5.1, 6.5, & OES HP 11i v1, HP 11i v2, HP 11i v3 Solaris 10, OpenSolaris 2.6.12, 3.1

Initiator, Target Initiator, Target, Multipath

NetWare HP-UX Solaris Linux

2003-08 2003-10 2002-05 2005-06

Initiator, Target Initiator Initiator, Target, Multipath, iSER Initiator (2.6.12), Target (3.1), Multipath, iSER Initiator Initiator (5.0), Target (4.0) Initiator, Target from NetBSD Initiator, Multipath N/A

OpenBSD NetBSD FreeBSD OpenVMS Mac OS X

2009-10 2002-06 2002-08 2002-08 2008-07

4.9 4.0, 5.0 7.0 8.3-1H1 10.4 - 10.7

Target available only as part of Windows Unified Data Storage Server. Target available in Storage Server 2008 (excepted Basic edition).[7] Target available for Windows Server 2008 R2 as a separate download. Windows Server 2012 has built-in iSCSI target version 3.3 (at least in preview versions).

iSCSI MacOS X has neither initiator nor target coming from vendor directly. There are a few MacOS X initiators and targets available but they are from third-party vendors only.

143

Targets
Most iSCSI targets involve disk, though iSCSI tape and medium-changer targets are popular as well. So far, physical devices have not featured native iSCSI interfaces on a component level. Instead, devices with Parallel SCSI or Fibre Channel interfaces are bridged by using iSCSI target software, external bridges, or controllers internal to the device enclosure. Alternatively, it is possible to virtualize disk and tape targets. Rather than representing an actual physical device, an emulated virtual device is presented. The underlying implementation can deviate drastically from the presented target as is done with virtual tape library (VTL) products. VTLs use disk storage for storing data written to virtual tapes. As with actual physical devices, virtual targets are presented by using iSCSI target software, external bridges, or controllers internal to the device enclosure. In the security products industry, some manufacturers use an iSCSI RAID as a target, with the initiator being either an IP-enabled encoder or camera.

Converters and bridges


Multiple systems exist that allow Fibre Channel, SCSI and SAS devices to be attached to an IP network for use via iSCSI. They can be used to allow migration from older storage technologies, access to SANs from remote servers and the linking of SANs over IP networks. An iSCSI gateway bridges IP servers to Fibre Channel SANs. The TCP connection is terminated at the gateway, which is implemented on a Fibre Channel switch or as a standalone appliance.

References
[1] "Intel Ethernet Controllers" (http:/ / www. intel. com/ content/ www/ us/ en/ ethernet-controllers/ ethernet-controllers. html). Intel.com. . Retrieved 2012-09-18. [2] Architecture and Dependability of Large-Scale Internet Services (http:/ / roc. cs. berkeley. edu/ papers/ inet-computing. pdf) David Oppenheimer and David A. Patterson, Berkeley, IEEE Internet Computing, SeptemberOctober 2002. [3] "RFC 3720 - Internet Small Computer Systems Interface (iSCSI), (Section 3.2.6.3.1. Type "iqn." (iSCSI Qualified Name))" (http:/ / tools. ietf. org/ html/ rfc3720#section-3. 2. 6. 3). 2004-04. p.32. . Retrieved 2010-07-16. [4] Satran, Julian; Kalman, Meth; Sapuntzakis, Costa; Zeidner, Efri; Chadalapaka, Mallikarjun (2004-04-02). "RFC 3720" (http:/ / tools. ietf. org/ html/ rfc3720#section-8. 2. 1). . [5] (http:/ / www-900. ibm. com/ cn/ support/ library/ storage/ download/ 200i iSCSI client for NT& 2000 Installation& User Guide. pdf) [6] (http:/ / www-900. ibm. com/ cn/ support/ library/ storage/ download/ 200i iSCSI client for Linux Installation& User Guide. pdf) [7] "Windows Storage Server | NAS | File Management" (http:/ / www. microsoft. com/ windowsserver2008/ en/ us/ WSS08/ iSCSI. aspx). Microsoft. . Retrieved 2012-09-18.

RFCs
RFC 3720 - Internet Small Computer Systems Interface (iSCSI) RFC 3721 - Internet Small Computer Systems Interface (iSCSI) Naming and Discovery RFC 3722 - String Profile for Internet Small Computer Systems Interface (iSCSI) Names RFC 3723 - Securing Block Storage Protocols over IP (Scope: The use of IPsec and IKE to secure iSCSI, iFCP, FCIP, iSNS and SLPv2.) RFC 3347 - Small Computer Systems Interface protocol over the Internet (iSCSI) Requirements and Design Considerations RFC 3783 - Small Computer Systems Interface (SCSI) Command Ordering Considerations with iSCSI RFC 3980 - T11 Network Address Authority (NAA) Naming Format for iSCSI Node Names

iSCSI RFC 4018 - Finding Internet Small Computer Systems Interface (iSCSI) Targets and Name Servers by Using Service Location Protocol version 2 (SLPv2) RFC 4173 - Bootstrapping Clients using the Internet Small Computer System Interface (iSCSI) Protocol RFC 4544 - Definitions of Managed Objects for Internet Small Computer System Interface (iSCSI) RFC 4850 - Declarative Public Extension Key for Internet Small Computer Systems Interface (iSCSI) Node Architecture RFC 4939 - Definitions of Managed Objects for iSNS (Internet Storage Name Service) RFC 5048 - Internet Small Computer System Interface (iSCSI) Corrections and Clarifications RFC 5047 - DA: Datamover Architecture for the Internet Small Computer System Interface (iSCSI) RFC 5046 - Internet Small Computer System Interface (iSCSI) Extensions for Remote Direct Memory Access (RDMA)

144

External links
SCST: A Generic SCSI Target for Linux (includes iSCSI, FC, FCoE, IB) (http://scst.sourceforge.net/)

Fibre Channel

145

Fibre Channel
Fibre Channel
Layer4.Protocol mapping LUN masking Layer3.Common services Layer2.Network Fibre Channel fabric Fibre Channel zoning Registered State Change Notification Layer1.Data link Fibre Channel 8B/10B encoding Layer0.Physical

Fibre Channel, or FC, is a gigabit-speed network technology primarily used for storage networking.[1][2] Fibre Channel is standardized in the T11 Technical Committee of the InterNational Committee for Information Technology Standards (INCITS), an American National Standards Institute (ANSI)accredited standards committee. Fibre Channel was primarily used in the supercomputer field, but has now become the standard connection type for storage area networks (SAN) in enterprise storage. Despite its name, Fibre Channel signaling can run on twisted pair copper wire in addition to fiber-optic cables.[1][2] Fibre Channel Protocol (FCP) is a transport protocol (similar to TCP used in IP networks) that predominantly transports SCSI commands over Fibre Channel networks.[1][2]

History
Fibre Channel started in 1988, with ANSI standard approval in 1994, as a way to simplify the HIPPI system then in use for similar roles. HIPPI used a massive 50-pair cable with bulky connectors, and had limited cable lengths. When Fibre Channel started to compete for the mass storage market its primary competitor was IBM's proprietary Serial Storage Architecture (SSA) interface. Eventually the market chose Fibre Channel over SSA, depriving IBM of control over the next generation of mid- to high-end storage technology. Fibre Channel was primarily concerned with simplifying the connections and increasing distances, as opposed to increasing speeds. Later, designers added the goals of connecting SCSI disk storage, providing higher speeds and far greater numbers of connected devices. It also added support for any number of "upper layer" protocols, including ATM, IP and FICON, with SCSI being the predominant usage. The following table shows Fibre Channel speed variants:[3]

Fibre Channel

146

Fibre Channel Variants


NAME 1GFC 2GFC 4GFC 8GFC 10GFC Serial Line-Rate (GBaud) Throughput (Fullduplex) (MBps)* Availability 1.0625 2.125 4.25 8.5 10.52 200 400 800 1600 2550 1997 2001 2004 2005 2008

10GFC Parallel 12.75 16GFC 20GFC 14.025 21.04 3200 5100 2011 20??

* Throughput for duplex connections

Fibre Channel topologies


There are three major Fibre Channel topologies, describing how a number of ports are connected together. A port in Fibre Channel terminology is any entity that actively communicates over the network, not necessarily a hardware port. This port is usually implemented in a device such as disk storage, an HBA on a server or a Fibre Channel switch.[1] Point-to-point (FC-P2P). Two devices are connected directly to each other. This is the simplest topology, with limited connectivity.[1] Arbitrated loop (FC-AL). In this design, all devices are in a loop or ring, similar to token ring networking. Adding or removing a device from the loop causes all activity on the loop to be interrupted. The failure of one device causes a break in the ring. Fibre Channel hubs exist to connect multiple devices together and may bypass failed ports. A loop may also be made by cabling each port to the next in a ring. A minimal loop containing only two ports, while appearing to be similar to FC-P2P, differs considerably in terms of the protocol. Only one pair of ports can communicate concurrently on a loop. Maximum speed of 8GFC. Switched fabric (FC-SW). All devices or loops of devices are connected to Fibre Channel switches, similar conceptually to modern Ethernet implementations. Advantages of this topology over FC-P2P or FC-AL include: The switches manage the state of the fabric, providing optimized interconnections. The traffic between two ports flows through the switches only, it is not transmitted to any other port. Failure of a port is isolated and should not affect operation of other ports. Multiple pairs of ports may communicate simultaneously in a fabric.

Fibre Channel

147

Attribute Max ports Address size Side effect of port failure

Point-to-Point 2 N/A Link fails 127

Arbitrated loop

Switched fabric ~16777216 (224) 24-bit port ID

8-bit ALPA

Loop fails (until port bypassed) N/A No In order Arbitrated Yes Not guaranteed Dedicated

Mixing different link rates No Frame delivery Access to medium In order Dedicated

Layers
Fibre Channel does not follow the OSI model layering, but is split similarly into five layers: FC4 Protocol-mapping layer, in which application protocols, such as SCSI or IP, are encapsulated into a PDU for delivery to FC2. FC3 Common services layer, a thin layer that could eventually implement functions like encryption or RAID redundancy algorithms; FC2 Network layer, defined by the FC-PI-2 standard, consists of the core of Fibre Channel, and defines the main protocols; FC1 Data link layer, which implements line coding of signals; FC0 PHY, includes cabling, connectors etc.; Layers FC0 through FC2 are also known as FC-PH, the physical layers of Fibre Channel. Fibre Channel routers operate up to FC4 level (i.e. they may operate as SCSI routers), switches up to FC2, and hubs on FC0 only. Fibre Channel products are available at 1, 2, 4, 8, 10, 16 and 20 Gbit/s; these protocol flavors are called accordingly 1GFC, 2GFC, 4GFC, 8GFC, 10GFC, 16GFC, or 20GFC. The 16GFC standard was approved by the INCITS T11 committee in 2010, and those products are expected to become available in 2011. Products based on the 1GFC, 2GFC, 4GFC, 8GFC and 16GFC standards should be interoperable and backward compatible. The 1GFC, 2GFC, 4GFC, 8GFC designs all use 8b/10b encoding, while the 16GFC standard uses 64b/66b encoding. Unlike the 10GFC and 20GFC standards, 16GFC provides backward compatibility with 4GFC and 8GFC. The 10 Gbit/s standard and its 20 Gbit/s derivative, however, are not backward-compatible with any of the slower-speed devices, as they differ considerably on FC1 level in using 64b/66b encoding instead of 8b/10b encoding, and are primarily used as inter-switch links.

Ports
The following types of ports are defined by Fibre Channel: node ports N_port is a port on the node (e.g. host or storage device) used with both FC-P2P or FC-SW topologies. Also known as node port. NL_port is a port on the node used with an FC-AL topology. Also known as Node Loop port.
FC topologies and port types

F_port is a port on the switch that connects to a node point-to-point (i.e. connects to an N_port). Also known as fabric port. An F_port is not loop capable.

Fibre Channel FL_port is a port on the switch that connects to a FC-AL loop (i.e. to NL_ports). Also known as fabric loop port. E_port is the connection between two fibre channel switches. Also known as an Expansion port. When E_ports between two switches form a link, that link is referred to as an inter-switch link (ISL). D_port is a diagnostic port, used solely for the purpose of running link-level diagnostics between two switches and to isolate link level fault on the port, in the SFP, or in the cable. EX_port is the connection between a fibre channel router and a fibre channel switch. On the side of the switch it looks like a normal E_port, but on the side of the router it is an EX_port. TE_port * a Cisco addition to Fibre Channel, now adopted as a standard. It is an extended ISL or EISL. The TE_port provides not only standard E_port functions but allows for routing of multiple VSANs (Virtual SANs). This is accomplished by modifying the standard Fibre Channel frame (vsan tagging) upon ingress/egress of the VSAN environment. Also known as Trunking E_port. VE_Port an INCITS T11 addition, FCIP interconnected E-Port/ISL, i.e. fabrics will merge. VEX_Port a INCITS T11 addition, is a FCIP interconnected EX-Port, routing needed via lsan zoning to connect initiator to a target. general (catch-all) types Auto or auto-sensing port found in Cisco switches, can automatically become an E_, TE_, F_, or FL_port as needed. Fx_port a generic port that can become a F_port (when connected to a N_port) or a FL_port (when connected to a NL_port). Found only on Cisco devices where oversubscription is a factor. GL_port on a switch can operate as an E_port, FL_port, or F_port. Found on QLogic switches. G_port or generic port on a switch can operate as an E_port or F_port. Found on Brocade, McData, and QLogic switches. L_port is the loose term used for any arbitrated loop port, NL_port or FL_port. Also known as Loop port. U_port is the loose term used for any arbitrated port. Also known as Universal port. Found only on Brocade switches..... (*Note: The term "trunking" is not a standard Fibre Channel term and is used by vendors interchangeably. For example: A trunk (an aggregation of ISLs) in a Brocade device is referred to as a Port Channel by Cisco. Whereas Cisco refers to trunking as an EISL.)

148

Optical carrier medium variants

Typical Fibre Channel connectors modern LC on the left and older SC (typical for 1Gbit/s speeds) on the right

Fibre Channel

149

Fiber modality Single-mode fiber

Speed (MByte/s) 1600

Transmitter

[4] [5] [5] [7]

Medium variant 1600-SM-LC-L 1600-SM-LZ-I 800-SM-LC-L 800-SM-LC-I [6]

Distance 0.5 m 10km 0.5 m 2km 2 m 10km 2 m 1.4km 2 m 10km 2 m 4km 2 m 2km 2 m 50km 2 m 10km 2 m 2km 2 m 50km 2 m 10km

1310nm longwave light 1490nm longwave light

[6]

800

1310nm longwave light

[8]

[8] [10] [8]

400

1310nm longwave light

[7][9]

400-SM-LC-L

400-SM-LC-M 400-SM-LL-I 200 1550nm longwave light 1310nm longwave light [12] [9][7]

[11] [12]

200-SM-LL-V 200-SM-LC-L 200-SM-LL-I

[10]

[11] [12]

100

1550nm longwave light 1310nm longwave light

[12] [13][7]

100-SM-LL-V

[14] 100-SM-LL-L [10] 100-SM-LC-L 100-SM-LL-I [14]

2 m 2km

Fibre Channel
[15][16][17] [18] 0.5 m 125 m [18] 0.5100 m 0.535 m 0.515 m 0.5190 m 0.5150 m 0.550 m 0.521 m 0.5400 m 0.5380 m 0.5150 m 0.570 m 0.5500 m 0.5300 m 0.5150 m 0.5860 m 0.5500 m 0.5300 m 2500 m 2175 m

150
Multimode Fiber 1600

850nm shortwave light

1600-M5F-SN-I

1600-M5E-SN-I 1600-M5-SN-S 1600-M6-SN-S 800 800-M5F-SN-I

[18] [19]

[18] [20]

800-M5E-SN-I 800-M5-SN-S 800-M6-SN-S 400

[20] [20] [18] [20]

400-M5F-SN-I

400-M5E-SN-I 400-M5-SN-I 400-M6-SN-I 200

[21] [21] [20]

200-M5E-SN-I 200-M5-SN-I 200-M6-SN-I

[21] [21] [22]

100

100-M5E-SN-I 100-M5-SN-I 100-M6-SN-I 100-M5-SL-I 100-M6-SL-I

[23] [24]

[24] [25]

Multimode fiber Fiber Diameter FC media designation OM1 OM2 OM3 OM4 62.5m 50m 50m 50m M6 M5 M5E M5F

Modern Fibre Channel devices support SFP transceiver, mainly with LC fiber connector. Older 1GFC devices used GBIC transceiver, mainly with SC fiber connector.

Fibre Channel

151

Fibre Channel infrastructure


Fibre Channel switches can be divided into two classes. These classes are not part of the standard, and the classification of every switch is a marketing decision of the manufacturer: Directors offer a high port-count in a modular (slot-based) chassis with no single point of failure (high availability). Switches are typically smaller, fixed-configuration (sometimes semi-modular), less redundant devices. A fabric consisting entirely of one vendor is considered to be homogeneous. This is often referred to as operating in its "native mode" and allows the vendor to add proprietary features which may not be compliant with the Fibre Channel standard.

SAN switch with optical FC connectors installed.

If multiple switch vendors are used within the same fabric it is heterogeneous, the switches may only achieve adjacency if all switches are placed into their interoperability modes. This is called the "open fabric" mode as each vendor's switch may have to disable its proprietary features to comply with the Fibre Channel standard. Some switch manufacturers offer a variety of interoperability modes above and beyond the "native" and "open fabric" states. These "native interoperability" modes allow switches to operate in the native mode of another vendor and still maintain some of the proprietary behaviors of both. However, running in native interoperability mode may still disable some proprietary features and can produce fabrics of questionable stability.

Fibre Channel host bus adapters


Fibre Channel HBAs are available for all major open systems, computer architectures, and buses, including PCI and SBus. Some are OS dependent. Each HBA has a unique World Wide Name (WWN), which is similar to an Ethernet MAC address in that it uses an Organizationally Unique Identifier (OUI) assigned by the IEEE. However, WWNs are longer (8 bytes). There are two types of WWNs on a HBA; a node WWN (WWNN), which can be shared by some or all ports of a device, and a port WWN (WWPN), which is necessarily unique to each port.

Development tools
When developing and/or troubleshooting the Fibre Channel bus, examination of hardware signals can be very important to find problems. Logic analyzers and bus analyzers are tools which collect, analyze, decode, store signals so people can view the high-speed waveforms at their leisure.

References
[1] Preston, W. Curtis (2002). "Fibre Channel Architecture". Using SANs and NAS. Sebastopol, CA: O'Reilly Media. pp.1939. ISBN978-0-596-00153-7. OCLC472853124. [2] Riabov, Vladmir V. (2004). "Storage Area Networks (SANs)". In Bidgoli, Hossein. The Internet Encyclopedia. Volume 3, P-Z. Hoboken, NJ: John Wiley & Sons. pp.329338. ISBN978-0-471-68997-3. OCLC55610291. [3] "Roadmaps" (http:/ / www. fibrechannel. org/ roadmaps). Fibre Channel Industry Association. . Retrieved 2009-11-26. [4] Transmitter values listed are the currently specified values for the variant listed. Some older versions of the FC standards listed slightly different values (however, the values listed here fall within the +/- variance allowed). Individual variations for each specification are listed in the references associated with those entries in this table. FC-PH = X3T11 Project 755D; FC-PH-2 = X3T11 Project 901D; FC-PI-4 = INCITS Project 1647-D; FC-PI-5 = INCITS Project 2118D. Copies are available from INCITS (http:/ / www. incits. org/ ). [5] FC-PI-5 Clause 6.3 [6] [7] [8] [9] FC-PI-5 Clause 8.1 FC-PI-4 Clause 6.3 FC-PI-4 Clause 8.1 FC-PH-2 lists 1300nm (see clause 6.1 and 8.1)

Fibre Channel
[10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] FC-PI clause 8.1 FC-PH-2 clause 8.1 FC-PI-4 Clause 11 FC-PH lists 1300nm (see clause 6.1 and 8.1) FC-PH Clause 8.1 FC-PI-5 Clause 6.4 FC-PI-4 Clause 6.4 The older FC-PH and FC-PH-2 list 850nm (for 62.5m cables) and 780nm (for 50m cables)(see clause 6.2, 8.2, and 8.3) FC-PI-5 Clause 8.2 FC-PI-5 Annex A FC-PI-4 Clause 8.2 FC-PI Clause 8.2 PC-PI-4 Clause 8.2 PC-PI Clause 8.2 PC-PI Clause 8.2 FC-PH Annex C and Annex E

152

INCITS Fibre Channel standards

Sources
Clark, T. Designing Storage Area Networks, Addison-Wesley, 1999. ISBN 0-201-61584-3

Further reading
RFC 2625 IP and ARP over Fibre Channel RFC 2837 Definitions of Managed Objects for the Fabric Element in Fibre Channel Standard RFC 3723 Securing Block Storage Protocols over IP RFC 4044 Fibre Channel Management MIB RFC 4625 Fibre Channel Routing Information MIB RFC 4626 MIB for Fibre Channel's Fabric Shortest Path First (FSPF) Protocol

External links
Fibre Channel Industry Association (http://www.fibrechannel.org/) (FCIA) INCITS technical committee responsible for FC standards(T11) (http://www.t11.org/index.html) IBM SAN Survival Guide (http://www.redbooks.ibm.com/redbooks.nsf/0/ f98c75e7c6b5a4ca88256d0c0060bcc0?OpenDocument) Introduction to Storage Area Networks (http://www.redbooks.ibm.com/abstracts/sg245470.html?Open) Fibre Channel overview (http://hsi.web.cern.ch/HSI/fcs/spec/overview.htm) Fibre Channel tutorial (http://www.iol.unh.edu/training/fc/tutorials/fc_tutorial.php) (UNH-IOL) Storage Networking Industry Association (http://www.snia.org) (SNIA)

Internet Fibre Channel Protocol

153

Internet Fibre Channel Protocol


Internet Fibre Channel Protocol (iFCP) is a gateway-to-gateway network protocol standard, officially ratified by the Internet Engineering Task Force, which provides Fibre Channel fabric functionality to fibre channel devices over an IP network. Currently the most common comes in 1 Gbit/s, 2 Gbit/s, 4 Gbit/s, 8 Gbit/s, 10 Gbit/s variants.

Technical overview
The iFCP protocol enables the implementation of fibre channel functionality over an IP network, within which the fibre channel switching and routing infrastructure is replaced by IP components and technology. Congestion control, error detection and recovery are provided through the use of TCP (Transmission Control Protocol). The primary objective of iFCP is to allow existing fibre channel devices to be networked and interconnected over an IP based network at wire speeds. The method of address translation defined and the protocol permit fibre channel storage devices and host adapters to be attached to an IP-based fabric using transparent gateways. The iFCP protocol layer's main function is to transport Fibre Channel frame images between Fibre Channel ports attached both locally and remotely. iFCP encapsulates and routes the fibre channel frames that make up each Fibre Channel information unit via a predetermined TCP connection for transport across the IP network when transporting frames to a remote Fibre Channel port.

External links
RFCs
RFC 4172 - A Protocol for Internet Fibre Channel Storage Networking (iFCP)

Other Links
iFCP Information Page [1] at the SNIA IP Storage Forum. iFCP Subgroup [2] at the SNIA IP Storage Forum. Protocol Summary [3] by javin.com.

References
[1] http:/ / www. snia. org/ forums/ ipsf/ programs/ about/ ifcp/ [2] http:/ / www. snia. org/ tech_activities/ ip_storage/ ifcp/ [3] http:/ / www. javvin. com/ protocoliFCP. html

Fibre Channel over Ethernet

154

Fibre Channel over Ethernet


Fibre Channel over Ethernet (FCoE) is an encapsulation of Fibre Channel frames over Ethernet networks. This allows Fibre Channel to use 10 Gigabit Ethernet networks (or higher speeds) while preserving the Fibre Channel protocol. The specification, supported by a large number of network and storage vendors, is part of the International Committee for Information Technology Standards T11 FC-BB-5 standard.[1]

Functionality
FCoE maps Fibre Channel directly over Ethernet while being independent of the Ethernet forwarding scheme. The FCoE protocol specification replaces the FC0 and FC1 layers of the Fibre Channel stack with Ethernet. By retaining the native Fibre Channel constructs, FCoE was meant to integrate with existing Fibre Channel networks and management software. Many data centers use Ethernet for TCP/IP networks and Fibre Channel for storage area networks (SANs). With FCoE, Fibre Channel becomes another network protocol running on Ethernet, alongside Combined storage and local area network traditional Internet Protocol (IP) traffic. FCoE operates directly above Ethernet in the network protocol stack, in contrast to iSCSI which runs on top of TCP and IP. As a consequence, FCoE is not routable at the IP layer, and will not work across routed IP networks. Since classical Ethernet had no priority-based flow control, unlike Fibre Channel, FCoE requires enhancements to the Ethernet standard to support a priority-based flow control mechanism (this prevents frame loss). The IEEE standards body is working on this in the Data Center Bridging Task Group. Fibre Channel required three primary extensions to deliver the capabilities of Fibre Channel over Ethernet networks: Encapsulation of native Fibre Channel frames into Ethernet Frames. Extensions to the Ethernet protocol itself to enable an Ethernet fabric in which frames are not routinely lost during periods of congestion. Mapping between Fibre Channel N_port IDs (aka FCIDs) and Ethernet MAC addresses. Computers connect to FCoE with Converged Network Adapters (CNAs), which contain both Fibre Channel Host Bus Adapter (HBA) and Ethernet Network Interface Card (NIC) functionality on the same adapter card. CNAs have one or more physical Ethernet ports. FCoE encapsulation can be done in software with a conventional Ethernet network interface card, however FCoE CNAs offload (from the CPU) the low level frame processing and SCSI protocol functions traditionally performed by Fibre Channel host bus adapters.

"Converged" network adapter

Fibre Channel over Ethernet

155

Application
The main application of FCoE is in data center storage area networks (SANs). FCoE has particular application in data centers due to the cabling reduction it makes possible, as well as in server virtualization applications, which often require many physical I/O connections per server. With FCoE, network (IP) and storage (SAN) data traffic can be consolidated using a single network. This consolidation can: reduce the number of network interface cards required to connect to disparate storage and IP networks reduce the number of cables and switches reduce power and cooling costs

Frame Format
FCoE is encapsulated over Ethernet with the use of a dedicated Ethertype, 0x8906. A single 4-bit field (version) satisfies the IEEE sub-type requirements. The SOF (start of frame) and EOF (end of frame) are encoded as specified in RFC 3643. Reserved bits are present to guarantee that the FCoE frame meets the minimum length requirement of Ethernet. Inside the encapsulated Fibre Channel frame, the frame header is retained so as to allow connecting to a storage network by passing on the Fibre Channel frame directly after de-encapsulation. The FIP (FCoE Initialization Protocol) is an integral part of FCoE. Its main goal is to discover and initialize FCoE capable entities connected to an Ethernet cloud. FIP uses a dedicated Ethertype of 0x8914.

FCoE Frame Format

Timeline
Azule Technology developed early implementation and applied for a patent in October 2003. [2] The FCoE standardization activity started in April 2007. The FCoE technology was defined as part of the INCITS T11 FC-BB-5 standard that was forwarded to ANSI for publication in June 2009.[1] The FC-BB-5 standard was published in May 2010 as ANSI/INCITS 462-2010.[3] An early implementor was Nuova Systems, a subsidiary of Cisco Systems, which announced a switch in April 2008.[4][5] Brocade Communications Systems also announced support in 2008.[6] After the Late-2000s financial crisis, however, any new technology had a hard time getting established.[7][8]

Fibre Channel over Ethernet

156

References
[1] "Fibre Channel: Backbone - 5 revision 2.00" (http:/ / www. t11. org/ ftp/ t11/ pub/ fc/ bb-5/ 09-056v5. pdf) (PDF). American National Standard for Information Technology International Committee for Information Technology Standards Technical Group T11. June 4, 2009. . Retrieved May 5, 2011. [2] "FCoE PatentApplication" (http:/ / appft. uspto. gov/ netacgi/ nph-Parser?Sect1=PTO1& Sect2=HITOFF& d=PG01& p=1& u=/ netahtml/ PTO/ srchnum. html& r=1& f=G& l=50& s1="20080028096". PGNR. & OS=DN/ 20080028096& RS=DN/ 20080028096). . [3] "Information technology - Fibre Channel - Backbone - 5 (FC-BB-5)" (http:/ / www. techstreet. com/ standards/ INCITS/ 462_2010?product_id=1724386). ANSI/INCITS 462-2010. InterNational Committee for Information Technology Standards (formerly NCITS). May 13, 2010. . Official standard. [4] Paul Shread (April 10, 2008). "Cisco Buys Nuova as FCoE Heats Up" (http:/ / www. enterprisestorageforum. com/ ipstorage/ news/ article. php/ 3739991/ Cisco-Buys-Nuova-as-FCoE-Heats-Up. htm). Enterprise Storage Forum. . Retrieved May 5, 2011. [5] "Cisco Announces Intent to Acquire Remaining Interest in Nuova Systems" (http:/ / newsroom. cisco. com/ dlls/ 2008/ prod_040808b. html). Press release (Cisco Systems). April 8, 2008. . Retrieved May 5, 2011. [6] Dave Rowell (March 19, 2008). "Cisco, Brocade See One Big Happy Fabric" (http:/ / www. enterprisestorageforum. com/ sans/ features/ article. php/ 3735351). Enterprise Storage Forum. . Retrieved May 5, 2011. [7] Drew Robb (March 29, 2011). "FCoE Struggles to Gain Traction" (http:/ / www. enterprisestorageforum. com/ article. php/ 3929431). Enterprise Storage Forum. . Retrieved May 5, 2011. [8] Henry Newman (April 25, 2011). "FCoE Gets Lost in Vendor Stupidity" (http:/ / www. enterprisestorageforum. com/ features/ article. php/ 3931681/ FCoE-Gets-Lost-in-Vendor-Stupidity. htm). Enterprise Storage Forum. . Retrieved May 5, 2011.

External links
SCST: A Generic SCSI Target for Linux (includes iSCSI, FC, FCoE, IB) (http://scst.sourceforge.net/) Implementation for the Linux operating system (http://www.open-fcoe.org/)

157

NetApp
NetApp filer
In computer storage, NetApp filer, known also as NetApp Fabric-Attached Storage (FAS), or NetApp's network attached storage (NAS) device are NetApp's offering in the area of Storage Systems. A FAS functions in an enterprise-class storage area network (SAN) as well as a networked storage appliance. It can serve storage over a network using file-based protocols such as NFS, CIFS, FTP, TFTP, and HTTP. Filers can also serve data over block-based protocols such as Fibre Channel (FC), Fibre Channel over Ethernet (FCoE) and iSCSI.[1] NetApp Filers implement their physical storage in large disk arrays. Most other large storage vendors' filers tend to use commodity computers with an operating system such as Microsoft Windows Storage Server or tuned Linux. NetApp filers use highly customized hardware and the proprietary Data ONTAP operating system, both originally designed by founders David Hitz and James Lau specifically for storage-serving purposes. Data ONTAP is NetApp's internal operating system, specially optimised for storage functions at high and low level, it is booted from FreeBSD as a stand-alone kernel-space module and use some functions of FreeBSD (command interpreter and drivers stack, for example). All filers have battery-backed NVRAM, which allows them to commit writes to stable storage quickly, without waiting on disks. Early filers connected to external disk enclosures via SCSI, while modern models (as of 2009) use FC and SAS protocol. The disk enclosures (shelves) support FC hard disk drives, as well as parallel ATA, serial ATA and Serial attached SCSI. Implementers often organize two filers in a high-availability cluster with a private high-speed link, either Fibre Channel, InfiniBand, or 10G Ethernet. One can additionally group such clusters together under a single namespace when running in the "cluster mode" of the Data ONTAP 8 operating system.

NetApp filer

158

Internal architecture
Most NetApp filers consist of customized computers with Intel or AMD processors using PCI. Each Filer has a proprietary NVRAM adapter to log all writes for performance and to play the data log forward in the event of an unplanned shutdown. One can link two filers together as a cluster, which NetApp (as of 2009) refers to using the less ambiguous term "Active/Active". The Data ONTAP operating system implements a single proprietary file-system called WAFL. When used for file storage, Data ONTAP acts as an NFS server and/or a CIFS server, serving files to both Unix-like clients and to Microsoft Windows clients from the same file systems. This makes it possible for Unix and Windows to share files by the use of three security styles: mixed, ntfs, and unix. Data ONTAP supports user, group, and tree-based quotas (referred to as q-trees) and allows for data segregation and management within volumes. Qtrees with the UNIX security style will preserve the standard Unix permission-bits, the NTFS security style will preserve NT ACLs found in the Windows environment, and the mixed security allows the use of NetApp FAS3240-R5 both interchangeably (with minor loss of fidelity). Since 2002, all NetApp FAS systems can also work as SAN storage over "block-based" protocols such as FC, iSCSI and FCoE (since 2007). Each filer model comes with a set configuration of processor, RAM and NVRAM, which users cannot expand after purchase. With the exception of some of the entry point storage controllers, the NetApp filers have at least one PCIe-based slot available for additional network, tape and/or disk connections. In June 2008 NetApp announced the Performance Acceleration Module (or PAM) to optimize the performance of workloads which carry out intensive random reads. This optional card goes into a PCIe slot and provides additional memory (or cache) between the disk and the filer RAM/NVRAM, thus improving performance. NetApp supports either SATA, Fibre Channel, or SAS disk drives, which it groups into RAID (Redundant Array of Inexpensive Disks or Redundant Array of Independent Disks) groups of up to 28 (26 data disks plus 2 parity disks). Multiple RAID groups form an "aggregate"; and within aggregates Data ONTAP operating system sets up "flexible volumes" to actually store data that users can access. An alternative is "Traditional volumes" where one or more RAID groups form a single static volume. Flexible volumes offer the advantage that many of them can be created on a single aggregate and resized at any time. Smaller volumes can then share all of the spindles available to the underlying aggregate. Traditional volumes and aggregates can only be expanded, never contracted. However, Traditional volumes can (theoretically) handle slightly higher I/O throughput than flexible volumes (with the same number of spindles), as they do not have to go through an additional viritualisation layer to talk to the underlying disk. WAFL, as a robust versioning filesystem, provides snapshots, which allow end-users to see earlier versions of files in the file system. Snapshots appear in a hidden directory: ~snapshot for Windows (CIFS) or .snapshot for Unix (NFS). Up to 255 snapshots can be made of any traditional or flexible volume. Snapshots are read-only, although Data ONTAP 7 provides additional ability to make writable "virtual clones", based at "WAFL snapshots" technique, as "FlexClones". Data ONTAP implements snapshots by tracking changes to disk-blocks between snapshot operations. It can set up snapshots in seconds because it only needs to take a copy of the root inode in the filesystem. This differs from the snapshots provided by some other storage vendors in which every block of storage has to be copied, which can take

NetApp filer many hours. Snapshots form the basis for NetApp disk replication technology SnapMirror, which effectively replicates snapshots between two NetApp filers. Later versions of Data ONTAP introduced cascading replication, where one volume could replicate to another and then another etc. NetApp also offers a backup product based around replicating and storing snapshots, called SnapVault. Open Systems SnapVault allows Windows and UNIX hosts to back up data to a NetApp filer and store any filesystem changes in snapshots. Data ONTAP also implements an option called "SyncMirror" where all the RAID groups within an aggregate or traditional volume can be duplicated to another set of hard disks, typically at another site via a Fibre Channel link. NetApp provides a "MetroCluster" option, that uses "SyncMirror" to provide a geo-cluster or active/active cluster between two sites up to 100km apart. Other product options include "SnapLock" which implements a "Write Once Read Many" functionality on magnetic disks instead of to optical media, so that data cannot be deleted until its retention period has been reached. SnapLock exists in two modes: compliance and enterprise. The compliance mode was designed to assist organizations in implementing a comprehensive archival solution that meets strict regulatory retention requirements such as dictated by the SEC and several healthcare governing bodies. Records and files committed to WORM storage on a SnapLock Compliance volume cannot be altered or deleted before the expiration of their retention period. Moreover, a SnapLock Compliance volume cannot be destroyed until all data have reached the end of their retention period. SnapLock Enterprise is geared toward assisting organizations that are more self-regulated and want to have greater flexibility in protecting digital assets with WORM-type data storage. Data stored as WORM on a SnapLock Enterprise volume are protected from alteration or modification with one main difference from SnapLock Compliance: as the files being stored are not for strict regulatory compliance, a SnapLock Enterprise volume can be destroyed by an administrator with root privileges on the FAS system containing the SnapLock Enterprise volume, even if the designed retention period has not yet passed. In both modes, the retention period can be extended, but not shortened, as this is incongruous with the concept of immutability. In addition, NetApp SnapLock data volumes are equipped with a tamper-proof compliance clock that is used as a time reference to block forbidden operations on files, even if the system time is tampered with. NetApp Filer does have Intelligence PAM ( Performance Accelerate Module ) currently known as Flash Cache which will further boost READ intensive work load without adding any further disk to the underlying RAID. Flash Cache also increase the performamce of a filer by offloading READ IO from Filer NVRAM . NetApp also offers products for taking application-consistent snapshots by coordinating the application and the NetApp Storage Array. These products support Microsoft Exchange, Microsoft SQL Server, Microsoft Sharepoint, Oracle, SAP and VMware ESX Server data. These products form part of the SnapManager suite.

159

Previous limitations
Prior to the release of ONTAP 8, individual aggregate sizes were limited to a maximum of 2TB for FAS250 models and 16TB for all other models. The limitation on aggregate size, coupled with increasing density of disk drives, served to limit the performance of the overall system. NetApp, like most storage vendors, increases overall system performance by parallelizing disk writes to many different spindles (disk drives). Large capacity drives, therefore limit the number of spindles that can be added to a single aggregate, and therefore limit the aggregate performance. Each aggregate also incurs a storage capacity overhead of approximately 7-11%, depending on the disk type. On systems with many aggregates this can result in lost storage capacity. However, the overhead comes about due to additional block-checksumming on the disk level as well as usual file system overhead, similar to the overhead in file systems like NTFS or EXT3. Block checksumming helps to insure that data errors at the disk drive level do not result in data loss.

NetApp filer Data ONTAP 8.0 supports a new 64bit aggregate format, which increases the size limit to approximately 100TB (depending on storage platform) thus restoring the ability to configure large spindle counts to increase performance and storage efficiency. ([2])

160

Model history
This list may omit some models. Information taken from spec.org [3], netapp.com [4] and storageperformance.org [5]
Model Status Released CPU Main memory ? MB NVRAM Raw capacity 14 GB Benchmark SPECsfs

FASServer 400 FASServer 450 FASServer 1300 FASServer 1400 FASServer F330 F220 F540 F210 F230 F520 F630 F720 F740 F760 F85 F87 F810 F820 F825 F840

Discontinued Jan 1993

50MHz Intel i486

4 MB

Discontinued Jan 1994

50MHz Intel i486

? MB

4 MB

14 GB

Discontinued Jan 1994

50MHz Intel i486

? MB

4 MB

14 GB

Discontinued Jan 1994

50MHz Intel i486

? MB

4 MB

14 GB

Discontinued Jan 1995 Discontinued Sept 1995 Discontinued Feb 1996 Discontinued June 1996 Discontinued May 1997 Discontinued May 1997 Discontinued May 1997 Discontinued June 1997 Discontinued Aug 1998 Discontinued Aug 1998 Discontinued Aug 1998 Discontinued Feb 2001 Discontinued Dec 2001 Discontinued Dec 2001 Discontinued Dec 2000 Discontinued Aug 2002 Discontinued Aug/Dec? 2000 Discontinued July 2001

50MHz Intel i486 90MHz Intel Pentium 75MHz Intel Pentium 275MHz DEC Alpha 21064A 75MHz Intel Pentium 90MHz Intel Pentium 275MHz DEC Alpha 21064A 500MHz DEC Alpha 21164A 400MHz DEC Alpha 21164A 400MHz DEC Alpha 21164A 600MHz DEC Alpha 21164A

256 MB 256 MB 256 MB 256 MB 256 MB 256 MB 256 MB 512 MB 256 MB 512 MB 1 GB 256 MB 256 MB

4 MB 8 MB 8 MB 8 MB 8 MB 8 MB 8 MB 32 MB 8 MB 32 MB 32 MB 64 MB 64 MB 128 MB 128 MB 128 MB 128 MB

? GB 117 GB ? GB ? GB ? GB ? GB ? GB 464 GB 464 GB 928 GB 1.39 TB 648 GB 576 GB 1.5 TB 3 TB 3 TB 6 TB

640 1310 754 2230 1113 1610 2361 4328 2691 5095 7750

733MHz Intel P3 Coppermine 733MHz Intel P3 Coppermine 733MHz Intel P3 Coppermine 733MHz Intel P3 Coppermine

512 MB 1 GB 1 GB 3 GB

4967 8350 8062 11873

F880

Dual 733MHz Intel P3 Coppermine 2.0GHz Intel P4 Xeon 1.8GHz Intel P4 Xeon Dual 2.2GHz Intel P4 Xeon Dual 2.8GHz Intel P4 Xeon MP 2 MB L3

3 GB

128 MB

9 TB

17531

FAS920 FAS940 FAS960 FAS980

Discontinued May 2004 Discontinued Aug 2002 Discontinued Aug 2002 Discontinued Jan 2004

2 GB 3 GB 6 GB 8 GB

256 MB 256 MB 256 MB 512 MB

7 TB 14 TB 28 TB 50 TB

13460 17419 25135 36036

NetApp filer

161
EOA 11/08 Jan 2004 600MHz Broadcom BCM1250 dual core MIPS 650MHz Broadcom BCM1250 dual core MIPS 2.2GHz Mobile Celeron 1.66GHz Intel Xeon 2.2GHz Mobile Celeron 512 MB 64 MB 4 TB

FAS250

FAS270

EOA 11/08

Jan 2004

1 GB

128 MB

16 TB

13620*

FAS2020 FAS2040 FAS2050 FAS2220 FAS2240

EOA

June 2007 Sept 2009 June 2007 June 2012 November 2011

1 GB 4 GB 2 GB

128 MB 512 MB 256 MB

68 TB 136 TB 104 TB 180 TB 20027*

1.73GHz Dual Core Intel Xeon

6 GB

768 MB

432 TB

38000

FAS3020 FAS3040 FAS3050 FAS3070

EOA 4/09 EOA 4/09

May 2005 Feb 2007

2.8GHz Intel Xeon Dual 2.4GHz AMD Opteron 250 Dual 2.8GHz Intel Xeon Dual 1.8GHz AMD dual core Opteron Single 2.4GHz AMD Opteron Dual Core 2216 Dual 2.6GHz AMD Opteron Dual Core 2218

2 GB 4 GB 4 GB 8 GB

512 MB 512 MB 512 MB 512 MB

84 TB 336 TB 168 TB 504 TB

34089* 60038* 47927* 85615*

Discontinued May 2005 EOA 4/09 Nov 2006

FAS3140

EOA 2/12

June 2008

4 GB

512 MB

420 TB

SFS2008

40109*

FAS3160

EOA 2/12

8 GB

2 GB

672 TB

SFS2008

60409*

FAS3170

EOA 2/12

June 2008

Dual 2.6GHz AMD Opteron Dual Core 2218 Single 2.3GHz Intel Xeon(tm) Processor (E5220) Quad 2.33GHz Intel Xeon(tm) Processor (Harpertown) Dual 3.0GHz Intel Xeon(tm) Processor (E5240) Dual 2.6GHz AMD Opteron 2.6GHz AMD dual core Opteron Quad 2.6GHz AMD Opteron 4 to 8 2.6GHz AMD dual core Opteron

16 GB

2 GB

840 TB

SFS97_R1

137306*

FAS3210

Nov 2010

8 GB

2 GB

480 TB

SFS2008

64292

FAS3240

Nov 2010

16 GB

2 GB

1,200 TB

??

??

FAS3270

Nov 2010

32 GB

4 GB

1,920 TB

SFS2008

101183

FAS6030 FAS6040 FAS6070 FAS6080

EOA 6/09

Mar 2006 Dec 2007

32 GB 16 GB 64 GB 64 GB

512 MB 512 MB 2 GB 4 GB

840 TB 840 TB 1,008 TB 1,176 TB

SFS97_R1

100295*

EOA 6/09

Mar 2006 Dec 2007

136048* SFS2008 120011*

FAS6080 FAS6210 FAS6240 Nov 2010 Nov 2010 4x 2.53GHz Intel Xeon(tm) Processor E5540 4x 2.93GHz Intel Xeon(tm) Processor X5670 CPU 48 GB 96 GB 8 GB 8 GB 2,400 TB 2,880 TB

SFS97_R1

164408*

SFS2008

190675

FAS6280

Nov 2010

192 GB

8 GB

2,880 TB

Model

Status

Released

Main memory

NVRAM

Raw capacity

Benchmark SPECsfs

EOA = End of Availability SPECsfs with "*" is clustered result. SPECsfs performed include SPECsfs93, SPECsfs97, SPECsfs97_R1 and SPECsfs2008. Results of different benchmark versions are not comparable.

NetApp filer

162

References
[1] Nabrzyski, Jarek; Schopf, Jennifer M.; Wglarz, Jan (2004). Grid Resource Management: State of the Art and Future Trends (http:/ / books. google. com/ books?id=o5J8kyfreXAC& pg=PA535). Springer. p.342. ISBN978-1-4020-7575-9. . Retrieved 11 June 2012. [2] http:/ / media. netapp. com/ documents/ tr-3786. pdf [3] http:/ / www. spec. org [4] http:/ / www. netapp. com/ us/ products/ storage-systems [5] http:/ / www. storageperformance. org

External links
Storage Filer (definitions) (http://searchstorage.techtarget.com/sDefinition/0,,sid5_gci1016340,00.html) SnapLock Technical Report (http://www.netapp.com/us/library/technical-reports/tr-3618.html) NetApp training videos (http://www.happysysadm.com/2010/11/netapp-videos.html) NETWORK-APPLIANCE (Mib file) (http://www.oidview.com/mibs/789/NETWORK-APPLIANCE-MIB. html)

Write Anywhere File Layout

163

Write Anywhere File Layout


WAFL
Developer Full name NetApp Write Anywhere File Layout Limits Max file size Max volume size up to 100TB (limited by containing aggregate size; variable maximum depending on platform) up to 100TB (limited by containing aggregate size; variable maximum depending on platform; limited to 16TB when using Deduplication) selectable (UTF-8 default)

Allowed characters in filenames

Features Dates recorded File system permissions Transparent compression Transparent encryption Data deduplication atime, ctime, mtime UNIX permissions and ACLs Yes (Ontap 8.0 onwards) No (possible with 3rd party appliances like Decru DataFort) Yes (FAS Dedup: periodic online scans, block based; VTL Dedup: online byte-range based)

The Write Anywhere File Layout (WAFL) is a file layout that supports large, high-performance RAID arrays, quick restarts without lengthy consistency checks in the event of a crash or power failure (though sometimes a WAFL check may be required which can take days), and growing the filesystems size quickly. It was designed by NetApp for use in its storage appliances. Its author claims that WAFL is not a file system.[1] WAFL provides mechanisms that enable a variety of file systems and technologies that want to access disk blocks.

Features
One of WAFL's most salient features is the snapshot, or read-only copy of the file system. Zero-copy snapshots allow users to recover files that have been accidentally deleted; they provide an online backup that can be accessed quickly. It is implemented similarly to that of a log-structured file system. A special kind of snapshot that the filer uses internally called a consistency point allows WAFL to restart quickly in the event of an improper shutdown. NetApp's Data ONTAP Release 7G operating system supports a read-write snapshot called FlexClone. An important feature of WAFL is its support for both a Unix-style file and directory model for NFS clients and a Microsoft Windows-style file and directory model for CIFS clients. WAFL also supports both security models, including a mode where different files on the same volume can have different security attributes attached to them. Unix can use either[2] access control lists (ACL) or a simple bitmask, whereas the more recent Windows model is based on access control lists. These two features make it possible to write a file to a CIFS type of networked filesystem and access it later via NFS from a Unix workstation. As the name suggests Write Anywhere File Layout automatically fragments data using temporal locality to write metadata alongside user data. This fragmentation does not adversely affect files that are sequentially written to or randomly read from, but does affect sequential read after random write. Data ONTAP has the reallocate command as of 7G to perform scheduled and manual defragmentation. Prior to 7G, the wafl scan reallocate command would need to be invoked from an advanced privilege level and could not be scheduled.

Write Anywhere File Layout

164

Notes
[1] "Is WAFL a File System?" (http:/ / blogs. netapp. com/ dave/ 2008/ 12/ is-wafl-a-files. html). . [2] "POSIX Access Control Lists on Linux" (http:/ / www. suse. de/ ~agruen/ acl/ linux-acls/ online/ ). .

External links
Network Appliance: File System Design for an NFS File Server Appliance (http://www.netapp.com/library/tr/ 3002.pdf) U.S. Patent 5819292 (http://www.google.com/patents?vid=5819292) - Method for maintaining consistent states of a file system and for creating user-accessible read-only copies of a file system - October 6, 1998

Zero-copy
"Zero-copy" describes computer operations in which the CPU does not perform the task of copying data from one memory area to another. This is most often used to save on processing power and memory use when sending files over a network.[1]

Principle
Zero-copy versions of operating system elements such as device drivers, file systems, and network protocol stacks greatly increase the performance of certain application programs and more efficiently utilize system resources. Performance is enhanced by allowing the CPU to move on to other tasks while data copies proceed in parallel in another part of the machine. Also, zero-copy operations reduce the number of time-consuming mode switches between user space and kernel space. System resources are utilized more efficiently since using a sophisticated CPU to perform extensive copy operations, which is a relatively simple task, is wasteful if other simpler system components can do the copying. As an example, reading a file and then sending it over a network the traditional way requires four data copies and four CPU context switches, if the file is small enough to fit in the file cache. Two of those data copies use the CPU. Sending the same file via zero copy reduces the context switches to two, and eliminates either half, or all CPU data copies.[1] Zero-copy protocols are especially important for high-speed networks in which the capacity of a network link approaches or exceeds the CPU's processing capacity. In such a case the CPU spends nearly all of its time copying transferred data, and thus becomes a bottleneck which limits the communication rate to below the link's capacity. A rule of thumb used in the industry is that roughly one CPU clock cycle is needed to process one bit of incoming data.

Zero-copy

165

Implementation
Techniques for creating zero-copy software include the use of DMA-based copying and memory-mapping through an MMU. These features require specific hardware support and usually involve particular memory alignment requirements.

Programmatic access
Several operating systems support zero-copying of files through specific APIs. Linux supports zero copy through system calls such as sys/socket.h's sendfile, sendfile64, and splice. Windows supports zero copy through the TransmitFile API. Java input streams can support zero copy through the java.nio.channels.FileChannel's transferTo() method if the underlying operating system also supports zero copy.[1] RDMA (Remote Direct Memory Access) protocols deeply rely on zero-copy techniques.

References
[1] Efficient data transfer through zero copy by Sathish K. Palaniappan and Pramod B. Nagaraja. September 2008 (http:/ / www. ibm. com/ developerworks/ library/ j-zerocopy/ index. html)

Direct memory access


Direct memory access (DMA) is a feature of modern computers that allows certain hardware subsystems within the computer to access system memory independently of the central processing unit (CPU). Without DMA, when the CPU is using programmed input/output, it is typically fully occupied for the entire duration of the read or write operation, and is thus unavailable to perform other work. With DMA, the CPU initiates the transfer, does other operations while the transfer is in progress, and receives an interrupt from the DMA controller when the operation is done. This feature is useful any time the CPU cannot keep up with the rate of data transfer, or where the CPU needs to perform useful work while waiting for a relatively slow I/O data transfer. Many hardware systems use DMA, including disk drive controllers, graphics cards, network cards and sound cards. DMA is also used for intra-chip data transfer in multi-core processors. Computers that have DMA channels can transfer data to and from devices with much less CPU overhead than computers without a DMA channel. Similarly, a processing element inside a multi-core processor can transfer data to and from its local memory without occupying its processor time, allowing computation and data transfer to proceed in parallel. DMA can also be used for "memory to memory" copying or moving of data within memory. DMA can offload expensive memory operations, such as large copies or scatter-gather operations, from the CPU to a dedicated DMA engine. An implementation example is the I/O Acceleration Technology.

Principle
A DMA controller can generate addresses and initiate memory read or write cycles. It contains several registers that can be written and read by the CPU. These include a memory address register, a byte count register, and one or more control registers. The control registers specify the I/O port to use, the direction of the transfer (reading from the I/O device or writing to the I/O device), the transfer unit (byte at a time or word at a time), and the number of bytes to transfer in one burst.[1] To carry out an input, output or memory-to-memory operation, the host processor initializes the DMA controller with a count of the number of words to transfer, and the memory address to use. The CPU then sends commands to a

Direct memory access peripheral device to initiate transfer of data. The DMA controller then provides addresses and read/write control lines to the system memory. Each time a word of data is ready to be transferred between the peripheral device and memory, the DMA controller increments its internal address register until the full block of data is transferred. DMA transfers can either occur one word at a time or all at once in burst mode. If they occur a word at a time, this can allow the CPU to access memory on alternate bus cycles - this is called cycle stealing since the DMA controller and CPU contend for memory access. In burst mode DMA, the CPU can be put on hold while the DMA transfer occurs and a full block of possibly hundreds or thousands of words can be moved.[2] When memory cycles are much faster than processor cycles, an interleaved DMA cycle is possible, where the DMA controller uses memory while the CPU cannot. In a bus mastering system, both the CPU and peripherals can be granted control of the memory bus. Where a peripheral can become bus master, it can directly write to system memory without involvement of the CPU, providing memory address and control signals as required. Some measure must be provided to put the processor into a hold condition so that bus contention does not occur.

166

Modes of operation
Burst mode
An entire block of data is transferred in one contiguous sequence. Once the DMA controller is granted access to the system bus by the CPU, it transfers all bytes of data in the data block before releasing control of the system buses back to the CPU. This mode is useful for loading program or data files into memory, but renders the CPU inactive for relatively long periods of time. The mode is also called "Block Transfer Mode".

Cycle stealing mode


The cycle stealing mode is used in systems in which the CPU should not be disabled for the length of time needed for burst transfer modes. In the cycle stealing mode, the DMA controller obtains access to the system bus the same way as in burst mode, using BR (Bus Request) and BG (Bus Grant) signals, which are the two signals controlling the interface between the CPU and the DMA controller. However, in cycle stealing mode, after one byte of data transfer, the control of the system bus is deasserted to the CPU via BG. It is then continually requested again via BR, transferring one byte of data per request, until the entire block of data has been transferred. By continually obtaining and releasing the control of the system bus, the DMA controller essentially interleaves instruction and data transfers. The CPU processes an instruction, then the DMA controller transfers one data value, and so on. On the one hand, the data block is not transferred as quickly in cycle stealing mode as in burst mode, but on the other hand the CPU is not idled for as long as in burst mode. Cycle stealing mode is useful for controllers that monitor data in real time.

Transparent mode
The transparent mode takes the most time to transfer a block of data, yet it is also the most efficient mode in terms of overall system performance. The DMA controller only transfers data when the CPU is performing operations that do not use the system buses. It is the primary advantage of the transparent mode that the CPU never stops executing its programs and the DMA transfer is free in terms of time. The disadvantage of the transparent mode is that the hardware needs to determine when the CPU is not using the system buses, which can be complex and relatively expensive.

Direct memory access

167

Cache coherency
DMA can lead to cache coherency problems. Imagine a CPU equipped with a cache and an external memory that can be accessed directly by devices using DMA. When the CPU accesses location X in the Cache incoherence due to DMA memory, the current value will be stored in the cache. Subsequent operations on X will update the cached copy of X, but not the external memory version of X, assuming a write-back cache. If the cache is not flushed to the memory before the next time a device tries to access X, the device will receive a stale value of X. Similarly, if the cached copy of X is not invalidated when a device writes a new value to the memory, then the CPU will operate on a stale value of X. This issue can be addressed in one of two ways in system design: Cache-coherent systems implement a method in hardware whereby external writes are signaled to the cache controller which then performs a cache invalidation for DMA writes or cache flush for DMA reads. Non-coherent systems leave this to software, where the OS must then ensure that the cache lines are flushed before an outgoing DMA transfer is started and invalidated before a memory range affected by an incoming DMA transfer is accessed. The OS must make sure that the memory range is not accessed by any running threads in the meantime. The latter approach introduces some overhead to the DMA operation, as most hardware requires a loop to invalidate each cache line individually. Hybrids also exist, where the secondary L2 cache is coherent while the L1 cache (typically on-CPU) is managed by software.

Examples
ISA
In the original IBM PC, there was only one Intel 8237 DMA controller capable of providing four DMA channels (numbered 0-3), as part of the so-called Industry Standard Architecture, or ISA. These DMA channels performed 8-bit transfers and could only address the first megabyte of RAM. With the IBM PC/AT, a second 8237 DMA controller was added (channels 5-7; channel 4 is unusable), and the page register was rewired to address the full 16 MB memory address space of the 80286 CPU. This second controller performed 16-bit transfers. Due to their lagging performance (2.5 Mbit/s[3]), these devices have been largely obsolete since the advent of the 80386 processor in 1985 and its capacity for 32-bit transfers. They are still supported to the extent they are required to support built-in legacy PC hardware on modern machines. The only pieces of legacy hardware that use ISA DMA and are still fairly common are the built-in Floppy disk controllers of many PC mainboards and those IEEE 1284 parallel ports that support the fast ECP mode. Each DMA channel has a 16-bit address register and a 16-bit count register associated with it. To initiate a data transfer the device driver sets up the DMA channel's address and count registers together with the direction of the data transfer, read or write. It then instructs the DMA hardware to begin the transfer. When the transfer is complete, the device interrupts the CPU. Scatter-gather or Vectored I/O DMA allows the transfer of data to and from multiple memory areas in a single DMA transaction. It is equivalent to the chaining together of multiple simple DMA requests. The motivation is to off-load multiple input/output interrupt and data copy tasks from the CPU. DRQ stands for Data request; DACK for Data acknowledge. These symbols, seen on hardware schematics of computer systems with DMA functionality, represent electronic signaling lines between the CPU and DMA controller. Each DMA channel has one Request and one Acknowledge line. A device that uses DMA must be configured to use both lines of the assigned DMA channel.

Direct memory access Standard ISA DMA assignments: 1. 2. 3. 4. 5. 6. 7. DRAM Refresh (obsolete), User hardware, Floppy disk controller, Hard disk (obsoleted by PIO modes, and replaced by UDMA modes), Cascade from XT DMA controller, Hard Disk (PS/2 only), user hardware for all others, User hardware.

168

PCI
A PCI architecture has no central DMA controller, unlike ISA. Instead, any PCI component can request control of the bus ("become the bus master") and request to read from and write to system memory. More precisely, a PCI component requests bus ownership from the PCI bus controller (usually the southbridge in a modern PC design), which will arbitrate if several devices request bus ownership simultaneously, since there can only be one bus master at one time. When the component is granted ownership, it will issue normal read and write commands on the PCI bus, which will be claimed by the bus controller and forwarded to the memory controller using a scheme which is specific to every chipset. As an example, on a modern AMD Socket AM2-based PC, the southbridge will forward the transactions to the northbridge (which is integrated on the CPU die) using HyperTransport, which will in turn convert them to DDR2 operations and send them out on the DDR2 memory bus. As can be seen, there are quite a number of steps involved in a PCI DMA transfer; however, that poses little problem, since the PCI device or PCI bus itself are an order of magnitude slower than rest of components (see list of device bandwidths). A modern x86 CPU may use more than 4 GB of memory, utilizing PAE, a 36-bit addressing mode, or the native 64-bit mode of x86-64 CPUs. In such a case, a device using DMA with a 32-bit address bus is unable to address memory above the 4 GB line. The new Double Address Cycle (DAC) mechanism, if implemented on both the PCI bus and the device itself,[4] enables 64-bit DMA addressing. Otherwise, the operating system would need to work around the problem by either using costly double buffers (Windows nomenclature) also known as bounce buffers (Linux), or it could use an IOMMU to provide address translation services if one is present.

I/OAT
As an example of DMA engine incorporated in a general-purpose CPU, newer Intel Xeon chipsets include a DMA engine technology called I/O Acceleration Technology (I/OAT), meant to improve network performance on high-throughput network interfaces, in particular gigabit Ethernet and faster.[5] However, various benchmarks with this approach by Intel's Linux kernel developer Andrew Grover indicate no more than 10% improvement in CPU utilization with receiving workloads, and no improvement when transmitting data.[6]

AHB
In systems-on-a-chip and embedded systems, typical system bus infrastructure is a complex on-chip bus such as AMBA High-performance Bus. AMBA defines two kinds of AHB components: master and slave. A slave interface is similar to programmed I/O through which the software (running on embedded CPU, e.g. ARM) can write/read I/O registers or (less commonly) local memory blocks inside the device. A master interface can be used by the device to perform DMA transactions to/from system memory without heavily loading the CPU. Therefore high bandwidth devices such as network controllers that need to transfer huge amounts of data to/from system memory will have two interface adapters to the AHB: a master and a slave interface. This is because on-chip buses like AHB do not support tri-stating the bus or alternating the direction of any line on the bus. Like PCI, no central DMA controller is required since the DMA is bus-mastering, but an arbiter is required in case of multiple

Direct memory access masters present on the system. Internally, a multichannel DMA engine is usually present in the device to perform multiple concurrent scatter-gather operations as programmed by the software.

169

Cell
As an example usage of DMA in a multiprocessor-system-on-chip, IBM/Sony/Toshiba's Cell processor incorporates a DMA engine for each of its 9 processing elements including one Power processor element (PPE) and eight synergistic processor elements (SPEs). Since the SPE's load/store instructions can read/write only its own local memory, an SPE entirely depends on DMAs to transfer data to and from the main memory and local memories of other SPEs. Thus the DMA acts as a primary means of data transfer among cores inside this CPU (in contrast to cache-coherent CMP architectures such as Intel's coming general-purpose GPU, Larrabee). DMA in Cell is fully cache coherent (note however local stores of SPEs operated upon by DMA do not act as globally coherent cache in the standard sense). In both read ("get") and write ("put"), a DMA command can transfer either a single block area of size up to 16KB, or a list of 2 to 2048 such blocks. The DMA command is issued by specifying a pair of a local address and a remote address: for example when a SPE program issues a put DMA command, it specifies an address of its own local memory as the source and a virtual memory address (pointing to either the main memory or the local memory of another SPE) as the target, together with a block size. According to a recent experiment, an effective peak performance of DMA in Cell (3GHz, under uniform traffic) reaches 200GB per second.[7]

Notes
[1] Osborne, Adam (1980). An Introduction to Microcomputers: Volume 1: Basic Concepts (2nd ed.). Osborne McGraw Hill. pp.564 through 593. ISBN0931988349. [2] Horowitz, Paul; Hill, Winfield (1989). The Art of Electronics (Second ed.). Cambridge University Press. p.702. ISBN0521370957. [3] Intel publication 03040, Aug 1989 [4] "Physical Address Extension PAE Memory and Windows" (http:/ / www. microsoft. com/ whdc/ system/ platform/ server/ PAE/ PAEdrv. mspx#E2D). Microsoft Windows Hardware Development Central. 2005. . Retrieved 2008-04-07. [5] Corbet, Jonathan (2005-12-06). "Memory copies in hardware" (http:/ / lwn. net/ Articles/ 162966/ ). LWN.net (December 8, 2005). . Retrieved 2006-11-12. [6] Grover, Andrew (2006-06-01). "I/OAT on LinuxNet wiki" (http:/ / linux-net. osdl. org/ index. php/ I/ OAT). Overview of I/OAT on Linux, with links to several benchmarks. . Retrieved 2006-12-12. [7] Kistler, Michael (2006-05). "Cell Multiprocessor Communication Network" (http:/ / portal. acm. org/ citation. cfm?id=1158825. 1159067). Extensive benchmarks of DMA performance in Cell Broadband Engine. .

References
DMA Fundamentals on Various PC Platforms (http://cires.colorado.edu/jimenez-group/QAMSResources/ Docs/DMAFundamentals.pdf), from A. F. Harvey and Data Acquisition Division Staff NATIONAL INSTRUMENTS mmap() and DMA (http://www.xml.com/ldd/chapter/book/ch13.html), from Linux Device Drivers, 2nd Edition, Alessandro Rubini & Jonathan Corbet Memory Mapping and DMA (http://www.oreilly.com/catalog/linuxdrive3/book/ch15.pdf), from Linux Device Drivers, 3rd Edition, Jonathan Corbet, Alessandro Rubini, Greg Kroah-Hartman DMA and Interrupt Handling (http://www.eventhelix.com/RealtimeMantra/FaultHandling/ dma_interrupt_handling.htm) DMA Modes & Bus Mastering (http://www.pcguide.com/ref/hdd/if/ide/modesDMA-c.html)

Memory management unit

170

Memory management unit


A memory management unit (MMU), sometimes called paged memory management unit (PMMU), is a computer hardware component responsible for handling accesses to memory requested by the CPU. Its functions include translation of virtual addresses to physical addresses (i.e., virtual memory management), memory protection, cache control, bus arbitration and in simpler computer architectures (especially 8-bit systems) bank switching.

This 68451 MMU could be used with the Motorola 68010

How it works
Modern MMUs typically divide the virtual address space (the range of addresses used by the processor) into pages, each having a size which is a power of 2, usually a few kilobytes, but they may be much larger. The bottom n bits of the address (the offset within a page) are left unchanged. The upper address bits are the (virtual) page number. The MMU normally translates virtual page numbers to physical page numbers via an associative cache called a translation Schematic of the operation of an MMU lookaside buffer (TLB). When the TLB lacks a translation, a slower mechanism involving hardware-specific data structures or software assistance is used. The data found in such data structures are typically called page table entries (PTEs), and the data structure itself is typically called a page table. The physical page number is combined with the page offset to give the complete physical address. A PTE or TLB entry may also include information about whether the page has been written to (the dirty bit), when it was last used (the accessed bit, for a least recently used page replacement algorithm), what kind of processes (user mode, supervisor mode) may read and write it, and whether it should be cached. Sometimes, a TLB entry or PTE prohibits access to a virtual page, perhaps because no physical random access memory has been allocated to that virtual page. In this case the MMU signals a page fault to the CPU. The operating system (OS) then handles the situation, perhaps by trying to find a spare frame of RAM and set up a new PTE to map it to the requested virtual address. If no RAM is free, it may be necessary to choose an existing page (known as a victim), using some replacement algorithm, and save it to disk (this is called "paging"). With some MMUs, there can also be a shortage of PTEs or TLB entries, in which case the OS will have to free one for the new mapping. In some cases a "page fault" may indicate a software bug. A key benefit of an MMU is memory protection: an OS can use it to protect against errant programs, by disallowing access to memory that a particular program should not

Memory management unit have access to. Typically, an OS assigns each program its own virtual address space. An MMU also reduces the problem of fragmentation of memory. After blocks of memory have been allocated and freed, the free memory may become fragmented (discontinuous) so that the largest contiguous block of free memory may be much smaller than the total amount. With virtual memory, a contiguous range of virtual addresses can be mapped to several non-contiguous blocks of physical memory. In some early microprocessor designs, memory management was performed by a separate integrated circuit such as the VLSI VI475 or the Motorola 68851 used with the Motorola 68020 CPU in the Macintosh II or the Z8015 used with the Zilog Z80 family of processors. Later microprocessors such as the Motorola 68030 and the ZILOG Z280 placed the MMU together with the CPU on the same integrated circuit, as did the Intel 80286 and later x86 microprocessors. While this article concentrates on modern MMUs, commonly based on pages, early systems used a similar concept for base-limit addressing, VLSI VI475 MMU "Apple HMMU" that further developed into segmentation. Those are occasionally also from the Macintosh II used with the Motorola 68020 present on modern architectures. The x86 architecture provided segmentation rather than paging in the 80286, and provides both paging and segmentation in the 80386 and later processors (although the use of segmentation is not available in 64-bit operation).

171

Examples
Most modern systems divide memory into pages that are 4 KB to 64 KB in size, often with the possibility to use huge pages from 2 MB to 512 MB in size. Page translations are cached in a TLB. Some systems, mainly older RISC designs, trap into the OS when a page translation is not found in the TLB. Most systems use a hardware-based tree walker. Most systems allow the MMU to be disabled; some disable the MMU when trapping into OS code.

VAX
VAX pages are 512 bytes, which is very small. An OS may treat multiple pages as if they were a single larger page, for example Linux on VAX groups 8 pages together, thus the system is viewed as having 4 KB pages. The VAX divides memory into 4 fixed-purpose regions, each 1 GB in size. They are: P0 space, which is used for general-purpose per-process memory such as heaps, P1 space, or control space, which is also per-process and is typically used for supervisor, executive, kernel, and user stacks and other per-process control structures managed by the operating system, S0 space, or system space, which is global to all processes and stores operating system code and data, whether paged or not, including pagetables, S1 space, which is unused and "Reserved to Digital". Page tables are big linear arrays. Normally this would be very wasteful when addresses are used at both ends of the possible range, but the page table for applications is itself stored in the kernel's paged memory. Thus there is effectively a 2-level tree, allowing applications to have sparse memory layout without wasting lots of space on unused page table entries. The VAX MMU is notable for lacking an accessed bit. OSes which implement paging must find some way to emulate the accessed bit if they are to operate efficiently. Typically, the OS will periodically unmap pages so that page-not-present faults can be used to let the OS set an accessed bit.

Memory management unit

172

ARM
ARM architecture based application processors implement an MMU defined by ARM's Virtual Memory System Architecture. The current architecture defines PTEs for describing 4KB and 64KB pages, 1MB sections and 16MB super-sections; legacy versions also defined a 1KB tiny page. The ARM uses a two-level pagetable if using 4KB and 64KB pages, or just a one-level pagetable for 1MB sections and 16MB sections. TLB updates are performed automatically by page-table walking hardware. PTEs include read/write access permission based on privilege, cacheability information, an NX bit, and a non-secure bit [1]

IBM System/370 and successors


The IBM System/370 has had an MMU since the early 1970s; it was initially known as a DAT (Dynamic Address Translation) box. It has the unusual feature of storing accessed and dirty bits outside of the page table. They refer to physical memory rather than virtual memory. They are accessed by special-purpose instructions. This reduces overhead for the OS, which would otherwise need to propagate accessed and dirty bits from the page tables to a more physically oriented data structure. This makes OS-level virtualization easier. These features have been inherited by succeeding mainframe architectures, up to the current z/Architecture.

DEC Alpha
The DEC Alpha processor divides memory into 8 KB pages. After a TLB miss, low-level firmware machine code (here called PALcode) walks a 3-level tree-structured page table. Addresses are broken down as follows: 21 bits unused, 10 bits to index the root level of the tree, 10 bits to index the middle level of the tree, 10 bits to index the leaf level of the tree, and 13 bits that pass through to the physical address without modification. Full read/write/execute permission bits are supported.

MIPS
The MIPS architecture supports 1 to 64 entries in the TLB. The number of TLB entries is configurable at CPU configuration before synthesis. TLB entries are dual. Each TLB entry maps a virtual page number (VPN2) to either one of two page frame numbers (PFN0 or PFN1), depending on the least significant bit of the virtual address that is not part of the page mask. This bit and the page mask bits, are not stored in the VPN2. Each TLB entry has its own page size, which can be any value from 1 KB to 256 MB in multiples of 4. Each PFN in a TLB entry has a caching attribute, a dirty and a valid status bit. A VPN2 has a global status bit and an OS assigned ID which participates in the virtual address TLB entry match, if the global status bit is set to 0. A PFN stores the physical address without the page mask bits. A TLB Refill exception is generated when there are no entries in the TLB that match the mapped virtual address. A TLB Invalid exception is generated when there is a match but the entry is marked invalid. A TLB Modified exception is generated when there is a match but the dirty status is not set. If a TLB exception occurs when processing a TLB exception, a double fault TLB exception, it is dispatched to its own exception handler. MIPS32 and MIPS32r2 support 32 bits of virtual address space and up to 36 bits of physical address space. MIPS64 supports up to 64 bits of virtual address space and up to 59 bits of physical address space.

Memory management unit

173

Sun 1
The original Sun 1 was a single-board computer built around the Motorola 68000 microprocessor and introduced in 1982. It included the original Sun 1 Memory Management Unit, that provided address translation, memory protection, memory sharing and memory allocation for multiple processes running on the CPU. All access of the CPU to private on-board RAM, external Multibus memory, on-board I/O and the Multibus I/O ran through the MMU where they were translated and protected in uniform fashion. The MMU was implemented in hardware on the CPU board. The MMU consisted of a context register, a segment map and a page map. Virtual addresses from the CPU were translated into intermediate addresses by the segment map, which in turn were translated into physical addresses by the page map. The page size was 2 KB and the segment size was 32 KB which gave 16 pages per segment. Up to 16 contexts could be mapped concurrently. The maximum logical address space for a context was 1024 pages or 2 MB. The maximum physical address that could be mapped simultaneously was also 2 MB. The context register was important in a multitasking operating system because it allowed the CPU to switch between processes without reloading all the translation state information. The 4-bit context register could switch between 16 sections of the segment map under supervisor control which allowed 16 contexts to be mapped concurrently. Each context had its own virtual address space. Sharing of virtual address space and inter-context communications could be provided by writing the same values in to the segment or page maps of different contexts. Additional contexts could be handled by treating the segment map as a context cache and replacing out-of-date contexts on a least-recently-used basis. The context register made no distinction between user and supervisor states; interrupts and traps did not switch contexts which required that all valid interrupt vectors always be mapped in page 0 of context, as well as the valid Supervisor Stack.[2]

PowerPC
In PowerPC G1, G2, G3, and G4, pages are normally 4 KB. After a TLB miss, the standard PowerPC MMU begins two simultaneous lookups. One lookup attempts to match the address with one of 4 or 8 Data Block Address Translation (DBAT) registers, or 4 or 8 Instruction Block Address Translation registers (IBAT) as appropriate. The BAT registers can map linear chunks of memory as large as 256 MB, and are normally used by an OS to map large portions of the address space for the OS kernel's own use. If the BAT lookup succeeds, the other lookup is halted and ignored. The other lookup, not directly supported by all processors in this family, is via a so-called "inverted page table" which acts as a hashed off-chip extension of the TLB. First, the top 4 bits of the address are used to select one of 16 segment registers. 24 bits from the segment register replace those 4 bits, producing a 52-bit address. The use of segment registers allows multiple processes to share the same hash table. The 52-bit address is hashed, then used as an index into the off-chip table. There, a group of 8 page table entries is scanned for one that matches. If none match due to excessive hash collisions, the processor tries again with a slightly different hash function. If this too fails, the CPU traps into the OS (with MMU disabled) so that the problem may be resolved. The OS needs to discard an entry from the hash table to make space for a new entry. The OS may generate the new entry from a more-normal tree-like page table or from per-mapping data structures which are likely to be slower and more space-efficient. Support for no-execute control is in the segment registers, leading to 256-MB granularity. A major problem with this design is poor cache locality caused by the hash function. Tree-based designs avoid this by placing the page table entries for adjacent pages in adjacent locations. An operating system running on the PowerPC may minimize the size of the hash table to reduce this problem. It is also somewhat slow to remove the page table entries of a process; the OS may avoid reusing segment values to delay facing this or it may elect to suffer the waste of memory associated with per-process hash tables. G1 chips do

Memory management unit not search for page table entries, but they do generate the hash with the expectation that an OS will search the standard hash table via software. The OS can write to the TLB. G2, G3, and early G4 chips use hardware to search the hash table. The latest chips allow the OS to choose either method. On chips that make this optional or do not support it at all, the OS may choose to use a tree-based page table exclusively.

174

IA-32 / x86
The x86 architecture has evolved over a long time while maintaining full software compatibility even for OS code. Thus the MMU is extremely complex, with many different possible operating modes. Normal operation of the traditional 80386 CPU and its successors (IA-32) is described here. The CPU primarily divides memory into 4 KB pages. Segment registers, fundamental to the older 8088 and 80286 MMU designs, are not used in modern OSes with one major exception: access to thread-specific data for applications or CPU-specific data for OS kernels, which is done with explicit use of the FS and GS segment registers. All memory access involves a segment register, chosen according to the code being executed. The segment register acts as an index into a table, which provides an offset to be added to the virtual address. Except when using FS or GS as described above, the OS ensures that the offset will be zero. After the offset is added, the address is masked to be no larger than 32 bits. The result may be looked up via a tree-structured page table, with the bits of the address being split as follows: 10 bits for the root of the tree, 10 bits for the leaves of the tree, and the 12 lowest bits being directly copied to the result. Some operating systems, such as OpenBSD with its W^X feature, and Linux with the Exec Shield or PaX patches, may also limit the length of the code segment, as specified by the CS register, to disallow execution of code in modifiable regions of the address space. Minor revisions of the MMU introduced with the Pentium have allowed very large 4 MB pages by skipping the bottom level of the tree. Minor revisions of the MMU introduced with the Pentium Pro introduced the Physical Address Extension (PAE) feature, enabling 36-bit physical addresses via three-level page tables (with 9+9+2 bits for the three levels, and the 12 lowest bits being directly copied to the result; large pages become only 2 MB in size). In addition, the Page Attribute Table allowed specification of cacheability by looking up a few high bits in a small on-CPU table. No-execute support was originally only provided on a per-segment basis, making it very awkward to use. More recent x86 chips provide a per-page no-execute bit in the PAE mode. The W^X, Exec Shield, and PaX mechanisms described above emulate per-page non-execute support on machines x86 processors lacking the NX bit by setting the length of the code segment, with a performance loss and a reduction in the available address space.

x86-64
x86-64 is a 64-bit extension of x86 that almost entirely removes segmentation in favor of the flat memory model used by almost all operating systems for the 386 or newer processors. In long mode, all segment offsets are ignored, except for the FS and GS segments. When used with 4 KB pages, the page table tree has four levels instead of three. The virtual addresses are divided up as follows: 16 bits unused, 9 bits each for 4 tree levels (total: 36 bits), and the 12 lowest bits directly copied to the result. With 2 MB pages there are only three levels of page table, for a total of 27 bits used in paging and 21 bits of offset. Some newer CPUs also support a 1 GB page with two levels of paging and 30 bits of offset.[3] CPUID can be used to determine if 1 GB pages are supported. In all three cases, the 16 highest bits are required to be equal to the 48th bit, or in the other words, the low 48 bits are sign extended to the higher bits. This is done to allow a future expansion of the addressable range, without compromising backwards compatibility. In all levels of the page table, the page table entry includes a no-execute bit.

Memory management unit

175

Unisys MCP Systems (Burroughs B5000)


Tanenbaum et al., recently stated[4] that the B5000 (and descendant systems) have no MMU. To understand the functionality provided by an MMU, it is instructive to study a counter example of a system that achieves this functionality by other means. The B5000 was the first commercial system to support virtual memory after the Atlas. It provides the two functions of an MMU in different ways. Firstly, the mapping of virtual memory addresses. Instead of needing an MMU, the MCP systems are descriptor based. Each allocated memory block is given a master descriptor with the properties of the block, i.e., the size, address, and whether present in memory. When a request is made to access the block for reading or writing, the hardware checks its presence via the presence bit (pbit) in the descriptor. A pbit of 1 indicates the presence of the block. In this case the block can be accessed via the physical address in the descriptor. If the pbit is zero, an interrupt is generated for the MCP (operating system) to make the block present. If the address field is zero, this is the first access to this block and it is allocated (an init pbit). If the address field is non-zero, it is a disk address of the block, which has previously been rolled out, so the block is fetched from disk and the pbit is set to 1 and the physical memory address updated to point to the block in memory (another pbit). This makes descriptors equivalent to a page-table entry in an MMU system. System performance can be monitored through the number of pbits. Init pbits indicate initial allocations, but a high level of other pbits indicate that the system may be thrashing. Note that all memory allocation is therefore completely automatic (one of the features of modern systems[5]) and there is no way to allocate blocks other than this mechanism. There are no such calls as malloc or dealloc, since memory blocks are also automatically discarded. The scheme is also lazy, since a block will not be allocated until it is actually referenced. When memory is near full, the MCP examines the working set, trying compaction (since the system is segmented, not paged), deallocating read-only segments (such as code-segments which can be restored from their original copy), and as a last resort, rolling dirty data segments out to disk. Secondly, protection. Since all accesses are via the descriptor the hardware can check all accesses are within bounds, and in the case of a write that the process has write permission. The MCP system is inherently secure and thus has no need of an MMU to provide this level of memory protection. Descriptors are read only to user processes and may only be updated by the system (hardware or MCP). (Descriptors have a tag of 5 and odd-tagged words are read only code words have a tag of 3.) Blocks can be shared between processes via copy descriptors in the process stack thus some processes may have write permission, whereas others not. A code segment is read only, thus reentrant and shared between processes. Copy descriptors contain a 20-bit address field giving index of the master descriptor in the master descriptor array. This also implements a very efficient and secure IPC mechanism. Blocks can easily be relocated since only the master descriptor needs update when a block's status changes. The only other aspect is performance do MMU- or non-MMU-based systems provide better performance? MCP systems may be implemented on top of standard hardware that does have an MMU (e.g., a standard PC). Even if the system implementation uses the MMU in some way, this will not be at all visible at the MCP level.

Memory management unit

176

References
[1] http:/ / infocenter. arm. com/ help/ topic/ com. arm. doc. ddi0344i/ DDI0344I_cortex_a8_r3p1_trm. pdf [2] Sun 68000 Board User's Manual, Sun Microsystems, Inc, February 1983, Revision B [3] "AMD64 Architecture Programmer's Manual Volume 2: System programming" (http:/ / www. amd. com/ us-en/ assets/ content_type/ white_papers_and_tech_docs/ 24593. pdf). 2007-09. . Retrieved 2009-04-14. [4] Can We Make Operating Systems Reliable and Secure? (http:/ / www. computer. org/ portal/ site/ computer/ menuitem. 5d61c1d591162e4b0ef1bd108bcd45f3/ index. jsp?& pName=computer_level1_article& TheCat=1005& path=computer/ homepage/ 0506& file=cover1. xml& xsl=article. xsl& ;jsessionid=JBpRc0RM1CyxTQGRLQKDDNv8jGMwY9Bp2hs3QR1CdSQ9Jh6rQ5d2!-267722489) [5] Design Principles Behind Smalltalk (Storage Management) (http:/ / users. ipa. net/ ~dwighth/ smalltalk/ byte_aug81/ design_principles_behind_smalltalk. html)

This article is based on material taken from the Free On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of the GFDL, version 1.3 or later.

Log-structured file system


A log-structured filesystem is a file system design first proposed in 1988 by John K. Ousterhout and Fred Douglis. Designed for high write throughput, all updates to data and metadata are written sequentially to a continuous stream, called a log. The design was first implemented by Ousterhout and Mendel Rosenblum.

Rationale
Conventional file systems tend to lay out files with great care for spatial locality and make in-place changes to their data structures in order to perform well on optical and magnetic disks, which tend to seek relatively slowly. The design of log-structured file systems is based on the hypothesis that this will no longer be effective because ever-increasing memory sizes on modern computers would lead to I/O becoming write-heavy because reads would be almost always satisfied from memory cache. A log-structured file system thus treats its storage as a circular log and writes sequentially to the head of the log. This has several important side effects: Write throughput on optical and magnetic disks is improved because they can be batched into large sequential runs and costly seeks are kept to a minimum. Writes create multiple, chronologically-advancing versions of both file data and meta-data. Some implementations make these old file versions nameable and accessible, a feature sometimes called time-travel or snapshotting. This is very similar to a versioning file system. Recovery from crashes is simpler. Upon its next mount, the file system does not need to walk all its data structures to fix any inconsistencies, but can reconstruct its state from the last consistent point in the log. Log-structured file systems, however, must reclaim free space from the tail of the log to prevent the file system from becoming full when the head of the log wraps around to meet it. The tail can release space and move forward by skipping over data for which newer versions exist farther ahead in the log. If there are no newer versions, then the data is moved and appended to the head. To reduce the overhead incurred by this garbage collection, most implementations avoid purely circular logs and divide up their storage into segments. The head of the log simply advances into non-adjacent segments which are already free. If space is needed, the least-full segments are reclaimed first. This decreases the I/O load of the garbage collector, but becomes increasingly ineffective as the file system fills up and nears capacity.

Log-structured file system

177

Implementations
John K. Ousterhout and Mendel Rosenblum implemented the first log-structured file system for the Sprite operating system in 1992.[1][2] BSD-LFS, an implementation by Margo Seltzer was added to 4.4BSD, and was later ported to 386BSD. It lacked support for snapshots. It was removed from FreeBSD and OpenBSD, but still lives on in NetBSD. Plan 9's Fossil file system is also log-structured and supports snapshots. NILFS is a log-structured file system implementation for Linux by NTT/Verio which supports snapshots. LinLogFS (formerly dtfs) and LFS (http://logfs.sourceforge.net/ [3]) are log-structured file system implementations for Linux. The latter was part of Google Summer of Code 2005. Both projects have been abandoned. LFS [4] is another log-structured file system for Linux developed by Charles University, Prague. It was to include support for snapshots and indexed directories, but development has since ceased. ULFS is a User-Level Log-structured File System (http://ulfs.sf.net) using FUSE (http://fuse.sf.net). CASL is a proprietary log-structured filesystem that uses Solid State Devices to cache traditional hard drives (http://www.nimblestorage.com/products/architecture/). SISL is a log-structured filesystem with deduplication and was designed by DataDomain (http://www. datadomain.com/products/SISL.html). Some kinds of storage media, such as flash memory and CD-RW, slowly degrade as they are written to and have a limited number of erase/write cycles at any one location. Log-structured file systems are sometimes used on these media because they make fewer in-place writes and thus prolong the life of the device by wear leveling. The more common such file systems include: UDF is a file system commonly used on optical discs. JFFS and its successor JFFS2 are simple Linux file systems intended for raw flash-based devices. UBIFS is a filesystem for raw NAND flash media and also intended to replace JFFS2. LogFS is a scalable flash filesystem for Linux that works on both raw flash media and block devices, intended to replace JFFS2. YAFFS is a raw NAND flash-specific file system for many operating systems (including Linux).

Disadvantages
The design rationale for log-structured file systems assumes that most reads will be optimized away by ever-enlarging memory caches. This assumption does not always hold: On magnetic mediawhere seeks are relatively expensivethe log structure may actually make reads much slower, since it fragments files that conventional file systems normally keep contiguous with in-place writes. On flash memorywhere seek times are usually negligiblethe log structure may not confer a worthwhile performance gain because write fragmentation has much less of an impact on write throughput. However many flash based devices cannot rewrite part of a block, and they must first perform a (slow) erase cycle of each block before being able to re-write, so by putting all the writes in one block, this can help performance as opposed to writes scattered into various blocks, each one of which must be copied into a buffer, erased, and written back.

Log-structured file system

178

References
[1] Rosenblum, Mendel and Ousterhout, John K. (June 1990) - " The LFS Storage Manager (http:/ / citeseer. ist. psu. edu/ rosenblum90lfs. html)". Proceedings of the 1990 Summer Usenix. pp315-324. [2] Rosenblum, Mendel and Ousterhout, John K. (February 1992) - " The Design and Implementation of a Log-Structured File System (http:/ / citeseer. ist. psu. edu/ rosenblum91design. html)". ACM Transactions on Computer Systems, Vol. 10 Issue 1. pp26-52. [3] http:/ / logfs. sourceforge. net/ [4] http:/ / aiya. ms. mff. cuni. cz/ lfs

Locality of reference
In computer science, locality of reference, also known as the principle of locality, is the phenomenon of the same value or related storage locations being frequently accessed. There are two basic types of reference locality. Temporal locality refers to the reuse of specific data and/or resources within relatively small time durations. Spatial locality refers to the use of data elements within relatively close storage locations. Sequential locality, a special case of spatial locality, occurs when data elements are arranged and accessed linearly, e.g., traversing the elements in a one-dimensional array. Locality is merely one type of predictable behavior that occurs in computer systems. Systems which exhibit strong locality of reference are good candidates for performance optimization through the use of techniques like the cache and instruction prefetch technology for memory or the advanced branch predictor at the pipelining of processors.

Locality of reference
The locality of reference, also known as the locality principle, is the phenomenon that the collection of the data locations referenced in a short period of time in a running computer often consists of relatively well predictable clusters. Important special cases of locality are temporal, spatial, equidistant and branch locality. Temporal locality: if at one point in time a particular memory location is referenced, then it is likely that the same location will be referenced again in the near future. There is a temporal proximity between the adjacent references to the same memory location. In this case it is common to make efforts to store a copy of the referenced data in special memory storage, which can be accessed faster. Temporal locality is a very special case of the spatial locality, namely when the prospective location is identical to the present location. Spatial locality: if a particular memory location is referenced at a particular time, then it is likely that nearby memory locations will be referenced in the near future. In this case it is common to attempt to guess the size and shape of the area around the current reference for which it is worthwhile to prepare faster access. Branch locality: if there are only few amount of possible alternatives for the prospective part of the path in the spatial-temporal coordinate space. This is the case when an instruction loop has a simple structure, or the possible outcome of a small system of conditional branching instructions is restricted to a small set of possibilities. Branch locality is typically not a spatial locality since the few possibilities can be located far away from each other. Equidistant locality: it is halfway between the spatial locality and the branch locality. Consider a loop accessing locations in an equidistant pattern, i.e. the path in the spatial-temporal coordinate space is a dotted line. In this case, a simple linear function can predict which location will be accessed in the near future. In order to make benefit from the very frequently occurring temporal and spatial kind of locality, most of the information storage systems are hierarchical; see below. The equidistant locality is usually supported by the diverse nontrivial increment instructions of the processors. For the case of branch locality, the contemporary processors have sophisticated branch predictors, and on the base of this prediction the memory manager of the processor tries to collect and preprocess the data of the plausible alternatives.

Locality of reference

179

Reasons for locality


There are several reasons for locality. These reasons are either goals to achieve or circumstances to accept, depending on the aspect. The reasons below are not disjoint; in fact, the list below goes from the most general case to special cases. Predictability: In fact, locality is merely one type of predictable behavior in computer systems. Luckily, many of the practical problems are decidable and hence the corresponding program can behave predictably, if it is well written. Structure of the program: Locality occurs often because of the way in which computer programs are created, for handling decidable problems. Generally, related data is stored in nearby locations in storage. One common pattern in computing involves the processing of several items, one at a time. This means that if a lot of processing is done, the single item will be accessed more than once, thus leading to temporal locality of reference. Furthermore, moving to the next item implies that the next item will be read, hence spatial locality of reference, since memory locations are typically read in batches. Linear data structures: Locality often occurs because code contains loops that tend to reference arrays or other data structures by indices. Sequential locality, a special case of spatial locality, occurs when relevant data elements are arranged and accessed linearly. For example, the simple traversal of elements in a one-dimensional array, from the base address to the highest element would exploit the sequential locality of the array in memory.[1] The more general equidistant locality occurs when the linear traversal is over a longer area of adjacent data structures having identical structure and size, and in addition to this, not the whole structures are in access, but only the mutually corresponding same elements of the structures. This is the case when a matrix is represented as a sequential matrix of rows and the requirement is to access a single column of the matrix.

Use of locality in general


If most of the time the substantial portion of the references aggregate into clusters, and if the shape of this system of clusters can be well predicted, then it can be used for speed optimization. There are several ways to make benefit from locality. The common techniques for optimization are: to increase the locality of references. This is achieved usually on the software side. to exploit the locality of references. This is achieved usually on the hardware side. The temporal and spatial locality can be capitalized by hierarchical storage hardwares. The equidistant locality can be used by the appropriately specialized instructions of the processors, this possibility is not only the responsibility of hardware, but the software as well, whether its structure is suitable for compiling a binary program which calls the specialized instructions in question. The branch locality is a more elaborate possibility, hence more developing effort is needed, but there is much larger reserve for future exploration in this kind of locality than in all the remaining ones.

Use of spatial and temporal locality: hierarchical memory


Hierarchical memory is a hardware optimization that takes the benefits of spatial and temporal locality and can be used on several levels of the memory hierarchy. Paging obviously benefits from temporal and spatial locality. A cache is a simple example of exploiting temporal locality, because it is a specially designed faster but smaller memory area, generally used to keep recently referenced data and data near recently referenced data, which can lead to potential performance increases. Data in cache does not necessarily correspond to data that is spatially close in main memory; however, data elements are brought into cache one cache line at a time. This means that spatial locality is again important: if one element is referenced, a few neighboring elements will also be brought into cache. Finally, temporal locality plays a role on the lowest level, since results that are referenced very closely together can be kept in the machine registers. Programming languages such as C allow the programmer to suggest that certain

Locality of reference variables are kept in registers. Data locality is a typical memory reference feature of regular programs (though many irregular memory access patterns exist). It makes the hierarchical memory layout profitable. In computers, memory is divided up into a hierarchy in order to speed up data accesses. The lower levels of the memory hierarchy tend to be slower, but larger. Thus, a program will achieve greater performance if it uses memory while it is cached in the upper levels of the memory hierarchy and avoids bringing other data into the upper levels of the hierarchy that will displace data that will be used shortly in the future. This is an ideal, and sometimes cannot be achieved. Typical memory hierarchy (access times and cache sizes are approximations of typical values used as of 2006 for the purpose of discussion; actual values and actual numbers of levels in the hierarchy vary): CPU registers (8-128 registers) immediate access L1 CPU caches (32KiB to 512KiB) fast access L2 CPU caches (128KiB to 24MiB) slightly slower access Main physical memory (RAM) (256MiB to 64GiB) slow access Disk (file system) (100GiB to 10TiB) very slow Remote Memory (such as other computers or the Internet) (Practically unlimited) speed varies

180

Modern machines tend to read blocks of lower memory into the next level of the memory hierarchy. If this displaces used memory, the operating system tries to predict which data will be accessed least (or latest) and move it down the memory hierarchy. Prediction algorithms tend to be simple to reduce hardware complexity, though they are becoming somewhat more complicated.

Spatial and temporal locality example: matrix multiplication


A common example is matrix multiplication: for i in 0..n for j in 0..m for k in 0..p C[i][j] = C[i][j] + A[i][k] * B[k][j]; When dealing with large matrices, this algorithm tends to shuffle data around too much. Since memory is pulled up the hierarchy in consecutive address blocks, in the C programming language it would be advantageous to refer to several memory addresses that share the same row (spatial locality). By keeping the row number fixed, the second element changes more rapidly. In C and C++, this means the memory addresses are used more consecutively. One can see that since j affects the column reference of both matrices C and B, it should be iterated in the innermost loop (this will fix the row iterators, i and k, while j moves across each column in the row). This will not change the mathematical result, but it improves efficiency. By switching the looping order for j and k, the speedup in large matrix multiplications becomes dramatic. (In this case, 'large' means, approximately, more than 100,000 elements in each matrix, or enough addressable memory such that the matrices will not fit in L1 and L2 caches.) Temporal locality can also be improved in the above example by using a technique called blocking. The larger matrix can be divided into evenly-sized sub-matrices, so that the smaller blocks can be referenced (multiplied) several times while in memory. for (ii = 0; ii < SIZE; ii += BLOCK_SIZE) for (kk = 0; kk < SIZE; kk += BLOCK_SIZE) for (jj = 0; jj < SIZE; jj += BLOCK_SIZE) for (i = ii; i < ii + BLOCK_SIZE && i < SIZE; i++) for (k = kk; k < kk + BLOCK_SIZE && k < SIZE; k++) for (j = jj; j < jj + BLOCK_SIZE && j < SIZE; j++)

Locality of reference C[i][j] = C[i][j] + A[i][k] * B[k][j]; The temporal locality of the above solution is provided because a block can be used several times before moving on, so that it is moved in and out of memory less often. Spatial locality is improved because elements with consecutive memory addresses tend to be pulled up the memory hierarchy together.

181

Bibliography
P.J. Denning, The Locality Principle, Communications of the ACM, Volume 48, Issue 7, (2005), Pages 1924 P.J. Denning, S.C. Schwartz, Properties of the working-set model, Communications of the ACM, Volume 15, Issue 3 (March 1972), Pages 191-198

References
[1] Aho, Lam, Sethi, and Ullman. "Compilers: Principles, Techniques & Tools" 2nd ed. Pearson Education, Inc. 2007

Article Sources and Contributors

182

Article Sources and Contributors


Central processing unit Source: http://en.wikipedia.org/w/index.php?oldid=513561248 Contributors: .:Ajvol:., 04satvinderbi, 11james22, 132qwerty, 16@r, 4I7.4I7, 4twenty42o, 7265, ABF, Acdx, Aceofskies05, Acroterion, Adam.Amory.97, Adam1213, Adamwisky, AdjustShift, Ahoerstemeier, Ahpook, Aiken drum, Aitias, Akaka, Akjar13, Akuyume, Alansohn, Alex, AlphaPyro, Alphachimp, AlyM, Ambuj.Saxena, Ameliorate!, Amp71, Anakata, AnakngAraw, Ancheta Wis, Anchit singla, Andre Engels, Andreas.Persson, AndrewWTaylor, Andy Dingley, Andypandy.UK, Andypandyjay, Angela, Angelic Wraith, Anna Lincoln, Ano onymis, Anonymous editor, Antandrus, Aphaia, Arakunem, Arcadie, Arch dude, ArchStanton69, Archer3, ArchonMagnus, Arman Cagle, Art LaPella, Arthena, Arunib, Ashawley, Ashish Gaikwad, Atlant, AuburnPilot, Avono, Avraham, Axl, AzaToth, BACbKA, BRUTE, Banes, Beastathon, Beland, Ben ben ben ben ben jerry, Ben-Zin, Benjaminjkim, Benscripps, Berkut, Bichito, Big Bird, Bighead01753, Bigtimepeace, BillyPreset, Bionik276, Bitolado, Blainster, Blanchardb, Blargarg, Bloodshedder, Bob121, Bobanater, Bobblewik, Bobet, Bobo192, Bonadea, Bongwarrior, Bookofjude, Booyabazooka, Bowmanjj, Brainmachine, Brheed Zonabp84, Brion VIBBER, Bsadowski1, Buletproofbrit, Bwrs, CJLL Wright, CKsquid, CTF83!, CZmarlin, Cactus.man, Calmer Waters, Camomen3000, Camoxide, Can't sleep, clown will eat me, CanadianLinuxUser, CanisRufus, Capricorn42, Captain-tucker, CaptainVindaloo, Catfish Jim and the soapdish, Cedars, Cessator, Chasingsol, Chip123456, Chowbok, Chuck Smith, Chuq, Chuunen Baka, Cimon Avaro, Ck lostsword, Coekon, Cogiati, Cohesion, Cometstyles, CommonsDelinker, Computerwoman417, Cpusweden, Crusadeonilliteracy, Cuchullain, Cureden, Curps, Cybercobra, Cyrius, D0762, DARTH SIDIOUS 2, Damicatz, Dan Granahan, Dan100, Darklightning1, DarthShrine, Dauntless28, Dave314159, David Gerard, DavidCary, Davo123, Deelkar, Deerstop, Defender of torch, Deicool, Dekisugi, Demus Wiesbaden, DerHexer, Dhiraj1984, Diannaa, Diego UFCG, Dinshoupang, Discospinster, Dlohcierekim's sock, Dmsar, Dmytheus, Doczilla, Dogah, Dogan900, Don4of4, Donarreiskoffer, Dr.alf, DragonHawk, Druiloor, Duk, Dulciana, Dureo, DwightKingsbury, Dyl, Ebricca, Edderso, Edonovan, Edward, EdwardJK, Edwy, Ekilfeather, Ekonomka, Elano, Eleassar777, Electron20, Elfguy, Elipongo, Eliz81, Emperorbma, Emre D., Enviroboy, Epbr123, Epic matt, Erin2003, Espin2, Espoo, Ettrig, Everyking, Evil saltine, Excirial, FMAlchemist36, Fabartus, Fallout boy, Fanf, Farosdaughter, Fastily, Fatal-, Feezo, Feinoha, Fieldday-sunday, Finell, Fir0002, Fiskars007, FleetCommand, Flewis, Flipper344, Fnagaton, Foxj, Frak, Frap, Freakofnurture, Frecklefoot, Fred Gandt, Fredrik, Frymaster, Fvasconcellos, Fylbecatulous, F, GB fan, Gail, Galoubet, Garion96, Gdrg22, Genius1789, George The Dragon, GeorgeBills, Georgeb92, GermanX, Giftlite, Gilliam, Gimmetrow, Giraffedata, Glane23, Glenn, Gmb1994, Gogo Dodo, Gohst, Goodone121, GraemeL, Graham Jones, Graham87, Greg.Kerr01, GregorB, Grungerz, Gseandiyh79, Guanaco, Gurch, Guy Harris, Gwandoya, Gwernol, Hadal, Hakufu Sonsaku, Hallo990, HandyAndy, Hans Dunkelberg, Harpastum, Haseo9999, Hashar, Hazal018, Headbangerbuggy, Heron, Hmains, Hmdz105, Hobartimus, Hotcrocodile, Ht1848, Hydrogen Iodide, Iain.mcclatchie, Icelight, Ida Shaw, Ihateblazing, Ikiroid, Ilion2, Imdaking, Imperial Monarch, Info lover, Instigate cjsc (Narine), Instinct, Ipsign, IraChesterfield, Iridescent, Ironholds, Iwan rahabok, Ixfd64, J.delanoy, J128, JForget, JHMM13, Jab843, Jackfork, Jacoplane, Jacopone, James086, Jamesooders, Jan1nad, Jarred.rop1, Jasper Deng, Java13690, Jaxl, Jazzee8845, Jbray3179, Jcoy, Jd027, JeLuF, Jebus989, Jeff G., Jeffreyarcand, Jennavecia, Jeremy Visser, Jesse V., JesseW, JiFish, JinJian, Jiy, Jjtennisman, Jmabel, JoanneB, Joelr31, John Quincy Adding Machine, JohnFromPinckney, Johnteslade, JonHarder, Jonas weepel, Jondel, Jorgenev, Joyous!, Jpk, Jpkoester1, Jrstern29, Jsharpminor, Juhuang, Justinc, K1Bond007, Kabu, Karderio, Kbdank71, Keilana, Kelpherder, Kesac, Killdevil, King of Hearts, Kipala, Klaser, Knepflerle, Knockwood, Knowitall44, KnowledgeOfSelf, Koishii1521, Kozuch, Krawi, Krich, Ks0stm, Kubanczyk, Kudz75, KuraAbyss, Kuru, LHvU, LOL, LaMenta3, Landon1980, Lankiveil, Lanky217, LaughingMan42, Lauri.pirttiaho, LeaveSleaves, Lectonar, LeeDanielCrocker, Leon7, LevelCheck, Levineps, Lg1223, Liao, Liftarn, Lightmouse, Ligulem, LikeLakers2, Lilac Soul, Little Mountain 5, LittleOldMe, Littleog, Lmno, Longhair, LordJeff, Loren.wilton, Lou.weird, Luciandrei, Luna Santin, Lupin, MC MasterChef, MECU, MONGO, Mac, MadGuy7023, Mahjongg, Malcolm, Malmis, Mandarax, Mandetory, Mani1, MansonP, Manta7, MantridFreeman, Manuel Trujillo Berges, Mapolat, MarSch, Marek69, Martin451, Materialscientist, Matey, Matt 118118, Matt Britt, Matthewirwin28693, Matticus78, Mav, Mayooranathan, Mbarbier, Mcdennis13, Mcicogni, Meiskam, Melsaran, Mentifisto, Merovingian, Michael Hardy, Microcell, Mikael Hggstrm, Mike Dill, Mike92591, Mikepabell, Milf&cookies, Millermk, Minesweeper, Minimac, Miranda, Misza13, Mitsuhirato, Mjbt, Mjpieters, Mmccalpin, Mmxx, Modulatum, Momma69, Monkeynoze, MooresLaw, Morkork, Mortense, Morwen, Mpgenius, MrOllie, Mudlock, Muriel Gottrop, Murray Langton, Mxn, Mygerardromance, Myhellhereorunder, Mynameiswa, MysticMetal, NPrice, Nagy, Nanshu, Nashhinton, NawlinWiki, Nayvik, Neelix, Neilc, NellieBly, Netanel h, Netoholic, NewEnglandYankee, Newton2, NickBush24, Nihiltres, Nikai, Nixdorf, Nixeagle, Noypi380, Nsaa, Nuggetboy, Nv8200p, Nxavar, Ohnoitsjamie, Oleg Alexandrov, Olie93, Omicronpersei8, Omiks3, Optakeover, Orange Suede Sofa, Orphan Wiki, Oxfordwang, Oxymoron83, P.B. Pilhet, PZFUN, Paolo.dL, Paul August, Pearcej, Pearle, Persian Poet Gal, Personline, Peruvianllama, Peti1212, Pgk, Phaldo, Phantomsteve, Pharaoh of the Wizards, Phatstakks, Pheeror, Phgao, Phil Boswell, Philip Trueman, Phuzion, Piano non troppo, Pilotguy, Pinkadelica, Pixel8, Pointillist, Pol098, Pololei, Pooosihole, Popup, Positron, Pranayrocks23, Priver312, Prodego, Pyfan, Qmwne235, Quadell, Quale, Quantyz, Qwyrxian, Qxz, R twell27, R'n'B, R. S. Shaw, RMartin-2, RTC, Radiopathy, Raghavkvp, Rajayush78, Ravenperch, RazorICE, Rdsmith4, Reach Out to the Truth, Reisio, Renamed user 1752, Res2216firestar, RexNL, Rhobite, Richi, Richwales, Rick Sidwell, Ridge Runner, RightSideNov, Rilak, RipFire12901, Rjclaudio, Robert K S, RobertG, Robost, Rome109, Ronhjones, Rowan Moore, Rprpr, Rror, Rsrikanth05, Ruud Koot, Rwwww, RyanParis, Rzsor, S raghu20, SCRECROW, SEG88, SGBailey, ST47, SWAdair, Sahrin, Sam42, Sango123, Sasquatch, Savemeto2, Sax Russell, Sceptre, Scientus, Scriptfan, Sdfsakjdhfuioaheif283, Sean.hoyland, Seanm924, Seaphoto, Secretlondon, Sensiblemayank, Shadowjams, Shanes, Shawnc, Shell Kinney, Shizhao, Shoessss, Shrikanthv, Sicklight, SimonP, Sir Hat, Slavy13, Smurrayinchester, Snigbrook, Sniper-ass, Snowmanmelting, Snowolf, Soir, Solipsist, Solitude, Solphusion, Someguy1221, Sonicology, Sp, SpaceFlight89, Spacepotato, Specs112, Spike Wilbury, Spliffy, Srce, Starcraft.nut, Stas3717, Steel, Stephenb, SteveBaker, Stevertigo, Suruena, Sverdrup, Sychen, Synchrite, TakuyaMurata, Tarquin, Tawker, Taxman, Taylorclem, Tbhotch, Techman224, Tempshill, Terence, TexasAndroid, The Anome, The High Fin Sperm Whale, The Ice Inside, The Thing That Should Not Be, The dragon123, The sock that should not be, TheJosh, TheKMan, Thesalus, Think outside the box, Thisisafakeaccountfordeletingpages, Tide rolls, Tigga en, Tim1980tim, Timir Saxa, Timl2k4, Timmy2, Tinton5, Tiptoety, ToastyMallows, Tom harrison, TomBridle, Tomeasy, Tommy Kronkvist, Tone, Tony1, TonySt, ToobMug, Tophtucker, Torahjerus14, Toussaint, Tpbradbury, Traroth, Tree Biting Conspiracy, Triggerhappy412, Trollilols, Trusilver, Tslocum, Ttwaring, Tuoreco, Turnerj, Tyler, Uncle Dick, Unixguy, Upholder, Useight, Uzume, VI, VampWillow, Velella, Versus22, Vikingstad, Vrenator, Vssun, Vulcanstar6, WJetChao, Wapcaplet, WardMuylaert, Wasted Sapience, Wavelength, Wayne Hardman, Wayne Slam, Wayward, Wbm1058, Wdfowty, Whispering, WhiteNebula, Who, Widefox, Wiki alf, WikiDegausser, WikiLaurent, Wikiloop, WikipedianMarlith, Willtron, Wilson44691, Wimt, Windchaser, Winstonliang6758, Wiz126, Wizardist, Wjbeaty, Wouterstomp, Wtshymanski, Wx4sno, X360Silent, X5UPR3ME STEV3x, XJamRastafire, XTerminator2000, Xaosflux, Xenobog, Xoneca, YDZ, Yabeeno, Yaronf, Yidisheryid, YourUserHere, Yucel114, Z.E.R.O., ZaferXYZ, ZeWrestler, Zelphar, ZeroOne, Zidonuke, Zippedmartin, Zodon, Zzuuzz, fw-us-hou-8.bmc.com, , , 2004 anonymous edits Random-access memory Source: http://en.wikipedia.org/w/index.php?oldid=514012208 Contributors: 0612, 128mem, 1exec1, 28421u2232nfenfcenc, 28bytes, 2TerabyteBox, =Josh.Harris, ALEF7, AThing, Aapo Laitinen, Abhikandoi2000, Abjm, Accurizer, Addera, Adlen, AdultSwim, Aitias, Akira625, Akuyume, Akyoyo94, Alan1000, Alansohn, Ale jrb, AlefZet, AlexanderDS, Alfio, Ali@gwc.org.uk, Alucard sigma, Amaurea, AnMaster, Anagy22, Anaxial, Andonic, Andrejj, Andres, Andrewy, Andy M. Wang, Andyjsmith, Andypandy.UK, Angelic Wraith, Anikingos, Ansible, Antandrus, AnthonyA7, Antman, Anuragiscool12, Aogus, Arbraini, Arkrishna, ArnoldReinhold, Arvindn, Asanchez1572, Asapilu, Aschmitz, Aslihanbilgekurt, Atulsnischal, Austinmurphy, Ayecee, AzaToth, B0at, BR01097, BW52, Bachrach44, Balderdash707, Barticus88, Basak327, Bawolff, Bdamokos, Bdskene, Beland, Ben D., Ben-Zin, Bencherlite, Benjeeeboy, Benjiboi, Berkut, BernardH, Beyond silence, Bjf, Blackdogman, Blanchardb, Blinklad, Bloigen, Bmonro, Bmunden, Bob360bob360, Bobagoncheese, Bobanater, Bobblewik, Bobo192, Bonadea, Bongwarrior, Boothy443, BorgHunter, Boulaur, Bped1985, Brianga, Brianski, Brion VIBBER, Brookie, Brother Dysk, Bruce1ee, Bsdjkqvfkj, Burntsauce, Bwhack, CASIO F-91W, CAkira, Cahk, Calamari, Caltas, Camboxer, Camerong, Can't sleep, clown will eat me, CanisRufus, Capricorn42, Cast, Catgut, Cbuckley, Cctoombs, Cessator, Chaheel Riens, Charleca, Charlieleake, Cheesechords, Chetvorno, ChipChamp, Christian List, CiaPan, Ciphers, CityOfSilver, Ckape, Clueless newbie, CoJaBo, CodeMaster123, Coercorash, ComputerWizerd, Conversion script, Cool110110, Coremayo, Corixidae, Corpx, Craggyisland, Crakkpot, Crispmuncher, Critikal, Crusadeonilliteracy, Crzysdrs, Cst17, Ctech72, Cyan Gardevoir, Czarkoff, DARTH SIDIOUS 2, DBigXray, DMacks, DVD R W, DVdm, Damian Yerrick, Dancter, Daranz, Darth Panda, DarylNickerson, DavidCary, DavidH, Davodd, Dchidest, Deathanatos, Deking305, DerHexer, DexDor, Dgw, Dharmuone, Diannaa, DieSwartzPunkt, Dimo414, DirkvdM, Discospinster, Djbdan, Doanjackson, DocWatson42, DocendoDiscimus, Dosman, Douglas Whitaker, Download, Drhex, Drrngrvy, Dustimagic, Dzubint, Dugosz, ELCleanup, ERcheck, Edderso, Editor911, Edward321, Eehite, Eggman183, Egmontaz, ElfWarrior, Elvir.doc, Emx, Enchanter, Enviroboy, Epbr123, Eric-Wester, EvanSeeds, Excirial, Explicit, Falcon8765, Favonian, Fervidfrogger, Fillepa, Fireaxe888, Firetrap9254, Flewis, Flowerpotman, Fnagaton, Frap, Frappucino, Frecklefoot, Frieda, Frosty3219, Funkyj, Furrykef, Fyrael, Fyyer, GCFreak2, Gadfium, Gaius Cornelius, Gamer007, Gapaddict, Gareth Aus, Geljamin, Geniac, Giftlite, Gjeremy, Glass Sword, Glenn, Gogo Dodo, Gold Standard, Gonzo fan2007, GoodDamon, Goodnightmush, Gopherbob1921, Gordeonbleu, Gracefool, Graham87, Greg847, Gregbard, Grendwx, Grinters, Grstain, Guardianangelz, Gurch, Guthrie, Guy Harris, Hadal, Halcionne, HalfShadow, Halmstad, Hariraja, Harp, Hazel77, Hbent, Heaviestcat, HeirloomGardener, HenkeB, Heron, Hikmet483, Hirohisat, Hllomen, Hobartimus, Holizz, Hoo man, Hounder4, Hsn6161, Hughey, Hughtcool, Hulleye, Husond, Huw Powell, Hwan051, Hydrogen Iodide, ICE77, II MusLiM HyBRiD II, IRP, Iced-T, Icelight, Igoldste, Ikebowen, Imran, Inaneframe, Infernowolf36, Insanity Incarnate, Irbisgreif, IronGargoyle, Ironicart, Irwangatot, Island, Isnow, Itsacon, Iulianu, Ixfd64, J.delanoy, J00tel, J04n, JForget, JaGa, Jaakobou, Jab843, James Anthony Knight, James084, Jascii, Jasper Deng, Jasz, JayC, Jbolden1517, JeLuF, Jeff G., Jeffrey O. Gustafson, Jennavecia, Jerome Charles Potts, Jerry1964, Jesse V., Jesse Viviano, Jfmantis, Jgreenberg, Jheald, Jic, Jim1138, Jimfile, Jimothytrotter, Jni, JoanneB, Joeblakesley, John Millikin, John.jml739, John254, JonHarder, Jonhjelm, Jonomacdrones, JoshuaZ, Jsharpminor, Juckum, Judy74, Julz888, K. Annoyomous, KB Alpha, KD5TVI, Kagemaru16, KaiKemmann, Kain Nihil, Kannmaeh, Karmic, Keilana, Keithonearth, Kelly Martin, Kelvingeorge, KennethJ, Kesac, Kevin chen2003, Khanbm, Kickkmyfaceinn, KieferSkunk, King Arthur6687, Kingpin13, Kjkolb, Kman543210, KnowledgeOfSelf, Kozuch, Krich, Kubanczyk, L Kensington, LOL, LSD, LX, Lakers297065, Lando Calrissian, LarsHolmberg, Law, LeaveSleaves, LedgendGamer, LeoNomis, Lerdsuwa, Leslie Mateus, Liamgaygay, Licensedlunacy, Lilac Soul, Lipatden, LizardJr8, Llakais, Logicwiki, Looper5920, Lotu, Lova Falk, Lovok, Lpetrazickis, LuK3, Luk, Luna Santin, Lupacchione, M0rphzone, MER-C, Mac, Madhero88, Majorly, Malafaya, Mani1, Manlyspidersporkadmin, Marsel92 22, Mart22n, Martin451, Masamage, Maschelos, Masterofabcs, MateoCorazon, Materialscientist, Matma Rex, Matt Britt, Mattimeeleo, Maury Markowitz, Mav, Maxxdxx, Mayumashu, Mbessey, Mboverload, Mdkoch84, Mechanical digger, Med, Megaman en m, Melsaran, Mentifisto, Mhnin0, Mike Dill, Mike Rosoft, Mild Bill Hiccup, Mindloss, MindyTan, Mindymoo22, Miquonranger03, Mirage-project, Mistercow, Monkey Bounce, Moogle001, Moreati, MorganaFiolett, MorrisRob, Mouse Nightshirt, Moxfyre, MrFish, MrOllie, Mulad, Mulder416, Mulligan's Wake, Muugokszhiion, Myanw, Mygerardromance, Myke2020, NYKevin, Nakon, NawlinWiki, Ned Scott, Nepenthes, Newport Beach, NicAgent, Nintnt, Nixeagle, Njardarlogar, Nmacu, Nmagedman, Noah Salzman, Norm, Notting Hill in London, Nsaa, Nyvhek, Obradovic Goran, Ochib, Odie5533, Oicumayberight, Oiketohn, Oliverdl, OlofE, Omicronpersei8, Ommel, Onur074, Optimist on the run, Orannis, OrbitOne, Oroso, Orphic, Owain, Oxydo, Oxymoron83, Papercutbiology, Parnell88, Pboyd, Pcb21, Pcb95, Pcj, Pearle, Pedro, Persian Poet Gal, Peruvianllama, Peter.C, Peterl, Peyre, Pgan002, Pgk, PhilKnight, Philip Trueman, PhilipO, Pinethicket, Pnnguyen, Possum, Prakashkumaraman, Profgadd, Proofreader77, Puchiko, Purpledramallama, Qatter, Qwyrxian, Qxz, R'n'B, R. S. Shaw, RA0808, RJaguar3, Radon210, RainbowOfLight, Ramu50, Raven4x4x, Ravenperch, RazielZero, Razorflame, Real NC, Reconsider the static, Reddysan345, Res2216firestar, RexNL, Rhobite, Rich Farmbrough, Richard D. LeCour, Ricky81682, Ridge Runner, Riferimento, Rilak, Ripepette, Rj, Roadrunner, Robert K S, Robert Merkel, Robertvan1, Roguecomgeek, Rollie, Rossumcapek, RoyBoy, Rrburke, Ruud Koot, Ryan8bit, SDJ, SJP, SWAdair, Sagaciousuk, Sagark86, Samuelim24, Sarenne, Sat84,

Article Sources and Contributors


Savannah Kaylee, Scarian, Scarypeep, SciberDoc, Scott14, Scottyj200, Seahorseruler, Seesh cabeesh, Seksek, SeoMac, Sfoskett, Shadow demon, Shadowlynk, Shandris, Shanes, Shawnhath, Shaz929, SheeEttin, Shinji008, Shoeofdeath, Shreshth91, Shubham18, Sigma 7, Sillydragon, Sinohayja, Sj, Skeddles, SkyWalker, Slakr, Slazenger, Slowking Man, Snydale, Sol Blue, Solidsnake204, Some jerk on the Internet, Sourav255, Souvik100, Sparkiegeek, Spike Wilbury, SpuriousQ, Ssd, Steevm, Stephen, Stephen Gilbert, Stevertigo, Stevethepanda, Stickee, Stryn, Suffusion of Yellow, Super IT guy, Synchrite, THEN WHO WAS PHONE?, THF, TVX109, TaintedZebra, Tangotango, Tawker, Tealwisp, Techdawg667, Tehh bakery, The Anome, The Cunctator, The Interior, The Rambling Man, The Thing That Should Not Be, The undertow, TheDoober, TheJosh, TheLurkerMan, Theamazingswamp, Theqwert, Theramcerebral, Thingg, Think outside the box, ThinkBlue, Thisisborin9, Thorpe, Thumperward, Tide rolls, Timmahlicious, Timmy mctimmy, Tintii, Toddst1, TomPhil, Tony Sidaway, Traroth, Trevor MacInnis, Tridian, Triona, Tubby23, Tvdinnerkid, Twilight1188, TwoTwoHello, Typhlosion, Uncle Dick, Uncle G, UncleBubba, Urhixidur, Usb10, Useight, VampWillow, Vanished895703, Vcolin, Velella, Versus22, Vilerage, ViriiK, Vishnava, Vivio Testarossa, Vsmith, Wafulz, Wavelength, Weregerbil, Wernher, Wesley, West.andrew.g, Whkoh, Wiki13, WikiSlasher, Wikiloop, Wikipelli, Will Sandberg, William Avery, Wimt, Wizardist, Wizzy, WookieInHeat, Work permit, Worthlessboy1420, Wowkiddymage, Wtshymanski, Xionbox, Xpclient, YUL89YYZ, Yansa, Yekrats, Yidisheryid, Yogiz, Yunshui, Yurik, Yyy, ZS, ZX81, Zachlipton, ZaferXYZ, Zara1709, Zero sharp, ZeroOne, Zhile, Zhou Yu, Zim Lex, Zita127, Zondor, , 1921 anonymous edits Computer data storage Source: http://en.wikipedia.org/w/index.php?oldid=513772897 Contributors: AThing, Aapo Laitinen, Aaron Schulz, Abtract, Abune, Achraf52, Acroterion, AdjustShift, Ahoerstemeier, Aiken drum, Ajustis, Alansohn, Alaphent, Aldie, Algotime, Alpha Quadrant (alt), Altenmann, Ancheta Wis, Andrew Hampe, Andrewpmk, Anfearard, Anghammarad, Anigma10363, Archer7, Arthena, Arthuralee, Ary29, Atif.t2, Austinmurphy, Aveilleux, B jonas, Bact, Bansipatel, Beetstra, Ben.c.roberts, BenFrantzDale, Beyond silence, BigCow, BillGatos, Bitbut, BlastOButter42, Bluerasberry, Bobblewik, Bobo192, Borgx, Brad101, Brookie, Bsadowski1, CBDunkerson, Callmejosh, Caltas, Calvin 1998, Canadian-Bacon, CanisRufus, Celebere, Ceros, Ch Th Jo, Channabankapur, Chetvorno, Christopher140691, Chuq, Ciphers, ClaireEvans, Clitton01, Cometstyles, Conversion script, Cooltude, Courcelles, CoyneT, Cpritchett42, Cuckooman4, Cyanoa Crylate, Cybercobra, DARTH SIDIOUS 2, DPdH, DVD R W, Damieng, Darth Panda, David Biddulph, David44357, Dcljr, Denniss, Dicklyon, Diego Moya, Discospinster, Dmooney, Dontdoit, DoubleBlue, Doulos Christos, Dycedarg, Eagleal, Earthlyreason, Eastlaw, Ecemaml, Edward, Eeekster, Elsendero, Emperorbma, Enviroboy, Epolk, Eurobas, Evice, Exagridsystems, Excirial, FT2, Fabartus, Face, Fir0002, Fominf, Fox2030, Frap, Freiberg, Fryed-peach, Funandtrvl, Fuzheado, GB fan, GDonato, GEBStgo, Gadfium, Gaff, Gaius Cornelius, Galoubet, Gamsbart, Giftlite, Gingekerr, Giraffedata, Glen, Glenn, Gokulhraj, Graham87, Guanxi, Gwernol, Haggis, Hagrinas, HappyInGeneral, Helix84, HenkeB, Heron, Hovev, Hussainnadeem, Hydrargyrum, Imran, IronGargoyle, Ixfd64, JCLately, JE, Jacek Kendysz, Jacroe, Jeffrey O. Gustafson, JesseW, Jhfireboy, Jim1138, Jimothytrotter, Johnuniq, Jon.bruce, JonHarder, Josh Parris, Joshua, Joshua Gyamfi, Joshuaneil, Jpbowen, JuJube, Jusdafax, K.Nevelsteen, KJS77, Kaiyuan, Karl-Henner, Katalaveno, Katari88, Kbdank71, Kelly Martin, Kenny sh, Ketsuekigata, Khendon, Kittsville, Kjanos, Kjkolb, Kmg90, Kneepole, KnowledgeOfSelf, Kozuch, Kpacquer, Krashlandon, Kubanczyk, Kungfuadam, Kvng, LX, Leadwind, Lenehey, Lizarddoctor, Lone boatman, Loren.wilton, Lowercase Sigma, M1ss1ontomars2k4, MER-C, Mac, Macademe, MadGuy7023, Mailer diablo, Majorly, Marcan, Mark Foskey, Materialscientist, Matt Britt, Matusz, Maury Markowitz, Meerkate1990, MegaSloth, Memorysuppliers, Mentifisto, Mgoida, Mh234, Michael Hardy, MichaelWattam, Mikeo, Mild Bill Hiccup, Mindmatrix, Minesweeper, Mjscud, Mmxx, Morken, Motor, Mpgenius, Mrmcompserv, Msp786, Muppet, Music Sorter, Mute Linguist, My76Strat, Myanw, Narutolovehinata5, Nate Silva, NawlinWiki, Neelix, NevilleRaymond, NewEnglandYankee, Nialsh, Nightsailor13, Nixdorf, Nobletripe, Noommos, Noone, Nuno Tavares, Ogat, Oicumayberight, Onorem, OrgasGirl, Orphan Wiki, Ortisa, Oscarthecat, Oxymoron83, Patrick, Pearle, Peruvianllama, Peter Flass, PhilKnight, Philthecow, Piano non troppo, Plugwash, Pne, Pnm, Pokeman, Pol098, Prari, Prashanthns, PrestonH, Professor Magneto, Public Menace, Qwertyas, R. S. Shaw, RTC, Raheel52000, Rama's Arrow, Reaper Eternal, Rebroad, Redmercury82, Reedy, RekishiEJ, Requestion, Rhobite, Rich Farmbrough, RichardF, Rick Sidwell, Ridge Runner, Rigadoun, Rilak, Ronz, RoseParks, RossPatterson, Rror, Rsrikanth05, Rtyq2, RyanCross, Rynsaha, Sam Korn, Santryl, SaturdayNightSpecial, Sav avril, Sbluen, SchreyP, Sciurin, Sfoskett, Shanes, Shawnc, Sheogorath, Sifaka, Signalhead, Slakr, Slightsmile, Smalljim, SoleraTec, Sotdh, SpaceFlight89, Specs112, StaticGull, Stevertigo, Storageman, Surachit, T-bonham, THEN WHO WAS PHONE?, Tannin, Technopilgrim, Thatguyflint, Thaurisil, The Anonymous One, The Thing That Should Not Be, TheBendster, TheSeer, Tiddly Tom, Tide rolls, Tim1357, Timm123, Tizio, Tobias Bergemann, Tom94022, Tomdo08, Tommy2010, Tubular, Twistednightmare, TzaB, Ultramandk, Uncle Scrooge, Urhixidur, VampWillow, Vectro, Versus22, Vespristiano, Victor, VictorianMutant, ViperSnake151, Vipinhari, Waihorace, Wapcaplet, Warut, Wavelength, Wayne Slam, Welsh, Wernher, Wiki alf, WikiBone, Wikisteve316, Wimt, Wknight94, WojPob, Wombatcat, Woohookitty, WorkingBeaver, Wtmitchell, Wtshymanski, Xqt, Yelyos, Yoosq, Zelikazi, ZeroUm, Zhen Lin, ZimZalaBim, Zoltar0, ZooFari, , , , 748 anonymous edits Disk storage Source: http://en.wikipedia.org/w/index.php?oldid=501416680 Contributors: Aapo Laitinen, Ade1982, Administration, Ahoerstemeier, Al Lemos, Alan012, Alfredo ougaowen, Andre Engels, ArnoldReinhold, Asfarer, Austinmurphy, AxelBoldt, BMF81, Becksguy, Biezl, Bobblewik, Burschik, Carbenium, Chaosdruid, ClementSeveillac, Cmglee, Compie, Compilation finished successfully, Conversion script, DBigXray, DPdH, Damieng, Daniel C, Dannykean, David R. Ingham, Deineka, Dmsar, Dr.queso, Drj, Drumex, Duncharris, DynamoDegsy, Edgepedia, EnTheMohammad, Evil saltine, Francs2000, Frap, Fudoreaper, Giftlite, GoldWingBob, Guy Harris, HaeB, Heron, Hughcharlesparker, Hvn0413, Ibmstorage, Infrogmation, Joeinwap, Julesd, KC109, Kakofonous, Kbdank71, Kbrose, Klemen Kocjancic, Koavf, Kozuch, Kubanczyk, LCP, Lenehey, Loggie, Mamyles, Mark91, Mennis, Mirror Vax, Moreschi, Mr Stephen, Muro de Aguas, Music Sorter, ND, Nations114, Npramod, Omegatron, Optakeover, Oxymoron83, Paul Mackay, Pearce jj, Playclever, Pnm, Public Menace, Punk4lifes, Pyrrhus16, Quintote, Reedy, Rich Farmbrough, Rwwww, SEIBasaurus, SaturnCat, Sean hemingway process, Sfoskett, Shanawazqureshi, Skagedal, Snagari, SoleraTec, Stepho-wrs, Svick, The son of anubis, Thruston, Tim Starling, Tom94022, Tomdinan, Tsca, VampWillow, Vanished user 5zariu3jisj0j4irj, Voidxor, Vsmith, WaitingForConnection, Wbm1058, Wernher, William Pietri, Williamv1138, Zhangzhe0101, 105 anonymous edits Hard disk drive Source: http://en.wikipedia.org/w/index.php?oldid=513758497 Contributors: -Majestic-, .:Ajvol:., 12dstring, 1836311903, 1exec1, 21655, 2620:0:1040:204:DD6:CD03:C970:C8FE, 2BRN2B, 711groove, 7segment, A D Monroe III, A. di M., A876, ABF, ADFX-03, AVRS, AaronPaige, Abdul raja, Abune, Acuares, Adam1213, Addihockey10, Addshore, Aeon1006, Aeons, Ahoerstemeier, Airplaneman, Aitias, Aiyizo, Ajh16, Akihabara, Akira625, Aksi great, Alansohn, Alasdair, Aldie, AlefZet, AlexSedov, AlexTiefling, Alexdi, Alexduric, Alfio, Alhead, Ali seed, Ali@gwc.org.uk, Alinor, AlistairMcMillan, Alpha Beta Epsilon, Alphax, Aluchko, Amniarix, AnAj, Anastrophe, Anaxial, Andrew sh, Andrewpmk, Andyscott12, Anir1uph, Anna Lincoln, Anomalocaris, AnonMoos, Antandrus, Aottley, Aqua008, Arichnad, Ariel., Ark, Ark25, Armistej, ArnoldReinhold, Arthena, Arthur Rubin, Artofcolors, AshtonBRSC, Asim18, Aslihanbilgekurt, Astralblue, Asuastrophysics, Asymmetric, Atcony, Atlant, Aude, Austinmurphy, Avinash rajput5, Avoided, AxelBoldt, Axonxorz, BBnet3000, BUF4Life, Baboo, Banej, Barefootguru, Barkeep, BarretB, Bart80, Becksguy, Beetstra, Belovedeagle, Ben pcc, BenFrantzDale, BenStorer, Bender235, Benstown, Bentogoa, Bevo, Beyondblue, Bgwwlm, Bhny, Bibliomaniac15, Bidabadi, Big Brother 1984, Bigjimr, Bility, Billastro, BirgitteSB, Bishonen, Bjf, Bjquinn, Bkell, Blackrose10122, Blainster, Blanchardb, BlankAxolotl, Blankfaze, Blastmeister, BlazingSpeed, Blazorthon, Blobglob, Bloodshedder, Blorg, Bob A, Bob lansdorp, Bob5972, Bobblewik, Bobet, Bobo192, Bonadea, Bonejaw, Bongwarrior, Bookofjude, Boone23howard, Bradenlockard, Brandon1717, BrianWiese, BrokenSegue, Brouhaha, BruceSchardt, Bubba hotep, Bubba73, Buddha24, Bunnyhop11, Bz2, CRobey, Calcprogrammer1, Caltrop, Calum Macisdean, Can't sleep, clown will eat me, CapitalR, Captrob1, CardinalDan, Cartque, Casmith 789, Caspertheghost, Catapult2, Cattrap, Cave Story Fan, Cbf1989, Centracated, Centrx, Ceros, CesarB, Ceyockey, ChangChienFu, ChaosWrath, Charles Kozierok, Chaser, Chatul, Chazwazza, Chealer, Ched, Cherkash, Chiefcoolbreeze, Chikikabra09, Chowbok, Chris the speller, Chris01720, ChrisGualtieri, ChrisfromHouston, Chrisjohnrowe, Chrislk02, Christian List, Chriswilkins2, Cjmnyc, Clay Juicer, Cleared as filed, Clements, Cmglee, CoJaBo, Coastergeekperson04, Coaxial, CobraBK, Coccun, Combuchan, Cometstyles, Compotatoj, Computerwoman417, Coneslayer, Confession0791, Conversion script, CoolFox, Corpx, Cory.gephart, Cprompt, Crazedcougar, Creative210, Crispmuncher, Cverrieruk, Cy21, Cybercobra, Cylonhunter, Cyrius, DARTH SIDIOUS 2, DBigXray, DHR, DMahalko, DNicastro, DPdH, DVD R W, DVdm, Dadu, Daedalus969, Daivox, Dakilangbobo, DanBealeCocks, Danaman5, Danceswithzerglings, Dancraggs, Danim, Danisshocked, Danorux, Dans1120, Darkmystic3021, Darkone, Darth Panda, Darxus, Datpcstore, Dave6, Daveydweeb, David Gerard, David H Braun (1964), David Schaich, David.Monniaux, DavidCary, Davidfstr, Davron, Dcljr, Dcoetzee, DeadEyeArrow, DeadlyAssassin, Deang619, Deele, Deglr6328, Deicool, Delldot, Demus Wiesbaden, DenisePro, Denniss, Deor, DerHexer, Desplow, Dgw, Dhatfield, Diaa abdelmoneim, Diceman, Dicklyon, Diego Moya, Dingbats, Djalako, Dmarquard, Dmccreary, Dmsar, DokReggar, Doktor Who, Don4of4, Donreed, DopefishJustin, Doramjan, Download, DragonKingMasterxxx, DragonLord, Draries4lyfe88, Drbreznjev, Drevicko, Drhex, Drkblz, Droll, DropDeadGorgias, Dtgriscom, Dual Freq, Duckbill, Dudecakes, Dwlegg, Dugosz, E23, EAi, ENeville, ESkog, EagleOne, Echoray, Ed g2s, EdX20, Edgar181, Edgr, Editor182, Edmundgreen, Edward, Egghead06, Egli, Ehov, EivindJ, El Pantera, El3ctr0nika, ElKevbo, Electron9, Elizium23, Elockard, Elockid, Elsendero, Elvey, Ember of Light, Emspin, Endymi0n, EnriqueVillar, Epbr123, Epeefleche, Eptin, Erik9, Escape Orbit, Everyking, Evil saltine, Excirial, Eyreland, FF2010, FT2, Fabartus, Falcon866, Faradayplank, Fatespeaks, Favonian, Feydey, Figureskatingfan, Fireiscool, Firetrap9254, Firsfron, Fishhead1, Flyguy649, Fnagaton, Foobar, Formative, FrYGuY, Fraggle81, Frankie, Frap, Freakmighty, Freewol, Frencheigh, Fsiler, Fubar Obfusco, Funke, Funnyfarmofdoom, Fuscob, Fuzheado, GB fan, GCFreak2, GFHandel, GKantaris, Gaius Cornelius, Gajuambi, Galoubet, Gangster99145, Gardar Rurak, Gatemansgc, Gazpacho, Geertjanvdk, GeneralKickass, GeorgeStepanek, Gerrit, Ghost det, Giftlite, Gilliam, Gimmetrow, Gioto, Gizmoguy100, Gjd001, Glane23, Glen, Glenn Brack, Glider87, Gmw, Gogo Dodo, GoingBatty, Googleaseerch, Gracefool, GraemeL, Graham87, Grahamdubya, Granum, Graue, Gravitan, Greatcaffeine, Greco8523, Greg L, GregorB, Grimey, Groyolo, Gsarwa, Gscshoyru, Gugustiuci, GuillaumeBeaudoin, Guitarmikel33, Gurch, Gurchzilla, Gurt Posh, Gus, Guy Harris, HJ Mitchell, Hadal, Hairchrm, HalJor, HamburgerRadio, Hamedruidd, Hammydaman, HandyAndy, Harddrives, Harryboyles, Hayabusa future, Hbent, Hdt83, HeikoEvermann, HenkeB, Hephaestos, Herakleitoszefesu, Herenthere, HeroicJay, Heron, Hertzsprung, Hervegirod, HexaChord, Hfastedge, Hiberniantears, Hillshum, Hinata, Hirzel, Hm2k, Hobartimus, Howcomeme, Hripsime Ashikyan, Htmlland, Hu, Husond, Hut 8.5, HybridBoy, I.M.S., IamWIDICOME, Icairns, IceBlade710, Igni, Ilya, Insanefreddy926, Intgr, Iohannes Animosus, Irdepesca572, Irene7999, Iridescent, Irishguy, Irodori, Island, Islander, Issaaccbb, Itu, Ixfd64, J.delanoy, J04n, JCRules, JFreeman, JHMM13, JLaTondre, JPX7, JaGa, Jack Merridew, Jackfork, Jacob Poon, Jamessully, Janke, Jarble, Jason Stormchild, JasonJack, JasonTWL, Javawizard, Jazzee8845, Jbo5112, Jcbx, Jcheckler, Jcsquardo, JeLuF, Jebba, JediMasterOracle, Jeff G., Jeffq, Jeh, Jengelh, Jerryseinfeld, Jesse Viviano, Jesster79, Jidanni, Jim.henderson, JinJian, Jmundo, Jndrline, Jnivekk, Johan1298, Johannordholm, John, John Nevard, JohnCD, Johncass, Johnnynolan2, Johnuniq, Jolomo, Jon.bruce, Jon513, JonHarder, Jonspavin, JordoCo, Josephwoh, Josh1995, JoshHolloway, Jouster, Joy, Jpk, Jrockley, Jrvsales, Js8uk2002, Julesd, Julian Morrison, Julz888, Jusdafax, Jushi, Justin Elliott, K3rb, KD5TVI, KN8BLADE28, Ka-Ping Yee, Kablammo, Kaczor, Kaini, Kangy, Kansan, Karam.Anthony.K, Kbdank71, Ken6en, Kenny sh, Kenyon, Kevin, Khalid Mahmood, Khoikhoi, Kidharb, Kikumbob, Kingtherion, Kipala, Kirk j12002, Klilidiplomus, Kmccoy, Kmweber, Knowledge Seeker, Koavf, Korora, Kostmo, Kozuch, Krawi, Krich, Krischan111, Kubanczyk, Kukini, Kuru, La goutte de pluie, Labemann, Landroo, Last Avenue, Lazychris2000, Lbs6380, LeaW, LeaveSleaves, Lee Carre, Lee Cremeans, Legalize, Legend, Legend78, LeoNomis, Leszek Jaczuk, Letdorf, Leuliett, LifeHax000, Lifung, Lightmouse, LilHelpa, Linas, Linkspamremover, Linuxmatt, LionKimbro, Little Mountain 5, Lklundin, Llewelyn MT, LocalH, Logan, Lord Snoeckx, Lordmark1, Loser 523, Lostinlodos, Lotje, LovesMacs, Lovetinkle, Lucas Brown 42, Lucid, Luder, Lugia2453, Luigiacruz, Lukaszpal, Lukeway99, Lumos3, Luna Santin, Lupo, MCBastos, MPerel, Mac, Macaddct1984, Magician718, Makelelecba, Makotti, Maksim L., Malke 2010, Malo, Mamyles, Manman008, Manway, Marc Mongenet, MarceloGarza, Marik7772003, Mark Rosenthal, Markus Kuhn, Martarius, Mashi121994, MasterFerretfu2, Materialscientist, Matesy, Matt Britt, Matt21811, Mattmill30, MauriceJFox3, Maury Markowitz, MaxSem, Maximr, Mayur, Mboverload, McMozart, McSly, Meegs, MegamanX64, Meidnjsksh fhu, Mentifisto, Mephistophelian, Mfield, Mfwitten, Mgrappone, Mhnin0, Michael Hardy, Michael R Bax, MightyJordan, Mike Dill, Mikebar, Mindgames11, Mindmatrix, Minipie8, Mirror Vax, MisterSheik, Mittosi, Mmxx, Mofomojo, Mooie, Moreati, Mormegil, Mortense, Moxfyre, Mpgenius, Mpradeep, Mr pand, MrOllie,

183

Article Sources and Contributors


Mrmcgibby, Msgohan, Mulad, Muro de Aguas, Music Sorter, Mycatcharlie, Mysid, N5iln, Namibnat, NawlinWiki, Nbarth, Ncrfgs, Ned Scott, NeoVampTrunks, Netjeff, NewEnglandYankee, Nick in syd, NickBush24, Nicklink483, Nikevich, Nintendude, Nixdorf, Nk, Nneonneo, Noctibus, Noisy, Nopetro, Norm, NorthCoastReader, Nsaa, Ntsimp, Nuujinn, Nv8200p, O^O, Oak Aged, Oberonfitch, Oda Mari, Of, Ohconfucius, Oli Filth, Omegatron, Omgitsasecret, Omicronpersei8, Orange Suede Sofa, Orina, Oscarthecat, Ospalh, Otsego.wupiwupi, Outsid3r, Oxymoron83, PJ, PL290, Paine, Palica, Pandacomics, Para, Paragon12321, Pashunka, PatMurphy, PatrickDunfordNZ, PatrickNiedzielski, Paul Breeuwsma, Paul Mackay, Paul R. Potts, Paul.w.bennett, Paul1337, Paulhurst, Pauli133, PcGnome, Pcb21, Pdslds1, Pearce jj, Pedro, Pegasus1138, Pelago, Peruvianllama, Peter Karlsen, Peterblaise, Pewwer42, Pgan002, Phantomsteve, PhilKnight, Phip, Piano non troppo, Pinethicket, Pingveno, Pipatron, Planetary Chaos Redux, Pluma, Pne, Poccil, Poiuyt Man, Pol098, Pomegranate, Poon4hire, Possum, Poweroid, Prcr, Predator106, Prolog, PseudoSudo, Public Menace, Qirex, Quadramble, Quietust, Quiggers1P, Quilker, QuillOmega0, Quintote, Qxz, R'n'B, Rainbow will, Ramgay, Ramu50, Ran, Ran4, Random contributor, Ranieri66, RaptorHunter, Ravedave, Ravenperch, Raywil, Raza514, Rbebout313, Rchandra, Rd232, Reconsider the static, Redgrittybrick, Reedy, Rees11, Rentar, Res2216firestar, Revan103, RexNL, Reyk, Rezd, Rfanshaw, Rgoodermote, Rich Farmbrough, Richard Hallas, Rickyp, Ridge Runner, Riflemann, Rileyrust, Ringbang, Rjstrock, Rjwilmsi, Rkarlsba, Rl, Rmallins, Robertvan1, Rofthorax, Roleplayer, Romioman2006, Ronk01, Rory096, Rrader, Rrburke, Rror, Rtcpenguin, Rub3X, Ruud Koot, Ruwolf, Ryanrs, Ryansccs, S in mtl, SBareSSomErMig, SF007, SJP, SMA11784, SWAdair, Saisdur, Sam Lowry, Sannse, Sanspeur, SarahEmm, SchmuckyTheCat, Schzmo, Screamingman14, Sean hemingway process, SeanMack, Seans Potato Business, Seaphoto, Seeker83, Seishirou Sakurazuka, Selkem, Semen, Seraphex, Seroyen, Servolock, Seven of Nine, Sfoskett, Shadowjams, Shantavira, Sharkface217, ShaunOfTheLive, Shawn in Montreal, Shawnc, Shayne0511, SheffieldSteel, Shell Kinney, Shervinafshar, Shirulashem, Shoemortgage, Shoeofdeath, Shoessss, Shreevatsa, Shyland, Silent cheeseburger, Silver hr, Simon Lieschke, Simpsnut14, Simpsons contributor, Sinni800, Sjakkalle, Skarebo, Sketchmoose, Skizzik, SkyWalker, Slakr, Slant6guy, Slayte1, Slp1, Smalljim, Smallman12q, Smelly9999, Smileyface11945, SoWhy, SolKarma, SpNeo, SpaceFlight89, SparhawkWiki, Speddie2, Speedevil, Spitfire19, Spommo, Stangbat, StealthFox, Steel, Stephan Leeds, Stephenb, Sting-fr, Stormie, Strait, StuartBrady, Suicidalhamster, Suit, Supertask, Surachit, Suruena, Sven Manguard, Swpb, Swtpc6800, Syndicate, Syu066, T0ny, T23c, THEN WHO WAS PHONE?, TS133, Tablizer, Tabor, TacoTako, Takanoha, Tallblondie, Tallkmac, Tandral, Tannin, Tariqabjotu, Tattylicious, Tavilis, Taw, TaylorBelden, Tbhotch, TechPurism, Techdawg667, Techjost, TechnoFaye, Technolust, Tedzdog, Tehdang, Teratornis, Ternto333, TexasAndroid, Thadius856, Thatguyflint, Thatotherdude, The Anome, The Goat, The Prometheist, The Rambling Man, The Thing That Should Not Be, TheArmadillo, TheBeck, TheDJ, TheHYPO, TheJosh, TheMadBaron, TheStarman, TheThomas, TheWickerMan, Thebdj, Theo Pardilla, Theo10011, Theproc64, Therealdp, Theshibboleth, Theymos, Thingg, Thorne, Thue, Thumperward, Thunderbird2, Thunderboltz, Tide rolls, Timanderso, Timbatron, Timeastor, Tins128, Tirerim, Titoxd, Tobias Bergemann, Tom94022, Tombomp, Tony1, TonyTheTiger, Tooki, Totebo, Towel401, TransAccu, Tree Kittens, TreeSmiler, Trevor MacInnis, Trevor Marron, Trevorloflin, Trojancowboy, Truong Thanh Chung, Trusilver, Truthanado, Tslocum, Tubbablub, Ubcule, Ugureren, Unspoken feeling, Uriyan, Useight, Utcursch, Uzume, VMS Mosaic, VampWillow, Van helsing, Vanderdecken, Vaubin, Veinor, Veritaserum, Versus22, Vincentsarego, Vipinhari, Viridae, Vivekbhusal, Vksvns, Volk282, Vsmith, Walk&check, Wapcaplet, Wavelength, Wendell, Wernher, Weyes, WhisperToMe, Whitepaw, Wichtounet, Wik, WikHead, Wiki alf, Wikiborg, Wikieditor06, Wikipediarules2221, Wikipelli, Wikispaceman, Willking1979, Willsmith, Winhunter, Wisq, Wknight94, Wolfkeeper, Woodstone, Wosch21149, Wtmitchell, Wtshymanski, X Wad, X!, Xenure, Xeolyte, Xinpig, Xooon, Xpclient, Xx521xx, Xy, Y2kcrazyjoker4, Yankeefan238, Yorkhung, YuriSanCa, Yurik, Yyy, Zayani, ZeN, Zedlik, Zeerak88, ZephyrAnycon, Zephyris, ZeroOne, Zerodamage, Zetawoof, Zodon, Zoicon5, Zorbatron, Zowie, Zven, Zzptichka, Zzuuzz, le flottante, , , , , 2426 anonymous edits Disk-drive performance characteristics Source: http://en.wikipedia.org/w/index.php?oldid=505458920 Contributors: BD2412, Charles Kozierok, Chris the speller, Gaius Cornelius, Indulis.b, Intgr, Jitendranp, Music Sorter, Sibsen, Tony1, Voidxor, Wilhelmina Will, 5 anonymous edits RAID Source: http://en.wikipedia.org/w/index.php?oldid=512509489 Contributors: -Majestic-, 00110001, 10e6mph, 24fan24, 2600:3C03:0:0:F03C:91FF:FE93:2A8A, 3R1C, 777sms, A.Kurtz, AMK152, AThing, Aapo Laitinen, Abune, Acsteitz, Addshore, Adot, Adrian Sampson, AgentSmith15, Aglyad, Ahoerstemeier, Ahvetm, Aiken drum, Akjar13, Al tally, Alansohn, Alanyoder, AlbertBickford, Alerante, Algimantas Kazlauskas, Ali@gwc.org.uk, Alkarex, Alphachimp, Alphaman, Aluvus, Amicon, Anastrophe, Anaxamandra, Andika, Andrew.in.snow, Andrewhayes, Andykoo1990, Ankit Maity, Anon lynx, Ansible, Antilived, Antimatt, Apokrif, Argav, Armando, Arny, Artagnon, Asa Zernik, Asasjdgavjhg, Ashley Pomeroy, Atlant, Atmchicago, Audriusa, Avenged Eightfold, Avernar, AySz88, Az1568, BBCWatcher, Barek, Basil.bourque, Baylink, Baz whyte, Beetstra, Benjsc, Beno1000, Beowulf7120, Beve, Bewildebeast, Big-guy21, Bilbo1507, Bkell, Blahma, BlankAxolotl, Blappen, Blaxthos, Blethering Scot, Blorg, BlueNovember, Bobbis, Bomazi, Bonehed, Bongwarrior, Borgx, Borowon, Bovineone, Brian0918, Brianski, Brinkost, Bubba73, Bubeck, Bucephalus, Bulbous, Bull Winkus, Burke Libbey, C.ciprian, CRGreathouse, CWenger, Cacycle, Calabraxthis, Caltas, Cambrasa, CambridgeBayWeather, Cameron168, Cameronhmatthews, Cander0000, Capricorn42, Carpetsmoker, Cate0012, Cburnett, Cedstyle, Celardore, CesarB, Cfaerber, ChangChienFu, Charles Gaudette, CharlotteWebb, Chipuni, Chowbok, Chris Murphy, ChrisCork, ChrisJMoor, Chrisbolt, Chrishota, Christina Silverman, Ciphers, CityOfSilver, Clarkm5, Claunia, Clayhalliwell, Cleared as filed, Clippard, Closedmouth, Cmerepaul, Coalpatch, Compellingelegance, Concentric2, Conversion script, Cool nirav, Copyeditor42, CozoH, CrazyTerabyte, Crisco 1492, Cunchem, Curtlee2002, Cyclonius, Cynix, D6, DARTH SIDIOUS 2, DMahalko, DZNRKkCV, DaBler, DabMachine, Damian Yerrick, Dan100, DanD, DanDanRevolution, Dandv, Danielgrad, Dante Alighieri, Dar-Ape, Darkonc, Darksidex, Darlock, Darth Panda, DasRakel, Davecason, David Latapie, DavidInCambridge, Davidmaxwaterman, Davidprior, Dcoetzee, DeathByDood, Deineka, Delirium, DerHexer, DevastatorIIC, Dfarrell07, Dgw, Diaconic, Dirkbb, Dismas, Dispenser, Dittaeva, Dkevanko, DocWatson42, Dogcow, Donpayette, Doodle77, DoubleBlue, Dr. Zed, DrFenerbache, Drake Wilson, Drmies, Duplicity, Durval, Dugosz, E090, E23, E8, Edcolins, Edderso, Edward, Edward Bebop, Edward Z. Yang, EdwinOlson, Eeekster, Einemnet, Elizium23, Elsendero, EncMstr, Eneville, Enviroboy, Epbr123, Eperotao, Eptalon, Eriberto mota, Eric Kvaalen, ErickaAtDell, Eros villas, Espoo, Evice, EvilDildo, Excirial, Extropian314, Fadookie, Favonian, FeRD NYC, Feline Hymnic, Ferengi, FerryGlider, Fieldday-sunday, Figsyrup, Film8ker, Flaming, Flewis, Flexer, Floydpink, FluffleS, Foobaz, Fookadookadoo, Foolip, Fraggle81, FrancoGG, Frankie0607, Franl, Frap, Fredrik, Furrykef, Gadfium, Gaius Cornelius, Gal Buki, Gamma, Gardar Rurak, Gardrek, Garrettw87, Gary King, George Makepeace, Gh5046, Ghen, Gil Dawson, Gilliam, Ginsengbomb, Gligoran, Gmaxwell, Gnomeza, Gogo Dodo, GoingBatty, Gojomo, GoldWingBob, Gr33ndata, GrAfFiT, Gracefool, Gracehoper, Grafen, Greco8523, Grendelkhan, Grim4593, Groove66, Guerby, GunnarWolf, Gwkronenberger, H.M.S Me, Haakon, Hagrinas, Hairy Dude, Halbuquerque, Hamedmrz, Hamidshahram, Hammersbach, Hankwang, HappyGod, Happyface 0, Harald Hansen, Harryboyles, Hathawayc, HelixWolf1, Henryhartley, Hitman012, Hlangeveld, Hoenny, Hooger158, Hpa, Hu12, Huku-chan, Husky, Hutcher, Huwr, Hydrogen Iodide, Hydrox, Hypert, I already forgot, ITyGuy32, IanMSpencer, IanManka, Icewolf34, Idleloop, Ilikeliljon, Iluvcapra, Imnotminkus, Imrehg, Imroy, Indie Film, IndyLawSteve, Infofarmer, Inim repus, Init, J.delanoy, JCRansom, JWHPryor, Jaja714, Jamesday, Jauerback, Jaykd78, Jcgoble3, Jclauzet, Jclemens, Jddphd, JeLuF, Jeffq, Jengelh, Jeremy68, Jerry, Jesse Viviano, Jessemigdal, Jetblack101, Jhfrontz, Jmmarton, Jobin RV, JoeCool59, Joeinwap, Joel D. Reid, JogyB, John, JonHarder, JonRosenSystems, Jonabbey, Jonathan de Boyne Pollard, Jonathanbrickman0000, Joshuapaquin, Joy, Jpo, Jsnx, Jtbirdsell, Jtle515, JzG, KD5TVI, Kace7, Kakaopor, Karateelf, Kareeser, Kasw, Kbdank71, Kbrose, Kcp100, Kendall-K1, Kennett, Kenyon, Kernel.package, KerryVeenstra, Ketil, Kevinz000, Khaufle, Khazar, Kickin' Da Speaker, Kimiko, KirbyMeister, KirbyRandolf, Kirkgaard, Kitdaddio, Klaus100, Klosterdev, Kocio, Krilnon, Krisjohn, Kshitij, Ktheory, Kubanczyk, Kumar303, Kuru, Kuzetsa, Kyng, LEdgeley, LFaraone, Lambyuk, Last Avenue, Lbs6380, Leandrod, Lee Pavelich, Legend78, Legios, LeonardoGregianin, Leszek Jaczuk, LilHelpa, LindsayH, Lionel Elie Mamane, Luckas Blade, Luigi Panache, Luk, Lukeway99, Lumaga, Lyonadmiral, MC10, MER-C, MGA73, MLRoach, MacMan4891, Madhero88, Majkl82, Majora4, Majorly, MaratL, Marco.difresco, MarienZ, Marokwitz, MartinCracauer, Marudubshinki, Mashahood, MasterOfThePuppets, Mator, Matrix61312, Matt B., Matt Crypto, Matt0401, Max Duchess, MaxEnt, Maywither, Mboverload, Mcewball13, Mckaysalisbury, Mclean007, Mdf, Menegazzo, MercuryFree, Metuchen, Mfwitten, Mhking, MiG, Michael Frind, Michael Hardy, Michealt, Mike Payne, Mikeg, Mild Bill Hiccup, Mindspillage, Mkcmkc, Monaarora84, Moneya, MonkeyKingBar, MorganHaggis, Morn, Mortein, Mortense, MrOllie, MrPrada, MrRedwood, Mrbill, Msinkm, Msnicki, Msrkiran, Mufka, Muhandes, Murphb1220, Mustang sh, Mwanning, Naddy, Naive cynic, Nasukaren, Nbarth, Neo-Vortex, Netlad, Nexus501, Nicblais, Nicd, Nick Wallis, Nil Einne, Ninjagecko, Nivix, Nizaros, Nmoas, Nsaa, Nuno Tavares, Nuwewsco, Ohnjaynb, Ohnoitsjamie, Oicumayberight, Oldag07, Omegatron, OptiSonic, Optigan13, Orangwiki, Orayzio, Orphan Wiki, Ospalh, Ottomachin, Oxda, Oxymoron83, Pablo Mayrgundter, Paradoxbomb, Parseljc, Patrick Lucas, Paul A, Paul.raymond.brenner, Paulbayer, Paulchu, Pauli133, Paulmmn, Pavel Vozenilek, Pcslc, Pdelong, Pdinhofer, Peace keeper, Perle, Peter Grey, Peter Hitchmough, Peterl, Peyre, Phatom87, PhennPhawcks, PhilHibbs, PhilipMW, Piano non troppo, PigTail, Pinethicket, Pipatron, Piper8, Playclever, Plugwash, Pnm, Pol098, Poor Yorick, Poromenos, Porqin, Poweroid, PseudoSudo, PuercoPop, Purgatory Fubar, Qbert203, Quantumobserver, R!SC, R'n'B, RC Master, RS Ren, RYNORT, Rama, Ramesh, Ramnath R Iyer, RandallDThomas, Random name, Randomtime, Raryel, RattusMaximus, Raymondwinn, Rbulling, Rchandra, Reach Out to the Truth, ReallyNiceGuy, Reaper Eternal, RedTomato, Reedy, Requestion, Rich Farmbrough, Richmeister, RickManion, Ricky81682, Ricom1, Rjstott, Rjwilmsi, Rkarlsba, Roadstaa, Robert Brockway, RobertG, Roeme, Rogerborg, Roguelazer, Ronambiar, Ronark, Ronz, Roregan, Rorx, Rowan Moore, Royallywasted, Rror, Ruben Tavernier, Rune.welsh, Russvdw, Ruud Koot, Ryan8374, Ryanrs, SCRECROW, Salvador Barley, Sam Hocevar, Samw, Sandeep4tech, Sandstein, SandyFace, Sanoj24, Sc4074100, Schneelocke, Schwabac, Scopecreep, Scott Paeth, Seans Potato Business, Seanwong, Sesse, SeventyThree, Sfbaldbear, Sfoskett, Shadowjams, Shaggyjacobs, ShaneMacaulay, Shankara.3000, Shashwat.goel, Shawnc, Shishire Maiga, Shivamp, Sietse Snel, SilkTork, Sillygates, SimonEast, SimonP, Skranga, Slaniel, Slightsmile, Slmhy, Slowking Man, Small business, Smurf-IV, Snigbrook, Snodnipper, Snowolf, Sommerfeld, Soumyasch, SpaceFlight89, Spainhower, Sparrow99, Specs112, SpiderJon, Spiff, Spl, Spleeman, SpuriousQ, Sspecter, Stephan Leeds, Stevestrange, Stickee, Stifle, Stolsvik, Stormie, Straif, Stratos42, Stuz, Sudheerdpain, SunDragon34, Sundae, Suniljoseph, Superm401, Surturz, Suruena, Svamja, Svick, Sysy, THEN WHO WAS PHONE?, TParis, Ta bu shi da yu, Tabletop, Tallasse, Tamlyn, Tangotango, Tbhotch, TechPurism, Tedp, Teeks99, Tegel, Teiresias, Telanis, Tellyaddict, Tempel, TenthPres, Tgabor, ThE cRaCkEr, That Guy, From That Show!, The Epopt, The Thing That Should Not Be, The7thone1188, TheDeathCard, TheDirge, TheDoober, Theangrycrab, Thief12, Thingg, Thomas Bjrkan, ThreeCB768, Thumperward, Tide rolls, Tim Holahan, TimSmall, Timrollpickering, Timwi, Tmassey, Tmopkisn, Tnxman307, Tobias Bergemann, Todd Vierling, Tom Edwards, Tommy2010, Tooki, Torpedo8, Tpbradbury, Treisijs, Trevie, Trevor MacInnis, Tskandier, Tsman, Tumthe3, Turian, Tverbeek, Twight, Twimoki, Twinxor, Two25, Tychol, Uldoon, UncleBubba, Unyoyega, Upholder, Urul, Utcursch, Uzume, VAcharon, Vandemark, Velella, Versus, Vicarage, VictorianMutant, Vilox, Viper5030, Vnv lain, VodkaJazz, VoidLurker, Vwestin, W.F.Galway, Wafulz, Walter, Warfreak, Warll, WatchAndObserve, Wavelength, Wavemaster447, Wayne Slam, Wbm1058, Wernher, Wfischer, WikHead, Wikiborg, Wikidrone, Wikipelli, Wikipollock, Wile E. Heresiarch, Will Beback Auto, Winded13, Wisq, Wk muriithi, Wknight94, Wolfkeeper, Woogee, Wrightwood906, XLerate, Xeoe, Xzilla, Yagood, Yamamoto Ichiro, Yaris678, Yellowstone6, Ylee, Yonkie, Zakhmad, Zhou Yu, Zippy, ZooFari, Zvika, 2405 anonymous edits Operating system Source: http://en.wikipedia.org/w/index.php?oldid=514016634 Contributors: 10metreh, 12.245.75.xxx, 1297, 130.64.31.xxx, 149AFK, 151.30.199.xxx, 1exec1, 1yesfan, 216.150.138.xxx, 28421u2232nfenfcenc, 2D, 2nth0nyj, 62.253.64.xxx, 789455dot38, 9258fahsflkh917fas, 9marksparks9, =Josh.Harris, A876, APH, AR bd, AVRS, AVand, Aaron north, Aaronstj, Abhik0904, Ablewisuk, Ablonus, Acerperi, Achowat, Adair2324, Adams kevin, Addshore, Adityachodya, AdjustShift, Adriaan, Ae-a, Afed, After Midnight, Agentlame, Ahoerstemeier, Ahunt, Ahy1, Aim Here, Aitias, Aladdin Sane, Alanbrowne, Alansohn, Alasdair, Ale jrb, AlefZet, Alegoo92, Alenaross07, AlexDitto, Alexei-ALXM, Alexf, Alexius08, Alexswilliams, Alextyhy, Alexwcovington, Alisha.4m, AlistairMcMillan, Alksentrs, Alll, Alsandro, Altay437, Alten, Althepal, Am088, Amicon, Amillar, Amphlett7, Anaxial, Andre Engels, Andrew Maiman, Andrewpmk, Android Mouse, Andy pyro, Andy16666, Andyzweb, Ang3lboy2001, Anna Lincoln, AnnaFrance, AnonMoos, Anouymous, Ansumang, Antandrus, Antonielly, Antonio Lopez, Applechair, Arakunem, Aranea Mortem, Arch dude, Archanamiya, Archer3, Ark, Arman Cagle, Aruton, Ashikpa, Ashish Gaikwad, Ashleypurdy, Astatine-210, Astral, Atlant, Atomician,

184

Article Sources and Contributors


Attitude2000, Avenged Eightfold, Awaterl, Ayla, BMF81, Bachinchi, Bact, Badhaker, Badriram, Baron1984, Baronnet, Bastique, Bbbl67, Bbuss, Beland, Ben Webber, BenAveling, Bencherlite, Benneman, Beno1000, Betacommand, Bevo, Bhu z Crecelu, BiT, Bidgee, Big Brother 1984, BigDunc, Bigdumbdinosaur, Bijesh nair, BjrnBergman, Blainster, Bleh999, Blu Aardvark III, Blue520, Bluemask, Bobo192, Boing! said Zebedee, Bonadea, Bongwarrior, Bookinvestor, Bornslippy, Branddobbe, Brianga, Brion VIBBER, Brolin Empey, Brownga, Bsadowski1, Btate, Bubba hotep, Buonoj, Burkeaj, Bwildasi, Cactus.man, Caknuck, Calabe1992, Callmejosh, Calltech, CalumH93, Camilo Sanchez, Caminoix, Can You Prove That You're Human, Can't sleep, clown will eat me, CanadianLinuxUser, Canageek, CanisRufus, Canterbury Tail, Capricorn42, Captain Goggles, Captain-n00dle, CarbonUnit, CarbonX, CardinalDan, Carlosvigopaz, Cartread, Casull, Cdills, Celebere, CesarB, Cfallin, Chairman S., Chaitanya.lala, Chamal N, ChaoticHeavens, Charles Nguyen, Charles dye, CharlotteWebb, Chase@osdev.org, Chatul, Chikoosahu, Chris1219, Chrisch, Chrislk02, Christian List, Christian75, Ck lostsword, Cleduc, Clindhartsen, Cllnk, Closedmouth, Clsin, Cncxbox, Cobi, Coffee, CommonsDelinker, Comperr, Conan, Conti, Conversion script, Cookdn, Coolcaesar, CoolingGibbon, Corpx, Courcelles, Cp111, Cpiral, Cps274203, Cpuwhiz11, Crazycomputers, Create g77, Creativename, Credema, Creidieki, Cul22dude, Cuvtixo, Cybercobra, Cybiko123, CyborgTosser, D, D6, DARTH SIDIOUS 2, DBishop1984, DJ Craig, DStoykov, DVdm, Daesotho, Dainomite, Damian Yerrick, Dan100, DanDoughty, Daniel C, Danieltobey, Dantheman88, Darkwind, Darth Panda, Dasnov, Daverocks, David Santos, DavidCary, DavidHalko, Davidam, Davidm617617, Dawnseeker2000, DeDroa, DeadEyeArrow, Deagle AP, Debloper, Deconstructhis, Defender of torch, Dekard, Dekisugi, Delinka, Demiurge, Demonkoryu, Denisarona, Deon, DerHexer, Desolator12, DestroyerPC, Dexter Nextnumber, Dhardik007, DiaNoCHe, DigitallyBorn, Dirkbb, DirkvdM, Discospinster, Dispenser, DivineAlpha, Djonesuk, Djsasso, Dmerrill, Dmlandfair, Doh5678, Donhoraldo, Dori, Dosman, Download, Doyley, DrDnar, DreamFieldArts, Drmies, Drummondjacob, Dsda, Dudboi, Duke56, DuncanHill, Dvn805, Dyl, Dynaflow, Dysprosia, Dzubint, E Wing, E.mammadli, ERcheck, ESkog, Eab28, Easwarno1, Echo95, EconoPhysicist, EdEColbert, Edivorce, Edward, Edwy, Eeekster, Ehheh, El C, Eleete, Elkman, Ellmist, Elockid, Elsendero, Elvenmuse, Ems2, Emurphy42, Emwave, Emx, Endothermic, Enigmar007, Enna59, Enno, Ente75, Enviroboy, Epbr123, Erickanner, Erkan Yilmaz, ErkinBatu, Escape Orbit, Ethan.hardman, Ethanhardman3, EurekaLott, Eurleif, Evercat, EwokiWiki, Excirial, Eyreland, Face, Falcon Kirtaran, Falcon8765, Favonian, Feedintm, Felyza, Ferrenrock, Fish and karate, Flewis, Flonase, Florian Blaschke, Flubbit, Fobenavi, Foot, ForrestVoight, Fram, Francis2795, Frankdushantha, Frap, Fred Gandt, FredStrauss, Fredrik, Freyr, Friecode, Fronx, Fsiler, Fubar Obfusco, Furrykef, Fvasconcellos, Fyver528, GB fan, GRAHAMUK, Gail, Gaius Cornelius, Gardar Rurak, Garlovel, Gaurav1146, Gauravdce07, Gazpacho, Gbeeker, Geekman314, GeneralChrisV, Georgia guy, Geph, Gepotto, Gerard Czadowski, Ghakko, Ghettoblaster, Ghyll, Giftlite, Glacialfox, Glen, Gogo Dodo, Gogodidi, Golfington, Golftheman, GoneAwayNowAndRetired, Goodnightmush, GorillaWarfare, Gorrister, Gortu, Grafen, Graham87, Grandscribe, GrayFullbuster, Greensburger, GrooveDog, Grosscha, Ground Zero, Grover cleveland, Grunt, Gschizas, Gscshoyru, Gtgray1948, Guess Who, Gumbos, Gurch, Guy Harris, HDrake, Hammersoft, Hanii Puppy, Hannes Hirzel, Hansfn, Harris james, Harry, Harryboyles, Hashar, Hawaiian717, Hazard-SJ, Hdante, HeikoEvermann, Henriquevicente, Heron, HexaChord, Hillel, Hirzel, Hmains, Holden15, Hqb, Hrundi Bakshi, Htaccess, Huszone, Hut 8.5, Hydrogen Iodide, II MusLiM HyBRiD II, IMSoP, Iamunknown, Ian Dunster, Ian Pitchford, Ian.thomson, Icefirearceus, Ida Shaw, Ideogram, Idleguy, Ilmari Karonen, Incnis Mrsi, Indon, Inferno, Lord of Penguins, Ino5hiro, Insanity Incarnate, Integralexplora, Intgr, Ioeth, Iridescent, IronGargoyle, Ishanjand, Iswariya.r, It Is Me Here, ItsMeowAnywhere, Ixfd64, J Milburn, J.delanoy, J00tel, JForget, JHunterJ, JLaTondre, JSpudeman, Jaan513, Jackfork, Jackmiles2006, James pic, JamesAM, JamesBWatson, Janitor Starr, Jarble, Jasper Deng, Jatkins, Javierito92, Jaxl, Jaysweet, Jbarta, Jclemens, Jdm64, Jdrowlands, Jebus989, Jedikaiti, Jeff G., Jeffwang, Jeltz, JeremyA, Jerome Charles Potts, Jeronimo, Jerry, Jerryobject, Jerrysmp, Jesse V., JetBlast, Jfg284, Jfmantis, Jh51681, Jhonsrid, Jijojohnpj, Jim1138, JimPlamondon, Jimmi Hugh, Jjk, Jjupiter100, Jkl4201, JoanneB, Jobrad, JoeSmack, Joecoolatjunkmaildotcom, Joemaza, Joffeloff, John Nevard, Johnnaylor, Johnny039, Johnuniq, JonHarder, Jonathan Hall, Jordi Burguet Castell, Jorge.guillen, JorgePeixoto, Josef.94, Josepsbd, Josh the Nerd, Joshlk, Joshua Gyamfi, Joy, Jpeeling, Jschnur, Jstirling, Jsysinc, Julepalme, Jumbuck, Jusdafax, K7jeb, KAMiKAZOW, KAtremer, KDesk, KGasso, Ka Faraq Gatri, Kagredon, Kajasudhakarababu, Kamanleodickson, Karabulutis252, Karimarie, Karnesky, Karol Langner, Karolinski, Kashmiri, Katalaveno, Kathleen.wright5, Katieh5584, Kaustubh.singh, Kaypoh, Kbdank71, Kbrose, Kcordina, Ke5crz, KenBest, Kenny sh, KenshinWithNoise, Kenyon, Kerowhack, Kev19, Kevin586, Kgoetz, Khoikhoi, Kidde, Kim Bruning, Kimdino, Kjaleshire, Kjetil r, Kjkolb, Kku, Klungel, Knokej, Knownot, Kokamomi, Kotiwalo, KrakatoaKatie, Krauss, Kubanczyk, Kuru, Kushalbiswas777, Kusma, Kusunose, Kwiki, Kyle1278, Kyng, Kyuuseishu, L Kensington, LFaraone, La Pianista, Lambiam, Landroo, Latka, Law, Leaflord, LeaveSleaves, Lejarrag, LeoNomis, Letdorf, Leuko, Lifemaestro, Lightedbulb, Lindert, Linkspamremover, Linnell, Littlegeisha, Livajo, Lkatkinsmith, Lmmaaaoooo, Loadmaster, Logan, Logixoul, Lordmarlineo, Lost.goblin, Love manjeet kumar singh, Lovelac7, Lowellian, Lradrama, Lt monu, Ltomuta, Lucid, Lucy-seline, Lucyin, Luk, Lumos3, Luna Santin, Lvken7, Lysander89, Lyt701, M.r santosh kumar., M2Ys4U, M4gnum0n, MBisanz, MC MasterChef, MER-C, MONGO, Mabdul, Macintosh User, Macintosh123, Magnus Manske, Maitchy, Makeemlighter, Manassehkatz, Mandarax, Manickam001, Manmohan Brahma, Manojbp07, Manticore, March23.1999, Marek69, MarioRadev, MarkSG, Markaci, Marko75, MarmotteNZ, Martarius, Martin smith 637, Martinwguy, Masonkinyon, Materialscientist, MattGiuca, Mattbr, Matthardingu, Matthuxtable, MattieTK, Mav, Max Naylor, Maxim, Maximus Rex, Maziotis, Mbalamuruga, Mblumber, Mc hammerutime, McDutchie, McLovin34, Mcloud91, Mdd, Mdikici, Meaghan, Medovina, Meegs, Melab-1, Melsaran, Memset, Mendalus, Meneth, Meowmeow8956, Merlion444, MetaEntropy, Miaers, Michael B. Trausch, MichaelR., Michaelas10, Mickyfitz13, Mike33, Mike92591, MikeLynch, Mikeblas, Milan Kerlger, Mild Bill Hiccup, Minesweeper, Minghong, Miquonranger03, Mirror Vax, Miss Madeline, MisterCharlie, Mistman123, MithrandirAgain, Mmxx, Mnemoc, Mononomic, Monz, Moondyne, Mortus Est, MovGP0, Mppl3z, Mptb3, Mr.Z-man, MrOllie, MrPaul84, Mrankur, Mthomp1998, Mualif02, Muehlburger, Mufka, Mujz1, Muralihbh, Murderbike, Musiphil, Mwanner, Mwheatland, Mwtoews, Mxn, Mslimix, N sharma000, N419BH, N5iln, NNLauron, NPrice, NULL, Nakon, Nanshu, Naohiro19 revertvandal, NapoliRoma, Nasnema, NawlinWiki, Nayak143, Nayvik, Ndavidow, NellieBly, Nergaal, Neversay.misher, Ngch89, Ngien, Ngyikp, Nick, Nikai, Ninuxpdb, Nixeagle, Njuuton, Nk, Nlu, No Guru, Nobody Ent, Noldoaran, Nono64, Norm, Northamerica1000, NotAnonymous0, Nothingisoftensomething, Notinasnaid, Nrabinowitz, Nsaa, Numlockfishy, Nvt, Nwusr123log, O.Koslowski, OKtosiTe, Ocolon, Oda Mari, Odell421, Odie5533, Ohnoitsjamie, Olathe, Oliverdl, Olivier, OllieWilliamson, OlurotimiO, Omicronpersei8, Omniplex, Ondertitel, Onorem, Oosoom, Openmikenite, Optimisticrizwan, OrgasGirl, Orrs, Oxymoron83, P.Marlow, Papadopa, Parasti, Patato, Patrick, Paul E T, Paul1337, Pcbsder, Pepper, PeterStJohn, Petrb, PhJ, PhantomS, Phgao, Phil websurfer@yahoo.com, Philip Howard, Philip Trueman, Photonik UK, Piano non troppo, Pierre Monteux, Pinethicket, Pinkadelica, Pithree, Plasticup, PlutosGeek, Pmlineditor, Polluks, Polyamorph, Pontiacsunfire08, Posix memalign, Prashanthomesh, PrestonH, Programming geek, Prolog, Prophile, Pruefer, Public Menace, Puffin, Qaanol, Quarkuar, QuiteUnusual, Qwerty0, Qwyrxian, R'n'B, R. S. Shaw, RA0808, RB972, RTC, Raanoo, Rabi Javed, Raffaele Megabyte, RainbowOfLight, Rainsak, Rami R, Ramif 47, Random Hippopotamus, RandomAct, RaseaC, Ratnadeepm, RattusMaximus, RavenXtra, Rayngwf, Raysonho, Raywil, RazorICE, Rbakels, Rbanzai, Rdsmith4, Reach Out to the Truth, RedWolf, Reedy, Rektide, Remixsoft10, Rettetast, RexNL, Rfc1394, Rhyswynne, Riana, Rich Farmbrough, Rilak, Rjgarr, Rjwilmsi, Rlinfinity, Rmere, Rmhermen, Robert K S, Robert Merkel, RobertG, Robertwharvey, RockMaster, Rockstone35, Rodri316, RogierBrussee, Rokfaith, Rolandg, Romanm, Ronark, Ronhjones, RossPatterson, Rotem Dan, RoyBoy, Rrelf, Rror, Rubena, Rubicon, Ruud Koot, Rzelnik, S.borchers, S10462, SF007, SNIyer12, SPQRobin, Safinaskar, Sainath468, Sakariyerirash, Sam Vimes, SampigeVenkatesh, Sander123, Sanfranman59, Sango123, Sardanaphalus, Sarikaanand, Scherr, SchmuckyTheCat, SchreyP, SchuminWeb, Schwallex, Schzmo, Sdfisher, Sean William, Seba5618, Sedmic, Senator Palpatine, Sewing, Shadowjams, Sharanbngr, Sharkert, SheikYerBooty, Shizhao, Shreevatsa, Shreshth91, Shriram, Sidious1741, Sigma 7, Signalhead, Silas S. Brown, Simon the Dragon, SimonP, Simxp, Sir Nicholas de Mimsy-Porpington, SirGrant, SirGre, SivaKumar, Skarebo, Skomes, Slgrandson, Slogan621, Slon02, SmackEater, Smadge1, Snowmanradio, Snowolf, Socalaaron, Socrates2008, SocratesJedi, SolKarma, SolarisBigot, Sommers, Sophus Bie, South Bay, Sp, Spanglegluppet, Sparkle24, SpooK, SpuriousQ, Squash, Sridip, Staffwaterboy, Stealthmartin, Stephen Gilbert, Stephen Turner, Stephenb, Stephenchou0722, SteveSims, Stevenj, Stewartadcock, Stickee, Stormie, SudoGhost, Sun Creator, SunCountryGuy01, Sunay419, Super Mac Gamer, SuperLuigi31, Superswade, SusanLesch, Susheel verma, Sven Manguard, Sven nestle, Sweet blueberry pie, Synchronism, Syzygy, THEN WHO WAS PHONE?, THeReDragOn, Ta bu shi da yu, Tannin, TarkusAB, Tarmo Tanilsoo, Tarquin, Tasting boob, Tatrgel, Tdrtdr, Tdscanuck, TempestSA, Teply, TexasAndroid, Texture, Tgeairn, Tgnome, The Anome, The Random Editor, The Thing That Should Not Be, The undertow, The1DB, TheAMmollusc, TheNewPhobia, TheWorld, Thecomputist, Theda, Thedjatclubrock, TheguX, Themoose8, Theshibboleth, Thingg, Thorpe, Thumperward, Tide rolls, TigerShark, Timir Saxa, Titoxd, Tnxman307, Tobias Bergemann, Toddst1, Tokai, Tom Hek, Tom harrison, Tomcool, Tommy2010, TommyB7973, Tompsci, Tony1, Tothwolf, Touch Of Light, Tpbradbury, Tpk5010, Traroth, Travelbird, Trevj, Trevjs, Trimaine, Triona, Trisweb, Triwbe, TurboForce, Twas Now, Twistedkevin, Twitty666, Twsx, Tyler, Tyomitch, Typhoon, Tyrel, Ultimus, Umofomia, Unbreakable MJ, Uncle Dick, Unixguy, Unknown-xyz, Uogl, Upthegro, Ursu17, Urvashi.iyogi, Useight, User A1, Utahraptor ostrommaysi, Utilitytrack, VampWillow, Vanessaezekowitz, Vanished user 39948282, Vbigdeli, VegaDark, Verrai, Vicenarian, Vikrant manore, Vincenzo.romano, Viriditas, Vorosgy, Vox Humana 8', Vrenator, W163, WJetChao, Wapcaplet, Wareh, Warren, Warut, Wasted Sapience, Waterjuice, Wavelength, Wayward, Wbm1058, Wdfarmer, Wdflake, Weedwhacker128, WellHowdyDoo, White Shadows, Who.was.phone, Widefox, WikHead, Wiki Wikardo, Wiki alf, WikiDan61, WikiPuppies, WikiTome, Wikievil666, Wikiloop, Wikipelli, WilyD, Winchelsea, Wingnutamj, Winhunter, Winston365, Wisconsinsurfer, Wk muriithi, Wknight94, Wluka, Woohookitty, World-os.com, WorldBrains, Wormsgoat, Wtmitchell, Wtshymanski, Wwagner, X42bn6, Xdenizen, Yaronf, Yellowdesk, Yes-minister, Yidisheryid, Yoink23, Youwillnevergetthis, Yunshui, Yworo, Zephyrus67, Zfr, Zidonuke, Zigger, Ziiike, Zlemming, Zondor, Zotel, Zx-man, Zzuuzz, , var Arnfjr Bjarmason, , , , , 3220 anonymous edits Unix-like Source: http://en.wikipedia.org/w/index.php?oldid=511360273 Contributors: 16@r, Acerperi, Adamantios, AdrianTM, Agarvin, Ahunt, Aladdin Sane, Aldie, AlistairMcMillan, Allens, Andrew1718, Andy16666, Angela, Athantor, BarkingFish, Bdesham, Beno1000, Betterworld, BiT, Borgx, Camje lemon, Chealer, Cheyinka, Chowbok, Chris Q, ChrisBrown, Cleared as filed, Clemwang, ColdShine, Conversion script, Crispmuncher, CyberSkull, Cybercobra, Cyclonius, Damian Yerrick, Darrien, David Gerard, Demi, Dereckson, Derek Ross, Druiloor, Dwheeler, Dylan Lake, Eequor, ElBenevolente, Ems2, EqualRights, Eraserhead1, Escape Orbit, Evice, Evil Monkey, Fibonacci, Frigoris, Funkysapien, Furrykef, GandalfDaGraay, Gareth Owen, Geronimooo, Ghettoblaster, Gogo Dodo, Grandscribe, Greenrd, Gronky, Gubbubu, Guy Harris, Haikupoet, Hans Dunkelberg, Henry W. Schmitt, Hroulf, IJohnMac, IMSoP, Imroy, JLaTondre, JYOuyang, James Foster, Janizary, Jerryobject, Joeinwap, Joffeloff, Jonny6910, Jordandanford, JorgePeixoto, Jpp, Julesd, Juliancolton, Karnesky, Ken Arromdee, Koffieyahoo, Kulokk, Kwertii, Kychot, La Pianista, Letdorf, Limewolf, Lost.goblin, LukeyBoy, MarXidad, Marathonmike, Markpeak, Mebden, Merphant, Michael B. Trausch, Mike hayes, Mindmatrix, Mms, Monedula, Mysekurity, N5iln, Naddy, NapoliRoma, NawlinWiki, Netha Hussain, Ni.cero, NicM, Nickj, Ossguy, Owain, Pelago, Phobos11, Pmlineditor, Polluks, Powwradota, Prolinesurfer, Psychonaut, Public Menace, Quasipalm, R'n'B, RJaguar3, Rchandra, Reisio, Revolus, Robertmh, Rvalles, SF007, Schily, Scott Ritchie, Scott Wilson, Seitz, Seth Nimbosa, Siroxo, Sligocki, Smazzin, SteinbDJ, SteveSims, Stevertigo, Taak, Tarquin, Tassedethe, Techsmith, Tedickey, Teemu Leisti, Thumperward, Timir2, Tobias Bergemann, Topbanana, Trialsanderrors, Tverbeek, Tyomitch, Utcursch, Vicki Rosenzweig, Wareh, WarthogDemon, Wereon, Wernher, Whayworth, XTaran, Yes-minister, Yworo, Zvn, ncel Acar, 220 anonymous edits File system Source: http://en.wikipedia.org/w/index.php?oldid=514049707 Contributors: (, 100110100, 121a0012, 2mcm, 90 Auto, Adamantios, Adrian, Ae-a, Ahoerstemeier, Ahy1, Aillema, Aj00200, Alansohn, Alba, Aldie, Aleksandar030, Aliekens, AlistairMcMillan, Alkrow, AmRadioHed, Ameen.crew, Anandbabu, Ancheta Wis, Andre Engels, Andy16666, AnonMoos, Anthony Borla, Arjayay, Ark, Arnon007, Aron1, Arrenlex, Aschrage, Asd.988, Assarbad, AtheWeatherman, AxelBoldt, Badgernet, Baryonic Being, Becksguy, Beland, Benash, Bender235, BiT, Bitwise, Bletch, Bob007, Boborok, Boing! said Zebedee, Bornhj, Brickmack, Brycen, Burschik, Byteemoz, COstop, Can't sleep, clown will eat me, Cander0000, Capricorn42, Carlosguitar, Catalina22, Cbayly, Ceyockey, Cgy flames, Chealer, Chipuni, Chris Chittleborough, Chris the speller, ChrisHodgesUK, Christian Storm, Claunia, Cmdrjameson, Cohesion, Colin Hill, Conversion script, Coolfrood, Corby, Cpiral, Crashmatrix, Creidieki, Csabo, Cspurrier, Ctachme, CyberSkull, DGerman, DMG413, DMahalko, DRAGON BOOSTER, DStoykov, DVD R W, Damian Yerrick, Damieng, DanDevorkin, Darklilac, Darrien, David Gerard, David H Braun (1964), DavidHalko, Davitf, Decoy, Dekisugi, Del Merritt, Delirium, DerHexer, Dexter Nextnumber, Dillee1, Dirkbb, DmitryKo, Donhalcon, Download, Druiloor, Dsant, Dsav, Dysprosia, EddEdmondson, Edward, ElBenevolente, Electron9, Eltouristo, Emperorbma, Emre D., Eob, Everyking, Ewlyahoocom,

185

Article Sources and Contributors


Eyreland, Falsifian, FatalError, Favonian, Ferrenrock, Firthy2002, Fogelmatrix, Foxxygirltamara, FrYGuY, Fraggle81, Frap, Froggy454, Gaius Cornelius, Galoubet, Gazpacho, Ghakko, Ghettoblaster, Gpvos, GraemeL, Grafikm fr, Graham87, Greg Lindahl, GregorB, Groogle, Guroadrunner, Guy Harris, Hadal, Hagedis, Hairy Dude, Hazel77, Helix84, Hif, Howdyboby, Ian Pitchford, Ianiruddha, Imroy, InShaneee, Ineuw, Info@segger-us.com, Intgr, Isarra, J0m1eisler, JLaTondre, Jason Quinn, Jasper Deng, Jcorgan, Jec, Jeff G., Jeffpc, Jengelh, Jerryobject, Jim.henderson, Joeblakesley, Joeinwap, Jonathan de Boyne Pollard, JordoCo, Jotel, Joy, Jna runn, Kairos, Karada, Karmastan, Kate, Kbdank71, Kbolino, Kc2idf, Kendrick7, Kenyon, Kim Bruning, Kiralexis, Kozaki, Krauss, Kvedulv, Kwharris, Kwi, LHOON, Lament, Lehoo, Leon Hunt, Letdorf, Lightdarkness, LilHelpa, Lion.guo, Loadmaster, LocoBurger, Lofote, Logixoul, Lost.goblin, Lotje, Lupo, MARQUIS111, MZMcBride, Mac, Mange01, Mannafredo, ManuSporny, Marcika, MarekMahut, Marudubshinki, Marysunshine, Mat-C, Materialscientist, MatthewWilcox, Matthiaspaul, Mattisgoo, Maxal, Maximaximax, Mbakhoff, Mdd, Med, Miblo, MicahDCochran, MikeRS, Mild Bill Hiccup, Mindmatrix, Minghong, Mjk64, Mk*, Mlessard, Mmairs, Modster, Monz, Morte, Mrichmon, Mschlindwein, Mulad, Mushroom, Mwtoews, Nahum Reduta, Nanshu, NapoliRoma, Nbarth, NeaNita, NevilleDNZ, Nikitadanilov, Nixdorf, OccamzRazor, Oda Mari, Omicronpersei8, OrangeDog, Orzetto, Oscarthecat, Ovpjuggalo, PGSONIC, Palconit, Patrick, Pattepa, Paul.raymond.brenner, Peterlin, Peyre, Phil Bordelon, PhilHibbs, PhotoBox, PichuUmbreon, Poccil, Pol098, Poppafuze, Porterde, Psychonaut, Public Menace, Pythagoras1, Qaywsxedc, Quale, Questulent, Quiddity, R. S. Shaw, RJaguar3, Radagast83, Raffaele Megabyte, Ravenmewtwo, Reconsider the static, RedWolf, Reisio, Retron, Reyk, Rfc1394, Rhobite, Rich257, Riotnrrd, Rob Kennedy, Rockstone35, Rogitor, Royce, Rror, Runner5k, Ruud Koot, Rvalles, Rynsaha, Ryulong, SEWilco, SMC, Sam Hocevar, SamCPP, Samfw, Saucepan, Scarlet Lioness, Scientus, ScottJ, Sdfisher, SeanMack, Semifinalist, Sheehan, Sherwood Cat, Showeropera, Slogan621, Smappy, Snaxe920, SolKarma, SolarisBigot, Sommerfeld, SpeedyGonsales, Splash, Squash, Ssd, Stephen Gilbert, Stephenb, Stuart Morrow, StuartBrady, Suffusion of Yellow, Supertin, Suruena, Swift, Swpb, Tablizer, Taka, Tannin, Tarquin, Tawker, Tellarite, Tempel, The ansible, TheAMmollusc, TheGeekHead, The_ansible, Theone256, Thompsa, Thumperward, Thunderpenguin, Tim Ivorson, Tobias Bergemann, Traut, Tylerni7, Typhoon, Uli, Uncle G, Unixguy, Val42, Vasi, Velella, Voidxor, W163, W1tgf, Wai Wai, Walabio, Warpflyght, Wayiran, Wesley, Wikid77, Wikievil666, Winston Chuen-Shih Yang, Wknight94, Wli, Woohookitty, Ww, X7q, Xcvista, Yamla, Yapchinhoong, Yudiweb, Zemyla, Zetawoof, Zhaofeng Li, Zodon, Zoicon5, Zunaid, var Arnfjr Bjarmason, , 796 , anonymous edits Network File System Source: http://en.wikipedia.org/w/index.php?oldid=510474946 Contributors: Alexbatko, Andareed, Atomota, Audriusa, Beland, Brianski, Bsroiaadn, Burschik, Byrial, CanisRufus, Chealer, Cokoli, Cplatz, Craphound, Daniel.Cardenas, David Levy, DavidHalko, Dgw, Dittaeva, Dogcow, Dontraub, Edward, Eike Welk, Emperorbma, Etienne.navarro, Fatespeaks, Fnielsen, Fuzheado, GarethEvans, Geek66, Ghakko, Ghettoblaster, Glenn, Gpvos, Gronky, Guy Harris, Haakon, Hairy Dude, Happywaffle, Harryboyles, Hashar, HedgeHog, Helix84, Hella, Hellisp, Herakleitoszefesu, Hhielscher, Hippietrail, Hossein.ir, Ilikeplums, Ilyathemuromets, Intgr, Ivanl, JesseHogan, Jimj wpg, Jordan Brown, Joy, JzG, KJRehberg, Karl-Henner, Khendon, Kirankumar9949, Liam987, Libkeiser, M4gnum0n, Magnus.de, Mange01, Mark Foskey, Mattpickman, Mcld, Mennis, Merlin83b, Mindmatrix, Minesweeper, Mipadi, Mre5765, Mwtoews, NTK, Nfs user, Nixdorf, Nono64, Nubiatech, Nuno Tavares, Ohiostandard, Oli Filth, Omicronpersei8, Orderud, OrgasGirl, Pedant17, Peter S., Peterklevy, PlaysWithLife, Quarsaw, Qwe, Qwertyus, Raanoo, Raysonho, Razimantv, Rees11, Richard W.M. Jones, Robneild, Romx, RuM, S.rvarr.S, Seaphoto, Secfan, Sfoskett, Shooke, SingingSenator, Smyth, SteinbDJ, Stormie, Stuart Morrow, Tanner-Christopher, Template namespace initialisation script, TenPoundHammer, Thu, Thue, Thumperward, Tony1958, Toussaint, Tyir, UkPaolo, Vedranf, Vegetator, Viajero, Whitelabrat, Wknight94, WojPob, Wrs1864, var Arnfjr Bjarmason, 129 anonymous edits Server Message Block Source: http://en.wikipedia.org/w/index.php?oldid=508903967 Contributors: 4crickj, A bit iffy, Aaron Brenneman, Abune, Aitrus56, Akhristov, Albe, Alex French, AlexPlank, Alexius08, AlistairMcMillan, Allter, Amux, Analoguedragon, Android Mouse, Angusmca, Barek, Bellfoundry, BenBreen2003, Bishopolis, Blaxthos, Bletch, Borgx, Bovineone, Brad.budokan, Bumm13, Car031, Cburnett, Cobone, CosineKitty, Courcelles, Crackrjackr, Ctrlsoft, Cwolfsheep, Darrien, David Gerard, Deflective, Dgranja, Dinjiin, Dkronst, Dogcow, Drange net, ENeville, Eaefremov, Emperorbma, Emurphy42, Etz Haim, Exos2222, FireyFly, Franciosi, Frap, G-force, Gaius Cornelius, Ghakko, Gistca, Gogo Dodo, Grawity, Groink, Guanxi, Guy Harris, Gz33, Haakon, Hairy Dude, Haseo9999, Heron, Hippietrail, Hmains, Hotdogger, Hqduong, Huiguo01, Iambk, Ian Pitchford, Imeson, Incnis Mrsi, Jamelan, Jannex, Jbattersby, Jd2157, JesseW, Jimmi Hugh, Julesd, Kadin2048, Karada, Kbolino, Kbrose, Kr2006, Kvng, Kwamikagami, Laithjazi, Leandrod, Lkcl, Loftenter, Mabdul, Mange01, Mark.taneo, Marokwitz, Mboverload, Memodude, Mentifisto, MetsFan76, Michael Hardy, Mike hayes, Miraceti, Mitch feaster, Mre5765, Mwtoews, Myersj, Nakon, Neilbmartin, Nicolas Melay, Nicolas1981, Nixdorf, NotinREALITY, Novasource, Nubiatech, Numa, Oznt, Paul Weaver, Pauly04, Pearle, Pedant17, Peyre, Piperh, Plutor, Pnm, Project2501a, Public Menace, Qwertyus, Raanoo, Raysonho, Rchandra, Rebrane, Reedy, Reinderien, ReverendDarkness, Richard W.M. Jones, Rjwilmsi, Robmv, Rockear, Romanc19s, RonFredericks, Rprpr, SeL, Sfoskett, Shell Kinney, Sietse Snel, Simon the Likable, Slashme, SmilesALot, Socrates2008, Spellmaster, Sprmw7, Stephan Leeds, SteveLoughran, Stevenrasnick, Stewartadcock, Strait, Superm401, Swiftbrand, Ta bu shi da yu, Thayts, The Anome, Thursbysoftware, Trailrunner13, Trasz, Travis B., UncleBubba, Visualitynq, Warren, WegianWarrior, Winspool, Wk muriithi, Wknight94, Woohookitty, Wrs1864, Xpclient, , 266 anonymous edits SCSI Source: http://en.wikipedia.org/w/index.php?oldid=506796496 Contributors: 16@r, 193.203.83.xxx, 2fort5r, 2strokewool, 777sms, 8.253, A876, Aarktica, Adamantios, Addw, AdeMiami, Adrian, Agentbla, Ahoerstemeier, Akasnakeyes, Al Lemos, Alansohn, Aldenrw, Aldie, Alecv, Alex, AlistairMcMillan, Andrew sh, Angelic Wraith, Antimatt, Arjun01, Arunajay, Atama, Austinmurphy, Avapoet, BFunk, Badgernet, Balabiot, Benefros, Berdidaine, Bergert, Bigdumbdinosaur, Bletch, Blugill, Bmicomp, Bobanny, Bobblewik, Bookinvestor, Boooooooooob, Borgx, Brandon, Bumm13, Butros, CALR, CB1226, Cab88, CarbonLifeForm, Ceros, CesarB's unpriviledged account, ChadCloman, Charlieleake, CharlotteWebb, Chmod007, ChristTrekker, Christian75, Ckmac97, Cohesion, Colin Keigher, ComCat, Commander Keane, Conversion script, Corti, CostyaV, Crazycomputers, Crdillon, Crystina7, Cy jvb, Cybercobra, Damieng, Dan aka jack, DanielRigal, DarthShrine, Dasparwani, DavidDouthitt, DeadEyeArrow, Deflective, Deineka, Demonkoryu, Dhp1080, Diannaa, Disavian, DmitryKo, DocWatson42, Douglas goodall, Dpbsmith, DragonHawk, Droll, Dtcdthingy, Duducoutinho, Dysprosia, E2eamon, Eagles2016, Edderso, Edokter, Electron9, Eleven81, Eloquence, Engineerism, Epbr123, Everyking, Ewlyahoocom, Exert, Extraordinary, FAchi, Flameeyes, Flockmeal, Florian Nanu, Flying fish, Francs2000, Fred J, FrozenPurpleCube, Gaurav, Gdo01, Gekedo, Gene Nygaard, Ghaberek, Ghakko, Ghettoblaster, Giftlite, Giraffedata, Gjs238, Guy Harris, Hac13, Hadal, Haham hanuka, HarisM, Hbent, Hellboy4311, Hephaestos, Heron, Hopp, Ht1848, ILike2BeAnonymous, Ian Pitchford, Ian.yy.huang, Ilcastro, JTN, James Foster, Jeh, Jesse Viviano, Jim10701, Joezamboni, John Vandenberg, JohnAlbertRigali, JohnI, Jonverve, KSiimson, Kbdank71, Kewp, Klausness, Koman90, Kozuch, Krawi, Kubanczyk, Kwamikagami, LFaraone, Latka, LeaveSleaves, Lee Cremeans, Lee Daniel Crocker, Lenehey, Lightmouse, Luckyherb, Luggerhead, Luk, M8R-lc06bv, MER-C, MSGJ, Mailer diablo, Marcok, Martarius, Matt Chase, Matt Crypto, MatthewWilcox, Maurreen, Maury Markowitz, Mausy5043, Mav, Mclayto, MichaelJanich, MikeStone2012, Mirror Vax, Mishrsud, Mitch Ames, Modster, Mscritsm, Mulad, Music Sorter, Nakon, Napalm Llama, Ndenison, Neilm, NellieBly, Nicholsr, Nicolas Melay, Nightscream, Nikai, Nixdorf, Numa, Ohconfucius, Otac0n, Pagingmrherman, Patcat88, Paul Suhler, Pelago, Philip Trueman, Piano non troppo, Pidgeot, Pjoef, Pmc, Pol098, Poweroid, Profoss, Public Menace, Pweltz, Qxz, RDBrown, RHaden, RJFJR, Rammer, Raptor007, Rdsmith4, RedWolf, RedWordSmith, Reddi, Redgrittybrick, Reedy, Requestion, Rich Farmbrough, Rilak, RivRubicon, Rjgodoy, Rjwilmsi, Rolfvandekrol, Rrohbeck, Rwwww, SNx, Sabre centaur, Scootey, Scorpion7, Scsidrafter, Seb az86556, Sergei, Serpent's Choice, Sfoskett, Shlbrt, Sietse Snel, Simetrical, Sin-man, Smiteri, Soap, SolKarma, Solitude, Some jerk on the Internet, SpaceFlight89, Speters33w, Sridhar.gattu, Srikeit, Ste.Ri, Steveprutz, StorageMania, Sunku here, Surv1v4l1st, Tannin, TechControl, TechPurism, Teknic, Tempshill, TeslaMaster, The Anome, TheJosh, Tide rolls, Tigertiger, Tnsnick, Tnxman307, Tolien, Tom94022, Toreau, Tubby, UncleBubba, Uriyan, VMS Mosaic, VampWillow, Versageek, Vidarlo, Whosyourjudas, WideArc, Wilko12, William Avery, Williamv1138, Winthrowe, Wisden17, Woohookitty, Wtshymanski, Zac67, Zoicon5, Zondor, , 533 anonymous edits iSCSI Source: http://en.wikipedia.org/w/index.php?oldid=513899947 Contributors: A876, Aarondailey, Abisys, Adcomcorp, Alansohn, Albanaco, Aldie, Allenc28, Amruthanaik, AnatolyVilchinsky, Andysontheroad, Aneczkab, Anna Lincoln, Anthony Fok, AviN1, Banderer, Barek, Bart.vanassche, Bdelisle, Benley, Bggoldie, Biot, BleepingCompute, Blowdart, Boleon, Bookinvestor, Bousquf, Bovineone, Brianicus, Brucela, Btilm, Butlerm, CRGreathouse, Cain Mosni, Canaima, Carey Evans, Catfish777, Celestra, Ceros, CesarB, Charles Gaudette, Charleswj, Chowbok, Chris the speller, Chrisbolt, Coffee, Col.panik, Corti, CostyaV, Cpeel, Csabo, Custardninja, Cy jvb, DMahalko, DNFCorp, DStoykov, DavidDouthitt, Deflective, Demitsu, DenGer, Djj, Dmdwiggi, Dmeranda, Dominic.ashton, Drmies, Edcolins, Eddyq, EdgeOfEpsilon, Epbr123, Ericyu, Fcami, Fostermarkd, Franke3c, Fredtorrey, Funandtrvl, Georgewilliamherbert, Giraffedata, Glenn, Grstain, Gudeldar, Gvvua, Haakon, Haidut, Hellkeeper, Henriok, Hpcanswers, Hschlarb, I already forgot, Ingvarr2000, Iolar, Jackfork, Jacob1044, Jadecristal, Jakllsch, Jc monk, JeLuF, Jeroenr, Jerome Charles Potts, Jesse Viviano, Jfcotton, Jliv, Joey Novak, Jonverve, Kaze0010, Kbrose, Ke4qqq, Kenleezle, Ketiltrout, Kevintolly, Khatru2, Kiddington, Klaus100, KnightRider, Kubanczyk, Kwamikagami, Lmatt, Lyod, Maderiaboy, Malis-cs, Mallboro, MarcNozell, Marcfl, Marek69, Mark Arsten, Matt Crypto, Melnakeeb, MeltBanana, Michael Hardy, Mor Griv, Music Sorter, NJM, Naddy, NapoliRoma, Nishkid64, Nono64, Olmsfam, Orokusaki, P shadoh, PHaze, Pagingmrherman, Paisa, Paul Koning, Pedant17, Perfecto, Peytonbland, Phatom87, Philipcurrito, Pmc, Pooua, Project2501a, Rasmus Faber, Raysonho, Rchandra, Reedy, Riaanvn, Rich Farmbrough, Rocketshiporion, Ronz, Rose Garden, Ruleke, Rwwww, S36e175, SANguru, SHayter, Savisko, Seitz, Sfoskett, Shamasis, Sietse Snel, Simonefrassanito, Singerw1, Skolor, Skouperd, Snowolf, Starbane, SteinbDJ, Stereo, Stereoroid, StorageSys, Storagegeek007, Sunnyvegas, Suruena, Swooshiain, Tedder, TekMason, Terence626, The Anome, The Thing That Should Not Be, Thtse, Thumperward, Titaniumlegs, Tqbf, Trasz, Tulcod, Uhw, UncleBubba, Unyoyega, VadZ, Verdatum, Vickaul, Vitor Mazuco, Vrenator, Vy0123, W Nowicki, Wavelength, Whosyourjudas, Xunillator, Zac67, Zink Dawg, 631 anonymous edits Fibre Channel Source: http://en.wikipedia.org/w/index.php?oldid=513659272 Contributors: ALEF7, Aarondailey, Abamir, Aeons, Aerotheque, Afiler, Aldie, Alecv, Amillar, Amrithraj, AnwerAbbas, Average Earthman, Barek, BenFrantzDale, Benjwong, Berkut, Bobblewik, Bovineone, Bpothier, Camster444, Canley, Ceros, Chrylis, Cjuans, Conan, Corti, Cpetty-wiki, DMG413, DabMachine, Damaru, DanielRigal, Davemck, DennisLMartin97, Derarthur, Dewritech, Draziw, DreamGuy, Dwandelt, Edcolins, Elano, Engineerism, Feline Hymnic, Ferdinand Pienaar, FreplySpang, Giftlite, Giraffedata, Guy Harris, Hairy Dude, HedgeHog, Hoplon, Hopp, I already forgot, Innv, Intgr, Jedonnelley, Jharcourt, Jim.henderson, John of Reading, Jonverve, Karlward, Karnamshiv, Katiker, Kbdank71, Kcordina, Kevintolly, Killian441, Kjj31337, KnightRider, Kocio, Kofman, Kubanczyk, Kvedulv, LCP, Levin, Lightmouse, LionelQ, Lloydic, Llywrch, Lotje, Lusitana, Makampec, Marcfl, Mark Arsten, MarkS, Matt Crypto, MatthewWilcox, Maury Markowitz, Mennis, Michael Hardy, Mindmatrix, Mrand, Mschnell, Music Sorter, Mvdhout, Nabla, Nageh, Nasa-verve, Neiko, Neilm, Nicholsr, PPrakash, Palani.j, PartTimeGnome, Pavel Vozenilek, Peaceray, Peter bertok, Phatom87, Poccil, Reedy, Rich Farmbrough, Rilak, Rkasper, Rrburke, Rwessel, Rwwww, Sanspeur, Sbisolo, Sbmeirow, Sengork, Sfoskett, Shaddack, Silencefools, SimonP, Sleske, Slowking Man, Smallpond, Stephan Leeds, Stephenb, Straif, TechPurism, Thunderbird2, Tonkie67, Tony1, Tothwolf, Unixxx, VampWillow, Virtugon, Vvavisnotwav, Wernher, WulfTheSaxon, Xrxjxpx, 296 anonymous edits Internet Fibre Channel Protocol Source: http://en.wikipedia.org/w/index.php?oldid=435311631 Contributors: Acertain, Beland, Chaitanya.lala, Deville, Giftlite, Gurch, Henrik, Kelly Martin, Liveste, Michaelrienstra, Pnm, Rmky87, S3raph1m, SHayter, Startstop123, The Banner Turbo, Tvaughn05, Vazster, 3 anonymous edits Fibre Channel over Ethernet Source: http://en.wikipedia.org/w/index.php?oldid=511249201 Contributors: A.e.henderson, Abisys, Actswartz, Bart.vanassche, Bub's, Celestra, Corti, Crenton136, Demartek, DennisLMartin97, Eleanor716, Gps1539, Hu12, Innv, JJohnston2, Jinxynix, Jorgath, Laura126, Leandrod, Llywrch, Mandarax, Marcfl, Meh130, Music Sorter, Naveenpf, Ppelleti, Rabarberski, Rfellows, Rjwilmsi, Rrburke, Sabariganesh, Suruena, Technodesi, The Anome, The Thing That Should Not Be, Tonkie, Tonkie67, Tu13es, Vigliano, W Nowicki, 55

186

Article Sources and Contributors


anonymous edits NetApp filer Source: http://en.wikipedia.org/w/index.php?oldid=506533913 Contributors: Aaronrryan, Alansohn, Anna Frodesiak, BlacKats, Cbdorsett, Cflm001, Daferdi, Dethme0w, Esteadle, Falcor84, Gryllida, HTeutsch, Infojunkie23, JFinigan, JeremyinNC, Johan Burati, Johannbell, John of Reading, John.keating123, Krishnarajm3, Kubanczyk, Mahlon, Markhoney, Mblumber, Methcub, Michael Dring, N8, Nicolaiplum, Noformation, Peachey88, Pnm, Pugglewuggle, QEDquid, R'n'B, Ramu50, Raysonho, Reedy, Rehanseo, Resker, Rilak, Romx, Sandeep346, Signalhead, Smyoung, Staffwaterboy, Stevestrange, The Obfuscator, Thumperward, Thunderbird2, Trasz, Vandemark, Wikiwebgeek, Woohookitty, Zesta777, 95 anonymous edits Write Anywhere File Layout Source: http://en.wikipedia.org/w/index.php?oldid=497791251 Contributors: Adamward1985, Alextangent, AlistairMcMillan, Analoguedragon, Arijit123, Brycen, Electron9, Frap, Ghakko, GregorB, Guy Harris, Gwern, Hamitr, JustAGal, Kinema, Kubanczyk, Kvedulv, Markhoney, Matithyahu, Mb1000, Meestaplu, Michael Dring, Mwtoews, Nono64, Pgan002, Pinecar, Public Menace, Raven Morris, Riaanvn, Rjwilmsi, Romx, Sandeep346, TheParanoidOne, Thumperward, Wknight94, 42 anonymous edits Zero-copy Source: http://en.wikipedia.org/w/index.php?oldid=508757055 Contributors: A5b, Algotr, Bomazi, Cwolfsheep, Dougher, Elronxenu, Gonnet, History2007, Jerome Charles Potts, Kesla, PhilipO, Raanoo, RainbowCrane, RainierHa, RevRagnarok, Thumperward, 15 anonymous edits Direct memory access Source: http://en.wikipedia.org/w/index.php?oldid=511075295 Contributors: A5b, Abdull, Ae-a, Aksi great, Alleborgo, Arnero, Arodrig6, Astrale01, Augast15, Bender235, Bomazi, Brianski, Bryan Derksen, CanisRufus, Carmichael, Catskul, Chris.franson, Chronodm, Damianesteves, David.Monniaux, Davidmaxwaterman, Denisarona, Dolda2000, Dudboi, Electron9, Ergaurav.ce, Eric-Wester, Erpel, Excirial, Fedallah, Ferengi, Ferritecore, Firebug, Forbidmario, Furrykef, Gajalkar.ashish, Galileo seven, Gioto, Goodone121, Hmmmmike, ITMaverix, Intgr, Isnow, JamesR, Jesse Viviano, Jfmantis, Jimw338, Jonverve, Jwoodger, Keenan Pepper, Keli666, Kenny sh, Kenyon, Kesla, Kipholbeck, KnowledgeOfSelf, Kostmo, Kubanczyk, Kvng, Law, Lightmouse, Liotier, LjL, Llywrch, LordK, Lubos, MDuo13, MER-C, Magog the Ogre, Mangojuice, Matthiaspaul, Mentifisto, Michael Hardy, Michi.bo, Mk*, Music Sorter, NightMonkey, Nikai, Ninjamask, Notheruser, OrgasGirl, Pathoschild, Pgan002, PhilipO, Pilaf, Pnorcks, PutzfetzenORG, R. S. Shaw, RainierHa, Richarddonkin, Richie, Rjpryan, Rmosler2100, Rocastelo, Roux, Seaphoto, Shd, Sjjung, Smalljim, Snubcube, Socrates2008, SpeedyGonsales, Surendhar Murugan, Suruena, Tarquin, The Blade of the Northern Lights, Thoreaulylazy, Thumperward, Timharwoodx, Uzume, VampWillow, Vary, W Hukriede, WegianWarrior, Wimt, Wtmitchell, Wtshymanski, Xonicx, Z-4195, , 296 anonymous edits Memory management unit Source: http://en.wikipedia.org/w/index.php?oldid=513711571 Contributors: Abdull, AlbertCahalan, Alfio, Andy16666, Arndbergmann, Aseretseravat, Asymmetric, Atomaton, Autopilot, Bluefoxicy, Bpringlemeir, Bryan Derksen, Callmejosh, Camw, CanisRufus, Chovain, Chris the speller, Curly Turkey, Daemorris, David.Monniaux, Echtner, Edward, Faisal.akeel, Fraggle81, Fvw, Gerardohc, Glider87, Gona.eu, Guy Harris, Iain.mcclatchie, Ian Joyner, Inquam, Intgr, Jcm, Jesin, Jimw338, Ketiltrout, KnightRider, Kubanczyk, Kusala.s, Levin, Lightmouse, Lightsp33d, LjL, MFNickster, Magnus.de, Malinqing, Milan Kerlger, Mirror Vax, Moskvax, Neilc, Nono64, Oxymoron83, Pgan002, Pnm, R. S. Shaw, Rilak, Robert Brockway, Robert.Harker, Rwalker, SciCorrector, Shjacks45, SimonP, SkyLined, Slurslee, Stepho-wrs, Thumperward, TutterMouse, UncleDouggie, Urul, Wdfarmer, Wernher, Yonaa, Zbytovsky, 79 anonymous edits Log-structured file system Source: http://en.wikipedia.org/w/index.php?oldid=457329311 Contributors: Ahl0003, Ariel., Binarysaint, Cybercobra, Dcoetzee, Dherb, Doc aberdeen, Ed g2s, EdgeOfEpsilon, Ghakko, Innes5, Intgr, Island Monkey, Jhawson, Karada, Katokop1, Maury Markowitz, Mdd, Meghanac, Melancholie, MementoVivere, Michael Dring, MikeRS, Music Sorter, Noloader, Pnm, R'n'B, Rat144, SLi, Shlomi Hillel, Tazpa, Tomdo08, Tristan Schmelcher, Wmahan, Xhienne, Zhangpengcas, 50 anonymous edits Locality of reference Source: http://en.wikipedia.org/w/index.php?oldid=513846580 Contributors: 16@r, 18.94, A5b, Adrianwn, Andreas Kaufmann, BMF81, Ben Wraith, BenFrantzDale, Brianhe, Charles Matthews, Chasrmartin, Cic, Costaluiz, Cpiral, Cyberjoac, DTM, Derek farn, Dfarrell07, Dinno, Ecb29, Ee00224, Einstein9073, Felix Andrews, Firsfron, Fredrik, Headbomb, Helwr, Intgr, Ixfd64, JPG-GR, John, JonHarder, Jruderman, KatelynJohann, Kbdank71, Kenyon, Kurt Jansson, Maverick1715, Mboverload, Mirror Vax, Naroza, NocNokNeo, Not-just-yeti, OlEnglish, Phils, Piet Delport, Pnm, Prohlep, R'n'B, Radagast83, Randomalious, ShakataGaNai, Stephen Morley, TakuyaMurata, The Anome, Themusicgod1, Uncle G, Uttar, Wernher, Zawersh, 51 anonymous edits

187

Image Sources, Licenses and Contributors

188

Image Sources, Licenses and Contributors


Image:Intel 80486DX2 top.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Intel_80486DX2_top.jpg License: Creative Commons Attribution-Share Alike 2.0 Generic Contributors: Denniss, Solipsist, 1 anonymous edits Image:Intel 80486DX2 bottom.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Intel_80486DX2_bottom.jpg License: Creative Commons Attribution-Share Alike 2.0 Generic Contributors: Denniss, Solipsist Image:Edvac.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Edvac.jpg License: Public Domain Contributors: ArnoldReinhold, Infrogmation, Medium69, Tothwolf Image:PDP-8i cpu.jpg Source: http://en.wikipedia.org/w/index.php?title=File:PDP-8i_cpu.jpg License: Public Domain Contributors: Robert Krten Image:80486dx2-large.jpg Source: http://en.wikipedia.org/w/index.php?title=File:80486dx2-large.jpg License: GNU Free Documentation License Contributors: A23cd-s, Adambro, Admrboltz, Artnnerisa, CarolSpears, Denniss, Greudin, Julia W, Kozuch, Martin Kozk, Mattbuck, Rjd0060, Rocket000, 11 anonymous edits Image:EBIntel Corei5.JPG Source: http://en.wikipedia.org/w/index.php?title=File:EBIntel_Corei5.JPG License: Creative Commons Attribution-Sharealike 3.0 Contributors: highwycombe (talk) Image:MOS 6502AD 4585 top.jpg Source: http://en.wikipedia.org/w/index.php?title=File:MOS_6502AD_4585_top.jpg License: GNU Free Documentation License Contributors: EugeneZelenko, German, Idrougge, Morkork, Wdwd Image:Nopipeline.png Source: http://en.wikipedia.org/w/index.php?title=File:Nopipeline.png License: GNU Free Documentation License Contributors: User:Poil Image:Fivestagespipeline.png Source: http://en.wikipedia.org/w/index.php?title=File:Fivestagespipeline.png License: GNU Free Documentation License Contributors: User:Poil Image:Superscalarpipeline.svg Source: http://en.wikipedia.org/w/index.php?title=File:Superscalarpipeline.svg License: Creative Commons Attribution-Sharealike 3.0 Contributors: Amit6, original version (File:Superscalarpipeline.png) by User:Poil Image:Memory module DDRAM 20-03-2006.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Memory_module_DDRAM_20-03-2006.jpg License: Public Domain Contributors: A.Savin, Afrank99, Americophile, Cyberdex, H005, Qurren, Tothwolf, 9 anonymous edits File:Bundesarchiv Bild 183-1989-0406-022, VEB Carl Zeiss Jena, 1-Megabit-Chip.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Bundesarchiv_Bild_183-1989-0406-022,_VEB_Carl_Zeiss_Jena,_1-Megabit-Chip.jpg License: Creative Commons Attribution-Sharealike 3.0 Germany Contributors: Afrank99, Aktron, ChristosV, Karsten11, Martin H., Yarnalgo, 6 anonymous edits File:RamTypes.JPG Source: http://en.wikipedia.org/w/index.php?title=File:RamTypes.JPG License: Creative Commons Attribution 3.0 Contributors: KB Alpha File:DDR2 ram mounted.jpg Source: http://en.wikipedia.org/w/index.php?title=File:DDR2_ram_mounted.jpg License: unknown Contributors: Afrank99, Beyond silence, Fir0002, H005, Jpk File:Seagate Hard Disk.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Seagate_Hard_Disk.jpg License: GNU Free Documentation License Contributors: Richard Wheeler File:Super DLTtape I.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Super_DLTtape_I.jpg License: Creative Commons Attribution-Sharealike 2.0 Contributors: Original uploader was Austinmurphy at en.wikipedia. Later version(s) were uploaded by Silverxxx at en.wikipedia. File:Computer storage types.svg Source: http://en.wikipedia.org/w/index.php?title=File:Computer_storage_types.svg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: Surachit File:Hard disk platter reflection.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Hard_disk_platter_reflection.jpg License: GNU Free Documentation License Contributors: Dave Indech File:StorageTek Powderhorn tape library.jpg Source: http://en.wikipedia.org/w/index.php?title=File:StorageTek_Powderhorn_tape_library.jpg License: Creative Commons Attribution-Sharealike 2.0 Contributors: Austin Mills from Austin, TX, USA File:DDR RAM-2.jpg Source: http://en.wikipedia.org/w/index.php?title=File:DDR_RAM-2.jpg License: Public Domain Contributors: Lszl Szalai (Beyond silence) Image:PD-icon.svg Source: http://en.wikipedia.org/w/index.php?title=File:PD-icon.svg License: Public Domain Contributors: Alex.muller, Anomie, Anonymous Dissident, CBM, MBisanz, PBS, Quadell, Rocket000, Strangerer, Timotheus Canens, 1 anonymous edits File:SixHardDriveFormFactors.jpg Source: http://en.wikipedia.org/w/index.php?title=File:SixHardDriveFormFactors.jpg License: Creative Commons Attribution-Sharealike 3.0 Contributors: Paul R. Potts File:Floppy Disk Drives 8 5 3.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Floppy_Disk_Drives_8_5_3.jpg License: Public Domain Contributors: Swtpc6800 en:User:Swtpc6800 Michael Holley File:Asus CD-ROM drive.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Asus_CD-ROM_drive.jpg License: Creative Commons Attribution 3.0 Contributors: Asim18 File:Comparison disk storage.svg Source: http://en.wikipedia.org/w/index.php?title=File:Comparison_disk_storage.svg License: Creative Commons Attribution-Sharealike 3.0 Contributors: Cmglee Image:Hard drive-en.svg Source: http://en.wikipedia.org/w/index.php?title=File:Hard_drive-en.svg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: Surachit File:Harddrive-engineerguy.ogv Source: http://en.wikipedia.org/w/index.php?title=File:Harddrive-engineerguy.ogv License: Creative Commons Attribution-Sharealike 3.0 Contributors: Bill Hammack Image:magneticMedia.svg Source: http://en.wikipedia.org/w/index.php?title=File:MagneticMedia.svg License: Creative Commons Attribution-Sharealike 3.0 Contributors: MagneticMedia_en.png: derivative work: Zerodamage Image:Aufnahme einzelner Magnetisierungen gespeicherter Bits auf einem Festplatten-Platter..jpg Source: http://en.wikipedia.org/w/index.php?title=File:Aufnahme_einzelner_Magnetisierungen_gespeicherter_Bits_auf_einem_Festplatten-Platter..jpg License: Creative Commons Attribution-Sharealike 3.0 Contributors: User:Matesy File:Perpendicular Recording Diagram.svg Source: http://en.wikipedia.org/w/index.php?title=File:Perpendicular_Recording_Diagram.svg License: Public Domain Contributors: TyIzaeL Image:Hard disk dismantled.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Hard_disk_dismantled.jpg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: User:Ed g2s Image:HardDiskAnatomy.jpg Source: http://en.wikipedia.org/w/index.php?title=File:HardDiskAnatomy.jpg License: Public Domain Contributors: Original uploader was Ben pcc at en.wikipedia File:Kopftraeger WD2500JS-00MHB0.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Kopftraeger_WD2500JS-00MHB0.jpg License: Creative Commons Attribution 3.0 Contributors: Suit Image:5.25 inch MFM hard disk drive.JPG Source: http://en.wikipedia.org/w/index.php?title=File:5.25_inch_MFM_hard_disk_drive.JPG License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: User Redgrittybrick on en.wikipedia Image:EBSamsung hard disk.JPG Source: http://en.wikipedia.org/w/index.php?title=File:EBSamsung_hard_disk.JPG License: Creative Commons Attribution-Sharealike 3.0 Contributors: Ubcule Image:SixHardDriveFormFactors.jpg Source: http://en.wikipedia.org/w/index.php?title=File:SixHardDriveFormFactors.jpg License: Creative Commons Attribution-Sharealike 3.0 Contributors: Paul R. Potts Image:Pata hdds.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Pata_hdds.jpg License: Creative Commons Zero Contributors: X Wad File:Seagate ST33232A hard disk inner view.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Seagate_ST33232A_hard_disk_inner_view.jpg License: Creative Commons Attribution-Sharealike 3.0 Contributors: Eric Gaba (Sting - fr:Sting) Image:Hard disk head.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Hard_disk_head.jpg License: Creative Commons Attribution 2.0 Contributors: Ahellwig, FlickreviewR, Riflemann, Str4nd, Tothwolf, 2 anonymous edits Image:Rwheadmacro.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Rwheadmacro.jpg License: Creative Commons Attribution 3.0 Contributors: Alexdi Image:Rwheadmicro.JPG Source: http://en.wikipedia.org/w/index.php?title=File:Rwheadmicro.JPG License: GNU Free Documentation License Contributors: Original uploader was Janke at en.wikipedia Image:Toshiba 1 TB External USB Hard Drive.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Toshiba_1_TB_External_USB_Hard_Drive.jpg License: Public Domain Contributors: Editor182, Mindmatrix, Tothwolf Image:External hard drives.jpg Source: http://en.wikipedia.org/w/index.php?title=File:External_hard_drives.jpg License: Creative Commons Attribution-Sharealike 3.0 Contributors: User:TonyTheTiger

Image Sources, Licenses and Contributors


File:Diagram of Hard Disk Drive Manufacturer Consolidation.svg Source: http://en.wikipedia.org/w/index.php?title=File:Diagram_of_Hard_Disk_Drive_Manufacturer_Consolidation.svg License: Creative Commons Attribution-Sharealike 3.0 Contributors: User:Juventas File:Hdd icon.svg Source: http://en.wikipedia.org/w/index.php?title=File:Hdd_icon.svg License: unknown Contributors: Abu badali, Augiasstallputzer, Derbeth, Ysangkok, 1 anonymous edits File:RAID 0.svg Source: http://en.wikipedia.org/w/index.php?title=File:RAID_0.svg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: en:User:Cburnett File:DysanRemovableDiskPack.agr.jpg Source: http://en.wikipedia.org/w/index.php?title=File:DysanRemovableDiskPack.agr.jpg License: Creative Commons Attribution-Sharealike 2.5 Contributors: Arnold Reinhold File:Hard Disk head top.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Hard_Disk_head_top.jpg License: Creative Commons Attribution-Sharealike 2.0 Contributors: Jeff Kubina from Columbia, Maryland Image:RAID 0.svg Source: http://en.wikipedia.org/w/index.php?title=File:RAID_0.svg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: en:User:Cburnett Image:RAID 1.svg Source: http://en.wikipedia.org/w/index.php?title=File:RAID_1.svg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: en:User:C burnett Image:RAID2 arch.svg Source: http://en.wikipedia.org/w/index.php?title=File:RAID2_arch.svg License: GNU Free Documentation License Contributors: User:knakts. Original uploader was Knakts at en.wikipedia Image:RAID 3.svg Source: http://en.wikipedia.org/w/index.php?title=File:RAID_3.svg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: en:User:Cburnett Image:RAID 4.svg Source: http://en.wikipedia.org/w/index.php?title=File:RAID_4.svg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: en:User:Cburnett Image:RAID 5.svg Source: http://en.wikipedia.org/w/index.php?title=File:RAID_5.svg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: en:User:Cburnett Image:RAID 6.svg Source: http://en.wikipedia.org/w/index.php?title=File:RAID_6.svg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: en:User:Cburnett Image:IBM360-65-1.corestore.jpg Source: http://en.wikipedia.org/w/index.php?title=File:IBM360-65-1.corestore.jpg License: GNU Free Documentation License Contributors: Original uploader was ArnoldReinhold at en.wikipedia Image:PC-DOS 1.10 screenshot.png Source: http://en.wikipedia.org/w/index.php?title=File:PC-DOS_1.10_screenshot.png License: Public Domain Contributors: Remember the dot at en.wikipedia (PNG) File:Unix history-simple.png Source: http://en.wikipedia.org/w/index.php?title=File:Unix_history-simple.png License: Creative Commons Attribution-Sharealike 3.0 Contributors: Eraserhead1 Image:First Web Server.jpg Source: http://en.wikipedia.org/w/index.php?title=File:First_Web_Server.jpg License: GNU Free Documentation License Contributors: User:Coolcaesar at en.wikipedia File:Ubuntu 12.04 Final Live CD Screenshot.png Source: http://en.wikipedia.org/w/index.php?title=File:Ubuntu_12.04_Final_Live_CD_Screenshot.png License: GNU General Public License Contributors: Ahunt, 1 anonymous edits File:Android 4.0.png Source: http://en.wikipedia.org/w/index.php?title=File:Android_4.0.png License: unknown Contributors: Android Open Source project File:Windows To Go USB Drive.png Source: http://en.wikipedia.org/w/index.php?title=File:Windows_To_Go_USB_Drive.png License: Creative Commons Zero Contributors: Adrignola, SF007, 2 anonymous edits Image:Kernel Layout.svg Source: http://en.wikipedia.org/w/index.php?title=File:Kernel_Layout.svg License: Creative Commons Attribution-Sharealike 3.0 Contributors: Bobbo Image:Priv rings.svg Source: http://en.wikipedia.org/w/index.php?title=File:Priv_rings.svg License: Creative Commons Attribution-Sharealike 2.5 Contributors: Daemorris, Magog the Ogre, Opraco File:Virtual memory.svg Source: http://en.wikipedia.org/w/index.php?title=File:Virtual_memory.svg License: Creative Commons Attribution-Sharealike 3.0 Contributors: Ehamberg File:Dolphin FileManager.png Source: http://en.wikipedia.org/w/index.php?title=File:Dolphin_FileManager.png License: unknown Contributors: KDE File:Command line.png Source: http://en.wikipedia.org/w/index.php?title=File:Command_line.png License: GNU General Public License Contributors: The GNU Dev team, and the Arch Linux Dev team (for the Pacman command in the example) File:KDE 4.png Source: http://en.wikipedia.org/w/index.php?title=File:KDE_4.png License: GNU General Public License Contributors: KDE Image:Unix history-simple.svg Source: http://en.wikipedia.org/w/index.php?title=File:Unix_history-simple.svg License: Creative Commons Attribution-Sharealike 3.0 Contributors: Eraserhead1, Infinity0, Sav_vas File:100 000-files 5-bytes each -- 400 megs of slack space.png Source: http://en.wikipedia.org/w/index.php?title=File:100_000-files_5-bytes_each_--_400_megs_of_slack_space.png License: Creative Commons Attribution-Sharealike 3.0 Contributors: User:DMahalko File:DirectoryListing1.png Source: http://en.wikipedia.org/w/index.php?title=File:DirectoryListing1.png License: Public Domain Contributors: Loadmaster (David R. Tribble) Image:Scsi logo.svg Source: http://en.wikipedia.org/w/index.php?title=File:Scsi_logo.svg License: Public Domain Contributors: Stassats at ru.wikipedia Previously uploaded to en.wikipedia by Vidarlo in April 2006 ( file log) File:Loudspeaker.svg Source: http://en.wikipedia.org/w/index.php?title=File:Loudspeaker.svg License: Public Domain Contributors: Bayo, Gmaxwell, Gnosygnu, Husky, Iamunknown, Mirithing, Myself488, Nethac DIU, Omegatron, Rocket000, Shanmugamp7, The Evil IP address, Wouterhagens, 22 anonymous edits Image:Scsi-1 gehaeuse.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Scsi-1_gehaeuse.jpg License: Creative Commons Attribution-Sharealike 2.0 Contributors: User Smial on de.wikipedia File:SCSI-terminator-exposed-hdr-0a.jpg Source: http://en.wikipedia.org/w/index.php?title=File:SCSI-terminator-exposed-hdr-0a.jpg License: Creative Commons Attribution-Sharealike 3.0 Contributors: Adamantios Image:Speakerlink.svg Source: http://en.wikipedia.org/w/index.php?title=File:Speakerlink.svg License: Creative Commons Attribution 3.0 Contributors: Woodstone. Original uploader was Woodstone at en.wikipedia Image:FC-Topologies.jpg Source: http://en.wikipedia.org/w/index.php?title=File:FC-Topologies.jpg License: Public Domain Contributors: ncarvalho Image:Lc-sc-fiber-connectors.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Lc-sc-fiber-connectors.jpg License: GNU Free Documentation License Contributors: User:Poil Image:ML-QLOGICNFCCONN.JPG Source: http://en.wikipedia.org/w/index.php?title=File:ML-QLOGICNFCCONN.JPG License: Creative Commons Attribution-Sharealike 3.0 Contributors: Melee File:Storage FCoE.png Source: http://en.wikipedia.org/w/index.php?title=File:Storage_FCoE.png License: Creative Commons Attribution 3.0 Contributors: Abisys File:Converged Network Adapter.png Source: http://en.wikipedia.org/w/index.php?title=File:Converged_Network_Adapter.png License: Creative Commons Attribution 3.0 Contributors: Abisys Image:Frame FCoE.png Source: http://en.wikipedia.org/w/index.php?title=File:Frame_FCoE.png License: Creative Commons Attribution 3.0 Contributors: Abisys File:Wikimedia Foundation Servers-8055 01.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Wikimedia_Foundation_Servers-8055_01.jpg License: Creative Commons Attribution-Sharealike 3.0 Contributors: User:Victorgrigas File:Cache incoherence write.svg Source: http://en.wikipedia.org/w/index.php?title=File:Cache_incoherence_write.svg License: Public Domain Contributors: Snubcube Image:MC68451 p1160081.jpg Source: http://en.wikipedia.org/w/index.php?title=File:MC68451_p1160081.jpg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: User:David.Monniaux File:MMU principle.png Source: http://en.wikipedia.org/w/index.php?title=File:MMU_principle.png License: Public Domain Contributors: Andre Schieleit Image:VLSI VI475 HMMU chip from an Apple Macintosh II - front.jpg Source: http://en.wikipedia.org/w/index.php?title=File:VLSI_VI475_HMMU_chip_from_an_Apple_Macintosh_II_-_front.jpg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: Gona.eu

189

License

190

License
Creative Commons Attribution-Share Alike 3.0 Unported //creativecommons.org/licenses/by-sa/3.0/

You might also like