You are on page 1of 41

The Evolution of File Systems

Thomas Rivera, Hitachi Data Systems

Craig Harmer, April 2011

SNIA Legal Notice


The material contained in this tutorial is copyrighted by the SNIA.
Member companies and individuals may use this material in presentations and
literature under the following conditions:

Any slide or slides used must be reproduced without modification


The SNIA must be acknowledged as source of any material used in the body of any
document containing material from these presentations.

This presentation is a project of the SNIA Education Committee.


Neither the Author nor the Presenter is an attorney and nothing in this presentation
is intended to be nor should be construed as legal advice or opinion. If you need
legal advice or legal opinion please contact an attorney.
The information presented herein represents the Author's personal opinion and
current understanding of the issues involved. The Author, the Presenter, and the
SNIA do not assume any responsibility or liability for damages arising out of any
reliance on or use of this information.
NO WARRANTIES, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK.
The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

Abstract
The File Systems Evolution
Over time additional file systems appeared focusing on specialized requirements
such as:
data sharing, remote file access, distributed file access, parallel files access, HPC,
archiving, security, etc.

Due to the dramatic growth of unstructured data, files as the basic units for data
containers are morphing into file objects, providing more semantics and featurerich capabilities for content processing
This presentation will:
Categorize and explain the basic principles of currently available file system
architectures (e.g. Local, Shared, SAN, Clustered, Network, Distributed, Parallel, etc.
Explain technologies like Scale-Out NAS, NAS Aggregation, NAS Virtualization, NAS
Clustering, Global Namespace, Parallel NFS
Review new file system architectures being developed

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

Related Tutorials

Check out SNIA Tutorial:

Check out SNIA Tutorial:

Using File Server Protocols for


Block-based Storage Workloads

Understanding Enterprise NAS

Check out SNIA Tutorial:


pNFS and NFS V4.2

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

Why File Systems Have Evolved


Scale
Megabytes Petabytes

Requirements
High availability
Data sharing
Remote access
Performance
Archiving
others

Local
File
System

Shared
File
System

SAN
File
System

Cluster
File
System

Network
File
System

.....

Distributed
File
System

Object
File
System

Parallel
File
System

Time

(Not a strict timelinenew capabilities are generally incremental)


The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

Where File Systems Live

User Application and Libraries (ls, mv, rm, cp, ...)


System Calls (open(), close(), read(), write(), ioctl(), mmap(), ...)

User space
Kernel space

VFS

Process Management

File System
*can be
bypassed by using
direct I/O

Data Cache*
Segmap Cache

mmap()

Memory
Mgmt

Scheduler

IPC

Volume Manager
Device Drivers

DMA

Buffers
Machine dependent code
Hardware

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

What File Systems Do


(UNIX example)
File locators:
(inodes)

Data locators:
(pointers)

Data:
(blocks)

Inode
direct 0

Host

Data Blocks

direct 1
direct 2
direct 3
direct 4
direct 5
direct 6
direct 7
direct 8
direct 9

10

11

12

13

14

15

16

17

18

19

single
indirect
double
indirect
triple
indirect
File Owner

data block
data block
data block
data block
data block
data block
data block
data block
data block
data block
data block
data block
data block

File Type
Permissions
Last Access

.
.
.

File attributes:

Size
# of links

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

A File System Taxonomy

File
Systems

Local
File System

Network
File System

Shared
File System

SAN
File System

Cluster
File System

Distributed
File System

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

Distributed
Parallel
File System

Local File System

Local file system


Application
File System

File system is co-located in the server with application

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

Local File System


Separate islands of data
Limitation: no data sharing

Application

Application

Application

Application

File System

File System

File System

File System

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

10

One Way to Share Data:


Scale-Up

Vertical scaling

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

11

Another Way to Share Data:


Scale-Out
Horizontal
Scaling

...

Storage Network

Shared Device:

A multi-LUN device shared among clients


Each client has exclusive access to a dedicated LUN

Shared
Data

Shared Data:

A physical device shared among clients


Clients access LUNs concurrently
The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

12

Data Access with


Shared/Global File System
Separate logical and physical placement
Metadata server
File access is a three-step transaction...

Metadata
Server

Client

Step 1:Request
access

Metadata
MDS
Server

Client

Metadata
MDS
Server

Step 2: Metadata
delivery
The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

Client

Step 3: Data
access

13

Shared/Global File System


Asymmetric (SAN File System)
Client Network
Application Server

Application Server

Application Server

Application Server

Application Server

Application
e.g. Web Server

Application
e.g. Web Server

Application
e.g. Web Server

Application
e.g. Web Server

Application
e.g. Web Server

Metadata Server
(active)

Metadata Server
(passive)

Data Server

Data Server

Data Server

Storage Network

Shared
Data
One active metadata server
Typically homogeneous (scaling limited by metadata server capacity)
Inter-node distance limited by storage network capability
The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

14

Shared/Global File System


Symmetric (Cluster File System)
Client Network
Application Server

Application Server

Application Server

Application Server

Application Server

Application
(e.g. Web Server)

Application
e.g. Web Server

Application
e.g. Web Server

Application
e.g. Web Server

Application
e.g. Web Server

Metadata Server
(active)

Metadata Server
(active)

Metadata Server
(active)

Metadata Server
(active)

Metadata Server
(active)

Data Server

Data Server

Data Server

Data Server

Data Server

Storage Network

Shared
Data
Metadata server in each node
Typically homogeneous (scaling limited by internal communication, e.g., distributed locking)
Inter-node distance limited by storage network capability
The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

15

Network File Systems


(aka Proxy File Systems)

Local File System


Application
File System

Network File System


Application

Application

Application

Application

File System
Client

File System
Client

File System
Client

File System
Client

Network Protocol*
File System
Server

* e.g. NFS, CIFS, AFP,


WebDAV, FTP, HTTP, ...

Enables sharing of files located on a file server among one or more client
computers using a network protocol
The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

16

Network File System Stack


(Example: Suns NFS)
Data
SCSI Port

SAN
SCSI HBA
SCSI Driver
Volume Mgr

Application

File System

NFS
Client

NFS
Server

RPC/XDR

RPC/XDR

TCP/IP

TCP/IP
Ethernet
NIC

Ethernet
NIC

LAN

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

17

Wide Area Network File Systems


Consolidation eases
Management

Data

Administration
Cost

SCSI Port

Compliance
Global file sharing and collaboration

SAN

Location consolidation and optimization

SCSI HBA
SCSI Driver
Application

Volume Mgr
File System

NFS
Client

NFS
Server

RPC/XDR

RPC/XDR

TCP/IP

TCP/IP

Ethernet NIC

Ethernet NIC

WAN

But: WAN performance is low compared to LAN/SAN performance


The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

18

Improving Wide Area File System


Performance
Application-specific optimizations: email, document management, SQL, ...
Protocol-specific optimizations: HTTP, NFS, CIFS, WebDAV, FTP, TCP/IP, ...
Transport acceleration: TCP accelerators
Intelligent caching: read-ahead, deferred write, coherency, ...
Data compression: algorithms, file-aware differencing, data aggregation,
I/O clustering, chunk based de-duplication, cross-protocol data reduction, ...

Data

SCSI Port
SAN
SCSI HBA

Application
Application
Application
Application
NFS/CIFS
Application
NFS/CIFS
Client
NFS/CIFS
Client
NFS/CIFS
Client
NFS
Client
RPC/XDR
Client
RPC/XDR
RPC/XDR
TCP/IP
RPC/XDR
TCP/IP
RPC/XDR
TCP/IP
Ethernet NIC
TCP/IP
Ethernet NIC
TCP/IP
Ethernet NIC
Ethernet NIC
Ethernet NIC

LAN

SCSI Driver
Volume Mgr
File System
NFS
Server
Compression Engine

RPC/XDR

Compression Engine

TCP/IP

TCP/IP

TCP/IP

TCP/IP

TCP/IP

Ethernet
NIC

Ethernet
NIC

Ethernet
NIC

Ethernet
NIC

Ethernet NIC

WAN
The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

LAN

19

Distributed File System (DFS)


/
Application
File System
Client

client
view:

/a /b /c

Network
Protocol

File System
Server

File System
Server

File System
Server

/a

/b

/c

Single File System

A network file system with files distributed among multiple file servers
Not a parallel file system
The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

20

Distributed Parallel File System


Segments of files distributed across storage nodes
Enables parallel I/O to individual files (aka file striping)
Client

Client

Client

Client

File

Network Protocol

File
Server

File
Server

File
Server

File
Server

File
Server

Aggregation of Storage Servers


RAIN + RAID
(aka Network RAID)
Global Namespace

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

21

NAS Aggregation

In-Band Solution
Sometimes called
NAS Router

IP Network
NAS Router

Global Namespace
File Server

File Server

File Server

Data

Data

Data

SAN

SAN

SAN

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

22

NAS Virtualization - Out-of-Band


Client

Client

Client

Client

Metadata
Server
(MDS)

IP Network

File
Server

distributed
files
striped
files
replicated
files

File
Server

File
Server

Global Namespace

File
Server

File_A

File_G

File_B

File_D

File_F

File_H

File_C

File_E

File_K_1

File_K_2

File_K_3

File_K_4

File_A

File_B

File_C

File_B

Individual files / file segments


pinned to file servers
Files can be distributed and/or
replicated for parallel access
Files can be striped for intra-file parallel
access
Clients must locate the right file server
e.g. NFSv4.1 (pNFS), Microsofts DFS

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

23

NAS Virtualization NFS4.1 pNFS


In-Band NAS:

Out-of-Band NAS:

Application Server
Application Server

Application Server
Application Server

Application Server
Application Server

Application Server

Application Server

Application Server

NFSv4 client

NFSv4 client

NFSv4 client

NFSv4.1 client
with pNFS

NFSv4.1 client
with pNFS

NFSv4.1 client
with pNFS

IP
NAS Appliance

Storage Protocols:
Block: FCP, iSCSI, SRP, SAS
File: NFSv4.1
Object: OSD

IP
NAS Appliance
with NFSv4.1
pNFS extensions

Data
Data

SAN

SAN

Data path decoupled from


control and metadata path
The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

24

Toward Storage Grids


via NAS
Two variants:
Client

IP

NFS

CIFS

Data Services
Local Files System

Classic Filer

Client

VIP Address

Client

VIP Address

NFS

CIFS

HTTP

FTP

WebDAV

Clustered Data Services


Cluster (Parallel) File System

All nodes serve all files...


NFS

CIFS

HTTP

FTP

WebDAV

Clustered Data Services

Each file pinned to a single server...

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

25

Cloud:
The New Grid
NAS Cluster is effectively a storage cloud
Clients

File Server

Clients

File Server

Clients

File Server

Storage Cloud

Clients
The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

26

Data Segmentation

Unstructured

Fixed Data

Media production,
eCAD, mCAD, Office docs

Media-archive, DAM,
Broadcast,
Medical imaging, MediaInternet

Structured

Dynamic Data

Transactional systems, ERP,


CRM

BI, Data warehousing,


Scientific,
Transaction archive

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

27

The New Reality of Data Segmentation

Unstructured

Dynamic Data

Fixed Data
Media-archive, DAM,
Broadcast,
medical imaging, MediaInternet

Media production,
eCAD, mCAD, Office docs

Structured

Semi
Structured*
Transactional systems, ERP,
CRM

BI, data warehousing,


scientific,
transaction archive

*Semi-Structured Data contains dynamic meta-data defined by users and/or applications


The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

28

Traditional Files

Metadata

Owner, permissions, type, last modification, ...

Data

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

29

Semi-Structured Data

Object ID
Methods

e.g., Encryption

Policies

e.g., Replication

Attributes

User/application defined

Metadata

Owner, permissions, type, last modification, ...

Data

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

30

The File Object Model


Data

OID

OID

Data

Store

Retrieve

Object

Object

Object ID
Methods

e.g., Encryption

Policies

e.g., Replication

Attributes

User/application defined

Metadata

Owner, permissions, type, last modification, ...

Inode
Data

Data Blocks

Name

OID

Object

Name

OID

Object

Name

OID

Object

Name

OID

Object

Name

OID

Object

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

31

Managing File Objects


File objects can be managed like records in a relational database with user
data as Binary Large Objects (BLOBs)

Metadata

Data

Data

Data

Data

Data

Data

Metadata

Attributes

Policies

Methods

Data
Data

Metadata

Data

Metadata

Data

Metadata

Data

Metadata

Data

Metadata

Metadata

Attributes

Metadata

Attributes

Metadata

Attributes

Metadata

Attributes

Metadata

Attributes

Attributes

Attributes

Attributes

Policies

Attributes

Policies

Attributes

Policies

Attributes

Policies

Policies

Policies

Policies

Policies

Policies

Methods

Policies

Methods

Policies

Methods

Methods

Methods

Methods

Methods

Methods

Methods

Methods

Object ID

Methods

Object ID

Object ID

Object ID

Object ID

Object ID

Object ID

Object ID

Object ID

Object ID

Object ID

Object ID

Database Schema

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

32

Data

Attributes

Attributes

Data

Policies

Policies

Metadata

Methods

Methods

Metadata

Object ID

Object ID

Data

Metadata

Attributes

Policies

Methods

Object ID

Data

Metadata

Attributes

Policies

Methods

Object ID

Data

Metadata

Attributes

Policies

Methods

Object ID

Data

Metadata

Attributes

Policies

Methods

Object ID

Managing File Objects (Cont.)

Indexes
constraints/relationships
Object search
Full text search
Join operations
Virtual views
SQL-like requests
Cursors

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

33

Data Serving Hierarchy


3 Levels of Abstraction

Application may interface with


the storage subsystem in any of
three layers:
Block highest performance
and very little meta data
File high performance and
some metadata
Object medium performance
and rich metadata

Application

Object

Many to One
File

Many to One
Block

Data Server Platform


The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

34

Attribution & Feedback


The SNIA Education Committee would like to thank the
following individuals for their contributions to this Tutorial.
Authorship History
Original Author : Christian Bandulet
Updates:
Thomas Rivera, September 2012
Paul Massiglia , Spring 2012
Craig Harmer, April 2011

Additional Contributors
Craig Harmer
Paul Massiglia
Joseph White
Thomas Rivera
Christian Bandulet

Please send any questions or comments regarding this SNIA Tutorial to


tracktutorials@snia.org
The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

35

35

Appendix

Reference Material

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

36

www.wikipedia.org
ADFS Acorn's Advanced Disc filing system, successor to DFS
BFS the Be File System used on BeOS
EFS Encrypted filesystem, An extension of NTFS
EFS (IRIX) an older block filing system under IRIX
Ext Extended filesystem, designed for Linux system
Ext2 Second extended filesystem, designed for Linux systems
Ext3 Name for the journalled form of ext2
FAT Used on DOS and Microsoft Windows, 12, 16 and 32 bit table depths
FFS (Amiga) Fast File System, used on Amiga systems. This FS has evolved over time. Now
counts FFS1, FFS Intl, FFS DCache, FFS2
FFS Fast File System, used on *BSD systems
Fossil Plan 9 from Bell Labs snapshot archival file system
Files-11 OpenVMS filesystem
GCR Group Code Recording, a floppy disk data encoding format used by the Apple II and
Commodore Business Machines in the 5" disk drives for their 8-bit computers
HFS Hierarchical File System, used on older Mac OS systems
The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

37

www.wikipedia.org (cont'd)
HFS Plus Updated version of HFS used on newer Mac OS systems
HPFS High Performance Filesystem, used on OS/2
ISO 9660 Used on CD-ROM and DVD-ROM discs
(Rock Ridge and Joliet are extensions to this)
JFS IBM Journaling Filesystem, provided in Linux, OS/2, and AIX
LFS 4.4BSD implementation of a log-structured file system
MFS Macintosh File System, used on early Mac OS systems
Minix file system Used on Minix systems
NTFS Used on Windows NT, Windows 2000, Windows XP and Windows Server 2003 systems
NSS Novell Storage Services. This is a new 64-bit journaling filesystem using a balanced tree
algorithm. Used in NetWare versions 5.0-up and recently ported to Linux.
OFS Old File System, on Amiga. Nice for floppies, but fairly useless on hard drives
PFS and PFS2, PFS3, etc. Technically interesting filesystem available for the Amiga, performs
very well under a lot of circumstances. Very simple and elegant

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

38

www.wikipedia.org (cont'd)
ReiserFS Filesystem that uses journaling
Reiser4 Filesystem that uses journaling, newest version of ReiserFS
SFS Smart File System, journaled file system available for the Amiga platforms
UDF Packet based filesystem for WORM/RW media such as CD-RW and DVD.
UFS Unix Filesystem, used on older BSD systems
UFS2 Unix Filesystem, used on newer BSD systems
UMSDOS FAT filesystem extended to store permissions and metadata, used for Linux
VxFS Veritas file system, first commercial journaling file system; HP-UX, Solaris, Linux, AIX
VSAM
WAFL Used on Network Appliance systems
XFS Used on SGI IRIX and Linux systems
ZFS Used on Solaris
SAM QFS (Oracle)

The Evolution of File Systems


2012 Storage Networking Industry Association. All Rights Reserved.

39

www.wikipedia.org (cont'd)
9P The Plan 9 and Inferno distributed file system
AFS (Andrew File System)
AppleShare
Arla (file system)
Coda
CXFS (Clustered XFS) a distributed networked file system designed by Silicon Graphics (SGI)
specifically to be used in a SAN
Distributed File System (DCE)
Distributed File System (Microsoft)
Freenet
Global File System (GFS)
Google File System (GFS)
IBRIX Fusion
InterMezzo
Isilon OneFS
Lustre (Oracle)
The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

40

www.wikipedia.org (cont'd)
NFS
OpenAFS
Server message block (SMB) (aka Common Internet File System (CIFS) or
Samba file system)
Xsan (a storage area network (SAN) filesystem from Apple Computer, Inc.)
archfs (archive)
cdfs (reading and writing of CDs)
cfs (caching)
Davfs2 (WebDAV)
Devfs
ftpfs (ftp access)
fuse (filesystem in userspace, like lufs but better maintained)
GPFS an IBM cluster file system
JFFS/JFFS2 (filesystems designed specifically for flash devices)
LUFS ( replace ftpfs, ftp ssh ... access)
nntpfs (netnews)
OCFS (Oracle Cluster File System)
The Evolution of File Systems
2012 Storage Networking Industry Association. All Rights Reserved.

41

You might also like