Professional Documents
Culture Documents
presented by
p y
Sarp Oral, Ph.D.
NCCS
CCS Sca
Scalingg Workshop
o s op
August 1st, 2007 Oak Ridge National Laboratory
U.S. Department of Energy
Outline
• What is Lustre
• Lustre Architecture
• Lustre
− POSIX compliant
p
− Parallel file system
• Lustre provides
− High-scalability
g y
− High-performance
− Single
g g global name space
p
• MDS
− Manages the name space, directory and file operations
− Stores file system metadata
− Extended attributes point to objects on OSTs
• OSS
− Manages
g the OSTs
• OST
− Manages underlying block devices
− Stores file data stripes
Metadata Ops
p MDS
OSS
OSS
Block I/O and file OSS
locking
OSS
OST
OST
OST
OST
Block
Block
Device
Block
Device
Block
Device
Device
• Failover
− Active-passive pairs for MDS and OSS
− Works fine on all NIX based systems except Catamount
• Failover is not supported with current UNICOS
• Failover will be supported with the CNL
• Simple tips
− Over striping might be bad
• Too small chunks to write into each OST
− Under utilizing OSTs and the network
− Under striping might be bad
• Too much stress per each OST
− Contention
− Command line
• “lfs setstripe” to set the stripe pattern
• “lfs getstripe” to query the stripe pattern
1 1 1 File A 1 1 2 File A
2 2
File B 2 1 File B
3
4 File C 4 3 File C
− Exceptions
• Flock/lockf is still not supported
• Security
− NFS comparable today
− Kerberos capabilities on the way
− Encrypted Lustre file system is under development
− DDN 9550s
• 18 racks/couplets
• Write-back cache is 1MB on each controller
• 36 TB per couplet w/ Fibre Channel drives
• Each LUN has a capacity of 2 TB and 4 KB block size
− Exact configuration
f details are to be determined
− Open issues
• A mechanism to transfer files between the Catamount side and
the CNL side
− Can be done by NCCS
− Can be done by users
− Phase 0
• Will be in production soon over Jaguar
• 20 OSS,
OSS 80 OSTs
OSTs, 4 OST/OSS,
OST/OSS 10GE & 4xSDR IB
• 10 couplets of DDN 8500s, FC 2 Gb direct links w/ failover
configured
• Spider provides
+ Ease of data transfer between clusters
+ Ability to analyze data offline
+ On the flyy data analysis/visualization
y capability
p y
+ Ease of diagnostics/decoupling
+ Lower acquisition/expansion cost
Infiniband
Network
TCP
Network
Spider
Legacy
Backend Systems
Disks