You are on page 1of 35

File System Implementation

Chapter 11

File-System Structure
A file system poses two distinct design problems
Defining how the file system should look to the user Creating algorithms and data structures to map the logical file system onto the physical device

File system resides on secondary storage (disks) File system organized into layers.
Each level uses the lower level to create new features for use by higher levels

File control block storage structure consisting of information about a file

A Typical File Control Block

File-System Implementation
Several on-disk and in-memory structures are used to implement a file system
On-disk structures
A boot control block A partition control block (superblock) A directory structure file control blocks An in-memory partition table An in-memory directory structure The system-wide open-file table The per-process open-file table

In-memory structures

Creating a File
To create a new file the application program calls the logical file system (which knows the format of the directory structures)
Allocates a new FCB Reads the appropriate directory into memory Updates the directory with new file name and FCB Writes back to disk

Some operating systems (UNIX) treat a directory exactly as a file, other operating systems (Windows), implement separate system calls for files and directories and treat directories separate from files.

Opening a File
Before a file can be used for I/O operations it must first be opened
Open call passes the file name to the file system The directory structure (usually cached) is searched for the given file name Once the file is found, the FCB is copied into the system-wide open-file table in memory An entry is made in the per-process open-file table, with a pointer to the system-wide open-file table The open call returns a pointer to the appropriate entry in the perprocess open-file table, all file operations are performed via this pointer (file descriptor in Unix, file handle in Windows)

In-Memory File System Structures

(a) refers to opening a file. (b) refers to reading a file.

Closing a File
After all I/O operations are complete a file should be closed
The per-process table entry is removed and the systemwide entrys open count is decremented When all users that have opened the file close it, the updated file information is copied back to the disk-based directory structure and the system-wide open-file table entry is removed

Some systems Use a caching a scheme. All information about an open file, except for its actual data blocks, is in memory

Disk Partition and Mounting


The layout of a disk can have many variations, depending on the operating system A disk can be divided into multiple partitions, or a partition can span multiple disks
Raw containing no file system Cooked containing a file system Boot information can be stored in a separate partition

The root partition which contains the operating-system kernel is mounted at boot time (other partitions can be mounted later) The operating system notes in its mount table that a file system is mounted and the type of file system
Windows mount each partition in a separate drive letter UNIX, file systems can be mounted at any directory

Directory Implementation
Linear List
Uses a linear list of file names with pointers to data blocks, requires a linear search to find a particular entry Simple to program but time-consuming to execute

Hash Table
Uses a linear list to stores directory entries but uses hashing to find the entry Hashing can greatly decrease the directory search time Handle collisions situations where two file names hash to the same location Major difficulties with a hash table are its fixed size and dependence on the hash function

Allocation Methods
An allocation method refers to how disk blocks are allocated for files Three major methods of allocating disk space are in wide use
Contiguous allocation Linked allocation Indexed allocation

Each method has its advantages and disadvantages Some systems support all three but more commonly a system will use one particular method

Contiguous-Allocation
Requires each file to occupy a set of contiguous blocks on the disk Disk addresses define a linear ordering on the disk Simple only starting location (block #) and length (number of blocks) are required
For a file n blocks long and starts at location b, then it occupies blocks b, b+1, b+2, , b+n-1 The directory entry for each block represents indicates the starting address of each block and the length allocated for this file

Both sequential and direct access is supported

Contiguous Allocation of Disk Space

Contiguous Allocation (Cont.)


Contiguous allocation has some problems
Dynamic storage-allocation
How to satisfy a request of size n from a list of free blocks

External fragmentation
Free space is broken into chunks and the largest chunk is insufficient for a request

Determining how much space is needed for a file


Allocate too little and the file may not be extended Allocate too much and space is wasted

File cannot grow

Extent Based Systems


To minimize the drawbacks of contiguous file allocation some file systems (I.e. Veritas File System) use a modified scheme
A contiguous chunk of space is allocated initially and when the amount is not large enough, another chunk of contiguous space (extent) is added to the initial allocation

Extent-based file systems allocate disk blocks in extents Internal fragmentation can still be a problem if the extents are too large External fragmentation can be a problem as extents of various sizes are allocated and de-allocated

Linked Allocation
Solves all the problems of contiguous allocation Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk

block

pointer

The directory contains a pointer to the first and last blocks of a file

Linked Allocation

Linked Allocation (Cont.)


No external fragmentation Any free block on the free-space list can be used to satisfy a request A file can grow as long as free blocks are available, never need to compact disk space Linked allocation does have disadvantages
Only effective for sequential-access files Space required for the list pointers. Use clusters to improve disk usage and access time. Reliability

The File Allocation Table (FAT) is a variation to the linked allocation method used to support direct access

File-Allocation Table

Indexed Allocation
Solves the external-fragmentation and size-declaration problems of contiguous allocation Supports direct access by bringing all the pointers together into the index block Each file has its own index block, which is an array of diskblock addresses

index table

Example of Indexed Allocation

Indexed Allocation (Cont)


Indexed allocation does suffer from wasted space Every file must have an index block. So the block needs to be as small as possible. A File may require more than one index blocks. Why?
Linked scheme Multilevel scheme Combined scheme

Linked Index Scheme


An index block is normally one disk block
Can be read and written directly by itself To allow for large files, link together several index blocks (no limit on size)

Multilevel Index
Use index of index blocks
Use a first-level index block to point to a set of second-level index blocks, which in turn point to the file blocks
With 4KB blocks and index size of 4 bytes, what is the maximum file size using 2-level index?

Could be extended to a third or fourth level, depending on the maximum file size

Multi-level Index mapping

outer-index index table file

Combined Scheme: UNIX (4K bytes per block)


keep the first n pointers of the index block in the files inode Indexed-allocation suffers from some of the same performance problems as does linked allocation
The index blocks can be cached in memory, but the data blocks may be spread all over a volume
The Unix inode

Free-Space Management
Need to reuse the space from deleted files for new files To keep track of free disk space, the system maintains a free-space list
Stores all free blocks those not allocated to a file or directory

To create a file the free-space list is searched and that space is allocated to the new file, this space is then removed form the list When a file is deleted its disk space is added to the free space list

Bit Vector
Frequently, the free-space list is implemented as a bit-map or bit vector
Each block is represented by 1 bit If the block is free; the bit is 1; if the block is allocated the bit is 0
0 1 2 n-1


bit[i] =

1 block[i] free
0 block[i] occupied

Linked List
Link together all the free disk blocks The first block contains a pointer to the next free disk block, Grouping
Stores the addresses of n free blocks in the first free block Large numbers of free blocks can be found quickly

Counting
Stores the address of the first free block and the number n of free contiguous blocks The overall list will be shorter

Linked Free Space List on Disk

Efficiency and Performance


Efficiency dependent on:
disk allocation and directory algorithms, e.g. pointer size. types of data kept in files directory entry

Performance
disk buffer cache separate section of main memory for frequently used blocks free-behind and read-ahead techniques to optimize sequential access improve PC performance by dedicating section of memory as virtual disk, or RAM disk (memorymapped IO)

Page Cache
A page cache caches pages rather than disk blocks using virtual memory techniques Memory-mapped I/O uses a page cache Routine I/O through the file system uses the buffer (disk) cache

Unified Buffer Cache


A unified buffer cache uses the same page cache to cache both memory-mapped pages and ordinary file system I/O

Recovery
Care must be taken to ensure that system failure does not result in loss of data or in data inconsistency Consistency checking
Compares data in directory structure with data blocks on disk, and tries to fix inconsistencies The allocation and free-space-management algorithms dictate what types of problems the checker can find

Backup and Restore


Use system programs to back up data from disk to another storage device (floppy disk, magnetic tape). Recover lost file or disk by restoring data from backup

Log Structured File Systems


Log structured (or journaling) file systems record each update to the file system as a transaction. All transactions are written to a log. A transaction is considered committed once it is written to the log. However, the file system may not yet be updated. The transactions in the log are asynchronously written to the file system. When the file system is modified, the transaction is removed from the log. If the file system crashes, all remaining transactions in the log must still be performed.

You might also like