You are on page 1of 108

Unit - 6

File-System Interface

File-System Interface

File System Implementation-

File system structure

File system implementation

File Concept

Access Methods

Directory Structure

Directory implementation
File-System Mounting
Allocation methods
File Sharing
Free-space management
Protection
Efficiency and performance

File Concept
File

: named collection of related


information that is recorded on secondary
storage.

Contiguous

logical address space

Types:

Data

numeric
character
binary

Program

Files

Data collections created by users

The File System is one of the most important parts of the


OS to a user

Desirable properties of files:

File Attributes

Name only information kept in human-readable form

Identifier unique tag (number) identifies file within file


system

Type needed for systems that support different types

Location pointer to file location on device

Size current file size

Protection controls who can do reading, writing, executing

Time, date, and user identification data for protection,


security, and usage monitoring

Information about files are kept in the directory structure,


which is maintained on the disk

File Operations

File is an abstract data type


Create space in file system, directory entry
Write system call (name, inf.), write pointer
Read system call, read pointer
Reposition within file (file seek)
Current file-posit
Delete
pointer
Truncate

Open(Fi) search the directory structure on disk


for entry Fi, and move the content of entry to
memory (open-file table)

Close (Fi) move the content of entry Fi in memory


to directory structure on disk

File Operations (Cont)

OS uses two levels of internal tables:

1. Per-process

table:
Keeps track of all files that a process has open
Each entry in per-process table points to a
system-wide table

2. System-wide

table:
Contains process-independent information ex:
file size, access dates

Open Files

Several pieces of data are needed to manage


open files:
File pointer: pointer to last read/write
location, per process that has the file open
File-open count:

keeps track of number of times a file is open/close

to allow removal of data from open-file table when last


processes closes it

Disk location of the file: cache of data access


information
Access rights: per-process table stores this
mode information (access)

File Locking

Provided by some operating systems and file systems

File locks allow one process to lock a file and prevent


other processes from gaining access to it

1. Shared

lock: several processes can acquire lock


concurrently (reader lock)

2. Exclusive

lock: only one process can acquire such


lock at a time (writer lock)

File locking mechanisms:


Mandatory access is denied depending on locks
held and requested
Advisory processes can find status of locks and
decide what to do

File Types Name, Extension

File Structure

File Structure

Files can be structured as a collection of records


or as a sequence of bytes

UNIX, Linux, Windows, Mac OSs consider files as


a sequence of bytes

Other OSs, notably many IBM mainframes, adopt


the collection-of-records approach; useful for DB

COBOL supports the collection-of-records file and


can implement it even on systems that dont
provide such files natively.

Structure Terms
File

Field

basic element of data

contains a single value

fixed or variable length

Database

collection of related
data

relationships among
elements of data are
explicit

designed for use by a


number of different
applications

consists of one or more

collection of similar
records

treated as a single entity

may be referenced by
name

access control
Record
restrictions
usually apply
collection
of related
at the file
level fields
that can be treated as a
unit by some application
program

One field is the key a


unique identifier

File Management
System Objectives

Meet the data management needs of the user

Guarantee that the data in the file are valid

Optimize performance

Provide I/O support for a variety of storage device


types

Minimize the potential for lost or destroyed data

Provide a standardized set of I/O interface routines to


user processes

Provide I/O support for multiple users in the case of


multiple-user systems

Minimal User
Requirements
Each user:

File Structure
OS

multiple file structures

disadvantage code to support them

Packing

blocks.

Users application program


OS

All

a no. of logical records into physical

files suffer from internal fragmentation

Larger the block size, greater the internal fragmentation

Access Methods
1.

Sequential Access
read next
write next
reset
no read after last write
(rewrite)
Ex: editors, compilers

2. Direct Access (file operations include block no. as


parameter n)
read n
write n
position to n
read next
write next
rewrite n

Sequential-access File

Direct access (or


relative
access)
File is viewed
as a numbered
sequence of
blocks or records

Ex: read block 14 then read block 53 and then write block 7

No

restrictions on the ordering of reading or


writing

Great

use for immediate access to large


amounts of information

Simulation of Sequential Access on Direct-access File

Example of Index and


Relative Files

Index: contains pointers to the various


blocks
Large files large index
Sol: create an index for index file

Directory Structure

A collection of nodes containing information about all files

Directory

Files

F1

F2

F3

F4
Fn

Both the directory structure and the files reside on disk


Backups of these two structures are kept on tapes

Disk Structure

Disk can be subdivided into partitions

Disks or partitions can be RAID protected against failure

Disk or partition can be used raw without a file system, or


formatted with a file system

Partitions also known as minidisks, slices

Entity containing file system known as a volume

Each volume containing file system also tracks that file


systems info in device directory or volume table of contents

As well as general-purpose file systems there are many


special-purpose file systems, frequently all within the same
operating system or computer

A Typical File-system
Organization

Operations Performed
on a Directory

To understand the requirements for a file structure, it is


helpful to consider the types of operations that may be
performed on the directory:

Organize the Directory (Logically) to

Efficiency locating a file quickly

Naming convenient to users


Two users can have same name for
different files
The same file can have several different
names

Grouping logical grouping of files by


properties, (e.g., all Java programs, all
games, )

Obtain

Single-Level Directory

A single directory for all users

Naming problem
Grouping problem

Two-Level Scheme

Figure 12.4
TreeStructured
Master
Directory

directory with
user
directories
underneath it

Each user
directory may
have
subdirectories
and files as
entries

Two-Level Directory

Separate directory for each user

Path name
Can have the same file name for

different user
Efficient searching

Tree-Structured
Directories

Tree-Structured
Directories
Directory entry:
File 0; subdirectory -(Cont)
1

Efficient searching

Current directory (working directory)

cd /spell/mail/prog

type list

Absolute or relative path name

Absolute: begins at root & follows a path down to


specified file

Relative: defines a path from the current directory

Tree-Structured
Directories (Cont)

Absolute or relative path name

Creating a new file is done in current directory

Delete a file(directory empty?)


rm <file-name>

Creating a new subdirectory is done in current directory


mkdir <dir-name>
Example: if in current directory /mail
mkdir count

Deleting mail deleting the entire subtree rooted by mail

Acyclic-Graph
Directories

Have shared subdirectories and files

Acyclic-Graph
Directories (Cont.)

Two different names (aliasing)

If dict deletes list dangling pointer


Solutions:

Backpointers, so we can delete all pointers


Variable size records a problem

Backpointers using a daisy chain organization

Entry-hold-count solution

New directory entry type

Link another name (pointer) to an existing file

Resolve the link follow pointer to locate the file

General Graph
Directory

General Graph
Directory (Cont.)

How do we guarantee no cycles?

Allow only links to file not subdirectories

Garbage collection

Every time a new link is added use a cycle detection


algorithm to determine whether it is OK

File System Mounting


A

file system must be mounted before it


can be accessed

unmounted file system (i.e.(b)) is


mounted at a mount point

Mount

point: location within file


structure where the file system is to be
attached.

(a) Existing. (b)


Unmounted Partition

Mount Point

File Sharing

Access
Rights

None

the user can determine that


the file exists and who its
owner is and can then
petition the owner for
additional access rights

Execution

the user would not be allowed


to read the user directory
that includes the file

Appending

Knowledge

the user can load and execute


a program but cannot copy it
the user can read the file for
any purpose, including
copying and execution

the user can modify, delete,


and add to the files data

Changing protection

Reading

Updating

the user can add data to the


file but cannot modify or
delete any of the files
contents

the user can change the


access rights granted to
other users

Deletion

the user can delete the file


from the file system

User Access Rights

File Sharing Multiple


Users

User IDs identify users, allowing permissions and


protections to be per-user

Group IDs allow users to be in groups, permitting group


access rights

File Sharing Remote


Uses networking to allow file system access between
systems
File Systems

Manually via programs like FTP


Automatically, seamlessly using distributed file
systems
Semi automatically via the world wide web

Client-server model allows clients to mount remote file


systems from servers
Server can serve multiple clients
Client and user-on-client identification is insecure or
complicated

Distributed Information Systems (distributed


naming services) such as LDAP, DNS, NIS, Active
Directory implement unified access to information
needed for remote computing

File Sharing Failure


Modes
Remote file systems add
new failure

modes, due to network failure, server


failure

Recovery from failure can involve state


information about status of each remote
request

Stateless protocols such as NFS include


all information in each request, allowing
easy recovery but less security

Protection

File owner/creator should be able to control:


what can be done
by whom

Types of access
Read
Write
Execute
Append
Delete
List

Access Lists and Groups

Mode of access: read, write, execute

Three classes of users


RWX
a) owner access
RWX
b) group access
RWX
c) public access

111

110

001

Ask manager to create a group (unique name), say G, and add some
users to the group.

For a particular file (say game) or subdirectory, define an appropriate


access.
owner
chmod

group
761

public
game

Attach a group to a file


chgrp

game

Access
Matrix

The basic elements are:


subject an entity capable
of accessing objects
object anything to which
access is controlled
access right the way in
which an object is accessed
by a subject

Access
Control
Lists

A matrix may be
decomposed by
columns, yielding
access control lists

The access control list


lists users and their
permitted access rights

Capabilit
y Lists

Decomposition by
rows yields
capability tickets

A capability
ticket specifies
authorized objects
and operations for
a user

Windows XP Access-control List


Management

File System
Implementation

File-System Structure
File

structure

Logical storage unit

Collection of related information

File

system resides on secondary storage


(disks)

File

system organized into layers

File

control block storage structure


consisting of information about a file

Typical Software Organization

File System
Architecture

Notice that the top layer consists of a number of


different file formats: pile, sequential, indexed
sequential,

These file formats are consistent with the collectionof- records approach to files and determine how file
data is accessed

Even in a byte-stream oriented file system its


possible to build files with record-based structures
but its up to the application to design the files and
build in access methods, indexes, etc.

Operating systems that include a variety of file


formats provide access methods and other support
automatically.

Layered File System


Architecture

File Formats Access methods provide the interface


to users

Logical I/O

Basic I/O

Basic file system

Device drivers

Device Drivers

Lowest level

Communicates directly with peripheral devices

Responsible for starting I/O operations on a


device

Processes the completion of an I/O request

Considered to be part of the operating system

Basic File System

Also referred to as the physical I/O level

Primary interface with the environment outside


the computer system

Deals with blocks of data that are exchanged


with disk or other mass storage devices.

placement of blocks on the secondary storage device

buffering blocks in main memory

Considered part of the operating system

Basic I/O Supervisor

Responsible for all file I/O initiation and termination

Control structures that deal with device I/O, scheduling,


and file status are maintained

Selects the device on which I/O is to be performed

Concerned with scheduling disk and tape accesses to


optimize performance

I/O buffers are assigned and secondary memory is


allocated at this level

Part of the operating system

Logical I/O

Logical I/O
This level is the interface between
the
logical commands issued by a
program and the physical details
required by the disk.
Logical units of data versus physical
blocks of data to match disk
requirements.

Access Method

Level of the file system closest to the user

Provides a standard interface between


applications and the file systems and devices
that hold the data

Different access methods reflect different file


structures and different ways of accessing
and
processing the data

Elements of File
Management

File Organization and Access

File organization is the logical structuring of the records as


determined by the way in which they are accessed

In choosing a file organization, several criteria are important:

short access time

ease of update

economy of storage

simple maintenance

reliability

Priority of criteria depends on the application that will use


the file

File Organization Types

The Pile

Least complicated form


of file organization

Data are collected in


the order they arrive

Each record consists of


one burst of data

Purpose is simply to
accumulate the mass
of data and save it

Record access is by
exhaustive search

The
Sequential
File

Most common form of


file structure

A fixed format is used


for records

Key field uniquely


identifies the record &
determines storage
order

Typically used in batch


applications

Only organization that is


easily stored on tape as

Indexed
Sequential File

Adds an index to the


file to support random
access

Adds an overflow file

Greatly reduces the


time required to
access a single record

Multiple levels of
indexing can be used
to provide greater
efficiency in access

Indexed File

Records are accessed only


through their indexes

Variable-length records can


be employed

Exhaustive index contains


one entry for every record in
the main file

Partial index contains entries


to records where the field of
interest exists

Used mostly in applications


where timeliness of
information is critical

Examples would be airline


reservation systems and
inventory control systems

Direct or Hashed File

Access directly any block of a known


address

Makes use of hashing on the key


value

Often used where:

very rapid access is required

fixed-length records are used

records are always accessed


one at a time

On

disk:

File system
implementation

1. Boot

control block: contains inf. Needed by


the system to boot OS
1.

UFS: boot block; NTFS: partition boot sector

2. Volume

control block: contains volume


details (no. of blocks, size of blocks, free
block count etc.)
1.

UFS: superblock; NTFS: master file table

3. Directory

structure: to organize files

File system
implementation
In-memory:
1. In-memory

mount table: information about


each mounted volume

2. In-memory

directory structure cache:


information of recently accessed directories

3. System-wide

open-file table: copy of FCB of


each open file

4. Per-process

open-file table: pointer to


appropriate entry in system-wide open-file
table

In-Memory File System


Structures
Fig. illustrates the necessary
file system

structures provided by the OS

Figure (a) refers to opening a file.

Figure (b) refers to reading a file.

In-Memory File System


Structures

Partitions and mounting


Disk
Raw

can be sliced into multiple partitions

disk containing no file system

Boot

information: sequential series of


blocks, loaded as an image into memory

Systems
Root

can be dual-booted.

partition: contains OS kernel & other


system files is mounted at boot time

Virtual File Systems


VFS provide an object-oriented way of (VFS)
implementing file systems.

VFS allows the same system call interface (the


API) to be used for different types of file systems.

The API is to the VFS interface, rather than any


specific type of file system.

VFS architecture in
4 main object types defined byLinux
Linux VFS :
1. inode
2. file

object: represents an individual file

object: represents an open file

3. superblock

object: represents an entire file

system
4. dentry

object: represents an individual


directory entry

Directory
Linear list of file
names with pointers to the data
Implementation
blocks.

1.

2.

simple to program
time-consuming to execute
finding a file requires linear search

Hash Table linear list with hash data structure.


takes a value from file name & returns a pointer to
the file name in the linear list
decreases directory search time
collisions situations where two file names hash to
the same location
fixed size

Record Blocking

Blocks are the unit of I/O


with secondary storage

for I/O to be
performed records
must be organized
as blocks

1) Fixed-Length Blocking fixed-

length records are used, and an


integral number of records (or
bytes) are stored in a block
Internal fragmentation
unused space at the end of each
block for records, but not for bytes

2) Variable-Length Spanned Blocking


variable-length records are packed
into blocks with no unused space

Given the size of a


block, three methods of
blocking can be used:

3) Variable-Length Unspanned

Blocking variable-length
records are used, but spanning is
not done

File Allocation
Disks are divided into physical blocks (sectors on a track)
Files are divided into logical blocks (subdivisions of the file)
Logical block size = some multiple of a physical block size
The operating system or file management system is responsible
for allocating blocks to files

Space is allocated to a file as one or more portions (contiguous


set of allocated disk blocks). A portion is the logical block size

File allocation table (FAT)

data structure used to keep track of the portions assigned to a file

Preallocation vs
Dynamic Allocation

A preallocation policy requires that the maximum size of a


file be declared at the time of the file creation request

For many applications it is difficult to estimate reliably the


maximum potential size of the file

tends to be wasteful because users and application


programmers tend to overestimate size

Dynamic allocation allocates space to a file in


portions as needed

Portion Size

In choosing a portion size there is a trade-off between efficiency


from the point of view of a single file versus overall system
efficiency

Items to be considered:
1) contiguity of space increases performance, especially for
Retrieve_Next operations, and greatly for transactions
running in a transaction-oriented operating system
2) having a large number of small portions increases the size
of tables needed to manage the allocation information
3) having fixed-size portions simplifies the reallocation of
space
4) having variable-size or small fixed-size portions minimizes
waste of unused storage due to overallocation

Summarizing the
Alternatives

Two major alternatives:

Table 12.3
File Allocation Methods

Contiguous File Allocation


A single

contiguous set
of blocks is
allocated to a
file at the time
of file creation

Preallocation

strategy using
variable-size
portions

Is the best from


the point of view
of the individual
sequential file

12.9

After Compaction

Figure 12.10 Contiguous File Allocation (After


Compaction)

Chained
Allocation
Allocation is on an

individual block basis

Each block contains a


pointer to the next
block in the chain

The file allocation

table needs just a


single entry for each file

No external

fragmentation to worry
about

Better for sequential


files

12.1
1

Chained Allocation After


Consolidation

12.1
2

Linked Allocation
(Cont.)
Simple need only starting address

Free-space management system no waste of


space

No random access

Mapping

Disadvantages:

Can be used only for sequential access files

Space required for the pointers (sol: clusters


multiple blocks)

Reliability (if a pointer were lost, sol: doubly linked


list)

Indexed Allocation with


Block Portions

12.1
3

Indexed Allocation with


Variable Length Portions

12.1
4

Indexed Allocation
(Cont.)
Need index table
Random
Dynamic

access

access without external


fragmentation, but have overhead of index
block.

Indexed File - Linked


Scheme
file block

Index block

link

link

Indexed Allocation Multilevel index


2nd level Index

Index block

link
link

Free Space
Management

Just as allocated space must be managed, so must the


unallocated space

To perform file allocation, it is necessary to know which


blocks are available

A disk allocation table is needed in addition to a file


allocation table

Free-Space
Management

Free-space list: keeps track of free disk space


1.

Bit vector

2.

Linked list

3.

Grouping

4.

Counting

Free-Space
1. Bit vector or bitManagement
vector (n blocks)
0 1

n-1

bit[i] =

1 block[i] free
0 block[i] occupied

Block number calculation:

(number of bits per word) *


(number of 0-value words) +
offset of first 1 bit
Bit map requires extra space
Easy to get contiguous files

Free-Space
(Cont.)
2. Management
Linked list (free list)

Link all free disk blocks keep pointer to first


free block & cache it in memory
Cannot get contiguous space easily
No waste of space

Chained Free Portions

The free portions may be chained together by using a


pointer and length value in each free portion

Negligible space overhead because there is no need for a


disk allocation table

Suited to all file allocation methods

Linked Free Space List


on Disk

Free-Space
(Cont.)
3.Management
Grouping:

Stores addresses of n free blocks in first free


block

Last block - addresses of another n free blocks

4. Counting:

Keeps address of first free block & number n


of free contiguous blocks that follow first block

Each entry disk address & count

Efficiency and
Performance

Efficiency depends on:

disk

allocation and directory


algorithms
types of data kept in files directory
entry
last write date or last access date

Efficiency and
Performance:
Performance

disk cache separate section of main memory for frequently used


blocks

Buffer cache separate section of main memory for blocks that will
be used again shortly

Page cache caches file data as pages

Unified virtual memory caches both pages & file data

Unified buffer cache uses the same page cache for both
memory-mapped pages and files

I/O Without a Unified


Buffer Cache

Efficiency and
Performance
Block replacement
mechanisms:
LRU
Free-behind

- removes block from


buffer as soon as next block is
requested.
Read-ahead - request block and
several subsequent blocks are read
and cached.

End of Unit - 6

You might also like