Professional Documents
Culture Documents
The world has changed a great deal since the first edition of this book ap-
peared in 1992. Computer networks and distributed systcms of all kinds have
become very cmmon. Small children naw mam the Internet, where previously
only computer pmfessimals went. As a consequence, this book has changed a
great deal, too.
The most obvious change is that the first edition was about half on single-
procrssor operating systems and half on distributed syskms. I c h s e that formal
in 1.991 because few universities then had courses un distributed systems and
whatever students l e m e d a b u t distributed systerns had to be put intu the operar-
ing systems course, for which this book was intended. Now most universities
have a separate course on distnbuted systems, so it is not neccssary to try to com-
bine the two subjects imo m e course and one boak. This buok is intended for a
first course on operating systems, and as such focuses mnstly on traditional
singk-pmessrir sy atems.
I have coauthored two other books on operating systems. This leads to two
poss i ble cúurse sequences.
Practically-onented sequence:
1. Operating Systems Design and Irnplementation by Tanenbaum and Wwdhuil
2, Distriliukd Systems by Taneribaum and Van Steen
Traditionãl squence:
I . M d e m Operãting Sy stems hy Tanenbaurn
2. Distributed Systerns by Tanenbaum and Van Steen
PREFACE
The former sequence uses MINIX and the students are expected to experiment
with MINIX in an accompanying laboratory supplementing the first course- The
latter sequencc'dms not use MINIX. Instead, svme small s i r d a f o r s are avaihble
that can be used for student exercises during a first course using this book. These
can be found starting on the author's Web page: wwrv.cs.vrr.nl/-as[/ by
clicking on Software and supplementary material for my books .
In addition to the major change of switching the emphais to single-processor
operating systems in this book, other major changes include the addition of entire
chapters on computer security, multimedia operating systems, and Windows 2000,
all important and timely topics. In addition, a new and unique chapter on operat-
ing system design has k e n added.
Another new feature is that many chapters now haw a section on research
about the topic uf the chapter. This is intended to inuuduce the reader l o modern
work in processes, memory management, and so on. These sections have
numerous references to the current research literature for the interested reader. In
addition, Chapter 1 3 has many introductory and tutorial references.
Finally, numemus topics have been added to this book or heavily revised.
These topics include: graphical user intefaces. multiprocessor operating systems.
power management for laptops, trusted systems, viruses, network terminals, CD-
ROM file systems, mutexes. RAID, soft timers, stable storage. fair-share schedul-
ing, and new paging algorithms. Many new problems haw been added and old
ones updated. The total number of problems now exceeds 450. A snlutions
manual is available to professors using this book in a course. They can obtain a
copy from their local Rentice Hall representative, In addition, over 250 new
references to the current literature have been added to bring the book up to date.
Despite the removal of more than 400 pages of old material. the book has
increased in size due to the large mount nf new material added. While the bmk
is still suitable for a one-semester or two-quarter course, iiis probably too long for
a one-quarter nr one-trimester course at most universities. For this reason, the
book has been designed in a modular way. Any course on operating systems
should cover chapters 1 through 6. This is basic material that every student show
know.
If additional time is available, additional chapters can be covered. Each of
them assumes the reader has finished chapters I through 6, but Chaps. 7 through
12 are each self contained. so any desired subset can be used and in any order,
depending on the interests of the instructor. In the author's opinion. Chaps. 7
through 12 are much more interesting than the earlier ones. Instructors should tell
heir students that they have to eat their broccoli before they can have the double
chocolate fudge cake dessert.
I would like to thank the following people for their help in reviewing parts of
the manuscript: f i d a Bazzi, Riccardo Bettati, Felipe Cabrera. Richard Chapman,
John Connely, John Diekinson, John Elliott, Deborah Frincke, Chandana Gamage,
Robbert Geist, David Golds, Jim Griffioen, Gary Harki n, Frans Kaashoek, Muk-
PREFACE XXV
kai Kriahnamoorthy, Monica Lam, Jussi Leiwo. Herb Mayer. Kirk McKusick, E v i
Nemeth, Bill Potvin, Prasant Shenoy, Thomas Skinner, Xian-He Sun, ~ i l l i a m
Terry, Robbert Van Renesse, and Maarten van Steen. Jamie Hanrahan, Mark
Russinovich, and Dave Solomon were enormously knowledgeable about Win-
dows 2OOO and very helpful. Special thanks go to A! Woodhull for valuablc
reviews and thinking of many new end-of-chapter problems.
My students were also helpful with comments and feedback, especial1y S taas
de Jnng, Jan de Vos, Niels Drmt, David Fokkema, Auke Folkerts, Peter Groene-
wegen, Wilcr~ Ibes, Stefan Jansen, Jeroen Kcterna, Joeri Mulder, Irwin
Oppenheim, Stef Post. Urnar Rehman, Daniel Rijkhof, Maarten Sunder, Maurits
van der Schee, Rik van der Stoel, Mark van D i d , Dennis van Veen. and Thomas
Zeeman.
Barbara and Marvin are still wonderful, as usual, each in a unique way.
Finally, last but not least, I would like to thank Suzanne for her love and patience,
not to mention all the druiven and kersen, which have replaced the s i n ~ s n p p e l s ~ ~ p
in recent times.
Andrcw S. Tanenbaurn
PREFACE
1.1. WHATISANOPERATINGSYSTEM'! 3
1 . 1 . 1 . T h e Operating System as an Extended Machine 3
1.1.2. The Operating Syficm as a Resource Manager 5
1.9. OUTLINEOFTHERESTOFTHISBOOK 69
1 1 0. METRIC UNITS 66
+
I . 11 . SUMMARY 67
CONTENTS
2 PROCESSESANDTHREADS
2.1. PROCESSES 71
2.1.1. The Process Modd 72
2 . I-2. Process Creation 73
2.3.3- Prncess Termination 75
2.1.4. Process Hierarchies 76
2.1.5. Pmcess Statcs 77
2.1.6- Trnple~nentati~n uf Prrxesscs 79
2.2. THREADS 81
2.2.1. . The Thrcad Model 8 1
2.2.2. Thread Usage 85
2.2.3. Implementing Threads in User Space 90
2.2.4. Implementing Threads in the Kernel 93
2.2.5. Hybrid Irnplementati~ns 94
2.2.6. Scheduler Activations 94
2.2.7. Pop-U'p Threads 96
2.2+8. Making Single-Threaded Codc Multithreaded 97
INTERPROCESS COMMUNICATION I 0 0
2.3.1- RaceConditians !00
2+3.2. Critical Regions 102
2.3.3. Mutual Exclusion with Busy Waiting 103
2+3.4. Sleep and Wakeup 108
'2.3.5. Semaphores 110
2.3.6. Mutexes 1 13
2.3.7. Monitors I 2 5
2.3.8. Message Passing 119
2.3.9. Barriers 123
2+4. CLASSICAL I K PROBLEMS 224
2.4. l. Thc Dining PhiIosophers Problem 125
2.4,2. The Readers and Wrikrs Pmblen~ 128
2.4.3. The Sleeping Barber Problem 129
2.5. SCHEDULING 132
2.5.1. Introduction to Schcdul t ng 1 32
2.5.2. Scheduling in Batch Systems 138
2.5.3. Scheduling in Interactive Systems 142
2.5.4. Scheduling in Real-Time Systems 148
2.5.5. Policy versus Mechanism 149
2.5.6. Thread Scheduling lSO
2.6. RESEARCH O N PROCESSES AND THREADS 15 1
3 DEADLOCKS
4 MEMORY MANAGEMENT
6 FlLE SYSTEMS
6 . SUMMARY 448
CONTENTS
FILE PLACEMENT 48 1
7.6.I . Placing a File on a Single Disk 48 I
7.6.2. Two Alternative File Organization Strategies 482
7.6.3. Placing Files for Near Video on Demand 486
7.6.4. Placing Multiple Fiks on a Single Disk 487
7.6.5. Placing Files o n Multiple Disks 490
CACHING 492
7.7.1. Block Caching 492
7.7.2. File Caching 494
SUMMARY 499
8 MULTIPLE PROCESSOR SYSTEMS
8.1. MULTIPROCESSORS S M
8.1.1. Mu1t iprocessor Hardware 506
8.1.2. Multiprncessor Operating System Types 5 13
8.1.3. Multiprocessor Synchronization 5 1 6
8.1.4. Multiprocessor Scheduling 52 1
5 SUMMARY 577
9 SECURITY
12+4.5.Hints 888
12.4.6. Exploiting Locality 888
12.4.7. Optimize the Common Case 889
Operating System
Machine language
i:
Microar~hitecture
I Physical devicss I1
Figure 1-1. A computer system consisls c~fhardware. system progrnms. and ap-
plication prtjgrams.
The purpose of the data path is to execute some sel af instructions. Some of
these can be carried out in one data path cycle; others may require multiple data
path cycles. These instructirms may use registers or lother hxdware facilities.
Togcther, the hardware and instructions visible to an assembly language program-
mer form the 1SA (instruction Set Architecture) level. This level i s often called
machine language.
The machine language typically has between 50 and 300 instructions, rnostly
for moving data around the machine, doing arithmetic. and comparing values. In
this level, the inputhutput devices are cnntrdled by loading values inm special
device regiswrs. For example, a disk can be comrnandcd to rcad by loading ihe
values of the disk address, main memory address, byte count, and dircctinn (read
or writc) into its registers. In practice, many mom parameters are necded. and rhc
status returned by the drive after an operatttim is highly cnmplcx. Furthennore. h r
many 1/0 (lnput./Output) devices, timing plays an i m p r t a n t role in the program-
ming.
To hide this complexity, an operating system is provided, It consists of' a
layer of software that (partially) hides the hardware and gives the programmer a
more collvenient set of instructions lo work with. For example. read block tram
file is conceptually simpler than having to worry about the details of moving disk
heads, waiting for them to settle dnwn, and sr, on.
On lop of the operating system is the resl o f the systcm suftware. Here w c
find the command interpreter (shell), window systems. compilers, editors, and
similar application-independem programs. It is important to realize that t h e e
programs arc definitely not part of the operating system, even though they LVC typ-
ically supplied by the computer manufacturer. This is a crucial, hut subtle. point.
The uperating system is (usually) that p m i o n of the software that runs in kernd
mode or supervisor mode. Ir is protected from user tampering by the hardware
(ignoring for the moment some older or low-md microprocessors that do nor have
hardware protection at all). Compilers and editors run in wer mode. If a user
does not like a pAcular compiler. he$ is free to write his own if ht: so chnflses:
he is not free to write his own clock interrupt handler, whlch is part of the operat-
ing system and is normally protected by hardware against attempts by users to
modify it.
This distinction, however, is sometimes blurred in embedded systems (which
may not have kernel mode) or interpeted systems (such as Java-based operating
systems that use interpretation, not hardware, to separate the compunents). Still,
for rradi~ionalcomputers. the operating system is what runs in kernel mode.
That said, in many systems there icre programs that run in user mode but
which help the operating system or perform privileged functions. For example,
there is often a program that allows users to change their passwords. This pra-
gram is not p a ~ lof the operating system and does not run in kernel mode, but it
clearly carries out a sensitive function and has to be protected in a special way.
In some systems, this idea is carried to an extreme form, and pieces of what is
traditionally considered to be the operating system (such as the file system) run in
user space. In such systems, it is difficult to draw a clear boundary. Everything
mnning in kernel mode is clearly part of the operating system, but some prograins
running outside it are uguably also part of it, or at ieast closely awxiated with it.
Finally, above the system programs come the application programs. These
programs are purchased or written by thc users to solve their particular problems.
such as word processing, spreadsheets, engineering calculations, or storing infor-
mation in a database.
Most computer users have had some experience with w operating system, but
it is difficult to pin down precisely what an ~peratingsystem is. Part of the prob-
lem is hat operating systems,perform two basically unrelated functions, extending
the machine and managing resources, and depending on who is doing the talking,
you hem mostly a b u t one function or the other. Let us now look at both+
p ~ 7 t yccJmpatjble
j controller chips used on most htd-based personal computers.
(Throughout this book we will use the terms "floppy disk" and "diskefW" inter-
changeahl .) The PD7h5 has 1 0 commands. each specified by loading between 1
and 9 bytes into a device register. These cvinrnands are f~ reading and writing
data, Inoving [he disk arm,and formatting tracks, as well as itlitidi~ing.sensing,
resetting, and recalibrating the controller and rhc drives.
The must basic commands are read and write, each of which requires 13
parameters, packed info 9 bytes. These paramrrcrs specify such items as the
address of thc disk block to be read, the number nf sectors per track. the recording
mode uscd ntl the physical medium. the inrerseclor gap spacing, and what to do
s ~ i t hu deleted-data-riddress-mark. If you d i ~not understand this mumbo jumbo,
do nut worry; that is precisely the point--it is rather csoteriu. When the operation
is cumpktedl the contrdler chip returns 23 s r i m and crrur fields packed into 7
bytes. A s if this were not ennugh, the floppy disk programtner must also be con-
stantly aware of whether h e motor is on o r off, I f the rnotor is off. it must be
tumcd on (with a Img startup delay) behre daln can be read or wriaen. The
motor cannot be left on tm long, however. the floppy disk will wcar w ~ t .The
program~neris thus forced to deal with the tradc-nff between long startup delays
versus wearing out flnppy disks (and toshp the data nn them).
Without going int.0 the r d details. i t should be clear thal the average pro-
grammer probably does not want to get too intimately involved with the program-
ming of floppy disks (or hard disks, which arc just as complex and quite dif- .,
ferent)- Instead, what the programmer warm is a simple, high-level ahstrx~ionto
deal with. In the cast: of disks, a typical abstraction would he lhat ihe disk c m -
tains a collection of narned files. Each file can be opened for reeding or writing.
then read or written, and finally closed. Details such as whether or not rccordlng
should use modified frequency n ~ d u l a t i o nand what the current state of the miltor
is should not appear in the abstraction presented to the user.
The program that hides the truth about the hardware from the programmer and
presents a nice, simple view of named filcs that can bc read and written is, of
course. the operating system. Just as the operating system shields the prograrnnler
from thc disk hardware and presents a simple file-oriented interface, it also con-
ceals a lot of unpleasant business concerning interrupts, timers, mcrnory managc-
ment. and other low-level features. In each case, the abstracticm ufkrrd by the
operating system is simpler and easier to u s e than that offered hy ihc underlying
hardware.
In this view, the function of the operating system is to prcsznt ihc uscr with
the equivalent of an extended machine or virtual machine that is easicr ro pro-
gram than the underlying hardware. How the operating system achievcs this gcml
is a long story, which we will study in dctail throughnut this hook. To summarize -.
i t in a nutshell, the operating system provides a variety of services that programs
can obtain using special instructions called system calls. f i e wiIl examine solne
of the more common system calls later in this chaptcr.
csprci:illy it' it only n r c t i s a small h x ~ i w 1l t tor:ll. 0 1 ' c0u1.r~. this 17aisc.i
issues of i:iin:ess, prorec~ion,and so on. and i t is up lo rhe opcratinp system l o
w l v c thcrn. Anothcr resource that is space multiplertrd is the (hard) disk. 111
lniinv- svsiems
- a single disk can hold files fi-orn man!- users 31 he same time.
Allocating disk spnsc iillil kccping ~ r n c kof who i s using tvhich disk hlorh:, i s a
typical operating sy stcm rcsowce tnanagemcnt task.
sccrirjns w t *ill briefly ltmk 31. ;i few nf tlw highlighls. Since upuraliup s_rsrcnw
havc hisrarically been ~ l o s e l ytied t o h e archirec~urcn f thcl c~srnpule~-.;
o n which
~ h c vrlin, we will l u r k at s~isucssivegeneriitirms nf wmputers Lr, sce what thcir
nperaring systems were like. This mapping of operating systcm gencratiut~sto
unmputer. generations is crude, hut i t dws p r o v i d ~some mucluru where thw:
w w l d otherwise be none.
The first true digital computer was designed h; rhe Er~pJish 1n:lrhcmariuian
Charles Bahbage ( 1792-1 87 1). Although Babbage spent mosl of his lifc ard for-
+.
tune trying to build his ";ma!ytical engine. he never got i t wurking prtqxrly
hecnuse it was purely mechanical. and the technology or h i s day cnuld nor pro-
duce the required wheels. gears, and cogs to rlw hi$ prcuisim that h c nrcdcd.
+
Needless to say. the analylical engine did no1 have an operating system.
As nn interesting histr~rical aside. Babbagc rc;rlizcd that Ire wrmld need
software for his analyiical engine, so he hi]-cd a voung W ~ I I I ~ Unamed I Ado
Lovclacc. who was the daughter of rhc famed British ptac Lord Bymn. as th;:
world's firsr programmer. The programming hngoagc ~ d a ' ~is' n;lmed a h - her.
The introduction of the transistor in the mid- 1950s changed the picture radi-
cally. Computers kcnme reliahk enough that they could be manufactured and
s d d ro paying customers with the expeatatiun thal they would cuntinue ta func-
tion long enough to gel some useful work: done, For the first time. [here was n
clear separation bet ween designers, builders. i p x a t w s . prtygrammers. and mai nte-
nance prsrmnel .
These machines, now called mainframes, were locked away in specially air
conditioned computer rooms, with staffs of professional npcratnrs to run them.
Only big cnrporations or major government agencies or ~rniversiricscould afford
the multimillion dollar price tag. To run a job i,i.e.. a program or sct of pro-
grams), a programmer would first write Lhs program o n papcr (in FORTRAN or
assembler). then punch it on cuds. He would then bring the card deck down to
the input room and hand i t to one uf the operators w d go drink coff'ce until thc
output was ready.
When the computer finishcd whatever job it was currently running, an opera-
tor would go over to the printer and tear off the output imd carry il over to the out-
put room, so that the progrdmrner could cdtect i t latcr. Then hc wouid rake one
of the card decks that had k e n broughl tiom the input room and read it in. If the
FORTRAN compiler was needed. the operator would have to pet it from a tile
cabinet and read it in. Much computer time was wasted while operators were
walking around the machine room.
Given the high cost of the equipment, i t i s noi surprising that people quickly
looked for ways to reduce the wasted timc. The solution generally adopted was
the batch system. The idea behind it was to collect a tray full of jobs in the input
room and then read them onto a magnetic tape using a small (relatively) inexpen-
sive computer. such as the ISM 1401, which was very good at reading cards.
copying tapes, and printing output. but not at all good ar numerical calculations.
CHAP. L
I
Other. much morc expensive machines, such ria the IRM 7W4, were u s e d ' h the
real computing. 'l'his situation i s shown in Fi 2 . 1 -2 .
&.
tor then loaded a special prnerarn (the ancestor ol. today's vperati~ig system).
L.
which read thc first job i'rom tape and ran i t . The c.}ulput was written onto a sec-
ond tape. i i ~ e a dof bekg printed. After each job finished, thc operiiting system
automatically read the next job from the ~ i p cmd bugan running i t . When the
whole batch was done. th operator rcmovrd rhc input and ourput tapes, replaced
the input tape with the next batch. and brought the o u l p u ~tape ro a 140 1 for pdtlr
ing o f fline (ix..not clsnrlected to the muin cninpiitcr).
The structure nf a lypical input job i s shcrwn in Fig. 1-3. It stal-~cclnut wilh ;I
$JOB card, specifying the rnaxirnum run timr irr minutes. the account nuiuber to
be charged. md the programmer's name. Then c;imc a $FOKI'KAN card. tellin2
the operating syslcm to l o x i the FOKI'KAIS cr~mpilerfroin the system tape. TI
was fnllowed by the prugram to be otrlpiled. and then a SLOAD card, directing
h e operating system ro load the nbjcct program just compiled. {Compiled pro-
grams were o f k n writterr o n scratch tapes and had to be loaded explicitly . l Next
cnmc the $ R U N card, iclling thc operaring system to run rhe progrum with ihz
data ftdluwing it. Finally, thc $END card n w k e d the cnd of the job. These prirn-
itive control c;irds were ~ h fi~rrrunners
c of mcdern joh contml languages anti c ~ n -
tnmd interpreters.
Largc second-genernt ion computers u-rrr used mostly h r s c i e n t i f i c ;rod
engineering calculittions, such as solving h e partial d i f t e r r n r i n l rquatiorle that
o f t e n r c c u r in physics and rnginecring. They ~ c r largely
c progl-anm~edin FOR-
TRAK and assembly Language. Typical operating sy stcms were PMS (the Fortran
M m i t o r System) and LBSYS, IBM's operating system for the 7094.
HISTORY OF OPERATING SYSTEMS
World Wide W& sites fiat rtiust przxcss fhnusnnds d ~xquestsper sewnd.
*The frc;itest strength of the "one !am il y idea was si mu lIaneoos!y i tl; prea1r:st
"
u ' E ~ ~ ~ c 'rhe
s \ . irlten~iollwiis that all sof~ware.including the upersting syst.e~n,
iIS13150 had tr:, work im all mndels. I t had 1 1 ) rurl o n m a l l .iysllrn~s,which d t c n
just rcpl;iccd I 4 U l s fr)r cnpying i;;irds tiiprs, imd 011 very Iui-gr syhterns, whi;h
d t r n rcplirccd 7093s I'm-doing weather fnrcrrastin~and othtx heavy crmputing. It
had to bc good on s y s l ~ r n swith few peripherals and im systems with marly peri-
phcrals. la had iu wurk in c~mrnerci;tlcnv irw\rnents and in scienii fic environ-
mcnts. Above all, il hiid to be efficient for ill1 of these different uses.
Thcrc was nu way that IBM ( o r a ~ i y b r d yclsc) could write a p i c w ul' srlftw;rrr:
to meet a11 ~hc>secmflicting rcquiremen~s. The rcsulr was an rnorluous and
extraordinarily ~ u m p l c xoperating system. PI-ohahly\we) to thi-uc mders [>frnagni-
tude larger than FMS. It cnnsisted of milli{,~lsof' lines uf as~srnlsl y 1anguag.c writ-
ten by thousands of pnp-iimrncrs. and c:clnr~iincd h m w n c l s upm rhoul;ands of
bugs, which necessitated a ct~ntinuousstream nf new relenses in an alteirlpt tr,
correct them. new release fixed s o m bugs and introhced new m c s . so the
number elf bugs probald:: remained c o n s m : TI l i ~ n c .
One of the designers of OS13hU. Frcd B~-oc>ks. s~ihsequrntlywrote a witty ant1
incisive bouk (Brooks, 1996) describing his expcrir.nces with OS/3hO. While it
would be irnpussiblc to .surnmarizc. the. hunk b e r e slrfficc i t iu say thal the cover
shows a herd of prehistoric heasts stuck in n [al- oil. l'lle cover of Silherschatz cr
81. i2000) makes a sirnilm point about operotin: systems k i n e tlinosaurs.
L
I)cspite its enonnuus size and problems. OS/3hO 3rd the s i ~ ~ i i l a third- r
generation operating systems pn~duccdby i~thurcomputer ~ n a t ~ l ~ l h c u l ~;~clually -ers
sari sfied most ol' their customers reasonah1y well. Thcy also popularized severaal
kcy techniques ilhsrnt in second-ge~lcl-ariol~ operating systcnli. Probably the 1arbsi
impoflant of these was nlultiprogramrnisg. 0 1 1 the 7094. whcn thc current job
paused to wait for a tapc or other 1/0 opcr;iiicm ro complete. the CPlj simply
idle until the 1K3 finished. With heavily CYU-hound sciznt ific calculations, [/O is
infrequem. so this wabted rims is nor significant. With commrl-cia1 data pt%occss-
ing, the I/O wait time can o k n hu XO or YO pel-ccnt ihc tol:il t i ~ n e .sn svmclhirlg
had to he donc to avoid having the (expensive :I ('PU be idle so inuch.
The solution that evolved was io partition mcr-oory into hevcral picces. with a
differen1 job in cach partilion. ils shown in Fig. 1-3. While one job was w*.aib,lg
for 110 lo c ~ m p l e t e ,another job could be mirrg thc CPU. If cnoughjclhs ct~uldbe
held in main memory at unre. the CPU could be kcpr busy nearly 100 percellt of
the rime. Having multiple jobs safely in inemory at once requires special
harduw-e lo protect eachjob against snooping and mischief hy the other imes, b u ~
h e 360 and other third-generation systems were equipped with !his hardware.
( i .e.. recapitulates) the evolution of the species (phylogeny). In other wards. after
fertilization, a human cgg goes through stages of being a fish, a pig, and so on
before turning into a human baby. Modern biologists regard this as a gross sim-
plification. but i t still h a s a kernel of truth in it.
Something analogous has happened in the computer industry. Each new
species (mainframe. minicomputer, personat computer, embedded computer.
smart card. etc.) seems to go through rhc development that its ancestors did. The
first mainframes were programmed enlirely in usscmbly language. Even complex
programs. like compiiers and operating systems, were written in assembler. By
the tinw minicomputers appeared on rhe scene. FORTRAN, COBOL, and other
high-level languages were common o n mainframes, but the new n~inicumputcrs
were i~everthelessprogrammed in assernbtcr (for lack of mcnrory). When micru-
computers {early personal computers) were invented. they, too. were prograrnrned
in assembler. even though by then minicomputers were also prngrilmrned in high-
level languages. Palmtop computers also scarled with assembly ctdc but quickly
moved on tu high-level languages (mr~stTyb e ~ a u s cthe devehprnent wurk was
done on bigger machines). The same is true for sman cards.
Now let us look at operating systems. The first mainframes initially had n o
protection hardware and no supporl fcsr multiprt~gramming,so they ran simple
opcrating systems that handled one manually-loaded program at a tirnc. Later
they acquired the hardware and operating system suppofl to handle multiple pro-
grams at once,and then full timesharing capabilities.
When minicomputers t?mt appeared, they also had n o prutcction hardware and
ran one manually-loaded program at a time. even though multiprogramming was
well established in the mainframe world by then. Gradually, they acquired pro-
tection hardware and the ability to run two or more programs at once. 'The first
microcomputers were also capable of running only one program a! a time, but
later acquired the ability to multiprograrn. Palmlops and smart cards went the
same route.
Disks first appeared on large mainframes. then on minicomputers. microcorn-
pulers, and so on down the line. Even nuw, smart cards du not have hard disks,
but with the advent of flash ROM, they will soon have the equivalent of i t . When
disks first appeared, primitive file systems sprung up. O n thc CDC 6600, easily
the most powerful mainframe in the world during much of the l960s, the file sys-
tem consisted of users having the ability tt, create a tile and thcndeclare it to be
pwmanent. meaning it stayed on the disk even after the creating program exiled.
To access such a file later, n program had to attach i t with a special command irnd
give its password (supplied when the file was made permanent). In effect, therc
was a single directory shared by all users. I t w a up to the users lo avoid file
name conflicts. Early rr~inicomputerfile systems had a single direc~oryshared by
all uscrs and so did early rnicrwornputer file systems.
Virtual memory (the ability to run programs larger than the physical memory)
had a similar development. It first appeared in mainframes, minicomputers,
rnicl.ocomputers and gra~luallv wurked i t s way down to smallci- and smaller 5)'s-
tetns. Ne~workinghad a siinilar history.
The
In all cases, the st>fmtwarrdevelopment w a s dictated by the ~echnology:?.'.
firs1 microuomputcrs. t i ~ rcxurnplc, had solnethi ng like 4 KB of memory and no
protection h i d w a r e . High-level languages arld mu11iprogr;nnining were simply
too much for such a tiny sysrem to handle. 4 s the mirrocotnputrrs evolved i n w
modern personat computers, they acquired the necessary hardware and then the
necessary software to handlc more advanced features. I t is likely that this
d e \ r e l q m x n t will continue for years to come. Other fields may also have this
wheel of reincarnation. but in the cornpurer industry i t seems ro spin faster.
All r ~ this
f history and duvelopinwt has lefl t ~ s~ ! i a~ widc
h variely ryt' t,pera[-
ing systems. nu1 2111 of whish are widely krluwn. tn this scctinn wc will briefly
touch up011 seven of thcn-I. We will coma back; t o w i n c of thesc differrnt kinds r d '
systems h ~ u inr the bnr,k.
A1 the high end are the r.>paratinp systcnlh ['rw the maintiiuncs, thusc Iwnlrr-
sizcd cntnputcrs still fourid in 117ajor corp[wiitc L~:LT;~centers. l'hcse cr~mputersdis-
tinguish themselves fronl personal conqmtcrs in l~:rrns ul' their 110 capacity. .4
~naint'rumcwith 1000 disk?; and thnlrsands ilf gigabytes of ~ h t uis nut unusual:
pcrsr~nalcomputer with thew specit'iuatior,s wr,~~!tl bc odd indcrc?. Mriinfrnmcs
are also making s o n m h i n r or a ccmcbnc]\ ;iz high-cncl Wch wn!ers. server\ f c ~ r
u
large nurnhers o( small rcqucsts. for esolllpk. chcch prcrce\uinp a1 il hank or air-
linr reser~:atkrns. Each unit c.11- wiw-!., is s n ~ d l .~ L I I heS R S ~ L Y T In ~ tmr~dk ~ i ~LIII-
drcds ur thousaiids per s c ~ r ~ Titncshari~ip
d. svs tcms ol low mulliple rcmoto usc1-s
to run juhs cm i h c c{.~mputcr ut ijnce. w c h ils quer\inp 3 big tlil~;tba%r.l'hcse fun^*-
lions itre closely tnrlarr.d:~nninfrarncoprriiliiig systems d t e n pcrtirnn all of thcr~r .
The ocxt category is the personai computer operating system. Their job is to
provide a g o d interface ro a single user. 'They are widlrly used for word process-
ing. spreadsheets. and Intcrnet access. Comn~onexamples are Windows 98, Win-
dows 2000. the Macintosh operating sysrein. and Linun. Persot~al compukr
operating systems are sn widely known thar probably little intrnduutirm is needed.
I n fact- many people an: n < ~even
t aware thar orher kinds exisr.
The sn~nllestoperating systems run 011 smart unrds, which art: credit card-
sized devices cnnlzlining a CPI! chip. They h a w very sevrrre prncessing p w ~ u i .
and rnc~nm-yconstrsints. S r m e of them can handle mly n s i nglc function. such iiS
electronic payments, bur others can handle rnulliple functions on the samc sm;trr
card. Often rhesc are prr~prietarysystems.
Some smart cards are Java r~rier~ted.What !his rrieans is thal thc ROM rm the
smart card holds an interprttcr fnr thc Java Vii-iual Machine TJVM). Java ;ipplels
(smaI1 progratns) arc downloaded t r y the curd and arc- interpreted by Lhe JV,M
interpreter. Some of thesc cards can handle multiple lava applcts at tht: same
lime, leading tu n~ultipimgrammingand the need to schcduk L ~ L ' I ~Rcsrwrce.
management and protection also become arl issue when two or Imnre iipplets are
present a1 the same time. These issues must he handled by the (usuelly rxlrurnely
primitive) operating sysmn present un lhc card.
Monitor
Hard
disk drive
I 1. . ..
Video Keyboard ,
F~OPW Hard
CPU Memory controller controller disk disk
.. controller . controller :
T
I
I
I
I
I
Bus
, Fetch Decode
unit unit
f I
I4
PSW controls the m{,de. When running in kernel mode. the CPU can cxecule
every ins~uctionin its instruction set and use every feature of the hardware. The
o ~ r a t i n gsystem runs in kernel mode. giving it access to the cc~mpletehudwarc.
I n contrast. uscr programs run in uscr mode, which permits only a subset of
the instructions 10be executed and a suhser of the features to be accessed. Gen-
eraljy. a l l instructions involving I/O and memory protection are disallowed in user
mode. Setting the PSW mode bit to kernel modeis also fohidden, of course.
To obtain services fmm the operating system, a user program must make a
system call. which traps into the kernel and invokes the operating system. The
TRAP instruction switches from user mode to kernel mode and starts the operating
system. When thc work has been completed. control is returned ro the user pro-
gram at the inslruction following the system call. We will explain rhe details of
the system call process later in this chapter. As a note on typography, we will use
the lower case Helvetica font to indicate system calls in running text, like t h k
read.
It is worth noting that computers have traps other than the instruction fur exe-
cuting a system call. Most of the uther traps arc caused by the hardware to warn
of an exceptional situation such as an attempt to divide by 0 or a floating-point
undert'low. In all cases the operating system gets contrtd and must decide whiit to
do- Sometimes the program must be terminated with an error. Other times [he
error can be ignored (an underflowed nutnkr can be set to 0).Finally, when the
p r o g m has announced in advance that it wanis tn handle certain kinds o f candi-
tions, control can be passed back to the program to let it deal with the problem.
1 nsec , I Registers [ , 4 K8
1 Cache
64-51 2 MB
10 nsec Main memory
10 maec I' Magnetic disk 1 5-50 GB
hit. ~ h rcquest
c is satisfied from h e cache and no rneintlry request is sen1 w e r the
bus lo the iniiiii mernrq. Cache hits normally take about ~ w t , cluck cycles
Cache misses have ti) go to mernory, with a substantial timc penalty. C;lche
rnerrtory i s limited in size due to ils high cost. Sornu machines have two .rjr even
~ h r c elevels of cache, each one slr~werand bigger than the m e before il.
Main memury comes next. This is the workhorse of the mcmxsrv system.
Main memory is often called RAM (Random Access Memory). bid tirncrs
sometimes call it core memory, because computers in the 1950s and 1961)s used
tiny magnetizable ferrite cores for rrtuin memory. Currently, rnen~nriesare rens t r l
hundreds of megabytes and growing rapidly. All UPL' requcsts that cannot he
satisfied out of the cachc go to main memory.
Next in the hierarchy i s magnetic disk (hard disk). Ilisk storage is two rwders
of magnitude cheaper than RAM per hit and often ~ w orders o uf magnitude larger.
as well+ The only pri>blern is that the time ttr randornly access &la o n it is clostL
to three orders of magnitude slower. This Iow s p e d i s rluc ti) thc fact lhat a disk
i s a mechanical device, as shown tn Fig. I -8.
A disk consists of one rlr more metal p h ~ c r sthat rotilk at 541'10. 72W. 01-
IO,XoU rprn A mechanical arm pivots over the platters from the corner, similar to
the pickup arm on an old 33 rpm phonograph for playing vinyl records. Inf3orm;i-
tion is written onto the disk in a series uf concentric circlcs. At any r i v c t ~arm
L.
position, each of the heads can read an arlautar region called a track. l'ogolher.
all the trnck.s fnr a given arm pnsitiun form a cylinder.
Each track is divided into some nulnbcr of sectors. typically 512 hytcs per
sector. On modern disks. the uukr cyhnders crmtain inme sc-cturs than h c . inncr
ones. Moving the arm from one cylinder tn rhe nexl one takes abour 1 msec.
Moving it ro a random cylinder typically takcs 5 mscc to 10 msec. depending on
Ihe drive. Once the arm is on the c ~ m x ttm ~ k t, l w drivc musi wait fur. the needed
secror to rotate under the head. an additional delay of 5 Inarc to I D msrc. depend-
ing un the drive's rpm. Once the sector i s under the head. reading or writing
r m x r s at a rate of 5 M B k c un low-end disk5 10 160 blR/seu i-m fdster ryncs.
SEC. 1.4 COMPUTER HARDWARE REVIEW
Surface 6
Surface 5
Surface 4
Surface 3 - --.
-
--
Direction of arm
__--
Surface 2
Surface 1 -.
Suhce 0
Tt-le final layer i n the mcrnory hierarchy is magnetic tape. This medium i s
often used as a backup f x disk stwage and for holding very large darn sets. Tr)
acccss a tape, it must first be put into a tape reader. either by a person or a mbnt
(autnjnarcd tape handling is common at ~nstallationswith huge databases). Then
the t a p may have to be spooled forwarded 10 gct to the requested block. All in
ail, this could take minutes. The big plus of tape is that tt is exceedingly cheap
per bit and removable, which is important for backup taws that must he stored
off-site in order'to survive fires, floods, earthquakes. etc.
The memory hierarchy we have discussed is typical, but some installations do
not have all the layers or have a few different ones (such as optical drsk). Still. in
all of them, as one goes down the hierarchy, the random access time increases
dramatically. the capacity increases equally dramatically, and the cost per bit
drops enormously. C'onscquently, it is likely that mcmnry h i e r a r ~ h k swill be
around far years to cornc.
In addition to the kinds of memory discussed above. many computers have a
small amount of ncmvolati le random access memory. lJn like RAM, nonvolatile
memory does not lose its contents when thc powcr i s switched off. ROM (Read
Only Memory) i s programmed st the facrory and cannot be uhangcd aflrrward. It
is fast and inexpensive. On some computers, rbe bootstcip Ioadcr used to slart the
computer is contained in ROM. Also, some 1/0 cards unrne with KOM fnr him-
dling low-level device control.
EEPROM (Elecltricdly Erasable ROM 1 and flash RAM ul-e also nonvola-
tile, but in contrast to ROM can he erased and r c w r i w n . Howevcr. writing them
takes orders of rnagnilude more time thau writing R A M . so thcy are used in the
same way ROM is. only with the additional fcaturr that it is now possihlc to
correct hugs in programs they hold by rewriting them in thc field.
Yet another kind of memory is CMOS, which is volatile. Many cclmputers
use CMOS memory to hold the currenr time and date. The CMOS memory and
he clclck circuir [ha[ incrcll~enlrthe timc in i r arc pc!ncrsJ b? n 5113~11balrcry,
thr: rilnc i.; Lorrccllyupdaiod, svtn w h r n rhc ci~mpurcri q irr~piugyA.'l'hi- CMOS
tncmory c a n 4 1 ~ ~hold
3 ihl: ~ w f i g ~ ~ - a tparanleterr,
ion such as *hich disk to bt.~A
fri,m. CAMi_lSis used becmse it draws su lit!le prnwr thar rhc uriginiil tBctrrv-
iosrallcd battery often l a s ~ sfor several y e a r s Hnwevcr. whzn it hegins to i i i I. the
computer can appear to h a w Alzheimer's disaasc, forgetting things thar i t ha5
known fnr ycars, like which hard disk ti> hoot fmm+
Let us now foeus on main memory for a lirrlc while. I t is oitcn dcsirahlc tit
hold multiple programs in memory at-once. If m e program i s blucked waiting f t ~ r
a disk read ro complete, another program can use rhe CPU. giving I, hettcl- CPI!
utilizalion. However. will1 two or r n r m programs in innin memrlrv 31 ijric'c. tw'r)
-l - 4
is running
-
Limit
1 Usmprogram
and data I Base-2
Base- 1 --,
User-1 data
- Lirn~t-1
Base- 1
The check and mapping result in converting an address generstcd by the pro-
gram, called a virtual address, into an address used by the memory. called a phy-
sical address. The device that performs the chcuk and mapping is called the
MMU {Memory Management Unit}. It is located on thc CPU chip or close t i ) i t .
bur is logically between the CPU and the memory.
A more sophisticated MMLJ is iliustratcd in Fig. I-c)(b). Here we have an
M M U with two pairs of base and limit registers. onc lor rhc p r o p r i m t e x t iind one
for the data. The program counter and all other rcfcrencos tu the program text use
pair 1 and data references use pair 2. As a conseqirence. it is now possible to h d v r
multiple users share the same program with only one copy of ii in memory. scrme-
thing not possible with the first scheme.. When program 1 i s running. the four
registers arc set as indicated by the arrows to the left of Fig. 1 -9(b). When pro-
gram 2 i s running, they are set as indicated by the arrows to the right of the figure.
Much mure sophisticated MMUs exist. W r will s~udysome of them later in this
1.4.3 I/O Devices
Meinory is not thc only resourcc t h a ~1 1 - 1 ~ opcl-atinp system must nlanuge. I/(.)
devices also interact heavily with the operating sysicni. As w r saw in Fig. 1-5.
I/O beviccs generally consist of T W O parts: ii ~*nnirr,llera i d the devics irseli'. The
cuntrollcr is a chip or a set of chips on a pluc-in ,_. board that ph~sicallyc u n t r d s thc
device. It accepts comtniinds from the operating syslcrn. fur rxample, l o read data
from the device, and carrics them our.
In many cases, the actual control of the devicc i s vcry complicsted and
detailed, so it is the joh of the ccmtrotler to prbescnt n simpla- inrerfilce to the
operating system. For example. a disk con ti-ollrr might accept n command to read
sector 1 1.20h from disk 2. The cuntrolltr- then has to convert this lincar sectrw
number to a cylinder, sector, and head. This conversion may hc crrrnplicetcd by
the fact that outer cylinders have more sectors than innur m c s atid that some bad
sectors have been remapped onto 01her uncs. Then thc conmller has to dewmine
which cylinder the disk arm is on and give it a scquencr of pulses to move in VI-
c ~ u tthe requisite numbcr of cylinders. It has lo wait until the proper sectur has
rotated under the head and then start reading and storing the bits as they come off
the drive. removing the prcan~blcand computing the uhecksum. 1:innlly, it has to
assemble the incoming bits intu words and storc them i n mcmury. To do all t h i ~
work, controllers often contain small entbcdded computers thal are programmed
to do their wr~rk.
The other piece is the actual device itsclf D c v i c r s havc fiiirlv simple inter-
faces. both because they cannot do rnllch and r v inakc them standard. The loiter i.;
needed so that any lnF5 disk cnntn~llercan handlc any IDE dish. For cxalnplc.
IDE stands for Integrated Drive Electronics and i s thu standard type of disk or,
Pentiums and some other computers. Since the acrual dc\ice interface is hidden
SEC. 1.4 COMPUTEK HARDWARE REVIEW
behind the cclntr.oilcr+ 1 that the operatillg syStWIl S W 5 the ir~terfdcetu the cotl-
tru\lcr. which ,nay be d i f f e ~ n from
l the interfiie to the device.
Because each type of controller is different. different software is WXdeti to
cunrl.ol tach one. The software tfiat talks to n conlrokr, g i v i n g it conlnlands and
responses, i s called a device driver. Each controller tnanufactuer has
to supply n driver for each operating system i t suppwts. Thus a scanner rnay
cumc with drivers f~)rWindt~ws98, W i n h w s 2WU, and UNIX. for example.
be used. the driver has to bc put intlo the operating system so it can run in
kcmcl n~ode.Theoretically. drivers can run outside rhe kernel. hut few current
svslcms support this possibility becausc il requires the abiiity to allow n user-
space driver €{, be ahlc to access the device i n n controlled way, a feature rarely
suppmted. There are threc ways the driver can be put into the kernel. The first
way is to relink the keinel with the new drivcr and then reboot the system. Many
UNIX systems work likc this. The second way is to make an entry in an operating
system file tellling it that it needs the driver and rhen rebout the system. At boot
time, the rjpcrating syst.en.1 goes 3rd finds the drivers it needs and loads them.
Windows works this way. The third way i s fbr the operating system to k able. tu
accept new drivers while r m n i n g and install them on-the-fly without the need to
rebool. This way used tu he rurc bur is kcorning inlrch more cornmoo now. Hot
pluggable devices, such ;is USR and IEEE, 1393 devices (discussed helow) always
need dy narnicdly hadcd drivers.
Every controller hiis a snlall number of registers that are used to com~nunicntc
with i t . For example, il tnininial disk controller might have registers for spccify-
ing rhe disk address. niernory address. sectclr count, and direction (read or write).
Ta activate the controller, thc driver gets a command from the operating sysicm,
then translates it into the appropriate values to write into the device rcgisrers.
On some computers, the device registers are mapped into the operating
system's address spacc, so rhcy can be read and writwn like urdinary memury
words. On such computers, n o special I/O instructinns are needed and uscr plmo-
grams can be kept away from the hardware by not putling these memory ad-
dresses within their math (c.g., hy using hase and limit registers). On nthcr com-
puters, the device registers are put in a sprciirl I/O port space. with each register
having a port address. O n these machines. special IN and OUT instructions are
available in kernel mode to allow drivers to read and write the registers. The
furmer scheme eliminates the need fur special 110 instruclions but uses up some
of ihc address spacc. The latter uses no address space but requires special insrruc-
tions. Both systems are widely used.
lnput and output can he done in three different ways. In the sinlplest method.
a user program issues a system call. which the kernel then translates into a pro-
cedure call to he appropriate driver. The driver then stnrrs the I/(> and sits in a
tight loop continuously polling the device to scr i f it is done (usually there is some
bit that indicates that the device is still busy). When the VCJ has completed, the
driver puts the data where [hey are needed (if any). and returns. The clperating
systcrn then returns control lo thc caller. This method i s called busy waiting a n d
has the disadvantage of lying up the CPU polling ihe device until it is finished.
'['he secrmd rnethud is for the driver to start the device and ask it ti) give 3n
L
interrupt when ir is finished. At [hsr point the driver returns. I ' h e operaring sys-
zcm rhcn blucks the r;nllzr if need he and Lwks far clther work t o do. When the
vonrroller detects the cnd of rhe ~ransfer.i t generates an interrupt ro signal corn-
pletinn.
in operating syslerns, s u l ~ us
h t c r n q ~ satbevery i~-rlpr~rtnnt r examine fhe idca
more dusely. In Fig. 1-IO(a) wc see a threc-step process fur VU. In step I , the
driver ~ 1 1 sthe controller what to do by writing into its device regisrcrs. The con-
chip wing certain bus lines in step 2. If the interrupt c ~ ~ i t r o l l cisr prepared lo
accept the interrupt (which it may not bc i f it i s bwiy with a higher priority m c ) , il
asserrs a pin on the CPU chip informing itl in step 3. Jn step 4, the interrupt coil-
troller puts the number the device on the bus so the CPU can izad it and k n o w
which device has just finished (many devices may he running at the samc t i m c ) .
Disk drive
9 Oisk
controller
2. Dispatch
Once the CPU has decided to take the interrupt. thc program crmnler and
PSW me typically then pushed onto the curre111 stack and the CPU switched intrr
kernel mode. The device number may be used as an index into part ul'rnewory
find the address of the interrupt handler for this device. This piin of rncrnory i x
called the interrupt vector. Once the intenup handler (part of the driver fior the
intempti ng device) has st:irted, it removes: tlic stacked program cclunter and PSH.
and saves them. then queries the device to learn its status. When rhe handler i s all
finished, i~ returns to the previously-running user program to the first instruction
that was not yet executed. These steps arc shown in Fig. i - lO(h).
The third method for doing 1/43 m a k s use uf a special UMA (Direct
Memory Access) chip that can control the tlow of hits between menlory and
some c~otrollcrwithout constant CPll intervention. The CPLI sets up the DMA
chip, [elling it how many bytes to transfer. thc device and rnrlnory addresses
involved. and the direction, and lets it gn. Whcn the DMA chip is done, it causes
an interrupt. which i s handled as described above. D M A and 1/0 hardware in
general will be discussed in more detail in Chap. 5 .
Interrupts can often happen st highly inconvenient tnornents, for cxarnple,
whilc another interrupt handler is running. fix this reason, the CPU has a way to
disable interrupts and then rcenable them later. While interrupts are disnblcd. any
devices that finish continue tn assert their intcrrupt signals, but the CPU is not
interrupted until interrupts are enabled again. If mu1ttplc devices finish while
interrupts are disabled. the inte.nupt cuntroller decides which m e rn lei through
first, usually based on static priofities assigned to tach device. The highest prior-
ity device wins.
1.4.4 Buses
The organization OK Fig. i -5 was used o n minicomputers for years and .;rlstl o n
the original ISM PC. Flowever. as processors and n~ernnricsgot faster. thu ahilily
of a single bus (and ceflainly the IBM PC bus) tu handlc all the traftk was
strained to the breaking point. Something had to givc. As a result. additional
buses were added, both far faster U 0 devices and fnr CPLJ to nleinm-y traffic. As
a consequence of this evoiution, a'lnrge Pentiunr sqatenr currrnti? looks some-
thing ltke Fig. 1-1 1.
This system has eight buses (cache, local, memory. PCI. SCSI, USB. IDE.
and ISA). each with a different transfer mte and function. The operating sysrem
must'be aware of all nf them for configuration and management. The two main
buses are the original IBM PC ISA (Industry Standard Architecture) bus and
its successor?the PC1 Peripheral Component Interconnect) bus. The ISA bus.
which was originally the IBM PCIAT bus, runs at 8.33 MHz and ran transfer 2
bytes at once, fora n~aximurnspeed of 16.67 MBlseo. i t is irrciuded for backward
compatibility with old and slow 1/0 cards. The PC'! hus was invenicd bv Inlei as n
successor to the ISA bus. It can run at 66 MHz and transfer 8 bvtes at a time, for
a data rare of 528 MB/sec. Most high-speed [/O devices use the PC1 bus now.
Even some non-Intel computers use the PC1 hus due to the large nurnbcr UP 110
cmds available for it.
In this configuration, the CPU talks to the PC1 bridge chip over the local bus.
and the PC1 bridge chip talks to the mernnry over a dedicated memory bus. ollcn
running at 100 MHz. Pentiurn systems havu a level- t cache on chip and a nluch
larger level-2 cache off chip, connected to the CPV by h e cache bus.
In addition, this system contains three specialized buses: IDE. USB. imd
SCSI. The IDE bus is for attaching peripherdl devices such ;ls disks and CD-
32 INTRODUCTION
+
!
r '
i4
I I
PCI 7
Level 2 /1 GPU bridge
cache
h-. 4. k
1
, PC I bus
I I I 1 I I
,
I SA IOE Available
. & . "p
bridge disk PC1 slot
,"
ISA bus
7
P
-
I I 30um 1
- - .. ..
1
Printer
I t
Available
ISA slot
1-5.1 Processes
A key concept in all operating nysrems iz. the process. A proccss i s h:lsically
a program in execution. Associated wirh ruc h process is its address spare. ;I list
t ~ f 'nlciriory lncalioi~sfrom some minimum (~.~suilll y 0) to sume rniixiinum. which
the procrss can read and write. The address space contains ihe executable pn?-
gram, the program's data. and i t s stack. Also assc~cialed wilh each process is
S U I T I ~sct of registers. including the program courmr. stack pointer. and othsr
hardware registem. and all the othcr infunnation needed to run the program.
W c wilt come hack lu the process concep in much more derail in (:hap. 2. but
fnr the time king.,the easiest way to get a good intuirive lkcl for a pnlccsh is In
think about ti mesharing s y s t e i ~ ~ sPeriodical
. l y . [he opcrati ng system decides lo
stop running one pmccsu and statl running another, for example. because the first
one has had more than 11s sham of CPU tinx i n the past second.
When a prncrss is suspended te~r~porarily lihc this, i t musr laler br rcstiir.~edin
exactly the same state it had when i t was stopped. This means that all inforn~ation
about thc process must tK. explicitly saved somewhere during the suspension. For
example, the proucss may have several files open for reading at orrc. Associated
SEr. 1-5 OPERATING SYSTEM CONCEFTS 35
wit11 each d'these files is a pointer giving rile current positit3n (i.c..the n u m k r of
lhc byte record to be read next). When a process is temporarily suspended. dl
thrsc pointers hc: saved so that a read call exccutcd after t h prfJCCSs
~ is re-
stafled will read the pruper data. I11 many operating systems, all thc information
abuut each process, other than the contents of its own address space. is stored in
an operating system tablc called the process table. which i s an array (or linked
li sr) of structures, one t i ~ each
r process currently in existence.
Thus, a (suspended) process consists nf its addrcss space. usually called tl~c
core image (in honor of the magnetic core inrn~oriesused in days of yore), and its
process u b l e m t r y . which contains i t s i-ttpistsrs. among other rhings.
The kcy process management system calls are those dealing with the creation
and ~crminaticmof processes. Consider a t y p i c d example. A process ciilled Ihr:
command interpreter or shell reads cr~~nrnands frum a terminal. The user has
just typed ii cnmrnand requesting that a program be compiled. Thc shell must
nolc ur-eatr a new process that wiIl run the cumpiler. Whcn that process has fin-
ished the wmpilatiun, it executes a system call to terminate itsclf.
11' 3 process can create one or more other proucsses (referTed €0 ns child
processes) m d these processes in turn c a n create child processes. wc quickly
<arrive at the p r a e s s Ircc structure of Fig. 1-12. Related processes that arc
cooperating ?a get. same job d r m ofte~ineed to cr~ininunicatewith one annthcr
and synchronize their aclivities. This communication is called interprocess com-
municatiun, and will be addressed in derail in Chap. 2.
Figure 1-12. A proccss trcc. F'rwess A crcaicd rwo child prwcswx. K and C'.
Process B creaied h - c c child prucesses. D. L-, b',
Other prnccss system calls are available to requcst mow memtrry (or release
unused memory), wait for a child process tu terminate, iind overlay its program
with a differcnt one.
Occasionally, Ihere is a necd to convey inforn~srionto a ~.unningprocess that
is not sitting arcwnd waiting for this inforn~ation. 1431- example, a prcxcss that is
communicating with another process on a different computer docs SO hy sending
messages to the remote prwcess over a computer network. To guard against the
possibility that a rnessagc or its reply is lost, thc sendcr may request that i t s crwn
operating system notify it after a specified nulnhzr of seconds. so that i t can
retransmit h e message if no acknowledgement has been received yet. After set-
ting this timer, the program may continue doing other work.
Whcn ihc specified number of' scccmds has elapsed. the openrliag s y s t r n ~
sends an alarm signal to the prwcss. 'The signill causcs thc pImrKc5s tPm-
porarily suspend whatever it was doing. suvc its rcgis~crson the stack. a i d stad
~ . u n n i n ga special signal handling prwedure. for. example. tu rrtranstoii 3 presum-
ably lost message. When the signal halldler is donc. thc I-uoning process is re-
started in thc state it wa5 in just b e f w the signal. Signals arc the .;oftwarc analog
of hardware interrup~sand c m be gencratcd by a vat-jay or causes in addition to
timers expiring. Many traps detected by hardware, such ;is cxecut ing an 11 legid
i n s t n c h n or using an ir~validaddress, are alsu converted inlo signals to the guilty
prrux.
Each person authoi-ized to usc a system is ilssiped ;I IJID (User 1Dcntific~1-
tion) by the systcm administrator. Every prrlcess started has the CJID d the per-
son w h o started it- A child process has the sawc UID ils ils pnrent. Users can bt.
members of grwps, each of which has a GID [Group IDentification r.
One UID. called the superuser (in IJNiXl, has special prwier and ]nab vid;iic
many af the prntectim rules. In large inst.allations, nnly the sy swm ;1dtnitlis\.rii1nr
knows thc password nwded to becm-t~esupcruser, but inany of rhe rjrdiniiry users
(especially studem) devote considerable cfftsrt Ir, trying to ijnd f1aw.s in the s y s -
tem that allow them tn become superuser wit h w t the pass wrmi.
Wc will study prwesscs, interpmcess ut~rninmicutirm.and related i s u c s ia
Chap. 2.
1 5 2 Deadlocks
When rwo or mcm yroccsses are interacting, they can sornctimes pel rhetn-
selves into a stalemate situation they cannot get out of. Such a siluaiion i h ~ i d l c d
a deadlock.
Deadlocks can best be introduced wjrh a real-world example everyone i.; film-
i h r with. deadlock in traffic. Cunsider thc si~uationof Fig. I - 13ta). Herr fuur
buses are approaching an in~erscution. Behind each one are more buses (not
sh<~wn).With a little bit r ~bad
f luck, the first f 0 l l r c r ~ u l r a11
j arrivc a t the intersw-
tion simultaneously. leading 1.u the situahm of Fig. I - 13(hj, i n which they arc
deadlocked because nonc of them can gil forward. Each onc is blocliing one of
the others. They .cannot baukwmd due t o orher buses behind thern. T k i - c is r w
easy way out.
Processes in a computer can experience an annlogr~ussituation in which they
cannot make any progress. For example, imagine a colnpurer with u tape drive
and CD-recorder. Now imagine that rwn pruccsses cach need to produce a CD-
ROM from data on a tape. Process I requesa and i s granted the tape drive- Ncxt
process 2 requests and i s granted the CD-reconlcr. Then prcccss I requests the
CD-rccurder and is suspended until proccss 2 i7eturns it. Finally. process 2 re-
quests the tape drive and is also suspended because process I dready has it. Hcre
OPERATING SYSTEM CONCEPTS 37
Every computer has some main memory that it uses to hold executing pro-
grams. In a very simple operating sysreni, only one pimogramat a fimc is in
memory. To mn n second program, the first m e has to be reinwed and rhc
second one placed in mcmory.
More sophisticated operating syslcrns n1Low multiple prograins tu be in
memory at the same time. To keep them from inferfixing with one another (and
with thc operating system). some kind of protection mechanism is needed. While
this mcchanisrn has to be in the hardware, i t is controlled by the operating system.
The abwe viewpoint is concerned with managing and protecling the
computer's main tnemory. A different. hui equally important memory-related
issue. is inanaging thc address space of the p170cesscs. Nounally, each process has
some set of addresses i t can use. typically running lirm~0 up to some maximum.
in the simplest case, the maximum amount uf addi-css space a process has is less
rhun the main memory. In this way, a prtlccss can t i l l up its address space and
there will be enough mom in main memory to hold it all.
However. on many computers addresses arc 3 2 or 64 bits. giving on ddress
space of 2" or bytes, I-espectively. What happens if a process has mtm
address space than the computer has nlain memolmy and thc process wants to use it
all? In the first computers, such a process was Just out of Iuvk. Now;rdays. a
technique celled virtual memory exists, in which the operating system keeps part
of the address space in main memory and part on dirk and s h u ~ l e spieces back
and forth between them as needed. This important operating system fuunction. a d
other memory management-related functions will he covercd in Chap. 4.
38 ¶NTRODUCTION CHAP. 1
All computers have physical devices for acquiring inpul and producing output. I
After all. what good would a computer be if the users could not tell it what to do
and could not get the results after i f did the work requested. Many kinds of input
and output devices exist, including keyboards, monitors, printers, and so on. It is
up to the operating system to manage these deviccs.
Consequently, every operating system has an I/O subsystem for managing its
l/O devices. Some of the U 0 software is device independent, that is, applies to
many or all devices equally well. Other parts of it, such as device drivers, are
specific to particular VO devices. I n Chap. 5 we will have a look at 1/0software.
1.5.5 Files
Another key concept suppotred by virtually all operating systems is the file
system. As noted before, a major function of the operating system is to hide rhc
peculiarities of the disks and other V 0 devices and present the pmgrammer with a
nice, clean abstract model of deviceindependent files. System calls are obviously
needed tr, create files, removc files, read files, and write files. Before a file can
be read, it must be located on the disk and opened, and after it has been read it
shwld be closed, so calls are provided to dn these things.
To provide a piace to keep files. most operating systems have h e cnncept of'a
directory as a way of grouping files together. A student, for example, might have
one directory for each course he is taking {fur the programs needed for that
course), mother directory for his electronic mail, and still another directory for his
World Wide Web home page, System calls are then needed lu create and remove
directories. Calls are also provided to put an existing file in a directory, and to
remove a file from a directory. Directory entries may be eihcr files or other
directories. This model also gives rise to a hierarchy-the file system-as shown
in Fig. 1-14.
The process and file hierarchies both are o r g a n i d as trees, but the similarity
stops there. Process hierarchies usually are not very deep (more than three levels
is unusual), whereas file hierarchies are commonly fuur, tlve. or even more levels
deep. Process hierarchies are typically short-lived. generally a few minutes at
most, whereas the directory hierarchy may ~ K I fors ~ years. Ownership and protec-
tion also differ for processes and Ales. Typically. only a parent process may con-
trol or even access a child process, but mechanisms nearly always exist to allow
files and directories to be read by a wider group than just the owner.
Every file within thc directory hierarchy can be specified by giving its path
name from the top of the directory hierarchy. the root directory. Such absolute
path names consist of the list of directories that must be traversed from the root
directory to get to the file, with slashes separating the components. In Fig. 1-14.
SEC. 1.5 OPERATING SYSTEM CONCEPTS
Root directory
u
However, the file svstern on ihe floppy cannot he used, bccause here is nu
way to specify path names un it. UNIX does not allow path names to be prcfixrd
by a drive name or numkr; that would be precisely the kind of device dependence
that operating systems ought to eliminate. Instead, the mount system call allows
the file system an the flnppy tn be attached to the r w t file system wherever the
prt>gram wants it ro be. In Fig. I-lS(h) the file system o n the floppy has been
mounted on directory b, thus allowing access tu files /.A and /b&l. If directory h
had contained m y files they would not be accessible while the floppy was
mounted, since /b would refer to the root directory of the floppy. (Nor bring able
to access these files is nut as serious as i t at first seems: filc systems are nearly
always mounted on empty directories.) If a system contains multiple hard disks.
they can all be muunted into a single tree as well,
Another important conccpt in UNlX is the speciat file. Spctai files are pm-
vided in order to make I/O devices took like files. That way. they can be read and
written using the same system calls as are used for reading and writing files. Two
kinds nf special files exist: block spefial Thes and chnrarter special files. Block
special files ;uc used to model devices that consist of a collecdon of randomly
addressable blocks, such as disks. By opening a block special file and reading.
say, block 4, a program can direct1y access the fourth block on the device. without
regard to [he stmcture of the file system contained on it. Similarly, character spe-
cial tiles are used to model printers. moderns, and other deviccs that accept or out-
put a character stream. By convention, the special files w e kept in the /dev direc-
tory. For example. /dewYp might be t he line printer.
The last feature we will discuss in this overview is one that relates to buth
processes and files: pipes. A pipe i s a sort of pseudofile that can be used ro con-
nect two processes, as shown in Fig. 1- 16. If pmcesses A and B wish to talk using
SEC. 1.5 OPERATING SYSTEM CONCEPTS 41
a pipe, tbey must set i~ up in advance. When process A wants to send data to
prwcss B, it writes on the pipe as though it were an output file. Process B can
read the data by reading from the pipe as though it were an input file. Thus, com-
munication between processes in UNIX looks very much like ordinary file reads
and writes. Stronger yet, the only way a process can discover that the output file
it is writing on is not really a file, but a pipe, i s by making a special system call.
File systems nm very important. We will have much more to say about them in
Chap. 6 and also in Chaps. 10 and 1 I .
1 5 6 Security
Computers contain large amounts of information that users often want to keep
confidential. This information may include electronic mail, business plans, tax
returns, and much more. It is up to the operating system to manap the system
security so that files, for example, are only accessible to auth~rizebusers.
As a simple example. just to ger an idea of how security c m work. consider
UNIX. Files in UNIX are protected by assigning each one a 9-bit binary protection
code. The protection code consists of three 3-bit fields, one for the owner, one: for
other members of the owner's group (users are divided into groups by the system
administrator), and one for everyme else- Each field has a bit for read access. a
bit for write access, and a bit for execute access. These 3 hits are known as the
FWX bits. Fnr example, the pmkction cude rwxr-x--x means that the owner can
read, write, or execute the file, other p u p members can read or execute (but not
write) the fiie, and everyone else can execute (but not r e d ar wrire) the file. For
a directory, x indicates search permission. A dash means that the corresponding
permission is absent.
In addition to file protection, *ere are many other security issues. Protecting
the system from unwanted inhders, both human and nonhuman (e-g.. viruses) is
one of them. We will took at various security issues in Chap. 9.
The operating system is the code that carries out the system calls. Editors.
compilers, assemblers. linkers, and command interpreters definitely are not p;ut of
the operacing system, even though they are important and useful. At the risk of
confusing things somewhat, in this section we will look briefly at the UNIX cam-
42 INTRODUCTION CHAP. 1
mand interpreter, called the shell. Although it is not part of the operating system.
it makes heavy use of many operating system features and thus serves as a good
example of how the system calTs can be used. It is also the primary interface
between a user sitting at his terminal and the operating system, unless the user is
using a graphical user interface. Many shells exist. including ~ h csh, , h h , and
bash. All of them support the functionality described below, which derives from
the original shell {sh).
When any user lops in, a shell is stand up. The shell has the terminal as
standard input and standard output. It starts out by typing the prompt. a character
such as a dollar sign, which tells the user that the shell is waiting co accept a corn-
mand. If the user now types
date
for example, the shell creates a child process and runs the dare program as the
child. While the child process is running, the shell waits for it to terminate,
When the child finishes. the shell types the prompc again and tries to read the next
Input line.
The user can specify that standard output be redirected to a file, for example,
date >file
which invokes the sort program with input taken from ,file1 and output sent to
file2.
The output of one program can be used as the input for another program by
connecting them with a pipe. Thus
cat file1 file2 fit03 I sod M e v A p
invokes the cat program to concutenate three files and send the output to s a r i to
arrange all the lines in alphabetical order. The output of son is redirected to the
file /dev&, typically the printer.
If a user puts an ampersand after a command. the shell does not wait for it to
complete. Instead it just givcs a prompt immediately. Consequently,
cat file1 file2 file3 t sort ddevllp &
starts up the sort as a background job, allowiog the user to continue working oor-
mally while the sort is going on. The shell has a number uf other interesting
features, which we do not have space to discuss here. Mnst books on UNIX dis-
cuss the shell at some length (e.g., Kernighan and Pike, 1984; Kochan and Wood.
1990: Medinets. 1999; Newham and Rosenblatt, 1998: and Robbins, 1999).
SEC. 1.5 OPERATING SYSTEM CONCEmS
Computer science, like many fields. i s largely technology driven. The mason
the ancient Romans lacked cars is not that they liked walking so much. It is
because they did not know how to build cars. Personal computers exist no?
because millions of people had some hng pent-up desire to own a computer, but
because it is now possible to manufacture them cheaply. We often forget how
much technology affects our view of systems and it is worth reflecting on this
point fmm time to time.
In particular, it frequently happens hat a change in technology renders some
idea obsolete and it quickly vanishes. However, another change in technology
could revive it again. This is especially m e when the change has to do with the
relative p c r f o m c e of different parts of the system. For example, when CPUs
became much faster than memories, caches & m e irnpmnt to s p e d up the
W o w " memory. If new memory technology some, day makes memories much
f s t e r than Cf Us, caches will vanish. And if a new CPU technology makes them
faster than memories again, caches will reappear, In biology, extinction is for-
ever, but in computer science, it is sometimes only far a few years+
As a consequence of this impermanence, in this book we will from time to
time look at "obsolete" concepts, that is, ideas that are not optimal with c m n t
technology. However, changes in the technology may bring back some of the so-
called "obsolete concepts." Fw this reason, it is important to understand why a
concept is obsolete and what changes in the environment might bring it back
again.
To make this point clearer, let us consider a few examples. Early computers
had hsrdwind instruction sets. The instructions were executed directly by
hardware and could not be changed. Then came mjcroprogmming, in which an
underlying interpreter carried out the instructions in software+ Hardwired errecu-
tion became obsolete. Then RISC computers were invented, and microprogram-
ming (i.e., interpreted execution) became obsokte because direct execution was
faster. Now we are seeing the resurgence of interpretation in h e form of Java
applets that are sent over the Internet and interpreted upon arrival. Execution
speed is not always crucial because network delays are so great that they tend to
dominate. But th&r could change, too, some day.
Early operating systems allocated files on the disk by just placing them in
contiguous sectors, one after another. Although this scheme was easy to imple-
ment, it was not flexible because when a file grew, there was not enough room to
store it any more. Thus the concept of contiguously allocated files was discarded
as obsolete. Until CD-ROMs came around. There the problem of growing files
&d not exist. All of a sudden, the simplicity of contiguous file allocation was
seen as a great idea and CD-ROM file systems are now based an it.
As our final idea, consider dynamic Linking. The MULTICS system was
designed to mn day and night without ever stopping. To fix bugs in software, it
CHAP. I
was necessary to have n way to replace library procedures while thry wcrc k i n g
used. The conccpt of dynamic linking was invented f o r this purposc. After MI!L-
TICS died, the concept was forgotten for a while. However. it was rediscowed
when ~nodrrnoperating systems needed a way to allow many propratns 10 share
the same library procedures without having their own private copies (because
gl-aphics libraries had grown so large). Most systems now wppon some form of
Qnatnic linking once again. The lisr goes on. but these examples shmdd make
[he point: an idea that i s ohsnlete today may he the star of thc party tomorrow.
Technology is not the only factor that drives systems and software. Econom-
ics plays a big role too. In the 1960s and 1970s. most terminals were inschanical
printing terminals ur 25 x XU character-orien~edCKTs rather than hitmap graphics
termit~als. This choice was nut a questiun ut' technology. Bit-map graphics remi-
nals were in use before 1960. It is just h a t thcy cnst many tens o f thousands of
dollars each. Only when the prim came d r w n enorrnnusly c w l d people (rlther
than the military) think o f dcdicaling une Icrrrliniil ta nn individual use..r.
cute n trap or system cihl instruction to transfer colarcll to the oper;lling sysren~.
The operating system then figures out w h a ~the calling process wants by inspect-
ing the parameters. Then i t carries out thc system call and returns control 10 the
4 . 1 .A SYSTEM CALLS
45
Address
OxFFFFFFFF
Library
)
uwr space
Karnel spat%
(Operating system]
Figure 1-17, The 1 1 steps in making the system call read[fd. buffer, nbytes).
the parameters pushed before the call to read. The program is now free to do
whatever it wants to do next.
In step 9 above, we said "may be returned 10 the user-space hbrary procedure
..." for good reason. The system call may block the caller, preventing it from
continuing. FWexample, if it is trying to read from the keyboard and nothing has
been typed yet. the caller has to be blocked. In this case. the ~peratingsystem
will look around to see if some other process can be run next. Later, when the
desired input is available, this process will get the attention of the system and
steps 9-1 1 will occur.
In the following sections, we will examine some of the most heavily used
POSIX system calls, or more specifically, the library procedures that make those
system calls. WSIX has about 100 procedure calls. Some of the most important
ones are listed in Fig. 1-18, grouped for convenience in four categories. In the
text we will briefly examine each call to see what it does. To a large extent, the
services offered by these calls determine most of what the operating system has to
do, since the resource management on personal computers is minimal (at least
compared to big machines with multiple users). The services include things like
creating and terminating processes, creating, deleting, reading, and writing files,
managing directories, and performing input and output.
SEC. 1.6 SYSTEM CALLS
-.
Proceas management . -.. -%--...-.. --
" I
7- DeserlptSon I
Call
..-
. -.. ......-- -
--..- -.- 1
: pid = fork( ] -._ _ _ . . _ . _ __._"._
Create- a child process identical
_-._ -
L
) pid = waitpid(pid,
_ &statbc, options)
. .,. - __. . _l,l_..
Wait
_ .-.. _ . for
.-___.__.
a child to terminate
_ ___--_---
I
I s = execve(name, argv. environp)
-.
-.--.-...-.-- -
Replace a process' core image
! erit(status) ----- . - .-- --... .A -
Terminate process execution .
..---..-
F11e mana$ernent
.
- -.-- ....., . ... -.----
----. - - . . . - -
4
A .%.
.... - . Call
. - .- ....
hacription -.-. A-
'
s = cltxe(fd) Close an opsn file -.--.
n = md(fd, buffw, nbytes) .-from
Read data - --a
a file into buffer
...-
n =write(fd, buffer, nwes) -. .. . - . Writs-. .,--data from a buffer into a file -. ...
. .psition = Iseek[fd, offset, whence) Move the file --- -pointer - -.. -A - -....--
' s = stat(narne, &bun i Get a fite's status information .-.
-- .. .We system
Directory and -- management . ------. - ...- A
'
.A .
i
- -- .- --.".
s = chdirtdirname)
s = chmod{nam%,mods]
Call
s = kill(pib, signal)
remnds = time(&semnds)
-., -
Mlscel~amua
----1'
:
,
'
..
% ----------..
I)ercriptim
Change the working diredory
Change a file's protection bits
Send a signal to a p r c e s s
Get the elapsed time since
-- . .. .& Jan. 1 ,
.-.-. -
1970
- ,. -
I+. 1
Figure 1-18, Some of the major POSIX system calls. The return code s is -1 if
an error has occurred. The Etum codes are as follvws: pid is a process id,]d is a
file descriptor, n is a byte count, positim is an offset within the file. and saiwnd.~
is the elapsed tima. The parmeten are explained in the text.
whether they are system calls. library caHs. or something else. [f a procedure can
be carried oul withou~invoking a system call (i+e., without trappirlg to the kernel).
i t will usually bc done in user space for reitsuns of performance. However, most
01' h e POSiX procedures do invoke system calls. usually with one procedure map-
ping directly onto one systern call. In a few cases. especially where several
required procedures arc only minor variations of one another, one system call han-
dlcs mare than one library call.
The first group of calls in Fig. I - 18 deals with prtwss management. Fork is a
good place to start the discussion. Fork is the only way to create a new process in
UNIX. It creates an exact duplicate uf the original process, including all the file
descriptors, registers-+werything. After the fork. the original prucess and the
copy (the parent and chiid) go their separate ways. All the variables have identi-
cal values at. the time of the fork, but since the parent's data are copied to create
the child, subsequent changes in one of them do not affec~the other one. {The
program text, which is unchangeable, is shared between parent and child.) The
fork call returns a value, which is zero in the child and equal to the child's process
identifier or PID in the parent. Using the returned PID, the twc processes can see
which one i s the parent pmcess and which one is the child process.
I n most cases, after a fork, the child will need to execute different code from
the parent. Consider the case of the shell. Zt rends a command fmrn the terminal,
forks off a child process, waits for the child to excfote the command, and then
reads the next command when the child terminates. To wait for the child tu fin-
ish, the parent execuks a waitpid system call. which just waits until the child tcr-
minates(any child if more than one exists). Waitpid can wait for a specific child,
or for any old child by setting the first parameter to - I . When waitpid completes.
the address pointed to by the second parameter, s ~ i d will ~ , be set to the child's
exit status (normal or abnormal termination and exit value). Various options are
dso provided, specified by the third parameter.
Now consider how fork is used by the shell. When a cornmand is typed, the
shell forks off a new process. This child process must execute the user command.
It does this by using the execve system call. which causes its entire core image to
be replaced by the file named in its first parameter. (Actually. the system call
itself is exec, but several different library procedures call i t with different pararno-
ters and slightly different names. We will treat these as system calls here.) A
highly simplified shell illustrating the use of fork. waitpid. and execve is shown in
Fig. 1 - 19.
In the most general case, e x w e hiis three parameters: the name of the file to
be executed, a pointer to the argument array. and a pointer to the environment
m y . These will be described shortly. Various library routines. including execl,
pxecv. execle. and execve. are provided to allow the paramekrs to be omitkd or
SYSTEM E'A1.1-S
Mefine TRUE 1
specified in various ways. 'Thmughout this hook wc will use the namc exec tr,
represent the syskm call invoked by all of r;hese.
Let us consider the case of a command such as
used to copy file1 tofik2, After the shell has forked, the child process lacares nnd
executes the file cp and passes to it the names uf the source and target files.
The main program of cp (and main program of most uther C prngrams) c w -
tains the declaration
main(argc, argv, env p)
where argc i s a count o f the number of items on the command line, including the
program name. For the example above. amc is 3.
The second parameter, rtrgv, is a pointer to an arrity. Element iui' that array is
a pointer to the i-th siring on the command line. In our example. crrgv[O] w o ~ l d
point tu the string "cp", orgy[ I ] would point to the string "file 1 " and i r q v l 2 l
would point to the string "fik2".
The third parameter of main. envp, is a pointer to the environment. an array of
strings containing assignments of the form ncme = v d u e used to pass information
such as the terminal type and home directory name to a program. In Fig. I - 1 9.nu
environment is passed to the child. so the third parameter of cwc.ve is a zero.
If exec sterns complicated, do not despair; i t is (semantically) the most com-
plex uf all the POSIX system calls. All the other ones are much simpler. As an
example of a simple one, consider exit, which processes should use when they arc
finished executing. It has one parameter. the exit status (0 lo 255), which i s
returned to the parent via smrluc in the waitpid system call.
50 INTRODUCTION CHAP. I
Pn~ccssesi n UNlX have their memory divided up into three segments: the text
segment (i.c., h e program code), the data segment (i.c..the variahlesj, and the
stack segment. The data segment g r o w upward and the stack grows diwnwerd.
as shown in Fig. 1-20. Between [hem is a gap of unused address space. The stack
grnws into rhe gap automatically, as needed. but expansion of the data segment is
done explicitly by using a system call. brk, which specifies the new address where
the data segment is to end. This call. however. is not defined by the POSIX stan-
dard, since programmers are encouraged to use the rnnlbr library procedure tor
dynamically allocating slorage, and the underlying implernenlarion of mdhr was
not thought to be a suitable subject for standardization sincc few programmers w e
Address {hex)
FFFF
a,kl
Many system calls relate to $he file system. 'In this section we wii I look at
calls that operate on individual flles: in rhe next one we will examine those that
involve directories or the We system as a whole.
To read or write a file, the file must first be opened using open. This call
specifies the file name tcr be opened. either as an absolute path name or relative to
the working directory, and a code of 0-XDONLY, 0-. WRONLY. or 0 - R D W R .
meaning open for reading, writing, or both. To create a new file. 0-.(?REAT is
used. The file descriptor returned can then bc used for reading or writing. After-
ward, the file can be closed by close, which makrs the file descriptor available for
reuse on a subsequent open.
The most heavily used calls are undoubtedly read and write. We saw read
earlier. Write has the same parameters.
Although most prngtams read and writc files sequentjally, for some applica-
tions programs need to be able to access any pan of a file at random. Associated
with each file is a pointer that indicates the current position in the file. When
reading (writing) sequentially. it normally poinis to the next hytu to be read [wrir-
ten). The lseek call changes the value of the position pointer, so that subsequent
calls to read or write can hegin anywhere in the file.
SYSTEM CALLS
Lseek has three parameters: the first is the file descriptor for the file, the
second is a file position, and the third tells whether the file position is relative to
the beginning of the file, the current position, or the end of the file. The value
returned by iseek is the absolute posiiion in the file after changing the pointer.
For each file, UNIX keeps track of the file mode (regular file, special file,
directory. and so on). size, time of last modification, and other information. Pro-
grams can ask to see this information via rhe stat system call. The first parameter
specifies the file to be inspected the second one is a pointer to a sttucture where
the information is to be put.
the file memo in jirn's directory is now entered into ust's directory under the name
note, Thereafter, ~ u s r / j ~ ~ m e
and
r n/usr\usdnutu
~ refer to the same file. As an
aside. whether user directories are kept in / ~ s r ,/user, h o m e , or somewhere else is
simply a decision made by the local system administrator.
Understanding how link works will probably make it clearer what it does.
Every file in UNlX has a unique number, its i-number, that identifies ir. This i-
number is an index into a table of i-nodes, one per file. telling who owns the file.
where its disk blocks are, and so on. A directory is simply a file containing a set
of (i-number, ASCII name) pairs. In the first versions o f UNIX, tach directory
entry was 16 bytes--2 bytes for the i-number and 14 bytes for the name. Now a
more complicated structure is needed to support long file names, but conceptually
a directory is still a set of &number. ASCll name) pairs. In Fig. 1-21, n r d has i-
number 16. and so on. What link does is simply create a new directory entry with
a (possibly new) name. using the i-number of an existing file. In Fig. 1 -21(b), two
entries have the same i-number ( 7 0 ) and thus refer 50 thc same file. If either m e
is lnrer r c m o v d , using the unlink system call. the ather one remains. If both are
removed, UNlX sees that no entries to the file exist (a field In the i-nudc keeps
track of the number of direcrory entries pointing to the tile), so the file is removed
from the disk.
As we have mentioned earlier, the mount systern call allows two file systcrns
to be merged into one. A common situation is to have the rrwt file system con-
taining the binary {executable) versions of the curnnlon commands and other
heavily used files. on a hard disk. The user can then inscn a floppy disk with filcs
to be read into the floppy disk drive.
By executing the mount system call. the floppy disk file system can he
attached to the root file system. as shown in Fig. 1-22. A typical statement in C t.2
perform the mount is
where the first parameier is the name of a block special fi te for drive 0. the scconc t
parameter is the place in ihe tree where il is lo he mounted, and thc third paramc-
ter tells whether the file system i s to be mounted read-write or read-only.
Figure 1-22. (a) F i l e system before the mumr, rb) F i l ~s~stcmo f m the rnaunr.
After the mount call. a file on drive O can be accrsscd hy just using its p:lth
from the root directory or the working direcrow, withoui regard tu which drive i t
is on+ In fact. second, third. and fourth drives can also he mounted mywhere in
the tree. The mount call makes it possible to integrate removable media into ii
single integrated. file hierarchy. without having lo worry a b u t which device a file
i s on. Although this evample involves floppy disks. hard disks or portions of hard
disks (often called partitions or minor devicw) can also 'be rnounted this way.
When a file system i s 110 longer needed, ii can be unmounted with the umount s p -
ten1 call.
A variety of other systcm calls exist as wcll. We will look at just four c~fthem
here. The chdir call changes the current wnrking direc~ov.After the call
an open on the file xy; will open /usArsrAasr!xy, The concept rlf a working
direclory eliminates the need for typing (long\ ahsulirte path names all the time.
In UNlX every file has a m d e used for protecttun. The mu& includes the
read-wrire-execute bits for the owner, group, and others. The chmod system call
makes it possible in change the inode of n file. F w example, to make a file read-
oniy by everyone except the owner. one crmld executc
The kill system call is the way users and user prwesses send signals. Jf a
process is prepared tcl catch a particular signal, then when it arrives, a signal
handler is run+ I f the process is not prepared to handle rr signal. then its arrjval
kills the process (hence the name of rhe call j.
POSIX defines swvcral procedures for dealing with time. For exnmplc. time
just rerurns the current t h e in seconds. with O corresponding to Jan. 1 . 1070 at
midnight Cjust as the day was startinp, not endingi. On computers with ??-bit
words. the maximum value time can return i s 2" - f seconds (assuming an
unsigned integer i s used). This value corresponds to a litllc over 136 years. Thus
in the year 2106. 32-hi1 U N l X systems will go berserk. imitating the famous YIK
problem. If you currunrly have a 32-hit C N I X syslem. you are advised to trade it
in for a 64-bit one sometime before the year 2 106.
exit
--
i .- .-pen. .
, ,
--. . - --
- I ExltProcess
.-
I CreateFile
j CreateProcess
+.-
.
.
. _
= fork
..
, Terminate execution
-.
.
.. - -
.
i
1
. , . -.. . . .-. -- - -
MoseHandle Close
! close
1
&.--
read
I: - .... .
ReadFile
. - L . -- --
a file--
file
... . . -.
.-
Create a-- new
._
directory
-- . ._ .. _ . _ __
.- . . . .
, -. -- 4 I
.- - -. . - .. . . ..- -. - -- .-
Let us now brietly go through the list { r f Fig. 1-23. CreateProcess creates a
new prucess. It does the combincd work of fork and execve in CINlX. It has many
parameters specifying the prtyxrties of the newly created process. Windows does
not have a process hierarchy as UNIX does so there is no concept o f a parent proc-
ess and n chiid process. After a prwess is crealsd, the ureatclr and crcatee are
equals. WakForSingleObject is used to wait for an rvenr. Many possible evznts
can be waited for. If the parameter specifics a process. then the caller waits for
rhe specified process to e x i t , which is done using ExitProcess.
The next six calls operate irn files and arc func~ionall?;similar lo their U N I X
cnunterparts although they differ in the parainetm and details. Sti!l, files can bc
opened, closed, read, and written pretty inuch as in UNIX. The SetFilePointer and
GetFileAttributesEx calls set the file position and get some of the file attributes.
WjnJows has directories and [hrv are crealed with CreateDirectory and
RemoveDirecPoty, respectively. There is also a notion nf a cw-rent directory. Set
SetCurrentOirectory. The current time is acquired using GetLocalTirne.
The Win32 inierke does not have links rn files, rrlr~untcdfilc syskms, secu-
riry, or signals. st> the calls corresponding to the UNlX ones d o not ex is^. Of
course. Win32 has a huge number of other calls that UNIX does not have. espe-
cially f w inanaging the GUI. And Windows 2000 has an elaborate security sys-
rem and also supports file links.
Onc bsr note about Win32 is perhaps worth making. Win32 is not a terribly
unifnnn ar consistent interfact. The main culprit here was the need to k hack-
ward compatible with the previous 16-bit intertice used in Windows 3 . x .
tben executing a trap instruction. This instrucrion switches the machine f i r ~ r nuser
mode to kernel mode and transfers control ro the operating system, shown as step
6 in Fig. 1- 17. The operating system then fetches the parameters and determines
which system call is lo he carried out. After that, i t indexes into a table that con-
tains in slot t a pointer to the procedure that carries out system call k (step 7 in
Fig. 1-17].
This organization suggests a basic structure for the operating system:
1 . A main program that invokes the requested service procedure.
2. A set nf service procedures that carry out the system calfs.
3. A set of uti t i ty prrxxdures that hclp the service procedures,
In this model, for tach system call there i s one service procedure that takes care
of it. The utility procedures do things that are needed by several service pro-
cedures. such as fetching data from user programs. This division of the pro-
cedures into t h e layers is shown in Fig. 1-24,
n Main
pmwdure
Sewice
procedures
-- - -- --
Vtility
procedures
which ccmld he pmgrarnn~edwithour having to worry aboul the fact that multiple
processes were running on a single processor. I n other words. layer O provided
the basic multiprogramming of the CPU.
write a program to test and grade student programs and run this program in ring n,
with the student programs running in ring n + 1 so that they could not change their
grades.
instructions here
Trap here
With VMfl70, each user process gets an exact copy of the actual computer.
With virtual 8086 mode on the Pentiurn. each user process gets an exact copy d a
different computer. Going one step further, researchers at M.l,T. have built a sys-
tem that gives each user a clone of the actual computer. but with a subset of the
resources (Engler et al., 1995). Thus one virtual machine might get disk b k k s 0
tu 1023, the next one might get blocks 1024 to 2047, and so on.
At the bottom layer. running in kernel tnude. is a program called the exoker-
nel. Its job is to allocate resources to vifiual n ~ x h i n e sand then check attempts to
use them to make s u e no machine is trying to use somchcdy eke's resources.
Each user-level virtual machine can run its own operaling system. as on VM/370
and rhe Pentium virtual 8086s,except that each one is restricted to using only the
resources It has asked fr and been allwated.
The advantage of the exokernel scheme is that it saves :r layer of rnapp~ng.In
the other designs. each virtual machine thinks it has its own disk, with blocks run-
ning fmm 0 tr, some rnaxirnum, so thc virtual machine n~oniturmust n~ainzilin
tables to remap disk addresses (and all other resnurces). With the exokernel, this
remapping is not needed. The cxokernel need only keep track of which virtual
machine has been assigned which resource. This method still has h e advanrage
of separating the multiprogramming (in the exokernel) from the user operaling
system cade I'in user space). but with less overhead. since all rhc exokernel has to
do is keep the virtual machines out of each other's hair.
number of virtual 370s in their entirety is not that simple (especially if you want
to do it reasonably efficiently).
A trend in modem operating systems is to take the idea of moving code up
into higher layers even further and remove as much as possible from kernel mode.
leaving a minimal mierokernel. The usual approach i s to implement most of the
operating system in user processes. To request a service, such as reading a block
o f a file, a user process (now known as the client process) sends the request to a
server process, which then does the work and sends back the answer.
Cliant
process
Client
prams
. Pmess
server
Terminal
wrvet
.
, +
File
wnmr
1
1
user mode
Microkernel
, } Kernel mode
Client obtains
service by
sending messages
to server processes
1 b +
- + * Kernel Kernel Kernel Kernel
NetwQrk
Message from
client to sewer
Virtually all operating systems researchers realize that current nperating sys-
tems are massive, inflexible, unreliable. insecure, and loaded with bugs, certain
ones more than others (names widdwlb hem tn prutect rke guihy). Cnnsequcntly,
there is o lot o f research om how to build flexible and dependable sysiems. Much
of the research concerns microkernel systems. These systems have a minimal
kernel, so there is a reawnable chance they can be made reliable and he
debugged. They are also flexible because much of the real operating system runs
as user-mode processes, and can thus be replaced or adapted easily, possibly even
during execution. Typically, dl the microkernel does is handle low-1eve1 resource
management and message passing between the user processes.
The first generation rnicrokernels, such as Amoeba (Tanenbaum et al.. 1990),
Chorus (Rozier et a]., 1988). Mach (Accelta et a].. IY86), and V (Cheriton. 1988).
proved that these systems could be built and made to wnrk. The second genera-
tion i s trying to prove that they can not only work, but with high performance as
well (Ford et al., 1996; Hartig et al., 1997; Liedtkc 1995, 1996; Rawson 1997; and
Zuberi et al., 1999). Based on published measurements, it appears that this goal
has been achieved.
Much kernel research is focused nowadays on building extensible operating
systems. These are typically microkernel systems with the ability to extend or
customize them in some direction. Some examples are Fluke (Ford et al.. 1997).
Paramecium (Van Dmrn et a\., 1993, SPIN (Bershad et al.. 1995b). and Vino
(Seltzer et a]+,1996). Some researchen are also looking at how to extend existing
S K . 1-8 RkSEAKCH ON OPERATIKC; SYSTEMS 65
sysrcmx (Ghormley et 31,. 1 W X ) . Many of chese systems illlow users to add their
c , \ N ~ccl& in the kernel. which brings up the obvious pmblrtrn of how to d h w User
extensjons in a secure way. Techniques include interpreting the extensions. res-
rricting them t~ code sandboxes, usjng type-safe languages, and code signing
tGrimm iind Bcrshad. 1997: and Small and Seltzer. 1998). Dwschel ct al. ( 1997)
prcsrnt o dissenting view, saying rha~loo much effort is going into srcuimityfor
user-extendable svstenl?;. In their view, researchers should fiyurc out which
extension.; are useful anti then just make those a r~ormalpart of thc kernel. without
the ability tu have users exlcnd ihe kernel on the fly.
Although one apprcwch to eliminating bloated. buggy. un~.eli;ibleoperating
s y s l r m s is lo makc them smaller. a more radical one i s to eliminate [he operating
syslem iilrogcther. Ttiis approach is k i n g taken by the group t ~ fKaastioek at
M.I.T. in heir Exnkerr~elresearch. Here the iden is tr, have ;I thin layer of
suAware running on the barc metal, whose only job is to securely nllwate the
hardware resources arntsng the users. For cxirmple. it must decide who gels ti, use
which part of the disk and where incoming network packets should be delivered.
Everything else is up tn user-levcl processes, making it pussihle tn build both
general-purpose and highly-specialized upcrating systems (Engler and Kaashoek,
1 W 5 ; Englcr et al+,1995; and Kaashoek et d.,19971.
It is also worth pillring c>ut that for measuring memory sizes, in commcm
indusky practice, the units have slightly different meanings. There Kilo m e m s
2'" (1024) rather than 10' (1000) because rncmcwies are always a power of two.
Thus a I - K B memory crmtains 1024 hytcs, rwi I(NO bytes. Similarly. a I-MB
memory contains 22U (1.048,576) bytes and a 1-GB memory contains 2-'"
(1,073,741,824) bytes. However. a I -Kbps cnrnmunicatim line ~ransmits 1IH)O
bits per second and a 10-Mbps LAN runs at 10,000,000 b i t s k c because these
speeds are not powers of'two. Un furtunutcl y. many people tend to mix up these
SEC. 1.10 METRIC UNITS 67
two systems, especially for disk sizes. Tn avoid ambiguity. in this b w k , w e will
use the symbols KB,MB.and GB for 2". 2*', and z3'
bytes respectively, and the
syrnbols Kbps, Mbps, and Gbps for 103. 1u6and I o9 bitdsec. respectively .
1.11 SUMMARY
Operating systems can be viewed frum two viewpoinls: resource managers
and extended machines. In the resource manager view, the operating system's job
i s ro manage the different parts of the system efficiently. In the extended machine
view, the job of the system is to provide the users with a virtual machine that i s
mwc convenient to use than the actual machine.
Operating systems have a long history, starting from the days when they
replaced the operator, tc, modern multiprogramming systems, Highlights include
early batch systems, multiprogramming syskrns, and personal cainputer systernh.
Since operating systems interact closely with the hardware, some knowledge
of computer hardware is useful to understanding them. Compurers are built up of
processors, memories, and VO devices, These parts are connected by buses.
The basic concepts on which all operating systems are built are processes,
memory management, VO management, the file system, and security. Each uf
these will be treated in a subsequent chapter.
The heart of any operating system is the set of system calls that it can handle.
These tell what the operating system really does. For UNIX, we have looked ;it
four groups of system calls. The first group of system calls relates to process
creation and kmination. The second group is for reading and writing files. Thc
third group is for directory management- The fourth group contains miscellanc-
ous calls.
Operating systems can be structured in several ways. The mos~cnmmon ones
at as a monolithic systcm, a hierarchy of layers, a virtual machine system, Jn
exnkernel, or using the client-server model.
PROBLEMS
4. On early computers, every byte of data read or written was directly handled by rhe
CPU (i.e., there was no DMA). What implicatiims does this organization have t i ~ r
inultiprogramming'?
CHAP. I PROBLEMS
20, A T'ilu whtmr lilc dcsa~-iplr,ris,f[/ cantitins the fnllowing sequencc of' b p s : 3. 1. 3. 1, 5 ,
9: '2,6. 5 . 3, 5 . Thc r o l l o w i n g svslcm calls are madc:
Iseek(fd, 3, SEEK-SET);
read(fd, &buffer, 4);
single-cornpu~crsy slem'?
24. To a pri~grammcr,a system. c;dl l m k s likr any ulher call tu ii Iibritry ~ I - L X ~ ~ L H - C[s. i~
impm:int thai a prryriunnler k n i w which lib r ; q pruc-edurcs result in sy sir111 cdll;'.'
[Jnder what circun~stanccsand why'?
uulimitcd number of child pmcsscs and observe what happens. Befclrc r w n i n g rhe
cxperirnent, type sync ro thc shell to flush the file system buffers to disk lo avoid ruin-
ing the file system. Note: Do nor try this on ;i shared system without first gc~tingper-
mission from the system administrator. T h r consequences will be 'ins~anll):obvhus s r ~
you arc likely to be caught and sanctirms may follow.
29. Examine and try to interpret the contents of ri L I ~ l X - h k cor Windows directory with 3
ton1 like the I JNIX rrd program or thc MS-DOS DEBUG program. Hint: How you do
this will depend upon whnl the OS a h w s . One trick that may work is lo create a
directory on a floppy disk with one upcrating system and then read the raw disk datil
using n different oper~lingsystem that allows such access.
PROCESSES AND THREADS
We arc nuw iibc~utI.{:, err~bark1-m a det.ailcd study a f hr>w ryxriiting sysfitrns Jre
designed and constructed. 'The most central concept in a n y i-rperating system is
the p w ~ ~ ~ , san
L s abstraction
: of ii running pnqram. Everything else hingex un this
concept. and i t is impm-toni €hat Ihc operating system designer {,;lndstudent) have
a thnrough cinderstanding o f what a process is 3s earl!. ;IS possible.
2.1 PROCESSES
All rnodern crmputcrrs can du several Ihinps nr ~ h csame tinic. Whilc running
a uscr prngrarn. a computer can also bc r i d i n g from ;I disk and u u t p ~ l ~ t itexl
n ~ It)
ii screen or printer. In a ~nultiprograrninings y s t m ~ .thc CYU also switchc.; f n n
progrim t u prr,gratn, running each for tens ur hundreds r-d- milliseur~nds. While.
strictly speaking. at any instant of time. the C'PU i h running only m e p r u g m n . in
h e course of 1 secmd, it may work an sevcrltl pmgrarlis. thus giving thc users the
iliusion of parallelisni. Surnetitnes peuple speak of ps~udoparallelismin this
context, to contrast it with the true hardware parallelism of multiprrscessor sys-
tems (which have two or nwre CPUs sharing the same physical sncmnry). Kecy-
iag track of multipie, parallel activities is hard for yet-qlr: lo do. Therefurtl, opcr-
ating system designers uver the ycars have evolved a conceplual rnodel (seqom-
tial processes) that makes paralklisrn easier to deal with. That model, its uses,
and some of its consequences form the subject of this chapter.
72 PROCESSES AND THREADS
i n this model, all the runnable software on the computer, sometimes including
the operating system. is organized into a ilumhcr of sequential processes. or just
processes for short. A process is just an executing program. including thc current
values of the program counter, registers, and variables. Conceptually. each proc-
c s s has its own vidual CPU. In reality, of coursc. the real CPU switches back and
f m h from process to process. bur to understand !he system. it is much easier to
think about a collection of processes running in (pseudo) parallel, than to 1ry t o
keep track of hnw the CYU switches fmm program to program. 'This rapid
switching hack and forth is called multiprclgramming. as wc saw in Chap. I.
Jn Fig, 3-l(a) wc see a computer multipmgramming four programs in
memory. In Fig. 2- I [b) we sce four processes, each with its own jlrs w l?f contrul
kt.,its own lngical proqram counter). and each m e running independently o f the
L
other ones. O f course, there is only one physical pnrgram counter, so when each
pmcess runs, its bgicai program counter is Ina3ed intn [he real program munter.
When it is finished for the time being, the physical program counter i s saved in
thc process' logical pnjgrarn counter in inemury. In Fig. 2- l( c ) we see that
viewed over a long enough time interval, all the processes have lnatlr progress,
but at any given inatant only one process i s aurually running.
One progmm counter I
With the CPU swi~chingback and forth amnng the processes, the rate at
which s prucess prforlns its computation will not be uniform and probably not
even reproducible if the same processes are run again. Thus, processes must not
he programmed with built-in assumptions about liming. Consider, for example,
an ID process that starts a streamer rape to restore backed up files. executes an
idle loop 10.000 times to let it get up to speed. imd then issues a command to read
the firsr record. If the CPU decides to sunitch to amxher process during rhe idle
Loop, thc tape process might not run again until after the first record was already
past the read head. When a process has cri~icalreal-time requirements like this,
SEC. 2.1 PROCESSES 73
that is, pal.ticular events mrrst occur wirhin a spccilird nu~nberof milliseconds.
special nleasures must he taken to cnsure that they do occur. Normally. huwevrr,
most processes x e not affcctcd by the underlying multiprogramming of the CPU
or rhe relative spccds o f different processes.
dit'ferencc between a process and a prograin is subtle, but crucid. An
analogy makc hclp here. Cunsider a culinary-minded computer scientist who is
baking a binhday cake for his daughter. H e has a birthday cake recipc and a
kitchen well stocked with all the input: flour, cggs, sugar, extract of vanilla, and
r o o n . In this analogy, the reeipc is the program (ix., an algorithm expressed in
sume suitable notation). the computer scientist is the processor (CPU), and the
cake ingredients are the input data. The process is the activity consisting of nur
baker reading the recipe, fetching the ingredients. and baking the cakc.
Now imagine fhat the computer sckntist's son wmes running ia crying, say-
ing that he has been stung by a bee. The computer scicnrist records wherc he was
in the recipe (the state of the current process i s savcd). gcts out a first aid book.
and begins follnwing the directions in I t . Hcre we see the processor being
switched from one process (haking) to a higher-priority prmcss (administering
medical care), each having a different prugrarn (recipe versus first aid book).
When the bee sting has been taken care of, the cornputcr scientist gocs back lo his
cake, cnntinuing at the point where he Zef~off.
The key idea here is that a process is an activity nf some kind. It has a pnl-
gram, input, output, and a state. A single prxessor may be shucd among several
processes. with some scheduling algorithm being used to determine when to stop
work on one process and service a different one.
{human) users and perform utork for them. (Ithers are backgmulld processes.
which re not ( I S S W ~ & ~with p ~ t i c u l a rusers. but instead have some specific
function. For example, one background process may be designed to accept
incoming ernail, steeping most of the day but suddcnly springing lu life when
email arrives. Another background prwess may he designed to accept incoming
rcquesh for Web pages hosted on [ha1 machine. waking up when a requesr arrims
ro service the request. Processes that stay in the background to handle some
activity such as email. Web pagcs, news, prindng. and so o n are called daemons.
Large systcms commonly have dozens or thcm. I n UNIX. the ps pnlgram can be
used to l i s t the runni!~g processes. In Windows [15/98/Me, typing CTRL-ALT-
DEL once shows what ' r running. In Windows 2(N)O. the €ask mnnrrart. is uscd.L.
In addition to the processes created a1 bcmt titne, new prrxesscs ran be crealcd
iifterward as wcli. Oftrn a running pmcess will issue system calls lo c'rtxtt. Llne tbr
rwire riew processes 112 help i t do i t s job. Creating new procrsses is particuhrly
useful when the work to be done can easily be C~~rrnu1atc.d irl terms r7f sevcral
relnked, but otherwise iriclependenr interwtitlp pn~uesses. h r exii.ltnple, ir a large
amount of dara is k i n g f e t c h 4 over a netwurli for subscqucnt prncrssing. i t tr~uy
T r o h ~ ~ i u a l l yin. all 1hcse cases. ;l ncw process is creatcd by having ail cxisting
L
ro corrpilc the program fou.r and n o siich file exists, the cum pile^. sitnply exits.
Screen-clricnted inieractivc proccsses generally do not c x i t when given had
parirmeters. Instead they pop up a dialog hox and ask ihs user to try ;\gain.
The third reason for terininiltlcm is nn t.r-rnr c,.aut;tdby the prrxess, often due tr~
a program bug. Examples include executing an illegal instruction, referenuing
nonexistent meInory, nr dividing by zero. In snrne systcms (erga.U N I X 1, a process
can tell the operating system that it wishes to handle certain errors itself, in which
case the process is sigr~aled(interrupted) instead i ~ fterminated when one of he
ct-rors I'lccurs.
The fourth reason a process might terminare is that a process execules a sys-
tem call telling the operating system to kill snmc other process. In UbiIX this call
is kill+ The correspondrr~gW in32 functilm is TerminateProcess. In b o ~ hcases, the
killer must have the necessary uuthmimtion to ciu in the killee. In snme systems.
when a process terminates, either vduntariiy or otherwise, d l processes it created
are immediately killed as well. Neither UWTX nur Windows works this way, how-
ever.
In some systems, when a prwess urGi1te.c; amther pl-occss. rhe parent process
and child process continue to be associated in certain ways. The child pi-ocess can
itself create more proccsses. forming a proccss hierarchy. Note that unlike plants
and animals that use sexual reproduction. a process has only one parent (hut zem.
one, two, or more children).
In UNIX. a process and all of its children and fulther dcscendalits rogether
form a process group. When a user sends a signal from the keyboard, the signal is
delivered m all members of the prcress group currently associated with the key-
board (usually all activc processes char were created in the current window)- Indi-
vidually, each proccss can catch the signal. ignore the signal. or rake thc defitult
action, which is to be killed by the signal.
As another example of where the procr.; hierarchy pla1.s n ride. Ict us look it{
how UNIX initializes itself when i t is s t a n d . A special process. called h i r . is
present in h e boot image. When i t starts running. j t rcads n file relling how m:my
terminals there are. Then i t forks off one new process per terminal. Thcse proc-
esses wait for someone to log in. If a login i s succcssful. the login prtxess ere-
cutes a shell to accept commands. These cotnn~andsmay start up more processes.
SEC. 2.I
and so li~rth.Thus, all the prucrsses in the wholc sysrem belong to a single tree.
will1 i ~ . r i c3t the ma.
In conIrast, Windrws does not have any concept of a process hierarchy. All
proccsscs arc equal. 'The only place wher-c therc is something like 3. process
hierarchy is that when u process i s created, the parent i s given n special ioken
(called a handle) thal it can use to con~roltho child. However, i t is free tn pass
~ J tokenS to some othcr process, thus invalidating the hierarchy. Processes in
LJMX cannot disinherit their children.
Alrhuugh each prucoss is an independent entity, with its own program counter
and internal state, processes often need tr, interact with other processes. One
process may generate srlme autpur that another prwcss uses as input. I n thc shell
ccln~rnand
cat chapter1 chapter2 chapter3 I grep tree
h e first process, running crrf, concatenates and autputs ihree f ~ k s .The secmd
process, running grep, selects all lines containi~lgthc word "trec." Depending on
the relative speeds of the two processes (which depends nn both the relative cum-
plexity of the programs and how rnuch CPli time each one has had), ir may hap-
pen that grep is ready to run. but there i s no input waiting for it. l r must then
blmk until some inpur is available.
When a process Mocks, il d i ~ sso because logically it cannot continue, typi-
cally hecause it is waiting for input that is not yet available. I t is also possible fur
a process that is conceptually ready and able to run to be stopped because 1ht.
csperating system Bas decided to allocate the CPU to another prwcss h r a while.
These two conditir~nsare completely different. In the first case, the suspension i s
inhereni in the problem {you cannnl prwess the user's command line until it has
been typed). In the second case, i t i s a technicality trf the system (nut enough
CPUs to give each prcwess its clwn private processor). In Fig. 2-2 we see a stare
diagram showing the three states a process may be in:
1 . Running (actually using I h c CPU nt thal instant).
2. Ready (runnablc: temporarily stopped to let another process run).
3. Blncked (unable to run until sumc cxtemal event happens).
Logically, the firs1 two stales are similar. In both cases thc process is willing io
run. only in the second one, there is ten~pc~rarily nu C'PU availiiblz f i x it. The
third state is different from the first twn in that the proccss cannot run, cvcn if tbc
CPU has mlhing else to do.
Fuur trmsitions are possible among these thrce states. as shown. Transition I
occurs when a process discovers that it caonot continue. In some systems the
1 . Process olnck$ for input
2.Scheduler picks another process
3. Scheduler picks this process
4 Input becomes available
Scheduler I
To implement the process modcl. the uperaring system imintains a rablu (an
array uf structures), called the prwess table. with rm-e centry per pruucss. (Sorrbc
authors call these entries process control hlucks.) This entry crmtairw infonm-
tion about the pruccss' slalc, its progrnim counter. stack pointer, mtinrry dlucii-
tion, the status at' its opcn fjles, its accounting and scheduling inhrn~atiun,and
everything else about the process that must he saved when the pi-ocess is switched
from mnning to r e d y or b1m'krJ statc so that i t can bc restnrtcd li.il~ras if it had
never been stopped.
Figure 2-4 shows some of the more important: fields in ;I rypiciil syslern. The
frclds in thc tifit column relatc to proccss tnanapcnwnt.. Thc other two ci~~uiniis
relate to rnernory management and file rnan;lgeti~ent,rcs pectivcty. t t shrluld be
noted that precisely which fields the process table has is highly system dependent,
but this figure gives a general idca of the kinds 01' informatirm ~ieeded.
Now that we have looked at the prncess table, i t is possible to explain a littlc
more about how the illusiun of multiple sequential processes is ~nitintninedo n a
machine with m e CPU and many 110 dcvlces. Associated with cach 1K.I d ~ v i c c :
) u I w a t i m (oficn near t h r
ciass ( e g . floppy disks, hard disks, timers, k m ~ i n a l s is
bottnm of memory) called the interrupt vector. It crmt.ilins the sddres8 t,C the
intermpc scn~iceprocedui-e. Suppose that user process 3 is running n h e n a disk
intempt OCCUTS. User p~-DCCSS 3's program counter. ptmgrmn ~Latuswword.and
possibly one or more registers are pushed onlo the (current) stack by the interrupt
hardware. The computer then jumps to the ;~tldressspecified in the disk interrupt
vector. That is all rhe hardware does. Frrm herc o n . i t i s up ro the software. i n
particular, the interrupt service prncedure.
All inkrrtrpts star1 by saving the rcgistn-s, often in the pl-oress t:hle entry fcjr
the currcnt process. Then thc intormation pushed onto the stock by the interrupr i c
removed ant1 the stack pointer is set to point I U a lrmporary stack used by the
.- . . . . . . . . . . ... - -... --
File managemen
Process management
Registers
Program counter
I Memory management
Pointer to text segment
i Pointer to data segment
R o d directory
Working directory
Program status word i Pointer to stack segment File descriptors
Stack pointer User lD
Process state Group ID
Priority I
Scheduling parameters I
Process ID
Parent prc-cess
Process group
Signals
Time when pmcess started
CPLI time used
Children's CPU time
Time -of. . next
-. .- . .-
alarm
process handler. Actions suuh as saving thc tegisters and setting the stack pointer
cannot w e n be expressed in h i g h - b e l lancuagcs such ;is C. so they are per-
b
formed by a small a.;sembly language routinc. usually rhc s m l e me- for all inler-
rupts since the work ol' saving the registers i s identical. nc, n m k r what t h e c;iusc
of the interrupt is,
Whcn this routinc i s finished, it calls 3 C prr,cedwe tr:, do the rest uf' lhrr work
far this specific intempt type, (We assume the upcrating system is wti~teriirr C.
the usual choice for all real operating systems.) When it has done i ~ job. s possibly
making somc prmws now ready, the scheduler is called I t 1 szc u h i ) tu run n e x t .
After that. control is passed back t o the asscmbly language c d r : l o l o a d up the
repistcrs and mcrnw-y map for the now-current proccss and slarl i t runnirlg. Inter-
rupt handling and scheduling are summarized in Fig. 2-5. It is wu1d-1noting that
the details vary sumewhar. frwm system lo system.
- .---- -
. . - . . . . -. . . ...... .. -. ....
F H a r d w a r e stacks program counter. atc. I
2.Hardware loads new program counter from interrupt vector.
3.Assembly language procedure saves registers.
4. Assembly language procedure sets up new stack.
5. C interrupt service runs (tjtpically reads and buffers input).
6.Scheduler decides which process 1s to run next.
! 7. C procedure returns to the assembly code.
1..8. .-Assembly
: language procedure starts up new current process. .
. . . . . . -....
PROCESSES
2.2 THREADS
In traditional openting systems. each pnxess has an address space and a sin-
c~Iet hmid nf cnntrol. In fact, that is alnlvsl Ihu definition of a process. Neverthe-
.3
User
spocc
Thread
space \) I Kernel
Bv switching back and forth among multipk proccsscs. the syslcm gives the ilh-
sion of separate sequential processes running in paraliel. Multithreading works
the same way. The CPU switches ripidly back and forth among the rhreads pm-
viding the illusion that the threads are running in parallel. albeit on a slnwer CPU
than the real one. Wlth thrce compute-bcmnd thrcads in ;I pnKesr;, the threads
wr~uldappear tr, be running i n parallel, each m e o n a CPU w i l h mw-third the
speed of the real CPU.
Different threads in a process are not quite as independent as dif'krenr
processes. All threads have cxnctly the same address s p a w which means [hill
they also share the same global variables. Since every thread can access evcry
memory address within thc process' address spuce, one thread can read, write. o r
even cun~pletelywipe out another thread's stack. Thcre is no protection hctween
threads because ( 1 ) it is impossible. and ( 2 ) it should not br: necessary. Unlike
different processes. which may be from different users and n h i v h may be hostile
co one another, a process i s always owned hy a single user. who has presumably
created multiple threads xu that they call cu~spcrate.not fight. In addition to shar-
ing an address space, all the threads share 111csame S C ~r ~ fupen files. child
processes. alarms, and signals, etc. us shown in F i g 2-7. T h u s the organization uf
Fig. 2-6(a) woutd be uscd when the thwc prwcsses are ussentially unreliitcd,
whereas Fig. 2-6(h) would be appropriate whcri ihe three threads arc actual1y parm!
o f the 5ame jot? and are actively and c l t ~ l cwperitting
y with each othcr.
The items in the firs[ column are p n m s s properties. not thread propcrtieh.
For example, if one thread upens a filc, that file i s visible to the othcr threads i n
the process and they can read and write il. '!'his i s hgical since thc prncess is the
unit (,f resource rnnnngen.lcnt. not the thread. Tf caoh thread had its own address
spuce, open tiles, pending alarms. and so vn. it would he a separate process. What
we are trying to achieve with the thread conccpt is thc ability for multiple threads
- -- "- . . .
- -. I
I'rmn ~ o m e
task.
Like a traditinnal process {i-e., ii process with ~ r d yULIC thread), a rhrad can
be in any m e nf several states: running, hlnckud, ready. or terminated. A running
thread currently has [hi: CPC; and is active. A l~locksdthread i s waiting h r srmw
w e n t tn unblock it. F m example. when a thread prrfor~nsa system c d l t o read
from the keyboard, it i < bhckcd until irlput is typed. A thread cun hluch wiring
for some external even1 lo happen or for some other thread to u~ihjocki t . A r e d y
thread is scheduled to run and will as soon as its turn curncs up. Thc transi~ions
between thread states are the same as the transitions k l w e e n process slates mb
are illustrated in Fig. 2-2.
It is important to realize thiii each thread has its own stack, as shown in
Fig. 2-8, Each thread's stack wntains o t ~ cframe f'or each p w e d u r e called bul nut
yet returned frrmt. This frame contains the prrxxdure's lwal variabks and thc
return address to use when thr pmccdure call has finished. Fur example. if pro-
cedure X calls procedure Y and this one calls prmcdurc Z , while Z is executing the
frames for X, Y, and Z will all be o n the stack. Each thread will generally call ilif-
ferent procedures and a thus a different execu~ionhistory. This is why i s thread
needs i t s own s~ack .
When multithreading is present, processes nomially s1at-t with a single thread
present. This thread has the ability to create new threads by calling a library pro-
cedure, for example, fhreud-rreure. A parameter to rhrrd-crrnru typically
specifies the name of a procedure for the n c n thread to run. It is nor ncctssnry (or
even possible) to specify anything about the new ihrcad's address space since il
au~ornaticnllyruns in thc address space of thr crealing ihrcati. Sometimes threads
are hierarchical. with a pare.nt-child relationship. hot often ilo such relarimship
exists, with all threads being equal. Wirh ur without a hierarchical rtll:riionship,
the creating thread is l~suallyreturned a thread identifier thai niilmes thc ncw
rhrcad.
When a thread has f<nished i t s work. it can exit by calling a l i b r a ~ ypro~edure,
say. thrrird-eri~. I t then vanishcs and i s rro lotrger schedulable. In snme thread
PROCESSES ANT) THREADS
Thread 2
Process
Kernel
systems, one thread can wait for a (specific) ~hreadio exit by calling a pmcedurc.
far example, head-wrrir. T h i s procedure b h c k s the calling thread u t ~ t i l a
(specific) thread has exired. In this regard, thread creation and termminatimis very
much like process crention and termination. with appmxirnately the sijrne uptims
as we1 l.
Another common thread cdl is rhrrrrd-~irid.which allows n thrcacl to volun-
tarily give up the CPU to let another thread'run. Such a call i s importan1 becausc
there is no clock interrupt to actually enforce timesharing as there is with
processes. Thus it i s imponant for threads to be polite and voluntitrily surrendtsr
the CPU from time to time to give other threads a chancc t o run. Other calls
allow one thread to wait for another thread to h i s h some wnrk, for a ~ h r c a d1 0
announce that iit has finished some work, and so on.
While threads are oftcn useful, they alsn intrrxluci: 3 number ~ n fr;urnpliuatirms
into the programming mr~dei. Tr, start w i h , consider the effects of thc l?YIX fork
system call. If the pawn1 process has multiple threads, shr,uld the child alsu have
thei-h'? If not. thc process may nat function prqserly, since d1 o f them may be
essential.
Howevcr. if the child process gets as many threads as the parent. what hap-
pens i f a thread in the parent was blocked on a read call, say, from the keyhoard?
Are two threads now biockcd on the keyboard, one in the parent and (mi:in thc
child? When a line is typed, do both rhl-eads get a copy of' i17 Only the parenr?
Only the child? The samr problem exists with open ne~workcnnneclions.
Another class of problems is related to the fact that threads share many da\;.i
structures. What happens if nnc thread clnses a file while anrjther m e is still read-
inp from it? Suppose that une thread noticcs that thcre is too 1irrle rllernury anti
starts allocating mare memory. Pan way through. a thread switch occurs, and thc
new thread also notices that there is too iirtle memory and also starts allocating
THREADS
Having dcscrihed whar threads are. i f is now time to explain why anyone
wants them. The main reason for having threads is that in many applications.
multiple activities are going on at once. Some of these may block from time to
lime. By decomposing such an appiication into mu ttiple sequential rhreads that
run in quasi-parallel, Ihe programming m d e l kcornes simpler.
We have seen this argurnenl before. It is precisely the argument tur huving
processes. instead of thinking about interrupts, timers, and context switches. we
can think abuut parallel processes. Only now with rhreads we add a new clcnient:
the ability for the parallel entities to share an address spacc and all af its data
amung themselves. This ability i s essential for certain applications, which i s why
having muhiple processes (with their separiite address spaces) will not wtlrk.
A second argument f w having threads i s that since they do nor have any
resnurces ateached to ihurn, they are easier it, crenlc and destroy than processes.
In many systems, creating a thread grxs 100 tirncs faster than ui7eating u pmuess.
When the number of threads needed changes dynamicntly and rapidly, this pro-
perty is useful.
A third reason for having threads is also a pcr1;mnance argurnenx. Threads
yietd no perfnmance gain when all of ihem are CPU hound. but when there is
substantial computing and also substantial l/O. huving threads allows thcse astivi-
ties to overlap, thus spceding up the application.
Finaljy, threads are uscful on systcrns with milltiple CPUs. whew r e a l paral-
lelism is possible. We w i l l come hack to this issue in Chap. 8.
It is prt~bablyeasicsl to sce why threads arc useful hy giving some concrete
examples. As a firs1 cxample, consider a word processor. Mosl word prwcshors
display the document being cwared tro h e screen formatted exactly as it will
apwar on the printed page. I n particularc all the line breaks and page hrcaks are
in their crlrrect and r i n d position so the uscr can inspcct them and h a n g the
document i f need br: k g . . to elirninare widows and nrphans-incomplttk top imd
bottom lines on a page. which are considered esthet i d l y unpleasing).
Suppose that the usur i s writing a book. From thc authofs p i n t of vicw, it i s
casiest to keep rht: emire bonk as a single file t c ~lnakc i~ easier to search for
topics, perform global substitutions. and so o n . Alternatively, each chapter might
be a separale file. Hnwevel-. having every section and suhscction a a separate
file is a ma1 nuisance when global changes have ro he rtlade to the entire book
since rhm hundreds o r files have to bc individually edited. For example, if pro-
posed standard x x x x . is approved just bctirrc the book gws to press, all
vccLirrences of "llraft Srandard xxxx" h a w tu be changed to "Standard xxxx' at
the last minule. If the entire book is one file, typically a single cornnland can do
the .;uhslitutions. In contrast, if the book 15 spreild tIVCF 3NI fiics. c ~ fllKh
must be edited separately.
Now consider what happens when the riser suddenly dcletrs one s c n t ~ l l u r
from pnge 1 of an 800-pagc docurncnt. After ctieckinp h e changed papc t o rtlakc
sure It i s correct, the user now wants to nmkc ;mother uhangc 011 pagc A(#) and
types in a command telling the word pruoctssca- to go to that page ((possjbly by
searching for a phrase occuning only there). The word p~occssoris now forced to
refofma1 thc entire book up to page 600 on the spot be.ca\~sei t does not know what
the f i r s { linu o f pagc 60I) will be until ir has prmocesscdall the previous pilges.
There may be a substantial delay before pnge 600 can hr displqed, Icading tu an
unhappy uscr.
Threads can help here. Suppose thal the word prt>cessor 1% wwrit~enas a cwu-
threaded program. One thread interacts with h e user and the o ~ h e handles
r ~cftlr-
matting in the background. As soon as the sentenre is delctd I r m ~ p q e t Ihe
interactive thread tells the reformatring thread ro rcl'urmnt the whole h w h .
Meanwhile, the int-eractivc thread continues tu listen 10 the keybr~ardand inouse
and responds to simple commands like scmlling page 1 while the orher tliread is
cumpuling madly in the btlckground. With a little luck, the rcformacting w i l l he
completed before the user asks to see pagc 600, so it can he disptaycd instanlly.
Whilc we are at it, why not add n third thread? Many word processors have I(
feature of automaticaLly saving the entire file tn disk every few minutcs to pl-otect
the user against losing a day's work in the event of a program crash, syslcm ur;lsh.
or power failure. The third h e a d can handlc the disk backups without interfering
with the other two. The situation with three threads is shown in Fig. 19-9.
Kernel
Keyboard Disk
Web $ w w r process
I,
Dispatcher thread
space
I
Web page cache
I
\ Kernel
J space
request. and hmding it ufl'to a worker. Each worker's code consists o f an infinite
loop consisting of accepting a request from h c dispatcher and checking the Web
cache 10 see if the page is present. If stl, it i s returned to the client and the wnrkci-
blocks waiting for a new request. If nnt, it gcts the p a p frr~mthe disk, rcturns il
to the client, and bhcks waiting for a new request.
A rough c~utlineof thc cude is given in Fig. ?-I J . Herc. as i n the rest of this
book, TRUE is assumed to be the constant I. Also, b@and prrgv are structure?.
appropriate for holding a work request and a Web p q c . respectively.
Consider how the Wcb server cr~uldbe written in the absetwc of threads. Ont-
possibility is to h a w it operate as a single thread. The main loop of the Wch
servcr gets a request, examines it, and carries it o u t tn completion bcfure ge~tinp
thc next one. While wailing fur the disk, the server is idle arid does not process
any other incoming requests. If the Web server is running un a dedicated
SEC. 2.2
machine, as is comnlonly the case, the CPU is simply idle whilc the Web server i.;
C be pro-
wailing for the disk. The net result is that many fwcr ~ C ~ U ~ S € S / S CCan
cesscd. Thus thrcads gain considerable perfomance, but each thrcad i.s pro-
gramrt~cdwqucntinlly. in the usual way.
S o f i r we have seen ~ w possible
o designs: a inultithreaded Web server and a
singlc-threaded Web server. Supposc that threads are not available but the system
designers find the performance loss due to single threading unacceptable. If a
nunblocking versivn of the read sys~cmcall is available. a third approach i s possi-
ble. When ii request cornrs in. the one and only thread examines it. If it can be
satisfied fi-om the cachc. Fine, hut if nul. a nm~hIockingdisk c~perationis started.
'I'he server records the state of the current ruqucst in ;itable and then goes and
wts the next cvent. The ncxt event tnay cither be ii request fur ncw work or a
c
I-cply from thc disk about 2 previous operaticm. If i t i s new wr~rk.that work i s
started. If i t is a reply frnrn the disk, the telcvant infixmation i s f'erchd from the
table and the reply prucessed. With nunblr~ckingdisk I/<),a reply prnbably will
have to take the f o m of a signal o r interrupt.
i n this design, the "sequential prwess" model that we had in the fip,t [wr,
cases is lost. The state of the crsrnpiitaticrn must be explicitly saved i ~ n dre.~tored
in the table every lime the scrver switches from working on one requrst to
another. In effecl, we are simulating the thrcads and their stacks the hard way. A
design like this i n which each rr~mputatiunhas a saved stale and thcre exists w r n e
set o f events that can occur lu change the st8to is called a finite-statc machine.
This concept is widely used throughnut computer science.
It should now be clear what threads have to offer. They make i t pussibk t o
retain Lhc idea of sequential processes h a t m a k t blmking system calls (e.g.. fnr
disk 1/0) and still achicve prvallelisnl. Blocking system calls make prrlgranmirlg
easier and paralldistn inlproves perfwmance. The single-threaded scrvw rct~ains
the ease of blocking sysrcm calls but pivcs up pcdbnnnnce. The third approach
achieves high performance through parallelism but uses nonblocking calls and
interrupts and is thus is hard to prtyram. These mt,dels arc sutnmarized in
Fig. 2- 12,
. -.. -. --
7 '
Character istics
1
.- .- .. -. .
7-
:
- . . - .. - - . -- ... - -
. .i
.
Finite-state
-.
machine
-. Parallelism, nonblocking
.. system calls, interrupts
....
A third example where threads are useful is in applications that must process
very large amounts ell data. The normal approach i s ro read in a block of dara.
process it. and then write it out again. Thc problem here is that if only blocking
system calls are available. \he process blocks while data are coming in and data