You are on page 1of 133

For More material See www.computertech-dovari.blogspot.

com
Analysis,Design And
Algorithsms(ADA)
Operating System
Lexical Analysis
Database Management
System
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
Analysis,Design And
Algorithm Study
Material
Concept of algorithm
Components of algorithms
Nmerical algorithm
!e"ie# of searching algorithm
!e"ie# of sorting algorithm
!ecrsion "$s iteration
%ntrodction to graph theory
Matrix representation
&rees
Di"ide ' Con(er ) *inary search
Max+ Min Search ' Merge sort
%nteger Moltiplication
Cassette filling
,napsac- problem
.ob schedoling
*ac-trac-ing
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
*ranch ' *ond
Shortest path
Minimal spanning trees
&echni(es for /raphs
Concept of Algorithm:
A common man0s belief is that a compter can do anything and e"erything that he imagines1 %t is "ery
difficlt to ma-e people reali2e that it is not really the compter bt the man behind compter #ho does
e"erything1
%n the modern internet #orld man feels that 3st by entering #hat he #ants to search into the compters
he can get information as desired by him1 4e belie"es that, this is done by compter1 A common man
seldom nderstands that a man made procedre called search has done the entire 3ob and the only spport
pro"ided by the compter is the exectional speed and organi2ed storage of information1
%n the abo"e instance, a designer of the information system shold -no# #hat one fre(ently searches
for1 4e shold ma-e a strctred organi2ation of all those details to store in memory of the compter1 *ased
on the re(irement, the right information is broght ot1 &his is accomplished throgh a set of instrctions
created by the designer of the information system to search the right information matching the re(irement
of the ser1 &his set of instrctions is termed as program1 %t shold be e"ident by no# that it is not the
compter, #hich generates atomatically the program bt it is the designer of the information system #ho
has created this1
&hs, the program is the one, #hich throgh the medim of the compter exectes to perform all the
acti"ities as desired by a ser1 &his implies that programming a compter is more important than the
compter itself #hile sol"ing a problem sing a compter and this part of programming has got to be done
by the man behind the compter1 5"en at this stage, one shold not (ic-ly 3mp to a conclsion that coding
is programming1 Coding is perhaps the last stage in the process of programming1 6rogramming in"ol"es
"arios acti"ities form the stage of concei"ing the problem pto the stage of creating a model to sol"e the
problem1 &he formal representation of this model as a se(ence of instrctions is called an algorithm and
coded algorithm in a specific compter langage is called a program1
One can no# experience that the focs is shifted from compter to compter programming and then to
creating an algorithm1 &his is algorithm design, heart of problem sol"ing1
Characteristic of Algorithm:
Let s try to present the scenario of a man brshing his o#n teeth(natral dentre) as an algorithm as
follo#s1 Step 71 &a-e the brsh Step 81 Apply the paste Step 91 Start brshing Step :1 !inse Step ;1 <ash
Step =1 Stop
%f one goes throgh these = steps #ithot being a#are of the statement of the problem, he cold possibly
feel that this is the algorithm for cleaning a toilet1 &his is becase of se"eral ambigities #hile
comprehending e"ery step1 &he step 7 may imply tooth brsh, paint brsh, toilet brsh etc1 Sch an
ambigity doesn0t an instrction an algorithmic step1 &hs e"ery step shold be made nambigos1 An
nambigos step is called definite instrction1 5"en if the step 8 is re#ritten as apply the tooth paste, to
eliminate ambigities yet the conflicts sch as, #here to apply the tooth paste and #here is the sorce of
the tooth paste, need to be resol"ed1 4ence, the act of applying the toothpaste is not mentioned1 Althogh
nambigos, sch nreali2able steps can0t be inclded as algorithmic instrction as they are not effecti"e1
&he definiteness and effecti"eness of an instrction implies the sccessfl termination of that instrction1
4o#e"er the abo"e t#o may not be sfficient to garantee the termination of the algorithm1 &herefore,
#hile designing an algorithm care shold be ta-en to pro"ide a proper termination for algorithm1
&hs, e"ery algorithm shold ha"e the follo#ing fi"e characteristic featre
%npt
Otpt
Definiteness
5ffecti"eness
&ermination
&herefore, an algorithm can be defined as a se(ence of definite and effecti"e instrctions, #hich terminates
#ith the prodction of correct otpt from the gi"en inpt1 %n other #ords, "ie#ed little more formally, an
algorithm is a step by step formali2ation of a mapping fnction to map inpt set onto an otpt set1 &he
problem of #riting do#n the correct algorithm for the abo"e problem of brshing the teeth is left to the
reader1 >or the prpose of clarity in nderstanding, let s consider the follo#ing examples1 5xample 7)
6roblem ) finding the largest "ale among n?@7 nmbers1 %npt ) the "ale of n and n nmbers Otpt )
the largest "ale Steps ) Let the "ale of the first be the largest "ale denoted by *%/
Let ! denote the nmber of remaining nmbers1 !@n+7
%f ! A@ B then it is implied that the list is still not exhasted1 &herefore loo- the next nmber called N5<1
No# ! becomes !+7
%f N5< is greater than *%/ then replace *%/ by the "ale of N5<
!epeat steps 9 to ; ntil ! becomes 2ero1
6rint *%/
Stop
5nd of algorithm 5xample 8) (adratic e(ation 5xample 9) listing all prime nmbers bet#een t#o limits n7
and n81 1.2.1 Algorithmic N otations %n this section #e present the psedocode that #e se throgh ot
the boo- to describe algorithms1 &he psedo code sed resembles 6ASCAL and C langage control
strctres1 4ence, it is expected that the reader be a#are of 6ASCAL$C1 5"en other#ise atleast no# it is
re(ired that the reader shold -no# preferably C to practically test the algorithm in this corse #or-1
4o#e"er, for the sa-e of completion #e present the commonly employed control constrcts present in the
algorithms1 A conditional statement has the follo#ing form If < condition> then Block 1 Else Block 2 If
end. &his psedocode exectes bloc-7 if the condition is tre other#ise bloc-8 is exected1 &he t#o types of
loop strctres are conter based and conditional based and they are as follo#s For variable = vale1 to
vale2 do Block For end 4ere the bloc- is exected for all the "ales of the "ariable from "ale 7 to "ale
81 &here are t#o types of conditional looping, #hile type and repeat type1 !hile "condition# do Block
!hile end. 4ere bloc- gets exected as long as the condition is tre1 $e%eat Block
&ntil<condition> 4ere bloc- is exected as long as condition is false1 %t may be obser"ed that the bloc- is
exected atleast once in repeat type1 E'ercise 1( De"ise the algorithm for the follo#ing and "erify #hether
they satisfy all the featres1 An algorithm that inpts three nmbers and otpts them in ascending order1
&o test #hether the three nmbers represent the sides of a right angle triangle1 &o test #hether a gi"en
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
point p(x,y) lies on x+axis or y+axis or in %$%%$%%%$%C (adrant1 &o compte the area of a circle of a gi"en
circmference &o locate a specific #ord in a dictionary1
Numerical Algorithm:
%f there are more then one possible #ay of sol"ing a problem, then one may thin- of more
than one algorithm for the same problem1 4ence, it is necessary to -no# in #hat domains
these algorithms are applicable1 Data domain is an important aspect to be -no#n in the field
of algorithms1 Once #e ha"e more than one algorithm for a gi"en problem, ho# do #e
choose the best among themD &he soltion is to de"ise some data sets and determine a
performance profile for each of the algorithms1
A best case data set can be obtained by ha"ing all distinct data in the set1 *t, it is al#ays
complex to determine a data set, #hich exhibits some a"erage beha"ior1 &he follo#ing
sections gi"e a brief idea of the #ell+-no#n accepted algorithms1
2.1 Nmerical Algorithms
Nmerical analysis is the theory of constrcti"e methods in mathematical analysis1
Constrcti"e method is a procedre sed to obtain the soltion for a mathematical problem
in finite nmber of steps and to some desired accracy1
2.1.1 Nmerical Iterative Algorithm
An iterati"e process can be illstrated #ith the flo# chart gi"en in fig 8171 &here are for
main bloc-s in the process "i21, initiali2ation, decision, comptation, and pdate1 &he
fnctions of these for bloc-s are as follo#s)
1. %nitiali2ation) all parameters are set to their initial "ales1
2. Decision) decision parameter is sed to determine #hen to exit from the loop1
3. Comptation) re(ired comptation is performed1
4. Epdate) decision parameter is pdated and is transformed for next iteration1
Many problems in engineering or science need the soltion of simltaneos linear algebraic
e(ations1 5"ery iterati"e algorithm is infinite step algorithm1 One of the iterati"e algorithms
to sol"e system of simltaneos e(ations is /ass Siedel1 &his iteration method re(ires
generally a fe# iteration1 %terati"e techni(es ha"e less rond+off error1 >or large system of
e(ations, the iteration re(ired may be (ite large1 *t, there is a garantee of getting the
con"ergent reslt1
>or example) consider the follo#ing set of e(ations,
7Bx7F8x8Fx9@ G
8x7F8Bx8+8x9@ +::
+8x7F9x8F7Bx9@ 881
&o sol"e the abo"e set of e(ations sing /ass Siedel iteration scheme, start #ith
(x7
(7)
,x8
(7)
,x9
(7)
)@(B,B,B) as initial "ales and compte the "ales of #e #rite the system of
x7, x8, x9 sing the e(ations gi"en belo#

x7
(-F7)
@(b7+a78x8
(-F7)
+a79x9
(-)
)$a77
x8
(-F7)
@(b8+a87x7
(-F7)
+a89x9
(-)
)$a88
x9
(-F7)
@(b9+a97x7
(-F7)
+a98x9
(-F7)
)$a99
for -@7,8,9,H
&his process is contined pto some desired accracy1 Nmerical iterati"e methods are also
applicable for obtaining the roots of the e(ation of the form f(x)@B1 &he "arios iterati"e
methods sed for this prpose are,
1. *isection method) xiF8@(xiFxiF7)$8
2. !egla+ >alsi method) x8@(xBf(x7)F x7f(xB))$ (f(x7)+f(xB))
3. Ne#ton !aphson method) x8@ x7+f(x7)$f
7
(x7)
Review of Searching Algorithm:
Let s assme that #e ha"e a se(ential file and #e #ish to retrie"e an element matching #ith -ey I-0, then,
#e ha"e to search the entire file from the beginning till the end to chec- #hether the element matching - is
present in the file or not1
&here are a nmber of complex searching algorithms to ser"e the prpose of searching1 &he linear search
and binary search methods are relati"ely straight for#ard methods of searching1
2.2.1 )e*ential search
%n this method, #e start to search from the beginning of the list and examine each element till the end of
the list1 %f the desired element is fond #e stop the search and retrn the index of that element1 %f the item
is not fond and the list is exhasted the search retrns a 2ero "ale1
%n the #orst case the item is not fond or the search item is the last (n
th
) element1 >or both sitations #e
mst examine all n elements of the array so the order of magnitde or complexity of the se(ential search
is n1 i1e1, O(n)1 &he exection time for this algorithm is proportional to n that is the algorithm exectes in
linear time1
&he algorithm for se(ential search is as follo#s,
Algorithm ) se(ential search
In%t ) A, "ector of n elements
,, search element
+t%t , 3 Jindex of -
-ethod ) i@7
<hile(iK@n)
L
if(AMiN@-)
L
#rite(Osearch sccessflO)
#rite(- is at location i)
exit()P
Q
else
iFF
if end
#hile end
#rite (search nsccessfl)P
algorithm ends1

2.2.2 Binar. search
*inary search method is also relati"ely simple method1 >or this method it is necessary to ha"e the "ector in
an alphabetical or nmerically increasing order1 A search for a particlar item #ith R resembles the search
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
for a #ord in the dictionary1 &he approximate mid entry is located and its -ey "ale is examined1 %f the mid
"ale is greater than R, then the list is chopped off at the (mid+7)
th
location1 No# the list gets redced to
half the original list1 &he middle entry of the left+redced list is examined in a similar manner1 &his procedre
is repeated ntil the item is fond or the list has no more elements1 On the other hand, if the mid "ale is
lesser than R, then the list is chopped off at (midF7)
th
location1 &he middle entry of the right+redced list is
examined and the procedre is contined ntil desired -ey is fond or the search inter"al is exhasted1
&he algorithm for binary search is as follo#s,
Algorithm ) binary search
In%t ) A, "ector of n elements
,, search element
+t%t , lo# Jindex of -
-ethod ) lo#@7,high@n
<hile(lo#K@high+7)
L
mid@(lo#Fhigh)$8
if(-KaMmidN)
high@mid
else
lo#@mid
if end
Q
#hile end
if(-@AMlo#N)
L
#rite(Osearch sccessflO)
#rite(- is at location lo#)
exit()P
Q
else
#rite (search nsccessfl)P
if endP
algorithm ends1
)orting,
One of the ma3or applications in compter science is the sorting of information in a table1 Sorting algorithms
arrange items in a set according to a predefined ordering relation1 &he most common types of data are
string information and nmerical information1 &he ordering relation for nmeric data simply in"ol"es
arranging items in se(ence from smallest to largest and from largest to smallest, #hich is called ascending
and descending order respecti"ely1
&he items in a set arranged in non+decreasing order are LS,77,79,7=,7=,7G,89Q1 &he items in a set arranged
in descending order is of the form L89,7G,7=,7=,79,77,SQ
Similarly for string information, La, abacs, abo"e, be, become, beyondQis in ascending order and L beyond,
become, be, abo"e, abacs, aQis in descending order1
&here are nmeros methods a"ailable for sorting information1 *t, not e"en one of them is best for all
applications1 6erformance of the methods depends on parameters li-e, si2e of the data set, degree of
relati"e order already present in the data etc1

2./.1 )election sort
&he idea in selection sort is to find the smallest "ale and place it in an order, then find the next smallest
and place in the right order1 &his process is contined till the entire table is sorted1
Consider the nsorted array,
aM7N aM8N aMTN
8B 9; 7T T 7: :7 9 9G
&he reslting array shold be
aM7N aM8N aMTN
9 T 7: 7T 8B 9; 9G :7
One #ay to sort the nsorted array #old be to perform the follo#ing steps)
>ind the smallest element in the nsorted array
6lace the smallest element in position of aM7N
i1e1, the smallest element in the nsorted array is 9 so exchange the "ales of aM7N and aMSN1 &he
array no# becomes,
aM7N aM8N aMTN
9 9; 7T T 7: :7 8B 9G
No# find the smallest from aM8N to aMTN , i1e1, T so exchange the "ales of aM8N and aM:N #hich
reslts #ith the array sho#n belo#,
aM7N aM8N aMTN
9 T 7T 9; 7: :7 8B 9G

!epeat this process ntil the entire array is sorted1 &he changes ndergone by the array is sho#n in fig
8181&he nmber of mo"es #ith this techni(e is al#ays of the order O(n)1

2./.2 Insertion sort
%nsertion sort is a straight for#ard method that is sefl for small collection of data1 &he idea here is to
obtain the complete soltion by inserting an element from the nordered part into the partially ordered
soltion extending it by one element1 Selecting an element from the nordered list cold be simple if the
first element of that list is selected1
aM7N aM8N aMTN
8B 9; 7T T 7: :7 9 9G
%nitially the #hole array is nordered1 So select the minimm and pt it in place of aM7N to act as sentinel1
No# the array is of the form,
aM7N aM8N aMTN
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
9 9; 7T T 7: :7 8B 9G
No# #e ha"e one element in the sorted list and the remaining elements are in the nordered set1 Select the
next element to be inserted1 %f the selected element is less than the preceding element mo"e the preceding
element by one position and insert the smaller element1
%n the abo"e array the next element to be inserted is x@9;, bt the preceding element is 9 #hich is less
than x1 4ence, ta-e the next element for insertion i1e1, 7T1 7T is less than 9;, so mo"e 9; one position
ahead and place 7T at that place1 &he reslting array #ill be,
aM7N aM8N aMTN
9 7T 9; T 7: :7 8B 9G
No# the element to be inserted is T1 T is less than 9; and T is also less than 7T so mo"e 9; and 7T one
position right and place T at aM8N1 &his process is carried till the sorted array is obtained1
&he changes ndergone are sho#n in fig 8191
One of the disad"antages of the insertion sort method is the amont of mo"ement of data1 %n the #orst
case, the nmber of mo"es is of the order O(n
8
)1 >or lengthy records it is (ite time consming1
2././ -erge sort
Merge sort begins by interpreting the inpts as n sorted files each of
length one1 &hese are merged pair #ise to obtain n$8 files of si2e t#o1 %f n is odd one file is of si2e one1
&hese n$8 files are then merged pair #ise and so on ntil #e are left #ith only one file1 &he example in fig
81: illstrates the process of merge sort1
As illstrated in the example merge sort consists of se"eral passes o"er the records being sorted1 %n the first
pass files of si2e one are merged1 %n the second pass the si2e of the files being merged is t#o1 %n the i
th
pass
the files being merged #ill be of si2e 8
i+7
1 A total of log8n passes are made o"er the data1 Since, t#o files can
be merged in linear time, each pass of merge sort ta-es O(n) time1 As there are log8n passes the total time
complexity is O(n log8n)1
Recursion:
!ecrsion may ha"e the follo#ing definitions)
+&he nested repetition of identical algorithm is recrsion1
+%t is a techni(e of defining an ob3ect$process by itself1
+!ecrsion is a process by #hich a fnction calls itself repeatedly ntil some specified condition has been
satisfied1
2.0.1 !hen to se recrsion
!ecrsion can be sed for repetiti"e comptations in #hich each action is stated in terms of pre"ios reslt1
&here are t#o conditions that mst be satisfied by any recrsi"e procedre1
1. 5ach time a fnction calls itself it shold get nearer to the soltion1
2. &here mst be a decision criterion for stopping the process1
%n ma-ing the decision abot #hether to #rite an algorithm in recrsi"e or non+recrsi"e form, it is al#ays
ad"isable to consider a tree strctre for the problem1 %f the strctre is simple then se non+recrsi"e
form1 %f the tree appears (ite bshy, #ith little dplication of tas-s, then recrsion is sitable1
&he recrsion algorithm for finding the factorial of a nmber is gi"en belo#,
Algorithm ) factorial+recrsion
In%t ) n, the nmber #hose factorial is to be fond1
+t%t , f, the factorial of n
-ethod ) if(n@B)
f@7
else
f@factorial(n+7) U n
if end
algorithm ends1
&he general procedre for any recrsi"e algorithm is as follo#s,
1. Sa"e the parameters, local "ariables and retrn addresses1
2. %f the termination criterion is reached perform final comptation and goto step 9 other#ise perform
final comptations and goto step 7
3. !estore the most recently sa"ed parameters, local
"ariable and retrn address and goto the latest retrn address1
2.0.2 Iteration v1s $ecrsion
Demerits of recrsi"e algorithms
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
1. Many programming langages do not spport recrsion, hence recrsi"e mathematical fnction is
implemented sing iterati"e methods1
2. 5"en thogh mathematical fnctions can be easily implemented sing recrsion it is al#ays at the
cost of exection time and memory space1 >or example, the recrsion tree for generating =
nmbers in a fibonacci series generation is gi"en in fig 81;1 A fibonacci series is of the form
B,7,7,8,9,;,T,79,Hetc, #here the third nmber is the sm of preceding t#o nmbers and so on1 %t
can be noticed from the fig 81; that, f(n+8) is compted t#ice, f(n+9) is compted thrice, f(n+:) is
compted ; times1
3. A recrsi"e procedre can be called from #ithin or otside itself and to ensre its proper fnctioning
it has to sa"e in some order the retrn addresses so that, a retrn to the proper location #ill reslt
#hen the retrn to a calling statement is made1
4. &he recrsi"e programs needs considerably more storage and #ill ta-e more time1
Demerits of iterati"e methods
1. Mathematical fnctions sch as factorial and fibonacci series generation can be easily implemented
sing recrsion than iteration1
2. %n iterati"e techni(es looping of statement is "ery mch necessary1
!ecrsion is a top do#n approach to problem sol"ing1 %t di"ides the problem into pieces or selects ot one
-ey step, postponing the rest1
%teration is more of a bottom p approach1 %t begins #ith #hat is -no#n and from this constrcts the
soltion step by step1 &he iterati"e fnction ob"iosly ses time that is O(n) #here as recrsi"e fnction has
an exponential time complexity1
%t is al#ays tre that recrsion can be replaced by iteration and stac-s1 %t is also tre that stac- can be
replaced by a recrsi"e program #ith no stac-1
2.2 3ashing
4ashing is a practical techni(e of maintaining a symbol table1 A symbol table is a data strctre #hich
allo#s to easily determine #hether an arbitrary element is present or not1
Consider a se(ential memory sho#n in fig 81=1 %n hashing techni(e the address R of a "ariable x is
obtained by compting an arithmetic fnction (hashing fnction) f(x)1 &hs f(x) points to the address #here
x shold be placed in the table1 &his address is -no#n as the hash address1
&he memory sed to store the "ariable sing hashing techni(e is assmed to be se(ential1 &he memory is
-no#n as hash table1 &he hash table is partitioned into se"eral storing spaces called bc-ets and each
bc-et is di"ided into slots (fig 81=)1
%f there are b bc-ets in the table, each bc-et is capable of holding s "ariables, #here each "ariable
occpies one slot1 &he fnction f(x) maps the possible "ariable onto the integers B throgh b+71 &he si2e of
the space from #here the "ariables are dra#n is called the identifier space1 Let & be the identifier space, n
be the nmber of "ariables$identifiers in the hash table1 &hen, the ratio n$& is called the identifier density
and a @ n$sb is the loading density or loading factor1
%f f(x7)@f(x8), #here x7and x8 are any t#o "ariables, then x7and x8 are called synonyms1 Synonyms are
mapped onto the same bc-et1 %f a ne# identifier is hashed into a already complete bc-et, collision occrs1
A hashing table #ith single slot is as gi"en belo#1 Let there be 8= bc-ets #ith single slot1 &he identifier to
be stored are /A, D, A, /, L, A8, A7, A9, A:, V, VA, 51 Let f(x) be the fnction #hich maps on to a address
e(al to the position of the first character of the identifier in the set of 5nglish alphabet1 &he hashing table
generated is as sho#n in fig 81S1
&ime ta-en to retrie"e the identifiers is as follo#s,
Search element
(x)
Search time
(t)
/A 7
D 7
A 7
/ 8
L 7
A8 8
A7 9
A9 ;
A: =
V 7
VA 7B
5 =
Wt @9G
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
A"erage retrie"al time @(Wt)$n1
&he a"erage retrie"al time entirely depends on the hashing fnction1
E'ercise 2,
1. <hat are the serios short comings of the binary search method and se(ential search method1
2. ,no# more searching techni(es in"ol"ing hashing fnctions
3. %mplement the algorithms for searching and calclate the complexities
4. <rite an algorithm for the abo"e method of selection sort and implement the same1
5. <rite the algorithm for merge sort method
6. &a-e ; data set of length 7B and hand simlate for each method gi"en abo"e1
7. &ry to -no# more sorting techni(es and ma-e a comparati"e stdy of them1
. <rite an iterati"e algorithm to find the factorial of a nmber
!. <rite a recrsi"e and iterati"e program for re"ersing a nmber
1". <rite recrsi"e and iterati"e program to find maximm and minimm in a list of nmbers1
11. <rite an algorithm to implement the hashing techni(e and implement the same
12. 4and simlate all algorithms for a ; datasets1
Introduction to Graph Theory:
/.1
/.1.1 !hat is gra%h4
A graph / @ (C, 5) consists of a set of ob3ects C @ L"7, "8, HQ called "ertices, and another
set 5 @ Le7, e8, HQ #hose elements are called edges1 5ach edge e- in 5 is identified #ith an
nordered pair ("i, "3) of "ertices1 &he "ertices "i, "3 associated #ith edge e- are called the
end "ertices of e-1
&he most common representation of graph is by means of a diagram, in #hich the "ertices
are represented as points and each edge as a line segment 3oining its end "ertices1 Often
this diagram itself is referred to as a graph1
>ig 9+71
%n the >ig1 9+7 edge e7 ha"ing same "ertex as both its end "ertices is called a self+loop1
&here may be more than one edge associated #ith a gi"en pair of "ertices, for example e:
and e; in >ig1 9+71 Sch edges are referred to as parallel edges1
A graph that has neither self+loop nor parallel edges are called a simple graph, other#ise it
is called general graph1 %t shold also be noted that, in dra#ing a graph, it is immaterial
#hether the lines are dra#n straight or cr"ed, long or short) #hat is important is the
incidence bet#een the edges and "ertices1
A graph is also called a linear complex, a 1-complex, or a one-dimensional complex1 A
"ertex is also referred to as a node, a junction, a point, 0-cell, or an 0-simplex1 Other terms
sed for an edge are a branch, a line, an element, a 1-cell, an arc, and a 1-simplex1
*ecase of its inherent simplicity, graph theory has a "ery #ide range of applications in
engineering, physical, social, and biological sciences, lingistics, and in nmeros other
areas1 A graph can be sed to represent almost any physical sitation in"ol"ing discrete
ob3ects and a relationship among them1

/.1.2 Finite and Infinite 5ra%hs
Althogh in the definition of a graph neither the "ertex set C nor the edge set 5 need be
finite, in most of the theory and almost all applications these sets are finite1 A graph #ith a
finite nmber of "ertices as #ell as a finite nmber of edges is called a finite graphP
other#ise, it is an infinite graph1
/.1./ Incidence and 6egree
<hen a "ertex vi is an end "ertex of some edge e3, "i and e3 are said to be incident #ith (on
or to) each other1 %n >ig1 9+7, for example, edges e8, e=, and eS are incident #ith "ertex ":1
&#o nonparallel edges are said to be ad3acent if they are incident on a common "ertex1 >or
example, e8 and eS in >ig1 9+7 are ad3acent1 Similarly, t#o "ertices are said to be ad3acent if
they are the end "ertices of the same edge1 %n >ig1 9+7, ": and "; are ad3acent, bt "7 and
": are not1
&he nmber of edges incident on a "ertex "i, #ith self+loops conted t#ice is called the
degree, d("i), of "ertex "i1 %n >ig1 9+7, for example, d("7) @ d("9) @ d(":) @ 9, d("8) @ :,
and d(";) @ 71 &he degree of a "ertex is sometimes also referred to as its valency1 Since
each edge contribtes t#o degrees, the sm of the degrees of all "ertices in / is t#ice the
nmber of edges in /1
/.1.0 Isolated verte'7 8endent verte'7 and Nll gra%h
A "ertex ha"ing no incident edge is called anisolated vertex1 %n other #ords, isolated
"ertices are "ertices #ith 2ero degree1 Certex ": and "S in >ig1 9+8, for example, are isolated
"ertices1 A "ertex of degree one is called a pendent vertex or an end vertex1 Certex "9 in
>ig1 9+8 is a pendant "ertex1 &#o ad3acent edges are said to be inseries if their common
"ertex is of degree t#o1 %n >ig1 9+8, the t#o edges incident on "7 are in series1

>ig1 9+8 /raph containing isolated "ertices, series edges and a pendant "ertex1
%n the definition of a graph / @ (C, 5), it is possible for the edge set 5 to be empty1 Sch a
graph, #ithot any edges, is called a nll graph1 %n other #ords, e"ery "ertex in a nll graph
is an isolated "ertex1 A nll graph of six "ertices is sho#n in >ig1 9+91 Althogh the edge set
5 may be empty, the "ertex set C mst not be emptyP other#ise, there is no graph1 %n other
#ords, by definition, a graph mst ha"e at least one "ertex1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com

-atri' $e%resentation of 5ra%h,
Althogh a pictorial representation of a graph is "ery con"enient for a "isal stdy, other representations are
better for compter processing1 A matrix is a con"enient and sefl #ay of representing a graph to a
compter1
Matrices lend themsel"es easily to mechanical maniplations1 *esides, many -no#n reslts of matrix algebra
can be readily applied to stdy the strctral properties of graphs from an algebraic point of "ie#1 %n many
applications of graph theory, sch as in electrical net#or- analysis and operation research, matrices also
trn ot to be the natral #ay of expressing the problem1
/.2.1 Incidence -atri'
Let / be a graph #ith n "ertices, e edges, and no self+loops1 Define an n by e matrix A @Mai3N, #hose n ro#s
correspond to the n "ertices and the e colmns correspond to the e edges, as follo#s)
&he matrix element
Ai3 @ 7, if 3
th
edge e3 is incident on i
th
"ertex "i, and
@ B, other#ise1
(a)
a b c d e f g h
"7 B B B 7 B 7 B B
"8 B B B B 7 7 7 7
"9 B B B B B B B 7
": 7 7 7 B 7 B B B
"; B B 7 7 B B 7 B
"= 7 7 B B B B B B
(b)
>ig1 9+: /raph and its incidence matrix1
Sch a matrix A is called the vertex-edge incidence matrix, or simply incidence matrix1 Matrix A for a graph
/ is sometimes also #ritten as A(/)1 A graph and its incidence matrix are sho#n in >ig1 9+:1 &he incidence
matrix contains only t#o elements, B and 71 Sch a matrix is called abinary matrix or a (0, 1)-matrix1
&he follo#ing obser"ations abot the incidence matrix A can readily be made)
1. Since e"ery edge is incident on exactly t#o "ertices, each colmn of A has exactly t#o 70
s
1
2. &he nmber of 70
s
in each ro# e(als the degree of the corresponding "ertex1
3. A ro# #ith all B0
s
, therefore, represents an isolated "ertex1
4. 6arallel edges in a graph prodce identical colmns in its incidence matrix, for example,
colmns 7 and 8 in >ig1 9+:1
Concept of Trees:
&he concept of a tree is probably the most important in graph theory, especially for those interested in
applications of graphs1
A tree is a connected graph #ithot any circits1 &he graph in >ig 9+; for instance, is a tree1 %t follo#s
immediately from the definition that a tree has to be a simple graph, that is, ha"ing neither a self+loop nor
parallel edges (becase they both form circits)1
>ig1 9+;1 &ree
&rees appear in nmeros instances1 &he genealogy of a family is often represented by means of a tree1 A
ri"er #ith its tribtaries and sb+tribtaries can also be represented by a tree1 &he sorting of mail according
to 2ip code and the sorting of pnched cards are done according to a tree (calleddecision tree or sorting
tree)1

/./.1 )ome %ro%erties of 9rees
1. &here is one and only one path bet#een e"ery pair of "ertices in a tree, &1
2. A tree #ith n "ertices has n+7 edges1
3. Any connected graph #ith n "ertices and n+7 edges is a tree1
4. A graph is a tree if and only if it is minimally connected1
&herefore a graph #ith n "ertices is called a tree if
1. / is connected and is circit less, or
2. / is connected and has n+7 edges, or
3. / is circit less and has n+7 edges, or
4. &here is exactly one path bet#een e"ery pair of "ertices in /, or
5. / is a minimally connected graph1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
>ig1 9+= &ree of a monotonically increasing se(ences in :,7,79,S,B,8,T,77,9
/./.2 8endent :ertices in a 9ree
%t is obser"ed that a tree sho#n in the >ig1 9+; has se"eral pendant "ertices1 A pendant "ertex #as defined
as a "ertex of degree one)1 &he reason is that in a tree of n "ertices #e ha"e n+7 edges, and hence 8(n+7)
degrees to be di"ided among n "ertices1 Since no "ertex can be of 2ero degree, #e mst ha"e at least t#o
"ertices of degree one in a tree1 &his ma-es sense only if n 81
An Application: &he follo#ing problem is sed in teaching compter programming1 /i"en a se(ence of
integers, no t#o of #hich are the same find the largest monotonically increasing sbse(ence in it1 Sppose
that the se(ence gi"en to s is :,7,79,S,B,8,T,77,9P it can be represented by a tree in #hich the "ertices
(except the start "ertex) represent indi"idal nmbers in the se(ence, and the path from the start "ertex
to a particlar "ertex v describes the monotonically increasing sbse(ence terminating in v1
As sho#n in >ig1 9+=, this se(ence contains for longest monotonically increasing sbse(ences, that is,
(:,S,T,77), (7,S,T,77), (7,8,T,77) and (B,8,T,77)1 5ach is of length for1 Compter programmers refer to
sch a tree sed in representing data as a data tree1
/././ $ooted and Binar. 9ree
A tree in #hich one "ertex (called the root) is distingished from all the others is called a rooted tree1 >or
instance, in >ig1 9+= "ertex named start, is distingished from the rest of the "ertices1 4ence
"ertex start can be considered the root of the tree, and so the tree is rooted1 /enerally, the termtree means
trees #ithot any root1 4o#e"er, for emphasis they are sometimes called free trees (or non rooted trees) to
differentiate them from the rooted -ind1

>ig1 9+= &ree1
inary !rees) A special class of rooted trees, called binary rooted trees, is of particlar interest, since they
are extensi"ely sed in the stdy of compter search methods, binary identification problems, and "ariable+
length binary codes1 A binary tree is defined as a tree in #hich there is exactly one "ertex of degree t#o,
and each of the remaining "ertices of degree one or three1 Since the "ertex of degree t#o is distinct from all
other "ertices, this "ertex ser"es as a root1 &hs e"ery binary tree is a rooted tree1
/./.0 )%anning 9rees
So far #e ha"e discssed the trees #hen it occrs as a graph by itself1 No# #e shall stdy the tree as a
sbgraph of another graph1 A gi"en graph has nmeros sbgraphs, from e edges, 8
e
distinct combinations
are possible1 Ob"iosly, some of these sbgrphs #ill be trees1 Ot of these trees #e are particlarly
interested in certain types of trees, called spanning trees1
A tree & is said to be a spanning tree of a connected graph / if & is a sbgraph of / and & contains all
"ertices of /1 Since the "ertices of / are barely hanging together in a spanning tree, it is a sort of s-eleton
of the original graph /1 &his is #hy a spanning tree is sometimes referred to as a s"eletonor scaffolding of
/1 Since spanning trees are the largest trees among all trees in /, it is also (ite appropriate to call a
spanning tree a maximal tree subgraph or maximal tree of/1
>inding a spanning tree of a connected graph / is simple1 %f / has no circit, it is its o#n spanning tree1 %f /
has a circit, delete an edge from the circit1 &his #ill still lea"e the graph connected1 %f there are more
circits, repeat the operation till an edge from the last circit is deleted, lea"ing a connected, circit+free
graph that contains all the "ertices of /1
/./.2 3amiltonian 8aths and ;ircits
#amiltonian circuit in a connected graph is defined as a closed #al- that tra"erses every vertex of / exactly
once, except of corse the starting "ertex, at #hich the #al- also terminates1 A circit in a connected graph
/ is said to be 4amiltonian if it incldes e"ery "ertex of /1 4ence a 4amiltonian circit in a graph of n
"ertices consists of exactly n edges1
#amiltonian path) %f #e remo"e any one edge from a 4amiltonian circit, #e are left #ith a path1 &his path
is called a #amiltonian path$ Clearly, a 4amiltonian path in a graph / tra"erses e"ery "ertex of /1 Since a
4amiltonian path is a sbgraph of a 4amiltonian circit (#hich in trn is a sbgraph of another graph), e"ery
graph that has a 4amiltonian circit also has a 4amiltonian path1 &here are, ho#e"er, many graphs #ith
4amiltonian paths that ha"e no 4amiltonian circits1 &he length of a 4amiltonian path in a connected graph
of n "ertices is n+71
/./.2 9raveling<)alesman 8roblem
A problem closely related to the (estion of 4amiltonian circits is the !raveling-salesman problem, stated
as follo#s) A salesman is re(ired to "isit a nmber of cities dring a trip1 /i"en the distances bet#een the
cities, in #hat order shold he tra"el so as to "isit e"ery city precisely once and retrn home, #ith the
minimm mileage tra"eledD
!epresenting the cities by "ertices and the roads bet#een them by edges, #e get a graph1 %n this graph,
#ith e"ery edge ei there is associated a real nmber (the distance in miles, say), %(ei)1 Sch a graph is
called a %eighted graph&%(ei) being the #eight of edge ei1
%n or problem, if each of the cities has a road to e"ery other city, #e ha"e a complete %eighted graph1 &his
graph has nmeros 4amiltonian circits, and #e are to pic- the one that has the smallest sm of distances
(or #eights)1
&he total nmber of different (not edge dis3oint, of corse) 4amiltonian circits in a complete graph of n
"ertices can be sho#n to be (n+7)A $ 81 &his follo#s from the fact that starting from any "ertex #e ha"e n+7
edges to choose from the first "ertex, n+8 from the second, n+9 from the third, and so on1 &hese being
independent, reslts #ith (n+7)A choices1 &his nmber is, ho#e"er, di"ided by 8, becase each 4amiltonian
circit has been conted t#ice1
&heoretically, the problem of the tra"eling salesman can al#ays be sol"ed by enmerating all (n+7)A$8
4amiltonian circits, calclating the distance tra"eled in each, and then pic-ing the shortest one1 4o#e"er,
for a large "ale of n, the labor in"ol"ed is too great e"en for a digital compter1
&he problem is to prescribe a manageable algorithm for finding the shortest rote1 No efficient algorithm for
problems of arbitrary si2e has yet been fond, althogh many attempts ha"e been made1 Since this problem
has applications in operations research, some specific large+scale examples ha"e been #or-ed ot1 &here
are also a"ailable se"eral heristic methods of soltion that gi"e a rote "ery close to the shortest one, bt
do not garantee the shortest1
E'ercise /
1. Dra# all simple graphs of one, t#o, three and for "ertices
2. Name 7B sitations that can be represented by means of graphs1 5xplain #hat each "ertex and
edge represent
3. Dra# a connected graph that becomes disconnected #hen any edge is remo"ed from it
4. Dra# all trees of n labeled "ertices for n@7,8,9,: and ;
5. S-etch all binary trees #ith six pendent edges
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
6. S-etch all spanning trees of gi"en graphs in this chapter
7. <rite incidence matrix for all the graphs de"eloped
. >ind the spanning trees for all the graphs de"eloped
!. Dra# a graph #hich has 4amiltonian path bt does not ha"e 4amiltonian circit
1". List different paths from "ertex7 to "ertex n in each graph de"eloped1
Divide and conuer:
&here are a nmber of general and po#erfl comptational strategies that are repeatedly
sed in compter science1 %t is often possible to phrase any problem in terms of these
general strategies1 &hese general strategies are Di"ide and Con(er, Dynamic 6rogramming1
&he techni(es of /reedy Search, *ac-trac-ing and *ranch and *ond e"alation are
"ariations of dynamic programming idea1 All these strategies and techni(es are discssed
in the sbse(ent chapters1
&he most #idely -no#n and often sed of these is the di"ide and con(er strategy1
&he basic idea of di"ide and con(er is to di"ide the original problem into t#o or more sb+
problems #hich can be sol"ed by the same techni(e1 %f it is possible to split the problem
frther into smaller and smaller sb+problems, a stage is reached #here the sb+problems
are small enogh to be sol"ed #ithot frther splitting1 Combining the soltions of the
indi"idals #e get the final con(ering1 Combining need not mean, simply the nion of
indi"idal soltions1
Di"ide and Con(er in"ol"es for steps
1. Di"ide
2. Con(er M%nitial Con(er occrred de to sol"ingN
3. Combine
4. Con(er M>inal Con(erN1
%n precise, for#ard 3orney is di"ide and bac-#ard 3orney is Con(er1 A general binary
di"ide and con(er algorithm is )
6rocedre D'C (6,X) $$the data si2e is from p to (
L
%f si2e(6,X) is small &hen
Sol"e(6,X)
5lse
M di"ide(6,X)
Combine (D'C(6,M), D'C(MF7,X))
Q
Sometimes, this type of algorithm is -no#n as control abstract algorithms as they gi"e an
abstract flo#1 &his #ay of brea-ing do#n the problem has fond #ide application in sorting,
selection and searching algorithm1
0.1 Binar. )earch,
Algorithm,
m (pF()$8
%f (p m () &hen do the follo#ing 5lse Stop
%f (A(m) @ ,ey &hen Isccessfl0 stop
5lse
%f (A(m) K -ey &hen
(@m+7P
5lse
p mF7
5nd Algorithm1
Illstration ,
Consider the data set #ith elements L78,7T,88,98,:=,;8,;G,=8,=TQ1 >irst let s consider the
simlation for sccessfl cases1
Successful cases:
,ey@78 6 X m Search
7 G ; x
7 : 8 x
7 7 7 sccessfl
&o search 78, 9 nits of time is re(ired

,ey@7T 6 X m Search
7 G ; x
7 : 8 sccessfl
&o search 7T, 8 nits of time is re(ired
,ey@88 6 X m Search
7 G ; x
7 : 8 x
9 : 9 sccessfl
&o search 88, 9 nits of time is re(ired
,ey@98 6 X m Search
7 G ; x
7 : 8 x
9 : 9 x
: : : sccessfl
&o search 98, : nits of time is re(ired
,ey@:= 6 X m Search
7 G ; sccessfl
&o search :=, 7 nit of time is re(ired
,ey@;8 6 X m Search
7 G ; x
= G S x
= = = sccessfl
&o search ;8, 9 nits of time is re(ired
,ey@;G 6 X m Search
7 G ; x
= G S sccessfl
&o search ;G, 8 nits of time is re(ired
,ey@=8 6 X m Search
7 G ; x
= G S x
T G T sccessfl
&o search =8, 9 nits of time is re(ired
,ey@=T 6 X m Search
7 G ; x
= G S x
T G T x
G G G sccessfl
&o search =T, : nits of time is re(ired

9F8F9F:F7F9F8F:
Sccessfl a"erage search time@ +++++++++++++++++++++++++
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
G
unsuccessful cases
,ey@8; 6 X m Search
7 G ; x
7 : 8 x
9 : 9 x
: : : x
&o search 8;, : nits of time is re(ired



,ey@=; 6 X m Search
7 G ; x
= G S x
T G T x
G G G x
&o search =;, : nits of time is re(ired
:F:
Ensccessfl search time @++++++++++++++++++++
8
a"erage (sm of nsccessfl search time
search @ F sm of Sccessfl search time)$(nF(nF7))
time



Stdy Notes 4ome Y Next Section??
0.1 6ivide and ;on*er
&here are a nmber of general and po#erfl comptational strategies that are repeatedly
sed in compter science1 %t is often possible to phrase any problem in terms of these
general strategies1 &hese general strategies are Di"ide and Con(er, Dynamic 6rogramming1
&he techni(es of /reedy Search, *ac-trac-ing and *ranch and *ond e"alation are
"ariations of dynamic programming idea1 All these strategies and techni(es are discssed
in the sbse(ent chapters1
&he most #idely -no#n and often sed of these is the di"ide and con(er strategy1
&he basic idea of di"ide and con(er is to di"ide the original problem into t#o or more sb+
problems #hich can be sol"ed by the same techni(e1 %f it is possible to split the problem
frther into smaller and smaller sb+problems, a stage is reached #here the sb+problems
are small enogh to be sol"ed #ithot frther splitting1 Combining the soltions of the
indi"idals #e get the final con(ering1 Combining need not mean, simply the nion of
indi"idal soltions1
Di"ide and Con(er in"ol"es for steps
1. Di"ide
2. Con(er M%nitial Con(er occrred de to sol"ingN
3. Combine
4. Con(er M>inal Con(erN1
%n precise, for#ard 3orney is di"ide and bac-#ard 3orney is Con(er1 A general binary
di"ide and con(er algorithm is )
6rocedre D'C (6,X) $$the data si2e is from p to (
L
%f si2e(6,X) is small &hen
Sol"e(6,X)
5lse
M di"ide(6,X)
Combine (D'C(6,M), D'C(MF7,X))
Q
Sometimes, this type of algorithm is -no#n as control abstract algorithms as they gi"e an
abstract flo#1 &his #ay of brea-ing do#n the problem has fond #ide application in sorting,
selection and searching algorithm1
0.1 Binar. )earch,
Algorithm,
m (pF()$8
%f (p m () &hen do the follo#ing 5lse Stop
%f (A(m) @ ,ey &hen Isccessfl0 stop
5lse
%f (A(m) K -ey &hen
(@m+7P
5lse
p mF7
5nd Algorithm1
Illstration ,
Consider the data set #ith elements L78,7T,88,98,:=,;8,;G,=8,=TQ1 >irst let s consider the
simlation for sccessfl cases1
Successful cases:
,ey@78 6 X m Search
7 G ; x
7 : 8 x
7 7 7 sccessfl
&o search 78, 9 nits of time is re(ired

,ey@7T 6 X m Search
7 G ; x
7 : 8 sccessfl
&o search 7T, 8 nits of time is re(ired
,ey@88 6 X m Search
7 G ; x
7 : 8 x
9 : 9 sccessfl
&o search 88, 9 nits of time is re(ired
,ey@98 6 X m Search
7 G ; x
7 : 8 x
9 : 9 x
: : : sccessfl
&o search 98, : nits of time is re(ired
,ey@:= 6 X m Search
7 G ; sccessfl
&o search :=, 7 nit of time is re(ired
,ey@;8 6 X m Search
7 G ; x
= G S x
= = = sccessfl
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
&o search ;8, 9 nits of time is re(ired
,ey@;G 6 X m Search
7 G ; x
= G S sccessfl
&o search ;G, 8 nits of time is re(ired
,ey@=8 6 X m Search
7 G ; x
= G S x
T G T sccessfl
&o search =8, 9 nits of time is re(ired
,ey@=T 6 X m Search
7 G ; x
= G S x
T G T x
G G G sccessfl
&o search =T, : nits of time is re(ired

9F8F9F:F7F9F8F:
Sccessfl a"erage search time@ +++++++++++++++++++++++++
G
unsuccessful cases
,ey@8; 6 X m Search
7 G ; x
7 : 8 x
9 : 9 x
: : : x
&o search 8;, : nits of time is re(ired



,ey@=; 6 X m Search
7 G ; x
= G S x
T G T x
G G G x
&o search =;, : nits of time is re(ired
:F:
Ensccessfl search time @++++++++++++++++++++
8
a"erage (sm of nsccessfl search time
search @ F sm of Sccessfl search time)$(nF(nF7))
time



!a"#!in Search:
Max+Min search problem aims at finding the smallest as #ell as the biggest element in a
"ector A of n elements1
>ollo#ing the steps of Di"ide and Con(er the "ector can be di"ided into sb+problem as
sho#n belo#1












&he search has no# redced to comparison of 8 nmbers1 &he time is spent in con(ering
and comparing #hich is the ma3or step in the algorithm1

Algorithm, Max+Min (p, (, max, min)
L
%f (p @ () &hen
max @ a(p)
min @ a(()
5lse
%f ( p J (+7) &hen
%f a(p) ? a(() &hen
max @ a(p)
min @ a(()
5lse
max @ a(()
min @ a(p)
%f 5nd
5lse
m (pF()$8
max+min(p,m,max7,min7)
max+min(mF7,(,max8,min8)
max large(max7,max8)
min small(min7,min8)
%f 5nd
%f 5nd
Algorithm 5nd1
Illstration
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com


Consider a data set #ith elements LT8,9=,:G,G7,78,7:,B=,S=,G8Q1 %nitially the max and min
"ariables ha"e nll "ales1 %n the first call, the list is bro-en into t#o e(al hal"es11 &he list
is again bro-en do#n into t#o1 &his process is contined till the length of the list is either
t#o or one1 &hen the maximm and minimm "ales are chosen from the smallest list and
these "ales are retrned to the preceding step #here the length of the list is slightly big1
&his process is contined till the entire list is searched1 &he detail description is sho#n in fig
:17
Integer -lti%lication,
&here are "arios methods of obtaining the prodct of t#o nmbers1 &he repeated addition
method is left as an assignment for the reader1 &he reader is expected to find the prodct of
some bigger nmbers sing the repeated addition method1
Another #ay of finding the prodct is the one #e generally se i1e1, the left shift method1
4.3.1 left shift method
GT7U789:
9G8:
8G:9U
7G=8UU
GT7UUU
787B;;:
%n this method, a@GT7 is the mltiplicand and b@789: is the mltiplier1 A is mltiplied by
e"ery digit of b starting from right to left1 On each mltiplication the sbse(ent prodcts
are shifted one place left1 >inally the prodcts obtained by mltiplying a by each digit of b is
smmed p to obtain the final prodct1
&he abo"e prodct can also be obtained by a right shift method, #hich can be illstrated as
follo#s,
4.3.2 right shift method
GT7U789:
GT7
7G=8
U8G:9
UU9G8:
787B;;:
%n the abo"e method, a is mltiplied by each digit of b from leftmost digit to rightmost digit1
On e"ery mltiplication the prodct is shifted one place to the right and finally all the
prodcts obtained by mltiplying Ia0 by each digit of Ib0 is added to obtain the final reslt1
&he prodct of t#o nmbers can also be obtained by di"iding Ia0 and mltiplying Ib0 by 8
repeatedly ntil aK@71
4.3.3 halving and doubling method
Let a@GT7 and b@789:
&he steps to be follo#ed are
1. %f a is odd store b
2. A@a$8 and b@bU8
3. !epeat step 8 and step 7 till aK@7
a b reslt
GT7 789: 789:
:GB 8:=T ++++++++++++
8:; :G9= :G9=
788 GTS8 +++++++++
=7 7GS:: 7GS::
9B 9G:TT ++++++++++++
7; STGS= STGS=
S 7;SG;8 7;SG;8
9 97;GB: 97;GB:
7 =97TBT =97TBT
Sm@787B;;:
&he abo"e method is called the hal"ing and dobling method1
0./.0 )%eed % algorithm,
%n this method #e split the nmber till it is easier to mltiply1 i1e1, #e split BGT7 into BG and
T7 and 789: into 78 and 9:1 BG is then mltiplied by both 78 and 9: bt, the prodcts are
shifted In0 places left before adding1 &he nmber of shifts In0 is decided as follo#s
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
Mltiplication
se(ence
shifts
BGU78 : 7BTUUUU
BGU9: 8 9B=UU
T7U78 8 GS8UU
T7U9: B 8S;:
Sm@787B;;:
>or BGT7U789:, mltiplication of 9: and T7 ta-es 2ero shifts, 9:UBG ta-es 8 shifts, 78 and
T7 ta-es 8 shifts and so on1
E'ercise 0
1. <rite the algorithm to find the prodct of t#o nmbers for all the methods explained1
2. 4and simlate the algorithm for atleast 7B different nmbers1
3. %mplement the same for "erification1
4. <rite a program to find the maximm and minimm of the list of n element #ith and
#ithot sing recrsion1
Greedy !ethod:
/reedy method is a method of choosing a sbset of the dataset as the soltion set that
reslts in some profit1 Consider a problem ha"ing n inpts, #e are re(ired to obtain the
soltion #hich is a series of sbsets that satisfy some constraints or conditions1 Any sbset,
#hich satisfies these constraints, is called a feasible soltion1 %t is re(ired to obtain the
feasible soltion that maximi2es or minimi2es the ob3ecti"e fnction1 &his feasible soltion
finally obtained is called optimal soltion1
%f one can de"ise an algorithm that #or-s in stages, considering one inpt at a time and at
each stage, a decision is ta-en on #hether the data chosen reslts #ith an optimal soltion
or not1 %f the inclsion of a particlar data reslts #ith an optimal soltion, then the data is
added into the partial soltion set1 On the other hand, if the inclsion of that data reslts
#ith infeasible soltion then the data is eliminated from the soltion set1
&he general algorithm for the greedy method is
1. Choose an element e belonging to dataset D1
2. Chec- #hether e can be inclded into the soltion set S if Zes soltion set is s s E
e1
3. Contine ntil s is filled p or D is exhasted #hiche"er is earlier1
2.1 ;assette Filling
Consider n programs that are to be stored on a tape of length L1 5ach program % is of length
li #here i lies bet#een 7 and n1 All programs can be stored on the tape iff the sm of the
lengths of the programs is at most L1 %t is assmed that, #hene"er a program is to be
retrie"ed the tape is initially positioned at the start end1
Let t3 be the time re(ired retrie"ing program i3#here programs are stored in the order
% @ i7, i8, i9, H,in1
&he time ta-en to access a program on the tape is called the mean retrie"al time (M!&)
i1e1, t3 @ li- -@7,8,H,3
No# the problem is to store the programs on the tape so that M!& is minimi2ed1 >rom the
abo"e discssion one can obser"e that the M!& can be minimi2ed if the programs are stored
in an increasing order i1e1, l7 l8 l9, H ln1
4ence the ordering defined minimi2es the retrie"al time1 &he soltion set obtained need not
be a sbset of data bt may be the data set itself in a different se(ence1

Illstration
Assme that 9 sorted files are gi"en1 Let the length of files A, * and C be S, 9 and ; nits
respecti"ely1 All these three files are to be stored on to a tape S in some se(ence that
redces the a"erage retrie"al time1 &he table sho#s the retrie"al time for all possible
orders1
Order of
recording
!etrie"al time M!&
A*C SF(SF9)F(SF9F;)@98 98$9
AC* SF(SF;)F(SF;F9)@9: 9:$9
*AC 9F(9FS)F(9FSF;)@8T 8T$9
*CA 9F(9F;)F(9F;FS)@8= 8=$9
CA* ;F(;FS)F(;FSF9)@98 98$9
C*A ;F(;F9)F(;F9FS)@8T 8T$9
General $napsac$ %ro&lem:
/reedy method is best sited to sol"e more complex problems sch as a -napsac- problem1
%n a -napsac- problem there is a -napsac- or a container of capacity M n items #here, each
item i is of #eight #i and is associated #ith a profit pi1
&he problem of -napsac- is to fill the a"ailable items into the -napsac- so that the -napsac-
gets filled p and yields a maximm profit1 %f a fraction xi of ob3ect i is placed into the
-napsac-, then a profit pixi is earned1 &he constrain is that all chosen ob3ects shold sm p
to M
Illstration
Consider a -napsac- problem of finding the optimal soltion #here, M@7;, (p7,p8,p9HpS) @
(7B, ;, 7;, S, =, 7T, 9) and (#7, #8, H1, #S) @ (8, 9, ;, S, 7, :, 7)1
%n order to find the soltion, one can follo# three different srategies1
Strategy 1 : non<increasing %rofit vales
Let (a,b,c,d,e,f,g) represent the items #ith profit (7B,;,7;,S,=,7T,9) then the se(ence of
ob3ects #ith non+increasing profit is (f,c,a,d,e,b,g)1
%tem chosen for inclsion
Xantity of item inclded
!emaining space in M
6iRi
f
7 fll nit
7;+:@77
7TU7@7T
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
C
7 fll nit
77+;@=
7;U7@7;
A
7 fll nit
=+8@:
7BU7@7B
d
:$S nit
:+:@B
:$SUS@B:
6rofit@ :S nits
&he soltion set is (7,B,7,:$S,B,7,B)1

)trateg. 2, non<decreasing =eights
&he se(ence of ob3ects #ith non+decreasing #eights is (e,g,a,b,f,c,d)1
%tem chosen for inclsion
Xantity of item inclded
!emaining space in M
6iR%
5
7 fll nit
7;+7@7:
=U7@=
/
7 fll nit
7:+7@79
9U7@9
A
7 fll nit
79+8@77
7BU7@7B
b
7 fll nit
77+9@T
;U7@B;
f
7 fll nit
T+:@:
7TU7@7T
c
:$; nit
:+:@B
:$;U7;@78
6rofit@ ;: nits
&he soltion set is (7,7,:$;,B,7,7,7)1
)trateg. 2, ma'imm %rofit %er nit of ca%acit. sed
(&his means that the ob3ects are considered in decreasing order of the ratio 6i$#%)
a) 67$#7 @7B$8 @ ; b) 68$#8 @;$9@71== c) 69$#9@7;$; @ 9
d) 6:$#: @S$S@7 e) 6;$#; @=$7@= f) 6=$#= @7T$: @ :1;
g) 6S$#S @9$7@9
4ence, the se(ence is (e,a,f,c,g,b,d)
%tem chosen for inclsion
Xantity of item inclded
!emaining space in M
6iR%
5
7 fll nit
7;+7@7:
=U7@=
A
7 fll nit
7:+8@78
7BU7@7B
>
7 fll nit
78+:@T
7TU7@7T
C
7 fll nit
T+;@9
7;U7@7;
g
7 fll nit
9+7@8
9U7@9
b
8$9 nit
8+8@B
8$9U;@9199
6rofit@ ;;199 nits
&he soltion set is (7,8$9,7,B,7,7,7)1
%n the abo"e problem it can be obser"ed that, if the sm of all the #eights is M then all
xi @ 7, is an optimal soltion1 %f #e assme that the sm of all #eights exceeds M, all xi0s
cannot be one1 Sometimes it becomes necessary to ta-e a fraction of some items to
completely fill the -napsac-1 &his type of -napsac- problems is a general -napsac- problem1
;once%t of back 9racking,
6roblems, #hich deal #ith searching a set of soltions, or #hich as- for an optimal soltion
satisfying some constraints can be sol"ed sing the bac-trac-ing formlation1 &he
bac-trac-ing algorithm yields the proper soltion in fe#er trials1
&he basic idea of bac-trac-ing is to bild p a "ector one component at a time and to test
#hether the "ector being formed has any chance of sccess1 &he ma3or ad"antage of this
algorithm is that if it is reali2ed that the partial "ector generated does not lead to an optimal
soltion then that "ector may be ignored1
*ac-trac-ing algorithm determine the soltion by systematically searching the soltion
space for the gi"en problem1 &his search is accomplished by sing a free organi2ation1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
*ac-trac-ing is a depth first search #ith some bonding fnction1 All soltions sing
bac-trac-ing are re(ired to satisfy a complex set of constraints1 &he constraints may be
explicit or implicit1
5xplicit constraints are rles, #hich restrict each "ector element to be chosen from the
gi"en set1 %mplicit constraints are rles, #hich determine #hich of the tples in the soltion
space, actally satisfy the criterion fnction1
>.1.1 ;assette filling %roblem,
&here are n programs that are to be stored on a tape of length L1 5"ery program Ii0 is of
length li1 All programs can be stored on the tape if and only if the sm of the lengths of the
programs is at most L1 %n this problem, it is assmed that #hene"er a program is to be
retrie"ed, the tape is positioned at the start end1 4ence, the time t3 needed to retrie"e
program i3from a tape ha"ing the programs in the order i7,i8, H,in is called mean retrie"al
time(M!&) and is gi"en by
t3 @ li- -@7,8,H,3
%n the optimal storage on tape problem, #e are re(ired to find a permtation for the n
programs so that #hen they are stored on the tape, the M!& is minimi2ed1
Let n@9 and (l7,l8,l9)@(;,7B,9),there are nA@= possible orderings1 &hese orderings and their
respecti"e M!& is gi"en in the fig =171 4ence, the best order of recording is 9,7,81
>.1.2 )bset %roblem,
&here are n positi"e nmbers gi"en in a set1 &he desire is to find all possible sbsets of this
set, the contents of #hich add onto a predefined "ale M1
Let there be n elements in the main set1 <@#M711nN represent the elements of the set1 i1e1,
# @ (#7,#8,#9,H,#n) "ector x @ xM711nN assmes either B or 7 "ale1 %f element #(i) is
inclded in the sbset then x(i) @71
Consider n@= m@9B and #M711=N@L;,7B,78,79,7;,7TQ1 &he partial bac-trac-ing tree is
sho#n in fig =181 &he label to the left of a node represents the item nmber chosen for
insertion and the label to the right represents the space occpied in M1 S represents a
soltion to the gi"en problem and * represents a bonding criteria if no soltion can be
reached1 >or the abo"e problem the soltion cold be (7,7,B,B,7,B), (7,B,7,7,B,B) and
(B,B,7,B,B,7)1 Completion of the tree strctre is left as an assignment for the reader1
>././ ? *een %roblem,
&he T (een problem can be stated as follo#s1 Consider a chessboard of order TRT1 &he
problem is to place T (eens on this board sch that no t#o (eens are attac- can attac-
each other1
Illstration.
Consider the problem of : (eens, bac-trac-ing soltion for this is as sho#n in the fig =191
&he figre sho#s a partial bac-trac-ing tree1 Completion of the tree is left as an assignment
for the reader1
Concept of 'ranch and 'ound:
&he term branch and bond refer to all state space search methods in #hich all possible
branches are deri"ed before any other node can become the 5+node1 %n other #ords the
exploration of a ne# node cannot begin ntil the crrent node is completely explored1
>.2.1 9a%e filling)
&he branch and bond tree for the records of length (;,7B,9) is as sho#n in fig =1:
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
Single Source Shortest %ath:
/raphs can be sed to represent the high#ay strctre of a state or contry #ith "ertices
representing cities and edges representing sections of high#ay1 &he edges can then be
assigned #eights #hich may be either the distance bet#een the t#o cities connected by the
edge or the a"erage time to dri"e along that section of high#ay1 A motorist #ishing to dri"e
from city A to * #old be interested in ans#ers to the follo#ing (estions)
1. %s there a path from A to *D
2. %f there is more than one path from A to *D <hich is the shortest pathD
&he problems defined by these (estions are
special case of the path problem #e stdy in this section1 &he length of a path
is no# defined to be the sm of the #eights of the edges on that path1 &he
starting "ertex of the path is referred to as the sorce and the last "ertex the
destination1 &he graphs are digraphs representing streets1 Consider a digraph
/@(C,5), #ith the distance to be tra"eled as #eights on the edges1 &he
problem is to determine the shortest path from "B to all the remaining
"ertices of /1 %t is assmed that all the #eights associated #ith the edges are
positi"e1 &he shortest path bet#een "B and some other node " is an ordering
among a sbset of the edges1 4ence this problem fits the ordering paradigm1


5xample)
Consider the digraph of fig S+71 Let the nmbers on the edges be the costs of
tra"elling along that rote1 %f a person is interested tra"el from "7 to "8, then
he enconters many paths1 Some of them are
1. "7 "8 @ ;B nits
2. "7 "9 ": "8 @ 7BF7;F8B@:; nits
3. "7 "; ": "8 @ :;F9BF8B@ G; nits
4. "7 "9 ": "; ": "8 @ 7BF7;F9;F9BF8B@77B nits
&he cheapest path among these is the path along "7 "9 ": "81 &he cost
of the path is 7BF7;F8B @ :; nits1 5"en thogh there are three edges on
this path, it is cheaper than tra"elling along the path connecting "7 and "8
directly i1e1, the path "7 "8 that costs ;B nits1 One can also notice that, it
is not possible to tra"el to "= from any other node1
&o formlate a greedy based algorithm to generate the cheapest paths, #e
mst concei"e a mltistage soltion to the problem and also of an
optimi2ation measre1 One possibility is to bild the shortest paths one by
one1 As an optimi2ation measre #e can se the sm of the lengths of all
paths so far generated1 >or this measre to be minimi2ed, each indi"idal
path mst be of minimm length1 %f #e ha"e already constrcted i shortest
paths, then sing this optimi2ation measre, the next path to be constrcted
shold be the next shortest minimm length path1 &he greedy #ay to
generate these paths in non+decreasing order of path length1 >irst, a shortest
path to the nearest "ertex is generated1 &hen a shortest path to the second
nearest "ertex is generated, and so on1
A mch simpler method #old be to sol"e it sing matrix representation1 &he
steps that shold be follo#ed is as follo#s,
Step 7) find the ad3acency matrix for the gi"en graph1 &he ad3acency matrix
for fig S17 is gi"en belo#
C7 C8 C9 C: C; C=
C7 + ;B 7B %nf :; %nf
C8 %nf + 7; %nf 7B %nf
C9 8B %nf + 7; inf %nf
C: %nf 8B %nf + 9; %nf
C; %nf %nf %nf 9B + %nf
C= %nf %nf %nf 9 %nf +
Step 8) consider "7 to be the sorce and choose the minimm entry in the
ro# "71 %n the abo"e table the minimm in ro# "7 is 7B1
Step 9) find ot the colmn in #hich the minimm is present, for the abo"e
example it is colmn "91 4ence, this is the node that has to be next "isited1
Step :) compte a matrix by eliminating "7 and "9 colmns1 %nitially retain only ro# "71
&he second ro# is compted by adding 7B to all "ales of ro# "91
&he reslting matrix is
C8 C: C; C=
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
C7 C# ;B %nf :; %nf
C7 C9 C# 7BFinf 7BF7; 7BFinf 7BFinf
Minimm ;B 8; :; inf
Step ;) find the minimm in each colmn1 No# select the minimm from the reslting ro#1
%n the abo"e example the minimm is 8;1 !epeat step 9 follo#ed by step : till all "ertices
are co"ered or single colmn is left1
&he soltion for the fig S17 can be contined as follo#s
C8 C; C=
C7 C# ;B :; %nf
C7 C9 C: C# 8;F8B 8;F9; 8;Finf
Minimm :; :; inf

C; C=
C7 C# :; %nf
C7 C9 C: C8 C# :;F7B :;Finf
Minimm :; inf
C=
C7 C# %nf
C7 C9 C: C8 C; C# :;Finf
Minimm inf
>inally the cheapest path from "7 to all other "ertices is gi"en by C7 C9 C: C8 C;1
!inimum Cost Spanning Tree:
Let /@(C,5) be an ndirected connected graph1 A sb+graph t @ (C,5
7
) of / is a spanning
tree of / if and only if t is a tree1


Abo"e figre sho#s the complete graph on for nodes together #ith three of its spanning
tree1
Spanning trees ha"e many applications1 >or example, they can be sed to obtain an
independent set of circit e(ations for an electric net#or-1 >irst, a spanning tree for the
electric net#or- is obtained1 Let * be the set of net#or- edges not in the spanning tree1
Adding an edge from * to the spanning tree creates a cycle1 ,irchoff0s second la# is sed on
each cycle to obtain a circit e(ation1
Another application of spanning trees arises from the property that a spanning tree is a
minimal sb+graph /0 of / sch that C(/0) @ C(/) and /0 is connected1 A minimal sb+graph
#ith n "ertices mst ha"e at least n+7 edges and all connected graphs #ith n+7 edges are
trees1 %f the nodes of / represent cities and the edges represent possible commnication
lin-s connecting t#o cities, then the minimm nmber of lin-s needed to connect the n
cities is n+71 the spanning trees of / represent all feasible choice1
%n practical sitations, the edges ha"e #eights assigned to them1 &hse #eights may
represent the cost of constrction, the length of the lin-, and so on1 /i"en sch a #eighted
graph, one #old then #ish to select cities to ha"e minimm total cost or minimm total
length1 %n either case the lin-s selected ha"e to form a tree1 %f this is not so, then the
selection of lin-s contains a cycle1 !emo"al of any one of the lin-s on this cycle reslts in a
lin- selection of less const connecting all cities1 <e are therefore interested in finding a
spanning tree of /1 #ith minimm cost since the identification of a minimm+cost spanning
tree in"ol"es the selection of a sbset of the edges, this problem fits the sbset paradigm1
@.2.1 8rimAs Algorithm
A greedy method to obtain a minimm+cost spanning tree bilds this tree edge by edge1
&he next edge to inclde is chosen according to some optimi2ation criterion1 &he simplest
sch criterion is to choose an edge that reslts in a minimm increase in the sm of the
costs of the edges so far inclded1 &here are t#o possible #ays to interpret this criterion1 %n
the first, the set of edges so far selected form a tree1 &hs, if A is the set of edges selected
so far, then A forms a tree1 &he next edge(,") to be inclded in A is a minimm+cost edge
not in A #ith the property that A E L(,")Q is also a tree1 &he corresponding algorithm is
-no#n as prim0s algorithm1
>or 6rim0s algorithm dra# n isolated "ertices and label them "7, "8, "9,H"n1 &ablate the
gi"en #eights of the edges of g in an n by n table1 Set the non existent edges as "ery large1
Start from "ertex "7 and connect it to its nearest neighbor (i1e1, to the "ertex, #hich has
the smallest entry in ro#7 of table) say C-1 No# consider "7 and "- as one sbgraph and
connect this sbgraph to its closest neighbor1 Let this ne# "ertex be "i1 Next regard the tree
#ith "7 "- and "i as one sbgraph and contine the process ntil all n "ertices ha"e been
connected by n+7 edges1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
Consider the graph sho#n in fig S191 &here are = "ertices and 78 edges1 &he #eights are
tablated in table gi"en belo#1
C7 C8 C9 C: C; C=
C7 + 7B 7= 77 7B 7S
C8 7B + G1; %nf %nf 7G1;
C9 7= G1; + S %nf 78
C: 77 %nf S + T S
C; 7B %nf %nf T + G
C= 7S 7G1; 78 S G +
Start #ith "7 and pic- the smallest entry in ro#7, #hich is either ("7,"8) or ("7,";)1 Let s
pic- ("7, ";)1 &he closest neighbor of the sbgraph ("7,";) is ": as it is the smallest in the
ro#s "7 and ";1 &he three remaining edges selected follo#ing the abo"e procedre trn ot
to be (":,"=) (":,"9) and ("9, "8) in that se(ence1 &he reslting shortest spanning tree is
sho#n in fig S1:1 &he #eight of this tree is :71;1

@.2./ BrskalAs Algorithm,
&here is a second possible interpretation of the optimi2ation criteria mentioned earlier in
#hich the edges of the graph are considered in non+decreasing order of cost1 &his
interpretation is that the set t of edges so far selected for the spanning tree be sch that it
is possible to complete t into a tree1 &hs t may not be a tree at all stages in the algorithm1
%n fact, it #ill generally only be a forest since the set of edges t can be completed into a tree
if and only if there are no cycles in t1 this method is de to -rs-al1
&he ,rs-al algorithm can be illstrated as folo#s, list ot all edges of graph / in order of
non+decreasing #eight1 Next select a smallest edge that ma-es no circit #ith pre"iosly
selected edges1 Contine this process ntil (n+7) edges ha"e been selected and these edges
#ill constitte the desired shortest spanning tree1
>or fig S19 -rs-al soltion is as follo#s,
C7 to "8 @7B
C7 to "9 @ 7=
C7 to ": @ 77
C7 to "; @ 7B
C7 to "= @ 7S
C8 to "9 @ G1;
C8 to "= @ 7G1;
C9 to ": @ S
C9 to "= @78
C: to "; @ T
C: to "= @ S
C; to "= @ G
&he abo"e path in ascending order is
C9 to ": @ S
C: to "= @ S
C: to "; @ T
C; to "= @ G
C8 to "9 @ G1;
C7 to "; @ 7B
C7 to "8 @7B
C7 to ": @ 77
C9 to "= @78
C7 to "9 @ 7=
C7 to "= @ 7S
C8 to "= @ 7G1;
Select the minimm, i1e1, "9 to ": connect them, no# select ": to "= and then ": to ";,
no# if #e select "; to "= then it forms a circit so drop it and go for the next1 Connect "8
and "9 and finally connect "7 and ";1 &hs, #e ha"e a minimm spanning tree, #hich is
similar to the figre S1:1
9echni*e for 5ra%hs,
A fndamental problem concerning graphs is the reachability problem1 %n its simplest form it
re(ires s to determine #hether there exists a path in the gi"en graph /@(C,5) sch that
this path starts at "ertex " and ends at "ertex 1 A more general form is to determine for a
gi"en starting
Certex " belonging to C all "ertices sch that there is a path from " to 1 &his latter
problem can be sol"ed by starting at "ertex " and systematically searching the graph / for
"ertices that can be reached from "1 &he 8 search methods for this are )
1. *readth first search1
2. Depth first search1
@./.1 Breadth first search,
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
%n *readth first search #e start at "ertex " and mar- it as ha"ing been reached1 &he "ertex
" at this time is said to be nexplored1 A "ertex is said to ha"e been explored by an
algorithm #hen the algorithm has "isited all "ertices ad3acent from it1 All n"isited "ertices
ad3acent from " are "isited next1 &here are ne# nexplored "ertices1 Certex " has no# been
explored1 &he ne#ly "isited "ertices ha"e not been explored and are pt onto the end of the
list of nexplored "ertices1 &he first "ertex on this list is the next to be explored1 5xploration
contines ntil no nexplored "ertex is left1 &he list of nexplored "ertices acts as a (ee
and can be represented sing any of the standard (ee representations1
@./.2 6e%th first search,
A depth first search of a graph differs from a breadth first search in that the exploration of a
"ertex " is sspended as soon as a ne# "ertex is reached1 At this time the exploration of
the ne# "ertex begins1 <hen this ne# "ertex has been explored, the exploration of
contines1 &he search terminates #hen all reached "ertices ha"e been flly explored1 &his
search process is best+described recrsi"ely1
Algorithm D>S(")
L
"isitedM"N@7
for each "ertex # ad3acent from " do
L
%f ("isitedM#N@B)then
D>S(#)P
Q
Q
(perating System
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
O"er"ie# and 4istory
6rocesses and &hreads
&hread Creation, Manipolation and
Synchroni2ation
Deadloc-
%mplementing Synchroni2ation Operations
C6E Schedoling
OS 6otporri
%ntrodction to Memory Management
%ntrodction to 6aging
+vervie= and 3istor.,
!hat is an o%erating s.stem4
4ard to define precisely, becase operating systems arose historically as people needed to sol"e problems
associated #ith sing compters1
[ Mch of operating system history dri"en by relati"e cost factors of hard#are and people1 4ard#are
started ot fantastically expensi"e relati"e to people and the relati"e cost has been decreasing e"er since1
!elati"e costs dri"e the goals of the operating system1
%n the beginning) E'%ensive 3ard=are7 ;hea% 8eo%le /oal) maximi2e hard#are tili2ation1
No#) ;hea% 3ard=are7 E'%ensive 8eo%le/oal) ma-e it easy for people to se compter1
[ %n the early days of compter se, compters #ere hge machines that are expensi"e to by, rn and
maintain1 Compter sed in single ser, interacti"e mode1 6rogrammers interact #ith the machine at a "ery
lo# le"el + flic- console s#itches, dmp cards into card reader, etc1 &he interface is basically the ra#
hard#are1
6roblem) Code to maniplate external %$O de"ices1 %s "ery complex, and is a ma3or sorce of
programming difficlty1
Soltion) *ild a sbrotine library (de"ice dri"ers) to manage the interaction #ith the %$O de"ices1
&he library is loaded into the top of memory and stays there1 &his is the first example of something
that #old gro# into an operating system1
[ *ecase the machine is so expensi"e, it is important to -eep it bsy1
6roblem) compter idles #hile programmer sets things p1 6oor tili2ation of hge in"estment1
Soltion) 4ire a speciali2ed person to do setp1 >aster than programmer, bt still a lot slo#er than
the machine1
Soltion) *ild a batch monitor1 Store 3obs on a dis- (spooling), ha"e compter read them in one at
a time and execte them1 *ig change in compter sage) debgging no# done offline from print
ots and memory dmps1 No more instant feedbac-1
6roblem) At any gi"en time, 3ob is acti"ely sing either the C6E or an %$O de"ice, and the rest of the
machine is idle and therefore ntili2ed1
Soltion) Allo# the 3ob to o"erlap comptation and %$O1 *ffering and interrpt handling added to
sbrotine library1
6roblem) one 3ob can\t -eep both C6E and %$O de"ices bsy1 (4a"e compte+bond 3obs that tend
to se only the C6E and %$O+bond 3obs that tend to se only the %$O de"ices1) /et poor tili2ation
either of C6E or %$O de"ices1
Soltion) mltiprogramming + se"eral 3obs share system1 Dynamically s#itch from one 3ob to
another #hen the rnning 3ob does %$O1 *ig isse) protection1 Don\t #ant one 3ob to affect the
reslts of another1 Memory protection and relocation added to hard#are, OS mst manage ne#
hard#are fnctionality1 OS starts to become a significant soft#are system1 OS also starts to ta-e p
significant resorces on its o#n1
[ 6hase shift) Compters become mch cheaper1 6eople costs become significant1
%sse) %t becomes important to ma-e compters easier to se and to impro"e the prodcti"ity of
the people1 One big prodcti"ity sin-) ha"ing to #ait for batch otpt (bt is this really treD)1 So, it
is important to rn interacti"ely1 *t compters are still so expensi"e that yo can\t by one for
e"ery person1 Soltion) interacti"e timesharing1
6roblem) Old batch schedlers #ere designed to rn a 3ob for as long as it #as tili2ing the C6E
effecti"ely (in practice, ntil it tried to do some %$O)1 *t no#, people need reasonable response
time from the compter1
Soltion) 6reempti"e schedling1
6roblem) 6eople need to ha"e their data and programs arond #hile they se the compter1
Soltion) Add file systems for (ic- access to data1 Compter becomes a repository for data, and
people don\t ha"e to se card dec-s or tapes to store their data1
6roblem) &he boss logs in and gets terrible response time becase the machine is o"erloaded1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
Soltion) 6rioriti2ed schedling1 &he boss gets more of the machine than the peons1 *t, C6E
schedling is 3st an example of resorce allocation problems1 &he timeshared machine #as fll of
limited resorces (C6E time, dis- space, physical memory space, etc1) and it became the
responsibility of the OS to mediate the allocation of the resorces1 So, de"eloped things li-e dis-
and physical memory (otas, etc1
[ O"erall, time sharing #as a sccess1 4o#e"er, it #as a limited sccess1 %n practical terms, e"ery
timeshared compter became o"erloaded and the response time dropped to annoying or nacceptable
le"els1 4ard+core hac-ers compensated by #or-ing at night, and #e de"eloped a generation of pasty+
loo-ing, nhealthy insomniacs addicted to caffeine1
[ Compters become e"en cheaper1 %t becomes practical to gi"e one compter to each ser1 %nitial cost is
"ery important in mar-et1 Minimal hard#are (no net#or-ing or hard dis-, "ery slo# microprocessors and
almost no memory) shipped #ith minimal OS (MS+DOS)1 6rotection, secrity less of an isse1 OS resorce
consmption becomes a big isse (compter only has =:B, of memory)1 OS bac- to a shared sbrotine
library1
[ 4ard#are becomes cheaper and sers more sophisticated1 6eople need to share data and information
#ith other people1 Compters become more information transfer, maniplation and storage de"ices rather
than machines that perform arithmetic operations1 Net#or-ing becomes "ery important, and as sharing
becomes an important part of the experience so does secrity1 Operating systems become more
sophisticated1 Start ptting bac- featres present in the old time sharing systems (OS$8, <indo#s N&, e"en
Enix)1
[ !ise of net#or-1 %nternet is a hge poplar phenomenon and dri"es ne# #ays of thin-ing abot
compting1 Operating system is no longer interface to the lo#er le"el machine + people strctre systems to
contain layers of middle#are1 So, a .a"a A6% or something similar may be the primary thing people need,
not a set of system calls1 %n fact, #hat the operating system is may become irrele"ant as long as it spports
the right set of middle#are1
[ Net#or- compter1 Concept of a box that gets all of its resorces o"er the net#or-1 No local file system,
3st net#or- interfaces to ac(ire all otside data1 So ha"e a slimmer "ersion of OS1
[ %n the ftre, compters #ill become physically small and portable1 Operating systems #ill ha"e to deal
#ith isses li-e disconnected operation and mobility1 6eople #ill also start sing information #ith a psedo+
real time component li-e "oice and "ideo1 Operating systems #ill ha"e to ad3st to deli"er acceptable
performance for these ne# forms of data1
[ <hat does a modern operating system doD
8rovides Abstractions 4ard#are has lo#+le"el physical resorces #ith complicated, idiosyncratic
interfaces1 OS pro"ides abstractions that present clean interfaces1 /oal) ma-e compter easier to
se1 5xamples) 6rocesses, Enbonded Memory, >iles, Synchroni2ation and Commnication
Mechanisms1
8rovides )tandard Interface /oal) portability1 Enix rns on many "ery different compter
systems1 &o a first approximation can port programs across systems #ith little effort1
-ediates $esorce &sage /oal) allo# mltiple sers to share resorces fairly, efficiently, safely
and secrely1 5xamples)
o Mltiple processes share one processor1 (preemptable resorce)
o Mltiple programs share one physical memory (preemptable resorce)1
o Mltiple sers and files share one dis-1 (non+preemptable resorce)
o Mltiple programs share a gi"en amont of dis- and net#or- band#idth (preemptable
resorce)1
;onsmes $esorces Solaris ta-es p abot TMbytes physical memory (or abot ]:BB)1
[ Abstractions often #or- #ell + for example, timesharing, "irtal memory and hierarchical and net#or-ed
file systems1 *t, may brea- do#n if stressed1 &imesharing gi"es poor performance if too many sers rn
compte+intensi"e 3obs1 Cirtal memory brea-s do#n if #or-ing set is too large (thrashing), or if there are
too many large processes (machine rns ot of s#ap space)1 Abstractions often fail for performance
reasons1
[ Abstractions also fail becase they pre"ent programmer from controlling machine at desired le"el1
5xample) database systems often #ant to control mo"ement of information bet#een dis- and physical
memory, and the paging system can get in the #ay1 More recently, existing OS schedlers fail to ade(ately
spport mltimedia and parallel processing needs, casing poor performance1
[ Concrrency and asynchrony ma-e operating systems "ery complicated pieces of soft#are1 Operating
systems are fndamentally non+deterministic and e"ent dri"en1 Can be difficlt to constrct (hndreds of
person+years of effort) and impossible to completely debg1 5xamples of concrrency and asynchrony)
%$O de"ices rn concrrently #ith C6E, interrpting C6E #hen done1
On a mltiprocessor mltiple ser processes execte in parallel1
Mltiple #or-stations execte concrrently and commnicate by sending messages o"er a net#or-1
6rotocol processing ta-es place asynchronosly1
Operating systems are so large no one person nderstands #hole system1 Otli"es any of its original
bilders1
[ &he ma3or problem facing compter science today is ho# to bild large, reliable soft#are
systems1 Operating systems are one of "ery fe# examples of existing large soft#are systems, and
by stdying operating systems #e may learn lessons applicable to the constrction of larger
systems1
(verview of %rocess:
[ A process is an exection stream in the context of a particlar process state1
An exection stream is a se(ence of instrctions1
6rocess state determines the effect of the instrctions1 %t sally incldes (bt is not restricted to))
!egisters
Stac-
Memory (global "ariables and dynamically allocated memory)
Open file tables
Signal management information
,ey concept) processes are separated) no process can directly affect the state of another process1
[ 6rocess is a -ey OS abstraction that sers see + the en"ironment yo interact #ith #hen yo se a
compter is bilt p ot of processes1
&he shell yo type stff into is a process1
<hen yo execte a program yo ha"e 3st compiled, the OS generates a process to rn the
program1
Zor <<< bro#ser is a process1
[ Organi2ing system acti"ities arond processes has pro"ed to be a sefl #ay of separating ot different
acti"ities into coherent nits1
[ &#o concepts) niprogramming and mltiprogramming1
Eniprogramming) only one process at a time1 &ypical example) DOS1 6roblem) sers often #ish to
perform more than one acti"ity at a time (load a remote file #hile editing a program, for example),
and niprogramming does not allo# this1 So DOS and other niprogrammed systems pt in things
li-e memory+resident programs that in"o-ed asynchronosly, bt still ha"e separation problems1
One -ey problem #ith DOS is that there is no memory protection + one program may #rite the
memory of another program, casing #eird bgs1
Mltiprogramming) mltiple processes at a time1 &ypical of Enix pls all crrently en"isioned ne#
operating systems1 Allo#s system to separate ot acti"ities cleanly1
[ Mltiprogramming introdces the resorce sharing problem + #hich processes get to se the physical
resorces of the machine #henD One crcial resorce) C6E1 Standard soltion is to se preempti"e
mltitas-ing + OS rns one process for a #hile, then ta-es the C6E a#ay from that process and lets another
process rn1 Mst sa"e and restore process state1 ,ey isse) fairness1 Mst ensre that all processes get
their fair share of the C6E1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
[ 4o# does the OS implement the process abstractionD Eses a context s#itch to s#itch from rnning one
process to rnning another process1
[ 4o# does machine implement context s#itchD A processor has a limited amont of physical resorces1
>or example, it has only one register set1 *t e"ery process on the machine has its o#n set of registers1
Soltion) sa"e and restore hard#are state on a context s#itch1 Sa"e the state in 6rocess Control *loc-
(6C*)1 <hat is in 6C*D Depends on the hard#are1
!egisters + almost all machines sa"e registers in 6C*1
6rocessor Stats <ord1
<hat abot memoryD Most machines allo# memory from mltiple processes to coexist in the
physical memory of the machine1 Some may re(ire Memory Management Enit (MME) changes on a
context s#itch1 *t, some early personal compters s#itched all of process\s memory ot to dis-
(AAA)1
[ Operating Systems are fndamentally e"ent+dri"en systems + they #ait for an e"ent to happen, respond
appropriately to the e"ent, then #ait for the next e"ent1 5xamples)
Eser hits a -ey1 &he -eystro-e is echoed on the screen1
A ser program isses a system call to read a file1 &he operating system figres ot #hich dis-
bloc-s to bring in, and generates a re(est to the dis- controller to read the dis- bloc-s into
memory1
&he dis- controller finishes reading in the dis- bloc- and generates and interrpt1 &he OS mo"es the
read data into the ser program and restarts the ser program1
A Mosaic or Netscape ser as-s for a E!L to be retrie"ed1 &his e"entally generates re(ests to the
OS to send re(est pac-ets ot o"er the net#or- to a remote <<< ser"er1 &he OS sends the
pac-ets1
&he response pac-ets come bac- from the <<< ser"er, interrpting the processor1 &he OS figres
ot #hich process shold get the pac-ets, then rotes the pac-ets to that process1
&ime+slice timer goes off1 &he OS mst sa"e the state of the crrent process, choose another
process to rn, the gi"e the C6E to that process1
[ <hen bild an e"ent+dri"en system #ith se"eral distinct serial acti"ities, threads are a -ey strctring
mechanism of the OS1
[ A thread is again an exection stream in the context of a thread state1 ,ey difference bet#een processes
and threads is that mltiple threads share parts of their state1 &ypically, allo# mltiple threads to read and
#rite same memory1 (!ecall that no processes cold directly access memory of another process)1 *t, each
thread still has its o#n registers1 Also has its o#n stac-, bt other threads can read and #rite the stac-
memory1
[ <hat is in a thread control bloc-D &ypically 3st registers1 Don\t need to do anything to the MME #hen
s#itch threads, becase all threads can access same memory1
[ &ypically, an OS #ill ha"e a separate thread for each distinct acti"ity1 %n particlar, the OS #ill ha"e a
separate thread for each process, and that thread #ill perform OS acti"ities on behalf of the process1 %n this
case #e say that each ser process is bac-ed by a -ernel thread1
<hen process isses a system call to read a file, the process\s thread #ill ta-e o"er, figre ot #hich
dis- accesses to generate, and isse the lo# le"el instrctions re(ired to start the transfer1 %t then
sspends ntil the dis- finishes reading in the data1
<hen process starts p a remote &C6 connection, its thread handles the lo#+le"el details of sending
ot net#or- pac-ets1
[ 4a"ing a separate thread for each acti"ity allo#s the programmer to program the actions associated #ith
that acti"ity as a single serial stream of actions and e"ents1 6rogrammer does not ha"e to deal #ith the
complexity of interlea"ing mltiple acti"ities on the same thread1
[ <hy allo# threads to access same memoryD *ecase inside OS, threads mst coordinate their acti"ities
"ery closely1
%f t#o processes isse read file system calls at close to the same time, mst ma-e sre that the OS
seriali2es the dis- re(ests appropriately1
<hen one process allocates memory, its thread mst find some free memory and gi"e it to the
process1 Mst ensre that mltiple threads allocate dis3oint pieces of memory1
4a"ing threads share the same address space ma-es it mch easier to coordinate acti"ities + can bild data
strctres that represent system state and ha"e threads read and #rite data strctres to figre ot #hat to
do #hen they need to process a re(est1
[ One complication that threads mst deal #ith) asynchrony1 Asynchronos e"ents happen arbitrarily as
the thread is execting, and may interfere #ith the thread\s acti"ities nless the programmer does
something to limit the asynchrony1 5xamples)
An interrpt occrs, transferring control a#ay from one thread to an interrpt handler1
A time+slice s#itch occrs, transferring control from one thread to another1
&#o threads rnning on different processors read and #rite the same memory1
[ Asynchronos e"ents, if not properly controlled, can lead to incorrect beha"ior1 5xamples)
&#o threads need to isse dis- re(ests1 >irst thread starts to program dis- controller (assme it is
memory+mapped, and mst isse mltiple #rites to specify a dis- operation)1 %n the meantime, the
second thread rns on a different processor and also isses the memory+mapped #rites to program
the dis- controller1 &he dis- controller gets horribly confsed and reads the #rong dis- bloc-1
&#o threads need to #rite to the display1 &he first thread starts to bild its re(est, bt before it
finishes a time+slice s#itch occrs and the second thread starts its re(est1 &he combination of the
t#o threads isses a forbidden re(est se(ence, and smo-e starts poring ot of the display1
>or acconting reasons the operating system -eeps trac- of ho# mch time is spent in each ser
program1 %t also -eeps a rnning sm of the total amont of time spent in all ser programs1 &#o
threads increment their local conters for their processes, then concrrently increment the global
conter1 &heir increments interfere, and the recorded total time spent in all ser processes is less
than the sm of the local times1
[ So, programmers need to coordinate the acti"ities of the mltiple threads so that these bad things don\t
happen1 ,ey mechanism) synchroni2ation operations1 &hese operations allo# threads to control the timing of
their e"ents relati"e to e"ents in other threads1 Appropriate se allo#s programmers to a"oid problems li-e
the ones otlined abo"e1
+vervie= +f 9hread,
[ <e first mst postlate a thread creation and maniplation interface1 <ill se the one in Nachos)
class &hread L
pblic)
&hread(charU debgName)P
^&hread()P
"oid >or-("oid (Ufnc)(int), int arg)P
"oid Zield()P
"oid >inish()P
Q
[ &he Thread constrctor creates a ne# thread1 %t allocates a data strctre #ith space for the &C*1
[ &o actally start the thread rnning, mst tell it #hat fnction to start rnning #hen it rns1
&he Fork method gi"es it the fnction and a parameter to the fnction1
[ <hat does Fork doD %t first allocates a stac- for the thread1 %t then sets p the &C* so that #hen the
thread starts rnning, it #ill in"o-e the fnction and pass it the correct parameter1 %t then pts the thread on
a rn (ee someplace1 Fork then retrns, and the thread that called Fork contines1
[ 4o# does OS set p &C* so that the thread starts rnning at the fnctionD >irst, it sets the stac- pointer
in the &C* to the stac-1 &hen, it sets the 6C in the &C* to be the first instrction in the fnction1 &hen, it
sets the register in the &C* holding the first parameter to the parameter1 <hen the thread system restores
the state from the &C*, the fnction #ill magically start to rn1
[ &he system maintains a (ee of rnnable threads1 <hene"er a processor becomes idle, the thread
schedler grabs a thread off of the rn (ee and rns the thread1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
[ Conceptally, threads execte concrrently1 &his is the best #ay to reason abot the beha"ior of
threads1 *t in practice, the OS only has a finite nmber of processors, and it can\t rn all of the rnnable
threads at once1 So, mst mltiplex the rnnable threads on the finite nmber of processors1
[ Let\s do a fe# thread examples1 >irst example) t#o threads that increment a "ariable1 int a @ BP
"oid sm(int p) L
aFFP
printf(O_d ) a @ _d`nO, p, a)P
Q "oid main() L
&hread Ut @ ne# &hread(OchildO)P
t+?>or-(sm, 7)P
sm(B)P
Q
[ &he t#o calls to sum rn concrrently1 <hat are the possible reslts of the programD &o nderstand this
flly, #e mst brea- the sum sbrotine p into its primiti"e components1
[ sum first reads the "ale of a into a register1 %t then increments the register, then stores the contents of
the register bac- into a1 %t then reads the "ales of of the control string, p and a into the registers that it
ses to pass argments to the printf rotine1 %t then callsprintf, #hich prints ot the data1
[ &he best #ay to nderstand the instrction se(ence is to loo- at the generated assembly langage
(cleaned p 3st a bit)1 Zo can ha"e the compiler generate assembly code instead of ob3ect code by gi"ing
it the +S flag1 %t #ill pt the generated assembly in the same file name as the 1c or 1cc file, bt #ith a 1s
sffix1
la a, _rB
ld M_rBN,_r7
add _r7,7,_r7
st _r7,M_rBN
ld M_rBN, _o9
Aparameters are passed starting #ith _oB
mo" _oB, _o7
la 1L7S, _oB
call printf
[ So #hen execte concrrently, the reslt depends on ho# the instrctions interlea"e1 <hat are possible
resltsD
0 : 1 0 : 1
1 : 2 1 : 1
1 : 2 1 : 1
0 : 1 0 : 1
1 : 1 0 : 2
0 : 2 1 : 2
0 : 2 1 : 2
1 : 1 0 : 2
So the reslts are nondeterministic + yo may get different reslts #hen yo rn the program more than
once1 So, it can be "ery difficlt to reprodce bgs1 Nondeterministic exection is one of the things that
ma-es #riting parallel programs mch more difficlt than #riting serial programs1
[ Chances are, the programmer is not happy #ith all of the possible reslts listed abo"e1 6robably #anted
the "ale of ato be 8 after both threads finish1 &o achie"e this, mst ma-e the increment operation atomic1
&hat is, mst pre"ent the interlea"ing of the instrctions in a #ay that #old interfere #ith the additions1
[ Concept of atomic operation1 An atomic operation is one that exectes #ithot any interference from
other operations + in other #ords, it exectes as one nit1 &ypically bild complex atomic operations p ot
of se(ences of primiti"e operations1 %n or case the primiti"e operations are the indi"idal machine
instrctions1
[ More formally, if se"eral atomic operations execte, the final reslt is garanteed to be the same as if the
operations exected in some serial order1
[ %n or case abo"e, bild an increment operation p ot of loads, stores and add machine instrctions1
<ant the increment operation to be atomic1
[ Ese synchroni2ation operations to ma-e code se(ences atomic1 >irst synchroni2ation abstraction)
semaphores1 A semaphore is, conceptally, a conter that spports t#o atomic operations, 6 and C1 4ere is
the Semaphore interface from Nachos)
class Semaphore {
public:
Semaphore(char* debugName int initial!alue"#
$Semaphore("#
%oid &("#
%oid !("#
'
[ 4ere is #hat the operations do)
Semphore(name, cont) ) creates a semaphore and initiali2es the conter to cont1
6() ) Atomically #aits ntil the conter is greater than B, then decrements the conter and retrns1
C() ) Atomically increments the conter1
[ 4ere is ho# #e can se the semaphore to ma-e the sumexample #or-)

int a ( 0#
Semaphore *s#
%oid sum(int p" {
int t#
s)*&("#
a++#
t ( a#
s)*!("#
printf(,-d : a ( -d.n, p t"#
'
%oid main(" {
Thread *t ( ne/ Thread(,child,"#
s ( ne/ Semaphore(,s, 1"#
t)*Fork(sum 1"#
sum(0"#
'
[ <e are sing semaphores here to implement a mtal exclsion mechanism1 &he idea behind mtal
exclsion is that only one thread at a time shold be allo#ed to do something1 %n this case, only one thread
shold access a1 Ese mtal exclsion to ma-e operations atomic1 &he code that performs the atomic
operation is called a critical section1
[ Semaphores do mch more than mtal exclsion1 &hey can also be sed to synchroni2e
prodcer$consmer programs1 &he idea is that the prodcer is generating data and the consmer is
consming data1 So a Enix pipe has a prodcer and a consmer1 Zo can also thin- of a person typing at a
-eyboard as a prodcer and the shell program reading the characters as a consmer1
[ 4ere is the synchroni2ation problem) ma-e sre that the consmer does not get ahead of the prodcer1
*t, #e #old li-e the prodcer to be able to prodce #ithot #aiting for the consmer to consme1 Can
se semaphores to do this1 4ere is ho# it #or-s)
Semaphore *s#
%oid consumer(int dumm0" {
/hile (1" {
s)*&("#
consume the ne1t unit of data
'
'
%oid producer(int dumm0" {
/hile (1" {
produce the ne1t unit of data
s)*!("#
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
'
'
%oid main(" {
s ( ne/ Semaphore(,s, 0"#
Thread *t ( ne/ Thread(,consumer,"#
t)*Fork(consumer 1"#
t ( ne/ Thread(,producer,"#
t)*Fork(producer 1"#
'
%n some sense the semaphore is an abstraction of the collection of data1
[ %n the real #orld, pragmatics intrde1 %f #e let the prodcer rn fore"er and ne"er rn the consmer, #e
ha"e to store all of the prodced data some#here1 *t no machine has an infinite amont of storage1 So, #e
#ant to let the prodcer to get ahead of the consmer if it can, bt only a gi"en amont ahead1 <e need to
implement a bonded bffer #hich can hold only N items1 %f the bonded bffer is fll, the prodcer mst
#ait before it can pt any more data in1
Semaphore *full#
Semaphore *empt0#
%oid consumer(int dumm0" {
/hile (1" {
full)*&("#
consume the ne1t unit of data
empt0)*!("#
'
'
%oid producer(int dumm0" {
/hile (1" {
empt0)*&("#
produce the ne1t unit of data
full)*!("#
'
'
%oid main(" {
empt0 ( ne/ Semaphore(,empt0, N"#
full ( ne/ Semaphore(,full, 0"#
Thread *t ( ne/ Thread(,consumer,"#
t)*Fork(consumer 1"#
t ( ne/ Thread(,producer,"#
t)*Fork(producer 1"#
'
An example of #here yo might se a prodcer and consmer in an operating system is the console (a
de"ice that reads and #rites characters from and to the system console)1 Zo #old probably se
semaphores to ma-e sre yo don\t try to read a character before it is typed1
[ Semaphores are one synchroni2ation abstraction1 &here is another called loc-s and condition "ariables1
[ Loc-s are an abstraction specifically for mtal exclsion only1 4ere is the Nachos loc- interface)
class 2ock {
public:
2ock(char* debugName"#
33 initiali4e lock to be F566
$2ock("#
33 deallocate lock
%oid 7c8uire("#
33 these are the onl0 operations on a lock
%oid 5elease("#
33 the0 are both *atomic*
'
[
A loc- can be in one of t#o states) loc-ed and nloc-ed1 Semantics of loc- operations)
Loc-(name) ) creates a loc- that starts ot in the nloc-ed state1
Ac(ire() ) Atomically #aits ntil the loc- state is nloc-ed, then sets the loc- state to loc-ed1
!elease() ) Atomically changes the loc- state to nloc-ed from loc-ed1
%n assignment 7 yo #ill implement loc-s in Nachos on top of semaphores1
[ <hat are re(irements for a loc-ing implementationD
Only one thread can ac(ire loc- at a time1 (safety)
%f mltiple threads try to ac(ire an nloc-ed loc-, one of the threads #ill get it1 (li"eness)
All nloc-s complete in finite time1 (li"eness)
[ <hat are desirable properties for a loc-ing implementationD
5fficiency) ta-e p as little resorces as possible1
>airness) threads ac(ire loc- in the order they as- for it1 Are also #ea-er forms of fairness1
Simple to se1
[ <hen se loc-s, typically associate a loc- #ith pieces of data that mltiple threads access1 <hen one
thread #ants to access a piece of data, it first ac(ires the loc-1 %t then performs the access, then nloc-s
the loc-1 So, the loc- allo#s threads to perform complicated atomic operations on each piece of data1
[ Can yo implement nbonded bffer only sing loc-sD &here is a problem + if the consmer #ants to
consme a piece of data before the prodcer prodces the data, it mst #ait1 *t loc-s do not allo# the
consmer to #ait ntil the prodcer prodces the data1 So, consmer mst loop ntil the data is ready1 &his
is bad becase it #astes C6E resorces1
[ &here is another synchroni2ation abstraction called condition "ariables 3st for this -ind of sitation1
4ere is the Nachos interface)
class 9ondition {
public:
9ondition(char* debugName"#
$9ondition("#
%oid :ait(2ock *condition2ock"#
%oid Signal(2ock *condition2ock"#
%oid ;roadcast(2ock *condition2ock"#
'
[ Semantics of condition "ariable operations)
Condition(name) ) creates a condition "ariable1
<ait(Loc- Ul) ) Atomically releases the loc- and #aits1 <hen <ait retrns the loc- #ill ha"e been
reac(ired1
Signal(Loc- Ul) ) 5nables one of the #aiting threads to rn1 <hen Signal retrns the loc- is still
ac(ired1
*roadcast(Loc- Ul) ) 5nables all of the #aiting threads to rn1 <hen *roadcast retrns the loc- is
still ac(ired1
All loc-s mst be the same1 %n assignment 7 yo #ill implement condition "ariables in Nachos on top of
semaphores1
[ &ypically, yo associate a loc- and a condition "ariable #ith a data strctre1 *efore the program
performs an operation on the data strctre, it ac(ires the loc-1 %f it has to #ait before it can perform the
operation, it ses the condition "ariable to #ait for another operation to bring the data strctre into a state
#here it can perform the operation1 %n some cases yo need more than one condition "ariable1
[ Let\s say that #e #ant to implement an nbonded bffer sing loc-s and condition "ariables1 %n this
case #e ha"e 8 consmers1
2ock *l#
9ondition *c#
int a%ail ( 0#
%oid consumer(int dumm0" {
/hile (1" {
l)*7c8uire("#
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
if (a%ail (( 0" {
c)*:ait(l"#
'
consume the ne1t unit of data
a%ail))#
l)*5elease("#
'
'
%oid producer(int dumm0" {
/hile (1" {
l)*7c8uire("#
produce the ne1t unit of data
a%ail++#
c)*Signal(l"#
l)*5elease("#
'
'
%oid main(" {
l ( ne/ 2ock(,l,"#
c ( ne/ 9ondition(,c,"#
Thread *t ( ne/ Thread(,consumer,"#
t)*Fork(consumer 1"#
Thread *t ( ne/ Thread(,consumer,"#
t)*Fork(consumer 2"#
t ( ne/ Thread(,producer,"#
t)*Fork(producer 1"#
'
[
&here are t#o "ariants of condition "ariables) 4oare condition "ariables and Mesa condition "ariables1 >or
4oare condition "ariables, #hen one thread performs a Signal, the "ery next thread to rn is the #aiting
thread1
>or Mesa condition "ariables, there are no garantees #hen the signalled thread #ill rn1 Other threads that
ac(ire the loc- can execte bet#een the signaller and the #aiter1 &he example abo"e #ill #or- #ith 4oare
condition "ariables bt not #ith Mesa condition "ariables1
[ <hat is the problem #ith Mesa condition "ariablesD Consider the follo#ing scenario) &hree threads,
thread 7 one prodcing data, threads 8 and 9 consming data1
&hread 8 calls consumer, and sspends1
&hread 7 calls producer, and signals thread 81
%nstead of thread 8 rnning next, thread 9 rns next, calls consumer, and consmes the element1
(Note) #ith 4oare monitors, thread 8 #old al#ays rn next, so this #old not happen1)
&hread 8 rns, and tries to consme an item that is not there1 Depending on the data strctre
sed to store prodced items, may get some -ind of illegal access error1
[ 4o# can #e fix this problemD !eplace the if #ith a/hile1
%oid consumer(int dumm0" {
/hile (1" {
l)*7c8uire("#
/hile (a%ail (( 0" {
c)*:ait(l"#
'
consume the ne1t unit of data
a%ail))#
l)*5elease("#
'
'
%n general, this is a crcial point1 Al#ays pt /hile\s arond yor condition "ariable code1 %f yo don\t, yo
can get really obscre bgs that sho# p "ery infre(ently1
[ %n this example, #hat is the data that the loc- and condition "ariable are associated #ithD
&he a%ail "ariable1
[ 6eople ha"e de"eloped a programming abstraction that atomatically associates loc-s and condition
"ariables #ith data1 &his abstraction is called a monitor1 A monitor is a data strctre pls a set of
operations (sort of li-e an abstract data type)1 &he monitor also has a loc- and, optionally, one or more
condition "ariables1 See notes for Lectre 7:1
[ &he compiler for the monitor langage atomatically inserts a loc- operation at the beginning of each
rotine and an nloc- operation at the end of the rotine1 So, programmer does not ha"e to pt in the loc-
operations1
[ Monitor langages #ere poplar in the middle TB\s + they are in some sense safer becase they
eliminate one possible programming error1 *t more recent langages ha"e tended not to spport monitors
explicitly, and expose the loc-ing operations to the programmer1 So the programmer has to insert the loc-
and nloc- operations by hand1 .a"a ta-es a middle grond + it spports monitors, bt also allo#s
programmers to exert finer grain control o"er the loc-ed sections by spporting synchroni2ed bloc-s #ithin
methods1 *t synchroni2ed bloc-s still present a strctred model of synchroni2ation, so it is not possible to
mismatch the loc- ac(ire and release1
[ Landromat 5xample) A local ladromat has s#itched to a compteri2ed machine allocation scheme1
&here are N machines, nmbered 7 to N1 *y the front door there are 6 allocation stations1 <hen yo #ant to
#ash yor clothes, yo go to an allocation station and pt in yor coins1 &he allocation station gi"es yo a
nmber, and yo se that machine1 &here are also 6 deallocation stations1 <hen yor clothes finish, yo
gi"e the nmber bac- to one of the deallocation stations, and someone else can se the machine1 4ere is
the alpha release of the machine allocation soft#are)
allocate(int dumm0" {
/hile (1" {
/ait for coins from user
n ( get("#
gi%e number n to user
'
'
deallocate(int dumm0" {
/hile (1" {
/ait for number n from user
put(i"#
'
'
main(" {
for (i ( 0# i < &# i++" {
t ( ne/ Thread(,allocate,"#
t)*Fork(allocate 0"#
t ( ne/ Thread(,deallocate,"#
t)*Fork(deallocate 0"#
'
'
[ &he -ey parts of the schedling are done in the t#o rotines get and put, #hich se an array data
strctre a to -eep trac- of #hich machines are in se and #hich are free1
int a=N>#
int get(" {
for (i ( 0# i < N# i++" {
if (a=i> (( 0" {
a=i> ( 1#
return(i+1"#
'
'
'
%oid put(int i" {
a=i)1> ( 0#
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
'
[ %t seems that the alpha soft#are isn\t doing all that #ell1 .st loo-ing at the soft#are, yo can see that
there are se"eral synchroni2ation problems1
[ &he first problem is that sometimes t#o people are assigned to the same machine1 <hy does this
happenD <e can fix this #ith a loc-)
int a=N>#
2ock *l#
int get(" {
l)*7c8uire("#
for (i ( 0# i < N# i++" {
if (a=i> (( 0" {
a=i> ( 1#
l)*5elease("#
return(i+1"#
'
'
l)*5elease("#
'
%oid put(int i" {
l)*7c8uire("#
a=i)1> ( 0#
l)*5elease("#
'
So no#, ha"e fixed the mltiple assignment problem1 *t #hat happens if someone comes in to the landry
#hen all of the machines are already ta-enD <hat does the machine retrnD Mst fix it so that the system
#aits ntil there is a machine free before it retrns a nmber1 &he sitation calls for condition "ariables1
int a=N>#
2ock *l#
9ondition *c#
int get(" {
l)*7c8uire("#
/hile (1" {
for (i ( 0# i < N# i++" {
if (a=i> (( 0" {
a=i> ( 1#
l)*5elease("#
return(i+1"#
'
'
c)*:ait(l"#
'
'
%oid put(int i" {
l)*7c8uire("#
a=i)1> ( 0#
c)*Signal("#
l)*5elease("#
'
[ <hat data is the loc- protectingD &he a array1
[ <hen #old yo se a broadcast operationD <hene"er #ant to #a-e p all #aiting threads, not 3st
one1 >or an e"ent that happens only once1 >or example, a bnch of threads may #ait ntil a file is deleted1
&he thread that actally deleted the file cold se a broadcast to #a-e p all of the threads1
[ Also se a broadcast for allocation$deallocation of "ariable si2ed nits1 5xample) concrrent malloc$free1
2ock *l#
9ondition *c#
char *malloc(int s" {
l)*7c8uire("#
/hile (cannot allocate a chunk of si4e s" {
c)*:ait(l"#
'
allocate chunk of si4e s#
l)*5elease("#
return pointer to allocated chunk
'
%oid free(char *m" {
l)*7c8uire("#
deallocate m?
c)*;roadcast(l"#
l)*5elease("#
'
[ 5xample #ith malloc$free1 %nitially start ot #ith 7B bytes free1
&ime 6rocess 7 6rocess 8 6rocess 9
malloc(7B)
scceeds
malloc(;)
sspends loc-
malloc(;)
sspends loc-
7 gets loc- + #aits
8 gets loc-
#aits
9 free(7B)
broadcast

: resme malloc(;)
scceeds

; resme malloc(;)
scceeds
= malloc(S)
#aits

S malloc(9)
#aits
T free(;)
broadcast

G resme malloc(S)
#aits

7B resme malloc(9)
scceeds
<hat #old happen if changed c)*;roadcast(l" to c)*Signal(l"D At step 7B, process 9 #old not
#a-e p, and it #old not get the chance to allocate a"ailable memory1 <hat #old happen if
changed /hile loop to an ifD
[ Zo #ill be as-ed to implement condition "ariables as part of assignment 71 &he follo#ing
implementation is %NCO!!5C&1 6lease do not trn this implementation in1
class 9ondition {
pri%ate:
int /aiting#
Semaphore *sema#
'
%oid 9ondition:::ait(2ock* l"
{
/aiting++#
l)*5elease("#
sema)*&("#
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
l)*7c8uire("#
'
%oid 9ondition::Signal(2ock* l"
{
if (/aiting * 0" {
sema)*!("#
/aiting))#
'
'
<hy is this soltion incorrectD *ecase in some cases the signalling thread may #a-e p a #aiting thread
that called <ait after the signalling thread called Signal1
(verview of Dead )oc$:
Zo may need to #rite code that ac(ires more than one loc-1 &his opens p the possibility of deadloc-1
Consider the follo#ing piece of code)
Loc- Ul7, Ul8P "oid p() L l7+?Ac(ire()P l8+?Ac(ire()P code that maniplates data that l7 and l8 protect l8+
?!elease()P l7+?!elease()P Q "oid (() L l8+?Ac(ire()P l7+?Ac(ire()P code that maniplates data that l7
and l8 protect l7+?!elease()P l8+?!elease()P Q
%f p and ( execte concrrently, consider #hat may happen1 >irst, p ac(ires l7 and ( ac(ires l81 &hen, p
#aits to ac(ire l8 and ( #aits to ac(ire l71 4o# long #ill they #aitD >ore"er1 &his case is called deadloc-1
<hat are conditions for deadloc-D Mtal 5xclsion) Only one thread can hold loc- at a time1 4old and <ait)
At least one thread holds a loc- and is #aiting for another process to release a loc-1 No preemption) Only
the process holding the loc- can release it1 Circlar <ait) &here is a set t1, 111, tn sch that t1 is #aiting for a
loc- held by t', 111, tn is #aiting for a loc- held by t11
4o# can p and ( a"oid deadloc-D Order the loc-s, and al#ays ac(ire the loc-s in that order1 5liminates the
circlar #ait condition1 Occasionally yo may need to #rite code that needs to ac(ire loc-s in different
orders1 4ere is #hat to do in this sitation1 >irst, most loc-ing abstractions offer an operation that tries to
ac(ire the loc- bt retrns if it cannot1 <e #ill call this operation Tr07c8uire1 Ese this operation to try to
ac(ire the loc- that yo need to ac(ire ot of order1 %f the operation scceeds, fine1 Once yo\"e got the
loc-, there is no problem1 %f the operation fails, yor code #ill need to release all loc-s that come after the
loc- yo are trying to ac(ire1 Ma-e sre the associated data strctres are in a state #here yo can release
the loc-s #ithot crashing the system1 !elease all of the loc-s that come after the loc- yo are trying to
ac(ire, then reac(ire all of the loc-s in the right order1 <hen the code resmes, bear in mind that the data
strctres might ha"e changed bet#een the time #hen yo released and reac(ired the loc-1
4ere is an example1 int d7, d8P $$ &he standard ac(isition order for these t#o loc-s $$ is l7, l81 Loc- Ul7, $$
protects d7 Ul8P $$ protects d8 $$ Decrements d8, and if the reslt is B, increments d7 "oid increment() L l8+
?Ac(ire()P int t @ d8P t++P if (t @@ B) L if (l7+?&ryAc(ire()) L d7FFP Q else L $$ Any modifications to d8 go
here + in this case none l8+?!elease()P l7+?Ac(ire()P l8+?Ac(ire()P t @ d8P t++P $$ some other thread may
ha"e changed d8 + mst rechec- it if (t @@ B) L d7FFP Q Q l7+?!elease()P Q d8 @ tP l8+?!elease()P Q
&his example is some#hat contri"ed, bt yo #ill recogni2e the sitation #hen it occrs1 &here is a
generali2ation of the deadloc- problem to sitations in #hich processes need mltiple resorces, and the
hard#are may ha"e mltiple -inds of each resorce + t#o printers, etc1 Seems -ind of li-e a batch model +
processes re(est resorces, then system schedles process to rn #hen resorces are a"ailable1 %n this
model, processes isse re(ests to OS for resorces, and OS decides #ho gets #hich resorce #hen1 A lot
of theory bilt p to handle this sitation1 6rocess first re(ests a resorce, the OS isses it and the process
ses the resorce, then the process releases the resorce bac- to the OS1 !eason abot resorce allocation
sing resorce allocation graph1 5ach resorce is represented #ith a box, each process #ith a circle, and the
indi"idal instances of the resorces #ith dots in the boxes1 Arro#s go from processes to resorce boxes if
the process is #aiting for the resorce1 Arro#s go from dots in resorce box to processes if the process
holds that instance of the resorce1 See >ig1 S171 %f graph contains no cycles, is no deadloc-1 %f has a cycle,
may or may not ha"e deadloc-1 See >ig1 S18, S191 System can either !estrict the #ay in #hich processes #ill
re(est resorces to pre"ent deadloc-1 !e(ire processes to gi"e ad"ance information abot #hich
resorces they #ill re(ire, then se algorithms that schedle the processes in a #ay that a"oids deadloc-1
Detect and eliminate deadloc- #hen it occrs1 >irst consider pre"ention1 Loo- at the deadloc- conditions
listed abo"e1 Mtal 5xclsion + &o eliminate mtal exclsion, allo# e"erybody to se the resorce
immediately if they #ant to1 Enrealistic in general + do yo #ant yor printer otpt interlea"ed #ith
someone elsesD 4old and <ait1 &o pre"ent hold and #ait, ensre that #hen a process re(ests resorces,
does not hold any other resorces1 5ither as-s for all resorces before exectes, or dynamically as-s for
resorces in chn-s as needed, then releases all resorces before as-ing for more1 &#o problems +
processes may hold bt not se resorces for a long time becase they #ill e"entally hold them1 Also, may
ha"e star"ation1 %f a process as-s for lots of resorces, may ne"er rn becase other processes al#ays hold
some sbset of the resorces1 Circlar <ait1 &o pre"ent circlar #ait, order resorces and re(ire processes
to re(est resorces in that order1 Deadloc- a"oidance1 Simplest algorithm + each process tells max nmber
of resorces it #ill e"er need1 As process rns, it re(ests resorces bt ne"er exceeds max nmber of
resorces1 System schedles processes and allocates resores in a #ay that ensres that no deadloc-
reslts1 5xample) system has 78 tape dri"es1 System crrently rnning 6B needs max 7B has ;, 67 needs
max : has 8, 68 needs max G has
81 Can system pre"ent deadloc- e"en if all processes re(est the maxD <ell, right no# system has 9 free
tape dri"es1 %f 67 rns first and completes, it #ill ha"e ; free tape dri"es1 6B can rn to completion #ith
those ; free tape dri"es e"en if it re(ests max1 &hen 68 can complete1 So, this schedle #ill execte
#ithot deadloc-1 %f 68 re(ests t#o more tape dri"es, can system gi"e it the dri"esD No, becase cannot be
sre it can rn all 3obs to completion #ith only 7 free dri"e1 So, system mst not gi"e 68 8 more tape dri"es
ntil 67 finishes1 %f 68 as-s for 8 tape dri"es, system sspends 68 ntil 67 finishes1 Concept) Safe Se(ence1
%s an ordering of processes sch that all processes can execte to completion in that order e"en if all
re(est maximm resorces1 Concept) Safe State + a state in #hich there exists a safe se(ence1 Deadloc-
a"oidance algorithms al#ays ensre that system stays in a safe state1 4o# can yo figre ot if a system is
in a safe stateD /i"en the crrent and maximm allocation, find a safe se(ence1 System mst maintain
some information abot the resorces and
A"ailM3N @ nmber of resorce 3 a"ailable MaxMi,3N @ max nmber of resorce 3 that process i #ill se
AllocMi,3N @ nmber of resorce 3 that process i crrently has NeedMi,3N @ MaxMi,3N + AllocMi,3N
Notation) 7<(; if for all processes i, 7=i><(;=i>1 Safety Algorithm) #ill try to find a safe se(ence1
Simlate e"oltion of system o"er time nder #orst case assmptions of resorce demands1 7) <or- @
A"ailP >inishMiN @ >alse for all iP 8) >ind i sch that >inishMiN @ >alse and NeedMiN K@ <or- %f no sch i exists,
goto : 9) <or- @ <or- F AllocMiNP >inishMiN @ &reP goto 8 :) %f >inishMiN @ &re for all i, system is in a safe
state
No#, can se safety algorithm to determine if #e can satisfy a gi"en resorce demand1 <hen a process
demands additional resorces, see if can gi"e them to process and remain in a safe state1 %f not, sspend
process ntil system can allocate resorces and remain in a safe state1 Need an additional data strctre)
!e(estMi,3N @ nmber of 3 resorces that process i re(ests 4ere is algorithm1 Assme process i has 3st
re(ested additional resorces1
7) %f !e(estMiN K@ NeedMiN goto 81 Other#ise, process has "iolated its maximm resorce claim1 8) %f
!e(estMiN K@ A"ail goto 91 Other#ise, i mst #ait becase resorces are not a"ailable1 9) 6retend to
allocate resorces as follo#s) A"ail @ A"ail + !e(estMiN AllocMiN @ AllocMiN F !e(estMiN NeedMiN @ NeedMiN +
!e(estMiN %f this is a safe state, gi"e the process the resorces1 Other#ise, sspend the process and restore
the old state1
<hen to chec- if a sspended process shold be gi"en the resorces and resmedD Ob"ios choice + #hen
some other process relin(ishes its resorces1 Ob"ios problem + process star"es becase other processes
#ith lo#er resorce re(irements are al#ays ta-ing freed resorces1 See 5xample in Section S1;19191 &hird
alternati"e) deadloc- detection and elimination1 .st let deadloc- happen1 Detect #hen it does, and
eliminate the deadloc- by preempting resorces1 4ere is deadloc- detection algorithm1 %s "ery similar to
safe state detection algorithm1
7) <or- @ A"ailP >inishMiN @ >alse for all iP 8) >ind i sch that >inishMiN @ >alse and !e(estMiN K@ <or- %f no
sch i exists, goto : 9) <or- @ <or- F AllocMiNP >inishMiN @ &reP goto 8 :) %f >inishMiN @ >alse for some i,
system is deadloc-ed1
<hen to rn deadloc- detection algorithmD Ob"ios time) #hene"er a process re(ests more resorces and
sspends1 %f deadloc- detection ta-es too mch time, maybe rn it less fre(ently1 O,, no# yo\"e fond a
deadloc-1 <hat do yo doD Mst free p some resorces so that some processes can rn1 So, preempt
resorces + ta-e them a#ay from processes1 Se"eral different preemption cases) Can preempt some
resorces #ithot -illing 3ob + for example, main memory1 Can 3st s#ap ot to dis- and resme 3ob later1 %f
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
3ob pro"ides rollbac- points, can roll 3ob bac- to point before ac(ired resorces1 At a later time, restart 3ob
from rollbac- point1 Defalt rollbac- point + start of 3ob1 >or some resorces mst 3st -ill 3ob1 All resorces
are then free1 Can either -ill processes one by one ntil yor system is no longer deadloc-ed1 Or, 3st go
ahead and -ill all deadloc-ed processes1 %n a real system, typically se different deadloc- strategies for
different sitations based on resorce characteristics1 &his #hole topic has a sort of =B\s and SB\s batch
mainframe feel to it1 4o# come these topics ne"er seem to arise in modern Enix systemsD
(verview of Synchroni*ation:
[
4o# do #e implement synchroni2ation operations li-e loc-sD Can bild synchroni2ation operations ot of
atomic reads and #rites1 &here is a lot of literatre on ho# to do this, one algorithm is called the ba-ery
algorithm1 *t, this is slo# and cmbersome to se1 So, most machines ha"e hard#are spport for
synchroni2ation + they pro"ide synchroni2ation instrctions1
[ On a niprocessor, the only thing that #ill ma-e mltiple instrction se(ences not atomic is interrpts1
So, if #ant to do a critical section, trn off interrpts before the critical section and trn on interrpts after
the critical section1 /aranteed atomicity1 %t is also fairly efficient1 5arly "ersions of Enix did this1
[ <hy not 3st se trning off interrptsD &#o main disad"antages) can\t se in a mltiprocessor, and
can\t se directly from ser program for synchroni2ation1
[ &est+And+Set1 &he test and set instrction atomically chec-s if a memory location is 2ero, and if so, sets
the memory location to 71 %f the memory location is 7, it does nothing1 %t retrns the old "ale of the
memory location1 Zo can se test and set to implement loc-s as follo#s)
&he loc- state is implemented by a memory location1 &he location is B if the loc- is nloc-ed and 7 if
the loc- is loc-ed1
&he loc- operation is implemented as)
/hile (test)and)set(l" (( 1"#

&he nloc- operation is implemented as) Ul @ BP
&he problem #ith this implementation is bsy+#aiting1 <hat if one thread already has the loc-, and another
thread #ants to ac(ire the loc-D &he ac(iring thread #ill spin ntil the thread that already has the loc-
nloc-s it1
[ <hat if the threads are rnning on a niprocessorD 4o# long #ill the ac(iring thread spinD Entil it
expires its (antm and thread that #ill nloc- the loc- rns1 So on a niprocessor, if can\t get the thread
the first time, shold 3st sspend1 So, loc- ac(isition loo-s li-e this)
/hile (test)and)set(l" (( 1" {
currentThread)*@ield("#
'

Can ma-e it e"en better by ha"ing a (ee loc- that (ees p the #aiting threads and gi"es the loc- to the
first thread in the (ee1 So, threads ne"er try to ac(ire loc- more than once1
[ On a mltiprocessor, it is less clear1 6rocess that #ill nloc- the loc- may be rnning on another
processor1 Maybe shold spin 3st a little #hile, in hopes that other process #ill release loc-1 &o e"alate
spinning and sspending strategies, need to come p #ith a cost for each sspension algorithm1 &he cost is
the amont of C6E time the algorithm ses to ac(ire a loc-1
[ &here are three components of the cost) spinning, sspending and resming1 <hat is the cost of
spinningD <aste the C6E for the spin time1 <hat is cost of sspending and resmingD Amont of C6E time it
ta-es to sspend the thread and restart it #hen the thread ac(ires the loc-1
[ 5ach loc- ac(isition algorithm spins for a #hile, then sspends if it didn\t get the loc-1 &he optimal
algorithm is as follo#s)
%f the loc- #ill be free in less than the sspend and resme time, spin ntil ac(ire the loc-1
%f the loc- #ill be free in more than the sspend and resme time, sspend immediately1
Ob"iosly, cannot implement this algorithm + it re(ires -no#ledge of the ftre, #hich #e do not in general
ha"e1
[ 4o# do #e e"alate practical algorithms + algorithms that spin for a #hile, then sspend1 <ell, #e
compare them #ith the optimal algorithm in the #orst case for the practical algorithm1 <hat is the #orst
case for any practical algorithm relati"e to the optimal algorithmD <hen the loc- become free 3st after the
practical algorithm stops spinning1
[ <hat is #orst+case cost of algorithm that spins for the sspend and resme time, then sspendsD (<ill
call this the S! algorithm)1 &#o times the sspend and resme time1 &he #orst case is #hen the loc- is
nloc-ed 3st after the thread starts the sspend1 &he optimal algorithm 3st spins ntil the loc- is nloc-ed,
ta-ing the sspend and resme time to ac(ire the loc-1 &he S! algorithm costs t#ice the sspend and
resme time +it first spins for the sspend and resme time, then sspends, then gets the loc-, then
resmes1
[ <hat abot other algorithms that spin for a different fixed amont of time then bloc-D Are all #orse
than the S! algorithm1
%f spin for less than sspend and resme time then sspend (call this the L&+S! algorithm), #orst
case is #hen loc- becomes free 3st after start the sspend1 %n this case the the algorithm #ill cost
spinning time pls sspend and resme time1 &he S! algorithm #ill 3st cost the spinning time1
%f spin for greater than sspend and resme time then sspend (call this the /!+S! algorithm),
#orst case is again #hen loc- becomes free 3st after start the sspend1 %n this case the S!
algorithm #ill also sspend and resme, bt it #ill spin for less time than the /&+S! algorithm
Of corse, in practice loc-s may not exhibit #orst case beha"ior, so best algorithm depends on loc-ing and
nloc-ing patterns actally obser"ed1
[ 4ere is the S! algorithm1 Again, can be impro"ed #ith se of (eeing loc-s1
notAone ( test)and)set(l"#
if (BnotAone" return#
start ( read9lock("#
/hile (notAone" {
stop ( read9lock("#
if (stop ) start *( suspend7nd5esumeTime" {
currentThread)*@ield("#
start ( read9lock("#
'
notAone ( test)and)set(l"#
'

[ &here is an orthogonal isse1 test+and+set instrction typically consmes bs resorces e"ery time1 *t
a load instrction caches the data1 Sbse(ent loads come ot of cache and ne"er hit the bs1 So, can do
something li-e this for inital algorithm)
/hile (1" {
if Btest)and)set(l" break#
/hile (*l (( 1"#
'

[ Are other instrctions that can be sed to implement spin loc-s + s#ap instrction, for example1
[ On modern !%SC machines, test+and+set and s#ap may case implementation headaches1 <old rather
do something that fits into load$store natre of architectre1 So, ha"e a non+bloc-ing abstraction) Load
Lin-ed(LL)$Store Conditional(SC)1
[ Semantics of LL) Load memory location into register and mar- it as loaded by this processor1 A memory
location can be mar-ed as loaded by more than one processor1
[ Semantics of SC) if the memory location is mar-ed as loaded by this processor, store the ne# "ale and
remo"e all mar-s from the memory location1 Other#ise, don\t perform the store1 !etrn #hether or not the
store scceeded1
[ 4ere is ho# to se LL$SC to implement the loc- operation)
/hile (1" {
22 r1 lock
if (r1 (( 0" {
2C r2 1
if (S9 r2 lock" break#
'
'

Enloc- operation is the same as before1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
[ Can also se LL$SC to implement some operations (li-e increment) directly1 6eople ha"e bilt p a #hole
bnch of theory dealing #ith the difference in po#er bet#een stff li-e LL$SC and test+and+set1
/hile (1" {
22 r1 lock
7AAC r1 1 r1
if (S9 r2 lock" break#
'

[ Note that the increment operation is non+bloc-ing1 %f t#o threads start to perform the increment at the
same time, neither #ill bloc- + both #ill complete the add and only one #ill sccessflly perform the SC1 &he
other #ill retry1 So, it eliminates problems #ith loc-ing li-e) one thread ac(ires loc-s and dies, or one
thread ac(ires loc-s and is sspended for a long time, pre"enting other threads that need to ac(ire the
loc- from proceeding1
(verview of C%+ Scheduling:
[
<hat is C6E schedlingD Determining #hich processes rn #hen there are mltiple rnnable processes1
<hy is it importantD *ecase it can can ha"e a big effect on resorce tili2ation and the o"erall performance
of the system1
[ *y the #ay, the #orld #ent throgh a long period (late TB\s, early GB\s) in #hich the most poplar
operating systems (DOS, Mac) had NO sophisticated C6E schedling algorithms1 &hey #ere single threaded
and ran one process at a time ntil the ser directs them to rn another process1 <hy #as this treD More
recent systems (<indo#s N&) are bac- to ha"ing sophisticated C6E schedling algorithms1 <hat dro"e the
change, and #hat #ill happen in the ftreD
[ *asic assmptions behind most schedling algorithms)
&here is a pool of rnnable processes contending for the C6E1
&he processes are independent and compete for resorces1
&he 3ob of the schedler is to distribte the scarce resorce of the C6E to the different processes
aafairly\\ (according to some definition of fairness) and in a #ay that optimi2es some performance
criteria1
%n general, these assmptions are starting to brea- do#n1 >irst of all, C6Es are not really that scarce +
almost e"erybody has se"eral, and pretty soon people #ill be able to afford lots1 Second, many applications
are starting to be strctred as mltiple cooperating processes1 So, a "ie# of the schedler as mediating
bet#een competing entities may be partially obsolete1
[ 4o# do processes beha"eD >irst, C6E$%O brst cycle1 A process #ill rn for a #hile (the C6E brst),
perform some %O (the %O brst), then rn for a #hile more (the next C6E brst)1 4o# long bet#een %O
operationsD Depends on the process1
%O *ond processes) processes that perform lots of %O operations1 5ach %O operation is follo#ed by
a short C6E brst to process the %O, then more %O happens1
C6E bond processes) processes that perform lots of comptation and do little %O1 &end to ha"e a
fe# long C6E brsts1
One of the things a schedler #ill typically do is s#itch the C6E to another process #hen one process does
%O1 <hyD &he %O #ill ta-e a long time, and don\t #ant to lea"e the C6E idle #hile #ait for the %O to finish1
[ <hen loo- at C6E brst times across the #hole system, ha"e the exponential or hyperexponential
distribtion in >ig1 ;181
[ <hat are possible process statesD
!nning + process is rnning on C6E1
!eady + ready to rn, bt not actally rnning on the C6E1
<aiting + #aiting for some e"ent li-e %O to happen1
[ <hen do schedling decisions ta-e placeD <hen does C6E choose #hich process to rnD Are a "ariety of
possibilities)
<hen process s#itches from rnning to #aiting1 Cold be becase of %O re(est, becase #ait for
child to terminate, or #ait for synchroni2ation operation (li-e loc- ac(isition) to complete1
<hen process s#itches from rnning to ready + on completion of interrpt handler, for example1
Common example of interrpt handler + timer interrpt in interacti"e systems1 %f schedler s#itches
processes in this case, it has preempted the rnning process1 Another common case interrpt
handler is the %O completion handler1
<hen process s#itches from #aiting to ready state (on completion of %O or ac(isition of a loc-, for
example)1
<hen a process terminates1
[ 4o# to e"alate schedling algorithmD &here are many possible criteria)
C6E Etili2ation) ,eep C6E tili2ation as high as possible1 (<hat is tili2ation, by the #ayD)1
&hroghpt) nmber of processes completed per nit time1
&rnarond &ime) mean time from sbmission to completion of process1
<aiting &ime) Amont of time spent ready to rn bt not rnning1
!esponse &ime) &ime bet#een sbmission of re(ests and first response to the re(est1
Schedler 5fficiency) &he schedler doesn\t perform any sefl #or-, so any time it ta-es is pre
o"erhead1 So, need to ma-e the schedler "ery efficient1
[ *ig difference) *atch and %nteracti"e systems1 %n batch systems, typically #ant good throghpt or
trnarond time1 %n interacti"e systems, both of these are still sally important (after all, #ant some
comptation to happen), bt response time is sally a primary consideration1 And, for some systems,
throghpt or trnarond time is not really rele"ant + some processes conceptally rn fore"er1
[ Difference bet#een long and short term schedling1 Long term schedler is gi"en a set of processes and
decides #hich ones shold start to rn1 Once they start rnning, they may sspend becase of %O or
becase of preemption1 Short term schedler decides #hich of the a"ailable 3obs that long term schedler
has decided are rnnable to actally rn1
[ Let\s start loo-ing at se"eral "anilla schedling algorithms1
[ >irst+Come, >irst+Ser"ed1 One ready (ee, OS rns the process at head of (ee, ne# processes come
in at the end of the (ee1 A process does not gi"e p C6E ntil it either terminates or performs %O1
[ Consider performance of >C>S algorithm for three compte+bond processes1 <hat if ha"e : processes
67 (ta-es 8: seconds), 68 (ta-es 9 seconds) and 69 (ta-es 9 seconds)1 %f arri"e in order 67, 68, 69, #hat is
<aiting &imeD (8: F 8S) $ 9 @ 7S
&rnarond &imeD (8: F 8S F 9B) @ 8S1
&hroghptD 9B $ 9 @ 7B1
<hat abot if processes come in order 68, 69, 67D <hat is
<aiting &imeD (9 F 9) $ 8 @ =
&rnarond &imeD (9 F = F 9B) @ 791
&hroghptD 9B $ 9 @ 7B1
[ Shortest+.ob+>irst (S.>) can eliminate some of the "ariance in <aiting and &rnarond time1 %n fact, it is
optimal #ith respect to a"erage #aiting time1 *ig problem) ho# does schedler figre ot ho# long #ill it
ta-e the process to rnD
[ >or long term schedler rnning on a batch system, ser #ill gi"e an estimate1 Esally pretty good + if it
is too short, system #ill cancel 3ob before it finishes1 %f too long, system #ill hold off on rnning the process1
So, sers gi"e pretty good estimates of o"erall rnning time1
[ >or short+term schedler, mst se the past to predict the ftre1 Standard #ay) se a time+decayed
exponentially #eighted a"erage of pre"ios C6E brsts for each process1 Let !n be the measred brst time
of the nth brst, sn be the predicted si2e of next C6E brst1 &hen, choose a #eighting factor %, #here B
K@ % K@ 7 and compte snF7 @ % !n F (7 +%)sn1 s0 is defined as some defalt constant or system a"erage1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
[ % tells ho# to #eight the past relati"e to ftre1 %f choose % @ 1;, last obser"ation has as mch #eight
as entire rest of the history1 %f choose % @ 7, only last obser"ation has any #eight1 Do a (ic- example1
[ 6reempti"e "s1 Non+preempti"e S.> schedler1 6reempti"e schedler rerns schedling decision #hen
process becomes ready1 %f the ne# process has priority o"er rnning process, the C6E preempts the rnning
process and exectes the ne# process1 Non+preempti"e schedler only does schedling decision #hen
rnning process "olntarily gi"es p C6E1 %n effect, it allo#s e"ery rnning process to finish its C6E brst1
[ Consider : processes 67 (brst time T), 68 (brst time :), 69 (brst time G) 6: (brst time ;) that
arri"e one time nit apart in order 67, 68, 69, 6:1 Assme that after brst happens, process is not reenabled
for a long time (at least 7BB, for example)1 <hat does a preempti"e S.> schedler doD <hat abot a non+
preempti"e schedlerD
[ 6riority Schedling1 5ach process is gi"en a priority, then C6E exectes process #ith highest priority1 %f
mltiple processes #ith same priority are rnnable, se some other criteria + typically >C>S1 S.> is an
example of a priority+based schedling algorithm1 <ith the exponential decay algorithm abo"e, the priorities
of a gi"en process change o"er time1
[ Assme #e ha"e ; processes 67 (brst time 7B, priority 9), 68 (brst time 7, priority 7), 69 (brst time
8, priority 9), 6: (brst time 7, priority :), 6; (brst time ;, priority 8)1 Lo#er nmbers represent higher
priorities1 <hat #old a standard priority schedler doD
[ *ig problem #ith priority schedling algorithms) star"ation or bloc-ing of lo#+priority processes1 Can
se aging to pre"ent this + ma-e the priority of a process go p the longer it stays rnnable bt isn\t rn1
[ <hat abot interacti"e systemsD Cannot 3st let any process rn on the C6E ntil it gi"es it p + mst
gi"e response to sers in a reasonable time1 So, se an algorithm called rond+robin schedling1 Similar to
>C>S bt #ith preemption1 4a"e a time (antm or time slice1 Let the first process in the (ee rn ntil it
expires its (antm (i1e1 rns for as long as the time (antm), then rn the next process in the (ee1
[ %mplementing rond+robin re(ires timer interrpts1 <hen schedle a process, set the timer to go off
after the time (antm amont of time expires1 %f process does %O before timer goes off, no problem + 3st
rn next process1 *t if process expires its (antm, do a context s#itch1 Sa"e the state of the rnning
process and rn the next process1
[ 4o# #ell does !! #or-D <ell, it gi"es good response time, bt can gi"e bad #aiting time1 Consider the
#aiting times nder rond robin for 9 processes 67 (brst time 8:), 68 (brst time 9), and 69 (brst time :)
#ith time (antm :1 <hat happens, and #hat is a"erage #aiting timeD <hat gi"es best #aiting timeD
[ <hat happens #ith really a really small (antmD %t loo-s li-e yo\"e got a C6E that is 7$n as po#erfl
as the real C6E, #here n is the nmber of processes1 6roblem #ith a small (antm + context s#itch
o"erhead1
[ <hat abot ha"ing a really small (antm spported in hard#areD &hen, yo ha"e something called
mltithreading1 /i"e the C6E a bnch of registers and hea"ily pipeline the exection1 >eed the processes
into the pipe one by one1 &reat memory access li-e %O + sspend the thread ntil the data comes bac- from
the memory1 %n the meantime, execte other threads1 Ese comptation to hide the latency of accessing
memory1
[ <hat abot a really big (antmD %t trns into >C>S1 !le of thmb + #ant TB percent of C6E brsts to
be shorter than time (antm1
[ Mltile"el Xee Schedling + li-e !!, except ha"e mltiple (ees1 &ypically, classify processes into
separate categories and gi"e a (ee to each category1 So, might ha"e system, interacti"e and batch
processes, #ith the priorities in that order1 Cold also allocate a percentage of the C6E to each (ee1
[ Mltile"el >eedbac- Xee Schedling + Li-e mltile"el schedling, except processes can mo"e bet#een
(ees as their priority changes1 Can be sed to gi"e %O bond and interacti"e processes C6E priority o"er
C6E bond processes1 Can also pre"ent star"ation by increasing the priority of processes that ha"e been idle
for a long time1
[ A simple example of a mltile"el feedbac- (ee schedling algorithm1 4a"e 9 (ees, nmbered B, 7,
8 #ith corresponding priority1 So, for example, execte a tas- in (ee 8 only #hen (ees B and 7 are
empty1
[ A process goes into (ee B #hen it becomes ready1 <hen rn a process from (ee B, gi"e it a
(antm of T ms1 %f it expires its (antm, mo"e to (ee 71 <hen execte a process from (ee 7, gi"e it
a (antm of 7=1 %f it expires its (antm, mo"e to (ee 81 %n (ee 8, rn a !! schedler #ith a large
(antm if in an interacti"e system or an >C>S schedler if in a batch system1 Of corse, preempt (ee 8
processes #hen a ne# process becomes ready1
[ Another example of a mltile"el feedbac- (ee schedling algorithm) the Enix schedler1 <e #ill go
o"er a simplified "ersion that does not inclde -ernel priorities1 &he point of the algorithm is to fairly allocate
the C6E bet#een processes, #ith processes that ha"e not recently sed a lot of C6E resorces gi"en priority
o"er processes that ha"e1
[ 6rocesses are gi"en a base priority of =B, #ith lo#er nmbers representing higher priorities1 &he system
cloc- generates an interrpt bet#een ;B and 7BB times a second, so #e #ill assme a "ale of =B cloc-
interrpts per second1 &he cloc- interrpt handler increments a C6E sage field in the 6C* of the interrpted
process e"ery time it rns1
[ &he system al#ays rns the highest priority process1 %f there is a tie, it rns the process that has been
ready longest1 5"ery second, it recalclates the priority and C6E sage field for e"ery process according to
the follo#ing formlas1
C6E sage field @ C6E sage field $ 8
6riority @ C6E sage field $ 8 F base priority
[ So, #hen a process does not se mch C6E recently, its priority rises1 &he priorities of %O bond
processes and interacti"e processes therefore tend to be high and the priorities of C6E bond processes
tend to be lo# (#hich is #hat yo #ant)1
[ Enix also allo#s sers to pro"ide a aanice\\ "ale for each process1 Nice "ales modify the priority
calclation as follo#s)
6riority @ C6E sage field $ 8 F base priority F nice "ale
So, yo can redce the priority of yor process to be aanice\\ to other processes (#hich may inclde yor
o#n)1
[ %n general, mltile"el feedbac- (ee schedlers are complex pieces of soft#are that mst be tned to
meet re(irements1
[ Anomalies and system effects associated #ith schedlers1
[ 6riority interacts #ith synchroni2ation to create a really nasty effect called priority in"ersion1 A priority
in"ersion happens #hen a lo#+priority thread ac(ires a loc-, then a high+priority thread tries to ac(ire the
loc- and bloc-s1 Any middle+priority threads #ill pre"ent the lo#+priority thread from rnning and nloc-ing
the loc-1 %n effect, the middle+priority threads bloc- the high+priority thread1
[ 4o# to pre"ent priority in"ersionsD Ese priority inheritance1 Any time a thread holds a loc- that other
threads are #aiting on, gi"e the thread the priority of the highest+priority thread #aiting to get the loc-1
6roblem is that priority inheritance ma-es the schedling algorithm less efficient and increases the
o"erhead1
[ 6reemption can interact #ith synchroni2ation in a mltiprocessor context to create another nasty effect +
the con"oy effect1 One thread ac(ires the loc-, then sspends1 Other threads come along, and need to
ac(ire the loc- to perform their operations1 5"erybody sspends ntil the loc- that has the thread #a-es
p1 At this point the threads are synchroni2ed, and #ill con"oy their #ay throgh the loc-, seriali2ing the
comptation1 So, dri"es do#n the processor tili2ation1
[ %f ha"e non+bloc-ing synchroni2ation "ia operations li-e LL$SC, don\t get con"oy effects cased by
sspending a thread competing for access to a resorce1 <hy notD *ecase threads don\t hold resorces
and pre"ent other threads from accessing them1
[ Similar effect #hen schedling C6E and %O bond processes1 Consider a >C>S algorithm #ith se"eral %O
bond and one C6E bond process1 All of the %O bond processes execte their brsts (ic-ly and (ee p
for access to the %O de"ice1 &he C6E bond process then exectes for a long time1 Dring this time all of the
%O bond processes ha"e their %O re(ests satisfied and mo"e bac- into the rn (ee1 *t they don\t rn +
the C6E bond process is rnning instead + so the %O de"ice idles1 >inally, the C6E bond process gets off
the C6E, and all of the %O bond processes rn for a short time then (ee p again for the %O de"ices1
!eslt is poor tili2ation of %O de"ice + it is bsy for a time #hile it processes the %O re(ests, then idle #hile
the %O bond processes #ait in the rn (ees for their short C6E brsts1 %n this case an easy soltion is to
gi"e %O bond processes priority o"er C6E bond processes1
[ %n general, a con"oy effect happens #hen a set of processes need to se a resorce for a short time,
and one process holds the resorce for a long time, bloc-ing all of the other processes1 Cases poor
tili2ation of the other resorces in the system1
(verview of ,ile System of (perating System:
<hen does a process need to access OS fnctionalityD 4ere are se"eral examples
!eading a file1 &he OS mst perform the file system operations re(ired to read the data off of dis-1
Creating a child process1 &he OS mst set stff p for the child process1
Sending a pac-et ot onto the net#or-1 &he OS typically handles the net#or- interface1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
<hy ha"e the OS do these thingsD <hy doesn\t the process 3st do them directlyD
Con"enience1 %mplement the fnctionality once in the OS and encapslate it behind an interface
that e"eryone ses1 So, processes 3st deal #ith the simple interface, and don\t ha"e to #rite
complicated lo#+le"el code to deal #ith de"ices1
6ortability1 OS exports a common interface typically a"ailable on many hard#are platforms1
Applications do not contain hard#are+specific code1
6rotection1 %f gi"e applications complete access to dis- or net#or- or #hate"er, they can corrpt
data from other applications, either maliciosly or becase of bgs1 4a"ing the OS do it eliminates
secrity problems bet#een applications1 Of corse, applications still ha"e to trst the OS1
[ 4o# do processes in"o-e OS fnctionalityD *y ma-ing a system call1 Conceptally, processes call a
sbrotine that goes off and performs the re(ired fnctionality1 *t OS mst execte in a different
protection domain than the application1 &ypically, OS exectes in sper"isor mode, #hich allo#s it to do
things li-e maniplate the dis- directly1
[ &o s#itch from normal ser mode to sper"isor mode, most machines pro"ide a system call instrction1
&his instrction cases an exception to ta-e place1 &he hard#are s#itches from ser mode to sper"isor
mode and in"o-es the exception handler inside the operating system1 &here is typically some -ind of
con"ention that the process ses to interact #ith the OS1
[ Let\s do an example + the Dpen system call1 System calls typically start ot #ith a normal sbrotine
call1 %n this case, #hen the process #ants to open a file, it 3st calls the Dpenrotine in a system library
someplace1 $U Open the Nachos file OnameO, and retrn an OOpen>ile%dO that can U be sed to read and
#rite to the file1 U$ Open>ile%d Open(char Uname)P
[ %nside the library, the Dpen sbrotine exectes a s0scallinstrction, #hich generates a system call
exception1 Open) addi ]8,]B,SCbOpen syscall 3 ]97 1end Open *y con"ention, the Dpen sbrotine pts a
nmber (in this case S9EDpen) into register 81 %nside the exception handler the OS loo-s at register 8 to
figre ot #hat system call it shold perform1
[ &he Dpen system call also ta-es a parameter + the address of the character string gi"ing the name of
the file to open1 *y con"ention, the compiler pts this parameter into register : #hen it generates the code
that calls the Dpen rotine in the library1 So, the OS loo-s in that register to find the address of the name of
the file to open1
[ More con"entions) scceeding parameters are pt into register ;, register =, etc1 Any retrn "ales from
the system call are pt into register 81
[ %nside the exception handler, the OS figres ot #hat action to ta-e, performs the action, then retrns
bac- to the ser program1
[ &here are other -inds of exceptions1 >or example, if the program attempts to deference a NELL pointer,
the hard#are #ill generate an exception1 &he OS #ill ha"e to figre ot #hat -ind of exception too- place
and handle it accordingly1 Another -ind of exception is a di"ide by B falt1
[ Similar things happen on a interrpt1 <hen an interrpt occrs, the hard#are pts the OS into
sper"isor mode and in"o-es an interrpt handler1 &he difference bet#een interrpts and exceptions is that
interrpts are generated by external e"ents (the dis- %O completes, a ne# character is typed at the console,
etc1) #hile exceptions are generated by a rnning program1
[ Ob3ect file formats1 &o rn a process, the OS mst load in an exectable file from the dis- into memory1
<hat does this file containD &he code to rn, any initiali2ed data, and a specification for ho# mch space the
ninitiali2ed data ta-es p1 May also be other stff to help debggers rn, etc1
[ &he compiler, lin-er and OS mst agree on a format for the exectable file1 >or example, Nachos ses
the follo#ing format for exectables) cdefine NO>>MA/%C Bxbadfad $U magic nmber denoting Nachos U
ob3ect code file U$ typedef strct segment L int "irtalAddrP $U location of segment in "irt addr space U$ int
in>ileAddrP $U location of segment in this file U$ int si2eP $U si2e of segment U$ Q SegmentP typedef strct
noff4eader L int noffMagicP $U shold be NO>>MA/%C U$ Segment codeP $U exectable code segment U$
Segment initDataP $U initiali2ed data segment U$ Segment ninitDataP $U ninitiali2ed data segment ++ U
shold be 2ero\ed before se U$ Q Noff4eaderP
[ <hat does the OS do #hen it loads an exectable inD
!eads in the header part of the exectable1
Chec-s to see if the magic nmber matches1
>igres ot ho# mch space it needs to hold the process1 &his incldes space for the stac-, the
code, the initiali2ed data and the ninitiali2ed data1
%f it needs to hold the entire process in physical memory, it goes off and finds the physical memory
it needs to hold the process1
%t then reads the code segment in from the file to physical memory1
%t then reads the initiali2ed data segment in from the file to physical memory1
%t 2eros the stac- and nintiali2ed memory1
[ 4o# does the operating system do %OD >irst, #e gi"e an o"er"ie# of ho# the hard#are does %O1
[ &here are t#o basic #ays to do %O + memory mapped %O and programmed %O1
Memory mapped %O + the control registers on the %O de"ice are mapped into the memory space of
the processor1 &he processor controls the de"ice by performing reads and #rites to the addresses
that the %O de"ice is mapped into1
6rogrammed %O + the processor has special %O instrctions li-e %N and OE&1 &hese control the %O
de"ice directly1
[ <riting the lo# le"el, complex code to control de"ices can be a "ery tric-y bsiness1 So, the OS
encapslates this code inside things called de"ice dri"ers1 &here are se"eral standard interfaces that de"ice
dri"ers present to the -ernel1 %t is the 3ob of the de"ice dri"er to implement its standard interface for its
de"ice1 &he rest of the OS can then se this interface and doesn\t ha"e to deal #ith complex %O code1
[ >or example, Enix has a bloc- de"ice dri"er interface1 All bloc- de"ice dri"ers spport a standard set of
calls li-e open, close, read and #rite1 &he dis- de"ice dri"er, for example, translates these calls into
operations that read and #rite sectors on the dis-1
[ &ypically, %O ta-es place asynchronosly #ith respect to the processor1 So, the processor #ill start an %O
operation (li-e #riting a dis- sector), then go off and do some other processing1 <hen the %O operation
completes, it interrpts the processor1 &he processor is typically "ectored off to an interrpt handler, #hich
ta-es #hate"er action needs to ta-e place1
[ 4ere is ho# Nachos does %O1 5ach de"ice presents an interface1 >or example, the dis- interface is in
dis-1h, and has operations to start a read and #rite re(est1 <hen the re(est completes, the Ohard#areO
in"o-es the 4andle%nterrpt method1
[ Only one thread can se each de"ice at a time1 Also, threads typically #ant to se de"ices
synchronosly1 So, for example, a thread #ill perform a dis- operation then #ait ntil the dis- operation
completes1 Nachos therefore encapslates the de"ice interface inside a higher le"el interface that pro"ides
synchronos, synchroni2ed access to the de"ice1 >or the dis- de"ice, this interface is in synchdis-1h1 &his
pro"ides operations to read and #rite sectors, for example1
[ 5ach method in the synchronos interface ensres exclsi"e access to the %O de"ice by ac(iring a loc-
before it performs any operation on the de"ice1
[ <hen the synchronos method gets exclsi"e access to the de"ice, it performs the operation to start the
%O1 %t then ses a semaphore (6 operation) to bloc- ntil the %O operation completes1 <hen the %O operation
completes, it in"o-es an interrpt handler1 &his handler performs a C operation on the semaphore to nbloc-
the synchronos method1 &he synchronos method then releases the loc- and retrns bac- to the calling
thread1
(verview of !emory !anagement System:
[
6oint of memory management algorithms + spport sharing of main memory1 <e #ill focs on ha"ing
mltiple processes sharing the same physical memory1 ,ey isses)
6rotection1 Mst allo# one process to protect its memory from access by other processes1
Naming1 4o# do processes identify shared pieces of memory1
&ransparency1 4o# transparent is sharing1 Does ser program ha"e to manage anything explicitlyD
5fficiency1 Any memory management strategy shold not impose too mch of a performance
brden1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
[ <hy share memory bet#een processesD *ecase #ant to mltiprogram the processor1 &o time share
system, to o"erlap comptation and %$O1 So, mst pro"ide for mltiple processes to be resident in physical
memory at the same time1 6rocesses mst share the physical memory1
[ 4istorical De"elopment1
>or first compters, loaded one program onto machine and it exected to completion1 No sharing
re(ired1 OS #as 3st a sbrotine library, and there #as no protection1 <hat addresses does
program generateD
Desire to increase processor tili2ation in the face of long %$O delays dro"e the adoptation of
mltiprogramming1 So, one process rns ntil it does %$O, then OS lets another process rn1 4o# do
processes share memoryD Alternati"es)
o Load both processes into memory, then s#itch bet#een them nder OS control1 Mst relocate
program #hen load it1 *ig 6roblem) 6rotection1 A bg in one process can -ill the other process1
MS+DOS, MS+<indo#s se this strategy1
o Copy entire memory of process to dis- #hen it does %$O, then copy bac- #hen it restarts1 No
need to relocate #hen load1 Ob"ios performance problems1 5arly "ersion of Enix did this1
o Do access chec-ing on each memory reference1 /i"e each program a piece of memory that it
can access, and on e"ery memory reference chec- that it stays #ithin its address space1 &ypical
mechanism) base and bonds registers1 <here is chec- doneD Ans#er) in hard#are for speed1
<hen OS rns process, loads the base and bonds registers for that process1 Cray+7 did this1
Note) there is no# a translation process1 6rogram generates "irtal addresses that get
translated into physical addresses1 *t, no longer ha"e a protection problem) one process
cannot access another\s memory, becase it is otside its address space1 %f it tries to access it,
the hard#are #ill generate an exception1
[ 5nd p #ith a model #here physical memory of machine is dynamically allocated to processes as they
enter and exit the system1 Cariety of allocation strategies) best fit, first fit, etc1 All sffer from external
fragmentation1 %n #orst case, may ha"e enogh memory free to load a process, bt can\t se it becase it is
fragmented into little pieces1
[ <hat if cannot find a space big enogh to rn a processD 5ither becase of fragmentation or becase
physical memory is too small to hold all address spaces1 Can compact and relocate processes (easy #ith
base and bonds hard#are, not so easy for direct physical address machines)1 Or, can s#ap a process ot to
dis- then restore #hen space becomes a"ailable1 %n both cases incr copying o"erhead1 <hen mo"e process
#ithin memory, mst copy bet#een memory locations1 <hen mo"e to dis-, mst copy bac- and forth to
dis-1
[ One #ay to a"oid external fragmentation) allocate physical memory to processes in fixed si2e chn-s
called page frames1 6resent abstraction to application of a single linear address space1 %nside machine,
brea- address space of application p into fixed si2e chn-s called pages1 6ages and page frames are same
si2e1 Store pages in page frames1 <hen process generates an address, dynamically translate to the physical
page frame #hich holds data for that page1
[ So, a "irtal address no# consists of t#o pieces) a page nmber and an offset #ithin that page1 6age
si2es are typically po#ers of 8P this simplifies extraction of page nmbers and offsets1 &o access a piece of
data at a gi"en address, system atomatically does the follo#ing)
5xtracts page nmber1
5xtracts offset1
&ranslate page nmber to physical page frame id1
Accesses data at offset in physical page frame1
[ 4o# does system perform translationD Simplest soltion) se a page table1 6age table is a linear array
indexed by "irtal page nmber that gi"es the physical page frame that contains that page1 <hat is loo-p
processD
5xtract page nmber1
5xtract offset1
Chec- that page nmber is #ithin address space of process1
Loo- p page nmber in page table1
Add offset to reslting physical page nmber
Access memory location1
[ <ith paging, still ha"e protection1 One process cannot access a piece of physical memory nless its page
table points to that physical page1 So, if the page tables of t#o processes point to different physical pages,
the processes cannot access each other\s physical memory1
[ >ixed si2e allocation of physical memory in page frames dramatically simplifies allocation algorithm1 OS
can 3st -eep trac- of free and sed pages and allocate free pages #hen a process needs memory1 &here is
no fragmentation of physical memory into smaller and smaller allocatable chn-s1
[ *t, are still pieces of memory that are nsed1 <hat happens if a program\s address space does not
end on a page bondaryD !est of page goes nsed1 &his -ind of memory loss is called internal
fragmentation1
Introdction of 8aging in +%erating ).stem,
[
*asic idea) allocate physical memory to processes in fixed si2e chn-s called page frames1 6resent
abstraction to application of a single linear address space1 %nside machine, brea- address space of
application p into fixed si2e chn-s called pages1
6ages and page frames are same si2e1 Store pages in page frames1 <hen process generates an address,
dynamically translate to the physical page frame #hich holds data for that page1
[ So, a "irtal address no# consists of t#o pieces) a page nmber and an offset #ithin that page1 6age
si2es are typically po#ers of 8P this simplifies extraction of page nmbers and offsets1 &o access a piece of
data at a gi"en address, system atomatically does the follo#ing)
5xtracts page nmber1
5xtracts offset1
&ranslate page nmber to physical page frame id1
Accesses data at offset in physical page frame1
[ 4o# does system perform translationD Simplest soltion) se a page table1 6age table is a linear array
indexed by "irtal page nmber that gi"es the physical page frame that contains that page1 <hat is loo-p
processD
5xtract page nmber1
5xtract offset1
Chec- that page nmber is #ithin address space of process1
Loo- p page nmber in page table1
Add offset to reslting physical page nmber
Access memory location1
6roblem) for each memory access that processor generates, mst no# generate t#o physical memory
accesses1
[ Speed p the loo-p problem #ith a cache1 Store most recent page loo-p "ales in &L*1 &L* design
options) flly associati"e, direct mapped, set associati"e, etc1 Can ma-e direct mapped larger for a gi"en
amont of circit space1
[ 4o# does loo-p #or- no#D
5xtract page nmber1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
5xtract offset1
Loo- p page nmber in &L*1
%f there, add offset to physical page nmber and access memory location1
Other#ise, trap to OS1 OS performs chec-, loo-s p physical page nmber, and loads translation
into &L*1 !estarts the instrction1
[ Li-e any cache, &L* can #or- #ell, or it can #or- poorly1 <hat is a good and bad case for a direct
mapped &L*D <hat abot flly associati"e &L*s, or set associati"e &L*D
[ >ixed si2e allocation of physical memory in page frames dramatically simplifies allocation algorithm1 OS
can 3st -eep trac- of free and sed pages and allocate free pages #hen a process needs memory1 &here is
no fragmentation of physical memory into smaller and smaller allocatable chn-s1
[ *t, are still pieces of memory that are nsed1 <hat happens if a program\s address space does not
end on a page bondaryD !est of page goes nsed1 *oo- calls this internal fragmentation1
[ 4o# do processes share memoryD &he OS ma-es their page tables point to the same physical page
frames1 Esefl for fast interprocess commnication mechanisms1 &his is "ery nice becase it allo#s
transparent sharing at speed1
[ <hat abot protectionD &here are a "ariety of protections)
6re"enting one process from reading or #riting another process\ memory1
6re"enting one process from reading another process\ memory1
6re"enting a process from reading or #riting some of its o#n memory1
6re"enting a process from reading some of its o#n memory1
4o# is this protection integrated into the abo"e schemeD
[ 6re"enting a process from reading or #riting memory) OS refses to establish a mapping from "irtal
address space to physical page frame containing the protected memory1 <hen program attempts to access
this memory, OS #ill typically generate a falt1 %f ser process catches the falt, can ta-e action to fix things
p1
[ 6re"enting a process from #riting memory, bt allo#ing a process to read memory1 OS sets a #rite
protect bit in the &L* entry1 %f process attempts to #rite the memory, OS generates a falt1 *t, reads go
throgh 3st fine1
[ Cirtal Memory %ntrodction1
[ <hen a segmented system needed more memory, it s#apped segments ot to dis- and then s#apped
them bac- in again #hen necessary1 6age based systems can do something similar on a page basis1
[ *asic idea) #hen OS needs to a physical page frame to store a page, and there are none free, it can
select one page and store it ot to dis-1 %t can then se the ne#ly free page frame for the ne# page1 Some
pragmatic considerations)
%n practice, it ma-es sense to -eep a fe# free page frames1 <hen nmber of free pages drops
belo# this threshold, choose a page and store it ot1 &his #ay, can o"erlap %$O re(ired to store ot
a page #ith comptation that ses the ne#ly allocated page frame1
%n practice the page frame si2e sally e(als the dis- bloc- si2e1 <hyD
Do yo need to allocate dis- space for a "irtal page before yo s#ap it otD (Not if al#ays -eep
one page frame free) <hy did *SD do thisD At some point OS mst refse to allocate a process
more memory becase has no s#ap space1 <hen can this happenD (malloc, stac- extension, ne#
process creation)1
[ <hen process tries to access paged ot memory, OS mst rn off to the dis-, find a free page frame,
then read page bac- off of dis- into the page frame and restart process1
[ <hat is ad"antage of "irtal memory$pagingD
Can rn programs #hose "irtal address space is larger than physical memory1 %n effect, one
process shares physical memory #ith itself1
Can also flexibly share machine bet#een processes #hose total address space si2es exceed the
physical memory si2e1
Spports a #ide range of ser+le"el stff + See Li and Appel paper1
[ Disad"antages of CM$paging) extra resorce consmption1
Memory o"erhead for storing page tables1 %n extreme cases, page table may ta-e p a significant
portion of "irtal memory1 One Soltion) page the page table1 Others) go to a more complicated
data strctre for storing "irtal to physical translations1
&ranslation o"erhead1
)e"ical Analysis
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
Lexical Analyser
!eglar 5xpressions and Strings
Langages
!egolar 5xpressions
5xample) /rammer >ormat
<or-ing 5xample
Atomata
Con"erting N>A to D>A
Con"erting !egolar 5xpression to N>A
)e"ical Analyser:
Introduction of Regular -"pressions and
Strings:
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
Introduction of )anguage in )e"ical Analysis:
Introduction to Regular -"pressions:
Introduction to Regular Grammar:
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
-"amples:
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
Introduction To ,inite Automata:
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
Conversion of Automata:
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
Regular -"pression to N,A Conversion:
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
Data 'ase !anagement
Systems
Database system Concept and
Architectre
5ntity !elationship and 5nhanced 5+!
!elational Data Model and !elational
Algebra
!elational Database Design
Xery Langage+SXL
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
Normali2ation
Data&ase System Concept:
&he term database originated #ithin the compting discipline1 Althogh its meaning
has been broadened by poplar se, e"en to inclde non+electronic databases, this
article is abot compter databases1
A compter database is astrctred collection of records or data that is stored in a
compter system so that a compter program or person sing a (ery langage
can conslt it to ans#er (eries
href@Ohttp)$$en1#i-ipedia1org$#i-i$Databasecbnote+BO?M7N1 &he records retrie"ed
in ans#er to (eries are information that can be sed to ma-e decisions1
&he compter program sed to manage and (ery a database is -no#n as a
database management system(D*MS)1 &he properties and design of database
systems are inclded in the stdy of information science1
A typical (ery cold be a (estion sch as, 4o# many hambrgers #ith t#o or
more beef patties #ere sold in the month of March in Ne# .erseyDO1 &o ans#er sch
a (estion, the database #old ha"e to store information abot hambrgers sold,
inclding nmber of patties, sales date, and the region1
&he central concept of a database is that of a collection of records, or pieces of
information1 &ypically, for a gi"en database, there is a strctral description of the
type of facts held in that database) this description is -no#n as a schema1 &he
schema describes the ob3ects that are represented in the database, and the
relationships among them1 &here are a nmber of different #ays of organi2ing a
schema, that is, of modeling the database strctre) these are -no#n as database
models(or data models)1 &he model in most common se today is the relational
model, #hich in layman\s terms represents all information in the form of mltiple
related tables each consisting of ro#s and colmns (the formal definition ses
mathematical terminology)1 &his model represents relationships by the se of
"ales common to more than one table1 Other models sch as the hierarchical
model and the net#or- model se a more explicit representation of relationships1
&he term database refers to the collection of related records, and the soft#are
shold be referred to as the database management system or D*MS1 <hen the
context is ambigos, ho#e"er, many database administrators and programmers
se the term database to co"er both meanings1
Many professionals consider a collection of data to constitte a database only if it
has certain properties) for example, if the data is managed to ensre its integrity
and (ality, if it allo#s shared access by a commnity of sers, if it has a schema,
or if it spports a (ery langage1 4o#e"er, there is no definition of these
properties that is ni"ersally agreed pon1
Database management systems are sally categori2ed according to the data
model that they spport) relational, ob3ect+relational, net#or-, and so on1 &he data
model #ill tend to determine the (ery langages that are a"ailable to access the
database1 A great deal of the internal engineering of a D*MS, ho#e"er, is
independent of the data model, and is concerned #ith managing factors sch as
performance, concrrency, integrity, and reco"ery from hard#are failres1 %n these
areas there are large differences bet#een prodcts1
%urpose of Data&ase System:

&o see #hy database management systems are necessary, let\s loo- at
a typical Ofile+processing system\\ spported by a con"entional
operating system1
&he application is a sa"ings ban-)
o Sa"ings accont and cstomer records are -ept in permanent
system files1
o Application programs are #ritten to maniplate files to perform the
follo#ing tas-s)
Debit or credit an accont1
Add a ne# accont1
>ind an accont balance1
/enerate monthly statements1
De"elopment of the system proceeds as follo#s)
o Ne# application programs mst be #ritten as the need arises1
o Ne# permanent files are created as re(ired1
o bt o"er a long period of time files may be in different formats, and
o Application programs may be in different langages1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com

o So #e can see there are problems #ith the straight file+processing


approach)
Data redndancy and inconsistency
Same information may be dplicated in se"eral places1
All copies may not be pdated properly1
Difficlty in accessing data
May ha"e to #rite a ne# application program to satisfy an
nsal re(est1
51g1 find all cstomers #ith the same postal code1
Cold generate this data manally, bt a long 3ob111
Data isolation
Data in different files1
Data in different formats1
Difficlt to #rite ne# application programs1
Mltiple sers
<ant concrrency for faster response time1
Need protection for concrrent pdates1
51g1 t#o cstomers #ithdra#ing fnds from the same accont
at the same time + accont has ];BB in it, and they #ithdra#
]7BB and ];B1 &he reslt cold be ]9;B, ]:BB or ]:;B if no
protection1
Secrity problems
5"ery ser of the system shold be able to access only the
data they are permitted to see1
51g1 payroll people only handle employee records, and cannot
see cstomer accontsP tellers only access accont data and
cannot see payroll data1
Difficlt to enforce this #ith application programs1
%ntegrity problems
Data may be re(ired to satisfy constraints1
51g1 no accont balance belo# ]8;1BB1
Again, difficlt to enforce or to change constraints #ith the
file+processing approach1
&hese problems and others led to the de"elopment of database
management s.stems1
Data&ase System Architecture:
6atabase<centric architectre or data<centric architectre has se"eral distinct
meanings, generally relating to soft#are architectres in #hich databases play a
crcial role1 Often this description is meant to contrast the design to an alternati"e
approach1 >or example, the characteri2ation of an architectre as Odatabase+
centricO may mean any combination of the follo#ing)
sing a standard, general+prpose relational database management
system, as opposed to cstomi2ed in+memory or file+based data
strctres and access methods1 <ith the e"oltion of sophisticated
D*MS soft#are, mch of #hich is either free or inclded #ith the
operating system, application de"elopers ha"e become increasingly
reliant on standard database tools, especially for the sa-e of rapid
application de"elopment1
sing dynamic, table+dri"en logic, as opposed to logic embodied in
pre"iosly &he se of table+dri"en logic, i1e1 beha"ior that is hea"ily
dictated by the contents of a database, allo#s programs to be simpler
and more flexible1 &his capability is a central featre of dynamic
programming langages1
sing stored procedres that rn on database ser"ers, as opposed to
greater reliance on logic rnning in middle+tier application ser"ers in a
mlti+tier architectre1 &he extent to #hich bsiness logic shold be
placed at the bac-+end "erss another tier is a sb3ect of ongoing
debate1 >or example, Oracle presents a detailed analysis of alternati"e
architectres that "ary in the placement of bsiness logic, conclding
that a database+centric approach has practical ad"antages from the
standpoint of ease of de"elopment and maintainability1
sing a shared database as the basis for commnicating bet#een
parallel processes in distribted compting applications, as opposed to
direct inter+process commnication "ia message passing fnctions and
message+oriented middle#are1 A potential benefit of database+centric
architectre in distribted applications is that it simplifies the design
by tili2ing D*MS+pro"ided transaction processing and indexing to
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
achie"e a high degree of reliability, performance, and capacity1 >or
example, *ase One describes a database+centric distribted compting
architectre for grid and clster compting, and explains ho# this
design pro"ides enhanced secrity, falt+tolerance, and scalability
The -#R !odel:

&he entity+relationship model is based on a perception of the #orld as


consisting of a collection of basic obCects(entities)
and relationshi%s among these ob3ects1

o An entit. is a distingishable ob3ect that exists1
o 5ach entity has associated #ith it a set ofattribtes describing it1
o 51g1 number and balance for an accont entity1
o A relationshi% is an association among se"eral entities1
o e1g1 A cust(acct relationship associates a cstomer #ith each
accont he or she has1
o &he set of all entities or relationships of the same type is called
the entit. set or relationshi% set1
o Another essential element of the 5+! diagram is the ma%%ing
cardinalities, #hich express the nmber of entities to #hich
another entity can be associated "ia a relationship set1
<e\ll see later ho# #ell this model #or-s to describe real #orld
sitations1

&he o"erall logical strctre of a database can be expressed
graphically by an E<$ diagram)
o rectangles) represent entity sets1
o elli%ses) represent attribtes1
o diamonds) represent relationships among entity sets1
o lines) lin- attribtes to entity sets and entity sets to relationships1
See figre 718 for an example1

Figre 1.2, A sample 5+! diagram1
-ntity and -ntity Set:

An entity is an ob3ect that exists and is distingishable from other


ob3ects1 >or instance, .ohn 4arris #ith S1%1N1 TGB+78+9:;= is an entity,
as he can be ni(ely identified as one particlar person in the
ni"erse1

An entity may be concrete (a person or a boo-, for example) or
abstract (li-e a holiday or a concept)1

An entity set is a set of entities of the same type (e1g1, all persons
ha"ing an accont at a ban-)1

5ntity sets need not be dis3oint1 >or example, the entity set employee
(all employees of a ban-) and the entity set ?cstomer (all cstomers
of the ban-) may ha"e members in common1

An entity is represented by a set of attribtes1
o 51g1 name, S1%1N1, street, city for aacstomer\\ entity1
o &he ?domainof the attribte is the set of permitted "ales (e1g1 the
telephone nmber mst be se"en positi"e integers)1

>ormally, an attribte is a fnction #hich maps an entity set into a
domain1
o 5"ery entity is described by a set of (attribte, data "ale) pairs1
o &here is one pair for each attribte of the entity set1
o 51g1 a particlar cstomer entity is described by the set L(name,
4arris), (S1%1N1, TGB+789+:;=), (street, North), (city, /eorgeto#n)Q1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
An analogy can be made #ith the programming langage notion of
type definition1
&he concept of an entity set corresponds to the programming langage
type definition1
A "ariable of a gi"en type has a particlar "ale at a point in time1

&hs, a programming langage "ariable corresponds to anentity in the
5+! model1
>igre 8+7 sho#s t#o entity sets1
<e #ill be dealing #ith fi"e entity sets in this section)
branch, the set of all branches of a particlar ban-1 5ach branch is
described by the attribtes branch+name, branch+city and assets1

cstomer, the set of all people ha"ing an accont at the ban-1
Attribtes arecstomer+name,S1%1N1, street and cstomer+city

employee, #ith attribtes employee+name and phone+nmber

accont, the set of all acconts created and maintained in the ban-1
Attribtes areaccont+nmber and balance1

transaction?, the set of all accont transactions exected in the ban-1
Attribtes are transaction+nmber, date and amont

Relationship and Relationship sets:
A relationshi% is an association bet#een se"eral entities1
A relationshi% set is a set of relationships of the same type1
Formall. it is a mathematical relation on (possibly non+distinct) sets1
%f are entity sets, then a relationship set ! is a sbsetof
#here is a relationship1
>or example, consider the t#o entity sets customer and account1 (>ig1 817 in the
text)1 <e define the relationship )ustAcct to denote the association bet#een
cstomers and their acconts1 &his is a binar.relationship set (see >igre 818 in
the text)1
/oing bac- to or formal definition, the relationship set )ustAcct is a sbset of all
the possible cstomer and accont pairings1
&his is a binary relationship1 Occasionally there are relationships in"ol"ing more
than t#o entity sets1
&he role of an entity is the fnction it plays in a relationship1 >or example, the
relationship %or"s-for cold be ordered pairs of employeeentities1 &he first
employee ta-es the role of manager, and the second one #ill ta-e the role of
#or-er1
A relationship may also ha"e descri%tive attribtes1 >or example, date(last date of
accont access) cold be an attribte of the )ustAcctrelationship set1

Attri&utes:
%t is possible to define a set of entities and the relationships among them in
a nmber of different #ays1 &he main difference is in ho# #e deal #ith
attribtes1
Consider the entity set employee #ith attribtesemployee-name and phone-
number1
<e cold arge that the phone be treated as an entity itself, #ith
attribtes phone-number andlocation1
&hen #e ha"e t#o entity sets, and the relationship
set *mp+hn defining the association bet#een employees and their
phones1
&his ne# definition allo#s employees to ha"e se"eral (or 2ero) phones1
Ne# definition may more accrately reflect the real #orld1
<e cannot extend this argment easily to ma-ing employee-name an
entity1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
&he (estion of #hat constittes an entity and #hat constittes an attribte
depends mainly on the strctre of the real #orld sitation being modeled,
and the semantics associated #ith the attribte in (estion1
-a%%ing ;onstraint,
An 5+! scheme may define certain constraints to #hich the contents of a
database mst conform1
-a%%ing ;ardinalities, express the nmber of entities to
#hich another entity can be associated "ia a relationship1 >or binary
relationship sets bet#een entity sets A and *, the mapping cardinality mst
be one of)

1. +ne<to<one) An entity in A is associated #ith at most one entity in *,
and an entity in * is associated #ith at most one entity in A1 (>igre
819)

2. +ne<to<man.) An entity in A is associated #ith any nmber in *1 An
entity in * is associated #ith at most one entity in A1 (>igre 81:)

3. -an.<to<one) An entity in A is associated #ith at most one entity in
*1 An entity in * is associated #ith any nmber in A1 (>igre 81;)

4. -an.<to<man.) 5ntities in A and * are associated #ith any nmber
from each other1 (>igre 81=)
&he appropriate mapping cardinality for a particlar relationship set depends
on the real #orld being modeled1 (&hin- abot the )ustAcct relationship111)

E'istence 6e%endencies, if the existence of entity R depends on the
existence of entity Z, then R is said to be e'istence de%endent on Z1
(Or #e say that Z is the dominant entity and R is
the sbordinate entity1)
>or example,
o Consider account and transaction entity sets, and a
relationship log bet#een them1
o &his is one+to+many from accont to transaction1
o %f an account entity is deleted, its associated transaction entities
mst also be deleted1
o &hs account is dominant andtransaction is sbordinate1
.eys:
Differences bet#een entities mst be expressed in terms of attribtes1

A s%erke. is a set of one or more attribtes #hich, ta-en
collecti"ely, allo# s to identify ni(ely an entity in the entity set1

>or example, in the entity set cstomer, cstomer+name and S1%1N1 is a
sper-ey1

Note that cstomer+name alone is not, as t#o cstomers cold ha"e
the same name1

A sper-ey may contain extraneos attribtes, and #e are often
interested in the smallest sper-ey1 A sper-ey for #hich no sbset is
a sper-ey is called a candidate ke.1

%n the example abo"e, S1%1N1 is a candidate -ey, as it is minimal, and
ni(ely identifies a cstomer entity1

A %rimar. ke. is a candidate -ey (there may be more than one)
chosen by the D* designer to identify entities in an entity set1
An entity set that does not possess sfficient attribtes to form a primary
-ey is called a =eak entit. set.One that does ha"e a primary -ey is called
a strong entit. set1
>or example,
&he entity set transaction has attribtes ?1
Different transactions on different acconts cold share the same
nmber1
&hese are not sfficient to form a primary -ey (ni(ely identify a
transaction)1
&hs transaction is a #ea- entity set1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
>or a #ea- entity set to be meaningfl, it mst be part of a one+to+many
relationship set1 &his relationship set shold ha"e no descripti"e attribtes1
(<hyD)
&he idea of strong and #ea- entity sets is related to the existence
dependencies seen earlier1
Member of a strong entity set is a dominant entity1
Member of a #ea- entity set is a sbordinate entity1
A #ea- entity set does not ha"e a primary -ey, bt #e need a means of
distingishing among the entities1
&he discriminator of a #ea- entity set is a set of attribtes that allo#s this
distinction to be made1
&he %rimar. ke. of a =eak entit. set is formed by ta-ing the primary -ey
of the strong entity set on #hich its existence depends (see Mapping
Constraints) pls its discriminator1
&o illstrate)
transaction is a #ea- entity1 %t is existence+dependent on accont1
&he primary -ey of accont? is accont+nmber1
transaction+nmber distingishes transaction entities #ithin the same
accont (and is ths the discriminator)1
So the primary -ey for transaction #old be (accont+nmber,
transaction+nmber)1
Dst $emember, &he primary -ey of a #ea- entity is fond by ta-ing the
primary -ey of the strong entity on #hich it is existence+dependent, pls the
discriminator of the #ea- entity set1
%rimary .eys ,or Relationship Sets:
&he attribtes of a relationship set are the attribtes that comprise the
primary -eys of the entity sets in"ol"ed in the relationship set1
>or example)
S1%1N1 is the primary -ey of cstomer, and
accont+nmber is the primary -ey of accont1
&he attribtes of the relationship set cstacct are then (accont+
nmber, S1%1N1)1
&his is enogh information to enable s to relate an accont to a person1
%f the relationship has descripti"e attribtes, those are also inclded in its
attribte set1 >or example, #e might add the attribte date? to the abo"e
relationship set, signifying the date of last access to an accont by a
particlar cstomer1
Note that this attribte cannot instead be placed in either entity set as it
relates to both a cstomer and an accont, and the relationship is many+to+
many1
&he primary -ey of a relationship set depends on the mapping cardinality
and the presence of descripti"e attribtes1
<ith no descripti"e attribtes)
man.<to<man., all attribtes in 1
one<to<man., primary -ey for the OmanyO entity1
Descripti"e attribtes may be added, depending on the mapping cardinality
and the semantics in"ol"ed (see text)1
The -ntity Relationship Daigram:
<e can express the o"erall logical strctre of a databasegra%hicall. #ith
an 5+! diagram1
%ts components are)
rectangles representing entity sets1
elli%ses representing attribtes1
diamonds representing relationship sets1
lines lin-ing attribtes to entity sets and entity sets to relationship
sets1
%n the text, lines may be directed (ha"e an arro# on the end) to signify
mapping cardinalities for relationship sets1 >igres 81T to 817B sho# some
examples1


Figre 2.@, An 5+! diagram

For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com


Figre 2.?, One+to+many from customer to account



Figre 2.E, Many+to+one from customer to account



Figre 2.1F, One+to+one from customer to account


/o bac- and re"ie# mapping cardinalities1 &hey express the nmber of
entities to #hich an entity can be associated "ia a relationship1
&he arro# positioning is simple once yo get it straight in yor mind, so do
some examples1 &hin- of the arro# head as pointing to the entity that
aaone\\ refers to1

(ther Style of -#R Diagram:
&he text ses one particlar style of diagram1 Many "ariations exist1
Some of the "ariations yo #ill see are)
Diamonds being omitted + a lin- bet#een entities indicates a
relationship1
o Less symbols, clearer pictre1
o <hat happens #ith descripti"e attribtesD
o %n this case, #e ha"e to create anintersection entit. to possess
the attribtes1
Nmbers instead of arro#heads indicating cardinality1
o Symbols, 7, n and m sed1
o 51g1 7 to 7, 7 to n, n to m1
o 5asier to nderstand than arro#heads1

A range of nmbers indicating o%tionalit. of relationship1 (See
5lmasri ' Na"athe, p ;T1)
o 51g (B,7) indicates minimm 2ero (optional), maximm 71
o Can also se (B,n), (7,7) or (7,n)1
o &ypically sed on near end of lin- + confsing at first, bt gi"es
more information1
o 51g1 entity 7 (B,7) ++ (7,n) entity 8 indicates that entity 7 is related
to bet#een B and 7 occrrences of entity 8 (optional)1
o 5ntity 8 is related to at least 7 and possibly many occrrences of
entity 7 (mandatory)1
-ltivaled attribtes may be indicated in some manner1
o Means attribte can ha"e more than one "ale1
o 51g1 hobbies1
o 4as to be normali2ed later on1
E'tended E<$ diagrams allo#ing more details$constraints in the real
#orld to be recorded1 (See 5lmasri ' Na"athe, chapter 871)
o Composite attribtes1
o Deri"ed attribtes1
o Sbclasses and sperclasses1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
o /enerali2ation and speciali2ation1
$oles in E<$ 6iagrams
&he fnction that an entity plays in a relationship is called its role1 !oles are
normally explicit and not specified1

&hey are sefl #hen the meaning of a relationship set needs clarification1


>or example, the entity sets of a relationship may not be distinct1 &he
relationship %or"s-for might be ordered pairs of employees (first is manager,
second is #or-er)1
%n the 5+! diagram, this can be sho#n by labelling the lines connecting
entities (rectangles) to relationships (diamonds)1 (See figre 8177)1

Figre 2.11, 5+! diagram #ith role indicators


!eak Entit. )ets in E<$ 6iagrams
A #ea- entity set is indicated by a dobly+otlined box1 >or example, the
pre"iosly+mentioned #ea- entity set transaction is dependent on the strong
entity set account "ia the relationship set log1
>igre 8178) sho#s this example1

Figre 2.12, 5+! diagram #ith a #ea- entity set


Nonbinar. $elationshi%s
Non+binary relationships can easily be represented1 >igre 8179) sho#s an
example1

Figre 2.1/, 5+! diagram #ith a ternary relationship


&his 5+! diagram says that a cstomer may ha"e se"eral acconts, each
located in a specific ban- branch, and that an accont may belong to se"eral
different cstomers1

Reducing -#R Diagrams to Ta&les:
A database conforming to an 5+! diagram can be represented by a collection of
tables1 <e\ll se the 5+! diagram of >igre817:) as or example1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
Figre 2.10, 5+! diagram #ith strong and #ea- entity sets
>or each entity set and relationship set, there is a ni*e table #hich is assigned
the name of the corresponding set1 5ach table has a nmber of colmns #ith
ni(e names1 (51g1 >igs1 817: + 817T in the text)1
Representation of Strong -ntity sets:
<e se a table #ith one colmn for each attribte of the set1 5ach ro# in the table
corresponds to one entity of the entity set1 >or the entity set account, see the table
of figre 817:1
<e can add, delete and modify ro#s (to reflect changes in the real #orld)1
A ro# of a table #ill consist of an n+tple #here n is the nmber of attribtes1
Actally, the table contains a sbset of the set of all possible ro#s1 <e refer to the
set of all possible ro#s as the cartesian %rodct of the sets of all attribte "ales1

<e may denote this as
for the accont table, #here and denote the set of all accont nmbers and
all accont balances, respecti"ely1
%n general, for a table of n colmns, #e may denote the cartesian prodct of
by

Data&ase !anagement System Sets:
>or a #ea- entity set, #e add colmns to the table corresponding to the primary
-ey of the strong entity set on #hich the #ea- set is dependent1
>or example, the #ea- entity set transaction has three attribtes) transaction-
number, date and amount1 &he primary -ey of account (on
#hich transactiondepends) is account-number1 &his gi"es s the table of figre
817=1

Representation Relationship Sets:
Let ! be a relationship set in"ol"ing entity sets 1
&he table corresponding to the relationship set ! has the follo#ing attribtes)



%f the relationship has - descripti"e attribtes, #e add them too)

An example)
&he relationship set )ustAcct in"ol"es the entity setscustomer and account1
&heir respecti"e primary -eys are ,$-$.$ and account-number1

)ustAcct also has a descripti"e attribte, date1

&his gi"es s the table of figre 817S1
Non<binar. $elationshi% )ets
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
&he ternary relationship of >igre 8179 gi"es s the table of figre 817T1 As re(ired, #e ta-e the primary
-eys of each entity set1 &here are no descripti"e attribtes in this example1
Ginking a !eak to a )trong Entit.
&hese relationship sets are many+to+one, and ha"e no descripti"e attribtes1 &he primary -ey of the #ea-
entity set is the primary -ey of the strong entity set it is existence+dependent on, pls its discriminator1
&he table for the relationship set #old ha"e the same attribtes, and is ths redndant1
Concept of -nhanced -#R Diagram:
<e ha"e seen #ea- entity sets, generali2ation and aggregation1 Designers mst decide #hen these featres
are appropriate1

Strong entity sets and their dependent #ea- entity sets may be regarded as a single aaob3ect\\ in
the database, as #ea- entities are existence+dependent on a strong entity1
%t is possible to treat an aggregated entity set as a single nit #ithot concern for its inner strctre
details1

/enerali2ation contribtes to modlarity by allo#ing common attribtes of similar entity sets to be
represented in one place in an 5+! diagram1
5xcessi"e se of the featres can introdce nnecessary complexity into the design1
9he $elational 6ata -odel,
&he !elational Data Model has the relation at its heart, bt then a #hole series of rles go"erning -eys,
relationships, 3oins, fnctional dependencies, transiti"e dependencies, mlti+"aled dependencies, and
modification anomalies1
9he $elation
&he /elation is the basic element in a relational data model1
>igre 9 + !elations in the !elational Data Model
A relation is sb3ect to the follo#ing rles)
1. !elation (file, table) is a t#o+dimensional table1
2. Attribte (i1e1 field or data item) is a colmn in the table1
3. 5ach colmn in the table has a ni(e name #ithin that table1
4. 5ach colmn is homogeneos1 &hs the entries in any colmn are all of the same type (e1g1 age,
name, employee+nmber, etc)1
5. 5ach colmn has a domain, the set of possible "ales that can appear in that colmn1
6. A &ple (i1e1 record) is a ro# in the table1
7. &he order of the ro#s and colmns is not important1
. Cales of a ro# all relate to some thing or portion of a thing1
!. !epeating grops (collections of logically related attribtes that occr mltiple times #ithin one
record occrrence) are not allo#ed1
1". Dplicate ro#s are not allo#ed (candidate -eys are designed to pre"ent this)1
11. Cells mst be single+"aled (bt can be "ariable length)1 Single "aled means the follo#ing)
o Cannot contain mltiple "ales sch as \A7,*8,C9\1
o Cannot contain combined "ales sch as \A*C+RZV\ #here \A*C\ means one thing and \RZV\
another1
A relation may be expressed sing the notation5(7;9 ???" #here)
! @ the name of the relation1
(A,*,C, 111) @ the attribtes #ithin the relation1
A @ the attribte(s) #hich form the primary -ey1
Be.s
1. A sim%le -ey contains a single attribte1
2. A com%osite ke. is a -ey that contains more than one attribte1
3. A candidate ke. is an attribte (or set of attribtes) that ni(ely identifies a ro#1 A candidate -ey
mst possess the follo#ing properties)
o Eni(e identification + >or e"ery ro# the "ale of the -ey mst ni(ely identify that ro#1
o Non redndancy + No attribte in the -ey can be discarded #ithot destroying the property of
ni(e identification1
4. A %rimar. ke. is the candidate -ey #hich is selected as the principal ni(e identifier1 5"ery
relation mst contain a primary -ey1 &he primary -ey is sally the -ey selected to identify a ro#
#hen the database is physically implemented1 >or example, a part nmber is selected instead of a
part description1
5. A s%erke. is any set of attribtes that ni(ely identifies a ro#1 A sper-ey differs from a
candidate -ey in that it does not re(ire the non redndancy property1
6. A foreign ke. is an attribte (or set of attribtes) that appears (sally) as a non -ey attribte in
one relation and as a primary -ey attribte in another relation1 % say usuallybecase it is possible
for a foreign -ey to also be the #hole or part of a primary -ey)
o A many+to+many relationship can only be implemented by introdcing an intersection or lin-
table #hich then becomes the child in t#o one+to+many relationships1 &he intersection table
therefore has a foreign -ey for each of its parents, and its primary -ey is a composite of both
foreign -eys1
o A one+to+one relationship re(ires that the child table has no more than one occrrence for
each parent, #hich can only be enforced by letting the foreign -ey also ser"e as the primary
-ey1
7. A semantic or natral -ey is a -ey for #hich the possible "ales ha"e an ob"ios meaning to the
ser or the data1 >or example, a semantic primary -ey for a COEN&!Z entity might contain the
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
"ale \ESA\ for the occrrence describing the Enited States of America1 &he "ale \ESA\ has
meaning to the ser1
. A technical or srrogate or artificial -ey is a -ey for #hich the possible "ales ha"e no ob"ios
meaning to the ser or the data1 &hese are sed instead of semantic -eys for any of the follo#ing
reasons)
o <hen the "ale in a semantic -ey is li-ely to be changed by the ser, or can ha"e dplicates1
>or example, on a 65!SON table it is n#ise to se 65!SONbNAM5 as the -ey as it is possible
to ha"e more than one person #ith the same name, or the name may change sch as throgh
marriage1
o <hen none of the existing attribtes can be sed to garantee ni(eness1 %n this case adding
an attribte #hose "ale is generated by the system, e1g from a se(ence of nmbers, is the
only #ay to pro"ide a ni(e "ale1 &ypical examples #old be O!D5!b%D and %NCO%C5b%D1
&he "ale \789:;\ has no meaning to the ser as it con"eys nothing abot the entity to #hich it
relates1
!. A -ey fnctionally determines the other attribtes in the ro#, ths it is al#ays a determinant1
1". Note that the term \-ey\ in most D*MS engines is implemented as an index #hich does not allo#
dplicate entries1
Relationships:
One table (relation) may be lin-ed #ith another in #hat is -no#n as a relationshi%1 !elationships may be
bilt into the database strctre to facilitate the operation of relational 3oins at rntime1
1. A relationship is bet#een t#o tables in #hat is -no#n as a one<to<man. or %arent<
child ormaster<detail relationship #here an occrrence on the \one\ or \parent\ or \master\ table
may ha"e any nmber of associated occrrences on the \many\ or \child\ or \detail\ table1 &o achie"e
this the child table mst contain fields #hich lin- bac- the %rimar. ke.on the %arent table1 &hese
fields on the childtable are -no#n as a foreign ke., and the%arent table is referred to as
the foreigntable (from the "ie#point of the child)1
2. %t is possible for a record on the %arent table to exist #ithot corresponding records on
thechild table, bt it shold not be possible for an entry on the child table to exist #ithot a
corresponding entry on the %arent table1
3. A child record #ithot a corresponding %arentrecord is -no#n as an or%han1
o %t is possible for a table to be related to itself1 >or this to be possible it needs aforeign
ke. #hich points bac- to the%rimar. ke.1 Note that these t#o -eys cannot be comprised of
exactly the same fields other#ise the record cold only e"er point to itself1
o A table may be the sb3ect of any nmber of relationships, and it may be the %arent in some
and the child in others1
o Some database engines allo# a %arenttable to be lin-ed "ia a candidate ke., bt if this #ere
changed it cold reslt in the lin- to the child table being bro-en1
o Some database engines allo# relationships to be managed by rles -no#n as referential
integrit. orforeign ke. restraints1 &hese #ill pre"ent entries on child tables from being
created if the foreign ke. does not exist on the %arent table, or #ill deal #ith entries
on child tables #hen the entry on the %arent table is pdated or deleted1
$elational Doins
&he 3oin operator is sed to combine data from t#o or more relations (tables) in order to satisfy a particlar
(ery1 &#o relations may be 3oined #hen they share at least one common attribte1 &he 3oin is implemented
by considering each ro# in an instance of each relation1 A ro# in relation !7 is 3oined to a ro# in relation !8
#hen the "ale of the common attribte(s) is e(al in the t#o relations1 &he 3oin of t#o relations is often
called a binar. Coin.
&he 3oin of t#o relations creates a ne# relation1 &he notation \!7 x !8\ indicates the 3oin of relations !7 and
!81 >or example, consider the follo#ing)
$elation $1
A B ;
7 ; 9
8 : ;
T 9 ;
G 9 9
7 = ;
; : 9
8 S ;
$elation $2
B 6 E
: S :
= 8 9
; S T
S 8 9
9 8 8

Note that the instances of relation !7 and !8 contain the same data "ales for attribte *1 Data
normalisation is concerned #ith decomposing a relation (e1g1 !(A,*,C,D,5) into smaller relations (e1g1 !7
and !8)1 &he data "ales for attribte * in this context #ill be identical in !7 and !81 &he instances of !7
and !8 are pro3ections of the instances of !(A,*,C,D,5) onto the attribtes (A,*,C) and (*,D,5) respecti"ely1
A pro3ection #ill not eliminate data "ales + dplicate ro#s are remo"ed, bt this #ill not remo"e a data
"ale from any attribte1
&he 3oin of relations !7 and !8 is possible becase * is a common attribte1 &he reslt of the 3oin is)
$elation $1 ' $2
A B ; 6 E
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
7 ; 9 S T
8 : ; S :
T 9 ; 8 8
G 9 9 8 8
7 = ; 8 9
; : 9 S :
8 S ; 8 9
&he ro# (8 : ; S :) #as formed by 3oining the ro# (8 : ;) from relation !7 to the ro# (: S :) from relation
!81 &he t#o ro#s #ere 3oined since each contained the same "ale for the common attribte *1 &he ro# (8
: ;) #as not 3oined to the ro# (= 8 9) since the "ales of the common attribte (: and =) are not the same1
)ossless /oins:
A set of relations satisfies the lossless 3oin property if the instances can be 3oined #ithot creating in"alid
data (i1e1 ne# ro#s)1 &he term lossless 3oin may be some#hat confsing1 A 3oin that is not lossless #ill
contain extra, in"alid ro#s1 A 3oin that is lossless #ill not contain extra, in"alid ro#s1 &hs the term gainless
Coin might be more appropriate1
&o gi"e an example of incorrect information created by an in"alid 3oin let s ta-e the follo#ing data
strctre)
5(student course instructor hour room grade"
Assming that only one section of a class is offered dring a semester #e can define the follo#ing fnctional
dependencies)
1. (4OE!, !OOM) COE!S5
2. (COE!S5, S&ED5N&) /!AD5
3. (%NS&!EC&O!, 4OE!) !OOM
4. (COE!S5) %NS&!EC&O!
5. (4OE!, S&ED5N&) !OOM
&a-e the follo#ing sample data)
)9&6EN9 ;+&$)E IN)9$&;9+$ 3+&$ $++- 5$A6E
Smith Math 7 .en-ins T)BB 7BB A
.ones 5nglish /oldman T)BB 8BB *
*ro#n 5nglish /oldman T)BB 8BB C
/reen Algebra .en-ins G)BB :BB A
&he follo#ing for relations, each in :th normal form, can be generated from the gi"en and implied
dependencies)
51(STFA6NT GDF5 9DF5S6"
52(STFA6NT 9DF5S6 H57A6"
5I(9DF5S6 CNST5F9TD5"
5J(CNST5F9TD5 GDF5 5DDK"
Note that the dependencies (4OE!, !OOM) COE!S5 and (4OE!, S&ED5N&) !OOM are not
explicitly represented in the preceding decomposition1 &he goal is to de"elop relations in :th normal form
that can be 3oined to ans#er any ad hoc in(iries correctly1 &his goal can be achie"ed #ithot representing
e"ery fnctional dependency as a relation1 >rthermore, se"eral sets of relations may satisfy the goal1
Determinant and Dependant:
&he terms determinant and dependent can be described as follo#s)
1. &he expression R Z means \if % -no# the "ale of R, then % can obtain the "ale of Z\ (in a table
or some#here)1
2. %n the expression R Z, R is thedeterminant and Z is the de%endentattribte1
3. &he "ale R determines the "ale of Z1
4. &he "ale Z de%ends on the "ale of R1
Fnctional 6e%endencies "F6#
A fnctional dependency can be described as follo#s)
1. An attribte is fnctionally dependent if its "ale is determined by another attribte1
2. &hat is, if #e -no# the "ale of one (or se"eral) data items, then #e can find the "ale of another
(or se"eral)1
3. >nctional dependencies are expressed as R Z, #here R is the determinant and Z is the
fnctionally dependent attribte1
4. %f A (*,C) then A * and A C1
5. %f (A,*) C, then it is not necessarily tre that A C and * C1
6. %f A * and * A, then A and * are in a 7+7 relationship1
7. %f A * then for A there can only e"er be one "ale for *1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
9ransitive 6e%endencies "96#
A transiti"e dependency can be described as follo#s)
1. An attribte is transiti"ely dependent if its "ale is determined by another attribte %hich is not a
"ey1
2. %f R Z and R is not a -ey then this is a transiti"e dependency1
3. A transiti"e dependency exists #hen A * C bt NO& A C1
-lti<:aled 6e%endencies "-:6#
A mlti+"aled dependency can be described as follo#s)
1. A table in"ol"es a mlti+"aled dependency if it may contain mltiple "ales for an entity1
2. A mlti+"aled dependency may arise as a reslt of enforcing 7st normal form1
3. R Z, ie R mlti+determines Z, #hen for each "ale of R #e can ha"e more than one "ale of Z1
4. %f A * and A C then #e ha"e a single attribte A #hich mlti+determines t#o other
independent attribtes, * and C1
5. %f A (*,C) then #e ha"e an attribte A #hich mlti+determines a set of associated attribtes, *
and C1
Doin 6e%endencies "D6#
A 3oin dependency can be described as follo#s)
1. %f a table can be decomposed into three or more smaller tables, it mst be capable of being 3oined
again on common -eys to form the original table1
-odification Anomalies
A ma3or ob3ecti"e of data normalisation is to a"oid modification anomalies1 &hese come in t#o fla"ors)
1. An insertion anomal. is a failre to place information abot a ne# database entry into all the
places in the database #here information abot that ne# entry needs to be stored1 %n a properly
normali2ed database, information abot a ne# entry needs to be inserted into only one place in the
database1 %n an inade(ately normali2ed database, information abot a ne# entry may need to be
inserted into more than one place, and, hman fallibility being #hat it is, some of the needed
additional insertions may be missed1
2. A deletion anomal. is a failre to remo"e information abot an existing database entry #hen it is
time to remo"e that entry1 %n a properly normali2ed database, information abot an old, to+be+
gotten+rid+of entry needs to be deleted from only one place in the database1 %n an inade(ately
normali2ed database, information abot that old entry may need to be deleted from more than one
place, and, hman fallibility being #hat it is, some of the needed additional deletions may be
missed1
An pdate of a database in"ol"es modifications that may be additions, deletions, or both1 &hs \pdate
anomalies\ can be either of the -inds of anomalies discssed abo"e1
All three -inds of anomalies are highly ndesirable, since their occrrence constittes corrption of the
database1 6roperly normalised databases are mch less ssceptible to corrption than are nnormalised
databases1
Types of Relational /oin:
A .O%N is a method of creating a reslt set that combines ro#s from t#o or more tables (relations)1 <hen
comparing the contents of t#o tables the follo#ing conditions may occr)
5"ery ro# in one relation has a match in the other relation1
!elation !7 contains ro#s that ha"e no match in relation !81
!elation !8 contains ro#s that ha"e no match in relation !71
%NN5! 3oins contain only matches1 OE&5! 3oins may contain mismatches as #ell1
Inner Doin
&his is sometimes -no#n as a sim%le 3oin1 %t retrns all ro#s from both tables #here there is a match1 %f
there are ro#s in !7 #hich do not ha"e matches in !8, those ro#s #ill not be listed1 &here are t#o possible
#ays of specifying this type of 3oin)
S5L5C& U >!OM !7, !8 <45!5 !71r7bfield @ !81r8bfieldP S5L5C& U >!OM !7 %NN5! .O%N !8 ON !71field @
!81r8bfield
%f the fields to be matched ha"e the same names in both tables then the DN condition, as in)
ON !71fieldname @ !81fieldname ON (!71field7 @ !81field7 AND !71field8 @ !81field8)
can be replaced by the shorter FSCNH condition, as in)
ES%N/ fieldname ES%N/ (field7, field8)
Natral Doin
A natral 3oin is based on all colmns in the t#o tables that ha"e the same name1 %t is semantically
e(i"alent to an %NN5! .O%N or a L5>& .O%N #ith a FSCNH clase that names all colmns that exist in both
tables1
S5L5C& U >!OM !7 NA&E!AL .O%N !8
&he alternati"e is a ke.ed 3oin #hich incldes an DN or FSCNHcondition1
Geft H+terI Doin
!etrns all the ro#s from !7 e"en if there are no matches in !81 %f there are no matches in !8 then the !8
"ales #ill be sho#n as nll1
S5L5C& U >!OM !7 L5>& MOE&5!N .O%N !8 ON !71field @ !81field
$ight H+terI Doin
!etrns all the ro#s from !8 e"en if there are no matches in !71 %f there are no matches in !7 then the !7
"ales #ill be sho#n as nll1
S5L5C& U >!OM !7 !%/4& MOE&5!N .O%N !8 ON !71field @ !81field
Fll H+terI Doin
!etrns all the ro#s from both tables e"en if there are no matches in one of the tables1 %f there are no
matches in one of the tables then its "ales #ill be sho#n as nll1
S5L5C& U >!OM !7 >ELL MOE&5!N .O%N !8 ON !71field @ !81field
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
)elf Doin
&his 3oins a table to itself1 &his table appears t#ice in the >!OM clase and is follo#ed by table aliases that
(alify colmn names in the 3oin condition1
S5L5C& a1field7, b1field8 >!OM !7 a, !7 b <45!5 a1field @ b1field
;ross Doin
&his type of 3oin is rarely sed as it does not ha"e a 3oin condition, so e"ery ro# of !7 is 3oined to e"ery ro#
of !81 >or example, if both tables contain 7BB ro#s the reslt #ill be 7B,BBB ro#s1 &his is sometimes -no#n
as a cartesian %rodctand can be specified in either one of the follo#ing #ays)
S5L5C& U >!OM !7 C!OSS .O%N !8 S5L5C& U >!OM !7, !8
9he $elational Algebra,
&he relational algebra is a procedral (ery langage1
1. Six fndamental operations)
o select (nary)
o pro3ect (nary)
o rename (nary)
o cartesian prodct (binary)
o nion (binary)
o set+difference (binary)
2. Se"eral other operations, defined in terms of the fndamental operations)
o set+intersection
o natral 3oin
o di"ision
o assignment
3. Operations prodce a ne# relation as a reslt1
,undamental (peration:

9he )elect +%eration


)elect selects tples that satisfy a gi"en predicate1 Select is denoted by a lo#ercase /ree- sigma (
), #ith the predicate appearing as a sbscript1 &he argment relation is gi"en in parentheses
follo#ing the 1
>or example, to select tples (ro#s) of the borro# relation #here the branch is OS>EO, #e #old
#rite
Let >igre 919 be the borro# and branch relations in the ban-ing example1
&he ne# relation created as the reslt of this operation consists of one
tple) 1
<e allo# comparisons sing @, , K, , ? and in the selection predicate1
<e also allo# the logical connecti"es (or) and (and)1 >or example)

Figre /.0, &he client relation1
Sppose there is one more relation, client, sho#n in >igre 91:, #ith the scheme
#e might #rite
to find clients #ho ha"e the same name as their ban-er1
[ 9he 8roCect +%eration
8roCect copies its argment relation for the specified attribtes only1 Since a relation is a set, dplicate
ro#s are eliminated1
6ro3ection is denoted by the /ree- capital letter pi ( )1 &he attribtes to be copied appear as sbscripts1
>or example, to obtain a relation sho#ing cstomers and branches, bt ignoring amont and loanc, #e
#rite
<e can perform these operations on the relations reslting from other operations1
&o get the names of cstomers ha"ing the same name as their ban-ers,
&hin- of select as ta-ing ro#s of a relation, and %roCect as ta-ing colmns of a relation1
[ 9he ;artesian 8rodct +%eration
&he cartesian %rodct of t#o relations is denoted by a cross ( ), #ritten
&he reslt of is a ne# relation #ith a tple for each possible%airing of tples from and 1
%n order to a"oid ambigity, the attribte names ha"e attached to them the name of the relation from #hich
they came1 %f no ambigity #ill reslt, #e drop the relation name1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
&he reslt is a "ery large relation1 %f has tples, and has tples,
then #ill ha"e tples1
&he reslting scheme is the concatenation of the schemes of and , #ith relation names added as
mentioned1
&o find the clients of ban-er .ohnson and the city in #hich they li"e, #e need information in both client and
cstomer relations1 <e can get this by #riting
4o#e"er, the stomer1cname colmn contains cstomers of ban-ers other than .ohnson1 (<hyD)
<e #ant ro#s #hereclient1cname @ cstomer1cname1 So #e can #rite
to get 3st these tples1
>inally, to get 3st the cstomer\s name and city, #e need a pro3ection)

9he )elect +%eration


)elect selects tples that satisfy a gi"en predicate1 Select is denoted by a lo#ercase /ree- sigma (
), #ith the predicate appearing as a sbscript1 &he argment relation is gi"en in parentheses
follo#ing the 1
>or example, to select tples (ro#s) of the borro% relation #here the branch is aaS>E\\, #e #old
#rite
Let >igre 919 be the borro% and branch relations in the ban-ing example1
Figre /./, &he borro% and branch relations1
&he ne# relation created as the reslt of this operation consists of one
tple) 1
<e allo# comparisons sing @, , K, , ? and in the selection predicate1
<e also allo# the logical connecti"es (or) and (and)1 >or example)

Figre /.0, &he client relation1
Sppose there is one more relation, client, sho#n in >igre 91:, #ith the scheme
#e might #rite
to find clients #ho ha"e the same name as their ban-er1
[ 9he 8roCect +%eration
8roCect copies its argment relation for the specified attribtes only1 Since a relation is a set, dplicate
ro#s are eliminated1
6ro3ection is denoted by the /ree- capital letter pi ( )1 &he attribtes to be copied appear as sbscripts1
>or example, to obtain a relation sho#ing cstomers and branches, bt ignoring amont and loanc, #e
#rite
<e can perform these operations on the relations reslting from other operations1
&o get the names of cstomers ha"ing the same name as their ban-ers,
&hin- of select as ta-ing ro#s of a relation, and %roCect as ta-ing colmns of a relation1
[ 9he ;artesian 8rodct +%eration
&he cartesian %rodct of t#o relations is denoted by a cross ( ), #ritten
&he reslt of is a ne# relation #ith a tple for each possible %airing of tples from and 1
%n order to a"oid ambigity, the attribte names ha"e attached to them the name of the relation from #hich
they came1 %f no ambigity #ill reslt, #e drop the relation name1
&he reslt is a "ery large relation1 %f has tples, and has tples,
then #ill ha"e tples1
&he reslting scheme is the concatenation of the schemes of and , #ith relation names added as
mentioned1
&o find the clients of ban-er .ohnson and the city in #hich they li"e, #e need information in
both client and customer relations1 <e can get this by #riting
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
4o#e"er, the customer$cname colmn contains cstomers of ban-ers other than .ohnson1 (<hyD)
<e #ant ro#s #here client$cname 0 customer$cname1 So #e can #rite
to get 3st these tples1
>inally, to get 3st the cstomer\s name and city, #e need a pro3ection)
,ormal Definition of Relational Alge&ra:

A basic expression consists of either


o A relation in the database1
o A constant relation1
/eneral expressions are formed ot of smaller sbexpressions sing
o select (p a predicate)
o pro3ect (s a list of attribtes)

o rename (x a relation name)
o nion

o set difference
o cartesian prodct
The Tuple Relational Calculas:

&he tple relational calcls is a nonprocedral langage1 (&he relational algebra #as procedral1)
<e mst pro"ide a formal description of the information desired1

A (ery in the tple relational calcls is expressed as
i1e1 the set of tples for #hich predicate is tre1

[ <e also se the notation
to indicate the "ale of tple on attribte 1
to sho# that tple is in relation 1
Relational Data&ase Design:

One of the best #ays to nderstand database design is to start #ith an all+in+one, flat+file table design and
then toss in some sample data to see #hat happens1 *y analysing the sample data, yo\ll be able to identify
problems cased by the initial design1 Zo can then modify the design to eliminate the problems, test some
more sample data, chec- for problems, and re+modify, contining this process ntil yo ha"e a consistent
and problem+free design1
Once yo gro# accstomed to the types of problems poor table design can create, hopeflly yo\ll be able to
s-ip the interim steps and 3mp immediately to the final table design1
A Simple Design %rocess:
Let\s step throgh a sample database design process1
<e\ll design a database to -eep trac- of stdents\ sports acti"ities1 <e\ll trac- each acti"ity a stdent ta-es
and the fee per semester to do that acti"ity1
)te% 1, ;reate an Activities table containing all the fields) stdent\s name, acti"ity and cost1 *ecase
some stdents ta-e more than one acti"ity, #e\ll ma-e allo#ances for that and inclde a second acti"ity and
cost field1 So or strctre #ill be) Stdent, Acti"ity 7, Cost 7, Acti"ity 8, Cost 8
)te% 2, 9est the table =ith some sam%le data. <hen yo create sample data, yo shold see #hat yor
table lets you get a%ay %ith$ >or instance, nothing pre"ents s from entering the same name for different
stdents, or different fees for the same acti"ity, so do so1 Zo shold also imagine trying to as- (estions
abot yor data and getting ans#ers bac- (essentially (erying the data and prodcing reports)1 >or
example, ho# do % find all the stdents ta-ing tennisD
)te% /, Anal.se the data. %n this case, #e can see a glaring problem in the first field1 <e ha"e t#o .ohn
Smiths, and there\s no #ay to tell them apart1 <e need to find a #ay to identify each stdent ni(ely1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
Uniquely identify records
Let\s fix the glaring problem first, then examine the ne# reslts1
)te% 0, -odif. the design. <e can identify each stdent ni(ely by gi"ing each one a ni(e %D, a ne#
field that #e add, called %D1 <e scrap the Stdent field and sbstitte an %D field1 Note the asteris- (U)
beside this field in the table belo#) it signals that the %D field is a "ey field, containing a ni(e "ale in
each record1 <e can se that field to retrie"e any specific record1 <hen yo create sch a -ey field in a
database program, the program #ill then pre"ent yo from entering dplicate "ales in this field,
safegarding the ni(eness of each entry1
Or table strctre is no#) %D, Acti"ity 7, Cost 7, Acti"ity 8, Cost 8
<hile it\s easy for the compter to -eep trac- of %D codes, it\s not so sefl for hmans1 So #e\re going to
introdce a second table that lists each %D and the stdent it belongs to1 Esing a database program, #e can
create both table strctres and then lin- them by the common field, %D1 <e\"e no# trned or initial flat-
file design into a relational database) a database containing mltiple tables lin-ed together by -ey fields1 %f
yo #ere sing a database program that can\t handle relational databases, yo\d basically be stc- #ith or
first design and all its attendant problems1 <ith a relational database program, yo can create as many
tables as yor data strctre re(ires1
&he Stdents table #old normally contain each stdent\s first name, last name, address, age and other
details, as #ell as the assigned %D1 &o -eep things simple, #e\ll restrict it to name and %D, and focs on the
Acti"ities table strctre1
)te% 2, 9est the table =ith sam%le data.
)te% >, Anal.se the data. &here\s still a lot #rong #ith the Acti"ities table)
1. <asted space1 Some stdents don\t ta-e a second acti"ity, and so #e\re #asting space #hen #e
store the data1 %t doesn\t seem mch of a bother in this sample, bt #hat if #e\re dealing #ith
thosands of recordsD
2. Addition anomalies1 <hat if c87G (#e can loo- him p and find it\s Mar- Antony) #ants to do a
third acti"ityD School rles allo# it, bt there\s no space in this strctre for another acti"ity1 <e
can\t add another record for Mar-, as that #old "iolate the ni(e -ey field %D, and it #old also
ma-e it difficlt to see all his information at once1
3. !edndant data entry1 %f the tennis fees go p to ]9G, #e ha"e to go throgh every record
containing tennis and modify the cost1
4. Xerying difficlties1 %t\s difficlt to find all people doing s#imming) #e ha"e to search throgh
Acti"ity 7and Acti"ity 8 to ma-e sre #e catch them all1
5. !edndant information1 %f ;B stdents ta-e s#imming, #e ha"e to type in both the acti"ity and its
cost each time1
6. %nconsistent data1 Notice that there are conflicting prices for s#immingD Shold it be ]7; or ]7SD
&his happens #hen one record is pdated and another isn\t1
Eliminate recurring fields
&he Stdents table is fine, so #e\ll -eep it1 *t there\s so mch #rong #ith the Acti"ities table let\s try to fix
it in stages1
)te% @, -odif. the design. <e can fix the first for problems by creating a separate record for each
acti"ity a stdent ta-es, instead of one record for all the acti"ities a stdent ta-es1
>irst #e eliminate the Acti"ity 8 and Cost 8 fields1 &hen #e need to ad3st the table strctre so #e can
enter mltiple records for each stdent1 &o do that, #e redefine the -ey so that it consists of t#o fields, %D
and Acti"ity1 As each stdent can only ta-e an acti"ity once, this combination gi"es s a ni(e -ey for each
record1
Or Acti"ities table has no# been simplified to) %D, Acti"ity, Cost1 Note ho# the ne# strctre lets stdents
ta-e any nmber of acti"ities J they\re no longer limited to t#o1
)te% ?, 9est sam%le data.
)te% E, Anal.se the data. <e -no# #e still ha"e the problems #ith redndant data (acti"ity fees
repeated) and inconsistent data (#hat\s the correct fee for s#immingD)1 <e need to fix these things, #hich
are both problems #ith editing or modifying records1
Introduction to S0):
SXL is a standard compter langage for accessing and maniplating databases1
[ SXL stands for )trctred Jery Gangage
[ SXL allo#s yo to access a database
[ SXL is an ANS% standard compter langage
[ SXL can execte (eries against a database
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
[ SXL can retrie"e data from a database
[ SXL can insert ne# records in a database
[ SXL can delete records from a database
[ SXL can pdate records in a database
SQL is a Standard - !"....
SXL is an ANS% (American National Standards %nstitte) standard compter langage for accessing and
maniplating database systems1 SXL statements are sed to retrie"e and pdate data in a database1 SXL
#or-s #ith database programs li-e MS Access, D*8, %nformix, MS SXL Ser"er, Oracle, Sybase, etc1
Enfortnately, there are many different "ersions of the SXL langage, bt to be in compliance #ith the ANS%
standard, they mst spport the same ma3or -ey#ords in a similar manner (sch as S5L5C&, E6DA&5,
D5L5&5, %NS5!&, <45!5, and others)1
Note, Most of the SXL database programs also ha"e their o#n proprietary extensions in addition to the SXL
standardA
'asics of The Select Statement:
%n a relational database, data is stored in tables1 An example table #old relate Social Secrity Nmber,
Name, and Address)
Em%lo.ee Address 9able
))N First Name Gast Name Address ;it. )tate
;78=TS:;T .oe Smith T9 >irst Street 4o#ard Ohio
S;T:8BB78 Mary Scott T:8 Cine A"e1 Losanti "ille Ohio
7B88;:TG= Sam .ones 99 5lm St1 6aris Ne# Zor-
TS=;78;=9 Sarah Ac-erman ::B E1S1 77B Epton Michigan
No#, let\s say yo #ant to see the address of each employee1 Ese the S5L5C& statement, li-e so)
S6269T FirstName 2astName 7ddress 9it0 State
F5DK 6mplo0ee7ddressTable#
&he follo#ing is the reslts of yor 1uery of the database)
First Name Gast Name Address ;it. )tate
.oe Smith T9 >irst Street 4o#ard Ohio
Mary Scott T:8 Cine A"e1 Losanti "ille Ohio
Sam .ones 99 5lm St1 6aris Ne# Zor-
Sarah Ac-erman ::B E1S1 77B Epton Michigan
&o explain #hat yo 3st did, yo as-ed for the all of data in the 5mployeeAddress&able, and specifically, yo
as-ed for thecolumns called >irstName, LastName, Address, City, and State1 Note that colmn names and
table names do not ha"e spaces111they mst be typed as one #ordP and that the statement ends #ith a
semicolon (P)1 &he general form for a S5L5C& statement, retrie"ing all of the ro%s in the table is)
S6269T 9olumnName 9olumnName ???
F5DK TableName#
&o get all colmns of a table #ithot typing all colmn names, se)
S6269T *
F5DK TableName#
5ach database management system (D*MS) and database soft#are has different methods for logging in to
the database and entering SXL commandsP see the local compter OgrO to help yo get onto the system,
so that yo can se SXL1
;onditional )election
&o frther discss the S5L5C& statement, let\s loo- at a ne# example table (for hypothetical prposes only))
Em%lo.ee)tatistics9able
Em%lo.ee I6No )alar. Benefits 8osition
B7B S;BBB 7;BBB Manager
7B; =;BBB 7;BBB Manager
7;8 =BBBB 7;BBB Manager
87; =BBBB 78;BB Manager
8:: ;BBBB 78BBB Staff
9BB :;BBB 7BBBB Staff
99; :BBBB 7BBBB Staff
:BB 98BBB S;BB 5ntry+ Le"el
::7 8TBBB S;BB 5ntry+ Le"el
Logical Operators
&here are six logical operators in SXL, and after introdcing them, #e\ll see ho# they\re sed)
@ 5(al
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
K? or A@ (see manal) Not 5(al
K Less &han
? /reater &han
K@ Less &han or 5(al &o
?@ /reater &han or 5(al &o
&he 2#*/* clase is sed to specify that only certain ro#s of the table are displayed, based on the criteria
described in that2#*/* clause1 %t is most easily nderstood by loo-ing at a cople of examples1
%f yo #anted to see the 5M6LOZ55%DNO\s of those ma-ing at or o"er ];B,BBB, se the follo#ing)
S5L5C& 5M6LOZ55%DNO
>!OM 5M6LOZ55S&A&%S&%CS&A*L5
<45!5 SALA!Z ?@ ;BBBBP
Notice that the ?@ (greater than or e(al to) sign is sed, as #e #anted to see those #ho made greater
than ];B,BBB, or e(al to ];B,BBB, listed together1 &his displays)
5M6LOZ55%DNO
++++++++++++
B7B
7B;
7;8
87;
8::
&he <45!5? description, SALA!Z ?@ ;BBBB, is -no#n as a condition &he same can be done for text
colmns)
S5L5C& 5M6LOZ55%DNO
>!OM 5M6LOZ55S&A&%S&%CS&A*L5
<45!5 6OS%&%ON @ \Manager\?
&his displays the %D Nmbers of all Managers1 /enerally, #ith text colmns, stic- to e(al to or not e(al to
conditions, and ma-e sre that any text that appears in the statement is srronded by single (otes (\)1

More Complex Conditions) Compond Conditions

&he AND operator 3oins t#o or more conditions, and displays a ro# only if that ro#\s data
satisfies AGGconditions listed (i1e1 all conditions hold tre)1 >or example, to display all staff ma-ing o"er
]:B,BBB, se)
S5L5C& 5M6LOZ55%DNO
>!OM 5M6LOZ55S&A&%S&%CS&A*L5
<45!5 SALA!Z ? :BBBB AND 6OS%&%ON @ \Staff\P
&he O! operator 3oins t#o or more conditions, bt retrns a ro# if ANK of the conditions listed hold tre1 &o
see all those #ho ma-e less than ]:B,BBB or ha"e less than ]7B,BBB in benefits, listed together, se the
follo#ing (ery)
S5L5C& 5M6LOZ55%DNO
>!OM 5M6LOZ55S&A&%S&%CS&A*L5
<45!5 SALA!Z K :BBBB O! *5N5>%&S K 7BBBBP
AND ' O! can be combined, for example)
S5L5C& 5M6LOZ55%DNO
>!OM 5M6LOZ55S&A&%S&%CS&A*L5
<45!5 6OS%&%ON @ \Manager\ AND SALA!Z ? =BBBB O! *5N5>%&S ? 78BBBP
>irst, SXL finds the ro#s #here the salary is greater than ]=B,BBB or the benefits is greater than ]78,BBB,
then ta-ing this ne# list of ro#s, SXL then sees if any of these ro#s satisfies the condition that the 6osition
colmn if e(al to \Manager\1 Sbse(ently, SXL only displays this second ne# list of ro#s, as the AND
operator forces SXL to only display sch ro#s satisfying the 6osition colmn condition1 Also note that the O!
operation is done first1
&o generali2e this process, SXL performs the O! operation(s) to determine the ro#s #here the O!
operation(s) hold tre (remember) any one of the conditions is tre), then these reslts are sed to
compare #ith the AND conditions, and only display those remaining ro#s #here the conditions 3oined by the
AND operator hold tre1
&o perform AND\s before O!\s, li-e if yo #anted to see a list of managers or anyone ma-ing a large salary
(?];B,BBB) and a large benefit pac-age (?]7B,BBB), #hether he or she is or is not a manager, se
parentheses)
S5L5C& 5M6LOZ55%DNO
>!OM 5M6LOZ55S&A&%S&%CS&A*L5
<45!5 6OS%&%ON @ \Manager\ O! (SALA!Z ? ;BBBB AND *5N5>%& ? 7BBBB)P

IN L BE9!EEN

An easier method of sing compond conditions ses %N or *5&<55N1 >or example, if yo #anted to list all
managers and staff)
S5L5C& 5M6LOZ55%DNO
>!OM 5M6LOZ55S&A&%S&%CS&A*L5
<45!5 6OS%&%ON %N (\Manager\, \Staff\)P
or to list those ma-ing greater than or e(al to ]9B,BBB, bt less than or e(al to ];B,BBB, se)
S5L5C& 5M6LOZ55%DNO
>!OM 5M6LOZ55S&A&%S&%CS&A*L5
<45!5 SALA!Z *5&<55N 9BBBB AND ;BBBBP
&o list e"eryone not in this range, try)
S5L5C& 5M6LOZ55%DNO
>!OM 5M6LOZ55S&A&%S&%CS&A*L5
<45!5 SALA!Z NO& *5&<55N 9BBBB AND ;BBBBP
Similarly, NO& %N lists all ro#s exclded from the %N list1

&sing GIBE

Loo- at the 5mployeeStatistics&able, and say yo #anted to see all people #hose last names started #ith
OLOP try)
S5L5C& 5M6LOZ55%DNO
>!OM 5M6LOZ55ADD!5SS&A*L5
<45!5 LAS&NAM5 L%,5 \L_\P
&he percent sign (_) is sed to represent any possible character (nmber, letter, or pnctation) or set of
characters that might appear after the OLO1 &o find those people #ith LastName\s ending in OLO, se \_L\, or
if yo #anted the OLO in the middle of the #ord, try \_L_\1 &he \_\ can be sed for any characters, in that
relati"e position to the gi"en characters1 NO& L%,5 displays ro#s not fitting the gi"en description1 Other
possiblities of sing L%,5, or any of these discssed conditionals, are a"ailable, thogh it depends on #hat
D*MS yo are singP as sal, conslt a manal or yor system manager or administrator for the a"ailable
featres on yor system, or 3st to ma-e sre that #hat yo are trying to do is a"ailable and allo#ed1 &his
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
disclaimer holds for the featres of SXL that #ill be discssed belo#1 &his section is 3st to gi"e yo an idea
of the possibilities of (eries that can be #ritten in SXL1

Doins,
%n this section, #e #ill only discss inner 3oins, ande1uijoins, as in general, they are the most sefl1 >or
more information, try the SXL lin-s at the bottom of the page1
/ood database design sggests that each table lists data only abot a single entity, and detailed information
can be obtained in a relational database, by sing additional tables, and by sing a join1
>irst, ta-e a loo- at these example tables)
AntiqueOwners
+=nerI6 +=nerGastName +=nerFirstName
B7 .ones *ill
B8 Smith *ob
7; La#son 6atricia
87 A-ins .ane
;B >o#ler Sam


Orders
+=nerI6 Item6esired
B8 &able
B8 Des-
87 Chair
7; Mirror


Antiques
)ellerI6 B.erI6 Item
B7 ;B *ed
B8 7; &able
7; B8 Chair
87 ;B Mirror
;B B7 Des-
B7 87 Cabinet
B8 87 Coffee &able
7; ;B Chair
B7 7; .e#elry *ox
B8 87 6ottery
87 B8 *oo-case
;B B7 6lant Stand
.eys:
>irst, let\s discss the concept of -eys?1 A primary -ey is a colmn or set of colmns that ni(ely idenifies
the rest of the data in any gi"en ro#1 >or example, in the Anti(eO#ners table, the O#ner%D colmn
ni(ely identifies that ro#1 &his means t#o things) no t#o ro#s can ha"e the same O#ner%D, and, e"en if
t#o o#ners ha"e the same first and last names, the O#ner%D colmn ensres that the t#o o#ners #ill not
be confsed #ith each other, becase the ni(e O#ner%D colmn #ill be sed throghot the database to
trac- the o#ners, rather than the names1
A foreign -ey is a colmn in a table #here that colmn is a primary -ey of another table, #hich means that
any data in a foreign -ey colmn mst ha"e corresponding data in the other table #here that colmn is the
primary -ey1 %n D*MS+spea-, this correspondence is -no#n as referential integrity1 >or example, in the
Anti(es table, both the *yer%D and Seller%D are foreign -eys to the primary -ey of the Anti(eO#ners
table (O#ner%DP for prposes of argment, one has to be an Anti(e O#ner before one can by or sell any
items), as, in both tables, the %D ro#s are sed to identify the o#ners or byers and sellers, and that the
O#ner%D is the primary -ey of the Anti(eO#ners table1 %n other #ords, all of this O%DO data is sed to refer
to the o#ners, byers, or sellers of anti(es, themsel"es, #ithot ha"ing to se the actal names1

8erforming a Doin

&he prpose of these -eys is so that data can be related across tables, #ithot ha"ing to repeat data in
e"ery table++this is the po#er of relational databases1 >or example, yo can find the names of those #ho
boght a chair #ithot ha"ing to list the fll name of the byer in the Anti(es table111yo can get the name
by relating those #ho boght a chair #ith the names in the Anti(eO#ners table throgh the se of the
O#ner%D, #hich relates the data in the t#o tables1 &o find the names of those #ho boght a chair, se the
follo#ing (ery)
S5L5C& O<N5!LAS&NAM5, O<N5!>%!S&NAM5
>!OM AN&%XE5O<N5!S, AN&%XE5S
<45!5 *EZ5!%D @ O<N5!%D AND %&5M @ \Chair\P
Note the follo#ing abot this (ery111notice that both tables in"ol"ed in the relation are listed in the >!OM
clase of the statement1 %n the <45!5 clase, first notice that the %&5M @ \Chair\ part restricts the listing to
those #ho ha"e boght (and in this example, thereby o#ns) a chair1 Secondly, notice ho# the %D colmns
are related from one table to the next by se of the *EZ5!%D @ O<N5!%D clase1 Only #here %D\s match
across tables and the item prchased is a chair (becase of the AND), #ill the names from the
Anti(eO#ners table be listed1 *ecase the 3oining condition sed an e(al sign, this 3oin is called an e(i
3oin1 &he reslt of this (ery is t#o names) Smith, *ob ' >o#ler, Sam1
Dot notation refers to prefixing the table names to colmn names, to a"oid ambigity, as sch)
S5L5C& AN&%XE5O<N5!S1O<N5!LAS&NAM5, AN&%XE5O<N5!S1O<N5!>%!S&NAM5
>!OM AN&%XE5O<N5!S, AN&%XE5S
<45!5 AN&%XE5S1*EZ5!%D @ AN&%XE5O<N5!S1O<N5!%D AND AN&%XE5S1%&5M @ \Chair\P
As the colmn names are different in each table, ho#e"er, this #asn\t necessary1

6I)9IN;9 and Eliminating 6%licates

Let\s say that yo #ant to list the %D and names ofonl. those people #ho ha"e sold an anti(e1 Ob"iosly,
yo #ant a list #here each seller is only listed once++yo don\t #ant to -no# ho# many anti(es a person
sold, 3st the fact that this person sold one (for conts, see the Aggregate >nction section belo#)1 &his
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
means that yo #ill need to tell SXL to eliminate dplicate sales ro#s, and 3st list each person only once1
&o do this, se the D%S&%NC& -ey#ord1
>irst, #e #ill need an e(i3oin to the Anti(eO#ners table to get the detail data of the person\s LastName
and >irstName1 4o#e"er, -eep in mind that since the Seller%D colmn in the Anti(es table is a foreign -ey
to the Anti(eO#ners table, a seller #ill only be listed if there is a ro# in the Anti(eO#ners table listing the
%D and names1 <e also #ant to eliminate mltiple occrences of the Seller%D in or listing, so #e se
D%S&%NC& on the colmn =here the re%eats ma. occr.
&o thro# in one more t#ist, #e #ill also #ant the list alphabeti2ed by LastName, then by >irstName (on a
LastName tie), then by O#ner%D (on a LastName and >irstName tie)1 &hs, #e #ill se the O!D5! *Z
clase)
?S5L5C& D%S&%NC& S5LL5!%D, O<N5!LAS&NAM5, O<N5!>%!S&NAM5
>!OM AN&%XE5S, AN&%XE5O<N5!S
<45!5 S5LL5!%D @ O<N5!%D
O!D5! *Z O<N5!LAS&NAM5, O<N5!>%!S&NAM5, O<N5!%D
%n this example, since e"eryone has sold an item, #e #ill get a listing of all of the o#ners, in alphabetical
order by last name1 >or ftre reference (and in case anyone as-s), this type of 3oin is considered to be in
the category of inner 3oins1
Aliases in Su& 0ueries:
%n this section, #e #ill tal- abot Aliases, -n and the se of sb(eries, and ho# these can be sed in a 9+
table example1 >irst, loo- at this (ery #hich prints the last name of those o#ners #ho ha"e placed an order
and #hat the order is, only listing those orders #hich can be filled (that is, there is a byer #ho o#ns that
ordered item))
S6269T D:N?D:N6527STN7K6 2ast Name D5A?CT6KA6SC56A Ctem Drdered
F5DK D5A65S D5A 7NTCLF6D:N65S D:N
:G656 D5A?D:N65CA ( D:N?D:N65CA
7NA D5A?CT6KA6SC56A CN
(S6269T CT6K
F5DK 7NTCLF6S"#
&his gi"es)
Last Name %tem Ordered
+++++++++ ++++++++++++
Smith &able
Smith Des-
A-ins Chair
La#son Mirror
&here are se"eral things to note abot this (ery)
1. >irst, the OLast NameO and O%tem OrderedO in the Select lines gi"es the headers on the report1
2. &he O<N ' O!D are aliasesP these are ne# names for the t#o tables listed in the >!OM clase that
are sed as prefixes for all dot notations of colmn names in the (ery (see abo"e)1 &his eliminates
ambigity, especially in the e(i3oin <45!5 clase #here both tables ha"e the colmn named
O#ner%D, and the dot notation tells SXL that #e are tal-ing abot t#o different O#ner%D\s from the
t#o different tables1
3. Note that the Orders table is listed first in the >!OM claseP this ma-es sre listing is done off of
that table, and the Anti(eO#ners table is only sed for the detail information (Last Name)1
4. Most importantly, the AND in the <45!5 clase forces the %n Sb(ery to be in"o-ed (O@ ANZO or
O@ SOM5O are t#o e(i"alent ses of %N)1 <hat this does is, the sb(ery is performed, retrning
all of the %tems o#ned from the Anti(es table, as there is no <45!5 clase1 &hen, for a ro# from
the Orders table to be listed, the %temDesired mst be in that retrned list of %tems o#ned from the
Anti(es table, ths listing an item only if the order can be filled from another o#ner1 Zo can thin-
of it this #ay) the sb(ery retrns a set of %tems from #hich each %temDesired in the Orders table
is comparedP the %n condition is tre only if the %temDesired is in that retrned set from the
Anti(es table1
<he#A &hat\s enogh on the topic of complex S5L5C& (eries for no#1 No# on to other SXL statements1
Normali*ation in Data&ase:
Normali2ation is the process of efficiently organi2ing data in a database1
&here are t#o goals of the normali2ation process) eliminating redndant data (for example, storing the
same data in more than one table) and ensring data dependencies ma-e sense (only storing related data in
a table)1 *oth of these are #orthy goals as they redce the amont of space a database consmes and
ensre that data is logically stored1
9he Normal Forms
&he database commnity has de"eloped a series of gidelines for ensring that databases are normali2ed1
&hese are referred to as normal forms and are nmbered from one (the lo#est form of normali2ation,
referred to as first normal form or 7N>) throgh fi"e (fifth normal form or ;N>)1 %n practical applications,
yo\ll often see 7N>, 8N>, and 9N> along #ith the occasional :N>1 >ifth normal form is "ery rarely seen and
#on\t be discssed in this article1
*efore #e begin or discssion of the normal forms, it\s important to point ot that they are gidelines and
gidelines only1 Occasionally, it becomes necessary to stray from them to meet practical bsiness
re(irements1 4o#e"er, #hen "ariations ta-e place, it\s extremely important to e"alate any possible
ramifications they cold ha"e on yor system and accont for possible inconsistencies1 &hat said, let\s
explore the normal forms1
First Normal Form "1NF#
>irst normal form (7N>) sets the "ery basic rles for an organi2ed database)
5liminate dplicati"e colmns from the same table1
Create separate tables for each grop of related data and identify each ro# #ith a ni(e colmn or
set of colmns (the primary -ey)1
)econd Normal Form "2NF#
Second normal form (8N>) frther addresses the concept of remo"ing dplicati"e data)
Meet all the re(irements of the first normal form1
!emo"e sbsets of data that apply to mltiple ro#s of a table and place them in separate tables1
Create relationships bet#een these ne# tables and their predecessors throgh the se of foreign
-eys1
9hird Normal Form "/NF#
&hird normal form (9N>) goes one large step frther)
Meet all the re(irements of the second normal form1
!emo"e colmns that are not dependent pon the primary -ey1
Forth Normal Form "0NF#
>inally, forth normal form (:N>) has one additional re(irement)
Meet all the re(irements of the third normal form1
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
A relation is in :N> if it has no mlti+"aled dependencies1
!emember, these normali2ation gidelines are cmlati"e1 >or a database to be in 8N>, it mst first flfill all
the criteria of a 7N> database1
<hen designing a relational database, it is normally a good thing to Onormali2eO the database1 &here are
different degrees of normali2ation, bt in general, relational databases shold be normali2ed to the Othird
normal formO1 Simply pt, this means that the attribtes in each table shold Odepend on the -ey, the
#hole -ey and nothing bt the -eyO1
An example of a de+normali2ed database table is pro"ided belo#1 &he database designer has assmed that
there #ill ne"er be a need to ha"e more than t#o order items in any one order)
*y mo"ing repeating grops of attribtes to a separate database table, the database design becomes more
flexible1 A single order can no# spport any nmber of order itemsP not 3st 3st t#o1 &he primary -ey (6,)
of the Order %tem table is the OOrder NbrO (represented by the relationship) pls the OOrder %tem NbrO)
&he OOrder %tem DescriptionO field is dependent on the OOrder %tem CodeOP not the ni(e identifier of the
Order %tem &able (i1e1 OOrder NbrO F OOrder %tem NbrO)1 *y creating a classification table, the database
become e"en more flexible1 Ne# codes can easily be added1 &he OOrder %tem DescriptionO for a gi"en code
can easily be altered shold the need e"er arise (e1g1 Oble #idgetO @? Olight ble #idgetO))
A !D*MS alone #ill not sol"e all data management isses1 A good data analyst and$or database analyst is
needed to design a flexible and efficient relational database1
&here are many different "endors that crrently prodce relational database management systems
(!D*MS)1 !elational databases "ary significantly in their capabilities and in costs1 Some prodcts are
proprietary #hile others are open sorce1 &he leading "endors of !D*MS are listed belo#)
$6B-) :endors $6B-)
Compter Associates %N/!5S
%*M D*8
%N>O!M%R Soft#are %N>O!M%R
Oracle Corporation Oracle
Microsoft Corporation MS Access
Microsoft Corporation SXL Ser"er
MySXL A* MySXL
NC! &eradata
6ostgreSXL D"lp /rp 6ostgreSXL
Sybase Sybase 77
Althogh most bsinesses manage their corporate data in relational database management systems
(!D*MS), many bsinesses still operate application systems that se flat files for data storage1 Many of
these systems are legacy ObatchO systems that can\t spport online data transactions1 A flat file can be
stored on compter tape or on a hard dri"e of some sort1
Net#or- databases sch as %DMS became poplar in the 7GTBs, #hen compters #ere mch less po#erfl
than the ones that exist today1 Althogh net#or- databases spported online transactions, the databases
#ere relati"ely inflexible1 Once a database #as designed, it #as often costly to implement changes1
4ierarchical databases #ere also poplar in the 7GSBs and 7GTBs1
-liminate Repeating Groups12N,3:
%n the original member list, each member name is follo#ed by any databases that the member has
experience #ith1 Some might -no# many, and others might not -no# any1 &o ans#er the (estion, O<ho
-no#s D*8DO #e need to perform an a#-#ard scan of the list loo-ing for references to D*81 &his is
inefficient and an extremely ntidy #ay to store information1
Mo"ing the -no#n databases into a seperate table helps a lot1 Separating the repeating grops of databases
from the member information reslts in first normal form1 &he Member%D in the database table matches
the primary -ey in the member table, pro"iding a foreign -ey for relating the t#o tables #ith a 3oin
operation1 No# #e can ans#er the (estion by loo-ing in the database table for OD*8O and getting the list of
members1
-liminate Redundant Data14N,3:
%n the Database &able, the primary -ey is made p of the Member%D and the Database%D1 &his ma-es sense
for other attribtes li-e O<here LearnedO and OS-ill Le"elO attribtes, since they #ill be different for e"ery
member$database combination1 *t the database name depends only on the Database%D1 &he same
database name #ill appear redndantly e"ery time its associated %D appears in the Database &able1
Sppose yo #ant to reclassify a database + gi"e it a different Database%D1 &he change has to be made for
e"ery member that lists that databaseA %f yo miss some, yo\ll ha"e se"eral members #ith the same
database nder different %Ds1 &his is an pdate anomaly1
Or sppose the last member listing a particlar database lea"es the grop1 4is records #ill be remo"ed from
the system, and the database #ill not be stored any#hereA &his is a delete anomaly1 &o a"oid these
problems, #e needsecond normal form1
&o achie"e this, separate the attribtes depending on both parts of the -ey from those depending only on
the Database%D1 &his reslts in t#o tables) ODatabaseO #hich gi"es the name for each Database%D, and
OMemberDatabaseO #hich lists the databases for each member1
No# #e can reclassify a database in a single operation) loo- p the Database%D in the ODatabaseO table and
change its name1 &he reslt #ill instantly be a"ailable throghot the application1
-liminate Columns Not Dependent on .ey15N,3:
Eliminate ;olmns Not 6e%endent +n Be." /NF #
&he Member table satisfies first normal form + it contains no repeating grops1 %t satisfies second normal
form + since it doesn\t ha"e a mlti"aled -ey1 *t the -ey is Member%D, and the company name and
location describe only a company, not a member1 &o achie"e third normal form, they mst be mo"ed into a
separate table1 Since they describe a company, CompanyCode becomes the -ey of the ne# OCompanyO
table1
&he moti"ation for this is the same for second normal form) #e #ant to a"oid pdate and delete anomalies1
>or example, sppose no members from the %*M #ere crrently stored in the database1 <ith the pre"ios
design, there #old be no record of its existence, e"en thogh 8B past members #ere from %*MA
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
'CN,1'oyce#Codd Normal ,orm3:
B;NF. Bo.ce<;odd Normal Form
*oyce+Codd Normal >orm states mathematically that)
A relation ! is said to be in *CN> if #hene"er R +? A holds in !, and A is not in R, then R is a candidate -ey
for !1
*CN> co"ers "ery specific sitations #here 9N> misses inter+dependencies bet#een non+-ey (bt candidate
-ey) attribtes1 &ypically, any relation that is in 9N> is also in *CN>1 4o#e"er, a 9N> relation #on\t be in
*CN> if (a) there are mltiple candidate -eys, (b) the -eys are composed of mltiple attribtes, and (c)
there are common attribtes bet#een the -eys1
*asically, a hmoros #ay to remember *CN> is that all fnctional dependencies are)
O&he -ey, the #hole -ey, and nothing bt the -ey, so help me Codd1O
Isolate Independent !ultiple Relationships16N,3:
&his applies primarily to -ey+only associati"e tables, and appears as a ternary relationship, bt has
incorrectly merged 8 distinct, independent relationships1
&he #ay this sitation starts is by a bsiness re(est list the one sho#n belo#1 &his cold be any 8 M)M
relationships from a single entity1 >or instance, a member cold -no# many soft#are tools, and a soft#are
tool may be sed by many members1 Also, a member cold ha"e recommended many boo-s, and a boo-
cold be recommended by many members1

%nitial bsiness re(est
So, to resol"e the t#o M)M relationships, #e -no# that #e shold resol"e them separately, and that #old
gi"e s :th normal form1 *t, if #e #ere to combine them into a single table, it might loo- right (it is in 9rd
normal form) at first1 &his is sho#n belo#, and "iolates :th normal form1
%ncorrect soltion
&o get a pictre of #hat is #rong, loo- at some sample data, sho#n belo#1 &he first fe# records loo- right,
#here *ill -no#s 5!<in and recommends the 5!<in *ible for e"eryone to read1 *t something is #rong
#ith Mary and Ste"e1 Mary didn\t recommend a boo-, and Ste"e Doesn\t -no# any soft#are tools1 Or
soltion has forced s to do strange things li-e create dmmy records in both *oo- and Soft#are to allo#
the record in the association, since it is -ey only table1
Sample data from incorrect soltion
&he correct soltion, to case the model to be in :th normal form, is to ensre that all M)M relationships are
resol"ed independently if they are indeed independent, as sho#n belo#1
Correct :th normal form
N+9EM &his is not to say that ALL ternary associations are in"alid1 &he abo"e sitation made it ob"ios that
*oo-s and Soft#are #ere independently lin-ed to Members1 %f, ho#e"er, there #ere distinct lin-s bet#een all
three, sch that #e #old be stating that O*ill recommends the 5!<in *ible as a reference for 5!<inO, then
separating the relationship into t#o separate associations #old be incorrect1 %n that case, #e #old lose
the distinct information abot the 9+#ay relationship1
Isolate Semantically Related !ultiple Relationship:
O,, no# lets modify the original bsiness diagram and add a lin- bet#een the boo-s and the soft#are tools,
indicating #hich boo-s deal #ith #hich soft#are tools, as sho#n belo#1
%nitial bsiness re(est
For More material See www.computertech-dovari.blogspot.com
For More material See www.computertech-dovari.blogspot.com
&his ma-es sense after the discssion on !le :, and again #e may be tempted to resol"e the mltiple M)M
relationships into a single association, #hich #old no# "iolate ;th normal form1 &he ternary association
loo-s identical to the one sho#n in the :th normal form example, and is also going to ha"e troble
displaying the information correctly1 &his time #e #old ha"e e"en more troble becase #e can\t sho# the
relationships bet#een boo-s and soft#are nless #e ha"e a member to lin- to, or #e ha"e to add or
fa"orite dmmy member record to allo# the record in the association table1
%ncorrect soltion
&he soltion, as before, is to ensre that all M)M relationships that are independent are resol"ed
independently, reslting in the model sho#n belo#1 No# information abot members and boo-s, members
and soft#are, and boo-s and soft#are are all stored independently, e"en thogh they are all "ery mch
semantically related1 %t is "ery tempting in many sitations to combine the mltiple M)M relationships
becase they are so similar1 <ithin complex bsiness discssions, the lines can become blrred and the
correct soltion not so ob"ios1
Correct ;th normal form

You might also like