You are on page 1of 90

Distributed computing

Mei-Ling Liu

12/08/21 Distributed Computing Introd 1


uction, M. Liu
Distributed system, distributed computing
 Early computing was performed on a single
processor. Uni-processor computing can be
called centralized computing.
 A distributed system is a collection of
independent computers, interconnected via a
network, capable of collaborating on a task.
 Distributed computing is computing
performed in a distributed system.

12/08/21 2
Distributed Computing Introduction,
Distributed Systems
w o rk
s t a t io n s a lo c a l n e t w o r k

T h e In te rn e t

a n e tw o rk h o s t

12/08/21 3
Distributed Computing Introduction,
Examples of Distributed systems
 Network of workstations (NOW): a group of
networked personal workstations connected to
one or more server machines.
 The Internet
 An intranet: a network of computers and
workstations within an organization,
segregated from the Internet via a protective
device (a firewall).

12/08/21 4
Distributed Computing Introduction,
Example of a large-scale distributed system
– eBay (Source: Los Angeles Times.)

12/08/21 5
Distributed Computing Introduction,
An example small-scale distributed system
(Source: Los Angeles Times.)

12/08/21 6
Distributed Computing Introduction,
Computers in a Distributed System

 Workstations: computers used by end-users to


perform computing
 Server machines: computers which provide
resources and services
 Personal Assistance Devices: handheld
computers connected to the system via a
wireless communication link.

12/08/21 7
Distributed Computing Introduction,
The power of the Internet
(Source: the Usability Professional Association’s site.)

 60 million American households use


computers.(The New York Times, 5/28/98)
 The number of computer users in the
workplace has increased from 600,000 in
1976 to 80 million today. (San Francisco
Examiner, 3/29/98)
 84% of Internet users say that the Web is
indispensable. Nearly the same percentage
find e-mail indispensable. 85% use the
Internet every day. (GVU, 1997)

12/08/21 8
Distributed Computing Introduction,
The Power of the Internet – 2
(Source: www.cisco.com)

 BACKBONE CAPACITY: The capacity of the Internet


backbone to carry information is doubling every 100 days. (
U.S. Internet Council, Apr. 1999).
 DATA TRAFFIC SURPASSING VOICE: Voice traffic is
growing at 10% per year or less, while data traffic is
conservatively estimated to be growing at 125% per year,
meaning voice will be less than 1% of the total traffic by
2007. (Technology Futures, Inc March 2000).
 DOMAIN NAMES: There are 12,844,877 unique domain
names (e.g. Cisco.com) registered worldwide, with 428,023
new domain names registered each week. (NetNames
Statistics 12/28/1999).

12/08/21 9
Distributed Computing Introduction,
The Power of the Internet – 3
(Source: www.cisco.com)

 DOMAIN NAMES: There are 12,844,877 unique domain


names (e.g. Cisco.com) registered worldwide, with 428,023
new domain names registered each week. (NetNames
Statistics 12/28/1999).
 HOST COMPUTERS: In July 1999 there were 56.2 million
"host" computers supporting web pages. In July 1997 there
were 19.5 million host computers, with 3.2 million hosts in
July 1994, and a mere 80,000 in July 1989. (
Internet Software Consortium – Internet Domain Survey).
 TOTAL AMOUNT OF DATA: 1,570,000,000 pages,
29,400,000,000,000 bytes of text, 353,000,000 images, and
5,880,000,000,000 bytes of image data. (The Censorware
Project, Jan. 26, 1999).

12/08/21 10
Distributed Computing Introduction,
The Power of the Internet – 4
(Source: www.cisco.com)

 EMAIL VOLUME: Average U.S. consumer will receive


1,600 commercial email messages in 2005, up from 40 in
1999, while non-marketing and personal correspondence will
more than double from approximately 1,750 emails per year
in 1999 to almost 4,000 in 2005 (Jupiter Communications,
May 2000).
 159 million computers in the U.S., 135 million in EU, and
116 million in Asia Pacific (as of April 2000).
 WEB HITS/DAY: U.S. web pages averaged one billion hits
per day (aggregate) in October 1999. (eMarketer/Media
Metrix, Nov. 1999).

12/08/21 11
Distributed Computing Introduction,
NUMBER OF AMERICANS ONLINE –
HISTORICAL
(Source: www.cisco.com)

 1993 – 90,000 (U.S. Internet Council, Apr. 1999).


 1997 – 19 million (Stratis Group, Apr. 1999).
 1998 – 68 million in 1998. (Strategis Group, Nov. 1999).
 1998 – 84 million from home or work (Stratis Group, Apr.
1999).
 1998 – 37 million DAILY (Stratis Group, Apr. 1999).
 1999, Nov. – 118.4 million (Cyberatlas/Nielsen Net Ratings,
Dec. 1999).
 1999, Nov. – 74 million actually went online
(Cyberatlas/Nielsen Net Ratings, Dec. 1999).

12/08/21 12
Distributed Computing Introduction,
PERCENTAGE OF AMERICANS ONLINE
(Source: www.cisco.com)

 1998 – 28% (IDC, Oct. 1999).


 1998 – 42% of the U.S. adult population.
(Stratis Group, Apr. 1999)
 2003 – 62% (IDC, Oct. 1999).
 2003 – 67% (Yankee Group, 1999).
 2005 – 91% (Strategy Analytics, Dec. 1999).

12/08/21 13
Distributed Computing Introduction,
The Power of the Internet – 5
(Source: www.cisco.com)

 NEW USERS Q1 2000: More than 5 million


Americans joined the online world in the first quarter
of 2000, which averages to roughly 55,000 new
users each day, 2,289 new users each hour, or 38
new users each minute. (CyberAtlas /
Telecommunications Reports International, May
2000).
 US INTERNET USAGE: Average US Internet user
went online 18 sessions, spent a total of 9 hours, 5
minutes and 24 seconds online and visited 10 unique
sites per month. (Nielsen NetRatings, June 2000).

12/08/21 14
Distributed Computing Introduction,
The Power of the Internet – 6
(Source: www.cisco.com)

 E-MAIL 1998: The U.S. Postal Service delivered 101 billion


pieces of paper mail in 1998. Estimates for e-mail messages
sent in 1998 range from 618 billion to 4 trillion. (U.S.
Internet Council, Apr. 1999).
 E-MAIL 1999: There are 270 million e-mailboxes in the U.S.
-- roughly 2.5 per user. (eMarketer/ Messaging Online, Nov.
1999).
 HOURS ONLINE (Veronis, Suhler & Associates, Nov.
1999):
1997 – 28 hours per capita
1998 – 74 hours per capita
2003 – 192 hours per capita

12/08/21 15
Distributed Computing Introduction,
ONLINE WORLDWIDE
(Source: www.cisco.com)

 1998 – 95.43 million people. (eMarketer eStats


1999).
 1998, Dec. – 144 million (IDC, Dec. 1999).
 1999, Dec. – 240 million (IDC, Dec. 1999).
 2002 – over 490 million (Computer Industry
Almanac, Nov. 1999).
 2005 – over 765 million (Computer Industry
Almanac, Nov. 1999
 U.S. -- 136 million (36% of world’s total)
(eMarketer, May 2000) – followed by Japan (27 M),
UK (18M), and China (16 M).

12/08/21 16
Distributed Computing Introduction,
Wireless access to the Internet
(Source: www.cisco.com)

 U.S. WIRELESS USERS: 61.5 million Americans will be


using wireless devices to access the Internet in 2003, up from
7.4 million in the US today (728% increase). (IDC Research,
Feb. 2000).
 MOBILE DATA: Almost 80% of the US Internet population
will access data from mobile phones in a year’s time, up from
the current figure of 3%. (Corechange, Inc & Cap Gemini
USA, Apr. 2000).

12/08/21 17
Distributed Computing Introduction,
“The network really is the computer.”
Tim O’Reilly, in an address at 6/2000 Java One:
“By now, it's a truism that the Internet runs on open source. Bind, the
Berkeley Internet Name Daemon, is the single most mission critical
program on the Internet, followed closely by Sendmail and Apache, open
source servers for two of the Internet's most widely used application
protocols, SMTP and HTTP.”
Early “killer apps”:
- usenet: distributed bulletin board
- email
- talk
Recent “killer apps”:
- the web
- collaborative computing

12/08/21 18
Distributed Computing Introduction,
Centralized vs. Distributed Computing

t e r m in a l
m a in f r a m e c o m p u t e r
w o r k s t a t io n

n e t w o r k lin k

n e tw o r k h o s t
c e n t r a liz e d c o m p u tin g
d is t r ib u t e d c o m p u t in g

12/08/21 19
Distributed Computing Introduction,
Monolithic mainframe applications vs. distributed
applicationsbased on http://www.inprise.com/visibroker/papers/distributed/wp.html

 The monolithic mainframe application architecture:


 Separate, single-function applications, such as order-entry or billing
 Applications cannot share data or other resources
 Developers must create multiple instances of the same functionality
(service).
 Proprietary (user) interfaces
 The distributed application architecture:
 Integrated applications
 Applications can share resources
 A single instance of functionality (service) can be reused.
 Common user interfaces

12/08/21 20
Distributed Computing Introduction,
Evolution of pardigms
 Client-server: Socket API, remote method invocation
 Distributed objects
 Object broker: CORBA
 Network service: Jini
 Object space: JavaSpaces
 Mobile agents
 Message oriented middleware (MOM): Java Message Service
 Collaborative applications

12/08/21 21
Distributed Computing Introduction,
Cooperative distributed computing projects
Cooperative distributed computing projects
(also called distributed computing in some
literature): these are projects that parcel out
large-scale computing to workstations, often
making use of surplus CPU cycles. Example:
seti@home: project to scan data retrieved by a
radio telescope to search for radio signals
from another world.

12/08/21 22
Distributed Computing Introduction,
Why distributed computing?
 Economics: distributed systems allow the
pooling of resources, including CPU cycles,
data storage, input/output devices, and
services.
 Reliability: a distributed system allow
replication of resources and/or services, thus
reducing service outage due to failures.
 The Internet has become a universal platform
for distributed computing.
12/08/21 23
Distributed Computing Introduction,
The Weaknesses and Strengths of Distributed
Computing
In any form of computing, there is always a
tradeoff in advantages and disadvantages
Some of the reasons for the popularity of
distributed computing :
 The affordability of computers and
availability of network access
 Resource sharing
 Scalability
 Fault Tolerance

12/08/21 24
Distributed Computing Introduction,
The Weaknesses and Strengths of
Distributed Computing
The disadvantages of distributed computing:
 Multiple Points of Failures: the failure of
one or more participating computers, or one
or more network links, can spell trouble.
 Security Concerns: In a distributed system,
there are more opportunities for unauthorized
attack.

12/08/21 25
Distributed Computing Introduction,
Introductory Basics

M. L. Liu

12/08/21 Distributed Computing Introd 26


uction, M. Liu
Basics in three areas
Some of the notations and concepts from
these areas will be employed from time to
time in the presentations for this course:
 Software engineering
 Operating systems

 Networks.

12/08/21 27
Distributed Computing Introduction,
Software Engineering Basics

12/08/21 Distributed Computing Introd 28


uction, M. Liu
Procedural versus Object-oriented Programming

In building network applications, there are two main


classes of programming languages: procedural
language and object-oriented language.
 Procedural languages, with the C language being the
primary example, use procedures (functions) to break
down the complexity of the tasks that an application
entails.  
 Object-oriented languages, exemplified by Java, use
objects to encapsulate the details. Each object simulates
an object in real life, carrying state data as well as
behaviors. State data are represented as instance data.
Behaviors are represented as methods.

12/08/21 29
Distributed Computing Introduction,
UML Class Diagram Notations

B a s ic U M L C la s s D ia g r a m N o ta tio n s
A c la s s /in te r fa c e is r e p r e s e n t e d a s f o llo w s :

in t e r f a c e /c la s s
nam e
a ttr ib u te s
a t t r ib u te s a r e s ta tic / in s t a n c e v a r ia b le s /c o n s t a n t s
(n a m e : ty pe )

o p e r a t io n s
o p e r a tio n s a r e s ta t ic o r in s ta n c e m e th o d s .
(m e th o d n a m e s)

NOTE: The shape, the style of the line (dashed or


solid), the direction of the arrow, and the shape of the
arrowheads (pointed, hollow, or solid) are significant.

12/08/21 30
Distributed Computing Introduction,
UML Class Diagram Notations

c la s s A c la s s B
c la s s B d e p e n d s o n ( u s e s )
a t tr ib u te s a tt r ib u te s
c la s s A

o p e r a t io n s o p e r a tio n s

c la s s C
c la s s C im p le m e n ts
a t tr ib u te s
J a v a in te r fa c e s o m e I n te r fa c e
so m e In te rfa ce
o p e r a t io n s

12/08/21 31
Distributed Computing Introduction,
UML Class Diagram Notations

in te r fa c e D c la s s E
c la s s E im p le m e n ts
a t tr ib u te s a t t r ib u t e s
p r o g r a m m e r -p r o v id e d in te r f a c e D

o p e r a tio n s o p e r a tio n s

in te r fa c e F c la s s G
c la s s G in h e r it s f r o m c la s s F
a t tr ib u te s a t t r ib u t e s

o p e r a tio n s o p e r a tio n s

12/08/21 32
Distributed Computing Introduction,
The Architecture of Distributed Applications

P r e s e n t a t io n
A p p lic a tio n (B u s in e s s ) lo g ic

S e r v ic e s

12/08/21 33
Distributed Computing Introduction,
Operating Systems Basics

12/08/21 Distributed Computing Introd 34


uction, M. Liu
Operating systems basics
 A process consists of an executing program,
its current values, state information, and the
resources used by the operating system to
manage its execution.
 A program is an artifact constructed by a
software developer; a process is a dynamic
entity which exists only when a program is
run.

12/08/21 35
Distributed Computing Introduction,
Process State Transition Diagram

t e r m in a t e d
sta rt
queued
e x it
d is p a t c h r u n n in g
re ady

e v e n t c o m p le t io n w a it in g
fo r ev en t
b lo c k e d

S im p life d f in it e s t a t e d ia g r a m fo r a p r o c e s s 's lif e t im e

12/08/21 36
Distributed Computing Introduction,
Java processes
 There are three types of Java program: applications,
applets, and servlets, all are written as a class.
 A Java application program has a main method, and is
run as an independent(standalone) process.
 An applet does not have a main method, and is run
using a browser or the appletviewer.
 A servlet does not have a main method, and is run in the
context of a web server.
 A Java program is compiled into bytecode, a
universal object code. When run, the bytecode is
interpreted by the Java Virtual Machine (JVM).

12/08/21 37
Distributed Computing Introduction,
Three Types of Java programs
 Applications
a program whose byte code can be run on any system which
has a Java Virtual Machine. An application may be
standalone (monolithic) or distributed (if it interacts with
another process).
 Applets
A program whose byte code is downloaded from a remote
machine and is run in the browser’s Java Virtual Machine.
 Servlets
A program whose byte code resides on a remote machine
and is run at the request of an HTTP client (a browser).

12/08/21 38
Distributed Computing Introduction,
Three Types of Java programs
A stan d alo n e J ava ap p lic atio n is r u n o n a lo c al m ac h in e
com pu te r

J a v a o b je c t

J a v a V ir t u a l M a c h in e

A n a p p l e t i s a n o b je c t d o w n l o a d e d ( t r a n s fe r r e d ) fr o m a r e m o t e m a c h i n e ,
the n r u n o n a lo c al m ac h in e.

a n a p p le t J a v a o b je c t

J a v a V ir t u a l M a c h in e

A s e r v l e t i s a n o b je c t t h a t r u n s o n a r e m o t e m a c h i n e a n d
in te r ac ts w ith a lo c al p r o g r am u sin g a r e q u e st-r e sp o n se p r o to c o l

a s e r v le t
requ est a proce ss

respon se

12/08/21 39
Distributed Computing Introduction,
A sample Java application
/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* A s a m p le o f a s im p le J a v a a p p lic a t io n .
* M . L iu 1 /8 /0 2
******************************************************/

im p o r t ja v a .io .* ;

c la s s M y P r o g r a m {

p u b lic s t a t ic v o id m a in ( S t r in g [ ] a r g s )
th r o w s I O E x c e p tio n {

B u ffe re dR e a de r k e y bo a rd = n e w
B u ffe r e d R e a d e r (n e w I n p u tS tr e a m R e a d e r (S y s te m .in ));
S tr in g th e N a m e ;
S y s t e m .o u t .p r in t ln ( " W h a t is y o u r n a m e ? " ) ;
th e N a m e = k e y b o a r d .r e a d L in e ( );
S y s te m .o u t.p r in t(" H e llo " + th e N a m e );
S y s te m .o u t.p r in tln (" - w e lc o m e to C S C 3 6 9 .\n " );

} // e n d m a in

} //e n d c la s s

12/08/21 40
Distributed Computing Introduction,
A Sample Java Applet

/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* A s a m p le o f a s im p le a p p le t.
* M . L iu 1 /8 /0 2
***************************************************/
< !- - A w e b p a g e w h ic h , w h e n b r o w s e d , w ill r u n >
im p o r t ja v a .a p p le t .A p p le t ; < !- - t h e M y A p p le t a p p le t >
im p o r t ja v a .a w t .* ; < !- - M . L iu 1 /8 /0 2 >

p u b lic c la s s M y A p p le t e x te n d s A p p le t { < t itle > S a m p le A p p le t < /tit le >


< h r>
p u b lic v o id p a in t (G r a p h ic s g ) {
s e t B a c k g r o u n d (C o lo r .b lu e ); < a p p le t c o d e = " M y A p p le t.c la s s " w id t h = 5 0 0 h e ig h t= 5 0 0 >
< /a p p le t >
F o n t C la u d e = n e w F o n t (" A r ia l" , F o n t .B O L D , 4 0 );
g .s e tF o n t( C la u d e ) ; < h r>
g .s e tC o lo r ( C o lo r .y e llo w ) ; < a h r e f = " H e llo .ja v a " > T h e s o u r c e .< /a >
g .d r a w S t r in g ( " H e llo W o r ld !" , 1 0 0 , 1 0 0 ) ;
} // e n d p a in t

} //e n d c la s s

12/08/21 41
Distributed Computing Introduction,
A Sample Java Servlet
/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* A s a m p le o f a s im p le J a v a s e r v le t.
* M . L iu 1 /8 /0 2
******************************************************/
im p o r t ja v a .io .* ;
im p o r t ja v a .t e x t .* ;
im p o r t ja v a .u t il.* ;
im p o r t ja v a x .s e r v le t .* ;
im p o r t ja v a x .s e r v le t .h t t p .* ;

p u b lic c la s s M y S e r v le t e x t e n d s H t t p S e r v le t {

p u b lic v o id d o G e t (H t t p S e r v le t R e q u e s t r e q u e s t ,
H tt p S e r v le tR e s p o n s e r e s p o n s e )
t h r o w s S e r v le t E x c e p t io n , I O E x c e p t io n {

P r in tW r ite r o u t;
S t r in g t it le = " M y S e r v le t O u t p u t " ;
// s e t c o n te n t ty p e a n d o th e r r e s p o n s e h e a d e r
// fie ld s fir s t
r e s p o n s e .s e t C o n te n tT y p e (" t e x t/h t m l" );
// th e n w r ite th e d a ta o f th e r e s p o n s e
o u t = r e s p o n s e .g e tW r ite r ();
o u t .p r in t ln (" < H T M L > < H E A D > < T I T L E > " ) ;
o u t .p r in t ln (t itle );
o u t .p r in t ln (" < /T I T L E > < /H E A D > < B O D Y > " ) ;
o u t .p r in t ln (" < H 1 > " + title + " < /H 1 > " );
o u t .p r in t ln ( " < P > H e llo W o r ld !" ) ;
o u t .p r in t ln (" < /B O D Y > < /H T M L > " ) ;
o u t .c lo s e ( );
} //e n d d o G e t

} //e n d c la s s

12/08/21 42
Distributed Computing Introduction,
Concurrent Processing
On modern day operating systems, multiple processes
appear to be executing concurrently on a machine by
timesharing resources.

P rocesse s
P1
P2
P3
P4
tim e
T im e s h a r in g o f a r e s o u r c e
12/08/21 43
Distributed Computing Introduction,
Concurrent processing within a process

It is often useful for a process to have parallel threads of


execution, each of which timeshare the system resources in
much the same way as concurrent processes.

A p a r e n t p r o c e s s m a y s p a w n c h ild p r o c e s s e s . A p r o c e s s m a y s p a w n c h ild th r e a d s
a p ro c ess
p aren t p ro c ess
m a in t h r e a d

c h ild t h r e a d 1

c h ild t h r e a d 2
c h ild p r o c e s s e s

C o n c u r r e n t p r o c e s s in g w it h in a p r o c e s s
12/08/21 44
Distributed Computing Introduction,
Java threads
 The Java Virtual Machine allows an application to have multiple
threads of execution running concurrently.
 Java provides a Thread class:
public class Thread
extends Object
implements Runnable
 When a Java Virtual Machine starts up, there is usually a single thread
(which typically calls the method named main of some designated
class). The Java Virtual Machine continues to execute threads until
either of the following occurs:
 The exit method of class Runtime has been called and the security
manager has permitted the exit operation to take place.
 All threads have ter,omated, either by returning from the call to the run
method or by throwing an exception that propagates beyond the run
method.

12/08/21 45
Distributed Computing Introduction,
Two ways to create a new thread of execution

 Using a subclass of the Thread class


 Using a class that implements the Runnable
interface

12/08/21 46
Distributed Computing Introduction,
Create a class that is a subclass of the Thread class

Declare a class to be a subclass of Thread. This subclass


should override the run method of class Thread. An instance
of the subclass can then be allocated and started:
T h read
...

ru n ( )
s le e p ( )
s ta rt( )

R u n T h read s
S o m eT h read
A Java m y ID
a p p lic a t io n
illu s t r a t in g t h e
u se o f T h read ru n ( )
c la s s
12/08/21 47
Distributed Computing Introduction,
Create a class that is a subclass of the Thread class

im p o r t S o m e T h r e a d ;
p u b lic c la s s S o m e T h r e a d e x t e n d s T h r e a d {
p u b lic c la s s R u n T h r e a d s
in t m y I D ;
{
p u b lic s t a t ic v o id m a in ( S t r in g [ ] a r g s )
S o m e T h r e a d (in t id ) {
{
t h is .m y I D = id ;
S o m e T h re a d p1 = n e w S o m e T h re a d(1 );
}
p 1 .s ta r t();
p u b lic v o id r u n ( ) {
S o m e T h re a d p2 = n e w S o m e T h re a d(2 );
in t i;
p 2 .s ta r t();
fo r (i = 1 ; i < 1 1 ; i+ + )
S y s te m .o u t.p r in tln (" T h r e a d " + m y I D + " : " + i);
S o m e T h re a d p3 = n e w S o m e T h re a d(3 );
}
p 3 .s ta r t();
} //e n d c la s s S o m e T h r e a d
}
} // e n d c la s s R u n T h r e a d s

12/08/21 48
Distributed Computing Introduction,
Java Threads-2
The other way to create a thread is to declare a class that
implements the Runnable interface. That class then implements
the run method. An instance of the class can then be allocated,
passed as an argument when creating Thread, and started.
T h read
...

ru n ( )
R u n T h read s2
s le e p ( )
A Java s ta rt( )
a p p lic a t io n
illu s t r a t in g t h e
u s e o f th e
R u n n a b le
in t e r f a c e S o m eT h read 2
m y ID R u n n a b le

ru n ( )

12/08/21 49
Distributed Computing Introduction,
Create a class that implements the Runnable interface

c la s s S o m e T h r e a d 2 im p le m e n ts R u n n a b le {
p u b lic c la s s R u n T h r e a d s 2
in t m y I D ;
{
p u b lic s ta tic v o id m a in (S tr in g [] a r g s )
S o m e T h r e a d 2 (in t id ) {
{
th is .m y I D = id ;
T h re a d p1 = n e w T h re a d(n e w S o m e T h re a d2 (1 ));
}
p 1 .s ta r t();
p u b lic v o id r u n ( ) {
T h re a d p2 = n e w T h re a d(n e w S o m e T h re a d2 (2 ));
in t i;
p 2 .s ta r t();
fo r (i = 1 ; i < 1 1 ; i+ + )
S y s te m .o u t.p r in tln (" T h r e a d " + m y I D + " : " + i) ;
T h re a d p3 = n e w T h re a d(n e w S o m e T h re a d2 (3 ));
}
p 3 .s ta r t();
} //e n d c la s s S o m e T h r e a d
}
}

12/08/21 50
Distributed Computing Introduction,
Program samples
 RunThreads.java
 SomeThread.java
 RunThreads2.java
 SomeThread2.java

12/08/21 51
Distributed Computing Introduction,
Thread-safe Programming
 When two threads independently access and update
the same data object, such as a counter, as part of
their code, the updating needs to be synchronized.
(See next slide.)
 Because the threads are executed concurrently, it is
possible for one of the updates to be overwritten by
the other due to the sequencing of the two sets of
machine instructions executed in behalf of the two
threads.
 To protect against the possibility, a synchronized
method can be used to provide mutual exclusion.

12/08/21 52
Distributed Computing Introduction,
Race Condition
t im e

f e t c h v a lu e in c o u n t e r a n d lo a d in t o a r e g is t e r f e t c h v a lu e in c o u n t e r a n d lo a d in t o a r e g is t e r

in c r e m e n t v a lu e in r e g is t e r f e t c h v a lu e in c o u n t e r a n d lo a d in t o a r e g is t e r

s t o r e v a lu e in r e g is t e r t o c o u n t e r in c r e m e n t v a lu e in r e g is t e r

in c r e m e n t v a lu e in r e g is te r
f e t c h v a lu e in c o u n t e r a n d lo a d in t o a r e g is t e r

in c r e m e n t v a lu e in r e g is te r s t o r e v a lu e in r e g is t e r t o c o u n t e r

s to r e v a lu e in r e g is te r to c o u n te r s to r e v a lu e in r e g is te r to c o u n te r

T h is e x e c u tio n re s u lts in th e T h is e x e c u tio n re s u lts in th e


valu e 2 in th e c o u n te r valu e 1 in th e c o u n te r

in s t r u c t io n e x e c u t e d in c o n c u r r e n t p r o c e s s o r t h r e a d 1
in s t r u c t i o n e x e c u t e d in c o n c u r r e n t p r o c e s s o r t h r e a d 2

12/08/21 53
Distributed Computing Introduction,
Synchronized method in a thread
c la s s S o m e T h r e a d 3 im p le m e n t s R u n n a b le {
s ta tic in t c o u n t= 0 ;

S o m e T h re a d3 () {
su pe r();
}

p u b lic v o id r u n ( ) {
u pda te ( );
}

s t a t ic p u b lic s y n c h r o n iz e d v o id u p d a t e ( ) {
in t m y C o u n t = c o u n t;
m yC ou n t+ + ;
cou n t = m yC ou n t;
S y s t e m .o u t .p r in t ln (" c o u n t = " + c o u n t +
" ; th r e a d c o u n t= " + T h r e a d .a c tiv e C o u n t ( ));
}
}

12/08/21 54
Distributed Computing Introduction,
Network Basics

12/08/21 55
Distributed Computing Introduction,
Network standards and protocols
 On public networks such as the Internet, it is
necessary for a common set of rules to be
specified for the exchange of data.
 Such rules, called protocols, specify such
matters as the formatting and semantics of
data, flow control, error correction.
 Software can share data over the network
using network software which supports a
common set of protocols.
12/08/21 56
Distributed Computing Introduction,
Protocols
 In the context of communications, a protocol is a set of
rules that must be observed by the participants.
 In communications involving computers, protocols must
be formally defined and precisely implemented. For
each protocol, there must be rules that specify the
followings:
 How is the data exchanged encoded?
 How are events (sending , receiving) synchronized
so that the participants can send and receive in a
coordinated order?
 The specification of a protocol does not dictate how the
rules are to be implemented.

12/08/21 57
Distributed Computing Introduction,
The network architecture

 Network hardware transfers electronic signals,which


represent a bit stream, between two devices.
 Modern day network applications require an application
programming interface (API) which masks the underlying
complexities of data transmission.
 A layered network architecture allows the functionalities
needed to mask the complexities to be provided
incrementally, layer by layer.
 Actual implementation of the functionalities may not be
clearly divided by layer.

12/08/21 58
Distributed Computing Introduction,
The OSI seven-layer network architecture
a p p lic a t io n la y e r a p p lic a t io n la y e r

p r e s e n t a t io n la y e r p r e s e n t a t io n la y e r

s e s s io n la y e r s e s s io n la y e r

t r a n s p o r t la y e r t r a n s p o r t la y e r

n e t w o r k la y e r n e t w o r k la y e r

d a t a lin k la y e r d a t a lin k la y e r

p h y s ic a l la y e r p h y s ic a l la y e r

12/08/21 59
Distributed Computing Introduction,
Network Architecture
The division of the layers is conceptual: the
implementation of the functionalities need not be
clearly divided as such in the hardware and
software that implements the architecture.
The conceptual division serves at least two useful
purposes :
1. Systematic specification of protocols
it allows protocols to be specified systematically
2. Conceptual Data Flow: it allows programs to be
written in terms of logical data flow.

12/08/21 60
Distributed Computing Introduction,
The TCP/IP Protocol Suite
 The Transmission Control Protocol/Internet Protocol suite is a set of
network protocols which supports a four-layer network architecture.
 It is currently the protocol suite employed on the Internet.

A p p lic a t io n la y e r A p p lic a t io n la y e r

T r a n s p o r t la y e r T r a n s p o r t la y e r

I n t e r n e t la y e r I n t e r n e t la y e r

P h y s ic a l la y e r P h y s ic a l la y e r

T h e I n te r n e t n e tw o r k a r c h ite c tu r e

12/08/21 61
Distributed Computing Introduction,
The TCP/IP Protocol Suite -2
 The Internet layer implements the Internet
Protocol, which provides the functionalities
for allowing data to be transmitted between
any two hosts on the Internet.
 The Transport layer delivers the transmitted
data to a specific process running on an
Internet host.
 The Application layer supports the
programming interface used for building a
program.

12/08/21 62
Distributed Computing Introduction,
Network Resources

 Network resources are resources available to the


participants of a distributed computing community.
 Network resources include hardware such as
computers and equipment, and software such as
processes, email mailboxes, files, web documents.
 An important class of network resources is network
services such as the World Wide Web and file
transfer (FTP), which are provided by specific
processes running on computers.

12/08/21 63
Distributed Computing Introduction,
Identification of Network Resources

One of the key challenges in distributed


computing is the unique identification of
resources available on the network, such as e-
mail mailboxes, and web documents.
 Addressing an Internet Host
 Addressing a process running on a host

 Email Addresses

 Addressing web contents: URL

12/08/21 64
Distributed Computing Introduction,
Addressing an Internet Host

12/08/21 65
Distributed Computing Introduction,
The Internet Topology

a n In te r n e t h o s t

s u b n e ts

T h e In te r n e t b a c k b o n e

T h e I n te r n e t T o p o lo g y M o d e l

12/08/21 66
Distributed Computing Introduction,
The Internet Topology
 The internet consists of an hierarchy of
networks, interconnected via a network
backbone.
 Each network has a unique network address.
 Computers, or hosts, are connected to a
network. Each host has a unique ID within
its network.
 Each process running on a host is associated
with zero or more ports. A port is a logical
entity for data transmission.
12/08/21 67
Distributed Computing Introduction,
The Internet addressing scheme
 In IP version 4, each address is 32 bit long.
 The address space accommodates 232 (4.3 billion) addresses in total.
 Addresses are divided into 5 classes (A through E)

b y te 0 b y te 1 b y te 2 b y te 3
class A ad d ress 0
class B address 1 0
n e tw o rk a d d re s s
class C address 1 1 0
h o s t p o r t io n
m u ltc a st a d d r e ss 1 1 1 0 m u ltic ast g r o u p
reserved address 1 1 1 1 0 r e s er rev se ed r v e d

12/08/21 68
Distributed Computing Introduction,
The Internet addressing scheme - 2

S u b d iv id in g th e h o s t p o r tio n o f a n I n te r n e t a d d r e s s :
b y te 0 b y te 1 b y te 2 b y te 3

class B address 1 0 n e tw o rk a d d r e s s h o s t p o r t io n

A c las s A /C ad d re s s s p a c e c an
a l s o b e s i m i l a r l y s u b d i v i d e d ..
W h ic h p o rtio n o f th e h o s t ad d re s s
is u s e d fo r th e s u b n e t id e n tific a tio n
is d e te rm in e d b y a s u b n e t m a s k . su b n et ad d ress lo c a l h o s t a d d r e s s

12/08/21 69
Distributed Computing Introduction,
Example:
Suppose the dotted-decimal notation for a particular Internet address
is129.65.24.50. The 32-bit binary expansion of the notation is as follows :

1 2 9 .6 5 .2 4 .5 0
10000001

01000001

00011000
Since the leading bit sequence is 10, the0 0 1 1 address
0010 is a Class B address.
Within the class, the network portion is identified by the remaining bits in
the first two bytes, that is, 00000101000001, and the host portion is the
values in the last two bytes, or 0001100000110010. For convenience, the
binary prefix for class identification is often included as part of the network
portion of the address, so that we would say that this particular address is at
network 129.65 and then at host address 24.50 on that network.

12/08/21 70
Distributed Computing Introduction,
Another example:
Given the address 224.0.0.1, one can expand it as follows:
 

2 2 4 .0 .0 .1
1110000

00000000
The binary prefix of 1110 signifies
00000000
that this is class D, or
multicast, address. Data packets 00000001 sent to this address
should therefore be delivered to the multicast group
0000000000000000000000000001.

12/08/21 71
Distributed Computing Introduction,
The Internet Address Scheme - 3
 For human readability, Internet addresses are
written in a dotted decimal notation:
nnn.nnn.nnn.nnn, where each nnn group is a decimal value
in the range of 0 through 255
# Internet host table (found in /etc/hosts file)
127.0.0.1 localhost
129.65.242.5 falcon.csc.calpoly.edu falcon loghost
129.65.241.9 falcon-srv.csc.calpoly.edu falcon-srv
129.65.242.4 hornet.csc.calpoly.edu hornet
129.65.241.8 hornet-srv.csc.calpoly.edu hornet-srv
129.65.54.9 onion.csc.calpoly.edu onion
129.65.241.3 hercules.csc.calpoly.edu hercules

12/08/21 72
Distributed Computing Introduction,
IP version 6 Addressing Scheme

 Each address is 128-bit long.


 There are three types of addresses:
 Unicast: An identifier for a single interface.
 Anycast: An identifier for a set of interfaces (typically
belonging to different nodes).
 Multicast: An identifier for a set of interfaces (typically
belonging to different nodes). A packet sent to a
multicast address is delivered to all interfaces identified
by that address.
 See Request for Comments: 2373
http://www.faqs.org/rfcs/ (link is in book’s
reference)

12/08/21 73
Distributed Computing Introduction,
The Domain Name System (DNS)

For user friendliness, each Internet address is mapped


to a symbolic name, using the DNS, in the format of:
<computer-name>.<subdomain hierarchy>.<organization>.<sector name>{.<country code>}
e.g., www.csc.calpoly.edu.us
root

t o p - le v e l d o m a in

com co u n try co d e
edu gov net org m il
in th e U .S .

o r g a n iz a tio n T o p -le v e l d o m a in n a m e h a s to b e a p p lie d fo r .


S u b d o m a in h ie r a c h y a n d n a m e s a r e a s s ig n e d
b y th e o r g a n iz a tio n .

...
s u b d o m a in
...

host nam e

12/08/21 74
Distributed Computing Introduction,
The Domain Name System
 For network applications, a domain name must be
mapped to its corresponding Internet address.
 Processes known as domain name system servers
provide the mapping service, based on a
distributed database of the mapping scheme.
 The mapping service is offered by thousands of
DNS servers on the Internet, each responsible for a
portion of the name space, called a zone. The
servers that have access to the DNS information
(zone file) for a zone is said to have authority for
that zone.

12/08/21 75
Distributed Computing Introduction,
Top-level Domain Names
 .com: For commercial entities, which anyone, anywhere in the
world, can register.
 .net : Originally designated for organizations directly involved in
Internet operations. It is increasingly being used by businesses when
the desired name under "com" is already registered by another
organization. Today anyone can register a name in the Net domain.
 .org: For miscellaneous organizations, including non-profits.
 .edu: For four-year accredited institutions of higher learning.
 .gov: For US Federal Government entities
 .mil: For US military
 Country Codes :For individual countries based on the International
Standards Organization. For example, ca for Canada, and jp for
Japan.

12/08/21 76
Distributed Computing Introduction,
Domain Name Hierarchy
. ( r o o t d o m a in )

.a u ... .c a ... .u s ... .z w .c o m .g o v .e d u . m il .n e t .o rg

c o u n tr y c o d e
u c s b .e d u ... c a lp o ly . e d u ...

c s c ... e e e n g lis h . . . w ir e le s s
cs ... e c e ...

12/08/21 77
Distributed Computing Introduction,
Name lookup and resolution
 If a domain name is used to address a host, its
corresponding IP address must be obtained for the
lower-layer network software.
 The mapping, or name resolution, must be
maintained in some registry.
 For runtime name resolution, a network service is
needed; a protocol must be defined for the naming
scheme and for the service. Example: The DNS
service supports the DNS; the Java RMI registry
supports RMI object lookup; JNDI is a network
service lookup protocol.

12/08/21 78
Distributed Computing Introduction,
Addressing a process running on a host

12/08/21 Distributed Computing Introd 79


uction, M. Liu
Logical Ports

host A

host B
...

p ro c ess

...
p o rt

E a ch h o st h a s 6 5 53 6 po rts.

T h e In te rn e t

12/08/21 80
Distributed Computing Introduction,
Well Known Ports
 Each Internet host has 216 (65,535) logical
ports. Each port is identified by a number
between 1 and 65535, and can be allocated to
a particular process.
 Port numbers beween 1 and 1023 are reserved
for processes which provide well-known
services such as finger, FTP, HTTP, and
email.

12/08/21 81
Distributed Computing Introduction,
Well-known ports

A s s ig n m e n t o f s o m e w e ll-k n o w n p o r ts
P ro to co l Port S e r v ic e

echo 7 IP C t e s t i n g

d a y t im e 13 p r o v id e s t h e c u r r e n t d a t e a n d t im e

ftp 21 f ile t r a n s f e r p r o t o c o l

t e ln e t 23 r e m o t e , c o m m a n d - lin e t e r m in a l s e s s io n

s m tp 25 s im p le m a il t r a n s f e r p r o t o c o l

t im e 37 p r o v id e s a s t a n d a r d t im e

f in g e r 79 p r o v id e s in f o r m a t io n a b o u t a u s e r

h ttp 80 w e b s e rv e r

R M I R e g is t r y 1099 r e g is tr y fo r R e m o te M e th o d In v o c a tio n
w e b s e r v e r w h ic h s u p p o r t s
s p e c ia l w e b s e r v e r 8080
s e r v le t s , J S P , o r A S P

12/08/21 82
Distributed Computing Introduction,
Choosing a port to run your program

 For our programming exercises: when a port


is needed, choose a random number above the
well known ports: 1,024- 65,535.
 If you are providing a network service for the
community, then arrange to have a port
assigned to and reserved for your service.

12/08/21 83
Distributed Computing Introduction,
Addressing a Web Document

12/08/21 84
Distributed Computing Introduction,
The Uniform Resource Identifier (URI)
 Resources to be shared on a network need to
be uniquely identifiable.
 On the Internet, a URI is a character string
which allows a resource to be located.
 There are two types of URIs:
 URL (Uniform Resource Locator) points to a
specific resource at a specific location
 URN (Uniform Resource Name) points to a
specific resource at a nonspecific location.

12/08/21 85
Distributed Computing Introduction,
URL

A URL has the format of:


protocol://host address[:port]/directory path/file name#section

A sam ple U R L :
h t t p :/ / w w w . c s c . c a lp o ly . e d u :8 0 8 0 / ~ m liu / C S C 3 6 9 / h w . h t m l # h w 1

s e c t io n n a m e
f ile n a m e
host n am e
d ir e c t o r y p a t h
p ro to c o l o f s e rv e r p o rt n u m b er o f serv er p ro c ess

O th e r p r o to c o ls th a t c a n a p p e a r in a U R L a r e :
file
ftp
g o p h er
n ew s
te ln e t
W A IS

12/08/21 86
Distributed Computing Introduction,
More on URL
 The path in a URL is relative to the
document root of the server. On the CSL
systems, a user’s document root is ~/www.
 A URL may appear in a document in a
relative form:
< a href=“another.html”>
and the actual URL referred to will be
another.html preceded by the protocol,
hostname, directory path of the document .
12/08/21 87
Distributed Computing Introduction,
Summary - 1

We discussed the following topics:


 What is meant by distributed computing
 Distributed system
 Distributed computing vs. parallel computing
 Basic concepts in operating system: processes
and threads

12/08/21 88
Distributed Computing Introduction,
Summary - 2
 Basic concepts in data communication:
 Network architectures: the OSI model and the
Internet model
 Connection-oriented communication vs.
connectionless communication
 Naming schemes for network resources
• The Domain Name System (DNS)
• Protocol port numbers
• Uniform Resource Identifier (URI)
• Email addresses

12/08/21 89
Distributed Computing Introduction,
Summary-3
 Basic concepts in software engineering:
 Procedural programming vs. object-oriented
programming
 UML Class diagrams

 The three-layered architecture of distributed


applications: presentation layer, application or
business logic, the service layer
 The terms toolkit, framework, and component

12/08/21 90
Distributed Computing Introduction,

You might also like