Professional Documents
Culture Documents
http://idc.hust.edu.cn/~rxli/ Ruixuan Li School of Computer Science and Technology Huazhong University of Science and Technology Sep. 10, 2013
Outline
Introduction
1$
(USA, 2003)
1 CPU 4 GB 1 1 GB 1 GB 3 10 M 10 TB 10 TB
2GH CPU, 2GB RAM: $2,000 200 GB, 100 50MB: $200 1 Mbps: $100/
10 KWhrs 14
3 1000.
3
(Views of Jim Gray, 2003)
(Distributed Computing)
5
U. C. Berkeley
SETI@Home
305
1:
2:
Resource sharing
It characterizes the range of the things that can usefully be shared in a networked computer It extends from hardware components to software-defined ft d fi d entities. titi It includes the stream of video frames and the audio connection.
Collaborative computing
10
A distributed system is one in which hardware or software components located at networked computers communicate and coordinate their actions only by passing messages.
Applications
P2P systems, Search Engines, Online games, Gmail, Gmail CORBA, DCOM, EJB, . Internet, Mobile phone networks, Wireless sensor networks networks, Corporation networks, networks Factory networks, Campus networks, Home networks
12
Middleware
Networks
Concurrency
concurrent programs execution share resource programs coordinate actions by exchanging messages when some systems fail, others may not know
No global clock
Independent failures
13
Outline
14
The Internet, The Intranet DNS service Distributed file system P2P Applications (BitTorrent, eMule) Mobile and ubiquitous computing Search engine, Sensor network, Cloud computing
15
The Internet
intranet ISP
backbone
17
The 'Network Effect kicks in, and the web goes critical'
1970
1975
1980
1985
1990
1995
2000
2005
2010
1969: 4 US Universities linked to form ARPANET 1972: First e-mail program created 1976: Robert Metcalfe develops Ethernet
18
Web
1980Tim Berners Berners-Lee LeeEnquire (Enquire Within Upon Everything) 199011 Webnxoc01.cern.ch nxoc01 cern ch Tim Berners-Lee WebWorldWideWeb Web 1991 CERN ( (European p Particle Physics y Laboratory) Web Web eb W3CWorld Wide Web Consortium
19
Web
20
Web
Load on the first Web server (info (info.cern.ch) cern ch) 1000 times what it has been 3 years earlier
21
Web
Number of web sites 1993-1996, from 130 to 600,000 sites 2010-2013, from 200,000,000 to 780,000,000 N t ft Netcraft
CNNIC201212 4.20
1993Mark AndreessenMosaic The great thing about the Internet--the thing that catalyzed it in the first place and renews it every day--is that there are so many people able to use it, able to do a million different things. It's an open platform that anybody can develop and create applications for for. A lot of people are able to apply their energy, and see it bear fruit. fruit.
25
1994, Mark AndreessenNetscape 1995, MicrosoftInternetInternet Explorer 1.02.0 1997, , IE4.0DHTMLWinner 1998, Netscape 2004 Mozilla.org 2004, Mozilla orgNetscapeFirefox 2008, GoogleChrome 2010, UC 2012, HTML5
DOTCOM Bubble
The technology technology-heavy heavy NASDAQ Composite index peaked in March 2000, reflecting the high point of the dot-com bubble bubble.
27
WEB2.0
Web 1.0 Ofoto Flickr Akamai BitTorrent mp3.com Napster Britannica Online Wikipedia personal websites (blogging) evite upcoming.orgEVDB SEO page views cost per click screen scraping web services publishing participation content management wikis directories tagging, folksonomy stickiness syndication
28
Web2.0 Buzzwords
Web
AdSense Facebook Twitter Mash up Mash-up Wikipedia Yahoo, ebay, amazon del icio us Flickr del.icio.us, th perpetual the t lb beta t
29
Web
www.eBay.com y www.wikipedia.com www.napster.com t www.youtube.com www.blogger.com www friendsreunited com www.friendsreunited.com www.drudgereport.com ()
30
Web
31
The Internet, The Intranet DNS service Distributed file system P2P Applications (BT, eMule) Mobile and ubiquitous computing Search engine, Sensor network, Cloud computing
32
A typical intranet
email server print and other servers Local area network
Desktop p computers
Web server
email server File server print other servers the rest th t of f the Internet router/firewall
33
Issues in intranet
34
The Internet, The Intranet, Mobile computing DNS service Distributed file system P2P Applications (BT, eMule) Mobile and ubiquitous computing Search engine, Sensor network, Cloud computing
35
/C/S
36
(Grid Computing) (Peer-to-Peer Computing) (Services Computing) (Autonomous Computing) (Edge Computing) ( (Mobile Computing) p g) (Sensor Network) (Pervasive/Ubiquitous (P i /Ubi it Computing) C ti ) (Cloud Computing)
37
Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations
38
39
40
(MEMS)
Mote (Berkeley)
Cricket
(MIT)
Mantis (UC
Boulder )
SmartLocus
(HP-Labs)
Smart Dust
(Berkeley)
41
42
Mobile devices
Laptop computers Handheld devices PDA, mobile phone, pager, video camera, di it l camera digital Wearable devices e.g. smart watches, digital glasses Devices embedded in appliances e.g. washing machines, hi-fi systems, cars and refrigerators g
43
44
Host intranet
Wireless LAN
WAP gateway
Home intranet
45
Discovery of resources Eliminating the need for users to reconfigure their mobile devices To cope with limited connectivity as they travel Provide privacy and other security guarantees
46
The Internet, The Intranet, Mobile computing DNS service Distributed file system P2P Applications (BT, eMule) Mobile and ubiquitous computing Search engine, Sensor network, Cloud computing
47
Crawler machines
DocIds
user query
Inverted index
48
49
50
Cloud Computing:
Computing service is a standard utility Users and corporations contract the services by units Significantly reduce the IT personal and infrastructure costs Well utilize rich computing, storage, and Internet resources Principles p of cloud computing p g Cost-effectiveness is the basis for computing, storage, and communication models in cloud computing Targeting standard computing model in a wide range Exploiting p g locality y and load sharing g with low overhead
51
New challenges (CS@Berkeley, (CS@Berkeley 2009) (1) availability of service (2) sharing data in different platforms (3) data security (4) minimizing communication cost (5) unpredictable performance (6) scalability of storage (7) reliability of large scale distributed systems (8) service scalability (9) trust to the cloud service (10) software ft li licensing i
52
Outline
Introduction
53
54
Network is reliable
New reliable: N li bl failures f il of f switches, it h powers, and d others, th security it attacks tt k Systems must be duplicated Latency improvement significantly lags behind that of bandwidth Latency reduction anywhere is most important Network bandwidth is expensive, does not follow Moores Law.
Latency is zero
Bandwidth a d dt is s infinite te
Network topology is out of users control, subject to changes all the time Networking/system administration rules are different from organizations
Two costs are involved: software overhead (e.g. TCP/IP, others), monthly maintenance fee Types of networks, computers, software systems are very diverse
55
56
57
Challenges,
58
Heterogeneity
Networks
Ethernet, token ring, etc big endian / little endian different API of Unix and Windows diff different t representations t ti f for d data t structures t t no application standards
59
C Computer t h hardware d
Operating systems
Programming languages
Heterogeneity (contd.)
Middleware
applies to a software layer that provides a programming abstraction as well as masking the heterogeneity of the underlying networks, hardware, programming g g languages g g OSs and p is used i d to t refer f t to code d th that t can b be sent tf from one computer to another and run at the destination
Mobile code
60
Openness
is the characteristic that determines whether the system can be extended and re-implemented in various way.
e.g. Unix
is determined by the degree to which new resource sharing services can be added and be made available for use by a variety of client programs.
e.g. Web
e.g. RFC
61
Openness (contd.)
Open APIs ()
62
Openness (contd.)
(e.g. XML-RPC)
63
Security
Confidentiality
I t it Integrity
e.g. checksum
Availability
protection against interference with the means to access t the e resources esou ces
Scalability
if it will remain effective when there is a significant increase in the number of resources and the number of users
65
Scalability (contd.)
Design challenges
e.g., servers support users at most O(n) e.g., DNS no worse than O(logn) e.g., IP address e.g., partitioning name table of DNS, cache and replication
66
Failure handling
Detecting
e.g. checksum for corrupted data Sometimes impossible p so suspect, p , e.g. g a remote crashed server in the Internet e.g. Retransmit message, standby server e.g. a web browser cannot contact a web server e.g. Roll back e.g. IP route, replicated name table of DNS
67
Masking
Tolerating
Recovery y
Redundancy y
Concurrency
Correctness
Performance
68
Transparency
Access transparency
using identical operations to access local and remote resources e.g. a graphical user interface with folders resources to be accessed without knowledge of their location e.g. URL several processed operate concurrently using shared resources without interference with between them multiple instances of resources to be used to increase reliability and performance without knowledge of the replicas by users or application programmers
69
Location transparency
Concurrency y transparency p y
R li ti transparency Replication t
Transparency (contd.)
Failure transparency
users and applications to complete their tasks despite the failure of hardware and software components, e.g., email
Mobility transparency
movement of resources and clients within a system without affecting the operation of users and programs, e.g., mobile phone
Scaling transparency
allows the system y and applications pp to expand p in scale without change to the system structure or the application algorithms
70
Summary
Distributed systems are pervasive Distributed computing and resource sharing are primary motivations for constructing distributed systems Characterization of Distributed System
71
Summary (contd.)
72
Q&A