The Internet is the world-wide federation of packet-switched networks running TCP / IP important applications include email and the World Wide Web (www) Computer Networks are everywhere they form an essential part of our infrastructure used o at home o at work o by governments o on the move o.
The Internet is the world-wide federation of packet-switched networks running TCP / IP important applications include email and the World Wide Web (www) Computer Networks are everywhere they form an essential part of our infrastructure used o at home o at work o by governments o on the move o.
The Internet is the world-wide federation of packet-switched networks running TCP / IP important applications include email and the World Wide Web (www) Computer Networks are everywhere they form an essential part of our infrastructure used o at home o at work o by governments o on the move o.
1. Computer Networks 2. The Internet 3. Part of the Internet 4. Packet Switching 5. History of the Internet 6. Growth of the Internet 7. Communication Protocols 8. Protocols and Layering 9. TCP/IP 5-layer Reference Model 10. TCP/IP layers with some protocols 11. Data Passing Through Layers 12. Headers and Layers 13. Internet Communication Paradigms 14. Connection-Oriented Communication 15. Client-Server Model 16. Client Software 17. Server Software 18. Server Identification 19. Service Identification 20. Client-Server Interaction 21. A specific example
1.1. Computer Networks computer networks are everywhere they form an essential part of our infrastructure used o at home o at work o by governments o on the move o ... there are many different types of networks and standards we will concentrate on the Internet 1.2. The Internet
(the above is a very old figure showing an internet) each ellipse represents a network connecting a number of computers directly an internet is a federation of computer networks, connected by routers the Internet is the world-wide federation of packet-switched networks running TCP/IP important applications include email and the World Wide Web (WWW)
1.3. Part of the Internet
(the above figure is taken from the book by Kurose and Ross)
1.4. Packet Switching early communication networks evolved from telephone systems used physical pair of wires between two parties to form a dedicated circuit circuit switching was the task of deciding which circuit to use when two parties wanted to communicate the circuit is reserved for the two parties during communication so it is not available to other parties the Internet uses packet switching which is considered more efficient packet switching o divides data into small blocks, called packets o allows multiple users to share a network o includes identification of the intended recipient in each packet o devices throughout the network each have information about how to reach each possible destination 1.5. History of the Internet (1957) Advanced Research Projects Agency (ARPA) established by US Department of Defense (1968-9) first packet-switching networks (1972) Telnet (1973) File Transfer Protocol (FTP); ARPANET goes international: o University College, London (UK) o Royal Radar Establishment (Norway) (1974) design of TCP (Transmission Control Protocol) (1977) email (1982) TCP and IP (Internet Protocol) used for ARPANET (1984) DNS (Domain Name Service) introduced (1991) WWW released
1.6. Growth of the Internet
1.7. Communication Protocols communication always involves at least two entities o one that sends information and another that receives it all entities in a network must agree on how information will be represented and communicated o the way that electrical signals are used to represent data o procedures used to initiate and conduct communication o the format of messages all communicating parties follow the same set of rules, a set of specifications a specification for network communication is called a communication protocol
1.8. Protocols and Layering computer networks are complex systems including both hardware and software rather than a single, huge specification for all possible forms of communication, designers divide the communication problem into subparts, called layers the interfaces between the layers are defined by protocols layers provide for modularity, making implementation and changes easier the combination of layers is sometimes called a protocol stack 1.9. TCP/IP 5-layer Reference Model
physical layer corresponds to the basic network hardware network interface, or link, layer specifies how data is divided into packets Internet layer specifies how packets are forwarded to particular machines over the Internet, using the Internet Protocol (IP) transport layer specifies how to communicate with particular processes on machines, using the Transmission Control Protocol (TCP) or User Datagram Protocol (UDP) application layer specifies how applications use the Internet, and includes protocols such as the HyperText Transfer Protocol (HTTP) and the Domain Name System (DNS)
1.10. TCP/IP layers with some protocols
1.11. Data Passing Through Layers
1.12. Headers and Layers
1.13. Internet Communication Paradigms Internet supports two basic communication paradigms: o stream paradigm o message paradigm stream paradigm message paradigm connection-oriented connectionless one-to-one communication many-to-many communication sequence of individual bytes sequence of individual messages arbitrary length transfer each message limited to 64 Kbytes used by most applications used for multimedia applications built on TCP protocol built on UDP protocol we will focus on the stream paradigm 1.14. Connection-Oriented Communication Internet stream service is connection-oriented two applications must request that a connection be created once it has been established, the connection allows the applications to send data in either direction finally, when they finish communicating, the applications request that the connection be terminated
1.15. Client-Server Model server application client application starts first starts second does not need to know which client will contact it must know which server to contact waits passively and arbitrarily long for contact from a client initiates contact whenever communication is needed communicates with a client by both sending and receiving data communicates with a server by both sending and receiving data stays running after servicing one client, and waits for another may terminate after interacting with a server 1.16. Client Software is an arbitrary application program that becomes a client temporarily when remote access is needed, but also performs other computation is invoked directly by a user, and executes only for one session runs locally on a user's personal computer actively initiates contact with a server can access multiple services as needed, but usually contacts one remote server at a time does not require especially powerful computer hardware 1.17. Server Software is a special-purpose, privileged program is dedicated to providing one service that can handle multiple remote clients at the same time is invoked automatically when a system boots, and continues to execute through many sessions runs on a large, powerful computer waits passively for contact from arbitrary remote clients accepts contact from arbitrary clients, but offers a single service requires powerful hardware and a sophisticated operating system (OS) 1.18. Server Identification Internet protocols divide identification into two pieces: o an identifier for the computer on which a server runs o an identifier for a service on the computer identifying a computer o each computer on the Internet is assigned a unique 32- bit identifier known as an Internet Protocol address (IP address) o 4 bytes written as n 1 .n 2 .n 3 .n 4 where each n i is a decimal number, e.g., 18.23.0.22 o a client must specify the server's IP address o to make server identification easy for humans, each computer is also assigned a name, and the Domain Name System (DNS) is used to translate a name into an address o thus, a user specifies a name such as www.dcs.bbk.ac.uk rather than an integer address 1.19. Service Identification each service available in the Internet is assigned a unique 16- bit identifier known as a protocol port number (or port number) o examples: email -> port number 25, and the web -> port number 80 when a server begins execution o it registers with its local OS by specifying the port number for its service when a client contacts a remote server to request service o the request contains a port number when a request arrives at a server computer o software on the server uses the port number in the request to determine which application on the server computer should handle the request 1.20. Client-Server Interaction following diagram illustrates client-server interaction o client on the left o server on the right
1.21. A specific example suppose I want to retrieve a web page from www.w3.org my browser will use DNS to find the IP address 128.30.52.37 my browser will compose a message based on HTTP asking to get the page HTTP will ask TCP to connect to port 80 on 128.30.52.37 TCP will ask IP to send the message to 128.30.52.37 IP will send the message to a router on the local network this router will send the message to another router ... the router on the local network for 128.30.52.37 will receive the message it will send the message to 128.30.52.37 IP will receive it and pass it up to TCP TCP will see that it is for port 80 and will pass it to the web server process the web server will interpret the HTTP and send the page to my browser
2. Internet Applications
1. Representation and Transfer 2. Web Protocols 3. Some Other Application Layer Protocols 4. Uniform Resource Identifiers (URIs) 5. Uniform Resource Locators (URLs) 6. URL schemes 7. Escaping Special URI characters 8. Domain Name System (DNS) 9. DNS Design 10. Top-Level Domains 11. DNS Server Hierarchy 12. DNS Server Model 13. Name Resolution 14. URIs, URNs and URLs 15. Uniform Resource Names (URNs) 16. Internet Electronic mail 17. Sending e-mail 18. Example SMTP Session 19. Email Representation Standards 20. Multi-purpose Internet Mail Extensions (MIME) 21. MIME Headers 22. Base64 Encoding
2.1. Representation and Transfer application-layer protocols specify two aspects of interaction o representation o transfer representation: o syntax of data items exchanged o specific form during transfer o translation of integers, characters and files between computers transfer: o interaction between client and server o message syntax and semantics o valid and invalid exchange o error handling o termination of interaction 2.2. Web Protocols World Wide Web (WWW) is one of the most widely used services on the Internet major WWW standards are o HyperText Markup Language (HTML): representation standard specifying contents and layout of a web page o Uniform Resource Identifier (URI): representation standard specifying format and meaning of web page identifiers o HyperText Transfer Protocol (HTTP): transfer protocol specifying how a browser interacts with a web server 2.3. Some Other Application Layer Protocols telnet (for remote login) o defined in RFC 318 (1972) ftp (file transfer protocol) o defined in RFC 454 (1973) email protocols o SMTP (Simple Mail Transfer Protocol) o POP3 (Post Office Protocol version 3) o IMAP4 (Internet Mail Access Protocol) DNS (Domain Name System) o defined in RFC 1034 and RFC 1035 (1987) RTP (Real-time Transfer Protocol) for audio and video o defined in RFC 3550 (2003) 2.4. Uniform Resource Identifiers (URIs) a Uniform Resource Identifier (URI) is a unique identifier for identifying a resource on the internet basic syntax is: scheme ":" scheme-specific-part where o scheme identifies a naming scheme, e.g., http o scheme-specific-part identifies resource in some way specific to the scheme o most commonly used URIs are Uniform Resource Locators (URLs) 2.5. Uniform Resource Locators (URLs) scheme examples include o ftp, http, https, mailto, telnet in the following syntax [ ... ] denotes optional everything else not in quotes denotes a string to be supplied scheme specific part has syntax "//" [ user [ ":" password ] "@" ] host [ ":" port ] [ "/" url- path ] [ "?" query-string ] [ "#" anchor ] where o user and password are not often used o host is a fully qualified domain name or IP address o port is optional (usually a default) o url-path is the path to the resource, specific to scheme o query-string includes parameters associated with the request (usually form fields) o anchor is a reference to a part of a resource (a fragment identifier) e.g. in http://vili.dcs.bbk.ac.uk/dept/staffperson05.asp?name=ptw o http is the scheme o vili.dcs.bbk.ac.uk is the host o dept/staffperson05.asp is the url-path o name=ptw is the query string 2.6. URL schemes http o user name and password not applicable o default port number is 80 https o HTTP over Secure Sockets Layer (SSL) o default port number is 443 ftp o user name and password can be given o if not, anonymous ftp used o default port number is 21 telnet o host is mandatory o default port number is 23 mailto o no need for url-path to be specified o program should prompt user for message, then send using SMTP 2.7. Escaping Special URI characters the space character is not allowed in URIs /, #, ? have special meanings in URIs also + is used to separate parameters in a query string so if we need any of these as an ordinary character in a URI, we use the escaped version the escaped version is the character % followed by the ASCII hexadecimal value of the character now % has a special meaning too the escaped versions are as follows: symbol escaped version % %25 / %2F # %23 ? %3F space %20 + %2B 2.8. Domain Name System (DNS) provides a service mapping (human-readable) DNS names to IP addresses browsers, mail software and most other Internet applications use DNS two advantages: o easier to remember www.w3.org than 128.30.52.37 o higher level of abstraction allows simpler reorganisation names are organised hierarchically: o most significant part of the name on the right (specified by DNS) o left-most segment of a name is the name of an individual computer DNS is essentially o a distributed database implemented as a hierarchy of DNS servers o an application-layer protocol allowings hosts to query the database 2.9. DNS Design why is DNS distributed? a simpler design would have been to have one DNS server storing all the mappings problems with this centralised design include: o it is a single point of failure o the need to handle huge volumes of queries o a single server cannot be "close" to all clients o it would also have to handle all updates for new hosts 2.10. Top-Level Domains right-most domains of the hierarchy are top-level domains: o either country-code top-level domain (ccTLD) o or generic top-level domain (gTLD) ccTLD represented by two-letter country-codes from ISO 3166, e.g., uk, fr, de, ch gTLD given in Internet informational RFC 1591: o edu: educational institutions o com: commercial entities, i.e., companies o net: network providers o org: organisations, e.g. NGOs o gov: government agencies o mil: US military o int: organisations established by international treaties 2.11. DNS Server Hierarchy the following shows a portion of the hierarchy of DNS servers
there are 13 root DNS servers (each is actually a cluster of replicated servers) o these return IP addresses of top-level domain servers top-level domain servers are responsible for top-level domains o they return IP addresses of authoritative servers for organisations each organisation must provide an authoritative DNS server for its publically accessible hosts 2.12. DNS Server Model each organization is free to choose how to organise its servers o a small organisation might use an ISP to run a DNS server o a larger organisation might place all names on a single server o a large organisation might divide its names among several servers DNS allows each organization to assign names to computers or to change those names without informing a central authority each DNS server contains information linking it to other DNS servers up and down the hierarchy a given server can be replicated replication is useful for heavily used servers, such as root servers that provide information about top-level domains DNS servers employ caching in order to improve performance and reduce load 2.13. Name Resolution translation of a domain name into an address is called name resolution the name is said to be resolved to an address software to perform the translation is known as a name resolver (or simply resolver) DNS server is used by a browser to map DNS name to IP address:
2.14. URIs, URNs and URLs a Uniform Resource Identifier (URI) is either o a Uniform Resource Name (URN), or o a Uniform Resource Locator (URL) URN names a resource, while URL gives its address URN vs URL analogous to DNS name vs IP address 2.15. Uniform Resource Names (URNs) overcome disadvantages of using URLs, namely: o dependence on host names o dependence on file structure on host o ease with which URL can be invalidated no syntactic difference between URN and URL URNs not yet supported by browsers syntax for URNs: "urn:" <NID> ":" <NSS> where o scheme is urn o scheme specific part is <NID> ":" <NSS> o <NID> is Namespace IDentifier, e.g., isbn o <NSS> is Namespace Specific String 2.16. Internet Electronic mail e-mail client responsible for o retrieving mail from server (POP3, IMAP4) o sending mail to server (SMTP) e-mail server responsible for o collecting mail from client (SMTP) o distributing mail to client (POP3, IMAP4) o relaying mail between e-mail servers (SMTP)
2.17. Sending e-mail SMTP (Simple Mail Transfer Protocol) defined in RFC 821 and 822 (1982), superceded by RFC 2822 (2001) use mailto: prefix in URI in browser uses TCP port 25 address of recipient is of the form name@dept.inst.ac.uk uses DNS (Domain Name System) to map domain name to IP address 2.18. Example SMTP Session mail message is transferred from user John_Q_Smith on computer example.edu to two users on computer somewhere.com
2.19. Email Representation Standards two important standards exist o RFC (Request For Comments) 2822 mail message format o Multi-purpose Internet Mail Extensions (MIME) RFC 2822 format comprises o a header section o a blank line o and a body header lines each have the form keyword: information where keywords include From, To, Subject, Cc the mail message (including headers) makes up the DATA as sent by SMTP 2.20. Multi-purpose Internet Mail Extensions (MIME) SMTP uses 7-bit ASCII format inadequate for non-English and non-textual data MIME defined in RFCs 2045, 2046, 2047, 2048, 2049; allows o non-ASCII message bodies o extensible set of different formats for non-textual bodies o multi-part message bodies o non-ASCII textual header information 2.21. MIME Headers MIME headers include: o MIME-Version o Content-Type: specifies a type and subtype o Content-Transfer-Encoding: specifies auxiliary encoding for transfer contents of the Content-Type header is the MIME type examples of MIME types are text/html, image/gif and multipart/mixed example of Content-Transfer-Encoding is base64: o preferred encoding for 8-bit binary data o each group of 3 bytes (24 bits) is encoded as 4 ASCII characters 2.22. Base64 Encoding
0x00 0x10 0x20 0x30 0 A Q g w 1 B R h x 2 C S i y 3 D T j z 4 E U k 0 5 F V l 1 6 G W m 2 7 H X n 3 8 I Y o 4 9 J Z p 5 A K a q 6 B L b r 7 C M c s 8 D N d t 9 E O e u + F P f v / values in top row and leftmost column are hexadecimal numbers range of values is 0x00 to 0x3F (111111) encode 01011010, 10001010, 00011101, e.g., by 1. splitting into 4 6-bit values: 010110, 101000, 101000, 011101 2. converting to hex: 0x16, 0x28, 0x28, 0x1D 3. use table to encode: W, o, o, d
QUIZ
1. What does WWW stand for? World Wacky Web Wide World Wumpus World Wide Web Wide World of Why? 2. Which one of the following is a search engine? Netscape Java Altavista Internet 3. What is the URL of the search engine in question 2? http://www.yahoo.com http://www.altavista.net http://www.yahooligans.org http ://www.altavista.com 4. What does URL stand for? united route link uniform resource locator unknown redirection link up real late 5. What is the name of the language you use to write a web page? HTTP FTP URL HTML 6. What do you do if you accidentally end up at an inappropriate web site? Hit "Back" button, and continue surfing Yell out "Hey look at this everyone!!" Hit "Back" button immediately, and raise your hand to tell the teacher or tell your parent Leave it on your screen until the teacher finds out
7. What are the consequences if download files from the Internet to your computer? You can play lot of games Or listen to lot of music Your computer will get full up of junk You might download a virus and put your computer and other at risk, always check if the download site is trustworthy
8. Which of the following terms is a "browser"? Netscape World Wide Web Launcher E-mail
9. All web addresses start with which of the following? htp http:// http:/ WWW
10. A word that looks underlined on a web page is usually what? an important word the web address a "link" to another web page a mistake
11. What is the World Wide Web?
(A) a computer game
(B) a software program
(C) the part of the Internet that enables information-sharing via interconnected pages
(D) another name for the Internet
12. Which is the best search tool for finding Web sites that have been handpicked and recommended by someone else?
(A) subject directories
(B) search engines
(C) meta-search engines
(D) discussion groups
13. The Internet was originally developed by whom?
(A) computer hackers
(B) a corporation
(C) the U.S. Department of Defense
(D) the University of Michigan
14. Which description does NOT apply to the Internet?
(A) an interconnected system of networks that allows for communication through e-mail, LISTSERVS, and the World Wide Web
(B) a public network neither owned nor run by any one group or individual
(C) a vast network that connects millions of computers around the world
(D) a catalog of information organized and fact-checked by a governing body
15. Which one of the following is a search engine?
(A) Macromedia Flash
(B) Google
(C) Netscape
(D) Librarians Index to the Internet
16. Which of the following is a TRUE statement?
(A) You are free to copy information you find on the Web and include it in your research report.
(B) You do not have to cite the Web sources you use in your research report.
(C) You should never consult Web sources when you are doing a research report.
(D) Just like print sources, Web sources must be cited in your research report. You are not free to plagiarize information you find on the Web.
17. What is a URL?
(A) a computer software program
(B) a type of UFO
(C) the address of a document or "page" on the World Wide Web
(D) an acronym for Unlimited Resources for Learning
18. What are the three main search expressions, or operators, recognized by Boolean logic?
(A) FROM, TO, WHOM
(B) AND, OR, NOT
(C) SEARCH, KEYWORD, TEXT
(D) AND, OR, BUT
19. Which of the following is a true statement about the Internet and the library?
(A) They both have an expert librarian or specialist to answer your questions.
(B) They both provide up-to-the-minute news and information.
(C) They both close after hours.
(D) They both provide access to newspapers, magazines, and journals.
20. http://www.classzone.com is an example of what?
(A) a URL
(B) an access code
(C) a directory
(D) a server
21. HTML is used to Plot complicated graphs Solve equations Author webpages Translate one language into another
22. The "http" you type at the beginning of any site's address stands for HTML Transfer Technology Process Hyperspace Techniques and Technology Progress Hyper Text Transfer Protocol Hyperspace Terms and Technology Protocol
23. "www" stands for World Wide Wait World Wide Web World Wide War World Wide Wares
24. ISP stands for Integrated Service Provider Internet Security Protocol Internet Survey Period Internet Service Provider
25. Google (www.google.com) is a Number in Math Chat service on the web Directory of images Search Engine
26. Internet Explorer is a Web Browser News Reader Graphing Package Any person browsing the net
28. On which of the following sites can you set up your email account: www.gre.org www.hotmail.com www.linux.org www.syvum.com
29. The speed of your net access is defined in terms of MHz Megabytes RAM Kbps
30. AOL stands for America Over LAN America Online Arranged Outer Line Audio Over LAN
31. Which of the following is not a method of accessing the web? ISDN DSL Modem CPU
32. Yahoo (www.yahoo.com) is a Super Computer Organization that allocates web addresses Website for Consumers Portal
33. What is the name given to the temporary storage area that a web browser uses to store pages and graphics that it has recently opened? Niche Cellar Cache Webspace
34. A computer on the Internet that hosts data, that can be accessed by web browsers using HTTP is known as: Web Server Web Rack Web Space Web Computer
35. Linux is A Web Browser An Operating System A Web Server An non profit organization
36. Microsoft Windows is A Web Browser A Web Server An Operating System A Spreadsheet Package
37. A domain name ending with "org" is A commercial website An organization A network site A site which has very high traffic
38. At which of the following sites would you most probably buy books? www.hotmail.com www.amazon.com www.sun.com www.msn.com
39. What can you do with the Internet? Exchange information with friends and colleagues Access pictures, sounds, video clips and other media elements Find diverse perspective on issues from a global audience Post and respond to inquiries on a variety of subjects
40. The Internet was developed in the... early 1990s late 1980s early 1970s late 1960s
41. According to CNN, how much did Internet traffic increase between 1994 and 1996? Two times Five times Ten times Twenty-five times
42. USENET is... A set of tools reserved exclusively for Internet administrators Short for United States Electronic Network A bulletin board system that allows for posting and responding to messages on the Internet A precursor to the Internet that is now obsolete
43. True or false: The Internet is managed by the U.S. government True False
44. What is a spider? A computer virus A program that catalogs Web sites A hacker who breaks into corporate computer systems An application for viewing Web sites
45. What is not always necessary for accessing the Web? A Web browser A connection to an Internet Access Provider A computer A modem