You are on page 1of 31

1.

Introduction to the Internet



1. Computer Networks
2. The Internet
3. Part of the Internet
4. Packet Switching
5. History of the Internet
6. Growth of the Internet
7. Communication Protocols
8. Protocols and Layering
9. TCP/IP 5-layer Reference Model
10. TCP/IP layers with some protocols
11. Data Passing Through Layers
12. Headers and Layers
13. Internet Communication Paradigms
14. Connection-Oriented Communication
15. Client-Server Model
16. Client Software
17. Server Software
18. Server Identification
19. Service Identification
20. Client-Server Interaction
21. A specific example

1.1. Computer Networks
computer networks are everywhere
they form an essential part of our infrastructure
used
o at home
o at work
o by governments
o on the move
o ...
there are many different types of networks and standards
we will concentrate on the Internet
1.2. The Internet

(the above is a very old figure showing an internet)
each ellipse represents a network connecting a number of computers
directly
an internet is a federation of computer networks, connected by routers
the Internet is the world-wide federation of packet-switched networks
running TCP/IP
important applications include email and the World Wide Web (WWW)








1.3. Part of the Internet

(the above figure is taken from the book by Kurose and Ross)

1.4. Packet Switching
early communication networks evolved from telephone systems
used physical pair of wires between two parties to form a
dedicated circuit
circuit switching was the task of deciding which circuit to use when two
parties wanted to communicate
the circuit is reserved for the two parties during communication
so it is not available to other parties
the Internet uses packet switching which is considered more efficient
packet switching
o divides data into small blocks, called packets
o allows multiple users to share a network
o includes identification of the intended recipient in each packet
o devices throughout the network each have information about how
to reach each possible destination
1.5. History of the Internet
(1957) Advanced Research Projects Agency (ARPA) established by US
Department of Defense
(1968-9) first packet-switching networks
(1972) Telnet
(1973) File Transfer Protocol (FTP); ARPANET goes international:
o University College, London (UK)
o Royal Radar Establishment (Norway)
(1974) design of TCP (Transmission Control Protocol)
(1977) email
(1982) TCP and IP (Internet Protocol) used for ARPANET
(1984) DNS (Domain Name Service) introduced
(1991) WWW released




1.6. Growth of the Internet


1.7. Communication Protocols
communication always involves at least two entities
o one that sends information and another that receives it
all entities in a network must agree on how information will be
represented and communicated
o the way that electrical signals are used to represent data
o procedures used to initiate and conduct communication
o the format of messages
all communicating parties follow the same set of rules, a set
of specifications
a specification for network communication is called a
communication protocol


1.8. Protocols and Layering
computer networks are complex systems including both hardware and
software
rather than a single, huge specification for all possible forms of
communication, designers divide the communication problem into
subparts, called layers
the interfaces between the layers are defined by protocols
layers provide for modularity, making implementation and changes
easier
the combination of layers is sometimes called a protocol stack
1.9. TCP/IP 5-layer Reference Model

physical layer corresponds to the basic network hardware
network interface, or link, layer specifies how data is divided into
packets
Internet layer specifies how packets are forwarded to
particular machines over the Internet, using the Internet Protocol (IP)
transport layer specifies how to communicate with
particular processes on machines, using the Transmission Control
Protocol (TCP) or User Datagram Protocol (UDP)
application layer specifies how applications use the Internet, and
includes protocols such as the HyperText Transfer Protocol (HTTP) and
the Domain Name System (DNS)

1.10. TCP/IP layers with some protocols

1.11. Data Passing Through Layers





1.12. Headers and Layers

1.13. Internet Communication Paradigms
Internet supports two basic communication paradigms:
o stream paradigm
o message paradigm
stream paradigm message paradigm
connection-oriented connectionless
one-to-one communication many-to-many communication
sequence of individual bytes sequence of individual messages
arbitrary length transfer each message limited to 64 Kbytes
used by most applications used for multimedia applications
built on TCP protocol built on UDP protocol
we will focus on the stream paradigm
1.14. Connection-Oriented Communication
Internet stream service is connection-oriented
two applications must request that a connection be created
once it has been established, the connection allows the applications to
send data in either direction
finally, when they finish communicating, the applications request that
the connection be terminated


1.15. Client-Server Model
server application client application
starts first starts second
does not need to know which client will
contact it
must know which server to contact
waits passively and arbitrarily long for
contact from a client
initiates contact whenever communication is
needed
communicates with a client by both sending
and receiving data
communicates with a server by both sending
and receiving data
stays running after servicing one client, and
waits for another
may terminate after interacting with a server
1.16. Client Software
is an arbitrary application program that becomes a client temporarily
when remote access is needed, but also performs other computation
is invoked directly by a user, and executes only for one session
runs locally on a user's personal computer
actively initiates contact with a server
can access multiple services as needed, but usually contacts one remote
server at a time
does not require especially powerful computer hardware
1.17. Server Software
is a special-purpose, privileged program
is dedicated to providing one service that can handle multiple remote
clients at the same time
is invoked automatically when a system boots, and continues to execute
through many sessions
runs on a large, powerful computer
waits passively for contact from arbitrary remote clients
accepts contact from arbitrary clients, but offers a single service
requires powerful hardware and a sophisticated operating system (OS)
1.18. Server Identification
Internet protocols divide identification into two pieces:
o an identifier for the computer on which a server runs
o an identifier for a service on the computer
identifying a computer
o each computer on the Internet is assigned a unique 32-
bit identifier known as an Internet Protocol address (IP address)
o 4 bytes written as n
1
.n
2
.n
3
.n
4
where each n
i
is a decimal number,
e.g., 18.23.0.22
o a client must specify the server's IP address
o to make server identification easy for humans, each computer is
also assigned a name, and the Domain Name System (DNS) is
used to translate a name into an address
o thus, a user specifies a name such as www.dcs.bbk.ac.uk rather
than an integer address
1.19. Service Identification
each service available in the Internet is assigned a unique 16-
bit identifier known as a protocol port number (or port number)
o examples: email -> port number 25, and the web -> port number
80
when a server begins execution
o it registers with its local OS by specifying the port number for its
service
when a client contacts a remote server to request service
o the request contains a port number
when a request arrives at a server computer
o software on the server uses the port number in the request to
determine which application on the server computer should handle
the request
1.20. Client-Server Interaction
following diagram illustrates client-server interaction
o client on the left
o server on the right

1.21. A specific example
suppose I want to retrieve a web page from www.w3.org
my browser will use DNS to find the IP address 128.30.52.37
my browser will compose a message based on HTTP asking to get the
page
HTTP will ask TCP to connect to port 80 on 128.30.52.37
TCP will ask IP to send the message to 128.30.52.37
IP will send the message to a router on the local network
this router will send the message to another router
...
the router on the local network for 128.30.52.37 will receive the
message
it will send the message to 128.30.52.37
IP will receive it and pass it up to TCP
TCP will see that it is for port 80 and will pass it to the web server
process
the web server will interpret the HTTP and send the page to my browser













2. Internet Applications

1. Representation and Transfer
2. Web Protocols
3. Some Other Application Layer Protocols
4. Uniform Resource Identifiers (URIs)
5. Uniform Resource Locators (URLs)
6. URL schemes
7. Escaping Special URI characters
8. Domain Name System (DNS)
9. DNS Design
10. Top-Level Domains
11. DNS Server Hierarchy
12. DNS Server Model
13. Name Resolution
14. URIs, URNs and URLs
15. Uniform Resource Names (URNs)
16. Internet Electronic mail
17. Sending e-mail
18. Example SMTP Session
19. Email Representation Standards
20. Multi-purpose Internet Mail Extensions (MIME)
21. MIME Headers
22. Base64 Encoding

2.1. Representation and Transfer
application-layer protocols specify two aspects of interaction
o representation
o transfer
representation:
o syntax of data items exchanged
o specific form during transfer
o translation of integers, characters and files between computers
transfer:
o interaction between client and server
o message syntax and semantics
o valid and invalid exchange
o error handling
o termination of interaction
2.2. Web Protocols
World Wide Web (WWW) is one of the most widely used services on
the Internet
major WWW standards are
o HyperText Markup Language (HTML): representation standard
specifying contents and layout of a web page
o Uniform Resource Identifier (URI): representation standard
specifying format and meaning of web page identifiers
o HyperText Transfer Protocol (HTTP): transfer protocol specifying
how a browser interacts with a web server
2.3. Some Other Application Layer Protocols
telnet (for remote login)
o defined in RFC 318 (1972)
ftp (file transfer protocol)
o defined in RFC 454 (1973)
email protocols
o SMTP (Simple Mail Transfer Protocol)
o POP3 (Post Office Protocol version 3)
o IMAP4 (Internet Mail Access Protocol)
DNS (Domain Name System)
o defined in RFC 1034 and RFC 1035 (1987)
RTP (Real-time Transfer Protocol) for audio and video
o defined in RFC 3550 (2003)
2.4. Uniform Resource Identifiers (URIs)
a Uniform Resource Identifier (URI) is a unique identifier for identifying
a resource on the internet
basic syntax is:
scheme ":" scheme-specific-part
where
o scheme identifies a naming scheme, e.g., http
o scheme-specific-part identifies resource in some way specific to
the scheme
o most commonly used URIs are Uniform Resource
Locators (URLs)
2.5. Uniform Resource Locators (URLs)
scheme examples include
o ftp, http, https, mailto, telnet
in the following syntax [ ... ] denotes optional
everything else not in quotes denotes a string to be supplied
scheme specific part has syntax
"//" [ user [ ":" password ] "@" ] host [ ":" port ] [ "/" url-
path ] [ "?" query-string ] [ "#" anchor ]
where
o user and password are not often used
o host is a fully qualified domain name or IP address
o port is optional (usually a default)
o url-path is the path to the resource, specific to scheme
o query-string includes parameters associated with the request
(usually form fields)
o anchor is a reference to a part of a resource (a fragment identifier)
e.g. in http://vili.dcs.bbk.ac.uk/dept/staffperson05.asp?name=ptw
o http is the scheme
o vili.dcs.bbk.ac.uk is the host
o dept/staffperson05.asp is the url-path
o name=ptw is the query string
2.6. URL schemes
http
o user name and password not applicable
o default port number is 80
https
o HTTP over Secure Sockets Layer (SSL)
o default port number is 443
ftp
o user name and password can be given
o if not, anonymous ftp used
o default port number is 21
telnet
o host is mandatory
o default port number is 23
mailto
o no need for url-path to be specified
o program should prompt user for message, then send using SMTP
2.7. Escaping Special URI characters
the space character is not allowed in URIs
/, #, ? have special meanings in URIs
also + is used to separate parameters in a query string
so if we need any of these as an ordinary character in a URI, we use
the escaped version
the escaped version is the character % followed by the ASCII
hexadecimal value of the character
now % has a special meaning too
the escaped versions are as follows:
symbol escaped version
%
%25
/
%2F
#
%23
?
%3F
space
%20
+
%2B
2.8. Domain Name System (DNS)
provides a service mapping (human-readable) DNS names to IP
addresses
browsers, mail software and most other Internet applications use DNS
two advantages:
o easier to remember www.w3.org than 128.30.52.37
o higher level of abstraction allows simpler reorganisation
names are organised hierarchically:
o most significant part of the name on the right (specified by DNS)
o left-most segment of a name is the name of an individual
computer
DNS is essentially
o a distributed database implemented as a hierarchy of DNS servers
o an application-layer protocol allowings hosts to query the
database
2.9. DNS Design
why is DNS distributed?
a simpler design would have been to have one DNS server storing all the
mappings
problems with this centralised design include:
o it is a single point of failure
o the need to handle huge volumes of queries
o a single server cannot be "close" to all clients
o it would also have to handle all updates for new hosts
2.10. Top-Level Domains
right-most domains of the hierarchy are top-level domains:
o either country-code top-level domain (ccTLD)
o or generic top-level domain (gTLD)
ccTLD represented by two-letter country-codes from ISO 3166,
e.g., uk, fr, de, ch
gTLD given in Internet informational RFC 1591:
o edu: educational institutions
o com: commercial entities, i.e., companies
o net: network providers
o org: organisations, e.g. NGOs
o gov: government agencies
o mil: US military
o int: organisations established by international treaties
2.11. DNS Server Hierarchy
the following shows a portion of the hierarchy of DNS servers

there are 13 root DNS servers (each is actually a cluster of replicated
servers)
o these return IP addresses of top-level domain servers
top-level domain servers are responsible for top-level domains
o they return IP addresses of authoritative servers for organisations
each organisation must provide an authoritative DNS server for its
publically accessible hosts
2.12. DNS Server Model
each organization is free to choose how to organise its servers
o a small organisation might use an ISP to run a DNS server
o a larger organisation might place all names on a single server
o a large organisation might divide its names among several servers
DNS allows each organization to assign names to computers or
to change those names without informing a central authority
each DNS server contains information linking it to other DNS servers up
and down the hierarchy
a given server can be replicated
replication is useful for heavily used servers, such as root servers that
provide information about top-level domains
DNS servers employ caching in order to improve performance and
reduce load
2.13. Name Resolution
translation of a domain name into an address is called name resolution
the name is said to be resolved to an address
software to perform the translation is known as a name resolver (or
simply resolver)
DNS server is used by a browser to map DNS name to IP address:

2.14. URIs, URNs and URLs
a Uniform Resource Identifier (URI) is either
o a Uniform Resource Name (URN), or
o a Uniform Resource Locator (URL)
URN names a resource, while URL gives its address
URN vs URL analogous to DNS name vs IP address
2.15. Uniform Resource Names (URNs)
overcome disadvantages of using URLs, namely:
o dependence on host names
o dependence on file structure on host
o ease with which URL can be invalidated
no syntactic difference between URN and URL
URNs not yet supported by browsers
syntax for URNs:
"urn:" <NID> ":" <NSS>
where
o scheme is urn
o scheme specific part is <NID> ":" <NSS>
o <NID> is Namespace IDentifier, e.g., isbn
o <NSS> is Namespace Specific String
2.16. Internet Electronic mail
e-mail client responsible for
o retrieving mail from server (POP3, IMAP4)
o sending mail to server (SMTP)
e-mail server responsible for
o collecting mail from client (SMTP)
o distributing mail to client (POP3, IMAP4)
o relaying mail between e-mail servers (SMTP)

2.17. Sending e-mail
SMTP (Simple Mail Transfer Protocol)
defined in RFC 821 and 822 (1982), superceded by RFC 2822 (2001)
use mailto: prefix in URI in browser
uses TCP port 25
address of recipient is of the form
name@dept.inst.ac.uk
uses DNS (Domain Name System) to map domain name to IP address
2.18. Example SMTP Session
mail message is transferred from user John_Q_Smith on
computer example.edu to two users on computer somewhere.com

2.19. Email Representation Standards
two important standards exist
o RFC (Request For Comments) 2822 mail message format
o Multi-purpose Internet Mail Extensions (MIME)
RFC 2822 format comprises
o a header section
o a blank line
o and a body
header lines each have the form
keyword: information
where keywords include From, To, Subject, Cc
the mail message (including headers) makes up the DATA as sent by
SMTP
2.20. Multi-purpose Internet Mail Extensions (MIME)
SMTP uses 7-bit ASCII format
inadequate for non-English and non-textual data
MIME defined in RFCs 2045, 2046, 2047, 2048, 2049; allows
o non-ASCII message bodies
o extensible set of different formats for non-textual bodies
o multi-part message bodies
o non-ASCII textual header information
2.21. MIME Headers
MIME headers include:
o MIME-Version
o Content-Type: specifies a type and subtype
o Content-Transfer-Encoding: specifies auxiliary encoding for
transfer
contents of the Content-Type header is the MIME type
examples of MIME types are text/html, image/gif and multipart/mixed
example of Content-Transfer-Encoding is base64:
o preferred encoding for 8-bit binary data
o each group of 3 bytes (24 bits) is encoded as 4 ASCII characters
2.22. Base64 Encoding

0x00 0x10 0x20 0x30
0 A Q g w
1 B R h x
2 C S i y
3 D T j z
4 E U k 0
5 F V l 1
6 G W m 2
7 H X n 3
8 I Y o 4
9 J Z p 5
A K a q 6
B L b r 7
C M c s 8
D N d t 9
E O e u +
F P f v /
values in top row and leftmost column are hexadecimal numbers
range of values is 0x00 to 0x3F (111111)
encode 01011010, 10001010, 00011101, e.g., by
1. splitting into 4 6-bit values:
010110, 101000, 101000, 011101
2. converting to hex: 0x16, 0x28, 0x28, 0x1D
3. use table to encode: W, o, o, d

QUIZ

1. What does WWW stand for?
World Wacky Web
Wide World Wumpus
World Wide Web
Wide World of Why?
2. Which one of the following is a search engine?
Netscape
Java
Altavista
Internet
3. What is the URL of the search engine in question 2?
http://www.yahoo.com
http://www.altavista.net
http://www.yahooligans.org
http ://www.altavista.com
4. What does URL stand for?
united route link
uniform resource locator
unknown redirection link
up real late
5. What is the name of the language you use to write a web
page?
HTTP
FTP
URL
HTML
6. What do you do if you accidentally end up at an
inappropriate web site?
Hit "Back" button, and continue surfing
Yell out "Hey look at this everyone!!"
Hit "Back" button immediately, and raise your hand to tell
the teacher or tell your parent
Leave it on your screen until the teacher finds out

7. What are the consequences if download files from the
Internet to your computer?
You can play lot of games
Or listen to lot of music
Your computer will get full up of junk
You might download a virus and put your computer and
other at risk, always check if the download site is trustworthy

8. Which of the following terms is a "browser"?
Netscape
World Wide Web
Launcher
E-mail

9. All web addresses start with which of the following?
htp
http://
http:/
WWW

10. A word that looks underlined on a web page is usually
what?
an important word
the web address
a "link" to another web page
a mistake

11. What is the World Wide Web?


(A) a computer game


(B) a software program


(C) the part of the Internet that enables
information-sharing via interconnected
pages


(D) another name for the Internet

12. Which is the best search tool for finding Web
sites that have been handpicked and
recommended by someone else?


(A) subject directories


(B) search engines


(C) meta-search engines


(D) discussion groups

13. The Internet was originally developed by
whom?


(A) computer hackers


(B) a corporation


(C) the U.S. Department of Defense


(D) the University of Michigan

14. Which description does NOT apply to the
Internet?


(A) an interconnected system of networks
that allows for communication through
e-mail, LISTSERVS, and the World Wide
Web


(B) a public network neither owned nor run
by any one group or individual


(C) a vast network that connects millions of
computers around the world


(D) a catalog of information organized and
fact-checked by a governing body

15. Which one of the following is a search engine?


(A) Macromedia Flash


(B) Google


(C) Netscape


(D) Librarians Index to the Internet

16. Which of the following is a TRUE statement?


(A) You are free to copy information you
find on the Web and include it in your
research report.


(B) You do not have to cite the Web sources
you use in your research report.


(C) You should never consult Web sources
when you are doing a research report.


(D) Just like print sources, Web sources
must be cited in your research report.
You are not free to plagiarize
information you find on the Web.

17. What is a URL?


(A) a computer software program


(B) a type of UFO


(C) the address of a document or "page" on
the World Wide Web


(D) an acronym for Unlimited Resources for
Learning

18. What are the three main search expressions,
or operators, recognized by Boolean logic?


(A) FROM, TO, WHOM


(B) AND, OR, NOT


(C) SEARCH, KEYWORD, TEXT


(D) AND, OR, BUT

19. Which of the following is a true statement
about the Internet and the library?


(A) They both have an expert librarian or
specialist to answer your questions.


(B) They both provide up-to-the-minute
news and information.


(C) They both close after hours.


(D) They both provide access to
newspapers, magazines, and journals.

20. http://www.classzone.com is an example of
what?


(A) a URL


(B) an access code


(C) a directory


(D) a server


21. HTML is used to
Plot complicated graphs
Solve equations
Author webpages
Translate one language into another


22. The "http" you type at the beginning of any site's address
stands for
HTML Transfer Technology Process
Hyperspace Techniques and Technology Progress
Hyper Text Transfer Protocol
Hyperspace Terms and Technology Protocol


23. "www" stands for
World Wide Wait
World Wide Web
World Wide War
World Wide Wares


24. ISP stands for
Integrated Service Provider
Internet Security Protocol
Internet Survey Period
Internet Service Provider


25. Google (www.google.com) is a
Number in Math
Chat service on the web
Directory of images
Search Engine


26. Internet Explorer is a
Web Browser
News Reader
Graphing Package
Any person browsing the net



27. Modem stands for
Memory Demagnetization
Monetary Devaluation Exchange Mechanism
Modulator Demodulater
Monetary Demarkation


28. On which of the following sites can you set up your email
account:
www.gre.org
www.hotmail.com
www.linux.org
www.syvum.com


29. The speed of your net access is defined in terms of
MHz
Megabytes
RAM
Kbps


30. AOL stands for
America Over LAN
America Online
Arranged Outer Line
Audio Over LAN


31. Which of the following is not a method of accessing the
web?
ISDN
DSL
Modem
CPU


32. Yahoo (www.yahoo.com) is a
Super Computer
Organization that allocates web addresses
Website for Consumers
Portal



33. What is the name given to the temporary storage area that
a web browser uses to store pages and graphics that it has
recently opened?
Niche
Cellar
Cache
Webspace


34. A computer on the Internet that hosts data, that can be
accessed by web browsers using HTTP is known as:
Web Server
Web Rack
Web Space
Web Computer


35. Linux is
A Web Browser
An Operating System
A Web Server
An non profit organization


36. Microsoft Windows is
A Web Browser
A Web Server
An Operating System
A Spreadsheet Package



37. A domain name ending with "org" is
A commercial website
An organization
A network site
A site which has very high traffic


38. At which of the following sites would you most probably
buy books?
www.hotmail.com
www.amazon.com
www.sun.com
www.msn.com


39. What can you do with the Internet?
Exchange information with friends and colleagues
Access pictures, sounds, video clips and other media
elements
Find diverse perspective on issues from a global audience
Post and respond to inquiries on a variety of subjects



40. The Internet was developed in the...
early 1990s
late 1980s
early 1970s
late 1960s


41. According to CNN, how much did Internet traffic increase
between 1994 and 1996?
Two times
Five times
Ten times
Twenty-five times


42. USENET is...
A set of tools reserved exclusively for Internet
administrators
Short for United States Electronic Network
A bulletin board system that allows for posting and
responding to messages on the Internet
A precursor to the Internet that is now obsolete

43. True or false: The Internet is managed by the U.S.
government
True
False

44. What is a spider?
A computer virus
A program that catalogs Web sites
A hacker who breaks into corporate computer systems
An application for viewing Web sites

45. What is not always necessary for accessing the Web?
A Web browser
A connection to an Internet Access Provider
A computer
A modem

You might also like