You are on page 1of 27

Computer Networks

Lecture 7: Application layer: FTP and HTTP

Marcin Bienkowski
Institute of Computer Science
University of Wrocaw

Computer networks (II UWr)

Lecture 7

1 / 23

Reminder: Internet reference model

FTP

HTTP

DNS
BSD sockets interface

UDP

IP

TCP
ICMP

ARP / RARP

2
Ethernet

Computer networks (II UWr)

Lecture 7

2 / 23

Outlook

FTP

HTTP
Protocol description
Proxy servers

Computer networks (II UWr)

Lecture 7

3 / 23

FTP

FTP

Computer networks (II UWr)

Lecture 7

4 / 23

FTP

FTP

File Transfer Protocol


Protocol for sending/receiving files to/from server.
Server listens on port 21.
After connection, client uses unix-like commands
For data transmission an additional port is opened.
presentation

Computer networks (II UWr)

Lecture 7

5 / 23

FTP

Connection for data (e.g., downloading a file)

Active mode
FTP client chooses a port, informs server about it and starts
listening on that port.
FTP server connects to this port and sends the data to it.
Problematic if the client is behind a firewall.

Passive mode
Client requests that the server should choose a port
Server picks a port, informs the client about it and starts to listen.
Client connects to this port and receives data from it.

Computer networks (II UWr)

Lecture 7

6 / 23

HTTP

HTTP

Computer networks (II UWr)

Lecture 7

7 / 23

HTTP

Protocol description

HyperText Transfer Protocol

Protocol for sending files (as FTP)


Very mature and complex protocol
version 1.1
Uses different namespace than FTP
Uses port 80.

Computer networks (II UWr)

Lecture 7

8 / 23

HTTP

Protocol description

URL (Uniform Resource Locator) (1)

URL: Identifies a given resource


Consists of two colon-separated parts
scheme: (http, ftp, mailto, file, ...)
resource-dependent part

Examples:
http://www.ii.uni.wroc.pl/index.html
http://pl.wikipedia.org/wiki/URL
ftp://ftp.kernel.org/pub/index.html
mailto:jan.kowalski@serwer.com

Computer networks (II UWr)

Lecture 7

9 / 23

HTTP

Protocol description

URL (2)
URL for schemes http, ftp
Part after colon:
//
domain name
optionally :port
/
resource identifier within a server
Example:
http://www.ii.uni.wroc.pl:80/mbi/dyd/sieciw_10s/
Note:
/ in the identifier for denoting hierarchical structure.
Resource identifier is not necessarily a path to the file!
Computer networks (II UWr)

Lecture 7

10 / 23

HTTP

Protocol description

URL (2)
URL for schemes http, ftp
Part after colon:
//
domain name
optionally :port
/
resource identifier within a server
Example:
http://www.ii.uni.wroc.pl:80/mbi/dyd/sieciw_10s/
Note:
/ in the identifier for denoting hierarchical structure.
Resource identifier is not necessarily a path to the file!
Computer networks (II UWr)

Lecture 7

10 / 23

HTTP

Protocol description

HTTP request and reply


How it works:
User enters URL in the web browser, it is split into parts (we
assume that scheme = http).
Web browser establishes a connection with a web server on
port 80.
It sends a HTTP request (GET method) example.
Server analyses the request, fetches appropriate file from the disk.
Server set an appropriate reply header and MIME type.
Server sends the file example.
Server closes the connection (or waits for another request)
Web browser performs an action depending on the MIME type
(displays / uses plugin / uses external application).

Computer networks (II UWr)

Lecture 7

11 / 23

HTTP

Protocol description

Keep-alive connections

TCP connection hand-shake = large overhead.


Usually web browser want to download many documents at once
(e.g., html web page + pictures).
HTTP/1.1 standard: connection is kept alive by default.
Connection is closed if the requests contains Connection:
close presentation

Computer networks (II UWr)

Lecture 7

12 / 23

HTTP

Protocol description

MIME type

For every file sent, the HTTP server should set Content-type field
appropriately. Examples:
text/plain text file
text/html HTML page
image/jpeg JPEG picture
video/mpeg MPEG video
application/msword DOC document
application/pdf PDF document
application/octet-stream sequence of bytes without an
interpretation.

Computer networks (II UWr)

Lecture 7

13 / 23

HTTP

Protocol description

HTTP replies

Important types of replies:


200 OK
301 Moved Permanently
302 Found
304 Not Modified
401 Unauthorized
403 Forbidden
404 Not Found
500 Internal Server Error

Computer networks (II UWr)

Lecture 7

14 / 23

HTTP

Protocol description

HTML

HTTP was designed for sending hypertext = text + links to other


texts.
This role is played by HTML.
HTTP + HTML = WWW.
HTML standardization is a W3C task.

Computer networks (II UWr)

Lecture 7

15 / 23

HTTP

Protocol description

HTML versions
Quick look into history
HTML 1.0, 2.0, mainly academic usage, content is most important.
HTML 3.0, 3.2, 4.0, the emphasis is shifted to presentation (mixed
with content)
HTML 4.01 also known as everything is allowed, many sloppily
written webpages the webbrowser has to cope not only with the
complicated standard but also with dozens deviations from it.
XHTML 1.0, based on XML, rigid structure, separates content and
structure (HTML) from the presentation (CSS styles)
rigid format = easier processing
automatic processing of data on the webpage
one HTML, many CSS = different versions for different recipients
(PDA, phones, visibility impaired, ...)

Computer networks (II UWr)

Lecture 7

16 / 23

HTTP

Protocol description

Dynamic WWW
Client-side dynamics
Javascript: simple object-oriented interpreted language, code
embedded in the HTML.
Java applets, Flash, Silverlight application execution by
different web browser plugins.
Server-side dynamics
URI may point to the program, whose output is HTML (+ HTTP
header)
CGI (Common Gateway Interface): standard allowing for execution
of an arbitrary external program.
Mechanisms integrated with the webserver (PHP, JSP, ASP,
mod_perl, ...)

Forms, parameter passing (GET and POST methods)


Cookies = session handling, HTTP itself is stateless.
Computer networks (II UWr)

Lecture 7

17 / 23

HTTP

Protocol description

Dynamic WWW
Client-side dynamics
Javascript: simple object-oriented interpreted language, code
embedded in the HTML.
Java applets, Flash, Silverlight application execution by
different web browser plugins.
Server-side dynamics
URI may point to the program, whose output is HTML (+ HTTP
header)
CGI (Common Gateway Interface): standard allowing for execution
of an arbitrary external program.
Mechanisms integrated with the webserver (PHP, JSP, ASP,
mod_perl, ...)

Forms, parameter passing (GET and POST methods)


Cookies = session handling, HTTP itself is stateless.
Computer networks (II UWr)

Lecture 7

17 / 23

HTTP

Protocol description

Abuses of HTTP protocol

Part of WWW services allows for non-human automatized access


Instead of creating a new protocol use HTTP as transport.
REST (Representational State Transfer) creating a web service
using existing HTTP methods (GET, PUT, POST, DELETE)
REST is not a standard, rather a philosophy.
Easy to automatize, but also human-readable.
Example services: eBay, Amazon, Twitter, Flickr, ...

Computer networks (II UWr)

Lecture 7

18 / 23

HTTP

Proxy servers

Proxy servers

Computer networks (II UWr)

Lecture 7

19 / 23

HTTP

Proxy servers

Instead of direct connection to webserver, the browser may connect


with the proxy server.
What for?
Limiting the traffic to the remote web pages web content is
stored in proxy cache.
Controlling access to web resources.

Note: proxy server usually means WWW proxy, but other services also
have proxy servers (ARP, DNS, DHCP, whois programming task!)

Computer networks (II UWr)

Lecture 7

20 / 23

HTTP

Proxy servers

Instead of direct connection to webserver, the browser may connect


with the proxy server.
What for?
Limiting the traffic to the remote web pages web content is
stored in proxy cache.
Controlling access to web resources.

Note: proxy server usually means WWW proxy, but other services also
have proxy servers (ARP, DNS, DHCP, whois programming task!)

Computer networks (II UWr)

Lecture 7

20 / 23

HTTP

Proxy servers

Proxy server

How it works:
It listens usually on port 8080.
If its cache does not contain the requested page or if it is
outdated, then
proxy connects to a given page,
stores the reply in the cache.

Proxy returns the answer to the client.

Computer networks (II UWr)

Lecture 7

21 / 23

HTTP

Proxy servers

Proxy server

How it works:
It listens usually on port 8080.
If its cache does not contain the requested page or if it is
outdated, then
proxy connects to a given page,
stores the reply in the cache.

Proxy returns the answer to the client.

Computer networks (II UWr)

Lecture 7

21 / 23

HTTP

Proxy servers

Proxy server, cont.


How the proxy checks whether the page in cache is up to date:
WWW server sets a field Expires: in the reply header
after this date, proxy evicts the page from the cache.
WWW server may set the field Pragma: no-cache and/or
Cache-Control: no-cache
this page will not be stored in proxy cache at all.
Client may set these fields in the HTTP request
proxy will neglect the contents of its cache.
In the remaining cases: heuristic based on the Last-modified:
field.

Computer networks (II UWr)

Lecture 7

22 / 23

HTTP

Proxy servers

Anonymous proxy servers

Normal proxy server adds its own fields to our HTTP request, e.g.,
X-Forwarded-For: (our IP address)
Via: (proxy IP address)
There are anonymous proxy server which do not add these headers
presentation.

Computer networks (II UWr)

Lecture 7

23 / 23

You might also like