Historical Perspective (HTML) allows cross- S.No. Year Activity referencing of documents via 1. 1989 Tim Berners Lee at hyperlinks CERN proposed 2. A Uniform Notation scheme for hypertext based addressing web accessible information resources over the network management system Such scheme is called Uniform Robert Calliau, Resource Identifier (URI) or alongwith Tim, Uniform Resource Locator reformatted the (URL) proposal as World 3. A protocol for transporting Wide Web (WWW) messages over the network 2. 1990 Berners Lee e.g. HyperText Transport implemented a Protocol (HTTP) server and a command line browser using initial The Uniform Resource Locator version of HTTP A flexible and extensible scheme to 3. 1991 CERN made these support other protocols besides HTTP software available for anonymous FTP Difference between URL, URN and URI download 4. 1993 50 different sites • URL utilize ‘locator’ information running HTTP that embeds both a server address servers; grew to 200 and a file location within 6 months 5. 1993 HTTP being an • URN utilize a simpler human onwards open specification, readable name that does not people started change even when the resource is writing their own moved to another location. URN browser & server failed to materialize as a globally software including supported web notation. So, GUI based browsers practically, URL is used. that supported typographic controls • URI W3C defined it as union of & display of images URL & URN. URI is formally more correct.
Building Blocks of the Web
Tim devised following as essential
components of web technology.
1. A Markup Language for
formatting hypertext documents. Prepared By: Syed Feroz Zainvi Available At: http://www.zainvi.tophonors.com E-mail: zainvi.sf@gmail.com Birth of the World Wide Web
This notation applies to most protocols
Generalized Notation for URL like http, https & ftp
Æ scheme – underlying protocol to be Fundamentals of HTTP
used e.g. HTTP or FTP HTTP is an application level protocol in Æ host - name or IP address for TCP/IP protocol suite, using TCP as the the web server being accessed. underlying transport protocol for transmitting messages. HTTP is a basic Æ port# - Port number that the target protocol that enables communication web server listens to. Default port for between web programs. HTTP server is 80 Advantage: Simple and widely used Æ path -File System path from ‘root’ directory of the server to the desired Disadvantage: Stateless and limited document. In practice, web server may functionality make use of aliasing to point to documents, gateways & services that ate a) HTTP protocol uses not explicitly accessible from the request/response paradigm server’s root directory b) The structure of request/response consists of group of lines Æ url params -Used for session containing message headers, identifiers in web servers supporting the followed by a blank line and then Java Servlet API message body c) It is a stateless protocol. HTTP Æ query-string - Produced as result of Transaction is a single request user-entered variables in HTML forms. and single response ‘=’ is used parameter-value pair and ‘&’ mark boundaries in between parameter- HTTP Servers, Browsers and Proxies value pairs. Æ Web servers are essentially HTTP Æ anchor - Reference to a positional servers. marker within the requested document like a bookmark. If present, it follows a Æ Web Browsers, however, are much hash mark or pound sign #. more than HTTP client. Usually, web browsers also have FTP, local file []-optional parameters access, e-mail client, netnews, Gopher Pay attention to positions of / ; ? = etc. functionality as well.
e.g. http://www.mywebsite.com/sj/test;id=80 97?name=sviergn&x=true#stuff
Prepared By: Syed Feroz Zainvi Available At: http://www.zainvi.tophonors.com
E-mail: zainvi.sf@gmail.com Birth of the World Wide Web Æ Proxies: No way to batch requests together – to Æ May act as server or as a ask a web server for an HTML page & client making requests to web server on all the images in references during behalf of other clients. course of the connection Æ Enable HTTP transfers across firewalls Cookies can be used for maintaining Æ Support for caching of HTTP state in web applications messages Æ filtering of HTTP requests HTTP/1.1 support connections that Æ Many other roles also outlive a single request/response Æ Generally, there are one or exchange. It assumes that connection more proxies between servers and remains in place until it is broken, or browsers. until an HTTP client requests that it be broken. But it does so for the sake of A connection is defined as a virtual efficiency and not state support circuit that is composed of HTTP agents, including browsers, servers and Structure of HTTP Messages intermediate proxies participating in the exchange. METHOD /path-to-resource HTTP/version-number Å--- REQUEST LINE Stateless Protocol Header-Name1:value When a protocol supports ‘state’ this Header-Name2:value means that it provides for the interaction <CR><LF> between client and server to contain a [optional request body] sequence of commands. The server is required to maintain the state of METHOD – GET, POST,… connection throughout the transmission of successive commands, until the path-to-resource – path portion of connection is terminated. The sequence requested URL of transmitted & executed commands is version-number->HTTP version used by often called a session. the client
HTTP is stateless for simplicity but e.g.
impose limitations on the capabilities of http://www.mywebsite.com/sj/index.html the web application. HTTP request message will be Lifetime of a connection is single request/response exchange GET /sj/index.html HTTP/1.1 Host: www.mywebsite.com No way to maintain persistent <CR><LF> information about a ‘session’ of successive interactions between a client In case of GET request, there is no body. and server HTTP/version-number status-code message (human-readble)
Prepared By: Syed Feroz Zainvi Available At: http://www.zainvi.tophonors.com
E-mail: zainvi.sf@gmail.com Birth of the World Wide Web Header-Name1:value Header-Name2:value <CR><LF> [response body]
e.g. HTTP/1.1 200 OK Content-Type:text/html Content-Length:9934 ………. ………. <HTML> ….. ….. </HTML>
Request/Response transmission is not
that simple. Complex negotiations occur between browsers and servers to determine what information should be sent. For instance, HTML pages may contain references to other accessible resources, such as images and applets. Clients that support rendering of images and applets must parse the retrieved HTML page to determine what additional resources are needed and then send HTTP request to retrieve those additional resources.
Server browser interactions are much
more complex for advanced applications.
To be continued…
References
Web Application Architecture –
Principles, Protocols, Practices by Leon Shklar and Richard Rosen, John Wiley & Sons Ltd.
Prepared By: Syed Feroz Zainvi Available At: http://www.zainvi.tophonors.com