You are on page 1of 6
Basic Web Concepts 1.1 Introduction For learning internet programming it is very essential to understand the interaction between web pages and web servers. Basically Hyper Text Transfer Protocol (HTTP) is the one which helps web client (user who browse web) to establish communication with the web server. HTTP is a request and response protocol between client and server. HTTP is useful for transferring the data in any format such as text, graphics or images. Hyper Text Markup Language(HTML) is a language used for describing the text based information on the web page. Thus one can design his own web page using a simple scripting language like HTML. This is a chapter in which we will discuss all such web related concepts. First of all we will understand the concept of URL, then we will discuss MIME and CGI. The discussion will end up with SGML. 1.2 URL The Uniform Resource Locator(URL) is unique address for the file that has to be accessed over the internet. When we want to access some web site we enter it’s URL in the address bar of the web browser. For example if we want to access www.google.com then we must specify its URL in the address bar as shown below - http://www.google.com name of the protocol domain name server However any other file such as some text file or image file or some HTML file can also be specified. The URL contains name of the protocol such as ittp:// The URL may contain the name of the protocol as such as ftp. For example (1-1) Web Technologies 1-2 Basic Web Concepts ftp://ftp.funet. fi/pub/standards/RFC/rfc2 166.txt. The protocol identifier and the resource name are separated by a colon and two forward slashes, The syntax of writing URL is as given below protocol://username@hostname/path/filenama Sometimes instead of domain name servers IP addresses can also be use For example hitpy/192.168.0.1 But use of IP address as URL is not preferred because human can not remember numbers very easily but they can remember names easily. 1.3 MIME ‘On the Internet, data is sent in the form of bytes. The receiving software collects these bytes and arranges this data in the proper order. But from theses bytes of data it is not clear whether it is text? Or whether they are a picture? Or whether they represent a movie? Then for a user at the receiving end the question arises “How is it possible to know what the data is?” On the other hand, suppose, in addition to the bytes some additional few bytes are sent along with the actual data telling what the data is then the receiver will easily come to know about the type of data or message. This is what MIME does! It tells what is in a message so that the message contents can be used in an appropriate way. Multipurpose Internet Mait Extensions (MIME) is an open standard. It specifies how messages must be formatted so that they can be exchanged between different e-mail systems. MIME is a very flexible format, permitting us to include virtually any type of file or document in an email message. The MIME messages can contain text, images, audio, video, or other application-specific data. MIME supports more than 100 different types of content. The content consists of two levels ;———-+sType denotes whether the content is text, image or movie? Content —| "—— Subtype denotes whether the content is doc, rtf, htm! file or whether it is tiff, jpg or gif image For example: The content of some image can be specified as Image/jpg. This means the type of e-mail content is image and subtype is jpg image. Following table lists various content types used by MIME Web Technologies Tes Basic Web Concepts texthntm! HTML text as on the World Wide Web imageigit ‘a common image in gif format audio/mial ‘midi music format for synthesizers application/pdf indicates POF document Example of MIME ‘HTTP/1.0 200 OK Date: Thu, 10 Jan 2008 00:04:30 GMT Server: Apache/1.1.0 Content-type: image/jpeg Set-Cookie: Apache=localhost349860630670983; path=/ Content-length: 32719 Last-modified: Fri, 14 Dec 2007 18:45:59 GMT Here note that the content type is image which is actually a jpeg image. 1.4.CGI * Common Gateway Interface (CGI) is a simple protocol that can be used to communicate between Web forms and your program. Using CGI programs web server can obtain data from (or send data to) databases, documents, and other programs, and presents that data to viewers via the web. More simply, a CGI is a program intended to be run on the web. * A CGI script can be written in any language that can read STDIN, write to STDOUT, and read environment variables. These languages can be C, Perl, or even shell scripting. * There are times when you might want to have some of your web pages dynamic or interactive with the site users. For example Suppose if you want to have some counter saying, "You are the visitor no: # # # at the same time it could also include Opinion poll based on user input and provides the responses from some questions asked them. Then the best way to obtain this is by using CGI scripts! * CGI script can be run on any platform but generally it is run on UNIX platform, ‘Other commonly used platforms are Windows XP,MAC OS How to write CGI scripts? * Before writing CGI scripts it is necessary to install web server on which the scripts can be executed. Apache is a open source web server and can be easily downloaded without any cost. ‘Web Technologies 1-4 Basic Web Concepts © The CGI script can be written in C or in PEARL. It should be executed in egi-bin directory of your web server. + Theextension to these CGI script files is .cgi or pm Here is a simple example of CGI program int main(void) printf(‘Content-type: text/html\n\n"}; print{('

It's my First CGI Program!\n

"); return 0; } The output on the web browser will be — It’s my First CGI Program! Input with CGI usually (but not always) requires environment variables, which can be obtained with a call to getenv() for each environment variable you want. The list of various environmental variables is as shown below = SERVER_SOFTWARE This holds the software name and version of the server you are running on. SERVER_NAME This holds the servers hostname, ONS alias, or IP address. SERVER_PROTOCOL _This holds the name and revision of the information protocol this request came in with. SERVER_PORT The port the server listens on for connections, Usually 80 REQUEST_METHOD The method with which the request was made, for the HTTP protocol, this is "GET", "HEAD", or "POST". PATH_INFO Scripts can be accessed as a virtual pathname, followed by extra information at the end of this path. PATH_TRANSLATED The server provides a translated version of PATH_INFO, which takes the path and does any virtual to physical mapping to it. It is then stored in this environment variable. SCRIPT_NAME This is. a virtual path to any script being executed QUERY_STRING Any information following a ? in the URL which referred to this script. Web Technologies To8. Basic Web Concepts REMOTE_HOST This holds the address of the remote host which is the host of the person calling the script. REMOTE_ADDR The IP address of the remote address making ‘the request. AUTH_TYPE If the server supports authentication, and the script is protected this is the protocol specific method used to validate the user. REMOTE_USER If the server supports authentication, and the script is protected this is the usemame or User ID CONTENT_TYPE For queries which have attached information such as "HTTP" "POST" and “PUT”, this is the content type off the data, usually it is text/plain. CONTENT_LENGTH Size of the input given by the user 1.5 Introduction to SGML ‘The Standard Generalized Markup Language(SGML) is a language for defining the markup languages. SGML is a descendant of IBM's Generalized Markup Language (GML), developed in the 1960s by Charles Goldfarb, Edward Mosher and Raymond Lorie.The angular brackets are used to define the tags in the SGML. Applications of SGML SGML was widly used to share the documents in large projects It is used in printing and publishing industry. Due to its complex nature it is rarely used for small and general purpose applications. One of the major applications of SGML is Oxford English Dictionary (OED), which was and is created using an SGML-like markup. Using SGML following things can be done - Assemble a single document from many sources (such as SGML fragments, word processor files, database queries, graphics, video clips, and real-time data from sensing instruments) Define a document structure using a special grammar called a Document ‘Type Definition (DTD) Validate that the document follows the structure that has been defined in the DTD. a esse Difference between SGML and HTML HTML is an SGML application. Most of the HTML browsers do not support some basic SGML constructions but all the SGML authoring tools can produce HTML documents Review Questions 1. What do you mean by URL. Explain the various components of an URL. What is MIME? Explain the purpose of MIME along with suitable examples? What is CGI? Explain with some example Enlist various environmental variables that are associated with CGI. What are the applications of SGML? Differentiate between SGML and HTML. Qo0

You might also like