You are on page 1of 7

WEB MINING

ABSTRACT Web mining can be broadly defined as the discovery and analysis of useful information from World Wide Web. In this paper, we present the analysis target of web mining and its uses .

INTRODUCTION
With the explosive growth of information sources available in World Wide Web, it is necessary to create Server-side and Clientside intelligent systems that can effectively mine for knowledge . A major challenge therefore is for e-tailers to identify and understand their new customer base. E-tailers need to learn as much as possible regarding the behaviour, the individual tastes and the preferences of the visitors to their sites . Web mining makes this possible .

WEB-MINING
Web mining - is the application of data mining techniques to discover patterns from the Web. According to analysis targets, web mining can be divided into three different types, which are Web usage mining, Web content mining and Web structure mining.

WEB USAGE MINING


Web usage mining is a process of extracting useful information from server logs i.e. users history. Web usage mining is the process of finding out what users are looking for on the Internet. Some users might be looking at only textual data, whereas some others might be interested in multimedia data.

WEB CONTENT MINING

Web content mining is the process of discovering information from text, image, audio or video data in the Internet pages. Web content mining sometimes is called web text mining, because the text content is the most widely researched area. The technologies that are normally used in web content mining are NLP (Natural language processing) and IR (Information retrieval). Although data mining is a relatively new term, the technology is not. Companies have used powerful computers to sift through volumes of supermarket scanner data and analyze market research reports for years. However, continuous innovations in computer processing power, disk storage, and statistical software are dramatically increasing the accuracy of analysis while driving down the cost.

WEB STRUCTURE MINING


Web structure mining is the application of datamining to reconstruct the structure of a website or sites. It is a process of using graph theory to analyze the node and connection structure of a web site. According to the type of web structural data, web structure mining can be divided into two kinds: 1. Extracting patterns from hyperlinks in the web: a hyperlink is a structural component that connects the web page to a different location. 2. Mining the document structure: analysis of the tree-like structure of page structures to describe HTML or XML tag usage.

WHERE DID WEB DATA MINING COME FROM?


Web data mining has grown out of the large volumes of data freely available on the web. Prior to data mining becoming a

stand-alone task, business analysts and statisticians extracted and analyzed datasets. However, the large volume and technical nature of data necessitated the creation of data mining tools designed specifically for web data mining.

TYPES
Data mining activities fall into three distinct areas: content mining, usage mining and Web structure mining. Content mining identifies and categorizes useful documents that contain specified words or phrases and multimedia items, including images and graphics, video and audio, along with databases and tables. Web usage mining analyzes server logs, site registration forms and other user information to gather information about visitors' behavior once they arrive at a specific website. Web structure mining attempts to find the relationship between websites. Searches retrieve information from incoming and outgoing links at each website to reveal patterns, popularity, similar or dissimilar keywords, content or themes.

USES OF WEB MINING


This technology has enabled e-commerce to do personalized marketing, which eventually results in higher trade volumes. The predicting capability of the mining application can benefit the society by identifying criminal activities. The companies can establish better customer relationship by giving them exactly what they need. Companies can understand the needs of the customer better and they can react to customer needs faster. The companies can find, attract and retain customers; they can save on production costs by utilizing the acquired insight of customer requirements. They can increase profitability by target pricing based on the profiles created.

They can even find the customer who might default to a competitor the company will try to retain the customer by providing promotional offers to the specific customer, thus reducing the risk of losing a customer.

PROS
Web mining essentially has many advantages which makes this technology attractive to corporations including the government agencies. This technology has enabled ecommerce to do personalized marketing, which eventually results in higher trade volumes. The government agencies are using this technology to classify threats and fight against terrorism. The predicting capability of the mining application can benefits the society by identifying criminal activities. The companies can establish better customer relationship by giving them exactly what they need. Companies can understand the needs of the customer better and they can react to customer needs faster. The companies can find, attract and retain customers; they can save on production costs by utilizing the acquired insight of customer requirements. They can increase profitability by target pricing based on the profiles created. They can even find the customer who might default to a competitor the company will try to retain the customer by providing promotional offers to the specific customer, thus reducing the risk of losing a customer or customers.

CONS
Web mining, itself, doesnt create issues, but this technology when used on data of personal nature might cause concerns. The most criticized ethical issue involving web mining is the invasion of privacy. Privacy is considered lost when information concerning an individual is obtained, used, or disseminated,

especially if this occurs without their knowledge or consent. The obtained data will be analyzed, and clustered to form profiles; the data will be made anonymous before clustering so that there are no personal profiles.] Thus these applications de-individualize the users by judging them by their mouse clicks. De-individualization, can be defined as a tendency of judging and treating people on the basis of group characteristics instead of on their own individual characteristics and merits. Another important concern is that the companies collecting the data for a specific purpose might use the data for a totally different purpose, and this essentially violates the users interests. The growing trend of selling personal data as a commodity encourages website owners to trade personal data obtained from their site. This trend has increased the amount of data being captured and traded increasing the likeliness of ones privacy being invaded. The companies which buy the data are obliged make it anonymous and these companies are considered authors of any specific release of mining patterns. They are legally responsible for the contents of the release; any inaccuracies in the release will result in serious lawsuits, but there is no law preventing them from trading the data. The applications make it hard to identify the use of such controversial attributes, and there is no strong rule against the usage of such algorithms with such attributes. This process could result in denial of service or a privilege to an individual based on his race, religion, right now this situation can be avoided by the high ethical standards maintained by the data mining company. The collected data is being made anonymous so that, the obtained data and the obtained patterns cannot be traced back to an individual. It might look as if this poses no threat to ones privacy, actually many extra information can be inferred by the application by combining two separate unscrupulous data from the user.

WEB-MINING TAXONOMY

CONCLUSION
Many companies wants an on-line presence, believing that all they have to do is build a website and sit back and reap the benefits. In most cases this has been a fruitless exercise and companies will be unable to improve the situation without first gaining a basic understanding of the visitors to their web site. Companies can now optimise their e-business sites for maximum commercial impact and personalise the on-line

content of their web site using web mining technology. It is those companies who adopt a web mining strategy NOW to learn about their customers who will gain the competitive edge in the new digital economy.

You might also like