You are on page 1of 6

Dataware housing and Data mining

Homework I

HOMEWORK I CAP617T DATAWARE HOUSING AND DATAMINING

DOA: 25 Aug 2012

DOS: 08 Sep 2012

Part A 1. Why data mining is very important and how an organisation can achieve the competitive growth by using data mining? Ans Importance of data mining: Data Mining has great importance in today's highly competitive business environment. A new concept of Business Intelligence data mining has evolved now, which is widely used by leading corporate houses to stay ahead of their competitors. Business Intelligence (BI) can help in providing latest information and used for competition analysis, market research, economical trends, consume behavior, industry research, geographical information analysis and so on. Business Intelligence Data Mining helps in decision-making. An organisation can achieve the competitive growth by using data mining: Data Mining is largely used in several applications such as understanding consumer research marketing, product analysis, demand and supply analysis, e-commerce, investment trend in stocks & real estates, telecommunications and so on. Data Mining is based on mathematical algorithm and analytical skills to drive the desired results from the huge database collection. Data Mining applications are widely used in direct marketing, health industry, e-commerce, customer relationship management (CRM), FMCG industry, telecommunication industry and financial sector. Data mining is available in various forms like text mining, web mining, audio & video data mining, pictorial data mining, relational databases, and social networks data mining. 2. In which way, data warehouse plays an important role for the growth of an organisation? Distinguish data warehouse and data mart by appropriate example. Ans Data warehouse plays an important role for the growth of an organisation:Data warehousing is an integral part of any organization. It proves to be helpful in providing easy access to collective information to all the employees of an organization. A data warehouse system is implemented to support decision-making in an organization. It helps in providing information or data when queries need extensive searching on a larger scale. Toda y, almost all businesses use data warehousing in order to acquire information when it is needed without interrupting the operating systems. This makes the data flow more consistent and users find it easier to retrieve information from the system.

Data warehousing is the process of gathering information from different parts of a business process in a centralized database or it can be defined as the collection of data that is used by employees in an organization for easy access and smooth working. Since the early 1990s, data warehousing has become an essential part of any organization and this resulted from the emergence of Information Technology (IT) and the revolution of the information management system. A data warehouse comes across as one of the must-haves for an organization, for a business decision can be shaped and achieved with exact and complete data. Data management with information has also addressed the issues related to time consumption and labor! Besides data warehousing proves to be profitable for business organizations, as it helps businesses increase their productivity.

3. Differentiate data analysis and data mining using appropriate examples. Ans Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.

Data mining is a particular data analysis technique that focuses on modeling and knowledge discovery for predictive rather than purely descriptive purposes. Business intelligence covers data analysis that relies heavily on aggregation, focusing on business information. In statistical applications, some people divide data analysis into descriptive statistics, exploratory data analysis (EDA), and confirmatory data analysis (CDA). EDA focuses on discovering new features in the data and CDA on confirming or falsifying existing hypotheses. Predictive analytics focuses on application of statistical or structural models for predictive forecasting or classification, while text analytics applies statistical, linguistic, and structural techniques to extract and classify information from textual sources, a species of unstructured data. All are varieties of data analysis. Part B 4. Discuss on various kinds of date being used for data mining. Ans Various kinds of date being used for data mining: Database-oriented data sets and applications

Relational database, data warehouse, transactional database Advanced data sets and advanced applications

Data streams and sensor data Time-series data, temporal data, sequence data (incl. bio-sequences) Structure data, graphs, social networks and multi-linked data Object-relational databases Heterogeneous databases and legacy databases Spatial data and spatiotemporal data Multimedia database Text databases The World-Wide Web 5. Comment on any fi`ve important data mining applications. Ans Data mining applications: Research and surveys. Data mining can be used for product research, surveys, market research and analysis. Information can be gathered that is quite useful in driving new marketing campaigns and promotions.

Information collection. Through the web scraping process it is possible to collect information regarding investors, investments and funds by scraping through related websites and databases. Customer opinions. Customer views and suggestions play an important role in the way a company operates. The information can be readily be found on forums, blogs and other resources where customers freely provide their views. Data scanning. Data collected and stored will be not be important unless scanned. Scanning is important to identify patterns and similarities contained in the data. Extraction of information. This is the processing of identifying the useful patterns in data that can be used in decision making process. This is so because decision making must be based on sound information and facts. Pre-processing of data. Usually the data collected is stored in the data warehouse. This data needs to be pre-processed.by pre-processing it means some data that may be deemed unimportant may therefore re removed manually be data mining experts.

6. With an appropriate diagram, explain the important steps involved in achieving business intelligence. Ans Step 1: Understand your data environment:- Before you can deploy BI successfully, you must build a good data model that generates high-quality data. How much mission-critical data is generated on Excel spreadsheets? How can you capture data centrally? Is data from different departments isolated across a variety of SQL Server databases? What is your plan for tagging data consistently?

Step 2: Understand your business environment:- The more you know about how people work in your organization, the better you will be able to build a solution. You want to know who needs business intelligence, what they need, when they need it, what kinds of reports they need, and how they will act upon the intelligence. Step 3: Determine whether you need a data warehouse:- Businesses that generate a lot of transactions and a lot of structured data or those that are already using ERP systems might find that a data warehouse is not only necessary, but strategically important as well. Step 4: Consider your hardware infrastructure:- While most organizations view business intelligence as a software investment, you must also consider hardware. When scoping your BI

project, make sure you plan for servers that deliver speed and accuracy in an environment that will be far more I/O-intensive than most other environments. According to the IDG 2011 Digital Universe study, information managed by data centers will grow by a factor of 50 over the next decade. BI solutions will have to manage ever-growing quantities and new types of data, such as Web clickstreams, sensor logs, social media analysis and more. High I/O operations per second are at a premium. In-memory technology, which is used by BI software vendors to increase the velocity of analysis, is also important. Security is crucial, so use devices with high levels of encryption that dont ruin system performance. Step 5: Communicate the value and get business buy-In:- Corporate culture and business process issues should not be underestimated. Forrester points to what it calls untamed processes in describing some of the challenges of successful BI deployment. These processes are further complicated when they involve external constituents, such as customers, prospects, partners and regulators.

Understand your data e environment

Communicate the value and get business buy-In Consider your hardware infrastructur e

business intelligence

Understand your business environment

Determine whether you need a data warehouse

You might also like