You are on page 1of 2

1.a) Explain the storage models of OLAP?

b) How does the data warehousing and data mining work together. 2. Suppose that the data for analysis includes the attribute age. the age values for the data tuples are increasing order 13 16 16 23 23 25 25 25 25 30 30 30 30 35 35 35 40 40 45 45 45 70 a) How might you determine the outliers in the data? b) What other methods are there for data smoothing? 3. List and describe the primitives for the data mining task? 4. Why perform attribute relevance analysis? Explain the various methods of it's? 5.a) How is association rules mined from large databases? b) Describe the different classifications of associate rule mining? 6. How will you solve a classification problem using decision trees? 7.a) What are the fields in which clustering techniques are used? b) What are the major requirements of clustering analysis? 8. Write short notes on: i) Discriminating different classes ii) Statistical measures in large databases. 9. Briefly compare and explain by taking an example of your point(s). a) Snowflake schema, fact constellation b) Data cleaning, data transformation. 10.a) Discuss various issues in data integration? b) Explain the concept hierarchy generation for categorical data? 11.a) Why is it important to have a data mining query language? b) Define schema and operation-derived hierarchies? 12. Outline a data cube-based incremental algorithm for mining analytical class comparisons? 13. List and explain the five techniques to improve the efficiency apriori algorithm? 14. What is backpropagation? Explain classification by back-propagation? 15. Why is outlier mining important? Discuss about different outlier detection approaches? Briefly discuss about any two hierarchical clustering methods with suitable examples? 16. Write short notes on: i) Mining Spatial Databases ii) Mining the World Wide Web. 17.a) Differentiate between OLAP and OLTP? b) Draw and explain the star schema for the data warehouse? 18. What is data compression? How would you compress data using principle component analysis (PCA)? 19. List and describe the various types of concept hierarchies? 20. List the statistical measures for the characterization of data dispersion, and discuss how they can be computed efficiently in large data bases? 21. What is Divide and Conquer? How it could be helpful for FP Growth method in generating frequent item sets without candidate generation? 22. Can we get classification rules from decision trees? If so how? What are the enhancements to the basic decision tree? 23. What are the different types of data used in cluster analysis? Explain in brief each one with an example? 24. Write short notes on: i) Data objects ii) Sequence Data Mining iii) Mining Text Databases.

25. What are the various issues in data mining? Explain each one in detail? 26. Why preprocess the data and explain in brief? 27. Write short notes on GUI, DMQL? How to design GUI based on DMQL? 28. How is class comparison performed? Can class comparison mining be implemented efficiently using data cube techniques? If yes explain? 29. Describe example of data set for which apriori check would actually increase the cost? 30. Explain the various preprocessing steps to improve the accuracy, efficiency, and scalability of the classification or prediction process? 31.a) What are the differences between clustering and nearest neighbor prediction? b) Define nominal, ordinal, and ratio scaled variables? 32.a) What are the various issues relating to the diversity of database types? b) Explain how data mining used in health care analysis? 33.Constraint based mining? Briefly discuss the possible constraints in high level declarative DMQL & User Inteface 34. Describe multidimensional associative rule mining from relational databases? 35. What is Associative mining? Briefly describe the criteria for classifying associative rules? 36. What is Back Propagation? Describe Back Propagation? Describe Back propagation Algorithm? 37. Briefly outline the major steps of decision tree classification? 38. What is Misclassification rate of a classifier? Describe Sensitivity & specify measures of a classifier? 39. What is Density based Clustering? Describe DB Scan clustering algorithm? 40. Cluster Analysis? Describe the dissimilarity measures for internal Scale variables & binary variables? 41. Multimedia DataMining? How similarity search can be perfomed on Multimedia data? Describe contents of a multimedia data cube? 42. Spatial data? Time series data? Briefly describe the time series and sequence data mining? 43. What is Clustering? What is Conceptual Clustering? Describe dimensions & measure in a spatial datacube? 44. Briefly explain Classification of Association rules with suitable examples? 45. Explain Bayesian classification 46. What is Prediction & with a suitable example explain how it is carried out? 47. Explain the partition method(k-mean & k-median)?