You are on page 1of 20

1. Data mining, the extraction of hidden .........

information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. (a) predictive (b) preventive (c) proactive (d) provocative (e) none Answer : Predictive 2. Data mining tools estimate future trends and behaviors, allowing businesses to make knowledge-......... decisions. (a) driven (b) laden (c) loaded (d) (e) ridden none

Answer : Driven 3. The automated, prospective analyses offered by data mining move beyond the analyses of past events provided by ......... tools typical of decision support systems. (a) introspective (b) usual common (c) reminiscent (d) (e) retrospective none

Answer : Retrospective

4. If someone told you that he had a good model to predict customer usage, the first thing you might try would be to ask him to apply his model to your customer ........., where you already knew the answer. (a) base (b) drive (c) file (d) (e) Log None Answer : A 5. This is a data management practice that characterizes the content, quality, and structure of your data. a. data cleaning b. data caving c. data refinement d. data profiling e. none D 6. This term is used to define the process of consolidating and managing customer information from all available sources to create a single customer profile: a. Customer Data Integration b. Customer Management c. data profiling d. data cleaning e. none A 7. This type of data (also known as source data) is data that has not been processed for use: a. staging data b. distilled data c. raw data d. pure data e. none C 8. What is the process of removing erroneous, incomplete or old data from a database? a. data eradication b. data cleansing c. data unloading d. data purging e. none

B 10. This is any type of process conducted to prepare data for some additional processing procedures: a. data mining b. data preprocessing c. data staging d. data warehousing e. none B 11. In an Internet context, this is the practice of tailoring Web pages to individual users' characteristics or preferences. a. Web services b. customer-facing c. client/server d. customer valuation e. personalization E 12. This is the processing of data analytics about customers and their relationship with the enterprise in order to improve the enterprise's future sales and service and lower cost. a. distribution management b. database marketing c. customer relationship management d. CRM analytics e. B2C D 13. This is a broad category of applications and technologies for gathering, storing, analyzing, and providing access to data to help enterprise users make better business decisions. a. best practice b. data mart c. business information warehouse d. business intelligence e. business warehouse D 14. This is a systematic approach to the gathering, consolidation, and processing of consumer data (both for customers and potential customers) that is maintained in a company's databases. a. database marketing b. marketing c. sales d. service oriented integration e. none

A 15 This is the practice of dividing a customer base into groups of individuals that are similar in specific ways relevant to marketing, such as age, gender, interests, spending habits, and so on. a. customer service chat b. customer managed relationship c. customer life cycle d. customer segmentation e. change management D 16. In data mining, this is a technique used to predict future behavior and anticipate the consequences of change. a. predictive technology b. disaster recovery c. phase change d. Digital Silhouettes e. predictive modeling E 17. The characteristics of a data warehouse include: a. Data is subject oriented. b. Data is time variant. c. Data is not integrated. d. Data is nonvolatile e. 1 and 2 Answer : e 18. During data warehouse development, keep in mind that your data warehouse should be: A Scaleable B Manageable C Available D Integrated E All Answer : E 19. The data in Data Warehouse is generally

a.
b. c. d. e. Answer : D

Un Clean Data Dirty Data Clean and Dirty Data None of above All of the above

20. Sequence of jobs to load data in to warehouse A. First load data into fact tables then then Aggregates if any B. First load data into fact table and then to dimension tables, then Aggregates if any C. First Aggregates then load data into dimension tables, then fact tables D. Does not matter if we do not load E. None of the above Answer : B 21. Snowflaking means A. Normalizing the data B. Denormalizing the data C. Same as star schema D. It is a data mart E. None of Above B

23. The main organizational justification for implementing a data warehouse is to provide A B C D E. Answer : A 24. OLTP stands for A B C D E. Answer : C 25. OLAP stands for A B C D E. On Line Analytical Processing On Line Abstraction Processing On Line Abstraction Protocol On Line Analytical Protocol none On Line Transaction Protocol On Line Terminal Processing On Line Transaction Processing On Line Terminal Protocol None Decision support Cheaper ways of handling transactions lagre scale transaction processing Storing large volumes of data None

Answer : A 26. Which of the following statements is true? A. A data warehouse is valuable only if the organisation has an interest in analyzing historical data. B. A data warehouse is valuable to those organizations that need to keep an audit trail of their activities C. A data warehouse is necessary to all those organizations that are using relational OLTP's D. A data warehouse is useful to all organizations that currently use OLTP's E. None Answer : A 27. A data warehouse A. must import data from transactional systems whenever significant changes occur in the transactional data. B. Has to work on live transactional data to provide up to date and vaild results C. Takes regular copies of non transaction data and stores it in a way that is optimized for query and reporting D. Takes regular copies of transaction data and does not store E. A and B Answer : E 29. Dimensionality refers to A B C D E. Answer : A 30 Analytical processing A the act of sumarising data on a regular basis (e.g. month end summaries) B the act of using a relational database to produce reports giving data summaries on a regular basis (e.g. monthly) C. the act of exporting data into a spreadsheet for analysis D. the act of using software to view the changes over time. Answer : A and B 31 the views in OLTP databases are a. detailed ,flat relational b. summarized, multi dimensional. c. Primitive d. Complex queries e. none The level of detail that is held in the Data Warehouse The data that describes the transactions. The number of dimension tables The level of data that is in the fact table None

Answer : a 32. Multiple data sources are combined in a) b) c) d) e) Ans. B Data Cleaning Data integration Data transforming Knowledge Representation none

34. A data warehouse needs to be a. Capable of integrating data from a wide variety of sources b. Subject orientated c. d. time varient non-volatile

F. all of them Answer : E 35 Data becomes inconsistent when f. It contains discrepancies in codes or names g. It lacks attribute values h. It contains errors i. It is integrated. j. none Answer: a 36. Why Is Data Preprocessing Important A. Data warehouse needs consistent integration of quality data B. Quality decisions must be based on quality data C. No quality data, no quality mining results D. A,b,c,d E. None Answer: d 37. A star schema is: (a) A de-normalized arrangement of dimensions and facts along with related measures (b) Same as snowflake schema (c) A complex relational join (d) A normalized arrangement of dimensions and facts along with related measures (e) none Answer: a

38 A snowflake schema is: (a) Similar to a star schema except that the dimensions are demoralized (b) A very complex slice and dice operation (c) A specific multidimensional query (d) Similar to a star schema except that the dimensions are normalized. (e) none Answer: d 39. Data mining results (a) Are explicit (b) Expose interesting associations, clusters and classifications (c) Are most accurate with small source samples (d) Verify test or probe business queries (e) none Answer: b 40. this type of OLAP operation navigates from less detailed data to more detailed data a. roll up b. pivot c. drill down d. slice e. none answer : c 41. it is the visualization operation of OLAP that rotates the data axes in view in order to provide an alternative presentation of the data a. slice and dice b. only slice c. pivot d. drill through e. none answer : c 42. CRISP DM, PMML are examples of a. some standards of Data mining query languages b. warehouse software c. data mart operations d. RDBMS tools e. none answers : a 43. the concept of ETL comes in this tier in the data warehouse architecture

a. tier 3 b. tier 2 c. tier 1 d. tier 4 d. none answer : c 44. these servers support multi dimensional views a. ROLAP servers b. MOLAP Servers c. SQL servers d. HOLAP servers e. none answer : b

45 The following is an efficient data minings association rule-mining application a) b) c) d) e) Ans. A 46. Market basket Analysis is an example for a) Association b) Clustering c) Clarification d) Integration e) none Ans. A 47. Out layers are a) b) c) d) e) Ans. B market basket analysis Nearest neighbor Principal component analysis Back Propagation. none

Clusters Noise Data labels Bayes clarifiers none

48. A star schema is: A. A de-normalized arrangement of dimensions and facts along with related measures B. Same as snowflake schema

C. A complex relational join D. A normalized arrangement of dimensions and facts along with related measures E. none Answer : a 49. A in this type of data bases data is recorded at regular intervals 4. a. Multimedia Databases b. Spatial Databases c. Time series Data bases d. Object oriental Databases e. none Answer: c 50 . it is the visualization operation of OLAP that rotates the data axes in view in order to provide an alternative presentation of the data A. slice and dice B. only slice C. pivot D. drill through E. none answer : c

QUIZ 2 1. these servers support multi dimensional views a. ROLAP servers b. MOLAP Servers c. SQL servers d. HOLAP servers e. none answer : b 2. Clustering is A. unsupervised B. Supervised C. Same is classification D. Both a and b E. none Answer: a 3. These hierarchical techniques start with all records in one cluster and then split that cluster into smaller units a. Divisive Methods

b. Agglomerative methods c. Partition methods d. Density methods e. none Answer: a 4 . Rule based systems consists of f. Clusters g. Decision trees h. If-then-else rules i. Associations j. none Answer: c 5. A good clustering method will produce high quality clusters with a. High inter class similarity, low intra class similarity b. High intra-class similarity, low inter-class similarity c. High intra and inter class similarity d. Low inter and intra Class similarity e. none Answer: b 6. A --------------- variable is a generalized from of binary variable. a. Nominal b. Interval Scaled c. Ordinal d. Ratio scaled Answer: a 7. An ordinal variable can be a. discrete or continuous b. dicrete and continuous c. discrete d. continuous e. none answer : a

8. -----------------------------: a positive measurement on a multiplicative scale, corresponding to exponential growth a. nominal variable b. binay variable c. ratio scaled variable d. interval scaled variable

e. none answer : c 9. Decompose data objects into a several levels of nested partitioning (tree of clusters), is called a ----------------------------a. b. c. d. e. cluster Classification Tree Dendrogram none

Answer : d 10. ------------------------------------ are normally used to measure the similarity or dissimilarity between two data objects a. b. c. d. e. answer : b 11. An ------------------------ variable can be discrete or continuous a. ordinal b. categorical c. binary d. interval scaled e. none answer : a 12. in K-means clustering, Each cluster is represented by the ----------------------------- of the cluster a. median b. center c. mode d. reference points e. none answer : b 13. For predicting stock market ----------------------------------- clustering algorithm is used a. k-means b. CLARA c. Nearest Neighborhood clustering distances classification k-means none

d. K-modes. e. none Answer: c 14. Retrieval of task relevant data in Data Mining is called a) data selection b) Pattern Presentation c) Data integration d) Task data. e) none

Ans. A 15. Multiple data sources are combined in f) Data Cleaning g) Data integration h) Data transforming i) Knowledge Representation j) none Ans. B Market basket Analysis is an example for f) Association g) Clustering h) Clarification i) Integration j) none

16.

Ans. A 17. The following is an efficient association rule-mining algorithm f) g) h) i) j) Ans. A 18. The following databases contain word descriptions for data objects a) Time series databases b) Text databases c) Multimedia databases d) Spatial databases e) none Apriori Nearest neighbor Principal component analysis Back Propagation. none

Ans. B

19.

Out layers are f) g) h) i) j)

Clusters Noise Data labels Bayes clarifiers none

Ans. B 20. Why isnt the data in operational systems appropriate for business analysis? (a) It is not consistent. (b) It is too detailed. ( c) It is not optimized for decision support applications and tools. (d) All of the above. (e) None Answer: c 21. Knowledge management, data warehousing and decision support (a) Are three interrelated information organization, manipulation, delivery and presentation disciplines (b) are unrelated components of Information Systems (c) are invented by IBM (d) Are vendor marketing programs with little commercial relevance (e) none Answer : a 22. A data mart is (a) A small data warehouse (b) A departmental subset of data warehouse (c) A simple data warehouse (d) All of the above. (e) none Answer: b 23. Data mining includes (a) Analyzing large volumes of data to discover interesting associations or patterns (b) Querying a large data warehouse to uncover undiscovered facts (c) Very complex SQL query operations (d) Slicing and dicing until you uncover interesting details (e) none Answer: a 24. Query and reporting tools are most appropriate for

(a) Controlled predictable query environments (b) Adhoc reporting requirements (c) Complex multifaceted business query applications (d) Discovery mode applications (e) none Answer: a 25 . Which of the following is the best example of a specific multidimensional query? (a) How many programmers worked more than 2000 hours last year? (b) Who are the customers in the northeast? (c) What is the profit for baby goods, by store, by month? By year? (d) Why are our best customers shopping at the competition? (e) none Answer: c 26. A star schema is: A. A de-normalized arrangement of dimensions and facts along with related measures B. Same as snowflake schema C. A complex relational join D. A normalized arrangement of dimensions and facts along with related measures E. none Answer: a 27. A snowflake schema is: (a) Similar to a star schema except that the dimensions are denormalised. (b) A very complex slice and dice operation (c) A specific multidimensional query (d) Similar to a star schema except that the dimensions are normalized. (e) none Answer: d 28. Data mining results (a) Are explicit (b) Expose interesting associations, clusters and classifications (c) Are most accurate with small source samples (d) Verify test or probe business queries (e) none Answer: b 29. Enterprise Miner is developed by A. SGI B. IBM C. SAS D. SPSS

E. none Answer: c 30. SPSS provides the following DM software A. CLEMENTINE B. DBMINER C. STATISTICAL MINER D. STATMINE E. none Answer a 31. A in this type of data bases data is recorded at regular intervals 5. a. Multimedia Databases b. Spatial Databases c. Time series Data bases d. Object oriental Databases e. none Answer: c 32. Data stored in Document databases is usually A. Structured B. Semi structured C. Unstructured D. Replicated E. none Answer: b 33. Time series data can be A. Irregular B. Seasonal C. Regular D. Non-trendy. E. none Answer: b 34. Classification predicts---------------- labels while prediction models --------------- functions a. Continuous, continuous b. Continuous, categorical c. Discrete, discrete d. Categorical, continuous e. none Answer d

35. Nueral networks are used for A. Prediction B. Clustering C. Association D. Aggregation E. none Answer : a 36.these are data dispersion characteristics A. mean B. median C. mode D. all E. none Answer: all 37. binning is one of the data A. smoothing technique B. cleaning technique C. quality technique D. none E. all answer : a 38. parsing, fuzzy matching techniques come under A. data scrubbing technique B. data auditing technique C. both a & b D. only a E. none Answer: a 39.geographical databases are examples of k. text data bases l. relational databases m. spatial data bases n. transactional databases o. none answer : c 40. A bin is replaced by the centroid of the bin. This method is called e. Binning by median f. Binning by means g. Binning by boundary h. Binning by standard deviation i. none Answer: b

41. The process of locating relevant documents based on user input, such as keywords or example documents is called as f. Machine Learning g. Web Mining h. Information Retrieval i. Neural Networks j. None Answer c 42. Market basket Analysis is an example for A. Association B. Clustering C. Clarification D. Integration E. none Ans. A

43.---------------------- and --------------------------- are the techniques used for prediction. A. B. C. D. E. Answer: a 44. CHAID stands for A. Chi square automatic interaction detector B. Chi square animated interaction detector C. Chi square automatic information detector D. Chi square automatic interaction device E. none Answer: a 45. CART stands for A. Classification and Rules tree B. Classification and regression trees C. Clustering and regression trees D. Classification and roots trees E. None Classification and nearest neighborhood Clustering and classification Decision trees and clustering Association and clustering none

Answer: b 46. Rule based systems consists of A. Clusters B. Decision trees C. If-then-else rules D. Associations E. None Answer: c 47. data mining is also called as a. b. c. d. e. Answer : a 48. Data mart and data warehousing are a. Are same b. Data warehouse is subset of mart c. Mart is subset of warehouse d. All are true e. None are true Answer : c 49. OLAP and OLTP are a. Same b. OLAP is for transactions c. OLTP is for analytics d. All are true e. None is true Answer : e 50. customer segmentation is an example of data mining technique a. b. c. d. e. answer : a clustering classification prediction neural networks none knowledge discovery knowledge integration Data warehousing Data marting None

You might also like