Uepottmeot of lofotmotloo 1ecbooloqy Ass gnment for 8L (I1) student. Give an example of a query for which that partitioning technique would provide fastest response. Describe the benefits and drawbacks of pipelined parallelism.
Uepottmeot of lofotmotloo 1ecbooloqy Ass gnment for 8L (I1) student. Give an example of a query for which that partitioning technique would provide fastest response. Describe the benefits and drawbacks of pipelined parallelism.
Uepottmeot of lofotmotloo 1ecbooloqy Ass gnment for 8L (I1) student. Give an example of a query for which that partitioning technique would provide fastest response. Describe the benefits and drawbacks of pipelined parallelism.
ur. !.!. Magdum College of Lnglneerlng, !ayslngpur.
uepottmeot of lofotmotloo 1ecbooloqy
Ass|gnment for 8L (I1) Student Sub[ect: Advanced Database Systems AsslgnmenL no. 2 Sub[ecL 1eacher: rof. nadaf A.P uaLe of AsslgnmenL: 22/07/2011 ueadllne uaLe for Submlsslon: 28]07]2011 uaLe of reLurnlng back correcLed coples: 01/08/2011
Ior 8atch 81 Q1. For each of the three partitioning techniques, namely round- robin, hash partitioning, and range partitioning, give an example of a query for which that partitioning technique would provide the fastest response. Q2. What form of parallelism(interquery, interoperation, or intraoperation) is likely to be the most important for each of the following tasks. a. Increasing the throughput of a system with many small queries b. Increasing the throughput of a system with a few large queries, when the number of disks and processors is large Q3. Describe the benets and drawbacks of pipelined parallelism.
Ior 8atch 82 Q4. In a range selection on a range-partitioned attribute, it is possible that only one disk may need to be accessed. Describe the benets and drawbacks of this property. Q5. Some parallel database systems store an extra copy of each data item on disks attached to a different processor, to avoid loss of data if one of the processors fails. a. Why is it a good idea to partition the copies of the data items of a processor across multiple processors? b. What are the benets and drawbacks of using RAID storage instead of storing an extra copy of each data item?
Q6. What factors could result in skew when a relation is partitioned on one of its attributes by: a. Hash partitioning b. Range partitioning In each case, what can be done to reduce the skew?
Ior 8atch 83 Q7. With pipelined parallelism, it is often a good idea to perform several operations in a pipeline on a single processor, even when many processors are available. a. Explain why. b. Would the arguments you advanced in part a hold if the machine has shared-memory architecture? Explain why or why not. c. Would the arguments in part a hold with independent parallelism? (That is, are there cases where, even if the operations are not pipelined and there are many processors available, it is still a good idea to perform several operations on the same processor?) Q8. Describe a good way to parallelize each of the following. a. The difference operation b. Aggregation by the count operation c. Aggregation by the count distinct operation d. Aggregation by the avg operation e. Left outer join, if the join condition involves only equality f. Left outer join, if the join condition involves comparisons other than equality g. Full outer join, if the join condition involves comparisons other than equality
Ior 8atch 84 Q9. Consider join processing using symmetric fragment and replicate with range partitioning. How can you optimize the evaluation if the join condition is of the form | r.A s.B | k,where k is a small constant. Here, | x | denotes the absolute value of x. A join with such a join condition is called a band join. Q10. Give an example of a join that is not a simple equi-join for which partitioned parallelism can be used. What attributes should be used for partitioning? Q11. Recall that histograms are used for constructing load-balanced range partitions. a. Suppose you have a histogram where values are between 1 and 100, and are partitioned into 10 ranges, 110, 1120,..., 91100, with frequencies 15, 5, 20,10, 10, 5, 5, 20,5,and 5, respectively. Give a load balanced range partitioning function to divide the values into 5 partitions. b. Write an algorithm for computing a balanced range partition with p partitions, given a histogram of frequency distributions containing n ranges. ---XX---