Professional Documents
Culture Documents
name
( (
sid
(S)
sid
(C
color =red
color =green
(P) ) ) S)
(b) Find the names of parts supplied by all suppliers.
name
( (
sid, pid
(C) /
sid
(S)) P)
or
name
( (
pid
(C)
pid
(
pid
(C) X
sid
(S)
pid,sid
(C) ) ) P)
7/8
Problem 5 (15 Points, 7.5 pts. each)
Assume the following SQL schema representing the history of transactions performed by customers in a
supermarket (keys are in bold and underlined):
Product(pid, name, price, mfr) - the product id, name, price and manufacturer of the product
Customer(cid, name, age) - customer cid and his/her name and age
Transaction(cid, pid, datetime) - customer cid purchased product pid on some date & time
Write one SQL statement for each of the following queries.
(a, 7.5 pts.) For each customer who has spent at least double the average amount spent by active
customers (customers that have made at least one purchase), print his/her name, the amount spent by this
customer, as well as the price of the most expensive product this customer bought.
select A.name, A.spent, C.max_spent
from
(select c.cid, c.name, sum(p.price) as spent
from product p, customer c, transaction t
where p.pid=t.pid AND c.cid=t.cid
group by cid, c.name) A,
(select AVG(spend) as avg_spent
from (select sum(p.price) as spent
from product p, transaction t
where p.pid=t.pid
group by t.cid)) B,
(select c.cid, max(p.price) as max_spent
from product p, transaction t
where p.pid=t.pid
group by cid) C
where A.cid=C.cid AND A.spent >=B.avg_spent*2
8/8
(b, 7.5 pts.) For each product that has been sold at least once, print the product name, the total quantity of
sales (the number of times the product has been sold), and the number of customers that bought the
product as well as their average age.
We will do it in steps to show the logic.
Lets first write down this portion of the query: number of sales and number of customers per product.
SELECT p.pid, p.name, count(*) AS nsales, COUNT(DISTINCT t.cid) AS ncustomers
FROM Transaction t, Product p
WHERE p.pid =t.pid
GROUP BY p.pid, p,name
We now have to figure out the last part of the query: for each product, find the average age of customers
that bought this product. It is tempting to take the above SQL, add a join with Customer c and add
AVG(DISTINCT c.age) or AVG(c.age) to the SELECT clause. However, this is wrong. The first AVG
will compute the average of unique c.age we want to include all ages even if two customers are of the
same age. The second AVG will include in the computation the age of a customer as many times as this
customer bought the same product, also wrong. The solution is to write a second SQL statement to get
each age of all customers (even if some are of the same age) per product exactly once:
SELECT t.pid, c.age
FROM Transaction t, Customer c
WHERE t.cid =c.cid
GROUP BY t.pid, c.age, c.cid
And now, we need to combine these two queries:
SELECT tmp1.name, tmp1.nsales, tmp1.ncustomers, AVG(tmp2.age)
FROM
(SELECT p.pid, p.name, count(*) AS nsales, COUNT(DISTINCT t.cid) AS ncustomers
FROM Transaction t, Product p
WHERE p.pid =t.pid
GROUP BY p.pid, p,name) tmp1,
(SELECT t.pid, c.age
FROM Transaction t, Customer c
WHERE t.cid =c.cid
GROUP BY t.pid, c.age, c.cid) tmp2
WHERE tmp2.pid =tmp1.pid
GROUP BY tmp1.pid, tmp1.name, tmp1.nsales, tmp1.ncustomers