You are on page 1of 2

Tutorial OLAP Cube using Pentaho Business Analytics

The technical tutorial demonstrates how


• an OLAP cube is configured using
◦ measures & dimensions
• a mysql table is used as data source using
◦ facts table and dimension tables
• Analytic reports are created using
◦ tables & charts

the_date

time_by_day the_year

birthdate
the_month
gender product_name

total_children customer sales_fact_199* product product_class_id

house_owner product_subcategory
product_class

product_id customer_id product_category


num_cars

time_id store_sales

store_costs

The relevant tables of the foodmart domain.

Get the operational data into a mysql database


1. run xamp-control in c:/xamp to start the mysql server and the apache server
1. start mysql server
2. start apache php server
3. restore the foodmart database using localhost/phpmyadmin
1. import foodmart*.sql.zip from Device Q (the file must be less than 2 MB)
2. add the mysql-connector to pentaho
1. copy mysqlconnection*.jar from Device Q to pentaho/biserver/tomcat/libs
2. go to services (german “Dienste”) and stop and start Pentaho BA again that it notices the
new mysqlconnector

Configure the analytic suite


3. set up a database connection
1. start the pentaho user console (use the Start-Menu or go to localhost:8080/pentaho)
2. manage datasource | new datasource | database tables
1. press plus to add a connection and fill in
1. host name: localhost
2. database name: foodmart
3. user name: root
4. password:<empty>
5. a connection name (e.g. foodmartmysql)
3. selelct the connection foodmartmysql and switch to "requires star schema"
1. give your datasource a name "foodmart"
2. select tables
1. sales_fact1997
2. customer
3. product
4. store
5. time_by_day
3. select sales_fact_1997 as fact table
4. create joins (for all selected tables a join must be created with the facts table)
1. with table product using product_id
2. with table customer using customer_id
3. with table store using store_id
4. with table time using time_id
4. create a cube (customize model now)
1. delete the default model
2. set store cost and store sales as measures
3. add dimension
1. time: using attribute "The date" as element
2. location: using "store city"
5. you can now create an analysis report using the data source foodmart
1. create a bar chart about the sales as a function of the cities for "Imagine Waffles" (use
the filter)
1. find out which city has the highest amount of sales for Imagine Waffles?

Use the cube for analytics


For more deep studies, modify your cube (edit datasource) to add the following datails and
dimensions
• dimension store
◦ using a hierarchy country, state, city, name (in this order, on the same level below the
hierarchy location)
◦ below name add manager and phone (are called attributes)
• dimension time
◦ using hierarchy the year, month of year (with property "the month")
• dimension product
◦ using product name
◦ task: in which month were the most beer sold?
▪ hint: use the filter to filter for any product name that contains beer

Tasks
1. create an analysis hierarchy that shows which stores belong to which city and which city
belongs to which state (drill down & roll up)
2. create a heatmap of sold products in dimension state and month to find out which state
performs best over the year (is it always the same state?)

You might also like