You are on page 1of 2

IE 300 Project 1 Basic Data Analysis

Due in class, Tuesday, March 13, 2012

Preliminaries
Project 1 is to be completed by student teams, with each team containing three students (excepting one or two teams that may contain 4 students). Teams are assigned by the instructor. Teams are identied by student last names, and are assigned a corresponding project team number. Data for the project for all teams will be placed on the course Compass website. That is, by Tuesday, February 21, on the Compass website you will nd your team number on a le which contains data in .xls (Excel) format, e.g., the data for project team 14 is in the le Team14.xls. Each le contains 1 vector of discrete data and 6 vectors of continuous data (labeled as discrete and conts); descriptions of the data and the associated probability analysis questions you are required to answer are given below. For your project, you are responsible for analyzing only the two data sets that correspond to your team number. Data assigned to other teams should not be used in your analysis.

Data Analysis Assignment


For both data sets, you may utilize the capabilities of MATLAB, EXCEL, or Minitab to analyze the data.

I. Continuous Data
The continuous data represent average monthly temperature measurements (Fahrenheit) (daily high temperature, averaged over the month), for the months of February and August, over a span of 100 years, from 3 dierent cities in the US. For each of the 3 cities and 2 months in your data set, individually you should answer the following: 1. What are the statistical characteristics of the sample data? 2. What type of probability distribution seems to describe the data best and why? 3. Does the data appear to be normally distributed (based on basic normality tests)? 4. Are there any outliers in the data that may be aecting your analysis? 5. Are there any trends in the data? 6. Suppose you and your friends are planning a trip in February to the southernmost city in your data set. What city will this be? What is the probability that the temperature during your visit will be over 70 F? Between 60-69 F? Below 55 F?

II. Discrete Data


The discrete data given represent the number of students who enter the computer network assistance center at Pomona College. Each observation represents the worst hour in a 24 hour (one day) period. There are a total of 92 observations, covering a 3 month period. The computer center management requires an analysis of the data in order to determine stang requirements. Given this data, they would like you to determine: 1. What are the statistical characteristics of the sample data? 2. What type of probability distribution seems to describe the data best and why?

3. Are there any outliers in the data that may be aecting your analysis? 4. What is the probability that larger numbers of students will arrive during a worst-case one-hour time period? (Note: you need to select, and justify your selection for what larger means in this context. State clearly what this value/range is in your report and why you selected it based on course-related analyses).

Turn in the following:


1. A concise but complete report including your analysis and support for your conclusions. 2. Relevant histograms, graphs and/or plots. 3. Attach a copy of your worksheets or spreadsheets from MATLAB/EXCEL; include references to routines used or any routines you wrote yourselves to complete the assignment. Note that the text of your report should be typed unless one of you has superior penmanship that everyone can read; mathematical formulas may be written in by hand. You may discuss the approaches you are using with other teams, however, each team should complete their own analysis (as the data is dierent!). Submit one report which has been completed solely by the members of the team. All spelling errors and egregious grammatical errors will result in deductions from the nal grade.

You might also like