You are on page 1of 9

STA 6166, Section 8489, Fall 2007 Homework Assignment #1 Due Date: 6 September 2007 Please do the following

Chapter Exercises in Freund and Wilson Chapter 1 Concept Questions 6 to 15, inclusive (pg. 50-51) Exercise 2 (pg. 53-54) Data for Exercise 2 (the first 3 lines are SAS code for those of you familiar with SAS; if not, please ignore):

Student Name: Ramin Shamshiri UFL ID#: 9021-3353

Ramin Shamshiri

STA6166, HW#1, Sep.06.2007

Page 1

1- What is the Median? 95-87-96-110-150-104-112-110 Solution: Ordering the IQ-Scores from low to high, we will have the table below: yi IQ-Score 1 87 2 95 3 96 4 104 5 110 6 110 7 112 8 150 The Median is the middle observation. The number of observations in this question is even, thus the Median is the average of the 2 middle observation, which is (y4+y5)/2= (104+110)/2=107 2- The concentration of DDT in milligrams per liter is: Answer: A ratio Variable 3- If the interquartile range is zero, you can conclude that: Answer: At least 50% of the observations have the same value 4- The species of each insect found in a plot of cropland is Answer: Nominal Variable 5- The average type of grass used in Texas lawns is best described by: Answer: The Mean 6- A sample of 100 IQ scored produced the followings: Mean= 95 Median= 100 Mode= 75 Lower Quartile= 70 (Q1) Upper Quartile= 120 (Q3) Standard Deviation= 30 (s) Which statement(s) is/are correct? Half of the scores are less than 95 Answer: Since the Median identify the middle of the observations when they are arranged in the order of low to high, half of the scores are less than 100 Not 95. Thus the statement is NOT CORRECT The middle 50% of scores are between 100 & 120

Ramin Shamshiri

STA6166, HW#1, Sep.06.2007

Page 2

Answer: The middle half of the distribution is between the border of the interquartile which in this question is between 70 and 120. Thus the statement is NOT CORRECT. Note: The middle point of the 50% of scores is defined as (Upper quartile + Lower Quartile) /2 = (120+70)/2=95. If the Median (100) was in the center of the box, (equal to 95), then the middle portion of the distribution could be symmetric One-quarter of the scores are greater than 120. Answer: Considering that 120 represents the 3rd quarter of the distribution, the next one quarter lies after than the 120 point, and thus are greater. So the statement is CORRECT. The most common score is 95 Answer:Mode represents the most occurring observation. In this question, the most common score is 75, not 95, thus the statement is NOT CORRECT. 7- A sample of 100 IQ scored produced the followings: Mean= 100 Median= 95 Mode= 75 Lower Quartile= 70 Upper Quartile= 120 Standard Deviation= 30 Which statement(s) is/are correct? Half of the scores are less than 100 Answer: Since the Median identify the middle of the observations when they are arranged in the order of low to high, half of the scores are less than Median, which is 95 here and for sure they are also less than 100. Thus the statement is CORRECT. The middle 50% of the scores are between 70 and 120 Answer: The middle half of the distribution is between the border of the interquartile which in this question is between 70 and 120. Thus the statement is CORRECT. One-quarter of the scores are greater than 100. Answer: Based on the Box-plot, 25% of the observation is greater than Q3. The statement can be CORRECT if it says at least one-quarter of the scores are greater than 100 and can be NOT CORRECT if it means that exactly one-quarter of the scores are greater than 100. The most common score is 95 Answer: Mode represents the most occurring observation. In this question, the most common score is 75, not 95, thus the statement is NOT CORRECT.

Ramin Shamshiri

STA6166, HW#1, Sep.06.2007

Page 3

8- Identify which of the following is a measure of dispersion: 1) Median th 2) 90 percentile 3) Interquartile range 4) Mean Answer: The interquartile range is the length of the interval between the 25 and 75 percentiles and describes the range of the middle half of the distribution, which is a measure of dispersion. So, Option No.3 is CORRECT ANSWER 9- A sample of pounds lost in a given week by individual members of a weight-reducing clinic produced the following statistic: Mean= 5 pounds Median=7 pounds Mode=4 pounds First quartile=2 pounds Third quartile=8.5 pounds Standard deviation=2 pounds Identify the correct statement: 1. One-fourth of the members lost less than 2 pounds 2. The middle 50% of the members lost between 2 and 8.5 pounds 3. The most common weight loss was 4 pounds 4. All of the above are correct 5. None of the above is correct Answer: Considering the Box-plot, 50% of the members have lost weight in the range of 2 to 8.5 pounds, therefore 25% of them have lost less than 2 pounds and 25% have lost more than 8.5 pounds, and most of them have lost 4 pounds. It can also be inferred from the question that the average weight they have lost is 5 pounds. All the Statements are correct. Thus option No.4 is the CORRECT ANSWER. 10- A measurable characteristic of a population is: 1) A parameter 2) A statistic 3) A sample 4) An experiment Answer: A sample is an un-bias part of the population which is measurable, thus option No.3 IS CORECT ANSWER.
th th

Ramin Shamshiri

STA6166, HW#1, Sep.06.2007

Page 4

11- What is the primary characteristic of a set of data for which the standard deviation is zero? 1) All values of the variable appear with equal frequency 2) All values of the variable have the same value 3) The mean of the value is also zero 4) All of the above are correct 5) None of the above is correct Answer: The standard deviation of a set of observed values is defined to be the positive root of the variance and the variance of a set of n observed values is the sum of the squared deviations divided by (n-1). The difference (distance) th between the observed value (yi) and the mean is called the deviation of the yi observation from the mean. So if the Standard deviation is zero, it means that the yi-y is zero or yi=y which means that all of the values of the variable have the same value. Thus option No.2 is the CORRECT ANSWER 12- Let X be the distance in miles from their present homes to residences when in high school of individuals at a class reunion. The X is 1) A categorical (nominal) variable 2) A continuous variable 3) A discrete variable 4) A parameter 5) A Statistic Answer: The distance is expressed in miles, and miles can be expressed as one mile, or two miles or any real and positive digit. Distance is considered continues variable here and thus option No.2 IS CORRECT ANSWER. 13- A subset of a population is: 1- A parameter 2- A population 3- A statistic 4- A sample 5- None of the above Answer: A subset of population is a Sample. Thus Option No.4 IS CORRECT ANSWER. 14- The median is a better measure of central tendency than the mean if: 1- The variable is discrete 2- The distribution is skewed 3- The variable is continues 4- The distribution is symmetric 5- None of the above is correct Answer: Option No.2 is correct.

Ramin Shamshiri

STA6166, HW#1, Sep.06.2007

Page 5

15- A small sample of automobile owners at Texas A&M university produced the following number of parking tickets during a particular year: 4,0,3,2,5,1,2,1,0. The mean number of tickets (rounded to the nearest tenth) is: 1- 1.7 2- 2.0 3- 2.5 4- 3.0 5- None of the Above Solution: The mean is the average of the data and can be calculated as [Zigma(yi)/number of data].(4+0+3+2+5+1+2+1+0)/9=2 Exercise 2- Page 53 and 54 a) Make a complete summery of one of these variables, compute Mean, Median, Variance and construct a bar chart and box plot. Answer: Arranging the observation in the order of Low to high, we have the table below: No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 WATER 0 0 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.5 0.5 0.5 0.75 0.75 0.75 VEG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 FOWL 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Ramin Shamshiri

STA6166, HW#1, Sep.06.2007

Page 6

17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43

0.75 1 1 1 1 1 1 1.25 1.25 1.5 1.5 1.5 1.5 1.5 2 2 2 2 2 3 4 5 5 5 6 7 7

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.25 0.5 0.75 1 1 1 1.25 1.5 1.75 2 2 2

1 2 2 2 4 5 9 10 11 11 12 14 15 16 16 16 17 18 26 30 32 51 59 74 80 125 125

Ramin Shamshiri

STA6166, HW#1, Sep.06.2007

Page 7

44 45 46 47 48 49 50 51 52

9 10 15 16 16 17 31 33 149

2 2.25 2.75 3 4 5.25 7 8 9

167 177 179 185 210 218 240 364 1410

WATER Mean Median= Variance= Standard Deviation= Mode= 7.125 1.5 452.864 21.2806 0.25

VEG 1.120192 0 4.327182 2.080188 0

FOWL 75.63462 11.5 42197.33 205.4199 0

The Bar chart is plotted in MATLAB as shown below: (Data1=Water, Data 2= VEG, Data3=FOWL)

Ramin Shamshiri

STA6166, HW#1, Sep.06.2007

Page 8

The process for constructing the Box-plot, is as follow: Q1= 25% of the distribution= 0.25*52= 13 => Q3=75% of the distribution= 0.75*52=39 => It means that: 50% of the observed value of Water will lie between 0.5 and 5, 50% of the observed value of VEG will lie between 0 and 1.5, 50% of the observed value of FOWL will lie between 0 and 59 b) Constructing a frequency distribution for FOWL and using the frequency distribution to compute the mean and variance: c) Make a scatter-plot relating WATER or VEG to FOWL Answer: Relating Water to FOWL means that Water lies on vertical axes (y), and FOWL lies on the horizontal axes (x) y13th_Water= 0.5 y39th_Water= 5 y13th_VEG=0 y13th_FOWL=0

y39th_VEG=1.5 y39th_FOWL=59

Ramin Shamshiri

STA6166, HW#1, Sep.06.2007

Page 9

You might also like