You are on page 1of 45

INTRODUCTION

This statistical project focuses on vehicles. These vehicles are a sample of 93 vehicles from a population of different models of cars by different manufactures in the USA. The 93 vehicles are categorized into six types being compact, small, mid-size, large, sporty and van respectively. To analyze the data in the data set of vehicles provided, two aims set out as guidelines are stipulated. The first aim is to compare price and mpg across categories of vehicles being compact, small, mid-size, large, sporty and van. To archive the first aim, price variables of both basic and top price are to be grouped according to the six car types. The mpg variables of mpg town and mpg best are also to be grouped according to the six car types. The Analysis Toolpak in Tools within Excel is then to be used to create tables for the price variables showing all descriptive statistics with the mean, median, standard deviation, range, skew measure, standard error, sum, count and coefficient of variation which is to be manually added in one table. There is to be an interpretation at the end of each table which includes concerns about the differences in the mean prices and mean mpg across vehicle types and the possible causes depending on whether its a mpg or price table. The distribution of price and mpg shapes across each vehicle type and what it means. Confidence intervals are to be used to talk about prices in the population of cars. Several differences of means tests on both prices and mpg are to be taken where there is no much difference between the means of different car types, in both prices and mpg. To find the likelihood of a cheap car of an efficient car across categories probability is to be used and its pattern is to be tested using chi squared tests. The second aim is to predict the mpg. This is to be done by using correlation to look for linear association between either town or best mpg and all other variables in the data set. A frequency table and a chi squared test are to be used. A correlation matrix or tables of comparative correlations of mpg measures against all other variables are to be shown, as well as comments on the implication of the correlation values. Scatter diagrams of the variables with mpg, useful for predictions are to be shown as well as the r2 value and the equation of the line. Comments on what the r2 value tells about the model used to predict the mpg and an explanation of what the value of the intercept and the slope mean in context are essential.

PART 1, AIM: COMPARING PRICES AND MPGS ACROSS VEHICLE CATEGORIES 1. (A) COMPARISONS OF VARIABLE DISTRIBUTIONS TABLE SHOWING THE DESCRIPTIVE MEASURES OF DIFFERENT CAR TYPES WITH BASIC PRICE Type of measure
Mean Standard Error Median Standard Deviation Coefficient of variation Skewness Range Minimum Maximum Sum Count

Compact

Small

Mid-size

Large

Sporty

Van

($000) ($000) ($000) ($000) ($000) ($000) 15.69375 8.4285171 24.113636 22.936363 16.857142 16.2 43 36 64 86 1.4682889 0.3258061 2.1644840 1.8876764 2.1101198 0.6759766 35 9 37 33 94 6 14.05 8.2 23.05 19.9 13.7 16.6 5.8731557 1.4930314 10.152330 6.2607144 7.8953456 2.0279299 39 32 04 52 87 79 0.3742353 0.1771404 0.4210202 0.2732405 0.4683679 0.1251808 31 63 84 45 64 62 1.0521697 1.5958385 0.6866637 1.1447903 1.5386210 0.5132536 06 01 23 06 69 08 20.5 6.2 33 16.9 25.5 5.9 8.5 6.7 12.4 17.5 9.1 13.6 29 12.9 45.4 34.4 34.6 19.5 251.1 177 530.5 252.3 236 145.8 16 21 22 11 14 9

Fig.1 basic price INTERPRETATION OF DESCRIPTIVE STATISTICS MEAN Fig 1 above shows the mean prices of cars as $15 693.75 for compact cars, $8
428.52 for small cars, $24 113.63 for mid- size cars, $22 936.36 for large cars, $16 857.14 for sporty cars and $16 200 for vans. Mid-size cars have the highest

mean price and the reason for this may be because they have an average engine size of 3.1 litres which is second biggest to large cars which have the highest average engine size of 4.1 litres. This coupled by the high number of air bags mid-size cars have ads on to their cost. However, small cars show the lowest mean price of $8 429. The difference between this price and that of mid-size cars is $15 684.6 which is way too extreme. This might be due to the
2

fact that the mean engine size of small cars is 1.6 litres and that of mid-size cars is 3.1 litres, hence the difference is 1.464 litres. Another possible cause of the extreme difference in mean prices between these two car types is that mid-size cars have a total number of 25 airbags whilst small cars have only 5 airbags.

MEDIAN The figure above shows Small cars having the lowest median price $8 200 compared to other car types. It is followed by sporty cars with a median price of $13 700, then compact cars $14 050. Next on the sequence are van cars with $16 600 then large cars with a median price of $19 900 and lastly midsize cars with a median price of $23 050.The median price values are therefore unaltered by extremely low or high prices. SKEW MEASURE From fig1 above shows the Skewness of compact cars as 1.052169706, small cars as 1.595838501, mid-size cars as 0.686663723, large cars as 1.144790306, sporty cars as 1.538621069 and lastly vans cars with a skew measure of 0.513253608. The figure goes on to indicate that compact, small, mid-size, large and sporty cars have positive Skewness. Vans display negative Skewness. This might be due to the fact that of all the car types, van cars have the lowest number of airbags of 3 whereas other car types have an average number of 14.4 airbags. COEFFICIENT OF VARIATION
Compact cars have a higher coefficient of variation (CV) 0f 0.374235331, this

means that variables are too spread away from the mean price. Small cars have a CV of 0.1771, mid-size cars show a CV of 0.4210 and large cars reveal that of 0.2732. Second but not last are sporty cars, they have a CV of 0.4684 and lastly van cars show a value of 0.1252 which is quite close to the mean. This means that for all car types, variables are spread away from the mean which means that they have extreme values except for van cars which show that the variables are close to the mean. RANGE In figure 1 above, compact cars have a range in basic price of $20 500 while small cars have a range of $6 200and mid-size cars have a range of $33 000. Again, large cars show a range of $16 900 whilst sporty cars show a range of
3

$25 500. Lastly, van cars reveal a range of$ 5 900. In contrast, mid-size cars have the highest range in basic price of $33 000 and the difference between this range and that of van cars is $27 100, this is too extreme and could be due to the fact that mid-size cars are really expensive more than van cars. However, the difference in range between mid-size cars and sporty cars is only $7 500, this could be as a result that sporty cars and mid-size cars almost have the same prices

TABLE SHOWING THE DESCRIPTIVE MEASURES OF DIFFERENT CAR TYPES WITH TOP PRICE
Type of measure Mean Standard Error Median Standard Deviation Coefficient of variation Skewness Range Minimum Maximum Sum Count Compact Small Mid-size Large Sporty Van

($000) ($000) ($000) ($000) ($000) ($000) 20.725 11.904761 30.313636 25.672727 21.957142 22.033333 9 36 27 86 33 1.9902365 0.6117296 3.2162486 2.0107027 2.2912570 1.0030509 86 4 24 68 16 02 18.5 11.3 27.35 21.9 21.2 21.7 7.9609463 2.8032973 15.085543 6.6687466 8.5730987 3.0091527 42 78 24 45 38 05 0.3841228 0.2354769 0.4976487 0.2597599 0.3904469 0.1365727 63 79 49 61 17 74 0.9495877 0.9181551 1.8169565 0.9124131 1.0292764 0.1245391 1 13 84 15 06 88 25.7 10.9 65.1 19.4 30.5 8.6 11.4 7.9 14.9 18.4 11 18 37.1 18.8 80 37.8 41.5 26.6 331.6 250 666.9 282.4 307.4 198.3 16 21 22 11 14 9

Fig.2 top price INTERPRETATION OF DESCRIPTIVE STATISTICS MEAN The figure above shows compact cars have a mean price of $ 20 725, small cars $11 905, midsize $30 314, large cars $25 673, sporty cars $21 957 and lastly vans with a mean price of $ 22 033. It indicates that mid-size cars have the highest mean price and this means that they are the most expensive cars. This is because they have the highest average number of airbags than any
4

other car types and this feature adds on to their cost. Moreover, this figure also shows that small cars have the lowest mean price and this may be because they have the smallest average engine size of just 1.6 litres. MEDIAN The median price of compact cars is $18 500, the median price of small cars is $11 300 and the mid-size cars have a median price of $27 350. Again, the one for large cars is $21 900, and as for large cars is $21 200 and lastly van cars have a median price of $21 700. Mid-size cars have the highest median price and small cars have the lowest median price. The difference between the median prices is $16 050, this could be due to the fact that their variables and extras are different.

SKEWNESS From fig 2, shows the Skewness of compact cars as 0.94958771, small cars as 0.918155113 , mid-size cars as 1.816956584, large cars as 0.912413115, and sporty cars as 1.029276406 and vans as 0.124539188. From the data above there is a resemblance of positive Skewness across all car types. This might be due to that of all the car types, midsize cars are the only car types with a total number of 25 airbags whereas other car types have an average number of 10 airbags. COFFICIENT OF VARIATION Looking at figure 2, we can note that the coefficient of variation values for compact, small, mi-size, sporty and van cars are 0.3841, 0.2385, 0.4976, 0.3904, and 0.1365 respectively. This means that the variables are extremely spread away from their mean prices this is therefore due to some prices being too high and some too low. However, the coefficient of variation for large cars is 0.2598 and this value shows that the variables are spread closely to the mean price. RANGE Figure2 shows that mid-size cars have the highest range in top price of $65 100, followed by sporty cars of $30 500 and compact cars which have a range of $25 700. Again in the preceding sequence, large cars follow with a range of $19 400 and small cars follow with that of $10 900. Lastly van cars show a range in top price of $8 600 being the lowest. The difference between the
5

price of midsize cars which is higher than that of any car type and that of vans which is the lowest is $56 500.These differences in range of top price could be due to the fact that some prices are too extreme.

TABLE SHOWING THE DESCRIPTIVE MEASURES OF DIFFERENT CAR TYPES WITH MPG TOWN Type of measure
Mean Standard Error Median Standard Deviation Coefficient of variation Skewness Range Minimum Maximum Sum Count

Compact
22.6875 0.4806137 57 23 1.9224550 28 0.0847363 09 0.0057805 8 6 20 26 363 16

Small
29.857142 86 1.3332482 97 29 6.1097112 39 0.2046314 76 1.2878894 33 24 22 46 627 21

Mid-size
19.545454 55 0.4041305 82 19 1.8955404 49 0.0969811 39 0.0381364 36 7 16 23 430 22

Large
18.36363 64 0.452723 62 19 1.501514 39 0.081765 63 0.546043 7 4 16 20 202 11

Sporty
21.785714 29 1.0439705 03 22.5 3.9061799 44 0.1793000 63 0.4767935 71 13 17 30 305 14

Van
17 0.4082482 9 17 1.2247448 71 0.0720438 15 1.0497813 2 3 15 18 153 9

Fig 3 MPG town INTERPRETATION OF DESCRIPTIVE STATISTICS MEAN Figure 3 shows that the mpg for compact cars is 22 688 miles, small cars is 29 857 miles, mid-size cars is 19 545 miles and large cars is 18 364 miles. Again, sporty cars have a mean mpg of 21 786miles and lastly van is 17 000miles. Small cars have the highest mean mpg because they have a relatively small engine size therefore they travel more miles than van cars which have the smallest mean mpg as they have bigger engine sizes. Small cars are therefore said to be fuel efficient and can travel more miles than van cars without using little fuel. MEDIAN The figure above (3) shows that the median values of mpg town arranged in ascending order are van with 17 miles, large and mid -size with 19 miles, sporty with 22.5 miles, compact with 23miles and small with 29 miles. The median value in mpg is 20.75miles as half of the values are above and below it. This means that the median value is not affected by extremely high or low prices.

SKEWNESS Figure 3 shows that compact cars have a skew measure of -0.00578058, small cars have 1.287889433, mid-size have cars 0.038136436, large cars -0.546043701; sporty cars 0.476793571 and vans have -1.049781318. Van cars reveal a symmetric skew shape; this means that the median and mean mpg variables are relatively the same. Small cars and mid-size cars have showed a considerable positive skew shape. In addition, compact cars, large cars and sporty cars have negative Skewness. These means that the variables are clustered more to the median that the mean. COEFFICIENT OF VARIATION The co efficient of variation for compact cars is 0.084736309 , for small cars is for 0.204631476 , mid size cars is 0.096981139 , large cars is 0.081765634 ,
7

sporty cars is 0.179300063 and the CV for vans is 0.072043815. Looking at the figure above, it can be stated that the coefficient of variation values in miles for all cars types are spread away from the mean. These means that some values are extremely high and some are extremely low in miles. RANGE The figure above indicates that compact car have a range of 6 miles, small cars have 24 miles followed by mid-size cars with a range of 7 miles then large cars with 4 miles, sporty cars with 13 miles and lastly vans with a range of 3 miles.
Vans display the smallest range among all the car types and therefore we can conclude that there is less dispersion in their MPG. Small cars have the highest

range in mpg of 24miles; this could be due to that they have a small engine size and low fuel consumption rate.

TABLE SHOWING THE DESCRIPTIVE MEASURES OF DIFFERENT CAR TYPES WITH MPG BEST Type of measure Mean Standard Error Median Standard Compact Small Mid-size Large Sporty Van

29.875 0.7352720 58 30 2.9410882

35.476190 48 1.2240040 61 33 5.6090912

26.727272 73 0.3835458 75 26.5 1.2720777

26.727272 73 0.3835458 75 26 1.2720777

28.785714 29 0.9731481 23 28.5 3.6411868

21.88888 89 0.484322 1 22 1.452966


8

Deviation Coefficient of variation Skewness Range Minimum Maximum Sum Count

34 0.0984464 68 0.5890515 28 10 26 36 478 16

6 0.1581086 14 1.1846060 75 21 29 50 745 21

56 0.0475947 45 0.1216279 5 9 22 31 588 22

56 0.0475947 45 0.0912718 3 3 25 28 294 11

61 0.1264928 43 0.5018105 76 12 24 36 403 14

31 0.066379 17 0.071153 4 4 20 24 197 9

Fig. 4 MPG best INTERPRETATION OF DESCRIPTIVE STATISTICS MEAN Fig 4 shows that the mpg for compact cars is 29 875miles, small cars have 35 476miles, mid-size cars is 26 727miles and large cars is 26 727miles. Again, sporty cars have a mean mpg of 28 786miles and lastly van is 21 889miles. Small cars have the highest mpg of 35.476, it has are relatively small engine size of 1.6 litre per car and therefore can travel over a long distance consuming a gallon of a fuel. In contrast, van car have the lowest mpg of 21.889, they have a large engine size of about 3.2 litres per car, these cars covers short distances over a gallon of fuel. Small cars have a lower consumption rate as compared to van cars. MEDIAN According to the figure above, van cars have the smallest mpg best of 22miles; large cars follow with a range in mpg best of 26miles. In addition, large and mid-size cars have a range of 26.5 miles and 28.5 miles respectively. Compact cars have a range of 30 miles and lastly small cars have a range of 33 miles. The median mpg values are unaffected by extremely low or high miles. SKEWNESS According to the figure above, we note that small cars have the highest skew measure of 5.60909121 followed by sporty cars with 3.641186861 and compact cars with a skew measure of 2.94108823. On the sequence follows van cars with a skew measure of 1.452966315 and lastly mid-size and large cars with same skew value of 1.272077756. Across all these car types except for mid-size cars there is positive Skewness which shows that the mean is large that the median. This might be due to the fact that these cars types
9

have a relatively medium sized engine. The mid-size cars show a symmetric Skewness because it has small sized engines. COEFFICIENT OF VARIATION The coefficient of variation for compact cars is , small cars is , mid-size cars is ,large cars is ,sporty cars is and for large cars is .According to the figure above, the coefficient of variation values in miles for all cars types are spread away from the mean. This means that some values are extremely high and some are extremely low in miles. RANGE The range of compact cars is 10miles , small cars is 21miles , mid-size cars 9miles, large is 3miles followed by sporty cars with 12miles and finally vans with 4miles. Small cars have the highest mpg range of 21 miles and large cars have the lowest mpg range of 3 miles. This means that there is less dispersion in mpg of large cars than in small cars because the range in of 3 miles is less than the range of 21 miles. That is the mpg of large cars is clustered more closely around the mean as compared to that of small cars which is dispersed away from the mean

(B) HYPOTHESIS TEST OF TWO MEANS We undertook hypothesis testing to test the interesting differences between car types for prices and mpgs. The t-test was used as n was less than 30. BASIC PRICE COMPACT AND VAN CAR TYPES We tested the difference in mean basic prices because there was not so much difference between their prices.
1.

H : = H:

2. = 0.05
3. We used t-test because n is less than 30(n=16 and n=9) and

population standard is not known.

10

4. Decision rule: reject H if tc -2.086 or tc 2.086. Critical value =

2.086.
5. T-statistic= -0.313191922. Therefore at 5% level of significance, we do

not reject the null hypothesis. There is sufficient evidence that the two sample means are from the same population.

SPORTY AND VAN CAR TYPES There is a slight difference in means test between these vehicles types.
1.

H : = H:

2. = 0.05 3. t- test because n is less than 30 (n=14 and n=9) and the population

standard deviation is not known.


4. Critical value= 2.120 therefore reject H if tc -2.120 or tc 2.210 5. T-statistic = 0.296577999. Therefore, at 5 % level of significance the null

hypothesis is not rejected. There is enough evidence that the two sample means are from the same population.

TOP PRICE SPORTY AND VAN CAR TYPES


1.

H : = H:

2. = 0.05 3. t- test because n is less than 30 (n=14 and n=9) and the population

standard deviation is not known. 4. Critical value = 2.179. Reject H if tc -2.179 or tc 2.179

11

5. T statistic= -0.030460311. At 5% level of significance we do not reject

the null hypothesis. Sufficient evidence can show that sporty and van car types are quite equal.

SPORTY AND COMPACT CAR TYPES


1.

H : = H:

2. = 0.05 3. t- test because n is less than 30 (n=16 and n=14) and the population

standard deviation is not known.


4. Critical value = 2.052. Reject H if tc -2.052 or tc 2.052

5. T statistic= -0.405985032. At 5% level of significance the null hypothesis not rejected. Sufficient evidence can show that sporty and compact car types are quite equal. MPG TOWN COMPACT AND SPORTY CAR TYPES A hypothesis test is taken to test the slight difference in means between these two car types.
1.

H : = H:

2.

= 0.05 standard deviation is not known.

3. t- test because n is less than 30 (n=16 and n=14) and the population 4. Critical value = 2.101 Reject H if tc -2.101 or tc 2.101 5. T statistic= 0.7846646963. At 5% level of significance the null hypothesis

not rejected. Sufficient evidence can show that sporty and compact car types are quite equal and belong to the same population.
12

MID-SIZE AND LARGE CAR TYPES The difference in means between these two vehicles was so small we had to carry a hypothesis test. 1. H: = H: 2. = 0.05

3. t- test because n is less than 30 (n=22 and n=11) and the population standard deviation is not known. 4. Critical value = 2.060 Reject H if tc -2.060 or tc 2.060 5. T statistic= 1.947428329. At 5% level of significance the null hypothesis not rejected. Sufficient evidence can show that mid-size and large car types are quite equal and belong to the same population. MPG BEST MID-SIZE AND LARGE CAR TYPES We carried a hypothesis test as follows to test the difference in means for the above car types;1. H: = H: 2. = 0.05
3. t- test because n is less than 30 (n=22 and n=11) and the population

standard deviation is not known.


13

4. Critical value = 2.086 Reject H if tc -2.086 or tc 2.086 5. T statistic= 0. At 5% level of significance the null hypothesis not rejected. Sufficient evidence can show that mid-size and large car types are equal and belong to the same population. COMPACT AND SPORTY CAR TYPES A t-test hypothesis test is undertaken to test the difference in means between these two vehicle types.

1.

H: = H:

2.

= 0.05

3. t- test because n is less than 30 (n=16 and n=14) and the population standard deviation is not known. 4. Critical value = 2.060 Reject H if tc -2.060 or tc 2.060 5. T statistic= 0.893084498. At 5% level of significance the null hypothesis not rejected. Sufficient evidence can show that sporty and compact car types are quite equal and belong to the same population. (C) CONFIDENCE INTERVAL Here confidence interval is used on prices and mpgs in the population of cars. 95% confidence interval was used because the standard error lies between 1.96 and not around 2.58 standard deviations of the population mean.

14

A TABLE SHOWING CONFIDENCE INTERVAL ACROSS CAR TYPES BASIC PRICE Car types Mean price($000) Confidence interval($000) 12.81590369 to 18.57159631 7.789937051 to 9.067097233 19.87124765 to 28.35602507 19.23651783 to 26.63620945 12.72130787 to 20.99297785 14.875085575 to 17.52491425

Compact Small Mid-size Large Sporty Van Figure 5

15.6937 8.4285172539 24.11363636 22.93636364 16.85714286 16.2

From the figure 5, there is 95% confidence that the true population means for compact cars lie between $12 816 to $18 572, small cars $7 790 and $9 067, mid-size cars $19 871 and $28 356, large cars $19 237 and $26 636, sporty

15

cars $12 721 and $20 993 and lastly van cars lie between $14 875 and $17 525.

TOP PRICE Car types Compact Small Mid-size Large Sporty van Mean Price($000) 20 725 11 905 30 314 25 673 21 957 22 033 Confidence Interval($000) 16 824 to 24 626 10 706 to 13 104 24 001 to 31 617 21 732 to 29 614 17 466 to 26 448 20 067 to 23 999

Figure 6 Figure 6 shows that there is 95% confidence that the population means of compact cars lie between $16 824 and $24 626, small cars between $10 706 to $13 104, mid-size cars between $24 001 to $31 617 while large cars lie between $21 732 to $29 614. Again, sporty cars come around between $17 466 to $26 448 and lastly van cars are between $20 067 to $23 999. MPG TOWN Car types Mean mpg(miles) Confidence

16

interval( miles) Compact Small Mid-size Large Sporty van Figure 7 Figure 7 above shows that there is a 95% confidence that the true population mean of compact cars lies between 21.7miles to 23.6miles, small cars between 27.2miles to 32.5miles, mid-size between 18.7miles to 20.3miles, large cars between 17.5miles to 19.2miles and sporty cars lie between 19.7miles to 23.8miles. Lastly, van cars lie between 16.2miles to 17.8miles. MPG BEST Car types Compact Small Mid-size Large Sporty Van Mean mpg(miles) 29.9 35.5 26.7 26.7 28.8 21.8 Confidence Interval( miles) 28.4 to 31.3 33.1 to 37.9 26.0 to 27.5 26.0 to 27.5 26.9 to 30.7 20.9 to 22.8 22.7 29.9 19.5 18.4 21.8 17.0 21.7 to 23.6 27.2 to 32.5 18.7 to 20.3 17.5 to 19.2 19.7 to 23.8 16.2 to 17.8

Figure 8 The figure above(8) reveals that there is a 95% confidence that the true population mean for compact cars lies between 28.4miles to 31.3miles, small cars between 33.1miles to 37.9miles between 26.9miles to 30.7miles whereas van cars lie between 20.9miles to 22.8miles.

17

(D) PROBABILITY BASIC PRICE (Average price $17 100) Car types High price($000) 5 0 13 11 4 2 35 Low price ($000) 11 21 9 0 10 7 58 Total ($000) 16 21 22 11 14 9 93

Compact Small Mid-size Large Sporty Van Total Figure 9

According to figure 9, the probability that a car is;a) Compact and low in basic price is 11/58 b) Large but high in basic price is 11/35
18

c)

A van is 9/93

d) Is of a low price 58/93

TOP PRICE (Average price $21 900) Car types Compact Small Mid-size Large Sporty Van Total(miles) Figure 10 The figure above reveals that the probability that a car could be:a) Mid-sized is 22/93 b) Low in top price is 57/93 c) Sporty but low in top price is 8/14 d) Compact and highly priced is 6/16
19

High price($000) 6 0 14 6 6 4 36

Low price($000) 10 21 8 5 8 5 57

Total($000) 16 21 22 11 14 9 93

MPG TOWN (Average MPG 22.5miles) Car types Compact Small Mid-size Large Sporty Van Total(miles) Figure 11 Looking from figure 11, the probability that a car;a) Has a low mpg but its a van is 9/45 b) Is sporty is 7/93
c) Is small but has a high mpg is 20/48

High MPG (miles) 9 20 1 11 7 0 48

Low MPG (miles) 7 1 21 0 7 9 45

Total(miles) 16 21 22 11 14 9 93

d) Has a high mpg but its a van is 0/48

20

MPG BEST (Average MPG 29.1 miles) Car types High MPG(miles) 9 19 4 0 6 0 38 Low MPG(miles) 7 2 18 11 8 9 55 Total miles

Compact Small Mid-size Large Sporty Van Total(miles)

16 21 22 11 14 9 93

Figure 12 This figure (12) shows that the probability that a car;a) Has a high mpg is 38/93 b) Is small and has a low mpg is 2/21 c) Sporty and has a high mpg is 6/38 d) Is compact is 9/93

21

(E) CHI-SQUARED Chi-squared was used across both prices and MPGs to test the apparent pattern of probability. BASIC PRICE

Car types

High price ($000) f 5 0 13 11 4 2 5.83 5.83 5.83 5.83 5.83 5.83

f- f

(ff)2

(f- f)2 /f 0.12 5.83 8.82 4.58 0.57 2.52

Compact Small Mid-size Large Sporty Van Figure 13.

-0.83 -5.83 7.17 5.17 -1.83 -3.83

0.69 34.0 51.4 26.7 3.35 14.67

As from figure 13, the computed x2 value is 22.44. It is beyond the rejection region with a critical value of 11.1. At the 0.05 level of significance we reject the null hypothesis and accept the alternate hypothesis. The difference between the observed and expected high prices is large enough to be considered significant.

22

TOP PRICE

Car types

High price ($000) f 6 0 14 6 6 4 6 6 6 6 6 6

f- f

(ff)2

(f- f)2 /f 0 6 10.67 0 0 0.67

Compact Small Mid-size Large Sporty Van Figure 14

0 -6 8 0 0 -2

0 36 64 0 0 4

f: observed high price f: expected high price

As from figure 14, the computed x2 value is 17.34. It is beyond the rejection region with a critical value of 11.1. At the 0.05 level of significance we reject the null hypothesis and accept the alternate hypothesis. The difference between the observed and expected high prices is large enough to be considered significant.

23

MPG TOWN

Car types

High mpg(miles) f 9 20 1 11 7 0 8 8 8 8 8 8

f- f

(ff)2

(f- f)2 /f

Compact Small Mid-size Large Sporty Van Figure 15

1 12 -7 3 -1 -8

1 144 49 9 1 64

0.125 18 6.125 1.125 0.125 8

This figure (15) reveals that the computed x2 value is 33.5 miles. It is in the rejection region beyond the critical value of 11.1. The null hypothesis is rejected at the 0.05 level of significance and we do not the alternate hypothesis.

24

MPG BEST Car types High mpg(miles) f 9 14 4 0 6 0 6.33 6.33 6.33 6.33 6.33 6.33 f f- f (ff)2 2.67 12.67 -2.33 -6.33 -0.33 -6.33 7.13 160.53 5.43 40.01 0.11 40.01 1.13 25.36 0.86 6.32 0.02 6.32 (f- f)2 /f

Compact Small Mid-size Large Sporty Van Figure 16

This figure (16) reveals that the computed x2 value is 40.01 miles. It is in the rejection region beyond the critical value of 11.1. The null hypothesis is rejected at the 0.05 level of significance and we accept the alternate hypothesis.

25

PART 2 AIM: PREDICTING MPG (2) a. For this part we examine the relationship between MPG (miles per gallon) and other variables in the data set. MPG town seemed suitable and reasonable A table showing a correlation matrix across vehicle types and variables MPG TOWN
Compact HP Length Engine size RPM Weight .021 .307 -.019 .031 -.282 Small -.574 -.441 -.791 .054 -.738 Mid-size -.742 -.581 -.718 -.153 -.811 Large -.335 -.488 -.960 .870 -.670 Sporty -.776 -.362 -.578 -.040 -.703 Van .069 .123 -.257 .637 -.378

Figure 17 From figure 17, variables which seem to be relevant and reasonable determinants of MPG town are horsepower (hp), engine size (litres), maximum revolutions of engine per minute (RPM) and weight (pounds). HORSEPOWER (hp) From figure 17, the coefficient of correlation r for MPG town and horsepower for car types are as follows, compact 0.021, small -0.574, mid-size -0.742, large cars -0.335, sporty cars -0.776 and van cars 0.069. Sporty cars have the
26

strongest negative correlation therefore horsepower can be used to predict the MPG for sporty cars, while compact cars have the weakest positive correlation with an r value quite close to 0. ENGINE SIZE (litres) Figure 17 shows that the coefficient of correlation r for MPG town and engine size for car types are as follows, compact -0.019, small -0.791, mid-size -0.718, large -0.960, sporty -0.578 and van cars -0.257, this means that there is a relationship between engine size and large cars since they have the strongest negative correlation thus can be used as an MPG determinant for large cars. MAXIMUM REVOLUTIONS OF ENGINE (per minute) Figure 17 shows that the coefficient of correlation r for MPG town and RPM for car types are as follows, compact 0.031, small 0.054, mid-size -0.153, large 0.870, sporty -0.040 and van 0.637.Large cars have the strongest positive correlation therefore have a strong relationship between RPM and MPG town, however RPM can be used as a predictor for MPG town for large cars unlike compact cars. WEIGHT (pounds) From figure 17, the coefficient of correlation r for MPG town and horsepower for car types are as follows, compact -0.282, small -0.738, mid-size -0.811, large -0.670, sporty -0.703 and van -0.378.There is a relationship between weight and MPG town of mid-size cars, they show the strongest negative correlation amongst other car types. However, weight can be used as an MPG determinant for mid-sized cars. SCATTER DIAGRAMS TO SHOW PREDICTION OF MPG TOWN AND RELEVANT VARIABLES

27

Graph 1 Graph 1 shows the coefficient of determination, r2 as 0.1121. In equation y= -0.023x + 22.495 where y= MPG town and x = horsepower. 11.2% of variation in MPG town is justified by changes in horsepower. The graph shows that for a unit increase in horsepower there is a decrease in MPG town.

28

Graph 2

Graph 2 shows the coefficient of determination, r2 as 0.9224. In equation y= -1.8184x + 26.018 where y= MPG town and x = engine size. 92% of variation in MPG town is justified by changes in engine size. The graph shows that for a unit increase in engine size there is a decrease in MPG town. However, engine size is a perfect predictor of MPG ton for large cars.

Graph 3

29

Graph 3 shows the coefficient of determination, r2 as 0.6572. In equation y= -0.0048 + 35.803 where y= MPG town and x = weight. 65.7% of variation in MPG town is explained by changes in weight. The graph shows that for a unit increase in weight there is a decrease in MPG town. Therefore weight is a perfect predictor of MPG ton for mid-sized cars. A TABLE SHOWING A CORRELATION MATRIX ACROSS VEHICLE TYPES AND VARIABLES MPG BEST
Compact -.053 .187 -.317 .149 -.452 Small -.546 -.295 -.628 -.014 -.620 Mid-size -.685 -.294 -.548 -.254 -.688 Large -.222 -.654 -.780 .797 -.745 Sporty -.724 -.176 -.453 -.207 -.612 Van .355 .296 -.041 .549 -.198

HP Length Engine size RPM Weight

Graph 18

From figure 18, variables which seem to be relevant and reasonable determinants of MPG best are horsepower (hp), engine size (litres), maximum revolutions of engine per minute (RPM) and weight (pounds). HORSEPOWER From figure 18, the coefficient of correlation r for MPG best and horsepower for car types are as follows, compact -0.053, small -0.546, mid-size -0.685, large cars -0.222, sporty cars -0.724 and van cars 0.355. Sporty cars have the strongest negative correlation therefore horsepower can be used to predict the MPG for sporty cars, while compact cars have the weakest positive correlation with an r value quite close to 0. ENGINE SIZE (litres) Figure 18 shows that the coefficient of correlation r for MPG best and engine size for car types are as follows, compact -0.317, small -0.628, mid-size -0.548, large -0.780, sporty -0.433 and van cars -0.041, this means that there is a relationship between engine size and large cars since they have the strongest negative correlation thus can be used as an MPG determinant for large cars.

30

WEIGHT (pounds) From figure 18, the coefficient of correlation r for MPG best and weight for car types are as follows, compact 0.452, small -0.620, mid-size -0.688, large -0.745, sporty -0.612 and van -0.198.There is a relationship between weight and MPG best of large cars, they show the strongest negative correlation amongst other car types. However, weight can be used as an MPG determinant for large cars while van cars show the weakest. MAXIMUM REVOLUTIONS OF ENGINE (per minute) Figure 18 shows that the coefficient of correlation r for MPG best and RPM for car types are as follows, compact 0.149, small -0.014, mid-size -0.254, large 0.797, sporty -0.207 and van 0.549.Large cars have the strongest positive correlation therefore have a strong relationship between RPM and MPG town, however RPM can be used as a predictor for MPG town for large cars unlike compact cars.

SCATTER DIAGRAMS TO SHOW PREDICTION OF MPG BEST AND RELEVANT VARIABLES

31

Graph 4 Graph 4 shows the coefficient of determination, r2 as 0.609. In equation y= -1.2518 + 31.996 where y= MPG best and x = engine size. 60% of variation in MPG best is explained by changes in engine size. The graph shows that for a unit increase in engine size there is a decrease in MPG town. Therefore weight is a perfect predictor of MPG best for large cars.

32

Graph 5 Graph 5 shows the coefficient of determination, r2 as 0.028. In equation y= -0.068x + 30.768 where y= MPG best and x = horsepower. 0.28% of variation in MPG best is justified by changes in horsepower. The graph shows when horsepower is 0 mpg is 30.768 which is quite impossible. Above an mpg of 0 a unit in (hp) increase in horsepower to a -0. 068 increase in efficiency. Therefore horsepower cannot be used to determine MPG best for compact cars.

Graph 6 Graph 6 shows the coefficient of determination, r2 as 0.0392. In equation y= -0.0019x + 29.036 where y= MPG best and x = weight. 3.9% of variation in MPG best is justified by changes in weight. The graph shows when weight is 0 mpg is 30.768 which is quite impossible. Above an mpg of 0 a unit in (pounds) increase in weight to a -0.0019 increase in efficiency. This clearly shows that weight is a bad predictor of MPG best for van cars.

33

CONCLUSION From analyzing the data in the data set of vehicles provided, a conclusion can be drawn that for the first aim, across all the car types, midsize cars have the highest mean prices because they have most features like a large number of airbags and large engine sizes. Both median prices are unaffected by extremely low or high prices. All the car types are positively skewed except van cars due to the fact that these car types have a high average number of 14.4 air bags whereas van cars have a low average number of 3 airbags. The coefficient of variation shows that all car types have variables spread away from the mean with extreme values except for vans and large cars which have variables close to the mean. Of all the car types, small cars have the highest mean mpg mainly because they have small engine sizes. Both the mpg medians are unaffected by the extremely high or low mile values. The mpg variables have all the different types of skewness being symmetric, positive and negative skewness, this is due to the small, medium sized and relatively larger sized engines respectively. The coefficient of variation shows that the mpg variables for the entire car types are spread away from the mean, that is, there are extremely high and low mpg values. All hypothesis tests results from the differences of means test conducted show sufficient evidence that all these all these car types are from the same population. The confidence intervals show the approximate values of where the true population mean might lie in the population. Chi-square tests based on probability used to test the differences in the apparent pattern in
34

probability show that the difference between the observed and expected frequencies is large enough to be considered significant. For the second aim, from using the variables in the correlation matrix a conclusion can be drawn that sporty cars have the strongest negative correlation to be used for the prediction of mpg for sporty cars using horsepower whereas compact cars have the weakest positive correlation with an r value quite close to zero. Engine size can be used as a determinant of mpg for large cars since there is a strong negative correlation relationship between engine size and large cars. Large also have a strong positive correlation relationship between maximum revolutions of engine per minute and mpg town. Another significant relationship which can be used for prediction of mpg town is the one between weight and mpg town for large cars which is shown as a strong negative correlation relationship. Using the equation from the scatter diagrams showing the relationship between different variables and mpg town, it has been established that 11.2% of variation in mpg town is justified by changes in horsepower for large cars. 92% of variation in mpg town is justified by engine size for large cars.

APPENDICES BASIC PRICE

TOP PRICE

35

MPG TOWN

36

MPG BEST

Excel tool pak was used to obtain values for the discriptive satistics measures of mean, median, mode and skewness. Coefficient of variation was calculated by hand though by applying the formula coefficient of variation = standard deviation mean.

CONFIDENCE INTERVAL
basic price

Confidence interval - mean


95% 15.69375 5.87315574 16 1.960 2.8778 18.5715 12.8160 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

small

Confidence interval - mean


95% 11.9047619 2.803297378 21 1.960 confidence level mean std. dev. n z 37

1.1990 13.1037 10.7058

half-width upper confidence limit lower confidence limit

midsize

Confidence interval - mean


95% 24.9363636 10.15233004 22 1.960 4.2423 29.1787 20.6941 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

large

Confidence interval - mean


95% 22.9363634 6.260714452 11 1.960 3.6998 26.6361 19.2366 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

sporty

Confidence interval - mean


95% 16.85714286 7.895345687 14 1.960 4.1358 confidence level mean std. dev. n z half-width 38

20.9929 12.7214

upper confidence limit lower confidence limit

van

Confidence interval - mean


95% 16.2 2.027929979 9 1.960 1.325 17.525 14.875 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

TOP PRICE compact

Confidence interval - mean


95% 20.725 7.960947 16 1.960 3.901 24.626 16.824 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

small

Confidence interval - mean


95% 11.90476 confidence level mean 39

2.803297 21 1.960 1.1990 13.1037 10.7058

std. dev. n z half-width upper confidence limit lower confidence limit

mid-size

Confidence interval - mean


95% 30.31364 15.08554 22 1.960 6.3037 36.6174 24.0099 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

large

Confidence interval - mean


95% 25.67273 6.668747 21 1.960 2.8522 28.5249 22.8205 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

sporty

Confidence interval - mean


95% 21.95714 confidence level mean 40

8.573099 14 1.960 4.4908 26.4479 17.4664

std. dev. n z half-width upper confidence limit lower confidence limit

van

Confidence interval - mean


95% 22.03333 3.009153 9 1.960 1.9659 23.9993 20.0674 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

MPG TOWN compact

Confidence interval - mean


95% 22.6875 1.922455028 16 1.960 0.9420 23.6295 21.7455 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

small

Confidence interval - mean


95% 29.85714286 6.109711239 confidence level mean std. dev. 41

21 1.960 2.6131 32.4703 27.2440

n z half-width upper confidence limit lower confidence limit

mid-size

Confidence interval - mean


95% 19.545455 1.895540449 22 1.960 0.7921 20.3375 18.7534 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

large

Confidence interval - mean


95% 18.36363636 1.501514387 11 1.960 0.8873 19.2510 17.4763 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

sporty

Confidence interval - mean


95% 21.78571429 3.906179944 14 1.960 confidence level mean std. dev. n z 42

2.0461 23.8319 19.7396

half-width upper confidence limit lower confidence limit

van

Confidence interval - mean


95% 17 1.224744871 9 1.960 0.800 17.800 16.200 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

MPG BEST compact

Confidence interval - mean


95% 29.875 2.941088234 16 1.960 1.441 31.316 28.434 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

small

Confidence interval - mean


95% 35.47619048 5.60909126 confidence level mean std. dev. 43

21 1.960 2.3990 37.8752 33.0772

n z half-width upper confidence limit lower confidence limit

mid-size

Confidence interval - mean


95% 26.72727273 1.272077756 22 1.960 0.5316 27.2588 26.1957 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

large

Confidence interval - mean


95% 26.72727273 1.272077756 11 1.960 0.7517 27.4790 25.9755 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

sporty

Confidence interval - mean


95% 28.78571429 3.641186861 14 2.160 2.1024 30.8881 26.6834 confidence level mean std. dev. n t (df = 13) half-width upper confidence limit lower confidence limit

44

van

Confidence interval - mean


95% 21.88888889 1.452966315 9 1.960 0.9493 22.8381 20.9396 confidence level mean std. dev. n z half-width upper confidence limit lower confidence limit

45

You might also like