You are on page 1of 4

STATISTICS - PRACTICAL SESSION 2

Giulia Marcon
1) Topics: Chapters 7 and 9
Measures of variability: range, variance, standard deviation, coefficient of variation
Relations between two variables: two-entry table, conditional frequencies, side-byside chart.
2) Exercises:
Exercise 1.
A student examined all bars and coffee shops near the University and the following variables
were collected:
SEAT
is it easy to find a seat? (0=NO 1=YES)
WINDOWS number of (shop) windows
PRICE
price of the preferred sandwich
DIST
distance from the University (in meters)
SEAT WINDOWS PRICE DIST

Total

DIST 2

2.8

650

422500

3.5

500

250000

700

490000

2.4

850

722500

3.2

200

40000

400

160000

4.5

350

122500

4.2

150

22500

3.6

200

40000

18

31.2

4000 2270000

a) Compute the range and the standard deviation of the variable Distance (DIST).
b) What is the most adequate measure of central tendency (central position) to describe the
variable PRICE?
c) Use an adequate analysis (on the basis of graphical representations and/or tables and/or
statistical measures) to show if finding a seat easily depends on the number of windows of
the bars. Briefly explain your results

a) X = DIST
Range = Max Min = 850 150 = 700
N

X =

x
i =1

4000
= 444.4444
9

X2 =

x
i =1

2
i

2270000
444.4444 2 = 54691.3580
9

X2 =

X = 54691.3580 = 233.8618

b) No mode can be observed from the data. We could prefer the median with respect to the
mean in case of anomalous values or in the case of asymmetric distribution.
There are no reasons why we should prefer the median to the mean (equals to 3.4667) as
measure of central tendency. Indeed, the underneath context does not allow us to conclude
that the mean is preferable to the median. The two values are very close to each other and
both of them can be correct measures of synthesis for the Price variable.
c)
Contingency table
(absolute frequencies)
Seat

Contingency table
(marginal row frequencies)
Seat

Total

Number of
windows

No

Yes

0.3333

0.6667

1.0

0.3333

0.6667

1.0

0.3333

0.6667

1.0

Total

Total

0.3333

0.6667

1.0

Number of
windows

No

Yes

Total

side-by-side chart
0,8

Rel. Freq.

0,7
0,6
0,5

Not easy to find a seat

0,4
Easy to find a seat

0,3
0,2
0,1
0
N. of windows
2

The contingency table is sufficient to show us that finding a seat in the bar has the same
relative frequency (33,33%) for bars with 1, 2 or 3 windows and it is therefore independent
from that variable.

Exercise 2.

In a population of 200 married couples, information on the number of children (X) and the yearly
income of the couple in thousands of Euros (Y) was collected. The resulting data are summarized
in the following two-way table:
Y\X

[0, 30)

10

50

60

[30,60)

20

36

[60,90)

a) Determine the frequency distribution of variable Number of children and provide an


appropriate graphical representation.
b) Compute the means of variable Number of children within the subpopulations obtained
for the different values of variable Yearly income.
c) Calculate the mean and variance for the two variables X and Y. Compare the variability of
the two variables by using an appropriate index.

a)

Frequency

20

78

102
200

Draw a Bar chart using absolute frequencies.


( 50 + 120 )
= 1.4167
120
( 20 + 72 )
=
= 1.5333
60
( 8 + 12 )
=
=1
20

x[ 0 ,30 ) =

b) x[ 30 ,60 )
x[ 60 ,90 )

c)

x = 1.41

y = 30

C.V . X =

sX =
2

sY =
2

(0 1.41) 2 20 + (1 1.41) 2 78 + (2 1.41) 2 102


= 0.4419
200

(15 30) 2 120 + (45 30) 2 60 + (75 30) 2 20


= 405
200

0.6648
= 0.4715
1.41

C.V .Y =

20.1246
= 0.6708
30

The variable Y fluctuates more than variable X.

s X = 0.6648

sY = 20.1246

You might also like