You are on page 1of 11

Maria Durban 2011

Chapter 2
Introduction to Random Variables
In the previous chapter we introduce the concept of event to describe characteristics of outcomes
of experiments. Now, we introduce a concept,ransom variable, that will allow events to be dened
in a more consistent way, they will always be numerical.
1 Denition of a random variable
A random variable is essentially a random number. For example, suppose a coin is thrown three
times and the results are observed, then:
E = {hhh, hht, hth, thh, tth, tht, htt, ttt}.
Examples of ransom variables would be: i) total number of head, ii) number of head minus number
of tails, etc. Each of these functions assigns a real number to each element of E. Since each outcome
of E is random, the corresponding real number is also random.
Denition
A random variable is a function which associates a real number to each element of the sample
space.
Random Variables are represented in capital letters, generally the last letters of the alphabet:
X,Y, Z, etc., and the values taken by the variable are represented by small letters.
Examples of ransom variables are:
Number of defective units in a random sample of 5 units
Number of faults per cm2 of material
Lifetime of a lamp
Resistance to compression of concrete
19
Maria Durban 2011
R
X
E
X(s
i
) = b; s
i
E
X(s
k
) = a
s
i
s
k
1 Definition of random variable
Estadstica, Profesora: Mara Durbn
8
a b
The elements in E have a probability distribution, this distribution is also
associated to the values of the variable X. That is, all r.v. preserve the
probability structure of the random experiment that generates it:
Pr( ) Pr( : ( ) ) X x s E X s x = = =
The space R
X
is the set of ALL possible values of X(s).
Each possible event of E has an associated value in R
X
.
Then, we can consider R
X
as another random space. The elements in E have a probability dis-
tribution, this distribution is also associated to the values of the variable X. That is, all random
variables preserve the probability structure of the random experiment that generates it:
P(X = x) = P(s E : X(S) = x)
2 Types of random variables
The rank of a random variable is the set of possible values taken by the variable. Depending on
the rank , the variables can be classied as:
1. Discrete: Those that take and nite or innite (numerable) number of values (Generally
count the number of times that something happens ). For example:
Number of faults on a glass surface
Proportion of default parts in a sample of 1000
Number of bits transmitted and received correctly
2. Continuous: Those whose rank is an interval of real numbers (Generally measure a magni-
tude ). For example:
Electric current
Temperature
Weight
2.1 Discrete random variables
The values taken by a random variable change from one experiment to another, since the results
of the experiment are dierent. A discrete random variable is dened by:
1. The values that it takes.
2. The probability of taking each value.
20
Maria Durban 2011
Then, a function R n p called probability function is dened as a function that assigned the
probability to each value of the random variable:
p(x
i
) = P(X = x
i
), x
1
, x
2
, . . .
The probability function has several properties derived from the axioms of probability:
p(x)
The properties of the probability function come from the axioms of
probability:
1. 0P(A) 1 2. P(E)=1
3. P(AUB)=P(A)+P(B) si AB=
2 Discrete random variables
Estadstica, Profesora: Mara Durbn
13
x
x
1
x
2
x
3
x
4
x
5
x
6
x
n
p(x
i
)
1. 0 p(x
i
) 1
2.

n
i=1
p(x
i
) = 1
3. If a < b < c,
P(a X c) = P(a X b) +P(b < X c)
Property 1 and 2 are immediate due to axiom 1 and 2 of probability. Property 3 is also
immediate if we dene A = {a X b} and B = {b < X c} and apply the third axiom.
Example
Suppose we toss two coins and dene the random variable X = number of tails. The sample space
has 4 elements: E = {HH, HT, TH, TT} and the random variable X tales 4 values: 0, 1, 2.
Experiment: Toss 2 coins.
X=Number of tails.
HH
TH
TT
E
2 Discrete random variables
Estadstica, Profesora: Mara Durbn
14
HH
TH
HT
TT
R
X
X
0 1 2
P
Each element of E has a certain probability, and it is associate to a value of the random variable.
Therefore, the probability function for this random variable is:
X p(x)
0 1/4
1 1/2
2 1/4
In addition to the probability function, it is sometimes convenient to use distribution function
which is dened as:
F
X
(x) = P(X x) < x < .
The distribution function is a non-decreasing function and satises:
F
X
() = 0 F
X
() = 1)
21
Maria Durban 2011
X F(x)
F(x)
Experiment: Toss 2 coins.
X=Number of tails.
2 Discrete random variables
Estadstica, Profesora: Mara Durbn
19
X F(x)
0 1/4
1 3/4
2 1
x=0 x=1 x=2 X
0.25
0.5
0.75
1
X F
X
(x)
0 1/4
1 1/4 + 1/2 = 3/2
2 1/4 + 1/2 + 1/4 = 1
In the previous example: Note that the function jumps whenever (x) > 0 and the at each x
i
the
value of the jump is p(x
i
)
2.2 Continuous random variables
In the case of continuous random variables, it doesnt make sense to calculate the probability that
the random variable takes a single value, X = x, since that probability is 0. Also, the second
property of the probability function,

n
i=1
p(x
i
) = 1, does not apply here, since the values taken by
the random variable is uncountable. We know that the generalization of the concept of sum (

) is
the integral (
_
), remember Riemanns Integral), and so, it is possible to introduce a new concept
that plays the role of the probability function, in the case of continuous random variables. This is
the density function, with is a integrable function that satises two properties:
f(x) 0,
_
+

f(x)dx = 1
Then, if X is a random variable with density function f
X
(x), for any a < b, the probability that
X falls in the interval (a, b) is the area under the density function between a and b:
2 Continuous random variables
Estadstica, Profesora: Mara Durbn
22
a b
Area below the curve
P(a < X < b) =
_
b
a
f(x)dx
The density function doesnt have to be symmetric, or be dened for all values. The shape of
the curve will depend on one or more parameters. From the denition of the density function, it is
easy to see that:
P(X = a) =
_
a
a
f(x)dx = 0
P(a X b) = P(a X < b) = P(a < X b) = P(a < X < b)
Example
The density function of the use of a machine in a year (in hours x100) is:
f(x) =
_
_
_
0.4
2.5
x, 0 < x < 2.5
0.8
0.4
2.5
x, 2.5 x < 5
0 otherwise
22
Maria Durban 2011
What is the probability that a machine randomly selected has been used less than 320 hours?
P(X < 3.2) =
_
2.5
0
0.4
2.5
xdx +
_
3.2
2.5
0.8
0.4
2.5
xdx
= 0.74
What is the probability that a machine randomly selected has
been used less than 320 hours?
f(x)
Example
2 Continuous random variables
Estadstica, Profesora: Mara Durbn
29
2.5
5
0.4
x
3.2
74 0
5 2
4 0
8 0
5 2
4 0
2 3
5 2
0
2 3
5 2
.
.
.
.
.
.
) . (
. .
.
=

=
= <

dx x dx x
X P
As in the case of discrete random variables, we can dene the distribution of a continuous
random variables by means of the Distribution function:
F(x) = P(X x) =
_

x
f(u)du, < x <
In the discrete case, the probability function is obtained as the dierence of adjoined values of F(x).
In the case of continuous variables:
f(x) =
dF(x)
dx
The Distribution function satises the following properties:
1. a < b F(a) F(b).
2. F() = 0 and F() = 1.
Example
The density function of the use of a machine in a year (in hours x100) is:
f(x) =
_
_
_
0.4
2.5
x, 0 < x < 2.5
0.8
0.4
2.5
x, 2.5 x < 5
0 otherwise
What is the distribution function?
3 Characteristic measures of a random variable
A random variable is characterized by its probability or density function and the distribution
function. But also, there are several measurements that help to describe it.
23
Maria Durban 2011
F(x) =
_

_
0 x 0
_
x
0
0.4
2.5
udu 0 < x < 2.5
_
2.5
0
0.4
2.5
udu
. .
P(x2.5)
+
_
x
2.5
0.8
0.4
2.5
udu
. .
P(2.5<Xx)
2.5 x < 5
1 x 5
=
_

_
0 x 0
0.08x
2
0 < x < 2.5
1 + 0.08x 0.08x
2
2.5 x < 5
1 x 5
Example
P(x<3.2)
2 Continuous random variables
37
3.1 Measures of Central Tendency
Central tendency of a random variable is the concentration of its values in the central part of the
distribution. A measure of central tendency is a value of the random variable which is representative
of the entire distribution of the variable. There are three main measures of central tendency:
1. Expected value or mean
2. Median
3. Mode
Expected value or mean of a random variable
The concept of expected value of a random variable is closely related to the notion of weighted
average. In the case of a sample of data, the sample mean allocates a weight of 1/n to each value:
x =
1
n
x
1
+
1
n
x
2
+. . . +
1
n
x
n
,
the possible values of a random variable are weighted by their probabilities:
= E[X] =

i
x
i
p(x
i
) for discrete random variables
= E[X] =
_
+

xf(x)dx for continuous random variables


In the example of section 2.1, what would be the expected number of tails when a coin is tossed
twice?
E[X] =

i
x
i
p(x
i
) = 0
1
4
+ 1
1
2
+ 2
1
4
= 1
24
Maria Durban 2011
In the example of section 2.2, what would be the average time of use of a machine?.
E[X] =
_
+

xf(x)dx =
_
2.5
0
0.4
2.5
x
2
dx +
_
5
2.5
0.8x
0.4
2.5
x
2
dx = 2.5
Median
Intuitively, the median is value that divides the total probability in to parts. Formally, the value
X = x of the random variable for with the distribution function F(x) = 1/2 is called the median
of X.
In the continuous case, this value always exist as is dened as the value m such that:
_
m

f(u)du =
1
2
.
However, for discrete r.v. it might not be the case. If this happens, the median is dened as,
m = x
k
F(x
k1
) < 0.5 and F(x
k
) 0.5
where x
k1
and x
k
are two consecutive values of X.
Also, other measures can be given using the distribution function, for example, Quartiles:
F(Q
1
) = 1/4 F(Q
2
) = 1/2 = median F(Q
3
) = 3/4
In the example of section 2.2, if we want to know the time of use such that 50% of the machines
have a use less or equal to that value, we can use the distribution functon that we calculated and
solve the equation
F(m) = 0.5
Since the distribution function is dened dierently in each interval, we need to solve the above
equation for m being in each interval:
0.08m
2
= 0.5 m = 2.5
1 + 0.8m0.08m
2
= 0.5 m = 2.5
In this case, the value obtained is the same in both intervals. This is not the case generally, in most
cases, only the solution to one equation is possible.
Mode
In the case of a discrete random variable, a mode of X is a number m such that P(X = m) is
largest. This is the most likely value of X or one of the most likely values if X has several values
with the same largest probability. For a continuous random variable, a mode is a number m such
that the probability density function is highest at x = m.
25
Maria Durban 2011
Comparison of location measures
Each of the measures describe above has a slightly dierent meaning. Suppose that you are betting
on the outcome of an experiment:
If you choose the mode, you are essentially betting on the single most likely outcome of the
experiment.
If you choose the median, you are equally likely to be larger or smaller than the outcome.
If you choose the mean, you have the best chance of being closest to the outcome.
3.2 Measures of dispersion
When speaking of the variability of a random variable, we generally mean the range of values that
would commonly (i.e., most probably) be observed in an experiment. This property is called the
dispersion of the random variable. Dispersion refers to the range of values that are commonly
assumed by the variable. Experiments that produce outcomes that are highly variable will be
more likely to give values that are farther from the mean than similar experiments that are not as
variable. In other words, probability distributions tend to be broader as the variability increases.
Variance and standard deviation
They are measures of how dispersed is the distribution of a random variable about its center. The
variance of a random variable X, V ar[X] or
2
X
, is the expected value of (X E(X))
2
, which is
the squared deviation of X from its mean value:
V ar[X] = E[(X E[X]
. .

)
2
],
and the standard deviation, , is the square root of the variance.

2
X
= V ar[X] =

i
(x
i
)
2
p(x
i
) for discrete random variables

2
X
= V ar[X] =
_
+

(x )
2
f(x)dx for continuous random variables
Here, we have another weighted sum. The values being summed are the squared deviations of the
variable from the mean. The squared deviations indicate how far the value is from the mean value,
and the weights in the sum are the probabilities. Thus, broader probability distributions will tend
to have larger weights for values of X that have larger squared deviations ( more distant from the
mean).
The variance of X may also be calculated as:
V ar[X] = E[X
2
] (E[X])
2
To prove it we will use the fact that, the expectation is a linear operator, and that E[X] is a
constant that does not depend on X:
26
Maria Durban 2011
3 Characteristic measures of a r.v.
[ ] [ ] ( )
2
Var X E X E X

=

Measures of dispersion
Estadstica, Profesora: Mara Durbn
45
[ ] [ ] ( )
2
2
Var X E X E X =

[ ] ( ) [ ] ( ) [ ]
[ ] ( ) [ ] [ ]
[ ] ( )
2 2
2
2
2
2
2
2
2
E X E X E X E X XE X
E X E X E X E X
E X E X

= +

= +

=

It is a linear operator
[ ] is a constant,
does not depend on
E X
X
Example
Let X be a random variable with density function:
f(x) = 2x, 0 x 1
The variance of X is:
V ar[X] = E[X
2
] (E[X])
2
E[X] =
_
+

xf(x)dx =
_
1
0
2x
2
dx = 2/3
E[X
2
] =
_
+

x
2
f(x)dx =
_
1
0
2x
3
dx = 1/2
V ar[X] =
1
2

4
9
= 1/9
4 Transformation of random variables
In many situations we may wish to transform one random variable X into a new one Y by means
of a transformation
Y = h(X).
Normally, we will know the probability/density or distribution function of X and the problem is to
calculate the probability/density or distribution function of Y . There are many cases depending
on the type of random variable and transformation. We will study three situations: i) X is a
continuous random variable and h is a continuous monotonic function, ii) X is continuous, but h
is non monotonic, ii) X is discrete and h is continuous.
4.1 Monotonic transformation of a continuous random variable
Suppose h is a continuous, dierentiable, monotonic function, given a value y = h(x):
F
Y
(y) = P(Y y) =
_
P(X h
1
(y)) = F
X
(h
1
(y)) monotonically increasing
P(X h
1
(y)) = 1 F
X
(h
1
(y)) monotonically decreasing
We dierentiate in both sides to the the density function:
f
y
=
F
Y
(y)
dy
=
_
F
X
(h
1
(y))
dx
dx
dy
= f
X
(h
1
(y))
dx
dy
monotonically increasing
1
F
X
(h
1
(y))
dx
dx
dy
= f
X
(h
1
(y))
dx
dy
monotonically decreasing
For h decreasing, we also have h
1
decreasing and consequently
dx
dy
=
dh
1
(y)
dy
will be negative, so
we can put together both equations as:
f
Y
(y) = f
X
(h
1
(y))

dx
dy

.
27
Maria Durban 2011
So, when calculating the density of Y , we need to follow the following steps:
1. Write X as a function of Y , i.e., calculate h
1
.
2. Calculate

dx
dy

.
3. Substitute x, calculated in 1, in the density function of X, in order to calculate f
X
(h
1
(y)).
4. Multiply 2 and 3.
Important: Once you have calculated the density function of y, you need to give the range of
values taken by the new variable Y .
4.2 Non monotonic transformation of a continuous random variable
When the function h is non monotonic, there might be more than one interval of values of X such
that Y y. In this case:
f
Y
(y) =

n
f
X
(x
n
)

dx
dy

x=xn

,
where x
n
, n = 1, . . . , are the solutions of y = h(x).
Example
The velocity of a gas particle is a random variable with density function:
f
V
(v) =
_
b
2
2
v
2
e
bv
v > 0
0 elsewhere
.
The kinetic energy of the particle is W = mV
2
/2. What is the density function of W?. This is a
case where the transformation is not monotonic since
v =
_
2w/m
however, the density function of V is 0 if v < 0, therefore we only need to take into account the
positive solution. Now, we follow the 4 steps given above:
1. v =
_
2w/m
2.

dv
dw

=
_
1
2mw
.
3. f
V
(h
1
(w)) = (b
2
/2)(
_
2w/m)
2
e
b

2w/m
4. f
W
(w) = (b
2
/2)
_
2w/me
b

2w/m
Now, we need to calculate the set of values taken by W, since V > 0, then W = mV
2
/2 > 0.
Therefore:
f
W
(w) =
_
(b
2
/2)
_
2w/me
b

2w/m
w > 0
0 elsewhere
.
28
Maria Durban 2011
4.3 Transformation of a discrete random variable
If X is a discrete random variable and h is a continuous transformation, then:
p
y
(y) =

n
p(x
n
), x
n
are the solutions of y = h(x).
Example
A company packs microchips in lots. It is know that the probability distribution of the number of
microchips per lots is given by:
x p(x) F(x)
11 0.03 0.03
12 0.03 0.06
13 0.03 0.09
14 0.06 0.15
15 0.26 0.41
16 0.09 0.5
17 0.12 0.62
18 0.21 0.83
19 0.14 0.97
20 0.03 1
What is P(X
2
144)?.
P(X
2
144) = P(x A) A = {x, x

144}
Pr(X 12) = 0.06
4.4 Expectations of functions of random variable
Sometimes we will need to calculate E[h(X)], where X is a random variable, and h is a function.If
Y = h(X).
E[Y ] =

x
i
,h(x
i
)=y
h(x
i
)p(X = x
i
) for discrete random variables
E[X] =
_
+

h(x)f
X
(x)dx for continuous random variables
We will prove it in the case of a continuous random variable:
E[Y ] =
_
+

yf
Y
(y)dy =
_
+

h(x)f
X
(x)
dx
dy
dy =
_
+

h(x)f
X
(x)dx
In the case of a linear transformation, the following properties are satised:
Y = a +bX
E[Y ] = a +bE[X]
V ar[Y ] = b
2
V ar[X]
29

You might also like