Expectation and Variance

PDFs, Expectations and Variances This is a (quickly and in places poorly written!
) review of some very basic concepts in probability and statistics that are prerequisites for this Time Series course. It is not all inclusive, but it should get you started.
Random Variables
Random variables, denoted by capital letters like X, Y , and Z, are variables that take on numerical values with certain probabilities. You can think of them as an outcome of an experiment that has been mapped to numbers. For example, suppose we ip a fair coin one time and we dene X= 1 , if the coin came up Heads 0 , if the coin came up Tails
Then X is a discrete random variable which takes the value 1 with probability 1/2 takes the value 0 with probability 1/2. A continuous random variable is one that can take on one of a whole continuum of values. For example, suppose our experiment is to measure the height of the next person to walk in the room. (Assume that we can measure with innite accuracy.) If we let X be the height of the next person to walk into the room, then X is a continuous random variable. Note that, because there are so many possible outcomes for X, the probability of seeing any one exact value is zero. For example, the probability that the next persons height is between 3 feet and 7 feet is pretty large, the chance that it is between 60 inches and 62 inches is quite a bit smaller, the chance that it is between 61.456856 and 61.456857 is really really small. The chance that it is one exact number shrinks to zero!
2
2.1
Probability Density Functions

Discrete Random Variables
Let X be a discrete random variable. The probability density function (pdf) for X is a function f (x) that is dened as f (x) = P (X = x). Note that the little x is only an argument of this function. If f is the pdf for X, then f (y) = P (X = y).
If you are are dealing with several random variables at one time, we will usually still denote the pdfs by f but add a subscripts for clarity. For example, suppose that X and Y are two discrete random variables. We can write the pdfs as: fX (x) = P (X = x) and fY (y) = P (Y = y).
It is sort of customary to make the arguments match but be aware that it is just an argument and that fX (y) = P (X = y).
2.2
Continuous Random Variables
Let X be a continuous random variable. As per the height discussion above, P (X = x) = 0 for any particular number x. Thus, it does not make any sense to dene the pdf as P (X = x). Instead, we dene the pdf of X as f (x) = a curve under which area represents probability. The pdf f for a continuous random variable must have two properties. We must have f (x) 0 for all x, and
f (x) dx
= 1.
Note that a pdf might be zero in a lot of places. For example, suppose that X is the continuous random variable with pdf 3x2 , 0 x 1 f (x) = 0 , otherwise. The probability that X is between 1/4 and 1/2 is
1/2
P (1/4 < X < 1/2) =

1/4
3x2 dx = 7/64.
Note that P (1/4 X 1/2) = P (1/4 < X < 1/2) = P (1/4 < X 1/2) = P (1/4 X < 1/2). The probability that X is less than 1/2 is
1/2 0 1/2 1/2
P (X < 1/2) =
3x2 dx =
0 dx +
0
3x2 dx =
0
1 3x2 dx = . 8
2.3
Joint pdfs
Let X and Y be two discrete random variables. We will use fX,Y (x, y) to denote the joint pdf of X and Y which is dened as fX,Y (x, y) = P (X = x, Y = y). (Note that P (X = x, Y = y) = P (X = x and Y = y).)
In the case that the discrete random variables only take on a small number of values, joint pdfs are often written in tabular form. For example x 1 3 11 3 0.2 0.1 0.2 y 5 0.1 0.1 0.3 Here, for example, P (X = 11, Y = 5) = 0.3. Note that, if you want the probability that X is 11, you would have to add up two probabilities: P (X = 11) = P (X = 11, Y = 3) + P (X = 11, Y = 5) = 0.1 + 0.3 = 0.5. In fact, considered alone, X takes on the values 1 3, and 11 with probabilities 0.3, 0.2, and 0.5, respectively. So, we can write out the individual pdf for X alone, denoted by fX (x) as a list of values written across the bottom row of that table. Similarly, we can write out the individual pdf for Y alone, denoted by fY (y) as a list of values written down a column to the right of that table. As all of these probabilities appear in the margins of the table, the individual pdfs for X and Y in the context of coming from a joint pdf are known as marginal pdfs. From a Joint pdf to Marginals: Let X and Y be discrete random variables with joint pdf fX,Y (x, y). The marginal pdfs for X and Y are given by fX (x) = y fX,Y (x, y), and fY (y) = x fX,Y (x, y).
If X and Y are continuous random variables, the joint pdf will still be denoted by fX,Y (x, y) but, as in the univariate case, will no longer represent probability. Instead, fX,Y (x, y) = a surface under which volume represents probability. If you wanted to know the probability that X is between 1 and 3 and Y is greater than 5, for example, you would write
3
P (1 < X < 3, Y > 5) =

1 5
fX,Y (x, y) dy dx.
If you wanted to know the probability that Y > X, you could compute this as

P (Y > X) =
x
fX,Y (x, y) dy dx.
Note that
fX,Y (x, y) dy dx = 1.
Analogous to the discrete case we have
From a Joint pdf to Marginals: Let X and Y be continuous random variables with joint pdf fX,Y (x, y). The marginal pdfs for X and Y are given by fX (x) = fX,Y (x, y) dy, and fY (y) = fX,Y (x, y) dx.
Denition: Random variables X and Y are said to be independent if the joint pdf factors into the marginal pdfs: fX,Y (x, y) = fX (x)fY (y).
Cumulative Distribution Functions
The cumulative distribution function or cdf for a random variable X, is generally denote by FX (x) and is dened, in both the discrete and continuous cases as FX (x) = P (X x). If X is discrete then FX (x) = P (X x) =
ux
P (X = u) =
ux
fX (u).
If X is continuous then FX (x) = P (X x) =
fX (u) du.
To go backwards from the cdf to the pdf of a continuous random variable, fX (x) = d FX (x). dx
**For the rest of this document I will provide proofs of claims for continuous random variables only!
PDFs for Functions of Random Variables
Let X be a continuous random variable with pdf fX (x). Let Y = g(X) be a continuous function of X that is invertible. We will now show that, fY (y), the pdf for Y , is related to the pdf of X as fY (y) = fX (g1 (y)) d 1 g (y) . dy
If g is invertible, then it must be either strictly increasing or strictly decreasing. To prove the formula above, we shall consider the two cases separately.
Case One: g is increasing If g is increasing then g 1 is also increasing. (The proof of this is left to you!) So, FY (y) = P (Y y) = P (g(X) y) = P (X g1 (y)). That last equality is gotten by applying the function g 1 to both sides. Since g1 is an increasing function, the direction of the inequality will be preserved. Note that the far right-hand side is equal to the cdf for X evaluated at g1 (y). So, we have FY (y) = FX (g1 )(y). Taking the derivative of both sides, using the fact that the derivative of the cdf is the pdf and using the chain rule on the right-hand side, we get fY (y) = d d d FY (y) = FX (g1 )(y) = fY (g1 (y)) g1 (y). dy dy dy
Since g1 is increasing, its derivative is positive, so adding an absolute value wont change anything: fY (y) = fY (g 1 (y)) Case Two: g is decreasing If g is decreasing then g 1 is also decreasing. (Again, the proof of this is left to you.) So, FY (y) = P (Y y) = P (g(X) y) = P (X g1 (y)). Note that that the inequality in that last probability has ipped because a decreasing function maps smaller values to larger values and vice versa. The right-hand side is equal to 1 P (X < g1 (y)) which, for a continuous random variable is equal to 1 P (X g1 (y)) = 1 FX (g1 (y)). Putting it all together, we now have FY (y) = 1 FX (g1 )(y). Taking the derivative of both sides, using the fact that the derivative of the cdf is the pdf and using the chain rule on the right-hand side, we get fY (y) = d d d d FY (y) = (1 FX (g1 )(y)) = 0 fY (g1 (y)) g1 (y) = fY (g1 (y)) g 1 (y). dy dy dy dy d 1 g (y) . dy
Since g1 is decreasing, its derivative is negative. However, bringing that negative from the front d d over to the derivative gives dy g1 (y) = dy g1 (y) . So, fY (y) = fY (g 1 (y)) as desired. d 1 g (y) , dy
Expected Values
The expectation, mean, or expected value is denoted by E[X] and is dened as a probability weighted average of the possible values that a random variable X can take on. Case: X is discrete: E[X] :=
x
x P (X = x) =
x
x fX (x)
Case: X is continuous: E[X] :=
x fX (x) dx
For the continuous pdf example of the previous Section, E[X] = = =

x 0 x 1 0 x
fX (x) dx 0 dx +
1 0 x
3x2 dx +
0 x
0 dx
3x2 dx = 3 . 4
5.1
Expected Values of Functions of Random Variables
Let X be a continuous random variable with pdf fX (x) and let Y = g(X) where g is a continuous function of X. Then, we already know that
E[g(X)] = E[Y ] = It turns out that this is equivalent to

y fY (y) dy.
g(x) fX (x) dx
This is very useful! We can nd the expected value of a function of X directly without rst having to consider a transformation to a dierent pdf! I will prove this here in the case that g is a strictly increasing function. (In actuality, g does not even have to be invertible.) Since we know that fY (y) = fX (g 1 (y))
d 1 dy g (y)
= fX (g1 (y))
d 1 dy g (y),
we can write
E[g(X)] = E[Y ] = Letting x = g1 (y), we have dx =
y fY (y) dy =
y fX (g1 (y))
d 1 g (y) dy. dy
d 1 dy g (y) dy, ? ?
and this becomes
g(x)fX (x) dx
where ? and ? are new limits of integration. These limits will be the values that x = g1 (y) takes on as y goes from to . Fortunately, since Y was dened from X as Y = g(X) and g is invertible, x = g 1 (y) will end up taking on the original possible values for X which are given as part of the pdf fX (x), as it may dened to be zero in a lot of places. So, we can write those limits from to and the pdf fX (x) will be non-zero for precisely the right range of values for X. (Sorry, that explanation was a bit rushed!) In summary, E[g(X)] =
g(x)fX (x) dx.
5.2
Expected Values for Functions of Two Random Variables
If X and Y are two continuous random variables, to nd the expectation of a function g(X, Y ), we must do a double integral against the joint pdf:

E[g(X, Y )] =
g(x, y) fX,Y (x, y) dy dx.

Expectation and Variance

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Expectation and Variance

Uploaded by

Copyright:

Available Formats

PDFs, Expectations and Variances This is a (quickly and in places poorly written!

Probability Density Functions

Continuous Random Variables

P (1/4 < X < 1/2) =

P (1 < X < 3, Y > 5) =

fX,Y (x, y) dy dx.

fX,Y (x, y) dy dx.

Analogous to the discrete case we have

Cumulative Distribution Functions

If X is continuous then FX (x) = P (X x) =

PDFs for Functions of Random Variables

Case: X is continuous: E[X] :=

For the continuous pdf example of the previous Section, E[X] = = =

Expected Values of Functions of Random Variables

E[g(X)] = E[Y ] = It turns out that this is equivalent to

E[g(X)] = E[Y ] = Letting x = g1 (y), we have dx =

and this becomes

Expected Values for Functions of Two Random Variables

g(x, y) fX,Y (x, y) dy dx.

You might also like