Professional Documents
Culture Documents
1
2 NUMERICAL AND STATISTICAL METHODS
What is a Random Variable?
Problem 1
(A) I only
(B) II only
(C) III only
(D) I and II
(E) II and III
Solution
0 0.25
1 0.50
2 0.25
y=1 y = 1 - 0.5x
Problem 1
(A) 0.10
(B) 0.15
(C) 0.25
(D) 0.50
(E) 0.90
Solution
Just like variables from a data set, random variables are described
by measures of central tendency (like the mean) and measures of
variability (like variance). This lesson shows how to compute these
measures for discrete random variables.
E(X) = x = [ xi * P(xi) ]
Example 1
Number of hits, x 0 1 2 3 4
Probability, P(x) 0.10 0.20 0.30 0.25 0.15
(A) 1.00
(B) 1.75
(C) 2.00
(D) 2.25
(E) None of the above.
FAQ 9
Solution
E(X) = [ xi * P(xi) ]
E(X) = 0*0.10 + 1*0.20 + 2*0.30 + 3*0.25 + 4*0.15 = 2.15
2 = { [ xi - E(x) ]2 * P(xi) }
Example 2
(A) 0.50
(B) 0.62
(C) 0.79
(D) 0.89
(E) 2.10
Solution
The correct answer is D. The solution has three parts. First, find
the expected value; then, find the variance; then, find the standard
deviation. Computations are shown below, beginning with the
expected value.
E(X) = [ xi * P(xi) ]
E(X) = 1*0.25 + 2*0.50 + 3*0.15 + 4*0.10 = 2.10
2 = { [ xi - E(x) ]2 * P(xi) }
= (1 - 2.1) * 0.25 + (2 - 2.1)2 * 0.50 + (3 - 2.1)2 * 0.15 + (4 -
2 2
2.1)2 * 0.10
2
= (1.21 * 0.25) + (0.01 * 0.50) + (0.81) * 0.15) + (3.61 * 0.10) =
0.3025 + 0.0050 + 0.1215 + 0.3610 = 0.79
The above conditions are equivalent. If either one is met, the other
condition also met; and X and Y are independent. If either
condition is not met, X and Y are dependent.
You can use tables like this to figure out whether two discrete
random variables are independent or dependent. Problem 1 below
shows how.
Problem 1
X A
0 1 2 0 1 2
3 0.1 0.2 0.2 3 0.1 0.2 0.2
Y B
4 0.1 0.2 0.2 4 0.2 0.2 0.1
(A) I only
(B) II only
(C) I and II
(D) Neither statement is true.
(E) It is not possible to answer this question, based on the
information given.
Solution
FAQ 13
The correct answer is A. The solution requires several
computations to test the independence of random variables. Those
computations are shown below.
Thus, P(x|y) = P(x), for all values of X and Y, which means that X
and Y are independent. We repeat the same analysis to test the
independence of A and B.
Thus, P(a|b) is not equal to P(a), for all values of A and B. For
example, P(a=0) = 0.3; but P(a=0 | b=3) = 0.2. This means that A
and B are not independent.
14 NUMERICAL AND STATISTICAL METHODS
Combinations of Random Variables
Problem 1
X
0 1 2
3 0.1 0.2 0.2
Y
4 0.1 0.2 0.2
(A) 1.2
(B) 3.5
(C) 4.5
(D) 4.7
(E) None of the above.
Solution
16 NUMERICAL AND STATISTICAL METHODS
The correct answer is D. The solution requires three computations:
(1) find the mean (expected value) of X, (2) find the mean
(expected value) of Y, and (3) find the sum of the means. Those
computations are shown below, beginning with the mean of X.
E(X) = [ xi * P(xi) ]
E(X) = 0 * (0.1 + 0.1) + 1 * (0.2 + 0.2) + 2 * (0.2 + 0.2) = 0 + 0.4
+ 0.8 = 1.2
E(Y) = [ yi * P(yi) ]
E(Y) = 3 * (0.1 + 0.2 + 0.2) + 4 * (0.1 + 0.2 + 0.2) = (3 * 0.5) + (4
* 0.5) = 1.5 + 2 = 3.5
And finally, the mean of the sum of X and Y is equal to the sum of
the means. Therefore,
Problem 2
(A) 2.65
(B) 5.00
FAQ 17
(C) 7.00
(D) 25.0
(E) It is not possible to answer this question, based on the
information given.
Solution
Adding a constant: Y = X + b
Subtracting a constant: Y = X - b
Multiplying by a constant: Y = mX
Dividing by a constant: Y = X/m
Multiplying by a constant and adding a constant: Y = mX +
b
Dividing by a constant and subtracting a constant: Y = X/m
-b
Problem 1
(A) $500
(B) $3,000
(C) $3,500
(D) None of the above.
(E) There is not enough information to answer this question.
20 NUMERICAL AND STATISTICAL METHODS
Solution
Y = mX + b
Y = 0.10 * X + 500
Y = mX + b
Y = 0.10 * $30,000 + $500 = $3,500
Problem 2
(A) $200
(B) $3,000
(C) $40,000
FAQ 21
(D) None of the above.
(E) There is not enough information to answer this question.
Solution
Y = mX + b
Y = 0.10 * X + 500
And finally, since the standard deviation is equal to the square root
of the variance, the standard deviation of employee bonuses is
equal to the square root of 40,000 or $200.
22 NUMERICAL AND STATISTICAL METHODS
What is a Probability Distribution?
Probability Distributions
Example 1
Suppose a die is tossed. What is the probability that the die will
land on 5 ?
Example 2
Note: The shaded area in the graph represents the probability that
the random variable X is less than or equal to a. This is a
cumulative probability. However, the probability that X is exactly
equal to a would be zero. A continuous random variable can take
on an infinite number of values. The probability that it will equal a
specific value (such as a) is always zero.
Binomial Experiment
A binomial experiment is a statistical experiment that has the
following properties:
Notation
The following notation is helpful, when we talk about binomial
probability.
FAQ 31
x: The number of successes that result from the binomial
experiment.
n: The number of trials in the binomial experiment.
P: The probability of success on an individual trial.
Q: The probability of failure on an individual trial. (This is
equal to 1 - P.)
n!: The factorial of n (also known as n factorial).
b(x; n, P): Binomial probability - the probability that an n-
trial binomial experiment results in exactly x successes,
when the probability of success on an individual trial is P.
nCr: The number of combinations of n things, taken r at a
time.
Binomial Distribution
A binomial random variable is the number of successes x in n
repeated trials of a binomial experiment. The probability
distribution of a binomial random variable is called a binomial
distribution.
Suppose we flip a coin two times and count the number of heads
(successes). The binomial random variable is the number of heads,
which can take on values of 0, 1, or 2. The binomial distribution is
presented below.
Example 1
Example 1
b(x < 45; 100, 0.5) = b(x = 0; 100, 0.5) + b(x = 1; 100, 0.5) + . . . +
b(x = 45; 100, 0.5)
b(x < 45; 100, 0.5) = 0.184
Example 2
Example 3
What is the probability that the world series will last 4 games? 5
games? 6 games? 7 games? Assume that the teams are evenly
matched.
In the world series, there are two baseball teams. The series ends
when the winning team wins 4 games. Therefore, we define a
success as a win by the team that ultimately becomes the world
series champion.
For the purpose of this analysis, we assume that the teams are
evenly matched. Therefore, the probability that a particular team
wins a particular game is 0.5.
Let's look first at the simplest case. What is the probability that the
series lasts only 4 games. This can occur if one team wins the first
4 games. The probability of the National League team winning 4
games in a row is:
Now let's tackle the question of finding probability that the world
series ends in 5 games. The trick in finding this solution is to
recognize that the series can only end in 5 games, if one team has
won 3 out of the first 4 games. So let's first find the probability that
the American League team wins exactly 3 of the first 4 games.
Okay, here comes some more tricky stuff, so listen up. Given that
the American League team has won 3 of the first 4 games, the
American League team has a 50/50 chance of winning the fifth
game to end the series. Therefore, the probability of the American
League team winning the series in 5 games is 0.25 * 0.50 = 0.125.
Since the National League team could also win the series in 5
games, the probability that the series ends in 5 games would be
0.125 + 0.125 = 0.25.
The rest of the problem would be solved in the same way. You
should find that the probability of the series ending in 6 games is
0.3125; and the probability of the series ending in 7 games is also
0.3125.
36 NUMERICAL AND STATISTICAL METHODS
Negative Binomial Distribution
=r/P
FAQ 39
where is the mean number of trials, r is the number of successes,
and P is the probability of a success on any given trial.
R = kP/Q
K = rQ/P
g(x; P) = P * Qx - 1
Sample Problems
The problems below show how to apply your new-found
knowledge of the negative binomial distribution (see Example 1)
and the geometric distribution (see Example 2).
Example 1
b*(x; r, P) = x-1Cr-1 * Pr * Qx - r
b*(5; 3, 0.7) = 4C2 * 0.73 * 0.32
b*(5; 3, 0.7) = 6 * 0.343 * 0.09 = 0.18522
Thus, the probability that Bob will make his third successful free
throw on his fifth shot is 0.18522.
Example 2
b*(x; r, P) = x-1Cr-1 * Pr * Qx - r
b*(5; 1, 0.7) = 4C0 * 0.71 * 0.34
b*(5; 3, 0.7) = 0.00567
g(x; P) = P * Qx - 1
g(5; 0.7) = 0.7 * 0.34 = 0.00567
Notation
The following notation is helpful, when we talk about
hypergeometric distributions and hypergeometric probability.
Hypergeometric Experiments
A hypergeometric experiment is a statistical experiment that has
the following properties:
Note further that if you selected the marbles with replacement, the
probability of success would not change. It would be 5/10 on every
trial. Then, this would be a binomial experiment.
Hypergeometric Distribution
A hypergeometric random variable is the number of successes
that result from a hypergeometric experiment. The probability
distribution of a hypergeometric random variable is called a
hypergeometric distribution.
Example 1
Hypergeometric Calculator
As you surely noticed, the hypergeometric formula requires many
time-consuming computations. The Stat Trek Hypergeometric
Calculator can do this work for you - quickly, easily, and error-
free. Use the Hypergeometric Calculator to compute
hypergeometric probabilities and cumulative hypergeometric
probabilities. The calculator is free. It can be found under the Stat
Tables tab, which appears in the header of every Stat Trek web
page.
46 NUMERICAL AND STATISTICAL METHODS
Hypergeometric Calculator
Example 1
Multinomial Experiment
A multinomial experiment is a statistical experiment that has the
following properties:
where n = n1 + n2 + . . . + nk.
Multinomial Calculator
As you may have noticed, the multinomial formula requires many
time-consuming computations. The Multinomial Calculator can do
this work for you - quickly, easily, and error-free. Use the
Multinomial Calculator to compute the probability of outcomes
from multinomial experiments. The calculator is free. It can be
found under the Stat Tables tab, which appears in the header of
every Stat Trek web page.
Multinomial Calculator
50 NUMERICAL AND STATISTICAL METHODS
Multinomial Probability: Sample Problems
Example 1
Example 2
Note that the specified region could take many forms. For instance,
it could be a length, an area, a volume, a period of time, etc.
Notation
The following notation is helpful, when we talk about the Poisson
distribution.
Example 1
Poisson Calculator
Poisson Calculator
Example 1
f(x)=10x=1x=0otherwise
Another common way to write it is
E(X)=1()+0(1)=,
and the variance of a Bernoulli is
V(X)=E(X2)[E(X)]2=12+02(1)2=(1).
Binomial distribution
XBin(n,).
58 NUMERICAL AND STATISTICAL METHODS
Suppose that an experiment consists of n repeated Bernoulli-type
trials, each trial resulting in a success with probability and a
failure with probability 1 . For example, toss a coin 100
times, n=100. Count the number of times you observe heads, e.g.
X=# of heads. If all the trials are independentthat is, if the
probability of success on any trial is unaffected by the outcome of
any other trialthen the total number of successes in the
experiment will have a binomial distribution, e.g, two coin tosses
do not affect each other. The binomial distribution can be written
as
E(X)=E(X1)+E(X2)++E(Xn)=n
and
V(X)=V(X1)+V(X2)++V(Xn)=n(1).
Note that X will not have a binomial distribution if the probability
of success is not constant from trial to trial, or if the trials are not
entirely independent (i.e. a success or failure on one trial alters the
probability of success on another trial).
FAQ 59
if X1Bin(n1,) and X2Bin(n2,), then X1+X2Bin(
n1+n2,)
As n increases, for fixed , the binomial distribution approaches
normal distribution N(n,n(1)).
Hypergeometric distribution
p(t)=Pr(N1=t)=(n1t)(n2mt)(nm),t[max(0,mn2);min(
n1,m)]
Poisson distribution
f(x|)=Pr(X=x)=xex!,x=0,1,2,,and,>0.
60 NUMERICAL AND STATISTICAL METHODS
n!x!(nx)!x(1)nxxex! (1)
Overdispersion
Count data often exhibit variability exceeding that predicted by the
binomial or Poisson. This phenomenon is known as
overdispersion.
p(X=k,)=()k!k+1exp(+1)
p(X=k)===()k!0k+1exp(+1)d()k!(k
+)(+1)(k+)(k+)()(k+1)(+1)(1+1)k
with
E(X)=E[E(X|)]=E[]=
V(X)=E[var(X|)]+var[E(X|)]=E[]+var[]=+2
=2(+1).
Beta-Binomial distribution
A family of discrete probability distributions on a finite support
arising when the probability of a success in each of a fixed or
known number of Bernoulli trials is either unknown or random.
For example, the researcher believes that the unknown probability
of having flu is not fixed and not the same for the entire
FAQ 63
population, but it's yet another random variable with its own
distribution. For example, in Bayesian analysis it will describe a
prior belief or knowledge about the probability of having flu based
on prior studies. Below X is what we observe such as the number
of flu cases.
P(X=k)==(nk)(+)()()10k+1(1)n+k1d
(n+1)(k+1)(nk+1)(+)()()(+k)(n
+k)(n++)
with
E(X)=E(n)=n+
Var(X)=E[n(1)]+var[n]=n(++n)(+)(+
+1)
___________________________
64 NUMERICAL AND STATISTICAL METHODS
PROBABILITY DISTRIBUTIONS
Theoretical distributions refer to a set of mathematical models
of the relative frequencies of a finite number of observations of
a variable. It is systematic arrangement of probabilities of
mutually exclusive andcollectively exhaustive elementary
events of an experiment. Observed frequency distributions are
based upon actual observation and experimentation. We can
deduce mathematically a frequency distribution of certain
population based on the trend of the known values. This kind of
distribution on experience or theoreticalconsiderations is
known as theoretical distribution or probability distributions.
These distributions may notfully agree with actual
observations or the empirical distributions based on sample
observations. If the number of experiments is increased
sufficiently the observed distributions may come closer to
theoretical orprobability distributions. Theoretical
distributions are useful for situations where actual
observations or experiments are not possible. Moreover, it can
be used to test the goodness of fit. They provide decision
makers with a logical basis for making decisions and are useful
in making predictions on the basis of limited information or
theoretical considerations.
There are broadly three theoretical distributions which are
generally applied in practice. They are :
1. Binomial distribution
2. Poisson distribution
3. Normal distribution
Binomial Distribution
It is a discrete distribution. The binomial distribution was
discovered by James Bernoulli in 1700 to deal with
dichotomous classification of events. It is a probability
distribution expressing the probability of one set of
dichotomous alternatives, i.e., success or failure. The binomial
probability distribution is developed under some assumptions
which are :
(i) An experiment is performed under similar conditions for a
number of times.
(ii) Each trial shall give two possible outcomes of the
experiment success or failure.
FAQ 65
S = {failure, success}
(iii) The probability of a success denoted by p remains constant
for all trials. The probability of a failure denoted by q is equal
to ( 1 p).
(iv) All trials for an experiment are independent.
If a trial of an experiment can result in success with probability
p and failure with probability q = (1 p), the probability of
exactly successes in n trials is given by
P(x) = nC
x
pxqnx where x = 0, l, 2...n.
where P(x) = Probability of x successes
nC
x = (! is termed factorial)
The entire probability distribution of x = 0, 1, 2,....,n can be
written as follows:
Binomial Probability Distribution