Professional Documents
Culture Documents
Math Essentials
Jeff Howbert
Winter 2012
Winter 2012
Winter 2012
Winter 2012
Notation
aA
|B|
|| v ||
Jeff Howbert
Winter 2012
Notation
x, y, z, vector (bold, lower case)
u, v
A, B, X matrix (bold, upper case)
y = f( x ) function (map): assigns unique value in
range of y to each value in domain of x
dyy / dx derivative of y with respect
p
to single
g
variable x
y = f(( x ) function on multiple
p variables, i.e. a
vector of variables; function in n-space
y / xi partial derivative of y with respect to
element i of vector x
Jeff Howbert
Winter 2012
Intuition:
In some process, several outcomes are possible.
When the process is repeated a large number of
times, each outcome occurs with a characteristic
relative frequency
frequency, or probability.
probability If a particular
outcome happens more often than another
outcome we say it is more probable
outcome,
probable.
Jeff Howbert
Winter 2012
Winter 2012
Probability spaces
A probability space is a random process or experiment with
three components:
, the set of possible outcomes O
number of possible outcomes = | | = N
F,
F the
th sett off possible
ibl events
t E
an event comprises 0 to N outcomes
number of possible events = | F | = 2N
Jeff Howbert
Winter 2012
Axioms of probability
1.
Non-negativity:
for any event E F,
F p( E ) 0
2
2.
3.
Jeff Howbert
Winter 2012
10
Jeff Howbert
Winter 2012
11
Jeff Howbert
Winter 2012
12
Winter 2012
13
Jeff Howbert
Winter 2012
14
Jeff Howbert
Winter 2012
15
Probability distributions
Discrete:
example:
sum of two
fair dice
example:
waiting time between
eruptions of Old Faithful
(minutes)
Jeff Howbert
probability
Continuous:
Winter 2012
16
Random variables
A random variable X is a function that associates a number x with
each outcome O of a process
Common
C
notation:
t ti
X( O ) = x, or just
j tX=x
Basically a way to redefine (usually simplify) a probability space to a
new probability space
X must obey axioms of probability (over the possible values of x)
X can be discrete or continuous
Example: X = number of heads in three flips of a coin
Possible values of X are 0, 1, 2, 3
p( X = 0 ) = p( X = 3 ) = 1 / 8
p( X = 1 ) = p( X = 2 ) = 3 / 8
Winter 2012
17
Winter 2012
18
Jeff Howbert
Winter 2012
19
Winter 2012
20
Jeff Howbert
Winter 2012
21
p
probability
0.2
0 15
0.15
0.1
0.05
0
American
sport
Asian
Y = manufacturer
Jeff Howbert
SUV
minivan
European
sedan
X = model type
Winter 2012
22
Jeff Howbert
Winter 2012
23
Expected value
Given:
A discrete random variable X,
X with possible
values x = x1, x2, xn
Probabilities p( X = xi ) that X takes on the
various values of xi
A function yi = f( xi ) defined on X
The expected value of f is the probability-weighted
average value of f( xi ):
E( f ) = i p( xi ) f( xi )
Jeff Howbert
Winter 2012
24
Winter 2012
25
Jeff Howbert
Winter 2012
26
t off a distribution
di t ib ti
Compare to formula for mean of an actual sample
1 n
= xi
N i =1
Jeff Howbert
Winter 2012
27
d off a di
distribution
t ib ti
is the standard deviation
Compare to formula for variance of an actual sample
1 n
2
2
=
(
x
)
i
N 1 i =1
Jeff Howbert
Winter 2012
28
high (pos
sitive)
covaria
ance
no co
ovariance
1 n
cov( x, y ) =
( xi x )( yi y )
N 1 i =1
Jeff Howbert
Winter 2012
29
Correlation
Pearsons correlation coefficient is covariance normalized
by the standard deviations of the two variables
cov( x, y )
corr( x, y ) =
x y
Winter 2012
30
Complement rule
Given: event A, which can occur or not
p( not A ) = 1 - p( A )
not A
Winter 2012
31
Product rule
Given: events A and B, which can co-occur (or not)
p( A,
A B ) = p( A | B ) p( B )
(same expression given previously to define conditional probability)
(not A, not B)
( A,
A B)
(A, not B)
B
(not A, B)
Winter 2012
32
Jeff Howbert
Winter 2012
33
p( A ) = p( A,
A B ) + p( A,
A not B )
(same expression given previously to define marginal probability)
(not A, not B)
( A,
A B)
(A, not B)
B
(not A, B)
Winter 2012
34
Independence
Given: events A and B, which can co-occur (or not)
p( A | B ) = p( A )
or
p( A,
A B ) = p( A ) p( B )
(not A, not B)
(not A, B)
B
(A, not B)
( A, B )
Winter 2012
35
Winter 2012
36
Bayes rule
A way to find conditional probabilities for one variable when
conditional probabilities for another variable are known.
p( B | A ) = p( A | B ) p( B ) / p( A )
where p( A ) = p( A, B ) + p( A, not B )
(not A, not B)
A
(A, not B)
Jeff Howbert
( A, B )
B
(not A, B)
Winter 2012
38
Bayes rule
posterior probability likelihood prior probability
p( B | A ) = p( A | B ) p( B ) / p( A )
(not A, not B)
A
(A, not B)
Jeff Howbert
( A, B )
B
(not A, B)
Winter 2012
39
Winter 2012
40
1.
2.
3.
Jeff Howbert
Winter 2012
41
Jeff Howbert
Winter 2012
42