mathematics that will be essential in our succeeding discussions. I fashioned this module to be an easy
read. I will not discuss these in class anymore!
Also, please bring a 3x5 index card next meeting and a 1x1 picture of yourself.
Have a good weekend and start this term right!
Note: We define X & Y as random variables.
I.
( )
( )
*Simply put, the equation is just the expected value of X is the weighted sum (weighted by
their probabilities of occurring, f(X) ) of the actual/realized values of X.
The mean is the longrun average value of the random variables over many trials.
It is a measure of centrality [ The other one is the median ]
( ) (
It is the sum of the variations of the actual values from the longrun average (Measure of
variability)
As you can see from the equation, the variance is the weighted sum of the deviations of our
actual data from our computed average. So to say, it is just the distance of the actual data from
the mean.
So why is it squared? It is to eliminate potential cancelling out of values, remember that the
square of a negative number is positive.
c. Covariance
(
)(
) {(
)}
From the equation, it can be seen that the covariance is the weighted sum of the difference of
the random variables (X & Y) from its average values (
)
It measures how the variables comove together
For example, looking at the equation it can be seen that if the difference of X from its mean is
positive and the difference of Y from its mean is negative, the product of the two will be a
negative number. Repeating these tests for each realized value of X & Y and summing it up will
draw a clear picture of how X & Y move together. If the sum results to a positive (negative)
number, then the variables move in the same (opposite) direction.
d. Correlation Coefficient
(
The correlation coefficient is the standardized covariance, meaning we divide it by the standard
deviations of X & Y.
It measures the degree/strength of linear association between the two variables.
Why do we standardize? It is to eliminate the units, because it is possible that X & Y have
different units.
 The correlation coefficient is a number that lies between 1 and 1.
II.
Hypothesis Testing
We define:
The null hypothesis is our apriori expectation or our conventional wisdom while our alternative
hypothesis is our challenge to this belief.
Therefore we can define hypothesis testing as trying to disprove current beliefs (null) about our sample
at hand by using test statistics (mean, variance, etc. ) as our proof.
Possible outcomes of hypothesis testing
Decision
Reject
Do Not Reject
III.
the pvalue is the lowest significance level at which a null could be rejected ( )
it is also the chance of committing the type I error
we usually set the pvalue as .01, .05 or .10.
The rejection rule: if the computed pvalue (from testing) is less than our preset pvalue(either .01,.05
or .10) then we reject the null hypothesis.
Why?

IV.
Think of our preset pvalue as our maximum tolerable level of committing a mistake, so if the
computed p is greater than this threshold then our chance of committing a type I error is greater
and therefore it is unacceptable!
These properties will be very helpful to you once we start deriving in our discussions.
a.
Taken from : https://www.math.lsu.edu/~stoltz/Courses/06FM1550/Summary/reg_part.pdf (You can review the summation notation from this
source)