Stats Class Notes

Stats Notes
Only multiplying data changes standard deviation

s is standard deviation
Z scores create mean of 0 and variance of 1
Empirical rule: you dont have data - assumption that its mound shaped
Chebyshevs rule: the proportion of data that lies within k standard deviations of the
mean is at least
1-1/k^2 where k is larger than 1
Rule always works - no assumptions
Classic Outlier detection: |Z|>or equal to 2 - but poor system as mean and standard
deviation affected by presence of outliers - makes harder to detect outliers
BoxPlot Rule (IQR rule): X<Q1 - 1.5(IQR) or X>Q3+1.5(IQR)
Mean (first moment), Variance (second moment) Skewness (3rd moment)
Skewness= i=1 to n (xi - mean)^3/ns^3
old skewness formula =3(mean-media)/standard deviation
Transformations to symmetrize data - sqrt(x), Y=log(x), Y=1/x
Covariance shows positive, negative or no relationship between two variables
Correlation gives strength and direction of relationship of two variable and ranges
between -1 and 1
positive covariance - x goes up, y goes up
negative - opposite
s xy subscript means covariance between xy
s xy = 1/(n-1) x (xi - averagex)(yi - averagey)
the Covariance Matrix - covariance of x with itself is the variance of x

You square standard deviation to get variance?
The correlation coefficient: r xy = s xy/s x s y
Combining Data Sets - To merge x and y, creat z where Z =aX+bY
Mean Z = a meanX + b meanY
Variance formula for Z
To minimise risk for mixed portfolio, make covariance of x and y negative
Slide 272 maths Slide 252
An exact correlation of 1 shows a straight line
Y = b0 + b1X
We have new notion Yi-hat = b0 +b1Xi
We define our fitting error as ei = Yi - b0 - b1Xi = Yi - Yi hat
Most popular method of fitting a line - Least-squares Criterion Function
minb0,b1 i=1 to n (Yi - b0 - b1Xi)^2
bottom number is b0 top is b1
VAR(bX+dY) = b^2VAR(x) + d^2VAR(Y)+2bdCOV(X,Y)
Subjective, logical, experimental probabilities
If A and B are independent, P(A and B) = P(A)P(B)
If not, P(A and B) = P(A|B)P(B)
Independent if P(A|B) = P(A)
A random variable is a rule or function that translates the outcomes of a probability
experiment into numbers - Discrete (only finite or limited set of values) or Continuous
(an infinite set of values)
We use capital letters to denote random variables and lowercase letters to denote
their values P(R=r)e.g.
Sample - xbar is
s^2 is ^2
P subscript X(x) = P(X=x)
^2 = (X- subscriptx)^2P(X=x) = Var(X0 = E(X^2)-( subscript X)^2
Chebyshevs rule also applies to standard deviation
PX,Y(X=x and Y=y) = PX(x)PY(y)
To show independence we must show that P(X=x and Y=y) = P(X=x)*P(Y=y)
Variances sum together only when two variables are independent
D~Bin(n=5,p=0.15)
Tables give area to the left
Total variance of several independent random variables is total of the variances
added together
Variances never subtract
= proportion of population with a certain characteristic - p hat is sample statistic
used to estimate
= mean value of a population variable - x bar is sample statistic used to estimate
x bar is random
averages of things are normally distributed (bell shaped)
X bar - random
x bar - realised
as n goes towards infinity, X bar goes to
For most distribution, n > 30 will give a sampling distribution that is nearly normal
For fairly symmetric distributions, n > 15
If the data is already normal, sample size does not matter
Construct a confidence interval with x1.96s/n
p hat 1.96(p hat (1-p hat))/n
Type 1 - reject the null when its true
Type 2 - dont reject the null when its false
Type 1 error is normally the worst (we minimise)
Confidence rule determines whether the value is above or below, whereas

hypothesis testing only shows if a particular value is plausible
For hypothesis test we reject when nought is not in (x bar 1.96s/n), which is the
same as t score= X bar - o/(s/n)
Does = nought i.e. does x bar = nought
P value could be considered how likely that Ho is true (if P is low, Ho must go)
For hypothesis testing, three ways: t score, P value, Confidence Interval (t score
doesnt work for small sample size)

Stats Class Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stats Class Notes

Uploaded by

Copyright:

Available Formats

Stats Notes

Only multiplying data changes standard deviation

the Covariance Matrix - covariance of x with itself is the variance of x

Confidence rule determines whether the value is above or below, whereas

You might also like