Professional Documents
Culture Documents
STATISTICS
Report on:
DEGREES OF FREEDOM
Married Man: There is only one subject and my degree of freedom
Is zero. I should increase my "sample size."
Degrees of freedom are the number of values in a distribution that are free to vary for any
particular statistic" In a layman word Degrees of freedom is used to minimize the error of
the results in statistics. That is why it is called degrees of freedom.
This article is an attempt to understand the concept of degrees of freedom that occurs
throughout statistics and is used extensively in statistical inference.
The term “degrees of freedom” was introduced by Sir Ronald fisher in 1922 without
mentioning its purpose. The various definitions given about “degrees of freedom” are:
1. It is the number of independent parameters that are needed to specify the
configuration of a system.
2. It refers to the number of independent variables involved in a statistic.
3. It is a number which in some way represents the size of the sample/samples used
in a statistical test. (In some cases it is the sample size, and sometimes it has to be
calculated depending on the kind of test).
4. It is a parameter that appears in some probability distributions used in statistical
inference, particularly the t-distribution, chi-squared distribution and F-
distribution.
5. It is a positive integer (however fractional numbers can occur in some
approximations) normally equivalent to the number of independent observations
in a sample, minus the number of population to be estimated from the sample.
Though all the above definitions are an attempt to explain the concept of the degrees
of freedom, they appear vague and inconsistent with each other. However a clearer
and concise definition can be:
This definition can be explained simply by taking the example of a contingency table.
For example if we take a 2 x 2 contingency table with marginal totals provided then
we will have only one degree of freedom as we are free to put only the value in one of
the four cells of the table and the value of the remaining three cells will be dependent
on all the other values (value of marginal totals and the one independent value which
we put).
X Y1 24
Y2 Y3 20
15 29 44
As in the above example X is the only independent variable, and therefore we have only
one degree of freedom, once we set its value for example “5” then all the other values
will also become known, Y1=19, Y2=10 and Y3=10.
Suppose in the above example we only provide with the grand total then we will have
three degrees of freedom.
X1 X2
X3 Y
44
As we can set three independent values such that suppose X1=10, X2=20, X3=5 then Y
must be equal to 9 so as to meet the given requirement.
Thus we can on the basic level understand degrees of freedom as the number or amount
of information that can be freely varied without the violation of any restrictions.
In the second contingency table where only the grand total was provided we will
calculate the degrees of freedom as
N-1= 4-1= 3
The maximum numbers of quantities whose values are free to vary before the remainder
of the quantities are determined. The examples of degree of freedom are present in
Everyday life. For example in a day a student has to attend 4 class i.e. English, Math,
Science, History, Language and business between a specific duration. The schedule of the
student has 3 degree of freedom. It means that the student has a choice of attending 3
classes under the schedule on it own will in his desire time slot however after taking all
the 3 classes the fourth class is automatically determined by default.
Furthermore if the student is given a certain restriction that he has to attend the English
class first then his degree of freedom will decrease a point and will become 2 as cannot
select it freely and it cannot vary
There are many other situations where we can talk about "degrees of freedom". For
instance, we may be studying the relationship between age, sex, and college GPA in a
population. In this situation, we have three "degrees of freedom" for each person.
0.8
0.7 F Distribution
3, 36 degrees of
0.6
freedom
s12
0.5
0.4
F=
s22
0.3
0.2
Now giving justification on why there are n– 1 degrees of freedom for calculating a
sample variance I can explain through an example Suppose There is a sample of 8
observation and nothing is known about the observation so any value can be taken at
random and can be freely taken and discarded. However if the sample variance of the 8
outcome is taken out we need to first take out
_
X=∑Xi/N now suppose the X=60 it is now not correct at all that all the
observation are free to be replaced as
x1+x2+x3+x4+x5+x6+x7+x8 =60
Therefore we can say that nX=60. This means that the sum of the 8 means is sixty. Now
the seven values in the sample are free to be replaced and can vary however when the
seven values are decided and fixed the 8 value is determined by default.
Therefore there are 7 degree of freedom i.e. 8-1 as the 7 values in the sample can vary
and 8th value is obtained by default.