29 views

Uploaded by Rudra Ardur

- A
- Assessing the Number of Goals in Soccer Matches
- 11. Hypothesis Testing
- Business Mathematics Assignment 1
- IE28 Lecture (Type II Error and Sample Size)
- What is the Difference Between Alpha And
- What is Hypothesis Testing (1)
- Hypothesis Testing
- 3. MB0040 Mba1 Stats
- Anova
- isl 6.2
- Chapter 08 - Quiz
- sw3_exeol_oddonly
- TEST OF HYP FORMULA & PROBS.docx
- innovation_09-17-07_with_tables__2_
- Writingyourthesis Test
- MCQs
- 20 EkanshGarg OB Finaldraft
- LO_Unit4_InferenceNumericalVariables.pdf
- 788-3408-1-PB

You are on page 1of 67

n n

x i

f x i i

x i 1

x i 1 n

n f

i 1

i

fi are frequencies

xi are mid values

Step1: Arrange the given data in the Ascending order

N

Step2: Count the given number of observations. m

Step3:

median l 2 h

Case I : If the number of observations is odd

f

n 1

th

2 N = total frequency

Case II: If the number of observations is even l = lower bound of median class

th th

n n m = frequency until the median class

element 1 element f = frequency of the median class

median 2

2

in the Ascending order h = width of the class interval

2

f f0

mode l h

2f - f 0 f1

An observation which repeats more number of

f = maximum frequency

times is called the Modal value

f0 = frequency above the maximum frequency

f1 = frequency below the maximum frequency

l = lower bound where the maximum frequency lies

h = width of the class interval

UN GROUPED DATA GROUPED DATA

n n

log xi f i log xi

G.M Anti log i1 G.M Anti log i1

n N

fi are frequencies

xi are mid values

n N

H.M H.M n

fi

n

1

i 1 xi i 1 x i

fi are frequencies

xi are mid values

MEASURES OF DISPERSION

Range= Max. Value Min. Value Range = Biggest upper bound Smallest Lower bound

Step1: Arrange the given data in the Ascending order

3N

Step2: Count the given number of observations. m3

Step3:

Q3 l3 4 h

Case I : If the number of observations is odd

f3

3(n 1)

th

Q3

4

element in Ascending order

N = total frequency

(n 1)

th

l3 = lower bound of the (3N/4) class

Q1 element in Ascending order

4 m3 = frequency until the median class

Case II: If the number of observations is even f3 = frequency of the median class

th h = width of the class interval

3n

Q3 element in Ascending order

4

N

n

th

m1

4 Q1 l1 4 h

Step4: f1

N = total frequency

Q Q1

Q.D 3 l1 = lower bound of the (N/4) class

2 m1 = frequency until the median class

f1 = frequency of the median class

h = width of the class interval

Q Q1

Q.D 3

2

UN GROUPED DATA GROUPED DATA

n n

n n f i (x i x ) f x

(x x

i i

i x) i M.D i 1

, where x i 1

N N

M.D i 1

, Where x i 1

n n fi are frequencies

xi are mid values

N total frequency

n

n

(x i x) 2

n

xi f (x x) i i

2

n

f x

S.D i 1 i i

S.D i 1

,Where x i 1

N , where x i 1

n n N

fi are frequencies

xi are mid values

N total frequency

n n

x i

f x i i

mean i 1

mean i 1

N

n

n

(x

n

i x) 2 f (x x) i i

2

S.D i 1 S.D i 1

n N

s.d s.d

C.V C.V 100

100 mean

mean

TESTING OF HYPOTHESIS

investigation is called population.

Eg:- If we want to study the average life time of the electric bulbs,

here the term population is the number of electric bulbs.

number of elements in that sample is known as sample size.

population characteristics on the basis of the sample is known as

sampling error and it is inherent and unavoidable in any sample

scheme.

known as parameter.

Eg:- i) Population mean ii) Population Variance 2

statistic

Eg:- i) Sample mean x ii) Sample variance s2

is the study of the tests of significance which enables us to decide on

the basis of the sample whether

i) the deviation between the observed sample statistic and

the hypothetical parameter value is very less.

ii) the deviation between two sample statistics is significant

or not.

a hypothesis i.e a definite statement about the population parameter.

Such a hypothesis which is usually a hypothesis of no difference is

called Null hypothesis.

According to R.A.Fisher Null hypothesis is the hypothesis

which is tested for all possible rejection under the assumption that it

is true. It is usually denoted by H0.

Alternative hypothesis: Any hypothesis which is complimentary to

the Null hypothesis is called an alternative hypothesis. It is usually

denoted by H1.

independent observations in a set. In a test of hypothesis a sample is

drawn from the population of which the parameter is under a test. The

size of the sample varies since it depends either on the experimenter

or on the resources available. Moreover the test statistic involves the

estimated value of the parameter which depends on the number of

observations. Hence the sample size plays an important role in testing

of hypothesis and is taken care of by degrees of freedom.

hypothesis.

Decision from Sample

Reject H0 Accept H0

Wrong Correct

H0 True (Type I error)

(Type II error)

Type II error: Probability accepting H0 when H0 is false.

level of significance of the test. It is also called the size of the

critical region.

find a statistic which is almost equal to the true value of the parameter.

The method of finding such a statistic is known as point estimation.

Eg:- sample mean is a point estimate of population mean.

find an interval in which a parameter is expected to lie with a particular

probability, the method of finding out such an interval is called interval

estimation.

Eg:- The 95% confident interval for the population mean is

x 1.96 , x 1.96

n n

Step 1: Null hypothesis: Set up the null hypothesis H0.

Step 2: Alternative hypothesis: Set up the alternative hypothesis H1.

Step 3: Level of significance: Choose the appropriate level of

significance which is fixed in advance.

Step 4: Test statistic: Compute the test statistic under H0.

Step 5: Inference: We compare the computed value with the table

value/ critical value. If the obsolete computed value is less than

or equal to the table value/critical value, we accept H0.

Otherwise we reject

Critical values:

1% 5% 10%

Two-tailed test 2.58 1.96 1.645

Right-tailed test 2.33 1.645 1.28

Left-tailed test -2.33 -1.645 -1.28

t-test

Test for Single mean:

H0: = 0

H1: 0 (or) > 0 (or) < 0

[Any one of these 3 conditions depends on the given problem]

Level of significance:

Appropriate level of significance is % (given/chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

x

t ~ t n1 at % level of significan ce

s n

Inference:

1) If it is a two tailed test (), Reject H0 if tcal t/2,n-1. Otherwise

accept H0.

2) If it is a right tailed test (>), Reject H0 if tcal t,n-1. Otherwise

accept H0.

3) If it is a left tailed test (<), Reject H0 if tcal -t,n-1. Otherwise

accept H0.

and their heights (in inches) are given below. Test whether the

sample comes from a normal population whose mean height is 66

inches or not at 5% level of significance?

63,63,66,67,68,69,70,70,71,71.

Sol:-

H0: = 66

H1: 66

Level of significance:

Appropriate level of significance is 5% (given)

Test Statistic:

To test the above hypotheses the test statistic is given by

x

t ~ t n1 at % level of significan ce

s n

x

x i

67.8

n

s

(x i x) 2

3.0111

n 1

67.8 66

t 1.89 ~ t 9 at 5% level of significan ce

3.0111 10

Inference:

The tabulated value of t at 5% level of significance for 9

degrees of freedom in a two tailed test is 2.262.

[t/2,n-1=t0.05/2,10-1=t0.025,9=2.262]

Here, tcal < t/2,n-1 . So, we accept H0. Hence we conclude that

the sample comes from the population whose mean value is 66inches

H0: x = y

H1: x y (or) x > y (or) x < y

[Any one of these 3 conditions depends on the given problem]

Level of significance:

Appropriate level of significance is % (given/chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

(x - y) ( x y )

t ~ t n1 n 2 1 at % level of significan ce

1 1

S

n1 n 2

Inference:

1) In a two tailed test (), Reject H0 if t cal t /2,n1 n 2 2 .

Otherwise accept H0.

2) In a right tailed test (>), Reject H0 if .

t cal t , n1 n 2 2

Otherwise accept H0.

3) In a left tailed test (<), Reject H0 if t cal t , n1 n 2 2

Otherwise accept H0.

heat producing capacity in millions of calories per ton of specimens

of coal from two mines.

Mine 1 8260 8130 8350 8070 8340

Mine 2 7950 7890 7900 8140 7920 7840

Use 0.01 level of significance to test whether the difference

between the means of these two samples is significant?

Sol:-

H0: x = y

H1: x y

Level of significance:

Appropriate level of significance is 1% (given)

Test Statistic:

To test the above hypotheses the test statistic is given by

(x - y) ( x y )

t ~ t n1 n 2 1 at % level of significan ce

1 1

S

n1 n 2

x

x i

8230 y

y i

7940

n1 n2

S

(x i x) 2 (y i y) 2

114.31

n1 n2 2

Under H0, the test statistic is given by

t 4.19 ~ t 9 at 1% level of significan ce

1 1

114.31

5 6

Inference:

The tabulated value of t at 1% level of significance for 9

degrees of freedom in a two tailed test is 3.250.

[t/2,n1+n2-2=t0.01/2,9=t0.005,9=3.250]

Here, tcal > ttab . So, we reject H0.

inches. Those of 10 randomly chosen soldiers are

61,62,65,66,69,69,70,71,72,73 inches. Discuss whether this data gives a

suggestion that the sailors are taller than soldiers.

Sol:-

H0: x = y

H1: x > y

Level of significance:

Appropriate level of significance is 5% (chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

(x - y) ( x y )

t ~ t n1 n 2 1 at % level of significan ce

1 1

S

n1 n 2

x

x i

68, y

y i

67.8

n1 n2

S

(x i x) 2 (y i y) 2

60 153.6

3.906

n1 n2 2 14

t 0.099 ~ t14 at 5% level of significan ce

1 1

3.906

6 10

Inference:

The tabulated value of t at 5% level of significance for 14

degrees of freedom in a right tailed test is 1.761.

[t,n1+n2-2=t0.05,14=t0.05,14=1.761]

Here, tcal < t,n-1 . So, we accept H0. Hence we conclude that

there is no significant difference between the mean heights of the

sailors and soldiers.

Test for equality of Two means (n1= n2) (Paired t-test)

H0: d = 0

H1: d o (or) d > 0 (or) d < 0

[Any one of these 3 conditions depends on the given problem]

Level of significance:

Appropriate level of significance is % (given/chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

d d

t ~ t n1 at % level of significan ce

sd n

Inference:

1) If it is a two tailed test (), Reject H0 if tcal t/2,n-1. Otherwise

accept H0.

2) If it is a right tailed test (>), Reject H0 if tcal t,n-1. Otherwise

accept H0.

3) If it is a left tailed test (<), Reject H0 if tcal -t,n-1. Otherwise

accept H0.

1) The following are the average weekly loses of working hours due

to accidents in 10 industrial plants before and after a certain safety

program was put into operation. Use the 5% level of significance

to test whether the safety program is effective

Before 45 73 46 124 33 57 83 34 26 17

After 36 60 44 119 35 51 77 29 24 11

Sol:-

H0: d = 0(There is no difference between in the average no.of

accidents before and after the safety program is put into operation)

H1: d > 0 (Training program is effective i.e the average number of

accidents decreased after the safety program is implemented)

Level of significance:

Appropriate level of significance is 5% (given)

Test Statistic:

To test the above hypotheses the test statistic is given by

d d

t ~ t n1 at % level of significan ce

sd n

di 9 13 2 5 -2 6 6 5 2 6

d

d i

5.2

n

sd

(d i d) 2

4.08

n 1

5.2 - 0

t 4.030 ~ t 9 at 5% level of significan ce

4.08 10

Inference:

The tabulated value of t at 5% level of significance for 9

degrees of freedom in a right tailed test is 1.833

[t,n-1=t0.05,10-1=t0.05,9=1.833]

Here, tcal > t,n-1 . So, we reject H0. Hence we conclude that

The safety program is effective.

H0: 2=02

H1: 202 (or) 2>02 (or) 2<02

[Any one of these 3 conditions depends on the given problem]

Level of significance:

Appropriate level of significance is % (given/chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

2

(x i - x) 2

~ 2 n-1 at % level of significan ce

2

(n - 1)s 2

2

~ 2 n-1 at % level of significan ce

2

Inference:

1) If it is a two tailed test (), Reject H0 if 2cal 2/2,n-1.

Otherwise accept H0.

2) If it is a right tailed test (>), Reject H0 if 2cal 2,n-1.

Otherwise accept H0.

3) If it is a left tailed test (<), Reject H0 if 2cal 21-,n-1.

Otherwise accept H0.

1. .If 12 observations of the specific heat of iron have a standard deviation

of 0.0086, test the null hypothesis that =0.01 for such observations.

Use the alternative hypothesis 0.01 and level of significance 0.01.

Sol:-

H0: =0.01

H1: 0.01

Level of significance:

Appropriate level of significance is 1% (given)

Test Statistic:

To test the above hypotheses the test statistic is given by

(n - 1)s 2

2

~ 2 n-1 at % level of significan ce

2

12 * 0.0086 2

2

8.1356

0.012

Inference:

The table value of 2 at 1% level of significance for 11 d.f in a

two tailed test is 26.757

2/2,n-1= 20.005,11=26.757

Here, 2cal < 2tab . So, we accept H0.

14,15,13,21,14,12,15,16,18,20,22,24,14,12,10. Test whether the

population variance d 7.5 at 5% level of significance.

Sol:-

H0: 2=7.5

H1: 27.5

Level of significance:

Appropriate level of significance is 5% (given)

Test Statistic:

To test the above hypotheses the test statistic is given by

2

(x i - x) 2

~ 2 n-1 at % level of significan ce

2

236

2 31.4667

7.5

Inference:

The table value of 2 at 5% level of significance for 14 d.f in a

two tailed test is 26.119

2/2,n-1= 20.025,14=26.119

Here, 2cal > 2tab . So, we reject H0.

H0: x2=y2

H1: x2 y2 (or) x2 > y2 (or) x2 < y2

[Any one of these 3 conditions depends on the given problem]

Level of significance:

Appropriate level of significance is % (given/chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

S2x

F 2 ~ Fn1-1,n2-1 at % level of significan ce

Sy

Inference:

1) If it is a two tailed test (), Reject H0 if Fcal F,n1-1,n2-1.

Otherwise accept H0.

2) For both right tailed and left tailed tests (>, <), Reject H 0 if

Fcal F,1-,n1-1,n2-1. Otherwise accept H0.

Note:-

Fcal1 i.e the calculated value of F must be always greater

than or equal to !. If not the test statistic becomes 1/F.

1). Random samples from two normal populations are given below

Sample 1 16 26 27 23 24 22

Sample 2 33 42 35 32 28 31

Sol:-

H0: x2=y2

H1: x2 y2

Level of significance:

Appropriate level of significance is 5% (chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

S2x

F 2 ~ Fn1-1,n2-1 at % level of significan ce

Sy

x 23, S 2

(x i x) 2

15.2

n1 1

x

y 33.5, S 2

(y i y) 2

22.7

n2 1

y

15.2

F 0.6696 1 ~ F5,5 at 5% level of significan ce

22.7

22.7

F 1.4934

15.2

Inference:

The table value of F at 5% L.O.S for (5,5) d.f is 5.05.

F,n2-1,n1-1=F0.05,5,5=5.05

Fcal < Ftab, We accept H0.

the following values

Sample 1 9 11 13 11 15 9 12 14

Sample 2 10 12 10 14 9 8 10

significant at 1% level of significance?

Sol:-

H0: x2=y2

H1: x2 y2

Level of significance:

Appropriate level of significance is 1% (given)

Test Statistic:

To test the above hypotheses the test statistic is given by

S2x

F 2 ~ Fn1-1,n2-1 at % level of significan ce

Sy

x 11.75, S 2

(x i x) 2

4.786

n1 1

x

y 10.43, S 2

(y i y) 2

3.952

n2 1

y

4.786

F 1.21 1 ~ F7,6 at 1% level of significan ce

3.952

Inference:

The table value of F at 1% L.O.S for (7,6) d.f is 8.26.

F,n1-1,n2-1=F0.05,7,6=8.26

Fcal < Ftab, We accept H0.

1% 5% 10%

Two-tailed test 2.58 1.96 1.645

Right-tailed test 2.33 1.645 1.28

Left-tailed test -2.33 -1.645 -1.28

H0: = 0

H1: 0 (or) > 0 (or) < 0

[Any one of these 3 conditions depends on the given problem]

Level of significance:

Appropriate level of significance is % (given/chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

x

z ~ N(0,1) at % level of significan ce

n

Inference:

1) If it is a two tailed test (), Reject H0 if Zcal Zcritical value.

Otherwise accept H0.

2) If it is a right tailed test (>), Reject H0 if Zcal Zcritical value.

Otherwise accept H0.

3) If it is a left tailed test (<), Reject H0 if Zcal Zcritical value.

Otherwise accept H0.

1) A trucking firm is suspicious of the claim that the average life time

of certain tyres is at least 28,000 miles. To check this claim the firm

puts 40 of these tyres on its trucks and gets a mean life time of

27,468 miles with a standard deviation of 1,348 miles. What can

we conclude if the probability of type I error is to be

at most 0.01?

Sol:-

H0: 28,000 miles

H1: < 28,000 miles

Level of significance:

Appropriate level of significance is 1% (given)

Test Statistic:

To test the above hypotheses the test statistic is given by

x

z ~ N(0,1) at % level of significan ce

n

Under H0, the test statistic is given by

27468 - 28000

z 2.52

1348 40

Inference:

The critical value of Z at 1% level of significance in a two

tailed test is -2.33

Here, Zcal < Zcritical value. So, we reject H0.

67.47 inches. Is it reasonable to regard the sample drawn from the

large population with mean height 67.39 inches and standard

deviation of 1.3 inches. Test at 1% level of significance.

Sol:-

H0: = 67.39

H1: 67.39

Level of significance:

Appropriate level of significance is 5% (given)

Test Statistic:

To test the above hypotheses the test statistic is given by

x

z ~ N(0,1) at % level of significan ce

n

67.47 - 67.39

z 1.23

1.3 400

Inference:

The critical value of Z at 1% level of significance in a two

tailed test is 1.645. Here, Zcal < Zcritical value. So, we accept H0.

standard deviation of the number of acceptable pieces produced by

automatic stamping machine are x 1038, s 146 . At 0.05 level of

significance does this enable us to reject the null hypothesis

=1000 against the alternative hypothesis >1000.

Sol:-

H0: = 1000

H1: > 1000

Level of significance:

Appropriate level of significance is 5% (given)

Test Statistic:

To test the above hypotheses the test statistic is given by

x

z ~ N(0,1) at % level of significan ce

n

1038 - 1000

z 2.08

146 64

Inference:

The critical value of Z at 5% level of significance in a two

tailed test is 1.645

Here, Zcal > Zcritical value. So, we reject H0.

H0: x = y

H1: x y (or) x > y (or) x < y

[Any one of these 3 conditions depends on the given problem]

Level of significance:

Appropriate level of significance is % (given/chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

(x - y) ( x y )

Z ~ N(0,1) at % level of significan ce

2

x

2

y

n1 n2

Inference:

1) If it is a two tailed test (), Reject H0 if Zcal Zcritical value.

Otherwise accept H0.

2) If it is a right tailed test (>), Reject H0 if Zcal Zcritical value.

Otherwise accept H0.

3) If it is a left tailed test (<), Reject H0 if Zcal Zcritical value.

Otherwise accept H0.

1) A company claims that its light bulbs are superior to those of its

main competitor. If a study showed that n1=40 bulbs has a mean

life time of 647 hours with a S.D of 27 hours. While a sample of

n2=40 mean lifetime of 638hours with a S.D 31 hours, does this

substantiate the claim at 0.05 level of significance.?

H0: x = y

H1: x > y

Level of significance:

Appropriate level of significance is 5% (given)

Test Statistic:

To test the above hypotheses the test statistic is given by

(x - y) ( x y )

Z ~ N(0,1) at % level of significan ce

2

x 2

y

n1 n2

Under H0, the test statistic becomes,

(647 - 638) (0)

Z 1.3846

2 2

27 31

40 40

Inference:

The critical value of Z at 5% level of significance in a two

tailed test is 1.645

Here, Zcal < Zcritical value. So, we accept H0. Hence we conclude

that there no significant differ between the mean life of two bulbs.

2) A college conducts both day and night classes intended to be

equally effective. A sample of 100 day-students yields examination

results as x1 72.4, S1 14.8 . A sample of 200 night-students

yields examination results as x 2 73.9, S2 17.9 . Are two means

statistically equal at 10% significance level?

H0: x = y

H1: x y

Level of significance:

Appropriate level of significance is 10% (given)

Test Statistic:

To test the above hypotheses the test statistic is given by

(x - y) ( x y )

Z ~ N(0,1) at % level of significan ce

2

x

2

y

n1 n2

Under H0, the test statistic becomes,

(72.4 - 73.9) (0)

Z 0.7702

14.82 17.9 2

100 200

Inference:

The critical value of Z at 10% level of significance in a two

tailed test is 1.645

Here, Zcal < Zcritical value. So, we accept H0. Hence we conclude

that there no significant differ between the means.

3) The mean yield of two sets and their variability are given below.

Test whether the difference in the mean yields of two sets is

significant at 5% level of significance.

Set I Set II

Mean yield 1258 kgs 1243 Kgs

S.D/plot 34 Kgs 28 Kgs

No. of Plots 40 60

H0: x = y

H1: x y

Level of significance:

Appropriate level of significance is 1% (given)

Test Statistic:

To test the above hypotheses the test statistic is given by

(x - y) ( x y )

Z ~ N(0,1) at % level of significan ce

2

x

2

y

n1 n2

Under H0, the test statistic becomes,

(1258 - 1243) (0)

Z 2.3154

34 2 282

40 60

Inference:

The critical value of Z at 5% level of significance in a two

tailed test is 1.96

Here, Zcal > Zcritical value. So, we reject H0. Hence we conclude

that the mean yield of the two sets are not equal.

H0: 2 = 02

H1: 2 02 (or) 2 > 02 (or) 2 < 02

[Any one of these 3 conditions depends on the given problem]

Level of significance:

Appropriate level of significance is % (given/chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

s -

z ~ N(0,1) at % level of significan ce

2

2n

Inference:

1) If it is a two tailed test (), Reject H0 if Zcal Zcritical value.

Otherwise accept H0.

2) If it is a right tailed test (>), Reject H0 if Zcal Zcritical value.

Otherwise accept H0.

3) If it is a left tailed test (<), Reject H0 if Zcal Zcritical value.

Otherwise accept H0.

1)A random sample of size 300 is drawn from a population and the

sample variance is observed to be 113.59. Can the sample

be regarded as drawn from the population with variance 100

H0: 2 = 100

H1: 2 100

Level of significance:

Appropriate level of significance is 5% (chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

s -

z ~ N(0,1) at % level of significan ce

2

2n

10.6578 - 10

z 1.6113

100

2 300

Inference:

The critical value of z at 5% level of significance in a two tailed

test is 1.96. Zcal < Zcritical value. We accept H0. Hence we conclude

that the sample has been drawn from the population whose

variance value is 100.

sample S.D is observed to be 2.52. Can the sample be regarded as

drawn from the population with S.D 2.56

H0: = 2.56

H1: 2.56

Level of significance:

Appropriate level of significance is 5% (chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

s -

z ~ N(0,1) at % level of significan ce

2

2n

2.52 - 2.56

z 0.663

2.56 2

2 900

Inference:

The critical value of z at 5% level of significance in a two tailed

test is 1.96. Zcal < Zcritical value. We accept H0. Hence we conclude

that the sample has been drawn from the population whose

S.D value is 2.56.

Test for Two Variances/ Two S.Ds:

H0: x2 = y2

H1: x2 y2 (or) x2 > y2 (or) x2 < y2

[Any one of these 3 conditions depends on the given problem]

Level of significance:

Appropriate level of significance is % (given/chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

(s x - s y ) - ( x y )

z ~ N(0,1) at % level of significan ce

x 2

y 2

2n1 2n 2

Inference:

1) If it is a two tailed test (), Reject H0 if Zcal Zcritical value.

Otherwise accept H0.

2) If it is a right tailed test (>), Reject H0 if Zcal Zcritical value.

Otherwise accept H0.

3) If it is a left tailed test (<), Reject H0 if Zcal Zcritical value.

Otherwise accept H0.

1) The mean yield of two sets and their variability are given below. i)

Test whether the difference in the mean yields of two sets is

significant.

ii) Test whether the difference in the variability in yielding is

significant.

Set I Set II

Mean yield 1258 kgs 1243 Kgs

S.D/plot 34 Kgs 28 Kgs

No. of Plots 40 60

H0: x = y

H1: x y

Level of significance:

Appropriate level of significance is 5% (chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

(x - y) ( x y )

Z ~ N(0,1) at % level of significan ce

2

x

2

y

n1 n2

Under H0, the test statistic becomes,

(1258 - 1243) (0)

Z 2.3154

34 2 282

40 60

Inference:

The critical value of Z at 5% level of significance in a two

tailed test is 1.96 Here, Zcal > Zcritical value. So, we reject H0. Hence

we conclude that the mean yield of the two sets are not equal.

H0: x2 = y2

H1: x2 y2

Level of significance:

Appropriate level of significance is 5% (chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

(s x - s y ) - ( x y )

z ~ N(0,1) at % level of significan ce

x 2

y 2

2n1 2n 2

z 1.31

2 2

34 28

2 40 2 60

Inference:

The critical value of Z at 5% level of significance in a two

tailed test is 1.96 Here, Zcal < Zcritical value. So, accept H0. Hence

we conclude that the variability of the two sets are equal.

data relating to the heights of adult males.

i)Is the difference between the means significant?

ii) Is the difference between the S.Ds significant?

Country A Country B

Mean height 67.42 inches 67.25 inches

S.D of heights 2.58 inches 2.50 inches

No. of Samples 1000 2000

H0: x = y

H1: x y

Level of significance:

Appropriate level of significance is 5% (chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

(x - y) ( x y )

Z ~ N(0,1) at % level of significan ce

2

x

2

y

n1 n2

Under H0, the test statistic becomes,

(67.42 - 67.25) (0)

Z 1.561

2.582 2.50 2

1000 2000

Inference:

The critical value of Z at 5% level of significance in a two

tailed test is 1.96 Here, Zcal < Zcritical value. So, we accept H0.

Hence we conclude that the mean heights of the adults males in

the two countries are equal.

H0: x2 = y2

H1: x2 y2

Level of significance:

Appropriate level of significance is 5% (chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

(s x - s y ) - ( x y )

z ~ N(0,1) at % level of significan ce

x 2

y 2

2n1 2n 2

(2.58 - 2.50) - (0)

z 1.03

2 2

2.58 2.52

2 1000 2 1200

Inference:

The critical value of Z at 5% level of significance in a two

tailed test is 1.96 Here, Zcal < Zcritical value. So, accept H0. Hence

we conclude that the S.Ds of the two sets are equal.

H0: P = P0

H1: P P0 (or) P > P0 (or) P < P0

[Any one of these 3 conditions depends on the given problem]

Level of significance:

Appropriate level of significance is % (given/chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

pP

z ~ N(0,1) at % level of significan ce

PQ

n

Under H0, the test statistic becomes

p P0

z ~ N(0,1) at % level of significan ce

P0 Q 0

n

Inference:

1) If it is a two tailed test (), Reject H0 if Zcal Zcritical value.

Otherwise accept H0.

2) If it is a right tailed test (>), Reject H0 if Zcal Zcritical value.

Otherwise accept H0.

3) If it is a left tailed test (<), Reject H0 if Zcal Zcritical value.

Otherwise accept H0.

1) A coin was tossed 400 times and the head turned up 2/6 times. Test

the hypothesis that the coin is unbiased at 5% level of significance.

H0: P=1/2. (The coin is unbiased. i.e if an unbiased coin is tossed,

the probability of head turning up is )

H1: P1/2.

Level of significance:

Appropriate level of significance is 5% (given)

Test Statistic:

To test the above hypotheses the test statistic is given by

pP

z ~ N(0,1) at % level of significan ce

PQ

n

Under H0, the test statistic becomes

0.3333 0.5

z 6.668

0.5 0.5

400

Inference:

The critical value of Z at 5% level of significance in a two

tailed test is 1.96

Here, Zcal > Zcritical value. So, we reject H0. Hence we conclude

that the the coin is not unbiased coin.

obtained 3240 times. On the assumption of certain throwing does

the data indicate the die is unbiased?

H0: P=2/6=1/3. ( if an unbiased die coin is thrown, the probability of

getting 5 or 6 is 2/6 )

H1: P1/3.

Level of significance:

Appropriate level of significance is 5% (given)

Test Statistic:

To test the above hypotheses the test statistic is given by

pP

z ~ N(0,1) at % level of significan ce

PQ

n

Under H0, the test statistic becomes

0.36 0.3333

z 6.05

0.3333 0.6666

9000

Inference:

The critical value of Z at 5% level of significance in a two

tailed test is 1.96 . Here, Zcal > Zcritical value. So, we reject H0.

Hence we conclude that the die is biased.

1) In a sample of 600 men from a certain city, 450 men are found to

be smokers. In a sample of 900 from another city 450 are found to

be smokers. Do the data indicate that the two cities are significantly

different with respect to prevalence of smoking habit among men?

Sol:-

H0: Px = Py

H1: Px Py

Level of significance:

Appropriate level of significance is 5% (chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

(p x p y ) (Px Py )

z ~ N(0,1) at % level of significan ce

1 1

PQ

n1 n 2

x 450 y 450 xy 450 450

px 0.75, py 0.5, P

n 1 600 n 2 900 n 1 n 2 600 900

(0.75 0.5) - (0)

z 9.6824

1 1

0.6 0.4

600 900

Inference:

The critical value of Z at 5% level of significance in a two

tailed test is 1.96

Here, Zcal > Zcritical value. So, we reject H0. Hence we conclude

that the two cities are significantly different with respect to

prevalence of smoking habit among men.

returns audited. In another sample of 2500 corporations, 61 had

their 1991 returns audited. Was the fraction of corporate returns

audited in 1992 significantly different from the 1991 fraction? Test

the appropriate hypotheses at =0.01.

Sol:-

H0: Px = Py

H1: Px Py

Level of significance:

Appropriate level of significance is 5% (chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

(p x p y ) (Px Py )

z ~ N(0,1) at % level of significan ce

1 1

PQ

n1 n 2

x 58 y 61 xy 58 61

px 0.029, p y 0.0244, P 0.0264

n 1 2000 n 2 2500 n 1 n 2 2000 2500

(0.029 0.0244) - (0)

z 0.9583

1 1

0.0264 0.9736

2000 2500

Inference:

The critical value of Z at 5% level of significance in a two

tailed test is 1.96

Here, Zcal < Zcritical value. So, we accept H0. Hence we conclude

that there is no significant difference between the fraction of

corporate returns audited in the years 1992 and 1991.

X 0 1 2 3 4 5 6

F 7 64 140 210 132 75 12

Sol:-

x f fx Px=ncxpxqn-x N.Px

0 7 0 0.014270 9.13 9

1 64 64 0.088230 56.47 56

2 140 280 0.227294 145.47 145

3 210 630 0.312289 199.87 200

4 132 528 0.241350 154.46 154

5 75 375 0.099480 63.67 64

6 12 72 0.017085 10.93 11

640 1949

fx /f = 1949/640=3.0453

P= 3.0453/6= 0.5075 , q=1-p=0.4925

The following mistakes per page were observed in a book. Fit a Poisson

Distribution and test the goodness of fit.

No.of mistakes / page 0 1 2 3 4

No.of pages 211 90 19 5 0

Sol:-

X f fx Px=e-x /x! N.Px

0 211 0 0.644036 209.31 209

1 90 90 0.283376 92.10 92

2 19 38 0.062343 20.26 20

3 5 15 0.009144 2.97 3

4 0 0 0.001006 .33 0

325 143

fx /f = 143/325=0.44

2-test

H0: Factors in the contingency table are independent

H1: Factors in the contingency table are dependent.

Level of significance:

Appropriate level of significance is % (given/chosen)

Test Statistic:

To test the above hypotheses the test statistic is given by

2 O E 2 ~ (r2 1)(c1) at % level of significan ce

E

O is the observed value in the (i, j) th cell

where i th row total j th column tot al

E

samplesize

Inference:

If 2 cal 2(r-1)(c-1) , we reject H0. Otherwise we accept H0.

three types of bonus schemes. Total employees were divided into

four categories namely laborers, clerks, technicians and executives.

The results obtained by way of opinion survey are presented in the

form of contingency table as given below. Test the good ness of fit

at 5% level of significance.

CATEGORY

Type 1 Type 2 Type 3

Labour 190 243 197

Clerks 82 44 44

Technicians 23 78 34

Executives 5 12 8

Sol:- H0: Factors in the contingency table are independent.

H1: Factors in the contingency table are dependent.

BONUS SCHEMES

EMPLOYEES Type 1 Type 2 Type 3

CATEGORY TOTAL

Expected 196.9 247.4 185.7

Count

Clerks 82 44 44 170

Expected 53.1 66.8 50.1

Count

Technicians 23 78 34 135

Expected 42.2 53.0 39.8

Count

Executives 5 12 8 25

7.8 9.8 7.4

Expected

Count

Total 300 377 283 960

employees performance in the companys training program and his

success in the job, a sample of 400 cases were taken and the

following results were obtained. Test at 1% l.o.s whether the

performance in the training program and success in the job are

independent and the table is as given below.

Below Avg. Above

Avg. Avg.

Poor 23 60 29

Avg 28 79 60

Very good 9 49 63

Sol:- H0: Factors in the contingency table are independent.

H1: Factors in the contingency table are dependent

Success Below Avg. Above

in job Avg. Avg.

Poor 23 60 29 112

Expected 16.8 52.6 42.6

Count

Avg. 28 79 60 167

Expected 25.0 78.5 63.5

Count

Very good 9 49 63 121

Expected 18.2 56.9 46.0

Count

Total 60 188 152 400

H1: There is no homogeneity among the means i.e 1 2 k

Calculations:

Row sum of squares(R. S.S) X i ,

2

where X i is the i th observatio n

G2

Correction factor (C.F) ,

N

where G is the Grand total

and N is the no.of observatio ns in the entire expt.

Sum of Squares due to Total(S.S. T) ST2 R.S.S - C.F

Ti2

Sum of Squares due to Treatments (S.S.tr) S 2

- C.F,

tr

ni

Ti i th row total,

n i no.of.obvs in i th row

Sum of Squares due to Error(S.S. E) Se2 S.S.T - S.S.tr ST2 S2tr

Source of Sum of Degrees of Mean Sum of Variance

Variation Squares freedom Squares Ratio

Treatments S 2tr k-1 S 2tr

s tr

2 s 2tr

k 1 F 2 ~ Fk 1,N k

Error

S e2 N-k S2 se

s e2 e

N-k

Total S T2 N-1 --------- ------------------------

Inference:

If Fcal F,k-1,N-k, We reject H0. Otherwise we accept H0.

Problem 1.

Suppose 3 drying formulas for curing a glue are studied and the following

times are observed. Carry out ANOVA one-way classification at 5% L.O.S

and comment

Formula A 13 10 8 11 8

Formula B 13 11 14 14

Formula C 4 1 3 4 2 4

Sol:-

H0: There is homogeneity among the means i.e A = B = C

H1: There is no homogeneity among the means i.e A B C

Appropriate level of significance is 5% (given)

Ti Ti2/ni

Formula A 13 10 8 11 8 --- 50 500

Formula B 13 11 14 14 --- --- 52 676

Formula C 4 1 3 4 2 4 18 54

G=120 Ti /ni= 1230

2

2

G2 105 2

Correction factor (C.F) 960

N 15

Sum of Squares due to Total(S.S. T) ST2 R.S.S - C.F 973 - 735 302

Ti2

Sum of Squares due to Treatments (S.S.tr) S - C.F 1230 - 960 270

2

tr

ni

Sum of Squares due to Error(S.S. E) Se2 S.S.T - S.S.tr ST2 S2tr 302 270 32

Source of Sum of Degrees of Mean Sum of Variance

Variation Squares freedom Squares Ratio

Treatments 270 2 135

(Formulae) 50.625

Error 32 12 2.667

Total 302 14

Inference

The table value of F at 5% level of significance for(2,12) d.f is 3.89

{F,k-1,N-k=F0.05,2,12=3.89}

Problem 2.

As a part of investigation of the collapse of the roof of a building, a testing

laboratory is given all the available bolts that connected the steel structure at

3 different positions on the roof. The faces required to sheer each of these

bolts are as follows. Perform an ANOVA to test at 0.05 L.O.S whether the

differences among the sample means at the 3 positions are significant.

Position 1 90 82 79 98 83 91

Position 2 105 89 93 104 89 95 86

Position 3 83 89 80 94

Sol:-

H0: There is homogeneity among the means i.e 1 = 2 = 3

H1: There is no homogeneity among the means i.e 1 2 3

Appropriate level of significance is 5% (given)

Ti Ti2/ni

Position 90 82 79 98 83 91 523 45,588.17

1

Position 105 89 93 104 89 95 86 661 62,417.28

2

Position 83 89 80 94 346 29,929.00

3

G=1530

Ti2/ni=1,37,934.45

Row sum of squares(R. S.S) X i 1,38,638

2

G2 1530 2

Correction factor (C.F) 1,37,700

N 17

Sum of Squares due to Total(S.S. T) ST2 R.S.S - C.F 1,38,638 1,37,700 938

Ti2

Sum of Squares due to Treatments (S.S.tr) S - C.F 1,37,934.45 - 1,37,700 234.45

2

tr

ni

Sum of Squares due to Error(S.S. E) Se2 S.S.T - S.S.tr ST2 S2tr 938 234.45 703.55

Source of Sum of Degrees of Mean Sum of Variance

Variation Squares freedom Squares Ratio

Treatments 234.45 2 117.225

(Formulae) 2.333

Error 703.55 14 50.253

Total 938 16

Inference

The table value of F at 5% level of significance for(2,14) d.f is

{F,k-1,N-k=F0.05,2,14= 3.74}

Fcal > Ftab, We reject H0. Hence we conclude that 1 2 3

ANOVA two-way classification

Null Hypothesis

H0(tr): There is homogeneity among the treatments i.e 1 = 2 == k

H0(b) : There is homogeneity among the blocks i.e 1 = 2 == h

Alternative Hypothesis

H1(tr): There is no homogeneity among the treatments 1 2 k

H1(b): There is no homogeneity among the treatments 1 2 h

Appropriate level of significance is % (given/chosen)

1 2 . . . h Ti Ti2

1 X11 X12 . . . X1h T1 T12

2 X21 X22 . . . X2h T2 T22

. . . . . . . . .

. . . . . . . . .

k Xk1 Xk2 . . . Xkh Tk Tk2

Bj B1 B2 . . . Bh G Ti2

Bj2 B12 B22 . . . Bh2 Bj2

Row sum of squares(R. S.S) X i ,

2

where X i is the i th observatio n

G2

Correction factor (C.F) , where G is the Grand total, N is no.of observatio ns in expt.

N

Sum of Squares due to Total(S.S. T) ST2 R.S.S - C.F

1 k 2

Sum of Squares due to Treatments (S.S.tr) S2tr Ti - C.F

h i 1

1 h 2

Sum of Squares due to Blocks(S.S .b) S B j - C.F

2

b

k j1

Sum of Squares due to Error(S.S. E) Se2 S.S.T - S.S.tr - S.S.b ST2 S2tr S2b

Source of Sum of Degrees of Mean Sum of Variance

Variation Squares freedom Squares Ratio

Treatments S 2 k-1 S2

s 2tr tr s 2tr

F 2 ~ Fk 1,(k -1)(h-1)

tr

k 1

Blocks S 2b h-1 S2 se

s 2b b

h 1 s 2b

Error

S e2 (k-1)(h-1)

s e2

Se2 F 2 ~ Fh 1,(k -1)(h-1)

(k - 1)(h - 1) se

Total S T2 kh-1 --------- ------------------------

Blocks

Treatment 1 13 7 9 3

Treatment 2 6 6 3 1

Treatment 3 11 5 15 5

Null Hypothesis

H0(tr): There is homogeneity among the treatments i.e 1 = 2 == k

H0(b) : There is homogeneity among the blocks i.e 1 = 2 == h

Alternative Hypothesis

H1(tr): There is no homogeneity among the treatments 1 2 k

H1(b): There is no homogeneity among the treatments 1 2 h

Appropriate level of significance is % (given/chosen)

Blocks Ti Ti2

Treatment 1 13 7 9 3 32 1024

Treatment 2 6 6 3 1 16 256

Treatment 3 11 5 15 5 36 1296

Bj 30 18 27 9 G=84 Ti2=2576

Bj2 900 324 729 81 Bj2=2034

Row sum of squares(R. S.S) X i 786

2

G2 84 2

Correction factor (C.F) 588

N 12

Sum of Squares due to Total(S.S. T) ST2 R.S.S - C.F 786 - 588 198

1 k 2 1

Sum of Squares due to Treatments (S.S.tr) S2tr

h i 1

Ti - C.F 2576 588 56

4

1 h 2 1

Sum of Squares due to Blocks(S.S .b) S2b

k j1

B j - C.F 2034 588 90

3

Sum of Squares due to Error(S.S. E) Se2 S.S.T - S.S.tr - S.S.b ST2 S2tr S2b 198 56 90 52

Variation Squares freedom of Squares Ratio

Treatments 56 2 28 Ft=3.23

Blocks 90 3 30

Error 52 6 8.67 Fb=3.46

Total 198 11

Inference:-

1. {F,k-1,N-k=F0.05,2,6=5.14 }, Ft < F tab , we accept H0(tr)

2. {F,k-1,N-k=F0.05,3,6=4.76 }, Fb < F tab , we accept H0(b)

Problem 2. Carry out ANOVA two-way classification to the following data.

Student 1 Student 2 Student 3 Student 4 Student 5

Form A 75 73 59 69 84

Form B 83 72 56 70 92

Form C 86 61 53 72 88

Form D 73 67 62 79 95

Null Hypothesis

H0(tr): There is homogeneity among the Forms i.e 1 = 2 = 3= 4

H0(b) : There is homogeneity among the Students i.e 1 = 2 = 3= 4= 5

Alternative Hypothesis

H1(tr): There is no homogeneity among the Forms i.e 1 2 3 4

H1(b): There is no homogeneity among the Students i.e 1 2 3 4 5

Appropriate level of significance is 5% (chosen)

S1 S2 S3 S4 S5 Ti Ti2

Form 75 73 59 69 84 360 129600

A

Form 83 72 56 70 92 373 139129

B

Form 86 61 53 72 88 360 129600

C

Form 73 67 62 79 95 376 141376

D

Bj 317 273 230 290 359 G=1469 Ti2=539705

Bj2 100489 74529 52900 84100 128881 Bj2=440899

Row sum of squares(R. S.S) X i 110607

2

G2 1469 2

Correction factor (C.F) 107898.05

N 20

Sum of Squares due to Total(S.S. T) S T2 R.S.S - C.F 110607 107898.05 2708.95

1 k 2 1

Sum of Squares due to Treatments (S.S.tr) S 2tr

h i 1

Ti - C.F 539705 107898.05 42.95

5

h

1 1

Sum of Squares due to Blocks(S.S .b) S 2b B 2j - C.F 440899 107898.05 2326.70

k j1 4

Sum of Squares due to Error(S.S. E) S e2 S.S.T - S.S.tr - S.S.b S T2 S 2tr S 2b 339.3

Source of Sum of Degrees of Mean Sum Variance

Variation Squares freedom of Squares Ratio

Treatments 42.95 3 14.3167 Ft=0.506

Blocks 2326.7 4 581.675

Error 339.3 12 28.275 Fb=20.572

Total 2708.95 19

Inference:- 1. {F,k-1,N-k=F0.05,12,3=8.74 }, Ft < 1, So F t =1.9762, Ft < F tab,

we accept H0(tr)

2.{F,k-1,N-k=F0.05,4,12=3.26 }, Fb > F tab , we reject H0(b)

CORRELATION & REGRESSION

then sat for a certain examination. Then I.Qs and examination marks

are as follows:

Find the rank correlation coefficient.

person I.Q Exam

marks Xi Yi di=xi-yi d i2

A 110 70 3 3 0 0

B 100 60 4 4 0 0

C 140 80 1 2 -1 1

D 120 90 2 1 1 1

E 80 10 6 6 0 0

F 90 20 5 5 0 0

d2=2

= 1- 6.d2/n(n2-1)=1-0.0572 = 0.9428

X Y xi yi di=xi-yi d i2

68 62 4 5 -1 1

64 58 6 7 -1 1

75 68 2.5 3.5 -1 1

50 45 9 10 -1 1

64 81 6 1 5 25

80 60 1 6 -5 25

75 68 2.5 3.5 -1 1

40 48 10 9 1 1

55 50 8 8 0 0

64 70 6 2 4 16

d2=72

In x-series 64 has occurred thrice ,so c.f =m(m2-1)/12=2

In y-series 68 has occurred twice ,so c.f =m(m2-1)/12=0.5

= 1- 6.d2/n(n2-1)=1-0.0572 = 0.545

MISC

1. In a partially destroyed laboratory record of an analysis of

correlation data the following results are only legible:

Variance of X =9

Regression equations are :

8X-10Y+66=0

40X-18Y=214

a) What were the mean values of X and Y

b) Correlation coefficient between X and Y

c) Standard deviation of Y

x y

Average 7.6 14.8

S.D 3.6 2.5

Correlation coefficient is 0.9

coefficient from the following data:

x 2 5 4 7 3 9

y 6 8 7 10 4 8

of x when y=15

x 2 5 4 8 7 6 10

y 3 8 3 7 9 8 11

x 5 6 7 9 12

y 13 11 9 8 6

Time series analysis is the data collected over a period of time . Our purpose is to see what

changes take place over the time in the event we are observing . We can try to predict the

future behavior of that event based on the data available, hoping that the recent behavior of this

series will overshadow earlier behavior..

Fore casting or predicting is an essential tool in any decision-making process. Time series

analysis is a quantitative method in which we use to determine patterns in data collected over

a period of time.

Analysis of time series helps us to understand the past behavior of time series data. With the

knowledge of the past behavior , it would be possible, within certain limits, to forecast the

probable future variations of the data. Thus it helps in planning future operations.

Time series is divided into three components

1.Long term variations (Trend)

2.Short term variations (periodic changes)

i)Seasonal variations

ii)Cyclical variations

3.Random or irregular variations

TREND

Trend means general tendency of the data to increase or decrease during a long period of time.

The term long period of time is a relative term and cannot be exactly defined. In some cases,

a week days may become a long period of time and along time period as long as two years

may become a small time period.

Foe example , the bacterial growth is very high for every five minutes and week days become

a long period in the study of bacteria while 2 years in the agriculture production become a

short time period. So the term trend depends upon the case under study.

Reasons for studying Trend:

1. The study of secular trends allow us to describe a historical pattern.

2. Studying secular trends permits us to project past pattern or trends into the future.

i) graphical method

ii) Method of semi-averages

iii) Method of Least squares

iv) Method of moving Averages

Free hand method or graphic method is the simplest method for studying trend. In this

method the actual figures (given data) are first plotted as points on a graph paper showing

the time series data along the vertical axis and time along the horizontal axis. Then a

straight line is drawn to fit as closely as possible the plotted points.

(To draw the line, leave equal number of points on both sides of it at more or less equal

distance). The line so obtained shows the direction of the trend and the vertical distance of

this line gives the trend value for each time period.

By this method a quick estimate of the trend can be obtained, but this depends too much

on the judgment of the investigation. Different people will locate the line in different

positions . This method should be used only when a quick approximate idea of the trend

is required.

Fit a trend line to the following data by the free hand method

Year 1970 1971 1972 1973 1974 1975 1976 1977

million Rs)

TREND LINE

70

69

SALES IN MILLIONS

68

67

66

65

64

63

62

61

1969 1970 1971 1972 1973 1974 1975 1976 1977 1978

YEAR

In the semi average method , the given data is first divided into two parts and an average

for each part is found. Then these two averages are plotted on the graph paper against the

mid points of the time intervals covered by the respective two parts. These two points are

joined by a straight line. This straight line is the required trend line.

Although this method is simple to apply, it may lead to poor results when used

indiscriminately . It is applicable only where the trend is linear or approximately linear.

Draw a trend line by the semi average method using the following data:

Year 1973 1974 1975 1976 1977 1978

Production of steel 253 260 255 263 259 264

(in lakh tons)

Sol:- The average production of steel for the first three years 253+260+255/3=256

The average production of steel for the last three years 263+259+264/3=262

Thus we get two points 256 and 262 which are plotted against the respective

middle years 1974 and 1977 of the two parts 1973-1975 and 1976-1978 . By

joining these two points , the required trend line is obtained.

266

264

PRODUCTION OF STEEL

262

260

258

256

254

252

1972 1973 1974 1975 1976 1977 1978 1979

YEAR

This method is widely used for measurement of trend.

Linear Trend:- Let (x1,y1),(x2,y2).(xn,yn) be n pairs of observations where yi

represents time series and xi represents time.

Let Y=a+bX be the linear equation of the straight line. The normal equations are

y = na + bx

xy = ax + bx2

Problem 1:-

Determine the equation of a straight line which best fits the following data:

Year 1974 1975 1976 1977 1978

Sales(in000) 35 56 79 80 40

Sol:-

Year Sales X Y XY X2

1974 35 -2 35 -70 4

1975 56 -1 56 -56 1

1976 79 0 79 0 0

1977 80 1 80 80 1

1978 40 2 40 80 4

0 290 34 10

5a+0b=290

a0+10b=34 a=58 , b=3.4

Required straight line equation is:

Y=58+3.4(X-1976)

Y=-6660.4+3.4X

Problem2:-

Fit a straight line to the following data:

X 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990

Y 42 50 61 75 92 111 120 127 140 138

Sol:-

X Y X-1986 XY X2

1981 42 -5 -210 25

1982 50 -4 -200 16

1983 61 -3 -183 9

1984 75 -2 -150 4

1985 92 -1 -92 1

1986 111 0 0 0

1987 120 1 120 1

1988 127 2 254 4

1989 140 3 420 9

1990 138 4 552 16

956 -5 511 85

10a-5b=956

-5a+85b=511 a=101.5939, b=11.9878

Y = 101.5939 + 11.9878(X-1986)

Y = -23706.177 + 11.9878 X

Problem3:-

Fit a trend line to the following data of sales of a commodity in a shop using least square

theory and estimate the volume of sales in 1997

Year 1990 1991 1992 1993 1994

Sales in 000 2.4 2.8 3.1 3.6 4.2

Sol:-

X Y X=X-1992 XY X2

1990 2.4 -2 -4.8 4

1991 2.8 -1 -2.8 1

1992 3.1 0 0 0

1993 3.6 1 3.6 1

1994 4.2 2 8.4 4

16.1 0 4.4 10

Y= 3.22+0.44(X-1992)

Y=-873.28+0.44X

YX=1997= 5.42

Problem4:- (2002 march)

Mr.Ramesh the auditor of a Govt. public school has received the inventory records to

determine if the current inventory holdings of text books are typical. The following

inventory amounts are from the previous 5 years

Year 1988 1989 1990 1991 1992

Inventory in Rs. 4620 4910 5490 5730 5990

b) Find the linear equation that describes the trend in the inventory holdings.

c) Estimate the value of inventory for the year 1993

X Y X=X-1990 XY X2

1988 4620 -2 -9240 4

1989 4910 -1 -4910 1

1990 5490 0 0 0

1991 5730 1 5730 1

1992 5990 2 11980 4

26740 0 3560 10

Y= 5348 + 356(X-1990)

Y= -703092 + 356X

Yx=1993 = 6416

Fit a Straight line by Principle of least square to the data. Estimate the likely production

for the year 2005.

X 1995 1996 1997 1998 1999 2000

Y 24 25 27 29 30 33

X Y X=X-1997 XY X2

1995 24 -2 -48 4

1996 25 -1 -25 1

1997 27 0 0 0

1998 29 1 29 1

1999 30 2 60 4

2000 33 3 99 9

168 115 19

Y=27.1143 + 1.7714(X-1997)

Y= -3510.3715 + 1.7714

YX=2005 = 41.2855

Problem1:-

Y = a+bx+cx2

Y = na +bx+cx2

xy = ax+bx2+cx3

x2y=ax2+bx2+cx4

X Y X XY X2 X3 X4 X2Y

1985 13 -2 -26 4 -8 16 52

1986 24 -1 -24 1 -1 1 24

1987 39 0 0 0 0 0 0

1988 65 1 65 1 1 1 65

1989 106 2 212 4 8 16 424

247 0 227 10 0 34 565

Problem2:- (April 2001)

The sales of a company in lakhs of rupees for the years 1975 through 1981 are

given below

Year 1975 1976 1977 1978 1979 1980 1981

Sales 32 47 65 92 132 190 275

Estimate sales figures for the year 1982 using an equation of the form Y=ab X

Where x represents years and y represents sales.

1975 32 -3 1.5051 -4.5153 9

1976 47 -2 1.6720 -3.3440 4

1977 65 -1 1.8129 -1.8129 1

1978 92 0 1.9637 0 0

1979 132 1 2.1205 2.1205 1

1980 190 2 2.2787 4.5874 4

1981 275 3 2.4393 7.3179 9

0 13.7923 4.3236 28

A=1.9704 ; B =0.1544

Y = 93.4114 * 1.4269 X

YX=1982 = 2.5881

the data by means of a moving average . Moving average of extent m is a series of

successive averages of m terms at a time starting with 1st,2nd,3rd.terms etc.

Thus the first average is the mean of the 1st m terms and the 2nd to (m+1)th term, the third is

the mean of the m terms from 3rd to (m+2)th term and so on.

If m is odd =(2k+1) say, moving average is placed against the midvalue of

the time interval it covers. i.e against t=k+1. and if m is even it is placed between the two

middle values of the time interval it covers. i.e between t=k and t=k+1. In the latter case the

moving average does not coincide with an original time period and an attempt is made to

synchronize the moving averages and the original data by centering the moving averages

which consists in taking a moving average of extenet two and putting these values against

t=k+1. The graph obtained on plotting thr moving average against time gives trend.

Problem1:-

Work out the centered 4 yearly moving averages for the following data

Year T Four year Two year

moving moving

Average Average

1971 2204

1972 2500

2436

1973 2360 2463.5

2491

1974 2680 2507.75

2524.50

1975 2424 2592.50

2660.50

1976 2634 2712.75

2765

1977 2904 2858.50

2952

1978 3098 2991.75

3031.50

1979 3172 3074.50

3117.50

1980 2952 3126.75

3136

1981 3248

1982 3172

MOVING AVERAGES

3200

3000

DATA

2800

2600

2400

2200

1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983

YEAR

problem2:-

Calculate the 5 yearly moving average to the following data

Year T Moving

average

1966 19.3

1967 20.9

1968 17.8 18.34

1969 16.1 18.04

1970 17.6 17.52

1971 17.8 17.42

1972 18.3 18.48

1973 17.3 18.82

1974 21.4 18.88

1975 19.3 19.12

1976 18.1 19.50

1977 19.5 19.66

1978 19.2 19.98

1979 22.2 20.66

1980 20.9 21.14

1981 21.5

1982 21.9

MOVING AVERAGES

23

22

21

DATA

20

19

18

17

16

1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983

YEAR

Problem3:-

The data below give the index of industrial production from 1961 to 1970:

Determine the trend by means of moving averages i)3 year ii)5 year

Year 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970

Index of 109.2 119.8 129.7 140.8 153.8 153.2 152.6 163.0 175.3 184.3

production

1961 109.2

1962 119.8 119.57

1963 129.7 130.10

1964 140.8 141.43

1965 153.8 149.27

1966 153.2 153.2

1967 152.6 156.27

1968 163.0 163.63

1969 175.3 174.2

1970 184.3

190

180

170

160

INDEX

150

140

130

120

110

100

1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971

YEAR

Year Index 5 yr

avg

1961 109.2

1962 119.8

1963 129.7 130.66

1964 140.8 139.46

1965 153.8 146.22

1966 153.2 152.68

1967 152.6 159.58

1968 163.0 165.68

1969 175.3

1970 184.3

190

180

170

160

INDEX

150

140

130

120

110

100

1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971

YEAR

Algorithm:-

Step1: Enter the no. of observations i.n

Step2:Read all the paired observations (Ut, t)

Step3:Enter the extent of moving average i.e m

Step4: Calculate the mid values

M2= (t2+t3+.+tm+1)/m

.

.

Mq= (tq+tq+1++tn )/m

Step5:If m is odd i.e (m=2k+1) goto step6; else if m is even (m=2k) goto step7;

Step6:Place the moving average against the midvalue of the time interval it covers.

Step7:Place the moving average between the kth and (k+1)th time series

Step8:Synchronize the moving average so that the average is against the time period.

a) Seasonal variations : Nature and custom are responsible for these variations . These

variations are due to rhythmic forces which operate in a regular and periodic manner over a

span of less than one year. Thus seasonal variations in a time series will exist if the data are

recorded quarterly, monthly, weekly, daily, hourly and so on. Thus in a time series data where

only annual figures are given , there are no seasonal variations. The following are responsible

for the seasonal variations.

i) Those resulting from natural forces: Weather and climatic changes play an important role in

seasonal movements. For instance the sale of umbrellas pick up very fast in rainy season, the

demand for electric fans goes up in summer and the sale of icecream increases in summer.

ii)Those resulting from man made conventions: These variations in a time series within a period

of 12 months are due to habits fashions, customs and conventions of the people in the society.

For example, the sale of gold ornaments goes up during marriages and festivals. These

variations operate in a regular manner

b) Cyclical variations: The oscillatory moments in a time series with a period of oscillation

more than one year are known as cyclical variations in one completet period is known as a

cycle oftenly known as business cycle which is also known as four facet cycle composed of

prosperity (Boom) , recession , depression and recovery. Generally these variations operate

from seven to eleven years.

2.Ratio to trend method.

3. Ratio to moving average method.

4. Link relative method

The data below gives the average quarterly process of a commodity for four years .

Calculate the seasonal variation indices.

1980 40.3 44.8 46.0 48.0

1981 50.1 53.1 55.3 59.5

1982 47.2 50.1 52.1 55.2

1983 55.4 59.0 61.6 65.3

Sol:-

1980 40.3 44.8 46.0 48.0

1981 50.1 53.1 55.3 59.5

1982 47.2 50.1 52.1 55.2

1983 55.4 59.0 61.6 65.3

Average 48.25 51.75 53.75 57 x

=52.69

Seasonal 91.57 98.21 102.01 108.18

Index

Algorithm:-

1.Read the data

2.Compute the averages for the IQ,IIQ,IIIQ & IVQ

3.Compute the overall average x

xi

4.Compute the seasonal Index by 100

x

problem1:-

Find the quarterly indices by Ratio to trend method to the following data

Table 1:-

Year IQ IIQ IIIQ IVQ Ut T=t-1980 Ut.T T2

1978 10 27 21 40 24.5 -2 -49 4

1979 11 35 29 57 33 -1 -33 1

1980 14 51 33 74 43 0 0 0

1981 19 57 43 78 49.25 1 49.25 1

1982 22 67 45 101 58.75 2 117.5 4

208.5 0 84.75 10

Ut = a+bT

Ut = na+bT

T.Ut=aT+bT2 a=41.7 , b=8.475

Ut=41.7+8.475T

Ut=41.7+8.475(t-1980)

Ut= -16738.8 + 8.475t

Table2:-

Year 1978 1979 1980 1981 1982

Trend values 24.75 33.225 41.7 50.175 58.65

IIQ=T.V-1/2Q.I

IIIQ=T.V+1/2Q.I

IQ=IIQ-Q.I

IVQ=IIIQ+Q.I

Table3:-

YEAR IQ IIQ IIIQ IVQ

1978 21.5718 23.6906 25.8093 27.9280

1979 30.0468 32.1656 34.2843 36.4030

1980 38.5218 40.6466 42.7593 44.8780

1981 46.9968 49.1156 51.2343 53.3530

1982 55.4711 57.5906 59.7093 61.8280

YEAR IQ IIQ IIIQ IVQ

1978 46.3568 113.9692 81.366 143.225

1979 36.6095 108.8118 84.5868 156.5805

1980 36.3430 125.4902 77.1761 164.8914

1981 40.4282 116.0527 83.9281 146.1969

1982 39.6602 116.1324 75.3650 163.356

AVG S.I 39.2795 116.09126 80.4844 154.85 390.7051

Corrected S.I 40.214 118.85 82.4 158.53 399.9940

Problem2:-

Calculate seasonal variation for the following data of salesin thousands Rs. Of a firm by the

Ratio to Trend Method

Table 1:-

Year IQ IIQ IIIQ IVQ Ut T=t-1980 Ut.T T2

1979 30 40 36 34 35 -2 -70 4

1980 34 52 50 44 45 -1 -45 1

1981 40 58 54 48 50 0 0 0

1982 54 76 68 62 65 1 65 1

1983 80 92 86 82 85 2 170 4

280 0 120 10

Ut = a+bT

Ut = na+bT

T.Ut=aT+bT2 a=56 , b=12

Ut=56+12T

Ut=56+12(t-1980)

Ut= -23716 + 12t

Table2:-

Year 1979 1980 1981 1982 1983

Trend values 32 44 56 68 80

IIQ=T.V-1/2Q.I

IIIQ=T.V+1/2Q.I

IQ=IIQ-Q.I

IVQ=IIIQ+Q.I

Table3:-

YEAR IQ IIQ IIIQ IVQ

1979 27.5 30.5 33.5 36.5

1980 39.5 42.5 45.5 48.5

1981 51.5 54.5 57.5 60.5

1982 63.5 66.5 69.5 72.5

1983 75.5 78.5 81.5 84.5

YEAR IQ IIQ IIIQ IVQ

1979 109.1 131.1 107.5 93.1

1980 86.1 122.4 109.9 90.7

1981 77.7 106.4 93.9 79.3

1982 85 114.3 97.8 85.5

1983 106 117.1 105.5 97

AVG S.I 92.78 118.26 102.92 87.32 401.28

Corrected S.I 92.4840 117.8827 102.5916 87.0414 399.9997

Algorithm :-

Step2: Calculate average values from the quarterly data i.e Ut

Step3: Fit a straight line of the form Ut=a+bt to the data

Step4: Calculate the yearly trend values

Step5: Calculate quarterly indices by Q.I=b/4

Step6: Find the quarterly trend values by

IIQ=T.V-1/2Q.I

IIIQ=T.V+1/2Q.I

IQ=IIQ-Q.I

IVQ=IIIQ+Q.I

Step7: Calculate percentage values by the formula old value/quarterly value * 100

Step8: Calculate the average seasonal Indices

Step9: Corrected seasonal indices are obtained by avg S.I/Total S.I * 400

Ratio to Moving Averages : A resort hotel wanted to establish the seasonal pattern of room

demand by its clientele. Hotel management wants to improve customer service and is

considering several plans to employ personal during peak periods to achieve this goal. The

table contains the quarterly occupancy during each quarter of the last four years. Calculate S.I

by ratio to moving averages.

YEAR IQ IIQ IIIQ IVQ

1980 75 60 54 59

1981 86 65 63 80

1982 90 72 66 85

1983 100 78 72 93

Sol:-

Year Ut Qtrly Adj.

moving mov.

Avg. Avg.

1980 75

60

62

54 63.375

64.75

59 65.375

66

1981 86 67.125

68.25

65 70.875

73.5

63 74

74.5

80 75.375

76.25

1982 90 76.625

77

72 77.625

78.25

66 79.5

80.75

85 81.5

82.25

1983 100 83

83.75

78 84.75

85.75

72

YEAR IQ IIQ IIIQ IVQ

1980 85.2071 90.2485

1981 128.1192 91.7108 85.1351 106.1360

1982 117.4551 92.7536 83.0189 104.2945

1983 120.4819 92.0354

Avg S.I 112.0187 92.1666 84.4537 100.2263 398.865

Adj. S.I 122.3659 92.4288 84.694 100.5115 400.000

Step2: Calculate the quarterly moving averages to the given data

Step3: Adjust the quarterly moving average such that the moving average is placed

against the time series data.

Step4: Calculate percentage table by formula old value/corresponding moving avg. * 100

Step5: Calculate the quarterly average values which are known as seasonal indices

Step6: Adjust the seasonal indices by Avg. seasonal index/total seasonal index * 400

Step7: The adjusted seasonal indices are the index values at the quarterly data

1981 68 60 61 63

1982 70 58 56 60

1983 68 63 68 67

1984 65 59 56 62

1985 60 55 51 58

Sol:-

YEAR IQ IIQ IIIQ IVQ

1981 ----------- 88.2353 101.6666 103.2787

1982 111.1111 82.857 96.5517 107.1428

1983 113.3333 92.6470 107.9365 98.5294

1984 97.0149 90.7692 94.9152 110.7142

1985 96.774 91.6666 92.7272 113.7254

Average L.R 104.5582 89.2348 98.7593 106.6779

Chain relatives (C.R) 100 89.2348 88.1277 94.0128

Adjusted C.R (S.I) 100 89.6603 88.9785 95.2890 373.9281

Corrected S.Is 106.972 95.9114 95.8821 101.9327 399.9981

D=1/4{104.5582*94.0128/100-100} = - 0.4254

Algorithm:-

Step1:-Translate the original data by using Link relative which is given by

Link Relative of any Quarter =Current Quarter figure/Old Quarter figure *100

Step2:-In the next step we can find the average of Link Relatives

Step3:-Convert the average Link Relatives to Chain Relatives on the basis of the IQ

Let us suppose that the Chain Relative of IQ =100

Let Chain Relative of IIQ = L.R of IIQ * C.R of IQ /100

Let Chain Relative of IIIQ = L.R of IIIQ * C.R of IIQ /100

Let Chain Relative of IVQ = L.R of IVQ * C.R of IIIQ /100

Adjusted C.R of IQ=100

Adjusted C.R of IIQ=calculated C.R-d

Adjusted C.R of IIIQ=calculated C.R -2d

Adjusted C.R of IVQ=calculated C.R-3d

Where d= [X1-100]

X1=L.R of the IQ*C.R of IVQ/100

Step5:-If the sum of the seasonal indices is not equal to 400 we adjust the seasonal

indices as usual.

These variations are not detected and they are beyond the human control. Earth

quakes, floods etc are responsible for these variations.

- AUploaded byMohammad Syazwan Surani
- Assessing the Number of Goals in Soccer MatchesUploaded byrpvresidence
- 11. Hypothesis TestingUploaded byNurgazy Nazhimidinov
- Business Mathematics Assignment 1Uploaded byAsadhShujau
- IE28 Lecture (Type II Error and Sample Size)Uploaded byPaulEricAbeto
- What is the Difference Between Alpha AndUploaded byCatherine Cheah Ming Ming
- What is Hypothesis Testing (1)Uploaded byNasir Mehmood Aryani
- Hypothesis TestingUploaded byKarl Phillip Ramoran Alcarde
- 3. MB0040 Mba1 StatsUploaded bySatish Patil
- AnovaUploaded byAashutosh Rathi
- isl 6.2Uploaded byNellson Nichioust
- Chapter 08 - QuizUploaded byhp508
- sw3_exeol_oddonlyUploaded byxandercage
- TEST OF HYP FORMULA & PROBS.docxUploaded byGnanaraj
- innovation_09-17-07_with_tables__2_Uploaded byajaypradhan85
- Writingyourthesis TestUploaded byChristian
- MCQsUploaded byRana Faisal
- 20 EkanshGarg OB FinaldraftUploaded byekansh garg
- LO_Unit4_InferenceNumericalVariables.pdfUploaded byPuthpura Puth
- 788-3408-1-PBUploaded byBalwant Patil
- Introduction to ANOVAUploaded byChan Waiyin
- Emirates NBD BAnk.docxUploaded byIsba Rafique
- GirlsUploaded byJell Lara Pacis
- HOW ADVANCED IS THE STRATEGY PARADIGM.pdfUploaded byIlker Dom
- 0026-1394_50_1_49Uploaded bynauji_k
- 7. IJBMR - Empirical Investigation on Mutual Funds and Their Influence Due to InterationalUploaded byTJPRC Publications
- ON EXISTENCE OF A CHANGE IN MEAN OF FUNCTIONAL DATAUploaded bysiladityajana
- 11.4.2012(2)Uploaded byNanda Subramanian
- T testUploaded byArihant Jain
- WritingaResearchProposal-Hershey10-08Uploaded bysadaram3434

- Research Methodology Customer Loyalty in TelecomUploaded bySandeep Novlani
- 4. Ch # 5, How Sociologist Do ResearchUploaded byerwin
- Download Assume That the Data Has a Normal DistributionUploaded bySolutionZIP
- Faktor Yang Mempengaruhi Kejadian Diare Pada BalitUploaded byPutry An
- Weekly Schedule BBI 4401, Sem 1, 2015-16.docxUploaded byUmmi Syakirah
- StatisticsUploaded byPravin Sasatte
- 10 - One-way AnovaUploaded byMurali Dharan
- River ProjectUploaded bykrmorales
- DISC 334-Management Science and Spreadsheet Modeling for UG Spring 2010-2011 FinalUploaded byrcheedarala
- rrmixedmethods_71909.pdfUploaded byHamidah Hamid
- PR 1Uploaded byBeverly Yosores Mansalapus
- sample_size_for_multiple_regressionUploaded bysaaid.hasbi
- SPSS and AMOS Brochure - August 2017Uploaded byjeganrajraj
- chemistry IA Proposal FormUploaded byCClfour
- 3149-7878-1-PB.pdfUploaded byFatanah Rijalul Aula
- MixedMethodsandSignificance.QualitativeReport.pdfUploaded bycrisjava
- Critical ReviewUploaded byYiota Aristodemou
- Chapter 4.pptUploaded byIsmailBhai
- 2964 CONSORT+2010+Checklist(1)Uploaded byizzkeonks
- Case Study TerminologyUploaded byAmando
- DMAIC - AnalyzeUploaded byDr.Srinivasan Kannappan
- A+Systematic+Review+of+the+Epidemiology+of+Status+EpilepticusUploaded byRenzo Lanchipa
- practice 2 from analysis of financial time seriesUploaded byapi-285777244
- Alexander (2001) Leslie Kish’s Rolling SamplesUploaded byestatista
- Analysis of VarianceUploaded bySherifa
- EXECUTIVE SUMMARY Midhun InstrumentationUploaded byVignesh Vijayan
- Pertemuan 04 Baru-Estimasi Dua PopulasiUploaded byAnta Pratama
- Clinical Research MethodologyUploaded bymisgana
- Introduction to StatisticsUploaded byjarence melgarejo
- Chapter 3 - MethodologyUploaded byLhorene Hope Dueñas