You are on page 1of 8

EXPERIMENT 3

Name : Harshit kapoor


Reg. No: 15BCE0657
Slot : L11+L12
3a: For the given details viz. (1) Country, (2) Net_Agricultural_Output, (3)
Population_Active_in_Agriculture, (4) Arables_Land_equivalent, (5)
Conversion_Ratio_of_Pasture_of_Arable_Land, (6) Productive_Livestock,
(7) Work_Stock, (8) Fertilizer_Consumption and (9)
Number_of_Tractors_in_ Agriculture Fit the Multiple Regression and
interpret your result.
Assume the variables as Dependent and Independent
according to your requirement/description. File Name:
Agriculture dataset.
Solution :-

> # agricultural data set experiment


> a=read.csv("C:/Users/new/Desktop/sem 3 imp documents/stats/R
LAB/submission/exp 2.csv")
>a
Country Net_Agricultural_Output
Population_Active_in_AgricultureArables_Land_equivalentConversion_Ratio_of_Pa
sture_of_Arable_LandProductive_Livestock
1
0.02

U.S.

17346

2
Canada
0.02
7655
3
0.02

8851

468033

77474

U.K.

1935
2102

1272
1221

91266
18960

9604

4
Norway
0.02
1251

290

495

2018

5
France
0.02
26070

3137

6
W.Germany
0.02
10202
7
Argentina
0.02
32294

1790

7490
4247

52796
21399

1373

1570

79789

8
Denmark
0.02
2903

614

540

6656

9 Netherlands
0.02
2122

610

746

2760

10 South africa
0.05
13690

438

2651

25071

11
Ireland
0.02
3661

388

594

12
0.02

Poland

2769

7035

13
0.02

Chile

3877
41755

6467
185

732

14688

2699

14 Puertorico
0.05
336
15
0.02

Japan

16
0.02

Italy

160
2346

246
18623

1023
14852

1094
3348

9127

38337

8700

17
0.02

Mexico

18
0.02

Greece

19
0.01

Turkey

20
0.01

Egypt

21
0.02

Peru

22
0.05

India

604

3803

29640

411

1507

8255

1199

5724

37552

12906
1645
12664
885

7558

6039

4416
286

1777

4718

9297

90523

336266

4440
83328

Work_Stock Fertilizer_Consumption Number_of_Tractors_in_Agriculture


1

825

1796

3952.1
192.1

3550000
367828

625

765.9

308540

198

99.6

9506

2613

870.8

122624

1628

1283.6

772

13.9

532

198.6

12257

276

368.5

15950

109776
25000

10

158

90.0

39500

11

526

60.0

9480

12

2541

13

613

14

60

15

1989

628.6

1810

16

1932

346.3

50590

17

6583

13.8

18

793

19

158.4
35.4
65.4

14500
6000
2150

32000

38.1

2869

3915

6.3

3959

20

2406

97.5

5400

21

1141

47.7

2400

22

75373

64.3

7500

>
>agriculture_model<lm(Net_Agricultural_Output~Population_Active_in_Agriculture+Fertilizer_Consum
ption, data=input_data)
Error in is.data.frame(data) : object 'input_data' not found
>agriculture_model<lm(Net_Agricultural_Output~Population_Active_in_Agriculture+Fertilizer_Consum
ption, data=a)
>agriculture_model

Call:
lm(formula = Net_Agricultural_Output ~ Population_Active_in_Agriculture +
Fertilizer_Consumption, data = a)

Coefficients:
(Intercept) Population_Active_in_Agriculture
Fertilizer_Consumption
-68.06161

0.09917

3.78036

>summary(agriculture_model)

Call:
lm(formula = Net_Agricultural_Output ~ Population_Active_in_Agriculture +
Fertilizer_Consumption, data = a)

Residuals:
Min

1Q Median

-3415.6 -154.4

3Q

Max

22.1 567.4 1596.0

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)

-68.06161 305.41917 -0.223

Population_Active_in_Agriculture 0.09917
Fertilizer_Consumption

3.78036

0.826

0.01369 7.244 7.08e-07 ***

0.30291 12.480 1.33e-10 ***

--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 1188 on 19 degrees of freedom


Multiple R-squared: 0.9157,

Adjusted R-squared: 0.9068

F-statistic: 103.2 on 2 and 19 DF, p-value: 6.258e-11

Interpretation :Now the Regression model can be stated as


Net_Agricultural_Output= -68.06161 +
0.09917*(Population_Active_in_Agriculture) +
3.78036*(Fertilizer_Consumption)

R2 is 0.91571 ,which is about 91% of Net Agricultural Output can be explained in


terms of age Population active in agriculture ,and Fertilizer Consumption of a
country through this linear model. We also see that all the explanatory variables have
positive relationship with Net Agricultural Output. These regression coefficient are
how ever not statistically significant except that of age, though the F-test in ANOVA
shows that the overall regression is significant at 0.01 level(p-value is almost zero).

3b. Use the Life Satisfaction dataset to fit the regression equation.

> a=read.csv("C:/Users/new/Desktop/sem 3 imp documents/stats/R


LAB/submission/ex.csv")
>a

Subject Age Gender Married IncomeCHealthCChildCLifeSatC SES Smoke Spirit


Finish LifeSat Income

1 16

38

17 17

30

22

26

2 28

38

16 21

39

20

15

3 16

16

4 23

51

22 31

60

48

73

5 18

52

25 38

32

20

14

6 30

25

43

53 36

39

33

38

7 19

19

55

28 41

51

33

45

52

39 40

30

42

88

8 19

52

9 34

29

10

10 16

53

21 27

29

37

19

11

11 25

39

18 34

61

40

56

12

12 16

42

31 29

58

35

70

13

13 16

43

15 28

39

32

71

14

14 16

18

15

15 16

16

16 32

26

17

17 19

18

18 17

10

55

48 53

43

42

19

19 24

17

52

16 36

54

38

75

20

20 26

12

57

39 41

32

42

67

60

54

52

54

46

17 52

20 56

34 38

20 38

39 37

17 25

35

23

> m=lm(Age+Gender~Married+HealthC+ChildC,data=t)

>m

Call:

40

27

30

36

21

26

37

35

47

26

16

64

44

25

38

39

lm(formula = Age + Gender ~ Married + HealthC + ChildC, data = a)

Coefficients:

(Intercept)

Married

22.29824

HealthCChildC

-1.89921

-0.03077

3.21078

>summary(m)

Call:

lm(formula = Age + Gender ~ Married + HealthC + ChildC, data = t)

Residuals:

Min

1Q Median

3Q

Max

-7.128 -4.984 -1.341 4.602 11.052

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 22.29824 11.01317 2.025 0.0599 .

Married

-1.89921

2.90593 -0.654 0.5227

HealthC

-0.03077

0.23294 -0.132 0.8966

ChildC

3.21078

1.90502 1.685 0.1113

---

Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 6.001 on 16 degrees of freedom

Multiple R-squared: 0.1633,

Adjusted R-squared: 0.006404

F-statistic: 1.041 on 3 and 16 DF, p-value: 0.4013

Interpretation :The regression model can be stated as :Age + Gender = 22.29824 - Married (1.89921) HealthC (0.03077) +
ChildC (3.21078)

You might also like