You are on page 1of 25

TRABAJO

http://lib.stat.cmu.edu/DASL/Datafiles/USCrime.html
Nombre de archivo de: EE.UU. Delito
Archivo de Materias: ciencias sociales
Nombres historia: EE.UU. Delito
Referencia: Vandaele, W. (1978) Participacin en actividades ilcitas: Erlich
renovado. En La disuasin y la incapacitacin, Blumstein, A., Cohen, J. y Nagin, D.,
eds., Washington, DC: National Academy of Sciences, 270-335. Mtodos: Estudio, Nueva
York: Chapman & Hall, 11. Tambin encontrado en: mano, D.J., et al. (1994) Manual de
los pequeos conjuntos de datos, Londres: Chapman & Hall, 101-103.
Autorizacin: Contacto autor
Descripcin: Estos datos estn relacionados con el delito y las estadsticas
demogrficas para 47 estados de EE.UU. en 1960. Los datos fueron recolectados a
partir de la FBI del Informe Uniforme de Delincuencia y otros organismos
gubernamentales a fin de determinar la forma en que la variable tasa de delincuencia
depende de las otras variables medidas en el estudio.
Nmero de casos: 47
Nombres de variables:
R: tasa de delincuencia: el nmero de delitos denunciados a la polica por milln de
habitantes
Edad: El nmero de hombres de edad 14ta-24o por cada 1000 habitantes
S: Indicador de la variable de los estados del sur (0 = no, 1 = S)
Ed: La media de nmero de aos de escolaridad x 10 para las personas de 25 aos de
edad o ms
Ex0: 1960 el gasto per cpita de la polica estatal y el gobierno local
Ex1: 1959 el gasto per cpita de la polica estatal y el gobierno local
LF: tasa de participacin de la fuerza laboral por cada 1000 hombres de edad civil
urbana 14ta-24ta
M: El nmero de varones por cada 1000 mujeres
N: tamao de la poblacin del Estado en cientos de miles
NW: El nmero de no-blancos por cada 1000 habitantes
U1: Tasa de desempleo urbano por cada 1000 hombres de edad 14a-24a
U2: Tasa de desempleo urbano por cada 1000 hombres de edad 35-39
W: La mediana del valor de los bienes y activos transferibles o ingresos de la
familia en decenas de $
X: El nmero de familias por cada 1000 ingresos por debajo de 1 / 2, la mediana de
ingresos
Y tasa de delincuencia

>RyM<- read.table(file.choose(),T)
>RyM
R Age S Ed Ex0 Ex1 LF
M
1
79.1 151 1 91 58 56 510 950
2 163.5 143 0 113 103 95 583 1012
3
57.8 142 1 89 45 44 533 969
4 196.9 136 0 121 149 141 577 994
5 123.4 141 0 121 109 101 591 985
6
68.2 121 0 110 118 115 547 964
7
96.3 127 1 111 82 79 519 982
8 155.5 131 1 109 115 109 542 969
9
85.6 157 1 90 65 62 553 955
10 70.5 140 0 118 71 68 632 1029
11 167.4 124 0 105 121 116 580 966
12 84.9 134 0 108 75 71 595 972
13 51.1 128 0 113 67 60 624 972
14 66.4 135 0 117 62 61 595 986
15 79.8 152 1 87 57 53 530 986
16 94.6 142 1 88 81 77 497 956
17 53.9 143 0 110 66 63 537 977
18 92.9 135 1 104 123 115 537 978
19 75.0 130 0 116 128 128 536 934
20 122.5 125 0 108 113 105 567 985
21 74.2 126 0 108 74 67 602 984
22 43.9 157 1 89 47 44 512 962
23 121.6 132 0 96 87 83 564 953
24 96.8 131 0 116 78 73 574 1038
25 52.3 130 0 116 63 57 641 984
26 199.3 131 0 121 160 143 631 1071
27 34.2 135 0 109 69 71 540 965
28 121.6 152 0 112 82 76 571 1018
29 104.3 119 0 107 166 157 521 938
30 69.6 166 1 89 58 54 521 973
31 37.3 140 0 93 55 54 535 1045
32 75.4 125 0 109 90 81 586 964
33 107.2 147 1 104 63 64 560 972
34 92.3 126 0 118 97 97 542 990
35 65.3 123 0 102 97 87 526 948
36 127.2 150 0 100 109 98 531 964
37 83.1 177 1 87 58 56 638 974
38 56.6 133 0 104 51 47 599 1024
39 82.6 149 1 88 61 54 515 953
40 115.1 145 1 104 82 74 560 981
41 88.0 148 0 122 72 66 601 998
42 54.2 141 0 109 56 54 523 968
43 82.3 162 1 99 75 70 522 996
44 103.0 136 0 121 95 96 574 1012
45 45.5 139 1 88 46 41 480 968
46 50.8 126 0 104 106 97 599 989
47 84.9 130 0 121 90 91 623 1049
> RyM$S<-factor(RyM$S)
> levels(RyM$S)
[1] "0" "1"
> levels(RyM$S)<-c("no","si")
> table(RyM$S)
no si
31 16

N
33
13
18
157
18
25
4
50
39
7
101
47
28
22
30
33
10
31
51
78
34
22
43
7
14
3
6
10
168
46
6
97
23
18
113
9
24
7
36
96
9
4
40
29
19
40
3

NW
301
102
219
80
30
44
139
179
286
15
106
59
10
46
72
321
6
170
24
94
12
423
92
36
26
77
4
79
89
254
20
82
95
21
76
24
349
40
165
126
19
2
208
36
49
24
22

U1
108
96
94
102
91
84
97
79
81
100
77
83
77
77
92
116
114
89
78
130
102
97
83
142
70
102
80
103
92
72
135
105
76
102
124
87
76
99
86
88
84
107
73
111
135
78
113

U2
41
36
33
39
20
29
38
35
28
24
35
31
25
27
43
47
35
34
34
58
33
34
32
42
21
41
22
28
36
26
40
43
24
35
50
38
28
27
35
31
20
37
27
37
53
25
40

W
394
557
318
673
578
689
620
472
421
526
657
580
507
529
405
427
487
631
627
626
557
288
513
540
486
674
564
537
637
396
453
617
462
589
572
559
382
425
395
488
590
489
496
622
457
593
588

X
261
194
250
167
174
126
168
206
239
174
170
172
206
190
264
247
166
165
135
166
195
276
227
176
196
152
139
215
154
237
200
163
233
166
158
153
254
225
251
228
144
170
224
162
249
171
160

//No uso la function na.omit(RyM) para eliminar datos perdidos, ya que en mi data
esta completa
> summary(RyM)
R
Min.
: 34.20
1st Qu.: 65.85
Median : 83.10
Mean
: 90.51
3rd Qu.:105.75
Max.
:199.30
Ex1
Min.
: 41.00
1st Qu.: 58.50
Median : 73.00
Mean
: 80.23
3rd Qu.: 97.00
Max.
:157.00
NW
Min.
: 2.0
1st Qu.: 24.0
Median : 76.0
Mean
:101.1
3rd Qu.:132.5
Max.
:423.0
X
Min.
:126.0
1st Qu.:165.5
Median :176.0
Mean
:194.0
3rd Qu.:227.5
Max.
:276.0

Age
Min.
:119.0
1st Qu.:130.0
Median :136.0
Mean
:138.6
3rd Qu.:146.0
Max.
:177.0
LF
Min.
:480.0
1st Qu.:530.5
Median :560.0
Mean
:561.2
3rd Qu.:593.0
Max.
:641.0
U1
Min.
: 70.00
1st Qu.: 80.50
Median : 92.00
Mean
: 95.47
3rd Qu.:104.00
Max.
:142.00

S
no:31
si:16

Ed
Min.
: 87.0
1st Qu.: 97.5
Median :108.0
Mean
:105.6
3rd Qu.:114.5
Max.
:122.0

M
Min.
: 934.0
1st Qu.: 964.5
Median : 977.0
Mean
: 983.0
3rd Qu.: 992.0
Max.
:1071.0
U2
Min.
:20.00
1st Qu.:27.50
Median :34.00
Mean
:33.98
3rd Qu.:38.50
Max.
:58.00

Ex0
Min.
: 45.0
1st Qu.: 62.5
Median : 78.0
Mean
: 85.0
3rd Qu.:104.5
Max.
:166.0

N
Min.
: 3.00
1st Qu.: 10.00
Median : 25.00
Mean
: 36.62
3rd Qu.: 41.50
Max.
:168.00
W
Min.
:288.0
1st Qu.:459.5
Median :537.0
Mean
:525.4
3rd Qu.:591.5
Max.
:689.0

> modelo<-lm(R~Ex1,RyM)
> modelo
Call:
lm(formula = R ~ Ex1, data = RyM)
Coefficients:
(Intercept)
16.5164

Ex1
0.9222

>cat("Y=",modelo$coefficients[1],"+",modelo$coefficients[2],"X","\n\n")
Y= 16.51642 + 0.9222031 X
> summary(modelo)
Call:
lm(formula = R ~ Ex1, data = RyM)
Residuals:
Min
1Q
-59.558 -15.676

Median
1.229

3Q
14.674

Max
59.374

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 16.5164
13.0427
1.266
0.212
Ex1
0.9222
0.1537
6.001 3.11e-07 ***
---

Signif. codes:

0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 29.14 on 45 degrees of freedom


Multiple R-squared: 0.4445,
Adjusted R-squared: 0.4322
F-statistic: 36.01 on 1 and 45 DF, p-value: 3.114e-07
> attach(RyM)
> R
[1] 79.1 163.5 57.8 196.9 123.4 68.2 96.3 155.5 85.6 70.5
[13] 51.1 66.4 79.8 94.6 53.9 92.9 75.0 122.5 74.2 43.9
[25] 52.3 199.3 34.2 121.6 104.3 69.6 37.3 75.4 107.2 92.3
[37] 83.1 56.6 82.6 115.1 88.0 54.2 82.3 103.0 45.5 50.8
> Age
[1] 151 143 142 136 141 121 127 131 157 140 124 134 128 135 152
[20] 125 126 157 132 131 130 131 135 152 119 166 140 125 147 126
[39] 149 145 148 141 162 136 139 126 130
> S
[1] si no si no no no si si si no no no no no si si no si no no
[26] no no no no si no no si no no no si no si si no no si no si
Levels: no si
> Ed
[1] 91 113 89 121 121 110 111 109 90 118 105 108 113 117 87
[20] 108 108 89 96 116 116 121 109 112 107 89 93 109 104 118
[39] 88 104 122 109 99 121 88 104 121
> Ex0
[1] 58 103 45 149 109 118 82 115 65 71 121 75 67 62 57
[20] 113 74 47 87 78 63 160 69 82 166 58 55 90 63 97
[39] 61 82 72 56 75 95 46 106 90
> Ex1
[1] 56 95 44 141 101 115 79 109 62 68 116 71 60 61 53
[20] 105 67 44 83 73 57 143 71 76 157 54 54 81 64 97
[39] 54 74 66 54 70 96 41 97 91
> LF
[1] 510 583 533 577 591 547 519 542 553 632 580 595 624 595 530
[20] 567 602 512 564 574 641 631 540 571 521 521 535 586 560 542
[39] 515 560 601 523 522 574 480 599 623
> M
[1] 950 1012 969 994 985 964 982 969 955 1029 966 972
[16] 956 977 978 934 985 984 962 953 1038 984 1071 965
[31] 1045 964 972 990 948 964 974 1024 953 981 998 968
[46] 989 1049
> N
[1] 33 13 18 157 18 25
4 50 39
7 101 47 28 22 30
[20] 78 34 22 43
7 14
3
6 10 168 46
6 97 23 18
[39] 36 96
9
4 40 29 19 40
3
> NW
[1] 301 102 219 80 30 44 139 179 286 15 106 59 10 46 72
[20] 94 12 423 92 36 26 77
4 79 89 254 20 82 95 21
[39] 165 126 19
2 208 36 49 24 22
> U1
[1] 108 96 94 102 91 84 97 79 81 100 77 83 77 77 92
[20] 130 102 97 83 142 70 102 80 103 92 72 135 105 76 102
[39] 86 88 84 107 73 111 135 78 113
> U2
[1] 41 36 33 39 20 29 38 35 28 24 35 31 25 27 43 47 35 34 34 58
[26] 41 22 28 36 26 40 43 24 35 50 38 28 27 35 31 20 37 27 37 53
> W
[1] 394 557 318 673 578 689 620 472 421 526 657 580 507 529 405
[20] 626 557 288 513 540 486 674 564 537 637 396 453 617 462 589
[39] 395 488 590 489 496 622 457 593 588
> X

167.4 84.9
121.6 96.8
65.3 127.2
84.9
142 143 135 130
123 150 177 133
no si no no no
no no
88 110 104 116
102 100 87 104
81 66 123 128
97 109 58 51
77
87

63 115 128
98 56 47

497 537 537 536


526 531 638 599
972 986
1018 938
996 1012

986
973
968

33
113

10
9

31
24

51
7

321
76

6 170
24 349

24
40

116 114
124 87

89
76

78
99

33 34 32 42 21
25 40
427 487 631 627
572 559 382 425

[1] 261 194 250 167 174 126 168 206 239 174 170 172 206 190 264 247 166 165 135
[20] 166 195 276 227 176 196 152 139 215 154 237 200 163 233 166 158 153 254 225
[39] 251 228 144 170 224 162 249 171 160
Histograma para la variable R
> RyM$R
[1] 79.1 163.5
[13] 51.1 66.4
[25] 52.3 199.3
[37] 83.1 56.6

57.8 196.9 123.4


79.8 94.6 53.9
34.2 121.6 104.3
82.6 115.1 88.0

68.2
92.9
69.6
54.2

96.3 155.5 85.6


75.0 122.5 74.2
37.3 75.4 107.2
82.3 103.0 45.5

70.5 167.4 84.9


43.9 121.6 96.8
92.3 65.3 127.2
50.8 84.9

> n<-length(RyM$R)
> n
[1] 47
#Regla de Sturges para hallar el numero de intervalos
> k<-1+3.3*log10(47)
> k
[1] 6.517923
> round(k)
[1] 7
#A<-Xmax-Xmin
> A<-max(RyM$R)-min(RyM$R)
> A
[1] 165.1
#tic=A/round(k)
> tic<-A/round(k)
> tic
[1] 23.58571
#COMO LOS NUMEROS TIENEN UN DECIMAL REDONDEAMOS CON UN DECIMAL
tic<-23.6
#LI1=min(RyM$R)

LS1= LI1+tic

> LI1<-min(RyM$R)
> LI1
[1] 34.2
> LS1<-LI1+tic
> LS1
[1] 57.8
#LI2= LI1+tic
> LI2<-LI1+tic
> LI2
[1] 57.8
> LS2<- LS1+tic
> LS2
[1] 81.4

LS2= LS1+tic

#usando la libreria agricolae para construir la tabla de frecuencia


> library(agricolae)
>
>

h<-graph.freq(RyM$R,frequency=3)
summary(h)
Inf
Sup
MC fi
fri Fi
34.2 57.8 46.0 11 0.23404255 11
57.8 81.4 69.6 10 0.21276596 21
81.4 105.0 93.2 14 0.29787234 35
105.0 128.6 116.8 7 0.14893617 42
128.6 152.2 140.4 0 0.00000000 42
152.2 175.8 164.0 3 0.06382979 45
175.8 199.4 187.6 2 0.04255319 47

Fri
0.2340426
0.4468085
0.7446809
0.8936170
0.8936170
0.9574468
1.0000000

0.000

0.004

0.008

0.012

> normal.freq(h,col="blue",frequency=3)

50

100
RyM$R

#Poligono de frecuencia para la variable R


> h<-graph.freq(RyM$R,frequency=3,border=FALSE)
> polygon.freq(h,frequency=3)
> grid(col="brown")

150

200

0.012
0.008
0.004
0.000

50

100

150

200

RyM$R

> qqnorm(RyM$R)
> shapiro.test(RyM$R)
Shapiro-Wilk normality test
data: RyM$R
W = 0.9127, p-value = 0.001882

150
100
50

Sample Quantiles

200

Normal Q-Q Plot

-2

-1

Theoretical Quantiles

#Grafico de tallos y hojas para la variable R


> stem(RyM$R,1)
The decimal point is 1 digit(s) to the right of the |
2 | 47
4 | 461124478

6
8
10
12
14
16
18

|
|
|
|
|
|
|

568014559
0233556823567
3475
22337
6
47
79

> par(mfrow=c(1,3))
> hist(RyM$R)
> plot(density(RyM$R,na.rm=TRUE))
> plot(sort(RyM$R),pch=".")

sort(RyM$R)

0.000

50

0.002

0.004

100

0.006

Density

Frequency

0.008

150

0.010

10

0.012

200

density.default(x = RyM$R, na.rm = TRUE)

12

Histogram of RyM$R

50

100

150

200

RyM$R

50

100

150

200

N = 47 Bandw idth = 12.41

10

20

30

Index

#coeficiente de variabilidad
#creando la funcion cv
cv<-function(x){sd(x)/abs(mean(x))*100}
> cv(RyM$R)
[1] 42.73219
#Valores outlier para la variable R
Varanalizar<-RyM$R
outliers<-boxplot(Varanalizar,plot=F)$out
nout=as.character(outliers)
boxplot(Varanalizar,col="blue")
for(i in 1:length(outliers))
{
text(outliers[i],as.character(which(Varanalizar==outliers[i])),cex=.8,pos=4)}

40

200

26
4

50

100

150

11

Las observaciones

26,4,11 son outlier

#Valores outlier para la variable Ex1


Varanalizar<-RyM$Ex1
outliers<-boxplot(Varanalizar,plot=F)$out
nout=as.character(outliers)
boxplot(Varanalizar,col="blue")
for(i in 1:length(outliers))
{
text(outliers[i],as.character(which(Varanalizar==outliers[i])),cex=.8,pos=4)
}

160
40

60

80

100

120

140

29

valores outlier: 29
#dividiendo los datos de la varioable Ex1 teniendo en cuenta la variable S
> split(RyM$Ex1,RyM$S)
$no
[1] 95 141 101 115 68 116
[20] 157 54 81 97 87 98

71
47

60
66

61
54

63 128 105
96 97 91

67

83

73

57 143

$si
[1]

77 115

44

54

54

74

70

41

56

44

79 109

62

53

64

56

71

Varanalizar<- split(RyM$Ex1,RyM$S)$no

60

80

100

120

140

160

outliers<-boxplot(Varanalizar,plot=F)$out
nout=as.character(outliers)
boxplot(Varanalizar,col="blue")
for(i in 1:length(outliers))
{
text(outliers[i],as.character(which(Varanalizar==outliers[i])),cex=.8,pos=4)
}

valores outlier: 20
Varanalizar<- split(RyM$Ex1,RyM$S)$si

20

76

outliers<-boxplot(Varanalizar,plot=F)$out
nout=as.character(outliers)
boxplot(Varanalizar,col="blue")
for(i in 1:length(outliers))
{
text(outliers[i],as.character(which(Varanalizar==outliers[i])),cex=.8,pos=4)
}

40

60

80

100

200
150

50

50

100

R
100

150

200

valores outlier: 8, 4
> par(mfrow=c(1,2))
> plot(R~Ex1 ,RyM)
> plot(R~S ,RyM)

40

80

120
Ex1

160

no

si
S

###
> modelo<-lm(RyM$R~RyM$Ex1)
> summary(modelo)
Call:
lm(formula = RyM$R ~ RyM$Ex1)
Residuals:
Min
1Q
-59.558 -15.676

Median
1.229

3Q
14.674

Max
59.374

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 16.5164
13.0427
1.266
0.212
RyM$Ex1
0.9222
0.1537
6.001 3.11e-07 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 29.14 on 45 degrees of freedom
Multiple R-squared: 0.4445,
Adjusted R-squared: 0.4322
F-statistic: 36.01 on 1 and 45 DF, p-value: 3.114e-07
El R cuadrado sale casi 45% influenciado por los valores extremos del modelo
> plot(modelo,2)

Normal Q-Q

1
0
-1
-2

Standardized residuals

19

29

-2

-1

0
Theoretical Quantiles
lm(R ~ Ex1)

Podemos eliminar del modelo las observaciones 29,19 y 2 para ver como cambia nuestro
R
> modelo<-lm(R~Ex1,RyM[-c(2,19,29),])
> summary(modelo)
Call:
lm(formula = R ~ Ex1, data = RyM[-c(2, 19, 29), ])
Residuals:
Min
1Q
-64.792 -14.427

Median
2.869

3Q
15.707

Max
33.953

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.7190
12.2912 -0.058
0.954
Ex1
1.1627
0.1518
7.659 1.68e-09 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 25 on 42 degrees of freedom
Multiple R-squared: 0.5828,
Adjusted R-squared: 0.5729
F-statistic: 58.67 on 1 and 42 DF, p-value: 1.682e-09
#nuestro R cuadrado sube a 58%

>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

subset(RyM,select=c("R","Ex1"))
R Ex1
79.1 56
163.5 95
57.8 44
196.9 141
123.4 101
68.2 115
96.3 79
155.5 109
85.6 62
70.5 68
167.4 116
84.9 71
51.1 60
66.4 61
79.8 53
94.6 77
53.9 63
92.9 115
75.0 128
122.5 105
74.2 67
43.9 44
121.6 83
96.8 73
52.3 57
199.3 143
34.2 71
121.6 76

29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47

104.3 157
69.6 54
37.3 54
75.4 81
107.2 64
92.3 97
65.3 87
127.2 98
83.1 56
56.6 47
82.6 54
115.1 74
88.0 66
54.2 54
82.3 70
103.0 96
45.5 41
50.8 97
84.9 91

> data<- subset(RyM,select=c("R","Ex1"))


> cor(data)
R
Ex1
R
1.0000000 0.6667141
Ex1 0.6667141 1.0000000
> plot(R~Ex1,data)
> abline(0,1)
> abline(lm(R~Ex1,data)$coef,lty=5)
> modelo<-lm(R~Ex1,data)
> summary(modelo)
Call:
lm(formula = R ~ Ex1, data = data)
Residuals:
Min
1Q
-59.558 -15.676

Median
1.229

3Q
14.674

Max
59.374

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 16.5164
13.0427
1.266
0.212
Ex1
0.9222
0.1537
6.001 3.11e-07 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 29.14 on 45 degrees of freedom
Multiple R-squared: 0.4445,
Adjusted R-squared: 0.4322
F-statistic: 36.01 on 1 and 45 DF, p-value: 3.114e-07

#podemos mejorar nuestras R cuadrado eliminando observaciones


> modelo<-(lm(R~Ex1,data=RyM[-c(2,19,18,29,11,46,22),]))
> summary(modelo)
Call:
lm(formula = R ~ Ex1, data = RyM[-c(2, 19, 18, 29, 11, 46, 22),
])
Residuals:
Min
1Q
-69.022 -11.750

Median
2.798

3Q
15.006

Max
32.326

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.3652
11.9364 -0.282
0.78
Ex1
1.2225
0.1509
8.104 8.32e-10 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 22.67 on 38 degrees of freedom
Multiple R-squared: 0.6335,
Adjusted R-squared: 0.6238
F-statistic: 65.67 on 1 and 38 DF, p-value: 8.325e-10

50

100

150

200

> plot(R~Ex1,data=RyM[-c(2,19,18,29,11,46,22),])
> abline(modelo)

40

60

80

100

120

140

Ex1

#Podemos observer que algunas observaciones todavia estan muy alejas de la recta
estimada de regression.
#Con las observaciones quie se elimino se llego a un R cuadrado del 63%.

> summary(lm(R~Age+S+Ed+Ex0+Ex1+LF+M+N+NW+U1+U2+W+X,data=RyM))
Call:
lm(formula = R ~ Age + S + Ed + Ex0 + Ex1 + LF + M + N + NW +
U1 + U2 + W + X, data = RyM)
Residuals:
Min
1Q
-34.884 -11.923

Median
-1.135

3Q
13.495

Max
50.560

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -6.918e+02 1.559e+02 -4.438 9.56e-05 ***
Age
1.040e+00 4.227e-01
2.460 0.01931 *
Ssi
-8.308e+00 1.491e+01 -0.557 0.58117
Ed
1.802e+00 6.496e-01
2.773 0.00906 **
Ex0
1.608e+00 1.059e+00
1.519 0.13836
Ex1
-6.673e-01 1.149e+00 -0.581 0.56529
LF
-4.103e-02 1.535e-01 -0.267 0.79087
M
1.648e-01 2.099e-01
0.785 0.43806
N
-4.128e-02 1.295e-01 -0.319 0.75196
NW
7.175e-03 6.387e-02
0.112 0.91124
U1
-6.017e-01 4.372e-01 -1.376 0.17798
U2
1.792e+00 8.561e-01
2.093 0.04407 *
W
1.374e-01 1.058e-01
1.298 0.20332
X
7.929e-01 2.351e-01
3.373 0.00191 **
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 21.94 on 33 degrees of freedom
Multiple R-squared: 0.7692,
Adjusted R-squared: 0.6783
F-statistic: 8.462 on 13 and 33 DF, p-value: 3.686e-07
Elimino NW
> summary(lm(R~Age+S+Ed+Ex0+Ex1+LF+M+N+U1+U2+W+X,data=RyM))
Call:
lm(formula = R ~ Age + S + Ed + Ex0 + Ex1 + LF + M + N + U1 +
U2 + W + X, data = RyM)
Residuals:
Min
1Q Median
-35.29 -11.72 -0.96
Coefficients:

3Q
13.71

Max
50.91

Estimate Std. Error t value Pr(>|t|)


(Intercept) -690.29703 153.01217 -4.511 7.32e-05 ***
Age
1.05547
0.39324
2.684 0.01116 *
Ssi
-7.50829
12.90942 -0.582 0.56466
Ed
1.78928
0.63096
2.836 0.00764 **
Ex0
1.59641
1.03837
1.537 0.13345
Ex1
-0.64185
1.10981 -0.578 0.56685
LF
-0.03540
0.14295 -0.248 0.80588
M
0.15887
0.20023
0.793 0.43303
N
-0.04116
0.12762 -0.323 0.74904
U1
-0.59321
0.42432 -1.398 0.17116
U2
1.79147
0.84356
2.124 0.04105 *
W
0.13472
0.10167
1.325 0.19401
X
0.79417
0.23139
3.432 0.00159 **
---

Signif. codes:

0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 21.61 on 34 degrees of freedom


Multiple R-squared: 0.7691,
Adjusted R-squared: 0.6877
F-statistic: 9.44 on 12 and 34 DF, p-value: 1.159e-07
Elimino LF
> summary(lm(R~Age+S+Ed+Ex0+Ex1+M+N+U1+U2+W+X,data=RyM))
Call:
lm(formula = R ~ Age + S + Ed + Ex0 + Ex1 + M + N + U1 + U2 +
W + X, data = RyM)
Residuals:
Min
1Q
-35.300 -12.480

Median
-1.150

3Q
13.050

Max
50.860

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -683.94267 148.80924 -4.596 5.4e-05 ***
Age
1.06058
0.38739
2.738 0.00966 **
Ssi
-6.17049
11.56670 -0.533 0.59708
Ed
1.74411
0.59587
2.927 0.00598 **
Ex0
1.55088
1.00817
1.538 0.13297
Ex1
-0.58155
1.06816 -0.544 0.58959
M
0.13488
0.17288
0.780 0.44051
N
-0.04601
0.12440 -0.370 0.71372
U1
-0.55055
0.38253 -1.439 0.15898
U2
1.78151
0.83123
2.143 0.03912 *
W
0.13283
0.10002
1.328 0.19276
X
0.78095
0.22210
3.516 0.00123 **
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 21.32 on 35 degrees of freedom
Multiple R-squared: 0.7687,
Adjusted R-squared: 0.696
F-statistic: 10.58 on 11 and 35 DF, p-value: 3.52e-08
Elimino N
> summary(lm(R~Age+S+Ed+Ex0+Ex1+M+U1+U2+W+X,data=RyM))
Call:
lm(formula = R ~ Age + S + Ed + Ex0 + Ex1 + M + U1 + U2 + W +
X, data = RyM)
Residuals:
Min
1Q
-34.148 -12.773
Coefficients:

Median
0.667

3Q
12.649

Max
49.797

Estimate Std. Error t value Pr(>|t|)


(Intercept) -702.74642 138.16819 -5.086 1.15e-05 ***
Age
1.06321
0.38265
2.779 0.008622 **
Ssi
-5.46561
11.27101 -0.485 0.630666
Ed
1.74234
0.58867
2.960 0.005416 **
Ex0
1.49202
0.98353
1.517 0.137996
Ex1
-0.55500
1.05289 -0.527 0.601343
M
0.16509
0.15053
1.097 0.280026
U1
-0.56407
0.37619 -1.499 0.142476

U2
1.77880
0.82117
2.166 0.036997 *
W
0.12639
0.09731
1.299 0.202227
X
0.75334
0.20666
3.645 0.000837 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 21.07 on 36 degrees of freedom
Multiple R-squared: 0.7678,
Adjusted R-squared: 0.7033
F-statistic: 11.91 on 10 and 36 DF, p-value: 1.042e-08
Elimino Ex1
> summary(lm(R~Age+S+Ed+Ex0+M+U1+U2+W+X,data=RyM))
Call:
lm(formula = R ~ Age + S + Ed + Ex0 + M + U1 + U2 + W + X, data = RyM)
Residuals:
Min
1Q
-37.416 -13.828

Median
-0.419

3Q
12.294

Max
48.106

Coefficients:

Estimate Std. Error t value Pr(>|t|)


(Intercept) -716.54697 134.33461 -5.334 5.01e-06 ***
Age
1.06853
0.37877
2.821 0.007652 **
Ssi
-6.50570
10.98812 -0.592 0.557407
Ed
1.70198
0.57794
2.945 0.005557 **
Ex0
0.98333
0.18793
5.232 6.86e-06 ***
M
0.17728
0.14728
1.204 0.236343
U1
-0.57520
0.37191 -1.547 0.130470
U2
1.80902
0.81113
2.230 0.031880 *
W
0.12828
0.09629
1.332 0.190899
X
0.77110
0.20189
3.819 0.000494 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 20.86 on 37 degrees of freedom
Multiple R-squared: 0.766,
Adjusted R-squared: 0.7091
F-statistic: 13.46 on 9 and 37 DF, p-value: 3.093e-09
Elimino S
> summary(lm(R~Age+Ed+Ex0+M+U1+U2+W+X,data=RyM))
Call:
lm(formula = R ~ Age + Ed + Ex0 + M + U1 + U2 + W + X, data = RyM)
Residuals:
Min
1Q
-36.9357 -13.1677
Coefficients:

Median
0.7294

3Q
11.7858

Max
49.9309

Estimate Std. Error t value Pr(>|t|)


(Intercept) -714.40473 133.13339 -5.366 4.21e-06 ***
Age
1.00832
0.36173
2.788 0.008248 **
Ed
1.74164
0.56912
3.060 0.004044 **
Ex0
0.97854
0.18614
5.257 5.94e-06 ***
M
0.18489
0.14546
1.271 0.211415
U1
-0.52958
0.36072 -1.468 0.150301
U2
1.70702
0.78582
2.172 0.036141 *
W
0.12714
0.09544
1.332 0.190732

X
0.73210
0.18921
3.869 0.000415 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 20.68 on 38 degrees of freedom
Multiple R-squared: 0.7638,
Adjusted R-squared: 0.7141
F-statistic: 15.36 on 8 and 38 DF, p-value: 8.88e-10

Elimino M
> summary(lm(R~Age+Ed+Ex0+U1+U2+W+X,data=RyM))
Call:
lm(formula = R ~ Age + Ed + Ex0 + U1 + U2 + W + X, data = RyM)
Residuals:
Min
1Q
-42.4673 -12.8154

Median
-0.3834

3Q
11.4613

Max
51.9505

Coefficients:

Estimate Std. Error t value Pr(>|t|)


(Intercept) -614.66543 108.39833 -5.670 1.49e-06 ***
Age
1.13175
0.35119
3.223 0.002566 **
Ed
2.02739
0.52695
3.847 0.000431 ***
Ex0
0.98629
0.18751
5.260 5.50e-06 ***
U1
-0.30919
0.31880 -0.970 0.338105
U2
1.44156
0.76352
1.888 0.066475 .
W
0.14548
0.09509
1.530 0.134103
X
0.79609
0.18382
4.331 0.000101 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 20.84 on 39 degrees of freedom
Multiple R-squared: 0.7538,
Adjusted R-squared: 0.7096
F-statistic: 17.06 on 7 and 39 DF, p-value: 4.342e-10
Elimino U1
> summary(lm(R~Age+Ed+Ex0+U2+W+X,data=RyM))
Call:
lm(formula = R ~ Age + Ed + Ex0 + U2 + W + X, data = RyM)
Residuals:
Min
1Q
-38.306 -10.209
Coefficients:

Median
-1.313

3Q
9.919

Max
54.544

Estimate Std. Error t value Pr(>|t|)


(Intercept) -618.5028
108.2456 -5.714 1.19e-06 ***
Age
1.1252
0.3509
3.207 0.002640 **
Ed
1.8179
0.4803
3.785 0.000505 ***
Ex0
1.0507
0.1752
5.996 4.78e-07 ***
U2
0.8282
0.4274
1.938 0.059743 .
W
0.1596
0.0939
1.699 0.097028 .

X
0.8236
0.1815
4.538 5.10e-05 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 20.83 on 40 degrees of freedom
Multiple R-squared: 0.7478,
Adjusted R-squared: 0.71
F-statistic: 19.77 on 6 and 40 DF, p-value: 1.441e-10
> modelo<-(lm(R~Age+Ed+Ex0+U2+X,data=RyM))
> modelo
Call:
lm(formula = R ~ Age + Ed + Ex0 + U2 + X, data = RyM)
Coefficients:
(Intercept)
-524.3743
>

Age
1.0198

Ed
2.0308

Ex0
1.2331

U2
0.9136

shapiro.test(residuals(modelo))
Shapiro-Wilk normality test

data: residuals(modelo)
W = 0.9715, p-value = 0.3017
#Nuestro modelo mejora cuando eliminamos algunas observaciones
#Eliminare las observaciones 2,19,18,29,11,46 del modelo
Vemos ke nuestro R cuadrado mejora (87%)
> modelo<-(lm(R~Age+Ed+Ex0+U2+X,data=RyM[-c(2,19,18,29,11,46),]))
> summary(modelo)
Call:
lm(formula = R ~ Age + Ed + Ex0 + U2 + X, data = RyM[-c(2, 19,
18, 29, 11, 46), ])
Residuals:
Min
1Q
-30.4608 -6.2641

Median
-0.1716

3Q
7.9186

Max
25.0147

Coefficients:

Estimate Std. Error t value Pr(>|t|)


(Intercept) -385.2597
74.8634 -5.146 1.03e-05 ***
Age
0.7516
0.2599
2.892 0.00654 **
Ed
1.2572
0.3746
3.356 0.00191 **
Ex0
1.4829
0.1172 12.657 1.27e-14 ***
U2
0.3252
0.3293
0.987 0.33018
X
0.5398
0.1005
5.371 5.22e-06 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 14.19 on 35 degrees of freedom
Multiple R-squared: 0.8724,
Adjusted R-squared: 0.8542
F-statistic: 47.88 on 5 and 35 DF, p-value: 1.126e-14

X
0.6349

> stem(RyM$Age)
The decimal point is 1 digit(s) to the right of the |
11
12
13
14
15
16
17

|
|
|
|
|
|
|

9
1345566678
000111234555669
001122335789
012277
26
7

> stem(RyM$Ed)
The decimal point is 1 digit(s) to the right of the |
8 | 77888999
9 | 013
9 | 69
10 | 0244444
10 | 578889999
11 | 001233
11 | 666788
12 | 111112
> stem(RyM$LF)

The decimal point is 1 digit(s) to the right of the |


48 | 07
50 | 0259
52 | 112360135677
54 | 02273
56 | 00471447
58 | 03615599
60 | 12
62 | 34128
64 | 1
> stem(RyM$M)
The decimal point is 1 digit(s) to the right of the |
92 | 48
94 | 803356
96 | 24445688992223478
98 | 1244556690468
100 | 228
102 | 498
104 | 59
106 | 1
> stem(RyM$N)
The decimal point is 1 digit(s) to the right of the |
0 | 3344667779900348889
2 | 22345890133469
4 | 0036701
6 | 8
8 | 67
10 | 13
12 |
14 | 7
16 | 8
> stem(RyM$NW)
The decimal point is 2 digit(s) to the right of the |
0 | 0011122222222334444
0 | 556788888999
1 | 00134
1 | 778
2 | 12
2 | 59
3 | 02
3 | 5
4 | 2
> stem(RyM$U1)
The decimal point is 1 digit(s) to the right of the |
7
8
9
10
11
12
13

|
|
|
|
|
|
|

02366777889
0133446789
12246779
022223578
1346
4
055

14 | 2
> stem(RyM$U2)
The decimal point is 1 digit(s) to the right of the |
2 | 001244
2 | 5567778889
3 | 11233444
3 | 555556677889
4 | 0011233
4 | 7
5 | 03
5 | 8
> stem(RyM$W)
The decimal point is 2 digit(s) to the right of the |
2
3
3
4
4
5
5
6
6

|
|
|
|
|
|
|
|
|

9
2
89
001233
56679999
0113344
66667889999
2223334
6779

> stem(RyM$X)
The decimal point is 1 digit(s) to the right of the |
12
14
16
18
20
22
24
26

|
|
|
|
|
|
|
|

659
42348
0235666780012446
0456
0665
4578379
79014
146

> h<-graph.freq(RyM$Ex0,frequency=3)
> summary(h)
Inf
Sup
MC fi
fri Fi
Fri
45.0 62.3 53.65 12 0.25531915 12 0.2553191
62.3 79.6 70.95 12 0.25531915 24 0.5106383
79.6 96.9 88.25 8 0.17021277 32 0.6808511
96.9 114.2 105.55 7 0.14893617 39 0.8297872
114.2 131.5 122.85 5 0.10638298 44 0.9361702
131.5 148.8 140.15 0 0.00000000 44 0.9361702
148.8 166.1 157.45 3 0.06382979 47 1.0000000
> lines(density(RyM$Ex0),col="red")
> polygon.freq(h,frequency=3,col="blue")

You might also like