Professional Documents
Culture Documents
. tabstat price rep78 weight mpg,by(foreignd) stat(mean sd min max skew kurt cv
> p25 p50 p75 p99) col(stat)
Summary for variables: price rep78 weight mpg
by categories of: foreignd (foreign D)
foreignd |
mean
sd
min
max skewness kurtosis
---------+-----------------------------------------------------------0 | 6384.682 2621.915
3748
12990 1.215236 3.555178
| 4.285714 .7171372
3
5 -.4592793 2.104167
| 2315.909 433.0035
1760
3420 1.056582 3.368013
| 24.77273 6.611187
14
41 .657329 3.10734
---------+-----------------------------------------------------------1 | 6072.423 3097.104
3291
15906 1.777939 5.090316
| 3.020833 .837666
1
5 -.0388361 3.574874
| 3317.115 695.3637
1800
4840 -.24371 2.784673
| 19.82692 4.743297
12
34 .7712432 3.441459
---------+-----------------------------------------------------------Total | 6165.257 2949.496
3291
15906 1.653434 4.819188
| 3.405797 .9899323
1
5 -.0570331 2.678086
| 3019.459 777.1936
1760
4840 .1481164 2.118403
| 21.2973 5.785503
12
41 .9487176 3.975005
---------------------------------------------------------------------foreignd |
cv
p25
p50
p75
p99
---------+-------------------------------------------------0 | .4106571
4499
5759
7140
12990
| .167332
4
4
5
5
| .1869691
2020
2180
2650
3420
| .2668736
21
24.5
28
41
---------+-------------------------------------------------1 | .5100278
4184
4782.5
6234
15906
| .2772963
3
3
3
5
| .209629
2790
3360
3730
4840
| .2392352
16.5
19
22
34
---------+-------------------------------------------------Total | .478406
4195
5006.5
6342
15906
| .290661
3
3
4
5
| .2573949
2240
3190
3600
4840
| .2716543
18
20
25
41
------------------------------------------------------------
---------+-------------------------------------------------------------------diff |
312.2587
754.4488
-1191.708
1816.225
-----------------------------------------------------------------------------diff = mean(0) - mean(1)
t = 0.4139
Ho: diff = 0
degrees of freedom =
72
Ha: diff < 0
Pr(T < t) = 0.6599
Ha: diff != 0
Pr(|T| > |t|) = 0.6802
Ha: diff != 0
Pr(|T| > |t|) = 0.0005
. save
invalid file specification
r(198);
. save, replace
file D:\amit\NIBM\TERM-III\RMPS\stata\autodata.dta saved
. su mpg
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------mpg |
74
21.2973
5.785503
12
41
. su mpg trunk
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------mpg |
74
21.2973
5.785503
12
41
trunk |
74
13.75676
4.277404
5
23
. su mpg.detail
variable detail not found
r(111);
. su mpg.detail
variable detail not found
r(111);
. su mpg,detail
mpg
------------------------------------------------------------Percentiles
Smallest
1%
12
12
5%
14
12
10%
14
14
Obs
74
25%
18
14
Sum of Wgt.
74
50%
75%
90%
95%
99%
20
25
29
34
41
Largest
34
35
35
41
Mean
Std. Dev.
21.2973
5.785503
Variance
Skewness
Kurtosis
33.47205
.9487176
3.975005
here in percentile the avg. mpg is 20, min-12 & max-41. this
max 41 is creating large variance of 33.47.
kurt is 3.97 is laptocurtic.
su trunk, detail
trunk
------------------------------------------------------------Percentiles
Smallest
1%
5
5
5%
7
6
10%
8
7
Obs
74
25%
10
7
Sum of Wgt.
74
50%
75%
90%
95%
99%
14
17
20
21
23
Largest
21
21
22
23
Mean
Std. Dev.
13.75676
4.277404
Variance
Skewness
Kurtosis
18.29619
.0292034
2.192052
.
even the mean 13.75 and median 14 is very closure and ske is near to 0,
which shows that it is a normal distributed, but we also need to see the
4th moment i.e. kurt, which is 2.19 i.e. platocurtic. hence it is not normal.
trunk since the pr is very high. but we can reject the Ho : that there is
no excess kurt in trunk. hence jointly at the significance level of 5%
we cannot reject the Ho : that there is no excess skew & kurt and hence is not n
ormal.
Number of obs
F( 7,
61)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
69
9.55
0.0000
0.5228
0.4680
2124.3
-----------------------------------------------------------------------------price |
Coef. Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------mpg | -58.55192 80.56935
-0.73 0.470
-219.6603
102.5565
foreignd |
-3191.5 987.2306
-3.23 0.002
-5165.59
-1217.41
gear_ratio | -372.8468 1190.573
-0.31 0.755
-2753.545
2007.851
displacement | 26.77955 7.013338
3.82 0.000
12.75552
40.80359
headroom | -640.6969 356.8875
-1.80 0.078
-1354.338
72.9437
length | 1.790722 27.67926
0.06 0.949
-53.55738
57.13882
rep78 | 316.5449 342.8423
0.92 0.359
-369.0106
1002.1
-cons(a) | 5935.526 6429.707
0.92 0.360
-6921.468
18792.52
-----------------------------------------------------------------------------the variable, which has high t value and low p value are significant and we can
not
reject the Ho : that the Beta is not significant. in the case of variables with
low p
we can safely reject Ho : and say that the variable are significantly different
from 0.
F stat is more powerful and check for overall significane here Ho : all coeffici
ent are 0 that
mean no variable affect the price, hence by rejecting Ho : a = B1 = B2 = ... = 0
,
here f is 9.55 and p =0 that mean we can safely reject the null and the model is
having some
variable, which explain the price, but it does not tell, which variable. hence t
o check we will
check the t test. the t test will tell us the insiginificant variable, which we
should
eliminate from the model.
hence go for the f test first and than go for t test. if in f test itself we can
n't
reject the HO : mean model is poor and we need not to go further for t test.
After regression there are many other test. to check the 1- multicolinearly, 2heteroscedasticity
& 3- serial correlation.
i may get high R square, but my regression will be biased. hence is will drop ei
ther of them to
free the model from the biasness. if our objective allow the multicolinerty upto
certail level,
we can go upto certail multicoliner variable. here null HO : the correlation is
zero. the lower value are
prob of committing error in rejecting the HO. hence we can safely reject the HO
: there is no correlation
in mpg and gear. hence there is significant correlation between both of them.
hence at the top in command the sig is the significance level asked by the comma
nd.
if i modify the command. i am asking the stata to pin point the highly significa
nt correlation.
report me 5% signifiance data and put star on 1% significant data.
pwcorr mpg foreignd gear_ratio displacement headroom length rep78,print(0.05) s
tar (0.01)
|
mpg foreignd gear_r~o displa~t headroom length
rep78
-------------+--------------------------------------------------------------mpg | 1.0000
foreignd | -0.3934* 1.0000
gear_ratio | 0.6162* -0.7067* 1.0000
displacement | -0.7056* 0.6138* -0.8289* 1.0000
headroom | -0.4138* 0.2939 -0.3779* 0.4745* 1.0000
length | -0.7958* 0.5702* -0.6964* 0.8351* 0.5163* 1.0000
rep78 | 0.4023* -0.5922* 0.4103* -0.4119*
-0.3606* 1.0000
here the * value are correlated with 99% confidence level. which we need to remo
ve from the model.
we can also use the spearman correlation funcation.which is more relevant and gi
ves more
information about the correlation.
. spearman mpg foreignd gear_ratio displacement headroom length rep78,pw stats(r
ho obs p) star(0.01)
+-----------------+
| Key
|
|-----------------|
| rho
|
| Number of obs |
| Sig. level
|
+-----------------+
|
mpg foreignd gear_r~o displa~t headroom length
rep78
-------------+--------------------------------------------------------------mpg | 1.0000
|
74
|
|
foreignd | -0.3629* 1.0000
|
74
74
pr
| 0.0015
|