Professional Documents
Culture Documents
Joanna Pimento
5/16/2013
1. Executive Summary
The Project analyzes the various factors affecting the Exchange Traded Funds offered by Motilal
Oswal Asset Management Company. The aim of this project was to develop a model using
Regression Analysis that would explain the correlation between various variables and the ETFs.
The analysis mainly tested the significance of interest rates, GDP, Inflation and equity Indices on
the ETFs.
ETF NAVs, Daily Returns on the underlying Indices and the Index Values were used as
dependant variables in individual regression models. The CRISIL G10 Values, CCIMIBOR,
Inflation and GDP were the main predictors. The regression results focused on comparing RSquare, Adjusted R-Square, and P-Values of the regression coefficients.
Using Microsoft Office Excel and Minitab, the analysis began with simple linear regression
models, then added and modified variables into multiple regression models and finally
concluded with Lagged Distributed regression models.
Contents
1. Executive Summary. 2
2. Analysis of ETF NAVs.. 4
2.1. Most Shares M50 4
3. Analysis of Index Values Of the Benchmark Index 5
3.1. NIFTY INDEX Analysis 5
4. Analysis of Daily Returns on the Benchmark Index. 6
4.1. NIFTY Daily Returns.. 6
4.2. CNXMCAP Daily returns. 7
4.3. NASDAQ Daily Returns 7
4.4. Gold Daily returns.. 7
5. Lagged Distributed Models. 8
6. Time Series Plots.. 10
6.1. NIFTY v/s CCIMIBOR without Time Lag. 10
6.2. NIFTY v/s CCIMIBOR with Time Lag. 11
7. Autoregressive Models 12
7.1. NIFTY Daily Returns. 12
7.2. CNXMCAP Daily Returns. 12
8. Comparison of Regression Models. 13
9. Conclusion. 15
10. References 16
Data for a period of 2 years is collected and suitably interpolated for missing values.
Using Excel, the NAV Values of the ETF is independently regressed with the above
mentioned variables.
The Level Of significance is set at 0.05.
If the P-values of the regression coefficients are lower than 0.05 we reject the null
hypothesis that the coefficient is zero, thus concluding our coefficient is statistically
significant.
Results
Predictors
CCIMIBOR
CRISIL G10 INDEX
INFLATION
GDP
GOLD Commodity
Correlation
45.06%
46.27%
14.86%
3.56%
11.99%
R square
20.30%
21.41%
2.21%
0.13%
1.44%
Intercept
103.38
16.24
79.88
69.77
79.78
P-Values
9.83E-153
0.001
5.49E-09
0.13
1.54E-138
Coefficient/slope
-3.63
0.03
-0.73
0.0002
-7.23E-05
P-Values
4.02E-26
1.25E-27
4.88E-01
0.94
0.007
CRISIL G10 INDEX has the highest correlation with M50 NAVS.
The Highest R-Square value suggests that the regression model using CRISIL G10 INDEX is better
suited for the analysis when compared with the other variables.
The intercept column gives the predicted value of M50 NAVS, if the predicted variables were
forecasted to be zero.
From the P-values we conclude that excluding GDP all other variables are statistically significant.
Inflation and CCIMIBOR have a negative relationship withM50 NAVs, while the other variables
will positively influence on M50 NAVs.
The slope column gives values of M50 NAVs when the predictor variable changes by one unit.
Again the P-values indicate that excluding GDP all other variables are statistically significant.
4
Data for a period of 5 years is collected and suitably interpolated for missing values.
Using Excel, the Index values itself is independently regressed with the above
mentioned variables.
The Level Of significance is set at 0.05.
If the P-values of the regression coefficients are lower than 0.05 we reject the null
hypothesis that the coefficient is zero, thus concluding our coefficient is statistically
significant.
NIFTY Index
Period: March 2008- Feb2013
Response Variable: NIFTY Index
Predictor Variables: CCIMIBOR, CRISIL G10 INDEX, CRISIL G10 RETURNS, MONTHLY INFLATION, GDP,
GOLD Commodity.
Results
Predictors
CCIMIBOR
CRISIL G10 INDEX
CRISIL G10 Returns
INFLATION
GDP
Gold Commodity
Correlation
19.71%
39.66%
3.26%
35.86%
63.73%
56.25%
R square
3.88%
15.73%
0.11%
12.86%
40.61%
31.64%
Intercept
4451.68
659.58
4942.80
4152.78
-159.21
3369.46
P-Values
2.59E-12
0.02
0.00
4.81E-21
0.92
3.20E-292
Coefficient/slope
74.58
2.16
-6183.73
101.97
0.42
0.03
P-Values
2.59E-12
6.27E-48
0.25
0.005
0.003
2.71E-104
Data for a period of 5 years is collected and suitably interpolated for missing values.
Using Excel, the daily returns on each Index are independently regressed with the above
mentioned variables.
The Level Of significance is set at 0.05.
If the P-values of the regression coefficients are lower than 0.05 we reject the null
hypothesis that the coefficient is zero, thus concluding our coefficient is statistically
significant.
i) NIFTY Daily Returns
Correlation
5.00%
4.40%
3.87%
34.12%
10.95%
0.22%
R square
0.25%
0.19%
0.15%
11.64%
1.20%
0.0005%
Intercept
0.26%
-0.95%
0.00%
0.35%
-0.25%
0.02%
P-Values
0.09
0.13
0.94
0.02
0.66
0.63
Coefficient/slope
-0.04%
0.0005%
14.67%
-0.05%
0.000021%
5.197E-10
P-Values
0.08
0.12
0.17
0.01
0.66
0.94
ii.
CNXMCAP Returns
P-Values
0.0006
0.080
0.98
0.006
0.78
Coefficient/slope
-0.07%
0.0005%
18.36%
-0.06%
0.000018%
P-Values
0.0003
0.078
0.06
0.003
0.76
P-Values
0.06
0.65
0.18
0.01
0.33
Coefficient/slope
-0.03%
0.0002%
-9.80%
-0.02%
0.000035%
P-Values
0.13
0.58
0.35
0.03
0.26
P-Values
0.13
0.97
0.26
0.66
0.79
NASDAQ Returns
Gold Returns
The adjusted R-Square, P Values of the Maximum Lag term Coefficient, The Akaike information
criteria and the Schwartz information criteria are used to analyze the model.
If the model was found to be inadequate, the Lag (q) was reduced by 1 and a new model consisting
of q-1 variables is then tested until the lowest p-value was obtained.
Results
LAG (q Days)
R-SQUARE
Adjusted R-Square
LAG 16
8.1%
6.9%
LAG 15
8.1%
6.9%
LAG 14
8.1%
7.0%
LAG 13
7.7%
6.7%
*AIC: Akaike Information Criteria
SIC: Schwarz Information Criteria
P-Value Of q Coefficient
0.382
0.713
0.033
0.108
AIC*
SIC*
0.000271
0.000271
0.000270
0.000273
0.000291
0.000289
0.000288
0.000289
Inference:
From the table above the P-value of the 16th coefficient is greater than 0.05. Hence we cannot
reject the null hypothesis and the variable is dropped from the model.
In a similar way by the above tabulated results, the regression model with 15 lagged variables
is also rejected.
The best suited regression model consists of 14 lagged variables giving a r-Square of 8.1% .The
increase in the adjusted R-Square value suggests that the dropping the LAG 15 and LAG 16 is
better suited for the analysis.
Since the P-Value of the 14th coefficient is less than 0.05 we keep the variable in the model.
If the Lag 14 variable was dropped, the adjusted R-Square of the LAG 13 Model decreases
suggesting that the dropped variable reduced the efficiency of the model.
Lastly comparing the AIC and SIC the LAG 14 model having lowest AIC and SIC is best suited for
the analysis.
Calculations:
Model
LAG 16
LAG 15
LAG 14
LAG 13
AIC=
RSS/N
0.00026
0.00026
0.00026
0.00027
AIC
K/N
0.000271 0.013912
0.000271 0.013083
0.000270 0.012255
0.000273 0.011429
N^(K/N)
1.103942
1.097467
1.091039
1.084658
SIC
0.000291
0.000289
0.000288
0.000289
SIC= /
6. Time-Series Plots
Objective: To compare models Time lagged Models with Simple Linear models with the help of Time
Series Plots.
1. NIFTY Daily Returns Regressed with CCIMIBOR (NO TIME LAG)
R-Square: 0.25%
Known NIFTY returns plotted along with Predicted NIFTY Returns obtained from the
regression model using CCIMIBOR (without lag) as the independent variable.
Time Series Plot of NIFTY Returns, Predicted NIFTY Returns
0.20
Variable
NIFTY Returns
Predicted NIFTY Returns
0.15
0.10
0.05
0.00
-0.05
-0.10
08
20
/
3
3/
08
09
10
09
10
11
20
20
20
20
20
20
/
/
/
/
/
/
2
10
12
17
13
10
9/
9/
3/
3/
9/
3/
Date
11
20
/
8
9/
12
20
/
7
3/
12
20
/
4
9/
NIFTY Daily Returns predicted from the regression model do not follow the same trend as the known
NIFTY Returns. This Model is not suitable for the analysis.
10
Known NIFTY returns plotted along with Predicted NIFTY Returns obtained from the
regression model using CCIMIBOR (with a 14 day lag) as the independent variable.
Variable
NIFTY Returns
Predicted NIFTY Returns
0.15
Data
0.10
0.05
0.00
-0.05
-0.10
8
8
9
9
0
0
1
1
2
2
3
00
00
00
00
01
01
01
01
01
01
01
2
2
2
2
2
2
2
2
2
2
2
/
/
/
/
/
/
/
/
/
8/
5/
26
19
26
23
26
17
14
29
22
9/
3/
3/
9/
3/
9/
3/
3/
9/
8/
2/
Date
In comparison to the earlier Time series Plot (with no time lag), NIFTY Returns predicted from
the Lag 14 Model are closer to the known NIFTY Returns.
Conclusion:
The Regression model using a 14 day lag period for the CCIMIBOR values as independent
variables is better suited for the analysis on NIFTY Returns as compared to a model with no time
lag.
11
7. Autoregressive Models
Objective: To determine the significance of lagged values of the dependant variable in the regression
equation.
Methodology:
Lagged values of the dependant variable ranging from lag 1 to lag 15 are added to the Regression
equations obtained above.
Results:
Dependant Variable: NIFTY Daily Returns
Independent Variables
CCIMIBOR Lag 7, CRISIL G10 Lag 14
CCIMIBOR Lag 7, CRISIL G10 Lag 14, NIFTY Lag 1
CCIMIBOR Lag 7, CRISIL G10 Lag 14, NIFTY Lag 1, NIFTY Lag 2
CCIMIBOR Lag 7, CRISIL G10 Lag 14, NIFTY Lag 1, NIFTY Lag 2,Nifty Lag 3
Dependant Variable: CNXMCAP Daily Returns
Independent Variables
CCIMIBOR LAG 10,CRISIL G10 LAG 14
CCIMIBOR LAG 10, CRISIL G10 LAG 14,CNXMCAP LAG 1
CCIMIBOR LAG 10, CRISIL G10 LAG 14,CNXMCAP LAG 1-LAG2
CCIMIBOR LAG 10, CRISIL G10 LAG 14,CNXMCAP LAG 1-LAG3
R-Square
Adjusted R-Square
0.80%
0.60%
0.90%
0.70%
1.00%
0.70%
1.00%
0.60%
R-Square
Adjusted R-Square
2.8%
2.6%
2.9%
2.7%
3.0%
2.6%
3.0%
2.6%
Based on all of the above analysis using linear, multiple and lagged distribute models the following
models are setup in conclusion
12
4.00%
4.20%
4.20%
4.30%
4.10%
4.20%
4.10%
4.20%
4.10%
4.00%
3.70%
3.60%
3.60%
3.60%
3.70%
3.10%
2.80%
Compare the colored models from table 8.1 with the corresponding models in table8.2.
Table 8.2
Independent Variables
NIFTY Daily Returns (No lag)
NIFTY (No Lag), Nifty Lag1- Nifty Lag 14
NIFTY (No lag),Nifty Lag 1- Nifty Lag 10
Nifty No Lag, Nifty Lag 1- Nifty Lag 3
Nifty No Lag, Nifty Lag 1- Nifty Lag 2
R-Square
Adjusted R-Square
75.10%
78.50%
78.50%
78.30%
77.50%
75.10%
78.20%
78.30%
78.20%
77.40%
Inference: The NIFTY Daily Returns without any Time LAG are largely significant in the analysis.
13
Model 2
Dependant Variable: CNXMCAP Daily Returns
Independent Variable: NIFTY, CCIMIBOR LAG 10, CRISIL G10 LAG 14
The regression equation is
CNXMCAP = 0.00083 + 0.759 Nifty - 0.000308 CCIMIBOR LAG 10+ 0.000001 CRISIL G10 LAG 14
R-Sq = 75.3%
R-Sq(adj) = 75.3%
Model 3
Dependant Variable: CNXMCAP Daily Returns
Independent Variable: NIFTY, CCIMIBOR, CRISIL G10
The regression equation is
CNXMCAP = - 0.00122 + 0.772 Nifty - 0.000419 CCILMIBOR + 0.000002 CRISIL G10 INDEX
R-Sq = 75.5%
R-Sq(adj) = 75.5%
Model 4
Dependant Variable: NIFTY Daily Returns
Independent Variable: CNXMCAP, CCIMIBOR LAG 14, CRISIL G10 LAG 7
The regression equation is
NIFTY = - 0.00274 + 0.000200 CCIMIBOR LAG 14 + 0.000001 CRISIL G10 LAG 7+ 0.986CNXMCAP
R-Sq = 0.8%
R-Sq(adj) = 0.6%
14
9. Conclusion
The R-Square values obtained in the analysis were relatively low until another equity index was added to
the regression model. The linear models explained at the start attained a maximum R-Square of 21%
when the NAVs of the ETFs were used as dependant variable. When the Daily Returns on the underlying
index were regressed the maximum R-square value was 15% and using the Index values itself as the
dependant variable explained 40% influence of the predictors.
However when time lagged values of the predictors were used in the regression, model efficiency
increased by approximately 8%. A 14day Lag in CCIMIBOR influenced NIFTY Returns to a larger extent
than CCIMIBOR without lag. Autoregressive models did not approach a R-Square greater than 5%
indicating lagging the dependant variable was not significant.
Finally when NIFTY Returns were added to the model having CNXMCAP as the dependent variable an
R-Square of 75% was obtained clearly indicating a strong relation of the two indices. Lagging the values
of NIFTY Returns in the model is not appropriate for the analysis as indicated by Model 1. As model 3
suggest lagging values of interest rates has a negligible change in model efficiency suggesting the central
parameter is the NIFTY index to explain variation in CNXMCAP Returns.
In conclusion Interest rates, GDP, inflation do not have a very strong influence on Index Returns. A
regression model consisting of these parameters is not efficient to understand factors affecting ETFs.
Variation in CXMCAP Daily Returns is most efficiently explained by NIFTY Daily Returns.
15
10.
References
Books:
Gujarati, Damodar N: Basic Econometrics, 4th Edition, Tata McGraw Hill
Koop, Gary: Analysis of Economic Data, 3rd Edition.
Abner, David J.: The ETF Handbook: How to Value Exchange Traded Funds, Wiley Finance
Google Scholar:
1. Pricing Efficiency of Exchange Traded Funds- Pavel Prusevic
2. The pricing Of China region ETFs- An empirical Analysis-
Websites: