Econometrics Paper

De La Salle University
A Regression Analysis on the Factors Affecting Total Health

Expenditure per Capita in Asian Countries
An Individual Report
Presented to
The Faculty of Economics Department
In partial fulfillment
Of the course requirements in
Basic Econometrics
Submitted to:
Dr. Cesar Rufino
Submitted by:
Maria Pamela A. Ramos
September 6, 2013
INTRODUCTION
Background of the Study
Statement of the Problem
Significance of the Study
Objectives of the Study
Scope and Limitation
REVIEW OF RELATED LITERATURE
THEORETICAL FRAMEWORK
OPERATIONAL FRAMEWORK
Description of Variables
A-Priori Expectations
Introduction of Hypothesized Econometric Model
7
7
7
9
METHODOLOGY
Presentation of Data
Empirical Procedures
10
10
12
EMPIRICAL RESULTS AND INTERPRETATION OF RESULTS

Summary Statistics
Initial Regression
Overall Test of Significance
Test for Multicollinearity
Test for Heteroscedasticity
Test for Misspecification
13
13
14
16
17
18
22
CONCLUSION AND RECOMMENDATION
24
BIBLIGOGRAPHY
24
I. Introduction
A. Statement of the Problem
Health care is one of the things that is very significant in a country. To be able to
measure whether health care is provided properly in a country, total health care
expenditure per capita. This paper aims to determine what causes total health care
expenditure per capita to grow or to decrease.
B. Significance of the Study
In different countries both government and private hospitals are supposed to
provide quality health care services to its people. But of course, it would vary because
there are countrys where government expenditure on health is greater than private
expenditure on health. This study is important so that countries know what to do to be
able to provide quality health care services.
C. Objectives of the Study
a.) To determine what affects total health care expenditure per capita.
b.) To understand the relationship of the government health expenditure,
private health expenditure and population towards the total health
expenditure per capita.
c.) To give policy recommendation that would help increase total health
expenditure per capita.
D. Scope and Limitation
This study will used a cross-sectional data to be able to make comparisons on
the different total expenditure per capita of select countries from Asia. Data from the
year 2008 was taken for it had complete information and values. However only 33
countries were selected from Asia because some countries had still incomplete data.
Data shown are estimated or rounded up values. Total health expenditure per capita is
measured by purchasing power parity (NCU per USS), government and private health
expenditure are measured in millions current USS$ and population is by thousands.
II. Review of Related Literature

The World Health Organizations together with other organizations have
conducted studies on the different determinants of health expenditure. One would be
the paper of Ke Xu, Priyanka Saksena, Alberto Holly entitled The Determinants of
Health Expenditure: A Country-level Panel Data Analysis.
The rapid growth of health expenditure has become a great concern for
both households and governments. There is extensive literature on the
determinants of health expenditure in OECD countries, but the same is not
true for developing countries. The aim of this study is to understand the
trajectory of health expenditure in developing countries. We use panel
data from 143 countries over 14 years, from 1995 to 2008 to study this.
We apply both standard fixed effects and dynamic models to explore the
factors associated with the growth of total health expenditure as well as its
main components namely, government health expenditure and out-ofpocket payments. Our data show great variation across countries in health
expenditure as a share of GDP, which ranges from less than 5% to 15%.
Apart from income many factors contribute to this variation, ranging from
demographic factors to health system characteristics. Our results suggest
that health expenditure in general does not grow faster than GDP after
taking other factors into consideration. Income elasticity is between 0.75
and 0.95 in the fixed effect model while, it is much smaller in the dynamic
model. We found no difference in health expenditure between tax-based
and insurance based health financing mechanisms. The study also
confirms the existence of fungibility, where external aid for health reduces
government health spending from domestic sources. However, the
decrease is much small than a dollar to dollar substitution. The study also
finds that government health expenditure and out-of-pocket payments
follow different paths and that the pace of health expenditure growth is
different for countries at different levels of economic development.
III. Theoretical Framework

The theory or concept of gross domestic product per capita will be used in this
study. Blanchard (2010) discusses gross domestic product in three equivalent ways, (1)
GDP is the value of final goods and services produced in the economy during a given
period; (2) GDP is the sum of value added in the economy during the given period; and
(3) GDP is the sum of incomes in the economy during a given period. It is composed of
consumption, which is the acquisition of goods and services, by consumers. Second, is
investment or the sum of nonresidential and residential investment. Third is government
expenditure, this is the total procurement of goods and services by the government. The
last component would be net exports or the difference export and imports.
GDP per capita, on the other hand, according to Investopedia (retrieved last
August 18, 2013) this is the quotient of a countrys GDP and population. A greater GDP
per capita indicates growth in the economy and means that there is more productivity.
This is useful when comparing the relative performance of a country to another.
In this research, the total expenditure on health will presume the role of GDP per
capita.
IV. Operational Framework

A. Description of Variable
Table 1: Variable List and Description
Variable
Definition
Regressand or Dependent Variable

The quantitative variable that measures the quotient of a
Total
expenditure
on
countrys total health expenditure, that is consisted of total
health
capita
at
government
health
expenditure
and
total
private
Purchasing Power Parity

expenditure on health, and the countrys total population.
(NCU per US$)
This is measured by the purchasing power parity
Regressor o Independent Variable
General
Government Total government expenditure on health. It is measured in
Expenditure on Health
Private
Expenditure
Health
million current US$.

on Total outlays for health by households as direct payments
or also called as out-of-the pocket expenditure, by
Total number of de facto resident population that is
Population
provided from the United Nations Population Division from

the World Health Organization.
B. A-Priori Expectations of Regressor

Table 2: A-Priori Expectations
thecap
Endogenous Variable
Total expenditure on health / capita at Purchasing Power
Parity (NCU per US$)
Exogenous Variable
A-priori Expectations
General government expenditure on health is expected to
have a positive relationship with total health expenditure
per capita.
geh
This is because the increase in the general government
General
Government
expenditure on health will also increase the total
Expenditure on Health
expenditure on health of a country. The bigger the total
expenditure is when divided by the total population will
result to a positive value that will constitute to the rise of
the total expenditure per capita.
Out-of-the pocket expenditure on health is expected to
have a positive effect on total.
peh
This is because the increase in the private expenditure on
Private
Expenditure
on
health will also increase the total expenditure on health of
Health
a country. The bigger the total expenditure is when divided
by the total population will result to a positive value that will
constitute to the rise of the total expenditure per capita.
pop
Population
Population is expected to have a negative relationship on

total health expenditure per capita.
As population increases the total health expenditure per

capita will decrease. This is because the expenditure will
be divided among more residents or citizens in the country.
C. Introduction to Hypothesized Econometric Model

Based on the economic theories that were discussed in the preceding chapters,
the hypothesized econometric model is developed below. The model was transformed
in a log-log model. This was done to make the units standardized and to make the
model less susceptible to data bias.
Model for Estimation:
= ! + ! + ! + ! +
V. Methodology
A. Data
The data utilized in this research is from the World Health Organizations (WHO)
Global Health Expenditure Database. This database supplies internationally comparable
numbers on national health expenditures. WHO annually updates the data from publicly
available reports such as national health accounts reports, National Statistics Office,
Central Bank, public expenditure information accounts from the World Bank, the
International Monetary Fund and the such.
The data taken for this empirical analysis are values of the total health
expenditure per capita, general government expenditure on health, out-of-the pocket
expenditure, maternal mortality rate and population of 33 Asian countries for the year
2008. Considering this the data has a cross-sectional nature.
Table 3: Data
Country
Afganisthan
thecap
geh
peh
pop
30
64
837
29,840
Armenia
230
196
244
3,079
Azerbaijan
373
403
1,734
8,944
19
1,003
1,812
145,478
Bhutan
246
57
701
Cambodia
111
105
2,579
13,823
China
285
104,486
104,705
1,335,720
Georgia
440
228
923
4,394
India
112
13,383
37,468
1,190,864
Bangladesh
10
Indonesia
110
5,827
8,682
234,951
754
8,840
13,742
72,289
Israel
1,971
9,582
5,300
7,309
Japan
2,878
335,561
79,834
127,692
Jordan
479
1,193
735
5,849
Kazakhstan
440
3,019
2,145
15,655
1,052
2,228
619
2,548
137
161
151
5,204
90
53
169
6,022
Lebanon
886
915
1,312
4,167
Malaysia
532
4,651
3,775
27,502
Maldives
635
104
43
308
Mongolia
225
189
138
2,667
Nepal
62
264
392
28,905
Oman
618
966
280
2,637
84
1,263
3,581
167,442
142
2,171
4,559
90,173
Qatar
1,472
1,815
346
1,396
Republic of Korea
1,723
33,650
26,496
48,949
Russian Federation
1,034
56,746
28,648
143,163
Singapore
2,378
2,184
5,783
4,772
Iran (Islamic Republic

of)
Kuwait
Kyrgyzstan
Lao,
Peoples
Democratic Republic
Pakistan
Philippines
11
Sri Lanka
160
686
759
20,474
Thailand
318
8,236
2,579
68,268
1,034
56,746
11,971
143,163
Turkey
B. Empirical Procedures
To be able to analyze the hypothesized econometric model it will be tested for
overall significance. It will undergo the process of estimation and inference. For
estimation, a regression analysis will be done with the model. This is to inspect the
statistical dependence of the dependent variable to one or more variables or also called
the explanatory variables. For inference, a level of significance = 0.05 or confidence
interval of 95% is constructed to verify the values that will be generated. This will help in
determining whether the hypothesized econometric model is significant.
The software Gretl is used to operate the multiple regression analysis for the
estimation and inference. The estimates acquired are expected to have properties such
as sufficiency, unbiasedness, consistency and efficiency. To know if the estimates will
meet
these
properties,
test
will
be
conducted
to
detect
multicollinearity,
heteroscedasticity and misspecification. If these problems arise, remedies will be done

to correct the problems.
12
V. Empirical Testing and Interpretation of Results

A. Summary of Data
Table 4: Summary Statistics
Variable
Mean
Median
Minimum
Maximum
l_thecap
5.1892
5.9225
2.9628
7.9674
l_geh
7.4097
7.1415
3.9788
12.724
l_peh
7.4737
7.5023
2.2180
11.559
l_pop
9.8367
9.6586
5.7289
14.105
Variable
Std. Dev.
C.V.
Skewness
Kurtosis
l_thecap
1.2473
0.21434
-0.26752
-0.53123
l_geh
2.3076
0.31144
0.40922
-0.57362
l_peh
2.1386
0.28615
-0.15258
-0.16750
l_pop
2.0281
0.20617
0.19814
-0.53410
Above is the summary statistics of the data used. When getting the summary, the
log form of each independent variable is used. The table shows the different special
expectations or moments of each explanatory variable. The first moment is the measure
of central tendency; this is where the mean, the median and the minimum and
maximum values are. The second moment is the standard deviation, is the measure of
how dispersed the data is from the mean, and the variance of the values of all the
variables. The third moment is the skewness or the measure of symmetry. The fourth
moment is kurtosis which measures the tail density of peakedness of the data.
13
B. Initial Regression
Table 5: Initial Regression
Variable
Coefficient
Standard Error
t-Ratio
p-value
const
7.74626
0.254278
30.46
1.43e 23
***
l_geh
0.482429
0.0431294
11.19
4.90e 12
***
l_peh
0.355901
0.0627896
5.668
3.97e 06
***
l_pop
0.829708
0.0447623
-18.54
1.28e 17
***
Mean dependent var
5.819173 S.D. dependent var
1.247274
Sum squared resid
2.253002 S.E. of regression
0.278729
R squared
0.954743 Adjusted R squared
0.950061
F(3, 29)
203.9274 P-Value(F)
1.39e-19
Log-likelihood
Schwarz criterion
2.534944 Akaike criterion
13.06989
19.05592 Hannan - Quinn
15.08400
Log-likelihood for thecap = -194.568

Given the generated estimates and substituting it to the hypothesized
econometric model the sample regression is as follows:
= 7.74626 + 0.482429 + 0.355901

0.829708 +
14
The above results will be examined by level of significance that was mentioned in the
preceding part. Given that the level of significance = 0.05, if the p-value of the estimate
is less than that it means that the estimate is significant and the null hypothesis must be
rejected. Having said that, when the p-value of the estimate is greater than 0.05 then it
is insignificant and there is no strong evidence to reject the null hypothesis.
To interpret the data, the level of significance is discussed first. The intercept of
the model has a positive value of 7.74626, which means that when the independent
variables are 0 then total expenditure per capita will be equal to 7.74626. Given that its
p-value is less than 0.05 or 5% then it can be said that its statistically significant.
The general government expenditure on health (geh) and the private expenditure
on healt (peh) are significant at the 5% level. Their p-values are 4.90e 12 and 3.97e
06 respectively. The regression also displayed that both variables have a positive
coefficient, which means they have a positive relationship with the total expenditure on
health per capita.
The populations p-value is less than 0.05, it can be inferred that it is statistically
significant and there is strong evidence against the null hypothesis that the coefficient
must be 0; hence rejecting it. Since population resulted to have a negative coefficient,
this implies that as population increase there will be a decrease in the total health
expenditure per capita. That being said there is a negative relationship where in a
percentage increase in population, total health expenditure per capita will decrease by
0.829708.
To measure the overall fitness of the chosen model with the given data, the !
must be analyze. The ! is a value that lies in between 0 and 1. If it is nearer to 1 or 1,
15
the fitted regression line is said to explain 100% of the variation of the independent
variable or the fit of the model is suitable the closer ! is to 1 (Gujarati & Porter, 2009).
From the regression analysis, the ! that was generated was 0.954743. This means
that 95.4743% of the data is explained by the model. The adjusted ! , on the other
hand, is 0.950061.
C. Overall Test of Significance
Given that a multiple regression analysis is being done, the null hypothesis is a
joint hypothesis. The over all test of significance will be used to test the hypothesis. It
will examine whether the dependent variable is linearly related to the independent
variables. Analysis of Variance (ANOVA) or also called the F-test can be used to
measure this. It is the analysis of the Total Sum of Squares or TSS that is composed of
the Estimated Sum of Squares or ESS and the Residual Sum of Squares or RSS.
The null hypothesis for this model is that all the coefficient of the independent
variables are 0 while on the other hand the alternative hypothesis is not all these
coefficients are 0. So, the null hypothesis will be rejected if the p-value of the F-statistic
is less than the level of significance. The ANOVA or F-table was generated from Gretl,
and its is below:
Table 6: Analysis of Variance
Special of
df
Mean square
47.5292
15.8431
2.253
29
0.776897
49.7822
32
1.55569
Squares
Regression
Residual
Total
16
R^2 = 47.5292 / 49.7822 = 0.954743

F(3, 29) = 15.8431 / 0.0776897 = 203.927 [p-value 1.39e-19]
From the results the p-value is 1.39e 19 and this is less than the level of
significance, therefore the model passed the test for overall significance.
D. Test for Multicollinearity
Ragnar Frisch coined multicollinearity in 1934. It indicates the condition where
there is either an exact or relatively exact linear relationship among the X variables. It
violates one of the classic linear regression model assumptions where there should not
be any mutlicollinearity among the independent variables. There are two types of
multicollinearity, first is the perfectly correlated multicollinearity, which means that they
are singular, and regression is not plausible. Second is the highly correlated but
dangerous multicollinearity, this is when variables are highly correlated to each other this is then dangerous for the model. Despite the violation OLS is still BLUE, however
different repercussions might arise such as erroneous detection of a coefficient being
insignificant because of the t-ratio, there will be a wide confidence interval, ! will be
very high, and the OLS estimators and their standard errors will be perceptive to
changes in data. (Gujarati & Porter, 2009)
One way to test for multicollinearity, the Variance Inflation Factor will be
computed. It is the speed with which variances and covariances increase, and it
indicates how the presence of muticollinearity inflates the variance of an estimatior. The
value of VIF should be less than or equal to 10, this is because when VIF is greater than
17
10 it is highly collinear. Corrective measures are done to fix the violations such as do
nothing, transform the variables into logarithms, remove the culprit variable or use panel
data.
The VIF for this model was generated using Gretl and the results are as follows:
Table 7: Variance Inflation Factors
Minimum possible value = 1.0
Values > 10.0 may indicate a collinearity problem
l_geh
4.080
l_peh
7.427
l_pop
3.395
VIF(j) =1/1 R(j)^2), where R(j) is the multiple correlation coefficient between variable j
and the other independent variables
Properties of matrix XX:

1-norm = 8696.0278
Determinant = 8121844.9
Reciprocal condition number = 0.00011235508
It can be evaluated that all the exogenous variables have a VIF less than 10.
This shows tolerable multicollinearity. However the logarithm of private expenditure on
health possesses the highest VIF but it will not cause any problem.
E. Test for Heteroskedasticity
18
If the classical linear regression model assumption that the disturbance ! have
all the same variance ! is not satisfied then there is heterosccedasticity. The OLS
estimators unbiasedness and consistency properties are not destroyed. These
estimators are no longer minimum variances or efficient, therefore OLS is not BLUE. If
heteroscedasticity exist, the variances of OLS estimators are not given by the normal
OLS formulas because the t and f test based on them can be deceptive which will result
to faulty conclusions. To identify hetereoscedasticity, there are two methods the
informal one which is the graphical method and the formal one which are the different
test that can be conducted such as Park Test, Glejser test, Spearmans Rank
Correlation Test, Goldfeld-Quandt Test, Breush-Pagan-Godfrey Test and Whites
General Heteroscedasticity Test. (Gujarati & Porter, 2009)
Both the informal and formal methods will be shown with the use of Gretl.
Figure 1: Scattergram of estimated residuals plotted against the variables
19
Heteroscedasticity can be seen from a graph if there exist a pattern. From the
graphs above, it can be seen that there is no systematic pattern in the model; therefore
the model is not heteroscedastic. On the other hand, it is said that graphs are too
subjective to interpret models therefore it would not specify whether the model is truly
heteroscedastic. So the formal test or the Whites General Heteroscedasticity Test
conducted using Gretl and the results are below:
20
Table 8: Whites Test for Heteroscedasticity

OLS, using observations 1-33
Dependent variable: uhat^2
coefficient
std. error
t-ratio
p-value
const
0.374249
0.808220
0.4631
0.6477
l_geh
0.0294932
0.182433
0.1617
0.8730
l_peh
0.359006
0.261361
1.374
0.1828
l_pop
0.344832
0.236222
1.460
0.1579
sq_l_geh
0.00935994
0.0147458
0.6348
0.5319
X2_X3
0.0263258
0.0359429
0.7324
0.4713
X2_X4
0.0118747
0.0259169
0.4582
0.6511
sq_l_peh
0.0537565
0.0297534
1.807
0.0839
X3_X4
0.106507
0.0478626
2.225
0.0362
sq_l_pop
0.0564436
0.0234462
2.407
0.0245
Unadjusted r-squared = 0.331738
Test statistic: TR^2 = 10.947346,

With p-value = P(Chi-square(9) > 10.947346) = 0.279335
The Whites General Heteroscedasticity Test does not depend on the normality
assumption and is implemented easily. It has an a-priori expectation where the null
21
hypothesis is homoscedasticity and the alternative is heteroscedasticity. From the

results above the p-value is at 0.279335 which is greater than 0.05, this means that
there is a strong evidence in favor of the null hypothesis. The model can now be
concluded as homoscedastic and this means that the variances of the residuals are
constant and it follows the OLS assumption.
F. Test for Mis-specification
Model specification error or bias is disregarding the classical linear regression
model assumption that the regression model used in the analysis must be correctly
specified. There are several types of mis-specification errors but the top three most
important ones are omitted variable bias, irrelevant variable bias and incorrect functional
form (Gujarati & Porter, 2009). Omitted variable bias is because of the underfitting of a
model due to an exclusion of a significant variable. The OLS becomes inconsistent and
biased that results to a misleading and questionable interpretations of the statistical
significance of the estimates and the confidence intervals. The overfitting of a model
causes irrelevant variable bias as a result to an inclusion of an irrelevant variable. The
confidence interval will remain valid, the estimates variances will be greater than
desired making it less accurate and OLS is still BLUE. On the other hand, incorrect
functional form means that the model must be transformed into linear, logarithmic, lin-lin
or log-log forms.
To see if there is any specification error or bias the Ramsey Regression
Specification Error Test will be conducted and the results are as follows:
Table 9: Ramsey Reset Test
RESET test for specification (squares and cubes)
22
Test statistic: F= 1.791166,

with p-value = P(F(2, 27) > 1.79917) = 0.186
RESET test for specification (squares only)
with p-value = P(F(1, 28) > 3.3501) = 0.0779
RESET test for specification (cubes only)
with p-value = P(F(1, 28) > 3.17414) = 0.0857
The null hypothesis for this is that the model is correctly specified and the
alternative hypothesis is that there is misspecification error or bias in the model. Looking
at all the p-values, the values are all greater than 0.05 then there is a strong evidence
not to reject the null hypothesis. So, it can be concluded that the model does not have
misspecification error or bias.
23
VI. Conclusion
This paper aims to determine what causes total health care expenditure per
capita to grow or to decrease. An empirical procedure was done to prove whether
general government expenditure on health, private expenditure on health and
population affect total health care expenditure per capita. And based on the findings, it
can be deduce that the hypothesized econometric model is valid.
For further studies, additional variables such as prevalence of diseases or
mortality rates or percentage of health care services given out should be added to really
measure the total health care expenditure per capita.
Reference:
Global Health Expenditure Database. (n.d.). World Health Organization. Retrieved
August 31, 2013, from apps.who.int/nha/database/DataExplorerRegime.aspx
Gujarati, D., & Porter, D. (2009). Basic Econometrics (5th ed.). Singapor: Mc Graw Hill.
Health financing for universal coverage. (n.d.). World Health Organization. Retrieved
August 18, 2013, from www.who.int/health_financing/documents/covreport_e_11-deter-he/en/
Per Capita GDP Definition | Investopedia. (n.d.). Investopedia - Educating the world
about finance. Retrieved August 18, 2013, from
http://www.investopedia.com/terms/p/per-capita-gdp.asp
24

Econometrics Paper

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Econometrics Paper

Uploaded by

Copyright:

Available Formats

De La Salle University

A Regression Analysis on the Factors Affecting Total Health

REVIEW OF RELATED LITERATURE

EMPIRICAL RESULTS AND INTERPRETATION OF RESULTS

CONCLUSION AND RECOMMENDATION

II. Review of Related Literature

III. Theoretical Framework

IV. Operational Framework

Regressand or Dependent Variable

Purchasing Power Parity

Government Total government expenditure on health. It is measured in

million current US$.

provided from the United Nations Population Division from

B. A-Priori Expectations of Regressor

Parity (NCU per US$)

Population is expected to have a negative relationship on

As population increases the total health expenditure per

C. Introduction to Hypothesized Econometric Model

Iran (Islamic Republic

heteroscedasticity and misspecification. If these problems arise, remedies will be done

V. Empirical Testing and Interpretation of Results

Mean dependent var

5.819173 S.D. dependent var

Sum squared resid

2.253002 S.E. of regression

0.954743 Adjusted R squared

2.534944 Akaike criterion

19.05592 Hannan - Quinn

Log-likelihood for thecap = -194.568

= 7.74626 + 0.482429 + 0.355901

R^2 = 47.5292 / 49.7822 = 0.954743

Properties of matrix XX:

Table 8: Whites Test for Heteroscedasticity

Unadjusted r-squared = 0.331738

Test statistic: TR^2 = 10.947346,

hypothesis is homoscedasticity and the alternative is heteroscedasticity. From the

Test statistic: F= 1.791166,

You might also like