You are on page 1of 24

De La Salle University

A Regression Analysis on the Factors Affecting Total Health


Expenditure per Capita in Asian Countries

An Individual Report
Presented to
The Faculty of Economics Department

In partial fulfillment
Of the course requirements in
Basic Econometrics

Submitted to:
Dr. Cesar Rufino

Submitted by:
Maria Pamela A. Ramos

September 6, 2013

INTRODUCTION
Background of the Study
Statement of the Problem
Significance of the Study
Objectives of the Study
Scope and Limitation

REVIEW OF RELATED LITERATURE

THEORETICAL FRAMEWORK

OPERATIONAL FRAMEWORK
Description of Variables
A-Priori Expectations
Introduction of Hypothesized Econometric Model

7
7
7
9

METHODOLOGY
Presentation of Data
Empirical Procedures

10
10
12

EMPIRICAL RESULTS AND INTERPRETATION OF RESULTS


Summary Statistics
Initial Regression
Overall Test of Significance
Test for Multicollinearity
Test for Heteroscedasticity
Test for Misspecification

13
13
14
16
17
18
22

CONCLUSION AND RECOMMENDATION

24

BIBLIGOGRAPHY

24

I. Introduction
A. Statement of the Problem
Health care is one of the things that is very significant in a country. To be able to
measure whether health care is provided properly in a country, total health care
expenditure per capita. This paper aims to determine what causes total health care
expenditure per capita to grow or to decrease.
B. Significance of the Study
In different countries both government and private hospitals are supposed to
provide quality health care services to its people. But of course, it would vary because
there are countrys where government expenditure on health is greater than private
expenditure on health. This study is important so that countries know what to do to be
able to provide quality health care services.
C. Objectives of the Study
a.) To determine what affects total health care expenditure per capita.
b.) To understand the relationship of the government health expenditure,
private health expenditure and population towards the total health
expenditure per capita.
c.) To give policy recommendation that would help increase total health
expenditure per capita.
D. Scope and Limitation
This study will used a cross-sectional data to be able to make comparisons on
the different total expenditure per capita of select countries from Asia. Data from the
year 2008 was taken for it had complete information and values. However only 33
countries were selected from Asia because some countries had still incomplete data.

Data shown are estimated or rounded up values. Total health expenditure per capita is
measured by purchasing power parity (NCU per USS), government and private health
expenditure are measured in millions current USS$ and population is by thousands.

II. Review of Related Literature


The World Health Organizations together with other organizations have
conducted studies on the different determinants of health expenditure. One would be
the paper of Ke Xu, Priyanka Saksena, Alberto Holly entitled The Determinants of
Health Expenditure: A Country-level Panel Data Analysis.
The rapid growth of health expenditure has become a great concern for
both households and governments. There is extensive literature on the
determinants of health expenditure in OECD countries, but the same is not
true for developing countries. The aim of this study is to understand the
trajectory of health expenditure in developing countries. We use panel
data from 143 countries over 14 years, from 1995 to 2008 to study this.
We apply both standard fixed effects and dynamic models to explore the
factors associated with the growth of total health expenditure as well as its
main components namely, government health expenditure and out-ofpocket payments. Our data show great variation across countries in health
expenditure as a share of GDP, which ranges from less than 5% to 15%.
Apart from income many factors contribute to this variation, ranging from
demographic factors to health system characteristics. Our results suggest
that health expenditure in general does not grow faster than GDP after
taking other factors into consideration. Income elasticity is between 0.75
and 0.95 in the fixed effect model while, it is much smaller in the dynamic
model. We found no difference in health expenditure between tax-based
and insurance based health financing mechanisms. The study also
confirms the existence of fungibility, where external aid for health reduces
government health spending from domestic sources. However, the
decrease is much small than a dollar to dollar substitution. The study also
finds that government health expenditure and out-of-pocket payments
follow different paths and that the pace of health expenditure growth is
different for countries at different levels of economic development.

III. Theoretical Framework


The theory or concept of gross domestic product per capita will be used in this
study. Blanchard (2010) discusses gross domestic product in three equivalent ways, (1)
GDP is the value of final goods and services produced in the economy during a given
period; (2) GDP is the sum of value added in the economy during the given period; and
(3) GDP is the sum of incomes in the economy during a given period. It is composed of
consumption, which is the acquisition of goods and services, by consumers. Second, is
investment or the sum of nonresidential and residential investment. Third is government
expenditure, this is the total procurement of goods and services by the government. The
last component would be net exports or the difference export and imports.
GDP per capita, on the other hand, according to Investopedia (retrieved last
August 18, 2013) this is the quotient of a countrys GDP and population. A greater GDP
per capita indicates growth in the economy and means that there is more productivity.
This is useful when comparing the relative performance of a country to another.
In this research, the total expenditure on health will presume the role of GDP per
capita.

IV. Operational Framework


A. Description of Variable
Table 1: Variable List and Description
Variable

Definition

Regressand or Dependent Variable


The quantitative variable that measures the quotient of a
Total

expenditure

on
countrys total health expenditure, that is consisted of total

health

capita

at
government

health

expenditure

and

total

private

Purchasing Power Parity


expenditure on health, and the countrys total population.
(NCU per US$)
This is measured by the purchasing power parity
Regressor o Independent Variable
General

Government Total government expenditure on health. It is measured in

Expenditure on Health
Private

Expenditure

Health

million current US$.


on Total outlays for health by households as direct payments
or also called as out-of-the pocket expenditure, by
Total number of de facto resident population that is

Population

provided from the United Nations Population Division from


the World Health Organization.

B. A-Priori Expectations of Regressor


Table 2: A-Priori Expectations
thecap
Endogenous Variable
Total expenditure on health / capita at Purchasing Power

Parity (NCU per US$)

Exogenous Variable

A-priori Expectations
General government expenditure on health is expected to
have a positive relationship with total health expenditure
per capita.

geh
This is because the increase in the general government
General

Government
expenditure on health will also increase the total

Expenditure on Health
expenditure on health of a country. The bigger the total
expenditure is when divided by the total population will
result to a positive value that will constitute to the rise of
the total expenditure per capita.
Out-of-the pocket expenditure on health is expected to
have a positive effect on total.
peh
This is because the increase in the private expenditure on
Private

Expenditure

on
health will also increase the total expenditure on health of

Health
a country. The bigger the total expenditure is when divided
by the total population will result to a positive value that will
constitute to the rise of the total expenditure per capita.
pop
Population

Population is expected to have a negative relationship on


total health expenditure per capita.

As population increases the total health expenditure per


capita will decrease. This is because the expenditure will
be divided among more residents or citizens in the country.

C. Introduction to Hypothesized Econometric Model


Based on the economic theories that were discussed in the preceding chapters,
the hypothesized econometric model is developed below. The model was transformed
in a log-log model. This was done to make the units standardized and to make the
model less susceptible to data bias.
Model for Estimation:
= ! + ! + ! + ! +

V. Methodology
A. Data
The data utilized in this research is from the World Health Organizations (WHO)
Global Health Expenditure Database. This database supplies internationally comparable
numbers on national health expenditures. WHO annually updates the data from publicly
available reports such as national health accounts reports, National Statistics Office,
Central Bank, public expenditure information accounts from the World Bank, the
International Monetary Fund and the such.
The data taken for this empirical analysis are values of the total health
expenditure per capita, general government expenditure on health, out-of-the pocket
expenditure, maternal mortality rate and population of 33 Asian countries for the year
2008. Considering this the data has a cross-sectional nature.
Table 3: Data
Country
Afganisthan

thecap

geh

peh

pop

30

64

837

29,840

Armenia

230

196

244

3,079

Azerbaijan

373

403

1,734

8,944

19

1,003

1,812

145,478

Bhutan

246

57

701

Cambodia

111

105

2,579

13,823

China

285

104,486

104,705

1,335,720

Georgia

440

228

923

4,394

India

112

13,383

37,468

1,190,864

Bangladesh

10

Indonesia

110

5,827

8,682

234,951

754

8,840

13,742

72,289

Israel

1,971

9,582

5,300

7,309

Japan

2,878

335,561

79,834

127,692

Jordan

479

1,193

735

5,849

Kazakhstan

440

3,019

2,145

15,655

1,052

2,228

619

2,548

137

161

151

5,204

90

53

169

6,022

Lebanon

886

915

1,312

4,167

Malaysia

532

4,651

3,775

27,502

Maldives

635

104

43

308

Mongolia

225

189

138

2,667

Nepal

62

264

392

28,905

Oman

618

966

280

2,637

84

1,263

3,581

167,442

142

2,171

4,559

90,173

Qatar

1,472

1,815

346

1,396

Republic of Korea

1,723

33,650

26,496

48,949

Russian Federation

1,034

56,746

28,648

143,163

Singapore

2,378

2,184

5,783

4,772

Iran (Islamic Republic


of)

Kuwait
Kyrgyzstan
Lao,

Peoples

Democratic Republic

Pakistan
Philippines

11

Sri Lanka

160

686

759

20,474

Thailand

318

8,236

2,579

68,268

1,034

56,746

11,971

143,163

Turkey

B. Empirical Procedures
To be able to analyze the hypothesized econometric model it will be tested for
overall significance. It will undergo the process of estimation and inference. For
estimation, a regression analysis will be done with the model. This is to inspect the
statistical dependence of the dependent variable to one or more variables or also called
the explanatory variables. For inference, a level of significance = 0.05 or confidence
interval of 95% is constructed to verify the values that will be generated. This will help in
determining whether the hypothesized econometric model is significant.
The software Gretl is used to operate the multiple regression analysis for the
estimation and inference. The estimates acquired are expected to have properties such
as sufficiency, unbiasedness, consistency and efficiency. To know if the estimates will
meet

these

properties,

test

will

be

conducted

to

detect

multicollinearity,

heteroscedasticity and misspecification. If these problems arise, remedies will be done


to correct the problems.

12

V. Empirical Testing and Interpretation of Results


A. Summary of Data
Table 4: Summary Statistics
Variable

Mean

Median

Minimum

Maximum

l_thecap

5.1892

5.9225

2.9628

7.9674

l_geh

7.4097

7.1415

3.9788

12.724

l_peh

7.4737

7.5023

2.2180

11.559

l_pop

9.8367

9.6586

5.7289

14.105

Variable

Std. Dev.

C.V.

Skewness

Kurtosis

l_thecap

1.2473

0.21434

-0.26752

-0.53123

l_geh

2.3076

0.31144

0.40922

-0.57362

l_peh

2.1386

0.28615

-0.15258

-0.16750

l_pop

2.0281

0.20617

0.19814

-0.53410

Above is the summary statistics of the data used. When getting the summary, the
log form of each independent variable is used. The table shows the different special
expectations or moments of each explanatory variable. The first moment is the measure
of central tendency; this is where the mean, the median and the minimum and
maximum values are. The second moment is the standard deviation, is the measure of
how dispersed the data is from the mean, and the variance of the values of all the
variables. The third moment is the skewness or the measure of symmetry. The fourth
moment is kurtosis which measures the tail density of peakedness of the data.

13

B. Initial Regression
Table 5: Initial Regression
Variable

Coefficient

Standard Error

t-Ratio

p-value

const

7.74626

0.254278

30.46

1.43e 23

***

l_geh

0.482429

0.0431294

11.19

4.90e 12

***

l_peh

0.355901

0.0627896

5.668

3.97e 06

***

l_pop

0.829708

0.0447623

-18.54

1.28e 17

***

Mean dependent var

5.819173 S.D. dependent var

1.247274

Sum squared resid

2.253002 S.E. of regression

0.278729

R squared

0.954743 Adjusted R squared

0.950061

F(3, 29)

203.9274 P-Value(F)

1.39e-19

Log-likelihood
Schwarz criterion

2.534944 Akaike criterion

13.06989

19.05592 Hannan - Quinn

15.08400

Log-likelihood for thecap = -194.568


Given the generated estimates and substituting it to the hypothesized
econometric model the sample regression is as follows:

= 7.74626 + 0.482429 + 0.355901


0.829708 +

14

The above results will be examined by level of significance that was mentioned in the
preceding part. Given that the level of significance = 0.05, if the p-value of the estimate
is less than that it means that the estimate is significant and the null hypothesis must be
rejected. Having said that, when the p-value of the estimate is greater than 0.05 then it
is insignificant and there is no strong evidence to reject the null hypothesis.
To interpret the data, the level of significance is discussed first. The intercept of
the model has a positive value of 7.74626, which means that when the independent
variables are 0 then total expenditure per capita will be equal to 7.74626. Given that its
p-value is less than 0.05 or 5% then it can be said that its statistically significant.
The general government expenditure on health (geh) and the private expenditure
on healt (peh) are significant at the 5% level. Their p-values are 4.90e 12 and 3.97e
06 respectively. The regression also displayed that both variables have a positive
coefficient, which means they have a positive relationship with the total expenditure on
health per capita.
The populations p-value is less than 0.05, it can be inferred that it is statistically
significant and there is strong evidence against the null hypothesis that the coefficient
must be 0; hence rejecting it. Since population resulted to have a negative coefficient,
this implies that as population increase there will be a decrease in the total health
expenditure per capita. That being said there is a negative relationship where in a
percentage increase in population, total health expenditure per capita will decrease by
0.829708.
To measure the overall fitness of the chosen model with the given data, the !
must be analyze. The ! is a value that lies in between 0 and 1. If it is nearer to 1 or 1,

15

the fitted regression line is said to explain 100% of the variation of the independent
variable or the fit of the model is suitable the closer ! is to 1 (Gujarati & Porter, 2009).
From the regression analysis, the ! that was generated was 0.954743. This means
that 95.4743% of the data is explained by the model. The adjusted ! , on the other
hand, is 0.950061.
C. Overall Test of Significance
Given that a multiple regression analysis is being done, the null hypothesis is a
joint hypothesis. The over all test of significance will be used to test the hypothesis. It
will examine whether the dependent variable is linearly related to the independent
variables. Analysis of Variance (ANOVA) or also called the F-test can be used to
measure this. It is the analysis of the Total Sum of Squares or TSS that is composed of
the Estimated Sum of Squares or ESS and the Residual Sum of Squares or RSS.
The null hypothesis for this model is that all the coefficient of the independent
variables are 0 while on the other hand the alternative hypothesis is not all these
coefficients are 0. So, the null hypothesis will be rejected if the p-value of the F-statistic
is less than the level of significance. The ANOVA or F-table was generated from Gretl,
and its is below:
Table 6: Analysis of Variance
Special of
df

Mean square

47.5292

15.8431

2.253

29

0.776897

49.7822

32

1.55569

Squares
Regression
Residual
Total

16

R^2 = 47.5292 / 49.7822 = 0.954743


F(3, 29) = 15.8431 / 0.0776897 = 203.927 [p-value 1.39e-19]

From the results the p-value is 1.39e 19 and this is less than the level of
significance, therefore the model passed the test for overall significance.
D. Test for Multicollinearity
Ragnar Frisch coined multicollinearity in 1934. It indicates the condition where
there is either an exact or relatively exact linear relationship among the X variables. It
violates one of the classic linear regression model assumptions where there should not
be any mutlicollinearity among the independent variables. There are two types of
multicollinearity, first is the perfectly correlated multicollinearity, which means that they
are singular, and regression is not plausible. Second is the highly correlated but
dangerous multicollinearity, this is when variables are highly correlated to each other this is then dangerous for the model. Despite the violation OLS is still BLUE, however
different repercussions might arise such as erroneous detection of a coefficient being
insignificant because of the t-ratio, there will be a wide confidence interval, ! will be
very high, and the OLS estimators and their standard errors will be perceptive to
changes in data. (Gujarati & Porter, 2009)
One way to test for multicollinearity, the Variance Inflation Factor will be
computed. It is the speed with which variances and covariances increase, and it
indicates how the presence of muticollinearity inflates the variance of an estimatior. The
value of VIF should be less than or equal to 10, this is because when VIF is greater than

17

10 it is highly collinear. Corrective measures are done to fix the violations such as do
nothing, transform the variables into logarithms, remove the culprit variable or use panel
data.
The VIF for this model was generated using Gretl and the results are as follows:
Table 7: Variance Inflation Factors
Minimum possible value = 1.0
Values > 10.0 may indicate a collinearity problem
l_geh

4.080

l_peh

7.427

l_pop

3.395

VIF(j) =1/1 R(j)^2), where R(j) is the multiple correlation coefficient between variable j
and the other independent variables

Properties of matrix XX:


1-norm = 8696.0278
Determinant = 8121844.9
Reciprocal condition number = 0.00011235508

It can be evaluated that all the exogenous variables have a VIF less than 10.
This shows tolerable multicollinearity. However the logarithm of private expenditure on
health possesses the highest VIF but it will not cause any problem.
E. Test for Heteroskedasticity

18

If the classical linear regression model assumption that the disturbance ! have
all the same variance ! is not satisfied then there is heterosccedasticity. The OLS
estimators unbiasedness and consistency properties are not destroyed. These
estimators are no longer minimum variances or efficient, therefore OLS is not BLUE. If
heteroscedasticity exist, the variances of OLS estimators are not given by the normal
OLS formulas because the t and f test based on them can be deceptive which will result
to faulty conclusions. To identify hetereoscedasticity, there are two methods the
informal one which is the graphical method and the formal one which are the different
test that can be conducted such as Park Test, Glejser test, Spearmans Rank
Correlation Test, Goldfeld-Quandt Test, Breush-Pagan-Godfrey Test and Whites
General Heteroscedasticity Test. (Gujarati & Porter, 2009)
Both the informal and formal methods will be shown with the use of Gretl.
Figure 1: Scattergram of estimated residuals plotted against the variables

19

Heteroscedasticity can be seen from a graph if there exist a pattern. From the
graphs above, it can be seen that there is no systematic pattern in the model; therefore
the model is not heteroscedastic. On the other hand, it is said that graphs are too
subjective to interpret models therefore it would not specify whether the model is truly
heteroscedastic. So the formal test or the Whites General Heteroscedasticity Test
conducted using Gretl and the results are below:

20

Table 8: Whites Test for Heteroscedasticity


OLS, using observations 1-33
Dependent variable: uhat^2

coefficient

std. error

t-ratio

p-value

const

0.374249

0.808220

0.4631

0.6477

l_geh

0.0294932

0.182433

0.1617

0.8730

l_peh

0.359006

0.261361

1.374

0.1828

l_pop

0.344832

0.236222

1.460

0.1579

sq_l_geh

0.00935994

0.0147458

0.6348

0.5319

X2_X3

0.0263258

0.0359429

0.7324

0.4713

X2_X4

0.0118747

0.0259169

0.4582

0.6511

sq_l_peh

0.0537565

0.0297534

1.807

0.0839

X3_X4

0.106507

0.0478626

2.225

0.0362

sq_l_pop

0.0564436

0.0234462

2.407

0.0245

Unadjusted r-squared = 0.331738

Test statistic: TR^2 = 10.947346,


With p-value = P(Chi-square(9) > 10.947346) = 0.279335

The Whites General Heteroscedasticity Test does not depend on the normality
assumption and is implemented easily. It has an a-priori expectation where the null

21

hypothesis is homoscedasticity and the alternative is heteroscedasticity. From the


results above the p-value is at 0.279335 which is greater than 0.05, this means that
there is a strong evidence in favor of the null hypothesis. The model can now be
concluded as homoscedastic and this means that the variances of the residuals are
constant and it follows the OLS assumption.
F. Test for Mis-specification
Model specification error or bias is disregarding the classical linear regression
model assumption that the regression model used in the analysis must be correctly
specified. There are several types of mis-specification errors but the top three most
important ones are omitted variable bias, irrelevant variable bias and incorrect functional
form (Gujarati & Porter, 2009). Omitted variable bias is because of the underfitting of a
model due to an exclusion of a significant variable. The OLS becomes inconsistent and
biased that results to a misleading and questionable interpretations of the statistical
significance of the estimates and the confidence intervals. The overfitting of a model
causes irrelevant variable bias as a result to an inclusion of an irrelevant variable. The
confidence interval will remain valid, the estimates variances will be greater than
desired making it less accurate and OLS is still BLUE. On the other hand, incorrect
functional form means that the model must be transformed into linear, logarithmic, lin-lin
or log-log forms.
To see if there is any specification error or bias the Ramsey Regression
Specification Error Test will be conducted and the results are as follows:
Table 9: Ramsey Reset Test
RESET test for specification (squares and cubes)

22

Test statistic: F= 1.791166,


with p-value = P(F(2, 27) > 1.79917) = 0.186
RESET test for specification (squares only)
Test statistic: F= 1.350097,
with p-value = P(F(1, 28) > 3.3501) = 0.0779
RESET test for specification (cubes only)
Test statistic: F= 3.174143,
with p-value = P(F(1, 28) > 3.17414) = 0.0857

The null hypothesis for this is that the model is correctly specified and the
alternative hypothesis is that there is misspecification error or bias in the model. Looking
at all the p-values, the values are all greater than 0.05 then there is a strong evidence
not to reject the null hypothesis. So, it can be concluded that the model does not have
misspecification error or bias.

23

VI. Conclusion
This paper aims to determine what causes total health care expenditure per
capita to grow or to decrease. An empirical procedure was done to prove whether
general government expenditure on health, private expenditure on health and
population affect total health care expenditure per capita. And based on the findings, it
can be deduce that the hypothesized econometric model is valid.
For further studies, additional variables such as prevalence of diseases or
mortality rates or percentage of health care services given out should be added to really
measure the total health care expenditure per capita.

Reference:
Global Health Expenditure Database. (n.d.). World Health Organization. Retrieved
August 31, 2013, from apps.who.int/nha/database/DataExplorerRegime.aspx
Gujarati, D., & Porter, D. (2009). Basic Econometrics (5th ed.). Singapor: Mc Graw Hill.
Health financing for universal coverage. (n.d.). World Health Organization. Retrieved
August 18, 2013, from www.who.int/health_financing/documents/covreport_e_11-deter-he/en/
Per Capita GDP Definition | Investopedia. (n.d.). Investopedia - Educating the world
about finance. Retrieved August 18, 2013, from
http://www.investopedia.com/terms/p/per-capita-gdp.asp

24

You might also like