You are on page 1of 30

KEY DETERMINANTS OF

PER CAPITA INCOME


GROWTH
CRP 5450 Final Term Paper

Tony H. Widjarnarso
Thw53@cornell.edu

CRP 5450 Final Term Paper

Table of Contents
1.

Introduction ......................................................................................................................... 3

2.

Background and Motivation ................................................................................................. 3

3.

Basic Data and Model Presentation .................................................................................... 6

4.

Initial Model & Diagnostics .................................................................................................. 8

5.

Regression Results & Interpretation ...................................................................................11

6.

Summary & Conclusion......................................................................................................13

Bibliography ..............................................................................................................................15
Appendices ...............................................................................................................................17
a.

Data source & calculation methods ................................................................................17

b.

Diagnostics test figures and diagrams ............................................................................23

c.

Alternative regression models ........................................................................................27

CRP 5450 Final Term Paper

Table of Figures
Figure 1 Initial regression residuals vs fitted values plot ............................................................23
Figure 2 Final regression residuals vs fitted values plot ............................................................23
Figure 3 Dependent - explanatory variables scatterplot (1) .......................................................24
Figure 4 Dependent - explanatory variables scatterplot (2) .......................................................25
Figure 5 Initial regression - Breusch-Pagan / Cook-Weisberg test for heteroscedasticity ..........25
Figure 6 Final regression - Breusch-Pagan / Cook-Weisberg test for heteroscedasticity ...........25
Figure 7 Initial model Ramsey reset test ...................................................................................26
Figure 8 Final model Ramsey reset test ....................................................................................26

Table 1 Basic regression model, and explanatory variables definition ........................................ 6


Table 2 Table of summary statistics ........................................................................................... 6
Table 3 Variables matrix of correlation ....................................................................................... 7
Table 4 Initial multiple regression result ..................................................................................... 8
Table 5 Final regression model .................................................................................................11
Table 6 Variance Inflation Factor (VIF) test for multicollinearity (left initial regression, right
final regression) ........................................................................................................................26
Table 7 Alternative regression 1 - dataset excludes DC ............................................................27
Table 8 Alternative regression 2 - log per capita income as dependent variable .......................27
Table 9 Alternative regression 3 - education, healthcare & social services as included industry
sector ........................................................................................................................................28
Table 10 Initial regression matrix of correlation .........................................................................28
Table 11 Initial regression variables list .....................................................................................29

CRP 5450 Final Term Paper

1. Introduction
Per capita income is often used by policymakers and the public as an overall index of wellbeing or standard of living in an economy, as it gives a per person measure of the income earned
or disbursed to individuals in the economy (Berger 1997). In his study, Berger had also briefly
explained that in its most basic form, the factors that influences per capita income are essentially
those which raise or lower the earning of the labor force. This may include: levels of education,
employment rates, and the types of employment.
While Bergers study on the key factors that determines the State of Kentuckys per capita
income had been insightful, it should be noted that the study was conducted nearly two decades
ago, when data were scarcer, and statistical methods were less advanced. In addition to that considering the amount of time that had passed since Bergers study- it may be expected that
much of the variables that were included in Bergers report would have changed dramatically in
the present altering the magnitude of influence that previously studied variables may have on
Per Capita Income (PCPI). An example of this change in significance can be found from a similar
study that was conducted by Connaughton and Madsen in 2004, which covered the period 19502000. The study revealed that the important variables in explaining real PCPI between states
appears to be the percentage of the population that lives in urban areas, the percentage of
population with a four year college degree, and the percentage of the population employed in the
service sector (Connaughton and Madsen 2004)
Building on the work of previous researchers, the purpose of this paper is to explore the key
factors that explains per capita personal income growth among states in the US, between the
periods of 2000 and 2010.

2. Background and Motivation


Bergers econometric estimates in 1997 drew a positive correlation between the State of
Kentuckys levels of education and its per capita income (Berger 1997). This conclusion is
consistent to the general consensus that higher education leads to increased productivity
(Paulsen 1996), although another research also notes that the link between growth and education
may vary as a result of different levels of pre-existing economic development (Petrakis and
Stamatakis 2001). The increase in per capita income may also be attributed to other noneconomic benefits that recipients of higher education get (Psacharopoulos 2006), as noted by

CRP 5450 Final Term Paper

Moretti (Moretti 2004). As these previous studies concluded then, we can expect to see a positive
correlation between educational attainment and per capita income.
From the types of employment standpoint, Berger had used private and public employment
to account for the difference in state the US Small Business by employment types. A recent report
by Small Business Administration (SBA) that was released in February 2013 however, stated that
private sector employment contributes 109,061,000 jobs to the economy, compared to just
22,265,000 jobs in the public sector accounting for only around 17 percent of the total
employment in the entire US economy (Small Business Administration (SBA) Office of Advocacy
2013). Given the significantly lower level of employment in the public sector compared to the
private sector, it is reasonable to think that private sector employments and employers might
provide a more representative variable in estimating the factors that determine the economic
growth of US states.
Further examination into US private sectors shows the importance of small and medium sizes
business to the private sector, and the overall US economy as a whole. In the entire nonagricultural business sector alone, small and medium sized businesses with less than 500
employee accounts for 50.7% of GDP in the United States (Leung and Rispoli 2011). In the retail
sector at least, it was documented that small business employments are more likely to generate
higher income to the region, than larger business (Neumark, Zhang and Ciccarela 2008). It is to
be expected then, that in most sectors in the US economy, the generation of employments and
incomes are dominated by small businesses rather than big businesses, such can be seen in
agriculture (85.1% of Industry employment), other services (85.8% of industry employment), and
construction (83.3% of industry employment) (SUSB - (Small Business Administration (SBA)
Office of Advocacy 2015)). Inferring from these data, we can expect to see a positive correlation
between small businesses growth to the per capita income.
Several notable sectors that are excepted from being dominated by small businesses are
finance & insurance, and manufacturing. The manufacturing sector in particular are still in large
part dominated by larger firms in 2012 (employment by large businesses accounted for 54.5% of
total industry employment) (SUSB - (Small Business Administration (SBA) Office of Advocacy
2015)), although several research had also mentioned that despite long-standing decline in the
share of total employment attributable to manufacturing, the sector had managed to keep its own
share of the economy, making the US a very larger manufacturing country, second only to China
(Baily and Bosworth 2014). This might imply that manufacturing still make up a significant portion

CRP 5450 Final Term Paper

of US economic growth and a fair share of US states per capita income, and that we can still
expect to see a positive correlation between manufacturing employment and per capita income
growth. It should be noted however that this hypothesis are still inconclusive, due to the fact that
the sector is still rapidly declining (Pierce and Schott 2012).
Both Berger (Berger 1997) and Connaughton (Connaughton and Madsen 2004) had
essentially used % of population living in the rural area in their model to represent number of
workers. The inclusion of % of population living in the rural area in both of their studies showed a
significant negative correlation to per capita income, which is in line with the currently prevailing
conventional wisdom, though this common notion is not unanimously agreed to, as Jones
documented mentioning that there are still arguments that pointed out the large concentration
of poverty that are concentrated in urban areas (Jones and Kone 1996), and that urbanization
doesnt necessarily always lead to growth (Fay and Opal 1999). While this argument is certainly
still worth noting, Joness regression model also arrived at the same conclusion as the previous
two aforementioned studies. Therefore, we will also hypothesize this variable to be negatively
correlated with per capita income.
Much of the previous work exploring factors that determine per capita income had focused on
the part of the population that is actively working, which shows consistent positive correlation with
per capita income, such that a slower growth in labor force had even been found to also slow the
overall income growth rate of a region (Congressional Budget Office 2009). Retirees on the other
hand, are a distinct part of the population that are no longer actively working and earning
additional income for the state, but are still being accounted for in the per capita income
calculation. Given that not much studies had been dedicated to explore the relationship between
the retiree population and per capita income, the aging trends that is expected to continue to
increase in developed countries such as the US (Zweifel, Felder and Meiers 1999), and rising
concerns regarding the major factor [that is] weighing down the long-term finances of state and
local governments [which] is the obligation to fund retiree benefits (Lutz and Sheiner 2014), this
study will use retiree growth, instead of the usual labor participation variables that are more
commonly used in per capita income participation. From these theories it can be expected then
that this variable are likely to have a negative correlation to the proposed of per capita income
estimation model in this paper.

CRP 5450 Final Term Paper

3. Basic Data and Model Presentation


A multiple regression model was formulated to determine the relative impact of each selected
explanatory variables to the dependent variable. The multiple regression model and the
definition of each variables are presented below (see
Appendices for detailed information of data):
= 1. 2000 + 2. 2010 + 3. 2000 + 4. + 5. 2000 + 6. + 7.
+ 8. 2000 +
Variable

Description

Expect
ed Sign

PcpIG

% Per capita income growth (year 2000 2010)

PcpI2000

Per capita income level (year 2000)

Sbe2010

Small business employment level (year 2000) as a percentage of total employment

ManE2000

% Employment level in manufacturing (year 2000)

ManEG

% Employment in manufacturing sector growth (year 2000 2010)

BdM2000

Population above 25 years old, with bachelor & associated degree or more level, as a

percentage of total population (year 2000)


BdMG

% Population above 25 years old, with bachelor & associate degree or more growth (year

2000 2010)
RG

% Population above 65 years old (retiree) growth (year 2000 2010)

Rur2000

Population that are living in rural areas (year 2000), as a percentage of total population

Table 1 Basic regression model, and explanatory variables definition

All variables are collected on state level for the year 2000 and 2010. Summary statistics and
the correlation matrix for the variables included in the linear regression model are listed below:

Table 2 Table of summary statistics

CRP 5450 Final Term Paper

Matrix of correlation

Table 3 Variables matrix of correlation

The matrix of correlation between included variables in Table 3 Variables matrix of correlation
shows that, the majority of explanatory variables that are going to be included in the model are
showing only a modest amount of correlation with the other explanatory variables, but a few of
the explanatory variables are showing a significant correlation with other explanatory variables.
Unaddressed, these data twists may eventually lead to multicollinearity in the final multiple
regression model, leading to the implicated variables to have a high standard error and biased tstatistics.
A second concern regarding the data used in this research was, seeing as the purpose of this
study was explore the impact of various variables to the per capita income growth for US states
in the year 2000 to 2010, ideally, all of the level data that is used on this model should be based
on the periods base year (2000). Yet, due to unavailability of data for small businesses
employment in the year 2000, the data for small businesses employment in the year 2010 was
used instead as a proxy for the variable small business employment (sbe2010)
In doing a regression for state levels, the number of observations are constrained to the total
number of states (51 including DC). The small sample size might cause the regression to be
biased, or make diagnostics of problems in the model difficult, as there would not be enough data
points in the residuals vs fitted plots, or any other scatter plot to make a pattern obvious.
Lastly, in calculating the growth between the year 2000 and 2010, it would be ideal to use the
same set of data and format (Decennial Census series, or American Community Survey series).

CRP 5450 Final Term Paper

This however, was not possible, as some of the surveys that uses the same format appears to
have been discontinued in the year 2004. In response to this, the data for this paper had to be
constructed from different series of census: the data for the year 2000 for the majority of variables
were obtained from the Decennial Census series, and data for the year 2010 were obtained from
either the American Community Survey (ACS) series, or the Annual Survey of Manufacturers
(ASM). This might lead to slight inconsistencies in the final dataset, as the methods of obtaining
the data and the format in which the data was compiled in the original dataset might be different
to some certain extent.

4. Initial Model & Diagnostics

Table 4 Initial multiple regression result

The initial regression had already showed a very high R2 value, and a statistically
significant p-values (less than 0.05) for the majority of the variables. A scatter diagram of the
models fitted values of dependent variables vs the residuals showed that the data points are
loosely distributed in a tube-like fashion around the y axis 0 value except for some states such

CRP 5450 Final Term Paper

as Wyoming, Connecticut, and DC (Figure 1). This shows that the model represents a relatively
good fit to the observation data included in the model (about 80% according to the R2). The
coefficients signs adheres to the previous theories that were studied in the construction of this
model, except for the variable rurg (% Population living in rural areas growth (year 2000 - 2010),
which should be statistically significant and show a negative coefficient. Given that this study is
expanding the included variables from previous work, rather than narrowing it down, it is more
likely that this problem was caused by inclusion of irrelevant extra variables, rather than omitted
variable bias (OVB).
Additional regressions that were ran using alternative variables and functional form before
arriving at the initial model are shown in (Table 8, Table 9). After experimenting with various
explanatory variables, the variable rurg was removed from the initial model, and is replaced by
rur2000 (% Population living in rural areas in the year 2000). The change was based on the fact
that rur2000, which represents level of population living in rural area has a more appropriate
functional form for model, rather than rurg that represents percent growth (change) of population
in the rural area. The change did not make the variable statistically significant to the model, but
the inclusion of this variable is still necessary to avoid OVB.
Other than the relative fitness of the model, the residuals vs fitted value plot also revealed
that there isnt likely to be any heteroscedasticity problem in the model (Figure 1). But considering
the possibility that heteroscedasticity may not be obvious in the scatter plot, given the limited data
count (51), a formal non-graphic test were still ran to account for heteroscedasticity (Figure 5),
which revealed a p-value of 0.3695 on the null hypothesis that the model has a constant variance.
We therefore fail to reject the null of the test for heteroscedasticity, and can conclude that the
residuals in this model are in fact homogenous, and the model homoscedastic.
Scatter diagrams between the models dependent and each independent variables shows
that selected explanatory variables are showing the expected correlation with the dependent
variable, but it also reveals that states such as DC, North Dakota and Wyoming are consistently
acting as an outlier, as can be seen from the collection of scatter diagrams of dependent and
independent variables (Figure 3 & Figure 4). There is a good possibility that these outlier states
may cause the model to be biased. But given the importance of these observations, it would not
be possible to drop these data. DC in particular, are showing a very distinct pattern in the data
set, having the highest concentration of population with an advanced degree (28.12%), highest
level of small business employment (82.52%) among all of the other states, and its entire

CRP 5450 Final Term Paper

population living in the urban area. Unlike the other states however, considering its status as a
special federal district rather than a state, it may be reasonable to exclude DC in this research,
but this should be done separately from the final model, as omitting DC in the final regression
model may potentially cause OVB.
The alternative regression model that excludes DC (Table 7) was found to have a higher
R2 of 0.84 and adjusted R2 of 0.8066 compared to the earlier model, but introduces several
complications in the other explanatory variables coefficient. Most notably on the variable sbcg (%
Small business companies growth (year 2000 2010), which becomes statistically significant
after the removal of DC from the dataset, and also the variable bdm2000 (Population above 25
years old, with bachelor & associated degree or more level, as a percentage of total population
in the year 2000), which becomes statistically insignificant after the removal of DC from the
dataset. The change in the latter variable in particular (bdm2000), shows the potential OVB that
omitting DC may cause to the final regression model.
Another thing that the scatter diagrams of the dependent vs each independent variable
revealed was that one of the included explanatory variables, sbe2010 may share a linear
relationship with pcpig (Figure 3 row 1, column 2). Previous literature studies argued that growth
in small businesses are contributing to the growth in the per capita income. Yet, seeing the linear
relationship between sbe2010 and pcpig, it may also be plausible to think that it is the per capita
income growth that is contributing to the overall number small businesses growth. Ideally, an
instrumental variable that is representative of the value of sbe2010 (Small business employment
(year 2010) as percentage of total employment) should be used as a replacement in the model,
but for the time being, sbe2010 will still be included in the linear regression model in this paper.
Sbcg however was dropped, due to its insignificant t-statistics, and possible correlation with
Sbe2010, which interfered with the other variables t-statistics.
In order to address the high degree of correlation between some of the explanatory
variables in the initial model (Table 10), a Variance Inflation Factor (VIF) test was ran to check
the severity of the impact of multicollinearity on this model (Table 6), which found a relatively low
-and still well under the maximum value threshold of 10- VIF value in the variable bdm2000 (7.23),
bdmg (5.79). Correlation between the two variables was expected though, as one is defined to
be the percent growth of the other. It is generally possible to mitigate the effect multicollinearity
by either: modifying the models specification, including additional variables to minimize the effect
of multicollinearity, or increasing the models sample size. However, due to the fact that this model

10

CRP 5450 Final Term Paper

is run on the state level, it is constrained to a set number of observation (51), and thus it is not
possible to mitigate the effect of multicollinearity on this model by increasing the observation
sample size, nor would it be necessary to do so according to the VIF test results.
A final specification test to check the initial models specification were Ramseys reset test
(Figure 7). The initial model received a p-value of 0.4447 in the test, with the null hypothesis that
the model has no omitted variables. After these diagnostics, the model was subjected to final
iterations, and were rechecked using the same processes to test for heteroscedasticity (Figure
6), multicollinearity (Table 6), OVB and functional form (Figure 8), and was found to have
improved results for all three tests. The final models residuals vs fitted values plot can be seen
in (Figure 2) in the appendices.

5. Regression Results & Interpretation

Table 5 Final regression model

11

CRP 5450 Final Term Paper

The final regression results presented in Table 5 showed a relatively good fit for the
equation. The Adjusted R2 value is 0.7674, and the value of R2 is found to be 0.8046. Six out of
the eight variables included in the model are found to be statistically significant at the 0.05 level.
Starting with the variable Pcpi2000, which is found to have a negative coefficient, and tested
statistically significant as was predicted. Its coefficient, having the value -0.0010167, signified that
all else being equal, a one point increase in Pcpi2000, is correlated with a 0.0010167 decrease
in Pcpig. The confidence interval signifies that we are 95% certain that the actual value for this
variables coefficient, lies within the [-0.0016956 - -0.0003379] range.
Sbe2010 is found to have positive coefficient, and is one of the variables that is found to be
statistically significant, as was predicted. The value of the coefficient on Sbe2010 indicates that,
all else being equal, for each one point increase in sbe2010, the Pcpig of the state is predicted to
increase by 0.4658011. The confidence interval shows that we are 95% confident that the real,
underlying value of the estimated coefficient for sbe2010 lies within [0.0736503 0.0857952].
The variable Mane2000 also tests statistically significant, with a very high t-statistic of -5.17, and
its sign is negative as predicted. Its coefficient estimated that, all else being equal, a one point
increase in the value of Mane2000 is correlated with a 1.066858 decrease in Pcpig. The
confidence interval shows that we can be 95% confident that the real coefficient for this variable
lies within [-1.438477 - -0.65024].
Maneg is found to tests statistically significant, though its sign, unlike the prediction, is found be
positive. Its coefficient estimated that, all else being equal, a one point increase in the value of
Maneg is correlated with a 0.0776296 increase in Pcpig. The confidence interval shows that we
are 95% certain that the real coefficient for this variable lies within [0.0350427 0.1202165]
Bdm2000 tested statistically significant at the 0.05 level. Its sign, as was predicted before, is
positive. All else being equal, a one point increase in bdm2000 is correlated with a 1.086697
increase in Pcpig. The confidence interval shows that we are 95% certain that the real coefficient
for this variable lies somewhere within [0.176872 2.155708].
Bdmg on the other hand, having a p-value of 0.6, is found to be statistically insignificant.
Furthermore, contrary to what was predicted, the sign for the variable Bdmg is also found to be
negative, rather than positive.

12

CRP 5450 Final Term Paper

Rg is also found to be statistically significant, having a high t-statistic of -4.48. Its sign is negative,
as was previously hypothesized. The coefficient of the variable rg indicated that, all else being
equal, a one point increase in Rg is correlated with a 0.8559418 decrease in Pcpig.
The variable rur2000, contrary to what was previously hypothesized, had a p-value of of 0.153,
and thus, is found to be statistically insignificant. Its coefficients sign, unlike the prediction, is also
found to be positive at 0.1017825. This result is perplexing, as previous studies on per capita
income had consistently showed strong negative correlation between the population that lives in
the rural area, and per capita income.

6. Summary & Conclusion


The purpose of this study is to explore the key factors that explains per capita personal income
growth among states in the US, using per capita income growth, to measure and compare the
increase of overall well-being among states in the US. The regression model specified in this
study specified state Per capita income growth as a function of small businesses employment,
manufacturing employment, level of population with advanced education, retirees, and the
percentage of its population that lives in the rural areas. While the variables that were included in
this study varies slightly from the usual parameters that are commonly used in per capita income
functions, it is still based largely on previous studies that examined the factors that influences per
capita income.
Small business employment (Sbe2010) for instance, was found to have positive influence over
per capita income growth (PcpIG), which is consistent with previous studies that claims the
significant contribution that small businesses make to the overall US economy,
Employment in the manufacturing sector (ManE2000) was also found to be statistically
significant, but contrary to the initial hypothesis, this variable was found to have a positive
coefficient, although the other variable for manufacturing that represents the employment in
manufacturing growth (ManEG) was found to have a negative coefficient, despite being
statistically insignificant. The positive coefficient that ManE2000 have might reflect the generally
higher capital that needs to be invested in a region to get a new manufacturing operation started
(relative to other industry sectors) which explains this variables positive contribution to per
capita income, while another plausible explanation might be that: because manufacturing industry
is typically a labor-intensive sector (compared to for example, the service sector industry), it is
reasonable to think that manufacturing employs more people, and therefore, distributes a good

13

CRP 5450 Final Term Paper

amount of wages and salary to a larger portion of the population relative to other sectors. While
the negative coefficient that ManEG have might come from the generally declining trend of
manufacturing sector employment, or the relatively low return and wage increase in this sector in
the long run, considering that this sector typically employs less educated/ less skilled workers.
BdmG was also found to be statistically insignificant, and is behaving the opposite of what
was hypothesized. Though its related variables, levels of education, which was represented by
Bdm2000 (Population above 25 years old, with bachelor & associated degree or more level, as a
percentage of total population in the year 2000) was found to behave consistently with the initial
hypothesis. It might be plausible to attribute the negative coefficient of BdmG to the increasingly
higher costs of education, and the relatively long period of return of benefits when an investment
on education is made. A concrete example of this might be found on the usually higher cost to
build a research laboratory, when compared to a manufacturing plant, as even though a laboratory
may present higher returns in the future, a manufacturing plant can commence production, and
start generating significant returns almost immediately upon its completion. Going by this logic,
the positive coefficient in Bdm2000 can be thought of as a portion of the population that has higher
productivity than the rest of the generally less educated population representing the finished
product of the investments on education in the period before this study.
Retiree growth, as hypothesized before, is found to have a statistically significant negative
relationship with per capita income growth. This result is interesting, though unsurprising, as
studies that explored key determinants of per capita income typically puts more focus on the
productive group of the population, rather than the unproductive group. The variable rur2000,
which represents population that are living in rural areas in the year 2000, as a percentage of total
population, is surprisingly found to be statistically insignificant, and having a positive relationship
with per capita income growth. This result contradicts previous studies that designated urban
areas and their population as the key to economic growth. While it is possible that this result was
simply caused by an error in the model, there is also a chance that this result was caused by the
growing trends of urbanization to cities, which caused the poorer population to be more
concentrated in the city, and leaving the richer population to live in the rural area.
These findings, especially of the less-explored variables might merit a more detailed research
in the future, particularly in planning for policies that are related to retiree populations. With regard
to the growing concerns towards the aging trends of US population, future policymakers may need
to look further on the impact that older population have on the overall well-being of the state.

14

CRP 5450 Final Term Paper

Another interesting finding in this study that calls for further investigation might be seen from
the small business employment variable. As despite its significance, there have been relatively
very few studies has actually been done to address the state of small businesses, and their
influence on the overall US economy. The findings on small businesses in this study would have
been better had the data for US Small Businesses in the year 2000 were more available and
comprehensive.
While this paper is not the first study to have examined this question, ever-changing socioeconomic structures meant that these factors magnitude and influence will always change over
time. The variables that were found to behave consistently with previous studies and reports in
this research, might not necessarily behave in the same manner in the future. It might be
interesting to expand future research on the same subject with a more diverse variable on social
characteristics to weigh, and compare the varying contributions that different groups of people
and industries have on overall economic growth. Several additional expansion variables to
consider in future researches may include: percent of married households in a region, number of
lawyers, or any other relatively high paying job as a percentage of the total employment.

Bibliography
Baily, Martin N, and Barry P Bosworth. 2014. "US Manufacturing: Understanding Its Past and Its
Potential Future." Journal of Economic Perspectives 28 (1): 3-26. Accessed November
27, 2015. http://www.brookings.edu/~/media/research/files/papers/2014/02/usmanufacturing-past-and-potential-future-baily-bosworth/us-manufacturing-past-andpotential-future-baily-bosworth.pdf.
Berger, Mark C. 1997. "Kentucky's Per Capita Income: Catching Up to the Rest of the Country."
Kentucky Annual Economic Report, February 15: 1-7. Accessed November 10, 2015.
http://cber.uky.edu/Downloads/annrpt97.pdf.
Congressional Budget Office. 2009. How Slower Growth in the Labor Force Could Affect the
Return on Capital. Background Paper, Washington DC: The Congress of the United
States.
Connaughton, John E, and Ronald A Madsen. 2004. "Explaining Per Capita Personal Income
Differences Between States." The Review of Regional Studies 34 (2): 206-220.
Accessed 12 7, 2015. http://journal.srsa.org/ojs/index.php/RRS/article/view/87/38.

15

CRP 5450 Final Term Paper

Fay, Marianne, and Charlotte Opal. 1999. "Urbanization Without Growth, a Not-So-Uncommon
Phenomenon." Policy Research Working Paper (The World Bank) 1-31. Accessed
December 8, 2015. http://dx.doi.org/10.1596/1813-9450-2412.
Jones, Barclay G, and Solomane Kone. 1996. "An Exploration of Relationshipes Between
Urbanization and Per Capita Income: United States and Countries of the World." Journal
of the Regional Science Association International (RSAI) (Regional Science Association
International (RSAI)) 75 (2): 135-153. Accessed 12 5, 2015.
http://link.springer.com/article/10.1007%2FBF02404704.
Leung, Danny, and Luke Rispoli. 2011. "The Contribution of Small and Medium-sized
Businesses to Gross Domestic Product: A Canada - United States Comparison."
Economic Analysis (EA) Research Paper Series 1-19. Accessed November 29, 2015.
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1864144.
Lutz, Byron, and Louise Sheiner. 2014. "The Fiscal Stress Arising from State and Local Retiree
Health Obligations." Journal of Health Economics (38): 130-146. Accessed November
27, 2015. http://www.nber.org/papers/w19779.pdf.
Moretti, Enrico. 2004. "Estimating the Social Return to Higher Education: Evidence from
Longitudinal and Repeated Cross-Sectional Data." Journal of Econometrics (121): 175212. Accessed November 26, 2015. http://eml.berkeley.edu/~moretti/socret.pdf.
Neumark, David, Junfu Zhang, and Stephen Ciccarela. 2008. "The Effects of Wal-Mart on Local
Labor Market." Journal of Urban Economics (63): 405-430. Accessed November 16,
2015. http://www.nber.org/papers/w11782.pdf.
Paulse, Michael B, and Nasrin Fatima. 1996. "Higher Education and State Workforce
Productivity in the 1990s." The NEA Higher Education Journal 75-94. Accessed 12 3,
2015. http://www.nea.org/assets/img/PubThoughtAndAction/TAA_04Sum_08.pdf.
Paulsen, Michael B. 1996. "Higher Education and State Workforce Productivity." The NEA
Higher Education journal 55-78. Accessed 12 3, 2015.
http://www.nea.org/assets/img/PubThoughtAndAction/TAA_96Spr_04.pdf.
Petrakis, P.E, and D Stamatakis. 2001. "Growth and Educational Levels: a Comparative
Analysis." Economics of Education (21): 513-521. Accessed December 7, 2015.
http://www.csus.edu/indiv/l/langd/Petrakis_Stamatakis.pdf.

16

CRP 5450 Final Term Paper

Pierce, Justin R, and Peter K Schott. 2012. "The Surprisingly Swift Decline of US Manufacturing
Employment." NBER Working Paper Series (18655): 1-47. Accessed November 29,
2015.
http://www.usitc.gov/research_and_analysis/documents/Pierce%20and%20Schott%20%20The%20Surprisingly%20Swift%20Decline%20of%20U.S.%20Manufacturing%20Em
ployment_0.pdf.
Psacharopoulos. 2006. "The Value of Investment in Education: Theory, Evidence, and Policy."
Journal of Education Finance (University of Illinois Press) 32 (2): 113-136. Accessed
December 8, 2015. http://www.jstor.org/stable/40704288 .
Small Business Administration (SBA) Office of Advocacy. 2013. "Small Business Profiles."
Small Business Administration. February 1. Accessed October 16, 2015.
https://www.sba.gov/sites/default/files/allprofiles12.pdf.
. 2015. "Small Business Profiles for the States and Territories." Small Business
Administration. February 1. Accessed October 18, 2015.
https://www.sba.gov/sites/default/files/advocacy/SB%20Profiles%202014-15_0.pdf.
Zweifel, Peter, Stefan Felder, and Markus Meiers. 1999. "Ageing of Population and Health Care
Expenditure: A Red Herring?" Health Economics (8): 485-496. Accessed December 13,
2015. http://onlinelibrary.wiley.com/doi/10.1002/(SICI)10991050(199909)8:6%3C485::AID-HEC461%3E3.0.CO;2-4/epdf.

Appendices
a. Data source & calculation methods
1. (PcpIG) Per Capita Income Growth (year 2000 - 2010)
Expresses the percentage change between state Per Capita Income on the year 2010 and
2000. The per capita income data used in this paper is expressed in real terms, taking into
account inflation by adjusting the 2000 dollar value to its equivalent in the year 2010.
Data Calculation:
a. 2000 per capita income adjusted to 2010 value by using CPI calculator:
http://data.bls.gov/cgi-bin/cpicalc.pl?cost1=41920&year1=2000&year2=2010

17

CRP 5450 Final Term Paper

b. =

2010 2000 per capita income

2000

100

Data unit: percentage


Data source: US Department of Commerce, Bureau of Economic Analysis. Released March
2013. Website: http://www.bea.gov/iTable/index_regional.cfm
2. (PcpI2000) Per capita income level (year 2000)
Real dollar value of state per capita income level in the year 2000.
Data Calculation: 2000 per capita income adjusted to 2010 value by using CPI calculator:
http://data.bls.gov/cgi-bin/cpicalc.pl?cost1=41920&year1=2000&year2=2010
Data unit: US$
Data source: US Department of Commerce, Bureau of Economic Analysis. Released March
2013. Website: http://www.bea.gov/iTable/index_regional.cfm
3.

(Sbe2010) Small business employment level (year 2010)

The proportion of employment in small businesses in 2010 from the total employment in the
year 2010. Small business firms are defined to be firms that has less than 500 employees
Data Calculation: 2010 =

2010
100
2010

Data unit: percentage


Data source:

Small business 2010 employment data: US Small Business Administration (SBA)


Office of Advocacy, Small Business Profile 2012 Report. Website:
https://www.sba.gov/sites/default/files/allprofiles12.pdf

Total employment in the year 2010 data: American Community Survey. Selected
Economic Characteristics 2010 American Community Survey 1-Year Estimates.
Website:
http://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=AC
S_10_1YR_DP03&prodType=table

4. (SbCG) Small business companies growth (year 2000 2010)


The change of numbers of small businesses (companies / firms) between the periods 2000
to 2010.
Data Calculation: =

2010 2000
2000

100

18

CRP 5450 Final Term Paper

Data unit: percentage


Data source: Small Business Administration (SBA) Office of Advocacy, Small Business
Profile 2012 Report. Website: https://www.sba.gov/sites/default/files/allprofiles12.pdf
5. (ManE2000) Manufacturing employment level (year 2000)
The proportion of employment in the manufacturing sector in 2000 from the total
employment of the year 2000.
Data Calculation: 2000 =

2000
100
2000

Data unit: percentage


Data source:

Employment in the manufacturing sector, year 2000 data: Decennial Census. Profile
of Selected Economic Characteristics: 2000 Census 2000 Summary File 4 (SF 4)
Sample Data. Website:
http://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=DE
C_00_SF4_DP3&prodType=table

Total employment in the year 2000 data: Decennial Census. Profile of Selected
Economic Characteristics: 2000 Census 2000 Summary File 3 (SF 3) Sample
Data. Website:
http://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=DE
C_00_SF3_DP3&prodType=table

6. (ManEG) Manufacturing employment growth (year 2000 2010)


The change of numbers of employment in the manufacturing sector between the periods
2000 to 2010.
Data Calculation:

2010 2000
100
2000

Data unit: percentage


Data source:

Employment in the manufacturing sector, year 2000 data: Decennial Census. Profile
of Selected Economic Characteristics: 2000 Census 2000 Summary File 4 (SF 4)
Sample Data. Website:

19

CRP 5450 Final Term Paper

http://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=DE
C_00_SF4_DP3&prodType=table

Employment in the manufacturing sector, year 2010 data: American Community


Survey. Selected Economic Characteristics: 2010 American Community Survey 1Year Estimates. Website:
http://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=AC
S_10_1YR_DP03&prodType=table

7. (BdM2000) Population above 25 years old with bachelor degree/ associate degree,
as percentage of total population in the year 2000
Data Calculation:
25 , (2000)
= + +
+ +
2000 =

25 , (2000)
2000

Data unit: percentage


Data source:

Population above 25 years old, with bachelor degree or more (2000) data: Decennial
Census. Sex by Educational Attainment for the Population 25 Years and Over [35].
Universe: Population 25 years and over. Census 2000 Summary File 3 (SF3) Sample
Data. Website:
http://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=DEC_0
0_SF3_P037&prodType=table

Total Population in 2000 data: Decennial Census. Total Population. Universe: Total
Population. 2000 Census Summary File 1. Website:
http://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=DEC_1
0_SF1_P1&prodType=table

8. (BdMG) Population above 25 years old with bachelor degree/associate degree or


more growth (year 2000 2010)
The change of numbers of population above 25 years old whose educational attainment level
reached bachelors degree, or its equivalent, or more between the periods 2000 to 2010.

20

CRP 5450 Final Term Paper

Data Calculation:
a. Population above 25 years old with bachelor/associate degree or more in the year
2000 & 2010:
25 , (2000)
= + +
+ +
25 , (2010)
= + +
+ +
b. =
25 , (2010) 25 , (2000)
25 , (2000)

100

Data unit: percentage


Data source:

Population above 25 years old, with bachelor degree or more (2000) data: Decennial
Census. Sex by Educational Attainment for the Population 25 Years and Over [35].
Universe: Population 25 years and over. Census 2000 Summary File 3 (SF3) Sample
Data. Website:
http://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=DEC_0
0_SF3_P037&prodType=table

Population above 25 years old, with bachelor degree or more (2010) data: American
Community Survey. Sex by Educational Attainment for the Population 25 Years and
Over. Universe: Population 25 years and over. 2010 American Community Survey 1Year Estimates. Website:
http://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_1
0_1YR_B15002&prodType=table

9. (Rg) Population 65 years and over (retirees) growth (year 2000 2010)
The change of numbers of population who are 65 years old or older between the periods
2000 to 2010
Data Calculation:
=

65 2010 65 2000
100
65 2000

21

CRP 5450 Final Term Paper

Data unit: percentage


Data source:

Population 65 years and over in the year 2000 data: American Community Survey.
Educational Attainment. 2000 American Community Survey 1-Year Estimates. Website:
http://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_1
0_1YR_S1501&prodType=table

Population 65 years and over in the year 2010 data: American Community Survey.
Educational Attainment. 2010 American Community Survey 1-Year Estimates. Website:
http://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_1
0_1YR_S1501&prodType=table

10. (Rur2000) Population living in rural areas (year 2000)


The population of people who are living in the rural areas in 2000, as percentage of total
population
Data Calculation:
2000 =

(2000)
100
(2000)

Data unit: percentage


Data source:

Population living in rural areas 2000 data: Decennial Census. Urban and Rural.
Universe: Housing Units. Census 2000 Summary File 1 (SF1) 100-percent data.
Website:
http://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=DEC_0
0_SF1_H002&prodType=table

Total Population in 2000 data: Decennial Census. Total Population. Universe: Total
Population. 2000 Census Summary File 1. Website:
http://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=DEC_1
0_SF1_P1&prodType=table

22

CRP 5450 Final Term Paper

b. Diagnostics test figures and diagrams

Figure 1 Initial regression residuals vs fitted values plot

Figure 2 Final regression residuals vs fitted values plot

23

CRP 5450 Final Term Paper

Correlation scatterplot

Figure 3 Dependent - explanatory variables scatterplot (1)

24

CRP 5450 Final Term Paper

Figure 4 Dependent - explanatory variables scatterplot (2)

Figure 5 Initial regression - Breusch-Pagan / Cook-Weisberg test for heteroscedasticity

Figure 6 Final regression - Breusch-Pagan / Cook-Weisberg test for heteroscedasticity

25

CRP 5450 Final Term Paper

Table 6 Variance Inflation Factor (VIF) test for multicollinearity (left initial regression, right final regression)

Figure 7 Initial model Ramsey reset test

Figure 8 Final model Ramsey reset test

26

CRP 5450 Final Term Paper

c. Alternative regression models

Table 7 Alternative regression 1 - dataset excludes DC

Table 8 Alternative regression 2 - log per capita income as dependent variable

27

CRP 5450 Final Term Paper

Table 9 Alternative regression 3 - education, healthcare & social services as included industry sector

Table 10 Initial regression matrix of correlation


Variable

Description

Expect
ed Sign

PcpIG

% Per capita income growth (year 2000 2010)

PcpI2000

Per capita income level (year 2000)

Sbe2010

Small business employment level (year 2000) as a percentage of total employment

SbcG

% Small business companies growth (year 2000 2010)

ManE2000

% Employment level in manufacturing (year 2000)

28

CRP 5450 Final Term Paper

ManEG

% Employment in manufacturing sector growth (year 2000 2010)

BdM2000

Population above 25 years old, with bachelor & associated degree or more level, as a

percentage of total population (year 2000)


BdMG

% Population above 25 years old, with bachelor & associate degree or more growth (year

2000 2010)
RG

% Population above 65 years old (retiree) growth (year 2000 2010)

Rur2000

% Population living in rural areas growth (year 2000 - 2010)

Table 11 Initial regression variables list

29

You might also like