You are on page 1of 24

Who becomes an entrepreneur?

Occupational choice and the firm size distribution


Markus Poschke
McGill University
February 2008

Abstract
Why do some people become entrepreneurs (and others dont)? Why are firms so heterogeneous? To start, the paper briefly documents evidence from the empirical literature
that the relationship between entrepreneurship and education is U-shaped; that many entrepreneurs start a firm out of necessity; that most firms are small, remain so, yet persist
in the market; and that returns to entrepreneurship have a much larger cross-sectional variance than returns to wage work. Popular models of the firm size distribution cannot easily
account for the persistence of low-productivity firms. The facts can however be explained
in a model of occupational choice between wage work and entrepreneurship where agents
are heterogeneous in their ability as workers, and starting entrepreneurs face uncertainty
about their projects productivity. Then, if the variance of firms productivity is high relative to that in workers ability, the most and the least able individuals choose to become
entrepreneurs. This sorting is due to heterogeneous outside options in the labor market.
Low-ability agents benefit disproportionately from the ability to pursue only good business
projects, and abandon low-productivity ones, because their opportunity cost of doing so
is low. High-ability entrepreneurs are relatively more prevalent if the correlation between
ability as a worker and firm productivity is high. The persistence of low-productivity firms
suggests that empirically, this correlation is imperfect.
JEL codes: E20, J23, L11, L16
Keywords: occupational choice, firm entry, selection, firm size distribution

Introduction

Why do some people become entrepreneurs (and others dont)? Why do so many firms fail? Why
are firms so heterogeneous? With these questions in mind, this paper explores the occupational

I would like to thank Antonio Ciccone, Russell Cooper, Omar Licandro, Francesc Ortega, Morten Ravn,
Jaume Ventura, and seminar participants at McGill, Concordia, and Universitat Pompeu Fabra for valuable
comments and suggestions.

Contact: McGill University, Economics Department, 855 Sherbrooke St West, Montreal QC H3A 2T7,
Canada. e-mail: markus.poschke@mcgill.ca

choice between wage work and entrepreneurship when people are heterogeneous in their ability
as workers, and startups differ in productivity. This decision has implications for the firm
productivity distribution and, through this, for aggregate productivity.
A substantial number of people choose to become entrepreneurs. In the U.S., for instance,
the ratio of entrepreneurs to entrepreneurs plus wage and salary workers is 12.8%, using CPS
data for 2003 cited in Hipple (2004). This rate is even higher in most other industrialized
economies (see Blanchflower 2000). Of course, who becomes an entrepreneur may matter a lot
for the firm productivity distribution, aggregate productivity, and welfare.
Before modeling the occupational choice, I review some relevant facts about entrepreneurship
from the empirical literature. First, do the most or the least able become entrepreneurs? A
priori, this is not clear, and depends on what type of firm one thinks of. Lazear (2005), for
instance, puts it this way:
It is tempting to argue that the most talented people become entrepreneurs because
they have the skills required to engage in creative activity. Perhaps so, but this flies
in the face of some facts. The man who opens up a small dry-cleaning shop with two
employees might be termed an entrepreneur, whereas the half-million-dollar-per-year
executive whose suit he cleans is someone elses employee. It is unlikely that the shop
owner is more able than the typical executive.
The reverse might be true. As necessity is the mother of invention, perhaps entrepreneurs are created when a worker has no alternatives. Rather than coming from
the top of the ability distribution, they are what is left over. This argument also flies
in the face of some facts. Any ability measure that classifies John D. Rockefeller,
Andrew Carnegie, or, more recently, Bill Gates near the bottom of the distribution
needs to be questioned.
Indeed, Section 2.1 shows that upon closer inspection, it results from the empirical literature
on entrepreneurship that when educational attainment is used as a proxy for ability, there is
a U-shape between entrepreneurship and ability: Self-employment rates are highest for people
with relatively high or low levels of education, and lower for people with intermediate levels of
education. This of course only comes out when not imposing a linear specification for education.
When just regressing the probability of being an entrepreneur on years of schooling, for instance,
results often are not significant as may well occur if the underlying relationship is actually

U-shaped.1
The remainder of Section 2 revisits three better-known facts about entrepreneurship. Firstly,
a substantial fraction of entrepreneurs (more than 10% in the U.S.) make their occupational
choice not to pursue some golden opportunity, but out of necessity. Secondly, the bulk of
firms are small, remain so, and yet persist in the market. Many are smaller than popular models
of firm entry and survival allow them to be, making structural estimation or calibration of those
models hard. Finally, returns to entrepreneurship have a much larger cross-sectional variance
than returns to wage or salaried work.
In Section 3, I set out a simple and pretty general model that can explain these facts. It
describes a world where people differ in both productive ability (output as workers) and the
productivity of firms they start, and choose the most rewarding occupation. Whereas productive ability is known, the productivity of entrepreneurial projects can only be found out by
implementing them, i.e. by becoming an entrepreneur. Average productivity of firms started by
a person may however be correlated with his/her productive ability.
Because a projects productivity is not known ex ante, entrepreneurs may start low-productivity
ventures and then abandon them in the hope of starting a more productive project next time.
Their optimal continuation policy hence consists in a reservation productivity, similar to that
in McCall (1970) labor market search. Less productive projects are abandoned. This reservation productivity is higher for more able agents. In Section 4, I then show that if the variance
in productivity of potential projects is relatively high compared to the variance in the ability
of workers, it will be the most able and the least able people who start firms, with agents of
intermediate ability choosing to become workers.
The reason for this is the following: The cost of starting a firm is an opportunity cost in
terms of foregone wages. This is higher the more discriminating the reservation productivity
policy is. Low-ability agents face low wages and therefore have a low opportunity cost of starting
a firm. High-ability agents have particularly high potential benefits. But agents of intermediate
ability fall in between, and as a result do not find it optimal to start a firm.
The correlation between individuals ability as workers and the productivity of their startups
determines the relative size of the groups of high- and low-ability entrepreneurs. A high correlation makes firm formation particularly attractive for high-ability agents. A low correlation
implies that expected firm value does not vary much with ability. Then the firm entry decision
1

Lazear, following his observation, then goes on to focus on heterogeneity in the structure of skills, and not in
ability. The potential importance of other dimensions of heterogeneity notwithstanding, the model proposed in
this paper is capable of explaining the at first sight puzzling entrepreneurship-ability relationship.

will mainly be determined by the return to working. As a result, there will be many low-ability
entrepreneurs.
In considering a setting with two dimensions of heterogeneity, the paper goes beyond the
classic models of entrepreneurial choice of Lucas (1978) and Kihlstrom and Laffont (1979). With
only one dimension of heterogeneity as there, it is relatively obvious who will start a firm: the
least risk averse, or the most able entrepreneurs. This however does not square well with the
empirical evidence on entrepreneurship and ability, or even with casual observation. Cagetti
and De Nardi (2006) do consider a model of occupational choice with two dimensions of ability.
However, their entrepreneurial ability is a binary variable, just indicating whether someone is
able to start a firm or not. They then focus on how the possibility of starting a firm shapes the
wealth distribution and the distribution of returns to entrepreneurship when there are financial
constraints quite distinct from the role of heterogeneity in this paper.
The model is close to another classic, though: the Roy (1951) model of occupational choice.
Jovanovic (1994) analyzes such a Roy model with known, heterogeneous managerial and working
abilities. The model here extends this by uncertainty about a startups productivity, and by
agents ability to search for a good project. Section 5 shows that while the relative variance
condition is similar to the one that governs occupational choice in the Roy model, the crucial
difference that gives rise to substantial firm entry by both high- and low-ability agents is the
ability to abandon bad projects.

Some facts on entrepreneurship

This section documents several relevant facts about entrepreneurship, some well-known, some
new. Firstly, the relationship between entrepreneurship and education is U-shaped. That is,
people with very low or high levels of education are more likely to be entrepreneurs than people
with intermediate levels of education. Secondly, there is a substantial fraction of people who
become entrepreneurs out of necessity, and not to pursue an opportunity. This fits with the
third fact: most firms are small. Most of these firms remain small and are not much more likely
to exit than their larger counterparts. Many existing models have trouble accounting for the
smallest of them. Finally, returns to entrepreneurship have a much higher variance than returns
to being an employee. This emerges robustly from the recent literature on entrepreneurship and
will therefore be reiterated only briefly here.

2.1

Who becomes an entrepreneur? Entrepreneurship and educational attainment

Are more productive or less productive people more likely to become entrepreneurs? As suggested by the quote from Lazear (2005), either argument might be made, depending on the type
of firm one is thinking of. This also suggests that the answer does not have to be either/or.
This section presents evidence on the relationship between entrepreneurship and ability.
Whereas there is an abundant literature on the impact of an additional year of schooling
on wages or salaries of employees, the relationship between entrepreneurship and schooling has
received much less attention, and much less sophisticated econometric treatment. Therefore, I
will simply focus on results in the literature on the proportion of entrepreneurs by educational
attainment.
Studies that look only for a linear effect, e.g. by regressing the probability of being an
entrepreneur on years of schooling, often remain inconclusive. This is also the case for the
meta-study by van der Sluis, van Praag and Vijverberg (2003). The reason for this is that on
closer inspection, as shown below, a U-shape appears: People at the extremes of the education
distribution are more likely to be entrepreneurs than people with intermediate levels of education.
Looking for a purely linear relationship will either result in hiding the U-shape or will be likely
to yield insignificant estimates.
Table 1 summarizes recent evidence. It shows self-employment rates by educational category
from a variety of sources, covering different countries and time periods. The columns refer to
elementary school (E), less than high school (<HS), high school (HS), less than college (<C),
college (C), and advanced degrees (>C). Not all sources report data for all of the educational
categories. More detail on the sources can be found in Table 2.
Table 1: Self-employment rates by education category

Borjas and Bronars (1989)


Hamilton (2000)
Hipple (2004) (unincorporated)
Hipple (2004) (incorporated)
Lin, Picot and Compton (2000)
Schjerning and Le Maire (2007)

data source
U.S., 1980 Census
U.S., 1984 SIPP
U.S., 2003 CPS
U.S., 2003 CPS
Canada, 1994
Denmark, 1980-96

18.4
10.9

educational
<HS HS
4.8
4.2
12.6 11.1
9.1
8.7
2.1
3.1
13.5 11.4
10.9

attainment
<C
C
4.6
6.5
12.6 15
8.2
7.5
3.9
5.5
10.1 11.1
7.4
3.6

>C

9.1
6.6
13.2
12.9

Table 2: Self-employment rates and education: detailed data sources


Blanchflower (2000)

Borjas and Bronars (1989)


Hamilton (2000)
Hipple (2004)
Lin et al. (2000)
Schjerning and Le Maire (2007)

micro data from 19 countries, 1975-1996, ages 16-64,


from Eurobarometer Surveys and General Social Survey,
1975-1996.
U.S., 1980 Census: white men aged 25-64, residing in
metropolitan areas, not employed in agriculture
U.S., 1984 SIPP: male school leavers aged 18-65
working in the nonfarm sector
U.S., 2003 CPS: men and women, aged 16 and older
Statistics Canada 1994 Survey of Labour and Income
Dynamics (SLID). ages 15-64
Statistics Denmark Integrated Database for labor market
Research (IDA) and the Danish Income Registry (IKR),
1980-1996 data. ages 30-55

The most remarkable feature of the data reported in Table 1 is that self-employment rates are
higher for the lowest and highest levels of schooling, and lower for intermediate levels. Hence,
the relationship between the self-employment rate and educational attainment is U-shaped.
This holds across data sources, time periods, and (some) countries, giving the regularity some
support. The only group for which it does not hold are the incorporated self-employed. This
exception arises because small businesses are unlikely to incorporate.
Econometric exercises show that these differences are not simply due to e.g. cohort effects.
The U-shape in education persists when regressing the probability of being an entrepreneur on a
set of demographics using discrete choice models. This is found both by Blanchflower (2000) in
data across 19 OECD countries, and by Schjerning and Le Maire (2007) on Danish data, using
very fine education categories. Blanchflower finds that controlling for age, education, gender,
household size, the number of children under the age of 15 in the household and the genderspecific country unemployment rate, the least educated (age left school < age 15) and the most
educated (age left school > 22 years) have the highest probabilities of being self-employed
(p. 488). This pattern is statistically significant. Schjerning and Le Maire, controlling for
age, wealth, number of children by age, marriage status, immigrant status and origin, and the
spouses self-employment status still find that the probability of being self-employed is lowest for
the intermediate education categories of post secondary education and a short cycle of higher
education, and higher at the extremes. A linear specification for education would not be able
to pick this up. Evans and Leighton (1989) for instance, using years of schooling as a measure
6

of education, do not find it to be significant when controlling for urban vs rural, experience,
unemployment status, fathers occupation, and some sectors.
As far as the (limited) evidence goes, the U-shaped relationship between entrepreneurship
and educational attainment hence emerges as a pretty robust fact. Further research has to show
if this is robust to using other measures of ability.

2.2

Why start a firm? Opportunity vs necessity

Are all entrepreneurs out to pursue pursuing some golden opportunity? Most firms stay small
(though, of course, many large firms started small), and yet they persist. In fact, a substantial
fraction of startups enters the market not to innovate and embark on a growth path, but out
of necessity. This results from data collected through the Global Entrepreneurship Monitor
(GEM) project in 47 industrialized and developing countries. The GEM is an academic research
consortium led by London Business School and Babson College. Its data provide the broadest
information on entrepreneurship across countries.
Table 3: Fraction of entrepreneurs starting a firm out of necessity (GEM data)
Western Europe
other OECD
Latin America
Belgium
10.8% Australia
16.7% Argentina
39.1%
Denmark
6.1% Canada
16.9% Brazil
46.7%
Spain
16.4% Japan
26.3% average
42.9%
Finland
9.7% New Zealand
13.5%
France
23.0% USA
12.3% Asia
Germany
26.5% average
17.1% Singapore
15.7%
Iceland
7.1%
Ireland
16.4% Transition Economies
Africa
Italy
13.5% Croatia
37.3% South Africa
39.2%
Netherlands
10.2% Hungary
33.0%
Norway
8.0% Slovenia
19.3%
Sweden
12.6% average
29.9%
UK
13.7%
average
13.4%

The GEM survey targets people aged 18 to 64 years who are involved in some nascent
entrepreneurial activity. The relevant group is identified in the context of household surveys.
Table 3 shows the fraction of people responding to Are you involved in this start-up to take
advantage of a business opportunity or because you have no better choices for work? as have
7

no better choice. In the GEM data, this is called the Necessity Entrepreneurial Activity Index.
The numbers shown are time averages for the period 2001 to 2005 for the countries where
observations for at least 4 years were available.2
Two facts stand out. First, the fraction of entrepreneurs out of necessity is by far the
highest in Latin America, South Africa, and Eastern European transition economies. Most likely,
issues of (in)formality play a role here. Secondly, there is a substantial fraction of entrepreneurs
out of necessity everywhere, even in industrialized countries. In most countries, the number
is above 10%. The average for industrialized countries is 14.4%. Hence, not all entrepreneurs
are out to innovate or pursue a golden opportunity.

2.3

Implications for the firm size distribution

The presence of low-ability entrepreneurs, combined with a substantial fraction of entrepreneurs


starting firms out of necessity, has implications for the firm size distribution. Indeed, small
firms are an important and persistent part of it. For instance, in the U.S., 55% of employer
firms have less than 5 employees. In addition, there are around 10 million self-employed. While
small firms are more likely to exit, the difference is small once age is controlled for (Bartelsman,
Scarpetta and Schivardi 2003, Figure 6). Hence, small firms are there to stay, and are a persistent
part of the firm size distribution.
Recent research attempting to match the firm size distribution has mainly focussed on its
right tail (see e.g. Luttmer 2007, Chatterjee and Rossi-Hansberg 2007), not paying much attention to the left tail. Indeed, popular models have problems accounting for just how small and
persistent small firms can be. For instance, in settings like that of Hopenhayn (1992) and the
many models based on it, a fixed cost or a uniform outside option imply that there is a strictly
positive minimum firm size. In the data, however, this minimum size is zero, measured in terms
of employees. Hence, estimated versions of such models have trouble accounting for small firms
and their persistence. Due to the non-linearity of these models, this may affect other parameter
estimates and predictions. Heterogeneity in outside options could solve that problem, as shown
below in the model.
2

Tabulated data are from the macro overview data of the GEM (micro data is also available). It has been
downloaded from http://www.entrepreneurship-sme.eu/.

2.4

Returns to entrepreneurship

Finally, one fact that has emerged robustly from the recent literature on the return to entrepreneurship is that its cross-sectional variance is much larger than that of wages. Whereas
measurement issues pose serious problems in comparing the average return to entrepreneurship
to that to wage work or to public equity (Hamilton 2000, Moskowitz and Vissing-Jrgensen
2002, Cagetti and De Nardi 2006), the difference in variance is so large, and largely immune to
shifts in the mean, that there is no disagreement on it.
To illustrate, in an earlier study, Borjas and Bronars (1989) estimated that the standard
deviation of log weekly income for the self-employed is up to twice that of wage-earners. Depending on the measure used for income from self-employment, it is between two and almost
four in the sample from the SIPP used by Hamilton (2000).
These four facts are related. They suggests that: entrepreneurs have very heterogeneous outside
options, so some become entrepreneurs out of necessity. These may (a conjecture) be mainly
people with low levels of education. The firms they run most likely will remain small, if they
manage to survive. Suppose that some variance in returns to entrepreneurship also arises from
heterogeneous quality of projects. Finally suppose that, while any budding entrepreneur could
end up running projects of varying return, those with higher education would on average run
their projects better, or run better projects. Then it is clear that the fact that entrepreneurs come
from the extremes of the ability distribution implies that the observed post-selection variance
in returns will be high relative to the fundamental variance in returns any individual might
face. Hence, selection from the extremes of the ability distribution, arising from heterogeneous
outside options, increases observed variance in returns to entrepreneurship.
The model developed in the next section shows how selection from the extremes can occur
naturally in a pretty general setting. It also suggests that one-sided selection models, as usually
employed in empirical work, will only capture part of the selection mechanism.

The economy

Time is discrete. The economy consists of a continuum of risk-neutral individuals of measure 1.


People derive utility from consumption, and can earn income either as workers or by running
their own firm. Every period, they retire with probability > 0, and a measure of people
newly enter the labor market. When an entrepreneur retires, the firm is dissolved. Employees can
however immediately find a new job on a competitive labor market. Future utility is discounted
9

at a rate r > 0. Combined with the retirement probability, this implies discounting future utility
using a discount factor = 1/(1 + r + ) (0, 1).
Firms produce a homogeneous good, which is used as the numeraire. They produce output
with the production function
y(, n) = n ,

0 < < 1.

(1)

This production function combines as inputs one manager/owner, who is essential to operate
the firm, with a labor input n. (Any individual can run at most one firm at any moment
in time.) Production exhibits decreasing returns to scale in the only variable input, labor, so
that optimal firm size is finite.3 This could be due for instance to limits in managers span of
control (Lucas 1978): as activity expands, it becomes more difficult to control, and the marginal
product of the variable factor diminishes. Firms vary in their total factor productivity , which
is constant over time for a given firm.
While firms differ in their productivity, individuals differ in their productive ability a. In
the following, this will be referred to as ability for short, to distinguish it from productivity,
which is a firm-level concept. Ability a is observable. Workers are perfectly substitutable in
production, so a worker with ability a is equivalent to a workers of productivity 1. A firms total
employment then is the integral over the as of its employees.
Individuals have the choice between working or running a firm for a living. A would-be
entrepreneur can start a firm by putting into practice some business idea. For the moment,
suppose that there is no entry cost. It is well-documented that it is hard to precisely assess
the quality of a project and of its implementation ex ante, before starting the firm.4 Hence,
assume that prior to entry, a potential entrepreneur faces a known probability distribution of
potential values of productivity his/her project could have. The precise productivity of a
concrete project, however, is only revealed upon starting a firm.
Timing is as follows. At the beginning of any period, agents face the choice between working
and starting a firm. Ability a being observable, agents know their return to working. People
who already run a firm also know the quality of that project and can take this into account
in their choice. The productivity of a new firm, however, is not known. Only the distribution
of productivities is known, so agents make their decision based on this. If they start a firm,
3
The setting is easy to extend to include a variable capital input, as long as there is a source of decreasing
returns ensuring finite optimal firm size.
4
Theoretically, this point has been made many times, see for instance Jovanovic (1982). The clearest supporting
evidence comes from high failure rates of young firms. This is amply documented, for some recent estimates of
survival hazards see e.g. Bartelsman et al. (2003).

10

they find out their projects productivity . They can then either pursue that project and start
production the following period, or abandon the project. As a result, there is an entry cost
in terms of foregone wages: It takes a period to start production, so starting a firm entails an
opportunity cost equal to foregoing a periods wages.
A firms productivity depends on the quality of its founders idea and management. Suppose
that more able (higher a) individuals have a tendency to have better ideas, or that there are some
general skills which are useful both in production and for running a firm. Then the correlation
between individuals productive ability and the productivity of the firms they might run is
positive. To summarize, the population consists of individuals of heterogeneous ability a. Each
of them has the choice between working or running a firm for a living. In case they start a firm,
the firms productivity is not known ex ante, but the distribution is known, and improves with
an individuals ability a in a first-order stochastic dominance sense.
Whereas this is not needed for results to go through, it will in some cases be useful to make
concrete parametric assumptions on the distributions of a and . So suppose that the logarithms
of peoples productive ability a and of the productivity of firms they may operate are drawn
from a bivariate normal distribution with variances and a , and correlation 0. Denote
the joint pdf by (), and the cdf by (). The expectation of ln conditional on a then is linear
in the realization of ln a, and the slope depends on and the two variances.
In short, while a persons productive ability and the quality of project they might run are
correlated ( 0), they are not the same. In fact, the literature on entrepreneurship (see in
particular Lazear 2005) stresses that entrepreneurs need to be Jacks-of-all-Trades and have
broad talents, as opposed to more specialized skills of production workers.
A competitive equilibrium in this economy consists of a wage schedule w(a) and a distribution
of agents over activities such that taking prices and wages as given, agents choose their occupation optimally, firms choose employment to maximize profits, and the labor market clears. The
firm productivity distribution is directly determined by the distribution of agents over activities
and their optimal occupational choice.

Occupational choice

The occupational choice problem has the following basic structure. Workers find it optimal to
start a firm if this yields higher value than employment. The latter depends on the workers
productive ability. This is also true for the former, because more able workers tend to run more
productive firms. As a result, not everyone will find it optimal to start a firm; who does so
11

depends on the correlation between and a. Once a firm has been started, the owner will only
pursue the concern if this gives a higher value than either giving it up and looking for a job
again, or giving it up and trying out a new project. The former is never going to be optimal:
any startup has the same expected value, so anyone who finds it optimal to start a firm once
is going to find it optimal again, even if the first project turned out to be unsuccessful. The
other option of starting a new project, however, implies that entrants will only continue if they
are sufficiently productive. For someone who realizes that his business idea was not good, it is
preferable to try out a new idea instead.
A starting entrepreneurs problem is thus analogous to a McCall (1970) search problem in
the labor market. Someone who has decided that trying to start a firm is the optimal thing
to do also has to decide which level of productivity is good enough to continue operating. The
reason is that the entrepreneur can always decide to try a new project next period, at the cost of
delaying production. Let the value of running a firm with productivity forever be F (). Let
expected firm value be V (a) E[F ()]. An entrepreneur who has just realized that his project
has productivity then has two options: pursue it and get F (), or try another project and get
V (a) next period. He is thus indifferent between the two actions if F () = V (a). This defines
a reservation productivity R : for draws of above R it is optimal to continue, and for draws
below R it is optimal to try a different project. Firm value at the reservation productivity has
to satisfy
F (R (a)) = E{max[F (), F (R (a))]|a} = V (a)

(2)

Note that R (a) is a function of a because individuals ability determines the productivity
distribution they face. By standard arguments, this functional equation in R (a) has a unique
solution.
Agents occupational choice problem then consists of comparing the value of starting a firm,
V (a), to the value of working. Denote the latter by W (a). As a does not change over an
individuals life, agents make the same choice every period, or we can think of them as making
an occupational choice when entering the labor market.5 With perfectly substitutable labor
inputs and a competitive labor market, the wage w(a) is linear in a, and so is the value W (a)
of working forever. The shape of V (a) then determines the pattern of occupational choice.
Apart from the reservation productivity, a crucial determinant of V (a) is the distribution of
5

The model abstracts from many features such as the need to finance entry investment or unemployment
episodes that can lead to alternation of periods of employment and entrepreneurship; I am pursuing this in
separate work.

12

conditional on a.
Assumption 1 E[|a] is weakly convex in a.
With bivariate lognormal a and , a sufficient condition for this is /a 1 (see Appendix
for details). Essentially, it implies that the variance in productivity of potential projects exceeds
that in ability in the population by some margin, and more so the smaller . While observations
on the variance in wages and profits cannot be used to verify this assumption because they are
post-occupational choice measures, the next proposition implies that they are consistent with
it.
Proposition 1 If Assumption 1 holds, V (a) is increasing and convex in a, with lima0 V (a) >
0. Then there are two cutoffs aL and aH (aL < aH ) where V (a) = W (a), and agents with
a (aL , aH ) choose to work, while agents with a aL or a aH become entrepreneurs.
Expected firm value V (a) increases in a if
dV
V
V 0
=
+
(a)
da
a
R R

(3)

is strictly positive. Changes in a affect the value of starting a firm through their impact on
the distribution of productivities an agent faces, and through their effect on the optimal policy.
Clearly, V /a > 0 if > 0. For the second part, it is necessary to investigate the shape of
R (a). To find this, taking the derivative of equation (2) with respect to a yields


dV
V
V 0
0
0
F (R )R (a) =
=
+
(a) .
da
a
R R

(4)

0 (a) gives
Solving this for R
0
R
(a) =

V /a
.
F 0 (R ) V /R

(5)

Again, the numerator is positive for > 0. The denominator is positive by the following
argument. Because of the max operator, E[max(F (), F (
))|a] > F (
) for any
. However, the
difference between the two shrinks as we contemplate
. Hence, F catches up with V ,
2 >
implying that F 0 (R ) > V /R . (Note for later use that this also implies that 2 V /R
0 (a) > 0, i.e. entrepreneurs with
F 00 (R ).) With both numerator and denominator positive, R

a higher a set a higher reservation productivity. Fundamentally, they can be more demanding
because they draw from a better distribution. As a consequence, both terms making up dV /da
13

are positive: more able entrepreneurs expect a higher firm value because they face a better
distribution, and because they optimally set a higher reservation productivity.
The shape of V (a) is governed by its second derivative
d2 V
2V
V 00
2V 0
=
+

(a)
+
R
2 R (a).
da2
a2
R
R

(6)

The first term on the right hand side, i.e. the second derivative of V with respect to a keeping
the reservation productivity fixed, is clearly positive if E[|a] is weakly convex in a.6 The second
derivative of V with respect to R , keeping a fixed, equals Pr( < R |a)F 00 (R )+F 0 (R )Pr( <
R |a)/R , which is positive by convexity of F (). The sign of the middle term in equation (6)
00 (a). From taking the derivative of (4) with respect to a and solving
depends on the sign of R
00 (a), this is
for R
00
R
(a)

 2

0 (a)  2
R
V
V
0
00
=
+ R (a)
.
2 F (R )
V /a a2
R

(7)

00 (a) > 0, and


From above we know that the term in round brackets is strictly positive, so R

hence V (a) is convex in a.


The last expression shows that while the shape of E[|a] matters for the shapes of V (a)
and of the reservation productivity R (a), it does not drive it alone. The effect of being able
2 F 00 ( ) > 0.
to try again, the option behind the reservation policy, enters through 2 V /R
R

Effectively, the reservation policy puts a floor under optimal realized firm value, and thereby
raises expected firm value. This also implies that lima0 V (a) is strictly positive, so that for the
least able workers, starting a firm is always optimal. With these people starting firms, V (a)
convex, and W (a) linear, it is also clear that the equilibrium must feature two regions where
agents start firms, for a aL and for a aH , where aL and aH are the a for which V (a) = W (a),
with aL < aH . (A situation where everyone starts a firm cannot be an equilibrium as total labor
input would be zero, so the marginal product would approach infinity, and high wages would
convince some individuals to work.) Note that the argument also goes through with linear
F (). This is because it is the choice to keep or dump a project, not convexity of F , that drives
convexity of V .
What is the economic intuition behind the result? First of all, it is important to see that
it does not come purely from the convexity of F (). Suppose E[|a] is linear in a. Then, by
bringing close to zero, the first term in (6) can be made arbitrarily small. The other source of
convexity coming from the reservation productivity, however, remains. Hence, it is the ability
6

This condition is somewhat stronger than needed. It is still convex for some slightly concave E[|a].

14

to reject low draws and try to start a new, more productive project that makes V (a) strictly
convex. It is thus the potential to choose to keep good projects, and discard bad ones, that
matters for the result. This choice makes V (a) convex by putting a floor under the firm value
there is a limit to how low productivity can be because too low values can be rejected. Figure
1 depicts the values of the choices as a function of a.
Why is it the people at the extremes of the ability distribution who become entrepreneurs?
The decision to start a firm involves a tradeoff between costs (foregone wages) and benefits
(future profits). For high-ability people, future profits are particularly high, so they start firms.
For low-ability people, future profits are not particularly high, but the opportunity cost is very
low. This matters in particular if repeated attempts are needed to find a good project. In this
case, there may be several periods of foregone wages, for benefits several periods in the future.
But foregone wages matter less for people expecting low wages. For agents with intermediate
ability, the trade-off is not favorable, costs exceed benefits, so they work. Thus, an interaction of
heterogeneous benefits with heterogeneous outside options generates selection from the extremes
of the ability distribution.
The value of starting a firm (V) or working (W)

V
W

Figure 1: The value of starting a firm (V (a), red, convex line) and of working (W (a), blue line)
In this way, the model explains a feature of the firm size distribution not commonly well
explained in existing models: despite low profits, small firms are a persistent feature of the firm
size distribution because their owners have even lower outside options in the labor market. This
15

effect is particularly strong in industries with low entry costs. The result is also consistent with
a substantial fraction of firms making little profits, as documented by Moskowitz and VissingJrgensen (2002) (and dubbed a possible Private Equity Premium Puzzle). While they reject
quite a number of possible explanations of this pattern, a low outside option is a new candidate.
A better quantitative characterization of the firm size and profit distribution in the model here
[in progress] would help assessing how far it goes.
The argument is related to but distinct from that made by Steinberger and Hintermaier
(2005). They argue that it is valuable for agents to find out about their (permanent, ex ante
unknown) entrepreneurial ability. Because this information has value, they are ready to suffer
some possible low returns in order to obtain it. This is somewhat similar to the element of
experimentation present here but quite different from the effects of heterogeneous ability. In
particular, their argument can explain why low returns occur, but it cannot explain the persistence of small low-profit firms, which arises naturally in a setting with heterogeneous ability in
wage work.
The result that there are always two regions with entrepreneurs relies on the domain of
stretching from zero to infinity. Truncating it might wipe out one of the groups, but would hardly
lead to a more general setting. Entry costs however, be they administrative or for financing some
sunk investment, may matter. Without them, the cost of entry consists only in foregone wages.
Adding entry costs would shift the value of starting a firm down, more so the more attempts
are needed to find a good project. It would also reduce the reservation productivity, as repeated
attempts become more costly. As a result, it is possible that aL drops to zero, wiping out the
region of low-a entrepreneurs. (The high-a region would persist as V would remain convex and
eventually exceed W for high enough a.) Empirically, this may be plausible, but only up to a
point, as low-a individuals could choose low-entry cost industries. Indeed, Lofstrom and Bates
(2007) find that college-educated people are more likely to enter high-barrier industries, and
people with less education are more likely to enter low-barrier industries. Similarly, Hurst and
Lusardi (2004) show that in data from the 1987 National Survey of Small Business Finances
(NSSBF) 25% of new firms were started with less than $5,000 in capital, and the median starting
capital provided by the founder was $22,700. Hence, choice of industry helps overcome entry
barriers to some extent.
To close the model, the firm productivity distribution can easily be obtained by equating
inflows and outflows. Let (a, ) be the measure of firms of productivity with an owner of
ability a. This is only positive for a < aL or a > aH (it is optimal for the owner to start a firm)

16

and for R (a) (the firm is productive enough so that the project is pursued), so focus only
on that case. A fraction of these firms exit every period due to retirement. Entry comes from
entrants to the labor market who choose to start a firm, or from entrepreneurs who previously
attempted entry, but failed to generate a productive enough project. The inflow into (a, )
hence is
(a, ) + (a, )(1 )(R (a)|a) + (a, )(1 )2 (R (a)|a)2 + . . .


(1 )(R (a)|a)
= (a, ) 1 +
1 (1 )(R (a)|a)
where () is the joint pdf of a and , and () is the cdf. With an outflow of (a, ), the stock
is given by
(a, ) =

(a, )
.
1 (1 )(R (a)|a)

(8)

Note that while the owner ability distribution features two disjoint parts, this does not have
to be the case for the firm productivity distribution. This is of course important, as the empirical
distribution is smooth. If aL and aH are not too far apart, some (relatively) high-productivity
firms operated by low-ability people and borderline firms operated by agents just above aH may
have similar levels of productivity, in particular if is significantly larger than a .7 Variation in
the taste for running ones own business and in risk aversion, dimensions abstracted from in this
paper, would also help to smooth the owner ability distribution and, as a consequence, the firm
productivity distribution. Note that while they would also help to explain the existence of small
firms (some of them persist because some psychological benefits compensate the owner for lower
income), they would not on their own explain the U-shaped relationship between education and
entrepreneurship, so heterogeneous ability and selection remain crucial.
Equation (8) also shows the effect that the ability to reject bad draws has on the firm size
distribution. Given any a, a higher reservation productivity R raises the probability (R )
of rejecting a given draw. It thereby increases the proportion of firms that ultimately end up
with > R relative to that fraction without rejection. For any a aL or a aH , as starting
entrepreneurs reject low productivity draws, they will eventually end up with a good draw. The
more stringent the threshold, the stronger this redistribution of probability mass to the area
above the threshold has to be. While taking time, this benefits aggregate productivity.
Aggregate labor demand follows directly from the firm productivity distribution and the
wage. It increases in the number of firms and decreases in the wage. Letting the marginal
7

For an illustration of what this can look like, see Figure 2 in the Appendix from an old, obsolete version of
the paper, updating in progress.

17

distribution of a be a (), labor supply is given by a (aH ) a (aL ), the measure of people for
who it is optimal to work. As a higher wage makes W (a) steeper, this increases in the wage for
any V (a). The equilibrium wage then is determined by labor market clearing.

Relationship to the Roy (1951) model of occupational choice

Assumption 1, or its parametric counterpart for the bivariate lognormal distribution, /a


1, looks familiar from the Roy (1951) model of occupational choice. There, workers choose
between two sectors of activity. Payoffs are known to the workers. In the aggregate, they are
bivariate lognormally distributed, with a correlation of between the logarithms of the payoffs.
Let the standard deviations of the logarithm of the random payoffs be and a , respectively,
and assume > a . Then, if /a 1, outputs are relatively highly correlated, and
relatively productive workers tend to choose the sector with the higher variance. This increases
the variance of the income distribution.
This pattern is similar to what obtains in the model presented here. If Assumption 1 holds,
and the variance of the return to starting a firm is relatively high, more able workers choose
to start firms. However, the model has an additional prediction: entry by low-ability workers.
This is driven not by the elements akin to the Roy model, but by entrants capacity to retain
good projects and dump bad ones, which is particularly valuable if the opportunity cost of new
attempts is low.
To see this more clearly, the model could be brought close to the Roy model by eliminating
entrepreneurs capacity to start again with a new project, and restricting them to accept their
first draw and to stay in business thereafter (certainly a draconian restriction). By Jensens
inequality, expected firm value then is convex and bounded below by F (E(|a)) if /a > 1.
This implies that high-a individuals optimally start firms, choosing the sector with the higher
variance, as in the Roy model. For low-a individuals, absent the ability to try a new project,
lima0 F (E(|a)) = 0. Expected firm value lies slightly above this but not much. So convexity
of the profit function combined with uncertainty about a projects productivity implies that,
differently from the Roy model, some agents with very low a start firms. However, they would
disappear as F becomes more linear. The bulk of low-a individuals, thus, choose to become
wage workers, choosing the low-variance sector, again as in the Roy model.
What really increases the expected value of starting a firm for low-a individuals is the ability
to drop bad projects and keep trying until they find a good one. This matters most for lowability agents as the cost of doing this consists in foregone wages, which are particularly low for
18

them, giving them a comparative advantage in keeping trying. Hence, whereas convexity of the
profit function already convinces a small group of low-a agents to start firms, the feature that
may make this phenomenon quantitatively important, and that distinguishes the predictions of
the model here more strongly from that of the Roy model, is the ability to try again.

and the allocation of talent

One important feature of the economy described here is the influence of , the correlation
between productive ability and the quality of the ideas generated, on occupational choice and
welfare. Note in passing that is a parameter of technology, not information, so this is not an
issue of imperfections in some sense, but a feature of the available technology.
It is easiest to understand its impact starting from some extreme cases. First, suppose that
= 0. Then, the expected project quality is the same, regardless of an entrepreneurs ability
a. As a consequence, the expected value of entry V (a) is also independent of a. With wages
increasing in a, this implies that in this case, there is only one cut-off, aL , and all agents with a
below that cutoff become entrepreneurs. All agents have the same expected payoff from starting
a firm, so they take their decisions purely on the basis of their outside option: those with the
highest ability become wage workers. Given = 0, this is an efficient outcome. This result also
obtains if < 0, as shown by Jovanovic (1994).
Now suppose that = 1. In this case, given a, everyone is certain about the quality of a
project they could obtain. Hence, firm value is known before entry, and is increasing and convex
for > 0. High-ability workers thus start firms. Something crucial happens in this case for
low-ability agents, however. With perfectly known , they lose the ability to keep good project
and drop bad ones, as every draw would be the same. (The same occurs for = 0.) The
floor under the firm value introduced by the ability to reject bad draws disappears. As a result,
lima0 V (a) = 0, aL = 0, and the group of low-ability entrepreneurs disappears. There are only
high-ability entrepreneurs. This case is similar to the outcome of e.g. the Lucas (1978) model
of the firm size distribution.
For 0 < < 1, however, both groups of entrepreneurs exist. How do they change with
? An increase in raises the E[|a] for high a and reduces it for low a, thereby favoring the
entrepreneurial prospects of high-ability agents. Expected firm value becomes steeper in a, rises
for high a, and declines for low a. As a result, both aH and aL tend to fall. Whereas it is hard
to say anything about the resulting change in the number of entrepreneurs in general, it is clear
that their composition changes towards more high-ability and high-productivity ones.
19

Empirical evidence of strictly positive is strong. While the empirical literature is grappling
with the effect of education on the probability of being an entrepreneur, the positive effect on
income both as a worker and as an entrepreneur is pretty robust. Nevertheless, the evidence
presented in Section 2 suggests that must be some way below 1.
Useful bounds could be put on it by calibrating the model to the firm size distribution; this
work is in progress.

Concluding remarks

Extension in progress: labor market frictions; entry by workers vs the unemployed.

20

References
Bartelsman, E., Scarpetta, S. and Schivardi, F. (2003), Comparative Analysis of Firm Demographics and Survival: Micro-Level Evidence for the OECD Countries, OECD Economics
Department Working Paper 348.
Blanchflower, D. G. (2000), Self-employment in OECD countries, Labor Economics 7, 471505.
Borjas, G. J. and Bronars, S. G. (1989), Consumer Discrimination and Self-Employment,
Journal of Political Economy 97(3), 581605.
Cagetti, M. and De Nardi, M. (2006), Entrepreneurship, Frictions, and Wealth, Journal of
Political Economy 114(5), 835870.
Chatterjee, S. and Rossi-Hansberg, E. (2007), Spin-offs and the Market for Ideas, mimeo .
Evans, D. S. and Leighton, L. S. (1989), Some Empirical Aspects of Entrepreneurship, American Economic Review 79(3), 519535.
Hamilton, B. H. (2000), Does Entrepreneurship Pay? An Empirical Analysis of the Returns to
Self-Employment, Journal of Political Economy 108(3), 604631.
Hipple, S. (2004), Self-employment in the United States: an update, Monthly Labor Review
127(7), 1323.
Hopenhayn, H. (1992), Entry, Exit, and Firm Dynamics in Long Run Equilibrium, Econometrica 60(5), 11271150.
Hurst, E. and Lusardi, A. (2004), Liquidity Constraints, Household Wealth, and Entrepreneurship, Journal of Political Economy 112(2), 319347.
Jovanovic, B. (1982), Selection and the Evolution of Industry, Econometrica 50(3), 649670.
Jovanovic, B. (1994), Firm Formation with Heterogeneous Management and Labor Skills, Small
Business Economics 6, 185191.
Kihlstrom, R. and Laffont, J.-J. (1979), A General Equilibrium Entrepreneurial Theory of Firm
Formation Based on Risk Aversion, Journal of Political Economy 87(4), 719748.
Lazear, E. P. (2005), Entrepreneurship, Journal of Labor Economics 23(4), 649680.
Lin, Z., Picot, G. and Compton, J. (2000), The Entry and Exit Dynamics of Self-Employment
in Canada, Small Business Economics 15(2), 105125.
Lofstrom, M. and Bates, T. (2007), African Americans Pursuit of Self-Employment, IZA Discussion Paper 3156.
Lucas, R. E. (1978), On the size distribution of business firms, Bell Journal of Economics
9, 508523.
Luttmer, E. (2007), Selection, Growth and the Size Distribution of Firms, Quarterly Journal
of Economics 122(3).
McCall, J. J. (1970), Economics of Information and Job Search, Quarterly Journal of Economics 84(1), 113126.
21

Moskowitz, T. J. and Vissing-Jrgensen, A. (2002), The Returns to Entrepreneurial Investment:


A Private Equity Premium Puzzle?, American Economic Review 92(4), 745778.
Roy, A. D. (1951), Some Thoughts on the Distribution of Earnings, Oxford Economic Papers
3(2), 135146.
Schjerning, B. and Le Maire, D. (2007), Earnings, Uncertainty, and the Self-Employment
Choice, Center for Economics and Business Research Discussion Paper 2007-04 04.
Steinberger, T. and Hintermaier, T. (2005), Occupational choice and the private equity premium
puzzle, Journal of Economic Dynamics and Control 29, 17651783.
van der Sluis, J., van Praag, C. and Vijverberg, W. (2003), Education and Entrepreneurship in
Industrialized Countries, Tinbergen Institute Discussion Paper 2003-046/3.

22

Appendix
A

The shape of E[|a]

With bivariate lognormal and a (assume , a > 0),

(ln a a )
a
cov(, a)2
var(ln |a) = 2
= (1 )2 .
a2
E[ln |a] = a +

Then
E[|a] = exp(E[ln |a] + var(ln |a)/2)



(1 )2
.
= exp +
(ln a a ) +
a
2
It is easy to show that E[|a] is increasing in a if > 0, and is convex in a if /a > 1.

23

Figure 2: Firm size distribution for = 0.5, 0.75, 0.95 Attention: from old version of the
paper

Blue line: firm size (employment) distribution. Green lines: firm size distribution for low(a aL ) and high-ability (a aH ) entrepreneurs. Red line: firm size distribution if everyone
was an entrepreneur and had one draw only ((a, ) = (a, ))

24

You might also like