You are on page 1of 52

The Pareto Distribution

2 pareto-distribution.nb

Introduction
pareto-distribution.nb 3

The Pareto Distribution

named after Vilfredo Pareto (1848-1923, Italian economist)

also known as Bradford distribution

continuous power-law probability distribution


4 pareto-distribution.nb

Paretos Proposal

Pareto proposed that the number of people (N) with incomes higher than x can be modeled log-linearly:

log N = log A - a log x

Letting the total population be N0 and the minimum income be x0, so that log N0 = log A - a log x0, we can
write this in proportionate terms as

log HN N0L = - a log Hx x0L

Note that for x1, x2 > x0 and associated N1, N2, this also implies

log HN1 N2L = - a log Hx1 x2L

So the same relationship holds in any tail of the income distribution.


pareto-distribution.nb 5

Pareto Distribution (Tail Function)

The Pareto Distribution is often presented in terms of its survival function (or reliability function, or tail
function), which gives the probability of seeing larger values than x. (I.e., it is 1-CDF; see below.) The
survival function is

Clear@x0, a, xD;
SurvivalFunction@ParetoDistribution@x0, aD, xD

I x0 M
x -a
x x0
1 True

Here x0 > 0 is the location parameter, and a > 0 is the shape parameter (or slope parameter, or Pareto
index). We are only interested in x > x0, and we are usually interested in a > 1 (which is required for finite
expected value).

Plot@% . 8x0 10 000, a 2<, 8x, 0, 100 000<D

1.0

0.8

0.6

0.4

0.2

20 000 40 000 60 000 80 000 100 000


6 pareto-distribution.nb

Relation to Exponential

Here is the survival function of the exponential distrubution:

R HyL = P@Y > yD = -a*y = Hy L-a

efine X = x0 Y , where Y has an exponential distribution. Then

P[X>x]=P[x0 Y > xE = PAY > x x0E = P@Y > logHx x0LD = Hx x0L-a

Comparing to our survival function for the Pareto distribution, we see that X has a Pareto distribution.
pareto-distribution.nb 7

Survival (Tail) Function


8 pareto-distribution.nb

Log-Linear Survival

Lets focus on values greater than the minimum:

tailPareto = Simplify@
SurvivalFunction@ParetoDistribution@x0, aD, xD,
Assumptions x > x0 > 0D

x0 a

Note that the log of the survival probability is linear in logHx x0L. We can say that the size elasticity of the
survival rate is a. (We will return to this.)

Simplify@
Log@tailParetoD,
Assumptions x > x0 > 0 && a > 0D

F
x0
a LogB
x
pareto-distribution.nb 9

Survival and the Pareto Index


In[370]:=
plotoptions =
8PlotRange 880, 100 000<, 80, 1<<, AxesLabel 8"Income", "Survival Rate"<,
ImageSize 250, Ticks 81000 * 820, 40, 80<, 80, 0.25, 0.5, 1<<<
Manipulate@GraphicsRow@8
Plot@SurvivalFunction@ParetoDistribution@10 000, $aD, $xD,
8$x, 10 000, 100 000<, Evaluate@plotoptionsDD,
LogLogPlot@SurvivalFunction@ParetoDistribution@10 000, $aD, $xD,
8$x, 10 000, 100 000<, Evaluate@plotoptionsDD
<D
, 88$a, 2, "Pareto index"<, 1, 5<D

8PlotRange 880, 100 000<, 80, 1<<, AxesLabel 8Income, Survival Rate<,
Out[370]=

ImageSize 250, Ticks 8820 000, 40 000, 80 000<, 80, 0.25, 0.5, 1<<<

Pareto index

Survival Rate Survival Rate


1 1

0.5
Out[371]=
0.25

0.5

0.25

0 Income Income
20 000 40 000 80 000 20 000 40 000 80 000
10 pareto-distribution.nb

Survival: Intuition

Consider a population of households and suppose sampling household incomes is like sampling from a
Pareto[10000,2].
What proportion of people earn more than $100000? From the form of the survival function, it should be
obvious that the answer is 1%: only 1 in 100 households earn more than $100000.

SurvivalFunction@ParetoDistribution@10 000, 2D, 100 000D

1
100

Note: given a = 2 and any x0, we find that 1% of the population has income greater than 10*x0. This is why
the Pareto distribution (along with other power law distributions) is called scale free.

Simplify@
SurvivalFunction@ParetoDistribution@x0, 2D, 10 * x0D,
Assumptions x0 > 0D

1
100

What is more, this relationship holds as well for subgroups: only 1% of the top 1% will have incomes again
ten times higher. This typifies a continuous power law distribution.

Simplify@
SurvivalFunction@ParetoDistribution@x0, 2D, 100 * x0D
SurvivalFunction@ParetoDistribution@x0, 2D, 10 * x0D,
Assumptions x0 > 0D

1
100
pareto-distribution.nb 11

Pareto Distribution: CDF and PDF

The cumulative distribution function (CDF) gives probability of seeing a given size or lower. Note that x0 is a
minimum value, called the location parameter.

cdfPareto = Simplify@
CDF@ParetoDistribution@x0, aD, xD,
Assumptions x > x0 > 0D

x0 a
1-
x

The PDF is of course the derivative of the CDF.

pdfPareto = Simplify@
PDF@ParetoDistribution@x0, aD, xD,
Assumptions x > x0 > 0D
pdfPareto D@cdfPareto, xD PowerExpand

x-1-a x0a a

True
12 pareto-distribution.nb

CDF and PDF of Pareto Distributions

options01 = 8PlotRange 880, 5<, 80, 1<<,


In[1047]:=

AxesLabel 8"Size", "CDF"<, ImageSize 250<; options02 =


8PlotRange 880, 5<, 80, 5<<, AxesLabel 8"Size", "PDF"<, ImageSize 250<;
Manipulate@
GraphicsRow@8
Plot@CDF@ParetoDistribution@1, $$aD, $xD, 8$x, 1, 5<, Evaluate@options01DD,
Plot@PDF@ParetoDistribution@1, $$aD, $xD, 8$x, 1, 5<, Evaluate@options02DD
<D,
88$$a, 1, "Pareto index"<, 0.1, 5<D

Pareto index

2.04

CDF PDF
1.0 5

Out[1048]=
0.8 4

0.6 3

0.4 2

0.2 1

Size 0 Size
0 1 2 3 4 5 0 1 2 3 4 5
pareto-distribution.nb 13

PDF of Pareto Distributions: Loglinearity

options02 =
8PlotRange 880, 5<, 80, 5<<, AxesLabel 8"Size", "PDF"<, ImageSize 250<;
Manipulate@
LogLogPlot@PDF@ParetoDistribution@1, $$aD, $xD,
8$x, 1, 5<, Evaluate@options02DD,
88$$a, 1, "Pareto index"<, 0.1, 5<D

Pareto index

PDF

0.1

0.01

0.001
Size
1.0 1.5 2.0 3.0
14 pareto-distribution.nb

Continuous Power Law Distribution

A power law distribution with shape parameter a has probability distribution function
pHxL = c x-H1+aL for x > x0

Clear@c, x, aD
pdfPower = c x-H1+aL ;

The constant c must be chosen to satisfy unitarity. In the continuous case, we compute the integral

Assuming@x0 > 0 && a > 0,


Integrate@pdfPower, 8x, x0, + Infinity<D
D
soln = Solve@% 1, cD Flatten

c x0-a
a

8c x0a a<

x-1-a x0a a

Plugging in our soltuion, we get the PDF of the Pareto distribution.

pdf . soln

x-1-a x0a a
pareto-distribution.nb 15

Lorenz Curve and Gini Coefficient


16 pareto-distribution.nb

Average Size

Recall the PDF of the Pareto distribution.

pdfPareto

x-1-a x0a a

Use this PDF to compute the average size of a draw from the Pareto distribution as x0 x pHxL dx.

Clear@x0, x, aD
meanPareto = Assuming@x0 > 0 && a > 1,
Integrate@x * pdfPareto, 8x, x0, + Infinity<D
D

x0 a
-1 + a
pareto-distribution.nb 17

Proportion of Total Income

The weighted sum of all incomes less than or equal to t is

x0 x pHxL x
t

Clear@x0, x, a, tD
cumSize = Assuming@t > x0 > 0 && a > 1,
Integrate@x * pdfPareto, 8x, x0, t<D
D

t-a Hta x0 - t x0a L a


-1 + a

Dividing by the mean (i.e., the probability weighted sum of all incomes) produces an expression for the
proportion of total income constituted by incomes of t or less: 1 - Ht x0L1-a .
Note that this expression only makes sense for a > 1, the case in which the mean exists. Also note that it
only depends on the ratio t x0.

cumShare = cumSize meanPareto Simplify


SimplifyAcumShare 1 - Ht x0L1-a , Assumptions a > 1 && t > x0 > 0E

1 - t1-a x0-1+a

True
18 pareto-distribution.nb

Lorenz Curve

A Lorenz curve for income plots this cumulative share of income vs the cumulative share of the population
earning it.
We have just found that the cumulative share of income for incomes less than t can be written as
1 - Ht x0L1-a . Recall that the CDF at income t, which gives the proportion of the population earning less than
t, is 1 - Ht x0L-a . So for any alpha, we can make a parametric plot of the Lorenz curve. Defining m = x0 t we
can write:

ManipulateA
ParametricPlotA91 - m$a , 1 - m$a-1 =, 8m, 0, 1<,
PlotRange 880, 1<, 80, 1<<, AspectRatio 1, ImageSize SmallE,
88$a, 2<, 1.01, 5<E

$a

1.0

0.8

0.6

0.4

0.2

0.0
0.0 0.2 0.4 0.6 0.8 1.0
pareto-distribution.nb 19

Lorenz Curve

Our parametric representation of the Lorenz curve can be used to derive a function, which expresses the
cumulative share of income as a function of the cumulative share of the population earning it. Recall that
the CDF at income t gives the proportion of the population (say, sHtL) earning less than t. Since the CDF is
strictly increasing, we can produce the inverse function tHsL, which we can then substitute into cumshare.
20 pareto-distribution.nb

ts = Solve@s cdfPareto, xD . 8x t< Flatten


Simplify@cumShare . ts,
Assumptions x0 > 0 && a > 1D
ManipulateA
PlotA9s, 1 - H1 - sLH$a-1L$a =, 8s, 0, 1<, AspectRatio 1E,
88$a, 2<, 1.01, 5<E

Solve::ifun :
Inverse functions are being used by Solve, so some solutions may not be found; use Reduce for complete solution information.

9t H1 - sL-1a x0=

1 - H1 - sL1- a
1

$a

1.0

0.8

0.6

0.4

0.2

0.2 0.4 0.6 0.8 1.0


pareto-distribution.nb 21

Gini Coefficient

The Gini Coefficient is twice the area between the 45 degree line and the Lorenz curve. We can caculate
that area as

AssumingBa > 1,

IntegrateBs - 1 + H1 - sL1- a , 8s, 0, 1<F


1

F
gini = 2 * % Simplify

1
-2 + 4 a

1
-1 + 2 a

Solve@g gini, aD

::a >>
1+g
2g

Solve@d 1 - 1 a, aD Flatten
gini . % Simplify
Solve@g %, dD

:a >
1
1-d

1-d
1+d

::d >>
1-g
1+g
22 pareto-distribution.nb

Pareto Distribution and Lorenz Curve

L@F_, k_D := 1 - H1 - FL1-1.k ;


options03 = 8PlotRange 880, 1<, 80, 1<<, AspectRatio 1, ImageSize 250<;
Manipulate@Plot@8F, L@F, kD, 1 - F<, 8F, 0, 1<, Evaluate@options03DD,
88k, 3, "Pareto index"<, 1, 10<D

Pareto index

1.0

0.8

0.6

0.4

0.2

0.0
0.0 0.2 0.4 0.6 0.8 1.0

80-20 Rule: Pareto (1906) noticed that about 80% of the land in Italy was owned by about 20% of the
population.

However his British tax return data showed something closer to 70-30.

There will always be some such proprotion: look for where the Lorenz curve crosses the unit simplex.
pareto-distribution.nb 23

80-20 Rule

We have seen with a = 2 that 1% of the population has a size at least 10 times the minimum, and 1% of that
1% has a size 10 times that.

More generally, if a > 1 (so that the expected value is finite), is some fraction 0 f 1 2 such that f of
those sampled receive H1 - f L of all income, and similarly for every real (not necessarily integer) n > 0,
100pn % of all people receive 100(1 - p)n % of all income.

AssumingAa > 1,
SolveA1 - s 1 - H1 - sL1-1a , sE
E
AssumingA1 > d > 0,
SolveAs H1 - sLd , sE
E
AssumingAd > 0,
SolveAs + s1d 1, sE
E

Solve::nsmet : This system cannot be solved with the methods available to Solve.

SolveB1 - s 1 - H1 - sL1- a , sF
1

Solve::nsmet : This system cannot be solved with the methods available to Solve.

SolveAs H1 - sLd , sE

Solve::nsmet : This system cannot be solved with the methods available to Solve.
1
SolveBs + s d 1, sF

SolveB1 - s 1 - H1 - sL1- a , aF
1

Solve::ifun :
Inverse functions are being used by Solve, so some solutions may not be found; use Reduce for complete solution information.

::a >>
Log@1 - sD
Log@1 - sD - Log@sD
24 pareto-distribution.nb

1
SolveBs + s d 1, dF

Solve::ifun :
Inverse functions are being used by Solve, so some solutions may not be found; use Reduce for complete solution information.

::d >>
Log@sD
Log@1 - sD

Plot@Log@sD Log@1 - sD, 8s, 0.5, 1<D

1.0

0.8

0.6

0.4

0.2

0.6 0.7 0.8 0.9 1.0


pareto-distribution.nb 25

Data Creation and Analysis


26 pareto-distribution.nb

Sampling from Power Law (Pareto) Distributions

We can generate a sample from a Pareto distribution by sampling from a uniform distribution on (0,1].
We transform each point U from the uniform according to X = x0 U 1a .
Then looking at the survival function we have

P@X > xD = PAx0 U 1a > xE = PA U 1a < x0 xE = P@U < Hx0 xLa D = Hx0 xLa

Technical note: note that we must rule out drawing a 0 from our uniform distribution. Most software draws
from the interval @0, 1L . In this case, just use 1 - U for your sample.
pareto-distribution.nb 27

Clear@sizedata, incomesD
alpha = 2; xmin = 10 000; npts = 2000;
sizedata = xmin H1 - RandomReal@1, nptsDL ^ H1 alphaL;
noise = RandomVariate@NormalDistribution@0, 100D, nptsD;
sizedata = Sort@sizedata + noiseD;
Clear@noiseD
proportionlarger = Reverse@Range@nptsDD npts N;
survivaldata = Transpose@8sizedata, proportionlarger<D N;
ListPlot@survivaldataD
llplot = ListPlot@Log@survivaldataDD

1.0

0.8

0.6

0.4

0.2

15 000 20 000 25 000 30 000 35 000 40 000

-1

-2

-3

9.5 10.0 10.5 11.0


28 pareto-distribution.nb

Simplest Approach to Estimation

An obvious estimator for x0 is the minimum observation (which is also the maximum likelihood estimator).
Recall that the mean of the distribution is x0 a Ha - 1L, we can then estimate a using

Clear@mean, x0, aD
minsize = Min@sizedataD
8meansize = Mean@sizedataD, theoreticalmean = xmin * alpha Halpha - 1L<
Solve@mean x0 * a Ha - 1L, aD . 8mean meansize, x0 minsize<

9792.57

819 129.2, 20 000<

88a 2.04883<<

Not bad.
We might improve a little on this by estimating x0 with using the expected value for the minimum observa-
tion given the sample size.
(See http://www.math.umt.edu/gideon/pareto.pdf)
pareto-distribution.nb 29

Fitting a Pareto Distribution to the Data

Recall that the survivial function told us that the proportion surviving is linear in logHx0 xL. So we can look
for a simple linear fit. The coefficient on x is our estimate of the Pareto index.

alpha
fit = Fit@Log@survivaldataD, 81, x<, xD H* linear fit to logged data *L
coefs = CoefficientList@fit, xD
Show@8llplot,
Plot@fit, 8x, Log@First@sizedataDD, Log@Last@sizedataDD<, PlotStyle 8Red<D<D
Exp@- coefsP1T coefsP2TD H* implied value of x0 *L

18.7679 - 2.03951 x

818.7679, - 2.03951<

-1

-2

-3

9.5 10.0 10.5 11.0

9918.88

A nonlinear model fit of the survival function produces similar results.

nlm01 = NonlinearModelFit@survivaldata, SurvivalFunction@


ParetoDistribution@x0hat, alphahatD, $xD, 8x0hat, alphahat<, $xD

9.66847107
F
$x 9949.45
FittedModelB $x19
1 True
30 pareto-distribution.nb

Sampling from Power Law (Pareto) Distributions

We are often forced to work with binned data. Lets create some.

npts = 106 ; alpha = 2; x0 = 1; x = x0 H1 - RandomReal@1, nptsDL ^ H1 alphaL;


H* we'll need x again for next figure *L
xmax = 10 * x0; bins = Table@i, 8i, x0, xmax, xmax 100<D;
relfreq = BinCounts@x, 8bins<D npts;
8ListPlot@Transpose@8bins@@2 ;;DD, relfreq<D,
PlotRange 880, xmax<, Automatic<D,
ListLogLogPlot@Transpose@8bins@@2 ;;DD, relfreq<DD<

0.030
0.100
0.025 0.050

:0.015 >
0.020
0.010
, 0.005
0.010
0.001
0.005 5 10-4

2 4 6 8 10 1.5 2.0 3.0 5.0 7.0 10.0


pareto-distribution.nb 31

Power Law Frequency Distribution

In our last slide, we cheated a bit by showing only the bins for relatively small sizes, which occur with the
greatest frequency. As size increases, relative frequency falls, and statisitcal noise becomes more promi-
nent, even if we substantially increase bin size.

xmax = 200 * x0; bins = Table@i, 8i, x0, xmax, xmax 100<D;
relfreq = BinCounts@x, 8bins<D npts;
mypoints = Transpose@8Rest@binsD, relfreq<D;
ListLinePlot@mypoints, PlotRange 88100 * x0, Automatic<, 80, 10 ^ - 5<<D

0.00001

8. 10-6

6. 10-6

4. 10-6

2. 10-6

0
100 120 140 160 180 200
32 pareto-distribution.nb

Power Law: Log Scale

We might hope to address this by moving to a log scale. This proves informative but is only partially success-
ful. Why? Our bins are still linear.
Notice the empty bins for large sizes.

ListLogLogPlot@mypointsD

0.1

0.01

0.001

10-4

10-5

5 10 20 50 100 200
pareto-distribution.nb 33

Logarithmic Binning

It works better to let our bin size grow as we consider larger size realizations: we can use logarithmic
binning.

logxmax = Log@xmaxD; logbins = Table@i, 8i, 0, logxmax, logxmax 100<D;


relfreq = BinCounts@Log@xD, 8logbins<D npts;
data = Transpose@8Rest@logbinsD, Log@relfreqD<D;
ListLinePlot@dataD

-2

-4

-6

-8

-10

-12

1 2 3 4 5

Clear@xD
34 pareto-distribution.nb

Nonlinear Curve Fitting


pareto-distribution.nb 35

Some Census Data

Data from http://www.census.gov/hhes/www/income/data/historical/inequality/IE-1.pdf table A.3. Income is


in 2010 dollars.

year ........................... 2010 2009 2008 2007 2006 2005 2004


10th percentile limit . . . . . 11,904 12,320 12,315 12,789 12,977 12,607 12,589
20th percentile limit . . . . . 20,000 20,791 20,974 21,337 21,666 21,419 21,338
50th HmedianL . . . . . . . . . 49,445 50,599 50,939 52,823 52,124 51,739 51,174
80th percentile limit . . . . . 100,065 101,651 101,508 105,156 104,930 102,420 101,580 102
90th percentile limit . . . . . 138,923 139,904 140,050 143,012 143,825 140,823 139,514 140
95th percentile limit . . . . . 180,810 182,972 182,277 186,126 188,175 185,397 181,399 182
36 pareto-distribution.nb

Census Data

incomes2010 = 811 904, 20 000, 49 445, 100 065, 138 923, 180 810<;
cdf2010 = 810, 20, 50, 80, 90, 95< 100.;

tail2010 = 1 - cdf2010;
incomecdf2010 = Transpose@8incomes2010, cdf2010<D;
incometail2010 = Transpose@8incomes2010, tail2010<D;
Labeled@GraphicsRow@8
g`cdf2010 =
ListPlot@incomecdf2010, AxesLabel 8"income", "cdf"<, AxesOrigin 80, 0<,
PlotStyle PointSize@0.02D, Ticks 88815 000, "$15k"<, 850 000, "$50k"<,
8100 000, "$100k"<, 8150 000, "$150k"<<, Automatic<, ImageSize 400D,
g`tail2010 = ListPlot@incometail2010, AxesLabel 8"income", "tail"<,
AxesOrigin 80, 0<,
PlotStyle PointSize@0.02D, Ticks 88815 000, "$15k"<, 850 000, "$50k"<,
8100 000, "$100k"<, 8150 000, "$150k"<<, Automatic<, ImageSize 400D
<D, "2010 Census Data"D

cdf tail

0.8
0.8

0.6
0.6
Out[722]=

0.4
0.4

0.2 0.2

income
$15k $50k $100k $150k $15k $50k

2010 Census Data


pareto-distribution.nb 37

Loglinear Survival Fit


In[806]:=
Clear@xD
fit2010 = Fit@Log@Transpose@8incomes2010, tail2010<DD, 81, $x<, $xD;
H* linear fit to logged data *L
coefs2010 = CoefficientList@fit2010, $xD;
ahat2010 = - coefs2010P2T;
x0hat2010 = Exp@coefs2010P1T ahat2010 D; H* implied value of x0 *L
tail2010fit = Piecewise@88Hx0hat2010 xL ^ ahat2010, x > x0hat2010<, 81, True<<D
cdf2010fit = 1 - tail2010fit;
GraphicsRow@8
Show@8g`tail2010, Plot@tail2010fit,
8x, First@incomes2010D, Last@incomes2010D<, PlotStyle 8Red<D<D
,
Show@8g`cdf2010, Plot@cdf2010fit,
8x, First@incomes2010D, Last@incomes2010D<, PlotStyle 8Red<D<D
<D

17 914.6 J x N
1 1.01727
Out[810]= x > 15 169.9
1 True

tail cdf

0.8
0.8

0.6
0.6
Out[812]=

0.4
0.4

0.2 0.2

income
$15k $50k $100k $150k $15k $50k

A problem with this log-linear survival fit it that it estimates minimum income at a value above the minimum
observed value. But the same thing happens with a nonlinear estimation.
38 pareto-distribution.nb

Nonlinear Least Squares: Fit CDF to 2010 Data

incomecdf2010 = 8811 904, 0.1<, 820 000, 0.2<,


In[695]:=

849 445, 0.5<, 8100 065, 0.8<, 8138 923, 0.9<, 8180 810, 0.95<<
nlm01 = NonlinearModelFit@incomecdf2010,
CDF@ParetoDistribution@khat, ahatD, $xD, 8khat, ahat<, $xD;
nlm01@"BestFitParameters"D
gpareto2010 = Show@ListPlot@incomecdf2010, AxesOrigin 80, 0<D,
Plot@nlm01@$xD, 8$x, 0, 200 000<, PlotStyle 8Red<DD

8811 904, 0.1<, 820 000, 0.2<, 849 445, 0.5<,


Out[695]=

8100 065, 0.8<, 8138 923, 0.9<, 8180 810, 0.95<<

8khat 16 104.6, ahat 0.858956<


Out[697]=

0.8

0.6

Out[698]=

0.4

0.2

50 000 100 000 150 000


pareto-distribution.nb 39

Some Earlier (2004) Income Data

For purposes of comparison, we use data from Maclachlan (2006).

incomes2004 = 815 000, 25 000, 35 000, 50 000, 75 000, 100 000<;


In[1022]:=

cdf2004 = 80.154, 0.283, 0.402, 0.55, 0.733, 0.843<;


tail2004 = 1 - cdf2004;
data2004fm = Transpose@8incomes2004, cdf2004<D
g`data2004fm = ListPlot@data2004fm, PlotStyle PointSize@0.01D,
AxesOrigin 80, 0<, AxesLabel 8"income", "cdf"<,
Ticks 88815 000, "$15k"<, 850 000, "$50k"<, 8100 000, "$100k"<<, Automatic<,
ImageSize 400D
Clear@xD
fit2004 = Fit@Log@Transpose@8incomes2004, tail2004<DD, 81, x<, xD
H* linear fit to logged data *L
coefs2004 = CoefficientList@fit2004, xD
ahat2004 = - coefs2004P2T
x0hat2004 = Exp@coefs2004P1T ahat2004 DH* implied value of x0 *L
cdf2004fit = 1 - Hx0hat2004 xL ^ ahat2004
temp = Plot@cdf2004fit,
8x, First@incomes2004D, Last@incomes2004D<, PlotStyle 8Red<D;
Show@8g`data2004fm, temp<D

8815 000, 0.154<, 825 000, 0.283<, 835 000, 0.402<,


Out[1025]=

850 000, 0.55<, 875 000, 0.733<, 8100 000, 0.843<<

cdf

0.8

0.6

Out[1026]=

0.4

0.2

income
$15k $50k $100k

Out[1028]=
8.43993 - 0.872352 x

88.43993, - 0.872352<
Out[1029]=

Out[1030]=
0.872352
40 pareto-distribution.nb

Out[1031]=
15 913.4

1 0.872352
Out[1032]=
1 - 4628.25
x

cdf

0.8

0.6

Out[1034]=

0.4

0.2

income
$15k $50k $100k
pareto-distribution.nb 41

Fit to Pareto Distribution

Lets fit these data points to a Pareto distribution, using NonlinearModelFit. (Mathematica 9 gives a perfect
match to the same estimation on Maclachlan (2006), who used Mathematica 5.)

data2004fm 8815 000, 0.154<, 825 000, 0.283<,


In[1040]:=

835 000, 0.402<, 850 000, 0.55<, 875 000, 0.733<, 8100 000, 0.843<<
nlm01 = NonlinearModelFit@data2004fm,
CDF@ParetoDistribution@khat, ahatD, $xD, 8khat, ahat<, $xD;
nlm01@"BestFitParameters"D
g`pareto =
Show@8g`data2004fm, Plot@nlm01@$xD, 8$x, 0, 100 000<, PlotStyle 8Red<D<D
Out[1040]=
True

8khat 12 989.3, ahat 0.658768<


Out[1042]=

cdf

0.8

0.6

Out[1043]=

0.4

0.2

income
$15k $50k $100k
42 pareto-distribution.nb

Puzzle

Iirc Mathematica 8 gave different results.

nlm01["BestFitParameters"]
nlm01["ParameterConfidenceIntervals"]
nlm01["ParameterErrors"]

{khat -> 18243.5, ahat -> 0.902423}

{{9379.06, 27107.9}, {0.312716, 1.49213}}

{3192.72, 0.212397}

0.8

0.6

0.4

0.2

20 000 40 000 60 000 80 000 100 000


pareto-distribution.nb 43

Other Two Parameter Distributions


Maclachlan (2006) proposes consideration of the LogNormal and Beaman distributions. This section repli-
cates her results for these, using her data.
44 pareto-distribution.nb

Lognormal
In[944]:=
GraphicsRow@8
Plot@SurvivalFunction@LogNormalDistribution@0, .2D, xD,
8x, 0, 2<, AxesOrigin 80, 0<D,
LogLogPlot@SurvivalFunction@LogNormalDistribution@0, .2D, xD,
8x, 0.01, 2<, PlotRange AllD
<, ImageSize LargeD
GraphicsRow@8
Plot@CDF@LogNormalDistribution@0, .2DD@xD, 8x, 0, 2<, AxesOrigin 80, 0<D,
Plot@PDF@LogNormalDistribution@0, 0.2DD@xD, 8x, 0, 2<, AxesOrigin 80, 0<D
<, ImageSize LargeD

1.0 1

0.8
0.1

0.6
Out[944]=
0.01
0.4

0.2
0.001

0.5 1.0 1.5 2.0 0.02 0.05 0.10 0.20 0.50 1.00 2.00

In[949]:=
Simplify@
8CDF@LogNormalDistribution@m, sD, xD, PDF@LogNormalDistribution@m, sD, xD<,
Assumptions x > 0D

Hm-Log@xDL2
-

: F, >
Out[949]= 1 m - Log@xD 2 s2
ErfcB
2 2 s 2p xs

1.0 2.0

0.8
1.5

0.6
Out[945]=
1.0
0.4

0.5
0.2

0.5 1.0 1.5 2.0 0.5 1.0 1.5 2.0


pareto-distribution.nb 45

Fit Lognormal
In[983]:=
Clear@xD
model = CDF@LogNormalDistribution@m, sD, xD
nlm02 = NonlinearModelFit@data2004fm, model, 88m, 10<, 8s, 1<<, xD
nlm02@"BestFitParameters"D
fittedmodel2004fm = model . %

F x>0
1 m-Log@xD
Out[984]= 2
ErfcB
2 s
0 True

Erfc@0.763334 H10.6606 - Log@xDLD x > 0


F
1
Out[985]=
FittedModelB 2
0 True

8m 10.6606, s 0.92634<
Out[986]=

Erfc@0.763334 H10.6606 - Log@xDLD x > 0


1
Out[987]=
2
0 True
46 pareto-distribution.nb

Visualize Lognormal Fit


In[1039]:=
g`lognormal = Show@
8g`data2004fm, Plot@fittedmodel2004fm, 8x, 15 000, 100 000<D<, ImageSize 300D

cdf

0.8

0.6

Out[1039]=

0.4

0.2

income
$15k $50k $100k

In[1044]:=
g2 = Show@8g`lognormal, g`pareto<,
PlotLabel "Compare Lognormal HBlueL and Pareto HRedL"D

Compare Lognormal HBlueL and Pareto HRedL


cdf

0.8

0.6
Out[1044]=

0.4

0.2

income
$15k $50k $100k
pareto-distribution.nb 47

Show@8Plot@fittedmodel2004fm, 8x, 1000, 200 000<D, g`pareto<D


In[1046]:=

0.8

0.6

Out[1046]=

0.4

0.2

50 000 100 000 150 000 200 000


48 pareto-distribution.nb

Beaman CDF

beamanCDF = 1 I1 + Hb - aL3 Hx - aL3 M


In[365]:=

Limit@beamanCDF, x D
Limit@beamanCDF, x aD
Limit@beamanCDF, x bD
D@beamanCDF, xD Simplify

Out[365]= 1
H-a+bL3
H-a+xL3
1+

Out[366]=
1

Out[367]=
0

Out[368]= 1
2

Out[369]= 3 H- a + bL3

J1 +
Ha-bL3
N Ha - xL4
2
Ha-xL3

Beaman
1
Hb-aL3
Hx-aL3
1+

TransposeA9Range@12 000, 100 000, 100D,


1 - 1 I1 + H44 + 27L3 HRange@12 000, 100 000, 100D + 27L3 M=E;
pareto-distribution.nb 49

ListLogLogPlot@%, Ticks NoneD


50 pareto-distribution.nb

Fit Beaman Distribution

Beaman

Maclachlan (2006) reports that the Beaman distribution was used at duPont in the 1970's to model sales
volume of products at various price points. because it gave a better fit than the lognormal. There are two
parameters: a represents the minimum price point and b represents the median price point. The distribution
allows negative values. See http://library.wolfram.com/infocenter/MathSource/6021/ for details. For elabora-
tion, see http://home.manhattan.edu/~fiona.maclachlan//beaman/beaman_notebook/.

1
beamanDistribution@$x_, a_, b_D := ;
Hb-aL3
H$x-aL3
1+

incomepoints = 8815 000, 0.154<, 825 000, 0.28300000000000003<, 835 000, 0.402<,
850 000, 0.55<, 875 000, 0.7330000000000001<, 8100 000, 0.8430000000000001<<;
nlm03 = NonlinearModelFitAincomepoints,
91 I1 + Hb - aL3 H$x - aL3 M, a < 0, b > 25 000=, 8b, a<, $xE;
Normal@nlm03D
bfp = nlm03@"BestFitParameters"D

1
3.484751014
H25 583.6+$xL3
1+

8b 44 786.9, a - 25 583.6<

These results are close to those of Machlachlan (who got values: {b44884.7,a-27515.7}). Unfortunately,
the estimates are *highly* sensitive to the constraint values. In any case, we use constraints based on
Machlachlan (2006) and get a pretty good fit, as illustrated here.
pareto-distribution.nb 51

gbeaman = Plot@beamanDistribution@$x, a, bD . bfp, 8$x, 5000, 100 000<,


PlotStyle 8Green<, ImageSize 300, AxesLabel 8"income", "cdf"<D;
gbeaman2 = Show@gdata, gbeaman, ImageSize 300, AxesLabel 8"income", "cdf"<D

cdf

0.8

0.6

0.4

0.2

income
$15,000 $50,000 $100,000

Two way comparison

Show@gdata, gbeaman, glognormal,


PlotLabel "Beaman in Green & Lognormal in Blue"D

Beaman in Green & Lognormal in Blue


cdf

0.8

0.6

0.4

0.2

income
$15,000 $50,000 $100,000
52 pareto-distribution.nb

References
Maclachlan, Fiona (2006). Investigating Power Laws with Mathematica. http://library.wolfram.com/infocen-
ter/Conferences/6461/

You might also like