Pareto Distribution

The Pareto Distribution
2 pareto-distribution.nb
Introduction
pareto-distribution.nb 3
The Pareto Distribution
named after Vilfredo Pareto (1848-1923, Italian economist)
also known as Bradford distribution
continuous power-law probability distribution

Paretos Proposal
Pareto proposed that the number of people (N) with incomes higher than x can be modeled log-linearly:
log N = log A - a log x
Letting the total population be N0 and the minimum income be x0, so that log N0 = log A - a log x0, we can
write this in proportionate terms as
log HN N0L = - a log Hx x0L
Note that for x1, x2 > x0 and associated N1, N2, this also implies
log HN1 N2L = - a log Hx1 x2L
So the same relationship holds in any tail of the income distribution.

Pareto Distribution (Tail Function)
The Pareto Distribution is often presented in terms of its survival function (or reliability function, or tail
function), which gives the probability of seeing larger values than x. (I.e., it is 1-CDF; see below.) The
survival function is
Clear@x0, a, xD;
SurvivalFunction@ParetoDistribution@x0, aD, xD
I x0 M
x -a
x x0
1 True
Here x0 > 0 is the location parameter, and a > 0 is the shape parameter (or slope parameter, or Pareto
index). We are only interested in x > x0, and we are usually interested in a > 1 (which is required for finite
expected value).
Plot@% . 8x0 10 000, a 2<, 8x, 0, 100 000<D
1.0
0.8
0.6
0.4
0.2
20 000 40 000 60 000 80 000 100 000

Relation to Exponential
Here is the survival function of the exponential distrubution:
R HyL = P@Y > yD = -a*y = Hy L-a
efine X = x0 Y , where Y has an exponential distribution. Then
P[X>x]=P[x0 Y > xE = PAY > x x0E = P@Y > logHx x0LD = Hx x0L-a
Comparing to our survival function for the Pareto distribution, we see that X has a Pareto distribution.
Survival (Tail) Function

Log-Linear Survival
Lets focus on values greater than the minimum:
tailPareto = Simplify@
SurvivalFunction@ParetoDistribution@x0, aD, xD,
Assumptions x > x0 > 0D
x0 a
Note that the log of the survival probability is linear in logHx x0L. We can say that the size elasticity of the
survival rate is a. (We will return to this.)
Simplify@
Log@tailParetoD,
Assumptions x > x0 > 0 && a > 0D
F
x0
a LogB
x
Survival and the Pareto Index

In[370]:=
plotoptions =
8PlotRange 880, 100 000<, 80, 1<<, AxesLabel 8"Income", "Survival Rate"<,
ImageSize 250, Ticks 81000 * 820, 40, 80<, 80, 0.25, 0.5, 1<<<
Manipulate@GraphicsRow@8
Plot@SurvivalFunction@ParetoDistribution@10 000, $aD, $xD,
8$x, 10 000, 100 000<, Evaluate@plotoptionsDD,
LogLogPlot@SurvivalFunction@ParetoDistribution@10 000, $aD, $xD,
8$x, 10 000, 100 000<, Evaluate@plotoptionsDD
<D
, 88$a, 2, "Pareto index"<, 1, 5<D
8PlotRange 880, 100 000<, 80, 1<<, AxesLabel 8Income, Survival Rate<,
Out[370]=
ImageSize 250, Ticks 8820 000, 40 000, 80 000<, 80, 0.25, 0.5, 1<<<
Pareto index
Survival Rate Survival Rate

1 1
0.5
Out[371]=
0.25
0.5
0.25
0 Income Income
20 000 40 000 80 000 20 000 40 000 80 000
Survival: Intuition
Consider a population of households and suppose sampling household incomes is like sampling from a
Pareto[10000,2].
What proportion of people earn more than $100000? From the form of the survival function, it should be
obvious that the answer is 1%: only 1 in 100 households earn more than $100000.
SurvivalFunction@ParetoDistribution@10 000, 2D, 100 000D
1
100
Note: given a = 2 and any x0, we find that 1% of the population has income greater than 10*x0. This is why
the Pareto distribution (along with other power law distributions) is called scale free.
Simplify@
SurvivalFunction@ParetoDistribution@x0, 2D, 10 * x0D,
Assumptions x0 > 0D
1
100
What is more, this relationship holds as well for subgroups: only 1% of the top 1% will have incomes again
ten times higher. This typifies a continuous power law distribution.
Simplify@
SurvivalFunction@ParetoDistribution@x0, 2D, 100 * x0D
SurvivalFunction@ParetoDistribution@x0, 2D, 10 * x0D,
Assumptions x0 > 0D
1
100
Pareto Distribution: CDF and PDF
The cumulative distribution function (CDF) gives probability of seeing a given size or lower. Note that x0 is a
minimum value, called the location parameter.
cdfPareto = Simplify@
CDF@ParetoDistribution@x0, aD, xD,
x0 a
1-
x
The PDF is of course the derivative of the CDF.
pdfPareto = Simplify@
PDF@ParetoDistribution@x0, aD, xD,
pdfPareto D@cdfPareto, xD PowerExpand
x-1-a x0a a
True
CDF and PDF of Pareto Distributions
options01 = 8PlotRange 880, 5<, 80, 1<<,

In[1047]:=
AxesLabel 8"Size", "CDF"<, ImageSize 250<; options02 =

8PlotRange 880, 5<, 80, 5<<, AxesLabel 8"Size", "PDF"<, ImageSize 250<;
Manipulate@
GraphicsRow@8
Plot@CDF@ParetoDistribution@1, $$aD, $xD, 8$x, 1, 5<, Evaluate@options01DD,
Plot@PDF@ParetoDistribution@1, $$aD, $xD, 8$x, 1, 5<, Evaluate@options02DD
<D,
88$$a, 1, "Pareto index"<, 0.1, 5<D
Pareto index
2.04
CDF PDF
1.0 5
Out[1048]=
0.8 4
0.6 3
0.4 2
0.2 1
Size 0 Size
0 1 2 3 4 5 0 1 2 3 4 5
PDF of Pareto Distributions: Loglinearity
options02 =
8PlotRange 880, 5<, 80, 5<<, AxesLabel 8"Size", "PDF"<, ImageSize 250<;
Manipulate@
LogLogPlot@PDF@ParetoDistribution@1, $$aD, $xD,
8$x, 1, 5<, Evaluate@options02DD,
88$$a, 1, "Pareto index"<, 0.1, 5<D
Pareto index
PDF
0.1
0.01
0.001
Size
1.0 1.5 2.0 3.0
Continuous Power Law Distribution
A power law distribution with shape parameter a has probability distribution function
pHxL = c x-H1+aL for x > x0
Clear@c, x, aD
pdfPower = c x-H1+aL ;
The constant c must be chosen to satisfy unitarity. In the continuous case, we compute the integral
Assuming@x0 > 0 && a > 0,

Integrate@pdfPower, 8x, x0, + Infinity<D
D
soln = Solve@% 1, cD Flatten
c x0-a
a
8c x0a a<
x-1-a x0a a
Plugging in our soltuion, we get the PDF of the Pareto distribution.
pdf . soln
x-1-a x0a a
Lorenz Curve and Gini Coefficient

Average Size
Recall the PDF of the Pareto distribution.
pdfPareto
x-1-a x0a a
Use this PDF to compute the average size of a draw from the Pareto distribution as x0 x pHxL dx.

Clear@x0, x, aD
meanPareto = Assuming@x0 > 0 && a > 1,
Integrate@x * pdfPareto, 8x, x0, + Infinity<D
D
x0 a
-1 + a
Proportion of Total Income
The weighted sum of all incomes less than or equal to t is
x0 x pHxL x
t
Clear@x0, x, a, tD
cumSize = Assuming@t > x0 > 0 && a > 1,
Integrate@x * pdfPareto, 8x, x0, t<D
D
t-a Hta x0 - t x0a L a

-1 + a
Dividing by the mean (i.e., the probability weighted sum of all incomes) produces an expression for the
proportion of total income constituted by incomes of t or less: 1 - Ht x0L1-a .
Note that this expression only makes sense for a > 1, the case in which the mean exists. Also note that it
only depends on the ratio t x0.
cumShare = cumSize meanPareto Simplify

SimplifyAcumShare 1 - Ht x0L1-a , Assumptions a > 1 && t > x0 > 0E
1 - t1-a x0-1+a
True
Lorenz Curve
A Lorenz curve for income plots this cumulative share of income vs the cumulative share of the population
earning it.
We have just found that the cumulative share of income for incomes less than t can be written as
1 - Ht x0L1-a . Recall that the CDF at income t, which gives the proportion of the population earning less than
t, is 1 - Ht x0L-a . So for any alpha, we can make a parametric plot of the Lorenz curve. Defining m = x0 t we
can write:
ManipulateA
ParametricPlotA91 - m$a , 1 - m$a-1 =, 8m, 0, 1<,
PlotRange 880, 1<, 80, 1<<, AspectRatio 1, ImageSize SmallE,
88$a, 2<, 1.01, 5<E
$a
1.0
0.8
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Lorenz Curve
Our parametric representation of the Lorenz curve can be used to derive a function, which expresses the
cumulative share of income as a function of the cumulative share of the population earning it. Recall that
the CDF at income t gives the proportion of the population (say, sHtL) earning less than t. Since the CDF is
strictly increasing, we can produce the inverse function tHsL, which we can then substitute into cumshare.
ts = Solve@s cdfPareto, xD . 8x t< Flatten

Simplify@cumShare . ts,
Assumptions x0 > 0 && a > 1D
ManipulateA
PlotA9s, 1 - H1 - sLH$a-1L$a =, 8s, 0, 1<, AspectRatio 1E,
88$a, 2<, 1.01, 5<E
Solve::ifun :
Inverse functions are being used by Solve, so some solutions may not be found; use Reduce for complete solution information.
9t H1 - sL-1a x0=
1 - H1 - sL1- a
1
$a
1.0
0.8
0.6
0.4
0.2
0.2 0.4 0.6 0.8 1.0

Gini Coefficient
The Gini Coefficient is twice the area between the 45 degree line and the Lorenz curve. We can caculate
that area as
AssumingBa > 1,
IntegrateBs - 1 + H1 - sL1- a , 8s, 0, 1<F

1
F
gini = 2 * % Simplify
1
-2 + 4 a
1
-1 + 2 a
Solve@g gini, aD
::a >>
1+g
2g
Solve@d 1 - 1 a, aD Flatten
gini . % Simplify
Solve@g %, dD
:a >
1
1-d
1-d
1+d
::d >>
1-g
1+g
Pareto Distribution and Lorenz Curve
L@F_, k_D := 1 - H1 - FL1-1.k ;

options03 = 8PlotRange 880, 1<, 80, 1<<, AspectRatio 1, ImageSize 250<;
Manipulate@Plot@8F, L@F, kD, 1 - F<, 8F, 0, 1<, Evaluate@options03DD,
88k, 3, "Pareto index"<, 1, 10<D
Pareto index
1.0
0.8
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
80-20 Rule: Pareto (1906) noticed that about 80% of the land in Italy was owned by about 20% of the
population.
However his British tax return data showed something closer to 70-30.
There will always be some such proprotion: look for where the Lorenz curve crosses the unit simplex.
80-20 Rule
We have seen with a = 2 that 1% of the population has a size at least 10 times the minimum, and 1% of that
1% has a size 10 times that.
More generally, if a > 1 (so that the expected value is finite), is some fraction 0 f 1 2 such that f of
those sampled receive H1 - f L of all income, and similarly for every real (not necessarily integer) n > 0,
100pn % of all people receive 100(1 - p)n % of all income.
AssumingAa > 1,
SolveA1 - s 1 - H1 - sL1-1a , sE
E
AssumingA1 > d > 0,
SolveAs H1 - sLd , sE
E
AssumingAd > 0,
SolveAs + s1d 1, sE
E
Solve::nsmet : This system cannot be solved with the methods available to Solve.
SolveB1 - s 1 - H1 - sL1- a , sF
1
SolveAs H1 - sLd , sE
1
SolveBs + s d 1, sF
SolveB1 - s 1 - H1 - sL1- a , aF
1
Solve::ifun :
::a >>
Log@1 - sD
Log@1 - sD - Log@sD
1
SolveBs + s d 1, dF
Solve::ifun :
::d >>
Log@sD
Log@1 - sD
Plot@Log@sD Log@1 - sD, 8s, 0.5, 1<D
1.0
0.8
0.6
0.4
0.2
0.6 0.7 0.8 0.9 1.0

Data Creation and Analysis

Sampling from Power Law (Pareto) Distributions
We can generate a sample from a Pareto distribution by sampling from a uniform distribution on (0,1].
We transform each point U from the uniform according to X = x0 U 1a .
Then looking at the survival function we have
P@X > xD = PAx0 U 1a > xE = PA U 1a < x0 xE = P@U < Hx0 xLa D = Hx0 xLa
Technical note: note that we must rule out drawing a 0 from our uniform distribution. Most software draws
from the interval @0, 1L . In this case, just use 1 - U for your sample.
Clear@sizedata, incomesD
alpha = 2; xmin = 10 000; npts = 2000;
sizedata = xmin H1 - RandomReal@1, nptsDL ^ H1 alphaL;
noise = RandomVariate@NormalDistribution@0, 100D, nptsD;
sizedata = Sort@sizedata + noiseD;
Clear@noiseD
proportionlarger = Reverse@Range@nptsDD npts N;
survivaldata = Transpose@8sizedata, proportionlarger<D N;
ListPlot@survivaldataD
llplot = ListPlot@Log@survivaldataDD
1.0
0.8
0.6
0.4
0.2
15 000 20 000 25 000 30 000 35 000 40 000
-1
-2
-3
9.5 10.0 10.5 11.0

Simplest Approach to Estimation
An obvious estimator for x0 is the minimum observation (which is also the maximum likelihood estimator).
Recall that the mean of the distribution is x0 a Ha - 1L, we can then estimate a using
Clear@mean, x0, aD
minsize = Min@sizedataD
8meansize = Mean@sizedataD, theoreticalmean = xmin * alpha Halpha - 1L<
Solve@mean x0 * a Ha - 1L, aD . 8mean meansize, x0 minsize<
9792.57
819 129.2, 20 000<
88a 2.04883<<
Not bad.
We might improve a little on this by estimating x0 with using the expected value for the minimum observa-
tion given the sample size.
(See http://www.math.umt.edu/gideon/pareto.pdf)
Fitting a Pareto Distribution to the Data
Recall that the survivial function told us that the proportion surviving is linear in logHx0 xL. So we can look
for a simple linear fit. The coefficient on x is our estimate of the Pareto index.
alpha
fit = Fit@Log@survivaldataD, 81, x<, xD H* linear fit to logged data *L
coefs = CoefficientList@fit, xD
Show@8llplot,
Plot@fit, 8x, Log@First@sizedataDD, Log@Last@sizedataDD<, PlotStyle 8Red<D<D
Exp@- coefsP1T coefsP2TD H* implied value of x0 *L
18.7679 - 2.03951 x
818.7679, - 2.03951<
-1
-2
-3
9.5 10.0 10.5 11.0
9918.88
A nonlinear model fit of the survival function produces similar results.
nlm01 = NonlinearModelFit@survivaldata, SurvivalFunction@

ParetoDistribution@x0hat, alphahatD, $xD, 8x0hat, alphahat<, $xD
9.66847107
F
$x 9949.45
FittedModelB $x19
1 True
Sampling from Power Law (Pareto) Distributions
We are often forced to work with binned data. Lets create some.
npts = 106 ; alpha = 2; x0 = 1; x = x0 H1 - RandomReal@1, nptsDL ^ H1 alphaL;

H* we'll need x again for next figure *L
xmax = 10 * x0; bins = Table@i, 8i, x0, xmax, xmax 100<D;
relfreq = BinCounts@x, 8bins<D npts;
8ListPlot@Transpose@8bins@@2 ;;DD, relfreq<D,
PlotRange 880, xmax<, Automatic<D,
ListLogLogPlot@Transpose@8bins@@2 ;;DD, relfreq<DD<
0.030
0.100
0.025 0.050
:0.015 >
0.020
0.010
, 0.005
0.010
0.001
0.005 5 10-4
2 4 6 8 10 1.5 2.0 3.0 5.0 7.0 10.0

Power Law Frequency Distribution
In our last slide, we cheated a bit by showing only the bins for relatively small sizes, which occur with the
greatest frequency. As size increases, relative frequency falls, and statisitcal noise becomes more promi-
nent, even if we substantially increase bin size.
xmax = 200 * x0; bins = Table@i, 8i, x0, xmax, xmax 100<D;
relfreq = BinCounts@x, 8bins<D npts;
mypoints = Transpose@8Rest@binsD, relfreq<D;
ListLinePlot@mypoints, PlotRange 88100 * x0, Automatic<, 80, 10 ^ - 5<<D
0.00001
8. 10-6
6. 10-6
4. 10-6
2. 10-6
0
100 120 140 160 180 200
Power Law: Log Scale
We might hope to address this by moving to a log scale. This proves informative but is only partially success-
ful. Why? Our bins are still linear.
Notice the empty bins for large sizes.
ListLogLogPlot@mypointsD
0.1
0.01
0.001
10-4
10-5
5 10 20 50 100 200
Logarithmic Binning
It works better to let our bin size grow as we consider larger size realizations: we can use logarithmic
binning.
logxmax = Log@xmaxD; logbins = Table@i, 8i, 0, logxmax, logxmax 100<D;

relfreq = BinCounts@Log@xD, 8logbins<D npts;
data = Transpose@8Rest@logbinsD, Log@relfreqD<D;
ListLinePlot@dataD
-2
-4
-6
-8
-10
-12
1 2 3 4 5
Clear@xD
Nonlinear Curve Fitting

Some Census Data
Data from http://www.census.gov/hhes/www/income/data/historical/inequality/IE-1.pdf table A.3. Income is

in 2010 dollars.
year ........................... 2010 2009 2008 2007 2006 2005 2004

10th percentile limit . . . . . 11,904 12,320 12,315 12,789 12,977 12,607 12,589
20th percentile limit . . . . . 20,000 20,791 20,974 21,337 21,666 21,419 21,338
50th HmedianL . . . . . . . . . 49,445 50,599 50,939 52,823 52,124 51,739 51,174
80th percentile limit . . . . . 100,065 101,651 101,508 105,156 104,930 102,420 101,580 102
Census Data
incomes2010 = 811 904, 20 000, 49 445, 100 065, 138 923, 180 810<;
cdf2010 = 810, 20, 50, 80, 90, 95< 100.;
tail2010 = 1 - cdf2010;
incomecdf2010 = Transpose@8incomes2010, cdf2010<D;
incometail2010 = Transpose@8incomes2010, tail2010<D;
Labeled@GraphicsRow@8
g`cdf2010 =
ListPlot@incomecdf2010, AxesLabel 8"income", "cdf"<, AxesOrigin 80, 0<,
PlotStyle PointSize@0.02D, Ticks 88815 000, "$15k"<, 850 000, "$50k"<,
8100 000, "$100k"<, 8150 000, "$150k"<<, Automatic<, ImageSize 400D,
g`tail2010 = ListPlot@incometail2010, AxesLabel 8"income", "tail"<,
AxesOrigin 80, 0<,
PlotStyle PointSize@0.02D, Ticks 88815 000, "$15k"<, 850 000, "$50k"<,
8100 000, "$100k"<, 8150 000, "$150k"<<, Automatic<, ImageSize 400D
<D, "2010 Census Data"D
cdf tail
0.8
0.8
0.6
0.6
Out[722]=
0.4
0.4
0.2 0.2
income
$15k $50k $100k $150k $15k $50k
2010 Census Data

Loglinear Survival Fit

In[806]:=
Clear@xD
fit2010 = Fit@Log@Transpose@8incomes2010, tail2010<DD, 81, $x<, $xD;
H* linear fit to logged data *L
coefs2010 = CoefficientList@fit2010, $xD;
ahat2010 = - coefs2010P2T;
x0hat2010 = Exp@coefs2010P1T ahat2010 D; H* implied value of x0 *L
tail2010fit = Piecewise@88Hx0hat2010 xL ^ ahat2010, x > x0hat2010<, 81, True<<D
cdf2010fit = 1 - tail2010fit;
GraphicsRow@8
Show@8g`tail2010, Plot@tail2010fit,
8x, First@incomes2010D, Last@incomes2010D<, PlotStyle 8Red<D<D
,
Show@8g`cdf2010, Plot@cdf2010fit,
8x, First@incomes2010D, Last@incomes2010D<, PlotStyle 8Red<D<D
<D
17 914.6 J x N
1 1.01727
Out[810]= x > 15 169.9
1 True
tail cdf
0.8
0.8
0.6
0.6
Out[812]=
0.4
0.4
0.2 0.2
income
$15k $50k $100k $150k $15k $50k
A problem with this log-linear survival fit it that it estimates minimum income at a value above the minimum
observed value. But the same thing happens with a nonlinear estimation.
Nonlinear Least Squares: Fit CDF to 2010 Data
incomecdf2010 = 8811 904, 0.1<, 820 000, 0.2<,

In[695]:=
849 445, 0.5<, 8100 065, 0.8<, 8138 923, 0.9<, 8180 810, 0.95<<
nlm01 = NonlinearModelFit@incomecdf2010,
CDF@ParetoDistribution@khat, ahatD, $xD, 8khat, ahat<, $xD;
nlm01@"BestFitParameters"D
gpareto2010 = Show@ListPlot@incomecdf2010, AxesOrigin 80, 0<D,
Plot@nlm01@$xD, 8$x, 0, 200 000<, PlotStyle 8Red<DD
8811 904, 0.1<, 820 000, 0.2<, 849 445, 0.5<,

Out[695]=
8100 065, 0.8<, 8138 923, 0.9<, 8180 810, 0.95<<
8khat 16 104.6, ahat 0.858956<

Out[697]=
0.8
0.6
Out[698]=
0.4
0.2
50 000 100 000 150 000

Some Earlier (2004) Income Data
For purposes of comparison, we use data from Maclachlan (2006).
incomes2004 = 815 000, 25 000, 35 000, 50 000, 75 000, 100 000<;

In[1022]:=
cdf2004 = 80.154, 0.283, 0.402, 0.55, 0.733, 0.843<;

tail2004 = 1 - cdf2004;
data2004fm = Transpose@8incomes2004, cdf2004<D
g`data2004fm = ListPlot@data2004fm, PlotStyle PointSize@0.01D,
AxesOrigin 80, 0<, AxesLabel 8"income", "cdf"<,
Ticks 88815 000, "$15k"<, 850 000, "$50k"<, 8100 000, "$100k"<<, Automatic<,
ImageSize 400D
Clear@xD
fit2004 = Fit@Log@Transpose@8incomes2004, tail2004<DD, 81, x<, xD
H* linear fit to logged data *L
coefs2004 = CoefficientList@fit2004, xD
ahat2004 = - coefs2004P2T
x0hat2004 = Exp@coefs2004P1T ahat2004 DH* implied value of x0 *L
cdf2004fit = 1 - Hx0hat2004 xL ^ ahat2004
temp = Plot@cdf2004fit,
8x, First@incomes2004D, Last@incomes2004D<, PlotStyle 8Red<D;
Show@8g`data2004fm, temp<D
8815 000, 0.154<, 825 000, 0.283<, 835 000, 0.402<,

Out[1025]=
850 000, 0.55<, 875 000, 0.733<, 8100 000, 0.843<<
cdf
0.8
0.6
Out[1026]=
0.4
0.2
income
$15k $50k $100k
Out[1028]=
8.43993 - 0.872352 x
88.43993, - 0.872352<
Out[1029]=
Out[1030]=
0.872352
Out[1031]=
15 913.4
1 0.872352
Out[1032]=
1 - 4628.25
x
cdf
0.8
0.6
Out[1034]=
0.4
0.2
income
$15k $50k $100k
Fit to Pareto Distribution
Lets fit these data points to a Pareto distribution, using NonlinearModelFit. (Mathematica 9 gives a perfect
match to the same estimation on Maclachlan (2006), who used Mathematica 5.)
data2004fm 8815 000, 0.154<, 825 000, 0.283<,

In[1040]:=
835 000, 0.402<, 850 000, 0.55<, 875 000, 0.733<, 8100 000, 0.843<<
nlm01 = NonlinearModelFit@data2004fm,
CDF@ParetoDistribution@khat, ahatD, $xD, 8khat, ahat<, $xD;
g`pareto =
Show@8g`data2004fm, Plot@nlm01@$xD, 8$x, 0, 100 000<, PlotStyle 8Red<D<D
Out[1040]=
True
8khat 12 989.3, ahat 0.658768<

Out[1042]=
cdf
0.8
0.6
Out[1043]=
0.4
0.2
income
$15k $50k $100k
Puzzle
Iirc Mathematica 8 gave different results.
nlm01["BestFitParameters"]
nlm01["ParameterConfidenceIntervals"]
nlm01["ParameterErrors"]
{khat -> 18243.5, ahat -> 0.902423}
{{9379.06, 27107.9}, {0.312716, 1.49213}}
{3192.72, 0.212397}
0.8
0.6
0.4
0.2
20 000 40 000 60 000 80 000 100 000

Other Two Parameter Distributions

Maclachlan (2006) proposes consideration of the LogNormal and Beaman distributions. This section repli-
cates her results for these, using her data.
Lognormal
In[944]:=
GraphicsRow@8
Plot@SurvivalFunction@LogNormalDistribution@0, .2D, xD,
8x, 0, 2<, AxesOrigin 80, 0<D,
LogLogPlot@SurvivalFunction@LogNormalDistribution@0, .2D, xD,
8x, 0.01, 2<, PlotRange AllD
<, ImageSize LargeD
GraphicsRow@8
Plot@CDF@LogNormalDistribution@0, .2DD@xD, 8x, 0, 2<, AxesOrigin 80, 0<D,
Plot@PDF@LogNormalDistribution@0, 0.2DD@xD, 8x, 0, 2<, AxesOrigin 80, 0<D
<, ImageSize LargeD
1.0 1
0.8
0.1
0.6
Out[944]=
0.01
0.4
0.2
0.001
0.5 1.0 1.5 2.0 0.02 0.05 0.10 0.20 0.50 1.00 2.00
In[949]:=
Simplify@
8CDF@LogNormalDistribution@m, sD, xD, PDF@LogNormalDistribution@m, sD, xD<,
Assumptions x > 0D
Hm-Log@xDL2
-
: F, >
Out[949]= 1 m - Log@xD 2 s2
ErfcB
2 2 s 2p xs
1.0 2.0
0.8
1.5
0.6
Out[945]=
1.0
0.4
0.5
0.2
0.5 1.0 1.5 2.0 0.5 1.0 1.5 2.0

Fit Lognormal
In[983]:=
Clear@xD
model = CDF@LogNormalDistribution@m, sD, xD
nlm02 = NonlinearModelFit@data2004fm, model, 88m, 10<, 8s, 1<<, xD
fittedmodel2004fm = model . %
F x>0
1 m-Log@xD
Out[984]= 2
ErfcB
2 s
0 True
Erfc@0.763334 H10.6606 - Log@xDLD x > 0

F
1
Out[985]=
FittedModelB 2
0 True
8m 10.6606, s 0.92634<
Out[986]=
Erfc@0.763334 H10.6606 - Log@xDLD x > 0

1
Out[987]=
2
0 True
Visualize Lognormal Fit

In[1039]:=
g`lognormal = Show@
8g`data2004fm, Plot@fittedmodel2004fm, 8x, 15 000, 100 000<D<, ImageSize 300D
cdf
0.8
0.6
Out[1039]=
0.4
0.2
income
$15k $50k $100k
In[1044]:=
g2 = Show@8g`lognormal, g`pareto<,
PlotLabel "Compare Lognormal HBlueL and Pareto HRedL"D
Compare Lognormal HBlueL and Pareto HRedL

cdf
0.8
0.6
Out[1044]=
0.4
0.2
income
$15k $50k $100k
Show@8Plot@fittedmodel2004fm, 8x, 1000, 200 000<D, g`pareto<D

In[1046]:=
0.8
0.6
Out[1046]=
0.4
0.2
50 000 100 000 150 000 200 000

Beaman CDF
beamanCDF = 1 I1 + Hb - aL3 Hx - aL3 M

In[365]:=
Limit@beamanCDF, x D
Limit@beamanCDF, x aD
Limit@beamanCDF, x bD
D@beamanCDF, xD Simplify
Out[365]= 1
H-a+bL3
H-a+xL3
1+
Out[366]=
1
Out[367]=
0
Out[368]= 1
2
Out[369]= 3 H- a + bL3
J1 +
Ha-bL3
N Ha - xL4
2
Ha-xL3
Beaman
1
Hb-aL3
Hx-aL3
1+
TransposeA9Range@12 000, 100 000, 100D,

1 - 1 I1 + H44 + 27L3 HRange@12 000, 100 000, 100D + 27L3 M=E;
ListLogLogPlot@%, Ticks NoneD

Fit Beaman Distribution
Beaman
Maclachlan (2006) reports that the Beaman distribution was used at duPont in the 1970's to model sales
volume of products at various price points. because it gave a better fit than the lognormal. There are two
parameters: a represents the minimum price point and b represents the median price point. The distribution
allows negative values. See http://library.wolfram.com/infocenter/MathSource/6021/ for details. For elabora-
tion, see http://home.manhattan.edu/~fiona.maclachlan//beaman/beaman_notebook/.
1
beamanDistribution@$x_, a_, b_D := ;
Hb-aL3
H$x-aL3
1+
incomepoints = 8815 000, 0.154<, 825 000, 0.28300000000000003<, 835 000, 0.402<,
850 000, 0.55<, 875 000, 0.7330000000000001<, 8100 000, 0.8430000000000001<<;
nlm03 = NonlinearModelFitAincomepoints,
91 I1 + Hb - aL3 H$x - aL3 M, a < 0, b > 25 000=, 8b, a<, $xE;
Normal@nlm03D
bfp = nlm03@"BestFitParameters"D
1
3.484751014
H25 583.6+$xL3
1+
8b 44 786.9, a - 25 583.6<
These results are close to those of Machlachlan (who got values: {b44884.7,a-27515.7}). Unfortunately,
the estimates are *highly* sensitive to the constraint values. In any case, we use constraints based on
Machlachlan (2006) and get a pretty good fit, as illustrated here.
gbeaman = Plot@beamanDistribution@$x, a, bD . bfp, 8$x, 5000, 100 000<,

PlotStyle 8Green<, ImageSize 300, AxesLabel 8"income", "cdf"<D;
gbeaman2 = Show@gdata, gbeaman, ImageSize 300, AxesLabel 8"income", "cdf"<D
cdf
0.8
0.6
0.4
0.2
income
$15,000 $50,000 $100,000
Two way comparison
Show@gdata, gbeaman, glognormal,

PlotLabel "Beaman in Green & Lognormal in Blue"D
Beaman in Green & Lognormal in Blue

cdf
0.8
0.6
0.4
0.2
income
$15,000 $50,000 $100,000
References
Maclachlan, Fiona (2006). Investigating Power Laws with Mathematica. http://library.wolfram.com/infocen-
ter/Conferences/6461/

Pareto Distribution

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pareto Distribution

Uploaded by

Copyright:

Available Formats

The Pareto Distribution

The Pareto Distribution

named after Vilfredo Pareto (1848-1923, Italian economist)

also known as Bradford distribution

continuous power-law probability distribution

log N = log A - a log x

log HN N0L = - a log Hx x0L

log HN1 N2L = - a log Hx1 x2L

So the same relationship holds in any tail of the income distribution.

Pareto Distribution (Tail Function)

Plot@% . 8x0 10 000, a 2<, 8x, 0, 100 000<D

20 000 40 000 60 000 80 000 100 000

Here is the survival function of the exponential distrubution:

R HyL = P@Y > yD = -a*y = Hy L-a

efine X = x0 Y , where Y has an exponential distribution. Then

Survival (Tail) Function

Lets focus on values greater than the minimum:

Survival and the Pareto Index

Survival Rate Survival Rate

SurvivalFunction@ParetoDistribution@10 000, 2D, 100 000D

Pareto Distribution: CDF and PDF

The PDF is of course the derivative of the CDF.

CDF and PDF of Pareto Distributions

options01 = 8PlotRange 880, 5<, 80, 1<<,

AxesLabel 8"Size", "CDF"<, ImageSize 250<; options02 =

PDF of Pareto Distributions: Loglinearity

Continuous Power Law Distribution

Assuming@x0 > 0 && a > 0,

Plugging in our soltuion, we get the PDF of the Pareto distribution.

Lorenz Curve and Gini Coefficient

Recall the PDF of the Pareto distribution.

Proportion of Total Income

The weighted sum of all incomes less than or equal to t is

t-a Hta x0 - t x0a L a

cumShare = cumSize meanPareto Simplify

ts = Solve@s cdfPareto, xD . 8x t< Flatten

0.2 0.4 0.6 0.8 1.0

IntegrateBs - 1 + H1 - sL1- a , 8s, 0, 1<F

Pareto Distribution and Lorenz Curve

L@F_, k_D := 1 - H1 - FL1-1.k ;

Plot@Log@sD Log@1 - sD, 8s, 0.5, 1<D

0.6 0.7 0.8 0.9 1.0

Data Creation and Analysis

Sampling from Power Law (Pareto) Distributions

15 000 20 000 25 000 30 000 35 000 40 000

9.5 10.0 10.5 11.0

Simplest Approach to Estimation

819 129.2, 20 000<

Fitting a Pareto Distribution to the Data

9.5 10.0 10.5 11.0

A nonlinear model fit of the survival function produces similar results.

nlm01 = NonlinearModelFit@survivaldata, SurvivalFunction@

Sampling from Power Law (Pareto) Distributions

npts = 106 ; alpha = 2; x0 = 1; x = x0 H1 - RandomReal@1, nptsDL ^ H1 alphaL;

2 4 6 8 10 1.5 2.0 3.0 5.0 7.0 10.0

Power Law Frequency Distribution

Power Law: Log Scale

logxmax = Log@xmaxD; logbins = Table@i, 8i, 0, logxmax, logxmax 100<D;

Nonlinear Curve Fitting

Some Census Data

Data from http://www.census.gov/hhes/www/income/data/historical/inequality/IE-1.pdf table A.3. Income is

year ........................... 2010 2009 2008 2007 2006 2005 2004

2010 Census Data

Loglinear Survival Fit