12 views

Original Title: Sample Size Determination and Confidence Interval Derivation for Exponential Distribution

Uploaded by Ismael Neu

- 13406.pdf
- 2002_Paper III
- Continuous Rvs
- Doane Chapter 08a
- Introduction4
- Cascade Model Slides
- 09yht5rdh
- AllExSol
- IJET2018.pdf
- Probability Distribution of Maximum Temperature in Adamawa State, Nigeria
- CT3_Sol_0511.pdf
- queuingproblems-110227225126-phpapp01
- MIT18_655S16_LecNote10
- ASTM Distillation Curve Dr. Khalid Farhod
- Quality of the integer ambiguity solution
- Chapter 10
- Jurnal Syaraf New 2
- Normal Distribution
- course dairy for V semester.pdf
- The Oredigger Issue 10 - November 11, 2013

You are on page 1of 6

org

Rhodes, Greece, June 26-July 1, 2016

Copyright 2016 by the International Society of Offshore and Polar Engineers (ISOPE)

ISBN 978-1-880653-88-3; ISSN 1098-6189

Sample Size Determination and Confidence Interval Derivation for Exponential Distribution

1

Department of Engineering and Safety,

University of Troms, Troms, Norway

ABSTRACT Given failure data, the MLE method is the common method for

parameter estimation. For MLE estimator, in general, it has been

Exponential distribution is the simplest and the most common used known: MLE estimator has good consistence. When the sample size

distribution for risk and reliability data analysis. This paper investigates becomes large, the estimator can asymptotically reach the true

the distribution of the estimator from maximum likelihood estimation parameters. However, in terms of biasness, the maximum likelihood

(MLE). This estimator distribution shows the MLE estimator biased method does not have good performance. The estimator from MLE is

and it should be corrected for small data size problem. The derived biased (Kendall et al., 2004). The biasness implies for small data size

distribution is applied to determine the optimal sample size for situation, the MLE tends to deviate the true parameters. Using this

experiment design, and it is applied to derive confidence interval. This estimator could have high risk of making wrong decision. The biasness

confidence interval is compared with confidence interval derived from problem should therefore be concerned in risk and reliability

Fisher information matrix and the likelihood ration method. engineering. Figure 1 demonstrates how the estimator distributed when

the sample size is only 5, where we can find how the risk could be if

KEY WORDS: Exponential Distribution; Sample Size; Confidence one choose one of them. The estimator is dispersed around the true

Interval; Estimator Distribution. parameter (dashed line) widely.

Sample size

Time to failure 1000

Variance

PDF function 800

Count

Likelihood function 600

Gamma distribution

Inverse Gamma distribution 400

CDF Cumulative density function 200

INTRODUCTION 0

0 5 10 15 20 25

distribution to model failure data (Meeker and Escobar, 1998, Yang

and Sirvanci, 1977). This simple distribution model with constant Figure 1. Estimator Distribution

failure rate has been widely applied in life test and the spare part

management (Barlow et al., 1996). The life of electronic system or This biasness problem of the MLE estimator has been ignored by a lot

some complex engineering system tends to follow exponential of practitioner in risk and reliability. In spite of some publications tries

distribution. Moreover, the parameter estimation is also simpler to develop method to correct the biasness, most applications in state of

compared with other distribution. art still uses the MLE for small sample size problem. The application is

not wrong, but it is not reasonable.

664

Exponential distribution is the simplest distribution. The exact expectation of estimator is

distribution of MLE estimator can be derived. The derived estimator

distribution can be used to find the mean value of the estimator and find

(8)

the exact variance. The other distribution for example Weibull

distribution, one have to use simulation or use the approximate method

to estimate the variance. This paper derives the exact MLE estimator From (8), it readily finds the maximum likelihood for exponential is

distribution of exponential distribution. This exact distribution is used biased. For small sample size, the estimator can be corrected by

to determine the sample size for experiment design and is used to multiplying a correction factor ,i.e.

correct the MLE estimator. Later on this exact distribution is used to

find a confidence interval for the estimator.

(9)

EXACT DISTRIBUITON OF EXPONENTIAL ESTIMATOR

Similarly, the variance of estimator can be derived.

In spite of that the MLE estimator is biased, however, if the expectation

of the estimator can be known, or the distribution of the estimator can

be known, the biasness can be corrected. Unfortunately, in reliability

engineering, the MLE for most distributions dont have explicit

expression. The expectation of estimator is not able to known. But

fortunately, for the exponential distribution, distribution of the MLE

estimator can be obtained. The estimator distribution can be obtained

by Gamma distribution. Some publications on statistics have obtained (10)

the similar results by using the moment generating method considering

the censoring situation (Childs et al., 2012, Cheng et al., 2013, Gupta The Crow-Rao lower bound for the Exponential is

and Kundu, 2007). This paper considers the complete data. The

probability density function of Gamma distribution is

(11)

; (1)

The . It can find the CR lower bound is lower

than the true lower bound of the . The CR variance is derived from

Samples drawn from Gamma distribution are The Fisher information matrix.

summation of the random variables is still Gamma distribution as

The true distribution of the maximum likelihood estimator can be

(2) derived from the distribution of in (2). This paper omits the detailed

procedure. The procedure can find in the textbook of probability and

It is straightforward to find the exponential is a special case of Gamma statistics (Rohatgi, 1975). The estimator distribution is

distribution with . Therefore, the summation of random variables

from Exponential distribution is then

(12)

(3)

The CDF of (12) is not easy to compute. One can instead use (2) to get

The probability density function is the CDF for (12) indirectly.

The maximum likelihood estimator of exponential distribution for distribution can be used to determine the number of samples in

observed samples is experiment design, when the life of the product has been known

following exponential distribution. The (12) depends on a unknown

(5) parameter . If the criteria to choose the sample size is based

(6) knowing the . The probability density function of can be derived

The (6) can be rewritten as as

(7) (14)

The (14) has an unknown parameter n, this is the sample size, i.e. the

The follows Gamma distribution in the form of

(14) shows the distribution of for n. Figure 2 shows the

against .

. It is thus . Therefore, the

665

0,06

8

0,05

n=5

6 n=10

0,04 X(L)<P<X(U)

n=20

n=50

Gamma PDF

n=400

PDF

0,03

4

0,02

2

0,01

0,00

0 0 20 40 60 80 100 120

0,0 0,5 1,0 1,5 2,0 2,5 3,0

ns

Efficiency Figure 3. Distribution of ns

Figure 2. Distribution of Estimator Ratio

This procedure requires iteration computation. Table 1 tabulates the

Alternatively, for convenience of computation, one can choose criteria minimum required sample size for various relative error. In spite that

as method requires this iteration computation, the advantage is evident. It

is accurate without requiring any asymptotical distribution, and it does

(15) not requires knowing parameter value.

It is can be deduced from (12)

No. Relative Error

; (16) 1 5% 1.05 0.95 1541

2 10% 1.11 0.91 385

The (16) is still Gamma distribution with and k=n in the Gamma 3 20% 1.25 0.833 101

distribution (1). It is more readily for computation than (12). 4 50% 2 0.67 21

Suppose the relative error tolerated is , the . If the CONFIDENCE INTERVAL DERIVATION

sample size used is , using the Formula (16), the probability of the test

failure rate near the true value is the Gamma distribution The distribution of the estimator can be applied to find confidence

, as shown in the shadowed area in Figure 3. interval. In general, the confidence interval derivation in statistics is

using the asymptotically normality of the estimator, i.e. according to

central limit theorem, the distribution of estimator is normal

distribution when sample size is infinite. The Fisher information matrix

method and the likelihood ratio method are based on this. It is evident

that this method requires the sample size is sufficiently large. Different

(17)

to the Fisher and likelihood ratio method, this section derives a

confident interval from the estimator distribution.

Given a certain confidence level, e.g. 95%, start from n=1, we calculate

the (17) and obtains the and . If the and does not Estimator Distribution Based Method

reach the requirement, we increase the n and repeat the procedure until

it meets it. Since the exact distribution of the estimator has been know as (12), the

confidence interval of the estimator is readily to be derived. Instead of

using (12), this paper uses the (2) to derive confidence interval. Since

the failure is the inverse of the in (2), we firstly find the interval for

. It is known follows Gamma distribution . The can use the

maximum likelihood estimator

, (18)

interval with level is then

(19)

666

The usage of the estimator as the true parameter is risky. The estimator divided by the value of the likelihood value of true parameter

performance of the confidence interval relies how close the estimator to follows . More detail can be found in (Lehmann and Romano,

the true parameter. This is a disadvantage of this method. 2005). This ratio can also be used to derive confidence interval, as

some commercial software uses. The likelihood ratio method is claimed

Fisher Information Matrix Method to be more suitable to the small sample size than the Fisher method. Let

denote the likelihood function. For exponential distribution, it has

The variance of estimator can be obtained from the Fisher information

matrix (Lehmann and Casella, 1998). This method is general and not

exponential distribution specific. The likelihood function of ln (27)

exponential distribution is:

The . The (27) can be

-ln -ln rewritten as

ln (20)

(28)

The second derivative of the log likelihood function is

It is a nonlinear equation. For the function

-ln

(21)

, (29)

Thus the variance is

It is easy to find the (29) has only one maximum value, as for the

-ln exponential distribution, the likelihood function has only a

(22)

maximum value. This maximum is the MLE estimator . Thus this is

According the central limit theorem, when the sample size is infinite, global maximum of (29). This property can facilitate finding the upper

the estimator follows normal distribution as and lower bound of the . Figure 4 shows the for an exponential

distribution. Figure 4 shows the has two solutions to . A

(23) confident interval of can derived by letting the lower value of the

solution as the lower bound, the higher solution as the higher bound.

The disadvantage of the method requires the numerical method to find

The follows standard normal distribution. Let is the value of

the roots of (28). The numerical method will have difficult when the

sample size is big. The and would be extreme big or small when

the standard normal distribution at percentage . The confidence

interval of a can be derived as n is large.

1,2e-7

(24) 1,0e-7

8,0e-8

6,0e-8

Likelihood

4,0e-8

not realistic. In practice, one can use ln instead of to estimate

interval (Meeker and Escobar, 1998). By the Delta method, the

2,0e-8

variance of ln is 0,0

-2,0e-8

0,0 0,2 0,4 0,6 0,8 1,0 1,2

Lambda

The . A confidence interval of can be derived Figure 4. Numerical plot of likelihood function

as

SIMULATION STUDY

The Fisher based and the likelihood based methods are the most two

(26) common confidence intervals in the applications. The estimator

distribution based method this paper derived is rarely found

applications or to be discussed in state of art. Comparing these three

methods, the estimator distribution method is lest relying on the

Similar to the Estimator distribution method, this method also uses the normality assumption. This simulation study is to find the performance

estimator as the true parameters. of the three methods.

Likelihood Ratio Method The simulation chooses three exponential distributions: they are bigger

1, equal 1 and less than 1. The sample size is chooses from 5 to 500.

When the sample size is infinite, the likelihood function value of any The choosing of more sample is not necessary as the performance will

667

approach similar when the data size is sufficiently large. For each distribution method shows the worst performance. When the sample

sample size, total 50,000 iterations has been run. size increases, the performance improves. The performances are similar

when the data size is 500. The likelihood ratio method shows best

The first simulation is to test the biasness correction (8). When the true performance. It has higher percentage over all two methods when the

parameter is 1, out of the 50,000 iterations for sample size 5, the mean data size is extreme small. When the data size is big, the three methods

of the adjusted estimator is 1.0051. The unadjusted estimator is 1.2564. show almost equal performance. The Fisher method is very stable, in

The adjusted estimator shows significant outperformance over the spite of the performance for extreme small data size not able to

original MLE estimator. Figure 5 shows the distribution of the two outperform likelihood ratio method. Some cases, the likelihood method

estimators, where we can see the distribution of the estimator has been is not working, the reason is the computer is overthrown. The or

shifted to left proportionally for the adjusted estimator. For the extreme is too big for the numerical software to handle.

small sample size, the biasness correction is very necessary.

When the estimator distribution uses the true parameter in the (19), the

estimator distribution method can almost reach 1,i.e. it can always

cover the true parameter. The estimator method relies too heavily on

the closeness to the true parameter. When the sample size is big, the

estimator can close to the true parameter, therefore the confident

interval becomes better. When the estimator deviates from the true

parameter largely, the method shows poor performance.

The next simulation uses the unbiased estimator for the estimator

distribution method. Simulation only runs for the small data size, as for

the large size, the biased estimator works well. The results shows in

Table 3, it shows the confidence interval works well. It is much better

than the performance using biased the MLE estimator. The

performance of the unbiased estimator has approach the performance of

the likelihood ratio method.

Figure 5. Adjusted Estimator vs the original Table 3. CI using Biasness Correction Estimator (0.95 confidence

level)

Exponential Size Estimator Unbiased

For the different confidence intervals, the performance is defined as:

Distribution Estimator Dis.

the percentage of this interval can cover the true parameter out of the

50,000 iterations. This iteration number is large enough. More iteration 5 0.8995 0.9451

2

will shall similar results. The simulation results shows in the Table 1. 20 0.9356 0.9494

5 0.8958 0.9434

1

Table 2. CI Simulation Results (0.95 confidence level) 20 0.9356 0.9470

Exponential Size Estimator Lik Ratio Fisher 5 0.9009 0.9476

0.1

Distribution Method 20 0.9378 0.9513

5 0.8963 * 0.9310

10 0.9231 0.9493 0.9429 The simulation shows for the small data size, using the biased

20 0.9360 0.9479 0.9437 correction estimator is very necessary both for the point estimation and

2 for the confidence interval derivation. In reliability and risk data

50 0.9442 0.9487 0.9446

analysis, when the data size is small, the MLE should be corrected to be

100 0.9470 0.9495 0.9490

unbiased, otherwise, it has high risk of making wrong decision.

500 0.9492 0.9505 0.9500

5 0.8989 0.9454 0.9316 CONCLUSION

10 0.9234 0.9422 0.9504

20 0.9374 0.9500 0.9468 The usage of the exact distribution can help to define the optimal

1

50 0.9444 0.9489 0.9477 sample size for experiment design. However, the confidence interval

100 0.9470 0.9493 0.9494 derived from the estimator distribution is not able to outperform the

500 0.9492 * 0.9493 Fisher information matrix method and the likelihood ratio method as

5 0.9006 0.9485 0.9339 the simulation implies, when the original maximum likelihood

10 0.9235 0.9464 0.9402 estimator used. The confidence interval has been significantly

20 0.9363 0.9446 0.9451 improved by using the biasness corrected estimator. The paper shows

0.1 the importance of the biasness correction for small sample size

50 0.9449 0.9458 0.9490

100 0.9474 0.9477 0.9494 problem. The usage of biased maximum likelihood estimator has high

500 0.9493 1 0.9494 risk to lead to wrong decision in risk and reliability data analysis.

1. * denotes for this method, the computer running this simulation is

overflown. ACKNOWLEGMENT

2. The last number in the number is italicized, as it differs when a new

simulation run. The random in this number is due to precision The authors would like to thank professor Gilberto to reconsider this

limitation of the simulation software. paper and review this paper. Special thanks own to the lab staffs of

department of engineering from university of Troms to motivate this

As shown in Table 2, for extreme small sample size, the estimator paper.

668

REFERENCES Lehmann, EL. & Casella, G. (1998). Theory of point estimation. 2nd

ed. Series: Springer texts in statistics. New York: Springer.

Barlow, RE., Proschan, F. & Hunter, LC. (1996). Mathematical theory Lehmann, EL. & Romano, JP. (2005). Testing statistical hypotheses.

of reliability. Series: Classics in applied mathematics, vol. 17. 3rd ed. Series: Springer texts in statistics. New York: Springer.

Philadelphia: SIAM. Meeker, WQ. & Escobar, LA. (1998). Statistical methods for reliability

Cheng, CH., Chen, JY. & Bai, JM. (2013). Exact inferences of the two- data. Series: Wiley series in probability and statistics Applied

parameter exponential distribution and Pareto distribution with probability and statistics section. New York: Wiley.

censored data. Journal of Applied Statistics, 40(7), pp. 1464-1479. Rohatgi, VK. (1975). An introduction to probability theory and

Childs, A., Balakrishnan, N. & Chandrasekar, B. 2012. Exact mathematical statistics. Series: Wiley series in probability and

distribution of the MLEs of the parameters and of the quantiles of mathematical statistics. New York: Wiley.

two-parameter exponential distribution under hybrid censoring. Yand, G. & Sirvanci, M. (1977). Estimation of a Time-Truncated

Statistics, 46(4), pp. 441-458. Exponential Parameter Used in Life Testing. Journal of the American

Gupta, RD. & Kundu, D. (2007). Generalized exponential distribution: Statistical Association, 72(358), pp. 444-447. doi: Doi

Existing results and some recent developments. Journal of Statistical 10.2307/2286816.

Planning and Inference, 137(11), pp. 3537-3547.

Kendall, MG., O'Hagan, A. & Forster, J. (2004). Kendall's advanced

theory of statistics. 2nd ed. Series: Kendall's library of statistics.

London: Arnold.

669

- 13406.pdfUploaded byRaghuram Vadiboyena V
- 2002_Paper IIIUploaded byhmphry
- Continuous RvsUploaded byArchiev Kumar
- Doane Chapter 08aUploaded byThomasMCarter
- Introduction4Uploaded bysinglethug
- Cascade Model SlidesUploaded byNguyen Duc Thien
- 09yht5rdhUploaded byKumar Bhaskar
- AllExSolUploaded byJitendra K Jha
- IJET2018.pdfUploaded byayadman
- Probability Distribution of Maximum Temperature in Adamawa State, NigeriaUploaded byIOSRjournal
- CT3_Sol_0511.pdfUploaded byeuticus
- queuingproblems-110227225126-phpapp01Uploaded byYasith Weerasinghe
- MIT18_655S16_LecNote10Uploaded byKira Lopez
- ASTM Distillation Curve Dr. Khalid FarhodUploaded by0ladybug0
- Quality of the integer ambiguity solutionUploaded byCocis PeTrisor
- Chapter 10Uploaded byNanak Bajwa
- Jurnal Syaraf New 2Uploaded bycornelia cindy SRD
- Normal DistributionUploaded bykumar030290
- course dairy for V semester.pdfUploaded byPiyushm Jain
- The Oredigger Issue 10 - November 11, 2013Uploaded byThe Oredigger
- Business_Analytics_Fundamentals_Syllabus.pdfUploaded byKUSH PANDIT
- Representing and Organizing Contextual Data in Context Aware EnvironmentsUploaded byJournal of Computing
- 005Uploaded byMohammed
- Encoded Problems (StatsFinals)Uploaded byHuey
- Deployment of Sampling Methods for SLA ValidationUploaded bymatupa
- Economic Mobility StudyUploaded bycmtinv
- Variable Measurement Systems - Part 3: Linearity | BPI ConsultingUploaded bykikita2911*
- Confidence Intervel CalculatorUploaded bysmtdrkd
- NR 311801 Probability and StatisticsUploaded bySrinivasa Rao G
- team projectUploaded byapi-239776943

- Exploratory Data Analysis With R (2015)Uploaded byJennifer Parker
- CAPITULO6Uploaded byviniciussilveira
- Gaussian sUploaded bytruongvinhlan19895148
- Gaussian sUploaded bytruongvinhlan19895148
- Tutorial Redes Neurais ArtificiaisUploaded byArnaldo Araújo
- e376f31f-f71d-4f36-97bc-7bfc4030976f.pdfUploaded byIsmael Neu
- Apostila-AXavierUploaded byAlexandre Morais
- Tutorial.testet.usodopvalue.11112004Uploaded byIsmael Neu
- IBM Knowledge Center - Conceitos Comuns Na Análise EstatísticaUploaded byIsmael Neu
- 9781441912695-c1Uploaded byJoseph L
- Aula142.pdfUploaded byIsmael Neu
- Art [Alves, Souza, 20XX] Métodos de Agrupamento e Componentes Principais - Teoria EAplicaçõesUploaded byIsmael Neu
- M03 ErrorControl CRD DesignUploaded byIsmael Neu
- Copy of Portn37arrozirrigadorsUploaded byIsmael Neu
- MBP Inferência BayesianaUploaded byIsmael Neu
- Dissertacao [Barroso, (UFV) 2014] Regressao Quantilica Na Avaliacao Da Adaptabilidade e Estabilidade FenotipicaUploaded byIsmael Neu
- Art [Silva, Benin, 2012] Análises Biplot - Conceitos, Interpretações e AplicaçõesUploaded byIsmael Neu
- Experimental Design and Data Analysis for Biologists - Quinn & Keough - Cambridge 2002Uploaded byDan Zach
- Dissertacao [Carneiro, (UFV) 2015] Rede Neural e Logica Fuzzy Aplicados No Melhoramento Do FeijoeiroUploaded byIsmael Neu
- agricolae.pdfUploaded byIsmael Neu
- Properties of Bootstrap SamplesUploaded byIsmael Neu
- Art [Resende, Sturion, Higa, 2001] Comparacao Entre Metodos de Avaliacao Da Establidade e Adaptabilidade Aplicados Aos Dados de Eucaliptus CloezianaUploaded byIsmael Neu
- AgricolaeUploaded byIsmael Neu
- Art [Cargnelutti Filho, Storck, , 2009] Associação Entre Métodos de Adaptabilidade e Estabilidade Em MilhoUploaded byIsmael Neu
- Art [Hongyu Ez Al, 2015] Comparacao Entre Os Modelos AMMI e GGE Biplot Para Os Dados de Ensaios Multi AmbientesUploaded byIsmael Neu
- Dissertacao [Busanello, (UFSM) 2012] Estudo Da Adaptabilidade e Estabilidade Em Híbridos Simples e Triplos de Milho Na Regiao Sul Do BrasilUploaded byIsmael Neu
- Art [Barroso et al, 2015] Metodologia para análise de adaptabilidade e estabilidade por meio de regressao quantilica.pdfUploaded byIsmael Neu
- ggeUploaded byIsmael Neu
- Art [Cargnelutti Filho et al, 2007] Comparação de métodos de adaptabilidade e estabilidade relacionados à produtividade de grãos de cultivares de milho.pdfUploaded byIsmael Neu
- GGEBiplotGUIUploaded byIsmael Neu

- EVALUATION OF BISAP SCORING IN ACUTE PANCREATITIS.Uploaded byIJAR Journal
- the distribution of common limpetsUploaded byapi-264094066
- SPE 102093 Pore Perm RelationshipUploaded byzztannguyenzz
- Dynamics of ThermometerUploaded bySaumya Agrawal
- Climate Change and Malaria in IndiaUploaded byArindom Baidya
- Mycotoxin Technical Manual GB02V2 Aug 11 Web Version1Uploaded byhba
- B. J. Winer-Statistical Principles in Experimental Design-NY c._ McGraw-Hill Book Company (1962)Uploaded byniki098
- An Introduction to SPDEs by John WalshUploaded byAdrián Hinojosa Calleja
- Hyslop 1980Uploaded byWilson Cohen
- Agricolae GuiaUploaded byJPercy Albert
- SBE_SM03Uploaded byingridstephany
- A Tutorial on RegressionUploaded byRockBrentwood
- D 1439 - 03 Standard Test Methods for Sodium CarboxymethylcelluloseUploaded byjamuyam
- BB ABS 10th Univariate and Bivariate Corr SR MR 81Pages FinalUploaded bygarimabehl
- Construction and Maintenance of Fired HeatersUploaded byMubarik Ali
- Node Tensile Strength of Honeycomb Core Materials C363.373465-1Uploaded byeforondao
- why is casting an important manufacturing process.docxUploaded bySa'ad Sabar
- Ch02-PropertiesOfFluidsUploaded byjadamiat
- MCMCUploaded byAnonymous MGG7vMI
- Rameshwara n 2015Uploaded byMaria Augusta
- Judge, Heller, & Klinger (2008).pdfUploaded byJuan Ok
- RNASeq-DifferentialExpression-SimonAndersUploaded byjubatus.libro
- Physicochemical Properties of Banana Peel Flour as Influenced by Variety and Stage of Ripeness Multivariate Statistical AnalysisUploaded bynalmonds
- lecture12.pptUploaded bysgstory
- Linear Regression Example.pdfUploaded byShi Zhenyang
- Causality in Economics and EconometricsUploaded byAbigail P. Dumalus
- Forecasting in OM (Lecture#2)Uploaded byHamza Rehman
- DAT101x Lab - Exploring DataUploaded byErnesto Pizarro
- sutton-88-with-erratum.pdfUploaded byLe Hoang Van
- OutliersUploaded byClaudio Pilar