You are on page 1of 8

In probability theory and statistics, the Poisson distribution (pronounced [pwas]) (or Poisson law of large numbers) is a discrete

probability distribution that expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate and independently of the time since the last event. (The Poisson distribution can also be used for the number of events in other specified intervals such as distance, area or volume.) The distribution was first introduced by Simon-Denis Poisson (17811840) and published, together with his probability theory, in 1838 in his work Recherches sur la probabilit des jugements en matire criminelle et en matire civile (Research on the Probability of Judgments in Criminal and Civil Matters). The work focused on certain random variables N that count, among other things, the number of discrete occurrences (sometimes called arrivals) that take place during a timeinterval of given length. As a function of n, this is the probability mass function. The Poisson distribution can be derived as a limiting case of the binomial distribution. The Poisson distribution can be applied to systems with a large number of possible events, each of which is rare. A classic example is the nuclear decay of atoms. The Poisson distribution is sometimes called a Poissonian, analogous to the term Gaussian for a Gauss or normal distribution.

Poisson noise and characterizing small occurrences The correlation of the mean and standard deviation in counting independent, discrete occurrences is useful scientifically. By monitoring how the fluctuations vary with the mean signal, one can estimate the contribution of a single occurrence, even if that contribution is too small to be detected directly. For example, the charge e on an electron can be estimated by correlating the magnitude of an electric current with its shot noise. If N electrons pass a point in a given time t on the average, the mean current is I = eN / t; since the current fluctuations should be of the order (i.e. the standard deviation of the Poisson process), the charge e can be estimated from the ratio. An everyday example is the graininess that appears as photographs are enlarged; the graininess is due to Poisson fluctuations in the number of reduced silver grains, not to the individual grains themselves. By correlating the graininess with the degree of enlargement, one can estimate the

contribution of an individual grain (which is otherwise too small to be seen unaided). Many other molecular applications of Poisson noise have been developed, e.g., estimating the number density of receptor molecules in a cell membrane.

Related distributions

The Poisson distribution can be derived as a limiting case to the binomial distribution as the number of trials goes to infinity and the expected number of successes remains fixed see law of rare events below. Therefore it can be used as an approximation of the binomial distribution if n is sufficiently large and p is sufficiently small. There is a rule of thumb stating that the Poisson distribution is a good approximation of the binomial distribution if n is at least 20 and p is smaller than or equal to 0.05, and an excellent approximation if n 100 and np 10.[2] For sufficiently large values of , (say >1000), the normal distribution with mean and variance (standard deviation), is an excellent approximation to the Poisson distribution. If is greater than about 10, then the normal distribution is a good approximation if an appropriate continuity correction is performed, i.e., P(X x), where (lower-case) x is a non-negative integer, is replaced by P(X x + 0.5).

Variance-stabilizing transformation: When a variable is Poisson distributed, its square root is approximately normally distributed with expected value of about and variance of about 1/4.[3] Under this transformation, the convergence to normality is far faster than the untransformed variable. Other, slightly more complicated, variance stabilizing transformations are available,[4] one of which is Anscombe transform. See Data transformation (statistics) for more general uses of transformations. If the number of arrivals in a given time interval [0,t] follows the Poisson distribution, with mean = t, then the lengths of the inter-arrival times follow the Exponential distribution, with mean 1 / .

Occurrence The Poisson distribution arises in connection with Poisson processes. It applies to various phenomena of discrete properties (that is, those that may happen 0, 1, 2, 3, ... times during a given period of time or in a given area) whenever the probability of the phenomenon happening is constant in time or space. Examples of events that may be modelled as a Poisson distribution include:

The number of soldiers killed by horse-kicks each year in each corps in the Prussian cavalry. This example was made famous by a book of Ladislaus Josephovich Bortkiewicz (18681931). The number of phone calls at a call centre per minute. Under an assumption of homogeneity, the number of times a web server is accessed per minute. The number of mutations in a given stretch of DNA after a certain amount of radiation. The proportion of cells that will be infected at a given multiplicity of infection.

How does this distribution arise? The law of rare events Comparison of the Poisson distribution (black dots) and the binomial distribution with n=10 (red line), n=20 (blue line), n=1000 (green line). All distributions have a mean of 5. The horizontal axis shows the number of events k. Notice that as n gets larger, the Poisson distribution becomes an increasingly better approximation for the binomial distribution with the same mean. This is sometimes known as the law of rare events, since each of the n individual Bernoulli events rarely occurs. The name may be misleading because the total count of success events in a Poisson process need not be rare if the parameter np is not small. For example, the number of telephone calls to a busy switchboard in one hour follows a Poisson distribution with the events appearing frequent to the operator, but they are rare from the point of the average member of the population who is very unlikely to make a call to that switchboard in that hour.

Properties

The expected value of a Poisson-distributed random variable is equal to and so is its variance. The higher moments of the Poisson distribution are Touchard polynomials in , whose coefficients have a combinatorial meaning. In fact, when the expected value of the Poisson distribution is 1, then Dobinski's formula says that the nth moment equals the number of partitions of a set of size n. The mode of a Poisson-distributed random variable with non-integer is equal to, which is the largest integer less than or equal to . This is also written as floor(). When is a positive integer, the modes are and 1. All of the cumulants of the Poisson distribution are equal to the expected value . The nth factorial moment of the Poisson distribution is n. The Poisson distributions are infinitely divisible probability distributions. The directed Kullback-Leibler divergence between Pois() and Pois(0) is given by

Applications The Poisson distribution has two applications: 1) The poisson distribution can be used as an alternative to the Binomial distribution in the case of very large samples. Your hypothesis was that you would find 'x' (the top cell) occurrences of a phenomenon, whereas in fact you found 'n' (the second input cell). The phenomenon might be the number of cases of a rare disease, the number of accidents on a busy junction, the number of stoppages on a production line, or the number of ethnic children at a football match. You want to test the assumption that environmental and other factors influencing the phenomenon were constant between observation periods. The program echoes the point probability and the probability of there being 'n' or more occurrences of a phenomenon given your expectation of 'x' occurrences. For the poisson distribution you do not need to give a sample size. If the sample size is known, it is generally preferable to use the Binomial. The main differences between the poisson distribution and the binomial distribution is that in the binomial all eligible phenomena are studied, whereas in the poisson distribution only the cases with a particular outcome are studied. For example: in the binomial all cars are studied to see whether they have had an accident or not, whereas using the poisson distribution only the cars which have had an accidents are studied. 2) The poisson distribution can also be used to study how 'accidents' or 'malfunctions' or the chance of winning the lottery never, once or more than once, are distributed on the level of a population. If having one 'accident' has no influence on the chance of having another accident, the victim is 'put back into the population' immediately after an 'event', people may have one, two, three, or more accidents during a certain period of time. The Poisson distribution tells you how these chances are distributed. Mean or incidence is the number of accidents divided by the size of the population and is given to the program in the top expectation box. Note that although your calculation may result in a value between zero and one, this value is not a proportion but a true mean. You would get a true proportion if you divide the number of people who had an accident by the number of people (For a discussion of the relationships between these numbers see Uitenbroek 1995). In the second 'observed' box is given the number of accidents you want to study. If you give 10 in the observed box the output gives you the proportion of the population who had '0' (zero) accidents, the proportion who had '1' (one) accident,

the proportion who had '2' (two) accidents etc. The cumulative distribution tells you the proportion that had '1' or more accidents, '2' or more etc. One assumption in this application of the poisson distribution is that the chance of having an accident is randomly distributed: every individual has an equal chance. Mathematically this is expressed in the fact that the variance and the mean for the poisson distribution are equal. A good way to check if this assumption that individuals have an equal chance of having the trait is correct, is to compare the variance of an (accident) distribution with its mean. If the variance is larger, then the assumption was not correct. The Negative Binomial Version 1 has been implemented to provide an alternative for the poisson distribution in the case of a non-random distribution.

CASE STUDY

Modeling fatal injury rates using Poisson regression: A case study of workers in agriculture, forestry, and fishing

Injury surveillance data serves as the foundation of many safety studies. These studies frequently gather information on the number of injuries along with the number of employees at risk of injury in each of several strata where the strata are defined in terms of a series of important predictor variables. It is common for analyses of such data to examine injury rates separately for each predictor variable. The analysis of the crude or unadjusted injury rates give an overall indication of injury rate changes as a function of a particular predictor variable; however, further insights may be gained from analyses using Poisson regression models.

Poisson regression models are described as a means of analyzing rates adjusting for one or more predictor variables. In these models, the log rate of injury is expressed as a linear function of predictor variables. The interpretation of model parameters is given along with a presentation of the basic formulation of such models. Testing for trend, evaluation of confounding, and effect modification are illustrated using surveillance data describing occupational fatal injury rates as a function of year (19831992), gender and age for White workers employed in agriculture, forestry, or fishing. Data for this analysis were obtained from two sources: the National Traumatic Occupational Fatality (NTOF) database from the National Institute for Occupational Safety and Health provided counts of the fatal injuries, while data from the U.S. Bureau of Labor Statistics (BLS) provided counts on employment. Using an unadjusted trend model, a statistically nonsignificant decline in fatal injury rates over 19831992 is observed. Further analysis using Poisson regression revealed an interaction between gender and calendar year with males experiencing a weak, albeit significant, decrease and females experiencing a strong and significant increase.

In many retail stores and banks, management has tried to reduce the frustration of customers by somehow increasing the speed of the checkout and cashier lines. Although most grocery stores seem to have retained the multiple line/multiple checkout system, many banks, credit unions, and fast food providers have gone in recent years to a queuing system where customers wait for the next available cashier. The frustrations of "getting in a slow line" are removed because that one slow transaction does not affect the throughput of the remaining customers. Walmart and McDonald's are other examples of companies which open up additional lines when there are more than about three people in line. In fact, Walmart has roaming clerks now who can total up your purchases and leave you with a number which the cashier enters to complete the financial aspect of your sale. Disney is another company where they face thousands of people a day. One method to ameliorate the problem has been to use queuing theory. It has been proved that throughput improves and customer satisfaction increases when queues are used instead of separate lines. Queues are also used extensively in computing--web servers and print servers are now common. Banks of 800 service phone numbers are a final example I will cite. Queuing theory leads one directly to the Poisson Distribution, named after the famous French mathematician Simeon Denis Poisson (1781-1840) who first studied it in 1837. He applied it to such morbid results as the probability of death in the Prussian army resulting from the kick of a horse and suicides among women and children. As hinted above, operations research has applied it to model random arrival times. Poisson Distribution The Poisson distribution is the continuous limit of the discrete binomial distribution. It depends on the following four assumptions: 1. It is possible to divide the time interval of interest into many small subintervals (like an hour into seconds). 2. The probability of an occurrence remains constant throughout the large time interval (random). 3. The probability of two or more occurrences in a subinterval is small enough to be ignored. 4. Occurrences are independent. Clearly, bank arrivals might have problems with assumption number four where payday, lunch hour, and car pooling may affect independence. However, the Poisson Distribution finds applicability in a surprisingly large variety of situations.

You might also like