You are on page 1of 3

Glossary of probability and statistics - Wikipedia, the free encyclopedia http://en.wikipedia.

org/wiki/Glossary_of_probability_and_statistics

Glossary of probability and statistics


From Wikipedia, the free encyclopedia.

Terms in statistics and probability theory :

Concerned fields
Probability theory
Algebra of random variables (linear algebra)
Statistics
Measure theory
Estimation theory

Probability interpretations:
Bayesian probability (or "personal probability")
Frequency probability
Eclectic probability

Glossary
Atomic event : another name for elementary event.
Bias can refer either to a sample not being representative of the population, or to the difference between the
expected value of an estimator and the true value.
Conditional distribution : Given two jointly distributed random variables X and Y, the conditional probability
distribution of Y given X (written "Y | X") is the probability distribution of Y when X is known to be a particular
value.
Conditional probability is the probability of some event A, assuming event B. Conditional probability is written
P(A|B), and is read "the probability of A, given B".
Completeness
Correlation, also called correlation coefficient, is a numeric measure of the strength of linear relationship
between two random variables (one can use it to quantify, for example, how shoe size and height are correlated in
the population). An example is the Pearson product-moment correlation coefficient, which is found by dividing the
covariance of the two variables by the product of their standard deviations. Independant variables have a
correlation of 0.
The Covariance between two random variables X and Y, with expected values E(X) = µ and E(Y) = ν is defined
as the expected value of random variable (X − µ)(Y − ν), and is written . It is used for measuring
correlation.
A data set is a sample' and the associated data points.
A data point is a typed measurement - it can be a boolean value, a real number, a vector (in which case it's also
called a data vector), etc.
A Distribution function is the function that gives the probability distribution of a random variable. It cannot be
negative, and its integral on the probability space is equal to 1.
Efficiency
An Elementary event (or atomic event) is an event with only one element. For example, when pulling a card out
of a deck, "getting the jack of spades" is an elementary event, while "getting a king or an ace" is not.
Estimator is a function of the known data that is used to estimate an unknown parameter; an estimate is the result
from the actual application of the function to a particular set of data. The mean can be used as an estimator.
The Expected value (or expectation) of a random variable is the sum of the probability of each possible outcome
of the experiment multiplied by its payoff ("value"). Thus, it represents the average amount one "expects" to win
per bet if bets with identical odds are repeated many times. For example, the expected value of a six-sided die roll
is 3.5. The concept is similar to the mean. The expected value of random variable X is typically written E(X) or µ

1 of 3 9/10/2005 9:35 PM
Glossary of probability and statistics - Wikipedia, the free encyclopedia http://en.wikipedia.org/wiki/Glossary_of_probability_and_statistics

(mu).
Experiment
An event is a subset of the sample space, to which a probability can be assigned. For example, on rolling a die,
"getting a five or a six" is an event (with a probability of one third if the die is fair).
Generating function
Independence or Statistical independence : Two events are independent if the outcome of one does not affect
that of the other (for example, getting a 1 on one die roll does not affect the probability of getting a 1 on a second
roll). Similarly, when we assert that two random variables are independent, we intuitively mean that knowing
something about the value of one of them does not yield any information about the value of the other.
Joint distribution : Given two random variables X and Y, the joint distribution of X and Y is the probability
distribution of X and Y together.
Joint probability is the probability of two events occurring together. The joint probability of A and B is written
or
Kurtosis is a measure of the "peakedness" of the probability distribution of a real-valued random variable.
Higher kurtosis means more of the variance is due to infrequent extreme deviations, as opposed to frequent
modestly-sized deviations.
A likelihood function (or just likelihood) is a conditional probability function considered a function of its
second argument with its first argument held fixed. For example, imagine pulling a numbered ball with the number
k from a bag of n balls, numbered 1 to n. Then you could describe a likelihood function for the random variable N
as the probability of getting k given that there are n balls : the likelihood will be 1/n for n greater or equal to k, and
0 for n smaller than k. Unlike a probability distribution function, this likelihood function will not sum up to 1 on
the sample space.
Marginal distribution : given two jointly distributed random variables X and Y, the marginal distribution of X
is simply the probability distribution of X ignoring information about Y.
Marginal probability is the probability of an event, ignoring any information about other events. The marginal
probability of A is written P(A). Contrast with conditional probability.
The Mean of a random variable is its expected value. The mean (or sample mean of a data set is just the average
value.
Moment about the mean
Mutual independence : A collection of events is mutually independent if for any subset of the collection, the joint
probability of all events occurring is equal to the product of the joint probabilities of the individual events. Think
of the result of a series of coin-flips. This is a stronger condition than pairwise independence.
Pairwise independence : a pairwise independent collection of random variables is a set of random variables any
two of which are independent.
parameter : Can be a population parameter, a distribution parameter, an unobserved parameter (all the
same ?). Often written θ.
Prior probability
A population or statistical population is a set of entities about which statistical inferences are to be drawn, often
based on random sampling. One can also talk about a population of measurements or values.
Population parameter : See statistical paramter
Posterior probability
Probability density is used to describe probability in a continuous probability distribution. For example, you can't
say that the probability of a man being six feet tall is 20%, but you can say he has 20% of chances of being
between five and six feet tall. Probability density is given by a probability density function. Contrast with
probability mass.
A probability density function gives the probability distribution for a continuous random variable.
A probability distribution is a function that gives the probability of all elements in a given space. (-> see that page
for a list of different distributions)
A Probability measure gives the probability of events in a probability space.
A probability space is a sample space over which a probability measure has been defined.
Random function
A random variable can be, for example, the possible outcomes of a dice roll (but it is not assigned a value). The
distribution function of a random variable gives the probability of different results. We can also derive the mean
and variance of a random variable.
Discrete random variable

2 of 3 9/10/2005 9:35 PM
Glossary of probability and statistics - Wikipedia, the free encyclopedia http://en.wikipedia.org/wiki/Glossary_of_probability_and_statistics

Continuous random variable


A Random vector (or multivariate random variable) is a vector whose components are random variables on the
same probability space.
A sample is that part of a population which is actually observed.
The sample space is the set of possible outcomes of an experiment. For example, the sample space for rolling a
six-sided die will be {1, 2, 3, 4, 5, 6}.
Sampling is a process of selecting observations to obtain knowledge about a population. There are many methods
to choose on which sample to do the observations.
A sampling distribution is the probability distribution, under repeated sampling of the population, of a given
statistic.
Skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable.
Roughly speaking, a distribution has positive skew (right-skewed) if the higher tail is longer and negative skew
(left-skewed) if the lower tail is longer (confusing the two is a common error).
The standard deviation is the most commonly used measure of statistical dispersion. It is the square root of the
variance, and is generally written σ (sigma).
Standardized moment
A statistic is the result of applying a statistical algorithm to a data set. It can also be described as an observable
random variable.
Statistical inference is inference about a population from a random sample drawn from it or, more generally,
about a random process from its observed behavior during a finite period of time.
Statistical dispersion (also called statistical variability) is a measure of how diverse some data is. It can be
expressed by the variance or the standard deviation.
A Statistical parameter is a parameter that indexes a family of probability distributions.
Sufficiency
The variance of a random variable is a measure of its statistical dispersion, indicating how far from the
expected value its values typically are. The variance of random variable X is typically designated as ,
, or simply σ2.

See also
Notation in probability
Probability axioms
List of statistical topics
List of probability topics

Retrieved from "http://en.wikipedia.org/wiki/Glossary_of_probability_and_statistics"

Categories: Mathematics stubs | Probability and statistics | Glossaries

This page was last modified 01:04, 19 August 2005.


All text is available under the terms of the GNU Free Documentation License (see Copyrights for
details).

3 of 3 9/10/2005 9:35 PM

You might also like