You are on page 1of 4

Probability & Statistics

HANDOUT 1. Probability basics

Bibliography: Michael Baron - ”Probability and Statistics for Computer Scientists” - chapter 2

INTRODUCTION
Probability theory and Statistics are branches of mathematics developed to deal with uncertainty.
Probability theory provides the basic tools for the science of statistical inference through experi-
mentation and data analysis.
Applications of probability theory to computer science and engineering: assessment of system reliability,
interpretation of measurement accuracy, maintenance of suitable quality controls, etc.

The goal of Probability Theory is to provide a mathematical structure for understanding or explaining
the chances or likelihoods of the various outcomes that may occur in an experiment.

SAMPLE SPACE, EVENTS AND PROBABILITY


• experiment = a repeatable process that yields a result or an observation
• outcome = an elementary result of an experiment
• sample space = a collection of all possible outcomes of an experiment
→ denoted by Ω or S
→ The elements of a sample space are also called sample points or elementary events.
• event = a set of outcomes (a subset of the sample space)
♣ A sample space of n outcomes yields 2n possible events.
• certain event (Ω) = the event which happens with certitude at each repetition of an experiment
• impossible event (∅) = an event which never takes place in a random experiment

OPERATIONS WITH EVENTS


Since events are sets of outcomes, operations with events rely on set theory.
• union A ∪ B of two events A and B
= the event which takes place when at least one of the events A OR B occur
→ consists of all the outcomes of these two events A and B
• intersection A ∩ B of two events A and B
= the event which occurs when both events A AND B take place at the same time
→ consists of outcomes that are common in these two events A and B
• complement A of an event A
= the event that occurs every time when A does NOT occur
→ consists of outcomes excluded from A
• difference A \ B of two events A and B
= the event which takes place when A occurs but B does not occur
→ consists of the outcomes A which are not in B
♣ A\B =A∩B
• Events A, B, C... are disjoint (incompatible or mutually exclusive) if A ∩ B ∩ C ∩ ... = ∅
→ disjoint events cannot occur at the same time
• Events A, B, C... are exhaustive if A ∪ B ∪ C ∪ ... = Ω
→ exhaustive events ”cover” all the sample space Ω
♣ Any event A and its complement A are disjoint and exhaustive.
• Event A implies event B (A ⇒ B or A ⊂ B) if the occurrence of A means that B occurs as well.
♣ Any event implies the certain event.

1
FREQUENCY
Let us consider an event A associated to a given experiment.
We repeat the same experiment N times and we denote by α the number of occurrences of the event A.
→ α ∈ {0, 1, 2, ..., N } is called absolute frequency of the event A
♣ The number of occurrences of the event A is N − α.
α
The number fN (A) = is called relative frequency of the event A.
N
♣ 0 ≤ fN (A) ≤ 1, for any N ∈ N? .
♣ fN (Ω) = 1, where Ω is the certain event.
♣ If A ∩ B = ∅ then fN (A ∪ B) = fN (A) + fN (B).
In a long run (large number of repetitions N), the probability of an event A can be estimated
by its relative frequency !!!
lim fN (A) = P (A)
N →∞

EQUALLY LIKELY OUTCOMES


• Two events A and B associated to an experiment are equally likely (equally probable) if there is
no reason to suppose that the occurrence of one of the two events is favored with respect to the other.
If the sample space Ω consists of equally likely outcomes ω1 , ω2 , ..., ωn , the probability of
each outcome is equal to the inverse of the number of outcomes from the sample space:
1
P (ωk ) =
n

The probability of any event E consisting of k outcomes is:


k number of outcomes in E
P (E) = =
n number of outcomes in Ω

The outcomes forming an event E are also favorable. Therefore:


number of favorable outcomes NF
P (E) = =
total number of outcomes NT

COMBINATORICS - COUNTING TECHNIQUES

Multiplication rule. If an experiment consists of k components (sub-experiments) for which the number
of possible outcomes are n1 , n2 , ..., nk then the total number of experimental outcomes (the size of the
sample space) is equal to n1 × n2 × ... × nk .

Permutations
• permutation = a possible selection of k distinguishable objects from a set of n objects (n ≥ k)
→ the order of the sampled objects is important!
If the selection is performed:
1. with replacement then the experiment is made up of k identical components, each with n possible
outcomes. The number of possible ways to select the k objects is

Pr (n, k) = nk

2. without replacement then the experiment is made up of k components with n, n − 1, ..., n − k + 1


outcomes respectively. The number of possible ways to select the k objects is
n!
P (n, k) = n(n − 1)(n − 2)...(n − k + 1) =
(n − k)!

2
Combinations
• combination = a possible selection of k indistinguishable objects from n objects (n ≥ k)
→ the order of the sampled objects is not taken into account!
If the selection is performed:
1. without replacement then the number of possible ways to select the k objects is

k P (n, k) n!
C(n, k) = Cn = =
P (k, k) k!(n − k)!

2. with replacement then the number of possible ways to select the k objects is

k (k + n − 1)!
Cr (n, k) = Ck+n−1 =
k!(n − 1)!

AXIOMATIC DEFINITION OF PROBABILITY


• We call probability on the sample space Ω a function P which associates to every event A ∈ P(Ω) a
number P (A), called probability of A, such that the following conditions (axioms) are fulfilled:

i) P (A) ≥ 0, ∀A ∈ P(Ω);
ii) P (Ω) = 1;
iii) A ∩ B = ∅ ⇒ P (A ∪ B) = P (A) + P (B), ∀A, B ∈ P(Ω).

• The function P : P(Ω) → R1+ is called probability measure .


• The pair (Ω, P ) is called probability space .

BASIC PROBABILITY RULES


♣ P (∅) = 0
k
X
♣ If A = {ω1 , ω2 , ..., ωk }, where ωi are outcomes, then P (A) = P ({ωi }).
i=1
♣ P (A) = 1 − P (A).
à n
! n
[ X
♣ If A1 , A2 , ..., An ∈ P(Ω) and Ai ∩ Aj = ∅, ∀i 6= j, then P Ai = P (Ai ).
i=1 i=1
♣ P (A ∪ B) = P (A) + P (B) − P (A ∩ B) for any A, B ∈ P(Ω).
→ This formula can be generalized for the union of n events. For example, for n = 3:
P (A ∪ B ∪ C) = P (A) + P (B) + P (C) − P (A ∩ B) − P (A ∩ C) − P (B ∩ C) + P (A ∩ B ∩ C)

Inequalities
♣ If A ⊂ B then P (A) ≤ P (B).
à n
! n
[ X
♣ for any A1 , A2 , .., An ∈ P(Ω) we have: P Ai ≤ P (Ai ) , ∀n ∈ N.
i=1 i=1
à n ! n
\ X
♣ for any A1 , A2 , .., An ∈ P(Ω) we have: P Ai ≥1− P (Ai ) , ∀n ∈ N.
i=1 i=1

INDEPENDENT EVENTS
• The events E1 , E2 ,..., En are called independent if the occurrence of one of these events does not
affect the probabilities of the others.
If events E1 , E2 ,..., En are independent then

P (E1 ∩ E2 ∩ ... ∩ En ) = P (E1 ) · P (E2 ) · ... · P (En )

♣ if A and B are independent events, then the events A and B; A and B; A and B are also independent.

3
CONDITIONAL PROBABILITY
• conditional probability of event A, given event B
= the probability that A occurs, when B is known to occur
P (A ∩ B)
→ is defined by P (A|B) = , assuming that P (B) 6= 0
P (B)
Properties of the conditional probability:
♣ 0 ≤ P (A|B) ≤ 1
♣ P (Ω|B) = 1
♣ if A1 , A2 are incompatible then P (A1 ∪ A2 |B) = P (A1 |B) + P (A2 |B)
♣ if A and B are independent events then P (A|B) = P (A) and P (B|A) = P (B)
Probability of the intersection of a series of events:
If A1 , A2 , ..., An are events such that P (A1 ∩ A2 ∩ ... ∩ An ) 6= 0 (i.e. they are compatible), then
P (A1 ∩ A2 ∩ ... ∩ An ) = P (A1 ) · P (A2 |A1 ) · P (A3 |(A1 ∩ A2 )) · ... · P (An |(A1 ∩ ... ∩ An−1 ))
♣ consequence: if A1 , A2 , ..., An are independent events then
P (A1 ∩ A2 ∩ ... ∩ An ) = P (A1 ) · P (A2 ) · ... · P (An )
Law of Total Probability:
• The events B1 , B2 , ..., Bk ∈ P(Ω) form a partition of the sample space Ω if:
k
[
i) Bi ∩ Bj = ∅, ∀i 6= j ii) Bi = Ω iii) P (Bi ) > 0, ∀i = 1, 2, ..., k
i=1

(i.e they are mutually exclusive and exhaustive)


• The events of a partition of the sample space are called hypotheses.
If the events B1 , B2 , ..., Bk form a partition of the sample space Ω and A ∈ P(Ω), then:
k
X
P (A) = P (Bi ) · P (A|Bi )
i=1

♣ Considering a partition formed by B and B, the law of total probability becomes:


P (A) = P (B)P (A|B) + P (B)P (A|B)
Bayes’ Rule:
For any two events A and B we have:
P (A|B)P (B)
P (B|A) =
P (A)
Bayes’ Theorem:
If the events B1 , B2 , ..., Bk form a partition of the sample space Ω and are the cause of the
occurrence of an event A, then for any j ∈ {1, 2, ..., k} we have:
P (Bj ) · P (A|Bj )
P (Bj |A) = k
X
P (Bi ) · P (A|Bi )
i=1

• the probabilities P (Bi ), P (B|Ai ), i = 1, k are called prior probabilities


• the probabilities P (Bi |A) are called posterior probabilities
• the event A is called evidence
Before we receive the evidence, we have a set of prior probabilities P (Bi ), i = 1, k for the hypotheses. If
we know the correct hypotheses, we know the probability for the evidence. That is, we know P (A|Bi ),
i = 1, k. If we want to find the conditional probabilities for the hypotheses, given the evidence, that is,
we want to find P (Bi |A), we can use Bayes’ theorem.

You might also like