Professional Documents
Culture Documents
Reading
Chapter 1 in the book. Chapter 3 pages 46-51.
Definition: Experiment
Definition: Event
An EVENT is a set of possible outcomes of an experiment. An ELEMENTARY EVENT is whatever you decide it is. For example:
The outcome of 1 roll of a die The outcomes of n rolls of a die The residue at position 237 in a protein The residues at position 237 in a family of proteins The weight of a person
Compound Events
A COMPOUND EVENT is a set of one or more elementary events. For example, you might define two compound events in a die-rolling experiment: E=roll less than 3, F=roll greater than or equal to 3. Then, E = {1, 2} and F = {3, 4, 5,6}.
Timothy L. Bailey BIOL3014 6
E F
EUF
EF
Notation
Joint Probability: Pr(E,F)
The probability of E and F
10
Bayes Rule can be used to reverse the roles of E and F: Pr(F | E) = Pr (E|F) Pr(F) / Pr(E)
11
Sequence Models
Observed biological sequences (DNA, RNA, protein) can be thought of as the outcomes of random processes. So, it makes sense to model sequences using probabilistic models. You can think of a sequence model as a little machine that randomly generates sequences.
Timothy L. Bailey BIOL3014 12
13
qA qC qG qT
Emission Probabilites
14
1-p
qA qC qG qT
15
Generating a Sequence
This Markov model can generate any DNA sequence. Associated with each sequence is a path and a probability. 1. Start in state S: P = 1 2. Move to state M: P=1P 3. Print x: P = qXP 4. Move to state M: P=pP or to state E: P=(1-p) P 5. If in state M, go to 3. If in state E, stop.
p
1-p
qA qC qG qT
17
This simple sequence model is called a 0-order Markov model because the probability distribution of the next letter to be generated doesnt depend on any (zero) of the letters preceding it.
1-p
The Markov Property: Let X = X1X2XL be a sequence. In an n-order Markov sequence model, the probability distribution of the next letter depends on the previous n letters generated. 0-order: Pr(Xi|X1X2Xi-1)=Pr(Xi) 1-order: Pr(Xi|X1X2Xi-1)=Pr(Xi|Xi-1) n-order: Pr(Xi|X1X2Xi-1)=Pr(Xi|Xi-1Xi-2Xi-n)
Timothy L. Bailey BIOL3014 18
qA qC qG qT
Pr(C|G)
19
20
Pr(T|AA) AA Pr(G|AA) AG
Timothy L. Bailey BIOL3014
AT E
AC
21