Professional Documents
Culture Documents
Bioinformatics Tutorials
BIOINFORMATICS
AND
GENE DISCOVERY
Iosif Vaisman
1998
From genes to proteins
From genes to proteins
DNA
PROMOTER
ELEMENTS
TRANSCRIPTION
RNA
SPLICE
SITES
SPLICING
mRNA
START STOP
CODON CODON
TRANSLATION
PROTEI
From genes to proteins
Comparative Sequence Sizes
• Yeast chromosome 3 350,000
• Escherichia coli (bacterium) genome 4,600,000
• Largest yeast chromosome now mapped 5,800,000
• Entire yeast genome 15,000,000
• Smallest human chromosome (Y) 50,000,000
• Largest human chromosome (1) 250,000,000
• Entire human genome 3,000,000,000
Low-resolution physical map
of chromosome 19
Chromosome 19 gene map
Computational Gene Prediction
0 1
1 bit
Information Theory
00 01
1 bit
10 11
1 bit
Information Theory
1 bit
1 bit
Scientific Models
Physical models -- Mathematical models
Parent B
crossover point
Child AB
Child BA
Mutation
Markov Model (or Markov Chain)
A T C T A G
Probability of a sequence
PREDICTION
REALITY
Sensitivity
c nc
PREDICTION
Sn = TP / (TP + FN)
TP FP
c
Specificity
FN TN
nc
Sp = TP / (TP + FP)
Measures of Prediction Accuracy
Exon Level
WRONG CORRECT MISSING
EXON EXON EXON
REALITY
PREDICTION