Professional Documents
Culture Documents
RAW
READS
SEQUENCE
DATA
processing
+
analysis
GENOMIC
VARIATION
Bedrock
of
soma%c
variant
discovery:
T/N
pair
comparisons
TUMOR
NORMAL
Table
of
Contents
h[p://www.path.cam.ac.uk/~pawesh/BreastCellLineDescrip%ons/HCC1954.html
Various
pa[erns
of
rearrangements
"weird" pairs
Chr1
Chr6
TUMOR
NORMAL
NORMAL
Not
to
be
confused
with
mere
germline
dele%ons
10.7
Kb
TUMOR
na12878
(trio)
Beware
ar%facts
from
mapping
anomalies
Chr18
Chr6
TUMOR
NORMAL
Beware
also
ar%facts
from
homology
16.80 Mb 25 Kb 16.81 Mb
to chr1 : 20,840,192
to
chr1
:
144,004,088
TUMOR
pairmates
are
from
mul%ple
distant
loca%ons
NORMAL
pairmates
are
oriented
randomly
Types
of
variants
TUMOR
NORMAL
Two
main
types
of
false
posi%ves
TUMOR
NORMAL
NORMAL
At
risk:
Every
base
At
risk:
~1000
germline
/
Mb
(known)
Source:
Misread
bases
Source:
Low
coverage
in
normal
Sequencing
ar%facts
Misaligned
reads
TUMOR
NORMAL
33% N 67% T
T
N
T
The
(variant)
allelic
frac%on
is
the
frac%on
of
alleles
(DNA
molecules)
from
a
locus
that
carry
the
variant
->
Also
the
expected
frac%on
of
suppor%ng
reads
Carter
et
al.
Nat.
Biotechnol.
(2012)
Which
is
even
worse
when
the
tumor
involves
subclones
6.3
b For
MuTect
6.3 1.0
6.3
AF=0.4
0.8
1.00 AF=0.2
6.3
Sensitivity
0.6
0.95
6.3 AF=0.1
6.3
0.4
0.90
AF=0.05
0.2
0.85
0 5 10 15 20
0
0 10 20 30 40 50 0 10 20 30 40 50 60
1
False positive rate (Mb ) Tumor sample sequencing depth
Calculation (Q35) Calculation (Q35) f = 0.4
MuTect STD MuTect STD (virtual tumors) f = 0.2
Sensi%vity
MuTect HC (recall)
decreases
with
the
variant
MuTectaHC llele
frac%on
(virtual tumors) f = 0.1
f = 0.2 MuTect HC (downsampling) f = 0.05
MuTect HC + PON (downsampling)
Indels
coming
soon!
(M2)
+
some
post-processing
to
rescue
TiN
variants
and
eliminate
ar%facts
Mapping
and
pre-processing
Same
as
germline:
BWA
+
Picard
+
GATK
Done
separately
for
each
sample
in
a
tumor/normal
pair
Cancer-specic
pre-processing
Es%ma%on
of
cross-sample
contamina%on
Variant
discovery
Call
SNPs
with
MuTect
Indels
coming
soon!
(M2)
Post-processing
Indels
coming
soon!
(M2)
+
some
post-processing
to
rescue
TiN
variants
and
eliminate
ar%facts
talks
Further
reading
Documenta%on
coming
soon
to
the
GATK
website
In
the
mean%me,
see
h[p://www.broadins%tute.org/cancer/cga/Home