You are on page 1of 15

REVIEWS

A P P L I C AT I O N S O F N E X T- G E N E R AT I O N S E Q U E N C I N G

Translating RNA sequencing into


clinical diagnostics: opportunities
andchallenges
Sara A.Byron1, Kendall R.Van Keuren-Jensen2, David M.Engelthaler 3, John D.Carpten4
and David W.Craig2
Abstract | With the emergence of RNA sequencing (RNA-seq) technologies, RNA-based biomolecules
hold expanded promise for their diagnostic, prognostic and therapeutic applicability in various
diseases, including cancers and infectious diseases. Detection of gene fusions and differential
expression of known disease-causing transcripts by RNA-seq represent some of the most immediate
opportunities. However, it is the diversity of RNA species detected through RNA-seq that holds new
promise for the multi-faceted clinical applicability of RNA-based measures, including the potential of
extracellular RNAs as non-invasive diagnostic indicators of disease. Ongoing efforts towards the
establishment of benchmark standards, assay optimization for clinical conditions and demonstration
of assay reproducibility are required to expand the clinical utility of RNA-seq.

Next-generation sequencing RNA is a dynamic and diverse biomolecule with an allele-specific expression, before discussing the emer
(NGS). High-throughput, essential role in numerous biological processes. From a ging areas in pathogen detection and measurement
massively parallel sequencing molecular diagnostic standpoint, RNA-based measure- ofnon-coding RNA (ncRNA) species. An overview of
technology that is used in ments have the potential for broad application across theclinically relevant RNA species discussed within the
various applications, including
whole-genome sequencing,
diverse areas of human health, including disease diag- Review is summarized within FIG.1, focusing on those
exome sequencing and RNA nosis, prognosis and therapeutic selection. Technological RNA species that hold the greatest promise for directly
sequencing. advancements have continually shaped the way that impacting current and future clinical testing, with an
RNA-based measurements are used in the clinic (BOX1). understanding that the full diversity of RNA species
With the evolution of next-generation sequencing (NGS) and their putative roles are covered in other reviews4,5.
technologies, the use of RNA sequencing (RNA-seq) to In addition, we do not review NGS assays and tech-
investigate the vast diversity of RNA species is an obvious nologies, as this topic is well reviewed and beyond our
1
Center for Translational and exciting application, which opens up entirely new intended scope. We finish by discussing the challenges
Innovation, Translational opportunities for improving diagnosis and treatment faced in translating this technology into clinical prac-
Genomics Research Institute, ofhuman disease. RNA-seq provides an indepth view of tice, including the regulatory environment and ongoing
Phoenix, Arizona 85004,
the transcriptome, detecting novel RNA transcript vari efforts to establish reference standards and the best prac-
USA.
2
Neurogenomics Division, ation1. Beyond operating as an open platform technology, tices for RNA-seq as a clinical test that is capable of high
Translational Genomics RNA-seq has a number of potential advantages over gene reproducibility, accuracy and precision.
Research Institute, Phoenix, expression microarrays, including an increased dynamic
Arizona 85004, USA. range of expression, measurement of focal changes (such Opportunities enabled by RNA-seq
3
Pathogen Genomics Division,
Translational Genomics
as s ingle nucleotide variants (SNVs), insertions and dele- As in whole-genome and whole-exome sequencing,
Research Institute, Flagstaff, tions), detection of different transcript isoforms, splice RNA-seq involves sequencing samples with billions of
Arizona 86001, USA. variants and chimeric gene fusions (including previ- bases across tens to hundreds of millions of paired or
4
Integrated Cancer Genomics ously unidentified genes and/or transcripts), and, fun- unpaired short-reads. This vast amount of short-read
Division, Translational
damentally, it can be performed on any species. Although RNA-seq data must be bioinformatically realigned and
Genomics Research Institute,
Phoenix, Arizona 85004, RNA-seq assays are now commercially available2,3, these assembled to detect and measure expression of hundreds
USA. early tests belie the considerable promise for broader of thousands of RNA transcripts. Not only can RNA-seq
Correspondence to D.W.C applicability of RNA-seq-based clinicaltests. detect underlying genomic alterations at single nucleo-
dcraig@tgen.org Here we review a selection of current and potential tide resolution within expressed regions of the genome,
doi:10.1038/nrg.2016.10 clinical applications of RNA-seq, focusing on differen- it can also quantify expression levels and capture vari
Published online 21 Mar 2016 tial expression, rare or fusion transcript detection and ation not detected at the genomic level, including the

NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 1



2
0
1
6
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

Box 1 | How technology has shaped the evolution of RNA as a clinical biomarker differentially expressed genes, tens of thousands of dif-
ferentially expressed gene isoforms and can detect muta-
Historically, gene expression analysis within the clinic has primarily focused around tions and germline variations for hundreds to thousands
single gene tests using quantitative reverse transcription PCR (qRT-PCR)131, such as in the of expressed genetic variants (thus facilitating the assess-
detection of the influenza virus132. This method has several advantages, including being ment of allele-specific expression of these variants),
fast, accurate, sensitive, high-throughput in terms of the number of clinical samples that
as well as detecting chimeric gene fusions, transcript
can be analysed, cost-effective and requiring low sample input. For these reasons and
the historic nature of this platform, qRT-PCR is generally deemed the gold standard isoforms and splice variants5,9. In addition, RNA-seq
method for measuring transcript levels, particularly in the clinical space; however, there can characterize previously unidentified transcripts
are a number of limitations, including the fact that although it is a high sample and diverse types of ncRNAs, including microRNAs
throughput technology, relatively few markers or measurements can be made in a single (m iR NAs), PIWI-interacting RNAs (piRNAs) and
assay. After the initial studies describing the relatively reproducible hybridization-based tRNAs5. Indeed, the open platform of RNA-seq for
methods to assess the expression of multiple gene targets using arrayed probes on solid detecting and measuring temporally dynamic RNA
surfaces133, it became clear that this microarray technological revolution would lead to species sets the stage for considerable challenges and
new opportunities for clinical assay development. Measurement of several RNA targets even more considerable opportunities associated with
at one time (as gene expression profiles) became associated with potential diagnostic RNA-seq moving into the clinical test environment.
or prognostic parameters in research. It was crucial to assess the clinical validity of these
technologies for multi-gene profile tests. From a gene expression standpoint,
MammaPrint (Agendia) provides an excellent example of a microarray-based clinical test Detecting aberrant transcription in human disease
that simultaneously measures the expression of 70 genes in breast tumours as a profile mRNA expression profiling. Multigene mRNA signature-
to help predict the risk of recurrence134. Although powerful, microarray-based assays can based assays are being increasingly incorporated into
have limitations in some environments, such as those related to laboratory-tolaboratory clinical management. These assays use various technol-
variation in sample preparation that can affect reproducibility. Moreover, for some ogy platforms to measure mRNA expression of different
applications, microarray signal-tonoise ratios can affect the limit of detection. multigene panels and have broad clinical application
Interestingly, a number of additional cancer multi-gene profile tests are clinically (TABLE1). For example, in breast cancer, recent clinical
available, such as OncoTypeDX (Genome Health)135 for breast cancer recurrence risk and guidelines support the use of multigene mRNA-based
Prolaris (Myriad)136 for prostate cancer aggressiveness. These tests are based on qRT-PCR prognostic assays to assist in treatment decisions in con-
technologies, rather than microarrays, largely owing to the belief that qRT-PCR is more
junction with clinicopathological factors10,11. Indeed,
reliable, reproducible, sensitive and accurate.
As we enter the era of next-generation sequencing (NGS) technologies, RNA the OncotypeDx 21gene expression assay was recently
sequencing (RNA-seq) can be brought to bear on clinical gene testing. RNA-seq-based validated in a prospectively conducted study in breast
tests can provide unprecedented flexibility, sensitivity and accuracy to gene cancer 12. Clinically relevant breast cancer gene expres-
expressionmeasurements. Moreover, the diversity of RNA species opens up sion signatures were compared using microarrays and
simultaneous measurements of rare transcripts, splice variants and non-coding RNA RNA-seq and reported strong correlation for expression
species. For example, the diverse reach of RNA species from RNA-seq is becoming of genes from the OncotypeDx and MammoPrint signa-
increasingly relevant, particularly in cancer. In addition to providing direct detection of tures across platforms (Spearman correlations of 0.965 and
RNA from fused genes, RNA-seq detection of specific oncogenic splice variants, such as 0.97, respectively)13. In other work14, systematic evalu
from EGFR137 and androgen receptor31, will probably have prognostic and therapeutic ation of RNA-seq-based and microarray-based classifiers
relevance. Indeed, whereas microarrays and qRT-PCR are a closed platform, with clearly
found that RNA-seq outperformed arrays in characteriz-
defined transcript detection and measurement, RNA-seq is an open platform by nature.
Likewise, the ability to identify novel transcripts may introduce clinical interpretation ing the transcriptome of cancer and performed similarly
challenges, perhaps with analogous variants of unknown significance terminology to arrays in clinical endpoint prediction.
found in clinical genomic DNA sequencing. Still, and perhaps more than is often AlloMap is a non-invasive gene expression-based
appreciated, establishment and standardization of methods for assessing reproducibility, blood test that is used to manage the clinical care of heart
accuracy and precision in a variety of clinically relevant conditions are needed to transplant recipients, providing a quantified score for the
facilitate adoption of RNA-seq tests in the clinical laboratory setting. risk of rejection based on the measurement of expression
of 20 genes, a subset of which are related to immune
system activation and signalling 15,16. The potential for
expression of alternative transcripts1,6. Similar to serial using RNA-seq in immune-related diseases is expanding
analysis of gene expression (SAGE), a predecessor tag- rapidly, and the ability to quickly target and sequence the
based sequencing method for genome-wide expression repertoire of Tcell and Bcell receptors from patients
analysis7, RNA-seq allows quantification of transcripts is beginning to mature, using techniques such as those
Open platform without pre-defining the RNA targets of interest and from Adaptive Biotechnologies and ImmunoSeq. These
A technology platform that
does not depend on genome
provides improved detection of RNA splice events1,6. strategies allow examination of immune-related diseases
annotation, or on predesigned Unlike most historical platforms for clinical RNA meas- and immunotherapy response in new ways, as exempli-
species-specific or tran- urement, such as microarrays and quantitative reverse fied in a recent report in which RNA-seq and exome
script-specific probes, for transcription PCR (qRT-PCR), RNA-seq is fundamen- sequencing were used together to evaluate mutation
transcript measurement.
tally an open platform technology, allowing both quan- load, expressed neoantigens and immune microenviron-
RNA-seq technology functions
as an open platform allowing tification of known or pre-defined RNA species and the ment expression as predictors of response to immune
for unbiased detection of both capability to detect and quantify rare and novel RNA checkpoint inhibitor therapy in melanoma17.
known and novel transcripts. transcript variants within a sample1. RNA-seq also has a
greater dynamic range for quantifying transcript expres- Gene fusions. Oncogenic gene fusions are well recog-
Single nucleotide variants
(SNVs). Single nucleotide (A, T,
sion compared to microarray technology 8, providing the nized for their pathogenic role in cancer. In some cases,
G or C) alterations in a DNA potential for increased detection of variation within a recurrent gene fusions correlate with specific tumour
sequence. sample. Overall, RNA-seq can identify thousands of subtypes, allowing gene fusion status to be used for

2 | ADVANCE ONLINE PUBLICATION www.nature.com/nrg



2
0
1
6
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

diagnostic purposes. According to the 2008 WHO cancer owing to the challenges of RNA sample quality
(World Health Organization) classification, diagnosis of in routine formalin-fixed paraffin-embedded (FFPE)
acute myeloid leukaemia (AML) can be made regardless pathology samples, as well as the risk of false negatives
of blast count based on detection of recurrent genetic resulting from limitations in detecting fusions involving
abnormalities, such as the t(8;21)(q22;q22) transloca- novel ALK fusion partners25.
tion, RUNX1RUNX1T1 fusion (involving isoforms In recent years, clinical detection of gene fusions
of runt-related transcription factor 1 (RUNX1); this has advanced beyond assays to detect individual fusion
fusion is also known as AML1ETO)18. Gene fusions events to the introduction of RNA-seq assays, which
have also been associated with prognosis and have been allow a more comprehensive evaluation of potential
suggested as biomarkers for screening and assessment gene fusions. For example, the FoundationOne Heme
of cancer risk, as exemplified by the TMPRSS2ERG assay uses RNA-seq with genomic sequencing to detect
fusion (involving transmembrane protease serine 2 gene common gene fusions in haematological cancers and
(TMPRSS2) and vets avian erythroblastosis virus E26 sarcomas2,3. Reports are emerging of clinical responses
oncogene homologue (ERG)) in prostate cancer 19. by patients receiving treatment on the basis of gene
Several US Food and Drug Administration (FDA)- fusion detection with this assay 2,3. RNA-seq promises
approved targeted agents have clinical biomarkers to expand the repertoire of detectable gene fusions,
amenable to RNA-seq, including agents with activity not only by capturing more subtle intrachromosomal
against known oncogenic fusions. The prototypical rearrangements but also allowing detection of fusion
example is the marked efficacy of kinase inhibitors products with uncharacterized fusion partners. Efforts
(forexample, imatinib) in BCRABL1positive (involv- are underway to catalogue gene fusions detected across
ing breakpoint cluster region (BCR) and tyrosine- various tumour types using RNA-seq data26, although
protein kinase ABL1 (ABL1)) chronic myeloid additional studies are needed to define the clinical value
leukaemia (CML)20. While karyotyping is typically for identified fusions.
used for diagnosis of CML, qRT-PCR measurement of
BCRABL1 transcripts is recommended to monitor the Alternative transcripts. Alternative transcript vari
molecular response during kinase inhibitor treatment 21. ants, arising from splicing alterations or structural
Continued advances in long-read RNA-seq technology variants, have been identified and implicated in a range
Reference standards
Highly characterized and promise to further expand the utility of fusion tran- of human diseases, including developmental disorders27,
standardized control materials script measurements, allowing detection of mutation neurodegenerative disorders28,29 and cancers30. There
that are used to ensure phasing by measuring across the full fusion transcript. is also growing evidence that the presence of alterna-
accuracy and comparability Recently, single-molecule long-read RNA-seq was tive transcripts can have therapeutic implications. For
ofassays.
applied to longitudinal samples from patients with BCR example, expression of the alternatively spliced androgen
Spearman correlation ABL1positive CML with poor treatment response22. The receptor variant 7 (ARV7) has been detected in the cir-
Statistical measure of the results provided a clonal view of the range of resistance culating tumour cells of ~30% of men with castration-
strength of association mutations, distinguishing between compound mutations resistant prostate cancer and is associated with reduced
between two rank-ordered
in the same RNA molecule and independent alterations response to androgen receptordirected therapies31,32.
variables.
present on different molecules of the BCRABL1 fusion Similarly, expression of the tumour-specific epidermal
Phasing transcript. The authors reported sensitive detection that growth factor receptor (EGFR) variant III (EGFRvIII)
Evaluation of closely situated resulted in the identification of several mutations that transcript is well described in glioblastoma, arising from
mutations to determine escaped detection by routine clinical analysis and, for one an inframe deletion encompassing exons 27 (REF.33);
whether they reside on the
same or different alleles.
of the proofofconcept cases, detected a known drug- clinical trials targeting EGFRvIII are underway34. In
resistant mutation in a longitudinal patient sample four some cases, the mechanisms contributing to the gener-
Break-apart probes months earlier than detected by Sangersequencing 22. ation of alternative transcripts may be missed by exome
A DNA probe system used Aside from monitoring BCRABL1 fusion tran- sequencing. Alternative breast cancer 1 (BRCA1) tran-
todetect rearrangements
script levels during treatment, current clinical guide- scripts have been identified in a subset of patients with
involving specific loci. Probes
for a region 5 of the lines rarely recommend RNA-based detection of gene breast cancer who have a family history of breast and/or
designated breakpoint are fusions. The EML4ALK fusion (involving echinoderm ovarian cancer 35. Notably, these patients had previously
labelled with one colour and microtubule associated protein like 4 (EML4) and ana- tested negative for pathogenic BRCA1 or BRCA2 muta-
probes for a region 3 of the plastic lymphoma receptor tyrosine kinase (ALK)) was tions by conventional genomic analysis35. It is antici-
breakpoint are labelled with
another colour. An overlapping
originally reported in a subset of non-small-cell lung pated that RNA-seq data will provide a more complete
signal (such as yellow for red cancers (NSCLC) in 2007, and the ALK inhibitors view of altered splicing and disease-specific transcripts,
and green probes) indicates crizotinib and ceritinib gained FDA approval in ALK- and that the growing body of transcriptome data will
anormal pattern, whereas rearrangement-positive NSCLC in 2011 and 2015, be a rich resource for discovery of disease-specific iso-
distinct signals (that is, red
respectively. EML4ALK is typically detected by fluor form transcripts as potential diagnostic markers and
andgreen) indicate the
presence of a rearrangement. escence insitu hybridization (FISH) using commercial therapeutictargets.
break-apart probes that flank a highly conserved trans
Structural variants location breakpoint in the ALK genomic locus23, with the Allele-specific expression. Allele-specific expression
Genomic variants, other than emerging use of immunohistochemistry-based strategies (ASE) can arise through multiple mechanisms, includ-
single-nucleotide variants,
involving large regions of DNA,
to detect overexpression of ALK protein24. Recent clini- ing genetic imprinting, Xchromosome inactivation and
including insertions, deletions, cal guidelines recommend against using qRT-PCR-based allele-specific transcription36; in some cases, ASE has
inversions and duplications. ALK fusion detection for treatment selection in lung been associated with predisposition to disease37,38. One

NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 3



2
0
1
6
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

Small RNA sequencing


miRNA (~1824 nucleotides) piRNA (~2632 nucleotides)
Potential biomarker in progression of Potential biomarker in
neurological disease and cancer prognosis disease progression

Genomic DNA sequencing


DNA Coding gene A Coding gene B DNA Viral integration DNA DNA
encoding encoding encoding encoding
miRNA lncRNA rRNA snoRNA
miRNA-induced Haploid A splicing variant Viral integration event, e.g. HPV
silencing

Genomic DNA
Ex ter
ive xo r

rRNA
at E ote

5. S

S
pr n 1

5S

28S
Ex 2
3

Ex 2
3
o

(Haploid A)

18
8
on
on

on

on

on
on
modifying
om
om

Ex

Ex

Ex
Pr
rn
te

Haploid B
Al

missense Translocation fusing exon 1 of gene B


variant with exon 2 and 3 of gene C
Genomic DNA
(Haploid B)

DNA RNA
Frequently
Transcriptome sequencing depleted RNA
Precursor mRNAs
Measured by random
priming amplication
Exon 3 to
be skipped snoRNAs
Haploid A transcript Precursor Precursor Part of rRNA
Precursor virus mRNAs
with exon skipping fusion mRNAs lncRNAs guidance

Haploid B isoform transcript


from alternative promoter

Haploid B transcripts
with expressed mutation rRNAs
Frequently ribo-depleted
owing to abundance

mRNA sequencing
End- End- End-
Mature mRNAs capping capping capping
Measured via poly(A) and and and
enriched mRNA-seq poly(A) poly(A) poly(A)
End- poly(A) tailing tailing tailing
capping tailing
Haploid A alternative transcript Expressed fusion ...AAAA
From exon-skipping, can be ...AAAA Potential therapeutic
activating or de-activating target in cancer
...AAAA

Haploid B isoform variant


From alternative promoter, ...AAAA ...AAAA AAAA... Expressed virus mRNAs
can be tissue specic Accessible from mRNA, qRT-PCR
...AAAA
assays for HPV and other transmitted
...AAAA ...AAAA viruses; can be used for viral
Haploid B transcripts ...AAAA detection and typing
Counts can be used for lncRNAs
dierential expression ...AAAA ...AAAA Measured from mRNA and
transcriptome sequencing
...AAAA
Allele-specic expression
Biased allele counts due to
promoters, altered splicing;
can be informative towards
DNA VUSs

Nature Reviews | Genetics


4 | ADVANCE ONLINE PUBLICATION www.nature.com/nrg

2
0
1
6
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

Single nucleotide of the most common mechanisms for ASE is genomic the presence of an otherwise unidentified promoter
polymorphisms imprinting, whereby one allele is silenced through mutation or intronic variant impacting splicing. For
(SNPs). Single nucleotide DNA methylation and histone modifications, leaving example, ASE analysis of transforming growth factor
alterations that represent biased expression of the transcribed single nucleotide beta receptor 1 (TGFBR1) has been observed and associ-
single base-pair variation at a
polymorphisms (SNPs) in a parent-oforigin specific man- ated with an increased risk of colon cancer, even though
specific DNA position among
individuals, the majority of ner. Imprinted gene clusters are frequently associated the mechanism for ASE has not been identified37. Within
which are inherited. with human disease, as disease syndromes can arise from our own work, we used RNA-seq and ASE analysis to
alterations on the single non-silenced, parental allele. For characterize both the chromosomal parent-oforigin and
Nonsense-mediated decay
example, Angelman syndrome, a neurogenetic disorder the extent of Xinactivation in a female child with mild
A translation-coupled RNA
decay mechanism whereby
associated with intellectual disability, speech impairment cognitive impairment46. By performing family-trio (child
aberrant mRNAs with and a risk of seizures, is a well-studied imprinting dis- and both parents) whole-exome sequencing and RNA-
premature stop codons are order caused by deficient maternal allele expression of seq, we defined a denovo heterozygous deletion encom-
recognized and degraded. ubiquitin protein ligase E3A (UBE3A) in the brain39. In passing 1.6kb on chromosome X, which contained
Expression quantitative
addition to epigenetic and transcriptional regulation, several genes associated with neurological dysfunction;
trait loci post-transcriptional mechanisms such as alternative using SNPs as phasing markers, we demonstrated that
(eQTLs). Genomic loci that splicing and protein-truncating mutations can also con- the focal deletion was present on the paternal allele. The
regulate the quantitative tribute to ASE40. For example, variants affecting splicing RNA-seq data further provided the ability to use ASE
phenotypic trait of gene
can cause exons to be skipped leading to ASE of variants analysis to estimate skewed Xchromosome inactivation,
expression. Genetic markers at
these loci are associated with
contained within the exon41; likewise, a premature stop demonstrating favoured expression of the cytogeneti-
measurable changes in gene codon can lead to nonsense-mediated decay of one allele, cally normal maternal allele, which we suggested con-
expression. resulting in ASE of the other 42,43. tributes to the observed modest phenotype. Notably,
Evaluating ASE within RNA-seq data can inform RNA-seq provided a unique advantage over the tradi-
RNA split reads
RNA sequencing reads that are
our understanding of regulatory variation and aid in tional Humara assay, which measures DNA methylation
split for example, to the functional interpretation of genetic variants44. Initial of the androgen receptor gene on chromosome X, pro-
accommodate exon junctions. applications of ASE-RNA-seq focused on genomic viding both a parent-oforigin and chromosome-wide
regions that contribute to variation in transcript view of X-inactivation46.
Extracellular RNAs
expression levels, termed expression quantitative trait loci, Degner etal.47 provided one of the first reports to
(exRNAs). RNAs found outside
of the cell, they can be
looking at Nigerian individuals from the International ascertain RNA-seq read-mapping allele specificity,
protected within vesicles or in HapMap project. RNA-seq of lymphoblastoid cell lines demonstrating a mapping bias for alleles with SNPs
association with RNA-binding derived from these individuals, coupled with the corre- represented in the reference sequence, compared to that
proteins, and they can include sponding genotypes from the HapMap project, resulted of the alternative allele, thus producing reference-biased
exogenous sequences.
in the identification of over 1,000 genes for which ASE45,47. Filtering to remove biased sites resulted in an
genetic variation influenced transcript levels or splicing enrichment of SNPs with ASE in genes with previously
and showed high concordance between polymorphisms reported cis-regulatory mechanisms or gene imprint-
located near genes and ASE45. More recently, as part of ing 47. Measurement of ASE can also be confounded by
the GenotypeTissue Expression (GTEx) Program, difficulty aligning RNA split reads that harbour neigh-
RNA-seq is being carried out on samples from a broad bouring SNPs and small indels, which can also lead
range of tissues from hundreds of post-mortem donors to reference-biased ASE42. Recent reports provide rec-
to, in part, examine the influence of genetic variation ommendations for bioinformatic analysis and data
on gene expression. By analysing ASE in the pilot processing to address these and other challenges and
GTEx data, the effects of truncating mutations on non- introduce tools for improved detection of ASE from
sense-mediated transcript decay were characterized, RNA-seqdata42,48.
demonstrating the utility of using RNA-seq and ASE
analysis to aid in the functional interpretation of genetic Extracellular RNAs. The investigation of extracellular
variants at the DNA level43. RNAs (exRNAs) in biofluids to monitor disease is a rap-
Given that some events may be difficult to detect or idly growing area of diagnostic research. exRNAs are
predict, ASE can be an important correlative biomarker released from all cells in the body and are protected
towards identifying a pathogenically relevant genetic from degradation by carriage within extracellular vesi
variant. Overexpression of a mutant allele may indicate cles or association with RNA-binding proteins (RBPs)
and lipoproteins 4952,160. Measurement of exRNA is
appealing as a non-invasive method for monitoring
Figure 1 | Diversity of RNA species detection enabled by RNA sequencing disease; as biofluids are more readily accessible than
applications. Various RNA sequencing (RNA-seq) methodologies can be used to measure tissues, more frequent longitudinal sampling can occur.
diverse, clinically relevant RNA species. Small RNA deep sequencing uses size selection to Transcripts from many tissue types, including neurons
sequence various small non-coding RNAs, including microRNAs (miRNAs) and from the brain, have been shown to be accessible and
PIWI-interacting RNAs (piRNAs). Precursor RNAs can be measured using random primer
detectable in plasma samples53. One obvious drawback
amplification and oligo(dT) primers can be used to select polyadenylated transcripts.
RNA-seq also allows for detection and measurement of alternative transcripts, chimeric is a lack of tissue specificity, as biofluids contain exRNAs
gene fusion transcripts and viral RNA transcripts, as well as evaluation for allele-specific and RBPs released from many tissue types. However, the
expression. HPV, human papillomavirus; lncRNA, long non-coding RNA; poly(A), size of the organ or tissue and proximity to the biofluid
polyadenylation; qRT-PCR, quantitative reverse transcription PCR; rRNA, ribosomal RNA; can influence the RNAs detected. For example, plasma
snoRNA, small nucleolar RNA; VUSs, variants of undetermined significance. samples have high levels of transcripts released from the

NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 5



2
0
1
6
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

Table 1 | Selected examples of current RNA-based clinical tests monitor disease, using changes in exRNAs as readouts
for key disease pathways and indicators of therapeutic
RNA Method Examples Use efficacy. Companies such as Exosome Diagnostics are
biomolecule
developing exRNA-based Clinical Laboratory Improvement
Viral RNA qRT-PCR Influenza virus68 Viral detection Amendments (CLIA) diagnostic tests to monitor key gene
Dengue virus69 and typing
HIV70
fusions (EML4ALK) and mutations (EGFR T790M)
Ebola virus71 from plasma samples56. ExoDx Lung(ALK) is the first
such test, measuring EML4ALK transcripts isolated
mRNA qRT-PCR AlloMap (CareDx; heart transplant)15,16 Diagnosis
Cancer TypeID (BioTheranostics)143 from exosomes in plasma from patients with NSCLC.
Notably, several groups have also used circulating RNA
Microarray Afirma Thyroid Nodule Assessment Diagnosis
(Veracyte)116 information to provide feedback about fetal health57.
The US National Center for Advancing Translational
qRT-PCR OncotypeDx (Genome Health; breast, Prognosis Sciences (NCATS) has recently launched the
prostate and colon cancer)144147
Breast Cancer Index Extracellular RNA Communication Consortium58 to
(BioTheranostics)148 develop the use of exRNA as a diagnostic tool59. They
Prolaris (Myriad; prostate cancer)136 have funded several groups to help develop a catalogue
Digital Prosigna Breast Cancer Prognostic Prognosis of exRNAs in healthy individuals and in a number of
barcoded Gene Signature (Nanostring)149 diseases60. With increasing support for exRNA research,
mRNA there should be substantial gains in understanding
analysis how to best examine these biomolecules and overcome
Microarray MammaPrint (Agendia; breast Prognosis variability in detection.
cancer)134
ColoPrint (Agendia; colon cancer)150
Decipher (Genome Dx; prostate Non-coding RNA species. Beyond mRNA quantifica-
cancer)151 tion and detection of alternative transcripts, RNA-seq
opens up possibilities to measure a considerable diver-
miRNA Microarray Cancer Origin (Rosetta Genomics)152 Diagnosis
sity of RNA species including long non-coding RNAs
Fusion qRT-PCR AML (RUNX1RUNX1T1)18 Diagnosis (lncRNAs) and various short RNA species including
transcript
qRT-PCR BCRABL1 (REF.21) Monitoring miRNAs and piRNAs (TABLE2). Owing to their stabil-
molecular ity and regulatory role in health and disease, miRNAs
response
during therapy
have been extensively examined as potential diagnostic
markers of disease. Currently, small RNA-seq of miRNAs
qRT-PCR ExoDx Lung (ALK) (Exosome Dx)161 Fusion and other targeted miRNA-array platforms have fallen
(exosomal detection
RNA) short for reliable cross-platform accuracy 61,62. Acon-
siderable obstacle to using small RNA-seq is the low
RNA-seq FoundationOne Heme2,3 Fusion
detection
level of validation observed across PCR and sequencing
platforms. The US National Institute of Standards and
AML, acute myeloid leukaemia; BCR, breakpoint cluster region; miRNA, microRNA; qRT-PCR,
quantitative reverse transcription PCR; RNA-seq, RNA sequencing; RUNX1, runt-related Technology (NIST) has begun development of small
transcription factor 1; RUNX1T1, runt-related transcription factor1 translocated to 1 (cyclin D RNA controls; external RNA controls to support the
related). validation of assay results and improve platforms are a
necessary next step for the utility of small RNA-seq and
liver and heart, while saliva has abundant transcripts are discussed further in later sections.
from salivary glands and the oesophagus (K.R.V.KJ., Of the small regulatory RNAs, miRNAs are the best
unpublished observations). More recently, research- studied and have an updated, well-curated repository
ers are addressing this challenge by testing methods for sequence information: miRBase63. With increasing
to selectively pull down extracellular vesicles derived accessibility and popularity of small RNA-seq, there
from specific tissues, such as by immunoprecipitation is growing interest in using this technology in other
for specific membrane proteins (such as L1 cell adhesion categories of regulatory RNA. However, correct align-
molecule (L1CAM) for neuronally derived vesicles)54,55. ment and categorization is hampered by the state of
Use of exRNAs for cancer detection has parallels to the small RNA databases. Some small RNA databases
extracellular DNA in that mutations can be detected are well-maintained and curated (such as the Genomic
and measured in RNA transcripts, provided that the tRNA Database64), whereas other databases maintain
mutations are transcribed. Potential advantages of sequences that are predicted but not experimentally
exRNAs are that there are many more copies of the validated. There is also substantial sequence overlap
Clinical Laboratory
RNA than the DNA (making assessment potentially between categories of RNA, such as piRNA, tRNA and
Improvement Amendments
(CLIA). All laboratory testing on more sensitive) and differences in expression level can rRNA, making d ownstream data analysis challenging.
humans in the United States is indicate that an organ or tissue is injured or diseased, There are new types of regulatory RNAs for which
regulated by The Centers for in a way that cannot be described by DNA measure- the diagnostic potential is unknown. Circular RNAs
Medicare and Medicaid ments. The catalogue of exRNA contains a large num- (circRNAs) were recently rediscovered in RNA-seq
Services through CLIA. The
purpose is to ensure quality
ber of mRNAs and a range of regulatory RNAs that can experiments searching for chromosomal rearrange-
and uniformity of laboratory be thoroughly evaluated by RNA-seq. There is growing ments in cancer 65. Although many groups are identify-
tests. interest in using this non-invasive analysis of exRNAs to ing new circRNAs and their potential functional roles

6 | ADVANCE ONLINE PUBLICATION www.nature.com/nrg



2
0
1
6
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

Metagenomic RNA-seq in the cell, there are few reports of function in disease non-targeted metagenomic RNA-seq has recently been
A method of sequencing the pathogenesis. However, circRNAs have been found in used to directly detect influenza virus RNA in respira-
entirety of the available RNA in high abundance in biofluids and tissues and have been tory fluids, with additional viral pathogens detected in a
a complex (for example, clinical found to be more stable than mRNAs, increasing their subset of cases72. In a public health context, RNA-seq was
or environmental) sample,
which may or may not include
potential for diagnostic purposes66. Other regulatory used to track the origin and transmission patterns of the
steps to subtract the host RNA RNAs have had new roles identified. For example, a role Ebola virus during the 2014 outbreak in West Africa73.
to improve or enrich for for tRNAshas been reported in cancer, whereby cleavage RNA-based amplicon sequencing is also being explored
microbial RNA. oftRNAs produced fragments that could displace the for viral quasi-species (that is, mixed allele population)
RNA-binding protein Ybox binding protein 1 (YBX1) assessment for hepatitis C virus (HCV) and HIV; such
RNA-based amplicon
sequencing from oncogenic transcripts, altering stability and sup- analyses are necessary in the clinic to determine the
A method of direct sequencing pressing breast cancer growth, invasion and metastasis67. presence and relative quantity of drug-resistance muta-
of cDNA amplicons of RNA A current challenge is associating new RNA discoveries, tions for patient therapy, which can occur as minor
targets from a clinical sample. newly identified sequence information and the emer components in a larger viral population74,75. However,
This can be multiplexed and
can involve RNA viral genomes,
ging roles for regulatory RNAs that challenge dogma clinical application of RNA-based diagnostics for infec-
microbial or host mRNA with d isease anddiagnosis. tious disease is still rare beyond the qRT-PCR assays for
transcripts, or exogenous viral pathogens.
RNAtargets. RNA-seq for infectious disease diagnosis
RNA-based pathogen diagnostics. Given the large num- Microbial exogenous small RNA. A tremendous diver-
Microbiome
The totality of the genomic
ber of clinically important RNA viruses (HIV, the Ebola, sity of exogenous RNAs from non-human sources has
content of microbial West Nile, dengue, hepatitis A, hepatitis D, hepatitis E, been seen in human plasma, which indicates there is a
community members in a coxsackie and influenza viruses, and the severe acute res- relationship between the host and the microbiome, food
complex (for example, clinical piratory syndrome (SARS) and Middle East respiratory sources and/or the environment 76,77. The sources and
or environmental) sample. In
the human microbiome, each
syndrome (MERS) coronaviruses), qRT-PCR assays have importance of these microbial exogenous RNAs which
body site has its own unique been developed and are commonly used in the clinic for may or may not be encapsulated in outer-membrane
microbiome; the entirety of the viral detection and typing 6871. It is likely that many of vesicles78 are still being explored, particularly in the
microbiome on and in an these targets will be translated into RNA-based sequen context of infection159. However, they hold a great deal
individual person is considered
cing assays in the near future. For example, unbiased of promise for new diagnostic targets. Extensive analysis
that persons pan-microbiome.

Table 2 | Regulatory non-coding RNA species


RNA Description Potential clinical application
species
miRNA miRNAs are ~1824 nucleotides in length miRNAs are being pursued as potential biomarkers in a broad
and represent the most extensively spectrum of diseases, from cancer to Alzheimer disease to
characterized group of small ncRNAs cardiovascular disease. A microarray-based miRNA test is
having activity in gene repression. currently available for use in characterizing cancer origin152.
piRNA piRNAs are ~2632 nucleotides in length, piRNAs have been implicated in cancer, with an initial study
with functions in transposon repression and demonstrating an association between increased expression
maintenance of germline genome integrity. ofpiRNA and poor prognosis in soft-tissue sarcomas153.
snRNA snRNAs are ~100300 nucleotides in Circulating levels of U2 snRNA fragments (RNU21f) have been
length, localized to the nucleus, with proposed as potential diagnostic biomarkers in various tumour
functions in RNA processing and splicing. types, including pancreatic cancer and colorectal cancer154.
snoRNA snoRNAs have two main classes, box Levels of snoRNA and/or their functional fragments have
C/D snoRNAs, ~6090 nucleotides been proposed as potential clinical diagnostic measures,
in length, and box H/ACA snoRNAs, with applications being pursued in fields such as cancer and
~120140 nucleotides. snoRNAs play a neurodegenerative disorders. Two snoRNAs were recently
key role in ribosome biogenesis and rRNA identified in sputum samples and shown to have potential use
modifications. as diagnostic biomarkers in lung cancer155.
lncRNA lncRNAs represent the category of lncRNAs have been associated with cancer prognosis,
ncRNAsthat are greater than 200 with potential utility as biomarkers in cancer. Tests such as
nucleotides in length and function to ExoIntelliScore Prostate include lncRNA as a biomarker156.
regulate gene expression.
circRNA circRNAs are lncRNAs that contain a Although little is known about the association of circRNAs with
covalent bond between the 5 and 3 end, disease, initial studies are exploring circRNA levels as potential
resulting in a continuous circular loop. biomarkers in cancer; a recent study showed an association
circRNAs can act as miRNA sponges and between reduced levels of a specific circRNA (hsa_circ_002059)
regulators of splicing and transcription. in gastric tumours compared to adjacent non-tumour tissue157.
tRNA tRNAs help with translation of mRNA to Recent evidence suggests that tRNA fragments are cleaved in
protein. tRNAs are highly structured and the presence of hypoxic or other stressful conditions. They can,
have many modifications to bases, making in some cases, act as decoys for RNA binding proteins, causing
them difficult to sequence through. destabilization of other transcripts158.
circRNA, circular RNA; lncRNA; long non-coding RNA; miRNA, microRNA; piRNA, PIWI-interacting RNA; snRNA, small nuclear RNA;
snoRNA, small nucleolar RNA; tRNA, transfer RNA.

NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 7



2
0
1
6
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

(invitro, exvivo and invivo) and cataloguing of small validity, clinical validity and, eventually, clinical utility 90
RNAs produced by pathogens has been under way for (FIG.2). Analytical validity generally refers to the ability
several years and will provide a comprehensive reference of the test to measure the intended biomolecules within
database of exogenous RNA signals that may be useful clinically relevant conditions. Establishment of an ana-
for future clinical infection studies7981. For example, lytically valid test can have different meanings depend-
exvivo studies on Neisseria meningitidis infections in ing on the regulatory framework that the test falls under,
human blood have yielded dozens of small RNAs that as discussed below. However, analytical validity generally
seem to be associated with bacteraemic infections82. implies that the test has undergone thorough technical
Similarly, multiple studies of the Mycobacterium tuber performance characterization. Clinical validity refers to
culosis microRNAome have yielded numerous bio- the ability of a test to predict a clinical outcome given
markers that are currently being explored for diagnostic a set of events, irrespective of whether the test results
purposes83 and even for phenotypegenotype predic- can enable an effective therapy. Clinical utility indicates
tive diagnostics (such as for identifying the presence of whether a test provides useful information, positive or
multi-drug resistance)84. negative, for the patient being tested. Tests that can either
indicate a more effective therapy, such as a companion
Pathogen mRNA. Measurement of microbial mRNA diagnostic, or provide information on avoiding some
may be a useful marker of infection, as expression may therapies may both have clinical utility.
improve detection in cases of low-level infections (for
example, bacteraemia and cerebrospinal fluid infec- Performance metrics and reference standards. To be
tions) and could act as a better predictor of disease analytically valid, a laboratory test must deliver accurate
compared to direct genomic detection. For example, information with reproducible and robust performance.
the simple detection of human papilloma virus (HPV) Accuracy is determined by evaluating a measured or cal-
DNA is not sufficient to diagnose HPV-related squa- culated value compared to a reference gold standard,
mous cell carcinoma (HPV DNA is detectable in ~14% with evaluation of sensitivity (ability to detect true posi
of healthy control women85); thus, RNA-based diag- tives) and specificity (ability to detect true negatives).
nostics to detect HPV have been developed. HPV early The test must also provide the same or similar results
oncoprotein E6/E7 mRNA detection, as a surrogate for with repeat testing (reproducibility) and withstand
active infection, may provide a better predictive value small, deliberate changes in pre-analytic or analytic
for cervicalcancer. variables associated with testing (robustness).
Establishing reference standards and the best prac-
Host RNA. Host response, in the form of mRNA sig- tices for measuring RNA-seq accuracy, reproducibility
natures, is also likely to become useful for monitor- and robustness has initially been ad hoc with individ-
ing specific infections. For example, upregulation ual groups providing the initial steps. In 2008, Marioni
of specific host immune factors (interferon beta 1 and colleagues8 provided some of the earliest techni-
(IFNB1) interferon lambda 2 (IFNL2) and interferon cal assessments of reproducibility for measuring gene
lambda 3 (IFNL3)) was recently demonstrated for expression levels by RNA-seq, reporting high repro-
genotype3HCV infection, which is associated with ducibility across technical replicates for a single RNA
accelerated liver fibrosis and is an independent risk sample. Of the genes denoted significantly differentially
factor for hepatocellular carcinoma, compared to non- expressed by microarray analysis, 81% were also dif-
genotype3HCV infection86,87. Host microRNAgene ferentially expressed with RNA-seq, with fold-change
interactions during the infection response are also correlations between the two technologies similar or
proving to be a fruitful source of potential diagnostic better than those reported in comparisons of different
biomarkers for specific infections, as well as distin- microarray platforms8. RNA-seq detected ~30% more
guishing between active and latent infections; for exam- differentially expressed genes than microarray analysis,
ple, the use of such markers to discriminate between with qRT-PCR confirmation for a subset, suggesting
latent infection and active disease with M.tuberculo a large proportion may represent true positives and
sis 88. Aswith the other diagnostic uses of small RNA this difference may be a result of the broader dynamic
biomarkers described in this Review, a number of range of RNA-seq and/or the ability to resolve splicing
hurdles exist for the development and validation of changes8. In 2013, the Genetic European Variation in
Companion diagnostic
In vitro diagnostic tests that
infection assays, including sensitivity, specificity and, Disease (GEUVADIS) consortium demonstrated the
provide information critical for perhaps more difficult to overcome, the normaliza- feasibility and reproducibility of performing RNA-seq
the safe and effective use of a tion of small RNA in clinical samples, which will vary across multiple laboratories, sequencing lymphoblastoid
corresponding therapeutic between conditions, tissues and individual hosts and cell lines for 465 individuals, in seven sequencing cen-
agent. These tests are used to
is an active area of study 89. However, the application tres using a single platform91. On the basis of this study,
select patients for treatment
with specific agents, including of RNA-seq provides a useful orthogonal approach to the consortium proposed a set of quality checks to assess
identifying patient populations genomics-based diagnostics for clinical microbiology. technical biases in RNA-seq data, including differences
with predicted efficacy as well in GC content, fragment size, transcript length and the
as those that should not Challenges moving RNA-seq to the clinic percentage of reads mapped to a nnotatedexons91.
receive the agent due to a low
likelihood of effectiveness or
Translating an assay to the clinic. Translation and Although providing necessary guidance on RNA-seq
possible serious adverse events broader adoption of a laboratory test into the clinic assay metrics, early assessments often used collection
from the therapy. involves evaluation and demonstration of analytical methods that may not reflect the conditions observed

8 | ADVANCE ONLINE PUBLICATION www.nature.com/nrg



2
0
1
6
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

Analytical validity Clinical validity Clinical utility


Accuracy and reliability of a test to measure The accuracy of how well a test detects or The likelihood the test is to inform clinical
a specic biomarker predicts clinical diagnosis or outcome decisions and improve outcome

Analytical sensitivity Clinical sensitivity Appropriate intervention


How often is the test positive when the How often is the test positive in patients with Assessment of test impact on patient care,
biomarker is present? the disease or clinical outcome? publishing of clinical trials.
Analytical specicity Clinical specicity Quality assurance
How often is the test negative when the How often is the test negative in patients Quality control measures for tests, reagents
biomarker is not present? without the disease or clinical outcome? and/or facilities.
Robustness Prevalence Monitoring
Repeatability and reproducibility of the assay The proportion of individuals that will have a Long-term monitoring of patients and
within and across laboratories. disease or outcome. establishment of guidelines for performance.
Limits of detection Positive predictive value Economics
Lowest level of reliable detection of transcripts. Given prevalence, the probability that subjects Financial costs and economic benets
with a positive test result for a disorder or associated with test.
Stability
outcome will have the disease or outcome.
Collection, handling, transport of sample and Education
impact on robustness. Negative predictive value Educational materials and informed consent
For negative tests, the probability that subjects requirements.
Gold standards
truly will not have the disease or outcome.
Reference sets for assessing sensitivity and ELSI
specicity. Penetrance Assessment of ethical, legal and societal
The proportion of subjects with the biomarker implications that arise in the context of the test.
that have the predicted outcome or diagnosis.

Figure 2 | Criteria for clinical test development and adoption. Before initial clinical introduction, a clinical test must
demonstrate analytical validity, showing sufficient assay performance to produce accurate and reproducibleNature Reviews | Genetics
technical
results. Demonstration of analytical validity involves several measures, including sensitivity (true technical positives),
specificity (true technical negatives), robustness and limits of detection. Clinical validity follows analytical validity and,
depending on the approval path, demonstration of clinical validity can come before (US Food and Drug Administration
(FDA) invitro diagnostic device) or after (Clinical Laboratory Improvement Amendments (CLIA) laboratory-developed test)
test clearance or approval. Clinical validity refers to the concordance between the test result and the clinical diagnosis or
outcome and involves measures of sensitivity (true clinical positives) and specificity (true clinical negatives), as well as
determination of positive and negative predictive values. Demonstration of both analytical validity and clinical validity
occurs before that of clinical utility. Clinical utility requires clinical evidence that use of the test has an impact on patient
care and includes evaluation of patient outcomes and the economic benefits associated with the test. ELSI, National
Human Genome Research Institutes Ethical, Legal and Social Implications Research Program.

within the clinic. For example, within oncology, most tools are also being developed and evaluated for their
studies of transcriptome analysis by RNA-seq come effect on test performance. In a comparison study of
from fresh-frozen tumour samples that are stringently several common software tools (CufflinksCuffdiff2,
collected in terms of cellularity, tumour necrosis and DESeq and edgeR) used to analyse differential expres-
RNA quality, whereas most pathological samples are sion by RNA-seq, using qRT-PCR and microarray results
collected through formalin fixation to preserve the as benchmark standards, Zhang and colleagues96 found
protein and cellular structure. As RNA-seq libraries are that each tool had different strengths and recommended
typically prepared from total RNA using polyadenyla- an ensemble-based approach combining two or more
tion (poly(A)) enrichment of mRNAs, this method does tools to reduce false-positives.
not adequately capture partially degraded mRNAs, as Although these early efforts advanced the field and
are found in FFPE samples. Given the clear challenge indeed proposed reference standards and best practices,
of low-quality RNA from clinical FFPE samples, con- the most substantial large-scale efforts have mainly
certed effort has focused on optimization and evaluation emerged in the past few years and include consortium
of protocol modifications, including rRNA depletion members spanning public regulatory bodies (such as
(ribo-depletion) protocols to remove rRNA without the FDA, NIST and the Centers for Disease Control
poly(A) enrichment92 and the use of capture sequencing, and Prevention), academic groups and industry. One
such as with oligonucleotide probe hybridization93. The of the first sets of reference standards for RNA-seq was
utility of cDNA-Capture sequencing (exome capture the development of synthetic RNA spikein controls
and RNA-seq) was demonstrated for differential gene by the External RNA Controls Consortium (ERCC)97.
expression analysis from FFPE samples94. In addition to The ERCC RNA spikein controls contain 92 poly
differential expression, this capture protocol was recently adenylated transcripts pre-formulated into two sets
applied in the Michigan Oncology Sequencing Center (Mix1 and Mix2), each with the full complement of
(MIONCOSEQ) clinical sequencing programme, transcripts spanning approximately 106-fold concentra-
which demonstrated that the use of capture libraries tion range, present at defined Mix1/Mix2 molar ratios
improved performance for fusion and splice junction in four subgroups. These stock solutions can be diluted
detection, compared to typical poly(A)-enriched RNA- and added to each RNA sample, thus providing a
seq using low-quality RNA (FFPE) samples95. Analytical posthoc measurement of assay performance from a set

NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 9



2
0
1
6
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

of synthetic transcripts98. Synthetic gene-fusion spike- The importance of these studies goes well beyond the
ins, composed of polyadenylated RNA transcripts cor- initial reporting, as companion papers helped to frame
responding to known oncogenic gene fusions, are also the specific areas where analytical validity can be sub-
available with corresponding RNA-seq data99. In late stantially improved. Particularly relevant to clinical spec-
2014, the Sequencing Quality Control (SEQC) project imens, a concordance between RNA-seq and microarray
(Phase III of the Microarray Quality Control (MAQC) was reported for 27 different chemical treatments, show-
Consortium)100 and the Association of Biomolecular ing that differentially expressed genes correlated with
Resource Facilities (ABRF)101 independently reported the effect size of the treatment given and, whereas both
on large-scale efforts to define the performance platforms showed high concordance with qRT-PCR data
characteristics of RNA-seq. Both studies represented for highly expressed genes, RNA-seq showed higher con-
remarka ble coordinated analysis of inter-sample, cordance than microarrays for genes with low expres-
cross-platform and inter-site variability for RNA-seq, sion102. Furthermore, multiple groups have examined the
compared to gold-standard methods such as micro role of normalization methods towards the controlling
arrays and qRT-PCR, with extensive use of ERCC RNA biases resulting from GC content, sequencing coverage
spikein controls. Both studies extensively examined and insert size103,104.
how assay differences such as poly(A)-enrichment
selection or random priming plus ribo-deletion affect Analysis paralysis and other bioinformatic challenges.
reproducibility. Specifically, the FDA-led SEQC study The SEQCMAQC100 and ABRF101 consortium papers
examined the reproducibility and accuracy of RNA-seq identified numerous substantial bioinformatic chal-
in a multi-site study using different sequencing plat- lenges that must be addressed for RNA-seq to become
forms. The SEQC study found that the correlation of broadly adopted into clinical laboratories. Recognizing
relative gene expression measurements between dif- that excellent bioinformatics reviews are available else-
ferent RNA-seq platforms (SOLiD, Life Technologies; where105, here we attempt only to highlight the major
HiSeq 2000, Illumina) and Affymetrix HuGene U133 overarching bioinformatics challenges. We find three
Plus 2.0 microarrays with TaqMan qRT-PCR was high large themes that frequently contribute to analysis
(Spearman and Pearson correlation coefficients of paralysis during the development of bioinformatics
>0.9), although none of the platforms provided accu- solutions to RNA-seq: first is the lack of consensus by
rate absolute quantification of transcript levels, based on governing bodies advocating best practices and reference
evaluation of ERCC spikein control titration values98. standards for validating RNA-seq pipelines; second is the
Sensitivity, specificity and reproducibility of differential overabundance of software tools, options and combina-
expression calls across sites were dependent on the ana tions thereof for RNA-seq analysis; and third are highly
lysis pipeline and the use of filters. Applying filters for complex pipelines consisting of chaining together mul-
P value, fold-change and expression level improved the tiple tools that are independently developed, maintained
false discovery rate for differential gene expression ana andlicensed.
lysis, with most pipelines showing high reproducibility First and probably most relevant, RNA-seq analysis
for differential expression calls across sites and great- has largely grown organically without extensive stand-
est concordance for the most highly expressed genes. ards or dominating governing bodies. By comparison,
Consistent with previous reports, the SEQC study found standards were established early on for DNA-based
that RNA-seq was seen as fundamentally superior at NGS by the 1,000 Genomes Project, including variant
detecting novel transcripts, such as those resulting from call format (VCF), binary alignment/map format (BAM for-
alternative splicing, validating over 80% by qRT-PCR100. mat) and genotype likelihoods, essentially providing the
The ABRF study examined intra- and inter-laboratory best practice approaches (REFS106111). Before 2014,
reproducibility at 15 different laboratories, comparing reference standards, ERCC spikein controls and the
different library preparation methods, sample vari general MAQC were major contributors towards build-
ables (RNA integrity, size-specific fractionation), ana ing a reproducible RNA-seq pipeline, but they pale in
lysis algorithms and sequencing platforms. The ABRF comparison to having a consensus germline-genomic
study reported an overall high concordance for nor- reference standard, such as the NA12878 human ref-
malized gene expression measures within platforms erence genome112. As a consequence of this lack of
Variant call format (Spearman correlation >0.86) and between plat- standards, RNA-seq provided fertile grounds for the
Standard text file format for
storing genomic sequence
forms (Spearmancorrelation >0.83). Deficiencies in emergence of a multitude of software tools and other
variant data, with each line of cross-platform detection were identified and associated options, all competing for relevance as part of a broad
the file describing a variant with read-length, analysis approaches and technol- range of RNA-seq solutions. RNA-seq pipelines repre-
present at a specific genomic ogy differences101. Similar to previous reports, library sent the laboratory-specific wrapper scripts, chaining
region or position.
preparation methods influenced transcript enrichment, together collections of software tools with the goal of
Binary alignment/map with poly(A) libraries containing more exonic reads reporting hundreds to thousands of test results from
format and ribo-depletion libraries containing more intronic gigabases of data. Unfortunately, having RNA-seq pipe-
(BAM format). Standard file reads. Notably, although differences were observed, lines composed of several independently developed
format for storing sequencing differential gene expression results were comparable components, each with continual versioning and vari
reads with alignments. BAM
files are binary representations
between poly(A) enrichment and ribo-depletion library able licenses, can be challenging in a clinical testing
of the sequence alignment/ preparation methods, as well as between degraded and laboratory. Although excellent for agile software devel-
map (SAM) format. non-degraded RNA samples101. opment in a non-regulated and rapidly changing field,

10 | ADVANCE ONLINE PUBLICATION www.nature.com/nrg



2
0
1
6
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

this type of fertile environment creates major challenges demonstrated the effects of the Afirma test on clinical
in a clinical laboratory. For example, whereas genome care recommendations, which resulted in a reduction of
builds change every few years, transcript definitions unnecessarysurgeries117.
often changequarterly. In the US, the distinction over when NGS assays are
Although the fundamental challenges of bioinfor- under regulatory oversight by the FDA or the CMS is
matics are unlikely to be easily solved, the framework emerging as an area of regulatory and legislative debate.
for how they are managed has improved considerably. In late 2014, the FDA proposed a regulatory framework
With the ERCC97, SEQCMAQC100 and ABRF101 con- for LDTs118,119 that will, in all likelihood, alter the regu-
sortia providing initial models and reference standards, latory landscape discussion for RNA-seq assays moving
the emergence of bake-offs and best practices will to the clinic. The FDA also provided a perspective on
be an essential next step in reducing some of the bio- the mammoth shifts created by technological advances
informatics challenges of clinical RNA-seq. Moreover, associated with NGS, and the requirement for the
the emergence of tools that allow containerization, agency to change from the current general enforce-
such as Docker 113, provide a platform for distributing ment discretion in which the FDA has generally not
fully contained pipelines such that a pipeline run in enforced regulations with respect to LDTs to having
one facility could be reproducibly deployed in another a more active role, with proposed premarket review
laboratory. Some full-packaged pipelines have recently and quality system regulation requirements120. Under
emerged with the expressed intent of being relevant for this proposed LDT framework, the CMS (under CLIA)
clinicalapplications114,115. would oversee the laboratory operations and testing
processes and the FDA would monitor compliance with
Regulatory considerations. Deployment of a clinical quality system regulations.
assay in the United States involves two paths, each The effects of expanding regulatory oversight by
with accompanying regulations administered under the FDA on RNA-seq are predicated around the FDA
the US Department of Health and Human Services approval process for the first FDA-cleared NGS instru-
(HHS). The first are those approved through the CLIA ment and NGS invitro diagnostic tests, the Illumina
of 1988 that allow laboratory-developed tests (LDTs). MiSeqDx and the associated invitro diagnostic assays
LDTs are invitro diagnostic tests that are developed and for cystic fibrosis, the Illumina MiSeqDx Cystic Fibrosis
used within a single approved laboratory and are not 139Variant and Cystic Fibrosis Clinical Sequencing
marketed towards any other laboratory. CLIA regu- Assays. Accuracy was evaluated using a representative
lations monitor the laboratory process to ensure the subset of variants, rather than evaluating all possible
accuracy, reliability and appropriateness of laboratory variants, and relied on publicly available data to support
testing, from sample acquisition, handling and stor- clinical relevance of the variants. Although evaluation
age to the interpretation and reporting of test results. of analytical performance may continue to involve this
The guidelines for approving CLIA laboratories are subset-based approach, the proposed new standards,
established by accredited professional organizations, as outlined by the FDA120, could include defined tech-
such as the College of American Pathologists or by nical metrics for data quality, additional standards for
other agencies approved by the Centers for Medicare computational approaches and standard best practices
and Medicaid Services (CMS), such as in the state of for quality assurance. The debate over FDA oversight
New York. CLIA regulations of LDTs do not address is largely focused on the presence or absence detection
the clinical validity or clinical utility of an assay, but of DNA variants, such as germline cystic fibrosis trans
instead provide a framework whereby clinical lab- membrane conductance regulator (CFTR) or BRCA2
oratories validate analytical performance measures testing. While the FDA guidance and debate is limited
of the LDTs within their own laboratory facility. The in use of examples, the broad scope of additional regula-
second set of regulations for clinical assay deployment tion on all NGS-developed tests, including RNA-seq, may
are the Medical Device Amendments of 1976, which provide regulatory uncertainty for RNA-seq and impede
expanded FDA oversight for the marketing of invitro its adoption in the clinic. The proposed FDA regulations
diagnostic devices (IVDs). FDA premarket review of around NGS have not gone without debate, emphasiz-
IVDs assures the assay has established analytical and ing that the limited enforcement capabilities and regu-
clinical validity; with the exception of companion diag- latory guidance could unnecessarily stifle adoption and
nostics, the FDA does not typically require demonstra- innovation121. International regulatory frameworks vary
Invitro diagnostic tests tion of clinical utility for clearance or approval of IVDs. across jurisdictions122,123, with evolving practice guide-
Laboratory tests used to Demonstration of clinical validity (for LDTs) and clin- lines and regulations for the clinical use of NGS118,119,123,124.
detect health conditions, ical utility (for LDTs and IVDs) can follow the initial Forexample, in the European Union (EU), IVD tests
infections or diseases. These
clearance or approval of a diagnostic test; clinical util- require a Conformit Europenne (CE) mark to indi-
diagnostic tests are performed
using a sample collected from ity, in particular, requires broader clinical evaluation cate compliance with the EU IVD Directive (98/79/EC).
the patient without direct across multiple sites and/or within clinical trials. For Similar to the US, the EU is reviewing policy changes
physical interaction between example, the Afirma (Veracyte) microarray-based gene related to IVDs, with proposed changes to harmonize the
the test and the patient. In the expression classifier for thyroid nodule assessment was IVD market and increase oversight, including the use of
United States, invitro
diagnostics are regulated by
launched in 2011 as a CLIA-regulated LDT. Subsequent a risk-based classification scheme to define clinical evi-
the US Food and Drug studies have reported on clinical validity 116 and clini- dence requirements, such as analytical andclinical per-
Administration (FDA). cal utility 117, the latter involving a multi-site study that formance, for IVD approval125. The pending regulatory

NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 11



2
0
1
6
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

Box 2 | Selected examples of integrating DNA and RNA analysis in oncology Integration of DNA sequencing and RNA-seq holds
promise beyond oncology. For example, in the transplant
Recent clinical sequencing reports from various groups point to the value of field, reports are emerging for the utility of circulating
incorporating RNA sequencing (RNA-seq) with DNA sequencing to evaluate the DNA in monitoring for transplant rejection129,130; as dis-
expression of mutant alleles, to detect both known and novel gene fusions, and to detect cussed earlier, RNA-based measures are already used
splice variants127,138,139. For example, Mody and colleagues139 recently reported results
for the early detection of rejection in heart transplant
from the Pediatric Michigan Oncology Sequencing (MI-ONCOSEQ) programme,
incorporating clinical exome-sequencing of tumour and germline DNA and recipients. The integration of RNA and DNA sequencing
transcriptome sequencing of tumour RNA into the management of children and young to improve transplant rejection diagnosis is an impor-
adults with refractory or relapsed cancer. The application of these integrative sequencing tant area currently being examined. The combination of
strategies resulted in changes to patient management in 46% (42 out of 91) of cases, genotyping circulating DNA from donor and recipient,
changes to therapy in 15% (14 out of 91) and partial or complete clinical remissions in 10% and assessing changes in expression level may provide
(9 out of 91), including cases in which potentially actionable events (mainly gene fusions) insights for the degree of rejection, the molecular mech-
were detected by RNA-seq, but absent in DNA sequencing139. Inone reported case, anisms underlying rejection and could suggest possible
RNA-seq identified a cryptic ETV6ABL1 fusion (involving ets variant6 (ETV6) and ABL1, therapeutic strategies to keep the transplant viable. In
which was not detected by standard cytogenetics or fluorescence insitu hybridization the same way, integration of DNA and RNA sequencing
(FISH), in a patient with precursor B cell acute lymphoblastic leukaemia; the patient
could benefit fetal medicine. DNA sequencing of the
maintained molecular remission following treatment with the ABL1 inhibitor imatinib.
Using RNA-seq and whole-genome sequencing, Andersson and colleagues127 reported fetus from maternal blood, in combination with changes
fusion detection in infant mixed lineage leukaemia (MLL)-rearranged acute lymphoblastic in transcript expression levels, could provide additional
leukaemia. In addition, they identified frequent activating mutations in tyrosine kinase accuracy and insight for assessing d evelopmental
PI3K (phosphatidylinositol4,5bisphosphate 3kinase)RAS signalling pathway genes, at complications.
low DNA allele frequencies, which is suggestive of clonal populations; however, RNA-seq Although challenges exist, the demonstrated utility
data demonstrated expression of the mutant allele in all cases127. of integrative sequencing strategies in research studies is
The value of integrating DNA and RNA analysis is also evident in our own clinical growing across broad health applications and points to
research sequencing experience. In one example, whole-genome sequencing and the promise for incorporation of RNA-seq into clinical
RNA-seq was used to detect a highly expressed CTLA4CD28 fusion (involving cytotoxic medicine.
Tlymphocyte associated protein 4 (CTLA4) and CD28 molecule (CD28)) in a patient with
advanced Szary syndrome, with rapid clinical response noted following treatment with
ipilimumab140, a monoclonal antibody targeting CTLA4. Integrated analysis of DNA Conclusion and perspectives
sequencing and RNA-seq data in triple-negative breast cancer samples also revealed the With its unprecedented ability to simultaneously detect
consequence of a splice site alteration in the tumour suppressor retinoblastoma 1 (RB1), global gene transcript levels and diverse RNA species,
providing transcript evidence for an inframe exon skipping event that was suggested to RNA-seq has the potential to revolutionize clinical test-
result in RB1 inactivation, indicating a lack of benefit from CDK4/CDK6 inhibitors141. In ing for a wide range of diseases. Although recent efforts
another study142, integrated whole-genome and whole-transcriptome analysis of have set the stage for the establishment of benchmark
cholangiocarcinoma tumours revealed fibroblast growth factor receptor 2 (FGFR2) standards for technical and analytical best practices in
fusions in three of six cases, two of which received FGFR-targeted therapy with evidence order to better standardize RNA-seq accuracy, repro-
of clinical response. In an additional case, preferential allele-specific expression of a ducibility and precision, additional steps toward test
lossoffunction mutation in ERBB receptor feedback inhibitor 1 (ERRFI1), a negative
proficiency and validation will be required to expand
regulator of EGFR, was detected and the patient went on to experience marked disease
regression following treatment with an EGFR inhibitor142. Together, these selected the utility of RNA-seq.
oncology examples illustrate the potential clinical value for integrating DNA- and Conceptually, RNA-seq is shot-gun sequencing of
RNA-based measures. the transcriptome, lending to both potential utility and
considerable hurdles towards translating RNA to the
clinic. The dynamic range and expanding approaches
changes, both in the US and internationally, may substan- for sample preparation and analysis allow for incredible
tially impact the clinical utility of RNA, particularly until fine-tuning of sensitivity, specificity and reproducibil-
greater consensus is reached towards r eference standards. ity. To some extent, the same flexibility and seemingly
infinite set of options for RNA-seq that has spurred
Integrating DNA and RNA sequencing incredible discoveries into the dynamic nature of human
In clinical studies, integrative DNA and RNA analysis disease has also hindered its path to the clinic. In par-
has provided additional evidence for dysregulation of ticular, the establishment of standards has lagged until
mutated genes, as well as detection of gene fusions and recently. Several paths forward exist and it is likely that
splicing variants and, in some cases, helped prioritize many will be taken towards the direct use of RNA-seq in
variants for therapeutic intervention (BOX2). In addition clinical applications. Once the discovery phase is com-
to associating specific genomic alterations with poten- plete, many diagnostic tests will become targeted assays,
tial therapeutic response, exciting and emerging work sensitive enough to detect small numbers of rare tran-
suggests that integrated sequencing strategies may also scripts. The fixed nature of probe sets with microarrays
aid in the identification of patient-specific immunogenic or qRT-PCR offer an accelerated path for clinical test
Triple-negative breast neoantigens expressed in the tumour. Recently, RNA-seq development, as the data are the data without the lure
cancer and exome sequencing were used to identify and filter for of the latest and newest analysis methods. Therefore,
A breast cancer subtype predicted immunogenic neoantigens that were expressed although RNA-seq as a platform has great promise,
characterized as oestrogen
receptor-negative, progesterone
in melanoma tumours, demonstrating an association continuing studies are needed to demonstrate analyti-
receptor-negative and ERBB2 between these expressed neoantigens and response to cal validity and facilitate its adoption within the clinical
(also known as HER2)-negative. the immune-checkpoint inhibitor ipilimumab17. laboratory setting.

12 | ADVANCE ONLINE PUBLICATION www.nature.com/nrg



2
0
1
6
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

1. Mortazavi,A., Williams,B.A., McCue,K., Schaeffer,L. 22. Cavelier,L. etal. Clonal distribution of BCRABL1 44. Larson,N.B. etal. Comprehensively evaluating
& Wold,B. Mapping and quantifying mammalian mutations and splice isoforms by single-molecule cis-regulatory variation in the human prostate
transcriptomes by RNA-Seq. Nat. Methods 5, long-read RNA sequencing. BMC Cancer 15, 45 transcriptome by using gene-level allele-specific
621628 (2008). (2015). expression. Am. J.Hum. Genet. 96, 869882 (2015).
2. Doebele,R.C. etal. An oncogenic NTRK fusion in a 23. Leighl,N.B. etal. Molecular testing for selection of 45. Pickrell,J.K. etal. Understanding mechanisms
soft tissue sarcoma patient with response to the patients with lung cancer for epidermal growth factor underlying human gene expression variation with RNA
tropomyosin-related kinase (TRK) inhibitor LOXO101. receptor and anaplastic lymphoma kinase tyrosine sequencing. Nature 464, 768772 (2010).
Cancer Discov. 5, 10491057 (2015). kinase inhibitors: American Society of Clinical 46. Szelinger,S. etal. Characterization of X chromosome
3. Sonu,R.J., Jonas,B.A., Dwyre,D.M., Gregg,J.P. Oncology endorsement of the College of American inactivation using integrated analysis of whole-exome
&Rashidi,H.H. Optimal molecular methods in Pathologists/International Association for the study and mRNA sequencing. PLoS ONE 9, e113036 (2014).
detecting p190 (BCR-ABL) fusion variants in oflung cancer/association for molecular pathology 47. Degner,J.F. etal. Effect of read-mapping biases on
hematologic malignancies: a case report and review guideline. J.Clin. Oncol. 32, 36733679 (2014). detecting allele-specific expression from RNA-
ofthe literature. Case Rep. Hematol. 2015, 458052 24. Wynes,M.W. etal. An international interpretation sequencing data. Bioinformatics 25, 32073212
(2015). study using the ALK IHC antibody D5F3 and a (2009).
4. Cech,T.R. & Steitz,J.A. The noncoding RNA sensitive detection kit demonstrates high concordance 48. Castel,S.E., Levy-Moonshine,A., Mohammadi,P.,
revolution-trashing old rules to forge new ones. between ALK IHC and ALK FISH and between Banks,E. & Lappalainen,T. Tools and best practices
Cell157, 7794 (2014). evaluators. J.Thorac. Oncol. 9, 631638 (2014). for data processing in allelic expression analysis.
5. Wang,Z., Gerstein,M. & Snyder,M. RNA-Seq: 25. Lindeman,N.I. etal. Molecular testing guideline for Genome Biol. 16, 195 (2015).
arevolutionary tool for transcriptomics. Nat.Rev.Genet. selection of lung cancer patients for EGFR and ALK 49. Valadi,H. etal. Exosome-mediated transfer of mRNAs
10, 5763 (2009). tyrosine kinase inhibitors: guideline from the College and microRNAs is a novel mechanism of genetic
6. Wilhelm,B.T. etal. Dynamic repertoire of a of American Pathologists, International Association for exchange between cells. Nat. Cell Biol. 9, 654659
eukaryotic transcriptome surveyed at single- the Study of Lung Cancer, and Association for (2007).
nucleotide resolution. Nature 453, 12391243 Molecular Pathology. Arch. Pathol. Lab. Med. 137, 50. Arroyo, J. D. etal. Argonaute2 complexes carry a
(2008). 828860 (2013). population of circulating microRNAs independent of
This is one of the earliest applications of RNA-seq 26. Wang,Y., Wu,N., Liu,J., Wu,Z. & Dong,D. vesicles in human plasma. Proc. Natl Acad. Sci. USA
indicating diagnostic potential for detecting and FusionCancer: a database of cancer fusion genes 108, 50035008 (2011).
quantifying various RNA species including mRNA, derived from RNA-seq data. Diagn. Pathol. 10, 131 51. Michael,A. etal. Exosomes from human saliva as a
alternative transcripts and non-coding RNA. (2015). source of microRNA biomarkers. Oral Dis. 16, 3438
7. Velculescu,V.E., Zhang,L., Vogelstein,B. & 27. Magri,F. etal. Clinical and molecular characterization (2010).
Kinzler,K.W. Serial analysis of gene expression. of a cohort of patients with novel nucleotide 52. Skog,J. etal. Glioblastoma microvesicles transport
Science 270, 484487 (1995). alterations of the Dystrophin gene detected by direct RNA and proteins that promote tumour growth and
This paper provides an early description of sequencing. BMC Med. Genet. 12, 37 (2011). provide diagnostic biomarkers. Nat. Cell Biol. 10,
multiplexed RNA detection that helped set the 28. Liu,F. & Gong,C.X. Tau exon 10 alternative splicing 14701476 (2008).
stagefor use of microarray and spotted arrays and tauopathies. Mol. Neurodegener. 3, 8 (2008). 53. Koh,W. etal. Noninvasive invivo monitoring of tissue-
withindiagnostics. 29. La Cognata,V., DAgata,V., Cavalcanti,F. & specific global gene expression in humans. Proc. Natl
8. Marioni,J.C., Mason,C.E., Mane,S.M., Stephens,M. Cavallaro,S. Splicing: is there an alternative Acad. Sci. USA 111, 73617366 (2014).
& Gilad,Y. RNA-seq: an assessment of technical contribution to Parkinsons disease? Neurogenetics 54. Goetzl,E.J. etal. Altered lysosomal proteins in neural-
reproducibility and comparison with gene expression 16, 245263 (2015). derived plasma exosomes in preclinical Alzheimer
arrays. Genome Res. 18, 15091517 (2008). 30. Chen,J. & Weiss,W.A. Alternative splicing in cancer: disease. Neurology 85, 4047 (2015).
9. Sultan,M. etal. A global view of gene activity and implications for biology and therapy. Oncogene 34, 55. Shi,M. etal. Plasma exosomal -synuclein is likely
alternative splicing by deep sequencing of the human 114 (2015). CNS-derived and increased in Parkinsons disease.
transcriptome. Science 321, 956960 (2008). 31. Dehm,S.M., Schmidt,L.J., Heemers,H.V., Acta Neuropathol. 128, 639650 (2014).
10. Senkus,E. etal. Primary breast cancer: ESMO Clinical Vessella,R.L. & Tindall,D.J. Splicing of a novel 56. Brock,G., Castellanos-Rizaldos,E., Hu,L.,
Practice Guidelines for diagnosis, treatment and androgen receptor exon generates a constitutively Coticchia,C. & Skog,J. Liquid biopsy for cancer
follow-up. Ann. Oncol. 26, v8v30 (2015). active androgen receptor that mediates prostate screening, patient stratification and monitoring.
11. Coates,A.S. etal. Tailoring therapies-improving the cancer therapy resistance. Cancer Res. 68, Translat. Cancer Res. 4, 280290 (2015).
management of early breast cancer: St Gallen 54695477 (2008). 57. Tsui,N.B. etal. Maternal plasma RNA sequencing
International Expert Consensus on the Primary 32. Antonarakis,E.S. etal. ARV7 and resistance to forgenome-wide transcriptomic profiling and
Therapy of Early Breast Cancer 2015. Ann. Oncol. 26, enzalutamide and abiraterone in prostate cancer. identification of pregnancy-associated transcripts.
15331546 (2015). N.Engl. J.Med. 371, 10281038 (2014). Clin.Chem. 60, 954962 (2014).
12. Sparano,J.A. etal. Prospective validation of a 33. Sugawa,N., Ekstrand,A.J., James,C.D. & 58. Ainsztein,A.M. etal. The NIH Extracellular RNA
21gene expression assay in breast cancer. Collins,V.P. Identical splicing of aberrant epidermal Communication Consortium. J.Extracell. Vesicles 4,
N.Engl.J.Med. 373, 20052014 (2015). growth factor receptor transcripts from amplified 27493 (2015).
This paper helps establish clinical validity for a rearranged genes in human glioblastomas. Proc. Natl 59. Quinn,J.F. etal. Extracellular RNAs: development as
prognostic signature for endocrine therapy alone Acad. Sci. USA 87, 86028606 (1990). biomarkers of human disease. J.Extracell. Vesicles 4,
for patients with hormone-receptor-positive, 34. Reardon,D.A. etal. 107 ReACT: overall survival 27495 (2015).
ERBB2negative, axillary node-negative breast froma randomized Phase II study of rindopepimut 60. Laurent,L.C. etal. Meeting report: discussions and
cancer. (CDX110) plus Bevacizumab in relapsed glioblastoma. preliminary findings on extracellular RNA
13. Fumagalli,D. etal. Transfer of clinically relevant Neurosurgery 62 (Suppl. 1), 198199 (2015). measurement methods from laboratories in the NIH
geneexpression signatures in breast cancer: from 35. Gambino,G., Tancredi,M., Falaschi,E., Aretini,P. & Extracellular RNA Communication Consortium.
Affymetrix microarray to Illumina RNA-Sequencing Caligo,M.A. Characterization of three alternative J.Extracell. Vesicles 4, 26533 (2015).
technology. BMC Genomics 15, 1008 (2014). transcripts of the BRCA1 gene in patients with breast 61. Mestdagh,P. etal. Evaluation of quantitative miRNA
14. Zhang,W. etal. Comparison of RNA-seq and cancer and a family history of breast and/or ovarian expression platforms in the microRNA quality control
microarray-based models for clinical endpoint cancer who tested negative for pathogenic mutations. (miRQC) study. Nat. Methods 11, 809815 (2014).
prediction. Genome Biol. 16, 133 (2015). Int. J.Mol. Med. 35, 950956 (2015). This paper provides a critical assessment of
15. Deng,M.C. etal. Noninvasive discrimination of 36. Massah,S., Beischlag,T.V. & Prefontaine,G.G. current platform performance in measuring miRNA
rejection in cardiac allograft recipients using gene Epigenetic events regulating monoallelic gene expression.
expression profiling. Am. J.Transplant. 6, 150160 expression. Crit. Rev. Biochem. Mol. Biol. 50, 62. Kelly,H. etal. Cross Platform standardisation of an
(2006). 337358 (2015). experimental pipeline for use in the identification of
16. Starling,R.C. etal. Molecular testing in the 37. Valle,L. etal. Germline allele-specific expression of dysregulated human circulating miRNAs. PLoS ONE
management of cardiac transplant recipients: initial TGFBR1 confers an increased risk of colorectal cancer. 10, e0137389 (2015).
clinical experience. J.Heart Lung Transplant. 25, Science 321, 13611365 (2008). 63. Kozomara,A. & Griffiths-Jones,S. miRBase: annotating
13891395 (2006). 38. de la Chapelle,A. Genetic predisposition to human high confidence microRNAs using deep sequencing
17. Van Allen,E.M. etal. Genomic correlates of response disease: allele-specific expression and low-penetrance data. Nucleic Acids Res. 42, D6873 (2014).
to CTLA4 blockade in metastatic melanoma. Science regulatory loci. Oncogene 28, 33453348 (2009). 64. Chan,P.P. & Lowe,T.M. GtRNAdb: a database of
350, 207211 (2015). 39. Lossie,A.C. etal. Distinct phenotypes distinguish transfer RNA genes detected in genomic sequence.
18. Vardiman,J.W. etal. The 2008 revision of the World themolecular classes of Angelman syndrome. Nucleic Acids Res. 37, D93 D97 (2009).
Health Organization (WHO) classification of myeloid J.Med.Genet. 38, 834845 (2001). 65. Salzman,J., Gawad,C., Wang,P.L., Lacayo,N. &
neoplasms and acute leukemia: rationale and 40. Li,G. etal. Identification of allele-specific alternative Brown,P.O. Circular RNAs are the predominant
important changes. Blood 114, 937951 (2009). mRNA processing via transcriptome sequencing. transcript isoform from hundreds of human genes in
19. Font-Tello,A. etal. Association of ERG and Nucleic Acids Res. 40, e104 (2012). diverse cell types. PLoS ONE 7, e30733 (2012).
TMPRSS2ERG with grade, stage, and prognosis of 41. Klimpe,S. etal. Evaluating the effect of spastin splice 66. Memczak,S., Papavasileiou,P., Peters,O. &
prostate cancer is dependent on their expression mutations by quantitative allele-specific expression Rajewsky,N. Identification and characterization of
levels. Prostate 75, 12161226 (2015). assay. Eur. J.Neurol. 18, 99105 (2011). circular RNAs as a new class of putative biomarkers
20. Druker,B.J. etal. Efficacy and safety of a specific 42. Wood,D.L.A. etal. Recommendations for accurate inhuman blood. PLoS ONE 10, e0141214 (2015).
inhibitor of the BCR-ABL tyrosine kinase in chronic resolution of gene and isoform allele-specific 67. Goodarzi,H. etal. Endogenous tRNA-derived
myeloid leukemia. N.Engl. J.Med. 344, 10311037 expression in RNA-seq data. PLoS ONE 10, fragments suppress breast cancer progression via
(2001). e0126911e0126927 (2015). YBX1 displacement. Cell 161, 790802 (2015).
21. OBrien,S. etal. Chronic Myelogenous Leukemia, 43. Rivas,M.A. etal. Human genomics. Effect of predicted Researchers identified a novel regulatory role for
version 1.2014. J.Natl Compr. Canc. Netw. 11, protein-truncating genetic variants on thehuman fragments of tRNA; further research may find that
13271340 (2013). transcriptome. Science 348, 666669 (2015). other RNA fragments can have novel functions.

NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 13



2
0
1
6
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

68. Sutter,D.E. etal. Performance of five FDA-approved 89. Barry,S.E. etal. Identification of miR93 as a suitable 107. Li,H. etal. The Sequence Alignment/Map format and
rapid antigen tests in the detection of 2009 H1N1 miR for normalizing miRNA in plasma of tuberculosis SAMtools. Bioinformatics 25, 20782079 (2009).
influenza A virus. J.Med. Virol. 84, 16991702 patients. J.Cell. Mol. Med. 19, 16061613 (2015). 108. Handsaker,R.E., Korn,J.M., Nemesh,J. &
(2012). 90. Haddow,J.E. & Palomaki,G.E. in Human genome McCarroll,S.A. Discovery and genotyping of genome
69. Santiago,G.A. etal. Analytical and clinical epidemiology: a scientific foundation for using genetic structural polymorphism by sequencing on a
performance of the CDC real time RTPCR assay for information to improve health and prevent disease. population scale. Nat. Genet. 43, 269276 (2011).
detection and typing of dengue virus. PLoS Negl. Trop. (eds Khoury, M. J., Little, J. & Burke, W.) 217233 109. Rehm,H.L. etal. ACMG clinical laboratory standards
Dis. 7, e2311 (2013). (Oxford Univ. Press, 2004). for next-generation sequencing. Genet. Med. 15,
70. Styer,L.M., Miller,T.T. & Parker,M.M. Validation 91. t Hoen,P.A. etal. Reproducibility of high-throughput 733747 (2013).
and clinical use of a sensitive HIV2 viral load assay mRNA and small RNA sequencing across laboratories. 110. Wang,Y., Lu,J., Yu,J., Gibbs,R.A. & Yu,F. An
that uses a whole virus internal control. J.Clin. Virol. Nat. Biotechnol. 31, 10151022 (2013). integrative variant analysis pipeline for accurate
58 (Suppl. 1), e127e133 (2013). 92. Zhao,W. etal. Comparison of RNA-Seq by poly (A) genotype/haplotype inference in population NGS data.
71. Pinsky,B.A. etal. Analytical performance capture, ribosomal RNA depletion, and DNA Genome Res. 23, 833842 (2013).
characteristics of the Cepheid GeneXpert Ebola Assay microarray for expression profiling. BMC Genomics 111. DePristo,M.A. etal. A framework for variation
for the detection of Ebola virus. PLoS ONE 10, 15, 419 (2014). discovery and genotyping using next-generation DNA
e0142216 (2015). 93. Mercer,T.R. etal. Targeted sequencing for gene sequencing data. Nat. Genet. 43, 491498 (2011).
72. Fischer,N. etal. Evaluation of unbiased next- discovery and quantification using RNA CaptureSeq. 112. Zook,J.M. etal. Integrating human sequence data
generation sequencing of RNA (RNA-seq) as a Nat. Protoc. 9, 9891009 (2014). sets provides a resource of benchmark SNP and indel
diagnostic method in influenza virus-positive 94. Cabanski,C.R. etal. cDNA hybrid capture improves genotype calls. Nat. Biotechnol. 32, 246251
respiratory samples. J.Clin. Microbiol. 53, transcriptome analysis on low-input and archived (2014).
22382250 (2015). samples. J.Mol. Diagn. 16, 440451 (2014). 113. Di Tommaso,P. etal. The impact of Docker containers
This study describes the use of an unbiased 95. Cieslik,M. etal. The use of exome capture RNA-seq for on the performance of genomic pipelines. PeerJ 3,
RNA-seq application for positive detection and highly degraded RNA with application to clinical cancer e1273 (2015).
characterization of influenza in respiratory sequencing. Genome Res. 25, 13721381 (2015). 114. Kalari,K.R. etal. MAP-RSeq: Mayo Analysis Pipeline
samples, paving the way for this tool as a This paper describes the application of exome- for RNA sequencing. BMC Bioinformatics 15, 224
diagnostic methodology. capture transcriptome sequencing to FFPE samples (2014).
73. Gire,S.K. etal. Genomic surveillance elucidates in the clinical setting. 115. Nasser,S. etal. An integrated framework for reporting
Ebolavirus origin and transmission during the 2014 96. Zhang,Z.H. etal. A Comparative study of techniques clinically relevant biomarkers from paired tumor/
outbreak. Science 345, 13691372 (2014). for differential expression analysis on RNA-seq data. normal genomic and transcriptomic sequencing data
This paper describes the use of RNA-seq for PLoS ONE 9, e103207 e103211 (2014). in support of clinical trials in personalized medicine.
viraldetection and measurement for epidemic 97. Munro,S.A. etal. Assessing technical performance in Pac. Symp. Biocomput. 5667 (2015).
emergence and tracking. differential gene expression experiments with external 116. Alexander,E.K. etal. Preoperative diagnosis of
74. Gregori,J. etal. Ultra-deep pyrosequencing (UDPS) spikein RNA control ratio mixtures. Nat. Commun. 5, benign thyroid nodules with indeterminate cytology.
data treatment to study amplicon HCV minor variants. 5125 (2014). N.Engl. J.Med. 367, 705715 (2012).
PLoS ONE 8, e83361 (2013). 98. Jiang,L. etal. Synthetic spikein standards for RNA-seq 117. Alexander,E.K. etal. Multicenter clinical
75. Ekici,H. etal. Cost-efficient HIV1 drug resistance experiments. Genome Res. 21, 15431551 (2011). experiencewith the Afirma gene expression classifier.
surveillance using multiplexed high-throughput This article discusses the development of widely J.Clin.Endocrinol. Metab. 99, 119125 (2014).
amplicon sequencing: implications for use in low- and adopted synthetic RNA spike-ins providing 118. US Department of Health & Human Services. Center
middle-income countries. J.Antimicrob. Chemother. sensitivity, accuracy and biases in RNA-seq for Devices and Radiological Health. FDA notification
69, 33493355 (2014). experiments; this work also provides an early and medical device reporting for laboratory
76. Wang,K. etal. The complex exogenous RNA spectra characterization of the limits for discovery of rare developed tests (LDTs) draft guidance. [online]
in human plasma: an interface with human gut biota? transcripts in RNA-seq. http://www.fda.gov/downloads/MedicalDevices/
PLoS ONE 7, e51009 (2012). 99. Tembe,W.D. etal. Open-access synthetic spikein DeviceRegulationandGuidance/GuidanceDocuments/
77. Beatty,M. etal. Small RNAs from plants, bacteria and mRNA-seq data for cancer gene fusions. BMC Genomics UCM416684.pdf (2014).
fungi within the order Hypocreales are ubiquitous in 15, 19 (2014). 119. US Department of Health & Human Services.
human plasma. BMC Genomics 15, 933 (2014). Publicly available RNA-seq data using synthetic Centerfor Devices and Radiological Health.
The authors aptly analyse the small RNA spike-ins of known cancer gene fusions. Framework for regulatory oversight of laboratory
component of human plasma that relates to the 100. Su,Z. etal. A comprehensive assessment of RNA-seq developed tests (LDTs) draft guidance. [online]
microbiome. Although small in sample size, this accuracy, reproducibility and information content by http://www.fda.gov/downloads/medicaldevices/
study helps to pave the way to more robust the Sequencing Quality Control Consortium. deviceregulationandguidance/guidancedocuments/
circulating small RNA studies combining both host Nat.Biotechnol. 32, 903914 (2014). ucm416685.pdf (2014).
and microbial RNA-omes. This study reports results from the SEQC project, 120. US Department of Health & Human Services.
78. Ghosal,A. etal. The extracellular RNA complement of evaluating RNA-seq performance assessments such Optimizing FDA s regulatory oversight of next
Escherichia coli. Microbiologyopen http://dx.doi.org/ as reproducibility and accuracy across sequencing generation sequencing diagnostic tests preliminary
10.1002/mbo3.235 (2015). platforms and collaborative sites. discussion paper. [online] http://www.fda.gov/
79. Chen,S.J. etal. Characterization of EpsteinBarr 101. Li,S. etal. Multi-platform assessment of downloads/medicaldevices/newsevents/
virus miRNAome in nasopharyngeal carcinoma by transcriptome profiling using RNA-seq in the ABRF workshopsconferences/ucm427869.pdf (2014).
deep sequencing. PLoS ONE 5, e12745 (2010). next-generation sequencing study. Nat. Biotechnol. 121. Evans,B.J., Burke,W. & Jarvik,G.P. The FDA
80. Meshesha,M.K. etal. The microRNA transcriptome 32, 915925 (2014). andgenomic testsgetting regulation right.
of human cytomegalovirus (HCMV). Open Virol. J. 6, This study reports results from the ABRF-NGS N.Engl.J.Med. 372, 22582264 (2015).
3848 (2012). study assessing the influence of experimental This commentary discusses FDA oversight of
81. Cucher,M. etal. High-throughput characterization protocols & RNA conditions across sequencing genome-scale tests, including NGS, framing the
ofEchinococcus spp. metacestode miRNomes. platforms and collaborative sites. discussion around the concepts of analytical and
Int.J.Parasitol. 45, 253267 (2015). 102. Wang,C. etal. The concordance between RNA-seq clinical validity.
82. Fagnocchi,L. etal. Global transcriptome analysis and microarray data depends on chemical treatment 122. Tazawa,Y. Perspective for the development of
reveals small RNAs affecting Neisseria meningitidis and transcript abundance. Nat. Biotechnol. 32, companion diagnostics and regulatory landscape
bacteremia. PLoS ONE 10, e0126325 (2015). 926932 (2014). toencourage personalized medicine in Japan.
83. Latorre,I. etal. A novel whole-blood miRNA signature This study reports results from the SEQC project, BreastCancer 23,1923 (2015).
for a rapid diagnosis of pulmonary tuberculosis. evaluating the influence of chemical treatment on 123. Pignatti,F. etal. Cancer drug development and the
Eur.Respir. J. 45, 11731176 (2015). RNA-seq performance. evolving regulatory framework for companion
84. Ren,N. etal. MicroRNA signatures from 103. Li,S. etal. Detecting and correcting systematic diagnostics in the European union. Clin. Cancer Res.
multidrugresistant Mycobacterium tuberculosis. variation in large-scale RNA sequencing data. 20, 14581468 (2014).
Mol.Med. Rep. 12, 65616567 (2015). Nat.Biotechnol. 32, 888895 (2014). 124. Matthijs,G. etal. Guidelines for diagnostic next-
85. Lie,A.K. & Kristensen,G. Human papillomavirus References 100103 describe large-scale efforts generation sequencing. Eur. J.Hum. Genet. 24, 25
E6/E7 mRNA testing as a predictive marker for from the SEQC and ABRF consortia towards (2016).
cervical carcinoma. Expert Rev. Mol. Diagn. 8, establishing standards for RNA-seq. This study 125. European Commission. Proposal for a regulation of
405415 (2008). evaluated systematic biases across platforms theEuropean Parliament and of the council on invitro
86. Kanwal,F., Kramer,J.R., Ilyas,J., Duan,Z. & andlaboratory sites, and assessed data analysis diagnostic medical devices. [online] http://ec.europa.eu/
ElSerag,H.B. HCV genotype 3 is associated with an approaches for data normalization and bias health/medical-devices/files/revision_docs/
increased risk of cirrhosis and hepatocellular cancer in correction. proposal_2012_541_en.pdf (2012).
a national sample of U. S.Veterans with HCV. Hepatol. 104. Risso,D., Ngai,J., Speed,T.P. & Dudoit,S. 126. Rashid,N.U. etal. Differential and limited expression
60, 98105 (2014). Normalization of RNA-seq data using factor analysis of mutant alleles in multiple myeloma. Blood 124,
87. Mitchell,A.M. etal. Transmitted/founder hepatitis C ofcontrol genes or samples. Nat. Biotechnol. 32, 31103117 (2014).
viruses induce cell-type- and genotype-specific 896902 (2014). 127. Andersson,A.K. etal. The landscape of somatic
differences in innate signaling within the liver. MBio 6, 105. Garber,M., Grabherr,M.G., Guttman,M. & mutations in infant MLL-rearranged acute
e02510 (2015). Trapnell,C. Computational methods for transcriptome lymphoblastic leukemias. Nat. Genet. 47, 330337
88. Wu,L.S. etal. Systematic expression profiling analysis annotation and quantification using RNA-seq. (2015).
identifies specific microRNA-gene interactions that Nat.Methods 8, 469477 (2011). 128. Chandrasekharappa,S.C. etal. Massively parallel
may differentiate between active and latent 106. Genomes Project,C. etal. A map of human genome sequencing, aCGH, and RNA-Seq technologies provide
tuberculosis infection. Biomed. Res. Int. 2014, variation from population-scale sequencing. Nature a comprehensive molecular diagnosis of Fanconi
895179 (2014). 467, 10611073 (2010). anemia. Blood 121, e138e148 (2013).

14 | ADVANCE ONLINE PUBLICATION www.nature.com/nrg



2
0
1
6
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

129. De Vlaminck,I. etal. Circulating cell-free DNA enables FGFR and EGFR pathways in sporadic intrahepatic 155. Su,J. etal. Analysis of small nucleolar RNAs in
noninvasive diagnosis of heart transplant rejection. cholangiocarcinoma. PLoS Genet. 10, e1004135 sputum for lung cancer diagnosis. Oncotarget
Sci.Transl Med. 6, 241ra77 (2014). (2014). http://dx.doi.org/10.18632/oncotarget.4219
130. De Vlaminck,I. etal. Noninvasive monitoring of infection This paper describes the discovery of (2015).
and rejection after lung transplantation. Proc.Natl therapeutically actionable events including novel 156. Du,Z. etal. Integrative genomic analyses reveal
Acad. Sci. USA 112, 13336133341 (2015). oncogenic fusions in FGFR2 identified by clinically relevant long noncoding RNAs in human
131. Heid,C.A., Stevens,J., Livak,K.J. & Williams,P.M. RNA-sequencing. cancer. Nat. Struct. Mol. Biol. 20, 908913
Real time quantitative PCR. Genome Res. 6, 986994 143. Greco,F.A., Lennington,W.J., Spigel,D.R. & (2013).
(1996). Hainsworth,J.D. Molecular profiling diagnosis in 157. Li,P. etal. Using circular RNA as a novel type of
132. Wang,W. etal. Design of multiplexed detection assays unknown primary cancer: accuracy and ability to biomarker in the screening of gastric cancer.
for identification of avian influenza a virus subtypes complement standard pathology. J.Natl Cancer Inst. Clin.Chim. Acta 444, 132136 (2015).
pathogenic to humans by SmartCycler real-time reverse 105, 782790 (2013). 158. Guo,Y. etal. Transfer RNA detection by small RNA
transcription-PCR. J.Clin. Microbiol. 47, 8692 (2009). 144. Paik,S. etal. Gene expression and benefit of deep sequencing and disease association with
133. Lockhart,D.J. etal. Expression monitoring by chemotherapy in women with node-negative, estrogen myelodysplastic syndromes. BMC Genomics 16, 727
hybridization to high-density oligonucleotide arrays. receptor-positive breast cancer. J.Clin. Oncol. 24, (2015).
Nat. Biotechnol. 14, 16751680 (1996). 37263734 (2006). 159. Westermann,A.J. etal. Dual RNA-seq unveils
134. Mook,S., Vant Veer,L.J., Rutgers,E.J., Piccart- 145. Albain,K.S. etal. Prognostic and predictive value noncoding RNA functions in host-pathogen
Gebhart,M.J. & Cardoso,F. Individualization of therapy ofthe 21gene recurrence score assay in interactions. Nature 529, 496501 (2016).
using Mammaprint: from development to the MINDACT postmenopausal women with node-positive, 160. Patton,J.G. etal. Biogenesis, delivery, and function
Trial. Cancer Genom. Proteom. 4, 147155 (2007). oestrogen-receptor-positive breast cancer on ofextracellular RNA. J. Extracell. Vesicles 4, 27494
This paper describes the development and clinical chemotherapy: a retrospective analysis of a (2015).
validation trial for the Mammaprint gene randomised trial. Lancet Oncol. 11, 5565 (2010). 161. Brinkmann, K. et al. Exosomal RNA-based liquid
expression based breast cancer recurrence test. 146. Knezevic,D. etal. Analytical validation of the biopsy detection of EML4-ALK in plasma from NSCLC
135. Paik,S. etal. A multigene assay to predict recurrence Oncotype DX prostate cancer assay - a clinical patients. 16th World Conference on Lung Cancer
of tamoxifen-treated, node-negative breast cancer. RTPCRassay optimized for prostate needle biopsies. Abstract 2591 (IASLC, 2015).
N.Engl. J.Med. 351, 28172826 (2004). BMC Genomics 14, 690 (2013).
This paper describes the development and clinical 147. Clark-Langone,K.M., Sangli,C., Krishnakumar,J. Acknowledgements
validation trial for the OncotypeDX gene expression &Watson,D. Translating tumor biology into The authors acknowledge funding from the Ben and
based breast cancer recurrence test. personalized treatment planning: analytical Catherine Ivy Foundation and the National Center for
136. Cooperberg,M.R. etal. Validation of a cell-cycle performance characteristics of the OncotypeDX Advancing Translational Sciences (exRNA Signatures Predict
progression gene panel to improve risk stratification in ColonCancer Assay. BMC Cancer 10, 691 (2010). Outcomes After Brain Injury; UH3TR000891). Research was
a contemporary prostatectomy cohort. J.Clin. Oncol. 148. Zhang,Y. etal. Breast cancer index identifies early- also supported by a Stand Up To Cancer Melanoma
31, 14281434 (2013). stage estrogen receptor-positive breast cancer Research Alliance Melanoma Dream Team Translational
137. Learn,C.A. etal. Resistance to tyrosine kinase patients at risk for early- and late-distant recurrence. Cancer Research Grant. Stand Up To Cancer is a programme
inhibition by mutant epidermal growth factor receptor Clin. Cancer Res. 19, 41964205 (2013). of the Entertainment Industry Foundation administered by
variant III contributes to the neoplastic phenotype of 149. Wallden,B. etal. Development and verification of the American Association for Cancer Research. The authors
glioblastoma multiforme. Clin. Cancer Res. 10, thePAM50based Prosigna breast cancer gene apologize to those whose work could not be cited or
32163224 (2004). signature assay. BMC Med. Genom. 8, 54 (2015). discussed owing to space constraints.
138. Robinson,D. etal. Integrative clinical genomics of 150. Salazar,R. etal. Gene expression signature to
advanced prostate cancer. Cell 161, 12151228 improve prognosis prediction of stage II and III Competing interests statement
(2015). colorectal cancer. J.Clin. Oncol. 29, 1724 The authors declare no competing interests.
139. Mody,R.J. etal. Integrative clinical sequencing in the (2011).
management of refractory or relapsed cancer in youth. 151. Erho,N. etal. Discovery and validation of a prostate
JAMA 314, 913925 (2015). cancer genomic classifier that predicts early DATABASES
140. Sekulic,A. etal. Personalized treatment of Sezary metastasis following radical prostatectomy. PLoS ONE circBase: http://www.circbase.org/
syndrome by targeting a novel CTLA4:CD28 fusion. 8, e66855 (2013). Genomic tRNA Database: http://gtrnadb.ucsc.edu/genomes/
Mol. Genet. Genom. Med. 3, 130136 (2015). 152. Meiri,E. etal. A second-generation microRNA-based eukaryota/Hsapi19/hg19-tRNAs.fa
141. Craig,D.W. etal. Genome and transcriptome assay for diagnosing tumor tissue origin. Oncologist miRBase: http://www.mirbase.org/
sequencing in prospective metastatic triple-negative 17, 801812 (2012). piRBase: http://regulatoryrna.org/database/piRNA/
breast cancer uncovers therapeutic vulnerabilities. 153. Taubert,H. etal. Expression of the stem cell self- piRNABank: http://pirnabank.ibab.ac.in/
Mol. Cancer Ther. 12, 104116 (2013). renewal gene Hiwi and risk of tumour-related death SILVA ribosomal RNA database project: http://www.arb-
One of the first papers investigating integration inpatients with soft-tissue sarcoma. Oncogene 26, silva.de/
ofwhole-transcriptome sequencing and genome 10981100 (2007). The Ribosomal Database Project: https://rdp.cme.msu.edu/
sequencing for targeted therapy selection in 154. Baraniskin,A. etal. Circulating U2 small nuclear The National Comprehensive Cancer Network:
advanced metastatic triple-negative breast cancer. RNAfragments as a novel diagnostic biomarker http://www.nccn.org/icl/default.aspx
142. Borad,M.J. etal. Integrated genomic characterization forpancreatic and colorectal adenocarcinoma. ALL LINKS ARE ACTIVE IN THE ONLINE PDF
reveals novel, therapeutically relevant drug targets in Int.J.Cancer 132, E48 E57 (2013).

NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 15



2
0
1
6
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.