You are on page 1of 43

Introduction to

Microarrays

BTCH-Paper XI Unit- IV: DNA Microarrays


A Brief History of Genomics

Breakthrough, the description of the structure of the


DNA helix by James D. Watson and Francis H. C. Crick
in 1953
F. Jacob and J. Monod in 1961, Gene regulation in
prokaryotes
Daniel Nathans and Hamilton Smith, 1973-
Restriction Endonuclease and DNA Ligase.
As early as 1972 Paul Berg and his colleagues at
Stanford University developed an animal virus, SV40,
vector containing bacteriophage lambda genes for the
insertion of foreign DNA into E. coli cells.
1985, Kary Mullis of USA, invented PCR
1986 , international conference in Santa Fe, New
Mexico implementing a human genome .]
This meeting led to a 1988 study by the National
Research Council titled Mapping and Sequencing the
Human Genome In
At the same time, under the leadership of Director James
Wyngaarden, the National Institutes of Health established the Office
of Genome Research which in 1989 became the National Center for
Human Genome Research, directed by James D. Watson. The next ten
years witnessed rapid progress and technology developments in
automated sequencing methods. These technologies led to the
establishment of largescale DNA sequencing projects at many public
research institutions around the world such as the Whitehead Institute
in Boston, MA and the Sanger Centre in Cambridge, UK.

These activities were accompanied by the rapid development of


computational and informational methods to meet challenges created
by an increasing flow of data from large-scale genome sequencing
projects.
In 1991 Craig Venter at the National Institutes of Health developed a
way of finding human genes that did not require sequencing of the
entire human genome.
 In 1992, Venter left NIH to establish The Institute for Genomic
Research, TIGR.
In 1998 the Human Genome Program announced a plan to complete
the human genome sequence by 2003, the 50th anniversary of
Watson and Crick’s escription of the structure of DNA.
The first in situ probe synthesis method for manufacturing DNA
Microarrays

• Understand the principles of the microarray


technique.
• Appreciate the limitations of microarrays and
problems associated with the technique.
• Know what types of output are generated from
different microarray analysis packages and what
they mean.
Why to Learn about Microarrays ?

• Extremely useful and powerful technology – given a


sample of human tissue, allows you to determine the
expression level of all human genes within that tissue.

• Now extremely widely used, not only in research


laboratories but also within commercial companies and
diagnostically in hospitals.

• Many research articles written involving microarray data


– bioinformatics is vital for understanding these data
and results.
What is a Microarray ?
• Mark Schena, one of the founders of the technology in early
1990s at Stanford, says microarrays need to be:
–Microscopic ordered arrays of specific probes on a planar surface.

Sample 3. Scan

1. Label
sample

2. Wash Over Array


For Example - Protein
Microarrays
Different Proteins Different Antibodies

Sample 3. Scan

1. Label
sample

2. Wash Over Array


Gene Expression Microarrays

Complementary
Different mRNAs
DNA sequences

Sample 3. Scan

1. Label
sample

2. Wash Over Array


Gene Expression Microarrays –
Key Concepts

Tissue biopsy
or cell culture

3. Sequence
1. mRNA extraction
specific nucleic
U acid hybridization
A
A U
A
G A A
C T
U G A TT
U A
U A T T A C
A G
T T C GA
2. Reverse T
C T A T
transcription G
A C
to cDNA and A
labelling A
Goals of a Microarray
Experiment
1. Find the genes that change expression
between experimental and control
samples
2. Classify samples based on a gene
expression profile
3. Find patterns: Groups of biologically
related genes that change expression
together across samples/treatments
Two Major Gene Expression
Microarray Technologies

Spotted Arrays
Affymetrix
GeneChips
Spotted Microarrays –
Manufacture

• DNA probes spotted onto the microarray can either be


cDNA created by PCR or synthetic oligonucleotides.
• The probes are physically spotted onto particular
positions on a glass slide using a robot and
immobilized using specific surface chemistry.
Spotted Microarrays –
Manufacture

Spotted microarrays can


also be manufactured
‘by hand’.
Spotted Microarrays – Example
Use
• As an example, we will look at detecting differences
between acute lymphoblastic leukemia (ALL) and
acute myeloid leukemia (AML) . This was one of the
first successful uses of microarrays in cancer
classification.
• Total mRNA was extracted from bone marrow taken
from patients with ALL and AML.
• Converted to cDNA by reverse transcription (since
mRNA is sensitive to degradation).
Spotted Microarrays – Two
Channels
ALL cDNAs AML cDNAs
Spot Interpretation
Green Spot – higher expression in ALL.
Red Spot – higher expression in AML.
Yellow Spot – equal expression in both.
Black Spot – not expressed in either.

Cy3 Dye Cy5 Dye


Label Label

Mix and hybridize


onto slide
Affymetrix GeneChip –
Manufacture
Oligonucleotides are synthesized
one nucleotide at a time on the
surface of a quartz wafer using
photolithographic chemistry in
clean room conditions.
Affymetrix “Gene chip” system

• Uses 25 base oligos synthesized in place


on a chip (20 pairs of oligos for each
gene) 20,000 genes/chip
• RNA labeled and scanned in a single
“color
• Arrays get smaller every year (more
genes)
• Chips are expensive
• Proprietary system: “black box”
software, can only use their chips
Affymetrix GeneChip –
Multiple Probes per Gene
Affymetrix GeneChip – Single Channel

ALL cDNAs Biotin Label Hybridize Scan

AML cDNAs
Example Yeast grown in Oxygen

21 © 2003 Discovering genomics AM Campbell LJ Heyer


Measuring Fluorescence

22 © 2003 Discovering genomics AM Campbell LJ Heyer


1. Why is there a dark center in the middle of each spot?
2. What differences and similarities does a DNA chip have with
a southern blot?
Oxygen and Gene Expression

23 © 2003 Discovering genomics AM Campbell LJ Heyer


What color would could we used to represent gene
expression?
Oxygen and Gene Expression

Where any of the genes transcribed


similarly?
Oxygen and Gene Expression

25 © 2003 Discovering genomics AM Campbell LJ Heyer


Two-color spotted DNA
microarrays
M = log2 R/G = log2R - log2G
• M < 0, gene is over-expressed in green
labeled
sample compared to red-labeled sample.
• M = 0, gene is equally expressed in both
samples.
• M > 0, gene is over-expressed in red-
labeled
sample compared to green-labeled sample.
Image Analysis – Spotted
Arrays

Gridding Segmentation
and background
extraction

Complications include spots


lying on curves, spots of
different shapes and sizes,
variation in background
fluorescence, etc.
Image Analysis – Affymetrix
Affymetrix
manufacturing and
processing is much
more controlled and
thus image analysis is
more straightforward
(generally using their
own software).

But Affymetrix can


still suffer from
surface defects such
as scratches and
visual inspection is
important.
Data Acquisition

• Scan the arrays


• Quantitate each spot
• Subtract background
• Normalize
• Export a table of fluorescent
intensities for each gene in the
array
Why do we need Normalization ?
Variation in gene expression values can either be:

 Variations that we are interested in, for example caused by particular


sample treatments or disease states (Signal)
 Variations that we are not interested in, for example caused by
differences in the chip spotting process, sample handling and labelling,
quantity of sample applied to the microarrays, hybridization time and
temperature, image scanning parameters, image processing, etc.
(Systematic Bias or Noise)

Normalization is a term used to describe a collection of methods which


try to eliminate the unwanted variations between sample expression
values while retaining the variation that we are interested in.

These problems can be demonstrated by microarraying the same


sample twice and comparing the expression values.
10 Minute Activity

• We examines two types of acute leukemia:


acute myeloid leukamia (AML) and acute lymphoblastic leukemia
(ALL)

• Bone marrow samples were taken from 27 ALL and 11 AML patients

• RNA was extracted and hybridized to Affymetrix microarrays,


which were then scanned.

• First spend 5 minutes by yourself writing down:


- The variation that this experiment wishes to detect (signal).
- Causes of variation that this experiment is not interested in
(noise).
- If this experiment used solid tissue biopsies – what other
factors of variation may result?

• Then form into groups of 2 or 3 people for 5 minutes and compare


your lists – do you agree on the causes of variation?
Sources of Variability
• Image analysis (identifying and quantitating
each spot on the array)
• Scanning (laser and detector, chemistry of the
flourescent label))

• Hybridization (temperature, time, mixing, etc.)


• Probe labeling
• RNA extraction
• Sample variability
Self-against-Self Experiment

This shows random noise and Ideally, all data points should lie
systematic bias. on this line.
MVA Plot Shows This Better

M = log(chip1/chip2)
= log(chip1) – log(chip2)

Essentially the ‘difference’


in log intensity between
chip1 and chip2.

The red lowess regression A = (log(chip1) + log(chip2))/2


line indicates that we have Essentially the ‘average log intensity’
systematic bias which for chip1 and chip2, or average signal.
depends on signal.
Lowess Normalization
BEFORE AFTER

This type of normalization corrects any systematic bias which is


intensity dependent. It assumes similar numbers of genes will be
upregulated and downregulated, and shifts the lowess fit to lie along
the x-axis.
Normalization
• Can control for many of the experimental
sources of variability (systematic, not
random or gene specific)
• Bring each image to the same average
brightness
• Can use simple math or fancy -
– divide by the mean (whole chip or by sectors)
– LOESS (locally weighted regression)
• No sure biological standards
Other Types of Biases

Other types of biases which may need correcting include:

• Dye biases due to different behaviours of fluorescent dyes.


• Spatial biases across the microarray surface.
• Print tip biases due to physical differences between printing tips.

It is always better, but not always possible, to minimize these unwanted


variations through good Experimental Design and Experimental
Procedures.
For instance dye bias can be removed by carrying out replicate samples in
which the dyes have been swapped (dye swap).
Sample Replicates
In the ALL and AML study we have 27 independent samples from
patients with ALL and 11 independent samples from patients with
AML.

These are biological replicates, in that they provide information about


the variation between biological samples.

Replicates are essential in microarray studies, they not only make the
mean expression values more accurate (reducing random noise), but
also provide information about the variability of a particular expression
value in the natural population (essential for hypothesis testing … to
come later).
Role of microarrays in drug discovery

You might also like