You are on page 1of 63

Principles of Mass Spectrometry-Based Proteomics

Steven Gygi
Department of Cell Biology
Harvard Medical School, Boston, MA
The Flow of Information
The Flow of Information

Networks

Complexes

Proteins

Genes
mRNA Profile Patterns on a Global Level

The Full Yeast Genome on a


Chip

Statistics:
6116 Yeast Genes
96 Intergenic regions

DeRisi et al, 1997, Science 278:680


Control of Eukaryotic Gene Expression

Inactive
Nucleus Cytosol mRNA

Translational
control
Primary
DNA RNA mRNA mRNA
transcript
RNA RNA Translational
Trancriptional processing transport control
control control control
Protein

Protein
activity
control
Inactive Active
protein protein
Proteomics:

Systematic identification and characterization of


proteins for their quantity, structure, function,
activity and molecular interactions
Mass-Spectrometry-Based Proteomics

Intensity
Identification
m/z

Intensity
Proteomics m/z

PTMs Quantification
If all you have is a hammer, everything looks like a nail
-- Bernard Baruch
Biochemists have cool hammers
Ion Trap Mass Spectrometry

IPI:IPI00215977.1 Homo sapiens


(Human) SPLICE ISOFORM 2 OF
INSULIN-LIKE GROWTH FACTOR II
PRECURSOR.
MGIPMGKSMLVLLTFLAFASCCIAAYRPSETLCGGELV
DTLQFVCGDRGFYFRLPGRPASRVSRRSRGIVEECCF
RSCDLALLETYCATPAKSERDVSTPPTVLPDNFPRYPVG
KFFQYDTWKQSTQRLRRGLPALLRARRGHVLAKELEA
FREAKRHRPLIALPTQDPAHGGAPPEMASNRK
Unambiguously
identify proteins

P Ac
Determine the
precise site of a
PTM

Quantify protein
Intensity

Intensity
abundance
m/z time
Protein Sequencing

NH2 -Glu-Gly-Ser-Thr-Ser-Pro-Pro-His-Ala-His-Leu-Lys-COOH

Edman-type degradation Tandem mass spectrometry

1 hr = Glu
1 hr = Gly
1 hr = Ser
.
.
.
1 hr = Lys

Total Time = 12 hours Total Time = ~1 second


Overview

How Proteins Are Studied with Mass Spectrometry

Protein/Peptide Mass Spectrometry


Instrumentation
Mass Spectrometry (MS) and Tandem Mass Spectrometry
(MS/MS)
Peptide Sequencing by MS/MS

Large-Scale MS Based Proteomics Studies


Large-Scale Data Acquisition
Automated Assignment of MS/MS Data

Studying Posttranslational Modification with MS

Quantitative MS Proteomics
Protein/Peptide Mass Spectrometry

General Scheme of a Mass Spectrometer

Ionization source Mass analyzer Detector

Evaporation Separation of Ions


Desorption Based on their m/z
Ionization (High Vacuum)

Gas-phase ions
Protein/Peptide Mass Spectrometry

Ionization of Large Biomolecules - Soft Ionization Techniques

Matrix-Assisted Laser Electrospray Ionization (ESI)


Desorption/Ionization
(MALDI) Multiply-Charged Ions
Direct On-Line Coupling with
Singly-Charged Ions Chromatographic Separation
Fast Analysis Techniques (LC-MS, LC-MS/MS)

Nobel Prize in Chemistry 2002: John B. Fenn, Koichi Tanaka


Protein/Peptide Mass Spectrometry

Multiply Charged Ions from the Electrospray Process


Mass Determination of Large Proteins

Cytochrome C (equine), molecular weight 12,360 Da

Deconvoluted Spectrum

Fenn et al. (1989) Science 246, 64-71


Protein/Peptide Mass Spectrometry

Multi-Step Mass Analysis: MS and MS/MS

The MS Experiment Detector Electron Multiplier

ESI Quadrupole Mass Analyzer

-(U + Vcost)
MS
z

-(U + Vcost)
Protein/Peptide Mass Spectrometry

Protein Identification at the MS level by


Peptide-Mass Fingerprinting
Relative High Purity of Proteins
Is Required 2D gel electrophoresis >gi|1723752|sp|P50086|
MSNYPLHQACMENEFFKVQELLHSKPSLLLQKDQDGRIPLHW
SVSFQAHEITSFLLSKMENVNLDDYPDD SGWTPFHIACSVG
NLEVVKSLYDRPLKPDLNKITNQGVTCLHLAVGKKWFEVSQF
LIENGASVRIKDKFN QIPLHRAASVGSLKLIELLCGLGKSA
VNWQDKQGWTPLFHALAEGHGDAAVLLVEKYGAEYDLVDNKG
AKAEDVALNEQVKKFFLNNV
Digestion with Trypsin
(Peptides better
Amenable to MS)

MS (typically) MALDI
Protein/Peptide Mass Spectrometry

Multi-Step Mass Analysis: MS and MS/MS


The MS/MS Experiment
Triple Quadrupole Analyzer
Q1 Q2 Q3

Scan Rf-Only
No Filter Function

MS
MS/MS Experiments Are
Initialized with an MS
Experiment
Protein/Peptide Mass Spectrometry

Multi-Step Mass Analysis: MS and MS/MS


The MS/MS Experiment

Q1 Q2 Q3

Isolating Collision Scanning


Defined Ion with Inert
Gas (CID*)

MS MS/MS

677
Protein/Peptide Mass Spectrometry

Fragmentation of Peptide Ions

x(n-m) z(n-m)
y(n-m)
Rm O
H
N

O Rm+1

bm
am
cm
Protein/Peptide Mass Spectrometry

Fragmentation of Peptide Ions

y7 y6 y5 y4 y3 y2 y1
O R2 O R4 O R6 O R8
H H H
H2N N N N OH
N N N N
H H H H
R1 O R3 O R5 O R7 O

a2 b2 b3 b4 b5 b6 b7
Protein/Peptide Mass Spectrometry

NH2-NSGDIVNLGSIAGR-COOH

YMR134W, yeast protein involved in iron metabolism


Mass Spectrometry of Peptides and Proteins

Large-Scale Protein Profiling


Large-Scale MS Based Proteomics Studies

Large-Scale Protein Profiling

Fractionation of Proteins

High Numbers Range of Proteins in Biological Samples


High Concentration Range of Proteins
Limited Number of Sequence Attempts (MS/MS Experiments) in an
Mass Spectrometric Analysis
Limited Dynamic Range of Mass Spectrometers
(Partly Due to Peptides Competing for Protons in the Ionization
Process Ionization Supression)
Large-Scale MS Based Proteomics Studies

Large-Scale Protein Profiling

Why Are Peptides and Not Proteins Analyzed?

Better Solubility, Allows Analysis of e.g. Membrane Proteins


Better Amenable to Chromatographic Separation (Mostly RP)
Better Amenable to Mass Spectrometric Analysis
Ionization
Fragmentation (Energy Impact in CID Process too Small to Fragment
Proteins)
Digestion with Trypsin (Cleaves After K and R), Protonation on both
Termini of Peptide Ions b and y Ion Series
Various Posttranslational Modifications (PTMs) Lead to
a Combinatorial Explosion of Protein Sequence Databases
Large-Scale MS Based Proteomics Studies

Large-Scale Protein Profiling

Why Are Peptides and Not Proteins Analyzed?

Intact Protein Analysis Top-Down Proteomics (Bottom-Up for Peptides)

Histone H4 (Human),
Function Controlled by Cassettes of PTMs
Not Accessible by Bottom-Up Proteomics

Electron Capture Dissociation for Fragmentation

Various Known Modifications (Acetylation,


Methylation, Phosphorylation) 46,875 Sequences
Considered

Pesavento et al. (2003) JACS 126


Large-Scale MS Based Proteomics Studies

Large-Scale Protein Profiling

Mass Spectrometers Used in Proteomic Studies


Two Different Instruments with Similar Performance

LTQ Orbitrap LTQ FT


The LTQ Orbitrap

Makarov (2000) Anal. Chem. 72, 1156.


Makarov (1999) US Patent 5, 886, 346.
The LTQ Orbitrap

1. Ions are stored in the Linear Trap


2. . are axially ejected
3. . and trapped in the C-trap
4. . they are squeezed into a small cloud and injected into the Orbitrap
5. . where they are electrostatically trapped, while rotating around the central electrode
and performing axial oscillation

The oscillating ions induce an image current into the two


outer halves of the orbitrap, which can be detected using
a differential amplifier

Ions of only one mass generate a sine


wave signal
The LTQ Orbitrap

k
=
The axial oscillation frequency follows the formula
m/ z
Where = oscillation frequency
k = instrumental constant
m/z = . well, we have seen this before

Many ions in the Orbitrap generate a


complex signal whose frequencies are
determined using a Fourier Transformation
Large-Scale MS Based Proteomics Studies

Large-Scale Protein Profiling

Data-Dependent MS/MS Spectra Acquisition

More than 10,000 MS/MS Spectra Acquired in a Typical LC-MS/MS Run


Large-Scale MS Based Proteomics Studies

Large-Scale Protein Profiling

High Mass Accuracy and


High Mass Resolution:

Allows Charge Determination


of Peptides
Enhances Confidence in Peptide
Identifications
Lowers the Influence of Noise in
Quantitative Studies
Large-Scale MS Based Proteomics Studies

Large-Scale Protein Profiling

De Novo Sequencing of Peptides Based


On MS/MS Data Ambitious Task

Whole-Genome Sequence Information:

Automated MS/MS Spectra Assignment


Through Matching Acquired MS/MS Spectra
With those Predicted on the Basis of Protein
Sequence Databases

Facilitates Large-Scale Proteomics


Large-Scale MS Based Proteomics Studies

Large-Scale Protein Profiling

MS/MS Spectra Database Search Algorithms: Acquired


MS/MS
Spectrum
SEQUEST:
Theoretical Spectra Predicted Based on
Database Sequences
Cross-Correlation to Match Acquired
with Predicted Data
In Silico
Mascot: Predicted
Determination of Corresponding b- Spectrum,
and y-ions. b- and
Probability for Random Match Calculated y-Ions
Large-Scale MS Based Proteomics Studies

Examples

4 Life Cycle Stages Characterized


2,400 Proteins Profiled
Aim: Targets for Drugs/Vaccines to Interrupt Life Cycle
Florens et al. (2002) Nature 419, 520
Large-Scale MS Based Proteomics Studies

Examples
Profiling of Posttranslational Modifications

Postranslational Modifications (PTM) Identifying PTMs with MS

>200 known A
R LC-MS/MS Analogous
Localization, Activity State, Turnover to Analysis of
N
Interaction Unmodified Peptides
D
C Database Search:
E Modified Residue(s)
Q Considered as
G
Additional Residue
H
Potentially Replacing
I
the Unmodified
L
Residue
K
M
F
P
S OR S P
T OR T P
W
Y OR Y P
V
Quantitative Studies by MS

Stable Isotopes
Internal Standards

894.47 Da O NH2
HN
S
O O O
H H H
N N N OH
H2N N N N
H H H
O O O O
O
O P OH
O OH
OH
M pS F E I L R
O NH2
901.47 Da
HN
S
O O O
H H H
N N N OH
H2N N N N
H H H
O O O O
O
O P OH
O OH
OH

M pS F E I L* R
Internal Standards in MS

Stable Isotope Dilution: Enrich heavier isotope occurrence


from natural abundance to ~ 100% for certain atoms in the
analyte
Carbon: C = 98.93 % 13C = 1.07%
12
Isotope Coded Affinity Tags (ICAT)

ICAT Reagents: Heavy reagent: d8-ICAT (X=deuterium)


Light reagent: d0-ICAT (X=hydrogen)

O
N N O XX O
XX
O I
N O O N
S XX XX

Biotin Linker (heavy or Thiol specific


tag light) reactive group
The ICAT Strategy for Quantitative Proteomics
Principles of ICAT Strategy

Quantitative

Potential to identify unknown proteins

Automated

Complexity of peptide mixture is reduced

Redundant quantification if multiple cysteines present

Database search is constrained by SH-specificity

Relative protein quantity is maintained through biochemical,

immunological, or physical fractionation

Compatible with analysis of low abundance proteins


Comparison of Yeast Utilizing Ethanol or Galactose as
Carbon Source

Transition states
Galactose Ethanol

100 g protein 100 g protein

Heavy ICAT reagent Light ICAT reagent

ICAT analysis
Differential Protein Expression in Yeast Growing on
Ethanol or Galactose
Gene Gal- Glu-
Name Sequence Ratio (Eth : Gal) repressed repressed
ACH1 KHNC#LHEPHMLK >100 : 1 Y
YSGVC#HTDLHAWHGDWPLPVK 0.57 : 1
ADH1 C#C#SDVFNQVVK 0.48 : 1
YSGVC#HTDLHAWHGDWPLPTK >200 : 1
ADH2 Y Y
C#SSDVFNHVVK >200 : 1
PEP4 KGWTGQYTLDC#NTR 2.60 : 1 Y
LPD1 VC#HAHPTLSEAFK 1.30 : 1 Y
RGNVC#GDAK 0.81 : 1
TEF1 C#GGIDK 0.70 : 1
FVPSKPMC#VEAFSEYPPLGR 0.74 : 1
GAL1 LTGAGWGGC#TVHLVPGGPNGNIEK 1 : >200 Y
HHIPFYEVDLC#DR 1 : >200
GAL10 Y
DC#VTLK 1 : >200
C#TGGIILTASHNPGGPENDMGIK 0.58 : 1
PGM2 Y
LSIC#GEESFGTGSNHVR 0.62 : 1
IPC#LADSHPK 1.47 : 1
C#INLSAEKEPEIFDAIK 1.52 : 1
PCK1 Y
C#AYPIDYIPSAK 1.41 : 1
IVEEPTSKDEIWWGPVNKPC#SER 1.85 : 1
SOD1 GFHIHEFGDATNGC#VSAGPHFNPFK 0.46 : 1 Y
QCR6 ALVHHYEEC#AER 1.30 : 1 Y
Ethanol Metabolism in Yeast

Glucose or Galactose
2NAD+

glycolysis Alcohol dehydrogenase isozymes

2NADH
2NADH 2NAD+
Pyruvate
O
ADH1
CH3-C-H CH3-CH2-OH
Oxaloacetate Acetaldehyde Ethanol

ADH2
TCA Cycle Acetate 2NADH 2NAD+
Oxidative Phosphorylation

C02 + H20
ICAT Analysis of Yeast Utilizing Different Carbon Sources:
Alcohol Dehydrogenase Isozymes

ADH1 : YSVCHTDLHAWHGDWPLPVKADH2 : YSVCHTDLHAWHGDWPLPTK

1457.1 1454.1
100 100
Relative Abundance

1453.1 Galactose Ethanol


50 Ethanol 50
Galactose

0 0
1430 1440 1450 1460 1470 1480 1430 1440 1450 1460 1470 1480
m/z m/z

Ratio: 0.57 Ratio: >200


Metabolic Labeling For Quantitative
Proteomics

IR

293T cells
in Light DMEM in Heavy DMEM
[12C6, 14N2]-Lys MW: 146 [13C6, 15N2]-Lys MW: 154
[12C6, 14N4]-Arg MW: 174 [13C6, 15N4]-Arg MW: 184

Trypsinize
Enrich Phosphopeptides

LC-MS/MS ID and Quantification


Quantification of Relative Difference in Protein Phosphorylation

MS 6 Da MS 6 Da

Light Heavy Light Heavy


100 100
Relative Intensity

Relative Intensity
50 50

0 0
m/z m/z

Unregulated Regulated
DNA Damage Induces A Phosphorylation Cascade And Cell Cycle Arrest

DNA Damage

Mec1 (ATR) Tel1(ATM)

? ? ? ?

Rad53 Chk1

? ? ? ? ? ? ? ? ?

Dun1

? ? ? ? ?
DNA Replication Is A Dangerous
Event For Complex Genomes
Chromosomal Breakage in RAD17-/- Cells

RAD17+/+ RAD17-/-

Chromosome Breaks

Lei Li
Cancer Susceptibility Linked to DNA Damage Responses

Syndrome Defect
Fanconis Anemia Crosslink Repair/ALL
Werners Syndrome Aging, Damage-sensitivity
Ataxia-Telangietasia IR sensitivity, Ataxia ALL, NHL
Xeroderma Pigmentosa Excision Repair/Skin Cancer
Blooms Syndrome SCE, Leukemia
Nijmegen Breakage Syndrome IR sensitivity/ ALL, NHL
Sekel Syndrome Microcephaly, Checkpoint
Li Fraumeni (p53, Chk2) Cell cycle, Apoptosis, Many tumors
HNPCC Mismatch repair/colon cancer + others
BRCA1, BRCA2 Recombination, Breast + Ovarian Cn
Measuring Phosphorylation Differences After IR Treatment
Quantification and identification of phospho-SQ/TQ peptides by LC-MS/MS

MS 6 Da MS 6 Da

Light Heavy Light Heavy


100 100
Relative Intensity

Relative Intensity
50 50

0 0
m/z m/z

Unregulated DNA Damage


Regulated
Example of XIC for Quantitative Pair after DNA Damage
Phosphorylation Changes After IR Treatment in Yeast
Examples of IR-induced Phosphorylation Changes
Final Thoughts

Mass spectrometry can identify, characterize, and quantify proteins

Scale can be in the thousands not global

Stable isotopes are powerful proteome-labeling atoms

Large-scale experiments often require significant validation

Mass spectrometry-based proteomics is enabling the measurement of new


endpoints in biology
Acknowledgements

Gygi Lab
Dept. Cell Biology
Harvard Medical School
Taplin Biological MS Facility

You might also like