You are on page 1of 4

JO U R N A L OF PR O TE O MI CS 10 5 (2 0 1 4) 1 4

Available online at www.sciencedirect.com

ScienceDirect
www.elsevier.com/locate/jprot

Editorial

Challenges and prospects of proteomics of


non-model organisms
The elucidation of the double helical structure of DNA, and
the mechanistic implications summarized in one sentence (It
has not escaped our notice that the specific pairing we have
postulated immediately suggests a possible copying mechanism for
the genetic material.) at the end of the seminal article
published in 1953 by James D. Watson and Francis H. C.
Crick [1], represented the birth of molecular biology. Less than
50 years later, the first draft of the entire human genome had
been sequenced [2,3], heralding the new era of genomics.
With an increasing list of genomes of organisms across all
domains of life being routinely sequenced, it becomes clear
that a large gap exists in predicting the phenotype from the
genotype. In other words, understanding the genome alone
does not allow obvious extrapolations into the complex
biology that occur downstream from the gene. Proteins are
the major workhorses of biological systems, the targets of
natural selection, and their abundance is regulated by
post-transcriptional and post-translational processes that
cannot be detected, or even inferred, only by genomic and
transcriptomic analyses. Gene regulation gives the cell control
over structure and function, and is the basis for cellular
differentiation, morphogenesis and the versatility and adaptability of any organism. Gene regulation may also serve as a
substrate for evolutionary change, since control of the timing,
location, and amount of gene expression can have a profound
effect on the phenotype of a cell or multicellular organism,
and phenotypic plasticity impacts on diversification and
speciation [4].
At any time point in the natural history of a living system,
the genotypephenotype map is the outcome of very complex
dynamics that include environmental effects. Bridging the
gap between the genotype and the phenotype is synonymous
with understanding these dynamics, and this goal demands a
quantitative biology approach. The need for the complete
picture engendered the development of new tools (most
notably biological mass spectrometry [5]) and even a new
field, proteomics [6,7]. Established in the 1990s, as a powerful
analytical technique, proteomics has catalyzed an expansion
of the scope of biological studies from the reductionist

http://dx.doi.org/10.1016/j.jprot.2014.04.034
1874-3919/ 2014 Elsevier B.V. All rights reserved.

approach of conventional protein chemistry to the parallel


and rapid proteome-wide measurements of hundreds to
thousands of proteins. The subsequent development of
quantitative MS techniques for the simultaneous study of
proteins, ultimately the whole proteome, that are expressed
in response to changes in gene activity, has represented a
major advance in the field over the first decade of the XXI
century [8].
Although significant advances in the comprehensive
profiling of proteins have occurred in model organisms (i.e.
those that, because of their small size and short generation
times, facilitate experimental laboratory research), proteomics research in non-model organisms, albeit representing
the vast majority of the biodiversity that exists on Earth [9],
has not advanced at the same pace. There are several reasons
that hamper the application of high-throughput proteomics
to non-model, particularly to non-genome-sequenced, organisms. On the one hand, peptide sequencing via MS/MS
requires a great deal of de novo interpretation of product ion
mass spectra [10], involving the manual intervention of an
expert mass spectrometrist, to subsequently carryout
cross-species, homology-based database searches [11]. Automated de novo peptide sequencing remains a challenge, and
even a simple MS/MS spectrum may require minutes for a
trained expert to interpret. On the other hand, the application
of proteomic approaches to non-model organisms whose
genome is not available or ill-defined entails the principal
challenge of making it necessary to infer protein functionality
through the evolutionary conservation of gene sequences and
protein domains between species. In addition, shotgun
protocols do not work for non-genome-sequenced organisms,
precluding the use of shotgun and targeted proteomics for the
relative or absolute quantification of proteins. Notwithstanding these limitations, the proteomics of non-model organisms
also presents its own strengths. Hence, proteogenomics [12]
and proteotranscriptomics [13] offer the possibility of refining,
even proof-checking, species-specific genome and transcriptome databases using proteomics data. Further opportunities of proteomics investigations of non-model organisms

EDI TOR IAL

can be guessed using a simile borrowed from the field of


structural biology.
The first crystal structure of an integral membrane protein
(non-model protein) dates from 1985 [14]. By then, almost
three decades after Kendrew, Phillips and others reported the
first structure of a protein, myoglobin [15], protein crystallography had produced hundreds of crystal structures of soluble
(model) proteins. At present, the growth of crystals suitable
for X-ray diffraction studies remains the major bottleneck of
membrane protein structure determination. On the other
hand, the advent of structural genomics (SG) initiatives, that
seek to describe the 3-dimensional structure of every protein
encoded by a given genome, has exponentially enhanced the
elucidation of model proteins. As opposed to traditional
structural biology, the determination of a protein structure
through a SG effort often comes before anything is known
regarding the protein function. Conversely, non-SG structural
biology approaches focusing on non-model membrane proteins whose significance is already appreciated, have immediate scientific impact by employing the non-model protein
structure as a Rosetta Stone to interpret in structural terms a
number of detailed biochemical and biophysical studies
collected over years. Similarly, the combined application of
multidisciplinary analytical methodologies to unravel the
biology and natural history of non-model organisms facilitates the comprehensive integration of proteomics data and
findings from diverse research areas across the biological
system.
Historically, research communities have focused on model
organisms to gain an insight into the general principles that
underlie various disciplines, such as genetics, development
and evolution. As a consequence, species that are not among
the handful of exhaustively studied model organisms have
been often ignored due to the lack of tractable genetics, and
searchable genomic databases. However, the situation is
changing, for the number of sequenced genomes of
non-model organisms is accumulating at a faster than ever
rate, with no signs of deceleration [16]. Incorporating proteomics analyses in ecology and population studies of
non-model organisms, which are often characterized by
their unique genomic and phenotypic characteristics, offers
an exciting reverse genetic venue towards the study of
adaptation, trait evolution, and species divergence, including
ontogenetic changes [17] and the impact of non-genetic
mechanisms in the evolution of phenotypic diversity within
a lineage and the fixation of fitness-related traits [18]; and for
addressing a wide range of old evolutionary questions from a
new perspective [19]. During the past few years, environmental physiologists and ecotoxicologists have made great progress in applying proteomics to the study of how organisms
respond to a changing environment and to pollution. The
contributions to a recent symposium on Comparative Proteomics of Environmental and Pollution Stress [20] offer the
novice a starting point for assessing the potential opportunities and challenges of proteomics to generate novel hypotheses about how organisms interact with their environment.
The current Special Issue of the Journal of Proteomics
brings together a collection of proteomics studies carried out
in such diverse groups of non-model organisms as plants
(Mangifera indica and Malus domestica fruits; the arsenic

hyperaccumulator
Pteris
vittata;
the
nuclear
phosphoproteome of developing chickpea, Cicer arietinum,
seedlings; mechanisms of environmental stress response of
the important fiber crop, Gossypium spp. (cotton); and proteomics of seeds and needles of two Mediterranean tree species
of agronomic interest, Quercus ilex and Pinus radiata); marine
organisms (including population proteomics of the great
scallop, Pecten maximus; lipid accumulation under nitrogen
starvation in the domesticated oleaginous algae Tisochrysis
lutea, a non-model organism of major interest for biomass
feedstock, food and biofuel production; proteomics of dinoflagellates, producer of an essential component of the food
chain in the marine ecosystem, and major causative species
of various shellfish poisonings; osmoregulation in the Japanese eel, Anguilla japonica, during acclimatization from seawater to freshwater conditions; proteome variance within
European whitefish, Coregonus lavaretus, populations adapted
to different salinity environments; proteomic characterization
of the hemolymph of Octopus vulgaris to understand the basis
of octopus tolerance-resistance to the protozoan parasite
Aggregata octopiana; proteomics investigation to disclose
mechanisms underlying the adaptive response of different
organisms to ecological and stress conditions); arthropods
(structure and post-translational modifications of a major
ampullate silk protein produced by Nephila spiders for use in
the construction of the frame and radii of orb webs, and as a
dragline to escape from predators; studies of the salivary
proteomes of phytotoxic Greenbug, Schizaphis graminum
Rondani, biotypes; first report on the proteome of the most
important Amblyomma tick species for their relevance as
vectors of zoonotic pathogens worldwide; elucidation of the
unexplored biodiversity of ant venom peptidomes, and their
application for chemotaxonomy); parasite helminth (elucidation of the surface proteome involved in the interaction of
Dicrocoelium dendriticum with the host, representing potential
targets for intervention against parasitic helminths); prokaryotes (host's environmental signals sensed by a pathogenic
strain of Bacillus anthracis to regulate its expression of
virulence-related genes); bats (proteomics investigation of
the strategy adopted by torpid Myotis ricketti bats against brain
dysfunction during hibernation); and snakes (venomics and
functional characterization of the venom of the eastern coral
snake, Micrurus fulvius, responsible for numerous snake bites
in southern United States; overview of the venom proteomes
of venomous snakes of Costa Rica, and preclinical
antivenomics analysis of the homologous and paraspecific
efficacy of a polyspecific antivenom; proteomics characterization of the ontogenetic and activity changes of the Chinese
short-tailed pitviper, Gloydius brevicaudus, venom; and proteomics assessment of the stability of snake venoms stored for
up to eight decades!).
Snake venomics and antivenomics form part of a
biology-driven conceptual framework to unveil the genesis
and natural history of venoms [13]. Identifying the molecular
basis of adaptations in natural populations is an important
yet largely unrealized goal in evolutionary biology. Such
information is of broad significance because it addresses
fundamental questions about the connection between genotype and phenotype for fitness related traits, such as venom,
and more explicitly, the relative importance of structural

EDI TOR IA L

variation in proteins versus gene regulatory changes as the


basis for adaptive variation in phenotype. The importance of
gene regulation in natural populations has been inferred from
measurements of variation in gene expression using a variety
of techniques. The most widely-used approach has been to
measure variation in mRNA levels for specific genes using
microarrays developed for model organisms [21]. While this
approach is powerful, allowing the simultaneous assessment
of transcript variation in many genes, it makes the assumption that there is a close link between transcript abundance
and protein levels. Further, it requires the existence of a
microarray with homologous loci. For non-model organisms
where such resources don't exist, an alternative is to directly
measure variation in the amounts of specific proteins as
quantified using recently developed proteomics techniques.
Fortunately, deconstructing complex venom phenotypes is
now within the reach of proteomic technologies [13], and
understanding evolutionary trends across venoms, and their
within- and between-species toxicological and immunological divergences and similarities, is of applied importance for
generating broad-ranging polyspecific antivenoms with
which to fight the neglected pathology of snakebite
envenoming [2224].
The application of the powerful Combinatorial Peptide Ligand
Library (CPLL) technique for the in-depth exploration of the trace
proteome of Champagne wines, and the proteomics-based
identification of the composition and manufacturing recipe of a
2500-year old sourdough bread unearthed by archeological
excavations in a Subeixi cemetery in China, complete the
journey across a small region of the vast landscape of
non-model systems amenable to proteomics studies.
The community working on model organisms is growing
steadily and the number of model organisms for which
proteome data are being generated is continuously increasing.
A few years ago, the Journal of Proteomics devoted a Special
Issue to [25]. More recently, a Human Proteome Organisation
(HUPO) initiative on model organism proteomes (iMOP, http://
www.imop.uzh.ch) was approved at the HUPO Ninth Annual
World Congress in Sydney, 2010 [26]. The iMOP initiative seeks
to promote proteomics within the model organism communities working on all non-human species, to integrate and link
proteome and other organism-specific databases. Increased
interactions among model organism proteomics researchers
are also expected to lead to the development of joint
resources and standards that should ultimately lead to
improved and innovative science.
Model organisms represent an important testing ground for
the development and validation of analytical approaches for
proteomics. The research community on non-model organisms
may also benefit from these advances, particularly if performed
on close relatives with well annotated reference sequences. The
promise of large-scale whole-genome sequencing projects such
as the Genome 10K Project (https://genome10k.soe.ucsc.edu), a
proposal to obtain whole-genome sequences for 10,000 representative vertebrate species spanning evolutionary diversity
across living mammals, birds, nonavian reptiles, amphibians,
and fishes [27], aims to anchor investigation of relatives in a
diversity of phylogenetic neighborhoods. However, researchers
of non-model organisms must also be well aware that, as
highlighted in the articles of this Special Issue, proteomics in

their non-model organisms of interest, although challenging, is


possible given sufficient human resources and commitment.
The time and efforts invested in arranging this Special Issue will
have been worthwhile if its reading attracts new explorers to
delve into the largely unexplored field of proteomics of
non-model organisms.

REFERENCES
[1] Watson JD, Crick FHC. A structure for deoxyribose nucleic
acid. Nature 1953;171:7378.
[2] International Human Genome Sequencing Consortium.
Initial sequencing and analysis of the human genome.
Nature 2001;409:860921.
[3] Venter JC, et al. The sequence of the human genome. Science
2001;291:130451.
[4] Pfennig DW, Wund MA, Snell-Rood EC, Cruickshank T,
Schlichting CD, Moczek AP. Phenotypic plasticity's impacts
on diversification and speciation. Trends Ecol Evol
2010;25:45967.
[5] Calvete JJ. The expanding universe of mass analyzer
configurations for biological analysis. Methods Mol Biol
2014;1072:6181.
[6] Klose J. From 2-D electrophoresis to proteomics. Electrophoresis
2009;30:S1429.
[7] James P. Protein identification in the post-genome era: the
rapid rise of proteomics. Q Rev Biophys 1997;30:279331.
[8] Eyers CE, Gaskell S, editors. Quantitative proteomics. RSC
Publishing. ISBN 978-1-84973-808-8; 2014. p. 1371.
[9] Hedges SB. The origin and evolution of model organisms. Nat
Rev Genet 2002;3:83849.
[10] Ma B, Johnson R. De novo sequencing and homology
searching. Mol Cell Proteomics 2012;11 [O111.014902].
[11] Junqueira M, Spirin V, Balbuena TS, Thomas H, Adzhubei I,
Sunyaev S, et al. Protein identification pipeline for the
homology-driven proteomics. J Proteomics 2008;71:34656.
[12] Armengaud J, Trapp J, Pible O, Geffard O, Chaumot A,
Hartmann EM. Non-model organisms, a species endangered
by proteogenomics. J Proteomics 2014;105:518.
[13] Calvete JJ. Next-generation snake venomics: protein-locus
resolution through venom decomplexation. Expert Rev
Proteomics Mar. 29 2014 [Epub ahead of print].
[14] Deisenhofer J, Epp O, Miki K, Huber R, Michel H. Structure of
the protein subunits in the photosynthetic reaction centre of
Rhodospeudomonas viridis at 3 resolution. Nature
1985;318:61824.
[15] http://www.nobelprize.org/nobel_prizes/chemistry/
laureates/1962/kendrew-lecture.pdf.
[16] Ellegren H. Genome sequencing and population genomics in
non-model organisms. Trends Ecol Evol 2014;29:5163.
[17] Durban J, Prez A, Sanz L, Gmez A, Bonilla F, Rodrguez S,
et al. Integrated omics profiling indicates that miRNAs are
modulators of the ontogenetic venom composition shift in
the Central American rattlesnake, Crotalus simus simus. BMC
Genomics 2013;14:234.
[18] Gibbs HL, Sanz L, Calvete JJ. Snake population venomics:
proteomics-based analyses of individual variation reveals
significant gene regulation effects on venom protein
expression in Sistrurus rattlesnakes. J Mol Evol 2009;68:11325.
[19] Diz AP, Martnez-Fernndez M, Roln-Alvarez E. Proteomics
in evolutionary ecology: linking the genotype with the
phenotype. Mol Ecol 2012;21:106080.
[20] Tomanek L. Introduction to the symposium Comparative
Proteomics of Environmental and Pollution Stress. Integr
Comp Biol 2012;52:6225.

EDI TOR IAL

[21] Whitehead A, Crawford DL. Variation within and among


species in gene expression: raw material for evolution. Mol
Ecol 2006;15:1197211.
[22] Warrell DA, Gutirrez JM, Calvete JJ, Williams D. New
approaches & technologies of venomics to meet the challenge
of human envenoming by snakebites in India. Indian J Med
Res 2013;138:3859.
[23] Gutirrez JM, Solano G, Pla D, Herrera M, Segura , Villalta M,
et al. Assessing the preclinical efficacy of antivenoms: from
the lethality neutralization assay to antivenomics. Toxicon
2013;69:16879.
[24] Williams DJ, Gutirrez JM, Calvete JJ, Wster W,
Ratanabanangkoon K, Paiva O, et al. Ending the drought: new
strategies for improving the flow of affordable, effective
antivenoms in Asia and Africa. J Proteomics 2011;74:173567.
[25] Ahrens CH, Schrimpf SP, Brunner E, Aebersold R. Model
organism proteomics. J Proteomics 2010;73:20513.

[26] Jones AM, Aebersold R, Ahrens CH, Apweiler R, Baerenfaller


K, Baker M, et al. The HUPO initiative on model organism
proteomes, iMOP. Proteomics 2012;12:3405.
[27] Genome 10K Community of Scientists. A proposal to obtain
whole-genome sequence for 10,000 vertebrate species. J
Hered 2009;100:65974.

Juan J. Calvete
Instituto de Biomedicina de Valencia CSIC, Jaime Roig 11,
46010 Valencia, Spain
Tel.: + 34 963391778.
E-mail address: jcalvete@ibv.csic.es.

You might also like