Professional Documents
Culture Documents
Abstract
The Green Fluorescent Protein is an invaluable asset to the field of molecular biology as;
it is a fluorescent protein that can glow under the emissions of a UV light and has allowed
microbiologists around the globe to actively and accurately track the locations and behaviors of
viruses and proteins as they moved around the organisms. By examining the nucleotide
sequences of 7 different GFP sequences it was found that variation among the wildlife was quite
high despite the similar glowing green function. The strains were composed of highly varying
base pair lengths but was due to codon based alignment issues and areas of missing
sequencing in the source code. Aside from this, the amino acid compositions showed a lesser
range of variations but was still significant. The 2 subgroups of wild type vs synthetic type GFP
did not display a clear divide between the two as modified GFPs for bio-tagging usage could be
made in a myriad of ways and thus was.
Introduction
The green fluorescent protein, or GFP for short was first observed and extracted from
the jellyfish hydroid medusa Aequorea victoria in 1962. Now it can also be extracted from a
myriad of other marine life and bacteria. The Green Fluorescent Protein is a protein which
manifests a bright green fluorescence when exposed to blue light in the ultraviolet range. It is
quite commonly made use of as a reporter of expression, in cell and molecular biology, even in
mammals such as homo sapiens. In recent times, it has been used to create biosensors and
been introduced into organisms while breeding and other methods such as cell transformation
and injection of a viral vector (a method to deliver genetic materials into cells) to prove the
expression of a gene throughout an organism. Strands of the GFP have been introduced into
many Bacteria, fish, plants, flies, and mammalian cells, including humans.
Many different mutants of the Green Fluorescence Protein have been engineered, thus
far, due to their high potential for usage. The first one was the single point mutation which
improved the incorporeal characteristics of GFP which resulted in increased fluorescence. This
improvement, now, matched the incorporeal characteristics of Fluorescein (FITC), which is a
commonly available synthetic organic compound and this, therefore, greatly increased GFPs
practical use in research. Many color mutations have been made as well, such as those in blue,
yellow, cyan, and red.
These variations in color came about because of no commonly used laser line near 395
nm. Confocal microscopy uses a laser line for GFP excitation, specifically an argon laser at 488
nm. Very little to no fluorescence was seen when this was used because the GFP was curtailed
by aberrant mRNA splicing. Therefore, modified versions of GFP were created to improve
expression of the fluorescent protein. Resulting from these modifications were an increase in
brightness emission and different color variants e.g. in order from shortest to longest emission
spectra: blue (FP/BFP), cyan (CFP), yellow (YFP), and red (RFP). Green falls between CFP
and YFP on the spectrum.
man-made bio tagging GFP variants, the amino acid and nucleotide variances may be found
anywhere on the trees and composition charts as many color emitting GFPs have already been
created for fluorescent tagging.
Methods
The data was divided into 2 subgroups of green fluorescent proteins. One sub-group
was dedicated to naturally occurring forms of GFP found in several marine lifeforms, and the
other contained synthetic and other forms of the GFP such as probe and expression vectors
that are commonly used to tag proteins.
The genes were selected based on varying organisms and for their GFP. The synthetic
ones were not chosen to be of any significance; they were selected based on their source
organisms. The final selection consisted of 5 wild types (those that were extracted from natural
sources), and 2 synthetics (those that were man-made). The list included Saccharomyces
Cerevisiae, Aequorea Victoria, Copepod Chiridius Poppei, Pseudomonas Aeruginosa,
Herbaspirillum Frisingense, Expression Vector pPV472 and Synthetic Construct; the information
was extracted from the NCIB documents under Source Organism. The nucleotide sequences
were aligned by opening a saved session which was followed by a prompt: Analyze or Align.
Unknown parameters did not allow ClustalW protein sequences alignments to be used for
further analysis; this was the only way around the issue.
After aligning the nucleotides, a neighbor-joining phylogenetic tree was generated
using the analyzed protein sequencing data found under the sequence data explorer window;
this would show the similarities to the genetic sequencing of the seven separate GFP variations
and will give a general outlook on how far and where they deviate from one another.
The composition tests were ran separately as they required different sequencing data to
compute from. The nucleotide composition computational test model required the DNA
sequence to be under the data explorer analysis window. The Amino acid composition test
model drew its interpolations from the protein sequencing data window. These tests would
produce an excel spreadsheet - or other text based data sheet - for a comparative view of the
nucleotide and amino acid makeup variation among the 7 GFP sources.
Using the PDB file acquired from the protein data bank, generating a 3D simulation of
the Green Fluorescent Protein was made possible, using the molecular visualization software
called Protein Insight. After the PDB file was loaded into the software, all of the protein chains
under the VP List were selected to give the protein its green fluorescence, as seen below.
There were a few other, more-in-depth, visualizations of the protein which involved the surface
and stick models but they were not shown here as they hindered the structure of the protein to
be seen clearly. The image below was generated, simply, by clicking Save Workspace, after
all of the aforementioned steps were carried out.
1.
2.
3.
4.
5.
6.
Results
Sequences Tested
Saccharomyces cerevisiae: yeast
Aequorea victoria: bioluminescent hydrozoan jellyfish
Copepods: group of small crustaceans
Pseudomonas aeruginosa: common disease causing bacterium
Herbaspirillum frisingense: nitrogen-fixing bacteria
Expression Vector pPV472
7. Synthetic Construct
Figure 3: Rectangular neighbor joining phylogenetic tree, with the format of the GFP type with
the source organism to the right hand side.
Figure 4: Pairwise distance between the amino acids of the GFP and its average
Figure 5: Pairwise distances between nucleotides of the GFP genes and its average
The results of these tests showed that the sequences for the two different variations of
the genes are similar but, sometimes quite different as well. The hypothesis that there will be
variation among the wildlife as many are smaller organisms and because of their lower
complexity any mutation will be significant, is supported; however, it is unclear if the complexity
of the organism is the root of this. It seems as though GFP changes quite a bit throughout
different species. The difference between the two genes is considerable as the base pair
readings fluctuate from 48-112, which is a large range. This particularly large range may have
likely to do with the improper alignments of the stop codons during ClustalW alignment, in
addition, there were many white/blank spaces in the protein sequencing. The analyses of the
amino acid compositions of the genes also proved that the sequences were not very similar, as
they all had differing values with a large, albeit smaller than the nucleotide range. The Pairwise
distances were very far in between for the amino acid compositions with an average distance of
2.169.
It looks like in conclusion, had the nucleotides been aligned correctly with their stop
codons, the nucleotide based data would have better matched the consistency of the amino
acid based data and calculations.
Wild
3. Chain B A Gfp-Like Protein From Marine Copepod Chiridius Poppei. Source: Copepod
Chiridius Poppei
http://www.ncbi.nlm.nih.gov/protein/126030216
Copepods (/koppd/; meaning "oar-feet") are a group of small crustaceans found in the sea
and nearly every freshwater habitat.
Synthetic
6. GFP Expression vector pPV472. Caenorhabditis elegans Source: Expression Vector
pPV472
http://www.ncbi.nlm.nih.gov/protein/AGH70233.1
7. Synthetic Construct Source: Synthetic Construct
http://www.ncbi.nlm.nih.gov/protein/ADN93293.1
http://www.ncbi.nlm.nih.gov/protein/61680649
http://www.ncbi.nlm.nih.gov/protein/61680649
Works Cited
10
http://www.conncoll.edu/ccacad/zimmer/GFP-ww/cooluses0.html
https://www.google.com/?q=GFP+protein#q=GFP+protein
http://www.ncbi.nlm.nih.gov/pubmed/9759496
https://www.jic.ac.uk/microscopy/more/T5_9.htm
http://dwb.unl.edu/Teacher/NSF/C08/C08Links/pps99.cryst.bbk.ac.uk/projects/gmocz/gfp.
htm
http://www.livescience.com/16752-gfp-protein-fluorescent-nih-nigms.html
http://www.rcsb.org/pdb/101/motm_disscussed_entry.do?id=1kys
http://www.rcsb.org/pdb/101/motm.do?momID=42
http://www.rcsb.org/pdb/101/motm.do?momID=123
http://www.ncbi.nlm.nih.gov/pubmed/10857375
https://www.google.com/?q=GFP+protein#q=GFP+protein
http://onlinelibrary.wiley.com/doi/10.1562/0031-8655(2000)0710771TEGFPA2.0.CO2/pdf
http://www.reddit.com/r/askscience/comments/24pdvb/why_do_dogs_lick_people/
Linda, Linda. "Glowing Proteins with Promising Biological and Medical Applications."
Http://www.acs.org/. N.p., Dec. 2008. Web. 2 May 2014.
<http://www.acs.org/content/dam/acsorg/education/resources/highschool/chemmatters/archive/
chemmatters-dec2008-gfp.pdf>.
" Glowing Proteins a Guiding Star for Biochemistry." The Nobel Prize in Chemistry 2008. N.p.,
2009. Web. 5 May 2014.
<http://www.nobelprize.org/nobel_prizes/chemistry/laureates/2008/press.html>.
Wikipedia. Wikimedia Foundation, n.d. Web. 29 Apr. 2014.
<http://en.wikipedia.org/wiki/Green_fluorescent_protein#Wild-type_GFP_.28wtGFP.29>.
11