You are on page 1of 11

Structure and Functionality of the Green Fluorescent Protein

Harrison Lu (undergrad), Braden Chudzik, Farrukh Mohiuddin

Abstract
The Green Fluorescent Protein is an invaluable asset to the field of molecular biology as;
it is a fluorescent protein that can glow under the emissions of a UV light and has allowed
microbiologists around the globe to actively and accurately track the locations and behaviors of
viruses and proteins as they moved around the organisms. By examining the nucleotide
sequences of 7 different GFP sequences it was found that variation among the wildlife was quite
high despite the similar glowing green function. The strains were composed of highly varying
base pair lengths but was due to codon based alignment issues and areas of missing
sequencing in the source code. Aside from this, the amino acid compositions showed a lesser
range of variations but was still significant. The 2 subgroups of wild type vs synthetic type GFP
did not display a clear divide between the two as modified GFPs for bio-tagging usage could be
made in a myriad of ways and thus was.

Introduction
The green fluorescent protein, or GFP for short was first observed and extracted from
the jellyfish hydroid medusa Aequorea victoria in 1962. Now it can also be extracted from a
myriad of other marine life and bacteria. The Green Fluorescent Protein is a protein which
manifests a bright green fluorescence when exposed to blue light in the ultraviolet range. It is
quite commonly made use of as a reporter of expression, in cell and molecular biology, even in
mammals such as homo sapiens. In recent times, it has been used to create biosensors and
been introduced into organisms while breeding and other methods such as cell transformation
and injection of a viral vector (a method to deliver genetic materials into cells) to prove the
expression of a gene throughout an organism. Strands of the GFP have been introduced into
many Bacteria, fish, plants, flies, and mammalian cells, including humans.
Many different mutants of the Green Fluorescence Protein have been engineered, thus
far, due to their high potential for usage. The first one was the single point mutation which
improved the incorporeal characteristics of GFP which resulted in increased fluorescence. This
improvement, now, matched the incorporeal characteristics of Fluorescein (FITC), which is a
commonly available synthetic organic compound and this, therefore, greatly increased GFPs
practical use in research. Many color mutations have been made as well, such as those in blue,
yellow, cyan, and red.
These variations in color came about because of no commonly used laser line near 395
nm. Confocal microscopy uses a laser line for GFP excitation, specifically an argon laser at 488
nm. Very little to no fluorescence was seen when this was used because the GFP was curtailed
by aberrant mRNA splicing. Therefore, modified versions of GFP were created to improve
expression of the fluorescent protein. Resulting from these modifications were an increase in
brightness emission and different color variants e.g. in order from shortest to longest emission
spectra: blue (FP/BFP), cyan (CFP), yellow (YFP), and red (RFP). Green falls between CFP
and YFP on the spectrum.

For general fluorescent microscopy purposes (opposed to confocal microscopy), FITC


filter sets have been used for viewing GFP. However, these are inadequate for wild type GFP in
excitation 475-495 nm and emission 520-560 nm. In order to alleviate this problem, several
more modified versions of GFP have been constructed which have increased fluorescence.
Perhaps more important is the fact that the major excitation peak has been red-shifted to 490
nm with the emission staying at 510 nm. This is preferable for use with FITC filter sets as the
modified GFP has the same excitation range as FITC.
The additional significance of the GFP has to do with the complex nature and pathways
of the many thousands of proteins being transcribed and expressed on a daily basis in any
organism. This density of varying proteins makes for difficult tracking for bioscientists, and in the
event of hazardous protein behavior, such as disease and cancers, this is detrimental to an
organisms health. GFP radiates bright green when exposed to certain UV lights which serves
as a potential long term solution to finding and tracking the
pathways of targeted proteins. This development has
transformed the beautiful deep sea lightshow into a bio
tagging tool.

Figure 1: An illustration of a San Diego beach has been


drawn with living bacteria which expresses 8 different
colors of fluorescent proteins, out of which green is most
recurring.

These research developments have awarded 3 remarkable scientists Osamu


Shimomura, Martin Chalfie, and Roger Y. Tsien the 2008 Nobel prize in Chemistry. 10 million
SEK - the equivalent of 1.522 million USD - was to be shared among them equally for the
discovery and development of the green fluorescent protein, GFP. Fluorescent marking and bio
tagging development is what helped to bring this new protein into stardom in such a short time.
Before this form of bio tagging came into the picture, scientists could only view cell interactions
on broad and vague terms. Viewing the works of a virus meant only seeing the aftermath of the
cell that had been infected, but now, with the use of the GFP and other fluorescent proteins,
they could for the first time watch as the virus operated. This would also allow molecular
biologists to track the movement of other proteins as they moved throughout the organism. One
group of scientists at Oklahoma state university have used this technology to observe how a
virus infects potato crops and expect to learn more of how crops and animals are infected.
Brainbow mice are a breed of genetically altered mice whose brains glow with a rainbow of
colors as seen from a fluorescent microscope. This is possible because GFP and fluorescent
proteins were inserted in their DNA such that the cells in the mices brain produce enough of the
protein to glow visibly.
Hypothesis: There will be a high variation in the wildlife due to their being smaller and
organisms of relatively low complexity. In addition, because the GFP in the wild needs to only
emit a bioluminescent light, it will have no need to be the same between organisms. As for the
2

man-made bio tagging GFP variants, the amino acid and nucleotide variances may be found
anywhere on the trees and composition charts as many color emitting GFPs have already been
created for fluorescent tagging.
Methods
The data was divided into 2 subgroups of green fluorescent proteins. One sub-group
was dedicated to naturally occurring forms of GFP found in several marine lifeforms, and the
other contained synthetic and other forms of the GFP such as probe and expression vectors
that are commonly used to tag proteins.
The genes were selected based on varying organisms and for their GFP. The synthetic
ones were not chosen to be of any significance; they were selected based on their source
organisms. The final selection consisted of 5 wild types (those that were extracted from natural
sources), and 2 synthetics (those that were man-made). The list included Saccharomyces
Cerevisiae, Aequorea Victoria, Copepod Chiridius Poppei, Pseudomonas Aeruginosa,
Herbaspirillum Frisingense, Expression Vector pPV472 and Synthetic Construct; the information
was extracted from the NCIB documents under Source Organism. The nucleotide sequences
were aligned by opening a saved session which was followed by a prompt: Analyze or Align.
Unknown parameters did not allow ClustalW protein sequences alignments to be used for
further analysis; this was the only way around the issue.
After aligning the nucleotides, a neighbor-joining phylogenetic tree was generated
using the analyzed protein sequencing data found under the sequence data explorer window;
this would show the similarities to the genetic sequencing of the seven separate GFP variations
and will give a general outlook on how far and where they deviate from one another.
The composition tests were ran separately as they required different sequencing data to
compute from. The nucleotide composition computational test model required the DNA
sequence to be under the data explorer analysis window. The Amino acid composition test
model drew its interpolations from the protein sequencing data window. These tests would
produce an excel spreadsheet - or other text based data sheet - for a comparative view of the
nucleotide and amino acid makeup variation among the 7 GFP sources.
Using the PDB file acquired from the protein data bank, generating a 3D simulation of
the Green Fluorescent Protein was made possible, using the molecular visualization software
called Protein Insight. After the PDB file was loaded into the software, all of the protein chains
under the VP List were selected to give the protein its green fluorescence, as seen below.
There were a few other, more-in-depth, visualizations of the protein which involved the surface
and stick models but they were not shown here as they hindered the structure of the protein to
be seen clearly. The image below was generated, simply, by clicking Save Workspace, after
all of the aforementioned steps were carried out.

Figure 2: Image of Green Fluorescent Protein generated via Protein Insight

1.
2.
3.
4.
5.
6.

Results
Sequences Tested
Saccharomyces cerevisiae: yeast
Aequorea victoria: bioluminescent hydrozoan jellyfish
Copepods: group of small crustaceans
Pseudomonas aeruginosa: common disease causing bacterium
Herbaspirillum frisingense: nitrogen-fixing bacteria
Expression Vector pPV472

7. Synthetic Construct
Figure 3: Rectangular neighbor joining phylogenetic tree, with the format of the GFP type with
the source organism to the right hand side.

Figure 4: Pairwise distance between the amino acids of the GFP and its average

Figure 5: Pairwise distances between nucleotides of the GFP genes and its average

Figure 6: Nucleotide comparison charts

Figure 7: Amino acid comparison charts


Discussion

The results of these tests showed that the sequences for the two different variations of
the genes are similar but, sometimes quite different as well. The hypothesis that there will be
variation among the wildlife as many are smaller organisms and because of their lower
complexity any mutation will be significant, is supported; however, it is unclear if the complexity
of the organism is the root of this. It seems as though GFP changes quite a bit throughout
different species. The difference between the two genes is considerable as the base pair
readings fluctuate from 48-112, which is a large range. This particularly large range may have
likely to do with the improper alignments of the stop codons during ClustalW alignment, in
addition, there were many white/blank spaces in the protein sequencing. The analyses of the
amino acid compositions of the genes also proved that the sequences were not very similar, as
they all had differing values with a large, albeit smaller than the nucleotide range. The Pairwise
distances were very far in between for the amino acid compositions with an average distance of
2.169.
It looks like in conclusion, had the nucleotides been aligned correctly with their stop
codons, the nucleotide based data would have better matched the consistency of the amino
acid based data and calculations.

Wild

1. RecName: FullUV excision repair protein RAD23. Source: Saccharomyces cerevisiae


http://www.ncbi.nlm.nih.gov/protein/P32628.1
Saccharomyces cerevisiae is a species of yeast.

2. RecName: FullGreen fluorescent protein. Source: Aequorea victoria


http://www.ncbi.nlm.nih.gov/protein/1169893
Aequorea victoria, also sometimes called the crystal jelly, is a bioluminescent hydrozoan
jellyfish

3. Chain B A Gfp-Like Protein From Marine Copepod Chiridius Poppei. Source: Copepod
Chiridius Poppei
http://www.ncbi.nlm.nih.gov/protein/126030216
Copepods (/koppd/; meaning "oar-feet") are a group of small crustaceans found in the sea
and nearly every freshwater habitat.

4. hypothetical protein V563_01252 Pseudomonas aeruginosa PAO1-GFP. Source:


Pseudomonas aeruginosa
http://www.ncbi.nlm.nih.gov/protein/611862750
Pseudomonas aeruginosa is a common bacterium that can cause disease in animals,
including humans.

5. green fluorescent protein Source: Herbaspirillum frisingense.


http://www.ncbi.nlm.nih.gov/protein/WP_006462513.1
Herbaspirillum frisingense[3] (Herbaspirillum frisingense sp. nov.) is a nitrogen-fixing bacteria
which was found in C4-fibre plants like prairie cordgrass (Spartina pectinata), Chinese silver
grass

Synthetic
6. GFP Expression vector pPV472. Caenorhabditis elegans Source: Expression Vector
pPV472
http://www.ncbi.nlm.nih.gov/protein/AGH70233.1
7. Synthetic Construct Source: Synthetic Construct
http://www.ncbi.nlm.nih.gov/protein/ADN93293.1
http://www.ncbi.nlm.nih.gov/protein/61680649
http://www.ncbi.nlm.nih.gov/protein/61680649

Works Cited

10

http://www.conncoll.edu/ccacad/zimmer/GFP-ww/cooluses0.html
https://www.google.com/?q=GFP+protein#q=GFP+protein
http://www.ncbi.nlm.nih.gov/pubmed/9759496
https://www.jic.ac.uk/microscopy/more/T5_9.htm
http://dwb.unl.edu/Teacher/NSF/C08/C08Links/pps99.cryst.bbk.ac.uk/projects/gmocz/gfp.
htm
http://www.livescience.com/16752-gfp-protein-fluorescent-nih-nigms.html
http://www.rcsb.org/pdb/101/motm_disscussed_entry.do?id=1kys
http://www.rcsb.org/pdb/101/motm.do?momID=42
http://www.rcsb.org/pdb/101/motm.do?momID=123
http://www.ncbi.nlm.nih.gov/pubmed/10857375
https://www.google.com/?q=GFP+protein#q=GFP+protein
http://onlinelibrary.wiley.com/doi/10.1562/0031-8655(2000)0710771TEGFPA2.0.CO2/pdf
http://www.reddit.com/r/askscience/comments/24pdvb/why_do_dogs_lick_people/

Linda, Linda. "Glowing Proteins with Promising Biological and Medical Applications."
Http://www.acs.org/. N.p., Dec. 2008. Web. 2 May 2014.
<http://www.acs.org/content/dam/acsorg/education/resources/highschool/chemmatters/archive/
chemmatters-dec2008-gfp.pdf>.
" Glowing Proteins a Guiding Star for Biochemistry." The Nobel Prize in Chemistry 2008. N.p.,
2009. Web. 5 May 2014.
<http://www.nobelprize.org/nobel_prizes/chemistry/laureates/2008/press.html>.
Wikipedia. Wikimedia Foundation, n.d. Web. 29 Apr. 2014.
<http://en.wikipedia.org/wiki/Green_fluorescent_protein#Wild-type_GFP_.28wtGFP.29>.

11

You might also like