You are on page 1of 64

As faculty of Weill Cornell Medical College in Qatar we are

committed to providing transparency for any and all external


relationships prior to giving an academic presentation.
I, Moncef LADJIMI
DO NOT have a financial interest
in commercial products or services.

PRINCIPLES OF BIOCHEMISTRY
SPRING 2015
(BIOMG 3350)
Lecture 3:

1/ Levels of Structure in Proteins:


From Primary to Quaternary.
2/ Protein Folding
Moncef LADJIMI
mol2007@qatar-med.cornell.edu
Office: C-169
Additional material for this lecture may be found in:
Lehningers Biochemistry (6th ed), chapter 3: p.96-104, chapter 4: p.115-146



PROTEINS: STRUCTURE, FUNCTION,


FOLDING

Learning goals:
Structure and properties of the peptide bond
Structural hierarchy in proteins
Structure and function of fibrous proteins
Structure analysis of globular proteins
Protein folding and denaturation

STRUCTURE OF PROTEINS
Unlike most organic polymers, protein
molecules adopt a specific three-dimensional
conformation.
This structure is able to fulfill a specific
biological function
This structure is called the native fold
The native fold has a large number of favorable
interactions within the protein
There is a cost in conformational entropy of
folding the protein into one specific native fold

FAVORABLE INTERACTIONS
IN PROTEINS
Hydrophobic eect
Release of water molecules from the structured solva4on layer around
the molecule as protein folds increases the net entropy

Hydrogen bonds
Interac4on of N-H and C=O of the pep4de bond leads to local regular
structures such as -helices and -sheets

Van der Waals interac6ons


Medium-range weak aDrac4on between all atoms contributes
signicantly to the stability in the interior of the protein

Electrosta6c interac6ons
Long-range strong interac4ons between permanently charged groups
Salt-bridges, esp. buried in the hydrophobic environment strongly
stabilize the protein

THE THREE-DIMENSIONAL (3D) STRUCTURE OF A PROTEIN


IS STABILIZED BY WEAK INTERACTIONS
The 3D structure of Chymotrypsin

FOUR LEVELS OF STRUCTURE


IN PROTEINS

Levels of structure in proteins. The primary structure consists of a sequence of amino


acids linked together by peptide bonds and includes any disulfide bonds. The resulting
polypeptide can be arranged into units of secondary structure, such as an helix. The helix is
a part of the tertiary structure of the folded polypeptide, which is itself one of the subunits that
make up the quaternary structure of the multisubunit protein, in this case hemoglobin.

STRUCTURE
OF THE PEPTIDE
BOND
Structure of the protein is partially dictated
by the properties of the peptide bond
The peptide bond is a resonance hybrid of two
canonical structures
The resonance causes the peptide bonds
to be less reactive compared to esters, for
example
to be quite rigid and nearly planar
to exhibit a large dipole moment in the favored
trans configuration

RESONANCE
IN THE PEPTIDE BOND

The carbonyl oxygen has a partial negative charge and the amide nitrogen a
partial negative charge, setting up a small electric dipole.
Virtually all peptide bonds in proteins occur in this trans configuration (99.95%)
Except the peptide bond involving the imino nitrogen of proline, which occurs
in the trans (96%) and the cis (4%) configurations (many of these occur in -turns).
Proline isomers

THE RIGID PEPTIDE PLANE AND


THE PARTIALLY FREE ROTATIONS
Rotation around the peptide bond is not permitted
Rotation around bonds connected to the alpha
carbon is permitted
f (phi): angle around the -carbonamide
nitrogen bond
y (psi): angle around the -carboncarbonyl
carbon bond
In a fully extended polypeptide, both y and f are
180

THE POLYPEPTIDE IS MADE UP OF A


SERIES OF PLANES LINKED
AT CARBONS

Three bonds separate sequential carbons in a polypeptide chain. The NC and


CC bonds can rotate, described by dihedral angles designated and ,
respectively. The peptide CN bond is not free to rotate. Other single bonds in the
backbone may also be rotationally hindered, depending on the size and charge of
the R groups.

DISTRIBUTION OF
AND DIHEDRAL ANGLES
Some f and y combinations are very unfavorable because
of steric crowding of backbone atoms with other atoms in
the backbone or side chains
Some f and y combinations are more favorable because of
chance to form favorable H-bonding interactions along the
backbone
A Ramachandran plot shows the distribution of f and y
dihedral angles that are found in a protein
shows the common secondary structure elements
reveals regions with unusual backbone structure

THE RAMACHANDRAN PLOT


Allowed values (blue) for of and dihedral angles in L-Alanine residues
(theoretical)
Peptide conformations are defined by the
values of and .
- dark blue represent conformations that
involve no steric overlap and thus are
fully allowed;
- medium blue indicates conformations
allowed at the extreme limits for
unfavorable atomic contacts;
- the lightest blue indicates conformations
that are permissible if a little flexibility is
allowed in the dihedral angles.
- The white regions are conformations
that are not allowed.
Conformations deemed possible are those
that involve little or no steric interference,
based on calculations using known
van der Waals radii and dihedral angles

SECONDARY STRUCTURES
Secondary structure refers to a local spatial
arrangement of the polypeptide backbone
Two regular arrangements are common:
The helix
stabilized by hydrogen bonds between nearby
residues

The sheet
stabilized by hydrogen bonds between adjacent
segments that may not be nearby

Irregular arrangement of the polypeptide chain is


called the random coil

THE HELIX
Helical backbone is held together by hydrogen
bonds between the backbone amides of an n
and n+4 amino acids
Right-handed helix with 3.6 residues (5.4 ) per
turn
Peptide bonds are aligned roughly parallel with
the helical axis
Side chains point out and are roughly
perpendicular with the helical axis

THE HELIX

Models of the helix, showing different aspects of its structure

THE HELIX: TOP VIEW


The inner diameter of the helix (no side chains) is
about 45
Too small for anything to fit inside
The outer diameter of the helix (with side chains)
is 1012
Happens to fit well into the major groove of
dsDNA
Residues 1 and 8 align nicely on top of each other
What kind of sequence gives an helix with
one hydrophobic face?

AMINO ACID SEQUENCE AFFECTS


HELIX STABILITY
Not all polypeptide sequences adopt -helical structures
Small hydrophobic residues such as Ala and Leu are
strong helix formers
Pro acts as a helix breaker because the rotation around
the N-Ca bond is impossible
Gly acts as a helix breaker because the tiny R-group
supports other conformations
Attractive or repulsive interactions between side chains
34 amino acids apart will affect formation

AMINO ACID SEQUENCE AFFECTS


HELIX STABILITY

THE HELIX DIPOLE


Recall that the peptide bond
has a strong dipole moment
Carbonyl O negative
Amide H positive
All peptide bonds in the
helix have a similar
orientation
The helix has a large
macroscopic dipole moment
Negatively charged residues
often occur near the positive
end of the helix dipole

The electric dipole of a


peptide bond is transmitted
along an -helical segment
through the intrachain
hydrogen bonds, resulting in
an overall helix dipole. In this
illustration, the amino and
carbonyl constituents of
each peptide bond are
indicated by + and
symbols, respectively. Nonhydrogen-bonded amino and
carbonyl constituents of the
peptide bonds near each
end of the -helical region
are shown in red.

SHEETS
The planarity of the peptide bond and tetrahedral
geometry of the -carbon create a pleated sheetlike structure
Sheet-like arrangement of backbone is held
together by hydrogen bonds between the backbone
amides in different strands
Side chains protrude from the sheet alternating in
up and down direction

PARALLEL AND ANTIPARALLEL


SHEETS
Parallel or antiparallel orientation of two
chains within a sheet are possible
In parallel sheets the H-bonded strands run
in the same direction
Resulting in bent H-bonds (weaker)

In antiparallel sheets the H-bonded strands


run in opposite directions
Resulting in linear H-bonds (stronger)

SHEETS

The -conformation organizes polypeptide chains into sheets

TURNS
turns occur frequently whenever strands in sheets
change the direc4on
The 180 turn is accomplished over four amino acids
The turn is stabilized by a hydrogen bond from a
carbonyl oxygen to amide proton three residues down
the sequence
Proline in posi4on 2 or glycine in posi4on 3 are
common in turns

TURNS
-turns (or haipin loops) are
connecting regions
between secondary structure
elements
Type I and type II turns are
most common; type I turns
occur more than twice as
frequently as type II.
Type II turns usually have
Gly as the third residue.
Note the hydrogen bond
between the peptide groups
of the first and fourth
residues of the bends.
(Individual amino acid
residues are framed by large
blue circles.)

THE RAMACHANDRAN PLOT OF PROTEIN STRUCTURES


(experimental)

The values of and for various allowed


secondary structures are overlaid on the plot of
theoretically allowed conformations.
Although left-handed helices extending over
several amino acid residues are theoretically
possible, they have not been observed in
proteins.

The values of and for all the amino acid


residues except Gly in the enzyme pyruvate
kinase are overlaid on the plot of
theoretically allowed conformations.
The small, flexible Gly residues were
excluded because they frequently
fall outside the expected (blue) ranges.

CIRCULAR DICHROISM (CD)


ANALYSIS
CD measures the molar absorp4on dierence of leW-
and right-circularly polarized light: = L R
Chromophores in the chiral environment produce
characteris4c signals
CD signals from pep4de bonds depend on the chain
conforma4on

COMMON SECONDARY STRUCTURES IN PROTEINS


CAN BE ASSESSED
BY CIRCULAR DICHROISM SPECTROSCOPY
Spectras of polylysine
entirely as helix, as
conformation,
or as a denatured,
random coil
The CD spectrum for
a given protein can
provide a rough
estimate for the
fraction of the protein
made up of the two
most common
secondary structures

PROTEIN TERTIARY STRUCTURE


Tertiary structure refers to the overall spatial arrangement of
atoms in a protein
Stabilized by numerous weak interactions between amino acid
side chains.
- Largely hydrophobic and polar interactions
- Can be stabilized by disulfide bonds
Interacting amino acids are not necessarily next to each other in
the primary sequence.
Two major classes
- Fibrous with polypeptide chains arranged in long
strands or sheets. Fibrous proteins are insoluble in water.
- Globular with polypeptide chains folded into a
spherical or globular shape

FIBROUS PROTEINS:
FROM STRUCTURE TO FUNCTION
fibrous proteins are adapted for a structural function

FIBROUS PROTEINS: -KERATIN


(FOUND IN HAIR, FEATHERS AND NAILS)
Intermediate filament
(4 protofibrils=32 chains)

(4 chains)

(8 chains)

Hair -keratin is an elongated helix with thick elements


near the amino and carboxyl termini.
Pairs of these helices are interwound in a left-handed
sense to form two-chain coiled coils.
These then combine in higher-order structures called
protofilaments and protofibrils.
About four protofibrils32 strands of -keratin in all
combine to form an intermediate filament.

Structure of hair. A hair is an array of


many -keratin filaments, made up of the
substructures shown in the left.

CHEMISTRY OF PERMANENT WAIVING OF HAIR


(AN OXYDO-REDUCTION REACTION OF KERATIN)

STRUCTURE OF COLLAGEN
Collagen is an important constituent of connective
tissue: tendons, cartilage, bones, cornea of the
eye
Each collagen chain is a long Gly- and Pro-rich
left-handed helix
Three collagen chains intertwine into a righthanded superhelical triple helix
The triple helix has higher tensile strength than a
steel wire of equal cross section
Many triple-helices assemble into a collagen fibril

FIBROUS PROTEINS: COLLAGEN


(FOUND IN TENDONS AND BONE MATRIX)
Collagen Fibrils

(a) The repeating


tripeptide sequence Gly
XPro or GlyX4-Hyp of
the chain of collagen
(about 1,000 residues)
adopts a left-handed
helical structure with
three residues per turn
(c) 3 of these helices (shown
here in gray, blue, and purple)
wrap around one another with a
right-handed twist

Collagen (Mr 300,000) is a rodshaped molecule, about 3,000


long and only 15 thick.

4-HYDROXYPROLINE IN COLLAGEN
Forces the proline ring into a favorable pucker

Offer more hydrogen bonds between the three


strands of collagen
The post-translational processing is catalyzed by
prolyl hydroxylase and requires -ketoglutarate,
molecular oxygen, and ascorbate (vitamin C)

VITAMIN C IN PROLYL 4-HYDROXYLASE


RESTORES FE2+ STATE
Reactions catalyzed by prolyl 4hydroxylase:
(a) The normal reaction, coupled to
proline hydroxylation, which does
not require ascorbate, leads to the
oxidation of Fe2+. (The fate of the
two oxygen atoms from O2 is
shown in red).
(b) The uncoupled reaction, in which
-ketoglutarate is oxidatively
decarboxylated without
hydroxylation of proline,
regenerates Fe2+ in the presence
of Ascorbate, which is consumed
stoichiometrically in this process
as it is converted to
dehydroascorbate.

SILK FIBROIN
Fibroin is the main protein in silk from moths and
spiders
An4parallel sheet structure
Small side chains (Ala and Gly) allow the close
packing of sheets
Structure is stabilized by
hydrogen bonding within sheets
Van der Waals interac4ons between sheets

SILK FIBROIN

The fibers in silk cloth and in a spider web are made up of the protein fibroin (made of layers of
antiparallel sheets rich in Ala and Gly residues). The small side chains interdigitate and allow
close packing of the sheets, as shown in the ball and stick view.

SPIDER SILK
Used for webs, egg sacks, and
wrapping the prey
Extremely strong material
stronger than steel
can stretch a lot before
breaking
A composite material
crystalline parts (fibroin-rich)
rubber-like stretchy parts

Strands of fibroin (blue) emerge from


the spinnerets of a spider.

WATER-SOLUBLE
GLOBULAR PROTEINS
Globular protein structures are compact and varied
Approximate dimensions of human
serum albumin (Mr 64,500),
585 residues in a single chain, if it
occurred entirely in extended
conformation or as an helix

Tertiary structure of sperm whale myoglobin.


(a) The polypeptide backbone in a ribbon
representation, which highlights regions of
secondary structure. The -helical regions are
evident.
(b) Surface contour image; this is useful for
visualizing pockets in the protein where other
molecules might bind.

GLOBULAR PROTEINS HAVE A VARIETY


OF TERTIARY STRUCTURES

MOTIFS (FOLDS)
Specific arrangement of several secondary
structure elements (generally less than 100
aminoacids)
All alpha-helix
All beta-sheet
Both

Motifs can be found as reoccurring structures


in numerous proteins
Proteins are made of different motifs folded
together

EXAMPLES OF PROTEIN MOTIFS

Helix-Loop-Helix
motif

Found in Helix-LoopHelix in DNA-binding


Proteins (cro repressor)

Found in Helix-Loop-Helix
(EF hand) in Calcium-binding
Proteins (Troponin C)
Helix-Loop-Helix
(EF hand motif)

EXAMPLES OF PROTEIN MOTIFS

motif

Hairpin motif
found in snake/scorpion venom toxin

Antiparallel motif found


in porins

-- motif

-- motif
In triose phosphate isomerase

DOMAINS (MULTIPLE MOTIFS)


A domain is a part of a polypeptide (generally more than
100-150 amino acids), consisting of one or multiple
motifs, forming a structural unit that:
Is independently stable thermodynamically
could undergo movements as a single entity with respect to the
entire protein
can fold/unfold independently relative to the entire protein (a
folding unit)
can evolve separately from the rest of the protein (an
evolutionary unit)
can have its own function (a functional unit)

SINGLE MOTIFS COMBINE TO FORM COMPLEX MOTIFS


AND COMPLEX MOTIFS COMBINE TO FORM DOMAINS

Domains in Phosphofructokinase

Motifs and Domains in -crystallin

Domains in Troponin C, a calcium-binding


protein in the muscle has two separate
calcium-binding domains, indicated in blue
and purple.

LARGE POLYPEPTIDE CHAINS FOLD INTO


SEVERAL STRUCTURAL DOMAINS

Domains as independent units of structure, folding, function and evolution

PROTEIN MOTIFS ARE THE BASIS FOR


PROTEIN STRUCTURAL CLASSIFICATION
All proteins

PROTEIN MOTIFS ARE THE BASIS FOR


PROTEIN STRUCTURAL CLASSIFICATION
All proteins

PROTEIN MOTIFS ARE THE BASIS FOR


PROTEIN STRUCTURAL CLASSIFICATION
/ proteins

PROTEIN TERTIARY
AND QUATERNARY STRUCTURE
uTertiary structure is the overall three-dimensional
arrangement of all atoms in a protein. It can be made of a single
domain or multiple domains, linked together by the peptide
bonds (covalent) within the same polypeptide chain.
uQuaternary structure is the arrangement of several protein
tertiary structures (protein subunits) in the three-dimensional
complex, for proteins that contain 2 or more separate
polypeptide chains (or subunits). Subunits within a
quaternary structure are generally not associated through
covalent bonds, but through weak interactions (H-bonds,
ionic interactions, hydrophobic interactions, van der Waals
interactions).

PROTEIN STABILITY
AND FOLDING
A proteins function depends on its 3D-structure
Loss of structural integrity with accompanying loss of
activity is called denaturation
Proteins can be denatured by:
heat or cold
pH extremes
organic solvents
chaotropic agents: urea and guanidinium
hydrochloride

PROTEIN FOLDING:
THE SECOND HALF OF THE GENETIC CODE

First half of the genetic code


(from RNA to a protein
Sequence)

Unfolded

Folded

Protein

Second half of the genetic code


(Folding according to Anfinsen
principle and other factors like
Molecular Chaperones)

LOSS OF PROTEIN STRUCTURE (DENATURATION)


RESULTS IN LOSS OF FUNCTION
Thermal denaturation of horse
apomyoglobin (myoglobin without the
heme prosthetic group) and disulfideintact ribonuclease A

Denaturation of disulfide-intact
ribonuclease A by guanidine
hydrochloride (GdnHCl), monitored
by circular dichroism.

Transition from the folded to the unfolded state is abrupt (a melting temperature, or Tm , can be
defined), suggesting cooperativity in the unfolding process

RENATURATION OF UNFOLDED, DENATURED


RIBONUCLEASE
RNase A:
A small, single-domain protein of
124 aminoacid residues (14
kDa), having 4 S-S bonds, that
catalyzes the hydolysis of singlestranded RNA.
Urea denatures the ribonuclease,
and mercaptoethanol
(HOCH2CH2SH) reduces, and
thus cleaves, the disulfide bonds
to yield eight Cys residues.
When urea and 2mercaptoethanol are removed,
the protein spontaneously
refolds, and the correct disulfide
bonds are reformed
Denaturation (or unfolding) and
renaturation (or folding) are
reversible and cooperative
processes.

The aminoacid
sequence of a
polypeptide chain
contains all the
information needed
for the formation of a
unique, biologically
active, 3D structure.
Anfinsen principle:
The amino acid
sequence of a protein
alone determines its
Native conformation
(or tertiary structure)

THE DRIVING FORCE OF PROTEIN FOLDING


(THE HYDROPHOBIC EFFECT)

Competition between
self-interactions within the
protein and interactions
with water drives protein
folding).

THE HYDROPHOBIC EFFECT


The Hydrophobic effect, also called

the hydrphobic collapse, brings about


the clustering of hydrophobic side chains
from diverse part of the polypetide chain
and causes the polypeptide to become
compact and start folding.
Thus, the non-polar amino acid side
chains are buried on the inside of the
protein to form a hydrophobic core that
is hidden from water, and the polar
amino acid side chains tend to gather on
the outside of the protein where they can
interact with water. !

HOW CAN PROTEINS FOLD SO FAST?


Proteins fold to the lowest-energy fold in the
microsecond to second time scales. How can
they find the right fold so fast?
It is mathematically impossible for protein folding
to occur by randomly trying every conformation
until the lowest-energy one is found (Levinthals
paradox)
Search for the minimum is not random because
the direction toward the native structure is
thermodynamically most favorable

HOW DO PROTEINS FOLD?


In leaving cells, proteins are assembled from amino acids at avery high rate
(E. coli can make a complete, biologically active, protein of 100 amino acid
residues in about 5 seconds). Folding times vary from few microseconds to
seconds.
How does such a polypetide arrive at its native conformation?
If each aminoacid residue of this 100 residues protein could take up 10
different conformations on average, this will give 10100 polypeptide
conformations.
If the protein folds spontaneously by a random process in which it tries all
possible conformations, and if each conformation is sampled in the shortest
time possible of 10-13 second (the time required for a single vibration), it would
take about 1077 years to sample all possible conformations (for comparison,
the age of the universe is estimated to be 13.7 109 years).
Protein folding cannot be a completely random, trial and error, process,
and there must be shortcuts

PROTEINS FOLDING FOLLOW


A DISTINCT PATH
A protein-folding pathway as defined
for a small protein (a hierarchical
pathway is shown, based on
computer modeling).
Small regions of secondary structure
are assembled first and then
gradually incorporated into larger
structures.
1/ Random coil
2/ Helices and sheets
3/ Long range interactions
4/ Continued process until folding is
complete.
The numbers indicate the amino acid residues in this
56 residue peptide that have acquired their final
structure in each of the steps shown.

THE THERMODYNAMICS OF PROTEIN FOLDING


DEPICTED AS A FREE-ENERGY FUNNEL
Folding is initiated by a
spontaneous collapse into a
compact state, mediated by
hydrophobic interactions.
The collapsed state, called
molten globule has high
secondary structure content but
aminoacids are not yet in place.
1/ Random coil
2/ Molten globule
3/ Final rearrangements

At the top, the number of conformations, and hence the conformational entropy, is large. Only a small fraction of
the intramolecular interactions that will exist in the native conformation are present.
As folding progresses, the thermodynamic path down the funnel reduces the number of states present
(decreases entropy), increases the amount of protein in the native conformation, and decreases the free energy.
Depressions on the sides of the funnel represent semi-stable folding intermediates, which in some cases may
slow the folding process.

GENERAL PRINCIPLES GOVERNING


PROTEIN FOLDING
The primary structure dictates the 3D structure. Most proteins fold
spontaneously, other will need assistance by molecular chaperones.!
Folding is driven by the hydrophobic effect. For soluble proteins, the
core of a protein is tightly packed and is hydrophobic. The surface of
a protein is polar and interacts with water.!
Secondary structure elements are necessary to satisfy the hydrogen
bonding capabilities of the main chain atoms in the interior of the
protein!
The native, folded state, is the lowest free-energy state. !
The stability of the folded structure is determined by non-covalent
interactions!
Interactions between charged residues (salt bridges)!
Hydrogen bonds!
van der Waals interactions (tight packing)!
Disulfide bonds!

SUMMARY
In this lecture, we learned about:
the two most important secondary structures
helices
sheets

how properties and function of fibrous proteins are


related
Motifs and domains in globular proteins
Tertiary and quaternary structure
one of the largest unsolved puzzles in modern
biochemistry: how proteins fold

Remember to prepare for next lecture:


Lehningers Biochemistry (6th ed),
chapter 4: p. 131-133
chapter 5: p. 157-174

You might also like