You are on page 1of 39

Protein structure and function

General amino acid structure and chemistry Structure, characteristics and classification of 20 amino acids Protein assembly and function:
Concept of: primary structure secondary structure tertiary structure quaternary structure functional domain

Proteins are an important class of biological macromolecules present in all biological organisms, and are made up of elements such as carbon, hydrogen, nitrogen, oxygen and sulphur. All proteins are polymers of amino acids. The polymers, also known as peptides consist of a sequence of 20 different L--amino acids, also referred to as residues. For chains under 40 residues the term peptide is frequently used instead of protein. To be able to perform their biological function, proteins fold into one, or more, specific spatial conformations, driven by a number of non-covalent interactions such as hydrogen bonding, ionic interactions, Van der Waals' forces and hydrophobic packing. In order to understand the functions of proteins at a molecular level, it is often necessary to determine the three dimensional structure of proteins. This is the topic of the scientific field of structural biology, that employs techniques such as X-ray crystallography or NMR spectroscopy, to determine the 2 structure of proteins.

Ionic bonds involve interactions between the oppositely charged groups of a molecule - for example the positively charged amino side chains of lysine and arginine, and the negatively charged carboxyl groups of glutamic and aspartic acid.

Hydrogen bonds are formed by "sharing" of a hydrogen atom between two electronegative atoms such as N and O.

van der Waals forces are very weak attractions (or repulsions) which occur between atoms at close range.

The hydrophobic amino acids of a protein will tend to cluster together, not as a result of attraction, but as a result of their repulsion by the hydrogen bonded water network in which the protein is dissolved. Hydrophobic regions of a protein will preferentially locate away from the surface of the molecule.

A certain number of residues is necessary to perform a particular biochemical function, and around 40-50 residues appears to be the lower limit for a functional domain size. Protein sizes range from this lower limit to several thousand residues in multi-functional or structural proteins. However, the current estimate for the average protein length is around 300 residues. Very large aggregates can be formed from protein subunits, for example many thousand actin molecules assemble into an actin filament.

Proteins play crucial roles in almost every biological process. They are responsible for a variety of physiological functions including:

Enzymatic catalysis - almost all biological reactions are enzyme catalyzed. Enzymes are known to increase the rate of a biological reaction by a factor of 10 to the 6th power! There are several thousand enzymes which have been identified to date. Binding, transport and storage - small molecules are often carried by proteins in the physiological setting (for example, the protein hemoglobin is responsible for the transport of oxygen to tissues). Many drug molecules are partially bound to serum albumins in the plasma.
Molecular switching - conformational changes in response to pH or ligand binding can be used to control cellular processes. Coordinated motion - muscle is mostly protein, and muscle contraction is mediated by the sliding motion of two protein filaments, actin and myosin.
6

Structural support - skin and bone are strengthened by the protein collagen.

Immune protection - antibodies are protein structures that are responsible for reacting with specific foreign substances in the body.
Generation and transmission of nerve impulses - some amino acids act as neurotransmitters (glutamate & aspartate in the brain, non-essential A.As), which transmit electrical signals from one nerve cell to another. Gamma amino butyric acid (GABA) is major inhibitory neurotransmitter in brain; Glycine is an inhibitory neurotransmitter found in the brain stem and spinal cord. In addition, receptors for neurotransmitters, drugs, etc are protein in nature. An example of this is the acetylcholine receptor, which is a protein structure that is embedded in postsynaptic neurons. Control of growth and differentiation - proteins can be critical to the control of growth, cell differentiation and expression of DNA. For example, repressor proteins may bind to specific segments of DNA, preventing expression and thus the formation of the product of that DNA segment. Also, many hormones and growth factors that regulate cell function, such as insulin or thyroid stimulating hormone are proteins.
7

Amino acid structure and chemistry

The amino and carboxyl moieties in an amino acid are alpha to one another; also located on the alpha carbon is an "R" group.
The -C atom is bound to an amino group, a carboxyl group, a hydrogen and a side chain. An exception from this rule is proline, where the hydrogen atom is replaced by a bond to the side chain. The nature of this R-group (called the side chain) determines the identity of a particular amino acid. It also determines the chemical properties of the -amino acid and may be any one of the 20 different side chains. A total of 20 amino acids are used to make up proteins (some modified or otherwise unusual amino acids exist).

In solution at physiological pH (7.4), amino acids undergo an acid-base reaction to form zwitterions. In a zwitterion, the + and - charges cancel to give a molecule with a net charge of zero. However, the pKa values for a typical amino acid (glycine for example) are 9.6 and 2.3 for the amino and carboxyl groups, respectively.
If the pH of an amino acid solution is lowered significantly from 7.4, a species results in which the amine group has a positive charge, while the carboxyl is neutral. Likewise, If the pH is raised from 7.4, a species results in which the amine group is neutral, while the carboxyl has a negative charge. Thus, the ionization state of amino acids is pH dependent.

carboxyl group

a-carbon is chiral (except for glycine)


at pH 7.0 uncharged amino acids are zwitterions amino acids have a tetrahedral structure

amino group

a-carbon

side chain

10

11

Amino Acid Enantiomers

Stereoisomers / enantiomers Biological system only synthesize and use L-amino12 acids

Classification of amino acids


The 20 naturally occurring amino acids can be divided into several groups based on their chemical properties. Important factors are charge, hydrophobicity/hydrophilicity, size and functional groups. The nature of the interaction of the different side chains with the aqueous environment plays a major role in molding protein structure. Hydrophobic side chains tend to be buried in the middle of the protein, where as hydrophilic side chains are exposed to the solvent. Examples of hydrophobic residues are: Leucine, isoleucine, phenylalanine, and valine, as well as tyrosine, alanine and tryptophan. The charge of the side chains plays an important role in protein structures, since ionic bonding can stabilize proteins structures, and an unpaired charge in the middle of a protein, can disrupt structures. Charged residues are strongly hydrophilic, and are usually found on the out side of proteins.
13

Positively charged side chains are found in lysine and arginine, and in some cases in histidine. Negative charges are found in glutamate and aspartate. The rest of the amino acids have smaller, generally hydrophilic side chains with various functional groups. Serine and threonine have hydroxyl groups, and asparagine and glutamine have amide groups. Some amino acids have special properties. Examples: Cysteine can form covalent disulphide bonds to other cysteines. Proline is cyclical, and glycine is small, and more flexible than the other amino acids.
14

Aliphatic Aromatic Sulfur containing Polar/uncharged basic/acidic

Hydrophobic

Hydrophillic

15

Aliphatic (alkane) Amino Acids


Proline (pro, P) cyclic imino acid

Hydrophobicity

Glycine(gly, G) only non-chiral amino acid, not hydrophobic Alanine (ala, A) R- group = methyl-group Valine (Val, V) Think V! Leucine (Leu, L) Isoleucine (Ile, I) - 2 chiral carbons
16

Aromatic Amino Acids


All very hydrophobic All contain aromatic group Absorb UV at 280 nm Phenylalanine (Phe, F) Tyrosine (Tyr, Y) -OH ionizable (pKa = 10.5), H-Bonding Tryptophan (Trp, W) bicyclic indole ring, H-Bonding

17

Sulfur Containing Amino Acids


Methionine (Met, M) start amino acid, very hydrophobic

Cysteine (Cys, C) sulfur in form of sulfhydroyl, important in disulfide linkages, weak acid, can form hydrogen bonds.
18

Acidic Amino Acids


Contain carboxyl groups (weaker acids than a-carboxylgroup) Negatively charged at physiological pH, present as conjugate bases (therefore ate not ic acids) Carboxyl groups function as nucleophiles in some enzymatic reactions
Aspartate Glutamate

19

Basic Amino Acids


Hydrophillic nitrogenous bases Positively charged at physiological pH Histidine imidazole ring protonated/ionized, only amino acid that functions as buffer in physiol range. Lysine - diamino acid, protonated at pH 7.0 Arginine - guianidinium ion always protonated, most basic amino acid

H+
+H

H+ pKa 6.0
20

Polar Uncharged Amino Acids


Polar side groups, hydrophillic in nature, can form hydrogen bonds Hydroxyls of Ser and Thr weakly ionizable Serine (Ser, S) looks like Ala w/ -OH

Threonine (Thr, T) 2 chiral carbons

Asparagine (Asn, N) amide of aspartic acid

Glutamine (Gln, Q) amide of glutamic acid


21

Essential/Non-Essential Amino Acids


Essential histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, valine

Non-essential alanine, arginine*, aspartate, asparagine, cysteine*, glutamate, glutamine, glycine*, proline*, serine, tyrosine*

22

Protein assembly: The peptide bond


Two amino acids can be combined in a condensation reaction. By repeating this reaction, long chains of residues (amino acids in a peptide bond) can be generated. This reaction is catalysed by the ribosome in a process known as translation. The peptide bond is in fact planar due to the delocalization of the electrons from the double bond. In contrast to the rather rigid peptide bond angle where (the bond between C1 and N) is always close to 180 degrees, the dihedral angles (the bond between N and C) and psi (the bond between C and C1) can have a certain range of possible values. These angles are the degrees of freedom of a protein, and they control the protein's three dimensional structure.
23

Bond angles for and


24

Levels of protein structure


Primary structure of a peptide or protein - refers to the linear sequence of the different amino acids. Counting of residues always starts at the N-terminal end (NH2-group), which is the end, where the amino group is not involved in a peptide bond. The primary structure of a protein is determined by the gene corresponding to the protein. A specific sequence of nucleotides in DNA is transcribed into mRNA, which is read by the ribosome in a process called translation. The sequence of a protein is unique to that protein, and defines the structure and function of the protein. The sequence of a protein can be determined by methods such as Edman degradation or tandem mass spectrometry. Often however, it is read directly from the sequence of the gene using the genetic code.

25

Post-transcriptional modifications such as disulfide bond formation, phosphorylations and glycosylations are usually also considered as part of the primary structure, but cannot be read from the gene. Primary structure determines all the other levels of protein structure Sickle Cell Anemia single amino acid change in hemoglobin related to disease

Osteoarthritis single amino acid change in collagen protein causes joint damage
26

Secondary structure - refers to the arrangement of amino acids that are close together in a chain. Examples of secondary structures are helices and pleated sheets. An alpha helix is a tightly coiled, rod-like structure which has an average of 3.6 amino acids per turn. The helix is stabilized by hydrogen bonding between the backbone carbonyl of one amino acid and the backbone NH of the amino acid four residues away. All main chain amino and carboxyl groups are hydrogen bonded, and the R groups stick out from the structure in a spiral arrangement.
27

Alpha-Helix
Side chain groups point outwards from the helix Amino acids with bulky side chains are less common in alpha-helix Glycine and proline destabilizes alpha-helix; form a bend
28

Another type of secondary structure, the beta pleated sheet is composed of two or more straight chains that are hydrogen bonded side by side.
If the amino termini are on the same end of each chain, the sheet is termed parallel, and if the chains run in the opposite direction (amino termini on opposite ends), the sheet is termed antiparallel. All of the amides are hydrogen bonded except those on the outer strands. Pleated sheets may be formed from a single chain if it contains a beta turn, which forms a hairpin loop structure. Often proline and glycine can be found in a beta turn, since they place a "bend" in the chain. When the beta sheet curves around itself and the outer edges on either side hydrogen bond to one another, it forms a structure called a beta barrel, which is a common structural motif in proteins.

29

In Beta turns, carbonyl C of one residue is H-bonded to the amide proton of a residue three residues away

30

Loops and turns


Loops Loops usually contain hydrophillic residues

Found on surfaces of proteins


Connect alpha-helices and beta-sheets Turns Loops with < 5 AAs are called turns Beta-turns are common in protein motifs
31

32

33

34

Tertiary structure: The elements of secondary structure are usually folded into a compact shape using a variety of loops and turns.
The formation of tertiary structure is usually driven by the burial of hydrophobic residues, but other interactions such as hydrogen bonding, ionic interactions and disulfide bonds can also stabilize the tertiary structure. The tertiary structure encompasses all the noncovalent interactions that are not considered secondary structure, and is what defines the overall fold of the protein, and is usually indispensable for the function of the protein.

35

The quarternary structure: is the interaction between several chains of peptide bonds. The individual chains are called subunits.
The individual subunits are not necessarily covalently connected, but might be connected by a disulfide bond. Not all proteins have quarternary structure, since they might be functional as monomers. The quarternary structure is stabilized by the same range of interactions as the tertiary structure. Complexes of two or more polypeptides (i.e. multiple subunits) are called multimers. Specifically it would be called a dimer if it contains two subunits, a trimer if it contains three subunits, and a tetramer if it contains four subunits. Multimers made up of identical subunits may be referred to with a prefix of "homo-" (e.g. a homotetramer) and those made up of different subunits may be referred to with a prefix of "hetero-" (e.g. a heterodimer).
36

Motifs, domains and folds in protein structure


Many proteins are organised into several units. A structural domain ("domain") is an element of the proteins overall structure that is self-stabilizing and often folds independently of the rest of the protein chain. Many domains are not unique to the protein products of one gene or one gene family but instead appear in a variety of proteins. Domains often are named and singled out because they figure prominently in the biological function of the protein they belong to; for example, the "calcium-binding domain of calmodulin. Because they are self-stabilizing, domains can be "swapped" by genetic engineering between one protein and another to make chimeras. A motif in this sense refers to a small specific combination of secondary structural elements (such as helixturn-helix). These elements are often called supersecondary structures. 37

Fold refers to a global type of arrangement, like helix-bundle or betabarrel.


Structure motifs usually consist of just a few elements, e.g. the 'helix-turn-helix' has just three. Note that while the spatial sequence of elements is the same in all instances of a motif, they may be encoded in any order within the underlying gene. Protein structural motifs often include loops of variable length and unspecified structure, which in effect create the "slack" necessary to bring together in space two elements that are not encoded by immediately adjacent DNA sequences in a gene. Note also that even when two genes encode secondary structural elements of a motif in the same order, nevertheless they may specify somewhat different sequences of amino acids. This is true not only because of the complicated relationship between tertiary and primary structure, but because the size of the elements varies from one protein and the next.
38

Despite the fact that there are about 100,000 different proteins expressed in eukaryotic systems, there are much fewer different domains, structural motifs and folds. This is partly a consequence of evolution, since genes or parts of genes can be doubled or moved around within the genome.
This means, that e.g. a protein domain might be moved from one protein to another thus giving the protein a new function. Because of these mechanisms, pathways and mechanisms tend to be reused in several different proteins.

Protein folding: Refers to the process by which the higher


structures form and is a consequence of the primary structure. A unique polypeptide may have more than one stable folded conformation, which could have a different biological activity, but usually, only one conformation is considered to be the active, or native conformation.
39

You might also like