You are on page 1of 10

BCH 4053 Biochemistry I

Fall 2001
Dr. Michael Blaber

Lecture 22

Cloning, DNA libraries

The E. coli genome is a double stranded DNA molecule, about 5 million basepairs long,
that is circular (the ends are covalently joined in a 5'→ 3' phosphodiester bond.

• A specific region in the E. coli chromosome controls the

initiation of replication of the chromosome. This region is called the replication


origin (or ori) region

• If a fragment of DNA containing the ori region is inserted into a circular piece of
DNA it will confer upon that piece of DNA the ability to replicate in an E. coli cell
o The fragment containing the origin of replication may be isolated by
digesting the E. coli DNA with an appropriate restriction endonuclease,
and isolated the desired fragment
o If the circular DNA molecule contains a single site recognized by the same
restriction endonuclease, it will be linearized (i.e. "opened up"). The ends
of the fragment with the origin of replication and the linearized circular
DNA will be complementary since they were cleaved by the same type of
restriction endonuclease.
o Covalent phosphodiester bonds can be reformed between the fragments
(creating a new recombinant circular DNA molecule) using an enzyme
known as DNA ligase.

• If this circularly closed DNA molecule with the ori region is inserted into an E. coli
cell it will autonomously replicate. Essentially, the E. coli now has two
chromosomes!

o This autonomously replicating DNA molecule is known commonly as a

plasmid.

o There are naturally occuring E. coli plasmids. One is the colE1 plasmid
o These plasmids have origin of replication regions that are slightly different
from the sequence in the natural E. coli chromosome, but serve the same
purpose (to initiate DNA replication.
o The colE1 origin of replication (

oriC) is most often used instead of the E. coli chromosome ori for routine
construction of plasmids

Although, in principle, the introduced plasmid will confer autonomous replication to the
plasmid DNA, such plasmids will typically "be lost" by the host E. coli after a few
generations of replication of the host cell. Why is this?
• The plasmid is a metabolic burden on the host E. coli
• During cell division of the host E. coli one of the daughter cells may not received
a copy of the plasmid (distribution into daughter cells is a random event)
• Any daughter E. coli cell that does not get the plasmid has an "advantage" over
those that did get the plasmid (e.g. without the metabolic burden of the plasmid
an E. coli cell can replicate faster, or does not need as much energy to replicate)
• After a few generations, the E. coli containing the plasmid will be out-competed
by those that do not have the plasmid

The E.coli needs a metabolic "reason" to retain the plasmid

Drug resistance

• By far the most common approach to the maintenance (i.e. retention) of plasmids
is through the incorporation of drug resistance genes.
o The drug in question will be an antibiotic that will normally kill the E. coli
o The drug-resistant gene codes for a protein that will confer resistance to
the antibiotic
• Drug resistant genes are also known as selectable markers, i.e.

we can select for their presence in an E. coli cell by including antibiotics in


the growth media.

Ampicillin is an antibiotic drug

• Ampicillin binds to and inhibits a number of enzymes in the bacterial membrane


that are involved in the synthesis of the gram negative cell wall.
o Therefore, addition of ampicillin to the growth media causes faulty cell wall
production in E. coli and this causes problems particularly during cell
replication (when new cell wall needs to be made)
o Proper cell replication cannot occur in the presence of ampicillin and the
E. coli will die (hence, ampicillin is an antibiotic)
o The chemical structure of ampicillin contains a

β-lactam ring.

• The

ampicillin resistance gene (ampr) codes for an enzyme (β -lactamase) that is


secreted into the periplasmic space of the bacterium where it catalyzes
hydrolysis of the β-lactam ring of the ampicillin.

o Hydrolysis (i.e. opening up) the

β-lactam ring in the ampicillin destroys its function.


o Thus, the enzyme product of the ampr gene destroys the antibiotic.

Modification of the plasmid DNA to include a drug-resistant gene

• If the ampicillin resistance gene (coding for b-lactamase enzyme) is inserted into
the autonomously-replicating plasmid, it will confer drug resistance

• Any E. coli that contains this plasmid will be resistant to ampicillin antibiotic. E.
coli that do not have this plasmid will be killed by antibiotic. This is the selective
pressure needed to force retention of the plasmid by E. coli (i.e. there remains a
greater metabolic load on the bacteria with the plasmid, but it confers resistance
to ampicillin).

The use of drug-resistance (i.e. a selectable marker) to maintain a plasmid therefore


comprises two elements:

• The

presence of an antibiotic (drug) in the media that will kill normal bacteria
• A relevant

drug-resistance gene on a plasmid that can confer drug resistance to the host
bacteria

• If the drug is removed from the media, the bacteria will again lose the plasmid
(i.e. will be outcompeted by bacteria that have lost the plasmid)

Cutting and pasting DNA

Plasmids provide a means by which foreign DNA can be introduced into a bacteria, and
the machinery of the bacteria is put to work in replicating and in some cases,
transcribing and translating the genetic information into mRNA and protein.

However, we need a way to easily insert such fragments. This involves two general
steps:

1. Cutting the plasmid (i.e. opening it up) at a desired location


2. Chemically linking the desired piece of duplex DNA into the opened site in
the plasmid DNA (i.e. formation of appropriate phosphodiester bonds between
the plasmid DNA and introduced DNA)

Restriction endonucleases are used to open the plasmid DNA and to generate the
fragment of duplex DNA to be inserted

• Most plasmids are between 3000-6000 basepairs long (i.e. 3-6kb)


• Most common restriction endonucleases recognize a unique 6 basepair
sequence (palindrome).
o EcoR1 recognizes 5' GAATTC 3'
o BamHI recognizes 5' GGATCC 3'
o HindIII recognizes 5' AAGCTT 3'
• Such enzymes are known as "six-cutters" and their recognition sequences are
"restriction sites" on the plasmid
• The probability of finding such a 6-basepair restriction site in a random piece of
DNA would be (1/4)6 or once every 4096 base pairs (i.e. once every 4 kb).
Therefore, you expect there to be only one such site on average in a typical size
plasmid. Thus, such enzymes are often useful for "opening up" plasmids, as
there is often only a single available site for digest
• However, plasmids are usually engineered so that options for different restriction
sites are not left to chance. The DNA sequence of plasmids is often changed so
that all restriction sites for commonly available 6-cutters are eliminated. Then a
piece of DNA is inserted that has a unique design. It is designed to contain in a
very short stretch of DNA sequences for common 6-cutter restriction
endonucleases. Such regions are called polylinkers.

Here is a polylinker region from a plasmid known as pUC18 (plasmid names are usually
prefixed with a lower case 'p', and the letters afterwards commonly refer to the initials of
the person that designed the plasmid. The numbers at the end are often the version of
the plasmid - they are modified and "upgraded" more often than Windows™)

• Now we have a truly useful plasmid design:

• The polylinker region allows us to insert a DNA fragment that might have been
generated by any number of combinations of restriction endonucleases.
• Suppose for example, that we can isolated a fragment of human DNA that
contains a gene of interest by digesting human DNA with a combination of KpnI
and PstI restriction endonucleases. Once isolated, this fragment of DNA can be
inserted into the above plasmid after it has been "opened up" with the same two
enzymes
Click here for more information on plasmids and selectable markers

How to make phosphodiester bonds between two DNA fragments

The enzyme that makes phosphodiester bonds between DNA fragments is DNA ligase

• There are several different ligases available. They require either ATP or NAD+ as
a cofactor (energy is required to make the covalent bond).
• They require a 5' phosphate group, and 3' hydroxyl for ligation
• Ligases exist to be able to ligate either blunt or cohesive-end DNA fragments.
Both types of fragments can be generated by different restriction endonucleases
• Notice that the ligation of complementary (cohesive-end) DNA fragments
(produced by digestion of same restriction endonuclease) results in
regeneration of the restriction site.

Click here for more information on ligase enzymes

Genomic libraries, cDNA libraries and expression vectors

Plasmids are often used for two different things:

• Generation of "libraries" of genetic information


• Expression of desired genes to produce the associated proteins

Genomic libraries

In genomic libraries the goal is to fragment a particular genome (e.g. human) into useful
sized pieces and to have a mechanism whereby each piece can be isolated, identified
and manipulated. One essential manipulation is the ability to replicate the fragment for
further use and study

Fragmentation of the genome, followed by insertion of the fragments into a plasmid, is a


useful method to achieve these goals

• Plasmids can reliably maintain and replicate a DNA fragment of up to 5-10kb


• Fragmentation of a genome can be accomplished using an appropriate restriction
endonuclease. Use a 6-cutter for an average fragment size of 4kb. Use an 8-
cutter for an average fragment size of 65kb.
• Open the plasmid using the same restriction enzyme and the fragments are
cohesive with the linearized plasmid
• Ligate using DNA ligase
• Each plasmid gets on average one fragment

How big a library (i.e. how many plasmids) are required to hold the information in a
given genome?

• E. coli genome has about 5 x106 bp


• If cut with a 6-cutter, we would have approximately 1,220 fragments with an
average size of 4kb
• Thus, a library of approximately 1,200 plasmids (or "clones") would be required
• Human genome has about 3 x 109 bp. If cut with a 6-cutter we would have
approximately 732,000 clones (quite a large library to work with!)
• A genomic library contains all the genetic information in the DNA of an
organism

cDNA libraries
Another type of library is a cDNA library (where the "c" means "complementary").

• A cDNA library starts with the mRNA extracted from a specific tissue or cell type
from an organism
• The mRNA is converted "back" into its complementary DNA using an RNA-
dependent DNA polymerase (from a virus)
• The duplex DNA thus produced is subcloned into a plasmid library

The unique feature of a cDNA library is that it contains the genetic information for
expressed genes (i.e. those that are producing mRNA for protein production) and that
they are from a specific tissue.

• cDNA libraries from different tissues of identical organisms will be different. Also,
cDNA libraries from the same tissue, but different develpmental stages (e.g.
embryo versus adult) will be different. So will cDNA libraries from identical
organisms with the same age and tissue, but with different diseased states.

Getting information "out" of a library

Of the thousands of clones in a genomic or cDNA library, which one is the one you are
interested in? Often you will have some DNA sequence information that can help
identify the gene of interest.

• Screening of libraries takes advantage of the ability of comlementary strands of


DNA to hybridize and form a stable duplex
• If a short region of single stranded DNA containing a known sequence of interest
is incubated with DNA from a plasmid library, it will hybridize (i.e. form a stable
duplex) with that clone that contains the gene of interest. But how to know which
clone it hybridized with?
• If the short piece of DNA is radiolabeld (e.g. with radioactive phosphate in the
phosphodiester backbone) then the clone that forms a stable duplex will also be
radioactive (the others will not bind the radiolabeled DNA fragment)
• The radioactive clone can be identified using x-ray film

Expression Libraries

In addition to being benign carriers of genetic information, plasmids can also be


modified to permit transcription of introduced DNA fragments.

• This requires the insertion into the plasmid of a promoter region. A promoter
region is a piece of DNA that

directs RNA polymerase to start transcription of the DNA downstream (i.e. 3'
to the promoter).
• The RNA polymerase is part of the transcriptional machinery in the host bacteria,
and the plasmid is simply recruiting the host RNA polymerase to transcribe DNA
from the plasmid, rather than the hose genome
• The mRNA thus produced can, again, use the host translational machinery to
produce the associated protein that is coded for by the RNA
• Specialized plasmids can be used for both library production (genomic or cDNA)
and expression
• Alternatively, genes of interest that are identified in libraries can be sub-cloned
into specialized expression vectors for protein production

Click here for more information on cDNA libraries

Click here form more information on genomic libraries

© 2001 Dr. Michael Blaber

You might also like