Professional Documents
Culture Documents
1
Bioprocess Engineering Laboratory, Bioprocess Division, Research Center for Biotechnology,
LIPI. Cibinong Science Center-Jalan Raya Bogor Km. 46, Cibinong 16911-Bogor
Construction of a CSF3-Synthetic Gene for Recombinant Human G-CSF Expression in Yeast Using a TBIO
(Thermodynamically Balanced Inside-Out) Method 1
JOURNAL of BIOTECHNOLOGY RESEARCH in TROPICAL REGION, Vol. 1, Oct. 2008 (Special Edition) ISSN: 1979-9756
Human G-CSF was coded from a single gene length. Each of which was constructed with the
called CSF3. It was mapped on the chromosome 17. TBIO method.
There are two isoforms of mature protein G-CSF The purpose of this study was to synthesize a
mostly found in the body, which are 177 aa yeast-codon-optimized CSF3 gene which might be
(NP_000750) and 174 aa (NP_757373) in lengths. expressed extracellularly from a methylotrophic
Both isoforms show similar bioactivities. There are yeast Pichia pastoris. An mRNA variant of human
three different types of G-CSF commercially availa- CSF3 gene (variant-2) producing the hG-CSF mole-
ble; lenograstim (Granocyte®), filgrastim (Neupo- cule isoform-b, was altered to obtain a CSF3 syn-
gen®) and pegylated filgrastim (Neulasta®). These thetic gene with optimized-codon preference for
drugs work in a similar way. The molecules of the protein expression in P. pastoris. In this research,
pegylated filgrastim have more glycosylation that we constructed a DNA sequence or an ORF (Open
helps the drug to work longer. Reading Frame) of hG-CSF gene, called CSF3, with
Chemical synthesis of DNA sequences provides the TBIO method. The method was slightly mod-
a powerful tool for creating, modifying and studying ified in which the PCR reactions was run sequential-
gene function; such as studying its structure and ly with a given primer pair from the middle of the
expression in a given host cell. In the past, the most gene. The PCR product was then used as template
direct method to construct a synthetic gene were to for sequence elongation with the next primer pair of
mix overlapping preformed double stranded DNA the outside part of the sequence being generated.
and ligate them each other enzymatically. However, The amino acid sequence of the hG-CSF iso-
the yield of the full-length product declines sharply form-b (NP757373) was used as template for gener-
with increasing number of DNA duplexes. The more ation of the synthetic gene sequence. The signal
common method is to construct separate DNA seg- peptide (first 30 amino acid) was excluded from the
ments of the gene from a smaller number of DNA sequence, resulting in a protein sequence of 174 aa
duplex, amplify each fragment by sub-cloning into a in length.
plasmid vector then ligate the fragments to give the
full-length gene. Although, this method could effi-
ciently produce the intermediate product for each MATERIALS AND METHODS
step, the intermittent sub-cloning and bacterial am-
plification steps make the procedure very tedious Materials: The oligonucleotide primers were syn-
and time-consuming. The more recent method in- thesized by Generay Biotech. The high fidelity Pfu
volves assembling of several synthetic primers or DNA polymerase and dNTPs mix were purchased
oligonucleotides to produce up to 500 bp DNA se- from Fermentas. Cloning plasmid InstanCloneTM
quence, assembled in a single-tube annealing and was purchased from Fermentas. Plasmid DNA gel
ligation reactions. extraction kit was from RBC. The XhoI and SalI
In this study, we applied a method for a synthet- restriction enzymes were from Fermentas. T4 DNA
ic gene construction called the TBIO (Thermody- ligase was from Fermentas.
namically Balanced Inside-Out) method. This me-
thod, originally reported by Gao et al. (2003), offers CSF3 Open Reading Frame (ORF) and primer
a very efficient method for gene construction with- design: The CSF3 Open Reading Frame (ORF) was
out the use of restriction and ligation procedures. generated based on the protein sequence of the iso-
This method is a PCR-based single-step DNA syn- form-b of hG-CSF molecule having 174 amino acid
thesis uses both primers in sense and anti-sense residues. The protein sequence was retrieved from
strands each for half of the gene length. The primer the gene database (GenBank accession no.
elongation will run on both directions. Thus, TBIO NP_757373) or the protein database (SwissProt ac-
bidirectional elongation must be completed for a cession no. P09919-2). The synthetic gene excludes
given outside primer pair before the next round of the first 30 aa native signal peptide. The DNA frag-
bidirectional elongation can take place. The method ment of the synthesized CSF3 ORF was 522 bp in
was reported to be successfully used in constructing length. However, in the design of the CSF3 ORF,
some DNA sequences of up to 1712 bp in length. two restriction sequences (XhoI and SalI) were add-
The same method was used to generate longer DNA ed at both ends of the synthetic gene as well as a
sequences in a slightly modified method (Xiong et linker peptide (KREAEA) at the 5‟ end. The result-
al., 2004). The method was reported to give high- ing ORF sequence was 558 bp in length. The protein
fidelity and cost-effective PCR-based two-step sequence was submitted to a software program,
DNA synthesis for construction of long segments of DNAWorks 3.1, by which the oligonucleotide se-
DNA. The long DNA sequence (2,382 bp) was dis- quence (ORF) of the synthetic gene was then gener-
sected into five DNA segments of around 500 bp in ated. The gene was optimized to contain P. pastoris
codon preference. Some parameters was set up for
Fuad, et.al. 2
JOURNAL of BIOTECHNOLOGY RESEARCH in TROPICAL REGION, Vol. 1, Oct. 2008 (Special Edition) ISSN: 1979-9756
the required oligonucleotide synthesis, including pTZ57R/T (Fermentas). The recombinant plasmid
primer length (set at 60 nt), annealing temperature was transformed into E. coli XL1-Blue. The recom-
(set at 60oC), codon frequency threshold (set at binant clones were then selected on selection LB-
10%) and the output mode was set to TBIO method agar medium containing IPTG and X-gal. Positive
(thermodynamically balanced inside-out). The pro- clones (white colonies) were selected, cultured in an
gram sent a detail output of the DNA sequence as appropriate media and the recombinant plasmids
well as primer sequences required for construction were extracted and analyzed. Enzymatic restriction
of the synthetic gene. There were 14 primers or oli- analysis was then carried out to analyze the recom-
gonucleotides which should be synthesized, each binant plasmids using single and double-digestion
having 60 nt (nucleotides) in average and an overlap analysis (with XhoI and SalI). Cloning and trans-
region varied from 18 to 25 nt between adjacent formation into the cloning plasmid kit were carried
primers. out according to protocols given by the producer.
Plasmid preparation and restriction analysis were
CSF3 Open Reading Frame construction and ex- done using the general protocols for molecular clon-
perimental design: The synthetic gene was con- ing according to Ausubel et al. (2002). The recom-
structed according to a modified TBIO method (Gao binant plasmid(s) harboring the correct DNA insert
et al., 2003). The PCR-based primer extention me- was then submitted for DNA sequence analysis us-
thod of the synthetic gene, which is the basic prin- ing appropriate primers.
ciple of the TBIO method, was started from the
middle of the gene sequence. As shown in Fig. 1,
the 3‟-terminal ends of the first pair of 60 mer RESULTS AND DISCUSSION
sense- and antisense-strand TBIO primers (P7 and
P8) overlap in the middle of the synthetic gene se- CSF3 Open Reading Frame (ORF) and primer
quence. The gene synthesis started at this point by design: Human G-CSF is encoded by a single gene
primer extension process. The PCR reaction was called CSF3 belongs to IL-6 superfamily. The gene
continued with the next 60 mer pair of outer primers is located in chromosome-17 and mapped at locus
(P6 and P9), both from sense- and antisense-strands. 17q11.2-q21 by in situ hybridization (Tweardy et
The reaction was repeated with the next pair of pri- al., 1987). The gene produces 3 variants of mRNA
mers. Those pairs of primers were added sequential- that resulted in 3 different types of preprotein hG-
ly to extend the sequence polymerization in both CSF. However, there are two isoforms of mature
directions of sense and antisense strands. The PCR hG-CSF mostly found in the body; isoform-a (177
mix reaction uses 40 nM of each primer pairs, 0.2 aa, GenBank accession no. NP-000750) and iso-
mM dNTPs, 1x Pfu buffer and 1.25 U of Pfu DNA form-b (174 aa, GenBank accession no. NP-
polymerase in a 50 µl PCR mix reaction volume. 757373). Although isoform-b lacks three amino ac-
For the subsequent PCR reaction, 2.5 µl of the inner ids VSE at position 66-68, both isoforms show simi-
DNA sequence was added (as template) into the lar bioactivity.
next PCR reaction with the next pair of primer for In this study, the protein sequence of the short ver-
DNA sequence elongation. The PCR cycles used sion of hG-CSF (isoform-b) was used as template to
was 2 min for first denaturation at 95oC, for 1 min generate the synthetic CSF3 gene sequence
of denaturation at 95oC, for 30 sec of annealing at (CSF3syn). The human CSF3 gene that codes for
59oC, for 1 min of elongation at 72oC and 5 min for hG-CSF was codon-optimized for expression in
final elongation at 72oC. The PCR reaction was set yeast. However, the first 30 aa native peptide signal
for 25 cycles. was excluded in the synthetic gene design. Instead, a
linker peptide KREAEA was added at the N-
Cloning and analysis of the synthetic gene: The terminal of the sequence. The linker peptide
PCR products of each sequential PCR reactions presents proteolytic cleavage site(s) that will be use-
were analyzed in a 1.5% agarose gel electrophoresis. ful for secretion of the recombinant protein in yeast
The final length of the synthetic DNA sequence be- P. pastoris since the protein target will be fused
ing constructed is 558 bp. Into this PCR product, an with a yeast-derived signal sequence factor- . Table
amount of 1 to 2 U of Taq DNA polymerase (Fer- 1 shows the hG-CSF protein sequence (isoform-b)
mentas) was added and the mixture was incubated at with the peptide linker. The sequence was used as
72oC for at least 30 min. This process was carried input and submitted into the DNA Works 3.1 pro-
out in order to add an additional “A” (adenine) at gram (online) to generate the DNA sequence
the 3‟-end of each double stranded synthetic DNA needed. The synthetic gene was designed to contain
sequence which has been produced. The DNA prod- optimized codon preferences for expression in yeast
uct was then sub-cloned into a commercial A/T P. pastoris. The ORF of the synthetic gene generat-
cloning plasmid kit such as InstanCloneTM ed by the program is shown in Table 2.
Construction of a CSF3-Synthetic Gene for Recombinant Human G-CSF Expression in Yeast Using a TBIO
(Thermodynamically Balanced Inside-Out) Method 3
JOURNAL of BIOTECHNOLOGY RESEARCH in TROPICAL REGION, Vol. 1, Oct. 2008 (Special Edition) ISSN: 1979-9756
Two restriction sites, XhoI and SalI, were added at poly-His Tag at the C-terminal. However, the stop
both ends of the sequence for cloning purpose in the codon will be included for the construction of other
yeast expression vector. The stop codon was ex- version of gene without Tag. The resulting ORF
cluded since the sequence would be fused with a sequence was 558 bp in length.
Table 1. Polypeptide sequence of CSF3 (or hG-CSF) used as input for the synthetic gene
design using the DNA Works 3.1 program.
Polipeptide sequence
1 KREAEATPLGPASSLPQSFLLKCLEQVRKIQGDGAALQEKLCATYKLCHPEELVLLGHSL
61 GIPWAPLSSCPSQALQLAGCLSQLHSGLFLYQGLLQALEGISPELGPTLDTLQLDVADFA
121 TTIWQQMEELGMAPALQPTQGAMPAFASAFQRRAGGVLVASHLQSFLEVSYRVLRHLAQP
181
Note: The KREAEA sequence is a peptide linker between signal sequence Factor-α and
the synthetic gene
Table 2. DNA sequence of the CSF3 synthetic gene as output from DNAWorks 3.1.
DNA sequence
1 AAGAGAGAGGCTGAAGCTACTCCACTAGGCCCAGCTTCTTCTTTGCCACAATCTTTTCTT
61 TTGAAGTGTTTGGAACAAGTTAGAAAGATTCAGGGTGATGGTGCTGCCTTGCAGGAAAAG
121 TTGTGTGCTACTTACAAGCTGTGTCATCCAGAAGAATTGGTCTTGCTGGGACATTCTTTG
181 GGTATTCCATGGGCTCCATTGTCTTCTTGTCCATCTCAAGCTCTGCAATTGGCTGGTTGT
241 TTGTCTCAGTTGCATTCTGGTTTGTTTCTGTACCAAGGATTGTTGCAAGCTTTGGAAGGT
301 ATTTCTCCAGAGTTGGGACCAACTTTGGATACTTTGCAACTTGATGTTGCTGATTTTGCT
361 ACTACTATTTGGCAACAAATGGAAGAACTAGGTATGGCTCCTGCTTTGCAGCCAACTCAA
421 GGTGCTATGCCAGCCTTTGCATCAGCTTTTCAGAGAAGAGCTGGTGGTGTTTTGGTTGCT
481 TCTCATTTGCAGTCTTTCCTAGAAGTTTCTTACAGAGTTTTGAGACATTTGGCTCAACCA
541
CSF3 Open Reading Frame construction with have been optimized to have equal Tm value (an-
TBIO method: The DNA Works 3.1 program also nealing temperature), which is 60±1oC.
generated 14 oligonucleotides or primers for the The TBIO-designed primer set was used for the
gene construction in addition to generating the syn- gene synthesis by a PCR-based method. The gene
thetic gene sequence. Those primers have an aver- was synthesized by seven-step sequential „inside-
age length of 60 nt (nucleotides). As seen in Fig. 1, out‟ bidirectional elongation reactions from the
half of them have the sense-strand sequences and middle to both ends, which are the N- and C-termini
the other half have the antisense-strand sequences of the synthetic gene sequence. A pair of TBIO pri-
(Primer sequences not shown). Between the adja- mers was used in each elongation. This method effi-
cent primers, there are overlap regions between 18 ciently produced the desired DNA product, which
and 25 nt in length. However, the overlap regions was the ORF of the synthetic gene as shown in Fig.
2.
Fuad, et.al. 4
JOURNAL of BIOTECHNOLOGY RESEARCH in TROPICAL REGION, Vol. 1, Oct. 2008 (Special Edition) ISSN: 1979-9756
p1
p2
p3
p4
p5
p6
p7
p8
p9
p10
p11
p12
p13
p14
Forward primer
Reverse primer
Fig. 1. Construction method of CSF3 synthetic gene using the TBIO (thermodynamical-
ly balanced inside-out) method. Primer extension was started from the middle part of
the gene, followed by exterior primer pairs. The process was carried out by sequential
PCR with each pair of primers. Half of the primers have the sense strand sequence,
whereas the other half have the anti-sense strand sequence.
Construction of a CSF3-Synthetic Gene for Recombinant Human G-CSF Expression in Yeast Using a TBIO
(Thermodynamically Balanced Inside-Out) Method 5
JOURNAL of BIOTECHNOLOGY RESEARCH in TROPICAL REGION, Vol. 1, Oct. 2008 (Special Edition) ISSN: 1979-9756
Table 3. Some characteristics of the oligonucleotides used for the synthesis of the CSF3 synthetic gene. (A) Codon fre-
quency range; (B) Annealing temperature range (Tm); (C) Overlap region range; (D) Oligos length.
A B C D
Frequency No. of Tm range No. of Overlap No. of oligos Length range No. of
range (%) codons (oC) overlaps length range oligos
(nt)
0-4 0 < 58 0 < 17 0 < 49 2
5-9 0 58 0 17 0 49-50 0
10-14 4 59 5 18 3 51-52 0
15-19 7 60 8 19 2 53-54 0
20-24 2 61 0 20 3 55-56 0
25-29 15 62 0 21 0 57-58 0
30-34 27 63 0 22 3 59-60 12
35-39 14 64 0 23 1 CSF3 61-62 0
40-44 30 65 0 24 0 63-64 0
45-49 25 66 0 25 1 65-66 0
50 56 26 0 67-68 0
27 0 69 0
Cloning and analysis of the synthetic gene: The hangs were added at both ends by incubating the
PCR-based gene synthesis with TBIO method re- synthetic gene product with Taq DNA polymerase
sulted in a final product of the target gene having (at 72oC, for 1 h) prior to ligation process into the
558 bp in length. Each step of the sequential PCR cloning plasmid (pTZ57R/T).
reactions produced a set of DNA products with in- Transformation of the ligation product has success-
cremental length, ranging from around 100 to 558 fully produced E. coli (strain XL-1 Blue) transfor-
bp of the final product (Fig. 2). During all amplifi- mants which harbor the recombinant plasmid con-
cation processes, Pfu DNA polymerase was used to taining the putative synthetic gene sequence. Re-
ensure the accuracy of the amplified DNA sequence striction analysis (with XhoI and SalI) was done to
from its primers. However, the Pfu DNA polyme- some positive clones obtained and some of them
rase would produce a „blunt end‟ product only. In have shown the correct DNA insert (Figs. 3 and 4).
view of cloning the synthetic gene product into an
„A/T‟ cloning vector, to the DNA product „A‟ over-
1 2 3 4 5 6 7
1 2 3 4 5 6 7 8 9 10 11 12 13
3000
2000
1500
1000 CSF3syn
750
500
250
Fig. 3. Recombinant plasmid miniprep which contains Fig. 4. Restriction analysis of recombinant plasmid
the CSF3 synthetic gene. Lane-1: plasmid w/o DNA in- TZ57R-CSF3syn with XhoI and SalI restriction enzymes.
sert (Control); lane 2-12: different clones of recombinant
plasmids.
Fuad, et.al. 6
JOURNAL of BIOTECHNOLOGY RESEARCH in TROPICAL REGION, Vol. 1, Oct. 2008 (Special Edition) ISSN: 1979-9756
Construction of a CSF3-Synthetic Gene for Recombinant Human G-CSF Expression in Yeast Using a TBIO
(Thermodynamically Balanced Inside-Out) Method 7