You are on page 1of 26

Bats are a major natural reservoir for hepaciviruses

and pegiviruses
Phenix-Lan Quan
a,1
, Cadhla Firth
a
, Juliette M. Conte
a
, Simon H. Williams
a
, Carlos M. Zambrana-Torrelio
b
,
Simon J. Anthony
a,b
, James A. Ellison
c
, Amy T. Gilbert
c
, Ivan V. Kuzmin
c,2
, Michael Niezgoda
c
, Modupe O. V. Osinubi
c
,
Sergio Recuenco
c
, Wanda Markotter
d
, Robert F. Breiman
e
, Lems Kalemba
f
, Jean Malekani
f
, Kim A. Lindblade
g
,
Melinda K. Rostal
b
, Rafael Ojeda-Flores
h
, Gerardo Suzan
h
, Lora B. Davis
i
, Dianna M. Blau
j
, Albert B. Ogunkoya
k
,
Danilo A. Alvarez Castillo
l
, David Moran
l
, Sali Ngam
m
, Dudu Akaibe
n
, Bernard Agwanda
o
, Thomas Briese
a
,
Jonathan H. Epstein
b
, Peter Daszak
b
, Charles E. Rupprecht
c,3
, Edward C. Holmes
p
, and W. Ian Lipkin
a
a
Center for Infection and Immunity, Mailman School of Public Health, Columbia University, New York, NY 10032;
b
EcoHealth Alliance, New York, NY 10001;
c
Poxvirus and Rabies Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging Zoonotic Infectious Diseases, Centers for
Disease Control and Prevention, Atlanta, GA 30333;
d
Department of Microbiology and Plant Pathology, University of Pretoria, Pretoria 0002, South Africa;
e
Centers for Disease Control and Prevention in Kenya, Nairobi, Kenya;
f
University of Kinshasa, Kinshasa 11, Democratic Republic of the Congo;
g
Centers for
Disease Control and Prevention Guatemala, 01015, Guatemala City, Guatemala;
h
Facultad de Medicina Veterinaria y Zootecnia, Universidad Nacional
Autnoma de Mxico, Ciudad Universitaria, 04510 Mxico D. F., Mexico;
i
Centers for Disease Control and Prevention Nigeria, Abuja, Nigeria;
j
Infectious
Diseases Pathology Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging Zoonotic Infectious Diseases, Centers for
Disease Control and Prevention, Atlanta, GA 30333;
k
Department of Veterinary Medicine, Ahmadu Bello University, Samaru, Zaria, Kaduna State, Nigeria;
l
Center for Health Studies, Universidad del Valle de Guatemala, 01015, Guatemala City, Guatemala;
m
Laboratoire National Vtrinaire, B.P. 503 Garoua,
Cameroon;
n
Centre de Surveillance de la Biodiversit, University of Kisangani, B.P. 2012 Kisangani, Democratic Republic of the Congo;
o
Mammalogy Section,
National Museums of Kenya, 00100 Nairobi, Kenya;
p
Sydney Emerging Infections and Biosecurity Institute, School of Biological Sciences and Sydney Medical
School, University of Sydney, Sydney, NSW 2006, Australia
Edited* by Harvey Alter, National Institutes of Health, Bethesda, MD, and approved March 25, 2013 (received for review February 19, 2013)
Although there are over 1,150 bat species worldwide, the diversity
of viruses harbored by bats has only recently come into focus as
a result of expanded wildlife surveillance. Such surveys are of
importance in determining the potential for novel viruses to emerge
in humans, and for optimal management of bats and their habitats.
To enhance our knowledge of the viral diversity present in bats, we
initially surveyed 415 sera from African and Central American bats.
Unbiased high-throughput sequencing revealed the presence of
a highly diverse group of bat-derived viruses related to hepacivi-
ruses and pegiviruses within the family Flaviridae. Subsequent PCR
screening of 1,258 bat specimens collected worldwide indicated the
presence of these viruses also in North America and Asia. A total of
83 bat-derived viruses were identied, representing an infection
rate of nearly 5%. Evolutionary analyses revealed that all known
hepaciviruses and pegiviruses, including those previously docu-
mented in humans and other primates, fall within the phylogenetic
diversity of the bat-derived viruses described here. The prevalence,
unprecedented viral biodiversity, phylogenetic divergence, and
worldwide distribution of the bat-derived viruses suggest that bats
are a major and ancient natural reservoir for bothhepaciviruses and
pegiviruses and provide insights into the evolutionary history of
hepatitis C virus and the human GB viruses.
D
etermining the natural reservoirs for emerging pathogens is
critical for disease prediction and prevention. Nearly 60% of
emerging infectious diseases in humans are zoonotic, with up to
70% originating from wildlife (1). Bats [order Chiroptera, sub-
orders Yinpterochiroptera and Yangochiroptera (2, 3)] are a nat-
ural reservoir for many important zoonotic viruses that cause severe
disease in humans, including lyssaviruses, severe acute respiratory
syndrome (SARS)-related coronaviruses (SARS-related CoV),
loviruses, henipaviruses, and other paramyxoviruses (413). In
addition, several viruses establish persistent viral infections in bats,
in which apparent clinical signs appear only rarely (5). Bats possess
unique characteristics that may contribute to their capacity to
function as a major reservoir host for viruses, including long life-
span, high species diversity, unique immune systems, gregarious
roosting behaviors, and high spatial mobility and population
densities (5).
Although hepatitis Cvirus (HCV) was discovered more than 20 y
ago, its origin is unknown, and, until recently, humans appeared to
be its only natural host (14, 15). HCV was initially isolated from the
serumof a person with non-A, non-B hepatitis and is now a leading
cause of chronic liver disease, cirrhosis, and hepatocellular carci-
noma, with 3% of the global population infected (16, 17). HCV
and its distant relative, GB virus B (GBV-B), belong to the genus
Hepacivirus within the family Flaviridae of single-stranded, positive-
sense RNA viruses (18, 19). New World monkeys can be experi-
mentally infected with GBV-B, resulting in clinical hepatitis (20).
Recently, nonprimate hepaciviruses (NPHV) were discovered in
domestic animals (dogs, horses), expanding the natural host range
of these agents (2123). A new avivirus genus, Pegivirus, has been
proposed for GBviruses GBV-A, GBV-C, and GBV-D(19). GBV-A
viruses have been identied in nonhuman primates and are not
known to infect humans, whereas GBV-C is frequently isolated
from humans and chimpanzees (19). GBV-D was recently identi-
ed in Old World frugivorous bats (24). No pegiviruses have yet
been identied to cause disease in their hosts (19).
Here, we report the discovery of a highly diverse group of bat-
derived viruses related to hepaciviruses and pegiviruses. The
prevalence, unprecedented viral biodiversity, and broad geo-
graphic distribution of these viruses, identied throughout the
order Chiroptera, suggest that bats are a key natural reservoir for
both hepaciviruses and pegiviruses.
Results
Discovery of Bat-Derived Hepaciviruses and Pegiviruses. We surveyed
serum specimens from 415 apparently healthy bats captured
from ve different countries (Guatemala, Cameroon, Nigeria,
Author contributions: P.-L.Q. and W.I.L. designed research; P.-L.Q., J.M.C., and S.H.W. per-
formed research; P.-L.Q., C.M.Z.-T., S.J.A., J.A.E., A.T.G., I.V.K., M.N., M.O.V.O., S.R., W.M.,
R.F.B., L.K., J.M., K.A.L., M.K.R., R.O.-F., G.S., L.B.D., D.M.B., A.B.O., D.A.A.C., D.M., S.N.,
D.A., B.A., J.H.E., P.D., andC.E.R. contributednewreagents/analytic tools; P.-L.Q., C.F., J.M.C.,
S.H.W., C.M.Z.-T., T.B., E.C.H., and W.I.L. analyzed data; and P.-L.Q., C.F., T.B., E.C.H., and
W.I.L. wrote the paper.
The authors declare no conict of interest.
*This Direct Submission article had a prearranged editor.
Data deposition: The sequences reported in this paper have been deposited in the
GenBank database. For a list of accession numbers for the viruses identied in this study,
see SI Appendix, Tables S6 and S7.
1
To whom correspondence should be addressed. E-mail: pq2106@columbia.edu.
2
Present address: Aravan, LLC, Lilburn, GA 30047.
3
Present address: The Global Alliance for Rabies Control, Manhattan, KS 66502.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
1073/pnas.1303037110/-/DCSupplemental.
www.pnas.org/cgi/doi/10.1073/pnas.1303037110 PNAS Early Edition | 1 of 6
M
I
C
R
O
B
I
O
L
O
G
Y
Democratic Republic of the Congo, and Kenya) that represent 33
species, 26 genera, and seven families (SI Appendix, Tables S1 and
S2). Aliquots of sera were combined into 43 pools for nucleic acid
extraction; total RNA was randomly amplied and subjected to
unbiased high-throughput sequencing (UHTS) (25). After re-
moval of host sequences, each pool yielded between 87 and 3,098
unique reads and 1128 assembled contiguous sequences. The
analysis of these sequence data at the nucleotide (nt) and amino
acid (aa) levels revealed the presence of sequences with distant
similarity to aviviruses in 29 sera pools. Between 1 and 716 a-
vivirus sequences, ranging from 50 to 10,803 nt in length, were
identied in the positive pools. The sequences showed aa sequence
similarities ranging from 24% to 100% to members of the genus
Hepacivirus or Pegivirus. Specic and degenerate primers targeting
a conserved motif in the RNA-dependent RNA polymerase gene
(RdRp or NS5B) were designed from the UHTS data and used to
conrm the initial ndings, and further for screening all individual
serum specimens in the positive pools. Using this approach,
genetic material from a total of 58 bat-derived viruses was
detected (SI Appendix, Tables S3 and S4).
To further explore the distribution of these viruses in other
countries and bat species, an additional 1,166 sera/plasma speci-
mens and 83 tissue specimens from34 bats (kidney, liver, and lung),
representing 40 species, 31 genera, and seven families of bats from
Nigeria, Bangladesh, and Mexico were screened (SI Appendix, Table
S2). Twenty-three sera/plasma and one lung specimen were posi-
tive. Additional specimens from 9 of the bats from Mexico with
positive sera specimens, consisting of seven oral and two rectal
swabs were also tested. One positive rectal swab was detected (SI
Appendix, Tables S3 and S4).
In total, 1,673 specimens, collected between 2007 and 2011 from
1,615 bats representing 58 species, 44 genera, and eight families in
seven countries worldwide were screened (SI Appendix, Table S2).
The genetic material of bat-derived viruses was detected in six of
the eight families of bats tested, representing all major phyloge-
netic lineages of the chiroptera. A total of 78 positive sera/plasma,
one lung specimen, and one rectal swab were tested positive,
PDB-113
PDB-829
PDB-830
GBV-B
PDB-632B
PDB-112
PDB-491.1
PDB-261
PDB-445
PDB-452
1
1
1
0.9
1
1
*
*
*
*
*
1
1
0.7
NPHV
1
1
1
1
1
1
1
1
1
0.9
0.9
0.9
HCV-3a
HCV-1b
HCV-5a
HCV-7a
HCV-4a
HCV-2b
HCV-1a
HCV-6g
HCV-6a
HCV-2a
HCV-4d
HCV-3f
PDB-1734
PMX-1641
PDB-99
PDB-1698
PDB-34.1
PDB-76.1
PMX-1615
PDB-620
0.9
1
1
1
1
1
*
*
*
*
*
*
PDB-303
GBV-D
PRB-891
PMX-1588
PDB-1699
PRB-1554
PRB-1447
PRB-1479
PDB-694
PDB-1715
PDB-366
1
1
1
0.9
0.8
1
*
*
*
0.8
0.9
GBV-A
GBV-C2
GBV-Alab
GBV-Amyx
GBV-C3
GBV-C4
GBV-C7
GBV-C6
GBV-C5
GBV-C1
GBV-Atri
GBV-Ctro
1
1
1
1
0.8
1
1
1
1
PDB-28A
PDB-41A
PMX-1376.pl
PMX-1376.re
PRB-978
PDB-47
PDB-24
PDB-130
PRB-1476
1
1
1
1
0.8
1
*
1 PDB-152
PDB-690
PDB-725
PDB-716
PDB-491.2
PRB-1129
PDB-743
PDB-534
PDB-935
PDB-401
PDB-664
PDB-722
*
*
PDB-737B
PRB-1085
PRB-1170
PDB-106
PRB-1006.1
PRB-940
*
*
1
1
0.9
1
0. 3
Clade F
Clade A
Clade B
Clade C
Clade D
Clade E
Clade K
Clade I
Clade H
Clade G
Hipposideridae
Hipposideridae
Pteropodidae
Molossidae
Emballonuridae
Vespertilionidae
Phyllostomidae
Molossidae
Pteropodidae
Phyllostomidae
Emballonuridae
Molossidae
Hipposideridae
PDB-34.2
PDB-423
PDB-76.2
PDB-840
PDB-692
PDB-854
PDB-706
PDB-895
PDB-921
PDB-898
PDB-798
PDB-883A
PDB-623
PRB-1547
PRB-1006.2
PRB-1522
PRB-1520
PRB-1549
PRB-1515
PDB-838
*
PDB-702
PDB-307.lu
PDB-903
1
1
1
0.8
Kenya
Nigeria
Cameroon
Bangladesh
DR Congo
Mexico
Guatemala
Clade J
C

H
e
p
a
c
i
v
i
r
u
s
P
e
g
i
v
i
r
u
s
Fig. 1. Bayesian phylogenetic tree of a 300-nt conserved region of the RdRp gene of the bat-derived viruses and selected members of the Hepacivirus and
Pegivirus genera. The 83 bat-derived viruses identied in this study are shown along with the clade designations (A to K); an asterisk indicates viruses for
which the near full-length genome was obtained. The countries of origin for the sequences generated in this study are indicated by the branch color; the host
and the bat families associated with each clade are indicated next to the clade name. Bayesian posterior probabilities are shown only for nodes with sig-
nicant support (>0.7). The scale bar indicates the average number of nucleotide substitutions per site. The virus names corresponding to the abbreviations
and GenBank accession numbers are provided in SI Appendix, Table S10. C, canine hepacivirus; , experimentally infected New World monkeys.
2 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.1303037110 Quan et al.
representing 21 species, 20 genera, and six families of bats (SI Ap-
pendix, Tables S3, S4, and S5). Quantitative PCR assays indicated
the presence of 10
3
to 10
8
RNA copies/mL of these viruses in the
sera or plasma of infected bats (SI Appendix, Table S4). A total of
83 bat-derived viruses, which potentially canrepresent 22 novel viral
species (3 within the Hepacivirus genus and 19 within the proposed
Pegivirus genus) were identied (SI Appendix, Table S6).
Genetic Diversity of Bat-Derived Hepaciviruses and Pegiviruses. Pre-
liminary phylogenetic analysis of a conserved 300-nt region of the
RdRp gene revealed the presence of a highly diverse group of
viruses that clustered within the Hepacivirus and Pegivirus genera
(Fig. 1). Phylogenetic trees based on the complete helicase and
RdRp coding sequences of a representative subset of these viru-
ses, along with previously known hepaci- and pegivirus sequences,
supported the clustering of these two genera into 11 strongly
supported clades [Bayesian posterior probability (BPP) values in
all cases 0.9] (SI Appendix, Fig. S1). Importantly, these analyses
show that, in both hepaciviruses and pegiviruses, all previously
described viruses fall within the phylogenetic diversity of the bat-
derived viruses described here, indicating that bats are a major
and ancient reservoir for genetic diversity in both genera. Within
the Hepacivirus genus, the viruses fell into three highly divergent
clades composed solely of viruses from two species of African bats
(Hipposideros vittatus, Otomops martiensseni) (BPP = 1.0). Clade
A viruses were distinct from, but most closely related to GBV-B
whereas clade C and D viruses occupied a basal position relative
to the clades containing NPHV(clade E) and HCV(clade F). The
bat-derived pegiviruses identied in this study formed three dis-
tinct lineages (BPP = 1.0). Viruses from the clades G and K
formed phylogenetically distinct clusters within the genus whereas
clade H viruses clustered with the previously identied GBV-D
viruses (24). Notably, the African, North American, and Asian bat
pegivirus lineages were all paraphyletic, suggesting a long history
of diversication and cross-species transmission. Although we
were unable to detect a history of recombination in our dataset,
coinfections were identied in individuals from three species of
African bats (Mops condylurus, O. martiensseni and Taphozous sp.)
and one species from Bangladesh (Pteropus giganteus), suggesting
the potential for recombination to occur. Three coinfections in-
volved two distinct pegiviruses from clades G or K whereas, in the
fourth case, a bat was infected with both a hepacivirus (clade C)
and a pegivirus (clade K) (SI Appendix, Table S4).
Genomic Organization of Bat-Derived Hepaciviruses and Pegiviruses.
Twenty near full-length genome sequences were generated, rep-
resenting all six clades in which the bat hepaciviruses (BHVs) and
bat pegiviruses (BPgVs) were identied (SI Appendix, Table S7).
Similar to all other members of the Flaviviridae, the BHVs and
BPgVs have a positive-sense, single-stranded RNA genome that
contains a single large ORF encoding a polyprotein precursor
(26). The ORF is anked at the 5 and 3 end by nontranslated
regions (NTRs). The BHV and BPgV genome sequences com-
prise at least 8,916 nt and encode precursor polyproteins of 2,842
3,469 aa in length (Fig. 2, SI Appendix, Table S7). Conserved
avivirus protein domains consistent with structural and non-
structural proteins were recognized in the BHV and BPgV pre-
dicted polyproteins, enabling accurate prediction of the location
and function of the encoded proteins (SI Appendix, Tables S8 and
S9). A comparison of predicted protein products indicated that
the envelope (E) and the nonstructural proteins NS2 and NS5A
were most variable, with less than 34.5%, 33.5%, and 20.4%
(hepaciviruses) and 48.3%, 59.2%, and 37.7% (pegiviruses) aa
sequence identities, respectively. In contrast, NS3 and NS5B were
the most conserved with aa sequence identities inthe range of 39.2
53.3% (NS3) and 35.346% (NS5B) for hepaciviruses and 43.8
78.7% (NS3) and 39.571.1% (NS5B) for pegiviruses (Datasets
S1 and S2).
Among the 20 BHVs and BPgVs with complete polyprotein-
coding sequences, 14 presented a genomic organization similar to
that of hepaciviruses or pegiviruses (Fig. 2, SI Appendix, Table S7).
Resembling HCV (2730), the genome of the BHV PDB-829 in
clade D contains a potential alternate reading frame (ARF) that
overlaps the core protein-coding gene (SI Appendix, Fig. S2). The
HCV ARF proteins (ARFPs) are derived either through ribo-
somal frameshift or through internal translation initiation (27, 30).
The putative BHV PDB-829 ARF begins at the alternative GUG
start codon located 41 nt upstream of the main C ORF and has
a coding capacity of 156 aa (31). In contrast to HCV, there is no in-
frame stop codon that terminates the BHV PDB-829 ARF at the
start of the main C ORF (27, 31). Interestingly, the ARF identied
in BHVPDB-829 is absent in the BHVs fromclade C. Genomes of
viruses fromclade Gpresented a distinct genomic feature, a region
of variable length referred to as the variable region (VR, 273843
nt in length) located upstream of the predicted E1 (Fig. 3). No
sequence homology to this region is observed in members of the
Flaviviridae or in any other viral sequence present in the GenBank/
European Molecular Biology Laboratory (EMBL) databases. The
predicted aa coding sequences deduced from the VRs are basic
(pI = 8.499.22) with two to four transmembrane regions. Predicted
signal peptidase cleavage sites are identied 1821 aa downstream
of the putative initiation codon (Fig.3). Further studies are nec-
essary to determine whether the VR is part of E1 or encodes for
a separate protein. The translated envelope protein sequences
(E1, E2) of the BHVs and BPgVs contain 59 potential N-linked
glycosylationsites, comparedwith1016and37sites in hepaciviruses
and pegiviruses, respectively (SI Appendix, Tables S8, S9 and Fig.
S3A). The BHVs contain 811 of the 18 E2 cysteine residues
strictly conserved among HCV genotypes and shown to form nine
5 NTR 3 NTR
C E1 E2 NS2 NS3 NS5A NS5B NS4B
*
5 NTR 3 NTR
E1 E2 NS2 NS3 NS5A NS5B NS4B
**
5 NTR 3 NTR
E1 E2 NS3 NS5A NS5B NS4B VR
***
Structural proteins Non-structural proteins
1 kb
HCV, GBV-B
Clade A
Clade C
Clade D
GBV-A, GBV-C
Clade H
Clade K
Clade G
N
S
4
A
N
S
4
A
N
S
4
A
ARFPs

NS2
Clade E
Hepacivirus
Pegivirus
Fig. 2. Comparison of the genome organization and putative proteomic
map of the bat-derived viruses with representative members of the Hep-
acivirus and Pegivirus genera. The putative genomic organization of the bat-
derived viruses is shown for each clade and was predicted using sequence
comparisons with known hepaciviruses and pegiviruses. The HCV genomic
RNA contains a unique ORF that encodes a precursor polyprotein that is
cleaved by viral and cellular proteases into structural proteins [core (C), en-
velope glycoproteins (E1, E2)], and nonstructural proteins (NS2, NS3, NS4A,
NS4B, NS5A, and NS5B). The HCV ARFPs are translated in an alternate
reading frame overlapping C and represented in orange. The structural and
nonstructural proteins are shown in green and red, respectively; the region
between E2 and NS2 is shown in light blue, and the variable region (VR)
identied in viruses from clade G is represented by the dark blue jagged box.
Arrowheads indicate putative host peptidase (purple) and viral NS2-NS3
(white) and NS3-NS4A (yellow) peptidase cleavage sites. A predicted protein
of 728 kDa is present between E2 and NS2 in the BHVs and BPgVs. *, HCV p7
(41, 42) and GBV-B p13 (43); **, predicted proteins of 21 kDa for GBV-A and
6 kDa for GBV-C (19); ***, predicted proteins of 2728 kDa; NTR, non-
translated region; , signal peptidase cleavage site predicted for GBV-C (19).
Quan et al. PNAS Early Edition | 3 of 6
M
I
C
R
O
B
I
O
L
O
G
Y
disulde bridges essential for proper folding of the envelope gly-
coprotein (SI Appendix, Fig. S3 A and B) (32).
Discussion
Our study suggests that bats are a major natural reservoir for both
hepaciviruses and pegiviruses. Hepaciviruses, formerly detected
only in primates, were recently identied in horses and dogs (21
23). Pegiviruses have been found only in primates and in one spe-
cies of Old World frugivorous bat (P. giganteus) (24). The viruses
characterized in this study showed an unprecedented viral bio-
diversity and represent the most diverse repertoire of hepaciviruses
and pegiviruses described to date (Fig. 1 and SI Appendix, Fig. S1).
In addition, our phylogenic analyses suggest that the bat-derived
viruses are likely to occupy a basal position relative to the previously
described hepaciviruses and pegiviruses although a wider sampling
of various mammalian species is required to determine whether
bats represent the ultimate reservoir host for these viruses. The
BHVs and BPgVs are widely distributed, are present on several
continents of the New and Old World, and are identied in both
suborders of bats (Yinpterochiroptera and Yangochiroptera) from
21 species, 20 genera, and six families, indicating a widespread
circulation (Fig. 4). Nearly ve percent of the bats studied here
were infected with BHVs or BPgVs (0.6% and 4%, respectively;
0.4% were undetermined). All bats collected were apparently
healthy despite the high levels of viremia detected, suggesting that
BHVs and BPVs may not be pathogenic in their hosts.
A long evolutionary history of these viruses in bats is further
supported by the distribution across the different taxonomic
groups of bats and geographic areas in which the viruses were
found. The three distinct clades of BHVs that were all identied in
Kenyan specimens were each associated with common insecti-
vorous African bats (O. martiensseni and H. vittatus) of the dis-
tantly related families Molossidae and Hipposideridae (Fig. 1). An
additional BHV of undetermined clade was also identied in an
insectivorous Scotoecus sp. (Family Vespertilionidae) bat from
Kenya (SI Appendix, Table S4). The BPgVs clustered in three
distinct clades, all of which were paraphyletic, indicating a long
history of diversication and cross-species transmission. Whereas
the viruses from clades G and H were separated by deep branches
associated with distinct bat species (with the exception of PDB-303
and PDB-1479), the clusters in clade K were associated with
multiple bat species from different genera and families, repre-
senting Yinpterochiroptera and Yangochiroptera bats. Moreover,
within each cluster, the closely related BPgVs originated from
different countries or even continents as well as from taxo-
nomically distinct bat hosts. Transmission of viruses between
bats may occur through shedding in excreta, as suggested by the
identication of BPgV in feces, via mating, or during gestation or
parturition. However, the mechanism by which these viruses (or
their ancestors) moved between continents remains to be eluci-
dated. Further studies, especially in Asian bats, are likely to
5 NTR
A
B
Signal peptide Extracellular Cytoplasmic Transmembrane
PDB-76.1
PDB-99 pI=8.49/9.13
91 aa
PDB-620
pI=8.16
281 aa
PDB-34.1
pI=9.22
277 aa
PDB-34.1
PDB-620
PDB-1698
PDB-1734
PDB-99
PDB76.1
PDB-34.1
PDB-620
PDB-1698
PDB-1734
PDB-99
PDB76.1
PDB-34.1
PDB-620
PDB-1698
PDB-1734
PDB-99
PDB76.1
PDB-34.1
PDB-620
PDB-1698
PDB-1734
PDB-99
PDB76.1
PDB-1698
pI=8.80
158 aa
PDB-1734
3 NTR
E1 E2 NS3 NS5A NS5B NS4B VR
***
N
S
4
A
NS2
Fig. 3. Schematic representation of the genomic organization of viruses from clade G. (A) Overall genomic organization and expanded diagram of the
predicted variable region (VR). The putative gene products after polyprotein cleavage are indicated in green for the structural proteins, red for the non-
structural proteins, light blue for the region analogous to the HCV-p7, and dark blue for the variable region (VR). ***, predicted proteins of 2728 kDa. (B)
Amino acid sequence alignment of the predicted VR of viruses from clade G. Putative AUG initiator codons were chosen based on the presence of predicted
signal peptidase cleavage sites identied 1821 aa downstream as indicated in red open box. Predicted signal peptidase cleavage sites between VR and E1 are
shown in blue open box. Strictly conserved and similar aa are indicated by black or gray shades, respectively.
4 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.1303037110 Quan et al.
provide additional information on the evolutionary history and
geographical distribution of these viruses.
If bats are currently or historically implicated in transmission
of hepaci- and pegiviruses to other species, potential routes for
transmission include consumption of fresh bat bodies, contami-
nation of food with bat excreta, or direct exposure to bat blood or
excreta or infection via intermediate hosts. Such mechanisms were
identied for Nipah virus (NiV), where humans become exposed
through contact with infected pigs, consumption of date palm sap
contaminated with infected bat excreta, or through direct expo-
sure to NiV-infected blood/excreta from the Indian ying fox,
P. giganteus (33). In the case of SARS, the emergence of SARS-CoV
in the human population probably occurred from bats to humans
via palm civets present in Chinese wet markets (3436). Whatever
the mechanism of interspecies transmission, the distinct bat-
derived pegivirus lineages identied in six bat families from Africa,
Asia, and Central America and their widespread distribution in-
dicate a much older association of pegiviruses with bats than with
any human or nonhuman primate host. Similarly, whereas
earlier studies have suggested that the evolution of both GBV-A
and GBV-C is compatible with virushost codivergence within
primates (3739), our phylogenetic analysis reveals multiple cross-
species transmissionsboth within bats and likely among other
mammalian speciesover an unknown time-scale. Therefore, the
evolutionary history of the hepaciviruses and pegiviruses is clearly
more complex than previously thought.
Taken together, our data indicate that bats are a major natural
reservoir for hepaciviruses and pegiviruses. Although our current
knowledge does not allow us to assess the true prevalence, host
range, or geographical distributions of BHVs and BPgVs, it does
reveal that these viruses are highly genetically diverse and are cir-
culating in major lineages of Chiroptera in both the Old and New
World. As such, our ndings shed light on the possible evolutionary
history of HCV and human GB viruses, which in turn opens avenues
for investigating cross-species transfer and zoonotic potential. Ad-
ditional eld surveillance for these hepaci- and pegiviruses in other
mammalian taxa will be necessary to unveil the diversity, bio-
geography, epizootiology, and natural history of these viruses, as well
as the mechanisms that drive spill-over infections and host shifts.
Materials and Methods
Bat Sample Collection. Bats were captured at multiple sites in seven countries
(SI Appendix, Table S1). Animals were collected with mist netting or hoop nets
in caves and around human dwellings or from roost locations and foraging
sites. All bats captured were apparently healthy. The species of each animal
was identied by eld biologists and recorded, as well as the sex and mor-
phometric measurements (forearm, body lengths, and weight). Bats captured
in Bangladesh and Mexico were anesthetized using isourane or manually
restrained, respectively. Animals were safely released at the site of capture
following sample collection. Oral and rectal swabs and blood were collected
(when possible) from each animal. Bats sampled in Africa and Guatemala were
anesthetized by intramuscular inoculation with ketamine hydrochloride (0.05
0.1 mg/g of body weight) and euthanized under sedation by intracardiac ex-
sanguination and cervical dislocation. Animals were necropsied by aseptic
technique for tissue collection and immediately stored on dry ice or liquid
nitrogen in the eld, for nal storage at 80 C. One hundred microliters to 3
mL of blood were collected and placed immediately into serum separator
tubes (BD vacutainer or microtainer; Becton Dickinson). Serum was kept on ice
separated by centrifugation and immediately frozen on dry ice or liquid ni-
trogen. The capture and sample collection protocols were approved by the
Institutional Animal Care and Use Committee of the Centers for Disease
Control and Prevention (Atlanta, GA), the University of California (Davis, CA;
protocol 16048), and the Tufts New England Medical Center (Boston, MA;
protocol G2907) as well as by local authorities in the countries of sampling.
Samples and High-Throughput Sequencing. Aliquots of sera from bat speci-
mens collected in Africa and Guatemala were combined into 43 pools for
nucleic acid extraction using the NucliSENS easyMAG (bioMerieux). Total
nucleic acid was treated with DNase-I before random amplication and
pyrosequencing (25). After assembly (Newbler v2.3; 454 Life Sciences), reads
were clustered and assembled into contiguous fragments for comparison
with the GenBank database of nucleic acids and proteins via the BLAST (40).
Screening and Quantitative PCR. All bat specimens were initially screened for
viral genetic material in pools of 810 specimens. BHV and BPgV screening
was performed using four nested PCR assays amplifying a 300-nt conserved
fragment of the RdRp gene (NS5B) (corresponding to the nt positions 8127
8483 in the HCV-1a genome, GenBank accession number M62321). Consensus
primers (SI Appendix, Table S11) were designed by multiple alignments of NS5B
avivirus nucleotide sequences identied in the unbiased high-throughput se-
quencing (UHTS) data. Thereafter, individual specimens from each positive
pool were screened using the same set of primers. Additionally, specic
primers were designed for all sera pools for which avivirus sequences were
identied in the UHTS data but were negative by NS5B consensus PCRs.
Reverse transcription was performed using SuperScript II RT (Invitrogen). PCR
primers were applied at 0.2 M concentrations with 1.5 L of cDNA (diluted
1:5) and HotStar polymerase (Qiagen). First round cycle conditions were: 95 C
for 7 min; 15 cycles at 95 C for 30 s, 65 C for 35 s (1 C per cycle), and 72 C
for 40 s; 35 cycles at 95 C for 30 s, 48 C for 30 s, and 72 C for 40 s; and 1 cycle
at 72 C for 7 min. Second round cycling conditions were: 95 C for 7 min; 35
Fig. 4. Geographic distribution of the bat-derived hepaciviruses and pegiviruses from specimens collected between 20072011. The proportion of the
different clades of bat-derived hepaciviruses and pegiviruses is represented in pie charts for each country. Numbers of samples were log converted. The
number of samples tested is indicated in parentheses. Map and pie charts were produced using the statistical software R (44).
Quan et al. PNAS Early Edition | 5 of 6
M
I
C
R
O
B
I
O
L
O
G
Y
cycles at 95 C for 30 s, 48 C for 30 s, and 72 C for 40 s; and 1 cycle at 72 C for
7 min. All PCR products were sequenced in both directions to conrm the
presence of viral genetic material in samples. Quantitative SYBR green PCR
assays were developed to determine the BHV or BPgV genome copy number.
Standards were prepared for each assay by cloning the NS5B amplicon from
representative viruses (see SI Appendix, Table S12 for primer sequences).
Quantitative PCR (qPCR) reactions were cycled on the 7500 Fast Real-Time
PCR System (Applied Biosystems) under the following conditions: 50 C for 2
min, 95 C for 10 min, then 45 cycles of 95 C for 15 s and 60 C for 60 s.
Genome Sequencing and Phylogenetic and Evolutionary Analyses. Details are
provided in SI Appendix, SI Materials and Methods.
GenBank accession numbers for the viruses identied in this study are
provided in SI Appendix, Tables S6 and S7.
ACKNOWLEDGMENTS. We thank the King and Chiefs of the Idanre com-
munity, Ondo State, Nigeria; the Federal Ministry of Health, Abuja,
Nigeria; S. Wuyah, M. Lawal, M. Ari, A. Mohammed, I. Onaja, and G. Kia;
the Faculty of Veterinary Medicine, Ahmadu Bello University (ABU), Zaria,
Nigeria; and the Vice Chancellor and Management of ABU. We thank Luis
Escobar, Alejandra Estevez, Mara Rene Lpez, Ramon Medrano, Maria
E. Morales, and Mara Luisa Muller (Center for Health Studies, Universidad
del Valle de Guatemala) and Julio Martinez (Guatemala Ministry of Agricul-
ture-Animal Health Department). We thank Sophronia Yu, Edwin Danga,
Evelyne Mulama, Solomon Gikundi, Leonard Nderitu, and Eric Ogola (Cen-
ters for Disease Control-Kenya), Rafael Ciraiz (Guatemala Ministry of Public
Health and Social Assistance), and Arif Islam (from Bangladesh) for excellent
technical assistance and logistics. This work was supported by National Insti-
tutes of Health Grants AI051292 and AI57158 (Northeast Biodefense Center)
(to W.I.L.), National Institute of Allergy and Infectious Diseases Grant
5R01AI079231-02, US Agency for International Development PREDICT Grant
GHNA 0009 0001 000, and an award from the US Department of Defense.
This study was also supported by the Emerging Pandemic Threats Program of
the US Agency for International Development, the National Center for
Emerging and Zoonotic Infectious Diseases, the Centers for Disease Control
and Prevention (Atlanta, GA), and Technical Support Corps funds from the
Global Disease Detection Program of the Centers for Disease Control and
Prevention (Atlanta, GA).
1. Jones KE, et al. (2008) Global trends in emerging infectious diseases. Nature
451(7181):990993.
2. Teeling EC, et al. (2005) A molecular phylogeny for bats illuminates biogeography and
the fossil record. Science 307(5709):580584.
3. Zhang G, et al. (2013) Comparative analysis of bat genomes provides insight into the
evolution of ight and immunity. Science 339(6118):456460.
4. Newman SH, Field HE, de Jong CE, Epstein JH, eds (2011) Investigating the Role of Bats
in Emerging Zoonoses: Balancing Ecology, Conservation and Public Health Interests
(FAO Animal Production and Health Manual No. 12) (Food and Agriculture Organi-
zation of the United Nations, Rome).
5. Calisher CH, Childs JE, Field HE, Holmes KV, Schountz T (2006) Bats: Important res-
ervoir hosts of emerging viruses. Clin Microbiol Rev 19(3):531545.
6. Chua KB, et al. (2000) Nipah virus: A recently emergent deadly paramyxovirus. Science
288(5470):14321435.
7. Drexler JF, et al. (2012) Bats host major mammalian paramyxoviruses. Nat Commun
3:796.
8. Lau SK, et al. (2005) Severe acute respiratory syndrome coronavirus-like virus in Chi-
nese horseshoe bats. Proc Natl Acad Sci USA 102(39):1404014045.
9. Leroy EM, et al. (2005) Fruit bats as reservoirs of Ebola virus. Nature 438(7068):
575576.
10. Li W, et al. (2005) Bats are natural reservoirs of SARS-like coronaviruses. Science
310(5748):676679.
11. Luis AD, et al. (2013) A comparison of bats and rodents as reservoirs of zoonotic vi-
ruses: Are bats special? Proc R Soc B 280(1756):20122753.
12. Murray K, et al. (1995) A morbillivirus that caused fatal disease in horses and humans.
Science 268(5207):9497.
13. Yob JM, et al. (2001) Nipah virus infection in bats (order Chiroptera) in peninsular
Malaysia. Emerg Infect Dis 7(3):439441.
14. Alter HJ (1989) Discovery of the non-A, non-B hepatitis virus: The end of the begin-
ning or the beginning of the end. Transfus Med Rev 3(2):7781.
15. Choo QL, et al. (1989) Isolation of a cDNA clone derived from a blood-borne non-A,
non-B viral hepatitis genome. Science 244(4902):359362.
16. Ray Kim W (2002) Global epidemiology and burden of hepatitis C. Microbes Infect
4(12):12191225.
17. Simmonds P (2004) Genetic diversity and evolution of hepatitis C virus15 years on.
J Gen Virol 85(Pt 11):31733188.
18. Moradpour D, Penin F, Rice CM (2007) Replication of hepatitis C virus. Nat Rev Mi-
crobiol 5(6):453463.
19. Stapleton JT, Foung S, Muerhoff AS, Bukh J, Simmonds P (2011) The GB viruses: A
review and proposed classication of GBV-A, GBV-C (HGV), and GBV-D in genus
Pegivirus within the family Flaviviridae. J Gen Virol 92(Pt 2):233246.
20. Bukh J, Apgar CL, Govindarajan S, Purcell RH (2001) Host range studies of GB virus-B
hepatitis agent, the closest relative of hepatitis C virus, in New World monkeys and
chimpanzees. J Med Virol 65(4):694697.
21. Burbelo PD, et al. (2012) Serology-enabled discovery of genetically diverse hep-
aciviruses in a new host. J Virol 86(11):61716178.
22. Lyons S, et al. (2012) Nonprimate hepaciviruses in domestic horses, United kingdom.
Emerg Infect Dis 18(12):19761982.
23. Kapoor A, et al. (2011) Characterization of a canine homolog of hepatitis C virus. Proc
Natl Acad Sci USA 108(28):1160811613.
24. Epstein JH, et al. (2010) Identication of GBV-D, a novel GB-like avivirus from old
world frugivorous bats (Pteropus giganteus) in Bangladesh. PLoS Pathog 6:e1000972.
25. Palacios G, et al. (2008) A new arenavirus in a cluster of fatal transplant-associated
diseases. N Engl J Med 358(10):991998.
26. Simmonds P, et al. (2012) in Virus Taxonomy: Ninth Report of the International
Committee on Taxonomy of Viruses, eds King AMQ, Lefkowitz E, Adams MJ, Carstens
EB (Academic, New York), pp 1003-1020.
27. Branch AD, Stump DD, Gutierrez JA, Eng F, Walewski JL (2005) The hepatitis C virus
alternate reading frame (ARF) and its family of novel products: The alternate reading
frame protein/F-protein, the double-frameshift protein, and others. Semin Liver Dis
25(1):105117.
28. Walewski JL, Keller TR, Stump DD, Branch AD (2001) Evidence for a new hepatitis C
virus antigen encoded in an overlapping reading frame. RNA 7(5):710721.
29. Xu Z, et al. (2001) Synthesis of a novel hepatitis C virus protein by ribosomal frame-
shift. EMBO J 20(14):38403848.
30. Vassilaki N, Mavromara P (2009) The HCV ARFP/F/core+1 protein: Production and
functional analysis of an unconventional viral product. IUBMB Life 61(7):739752.
31. Ina Y, Mizokami M, Ohba K, Gojobori T (1994) Reduction of synonymous substitutions
in the core protein gene of hepatitis C virus. J Mol Evol 38(1):5056.
32. Krey T, et al. (2010) The disulde bonds in glycoprotein E2 of hepatitis C virus reveal
the tertiary organization of the molecule. PLoS Pathog 6(2):e1000762.
33. Luby SP, et al. (2006) Foodborne transmission of Nipah virus, Bangladesh. Emerg In-
fect Dis 12(12):18881894.
34. Guan Y, et al. (2003) Isolation and characterization of viruses related to the SARS
coronavirus from animals in southern China. Science 302(5643):276278.
35. Lau SK, et al. (2010) Ecoepidemiology and complete genome comparison of different
strains of severe acute respiratory syndrome-related Rhinolophus bat coronavirus in
China reveal bats as a reservoir for acute, self-limiting infection that allows re-
combination events. J Virol 84(6):28082819.
36. Song HD, et al. (2005) Cross-host evolution of severe acute respiratory syndrome
coronavirus in palm civet and human. Proc Natl Acad Sci USA 102(7):24302435.
37. Charrel RN, De Micco P, de Lamballerie X (1999) Phylogenetic analysis of GB viruses A
and C: Evidence for cospeciation between virus isolates and their primate hosts. J Gen
Virol 80(Pt 9):23292335.
38. Patel MR, Loo YM, Horner SM, Gale M, Jr., Malik HS (2012) Convergent evolution of
escape from hepaciviral antagonism in primates. PLoS Biol 10(3):e1001282.
39. Sharp PM, Simmonds P (2011) Evaluating the evidence for virus/host co-evolution.
Curr Opin Virol 1(5):436441.
40. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment
search tool. J Mol Biol 215(3):403410.
41. Grifn SD, et al. (2003) The p7 protein of hepatitis C virus forms an ion channel that is
blocked by the antiviral drug, Amantadine. FEBS Lett 535(1-3):3438.
42. Lin C, Lindenbach BD, Prgai BM, McCourt DW, Rice CM (1994) Processing in the
hepatitis C virus E2-NS2 region: Identication of p7 and two distinct E2-specic
products with different C termini. J Virol 68(8):50635073.
43. Takikawa S, et al. (2006) Functional analyses of GB virus B p13 protein: Development
of a recombinant GB virus B hepatitis virus with a p7 protein. Proc Natl Acad Sci USA
103(9):33453350.
44. R Core Team (2012) R: A Language and Environment for Statistical Computing
(R Foundation for Statistical Computing, Vienna, Austria). Available at www.R-project.
org/.
6 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.1303037110 Quan et al.


Supporting Information

Bats are a major natural reservoir for hepaciviruses and
pegiviruses



Phenix-Lan Quan
a,1
,
Cadhla Firth
a
,
Juliette M. Conte
a
,
Simon H. Williams
a
,
Carlos M.
Zambrana-Torrelio
b
, Simon J. Anthony
a,b
,
James A. Ellison
c
,
Amy T. Gilbert
c
,
Ivan V.
Kuzmin
c,2
,
Michael Niezgoda
c
,
Modupe O. V. Osinubi
c
,

Sergio Recuenco
c
,
Wanda
Markotter
d
,
Robert F. Breiman
e
,

Lems Kalemba
f
,
Jean Malekani
f
,

Kim A. Lindblade
g
,

Melinda K. Rostal
b
, Rafael Ojeda-Flores
h
,
Gerardo Suzan
h
, Lora B. Davis
i
,
Dianna M.
Blau
j
,

Albert B. Ogunkoya
k
,
Danilo A. Alvarez Castillo
l
,
David Moran
l
,
Sali Ngam
m
,
Dudu
Akaibe
n
,
Bernard Agwanda
o
, Thomas Briese
a
,
Jonathan H. Epstein
b
,
Peter Daszak
b
,

Charles E. Rupprecht
c,3
,
Edward C. Holmes
p
, W. Ian Lipkin
a




a
Center for Infection and Immunity, Mailman School of Public Health, Columbia University, New
York, NY, 10032;
b
EcoHealth Alliance, New York, NY 10001;
c
Poxvirus and Rabies Branch,
Division of High-Consequence Pathogens and Pathology, National Center for Emerging Zoonotic
Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA 30333;

d
Department of Microbiology and Plant Pathology, University of Pretoria, Pretoria 0002, South
Africa;
e
Centers for Disease Control and Prevention in Kenya, Nairobi, Kenya;
f
University of
Kinshasa, Kinshasa 11, Democratic Republic of the Congo;
g
Centers for Disease Control and
Prevention Guatemala, 01015, Guatemala City, Guatemala;
h
Facultad de Medicina Veterinaria y
Zootecnia, Universidad Nacional Autnoma de Mxico, Ciudad Universitaria, 04510, Mxico D. F.
Mexico;
i
Centers for Disease Control and Prevention Nigeria, Abuja, Nigeria;
j
Infectious Diseases
Pathology Branch, Division of High-Consequence Pathogens and Pathology, National Center for
Emerging Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA
30333;
k
Department of Veterinary Medicine, Ahmadu Bello University, Samaru, Zaria, Kaduna
State, Nigeria;
l
Center for Health Studies, Universidad del Valle de Guatemala, 01015, Guatemala
City, Guatemala;
m
Laboratoire National Vtrinaire, B.P. 503 Garoua, Cameroon;
n
Centre de
Surveillance de la Biodiversit, University of Kisangani, B.P. 2012 Kisangani, Democratic
Republic of the Congo;
o
Mammalogy Section, National Museums of Kenya, 00100 Nairobi,
Kenya;
p
Sydney Emerging Infections and Biosecurity Institute, School of Biological Sciences and
Sydney Medical School, The University of Sydney, Sydney, NSW 2006, Australia.

Current affiliations:
2
Aravan, LLC, Lilburn, GA 30047;
3
The Global Alliance for Rabies Control,
Manhattan, KS 66502, USA.


1/20
SI Material and methods

Genome sequencing. Sequences identified in the unbiased high-throughput
sequencing (UHTS) data with similarities to flaviviruses were assembled against
prototype viruses from the Hepacivirus or the Pegivirus genera. PCR primers for
amplification across sequence gaps were designed based on the UHTS data and the
draft genome was sequenced by overlapping PCR products. Additional methods applied
to obtain the genome sequence included primer walking using specific and consensus
primers and 3 and 5 RACE kit (Clontech). Products were purified (Qiagen) and directly
dideoxy sequenced in both directions with ABI Prism BigDye Terminator 1.1 cycle
sequencing kits (Applied Biosystems).

Phylogenetic Analyses. To determine the phylogenetic relationships of the BHVs and
BPgVs within the Hepacivirus and Pegivirus genera, three data sets were assembled: (i)
a nucleotide data set containing all partial NS5B genes obtained in this study along with
representatives of known hepaci/pegiviruses available on GenBank (300 nt, N=116); (ii)
an amino acid data set of the complete NS3 genes from representatives from each clade
of bat viruses along with all known hepaci/pegiviruses from GenBank (641 aa, N=58);
and (iii) an amino acid data set containing complete NS5B genes of the same viruses as
above (629 aa, N=58). Each sequence alignment was manually created using the
program Se-Al v2.0a11 (http://tree.bio.ed.ac.uk/software/) and assessed for the
presence of recombination using the RDP, GENECONV, Chimaera, MaxChi, BootScan,
SiScan, 3Seq, LARD algorithms within the RDP3 software package, as well as the SBP
and GARD methods available in the datamonkey analysis package (1-3). For each data
set, phylogenetic trees were inferred using MrBayes v3.2 assuming the HKY85 model of
nucleotide substitution with four categories of gamma-distributed rate variation, or the
2/20
WAG model of amino acid substitution (4). Two independent Markov chain Monte Carlo
(MCMC) runs were performed for each data set for 10 million generations, with trees
sampled every 5000 generations. Each run was terminated after convergence was
reached (standard deviation of the split frequencies <0.01). A consensus tree was
created for each data set by summarizing all trees from both runs after a 10% burnin
was removed from each.
Pairwise percent amino acid similarities based on the complete NS3 and NS5B gene
alignments for all taxa were calculated using the program Geneious v6.0.4
(www.geneious.com). Major clades in each phylogeny (denoted clades A through K)
were defined as containing sequences that were !68% similar at the amino acid level in
both genes. Within the Hepacivirus genus, viral species were defined based on the
pairwise percent amino acid similarity within HCV, the only species classified by the
International Committee on Taxonomy of viruses (ICTV) (ICTV 9th report, 2012). All
novel sequences that were !70.3% similar at the amino acid level were considered to be
a single species. To date, no species within the proposed Pegivirus genus have been
recognized by the ICTV (ICTV 9th report, 2012). However, the genetic diversity between
two putative and closely related species (84.3% between GBV-Ctro and GBV-C) was
used here as a standard to delineate species within the Pegivirus genus (5).
Protein family analysis was performed using Pfam (http://pfam.sanger.ac.uk/).
Predictions of signal peptide cleavage sites, glycosylation sites, and transmembrane
domains were performed using respective prediction servers available at the Center for
Biological Sequence Analysis (http://www.cbs.dtu.dk/services/ and
http://bioinf.cs.ucl.ac.uk/).


3/20
G
P
S
L
a
t
i
t
u
d
e
G
P
S
L
o
n
g
i
t
u
d
e
E
a
r
l
i
e
s
t
c
o
l
l
e
c
t
i
o
n

d
a
t
e


(
m
o
n
t
h
/
y
e
a
r
)
L
a
t
e
s
t

c
o
l
l
e
c
t
i
o
n

d
a
t
e

(
m
o
n
t
h
/
y
e
a
r
)
E
m
b
a
l
l
o
n
u
r
i
d
a
e
H
i
p
p
o
s
i
d
e
r
i
d
a
e
M
i
n
i
o
p
t
e
r
i
d
a
e
M
o
l
o
s
s
i
d
a
e
M
o
r
m
o
o
p
i
d
a
e
P
h
y
l
l
o
s
t
o
m
i
d
a
e
P
t
e
r
o
p
o
d
i
d
a
e
V
e
s
p
e
r
t
i
l
i
o
n
i
d
a
e
F
a
m
i
l
y

n
o
t

s
p
e
c
i
f
i
e
d
T
o
t
a
l

B
a
t
s
C
a
m
e
r
o
o
n
C
a
v
e
s
0
9
.
2
7
3
2
2

N
0
1
3
.
5
0
4
4
9

E
0
4
/
2
0
1
0
-
2
1
3
L
a
n
a
v
e
t
,

Q
u
a
r
t
e
r

D
0
9
.
2
5
2
8
0

N
0
1
3
.
4
5
5
0
7

E
0
4
/
2
0
1
0
-
2
2
M
a
y
a

o
u
l
u
0
9
.
6
7
6
9
8

N
0
1
3
.
6
1
7
7
1

E
0
4
/
2
0
1
0
-
8
8
N
g
o
n
g
0
9
.
0
2
3
9
7

N
0
1
3
.
5
1
7
2
6

E
0
4
/
2
0
1
0
-
9
9
D
R
C
K
i
n
s
h
a
s
a
,

s
c
h
o
o
l

(
1
)
0
4
.
2
6
3
3
4

S
0
1
5
.
1
8
5
6
7

E
0
7
/
2
0
1
1
-
1
1
1
1
K
i
n
s
h
a
s
a
,

s
c
h
o
o
l

(
2
)
0
4
.
2
0
3
7
9

S
0
1
5
.
1
5
9
0
2

E
0
7
/
2
0
1
1
-
4
4
K
i
s
a
n
g
a
n
i
,

C
i
m
e
s
t
a
n
0
0
.
2
9
6
0
3

N
0
2
5
.
1
4
5
7
5

E
0
7
/
2
0
1
1
-
2
2
K
i
s
a
n
g
a
n
i
,

M
a
y
e
l
e

I
s
l
a
n
d
0
0
.
2
9
6
7
8

N
0
2
5
.
1
2
7
7
2

E
0
7
/
2
0
1
1
-
1
5
1
5
K
i
s
a
n
g
a
n
i
,

r
a
i
n

f
o
r
e
s
t

n
e
x
t

t
o

L
a
y
o
k
o
0
0
.
1
7
6
4
5

N
0
2
5
.
1
7
3
3
4

E
0
7
/
2
0
1
1
-
3
5
8
K
i
s
a
n
g
a
n
i
,

r
a
i
n

f
o
r
e
s
t

n
e
x
t

t
o

M
a
s
a
k
o
0
0
.
3
7
2
0
9

N
0
2
5
.
1
4
4
4
3

E
0
7
/
2
0
1
1
-
3
3
K
i
s
a
n
t
u

c
h
u
r
c
h
0
5
.
1
2
3
9
4

S
0
1
5
.
0
8
3
5
4

E
0
7
/
2
0
1
1
-
1
1
K
e
n
y
a
A
s
e
m
b
o
,

c
h
u
r
c
h

a
n
d

s
c
h
o
o
l
0
0
.
0
8
1
2
3

S
0
3
4
.
2
1
2
3
0

E
0
6
/
2
0
1
0
-
2
3
5
G
i
l
g
i
l

m
i
n
e
0
0
.
4
3
1
0
9

S
0
3
6
.
1
7
3
8
4

E
0
7
/
2
0
1
0
-
1
1
K
i
s
i
i
0
0
.
6
7
9
2
5

S
0
3
4
.
7
7
2
5
4

E
0
8
/
2
0
1
1
-
8
8
M
a
l
i
n
d
i
0
3
.
2
1
5
6
9

S
0
4
0
.
1
1
9
8
0

E
0
6
/
2
0
1
0
-
3
3
M
o
u
n
t

E
l
g
o
n
,

K
i
t
u
m

c
a
v
e
0
1
.
0
1
9
6
5

N
0
3
4
.
4
5
4
6
2

E
0
6
/
2
0
1
0
-
3
3
M
o
u
n
t

E
l
g
o
n
,

M
a
k
i
n
g
e
n
i

c
a
v
e
0
1
.
0
2
4
8
2

N
0
3
4
.
4
4
7
3
0

E
0
6
/
2
0
1
0
-
9
9
S
h
i
m
o
n
i

c
a
v
e
0
4
.
3
8
8
3
2

S
0
3
9
.
2
2
8
2
0

E
0
6
/
2
0
1
0
0
8
/
2
0
1
0
1
6
1
6
S
u
s
w
a

c
a
v
e
0
1
.
0
7
9
1
5

S
0
3
6
.
2
4
2
7
9

E
0
6
/
2
0
1
0
-
1
5
1
5
T
h
r
e
e

c
a
v
e
s
0
4
.
3
6
8
4
1

S
0
3
9
.
2
1
2
4
5

E
0
6
/
2
0
1
0
0
8
/
2
0
1
0
5
9
1
0
6
9
T
s
a
v
o

E
a
s
t
,

N
d
o
l
o
l
o

c
a
m
p
0
3
.
3
6
0
8
3

S
0
3
8
.
6
4
5
7
4

E
0
6
/
2
0
1
0
-
2
3
5
W
a
t
a
m
u
0
3
.
2
1
0
7
1

S
0
4
0
.
0
0
9
1
5

E
0
6
/
2
0
1
0
-
4
4
S
i
t
e

n
o
t

s
p
e
c
i
f
i
e
d
N
D
N
D
N
D
N
D
2
2
N
i
g
e
r
i
a
C
o
l
l
e
g
e

o
f

A
g
r
i
c
u
l
t
u
r
e
0
7
.
1
4
7
9
5

N
0
0
5
.
1
1
2
0
5

E
0
9
/
2
0
1
0
-
5
1
5
1
I
d
a
n
r
e

c
a
v
e
0
7
.
1
3
4
4
4

N
0
0
5
.
1
3
2
2
2

E
0
6
/
2
0
0
8
0
9
/
2
0
1
0
4
9
1
9
5
R
o
c
k
e
f
e
l
l
e
r
1
1
.
1
7
1
7
7

N
0
0
7
.
6
3
2
1
1

E
0
6
/
2
0
0
8
-
1
3
1
1
1
5
S
i
t
e

n
o
t

s
p
e
c
i
f
i
e
d
N
D
N
D
N
D
N
D
7
7
G
u
a
t
e
m
a
l
a
A
g

e
r
o
1
4
.
1
8
8
9
3

N
0
9
1
.
0
5
3
4
3

W
2
0
1
0
-
2
4
4
2
4
8
A
l
d
e
a

S
a
l
a
c
u
i
m
,

C
o
b
a
n
1
5
.
8
4
4
4
1

N
0
9
0
.
7
2
0
8
6

W
2
0
0
8
2
0
0
9
1
4
1
4
L
o
s

H
i
l
o
s

1
3
.
9
7
7
6
5

N
0
9
0
.
2
7
3
8
1

W
2
0
0
9
-
1
1
M
o
n
t
a
n
a
s

A
z
u
l
e
s
1
4
.
4
1
3
7
9

N
0
9
1
.
0
6
3
1
6

W
2
0
1
0
-
1
2
1
2
M
e
x
i
c
o
A
0
1
1
6
.
1
2
0
5
5

N
9
0
.
9
3
1
6
6

W
0
7
/
2
0
1
0
0
9
/
2
0
1
0
3
3
A
0
2
1
6
.
1
3
4
8
6

N
9
0
.
9
0
3
6
3

W
0
7
/
2
0
1
0
1
0
/
2
0
1
0
1
1
4
4
1
9
A
0
3
1
6
.
1
0
6
1
0

N
9
0
.
9
8
5
2
0

W
0
7
/
2
0
1
0
0
1
/
2
0
1
1
1
1
4
1
5
A
0
4
1
6
.
1
3
4
8
6

N
9
0
.
9
2
2
2
0


W
0
7
/
2
0
1
0
-
2
2
A
0
5
1
6
.
1
1
9
7
0

N
9
0
.
9
2
5
3
0

W
0
7
/
2
0
1
0
0
9
/
2
0
1
0
2
8
1
2
9
A
0
6
1
6
.
1
5
3
4
2

N
9
0
.
8
9
5
7
5

W
1
0
/
2
0
1
0
0
1
/
2
0
1
1
1
2
2
1
2
4
A
0
7
1
6
.
0
9
9
2
8

N
9
0
.
9
2
4
8
9

W
0
7
/
2
0
1
0
1
0
/
2
0
1
0
1
2
3
2
1
3
6
A
0
8
1
6
.
1
1
4
9
0

N
9
0
.
9
4
0
8
0

W
0
9
/
2
0
1
0
-
4
4
A
0
9
1
6
.
2
8
0
7
2

N
9
0
.
8
3
7
2
2

W
0
1
/
2
0
1
1
-
1
9
1
0
A
1
0
1
6
.
0
8
3
9
7

N
9
0
.
9
6
8
1
4

W
0
1
/
2
0
1
1
-
3
5
8
A
1
1
1
6
.
1
0
5
2
8

N
9
1
.
0
1
4
9
2

W
0
1
/
2
0
1
1
-
2
4
6
B
0
1
1
8
.
4
0
9
3
3

N
8
9
.
8
9
9
1
1

W
0
8
/
2
0
1
0
-
1
2
6
3
1
3
1
B
0
2
1
8
.
3
1
6
2
5

N
8
9
.
8
5
6
4
4

W
0
8
/
2
0
1
0
1
0
/
1
0
2
7
2
7
B
0
3
1
8
.
1
8
5
4
7

N
8
9
.
7
4
6
8
8

W
0
8
/
2
0
1
0
-
3
9
2
1
4
B
0
5
1
8
.
6
3
4
1
9

N
9
0
.
1
6
1
1
1

W
0
8
/
2
0
1
0
1
0
/
1
0
1
1
8
1
9
B
0
6
1
8
.
5
2
3
0
8

N
8
9
.
5
9
9
1
3

W
0
8
/
2
0
1
0
1
0
/
1
0
1
2
0
2
1
B
0
7
1
8
.
5
2
2
8
1

N
8
9
.
8
2
3
6
3

W
0
8
/
2
0
1
0
-
9
9
B
0
8
1
8
.
3
9
7
6
1

N
8
9
.
4
4
2
8
0

W
1
0
/
2
0
1
0
-
1
1
1
1
2
C
0
1
1
9
.
2
9
6
1
5

N
9
9
.
1
0
3
3
5

W
0
7
/
2
0
1
0
0
8
/
2
0
1
0
2
0
5
2
5
C
0
2
1
9
.
2
9
6
6
0

N
9
9
.
0
9
3
2
9

W
N
D
N
D
1
1
C
0
3
1
9
.
5
5
0
5
5

N
9
9
.
4
6
9
7
2

W
N
D
N
D
1
1
C
0
4
1
9
.
3
1
9
0
1

N
9
9
.
1
9
3
5
2

W
N
D
N
D
2
2
C
0
5
1
9
.
3
5
9
7
2

N
9
8
.
5
7
6
6
6

W
0
9
/
2
0
1
0
-
5
5
S
i
t
e

n
o
t

s
p
e
c
i
f
i
e
d
N
D
N
D
0
7
/
2
0
1
0
0
9
/
2
0
1
0
3
3
4
4
4
1
B
a
n
g
l
a
d
e
s
h
F
a
r
i
d
i
p
u
r
N
D
N
D
0
7
/
2
0
0
7
0
6
/
2
0
1
0
6
1
2
6
1
2
K
h
u
l
n
a
N
D
N
D
0
1
/
2
0
0
9
-
5
0
5
0
K
u
s
h
t
i
a
N
D
N
D
0
8
/
2
0
0
7
-
1
0
0
1
0
0
S
y
l
k
e
t
N
D
N
D
0
9
/
2
0
0
8
-
4
0
4
0
T
o
t
a
l

B
a
t
s
3
8
4
1
6
6
2
1
3
6
2
1
0
2
2
4
0
1
6
1
6
1
5

T
a
b
l
e

S
1
.


B
a
t

s
a
m
p
l
i
n
g

s
i
t
e

i
n
f
o
r
m
a
t
i
o
n

G
l
o
b
a
l

p
o
s
i
t
i
o
n
i
n
g

s
y
s
t
e
m

(
G
P
S
)

c
o
o
r
d
i
n
a
t
e
s

f
o
r

e
a
c
h

s
i
t
e
,

c
o
l
l
e
c
t
i
o
n

d
a
t
e

a
n
d

n
u
m
b
e
r

o
f

b
a
t
s

s
a
m
p
l
e
d
.


-

:

n
o
t

a
p
p
l
i
c
a
b
l
e
;

N
:

n
o
r
t
h
;

E
:

e
a
s
t
;

S
:
s
o
u
t
h
;

W
:

w
e
s
t
;

N
D
:

n
o

d
a
t
a

a
v
a
i
l
a
b
l
e
;

D
R
C
:

D
e
m
o
c
r
a
t
i
c

R
e
p
u
b
l
i
c

o
f

t
h
e

C
o
n
g
o
.

4/20
Table S1
F
a
m
i
l
y
G
e
n
u
s
S
p
e
c
i
e
s
S
m
L
u
L
v
K
d
O
r
R
e
S
m
L
u
L
v
K
d
O
r
R
e
S
m
L
u
L
v
K
d
O
r
R
e
S
m
L
u
L
v
K
d
O
r
R
e
S
m
L
u
L
v
K
d
O
r
R
e
P
L
L
u
L
v
K
d
O
r
R
e
S
m
L
u
L
v
K
d
O
r
R
e
T
o
t
a
l

s
p
e
c
i
m
e
n
s
E
m
b
a
l
l
o
n
u
r
i
d
a
e
S
a
c
c
o
p
t
e
r
y
x
S
.

b
i
l
i
n
e
a
t
a
1
1
(
n
=
3
)
T
a
p
h
o
z
o
u
s
S
p
e
c
i
e
s

n
o
t

i
d
e
n
t
i
f
i
e
d
2
2
H
i
p
p
o
s
i
d
e
r
i
d
a
e
H
i
p
p
o
s
i
d
e
r
o
s
H
.

v
i
t
t
a
t
u
s
7
5
7
5
H
.
g
i
g
a
s
4
1
1
6
(
n
=
8
5
)
H
.

f
u
l
i
g
i
n
o
s
u
s
3
3
S
p
e
c
i
e
s

n
o
t

i
d
e
n
t
i
f
i
e
d
1
1
M
i
n
i
o
p
t
e
r
i
d
a
e
M
i
n
i
o
p
t
e
r
u
s
M
i
n
i
o
p
t
e
r
u
s

s
p
.
1
(
n
=
1
)
M
o
l
o
s
s
i
d
a
e
C
h
a
e
r
e
p
h
o
n
S
p
e
c
i
e
s

n
o
t

i
d
e
n
t
i
f
i
e
d
1
1
6
2
1
9
(
n
=
6
7
)
M
o
l
o
s
s
u
s
M
.

s
i
n
a
l
o
a
e
2
2
M
o
p
s
M
.

c
o
n
d
y
l
u
r
u
s
1
0
1
0
N
y
c
t
i
n
o
m
o
p
s
N
.

m
a
c
r
o
t
i
s
1
0
1
1
1
O
t
o
m
o
p
s
O
.

m
a
r
t
i
e
n
s
s
e
n
i
1
5
1
5
T
a
d
a
r
i
d
a
T
.

b
r
a
s
i
l
i
e
n
s
i
s
1
0
1
0
M
o
r
m
o
o
p
i
d
a
e
M
o
r
m
o
o
p
s
M
.

m
e
g
a
l
o
p
h
y
l
l
a
2
2
(
n
=
2
1
)
P
t
e
r
o
n
o
t
u
s
P
.

p
a
r
n
e
l
l
i
i
1
9
1
9
P
h
y
l
l
o
s
t
o
m
i
d
a
e
A
r
t
i
b
e
u
s
A
.

j
a
m
a
i
c
e
n
s
i
s
1
6
4
1
5
7
(
n
=
3
6
8
)
A
.

l
i
t
u
r
a
t
u
s
2
4
0
4
2
A
.

p
h
a
e
o
t
i
s
6
6
A
.

t
o
l
t
e
c
u
s
1
1
A
.

w
a
t
s
o
n
i
1
0
1
0
C
a
r
o
l
l
i
a
C
.

p
e
r
s
p
i
c
i
l
l
a
t
a
4
8
1
1
3
C
.

s
o
w
e
l
l
i
3
8
1
3
9
C
h
o
e
r
o
n
i
s
c
u
s
C
.

g
o
d
m
a
n
i
9
9
C
h
r
o
t
o
p
t
e
r
u
s
C
.

a
u
r
i
t
u
s
4
4
D
e
s
m
o
d
u
s
D
.

r
o
t
u
n
d
u
s
3
0
6
3
6
G
l
o
s
s
o
p
h
a
g
a
G
.

c
o
m
m
i
s
s
a
r
i
s
i
1
1
2
G
.

s
o
r
i
c
i
n
a
1
2
9
3
0
L
o
n
c
h
o
r
h
i
n
a
L
.

a
u
r
i
t
a
1
1
L
o
p
h
o
s
t
o
m
a
L
.

b
r
a
s
i
l
i
e
n
s
e
1
1
M
i
c
r
o
n
y
c
t
e
r
i
s
M
.

m
i
c
r
o
t
i
s
1
2
3
M
.

s
c
h
m
i
d
t
o
r
u
m
2
2
M
i
m
o
n
M
.

c
o
z
u
m
e
l
a
e
1
1
P
h
y
l
l
o
d
e
r
m
a
P
.

s
t
e
n
o
p
s
4
4
P
h
y
l
l
o
s
t
o
m
u
s
P
.

d
i
s
c
o
l
o
r
4
4
P
l
a
t
y
r
r
h
i
n
u
s
P
.

h
e
l
l
e
r
i
1
2
4
2
5
S
t
u
r
n
i
r
a
S
.

l
i
l
i
u
m
1
3
2
0
3
3
S
.

l
u
d
o
v
i
c
i
2
5
1
2
6
S
p
e
c
i
e
s

n
o
t

i
d
e
n
t
i
f
i
e
d
1
1
T
o
n
a
t
i
a
T
.

s
a
u
r
o
p
h
i
l
a
2
2
T
r
a
c
h
o
p
s
T
.

c
i
r
r
h
o
s
u
s
4
2
6
U
r
o
d
e
r
m
a
U
.

b
i
l
o
b
a
t
u
m
1
8
9
V
a
m
p
y
r
o
d
e
s
V
.

c
a
r
a
c
c
i
o
l
i
1
1
P
t
e
r
o
p
o
d
i
d
a
e
E
i
d
o
l
o
n
E
.

h
e
l
v
u
m
8
1
7
1
1
4
7
3
3
3
9
2
(
n
=
1
0
5
3
)
E
p
o
m
o
p
h
o
r
u
s
E
.

l
a
b
i
a
t
u
s
3
3
E
.

w
a
h
l
b
e
r
g
i
2
2
H
y
p
s
i
g
n
a
t
h
u
s
H
.

m
o
n
s
t
r
o
s
u
s
2
2
L
i
s
s
o
n
y
c
t
e
r
i
s
L
.

a
n
g
o
l
e
n
s
i
s
4
4
M
e
g
a
l
o
g
l
o
s
s
u
s
M
.

w
o
e
r
m
a
n
n
i
3
3
M
y
o
n
y
c
t
e
r
i
s
M
.

t
o
r
q
u
a
t
a
3
3
P
t
e
r
o
p
u
s
P
.

g
i
g
a
n
t
e
u
s
8
0
2
8
0
2
R
o
u
s
e
t
t
u
s
R
.

a
e
g
y
p
t
i
a
c
u
s
2
6
7
2
1
9
1
6
9
1
4
2
V
e
s
p
e
r
t
i
l
i
o
n
i
d
a
e
B
a
u
e
r
u
s
B
.

d
u
b
i
a
q
u
e
r
c
u
s
1
4
1
4
(
n
=
5
7
)
E
p
t
e
s
i
c
u
s
E
.

f
u
s
c
u
s
2
2
M
y
o
t
i
s
M
.

o
c
c
u
l
t
u
s
1
1
M
.

v
e
l
i
f
e
r
9
9
P
i
p
i
s
t
r
e
l
l
u
s
S
p
e
c
i
e
s

n
o
t

i
d
e
n
t
i
f
i
e
d
5
5
3
1
3
S
c
o
t
o
e
c
u
s
S
p
e
c
i
e
s

n
o
t

i
d
e
n
t
i
f
i
e
d
1
1
S
c
o
t
o
p
h
i
l
u
s
S
.

d
i
n
g
a
n
i
i
2
2
S
.

l
e
u
c
o
g
a
s
t
e
r
4
4
2
1
0
S
.

n
i
g
r
i
t
a
2
2
1
5
U
n
k
n
o
w
n

2
7
7
1
1
(
n
=
1
8
)
T
o
t
a
l

s
p
e
c
i
m
e
n
s
1
6
7
3
S
e
r
u
m
/
P
l
a
s
m
a
1
5
8
1
L
u
n
g
3
4
L
i
v
e
r
3
1
K
i
d
n
e
y
1
8
O
r
a
l

s
w
a
b
7
R
e
c
t
a
l

s
w
a
b
21
1
8
M
e
x
i
c
o
B
a
n
g
l
a
d
e
s
h
3
7
3
8
0
2
7
5
G
u
a
t
e
m
a
l
a
C
a
m
e
r
o
o
n
D
R
C
K
e
n
y
a
N
i
g
e
r
i
a
2
2
4
4
1
4
0
2
1
7

T
a
b
l
e

S
2
.

B
a
t

s
p
e
c
i
m
e
n

c
o
l
l
e
c
t
i
o
n

b
y

c
o
u
n
t
r
y

S
m
:

S
e
r
u
m
;

P
L
:

P
l
a
s
m
a
;

L
u
:

L
u
n
g
;

L
v
:

L
i
v
e
r
;

K
d
:

K
i
d
n
e
y
;

O
r
:

O
r
a
l

s
w
a
b
;

R
e
:

R
e
c
t
a
l

s
w
a
b
;

n
:
n
u
m
b
e
r

o
f

b
a
t
s
.



5/20
Table S2
n
A
C
D
G
H
K
U
n
A
C
D
G
H
K
U
n
A
C
D
G
H
K
U
n
A
C
D
G
H
K
U
n
A
C
D
G
H
K
U
n
A
C
D
G
H
K
U
n
A
C
D
G
H
K
U
F
a
m
i
l
y
G
e
n
u
s
S
p
e
c
i
e
s
E
m
b
a
l
l
o
n
u
r
i
d
a
e
S
a
c
c
o
p
t
e
r
y
x
S
.

b
i
l
i
n
e
a
t
a
1
(
n
=
3
)
T
a
p
h
o
z
o
u
s
S
p
e
c
i
e
s

n
o
t

i
d
e
n
t
i
f
i
e
d
2
2
1
H
i
p
p
o
s
i
d
e
r
i
d
a
e
H
i
p
p
o
s
i
d
e
r
o
s
H
.

v
i
t
t
a
t
u
s
7
5
2
4
2
H
.

g
i
g
a
s
5
1
(
n
=
8
4
)
H
.

f
u
l
i
g
i
n
o
s
u
s
3
S
p
e
c
i
e
s

n
o
t

i
d
e
n
t
i
f
i
e
d
1
M
i
n
i
o
p
t
e
r
i
d
a
e
M
i
n
i
o
p
t
e
r
u
s
S
p
e
c
i
e
s

n
o
t

i
d
e
n
t
i
f
i
e
d
1
(
n
=
1
)
M
o
l
o
s
s
i
d
a
e
C
h
a
e
r
e
p
h
o
n
S
p
e
c
i
e
s

n
o
t

i
d
e
n
t
i
f
i
e
d
1
1
3
6
2
1
(
n
=
6
6
)
M
o
l
o
s
s
u
s
M
.

s
i
n
a
l
o
a
e
2
M
o
p
s
M
.

c
o
n
d
y
l
u
r
u
s
1
0
1
2
N
y
c
t
i
n
o
m
o
p
s
N
.

m
a
c
r
o
t
i
s
1
0
1
O
t
o
m
o
p
s
O
.

m
a
r
t
i
e
n
s
s
e
n
i
1
5
3
2
T
a
d
a
r
i
d
a
T
.

b
r
a
s
i
l
i
e
n
s
i
s
1
0
M
o
r
m
o
o
p
i
d
a
e
M
o
r
m
o
o
p
s
M
.

m
e
g
a
l
o
p
h
y
l
l
a
2
(
n
=
2
1
)
P
t
e
r
o
n
o
t
u
s
P
.

p
a
r
n
e
l
l
i
i
1
9
P
h
y
l
l
o
s
t
o
m
i
d
a
e
A
r
t
i
b
e
u
s
A
.

j
a
m
a
i
c
e
n
s
i
s
1
6
4
1
(
n
=
3
6
2
)
A
.

l
i
t
u
r
a
t
u
s
2
4
0
A
.

p
h
a
e
o
t
i
s
6
A
.

t
o
l
t
e
c
u
s
1
A
.

w
a
t
s
o
n
i
1
0
1
C
a
r
o
l
l
i
a
C
.

p
e
r
s
p
i
c
i
l
l
a
t
a
4
2
8
1
C
.

s
o
w
e
l
l
i
3
8
C
h
o
e
r
o
n
i
s
c
u
s
C
.

g
o
d
m
a
n
i
9
C
h
r
o
t
o
p
t
e
r
u
s
C
.

a
u
r
i
t
u
s
4
D
e
s
m
o
d
u
s
D
.

r
o
t
u
n
d
u
s
3
0
1
1
6
G
l
o
s
s
o
p
h
a
g
a
G
.

c
o
m
m
i
s
s
a
r
i
s
i
1
1
G
.

s
o
r
i
c
i
n
a
1
2
9
L
i
o
n
y
c
t
e
r
i
s
S
p
e
c
i
e
s

n
o
t

i
d
e
n
t
i
f
i
e
d
L
o
n
c
h
o
r
h
i
n
a
L
.

a
u
r
i
t
a
1
L
o
p
h
o
s
t
o
m
a
L
.

b
r
a
s
i
l
i
e
n
s
e
1
M
i
c
r
o
n
y
c
t
e
r
i
s
M
.

m
i
c
r
o
t
i
s
1
2
M
.

s
c
h
m
i
d
t
o
r
u
m
2
M
i
m
o
n
M
.

c
o
z
u
m
e
l
a
e
1
P
h
y
l
l
o
d
e
r
m
a
P
.

s
t
e
n
o
p
s
4
P
h
y
l
l
o
s
t
o
m
u
s
P
.

d
i
s
c
o
l
o
r
4
P
l
a
t
y
r
r
h
i
n
u
s
P
.

h
e
l
l
e
r
i
1
2
4
S
t
u
r
n
i
r
a
S
.

l
i
l
i
u
m
1
3
1
2
0
S
.

l
u
d
o
v
i
c
i
2
5
1
S
p
e
c
i
e
s

n
o
t

i
d
e
n
t
i
f
i
e
d
1
T
o
n
a
t
i
a
T
.

s
a
u
r
o
p
h
i
l
a
2
T
r
a
c
h
o
p
s
T
.

c
i
r
r
h
o
s
u
s
4
1
U
r
o
d
e
r
m
a
U
.

b
i
l
o
b
a
t
u
m
1
8
V
a
m
p
y
r
o
d
e
s
V
.

c
a
r
a
c
c
i
o
l
i
1
P
t
e
r
o
p
o
d
i
d
a
e
E
i
d
o
l
o
n
E
.

h
e
l
v
u
m
8
1
1
7
1
1
1
1
5
0
1
5
1
(
n
=
1
0
2
2
)
E
p
o
m
o
p
h
o
r
u
s
E
.

l
a
b
i
a
t
u
s
3
1
E
.

w
a
h
l
b
e
r
g
i
2
H
y
p
s
i
g
n
a
t
h
u
s
H
.

m
o
n
s
t
r
o
s
u
s
2
L
i
s
s
o
n
n
y
c
t
e
r
i
s
L
.

a
n
g
o
l
e
n
s
i
s
4
M
e
g
a
l
o
g
l
o
s
s
u
s
M
.

w
o
e
r
m
a
n
n
i
3
1
M
y
o
n
y
c
t
e
r
i
s
M
.

t
o
r
q
u
a
t
a
3
P
t
e
r
o
p
u
s
P
.

g
i
g
a
n
t
e
u
s
8
0
2
4
1
3
R
o
u
s
e
t
t
u
s
R
.

a
e
g
y
p
t
i
a
c
u
s
2
6
3
9
1
9
1
V
e
s
p
e
r
t
i
l
i
o
n
i
d
a
e
B
a
u
e
r
u
s
B
.

d
u
b
i
a
q
u
e
r
c
u
s
1
4
(
n
=

4
0
)
E
p
t
e
s
i
c
u
s
E
.

f
u
s
c
u
s
2
M
y
o
t
i
s
M
.

o
c
c
u
l
t
u
s
1
M
.

v
e
l
i
f
e
r
9
P
i
p
i
s
t
r
e
l
l
u
s
S
p
e
c
i
e
s

n
o
t

i
d
e
n
t
i
f
i
e
d
5
S
c
o
t
o
e
c
u
s
S
p
e
c
i
e
s

n
o
t

i
d
e
n
t
i
f
i
e
d
1
1
S
c
o
t
o
p
h
i
l
u
s
S
.

d
i
n
g
a
n
i
i
2
1
S
.

l
e
u
c
o
g
a
s
t
e
r
4
S
.

n
i
g
r
i
t
a
2
U
n
k
n
o
w
n
2
7
3
7
1
(
n
=
1
6
)





































T
o
t
a
l
2
2
2
5
4
4
1
2
2
1
4
0
2
3
4
1
1
0
1
1
6
8
1
1
8
2
7
5
2
2
1
3
6
4
2
1
1
3
8
0
2
4
1
3
C
l
a
d
e
C
l
a
d
e

C
l
a
d
e

C
l
a
d
e
C
l
a
d
e
C
l
a
d
e
C
l
a
d
e
M
e
x
i
c
o
B
a
n
g
l
a
d
e
s
h
C
a
m
e
r
o
o
n
D
R
C
K
e
n
y
a
N
i
g
e
r
i
a
G
u
a
t
e
m
a
l
a

T
a
b
l
e

S
3
.


B
a
t
-
d
e
r
i
v
e
d

h
e
p
a
c
i
v
i
r
u
s

a
n
d

p
e
g
i
v
i
r
u
s

d
e
t
e
c
t
e
d

b
y

P
C
R

s
h
o
w
n

b
y

c
l
a
d
e

a
n
d

h
o
s
t

C
l
a
d
e
s

o
f

b
a
t
-
d
e
r
i
v
e
d

h
e
p
a
c
i
v
i
r
u
s

(
c
l
a
d
e
s

A
,

C
,

D
)

a
n
d

p
e
g
i
v
i
r
u
s

(
c
l
a
d
e
s

G
,

H
,

K
)


w
e
r
e

d
e
f
i
n
e
d

b
y

p
h
y
l
o
g
e
n
e
t
i
c

a
n
a
l
y
s
e
s

(
s
e
e

F
i
g
.
1
,


F
i
g
.

S
1
)
.

N
u
m
b
e
r
s

i
n

c
o
l
o
r
e
d

b
o
x
e
s

r
e
p
r
e
s
e
n
t

t
h
e

n
u
m
b
e
r

o
f

v
i
r
u
s
e
s

d
e
t
e
c
t
e
d

b
y

c
l
a
d
e
.

V
i
r
u
s
e
s

w
i
t
h

u
n
d
e
t
e
r
m
i
n
e
d

c
l
a
d
e

e
i
t
h
e
r

h
a
d

i
n
s
u
f
f
i
c
i
e
n
t

N
S
5
B

s
e
q
u
e
n
c
e

d
a
t
a

f
o
r

p
h
y
l
o
g
e
n
e
t
i
c

a
n
a
l
y
s
i
s
,

o
r

w
e
r
e

n
e
g
a
t
i
v
e

b
y

N
S
5
B

P
C
R

s
c
r
e
e
n

b
u
t

w
e
r
e

d
e
t
e
c
t
e
d

w
i
t
h

a
l
t
e
r
n
a
t
e

s
p
e
c
i
f
i
c

P
C
R

p
r
i
m
e
r
s

d
e
s
i
g
n
e
d

f
r
o
m

h
i
g
h
-
t
h
r
o
u
g
h
p
u
t

s
e
q
u
e
n
c
i
n
g

d
a
t
a

(
s
e
e

M
e
t
h
o
d
s
)
.

n
:

n
u
m
b
e
r

o
f

b
a
t
s

s
c
r
e
e
n
e
d

b
y

P
C
R
;

U
:

u
n
d
e
t
e
r
m
i
n
e
d

c
l
a
d
e
;

D
R
C
:

D
e
m
o
c
r
a
t
i
c

R
e
p
u
b
l
i
c

o
f

t
h
e

C
o
n
g
o
.



6/20
Table S3
Country Field Site Specimen
Collection
date
(month/year)
Family Genus Species Common name Diet Sex Comments
Clade A
PDB-112 Kenya Three caves Serum 06/2010 Hipposideridae Hipposideros H. vittatus Striped leaf-nosed bat Insect F
PDB-113 Kenya Three caves Serum 06/2010 Hipposideridae Hipposideros H. vittatus Striped leaf-nosed bat Insect M
Clade C
PDB-445 Kenya Suswa cave Serum 07/2010 Molossidae Otomops O. martiensseni Large-eared Free-tailed bat Insect M
PDB-452 Kenya Suswa cave Serum 07/2010 Molossidae Otomops O. martiensseni Large-eared Free-tailed bat Insect M
Clade D
PDB-261 Kenya Shimoni cave Serum 06/2010 Hipposideridae Hipposideros H. vittatus Striped leaf-nosed bat Insect F
PDB-632B Kenya Shimoni cave Serum 08/2011 Hipposideridae Hipposideros H. vittatus Striped leaf-nosed bat Insect F
PDB-829 Kenya Shimoni cave Serum 08/2011 Hipposideridae Hipposideros H. vittatus Striped leaf-nosed bat Insect F
PDB-830 Kenya Shimoni cave Serum 08/2011 Hipposideridae Hipposideros H. vittatus Striped leaf-nosed bat Insect M
Clade G
PDB-99 Cameroon Caves Serum 04/2010 Emballonuridae Taphozous Species not identified Sac-winged bat Insect M
PDB-620 Kenya Tsavo East, Ndololo camp Serum 06/2010 Vespertilionidae Scotophilus S. dinganii African yellow bat Insect M
PDB-1698 Guatemala Agero Serum 2010 Phyllostomidae Carollia C. perspicillata Seba's short-tailed bat Fruit/insect M
PDB-1734 Guatemala Agero Serum 2010 Phyllostomidae Carollia C. perspicillata Seba's short-tailed bat Fruit/insect F
PMX-1615 Mexico A11 Plasma 01/2011 Phyllostomidae Artibeus A. watsoni Watson's fruit-eating bat Fruit ND
PMX-1641 Mexico A06 Plasma 01/2011 Phyllostomidae Glossophaga G. commissarisi Commissaris's long-tongued bat Fruit/insect ND
Clade H
PDB-303 DRC
Kisangani, rain forest next to
Masako
Serum 07/2011 Pteropodidae Megaloglossus M. woermanni Woermann's fruit bat Fruit M
PDB-366 DRC Kisangani, Mayele Island Serum 07/2011 Pteropodidae Eidolon E. helvum Straw-coloured fruit bat Fruit M
PDB-694 Nigeria College of Agrigulture Serum 09/2010 Pteropodidae Eidolon E. helvum Straw-coloured fruit bat Fruit F
PRB-891 Bangladesh Faridipur Serum 07/2007 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit M
PRB-1479 Bangladesh Faridipur Serum 02/2009 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit F
PRB-1447 Bangladesh Faridipur Serum 05/2009 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit M
PRB-1554 Bangladesh Khulna Serum 01/2009 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit M
PDB-1699 Guatemala Agero Serum 2010 Phyllostomidae Desmodus D. rotundus Vampire bat Blood M
PDB-1715 Guatemala Montanas Azules Serum 2010 Phyllostomidae Sturnira S. lilium Little yellow-shouldered bat Fruit M
PMX-1588 Mexico A07 Plasma 10/2010 Phyllostomidae Carollia C. perspicillata Seba's short tailed bat Fruit/insect ND
Clade K
PDB-24 Cameroon Ngong Serum 04/2010 Molossidae Chaerephon Species not identified Free-tailed bat Insect F Pregnant
PDB-28A Cameroon Ngong Serum 04/2010 Molossidae Chaerephon Species not identified Free-tailed bat Insect M
PDB-41A Cameroon Ngong Serum 04/2010 Molossidae Chaerephon Species not identified Free-tailed bat Insect M
PDB-106 Cameroon Maya oulu Serum 04/2010 Pteropodidae Eidolon E. helvum Straw-coloured fruit bat Fruit M
PDB-47 DRC Kinshasa, school (1) Serum 07/2011 Molossidae Mops M. condylurus Angolan Free-tailed bat Insect M
PDB-130 Kenya Three caves Serum 06/2010 Hipposideridae Hipposideros H. vittatus Striped leaf-nosed bat Insect F
PDB-152 Kenya Three caves Serum 06/2010 Hipposideridae Hipposideros H. vittatus Striped leaf-nosed bat Insect M
PDB-401 Kenya Asembo, church and school Serum 06/2010 Pteropodidae Epomophorus E. labiatus Ethiopian epauletted fruit bat Fruit F Pregnant
PDB-423 Kenya Asembo, church and school Serum 06/2010 Molossidae Chaerephon Species not identified Free-tailed bat Insect F
PDB-534 Kenya Suswa cave Serum 07/2010 Molossidae Otomops O. martiensseni Large-eared Free-tailed bat Insect M
PDB-737B Kenya Kisii Serum 08/2011 Pteropodidae Eidolon E. helvum Straw-coloured fruit bat Fruit M
PDB-838 Kenya Three caves Serum 08/2011 Pteropodidae Rousettus R. aegyptiacus Egyptian fruit bat Fruit M
PDB-840 Kenya Three caves Serum 08/2011 Pteropodidae Rousettus R. aegyptiacus Egyptian fruit bat Fruit M
PDB-854 Kenya Three caves Serum 08/2011 Pteropodidae Rousettus R. aegyptiacus Egyptian fruit bat Fruit M
PDB-623 Nigeria ND Serum ND ND ND ND ND ND ND
PDB-664 Nigeria College of Agrigulture Serum 09/2010 Pteropodidae Eidolon E. helvum Straw-coloured fruit bat Fruit M
PDB-690 Nigeria College of Agrigulture Serum 09/2010 Pteropodidae Eidolon E. helvum Straw-coloured fruit bat Fruit M
PDB-692 Nigeria College of Agrigulture Serum 09/2010 Pteropodidae Eidolon E. helvum Straw-coloured fruit bat Fruit M
PDB-702 Nigeria Idanre cave Serum 09/2010 Pteropodidae Rousettus R. aegyptiacus Egyptian fruit bat Fruit M
PDB-706 Nigeria Idanre cave Serum 09/2010 Pteropodidae Rousettus R. aegyptiacus Egyptian fruit bat Fruit F Pregnant
PDB-716 Nigeria College of Agrigulture Serum 09/2010 Pteropodidae Eidolon E. helvum Straw-coloured fruit bat Fruit M
PDB-722 Nigeria Idanre cave Serum 09/2010 Pteropodidae Rousettus R. aegyptiacus Egyptian fruit bat Fruit M
PDB-725 Nigeria College of Agrigulture Serum 09/2010 Pteropodidae Eidolon E. helvum Straw-coloured fruit bat Fruit F
PDB-743 Nigeria Idanre cave Serum 09/2010 Pteropodidae Rousettus R. aegyptiacus Egyptian fruit bat Fruit M
PDB-798 Nigeria ND Serum ND ND ND ND ND ND ND
PDB-883A Nigeria Idanre cave Serum 09/2010 Pteropodidae Rousettus R. aegyptiacus Egyptian fruit bat Fruit F Pregnant
PDB-895 Nigeria Idanre cave Serum 09/2010 Pteropodidae Rousettus R. aegyptiacus Egyptian fruit bat Fruit M
PDB-898 Nigeria Idanre cave Serum 09/2010 Pteropodidae Rousettus R. aegyptiacus Egyptian fruit bat Fruit M
PDB-903 Nigeria Idanre cave Serum 09/2010 Hipposideridae Hipposideros H. gigas Giant leaf-nosed bat Insect F
PDB-921 Nigeria Idanre cave Serum 09/2010 Pteropodidae Rousettus R. aegyptiacus Egyptian fruit bat Fruit M
PDB-935 Nigeria ND Serum ND ND ND ND ND ND ND
PDB-307.lu Nigeria Idanre, Ondo State Lung 06/2008 Pteropodidae Rousettus R. aegyptiacus Egyptian fruit bat Fruit M
PRB-1170 Bangladesh Faridipur Serum 06/2010 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit F
PRB-1515 Bangladesh Khulna Serum 01/2009 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit F Pregnant
PRB-1476 Bangladesh Faridipur Serum 02/2009 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit M
PRB-1520 Bangladesh Khulna Serum 01/2009 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit M
PRB-1522 Bangladesh Khulna Serum 01/2009 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit M
PRB-1129 Bangladesh Kushtia Serum 08/2007 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit M
PRB-1547 Bangladesh Khulna Serum 01/2009 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit F
PRB-1549 Bangladesh Khulna Serum 01/2009 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit M
PRB-940 Bangladesh Faridipur Serum 07/2007 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit M
PRB-978 Bangladesh Faridipur Serum 04/2008 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit F Pregnant
PRB-1085 Bangladesh Faridipur Serum 10/2008 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit M
PMX-1376.pl^^ Mexico C01 Plasma 08/2010 Molossidae Nyctinomops N. macrotis Big Free-tailed bat Insect ND
PMX-1376.re^^ Mexico C01 Rectal swab 09/2010 Molossidae Nyctinomops N. macrotis Big Free-tailed bat Insect ND
Co-infections
C / K PDB-491.1/2 Kenya Suswa cave Serum 07/2010 Molossidae Otomops O. martiensseni Large-eared Free-tailed bat Insect F 4.56 x 10
7
5.20 x 10
7
G / K PDB-76.1/2 Cameroon Caves Serum 04/2010 Emballonuridae Taphozous Species not identified Sac-winged bat Insect M 4.43 x 10
7
3.02 x 10
4
G / K PDB-34.1/2 DRC Kinshasa, school (1) Serum 07/2011 Molossidae Mops M. condylurus Angolan Free-tailed bat Insect F 1.01 x 10
8
7.05 x 10
5
K / K* PRB-1006.1/2 Bangladesh Faridipur Serum 11/2009 Pteropodidae Pteropus P. giganteus Indian flying fox Fruit M
Clade undetermined
PDB-909# Nigeria Idanre cave Serum 09/2010 Pteropodidae Rousettus R. aegyptiacus Egyptian fruit bat Fruit M
PDB-975# Nigeria College of Agrigulture Serum 09/2010 Pteropodidae Eidolon E. helvum Straw-coloured fruit bat Fruit M
PDB-1707# Guatemala Aldea Salacuim, Coban Serum 2010 Phyllostomidae Desmodus D. rotundus Vampire bat Blood F
PDB-638B$ Kenya Tsavo East, Ndololo camp Serum 06/2010 Vespertilionidae Scotoecus Species not identified Desert yellow bat Insect F
PMX-1503$ Mexico ND Plasma 07/2010 Phyllostomidae Trachops T. cirrhosus Fringe-lipped bats Carnivorous/Insect ND
PMX-1508$ Mexico ND Plasma 09/2010 Phyllostomidae ND ND ND ND ND
PMX-1608$ Mexico ND Plasma 01/2011 Phyllostomidae Sturnira S. ludovici Highland yellow-shouldered bat Fruit ND
ND: data not available
3.00 x 10
8
Virus ID
Host Information
Viral load
(RNA copies/mL
serum)
1.25 x 10
7
2.69 x 10
5
1.92 x 10
6
4.29 x 10
7
1.22 x 10
6
1.07 x 10
5
2.62 x 10
3
1.57 x 10
5
7.23 x 10
3
1.78 x 10
8
1.04 x 10
4
5.02 x 10
7
5.05 x 10
5
1.38 x 10
4
1.51 x 10
6
4.13 x 10
6
1.67 x 10
6
9.00 x 10
5
2.17 x 10
6
1.97 x 10
5
8.13 x 10
5
2.30 x 10
6
2.07 x 10
5
2.58 x 10
5
9.09 x 10
2
copies/g
Table S4. Bat-derived hepacivirus and pegivirus identified in this study with host information
Clades of bat-derived hepacivirus (clades A, C, D) and pegivirus (clades G, H, K) were defined by phylogenetic analyses (see Fig.1, Fig. S1).
ND: no data available; DRC: Democratic Republic of the Congo.
^^: Plasma and rectal swab collected from same bat; *: distinct strains within clade K; #: NS5B PCR screen negative, viruses detected using specific PCR
primers designed from high-throughput sequencing information; $: insufficient sequence information for NS5B phylogenetic analysis.
Table S4
7/20
Clade Bat species Cameroon DRC Kenya Nigeria Guatemala Mexico Bangladesh % Infected
A Hipposideros vittatus 2/75 2.7%
C Otomops martiensseni 3/15 20.0%
D Hipposideros vittatus 4/75 5.3%
G Artibeus watsoni 1/10 10.0%
Carollia perspicillata 2/4 0/8 16.7%
Glossophaga commissarisi 1/1 100.0%
Mops condylurus 1/10 10.0%
Scotophilus dinganii 1/2 50.0%
Taphozous sp. 2/2 100.0%
H Carollia perspicillata 0/4 1/8 8.3%
Desmodus rotundus 1/30 0/7 2.7%
Eidolon helvum 0/8 1/17 0/11 1/50 2.3%
Megaloglossus woermanni 1/3 33.3%
Pteropus giganteus 4/802 0.5%
Sturnira lilium 1/13 0/20 3.0%
K Chaerephon sp. 3/11 0/6 1/2 21.1%
Eidolon helvum 1/8 0/17 1/11 5/50 8.1%
Epomophorus labiatus 1/3 33.3%
Hipposideros vittatus 2/75 4.0%
Hipposideros gigas 1/5 20.0%
Mops condylurus 2/10 20.0%
Nyctinomops macrotis 1/10 10.0%
Otomops martiensseni 2/15 13.3%
Pteropus giganteus 12/802 1.5%
Rousettus aegyptiacus 3/26 9/91 10.3%
Taphozous sp. 1/2 50.0%
unknown 0/2 3/7 0/7 18.8%
Undetermined Desmodus rotundus 1/30 0/7 2.7%
Eidolon helvum 0/8 0/17 0/11 1/50 1.2%
Rousettus aegyptiacus 0/26 1/91 0.9%
Scotoecus sp. 1/1 100.0%
Sturnira ludovici 1/25 4.0%
Trachops cirrhosus 1/4 25.0%
unknown 0/2 0/7 1/7 6.3%
Percent Infected 27.3% 9.1% 15.0% 12.5% 6.7% 1.9% 2.0%

Table S5. Percentage of bats infected with BHVs and BPgVs
Clades of bat-derived hepacivirus (clades A, C, D) and pegivirus (clades G, H, K) were defined by phylogenetic analyses (see Fig. 1 and
Fig. S1). Viruses with undetermined clade had insufficient NS5B sequence data for phylogenetic analysis or were negative by NS5B
PCR screen but were positive with alternate specific PCR primers designed from high-throughput sequencing data. Red text indicates
positive.
BHV: bat hepacivirus; BPgV: bat pegivirus; sp.: species; DRC: Democratic Republic of the Congo.
8/20
Table S5
Genus Clade Species Virus ID Virus Nomenclature
a
Country Accession no.
Hepacivirus A 1 PDB-112 BHV/A1/H.vittatus/PDB112/KEN/2010 Kenya KC796077
PDB-113 BHV/A1/H.vittatus/PDB113/KEN/2010 Kenya KC796033
Hepacivirus C 1 PDB-452 BHV/C1/O.martiensseni/PDB452/KEN/2010 Kenya KC796090
PDB-445 BHV/C1/O.martiensseni/PDB445/KEN/2010 Kenya KC796091
PDB-491.1 BHV/C1/O.martiensseni/PDB491.1/KEN/2010 Kenya KC796078
Hepacivirus D 1 PDB-261 BHV/D1/H.vittatus/PDB261/KEN/2010 Kenya KC796031
PDB-829 BHV/D1/H.vittatus/PDB829/KEN/2011 Kenya KC796074
PDB-632B BHV/D1/H.vittatus/PDB632B/KEN/2011 Kenya KC796021
PDB-830 BHV/D1/H.vittatus/PDB830/KEN/2011 Kenya KC796039
Pegivirus G 1 PMX-1615 BPgV/G1/A.watsoni/PMX1615/MEX/2011 Mexico KC796072
2 PMX-1641 BPgV/G2/G.commisarisi/PMX1641/MEX/2011 Mexico KC796066
3 PDB-1698 BPgV/G3/C.perspicillata/PDB1698/GUA/2010 Guatemala KC796080
PDB-1734 BPgV/G3/C.perspicillata/PDB1734/GUA/2010 Guatemala KC796087
4 PDB-34.1 BPgV/G4/M.condylurus/PDB34.1/DRC/2011 DRC KC796093
5 PDB-620 BPgV/G5/S.dinganii/PDB620/KEN/2010 Kenya KC796076
6 PDB-76.1 BPgV/G6/Taphozous_sp/PDB76.1/CAM/2010 Cameroon KC796084
PDB-99 BPgV/G6/Taphozous_sp/PDB99/CAM/2010 Cameroon KC796079
Pegivirus H 1 PDB-1699 BPgV/H1/D.rotundus/PDB1699/GUA/2010 Guatemala KC796061
2 PDB-1715 BPgV/H2/S.lilium/PDB1715/GUA/2010 Guatemala KC796088
3 PMX-1588 BPgV/H3/C.perspicillata/PMX1588/MEX/2010 Mexico KC796067
4 PRB-1447 BPgV/H4/P.giganteus/PRB1447/BAN/2009 Bangladesh KC796069
5* GBV-D strain 68 BPgV/H5/P.giganteus/68/BAN/2007 Bangladesh GU566734
GBV-D strain 93 BPgV/H5/P.giganteus/93/BAN/2007 Bangladesh GU566735
6 PDB-366 BPgV/H6/E.helvum/PDB366/DRC/2011 DRC KC796029
PDB-694 BPgV/H6/E.helvum/PDB694/NIG/2010 Nigeria KC796083
7 PDB-303 BPgV/H7/M.woermanni/PDB303/DRC/2011 DRC KC796073
PRB-1479 BPgV/H7/P.giganteus/PRB1479/BAN/2009 Bangladesh KC796071
8 PRB-891 BPgV/H8/P.giganteus/PRB891/BAN/2007 Bangladesh KC796051
PRB-1554 BPgV/H8/P.giganteus/PRB1554/BAN/2009 Bangladesh KC796063
Pegivirus K 1 PMX-1376-pl BPgV/K1/N.macrotis/PMX1376-pl/MEX/2010 Mexico KC796065
PMX-1376-re BPgV/K1/N.macrotis/PMX1376-re/MEX/2010 Mexico KC796060
2 PDB-47 BPgV/K2/M.condylurus/PDB47/DRC/2011 DRC KC796028
PDB-130 BPgV/K2/H.vittatus/PDB130/KEN/2010 Kenya KC796035
PDB-41A BPgV/K2/Chaerephon_sp/PDB41A/CAM/2010 Cameroon KC796036
PDB-24 BPgV/K2/Chaerephon_sp/PDB24/CAM/2010 Cameroon KC796082
PDB-28A BPgV/K2/Chaerephon_sp/PDB28A/CAM/2010 Cameroon KC796025
PRB-1476 BPgV/K2/P.giganteus/PRB1476/BAN/2009 Bangladesh KC811078
PRB-978 BPgV/K2/P.giganteus/PRB978/BAN/2008 Bangladesh KC811075
3 PDB-534 BPgV/K3/O.martiensseni/PDB534/KEN/2010 Kenya KC796085
PDB-491.2 BPgV/K3/O.martiensseni/PDB491.2/KEN/2010 Kenya KC796089
PDB-743 BPgV/K3/R.aegypticus/PDB743/NIG/2010 Nigeria KC796055
PDB-152 BPgV/K3/H.vittatus/PDB152/KEN/2010 Kenya KC796092
PDB-716 BPgV/K3/E.helvum/PDB716/NIG/2010 Nigeria KC796053
PDB-401 BPgV/K3/E.labiatus/PDB401/KEN/2010 Kenya KC796032
PRB-1129 BPgV/K3/P.giganteus/PRB1129/BAN/2007 Bangladesh KC796056
PDB-690 BPgV/K3/E.helvum/PDB690/NIG/2010 Nigeria KC796034
PDB-664 BPgV/K3/E.helvum/PDB664/NIG/2010 Nigeria KC796018
PDB-935 BPgV/K3/unknown/PDB935/NIG/unknown Nigeria KC796047
PDB-722 BPgV/K3/R.aegypticus/PDB722/NIG/2010 Nigeria KC796044
PDB-725 BPgV/K3/E.helvum/PDB725/NIG/2010 Nigeria KC796041
4 PDB-737B BPgV/K4/E.helvum/PDB737B/KEN/2011 Kenya KC796081
PRB-1006.1 BPgV/K4/P.giganteus/PRB1006.1/BAN/2009 Bangladesh KC796052
PRB-940 BPgV/K4/P.giganteus/PRB940/BAN/2007 Bangladesh KC796050
PRB-1170 BPgV/K4/P.giganteus/PRB1170/BAN/2010 Bangladesh KC796095
PDB-106 BPgV/K4/E.helvum/PDB106/CAM/2010 Cameroon KC796075
PRB-1085 BPgV/K4/P.giganteus/PRB1085/BAN/2008 Bangladesh KC796054
5 PDB-702 BPgV/K5/R.aegypticus/PDB702/NIG/2010 Nigeria KC796019
PDB-903 BPgV/K5/H.gigas/PDB903/NIG/2010 Nigeria KC796048
PDB-307.lu BPgV/K5/R.aegypticus/PDB307.lu/NIG/2008 Nigeria KC811079
6 PDB-623 BPgV/K6/unknown/PDB623/NIG/unknown Nigeria KC796020
PRB-1520 BPgV/K6/P.giganteus/PRB1520/BAN/2009 Bangladesh KC796059
PRB-1522 BPgV/K6/P.giganteus/PRB1522/BAN/2009 Bangladesh KC796062
PRB-1515 BPgV/K6/P.giganteus/PRB1515/BAN/2009 Bangladesh KC796057
PRB-1549 BPgV/K6/P.giganteus/PRB1549/BAN/2009 Bangladesh KC796058
PRB-1006.2 BPgV/K6/P.giganteus/PRB1006.2/BAN/2009 Bangladesh KC796043
PRB-1547 BPgV/K6/P.giganteus/PRB1547/BAN/2009 Bangladesh KC796070
PDB-798 BPgV/K6/unknown/PDB798/NIG/unknown Nigeria KC796040
PDB-838 BPgV/K6//R.aegypticus/PDB838/KEN/2011 Kenya KC796086
PDB-898 BPgV/K6/R.aegypticus/PDB898/NIG/2010 Nigeria KC796049
PDB-895 BPgV/K6/R.aegypticus/PDB895/NIG/2010 Nigeria KC796038
PDB-883A BPgV/K6/R.aegypticus/PDB883A/NIG/2010 Nigeria KC796037
PDB-921 BPgV/K6/R.aegypticus/PDB921/NIG/2010 Nigeria KC796094
PDB-854 BPgV/K6/R.aegypticus/PDB854/KEN/2011 Kenya KC796046
PDB-840 BPgV/K6/R.aegypticus/PDB840/KEN/2011 Kenya KC796045
PDB-706 BPgV/K6/R.aegypticus/PDB706/NIG/2010 Nigeria KC796022
PDB-76.2 BPgV/K6/Taphozous_sp/PDB76.2/CAM/2010 Cameroon KC796030
PDB-34.2 BPgV/K6/M.condylurus/PDB34.2/DRC/2011 DRC KC796024
PDB-423 BPgV/K6/Chaerephon_sp/PDB423/KEN/2010 Kenya KC796027
PDB-692 BPgV/K6/E.helvum/PDB692/NIG/2010 Nigeria KC796023
PDB-909 BPgV/R.aegypticus/PDB909/NIG/2010 Nigeria KC811073
PDB-975 BPgV/E.helvum/PDB975/NIG/2010 Nigeria KC811077
PDB-1707 BPgV/D.rotundus/PDB1707/GUA/2010 Guatemala KC796096
PDB-638B BHV/Scotoecus_sp/PDB638B/KEN/2010 Kenya KC796026
PMX-1503 BPgV/T.cirrhosus/PMX1503/MEX/2010 Mexico KC811074
PMX-1508 BPgV/unknown/PMX1508/MEX/2010 Mexico KC796068
PMX-1608 BPgV/S.ludovici/PMX1608/MEX/2011 Mexico KC811076
Clade undetermined
Table S6. Proposed BHVs and BPgVs species identified in this study
a
Genus/species/virus ID/country/year.
BHV: Bat hepacivirus; BPgV: Bat pegivirus; CAM:Cameroon; DRC: Democratic Republic of the Congo, GUA: Guatemala; KEN: Kenya; MEX: Mexico;
NIG: Nigeria; BAN: Bangladesh; ID: identification; sp: species; lu: lung; pl: plasma; re: rectal swab; * bat-derived viruses identified in a previous study (6).
9/20
Table S6
%

G
C
G
e
n
u
s
#
C
l
a
d
e
V
i
r
u
s

I
D
A
c
c
e
s
s
i
o
n

n
o
.
G
e
n
o
m
e

O
r
g
a
n
i
z
a
t
i
o
n
G
e
n
o
m
e
5
'

N
T
R

3
'

N
T
R
O
R
F
O
R
F
P
o
l
y
p
r
o
t
e
i
n
V
R
C
E
1
E
2
p

N
S
2
N
S
3
N
S
4
A
N
S
4
B
N
S
5
A
N
S
5
B
A
P
D
B
-
1
1
2
K
C
7
9
6
0
7
7
C
-
E
1
-
E
2
-
-
P
7
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
8
9
1
6
1
7
3
3
7
8
7
0
6
5
5
.
5
2
9
0
1
N
A
1
6
1
2
0
3
2
6
5
5
9
2
1
2
6
2
9
5
7
2
5
9
4
5
9
5
9
7
B
G
B
V
-
B
A
F
1
7
9
6
1
2
C
-
E
1
-
E
2
-
P
1
3
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
3
9
9
4
4
5
3
5
9
8
5
9
5
5
0
.
1
2
8
6
4
N
A
1
5
6
1
9
2
2
6
5
1
1
9
2
0
8
6
2
0
5
5
2
4
8
4
1
1
5
9
0
G
B
V
-
B
U
2
2
3
0
4
C
-
E
1
-
E
2
-
P
1
3
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
1
4
3
4
4
5
1
0
3
8
5
9
5
5
0
.
1
2
8
6
4
N
A
1
5
6
1
9
2
2
6
5
1
1
9
2
0
8
6
2
7
4
8
2
4
8
4
1
1
5
9
0
C
P
D
B
-
4
4
5
K
C
7
9
6
0
9
1
C
-
E
1
-
E
2
-
P
7
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
1
7
0
1
5
6
9
8
8
9
1
6
5
2
.
4
2
9
7
1
N
A
1
8
9
1
9
2
3
2
2
6
3
2
1
7
6
2
8
5
5
2
5
9
4
5
6
5
9
0
P
D
B
-
4
5
2
K
C
7
9
6
0
9
0
C
-
E
1
-
E
2
-
P
7
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
2
2
4
2
0
5
1
0
3
8
9
1
6
5
2
.
2
2
9
7
1
N
A
1
8
9
1
9
2
3
2
2
6
3
2
1
7
6
2
8
5
5
2
5
9
4
5
6
5
9
0
P
D
B
-
4
9
1
.
1
K
C
7
9
6
0
7
8
C
-
E
1
-
E
2
-
P
7
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
1
8
8
1
7
1
1
0
1
8
9
1
6
5
2
.
5
2
9
7
1
N
A
1
8
9
1
9
2
3
2
2
6
3
2
1
7
6
2
8
5
5
2
5
9
4
5
6
5
9
0
D
P
D
B
-
8
2
9
K
C
7
9
6
0
7
4
C
-
E
1
-
E
2
-
P
7
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
6
0
9
2
8
0
2
5
4
9
0
7
5
5
6
.
5
3
0
2
4
N
A
1
9
0
1
9
2
3
2
6
6
3
2
1
7
6
2
8
5
6
2
5
7
5
0
4
5
9
1
E
C
H
V
J
F
7
4
4
9
9
1
/
J
F
7
4
4
9
9
7
C
-
E
1
-
E
2
-
P
7
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
4
6
8
3
6
6
2
7
3
8
8
2
9
5
0
.
2
2
9
4
2
N
A
2
0
4
1
8
8
3
3
4
6
3
2
1
7
6
3
1
5
4
2
5
7
4
0
6
5
8
8
N
P
H
V
J
Q
4
3
4
0
0
8
C
-
E
1
-
E
2
-
P
7
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
2
2
9
3
8
7
7
8
8
3
5
5
0
.
2
2
9
4
4
N
A
2
0
4
1
8
8
3
3
5
6
3
2
1
7
6
3
1
5
4
2
5
7
4
0
7
5
8
8
F
H
C
V
-
1
a
M
6
2
3
2
1
C
-
E
1
-
E
2
-
P
7
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
4
0
1
3
4
1
2
4
9
0
3
6
5
8
.
8
3
0
1
1
N
A
1
9
0
1
9
3
3
6
3
6
3
2
1
7
6
3
1
5
4
2
5
7
4
5
2
5
9
1
G
P
D
B
-
3
4
.
1
K
C
7
9
6
0
9
3
V
R
-
E
1
-
E
2
-
P
2
8
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
1
0
5
8
0
8
8
1
4
8
1
0
3
4
4
6
0
.
0
3
4
4
7
2
5
7
N
I
1
8
8
3
4
2
2
6
2
2
4
0
6
2
8
6
2
2
7
1
6
0
9
5
6
8
P
D
B
-
7
6
.
1
K
C
7
9
6
0
8
4
V
R
-
E
1
-
E
2
-
P
2
7
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
1
0
4
8
7
2
8
5
3
9
2
9
8
1
0
5
7
.
9
3
2
6
9
7
3
N
I
1
9
0
3
4
1
2
5
6
2
4
2
6
2
8
7
3
2
7
0
6
1
1
5
6
7
P
D
B
-
9
9
K
C
7
9
6
0
7
9
V
R
-
E
1
-
E
2
-
P
2
7
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
1
0
3
9
1
3
2
3
2
5
9
9
8
1
0
5
7
.
9
3
2
6
9
7
3
N
I
1
9
0
3
4
1
2
5
6
2
4
2
6
2
8
7
3
2
7
0
6
1
1
5
6
7
P
D
B
-
6
2
0
K
C
7
9
6
0
7
6
V
R
-
E
1
-
E
2
-
P
2
8
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
1
0
7
6
7
4
9
3
0
8
1
0
4
1
0
5
6
.
8
3
4
6
9
2
6
0
N
I
1
9
1
3
5
0
2
6
0
2
3
9
6
2
8
6
3
2
7
2
6
1
6
5
6
9
P
D
B
-
1
7
3
4
K
C
7
9
6
0
8
7
V
R
-
E
1
-
E
2
-
P
2
7
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
1
0
5
2
9
2
7
9
2
7
2
9
9
7
8
6
0
.
9
3
3
2
5
1
3
8
N
I
1
8
9
3
6
2
2
5
5
2
4
0
6
2
8
6
3
2
6
9
5
8
7
5
7
4
P
D
B
-
1
6
9
8
K
C
7
9
6
0
8
0
V
R
-
E
1
-
E
2
-
P
2
7
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
1
0
4
6
5
2
1
6
2
7
1
9
9
7
8
6
1
.
0
3
3
2
5
1
3
7
N
I
1
8
9
3
6
2
2
5
5
2
4
0
6
2
8
6
3
2
6
9
5
8
7
5
7
4
H
G
B
V
-
D
G
U
5
6
6
7
3
4
E
1
-
E
2
-
P
2
6
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
6
3
0
4
6
2
6
3
9
3
2
1
5
6
.
7
3
0
6
5
N
K
N
I
1
9
3
3
3
7
2
4
2
2
4
0
6
2
8
8
0
2
8
4
4
7
2
5
7
6
G
B
V
-
D
G
U
5
6
6
7
3
5
E
1
-
E
2
-
P
2
6
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
5
3
6
0
2
3
0
9
3
0
6
5
6
.
6
3
0
6
5
N
K
N
I
1
9
3
3
3
7
2
4
2
2
4
0
6
2
8
8
0
2
8
4
4
7
2
5
7
6
P
D
B
-
3
0
3
K
C
7
9
6
0
7
3
E
1
-
E
2
-
P
2
3
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
8
9
2
5
1
6
1
9
3
9
1
8
3
5
7
.
3
3
0
6
0
N
I
N
I
1
9
2
3
5
3
2
1
7
2
3
6
6
2
7
8
2
2
8
4
4
7
7
5
7
6
P
D
B
-
6
9
4
K
C
7
9
6
0
8
3
E
1
-
E
2
-
P
2
5
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
6
8
4
3
5
4
1
7
4
9
1
5
6
5
6
.
7
3
0
5
1
N
I
N
I
1
9
2
3
4
9
2
3
2
2
3
7
6
2
7
8
1
2
8
4
4
5
9
5
7
5
P
D
B
-
1
7
1
5
K
C
7
9
6
0
8
8
E
1
-
E
2
-
P
1
6
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
7
7
7
5
6
6
1
0
0
9
1
1
1
5
8
.
8
3
0
3
6
N
I
N
I
1
9
1
3
5
7
1
5
2
2
3
8
6
2
9
8
2
3
2
3
4
7
0
5
7
9
I
G
B
V
-
C
H
Q
3
3
1
2
3
4
E
1
-
E
2
-
P
6
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
2
6
4
4
2
0
3
1
5
8
5
2
9
5
9
.
2
2
8
4
2
N
I
N
I
1
9
4
3
7
3
7
3
2
3
9
6
0
3
1
0
9
2
6
0
4
1
4
5
6
3
J
G
B
V
-
A
U
2
2
3
0
3
E
1
-
E
2
-
P
2
1
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
6
5
3
5
9
3
1
9
5
8
8
6
5
5
6
.
6
2
9
5
4
N
I
N
1
8
8
3
4
8
1
7
1
2
5
4
6
2
5
6
3
2
8
5
4
5
7
5
6
1
G
B
V
-
A
-
t
r
i
A
F
0
2
3
4
2
5
E
1
-
E
2
-
P
2
1
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
6
2
5
5
2
8
8
2
9
0
1
8
6
0
.
0
3
0
0
5
N
I
N
I
1
9
4
3
4
8
1
7
1
2
5
4
6
2
5
6
4
2
9
4
4
7
9
5
6
5
K
P
D
B
-
2
4
K
C
7
9
6
0
8
2
E
1
-
E
2
-
P
1
8
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
5
0
3
4
5
2
1
8
9
0
3
3
5
9
.
9
3
0
1
0
N
I
N
I
1
9
0
3
6
9
1
7
5
2
3
6
6
2
5
7
2
2
9
7
4
6
1
5
7
0
P
D
B
-
1
0
6
K
C
7
9
6
0
7
5
E
1
-
E
2
-
P
1
8
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
3
3
0
3
8
9
2
2
8
9
1
9
6
0
.
5
2
9
7
2
N
I
N
I
1
9
0
3
6
8
1
7
5
2
3
6
6
2
5
7
3
2
9
8
4
3
0
5
6
4
P
D
B
-
4
9
1
.
2
K
C
7
9
6
0
8
9
E
1
-
E
2
-
P
1
8
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
5
6
6
5
6
6
6
0
8
9
4
0
6
1
.
1
2
9
7
9
N
K
N
I
1
9
0
3
7
1
1
7
4
2
3
7
6
2
5
7
1
2
9
8
4
3
1
5
6
7
P
D
B
-
5
3
4
K
C
7
9
6
0
8
5
E
1
-
E
2
-
P
1
8
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
4
2
8
4
3
6
5
2
8
9
4
0
6
0
.
8
2
9
7
9
N
K
N
I
1
9
0
3
7
1
1
7
5
2
3
6
6
2
5
7
1
2
9
8
4
3
1
5
6
7
P
D
B
-
8
3
8
K
C
7
9
6
0
8
6
E
1
-
E
2
-
P
1
8
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
3
9
4
4
1
9
4
1
8
9
3
4
5
9
.
2
2
9
7
7
N
K
N
I
1
9
0
3
6
9
1
7
1
2
3
9
6
2
5
7
2
2
9
8
4
3
5
5
6
4
P
D
B
-
7
3
7
B
K
C
7
9
6
0
8
1
E
1
-
E
2
-
P
1
8
-
N
S
2
-
N
S
3
-
N
S
4
A
-
N
S
4
B
-
N
S
5
A
-
N
S
5
B
9
4
9
4
5
4
4
3
1
8
9
1
9
6
0
.
2
2
9
7
2
N
I
N
I
1
9
0
3
6
8
1
7
5
2
3
6
6
2
5
7
3
2
9
8
4
3
0
5
6
4
c
la
d
e

3
b
L
e
n
g
t
h

(
n
t
)
L
e
n
g
t
h

(
a
a
)
Hepacivirus Pegivirus
*

T
a
b
l
e

S
7
.



C
o
m
p
a
r
i
s
o
n

o
f

t
h
e

g
e
n
o
m
i
c

o
r
g
a
n
i
z
a
t
i
o
n

o
f

B
H
V
s

a
n
d

B
P
g
V
s

w
i
t
h

r
e
p
r
e
s
e
n
t
a
t
i
v
e

m
e
m
b
e
r
s

o
f

t
h
e

H
e
p
a
c
i
v
i
r
u
s

a
n
d

P
e
g
i
v
i
r
u
s

g
e
n
e
r
a

T
h
e

p
o
t
e
n
t
i
a
l

A
U
G

i
n
i
t
i
a
t
o
r

c
o
d
o
n

o
f

t
h
e

p
o
l
y
p
r
o
t
e
i
n

a
n
d

t
h
e

c
l
e
a
v
a
g
e

m
a
p

o
f

t
h
e

B
H
V
s

a
n
d

B
P
g
V
s

p
o
l
y
p
r
o
t
e
i
n
s

w
e
r
e

p
r
e
d
i
c
t
e
d

f
r
o
m

t
h
e

a
l
i
g
n
m
e
n
t

w
i
t
h

r
e
p
r
e
s
e
n
t
a
t
i
v
e

m
e
m
b
e
r
s

o
f

t
h
e

H
e
p
a
c
i
v
i
r
u
s

a
n
d

P
e
g
i
v
i
r
u
s

g
e
n
e
r
a

a
n
d

S
i
g
n
a
l
P
.
4
.
1

(
7
)
.

B
H
V
:

b
a
t

h
e
p
a
c
i
v
i
r
u
s
;


B
P
g
V
:

b
a
t

p
e
g
i
v
i
r
u
s
;

N
T
R
:

n
o
n
-
t
r
a
n
s
l
a
t
e
d

r
e
g
i
o
n
;

O
R
F
:

o
p
e
n

r
e
a
d
i
n
g

f
r
a
m
e

c
o
r
r
e
s
p
o
n
d
i
n
g

t
o

t
h
e

p
o
l
y
p
r
o
t
e
i
n
-
c
o
d
i
n
g

s
e
q
u
e
n
c
e
;

V
R
:

v
a
r
i
a
b
l
e

r
e
g
i
o
n
;

C
:

C
o
r
e
;

E
1
,

E
2
:

E
n
v
e
l
o
p
e

g
l
y
c
o
p
r
o
t
e
i
n
s
;

p
:

p

p
r
o
t
e
i
n
;

N
S
:

N
o
n
s
t
r
u
c
t
u
r
a
l

p
r
o
t
e
i
n
s

(
N
S
2
,

N
S
3
,

N
S
4
A
,

N
S
4
B
,

N
S
5
A
,

N
S
5
B
)
.

N
A
:

n
o
t

a
p
p
l
i
c
a
b
l
e
;

N
I
:

n
o
t

i
d
e
n
t
i
f
i
e
d
;

N
K
:

n
o
t

k
n
o
w
n
;

C
H
V
:

c
a
n
i
n
e

h
e
p
a
c
i
v
i
r
u
s
;

H
C
V
:

H
e
p
a
t
i
t
i
s

C

v
i
r
u
s
;

N
P
H
V
:

N
o
n
p
r
i
m
a
t
e

h
e
p
a
c
i
v
i
r
u
s
;

I
D
:

i
d
e
n
t
i
f
i
c
a
t
i
o
n
.

*
:

P
r
o
p
o
s
e
d

g
e
n
u
s

(
5
)
;

#
:

A
l
l

b
a
t
-
d
e
r
i
v
e
d

v
i
r
u
s
e
s

i
d
e
n
t
i
f
i
e
d

i
n

t
h
i
s

s
t
u
d
y

a
r
e

p
r
o
p
o
s
e
d

t
o

b
e

c
l
a
s
s
i
f
i
e
d

i
n

t
h
e

H
e
p
a
c
i
v
i
r
u
s

g
e
n
u
s

(
C
l
a
d
e
s

A
,

C

a
n
d

D
)

o
r

t
h
e

P
e
g
i
v
i
r
u
s

g
e
n
u
s

(
c
l
a
d
e
s

G
,

H
,

a
n
d

K
)
;

:

p
r
e
d
i
c
t
e
d

p
r
o
t
e
i
n
s


i
n

B
H
V
s

a
n
d

B
P
g
V
s

p
r
e
s
e
n
t

i
n

a

r
e
g
i
o
n

a
n
a
l
o
g
o
u
s

t
o

t
h
e

p

p
r
o
t
e
i
n
s

(
H
C
V

p
7

(
8
)
,

G
B
V
-
B

p
1
3

(
9
)
)

a
n
d

t
o

t
h
e

2
1
k
D
a

G
B
V
-
A
,

6
k
D
a

G
B
V
-
C

a
n
d

2
6

k
D
a

G
B
V
-
D

p
r
o
t
e
i
n
s
.

Table S7
10/20
C
l
a
d
e

A
C
l
a
d
e

B
C
l
a
d
e

C
C
l
a
d
e

D
C
l
a
d
e

E
C
l
a
d
e

F
P
D
B
-
1
1
2

(
K
C
7
9
6
0
7
7
)
G
B
V
-
B

(
A
F
1
7
9
6
1
2
)
P
D
B
-
4
5
2

(
K
C
7
9
6
0
9
0
)
P
D
B
-
8
2
9

(
K
C
7
9
6
0
7
4
)
C
H
V

(
J
F
7
4
4
9
9
1
)
H
C
V

(
M
6
2
3
2
1
)
5
'

N
T
R
T
r
a
n
s
la
t
io
n

in
it
a
t
io
n
I
R
E
S
1
N
K
T
y
p
e

I
I
I
2
N
K
N
K
T
y
p
e

I
V
3
T
y
p
e

I
I
I
2
C
N
u
c
le
o
c
a
p
s
id
P
f
a
m
4
P
f
a
m
0
1
5
4
2
P
f
a
m
0
1
5
4
2
P
f
a
m
0
1
5
4
2
P
f
a
m
0
1
5
4
2
P
f
a
m
0
1
5
4
2
P
f
a
m
0
1
5
4
2
A
R
F
P
s
5
,6
U
n
k
n
o
w
n
N
I
A
R
F
7
N
I
A
R
F
N
K
3
A
R
F
P
s
5
,6
E
1
/
E
2
E
n
v
e
lo
p
e

g
ly
c
o
p
r
o
t
e
in
s
N
-
lin
k
e
d

g
ly
c
o
s
y
la
t
io
n

s
it
e
s

6
*








































































9
*







































































7
*





































































6
*


































































1
4
*



































































1
6
p
I
o
n

c
h
a
n
n
e
l8
~
7


k
D
a
#
p
1
3
9
,1
0
~
7


k
D
a
#
~
7


k
D
a
#
~
7
.
4


k
D
a
#
p
7
8
,1
1
N
S
2
C
y
s
t
e
in
e

p
r
o
t
e
a
s
e
N
S
2
/
N
S
3

a
u
t
o
c
a
t
a
ly
t
ic

t
r
ia
d

(
H
,
E
,
C
)
1
2
,1
3
H
8
2
9
, D
8
4
8
, C
8
6
9
H
8
7
0
, E
8
8
8
, C
9
0
9
H
9
1
0
, E
9
3
0
, C
9
5
1
H
9
1
5
, E
9
3
5
, C
9
5
6
H
9
3
2
, E
9
5
2
, C
9
7
3
H
9
5
2
, E
9
7
2
, C
9
9
3
C
a
t
a
ly
t
ic

t
r
ia
d

(
H
X
2
3
,

D
X
5
6
,

S
)
1
4

H
9
5
7
, D
9
8
1
, S
1
0
4
0
H
9
9
7
, D
1
0
2
1
, S
1
0
7
9
H
1
0
4
0
, D
1
0
6
4
, S
1
1
2
3
H
1
0
4
5
, D
1
0
6
9
, S
1
1
2
8
H
1
0
6
3
, D
1
0
8
7
, S
1
1
4
5
H
1
0
8
3
, D
1
1
0
7
, S
1
1
6
5
C
y
s
t
e
in
e

z
in
c
-
b
in
d
in
g

r
e
s
id
u
e
s

(
C
X
n
C
X
n
C
X
3
H
)
1
5
,1
6
,1
7
C
9
9
7
X
C
9
9
9
X
4
6
C
1
0
4
6
X
3
H
1
0
5
0
C
1
0
3
7
X
C
1
0
3
9
X
4
5
C
1
0
8
5
X
3
H
1
0
8
9
C
1
0
8
0
X
C
1
0
8
2
X
4
6
C
1
1
2
9
X
3
H
1
1
3
3
C
1
0
8
5
X
C
1
0
8
7
X
4
6
C
1
1
3
4
X
3
H
1
1
3
8
C
1
1
0
3
X
C
1
1
0
5
X
4
5
C
1
1
5
1
X
3
D
1
1
5
5
C
1
1
2
3
X
C
1
1
2
5
X
4
5
C
1
1
7
1
X
3
H
1
1
7
5
S
1

p
o
c
k
e
t

r
e
s
id
u
e
s

(
L
,
F
,
A
)
1
6
,1
7
N
I
N
I
L
1
1
1
9
F
1
1
3
8
A
1
1
4
1
L
1
1
2
4
F
1
1
4
3
T
1
1
4
6
L
1
1
4
1
F
1
1
6
0
A
1
1
6
3
L
1
1
6
1
F
1
1
8
0
A
1
1
8
3
M
o
t
if

I
a
2
Q
1
2
0
7
I
V
R
M
L
V
A
P
T
G
S
G
K
S
T
R
L
P

Y
1
1
3
7
S
V
Q
I
L
I
A
P
T
G
S
G
K
S
T
K
L
P

Y
1
1
8
1
S
V
K
F
L
H
A
P
T
G
S
G
K
S
T
K
M
P

F
1
2
9
3
A
I
K
L
L
H
A
P
T
G
A
G
K
S
T
K
M
P

Y
1
2
0
3
Q
V
S
F
L
H
A
P
T
G
S
G
K
S
T
K
M
P
V
1
2
2
5
A
H
L
H
A
P
T
G
S
G
K
S
T
K
V
P


M
o
t
if

I
b
2
Y
1
2
3
3
S
V
L
V
L
N
P
S
V
A
T
T
L
N
F

Y
1
1
6
3
E
V
L
V
L
N
P
S
V
A
T
T
A
S
M

Y
1
2
0
7
R
T
L
V
C
N
P
S
V
A
T
T
R
S
M

Y
1
3
1
9
S
V
L
V
L
N
P
S
V
A
T
T
I
A
M

Y
1
2
2
9
H
V
L
V
L
N
P
S
V
A
S
T
L
S
F
Y
1
2
4
9
K
V
L
V
L
N
P
S
V
A
A
T
L
G
F


M
o
t
if

I
I
2
V
1
2
9
5
I
I
C
D
E
C
H

V
1
2
2
4
I
I
C
D
E
C
H

V
1
2
7
0
V
V
C
D
E
C
H

V
1
3
8
2
I
I
C
D
E
C
H

I
1
2
9
2
I
I
C
D
E
C
H
I
1
3
1
2
I
I
C
D
E
C
H


M
o
t
if

I
I
I
2
V
1
3
2
7
I
L
A
T
A
T
P
P

V
1
2
5
6
V
L
A
T
A
T
P
P

V
1
3
0
2
I
L
A
T
A
T
P
P

V
1
4
1
4
V
L
A
T
A
T
P
P

V
1
3
2
4
V
L
A
T
A
T
P
P
V
1
3
4
4
V
L
A
T
A
T
P
P


M
o
t
if

I
V
2
L
1
3
7
4
I
F
Q
A
S
K
A
H

L
1
3
0
3
I
F
E
A
T
K
K
H

L
1
3
4
9
I
F
C
H
S
K
K
K

L
1
4
6
1
I
F
C
H
S
K
A
K

L
1
3
7
1
I
F
C
H
S
K
K
K
L
1
3
9
1
I
F
C
H
S
K
K
K


M
o
t
if

V
2
V
1
4
1
6
V
V
A
T
D
A
L
M
T
G
Y
T
G
N
F
D
S
V
T

V
1
3
4
4
V
V
A
T
D
A
L
C
T
G
Y
T
G
D
F
D
S
V
Y

V
1
3
9
1
V
V
A
T
D
A
L
M
T
G
Y
T
G
N
F
D
S
V
Y

V
1
5
0
3
V
I
A
T
D
A
L
M
T
G
Y
T
G
N
F
D
S
V
T

V
1
4
1
3
V
V
A
T
D
A
L
M
T
G
Y
S
G
N
F
D
T
V
T
V
1
4
3
3
V
V
A
T
D
A
L
M
T
G
Y
T
G
D
F
D
S
V
I

M
o
t
if

V
I
2
T
1
4
5
9
L
P
S
S
N
V
V
R
M
Q
R
R
G
R

V
1
3
8
7
C
G
V
S
A
I
V
K
G
Q
R
R
G
R

T
1
4
3
4
R
P
S
D
S
V
Q
R
T
Q
R
R
G
R

S
1
5
4
6
K
P
A
D
S
I
T
R
T
Q
R
R
G
R

P
1
4
5
6
K
P
S
D
A
V
C
R
T
Q
R
R
G
R
T
1
4
7
6
L
P
Q
D
A
V
S
R
T
Q
R
R
G
R


N
S
4
A
N
S
3
-
p
r
o
t
e
a
s
e

c
o
f
a
c
t
o
r
P
f
a
m
N
I
N
I
P
f
a
m
0
1
0
0
6
P
f
a
m
0
1
0
0
6
N
I
P
f
a
m
0
1
0
0
6
N
u
c
le
o
t
id
e

b
in
d
in
g
A
-
m
o
t
if

(
G
X
4
G
K
)
1
8
N
I
N
I
N
I
N
I
N
I
G
1
8
4
6
S
V
G
L
G
K
N
u
c
le
o
t
id
e

b
in
d
in
g

B
-
m
o
t
if

(
D
X
X
A
)
1
8
N
I
N
I
N
I
N
I
N
I
D
1
9
4
2
A
A
A
P
f
a
m
P
f
a
m
0
1
0
0
1
P
f
a
m
0
1
0
0
1
N
I
P
f
a
m
0
1
0
0
1
N
I
P
f
a
m
0
1
0
0
1
N
S
5
A
P
h
o
s
p
h
o
p
r
o
t
e
in
Z
in
c

f
in
g
e
r

m
o
t
if

(
C
X
1
7
C
X
C
X
n
C
)
1
9
C
1
8
9
1
X
1
7
C
1
9
0
9
X
C
1
9
1
1
X
2
1
C
1
9
3
3
C
1
9
0
5
X
1
7
C
1
9
2
3
X
C
1
9
2
5
X
2
1
C
1
9
4
7
C
1
9
6
5
X
1
7
C
1
9
8
3
X
C
1
9
8
5
X
2
1
C
2
0
0
7
C
1
9
6
9
X
1
7
C
1
9
8
7
X
C
1
9
8
9
X
2
1
C
2
0
1
1
C
1
9
8
7
X
1
7
C
2
0
0
5
X
C
2
0
0
7
X
2
0
C
2
0
2
8
C
2
0
1
1
X
1
7
C
2
0
2
9
X
C
2
0
3
1
X
2
0
C
2
0
5
2
M
o
t
if

I
2
V
2
4
4
7
P
K
I
E
T
V
2
4
1
3
P
K
E
E
V
M
2
5
1
9
P
K
V
E
V
M
2
5
7
1
A
K
E
E
I
M
2
4
9
3
A
K
N
E
V
M
2
5
5
9
A
K
N
E
V
M
o
t
if

I
I
2
M
2
4
6
2
K
P
P
R
L
I
A
Y
P
H
L
E
V
R
V
A
E
K
M
Y
L
G
D
V
A
Q
R
V
K
2
4
2
8
K
P
P
R
L
I
S
Y
P
H
L
E
M
R
C
V
E
K
M
Y
Y
G
Q
V
A
P
D
V
K
2
5
3
5
K
P
P
R
L
I
M
Y
P
D
L
I
T
R
A
V
E
K
K
V
L
G
D
I
G
P
K
R
2
5
8
7
K
P
P
R
L
I
M
F
P
D
L
I
V
R
A
T
E
K
A
V
L
G
D
L
A
P
K
R
2
5
0
6
K
P
A
R
L
I
V
Y
P
D
L
P
V
R
A
C
E
K
R
A
M
Y
D
L
F
Q
K
R
2
5
7
4
K
P
A
R
L
I
V
F
P
D
L
G
V
R
V
C
E
K
M
A
L
Y
D
V
V
T
K
M
o
t
if

I
I
I
2
G
2
5
0
0
F
Q
Y
S
P
K
Q
R
V
D
Y
L
G
2
4
6
6
F
V
D
P
R
T
R
V
K
R
L
L
G
2
5
7
3
F
Q
Y
T
P
Q
Q
R
V
D
R
M
G
2
6
2
5
F
A
Y
T
P
K
E
R
V
E
K
I
G
2
5
4
4
F
Q
Y
T
P
R
Q
R
V
D
R
L
G
2
6
1
2
F
Q
Y
S
P
G
Q
R
V
E
F
L
M
o
t
if

I
V
2
F
2
5
2
7
D
T
H
C
F
D
S
N
V
T
C
2
4
9
0
D
T
V
C
F
D
S
T
I
T
C
2
6
0
0
D
T
Q
C
F
D
S
T
I
T
C
2
6
5
2
D
T
V
C
F
D
S
T
V
T
Y
2
5
7
1
D
T
K
C
F
D
S
T
V
T
Y
2
6
3
9
D
T
R
C
F
D
S
T
V
T
M
o
t
if

V
2
R
2
5
8
5
F
C
R
A
S
G
T
Y
T
T
S
A
G
N
T
I
T
C
F
L
K
A
K
A
A
A
R
2
5
4
8
R
C
R
S
S
G
V
Y
T
T
S
S
S
N
S
L
T
C
W
L
K
V
N
A
A
A
R
2
6
5
8
Q
C
R
A
S
G
V
F
T
T
S
S
S
N
C
L
T
A
W
L
K
V
R
A
S
A
R
2
7
1
0
N
C
R
A
S
G
V
Y
T
T
S
S
S
N
C
L
T
A
W
I
K
V
H
A
A
A
R
2
6
2
9
E
C
R
A
S
G
V
F
P
T
S
M
G
N
T
L
T
N
F
I
K
A
S
A
A
A
R
2
6
9
7
R
C
R
A
S
G
V
L
T
T
S
C
G
N
T
L
T
C
Y
I
K
A
R
A
A
C
M
o
t
if

V
I
2
F
2
6
2
1
L
I
H
G
D
D
C
L
V
F
2
5
8
4
L
I
C
G
D
D
C
T
V
L
2
6
9
4
L
V
S
G
D
D
V
F
G
L
2
7
4
6
L
V
T
G
D
D
V
F
G
F
2
6
6
5
L
I
C
G
D
D
L
V
C
M
2
7
3
3
L
V
C
G
D
D
L
V
V
M
o
t
if

V
I
I
2
E
2
6
6
9
L
L
D
T
C
S
S
N
E
2
6
3
2
E
L
T
S
C
S
S
N
E
2
7
4
1
E
L
T
C
C
S
Q
N
E
2
7
9
3
E
L
T
C
C
S
S
N
E
2
7
1
3
Q
I
T
S
C
S
S
N
E
2
7
8
1
L
I
T
S
C
S
S
N
M
o
t
if

V
I
I
I
2
Y
2
6
9
0
Y
L
T
R
D
P
S
I
P
L
A
R
Y
2
6
5
3
F
L
T
R
D
P
R
I
P
L
G
R
Y
2
7
6
2
Y
L
T
R
D
P
R
V
P
L
A
R
Y
2
8
1
4
Y
L
T
R
D
P
R
T
P
F
A
R
Y
2
7
3
4
F
L
T
R
D
P
T
T
P
L
A
R
Y
2
8
0
2
Y
L
T
R
D
P
T
T
P
L
A
R
P
o
ly
u
r
id
in
e

s
e
q
u
e
n
c
e
s
2
0
,2
1
,2
2
N
K
P
o
ly

U
2
0
N
K
N
K
N
I
3
P
o
ly

U
-
U
C
2
1
,2
2
C
o
n
s
e
r
v
e
d

n
o
n
-
h
o
m
o
p
o
ly
m
e
r
ic

r
e
g
io
n
2
1
,2
2
,2
3
N
K
Y

t
a
il2
3
N
K
N
K
N
I
3
X

t
a
il2
1
,2
2
N
S
4
B
M
e
m
b
r
a
n
e

a
lt
e
r
a
t
io
n
s
N
S
5
B
R
N
A
-
d
e
p
e
n
d
e
n
t

R
N
A

p
o
ly
m
e
r
a
s
e
3
'

N
T
R
R
e
p
lic
a
t
io
n
R
e
g
i
o
n
F
u
n
c
t
i
o
n
F
e
a
t
u
r
e
/
C
o
n
s
e
r
v
e
d

M
o
t
i
f
N
S
3
S
e
r
in
e

p
r
o
t
e
a
s
e
N
T
P
a
s
e
/
H
e
lic
a
s
e
O
n
e
re
p
re
s
e
n
ta
tiv
e
g
e
n
o
m
e
fro
m
e
a
c
h
c
la
d
e
w
a
s
u
s
e
d
fo
r th
e
a
n
a
ly
s
is
. A
m
in
o
a
c
id
s
a
re
n
u
m
b
e
re
d
w
ith
re
s
p
e
c
t to
th
e
m
e
th
io
n
in
e
in
itia
tio
n
c
o
d
o
n
o
f th
e
p
o
ly
p
ro
te
in
.
N
T
R
: N
o
n
-tra
n
s
la
te
d
re
g
io
n
; C
: C
o
re
; A
R
F
P
/F
: A
lte
rn
a
te
re
a
d
in
g
fra
m
e
p
ro
te
in
/fra
m
e
s
h
ift p
ro
te
in
; E
1
/E
2
: E
n
v
e
lo
p
e
g
ly
c
o
p
ro
te
in
s
; N
S
: N
o
n
-s
tru
c
tu
ra
l p
ro
te
in
s
(N
S
1
, N
S
2
, N
S
3
, N
S
4
A
, N
S
4
B
, N
S
5
A
, N
S
5
B
); IR
E
S
: In
te
rn
a
l R
ib
o
s
o
m
e
E
n
try
S
ite
; N
K
: n
o
t k
n
o
w
n
; N
I: n
o
t
id
e
n
tifie
d
; N
C
: n
o
t c
la
s
s
ifie
d
.
*
P
re
d
ic
te
d
u
s
in
g
N
e
tN
G
ly
c
1
.0
S
e
rv
e
r20
*
*
A
n
a
lo
g
o
u
s
re
g
io
n
to
H
C
V
p
7
a
n
d
G
B
V
-B
p
1
3
.
1
.
J. T
. S
ta
p
le
to
n
, S
. F
o
u
n
g
, A
. S
. M
u
e
rh
o
ff, J. B
u
k
h
, P
. S
im
m
o
n
d
s, T
h
e
G
B
v
iru
se
s: a
re
v
ie
w
a
n
d
p
ro
p
o
se
d
c
la
ssific
a
tio
n
o
f G
B
V
-A
, G
B
V
-C
(H
G
V
), a
n
d
G
B
V
-D
in
g
e
n
u
s P
e
g
iv
iru
s w
ith
in
th
e
fa
m
ily
F
la
v
iv
irid
a
e
. T
h
e
J
o
u
rn
a
l o
f g
e
n
e
ra
l v
iro
lo
g
y
9
2
, 2
3
3
(F
e
b
, 2
0
1
1
).
2
.
A
. K
a
p
o
o
r e
t a
l., C
h
a
ra
c
te
riz
a
tio
n
o
f a
c
a
n
in
e
h
o
m
o
lo
g
o
f h
e
p
a
titis C
v
iru
s. P
ro
c
e
e
d
in
g
s o
f th
e
N
a
tio
n
a
l A
c
a
d
e
m
y
o
f S
c
ie
n
c
e
s o
f th
e
U
n
ite
d
S
ta
te
s o
f A
m
e
ric
a
1
0
8
, 1
1
6
0
8
(Ju
l 1
2
, 2
0
1
1
).
3
.
J. L
. W
a
le
w
sk
i, T
. R
. K
e
lle
r, D
. D
. S
tu
m
p
, A
. D
. B
ra
n
c
h
, E
v
id
e
n
c
e
fo
r a
n
e
w
h
e
p
a
titis C
v
iru
s a
n
tig
e
n
e
n
c
o
d
e
d
in
a
n
o
v
e
rla
p
p
in
g
re
a
d
in
g
fra
m
e
. R
n
a
7
, 7
1
0
(M
a
y
, 2
0
0
1
).
4
.
A
. D
. B
ra
n
c
h
, D
. D
. S
tu
m
p
, J. A
. G
u
tie
rre
z
, F
. E
n
g
, J. L
. W
a
le
w
sk
i, T
h
e
h
e
p
a
titis C
v
iru
s a
lte
rn
a
te
re
a
d
in
g
fra
m
e
(A
R
F
) a
n
d
its fa
m
ily
o
f n
o
v
e
l p
ro
d
u
c
ts: th
e
a
lte
rn
a
te
re
a
d
in
g
fra
m
e
p
ro
te
in
/F
-p
ro
te
in
, th
e
d
o
u
b
le
-fra
m
e
sh
ift p
ro
te
in
, a
n
d
o
th
e
rs. S
e
m
in
a
rs in
liv
e
r d
ise
a
se
2
5
, 1
0
5
(F
e
b
, 2
0
0
5
).
5
.
Z
. X
u
e
t a
l., S
y
n
th
e
sis o
f a
n
o
v
e
l h
e
p
a
titis C
v
iru
s p
ro
te
in
b
y
rib
o
so
m
a
l fra
m
e
sh
ift. T
h
e
E
M
B
O
jo
u
rn
a
l 2
0
, 3
8
4
0
(Ju
l 1
6
, 2
0
0
1
).
6
.
S
. D
. G
riffin
e
t a
l., T
h
e
p
7
p
ro
te
in
o
f h
e
p
a
titis C
v
iru
s fo
rm
s a
n
io
n
c
h
a
n
n
e
l th
a
t is b
lo
c
k
e
d
b
y
th
e
a
n
tiv
ira
l d
ru
g
, A
m
a
n
ta
d
in
e
. F
E
B
S
le
tte
rs 5
3
5
, 3
4
(Ja
n
3
0
, 2
0
0
3
).
7
.
D
. G
h
ib
a
u
d
o
, L
. C
o
h
e
n
, F
. P
e
n
in
, A
. M
a
rtin
, C
h
a
ra
c
te
riz
a
tio
n
o
f G
B
v
iru
s B
p
o
ly
p
ro
te
in
p
ro
c
e
ssin
g
re
v
e
a
ls th
e
e
x
iste
n
c
e
o
f a
n
o
v
e
l 1
3
-k
D
a
p
ro
te
in
w
ith
p
a
rtia
l h
o
m
o
lo
g
y
to
h
e
p
a
titis C
v
iru
s p
7
p
ro
te
in
. J
B
io
l C
h
e
m
2
7
9
, 2
4
9
6
5
(Ju
n
1
1
, 2
0
0
4
).
8
.
C
. L
in
, B
. D
. L
in
d
e
n
b
a
c
h
, B
. M
. P
ra
g
a
i, D
. W
. M
c
C
o
u
rt, C
. M
. R
ic
e
, P
ro
c
e
ssin
g
in
th
e
h
e
p
a
titis C
v
iru
s E
2
-N
S
2
re
g
io
n
: id
e
n
tific
a
tio
n
o
f p
7
a
n
d
tw
o
d
istin
c
t E
2
-sp
e
c
ific
p
ro
d
u
c
ts w
ith
d
iffe
re
n
t C
te
rm
in
i. J
o
u
rn
a
l o
f v
iro
lo
g
y
6
8
, 5
0
6
3
(A
u
g
, 1
9
9
4
).
9
.
M
. H
ijik
a
ta
e
t a
l., T
w
o
d
istin
c
t p
ro
te
in
a
se
a
c
tiv
itie
s re
q
u
ire
d
fo
r th
e
p
ro
c
e
ssin
g
o
f a
p
u
ta
tiv
e
n
o
n
stru
c
tu
ra
l p
re
c
u
rso
r p
ro
te
in
o
f h
e
p
a
titis C
v
iru
s. J
o
u
rn
a
l o
f v
iro
lo
g
y
6
7
, 4
6
6
5
(A
u
g
, 1
9
9
3
).
1
0
.
A
. G
ra
k
o
u
i, D
. W
. M
c
C
o
u
rt, C
. W
y
c
h
o
w
sk
i, S
. M
. F
e
in
sto
n
e
, C
. M
. R
ic
e
, A
se
c
o
n
d
h
e
p
a
titis C
v
iru
s-e
n
c
o
d
e
d
p
ro
te
in
a
se
. P
ro
c
e
e
d
in
g
s o
f th
e
N
a
tio
n
a
l A
c
a
d
e
m
y
o
f S
c
ie
n
c
e
s o
f th
e
U
n
ite
d
S
ta
te
s o
f A
m
e
ric
a
9
0
, 1
0
5
8
3
(N
o
v
1
5
, 1
9
9
3
).
1
1
.
J. F
. B
a
z
a
n
, R
. J. F
le
tte
ric
k
, D
e
te
c
tio
n
o
f a
try
p
sin
-lik
e
se
rin
e
p
ro
te
a
se
d
o
m
a
in
in
fla
v
iv
iru
se
s a
n
d
p
e
stiv
iru
se
s. V
iro
lo
g
y
1
7
1
, 6
3
7
(A
u
g
, 1
9
8
9
).
1
2
.
C
. M
. F
a
illa
, E
. P
iz
z
i, R
. D
e
F
ra
n
c
e
sc
o
, A
. T
ra
m
o
n
ta
n
o
, R
e
d
e
sig
n
in
g
th
e
su
b
stra
te
sp
e
c
ific
ity
o
f th
e
h
e
p
a
titis C
v
iru
s N
S
3
p
ro
te
a
se
. F
o
ld
in
g
&
d
e
sig
n
1
, 3
5
(1
9
9
6
).
1
3
.
J. L
. K
im
e
t a
l., C
ry
sta
l stru
c
tu
re
o
f th
e
h
e
p
a
titis C
v
iru
s N
S
3
p
ro
te
a
se
d
o
m
a
in
c
o
m
p
le
x
e
d
w
ith
a
sy
n
th
e
tic
N
S
4
A
c
o
fa
c
to
r p
e
p
tid
e
. C
e
ll 8
7
, 3
4
3
(O
c
t 1
8
, 1
9
9
6
).
1
4
.
R
. A
. L
o
v
e
e
t a
l., T
h
e
c
ry
sta
l stru
c
tu
re
o
f h
e
p
a
titis C
v
iru
s N
S
3
p
ro
te
in
a
se
re
v
e
a
ls a
try
p
sin
-lik
e
fo
ld
a
n
d
a
stru
c
tu
ra
l z
in
c
b
in
d
in
g
site
. C
e
ll 8
7
, 3
3
1
(O
c
t 1
8
, 1
9
9
6
).
1
5
.
S
. E
in
a
v
, M
. E
la
z
a
r, T
. D
a
n
ie
li, J. S
. G
le
n
n
, A
n
u
c
le
o
tid
e
b
in
d
in
g
m
o
tif in
h
e
p
a
titis C
v
iru
s (H
C
V
) N
S
4
B
m
e
d
ia
te
s H
C
V
R
N
A
re
p
lic
a
tio
n
. J
o
u
rn
a
l o
f v
iro
lo
g
y
7
8
, 1
1
2
8
8
(O
c
t, 2
0
0
4
).
1
6
.
T
. L
. T
e
llin
g
h
u
ise
n
, J. M
a
rc
o
trig
ia
n
o
, A
. E
. G
o
rb
a
le
n
y
a
, C
. M
. R
ic
e
, T
h
e
N
S
5
A
p
ro
te
in
o
f h
e
p
a
titis C
v
iru
s is a
z
in
c
m
e
ta
llo
p
ro
te
in
. J
B
io
l C
h
e
m
2
7
9
, 4
8
5
7
6
(N
o
v
1
9
, 2
0
0
4
).
1
7
.
T
. T
a
n
a
k
a
, N
. K
a
to
, M
. J. C
h
o
, K
. S
u
g
iy
a
m
a
, K
. S
h
im
o
to
h
n
o
, S
tru
c
tu
re
o
f th
e
3
' te
rm
in
u
s o
f th
e
h
e
p
a
titis C
v
iru
s g
e
n
o
m
e
. J
o
u
rn
a
l o
f v
iro
lo
g
y
7
0
, 3
3
0
7
(M
a
y
, 1
9
9
6
).
1
8
.
T
. T
a
n
a
k
a
, N
. K
a
to
, M
. J. C
h
o
, K
. S
h
im
o
to
h
n
o
, A
n
o
v
e
l se
q
u
e
n
c
e
fo
u
n
d
a
t th
e
3
' te
rm
in
u
s o
f h
e
p
a
titis C
v
iru
s g
e
n
o
m
e
. B
io
c
h
e
m
ic
a
l a
n
d
b
io
p
h
y
sic
a
l re
se
a
rc
h
c
o
m
m
u
n
ic
a
tio
n
s 2
1
5
, 7
4
4
(O
c
t 1
3
, 1
9
9
5
).
1
9
.
A
. A
. K
o
ly
k
h
a
lo
v
, S
. M
. F
e
in
sto
n
e
, C
. M
. R
ic
e
, Id
e
n
tific
a
tio
n
o
f a
h
ig
h
ly
c
o
n
se
rv
e
d
se
q
u
e
n
c
e
e
le
m
e
n
t a
t th
e
3
' te
rm
in
u
s o
f h
e
p
a
titis C
v
iru
s g
e
n
o
m
e
R
N
A
. J
o
u
rn
a
l o
f v
iro
lo
g
y
7
0
, 3
3
6
3
(Ju
n
, 1
9
9
6
).
2
0
.
A
. S
b
a
rd
e
lla
ti, E
. S
c
a
rse
lli, L
. T
o
m
e
i, A
. S
. K
e
k
u
le
, C
. T
ra
b
o
n
i, Id
e
n
tific
a
tio
n
o
f a
n
o
v
e
l se
q
u
e
n
c
e
a
t th
e
3
' e
n
d
o
f th
e
G
B
v
iru
s B
g
e
n
o
m
e
. J
o
u
rn
a
l o
f v
iro
lo
g
y
7
3
, 1
0
5
4
6
(D
e
c
, 1
9
9
9
).
2
1
.
P
. D
. B
u
rb
e
lo
e
t a
l., S
e
ro
lo
g
y
-e
n
a
b
le
d
d
isc
o
v
e
ry
o
f g
e
n
e
tic
a
lly
d
iv
e
rse
h
e
p
a
c
iv
iru
se
s in
a
n
e
w
h
o
st. J
o
u
rn
a
l o
f v
iro
lo
g
y
8
6
, 6
1
7
1
(Ju
n
, 2
0
1
2
).
2
2
.
N
e
tN
G
ly
c
1
.0
S
e
rv
e
r. A
c
c
e
ssib
le
o
n
lin
e
: h
ttp
://w
w
w
.c
b
s.d
tu
.d
k
/se
rv
ic
e
s/N
e
tN
G
ly
c
/

T
a
b
l
e

S
8
.


C
o
m
p
a
r
i
s
o
n

o
f

B
H
V

f
e
a
t
u
r
e
s

w
i
t
h

r
e
p
r
e
s
e
n
t
a
t
i
v
e

m
e
m
b
e
r
s

o
f

t
h
e

H
e
p
a
c
i
v
i
r
u
s

g
e
n
u
s

O
n
e

r
e
p
r
e
s
e
n
t
a
t
i
v
e

v
i
r
a
l

g
e
n
o
m
e

f
r
o
m

e
a
c
h

c
l
a
d
e

w
a
s

u
s
e
d

f
o
r

t
h
e

a
n
a
l
y
s
i
s
.

A
m
i
n
o

a
c
i
d
s

a
r
e

n
u
m
b
e
r
e
d

a
c
c
o
r
d
i
n
g

t
o

t
h
e

A
U
G

i
n
i
t
i
a
t
o
r

c
o
d
o
n

o
f

t
h
e

p
o
l
y
p
r
o
t
e
i
n
.


N
T
R
:

N
o
n
-
t
r
a
n
s
l
a
t
e
d

r
e
g
i
o
n
;

I
R
E
S
:

I
n
t
e
r
n
a
l

r
i
b
o
s
o
m
e

e
n
t
r
y

s
i
t
e
;

C
:

C
o
r
e
;

A
R
F
:

A
l
t
e
r
n
a
t
e

r
e
a
d
i
n
g

f
r
a
m
e
;

A
R
F
P
s
:

A
l
t
e
r
n
a
t
e

r
e
a
d
i
n
g

f
r
a
m
e

p
r
o
t
e
i
n
/
f
r
a
m
e
s
h
i
f
t

p
r
o
t
e
i
n
s
;


E
1
/
E
2
:

E
n
v
e
l
o
p
e

g
l
y
c
o
p
r
o
t
e
i
n
s
;

p
:

p

p
r
o
t
e
i
n
;

P
f
a
m
:

P
r
o
t
e
i
n

f
a
m
i
l
y
;

N
S
:

N
o
n
-
s
t
r
u
c
t
u
r
a
l

p
r
o
t
e
i
n
s

(
N
S
2
,

N
S
3
,

N
S
4
A
,

N
S
4
B
,

N
S
5
A
,

N
S
5
B
)
;

N
I
:

n
o
t

i
d
e
n
t
i
f
i
e
d
;

N
K
:

n
o
t

k
n
o
w
n
;

B
H
V
:

b
a
t

h
e
p
a
c
i
v
i
r
u
s
;


C
H
V
:

c
a
n
i
n
e

h
e
p
a
c
i
v
i
r
u
s
;

H
C
V
:

H
e
p
a
t
i
t
i
s

C

v
i
r
u
s
.

*
P
r
e
d
i
c
t
e
d
;

#

p
r
e
d
i
c
t
e
d

p
r
o
t
e
i
n
s

i
n

t
h
e

B
H
V
s

p
r
e
s
e
n
t

i
n

a

r
e
g
i
o
n

a
n
a
l
o
g
o
u
s

t
o

t
h
e

p

p
r
o
t
e
i
n
s

(
H
C
V

p
7
,

G
B
V
-
B

p
1
3
)
.



1
.

W
a
l
s
h

D

&

M
o
h
r

I

(
2
0
1
1
)

V
i
r
a
l

s
u
b
v
e
r
s
i
o
n

o
f

t
h
e

h
o
s
t

p
r
o
t
e
i
n

s
y
n
t
h
e
s
i
s

m
a
c
h
i
n
e
r
y
.

N
a
t

R
e
v

M
i
c
r
o
b
i
o
l

9
(
1
2
)
:
8
6
0
-
8
7
5
.

2
.

S
t
a
p
l
e
t
o
n

J
T
,

F
o
u
n
g

S
,

M
u
e
r
h
o
f
f

A
S
,

B
u
k
h

J
,

&

S
i
m
m
o
n
d
s

P

(
2
0
1
1
)

T
h
e

G
B

v
i
r
u
s
e
s
:

a

r
e
v
i
e
w

a
n
d

p
r
o
p
o
s
e
d

c
l
a
s
s
i
f
i
c
a
t
i
o
n

o
f

G
B
V
-
A
,

G
B
V
-
C

(
H
G
V
)
,

a
n
d

G
B
V
-
D

i
n

g
e
n
u
s

P
e
g
i
v
i
r
u
s

w
i
t
h
i
n

t
h
e

f
a
m
i
l
y

F
l
a
v
i
v
i
r
i
d
a
e
.

J

G
e
n

V
i
r
o
l

9
2
(
P
t

2
)
:
2
3
3
-
2
4
6
.

3
.

K
a
p
o
o
r

A
,

e
t

a
l
.

(
2
0
1
1
)

C
h
a
r
a
c
t
e
r
i
z
a
t
i
o
n

o
f

a

c
a
n
i
n
e

h
o
m
o
l
o
g

o
f

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s
.

P
r
o
c

N
a
t
l

A
c
a
d

S
c
i

U

S

A

1
0
8
(
2
8
)
:
1
1
6
0
8
-
1
1
6
1
3
.

4
.

P
u
n
t
a

M
,

e
t

a
l
.

(
2
0
1
2
)

T
h
e

P
f
a
m

p
r
o
t
e
i
n

f
a
m
i
l
i
e
s

d
a
t
a
b
a
s
e
.

N
u
c
l
e
i
c

A
c
i
d
s

R
e
s

4
0
(
D
a
t
a
b
a
s
e

i
s
s
u
e
)
:
D
2
9
0
-
3
0
1
.

5
.

B
r
a
n
c
h

A
D
,

S
t
u
m
p

D
D
,

G
u
t
i
e
r
r
e
z

J
A
,

E
n
g

F
,

&

W
a
l
e
w
s
k
i

J
L

(
2
0
0
5
)

T
h
e

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

a
l
t
e
r
n
a
t
e

r
e
a
d
i
n
g

f
r
a
m
e

(
A
R
F
)

a
n
d

i
t
s

f
a
m
i
l
y

o
f

n
o
v
e
l

p
r
o
d
u
c
t
s
:

t
h
e

a
l
t
e
r
n
a
t
e

r
e
a
d
i
n
g

f
r
a
m
e

p
r
o
t
e
i
n
/
F
-
p
r
o
t
e
i
n
,

t
h
e

d
o
u
b
l
e
-
f
r
a
m
e
s
h
i
f
t

p
r
o
t
e
i
n
,

a
n
d

o
t
h
e
r
s
.

S
e
m
i
n


L
i
v
e
r

D
i
s

2
5
(
1
)
:
1
0
5
-
1
1
7
.

6
.

V
a
s
s
i
l
a
k
i

N

&

M
a
v
r
o
m
a
r
a

P

(
2
0
0
9
)

T
h
e

H
C
V

A
R
F
P
/
F
/
c
o
r
e
+
1

p
r
o
t
e
i
n
:

p
r
o
d
u
c
t
i
o
n

a
n
d

f
u
n
c
t
i
o
n
a
l

a
n
a
l
y
s
i
s

o
f

a
n

u
n
c
o
n
v
e
n
t
i
o
n
a
l

v
i
r
a
l

p
r
o
d
u
c
t
.

I
U
B
M
B

L
i
f
e

6
1
(
7
)
:
7
3
9
-
7
5
2
.

7
.

X
u

Z
,

e
t

a
l
.

(
2
0
0
1
)

S
y
n
t
h
e
s
i
s

o
f

a

n
o
v
e
l

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

p
r
o
t
e
i
n

b
y

r
i
b
o
s
o
m
a
l

f
r
a
m
e
s
h
i
f
t
.

E
M
B
O

J

2
0
(
1
4
)
:
3
8
4
0
-
3
8
4
8
.

8
.

G
r
i
f
f
i
n

S
D
,

e
t

a
l
.

(
2
0
0
3
)

T
h
e

p
7

p
r
o
t
e
i
n

o
f

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

f
o
r
m
s

a
n

i
o
n

c
h
a
n
n
e
l

t
h
a
t

i
s

b
l
o
c
k
e
d

b
y

t
h
e

a
n
t
i
v
i
r
a
l

d
r
u
g
,

A
m
a
n
t
a
d
i
n
e
.

F
E
B
S

L
e
t
t

5
3
5
(
1
-
3
)
:
3
4
-
3
8
.

9
.

G
h
i
b
a
u
d
o

D
,

C
o
h
e
n

L
,

P
e
n
i
n

F
,

&

M
a
r
t
i
n

A

(
2
0
0
4
)

C
h
a
r
a
c
t
e
r
i
z
a
t
i
o
n

o
f

G
B

v
i
r
u
s

B

p
o
l
y
p
r
o
t
e
i
n

p
r
o
c
e
s
s
i
n
g

r
e
v
e
a
l
s

t
h
e

e
x
i
s
t
e
n
c
e

o
f

a

n
o
v
e
l

1
3
-
k
D
a

p
r
o
t
e
i
n

w
i
t
h

p
a
r
t
i
a
l

h
o
m
o
l
o
g
y

t
o

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

p
7

p
r
o
t
e
i
n
.

J

B
i
o
l

C
h
e
m

2
7
9
(
2
4
)
:
2
4
9
6
5
-
2
4
9
7
5
.

1
0
.

T
a
k
i
k
a
w
a

S
,

e
t

a
l
.

(
2
0
0
6
)

F
u
n
c
t
i
o
n
a
l

a
n
a
l
y
s
e
s

o
f

G
B

v
i
r
u
s

B

p
1
3

p
r
o
t
e
i
n
:

d
e
v
e
l
o
p
m
e
n
t

o
f

a

r
e
c
o
m
b
i
n
a
n
t

G
B

v
i
r
u
s

B

h
e
p
a
t
i
t
i
s

v
i
r
u
s

w
i
t
h

a

p
7

p
r
o
t
e
i
n
.

P
r
o
c

N
a
t
l

A
c
a
d

S
c
i

U

S

A

1
0
3
(
9
)
:
3
3
4
5
-
3
3
5
0
.

1
1
.

L
i
n

C
,

L
i
n
d
e
n
b
a
c
h

B
D
,

P
r
a
g
a
i

B
M
,

M
c
C
o
u
r
t

D
W
,

&

R
i
c
e

C
M

(
1
9
9
4
)

P
r
o
c
e
s
s
i
n
g

i
n

t
h
e

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

E
2
-
N
S
2

r
e
g
i
o
n
:

i
d
e
n
t
i
f
i
c
a
t
i
o
n

o
f

p
7

a
n
d

t
w
o

d
i
s
t
i
n
c
t

E
2
-
s
p
e
c
i
f
i
c

p
r
o
d
u
c
t
s

w
i
t
h

d
i
f
f
e
r
e
n
t

C

t
e
r
m
i
n
i
.

J

V
i
r
o
l

6
8
(
8
)
:
5
0
6
3
-
5
0
7
3
.

1
2
.

G
r
a
k
o
u
i

A
,

M
c
C
o
u
r
t

D
W
,

W
y
c
h
o
w
s
k
i

C
,

F
e
i
n
s
t
o
n
e

S
M
,

&

R
i
c
e

C
M

(
1
9
9
3
)

A

s
e
c
o
n
d

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s
-
e
n
c
o
d
e
d

p
r
o
t
e
i
n
a
s
e
.

P
r
o
c

N
a
t
l

A
c
a
d

S
c
i

U

S

A

9
0
(
2
2
)
:
1
0
5
8
3
-
1
0
5
8
7
.

1
3
.

H
i
j
i
k
a
t
a

M
,

e
t

a
l
.

(
1
9
9
3
)

T
w
o

d
i
s
t
i
n
c
t

p
r
o
t
e
i
n
a
s
e

a
c
t
i
v
i
t
i
e
s

r
e
q
u
i
r
e
d

f
o
r

t
h
e

p
r
o
c
e
s
s
i
n
g

o
f

a

p
u
t
a
t
i
v
e

n
o
n
s
t
r
u
c
t
u
r
a
l

p
r
e
c
u
r
s
o
r

p
r
o
t
e
i
n

o
f

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s
.

J

V
i
r
o
l

6
7
(
8
)
:
4
6
6
5
-
4
6
7
5
.

1
4
.

B
a
z
a
n

J
F

&

F
l
e
t
t
e
r
i
c
k

R
J

(
1
9
8
9
)

D
e
t
e
c
t
i
o
n

o
f

a

t
r
y
p
s
i
n
-
l
i
k
e

s
e
r
i
n
e

p
r
o
t
e
a
s
e

d
o
m
a
i
n

i
n

f
l
a
v
i
v
i
r
u
s
e
s

a
n
d

p
e
s
t
i
v
i
r
u
s
e
s
.

V
i
r
o
l
o
g
y

1
7
1
(
2
)
:
6
3
7
-
6
3
9
.

1
5
.

F
a
i
l
l
a

C
M
,

P
i
z
z
i

E
,

D
e

F
r
a
n
c
e
s
c
o

R
,

&

T
r
a
m
o
n
t
a
n
o

A

(
1
9
9
6
)

R
e
d
e
s
i
g
n
i
n
g

t
h
e

s
u
b
s
t
r
a
t
e

s
p
e
c
i
f
i
c
i
t
y

o
f

t
h
e

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

N
S
3

p
r
o
t
e
a
s
e
.

F
o
l
d

D
e
s

1
(
1
)
:
3
5
-
4
2
.

1
6
.

K
i
m

J
L
,

e
t

a
l
.

(
1
9
9
6
)

C
r
y
s
t
a
l

s
t
r
u
c
t
u
r
e

o
f

t
h
e

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

N
S
3

p
r
o
t
e
a
s
e

d
o
m
a
i
n

c
o
m
p
l
e
x
e
d

w
i
t
h

a

s
y
n
t
h
e
t
i
c

N
S
4
A

c
o
f
a
c
t
o
r

p
e
p
t
i
d
e
.

C
e
l
l

8
7
(
2
)
:
3
4
3
-
3
5
5
.

1
7
.

L
o
v
e

R
A
,

e
t

a
l
.

(
1
9
9
6
)

T
h
e

c
r
y
s
t
a
l

s
t
r
u
c
t
u
r
e

o
f

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

N
S
3

p
r
o
t
e
i
n
a
s
e

r
e
v
e
a
l
s

a

t
r
y
p
s
i
n
-
l
i
k
e

f
o
l
d

a
n
d

a

s
t
r
u
c
t
u
r
a
l

z
i
n
c

b
i
n
d
i
n
g

s
i
t
e
.

C
e
l
l

8
7
(
2
)
:
3
3
1
-
3
4
2
.

1
8
.

E
i
n
a
v

S
,

E
l
a
z
a
r

M
,

D
a
n
i
e
l
i

T
,

&

G
l
e
n
n

J
S

(
2
0
0
4
)

A

n
u
c
l
e
o
t
i
d
e

b
i
n
d
i
n
g

m
o
t
i
f

i
n

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

(
H
C
V
)

N
S
4
B

m
e
d
i
a
t
e
s

H
C
V

R
N
A

r
e
p
l
i
c
a
t
i
o
n
.

J

V
i
r
o
l

7
8
(
2
0
)
:
1
1
2
8
8
-
1
1
2
9
5
.

1
9
.

T
e
l
l
i
n
g
h
u
i
s
e
n

T
L
,

M
a
r
c
o
t
r
i
g
i
a
n
o

J
,

G
o
r
b
a
l
e
n
y
a

A
E
,

&

R
i
c
e

C
M

(
2
0
0
4
)

T
h
e

N
S
5
A

p
r
o
t
e
i
n

o
f

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

i
s

a

z
i
n
c

m
e
t
a
l
l
o
p
r
o
t
e
i
n
.

J

B
i
o
l

C
h
e
m

2
7
9
(
4
7
)
:
4
8
5
7
6
-
4
8
5
8
7
.

2
0
.

B
u
k
h

J
,

A
p
g
a
r

C
L
,

&

Y
a
n
a
g
i

M

(
1
9
9
9
)

T
o
w
a
r
d

a

s
u
r
r
o
g
a
t
e

m
o
d
e
l

f
o
r

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s
:

A
n

i
n
f
e
c
t
i
o
u
s

m
o
l
e
c
u
l
a
r

c
l
o
n
e

o
f

t
h
e

G
B

v
i
r
u
s
-
B

h
e
p
a
t
i
t
i
s

a
g
e
n
t
.

V
i
r
o
l
o
g
y

2
6
2
(
2
)
:
4
7
0
-
4
7
8
.

2
1
.

K
o
l
y
k
h
a
l
o
v

A
A
,

F
e
i
n
s
t
o
n
e

S
M
,

&

R
i
c
e

C
M

(
1
9
9
6
)

I
d
e
n
t
i
f
i
c
a
t
i
o
n

o
f

a

h
i
g
h
l
y

c
o
n
s
e
r
v
e
d

s
e
q
u
e
n
c
e

e
l
e
m
e
n
t

a
t

t
h
e

3
'

t
e
r
m
i
n
u
s

o
f

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

g
e
n
o
m
e

R
N
A
.

J

V
i
r
o
l

7
0
(
6
)
:
3
3
6
3
-
3
3
7
1
.

2
2
.

T
a
n
a
k
a

T
,

K
a
t
o

N
,

C
h
o

M
J
,

S
u
g
i
y
a
m
a

K
,

&

S
h
i
m
o
t
o
h
n
o

K

(
1
9
9
6
)

S
t
r
u
c
t
u
r
e

o
f

t
h
e

3
'

t
e
r
m
i
n
u
s

o
f

t
h
e

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

g
e
n
o
m
e
.

J

V
i
r
o
l

7
0
(
5
)
:
3
3
0
7
-
3
3
1
2
.

2
3
.

S
b
a
r
d
e
l
l
a
t
i

A
,

S
c
a
r
s
e
l
l
i

E
,

T
o
m
e
i

L
,

K
e
k
u
l
e

A
S
,

&

T
r
a
b
o
n
i

C

(
1
9
9
9
)

I
d
e
n
t
i
f
i
c
a
t
i
o
n

o
f

a

n
o
v
e
l

s
e
q
u
e
n
c
e

a
t

t
h
e

3
'

e
n
d

o
f

t
h
e

G
B

v
i
r
u
s

B

g
e
n
o
m
e
.

J

V
i
r
o
l

7
3
(
1
2
)
:
1
0
5
4
6
-
1
0
5
5
0
.


11/20
Table S8
C
l
a
d
e

G
C
l
a
d
e

H
C
l
a
d
e

I
C
l
a
d
e

J
C
l
a
d
e

K
P
D
B
-
3
4
.
1

(
K
C
7
9
6
0
9
3
)
P
D
B
-
6
9
4

(
K
C
7
9
6
0
8
3
)
G
B
V
-
C

(
H
Q
3
3
1
2
3
4
)
G
B
V
-
A

(
U
2
2
3
0
3
)
P
D
B
-
1
0
6

(
K
C
7
9
6
0
7
5
)
5
'
N
T
R
T
r
a
n
s
la
t
io
n

in
it
a
t
io
n
I
R
E
S
1
N
K
N
K
N
C
2
,3
N
C
2
,3
N
K
V
R
U
n
k
n
o
w
n
V
R
N
I
N
I
N
I
N
I
E
1
/
E
2
E
n
v
e
lo
p
e

g
ly
c
o
p
r
o
t
e
in
s
N
-
lin
k
e
d

g
ly
c
o
s
y
la
t
io
n

s
it
e
s

8
*
6
*
4
*
4
*
5
*
p
#
U
n
k
n
o
w
n
~

2
8


k
D
a
~

2
5

k
D
a

6

k
D
a
2

2
1

k
D
a
2
~

1
8

k
D
a
N
S
2
C
y
s
t
e
in
e

p
r
o
t
e
a
s
e
N
S
2
/
N
S
3

a
u
t
o
c
a
t
a
ly
t
ic

t
r
ia
d

(
H
,
E
,
C
)
4
,5
H
1
2
3
3
, E
1
2
5
3
, C
1
2
7
4
H
9
4
9
, E
9
6
9
, C
9
9
0
H
8
1
8
, E
8
3
8
, C
8
5
9
H
8
8
7
, E
9
0
7
, C
9
2
8
H
9
0
6
, E
9
2
6
, C
9
4
7
C
a
t
a
ly
t
ic

t
r
ia
d

(
H
X
2
3
,

D
X
5
6
,

S
)
6
H
1
3
6
5
, D
1
3
8
9
, S
1
4
4
7
H
1
0
8
1
, D
1
1
0
5
, S
1
1
6
2
H
9
5
0
, D
9
7
4
, S
1
0
3
1
H
1
0
1
9
, D
1
0
4
3
, S
1
0
9
9
H
1
0
3
8
, D
1
0
6
2
, S
1
1
1
8
C
y
s
t
e
in
e

z
in
c
-
b
in
d
in
g

r
e
s
id
u
e
s

(
C
X
n
C
X
n
C
X
3
H
)
7
,8
,9
C
1
4
0
5
X
1
C
1
4
0
7
X
4
5
C
1
4
5
3
X
3
H
1
4
5
7
C
1
1
2
1
X
1
C
1
1
2
3
X
4
4
C
1
1
6
8
X
3
H
1
1
7
2
C
9
9
0
X
1
C
9
9
2
X
4
4
C
1
0
3
7
X
3
H
1
0
4
1
C
1
0
5
9
X
1
C
1
0
6
1
X
4
3
C
1
1
0
5
X
3
H
1
1
0
9
C
1
0
7
8
X
1
C
1
0
8
0
X
4
3
C
1
1
2
4
X
3
H
1
1
2
8
S
1

p
o
c
k
e
t

r
e
s
id
u
e
s

(
L
,
F
,
A
)
8
,9
N
I
N
I
N
I
N
I
N
I
M
o
t
if

I
a
2
V
1
5
0
7
S
Y
V
A
P
T
G
S
G
K
S
T
K
L
P

F
1
2
1
7
E
E
K
P
L
F
V
P
T
G
S
G
K
S
T
K
I
P

F
1
0
8
7
K
E
A
P
L
F
M
P
T
G
A
G
K
S
T
R
V
P

Y
1
1
5
5
R
E
A
P
L
F
L
P
T
G
A
G
K
S
T
R
V
P

F
1
1
7
4
R
E
A
P
L
F
L
P
T
G
S
G
K
S
T
R
V
P

M
o
t
if

I
b
2
H
1
5
3
0
R
V
L
V
L
N
P
S
V
V
T
T
K
A
M

Q
1
2
4
3
N
V
L
V
C
N
P
S
I
A
T
T
M
A
M

H
1
1
1
3
K
V
L
I
L
N
P
S
V
A
T
V
R
A
M

H
1
1
8
1
K
V
L
V
L
N
P
S
I
A
T
V
R
A
M

H
1
2
0
0
K
V
L
V
L
N
P
S
I
A
T
T
R
A
M

M
o
t
if

I
I
2
V
1
5
9
7
V
I
C
D
E
C
H

V
1
3
1
0
V
I
C
D
E
A
H

V
1
1
8
0
V
I
C
D
E
C
H

V
1
2
4
8
V
I
C
D
E
L
H

V
1
2
6
7
V
I
C
D
E
C
H

M
o
t
if

I
I
I
2
L
1
6
2
9
I
L
A
T
A
T
P
P

L
1
3
4
2
L
Y
A
T
A
T
P
A

V
1
2
1
2
L
Y
A
T
A
T
P
P

L
1
2
8
0
L
F
A
T
A
T
P
P

L
1
2
9
9
L
F
A
T
A
T
P
P

M
o
t
if

I
V
2
V
1
6
7
6
I
F
C
H
S
K
A
E

L
1
3
8
9
I
F
C
H
S
K
D
Q

L
1
2
5
8
V
F
C
H
S
K
A
E

L
1
3
2
7
L
F
C
H
S
K
V
E

I
1
3
4
6
I
F
C
H
S
K
L
E

M
o
t
if

V
2
T
1
7
1
8
V
V
A
T
D
A
I
S
T
G
Y
T
G
N
F
A
S
C
T

V
1
4
3
1
V
C
A
T
D
A
L
M
S
G
Y
T
G
N
F
D
T
V
T

V
1
2
9
9
V
C
A
T
D
A
L
S
T
G
Y
T
G
N
F
D
S
V
T

C
1
3
6
7
V
C
A
T
D
A
L
S
T
G
Y
T
G
N
F
D
T
V
T

C
1
3
8
6
V
C
A
T
D
A
L
S
T
G
Y
T
G
N
F
D
T
V
T

M
o
t
if

V
I
2
T
1
7
6
1
K
P
A
D
A
A
L
R
M
Q
R
R
G
R

V
1
4
7
4
E
P
S
P
A
D
T
R
M
Q
R
R
G
R

T
1
3
4
2
V
P
A
S
A
E
L
S
M
Q
R
R
G
R

T
1
4
1
0
V
P
A
P
A
E
L
R
A
Q
R
R
G
R

T
1
4
2
9
V
P
A
P
A
E
L
R
M
Q
R
R
G
R

N
S
4
A
N
S
3
-
p
r
o
t
e
a
s
e

c
o
f
a
c
t
o
r
P
f
a
m
1
0
N
I
N
I
P
f
a
m
0
1
0
0
6
P
f
a
m
0
1
0
0
6
N
u
c
le
o
t
id
e

b
in
d
in
g
A
-
m
o
t
if

(
G
X
4
G
K
)
1
1
N
I
N
I
N
I
N
I
N
I
N
u
c
le
o
t
id
e

b
in
d
in
g

B
-
m
o
t
if

(
D
X
X
A
)
1
1
N
I
N
I
N
I
N
I
N
I
P
f
a
m
P
f
a
m
0
1
0
0
1
P
f
a
m
0
1
0
0
1
P
f
a
m
0
1
0
0
1
P
f
a
m
0
1
0
0
1
P
f
a
m
0
1
0
0
1
N
S
5
A
P
h
o
s
p
h
o
p
r
o
t
e
in
Z
in
c

f
in
g
e
r

m
o
t
if

(
C
X
1
7
C
X
C
X
n
C
)
1
2
C
2
3
0
7
X
1
7
C
2
3
2
5
C
2
3
2
7
X
2
6
C
2
3
5
4
C
2
0
5
4
X
1
7
C
2
0
7
2
X
C
2
0
7
4
X
2
2
C
2
0
9
7
C
1
9
0
1
X
1
7
C
1
9
1
9
X
C
1
9
2
1
X
2
2
C
1
9
4
4
C
1
9
7
2
X
1
7
C
1
9
9
0
X
C
1
9
9
2
X
2
2
C
2
0
1
5
C
2
0
1
4
X
1
7
C
2
0
3
2
X
C
2
0
3
4
X
2
2
C
2
0
5
7
M
o
t
if

I
2
R
3
0
1
6
P
K
S
E
V
T
2
6
1
3
A
K
Q
E
V
T
2
4
1
5
V
K
K
E
V
V
2
5
2
7
T
K
R
E
V
T
2
5
4
4
C
K
R
E
V
M
o
t
if

I
I
2
P
3
0
2
9
K
P
P
R
L
I
C
Y
P
S
L
E
F
R
V
A
E
K
M
I
L
G
D
P
A
V
V
R
2
6
2
6
K
P
P
R
L
I
C
Y
P
S
L
E
F
R
V
A
E
K
M
I
L
G
D
P
S
V
V
E
2
4
2
8
K
A
P
R
L
I
V
F
P
P
L
D
F
R
I
A
E
K
L
I
L
G
D
P
G
R
V
R
2
5
3
9
K
P
P
R
F
I
V
F
P
P
L
D
F
R
I
A
E
K
M
I
L
G
D
P
G
I
V
R
2
5
5
6
K
P
P
R
F
I
V
H
P
P
L
D
F
R
V
A
E
K
M
I
L
G
D
P
G
K
V
M
o
t
if

I
I
I
2
G
3
0
6
6
F
Q
H
P
P
H
K
R
A
K
V
L
G
2
6
6
3
F
Q
Y
T
P
V
E
R
V
R
V
L
A
2
4
6
5
F
Q
Y
T
P
N
Q
R
V
R
E
M
L
2
5
7
6
F
Q
Y
T
P
N
Q
R
V
K
A
L
A
2
5
9
3
F
Q
Y
T
P
N
Q
K
V
K
H
L
M
o
t
if

I
V
2
V
3
0
9
3
D
G
A
C
F
D
S
T
I
T
V
2
6
9
0
D
A
I
C
F
D
S
T
I
T
V
2
4
9
2
D
A
T
C
F
D
S
S
I
T
V
2
6
0
3
D
A
T
C
F
D
S
S
I
D
V
2
6
2
0
D
A
S
V
F
D
S
T
I
T
M
o
t
if

V
2
R
3
1
4
7
R
C
R
A
S
G
T
L
T
T
S
A
G
N
S
I
T
C
Y
I
K
V
T
A
A
C
R
2
7
4
4
A
C
R
A
S
G
V
L
T
T
S
S
S
N
S
I
T
C
F
L
K
V
S
A
A
C
R
2
5
4
6
Y
C
R
S
S
G
V
L
T
T
S
A
S
N
C
L
T
C
Y
I
K
V
K
A
A
C
R
2
6
5
7
Q
C
R
S
S
G
V
L
T
T
S
S
A
N
S
I
T
C
Y
I
K
V
S
A
A
C
R
2
6
7
4
R
C
R
A
S
G
V
L
T
T
S
S
G
N
S
I
T
A
Y
L
K
V
K
A
A
C
M
o
t
if

V
I
2
F
3
1
8
3
L
I
H
G
D
D
V
V
I
L
2
7
8
0
L
I
H
G
D
D
T
L
I
L
2
5
8
2
L
I
A
G
D
D
C
L
I
F
2
6
9
3
F
I
A
G
D
D
C
L
I
L
2
7
1
0
L
I
A
G
D
D
C
L
I
M
o
t
if

V
I
I
2
S
3
2
2
5
T
A
E
S
C
S
A
T
D
2
8
2
2
T
A
E
S
C
S
A
T
D
2
6
2
4
T
A
P
F
C
S
T
W
D
2
7
3
5
T
A
E
C
C
S
A
Y
D
2
7
5
2
T
A
E
T
C
S
A
Y
M
o
t
if

V
I
I
I
2
P
3
2
4
5
V
L
T
T
D
M
R
R
G
L
G
R
W
2
8
4
6
F
L
T
T
D
F
R
R
V
L
A
R
F
2
6
4
4
F
L
T
T
D
F
R
R
P
L
A
R
W
2
7
5
5
W
L
S
T
D
M
R
K
P
L
A
R
W
2
7
7
2
W
M
S
T
D
M
R
K
P
L
A
R
P
o
ly
u
r
id
in
e

s
e
q
u
e
n
c
e
s
1
3
,1
4
N
K
N
K
N
I
2
N
I
2
N
K
C
o
n
s
e
r
v
e
d

n
o
n
-
h
o
m
o
p
o
ly
m
e
r
ic

r
e
g
io
n
1
3
,1
4
,1
5
N
K
N
K
N
I
2
N
I
2
N
K
N
S
4
B
M
e
m
b
r
a
n
e
A
lt
e
r
a
t
io
n
s
N
S
5
B
R
N
A
-
d
e
p
e
n
d
e
n
t

R
N
A

p
o
ly
m
e
r
a
s
e
3
'
N
T
R
R
e
p
lic
a
t
io
n
R
e
g
i
o
n
F
u
n
c
t
i
o
n
F
e
a
t
u
r
e
/
C
o
n
s
e
r
v
e
d

M
o
t
i
f
N
S
3
S
e
r
in
e

p
r
o
t
e
a
s
e
N
T
P
a
s
e
/
H
e
lic
a
s
e

T
a
b
l
e

S
9
.


C
o
m
p
a
r
i
s
o
n

o
f

B
P
g
V

f
e
a
t
u
r
e
s

w
i
t
h

r
e
p
r
e
s
e
n
t
a
t
i
v
e

m
e
m
b
e
r
s

o
f

t
h
e

P
e
g
i
v
i
r
u
s

g
e
n
u
s

O
n
e

r
e
p
r
e
s
e
n
t
a
t
i
v
e

v
i
r
a
l

g
e
n
o
m
e

f
r
o
m

e
a
c
h

c
l
a
d
e

w
a
s

u
s
e
d

f
o
r

t
h
e

a
n
a
l
y
s
i
s
.

A
m
i
n
o

a
c
i
d
s

a
r
e

n
u
m
b
e
r
e
d

a
c
c
o
r
d
i
n
g

t
o

t
h
e

A
U
G

i
n
i
t
i
a
t
o
r

c
o
d
o
n

o
f

t
h
e

p
o
l
y
p
r
o
t
e
i
n
.


B
P
g
V
:

b
a
t

p
e
g
i
v
i
r
u
s
;

N
T
R
:

N
o
n
-
t
r
a
n
s
l
a
t
e
d

r
e
g
i
o
n
;


I
R
E
S
:

i
n
t
e
r
n
a
l

r
i
b
o
s
o
m
e

e
n
t
r
y

s
i
t
e
;

E
1
/
E
2
:

E
n
v
e
l
o
p
e

g
l
y
c
o
p
r
o
t
e
i
n
s
;

p
:


p

p
r
o
t
e
i
n
;

N
S
:

N
o
n
-
s
t
r
u
c
t
u
r
a
l

p
r
o
t
e
i
n
s

(
N
S
2
,

N
S
3
,

N
S
4
A
,

N
S
4
B
,

N
S
5
A
,

N
S
5
B
)
;

P
f
a
m
:

P
r
o
t
e
i
n

f
a
m
i
l
y
;

N
C
:

n
o
t

c
l
a
s
s
i
f
i
e
d
;

N
I
:

n
o
t

i
d
e
n
t
i
f
i
e
d
;

N
K
:

n
o
t

k
n
o
w
n
.


*
P
r
e
d
i
c
t
e
d
;

#

p
r
e
d
i
c
t
e
d

p
r
o
t
e
i
n
s

i
n

G
B
V
-
A
,

G
B
V
-
C

a
n
d

B
P
g
V
s

p
r
e
s
e
n
t

i
n

a

r
e
g
i
o
n

a
n
a
l
o
g
o
u
s

t
o

t
h
e

p

p
r
o
t
e
i
n
s

(
H
C
V

p
7
,

G
B
V
-
B

p
1
3
)
.

1
.

W
a
l
s
h

D

&

M
o
h
r

I

(
2
0
1
1
)

V
i
r
a
l

s
u
b
v
e
r
s
i
o
n

o
f

t
h
e

h
o
s
t

p
r
o
t
e
i
n

s
y
n
t
h
e
s
i
s

m
a
c
h
i
n
e
r
y
.

N
a
t

R
e
v

M
i
c
r
o
b
i
o
l

9
(
1
2
)
:
8
6
0
-
8
7
5
.

2
.

S
t
a
p
l
e
t
o
n

J
T
,

F
o
u
n
g

S
,

M
u
e
r
h
o
f
f

A
S
,

B
u
k
h

J
,

&

S
i
m
m
o
n
d
s

P

(
2
0
1
1
)

T
h
e

G
B

v
i
r
u
s
e
s
:

a

r
e
v
i
e
w

a
n
d

p
r
o
p
o
s
e
d

c
l
a
s
s
i
f
i
c
a
t
i
o
n

o
f

G
B
V
-
A
,

G
B
V
-
C

(
H
G
V
)
,

a
n
d

G
B
V
-
D

i
n

g
e
n
u
s

P
e
g
i
v
i
r
u
s

w
i
t
h
i
n

t
h
e

f
a
m
i
l
y

F
l
a
v
i
v
i
r
i
d
a
e
.

J

G
e
n


V
i
r
o
l

9
2
(
P
t

2
)
:
2
3
3
-
2
4
6
.

3
.

S
i
m
o
n
s

J
N
,

D
e
s
a
i

S
M
,

S
c
h
u
l
t
z

D
E
,

L
e
m
o
n

S
M
,

&

M
u
s
h
a
h
w
a
r

I
K

(
1
9
9
6
)

T
r
a
n
s
l
a
t
i
o
n

i
n
i
t
i
a
t
i
o
n

i
n

G
B

v
i
r
u
s
e
s

A

a
n
d

C
:

e
v
i
d
e
n
c
e

f
o
r

i
n
t
e
r
n
a
l

r
i
b
o
s
o
m
e

e
n
t
r
y

a
n
d

i
m
p
l
i
c
a
t
i
o
n
s

f
o
r

g
e
n
o
m
e

o
r
g
a
n
i
z
a
t
i
o
n
.

J

V
i
r
o
l

7
0
(
9
)
:

6
1
2
6
-
6
1
3
5
.

4
.

G
r
a
k
o
u
i

A
,

M
c
C
o
u
r
t

D
W
,

W
y
c
h
o
w
s
k
i

C
,

F
e
i
n
s
t
o
n
e

S
M
,

&

R
i
c
e

C
M

(
1
9
9
3
)

A

s
e
c
o
n
d

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s
-
e
n
c
o
d
e
d

p
r
o
t
e
i
n
a
s
e
.

P
r
o
c

N
a
t
l

A
c
a
d

S
c
i

U

S

A

9
0
(
2
2
)
:
1
0
5
8
3
-
1
0
5
8
7
.

5
.

H
i
j
i
k
a
t
a

M
,

e
t

a
l
.

(
1
9
9
3
)

T
w
o

d
i
s
t
i
n
c
t

p
r
o
t
e
i
n
a
s
e

a
c
t
i
v
i
t
i
e
s

r
e
q
u
i
r
e
d

f
o
r

t
h
e

p
r
o
c
e
s
s
i
n
g

o
f

a

p
u
t
a
t
i
v
e

n
o
n
s
t
r
u
c
t
u
r
a
l

p
r
e
c
u
r
s
o
r

p
r
o
t
e
i
n

o
f

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s
.

J

V
i
r
o
l

6
7
(
8
)
:
4
6
6
5
-
4
6
7
5
.

6
.

B
a
z
a
n

J
F

&

F
l
e
t
t
e
r
i
c
k

R
J

(
1
9
8
9
)

D
e
t
e
c
t
i
o
n

o
f

a

t
r
y
p
s
i
n
-
l
i
k
e

s
e
r
i
n
e

p
r
o
t
e
a
s
e

d
o
m
a
i
n

i
n

f
l
a
v
i
v
i
r
u
s
e
s

a
n
d

p
e
s
t
i
v
i
r
u
s
e
s
.

V
i
r
o
l
o
g
y

1
7
1
(
2
)
:
6
3
7
-
6
3
9
.

7
.

F
a
i
l
l
a

C
M
,

P
i
z
z
i

E
,

D
e

F
r
a
n
c
e
s
c
o

R
,

&

T
r
a
m
o
n
t
a
n
o

A

(
1
9
9
6
)

R
e
d
e
s
i
g
n
i
n
g

t
h
e

s
u
b
s
t
r
a
t
e

s
p
e
c
i
f
i
c
i
t
y

o
f

t
h
e

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

N
S
3

p
r
o
t
e
a
s
e
.

F
o
l
d

D
e
s

1
(
1
)
:
3
5
-
4
2
.

8
.

K
i
m

J
L
,

e
t

a
l
.

(
1
9
9
6
)

C
r
y
s
t
a
l

s
t
r
u
c
t
u
r
e

o
f

t
h
e

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

N
S
3

p
r
o
t
e
a
s
e

d
o
m
a
i
n

c
o
m
p
l
e
x
e
d

w
i
t
h

a

s
y
n
t
h
e
t
i
c

N
S
4
A

c
o
f
a
c
t
o
r

p
e
p
t
i
d
e
.

C
e
l
l

8
7
(
2
)
:
3
4
3
-
3
5
5
.

9
.

L
o
v
e

R
A
,

e
t

a
l
.

(
1
9
9
6
)

T
h
e

c
r
y
s
t
a
l

s
t
r
u
c
t
u
r
e

o
f

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

N
S
3

p
r
o
t
e
i
n
a
s
e

r
e
v
e
a
l
s

a

t
r
y
p
s
i
n
-
l
i
k
e

f
o
l
d

a
n
d

a

s
t
r
u
c
t
u
r
a
l

z
i
n
c

b
i
n
d
i
n
g

s
i
t
e
.

C
e
l
l

8
7
(
2
)
:
3
3
1
-
3
4
2
.

1
0
.

P
u
n
t
a

M
,

e
t

a
l
.

(
2
0
1
2
)

T
h
e

P
f
a
m

p
r
o
t
e
i
n

f
a
m
i
l
i
e
s

d
a
t
a
b
a
s
e
.

N
u
c
l
e
i
c

A
c
i
d
s

R
e
s

4
0
(
D
a
t
a
b
a
s
e

i
s
s
u
e
)
:
D
2
9
0
-
3
0
1
.

1
1
.

E
i
n
a
v

S
,

E
l
a
z
a
r

M
,

D
a
n
i
e
l
i

T
,

&

G
l
e
n
n

J
S

(
2
0
0
4
)

A

n
u
c
l
e
o
t
i
d
e

b
i
n
d
i
n
g

m
o
t
i
f

i
n

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

(
H
C
V
)

N
S
4
B

m
e
d
i
a
t
e
s

H
C
V

R
N
A

r
e
p
l
i
c
a
t
i
o
n
.

J

V
i
r
o
l

7
8
(
2
0
)
:
1
1
2
8
8
-
1
1
2
9
5
.

1
2
.

T
e
l
l
i
n
g
h
u
i
s
e
n

T
L
,

M
a
r
c
o
t
r
i
g
i
a
n
o

J
,

G
o
r
b
a
l
e
n
y
a

A
E
,

&

R
i
c
e

C
M

(
2
0
0
4
)

T
h
e

N
S
5
A

p
r
o
t
e
i
n

o
f

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

i
s

a

z
i
n
c

m
e
t
a
l
l
o
p
r
o
t
e
i
n
.

J

B
i
o
l

C
h
e
m

2
7
9
(
4
7
)
:
4
8
5
7
6
-
4
8
5
8
7
.

1
3
.

T
a
n
a
k
a

T
,

K
a
t
o

N
,

C
h
o

M
J
,

S
u
g
i
y
a
m
a

K
,

&

S
h
i
m
o
t
o
h
n
o

K

(
1
9
9
6
)

S
t
r
u
c
t
u
r
e

o
f

t
h
e

3
'
t
e
r
m
i
n
u
s

o
f

t
h
e

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

g
e
n
o
m
e
.

J

V
i
r
o
l

7
0
(
5
)
:
3
3
0
7
-
3
3
1
2
.

1
4
.

K
o
l
y
k
h
a
l
o
v

A
A
,

F
e
i
n
s
t
o
n
e

S
M
,

&

R
i
c
e

C
M

(
1
9
9
6
)

I
d
e
n
t
i
f
i
c
a
t
i
o
n

o
f

a

h
i
g
h
l
y

c
o
n
s
e
r
v
e
d

s
e
q
u
e
n
c
e

e
l
e
m
e
n
t

a
t

t
h
e

3
'
t
e
r
m
i
n
u
s

o
f

h
e
p
a
t
i
t
i
s

C

v
i
r
u
s

g
e
n
o
m
e

R
N
A
.

J

V
i
r
o
l

7
0
(
6
)
:
3
3
6
3
-
3
3
7
1
.

1
5
.

S
b
a
r
d
e
l
l
a
t
i

A
,

S
c
a
r
s
e
l
l
i

E
,

T
o
m
e
i

L
,

K
e
k
u
l
e

A
S
,

&

T
r
a
b
o
n
i

C

(
1
9
9
9
)

I
d
e
n
t
i
f
i
c
a
t
i
o
n

o
f

a

n
o
v
e
l

s
e
q
u
e
n
c
e

a
t

t
h
e

3
'
e
n
d

o
f

t
h
e

G
B

v
i
r
u
s

B

g
e
n
o
m
e
.

J

V
i
r
o
l

7
3
(
1
2
)
:
1
0
5
4
6
-
1
0
5
5
0
.


Table S9
12/20
Genus/virus Accession No.
Hepacivirus
Canine hepacivirus (CHV) JF744991/JF744997
GB virus B (GBV-B)
GB virus B (GBV-B) AF179612
GB virus B (GBV-B) U22304
GB virus B (GBV-B) AB630361
GB virus B (GBV-B) AJ277947
Hepatitis C virus (HCV)
Hepatitis C virus 1a (HCV-1a) M62321
Hepatitis C virus 1b (HCV-1b) D90208
Hepatitis C virus 2a (HCV-2a) D00944
Hepatitis C virus 2b (HCV-2b) D10988
Hepatitis C virus 3a (HCV-3a) NC_009824
Hepatitis C virus 3f (HCV-3f) D63821
Hepatitis C virus 4a (HCV-4a) Y11604
Hepatitis C virus 4d (HCV-4d) DQ516083
Hepatitis C virus 5a (HCV-5a) Y13184
Hepatitis C virus 6a (HCV-6a) Y12083
Hepatitis C virus 6g (HCV-6g) D63822
Hepatitis C virus 7a (HCV-7a) EF108306
Hepatitis C virus strain H77 AF011751
Nonprimate hepacivirus (NPHV)
Nonprimate hepacivirus (NPHV) JQ434001
Nonprimate hepacivirus (NPHV) JQ434002
Nonprimate hepacivirus (NPHV) JQ434003
Nonprimate hepacivirus (NPHV) JQ434004
Nonprimate hepacivirus (NPHV) JQ434008
Pegivirus*
GB virus A (GBV-A)
GB virus A (GBV-A) U22303
GB virus A-Alab (GBV-A-Alab) U94421
GB virus A-myx (GBV-A-myx) AF023424
GB virus A-tri (GBV-A-tri) AF023425
GB virus C (GBV-C)
GB virus C 1 (GBV-C-1) AB003291
GB virus C 2 (GBV-C-2) AB003289
GB virus C 3 (GBV-C-3) D87715
GB virus C 4 (GBV-C-4) AB018667
GB virus C 5 (GBV-C-5) AY949771
GB virus C 6 (GBV-C-6) AB003292
GB virus C 7 (GBV-C-7) HQ331234
GB virus C-trog (GBV-C-trog) AF070476
GB virus D (GBV-D)
GB virus D (GBV-D) GU566734
GB virus D (GBV-D) GU566735
*proposed genus (5)
Table S10. Virus abbreviation and Genbank accession numbers used in this study
13/20
Table S10
Nested PCR Assay Oligonucleotide ID Target Region Sequence (5'3')
BHV-1-F1 NS5B GTAGCGGAGAAGATGTATCTGGG
BHV-1-R1 NS5B GCCTTAGCCTTGAGAAAGCAGGTGAT
BHV-1-F2 NS5B GAGAAGATGTATCTGGGGGACGT
BHV-1-R2 NS5B AGAAAGCAGGTGATGGTATTGCC
BHV-2-F1 NS5B CCAAARGTWGTBAAGGCTGTGCT
BHV-2-R1 NS5B ACTTTGAKCCASGCAGTKARACAGTT
BHV-2-F2 NS5B GCTGTGCTSAAGGAMGAGTACGGCT
BHV-2-R2 NS5B CCASGCAGTKARACAGTTACTRGAG
BPgV-3-F1 NS5B GTSGCBGARAAGATGATCCTGGG
BPgV-3-R1 NS5B GCCTTCACCTTSAGGTARCAGGTGAT
BPgV-3-F2 NS5B GAGAAGATGTATCTGGGGGACGT
BPgV-3-R2 NS5B AGAAAGCAGGTGATGGTATTGCC
BPgV-4-F1 NS5B GTRGCNGARAARATGATCHTGGG
BPgV-4-R1 NS5B GCMGTSACTTTRAGGTARCAKGTRAT
BPgV-4-F2 NS5B GARAARATGATCHTGGGMGAYCC
BPgV-4-R2 NS5B AGGTARCAKGTRATGGARTTGCC
1
PCR1
PCR2
2
PCR1
PCR2
3
PCR1
PCR2
4
PCR1
PCR2
Table S11. Oligonucleotides used for BHV and BPgV screening
ID: identification; BHV: Bat hepacivirus, BPgV: Bat pegivirus; F: forward; R: reverse; NS: nonstructural.
B=C/G/T, H= A/C/T, K= G/T, M=A/C, N= A/C/G/T, R=A/G, S=C/G, Y=C/T, W=A/T
14/20
Table S11
Genus Clade Species Virus ID Oligonucleotide ID Sequence (5'!3')
Hepacivirus A 1 PDB-112 PDB112-7720F GGCGCTCCAAAGCAAGTCCCTT
PDB112-7868R GGAGAGAACCCTGATTCCGT
PDB-113 PDB113-98F GGCGCTCCAAAGCAAGTCCCTT
PDB113-246R GGAGAGAACCCTGATTCCGT
C 1 PDB-452 PDB452-8003F TGCGATACCCAATGCTTCGACT
PDB452-8143R TGGTCCACCTTGGTAGAGGCGA
PDB-445 PDB445-8109F TGCGATACCCAATGCTTCGACT
PDB445-8249R TGGTCCACCTTGGTAGAGGCGA
PDB-491.1 PDB491.1-7969F TGCGATACCCAATGCTTCGACT
PDB491.1-8109R TGGTCCACCTTGGTAGAGGCGA
D 1 PDB-829 PDB829-8202F GGCACCAAACCAAAAATCCTGTCGC
PDB829-8360R ACAGCCTGTCATGCAAAGCACG
PDB-261 PDB261-50F GGCACCAAACCAAAAATCCTGTCGC
PDB261-208R ACAGCCTGTCATGCAAAGCACG
PDB-632B PDB632B-56F GGCACCAAACCAAAAATCCTGTCGC
PDB632B-214R ACAGCCTGTCATGCAAAGCACG
PDB-830 PDB830-56F GGCACCAAACCAAAAATCCTGTCGC
PDB830-214R ACAGCCTGTCATGCAAAGCACG
UN PDB-638B PDB638-145F GTCAGATTTATCAGGCAGCTGCAAA
PDB638-270R GGCAGTGCCTGTAGCCGCAA
Pegivirus G 1 PMX-1615 PMX1615-83F GATGGCGTGTGCTTTGACAG
PMX1615-156R GAAGCGGCGGCAAAGATGTT
2 PMX-1641 PMX1641-116F GGACATCGAACGCGAGGCGAA
PMX1641-196R GCGTAGTGCTCGTGCA
3 PDB-1698 PDB1698-9147F GGACATTGATCGCGAGGCGAA
PDB1698-9309R GGCACTGGTGGTCAGTGT
PDB-1734 PDB1734-9278F GGACATTGATCGCGAGGCGAA
PDB1734-9440R GGCACTGGTGGTCAGTGT
4 PDB-34.1 PDB34.1-9285F CCTGATGTCGGCTTGGGTGAT
PDB34.1-9412R GGTTCCAACACCCGCCGCACAA
5 PDB-620 PDB620-9334F GGCTGAGGTTCTCTATCGCAT
PDB620-9498R GCATGCAATCGGCGAACCAGA
6 PDB-76.1 PDB76.1-9016F CCAGTGTGCTACACATTGGA
PDB76.1-9141R GGCATGCAGCCGACGGACAA
PDB-99 PDB99-9054F CCGGTGTGTTACACATTGGA
PDB99-9179R GGAGTGCAGCCGACGAACAA
H 2 PDB-1715 PDB1715-8564F GCCTGCCGCTATATCGGTTGAC
PDB1715-8704R CCCTCGGCATAATACTTGCC
6 PDB-694 PDB694-8406F GCCTGGTTGCATCACAGTGGAT
PDB694-8546R CCACCTGAGTAATACGCA
7 PDB-303 PDB303-8582F GCCAGCATAGTACTTGCCGA
PDB303-8733R CCTCGAAGGTTAGGCCTGGTT
K 2 PDB-24 PDB24-8369F GCGCAGATTGGTGGACGTT
PDB24-8507R CGGCCTGTCACTGGCAGCA
3 PDB-491.2 PDB491.2-8391F CGCAGGTTGGTCGACGTGT
PDB491.2-8509R GCGAACAACTTCGCCTCGAA
PDB-534 PDB534-8362F CGCAGGTTGGTCGACGTGT
PDB534-8480R GCGAACAACTTCGCCTCAAA
PDB-664 PDB-664-181F AAGTTGTTCGCGGCTGCGAGCG
PDB-664-304R CCGAAGACCGACACCGCCGGTA
PDB-725 PDB-725-178F AAGTTGTTCGCGGCTGCGAGCG
PDB-725-301R CCGAAGACCGACACCGCCGGTA
4 PDB-106 PDB106-8234F GCGCTGCATAACGGTTGAT
PDB106-8344R CGCACGAGCTCAGGCTGAT
5 PDB-307.lu PDB307.lu-172F ACCCCGTCAGGGGTGACCAT
PDB307.lu-290R CACCCGTCAGGAGTAACCAT
6 PDB-34.2 PDB34.2-99F TGGTTGAGATGTGGCAGGAGAA
PDB34.2-217R CGCTAGGCTGAAAATGTCGCGCT
PDB-76.2 PDB76.2-51F GGGAAAGCTTATGCGTTCCAAT
PDB76.2-201R CGCGCTCGACGCGCATGTCCT
PDB-838 PDB838-8322F ACGGAAGAGGACATGCG
PDB838-8440R ACCCCGTCAGGGGTGACCAT
PDB-423 PDB423-169F ACGGAGGAGGACATGCG
PDB838-287R ACCCCGTCAGGGGTTACCAT
Table S12. NS5B real time PCR oligonucleotides used for virus quantification

UN: undetermined clade, ID: identification; F: forward; R: reverse; NS: nonstructural, lu: lung.

Table S12
15/20
0. 3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
GBV-B
PDB-112
HCV-3a
HCV-1b
HCV-5a
HCV-7a
HCV-4a
NPHV
HCV-2b
HCV-1a
HCV-6g
HCV-6a
HCV-2a
HCV-4d
HCV-3f
PDB-1734
PDB-1698
PDB-99
PDB-34.1
PDB-76
PDB-620
PDB-303
GBV-D
PDB-694
PDB-1715
GBV-Anig
GBV-C2
GBV-Alab
GBV-Amyx
GBV-C3
GBV-C4
GBV-C7
GBV-C6
GBV-C5
GBV-C1
GBV-Atri
GBV-Ctro
PDB-838
PDB-737B
PDB-106
PDB-491.2
PDB-534
PDB-24
1
1
1
F
B
E
J
I
H
G
A
1
1
1
1
1
1
PDB-829
PDB-491.1
PDB-445
PDB-452
D
C
0.9
0.9
0.9
0.9
1
1
1
1
1
1
PDB-112
1
1
1
1
1
1
1
1
1
1
1
1
1
1
GBV-B
B
A
PDB-829 D
PDB-491.1
PDB-445
PDB-452 C
NPHV
E
0.9
0.9
1
1
1
HCV-7a
HCV-2b
HCV-2a
HCV-3a
HCV-3f
HCV-6g
HCV-6a
HCV-5a
HCV-4a
HCV-4d
HCV-1b
HCV-1a
F
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0.8
0.8
PDB-1734
PDB-1698
PDB-99
PDB-34.1
PDB-76
PDB-620
GBV-D
PDB-1715
PDB-303
PDB-694
H
G
GBV-C1
GBV-Ctro
GBV-C7
GBV-C3
GBV-C4
GBV-C2
GBV-C6
GBV-C5
I
1
1
1
GBV-Atri
PDB-737B
PDB-106
PDB-491.2
PDB-534
PDB-24
GBV-Anig
GBV-Amyx
GBV-Alab
PDB-838
0. 2
J
1
K K
PDB-829
16/20
Figure S1
Figure S1: Bayesian phylogenetic trees of the complete helicase (left) and RdRp (right) genes of
selected members of the Hepacivirus and Pegivirus genera and representatives from each clade of
viruses identified in this study. The different clades (A to K) are highlighted in color. Bayesian
posterior probabilities > 0.7 for each clade are designated for major nodes only. The scale bar
indicates the average number of amino acid substitutions per site. The virus names corresponding
to the abbreviations and GenBank accession numbers are included in Table S10.
5 NTR 3 NTR
GUG
C
ARF
5 NTR 3 NTR
GCG
C
5 NTR 3 NTR
GUG
- 41nt
C
+1 ribosomal frameshift
ARFP/F
HCV
(AF011751)
PDB-829
PDB-445
PDB-452
PDB-491.1
- 41nt
- 41nt
AUG
AUG
AUG
17/20
Figure S2
Figure S2: Comparison of the open reading frame in the core protein-coding region of HCV and in
BHV PDB-829. The HCV alternate reading frame proteins (ARFPs) are derived either through
ribosomal frameshift or internal translation initiation (10, 11). The HCV ARFP/F protein (orange) is
encoded in an alternate reading frame overlapping the core-encoding region (green) by ribosomal
frameshift (+ 1 frame relative to the main C ORF). A putative alternate reading frame (ARF, yellow)
is identified for BHV PDB-829 (clade D) at the potential GUG start codon (- 41 nt) that continues
into the core-coding region, and has a coding capacity of 156 aa. No alternate reading frames were
identified for the BHVs from clade C (PDB-445, 452 and 491.1). The AUG start codon for C is
indicated in bold. NTR: non-translated region; C: core.
| | | | | | | | | | |
HCV-1a - - - Y Q V R N - - S T G L Y H V T N D C P N S S I V Y E A A D A I L H T P G C V P C V R E G N A S - R C W V A M T P T V A T R D G K L P A T - Q L R R H I D L L V G S A T L C S A L Y V G D L C G S V
HCV-1b - - - Y E V R N - - V S G I Y H V T N D C S N S S I V Y E A A D M I M H T P G C V P C V R E S N F S - R C W V A L T P T L A A R N S S I P T T - T I R R H V D L L V G A A A L C S A M Y V G D L C G S V
HCV-2a - - - A E V K N - - I S T G Y M V T N D C T N D S I T W Q L Q A A V L H V P G C V P C E K V G N T S - R C W I P V S P N V A V Q Q P G A L T Q - G L R T H I D M V V M S A T L C S A L Y V G D L C G G V
HCV-2b - - - V E V R N - - I S S S Y Y A T N D C S N N S I T W Q L T D A V L H L P G C V P C E N D N G T L - H C W I Q V T P N V A V K H R G A L T R - S L R T H V D M I V M A A T A C S A L Y V G D V C G A V
HCV-3a - - - L E W R N - - T S G L Y V L T N D C S N S S I V Y E A D D V I L H T P G C V P C V Q D G N T S - T C W T P V T P T V A V R Y V G A T T A - S I R S H V D L L V G A A T M C S A L Y V G D M C G A V
HCV-3f - - - L E Y R N - - A S G L Y T V T N D C S N G S I V Y E A G D V I L H L P G C I P C V R L N N A S - K C W T P V S P T V A V S R P G A A T A - S L R T H V D M M V G A A T L C S A L Y V G D L C G A L
HCV-4a - - - V N Y R N - - V S G I Y H V T N D C P N S S I V Y E A D H H I M H L P G C V P C V R E G N Q S - R C W V A L T P T V A A P Y I G A P L E - S L R S H V D L M V G A A T V C S G L Y I G D L C G G L
HCV-4d - - - Y N Y R N - - S S G V Y H V T N D C P N S S I V Y E T E H H I L H L P G C V P C V R A G N K S - S C W V S L T P T V A A P H L N A P L E - S L R R H V D L M V G S A T L C S A L Y I G D V C G G A
HCV-5a - - - V P Y R N - - A S G V Y H V T N D C P N S S I V Y E A D N L I L H A P G C V P C V L E D N V S - R C W V Q I T P T L S A P S F G A V T A - L L R R A V D Y L A G G A A F C S A L Y V G D A C G A L
HCV-6g - - - V N Y A N - - K S G I Y H L T N D C P N S S M V Y E A E A I I L H L P G C V P C I R T G N Q S - R C W T P A T P T L A I P N S T V P A S - G F R Q H I D L M V G A A A L C S A M Y L G D L C G G V
HCV-6a - - - L T Y G N - - S S G L Y H L T N D C S N S S I V L E A D A M I L H L P G C L P C V R V G N Q S - T C W H A V S P T L A T P N A S T P A T - G F R R H V D L L A G A A V V C S S L Y I G D L C G S L
HCV-7a - - - Y E V R N - - S S G V Y H L T N D C P N A S I V Y E T D N A I L H E P G C V P C V R E G N T S - R C W E P V A P T L A V R Y R G A L T D - D L R T H I D L V V A S A T L C S A L Y V G D I C G A I
GBV-B - - A R V T D P - - D T N T T I L T N C C Q R N Q V I Y C S P S T C L H E P G C V I C A D E - - - - - - C W V P A N P Y I S H P S N W T G T D S F L A D H I D F V M G A L V T C D A L D I G E L C G A C
NPHV - - S V - V R N - - - G G - H V V S N D C N S S Q I L W A A S D W A I H E V G C V P C V D S - - - - - T C W V P L T S S I S V R N E S V I V R - G L G S H I D V L A A M A S V C S T L G I G E A C G T A
CHV - - S V - V R N - - - G G - H V V S N D C N S S Q I L W A S S D W A I H E V G C I P C V D G - - - - - V C W V P L T S S I S V R N E S V I V R - G L G S H I D V L S A M A S V C S T L G I G E A C G A A
PDB-112 V P A I R F K A E Q E H T W Y A L T N C C P P E S V R Y C T F H T C L H D S G C A I C E R A G D G N V T C W I P D G V F S S H P P G Y E G V D P W L A N H I E Y V S A A V L L C D W L E V G E I C S M T
PDB-829 - - S Y A S H T C Q V G S D V V F T N A C N P D E I Y F C T D Y G C W H A G G C V P C V D G - - - - - E C W H R L S P S F S L K N D S L E S L - G L I P H I D A L M M L C A T C D A L Y I G E A C G M A
PDB-445 - - S F A S H S C Q V G N D V I V T N A C N S D E I Y F C S E D I C W H A G G C V P C E G G - - - - - K C W E R I G V T L S I R N E S V R L T - S M L P H I D G L L M L C A A C D A L G I G E V C G V G
PDB-491.1 - - S F A S H S C Q V G N D V I V T N A C N S D E I Y F C S E D I C W H A G G C V P C E G G - - - - - K C W E R I G V T L S I R N E S V R L T - S M L P H I D G L L M L C A A C D A L G I G E V C G V G
PDB-452 - - S F A S H S C Q V G N D V I V T N A C N S D E I Y F C S E D I C W H A G G C V P C E G G - - - - - K C W E R I G V T L S I R N E S V R L T - S M L P H I D G L L M L C A A C D A L G I G E V C G V G
| | | | | | | | | | |
HCV-1a F L V G Q - L F T F S P R R H W T T Q G C N C S I Y P G H I T G H R M A W D M M M N W - - - - S P T T A L V M A Q L L R I P Q A I L D M I A G A H W G V L A G I A Y F S M V G N W A K V L V V L L L F A
HCV-1b F L V S Q - L F T F S P R R Y E T V Q D C N C S I Y P G H V S G H R M A W D M M M N W - - - - S P T T A L V V S Q L L R I P Q A V V D M V A G A H W G V L A G L A Y Y S M V G N W A K V L I V M L L F A
HCV-2a M L A A Q - M F I V S P Q H H W F V Q D C N C S I Y P G T I T G H R M A W D M M M N W - - - - S P T A T M I L A Y A M R V P E V I I D I I G G A H W G V M F G L A Y F S M Q G A W A K V V V I L L L A A
HCV-2b M I L S Q - A F M V S P Q R H N F T Q E C N C S I Y Q G H I T G H R M A W D M M L S W - - - - S P T L T M I L A Y A A R V P E L V L E I I F G G H W G V V F G L A Y F S M Q G A W A K V I A I L L L V A
HCV-3a F L V G Q - A F T F R P R R H Q T V Q T C N C S L Y P G H L S G H R M A W D M M M N W - - - - S P A V G M V V A H V L R L P Q T L F D I M A G A H W G I L A G L A Y Y S M Q G N W A K V A I I M V M F S
HCV-3f F L V G Q - G F S W R H R Q H W T V Q D C N C S I Y P G H L T G H R M A W D M M M N W - - - - S P A M T L I V S Q V L R L P Q T M F D L V I G A H W G V M A G V A Y Y S M Q G N W A K V F L V L C L F S
HCV-4a F L V G Q - M F S F R P R R H W T T Q D C N C S I Y T G H I T G H R M A W D M M M N W - - - - S P T T T L V L A Q V M R I P T T L V D L L S G G H W G V L V G V A Y F S M Q A N W A K V I L V L F L F A
HCV-4d F L V G Q - L F T F Q P R R H W T T Q D C N C S I Y T G H I T G H R M A W D M M M N W - - - - S P T T T L V L A Q L M R I P S A M V D L L A G G H W G I L V G V A Y F S M Q A N W A K V I L V L F L F A
HCV-5a S L V G Q - M F T Y K P R Q H T T V Q D C N C S I Y S G H I T G H R M A W D M M M K W - - - - S P T T A L L M A Q L L R I P Q V V I D I I A G G H W G V L L A A A Y F A S T A N W A K V I L V L F L F A
HCV-6g F L V G Q - L F T F R P R I H Q T V Q D C N C S I Y T G H V T G H R M A W D M M M N W - - - - S P T A T F V V S S A L R A P Q V L F D I F A G G H W G I I G A L L Y Y S T A A N W A K V I I V L L L F A
HCV-6a F L A G Q - L F A F Q P R R H W T V Q D C N C S I Y T G H V T G H K M A W D M M M N W - - - - S P T T T L V L S S I L R V P E I C A S V I F G G H W G I L L A V A Y F G M A G N W L K V L A V L F L F A
HCV-7a F I A S Q - A V L W K P G G G R I V Q D C N C S I Y P G H V T G H R M A W D M M Q N W - - - - A P A L S M V A A Y A V R V P G V I I T T V A G G H W G V L F G L A Y F G M A G N W A K V I L I M L L M S
GBV-B V L V G D W L V R H W L I H I D L N E T G T C - - Y L E V P T G I D P G F L G F I G W M A G K V E A V I F L T K L A S Q V P Y A I A T M F S S V H Y L A V G A L I Y Y A S R G K W Y Q L L L A L M L Y I
NPHV T L T Y I T F L S R F F M S L N L T N D C E C F L Y P G A I S T F E F T L R A L Q S M - - - - M P N L S G F V S M F S G V P N T L F T I F T N G H W G V I L A L C L Y G T T N N Y F K L C L L L L A Y S
CHV T L T Y I T F L S R F F M P L N L T N D C E C F L Y P G A I S T F E F T M R A L Q S M - - - - M P N L S G F L S M F S G L P N T L F T I F T N G H W G V I L A L C L Y G T T N D Y F K L C L L L L A Y S
PDB-112 V W A V D W S L G H M Y H H I D L T Q N A T C - - W L S K P T G I D P G I V S W L G W V K S E L G L I A Y F I G W L S K L P V A V V H L V V N M H Y F T L A S F L Y Y F S Q G K P V K V A L V F F V Y -
PDB-829 V L G F E W I F H L F H S S Y E F T C E C D C Y L L L E A P S S I K V S F D V F Q S Y - - - - F S G L Q W L G A V L A E V P G A L L G L V T G R H L G V L F A V A Y Y A M G T A P L R A V G V I L L Y L
PDB-445 V L V F E T T Y H L H S V S R N F S C N C D C H L L E T P K S A S A I S F S V V S S Y - - - - F K D L T W V T S L F A E V P G A V L Q L V G G G H L G V L F A L L Y Y G L G P A P L R A V L V L L L F L
PDB-491.1 V L V F E T T Y H L H S V S R N F S C N C D C H L L E T P K S A S A I S F S V V S S Y - - - - F K D L T W V T S L F A E V P G A V L Q L V G G G H L G V L F A L L Y Y G L G P A P L R A V L V L L L F L
PDB-452 V L V F E T T Y H L H S V S R N F S C N C D C H L L E T P K S A S A I S F S V V S S Y - - - - F K D L T W V T S L F A E V P G A V L Q L V G G G H L G V L F A L L Y Y G L G P A P L R A V L V L L L F L
| | | | |
*
| | | | | |
HCV-1a G V D A E T H V T G G S A G H T V S G F V S - L L A P G A K Q N V Q L I N T N G S W H - - - - L N S T A L N C N D S L N T G W L A G L F Y H H K F N S S G C P E R L A S C R P L T D F D Q G W G P I S Y
HCV-1b G V D G H T H V T G G R V A S S T Q S L V S - W L S Q G P S Q K I Q L V N T N G S W H - - - - I N R T A L N C N D S L Q T G F I A A L F Y A H R F N A S G C P E R M A S C R P I D E F A Q G W G P I T H
HCV-2a G V D A Q T H T V G G S T A H N A R T L T G - M F S L G A R Q K I Q L I N T N G S W H - - - - I N R T A L N C N D S L H T G F L A S L F Y T H S F N S S G C P E R M S A C R S I E A F R V G W G A L Q Y
HCV-2b G V D A T T Y S S G Q E A G R T V A G F A G - L F T T G A K Q N L Y L I N T N G S W H - - - - I N R T A L N C N D S L Q T G F L A S L F Y T H K F N S S G C P E R L S S C R G L D D F R I G W G T L E Y
HCV-3a G V D A H T Y T T G G T A S R H T Q A F A G - L F D I G P Q Q K L Q L V N T N G S W H - - - - I N S T A L N C N E S I N T G F I A G L F Y Y H K F N S T G C P Q R L S S C K P I T F F R Q G W G P L T D
HCV-3f G V D A S T T I T G G V A A S G A F T I T S - L F S T G A K Q P L H L V N T N G S W H - - - - I N R T A L N C N D S L N T G F I A G L L Y Y H K F N S S G C V E R M S A C S P L D R F A Q G W G P L G P
HCV-4a G V D A E T H V S G A A V G R S T A G L A N - L F S S G S K Q N L Q L I N S N G S W H - - - - I N R T A L N C N D S L N T G F L A S L F Y T H K F N S S G C S E R L A C C K S L D S Y G Q G W G P L G V
HCV-4d G V D A Q T H I T G G K A G R D A L T F A G - L F T M G G Q Q H I Q L I N T N G S W H - - - - I N R T A L N C N D S L N T G F L A S L F Y Y R R F N S S G C P E R L A S C S S L D S L P Q G W G P L G I
HCV-5a G V D G R T H T V G G T V G Q G L K S L T S - F F N P G P Q R Q L Q F V N T N G S W H - - - - I N S T A L N C N D S L Q T G F I A G L M Y A H K F N S S G C P E R M S S C R P L A A F D Q G W G T I S Y
HCV-6g G V D A S T Y V A - S S V S Q A T S G L V S - L F S A G A R Q N L Q L I N T N G S W H - - - - I N R T A L N C N D S L Q T G F I A S L F Y R N K F N A T G C P E R L S A C K T L D S F D Q G W G P I T Y
HCV-6a G V E A Q T M I A - H G V S Q T T S G F A S - L L T P G A K Q N I Q L I N T N G S W H - - - - I N R T A L N C N D S L Q T G F L A S L F Y T H K F N S S G C P E R M A A C K P L A E F R Q G W G Q I T H
HCV-7a G V D A E T M A V G A R A A H T T G A L V S - L L N P G P S Q R L Q L I N T N G S W H - - - - I N R T A L N C N D S L Q T G F I A A L F Y T H R F N S S G C P E R M A S C K P L S D F D Q G W G P L W Y
GBV-B E A T S G N P I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - R V P T G C S I A E F - - - - - - - - - - - - - - - - - C S P L M I P C - - - - - - - - - - - - - - -
NPHV G - - - - - - - - - - L V S C D - - - S D Y I N V S L S C N F T V K Q M W G W T F F P K W A L L N G Q R L N C - - - - - - - - - - - - - - - - - - T E G S P Y N P K - - C K G P M D F - N I T T D P V V
CHV G - - - - - - - - - - L V S C D D Y L N V S - - - - L S C N F T V K E M W G W T F F P K W A L L N G Q R L N C - - - - - - - - - - - - - - - - - - T E G S P Y N P K - - C K G P F D F - N V T T D P Y I
PDB-112 - V E A A A A - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - M P V N C S W F A T Q N E L D L W - - - - - - - - - - C S P L V Q P C - - - - - - - - - - - - - - -
PDB-829 T A A Q A N P I P T R V A N A S D I S A C S P I Q P P G P L W G L S W L - - G G V W H - - - - R Q I T D A P V - - - - - - - - - - - - - - - - - - L L A K P P N Y T A - L R N F P I H Q K P H P Y G M Y
PDB-445 T A S Q A A E H T - R V A G A D D F G S C G - I R P P G P L W S S L W L N Y F - - W K - - - - T N T T T A P - - - - - - - - - - - - - - - - - - - T L G T T P Q H P P S G H - L T V - - Q P W P N T T W
PDB-491.1 T A S Q A A E H T - R V A G A E D F G S C G - I R P P G P L W S S L W L N Y F - - W K - - - - T N I T T A P - - - - - - - - - - - - - - - - - - - T L G T T P Q H P P S G H - L T V - - Q P W P N T T W
PDB-452 T A S Q A A E H T - R V A G A D D F G S C G - I R P P G P L W S S L W L N Y F - - W K - - - - T N T T T A P - - - - - - - - - - - - - - - - - - - T L G T T P Q H P P S G H - L V V - - Q P W P N T T W
| | | | | | | | | |
HCV-1a - - A N G S G P D Q R P Y C W H Y P P K P C G I - V P A K S V C G P V Y C F T P S P V - V V G T T D R S G A P T Y S W G E N D T D V F V L N N T R P P L G N W F G C T W M N S T G F T K V C G - - - - -
HCV-1b - - D M P E S S D Q R P Y C W H Y A P R P C G I - V P A S Q V C G P V Y C F T P S P V - V V G T T D R F G A P T Y S W G E N E T D V L L L S N T R P P Q G N W F G C T W M N S T G F T K T C G - - - - -
HCV-2a E D N V T N P E D M R P Y C W H Y P P R Q C G V - V S A S S V C G P V Y C F T P S P V - V V G T T D R L G A P T Y T W G E N E T D V F L L N S T R P P Q G S W F G C T W M N S T G Y T K T C G - - - - -
HCV-2b E T N V T N D G D M R P Y C W H Y P P R P C G I - V P A R T V C G P V Y C F T P S P V - V V G T T D K Q G V P T Y T W G E N E T D V F L L N S T R P P R G A W F G C T W M N G T G F T K T C G - - - - -
HCV-3a - A N I T G P S D D R P Y C W H Y A P R P C D I - V P A S S V C G P V Y C F T P S P V - V V G T T D A R G V P T Y T W G E N E K D V F L L K S Q R P P S G R W F G C S W M N S T G F L K T C G - - - - -
HCV-3f - A N I S G P S S E K P Y C W H Y A P R P C D T - V P A Q S V C G P V Y C F T P S P V - V V G A T D K R G A P T Y T W G E N E S D V F L L E S A R P P T E P W F G C T W M N G S G Y V K T C G - - - - -
HCV-4a A - N I S G S S D D R P Y C W H Y A P R P C G I - V P A S S V C G P V Y C F T P S P V - V V G T T D H V G V P T Y T W G E N E T D V F L L N S T R P P H G A W F G C V W M N S T G F T K T C G - - - - -
HCV-4d - - Y Q P N V P D T R P Y C W N Y T P R P C G T - V S A L T V C G P V Y C F T P S P V - V V G T T D R R G A P T Y T W G E N E T D V F L L N T T R P P R G A W F G C T W M N S T G F T K S C G - - - - -
HCV-5a - A T I S G P S D D K P Y C W H Y P P R P C G V - V P A R D V C G P V Y C F T P S P V - V V G T T D R R G C P T Y N W G S N E T D I L L L N N I R P P A G N W F G C T W M N S T G F V K N C G - - - - -
HCV-6g - A N I S G P A V E K P Y C W H Y P P R P C E V - V S A L N V C G P V Y C F T P S P V - V L G T T D R R G N P T Y T W G A N E T D V F M M S S L R P P A G G W Y G C T W M N T S G F V K T C G - - - - -
HCV-6a - K N V S G P S D D R P Y C W H Y A P R P C E V - V P A R S V C G P V Y C F T P S P V - V V G T T D K R G N P T Y T W G E N E T D V F M L E S L R P P T G G W F G C T W M N S T G F T K T C G - - - - -
HCV-7a - N S T E R P S D Q R P Y C W H Y A P S P C G I - V P A K D V C G P V Y C F T P S P V - V V G T T D R R G V P T Y T W G E N E S D V F L L N S T R P P Q G S W F G C S W M N T T G F T K T C G - - - - -
GBV-B - - - - - - - - - - - - - - - - - - - - P C - H S Y L S E N V - S E V I C Y S P K W T R P V T L E Y N N S I S W Y P Y - - - - - - - - - - - - - - T I P G A R - G C M V K F K N N - T W G C C R I R N V
NPHV A Y S G T R S H P P C P Y - - - H V S R P C - S V L N A S R V C G K P T C F G P A P I - E V G V T D R D G N L A S W N D T G Q F Y F D L R S P H R P P R G R W Y G C V W L N S T G W V K Q C G - - - - -
CHV A Y S G T R S H P P C P Y - - - H V S R P C - S V L D A S R V C G K P T C F G P A P I - E V G V T D R D G K L A S W N D S G Q F F F D L R S P H R P P R G R W Y G C V W L N S T G W V K Q C G - - - - -
PDB-112 - - - - - - - - - - - - - - - - - - - - R C S H L N L A E N V - S E A V C F N P F S V - P T K I - - - - - I G S Y V E - - - - - - - - - - - - - - L P P K A W - G C V I K F H N G - S A K C C A A R R V
PDB-829 - - - - G R S D T H G P Y - - - R I F P R C K K R L Y D H E V C G V V T C F N P W P H D L V R Q R G P N G T G W F N - - - - - - - - - L P Q T N R G P P D H N W G C L W L N L T N A L K G C G - - - - -
PDB-445 - G D G R G T R T H S P Y - - - R I F P R C K P F I P D G N V C G P V T C F T P W P F D L E R D K N K S G - - - Y H - - - - - - - - - L P Q G S R S A P D H F Y G C V W L N R T G F L L G C G - - - - -
PDB-491.1 - G D G R G T R T H S P Y - - - R I F P R C K P F I P D G N V C G P V T C F T P W P F D L E R D K N K S G - - - Y H - - - - - - - - - L P Q G S R S A P D H F Y G C V W L N R T G F L L G C G - - - - -
PDB-452 - G D G R G T R T H S P Y - - - R I F P R C K P F I P D G N V C G P V T C F T P W P F D L E R D K N K S G - - - Y H - - - - - - - - - L P Q G S R S A P D H F Y G C V W L N R T G F L L G C G - - - - -
| | | | | | | | | |
HCV-1a A P P C V I G G A G N N T - - - - - L H C P T D C F R K H P D A T Y S R C G S G P W I T P - - - - - - - - - - R C L V D Y P Y R L W H Y P C T I N Y T I F K I R M Y V G G V E H R L E - A A C N W T R G
HCV-1b G P P C N I G G V G N N T - - - - - L V C P T D C F R K H P E A T Y T K C G S G P W L T P - - - - - - - - - - R C M V D Y P Y R L W H Y P C T V N F T V F K V R M Y V G G V E H R L N - A A C N W T R G
HCV-2a A P P C R I R A D F N A S M D - - - L L C P T D C F R K H P D T T Y I K C G S G P W L T P - - - - - - - - - - R C L I D Y P Y R L W H Y P C T V N Y T I F K I R M Y V G G V E H R L T - A A C N F T R G
HCV-2b A P P C R I R K D Y N S T I D - - - L L C P T D C F R K H P D A T Y L K C G A G P W L T P - - - - - - - - - - R C L V D Y P Y R L W H Y P C T V N F T I F K A R M Y V G G V E H R F S - A A C N F T R G
HCV-3a A P P C N I Y G G E G N P H N E S D L F C P T D C F R K H P E T T Y S R C G A G P W L T P - - - - - - - - - - R C M V D Y P Y R L W H Y P C T V D F R L F K V R M F V G G F E H R F T - A A C N W T R G
HCV-3f A P P C H I Y G G R E G K S N N S - L V C P T D C F R K H P D A T Y N R C G A G P W L T P - - - - - - - - - - R C L V D Y P Y R L W H Y P C T V N Y T I F K V R M F V G G L E H R F N - A A C N W T R G
HCV-4a A P P C E V N T N N G T - - - - - - W H C P T D C F R K H P E T T Y A K C G S G P W I T P - - - - - - - - - - R C L I D Y P Y R L W H F P C T A N F S V F N I R T F V G G I E H R M Q - A A C N W T R G
HCV-4d G P P C S I T A N G S T - - - - - - W G C P T D C F R K H P E A T Y T K C G S G P W L T P - - - - - - - - - - R C L V D Y P Y R L W H Y P C T V N Y T V F K V R M Y I G G I E H R L D - A A C N W T R G
HCV-5a A P P C N L G P T G N N S - - - - - L K C P T D C F R K H P D A T Y T R C G S G P W L T P - - - - - - - - - - R C L V H Y P Y R L W H Y P C T V N Y T I F K V R M F I G G L E H R L E - A A C N W T Y G
HCV-6g A P P C N I R P N P E E N R T E T - L R C P T D C F R K H P G A T Y A K C G S G P W L T P - - - - - - - - - - R C L V D Y P Y R L W H Y P C T V N Y T L H K V R M Y I A G S E H R F T - A A C N W T R G
HCV-6a A P P C Q I V P G N Y N S S A N E - L L C P T D C F R K H P E A T Y Q R C G S G P W V T P - - - - - - - - - - R C L V D Y A Y R L W H Y P C T V N F T L H K V R M F V G G T E H R F D - V A C N W T R G
HCV-7a G P P C K I R P Q G A Q S N T S - - L T C P T D C F R K H P R A T Y S A C G S G P W L T P - - - - - - - - - - R C M V H Y P Y R L W H Y P C T V N F T I H K V R L Y I G G V E H R L D - A A C N W T R G
GBV-B P S Y C - - - - - - - - - - - - - - - T M G T D A V W N D T R N T Y E A C G V T P W L T T - A W H N G S A L K L A I L Q Y P G S K E M F K - P H N W M S G - H L Y F E G S D T P I V - - Y F Y D P V N S
NPHV A P P C N M K L M S N T S K T - - - F V C P S D C F R Q N P K A T Y Q L C G Q G P W I S H - - - - - - - - - - N C L I D Y T D R Y L H F P C T E N F T V Y P V R M I L G D G A R D V R - V A C K F N R S
CHV A P P C N M R L M S N K S K P - - - F V C P T D C F R Q N P K A T Y Q L C G Q G P W I T H - - - - - - - - - - S C L I D Y T D R Y L H F P C T E N F T V Y P V R M I L G D D A R D V R - V A C K F N R S
PDB-112 P D Y C - - - - - - - - - - - - - - K G C S S D C S W Q D P R Q T F E N C G T T P W V S T V R T P E G G V S K V L V L A H D N I P T I L G V P Y S W P S Y Q T Q W P E A R N R L L Y L K Y N N S W S D L
PDB-829 P P P C Q S G A - - - - - - - - - - Y V C G R D C F E V N P R M R F E A C G Q A P W L T D - - - - - - - - - - K L I I D Y P M R P I H Y P Q T T D W G I Y Q L R V S F P I L D G D L A - A A A F L N R S
PDB-445 P P P C L I G R - - - - - - - - - - Y A C A R D C F E V N P R A T F T L C G Q G P W I S P - - - - - - - - - - T A L I K Y P M A H V H W P Q V A E Y G E Y T I R F S S S L H S G N L P L L A K R T N N S
PDB-491.1 P P P C L I G R - - - - - - - - - - Y A C A R D C F E V N P R A T F T L C G Q G P W I S P - - - - - - - - - - T A L I K Y P M A H V H W P Q V A E Y G E Y T I R F S S S L H S G N L P L L A K R T N N S
PDB-452 P P P C L I G R - - - - - - - - - - Y A C A R D C F E V N P R A T F T L C G Q G P W I S P - - - - - - - - - - T A L I K Y P M A H V H W P Q V A E Y G E Y T I R F S S S L H S G N L P L L A K R T N N S
| | | | | | | | | |
HCV-1a E R C D L E D R D R S E L S P L L L T T T Q W Q V L P C S F T T L P A L S T G L I H L H Q N I V D V Q Y L Y G V G S S I A S W A I K W E Y V V L L F L L L A D A R V C S C L W M M L L I S - Q A E A - -
HCV-1b E R C D L E D R D R S E L S P L L L S T T E W Q I L P C S F T T L P A L S T G L I H L H R N I V D V Q Y L Y G I G S A V V S F A I K W E Y I L L L F L L L A D A R V C A C L W M M L L I A - Q A E A - -
HCV-2a D R C N L E D R D R S Q L S P L L H S T T E W A I L P C T Y S D L P A L S T G L L H L H Q N I V D V Q F M Y G L S P A L T K Y I V R W E W V V L L F L L L A D A R V C A C L W M L I L L G - Q A E A - -
HCV-2b D R C R L E D R D R G Q Q S P L L H S T T E W A V L P C S F S D L P A L S T G L L H L H Q N I V D V Q Y L Y G L S P A L T R Y I V K W E W V I L L F L L L A D A R I C A C L W M L I I L G - Q A E A - -
HCV-3a E R C D I E D R D R S E Q H P L L H S T T E L A I L P C S F T P M P A L S T G L I H L H Q N I V D V Q Y L Y G V G S G M V G W A L K W E F V I L V F L L L A D A R V C V A L W L M L M I S - Q T E A - -
HCV-3f E R C N L E D R D R S E M Y P L L H S T T E Q A I L P C S F V P I P A L S T G L I H L H Q N I V D V Q Y L Y G I S S G L V G W A I K W E F V I L I F L L L A D A R V C V V L W M M M L I S - Q A E A - -
HCV-4a E V C G L E H R D R V E L S P L L L T T T A W Q I L P C S F T T L P A L S T G L I H L H Q N I V D V Q Y L Y G V G S A V V S W A L K W E Y V V L A F L L L A D A R V S A Y L W M M F M V S - Q V E A - -
HCV-4d E P C D L E H R D R T E I S P L L L S T T Q W Q V L P C S F T T L P A L S T G L I H L H Q N I V D V Q Y L Y G V G S A V V S W A L K W E Y V V L A F L L L A G A R I C A C L W M M L L V A - Q V E A - -
HCV-5a E R C D L E D R D R A E L S P L L H T T T Q W A I L P C S F T P T P A L S T G L I H L H Q N I V D T Q Y L Y G L S S S I V S W A V K W E Y I M L V F L L L A D A R I C T C L L I L L L I - C Q A E A - -
HCV-6g E R C D L A D R D R I E M S P L L F S T T E L A I L P C S F T T M P A L S T G L I H L H Q N V V D V Q Y L Y G L S T S I V N W A I K W E Y V V L L F L V L A D S R I C L A L W L M L L I G - Q A E A - -
HCV-6a E R C E L H D R N R I E M S P L L F S T T Q L S I L P C S F S T M P A L S T G L I H L H Q N I V D V Q Y L Y G V S T N V T S W V V K W E Y I V L M F L V L A D A R I C T C L W L M L L I S - T V E A - -
HCV-7a E R C D L E D R D R V D M S P L L H S T T E L A I L P C S F V P L P A L S T G L I H L H Q N I V D A Q Y L Y G L S P A I I S W A I R W E W V V L V F L L L A D A R I C A C L W M M M L M A - Q A E A - -
GBV-B T L L P P E R W A R L P G T P P V V R G S W L Q V P Q G F Y S D V K D L A T G L I T K D K A W K N Y Q V L Y S A T G A L S L T G V T T K A V V L I L L G L C G S K Y L I L A Y L C Y L S L C F G R A S G
NPHV I S C R T E D R L R A S I V S L L Y S V T T A A V P P C H F S P L P A F T T G L I H L D R N L S D V Q Y V W A M T P S A V N I F L R L E W A V F F L L L L M D A K V C A I L W F C L C L A L Q A E A - -
CHV V S C R T E D R L R A S I V S L L Y S V T T A A V P P C H F S P L P A F T T G L I H L D R N L S D V Q Y V W A M T P S A V N I F L R L G W A V F F L L L L M D A K V C A I L W F C L C L A L Q A E A - -
PDB-112 D P H P - Q H W G R I P G W P E S Y R S W W I W V P K G L Y A D T R D M S T G L L T K D A K Y P E Y Q L V M S A T G S L S L A S I S T T I V V A A I M A F L G G R W S L L L - F C L A Q M L - - E G - -
PDB-829 E P V S A G R W R R A L G R P K L W T A T K I A V P P G H A Y E M P A L A S G F I S K D P F H Q D V Q T F I N G P G F A L G P L V S I K I A V L L T L L L M G S R I V L V G W I I I W - A Y W A D A - -
PDB-445 Q S V T K G R W Y R V P G N P N L Y D T V R M Q V P P N H F F P I P A M A S A Y I S K D P F Y T D V Q I F S S S P Q T S L I P L V S L K M A V L M L L L L M N A R V V L V L W V L F W - A Y A A E G - -
PDB-491.1 E P V T K G R W Y R V P G N P N L Y D T V R M Q V P P N H F F P I P A M A S A Y I S K D P F F T D V Q I F S S S P Q T S L I P L V S L K M A V L M L L L L M N A R V V L V L W V L F W - A Y A A E G - -
PDB-452 E P V T K G R W Y R V P G N P N L Y D T V R M Q V P P N H F F P I P A M A S A Y I S K D P F Y T D V Q I F S S S P Q T S L I P L V S L K M A V L M L L L L M N A R V V L V L W V L F W - A Y A A E G - -
9 9
0 0 6 0 9 5 0 8 5 0 7 5 0 6 5
CXVII CXVII
0 5 5 0 4 5 0 3 5 0 2 5 0 1 5 1 0 5
7 6 6 5 8 7
0 0 5 0 9 4 0 8 4 0 7 4 0 6 4
CX CXI CXII CXIII CXIV
0 5 4 0 4 4 0 3 4 0 2 4 0 1 4 1 0 4
CXV CXVI
4 4 3 3 5 1
0 0 4 0 9 3 0 8 3 0 7 3
CIV CV CVI CVII CVIII CIX
2 2 1
0 6 3 0 5 3 0 4 3 0 3 3 0 2 3 0 1 3 1 0 3
0 0 3 0 9 2 0 8 2 0 7 2 0 6 2
CI CII CIII
0 5 2 0 4 2 0 3 2 0 2 2 0 1 2 1 0 2
0 0 2 0 9 1 0 8 1 0 7 1 0 6 1 0 5 1
0 0 1 0 9 0 8 0 7 0 6 50
0 4 1 0 3 1 0 2 1 0 1 1 1 0 1
0 4 0 3 0 2 0 1 1
Figure S3A: Amino acid sequence alignment of the envelope E1 and E2 of BHVs and representative members
of the Hepacivirus genus. A) Cysteine and asparagine residues are highlighted in green and blue, respectively.
Cysteine residues experimentally shown to be involved in disulfide bridges in HCV E2 are numbered
sequentially (C
I-
C
XVIII
) according to their location in the alignment and indicated above the sequences; numbers
under the sequences indicate disulfide-bond connectivity (12). Predicted N-glycosylation sites in E1 and E2 are
shown in purple open boxes, and experimentally determined sites in HCV E2 are shown in red open boxes
(13). HCV E2 residues shown to interact with CD81 are indicated with orange asterisks (12). Amino acid
residues are numbered with respect to the E1 protein. GenBank accession numbers are provided in Table
S10.
* * * * * *
* * * *
* * * *
18/20
Figure S3A
1 2 3 4 5 6 7 8 9
Clade Virus ID
Accession
Number
C
I
-C
VIII
C
II
-C
III
C
IV
-C
V
C
VI
-C
VII
C
IX
-C
X
C
XI
-C
XII
C
XIII
-C
XV
C
IV
-C
XVI
C
XVII
-C
XVIII
A PDB-112
B GBV-B U22304
C PDB-452
D PDB-829
E CHV JF744991
NPHV JQ434008
F HCV-3a NC_009824
Disulphide Bridges
1
Figure S3B
KC796077
KC796090
KC796074
19/20
Figure S3B: Comparison of the predicted disulfide-bond connectivity in E2 of BHVs and NPHVs.
The 18 cysteine residues (CI-CXVIII) experimentally identified to be involved in the 9 disulfide
bridges (numbered1 through 9) in HCV E2 are shown according to their disulfide bond
connectivity (12). Dots indicate predicted (clade A-E) or experimental (clade F) disulfide-bond
bridges.
SI References



1. Delport W, Poon AF, Frost SD, & Kosakovsky Pond SL (2010) Datamonkey
2010: a suite of phylogenetic analysis tools for evolutionary biology.
Bioinformatics 26(19):2455-2457.
2. Kosakovsky Pond SL, Posada D, Gravenor MB, Woelk CH, & Frost SD (2006)
GARD: a genetic algorithm for recombination detection. Bioinformatics
22(24):3096-3098.
3. Martin DP, et al. (2010) RDP3: a flexible and fast computer program for
analyzing recombination. Bioinformatics 26(19):2462-2463.
4. Ronquist F & Huelsenbeck JP (2003) MrBayes 3: Bayesian phylogenetic
inference under mixed models. Bioinformatics 19(12):1572-1574.
5. Stapleton JT, Foung S, Muerhoff AS, Bukh J, & Simmonds P (2011) The GB
viruses: a review and proposed classification of GBV-A, GBV-C (HGV), and
GBV-D in genus Pegivirus within the family Flaviviridae. J Gen Virol 92(Pt 2):233-
246.
6. Epstein JH, et al. (2010) Identification of GBV-D, a novel GB-like flavivirus from
old world frugivorous bats (Pteropus giganteus) in Bangladesh. PLoS Pathog
6:e1000972.
7. Petersen TN, Brunak S, von Heijne G, & Nielsen H (2011) SignalP 4.0:
discriminating signal peptides from transmembrane regions. Nat Methods
8(10):785-786.
8. Lin C, Lindenbach BD, Pragai BM, McCourt DW, & Rice CM (1994) Processing
in the hepatitis C virus E2-NS2 region: identification of p7 and two distinct E2-
specific products with different C termini. J Virol 68(8):5063-5073.
9. Ghibaudo D, Cohen L, Penin F, & Martin A (2004) Characterization of GB virus B
polyprotein processing reveals the existence of a novel 13-kDa protein with
partial homology to hepatitis C virus p7 protein. J Biol Chem 279(24):24965-
24975.
10. Branch AD, Stump DD, Gutierrez JA, Eng F, & Walewski JL (2005) The hepatitis
C virus alternate reading frame (ARF) and its family of novel products: the
alternate reading frame protein/F-protein, the double-frameshift protein, and
others. Semin Liver Dis 25(1):105-117.
11. Vassilaki N & Mavromara P (2009) The HCV ARFP/F/core+1 protein: production
and functional analysis of an unconventional viral product. IUBMB Life 61(7):739-
752.
12. Krey T, et al. (2010) The disulfide bonds in glycoprotein E2 of hepatitis C virus
reveal the tertiary organization of the molecule. PLoS Pathog 6(2):e1000762.
13. Whidby J, et al. (2009) Blocking hepatitis C virus infection with recombinant form
of envelope protein 2 ectodomain. J Virol 83(21):11078-11089.

20/20

You might also like