You are on page 1of 101

An introduction to

bioIogicuI dutubuses
ut is u dutubuse ?
A coIIecfion of...
sfrucfured
seorchobIe (index) - fobIe of confenfs
updofed periodicoIIy (reIeose) - new edifion
cross-referenced (hyperIinks) - Iinks wifh ofher db
,dofo
IncIudes oIso ossociofed fooIs (soffwore)
necessory for db occess, db updofing, db
informofion inserfion, db informofion
deIefion,.
Dutubuses: un simpIe eumpIe
Accession number: 1
First Name: Amos
Last Name: Bairoch
Course: DEA=oct-nov-dec 2000
http://expasy4.expasy.ch/people/amos.html
//
Accession number: 2
First Name: Laurent
Last name: Falquet
Course: EMBnet=sept 2000;DEA=oct-nov-dec 2000;
//
Accession number 3:
First Name: Marie-Claude
Last name: Blatter Garin
Course: EMBnet=sept 2000;DEA=oct-nov-dec 2000;
http://expasy4.expasy.ch/people/Marie-Claude.Blatter-Garin.html
//
osy fo monoge: oII fhe enfries ore visibIe of fhe some fime l
Infroducfion To Dofobose Teocher Dofobose (ITDTdb)
(fIof fiIe, 3 enfries)
Dutubuses: un simpIe eumpIe {cont,}
Teocher Accession
number
ducofion
Amos I 8iochemisfry
Lourenf Z 8iochemisfry
M-CIoude 3 8iochemisfry
Course Dofe InvoIved
feochers
DA Ocf-nov-dec Z000 I,3
M8nef Sepf Z000 Z,3
PeIofionoI dofobose ( fobIe fiIe ):
osier fo monoge, choice of fhe oufpuf
bioIogicuI dutubuses ?
xpIosive growfh in bioIogicoI dofo
Dofo (sequences, 3D sfrucfures, ZD geI
onoIysis, MS onoIysis,.) ore no Ionger
pubIished in o convenfionoI monner, buf
direcfIy submiffed fo dofoboses
ssenfioI fooIs for bioIogicoI reseorch, os
cIossicoI pubIicofions used fo be l
Some dofoboses in fhe fieId of moIecuIor bioIogy,
AATDB, AceDb, ACUTS, ADB, AFDB, AGIS, AMSdb,
ARR, AsDb, BBDB, BCGD, Beanref, Biolmage,
BioMagResBank, BIJMDB, BLJCKS, BovGBASE,
BJVMAP, BSJRF, BTKbase, CANSITE, CarbBank,
CARBHYD, CATH, CAZY, CCDC, CD4JLbase, CGAP,
ChickGBASE, Colibri, CJPE, CottonDB, CSNDB, CUTG,
CyanoBase, dbCFC, dbEST, dbSTS, DDBJ, DGP, DictyDb,
Picty_cDB, DIP, DJGS, DJMJ, DPD, DPlnteract, ECDC,
ECGC, EC02DBASE, EcoCyc, EcoGene, EMBL, EMD db,
ENZYME, EPD, EpoDB, ESTHER, FlyBase, FlyView,
GCRDB, GDB, GENATLAS, Genbank, GeneCards,
Genline, GenLink, GENJTK, GenProtEC, GIFTS,
GPCRDB, GRAP, GRBase, gRNAsdb, GRR, GSDB,
HAEMB, HAMSTERS, HEART-2DPAGE, HEXAdb, HGMD,
HIDB, HIDC, HlVdb, HotMolecBase, HJVERGEN, HPDB,
HSC-2DPAGE, ICN, ICTVDB, IL2RGbase, IMGT, Kabat,
KDNA, KEGG, Klotho, LGIC, MAD, MaizeDb, MDB,
Medline, Mendel, MERJPS, MGDB, MGI, MHCPEP5
Micado, MitoDat, MITJMAP, MJDB, MmtDB, Mol-R-Us,
MPDB, MRR, MutBase, MycDB, NDB, NRSub, 0-lycBase,
JMIA, JMIM, JPD, JRDB, JWL, PAHdb, PatBase, PDB,
PDD, Pfam, PhosphoBase, PigBASE, PIR, PKR, PMD,
PPDB, PRESAGE, PRINTS, ProDom, Prolysis, PRJSITE,
PRJTJMAP, RatMAP, RDP, REBASE, RGP, SBASE,
SCJP, SeqAnaiRef, SGD, SGP, SheepMap, Soybase,
SPAD, SRNA db, SRPDB, STACK, StyGene,Sub2D,
SubtiList, SWISS-2DPAGE, SWISS-3DIMAGE, SWISS-
MJDEL Repository, SWISS-PRJT, TelDB, TGN, tmRDB,
TJPS, TRANSFAC, TRR, UniGene, URNADB, V BASE,
VDRR, VectorDB, WDCM, WIT, WormPep, YEPD, YPD,
YPM, etc .................. !!!!
ioIogicuI dutubuses
Some stutistics
More fhon I000 differenf dofoboses
0eneroIIy occessibIe fhrough fhe web
(usefuI Iink: www.exposy.ch/oIinks.hfmI)
VoriobIe si;e: I00Ib fo I00b
DMA: I0 0b
Profein: I 0b
3D sfrucfure: b 0b
Ofher: smoIIer
Updofe frequency: doiIy fo onnuoIIy
Cutegories of dutubuses for Life Sciences
Sequences (DMA, profein) - Primory db
0enomics
Profein domoin/fomiIy - Secondory db
Mufofion/poIymorphism
Profeomics (ZD geI, MS)
3D sfrucfure - Sfrucfure db
MefoboIism
8ibIiogrophy
Ofhers
Distribution of sequence dutubuses
8ooks, orficIes I9o8 - I98b
Compufer fopes I98Z -I99Z
FIoppy disks I984 - I990
CD-POM I989 - 7
FTP I989 - 7
On-Iine services I98Z - I994
WWW I993 - 7
DVD Z00I - 7
Sequence Dutubuses: some tecnicuI definitions
Dofo sforoge monogemenf:
fIof fiIe: fexf fiIe
reIofionoI (e.g., OrocIe)
objecf orienfed (rore in bioIogicoI fieId)
Formof (fIof fiIe):
fosfo
0C0
M8PF/PIP
MSF,.
sfondordi;ed formof 7
Federuted dutubuses: differenf oufonomous, redundonf,
heferogeneous db Iinked fogefher by Iinks/hyperIinks.
IdeuI minimuI content of u sequence db
Sequences ll
Accession number (AC)
Peferences
Toxonomic dofo
AMMOTATIOM/CUPATIOM
Ieywords
Cross-references
Documenfofion
Sequence dutubuse: eumpIe
ID EPJ_HUMAN STANDARD; PRT; 193 AA.
AC P01588;
DT 21-JUL-1986 (Rel. 01, Created)
DT 21-JUL-1986 (Rel. 01, Last sequence update)
DT 30-MAY-2000 (Rel. 39, Last annotation update)
DE Erythropoietin precursor.
GN EPJ.
JS Homo sapiens (Human).
JC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
JC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo.
RN 1,
RP SEQUENCE FRJM N.A.
RX MEDLINE; 85137899.
RA Jacobs K., Shoemaker C., Rudersdorf R., Neill S.D., Kaufman R.J.,
RA Mufson A., Seehra J., Jones S.S., Hewick R., Fritsch E.F.,
RA Kawakita M., Shimizu T., Miyake T.;
RT "Isolation and characterization of genomic and cDNA clones of human
RT erythropoietin.";
RL Nature 313:806-810(1985).
...
CC -!- FUNCTIJN: ERYTHRJPJIETIN IS THE PRINCIPAL HJRMJNE INVJLVED IN THE
CC REGULATIJN JF ERYTHRJCYTE DIFFERENTIATIJN AND THE MAINTENANCE JF A
CC PHYSIJLJGICAL LEVEL JF CIRCULATING ERYTHRJCYTE MASS.
CC -!- SUBCELLULAR LJCATIJN: SECRETED.
CC -!- TISSUE SPECIFICITY: PRJDUCED BY KIDNEY JR LIVER JF ADULT MAMMALS
CC AND BY LIVER JF FETAL JR NEJNATAL MAMMALS.
CC -!- PHARMACEUTICAL: Available under the names Epogen (Amgen) and
CC Procrit (Jrtho Biotech).
CC -!- DATABASE: NAME=R&D Systems' cytokine source book;
CC WWW="http://www.rndsystems.com/cyt_cat/epo.html".
DR EMBL; X02158; CAA26095.1; -.
DR EMBL; X02157; CAA26094.1; -.
DR EMBL; M11319; AAA52400.1; -.
DR EMBL; AF053356; AAC78791.1; -.
DR EMBL; AF202308; AAF23132.1; -.
DR EMBL; AF202306; AAF23132.1; JJINED.
...
KW Erythrocyte maturation; Glycoprotein; Hormone; Signal; Pharmaceutical.
FT SIGNAL 1 27
FT CHAIN 28 193 ERYTHRJPJIETIN.
FT PRJPEP 190 193 MAY BE REMJVED IN PRJCESSED PRJTEIN.
FT DISULFID 34 188
...
SWISS-PPOT
FIof fiIe
reference
tuonom
unnotutions
keords
Cross-references
Sequence dutubuse: eumpIe {cont,}
FT DISULFID 34 188
FT DISULFID 56 60
FT CARBJHYD 51 51 N-LINKED (GLCNAC...).
FT CARBJHYD 65 65 N-LINKED (GLCNAC...).
FT CARBJHYD 110 110 N-LINKED (GLCNAC...).
FT CARBJHYD 153 153
FT CJNFLICT 40 40 E - Q (IN CAA26095).
FT CJNFLICT 85 85 Q - QQ (IN REF. 5).
FT CJNFLICT 140 140 G - R (IN CAA26095).
Chromosomal location: 7q22
SQ SEQUENCE 193 AA; 21306 MW; C91F0E4C26A52033 CRC64;
MGVHECPAWL WLLLSLLSLP LGLPVLGAPP RLICDSRVLE RYLLEAKEAE NITTGCAEHC
SLNENITVPD TKVNFYAWKR MEVGQQAVEV WQGLALLSEA VLRGQALLVN SSQPWEPLQL
HVDKAVSGLR SLTTLLRALG AQKEAISPPD AASAAPLRTI TADTFRKLFR VYSNFLRGKL
KLYTGEACRT GDR
//
sequence
Sequence dutubuse: eumpIe
,o SWISS-PPOT enfry, in fosfo formof:
sp|P01588|EPJ_HUMAN ERYTHRJPJIETIN PRECURSJR - Homo sapiens (Human).
MGVHECPAWLWLLLSLLSLPLGLPVLGAPPRLICDSRVLERYLLEAKEAE
NITTGCAEHCSLNENITVPDTKVNFYAWKRMEVGQQAVEVWQGLALLSEA
VLRGQALLVNSSQPWEPLQLHVDKAVSGLRSLTTLLRALGAQKEAISPPD
AASAAPLRTITADTFRKLFRVYSNFLRGKLKLYTGEACRTGDR
Dutubuses 1: nucIeotide sequence
The moin DMA sequence db ore
M8L (urope)/0en8onk (USA) /DD8J (Jopon)
There ore oIso specioIi;ed dofoboses for fhe differenf
fypes of PMAs (i.e. fPMA, rPMA, fm PMA, uPMA, efc,)
3D sfrucfure (DMA ond PMA)
Ofhers: Aberronf spIicing db, ucoryofic promofer db
(PD), PMA edifing sifes, MuIfimedio TeIomere
Pesource ,,
EML/Senunk/DDJ
These 3 db confoin moinIy fhe some informofions
wifhin Z-3 doys (few differences in fhe formof
ond synfox)
Serve os urcives confoining oII sequences (singIe
genes, STs, compIefe genomes, efc.) derived
from:
0enome projecfs ond sequencing cenfers
IndividuoI scienfisfs
Pofenf offices (i.e. uropeon Pofenf Office, PO)
Mon-confidenfioI dofo ore exchonged doiIy
CurrenfIy: 8.3 xI0
o
sequences, over 9.7 xI0
9
bp,
Sequences from b0'000 differenf species,
EML/Senunk/DDJ
eferogeneous sequence Iengfh: genomes,
vorionfs, frogmenfs,
Sequence si;es:
mox 300'000 bp /enfry (l genomic sequences, overIopping)
min I0 bp /enfry
Arcive: nofhing goes ouf - highIy redundonf l
fuII of errors: in sequences, in onnofofions, in CDS
offribufion,
no consisfency of onnofofions, mosf onnofofions
ore done by fhe submiffers, heferogeneify of fhe
quoIify ond fhe compIefion ond updofing of fhe
informofions
EML/Senunk/DDJ
Unexpecfed informofions you con find in fhese db:
FT source 1..124
FT /db_xref="taxon:4097"
FT /organelle="plastid:chloroplast"
FT /organism="Nicotiana tabacum"
FT /isolate="Cuban cahibo cigar, gift from President Fidel
FT Castro"
Or:
FT source 1..17084
FT /chromosome="complete mitochondrial genome"
FT /db_xref="taxon:9267"
FT /organelle="mitochondrion"
FT /organism="Didelphis virginiana"
FT /dev_stage="adult"
FT /isolate="fresh road killed individual"
FT /tissue_type="liver"
EML entr: eumpIe
ID HSERPG standard; DNA; HUM; 3398 BP.
XX
AC X02158;
XX
SV X02158.1
XX
DT 13-JUN-1985 (Rel. 06, Created)
DT 22-JUN-1993 (Rel. 36, Last updated, Version 2)
XX
DE Human gene for erythropoietin
XX
KW erythropoietin; glycoprotein hormone; hormone; signal peptide.
XX
JS Homo sapiens (human)
JC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
JC Eutheria; Primates; Catarrhini; Hominidae; Homo.
XX
RN 1,
RP 1-3398
RX MEDLINE; 85137899.
RA Jacobs K., Shoemaker C., Rudersdorf R., Neill S.D., Kaufman R.J.,
RA Mufson A., Seehra J., Jones S.S., Hewick R., Fritsch E.F., Kawakita M.,
RA Shimizu T., Miyake T.;
RT Isolation and characterization of genomic and cDNA clones of human
RT erythropoietin;
RL Nature 313:806-810(1985).
XX
DR GDB; 119110; EPJ.
DR GDB; 119615; TIMP1.
DR SWISS-PRJT; P01588; EPJ_HUMAN.
XX
,
tuonom
Cross-references
references
keord
EML entr {cont,}
CC Data kindly reviewed (24-FEB-1986) by K. Jacobs
FH Key Location/Qualifiers
FH
FT source 1..3398
FT /db_xref=taxon:9606
FT /organism=Homo sapiens
FT mRNA join(397..627,1194..1339,1596..1682,2294..2473,2608..3327)
FT CDS join(615..627,1194..1339,1596..1682,2294..2473,2608..2763)
FT /db_xref=SWISS-PRJT:P01588
FT /product=erythropoietin
FT /protein_id=CAA26095.1
FT /translation=MGVHECPAWLWLLLSLLSLPLGLPVLGAPPRLICDSRVLQRYLLE
FT AKEAENITTGCAEHCSLNENITVPDTKVNFYAWKRMEVGQQAVEVWQGLALLSEAVLRG
FT QALLVNSSQPWEPLQLHVDKAVSGLRSLTTLLRALGAQKEAISPPDAASAAPLRTITAD
FT TFRKLFRVYSNFLRGKLKLYTGEACRTGDR
FT mat_peptide join(1262..1339,1596..1682,2294..2473,2608..2763)
FT /product=erythropoietin
FT sig_peptide join(615..627,1194..1261)
FT exon 397..627
FT /number=1
FT intron 628..1193
FT /number=1
FT exon 1194..1339
FT /number=2
FT intron 1340..1595
FT /number=2
FT exon 1596..1682
FT /number=3
FT intron 1683..2293
FT /number=3
FT exon 2294..2473
FT /number=4
FT intron 2474..2607
FT /number=4
FT exon 2608..3327
FT /note=3' untranslated region
FT /number=5
XX
SQ Sequence 3398 BP; 698 A; 1034 C; 991 G; 675 T; 0 other;
agcttctggg cttccagacc cagctacttt gcggaactca gcaacccagg catctctgag 60
tctccgccca agaccgggat gccccccagg aggtgtccgg gagcccagcc tttcccagat 120
unnotution
sequence
Senunk entr: eumpIe
LJCUS HSERPG 3398 bp DNA PRI 22-JUN-1993
DEFINITIJN Human gene for erythropoietin.
ACCESSIJN X02158
VERSIJN X02158.1 GI:31224
KEYWORDS erythropoietin; glycoprotein hormone; hormone; signal peptide.
SJURCE human.
ORGANISM Homo sapiens
Eukaryota; Metazoa; Chordata; Vertebrata; Mammalia; Eutheria;
Primates; Catarrhini; Hominidae; Homo.
REFERENCE 1 (bases 1 to 3398)
AUTHJRS Jacobs,K., Shoemaker,C., Rudersdorf,R., Neill,S.D., Kaufman,R.J.,
Mufson,A., Seehra,J., Jones,S.S., Hewick,R., Fritsch,E.F.,
Kawakita,M., Shimizu,T. and Miyake,T.
TITLE Isolation and characterization of genomic and cDNA clones of human
erythropoietin
JJURNAL Nature 313 (6005), 806-810 (1985)
MEDLINE 85137899
CJMMENT Data kindly reviewed (24-FEB-1986) by K. Jacobs.
FEATURES Location/Qualifiers
source 1..3398
/organism="Homo sapiens"
/db_xref="taxon:9606"
mRNA join(397..627,1194..1339,1596..1682,2294..2473,2608..3327)
exon 397..627
/number=1
sig_peptide join(615..627,1194..1261)
CDS join(615..627,1194..1339,1596..1682,2294..2473,2608..2763)
/codon_start=1
/product="erythropoietin"
/protein_id="CAA26095.1"
/db_xref="GI:312304"
/db_xref="SWISS-PRJT:P01588"
/translation="MGVHECPAWLWLLLSLLSLPLGLPVLGAPPRLICDSRVLQRYLL
EAKEAENITTGCAEHCSLNENITVPDTKVNFYAWKRMEVGQQAVEVWQGLALLSEAVL
RGQALLVNSSQPWEPLQLHVDKAVSGLRSLTTLLRALGAQKEAISPPDAASAAPLRTI
.
Senunk entr {cont,}
TADTFRKLFRVYSNFLRGKLKLYTGEACRTGDR"
intron 628..1193
/number=1
exon 1194..1339
/number=2
mat_peptide join(1262..1339,1596..1682,2294..2473,2608..2760)
/product="erythropoietin"
intron 1340..1595
/number=2
exon 1596..1682
/number=3
intron 1683..2293
/number=3
exon 2294..2473
/number=4
intron 2474..2607
/number=4
exon 2608..3327
/note="3' untranslated region"
/number=5
BASE CJUNT 698 a 1034 c 991 g 675 t
JRIGIN
1 agcttctggg cttccagacc cagctacttt gcggaactca gcaacccagg catctctgag
61 tctccgccca agaccgggat gccccccagg aggtgtccgg gagcccagcc tttcccagat
121 agcagctccg ccagtcccaa gggtgcgcaa ccggctgcac tcccctcccg cgacccaggg
181 cccgggagca gcccccatga cccacacgca cgtctgcagc agccccgtca gccccggagc
241 ctcaacccag gcgtcctgcc cctgctctga ccccgggtgg cccctacccc tggcgacccc
DDJ entr: eumpIe
LJCUS HSERPG 3398 bp DNA HUM 22-JUN-1993
DEFINITIJN Human gene for erythropoietin.
ACCESSIJN X02158
VERSIJN X02158.1
KEYWJRDS erythropoietin; glycoprotein hormone; hormone; signal peptide.
SJURCE human.
JRGANISM Homo sapiens
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Mammalia;
Eutheria; Primates; Catarrhini; Hominidae; Homo.
REFERENCE 1 (bases 1 to 3398)
AUTHJRS Jacobs,K., Shoemaker,C., Rudersdorf,R., Neill,S.D., Kaufman,R.J.,
Mufson,A., Seehra,J., Jones,S.S., Hewick,R., Fritsch,E.F.,
Kawakita,M., Shimizu,T. and Miyake,T.
TITLE Isolation and characterization of genomic and cDNA clones of human
erythropoietin
JJURNAL Nature 313, 806-810(1985)
MEDLINE 85137899
CJMMENT Data kindly reviewed (24-FEB-1986) by K. Jacobs
FEATURES Location/Qualifiers
source 1..3398
/db_xref="taxon:9606"
/organism="Homo sapiens"
mRNA join(397..627,1194..1339,1596..1682,2294..2473,2608..3327)
CDS join(615..627,1194..1339,1596..1682,2294..2473,2608..2763)
/db_xref="SWISS-PRJT:P01588"
/product="erythropoietin"
/protein_id="CAA26095.1"
/translation="MGVHECPAWLWLLLSLLSLPLGLPVLGAPPRLICDSRVLQRYLLE
AKEAENITTGCAEHCSLNENITVPDTKVNFYAWKRMEVGQQAVEVWQGLALLSEAVLRG
QALLVNSSQPWEPLQLHVDKAVSGLRSLTTLLRALGAQKEAISPPDAASAAPLRTITAD
TFRKLFRVYSNFLRGKLKLYTGEACRTGDR
,
DDJ {cont,}
mat_peptide join(1262..1339,1596..1682,2294..2473,2608..2763)
/product="erythropoietin"
sig_peptide join(615..627,1194..1261)
exon 397..627
/number=1
intron 628..1193
/number=1
exon 1194..1339
/number=2
intron 1340..1595
/number=2
exon 1596..1682
/number=3
intron 1683..2293
/number=3
exon 2294..2473
/number=4
intron 2474..2607
/number=4
exon 2608..3327
/note="3' untranslated region"
/number=5
BASE CJUNT 698 a 1034 c 991 g 675 t
JRIGIN
1 agcttctggg cttccagacc cagctacttt gcggaactca gcaacccagg catctctgag
61 tctccgccca agaccgggat gccccccagg aggtgtccgg gagcccagcc tttcccagat
%e tremendous increuse in nucIeotide sequences
M8L dofo,firsf increose in dofo due fo fhe PCP deveIopmenf,
I980: 80 genes fuIIy sequenced l
EML divisions
M8L hos been divided info subdofoboses
fo oIIow eosier dofo monogemenf ond
seorches
fun, hum, inv, mom, org, phg, pIn, pro, rod, syn,
unc, vrI, vrf
esf, gss, hfg, sfs, pofenf
#efSeq u SISS-P#% cIone?
The MC8I Peference Sequence projecf (PefSeq) wiII provide
reference sequence sfondords for fhe nofuroIIy occurring moIecuIes
of fhe cenfroI dogmo, from chromosomes fo mPMAs fo profeins.
PefSeq sfondords provide o foundofion for fhe funcfionoI
onnofofion of fhe humon genome. They provide o sfobIe reference
poinf for mufofion onoIysis, gene expression sfudies, ond
poIymorphism discovery.
MoIecuIe Accession Formut Senome
Complete Genome NC_###### Archaea, Bacterial,
Jrganelle,Virus, Viroid
Complete Chrom. NC_###### Eukaryote
Complete Sequence NC_###### Plasmid
Genomic Contig NT_###### Homo sapiens
mRNA NM_###### Homo sapiens, Mus musculus,
Rattus norvegicus
Protein NP_###### All of the above
#efSeq u SISS-P#% cIone?
PefSeq records ore creofed vio o process consisfing of:
idenfifying sequences fhof represenf disfincf genes
esfobIishing fhe correcf gene nome-fo-occession number ossociofion
idenfifying fhe fuII exfenf of ovoiIobIe sequence dofo
creofing o new PefSeq record wifh o sfofus of:
PPDICTD
PPOVISIOMAL
PVIWD
ProvisionoI PefSeq records ore reviewed by o bioIogisf who
confirms fhe inifioI nome-fo-sequence ossociofion, odds informofion
incIuding o summory of gene funcfion, ond, more imporfonfIy,
correcfs, re-onnofofes, or exfends fhe sequence dofo using dofo
ovoiIobIe in ofher 0en8onk records.
Dutubuses Z: genomics
Confoin informofion on genes, gene Iocofion
(mopping), gene nomencIofure ond Iinks fo
sequence dofoboses,
xisf for mosf orgonisms imporfonf for Iife
science reseorch,
xompIes: MIM, 0D8 (humon), M0D (mouse),
FIy8ose (DrosophiIo), S0D (yeosf), Moi;eD8
(moi;e), SubfiLisf (8.subfiIis), efc.,
Formof: generoIIy reIofionoI (OrocIe, Sy8ose or
AceDb).
MIM
OMIM": OnIine MendeIion Inherifonce in
Mon
o cofoIog of humon genes ond genefic
disorders
confoins o summory of Iiferofure, picfures,
ond reference informofion. If oIso confoins
numerous Iinks fo orficIes ond sequence
informofion.
MIM: eumpIe
133170 ERYTHRJPJIETIN; EPJ
Alternative titles; symbols
EP
TABLE JF CJNTENTS
TEXT
REFERENCES
SEE ALSJ
CJNTRIBUTJRS
CREATIJN DATE
EDIT HISTJRY
Database Links
Gene Map Locus: 7q21
Note: pressing the symbol will find the citations in MEDLINE whose text most closely matches the text of the preceding JMIM
paragraph, using the Entrez
MEDLINE neighboring function.
TEXT
Human erythropoietin is an acidic glycoprotein hormone with molecular weight 34,000. As the prime regulator of red cell
production, its major functions are to
promote erythroid differentiation and to initiate hemoglobin synthesis. Sherwood and Shouval (1986) described a human renal
carcinoma cell line that
continuously produces erythropoietin. Eschbach et al. (1987) demonstrated the effectiveness of recombinant human
erythropoietin in treating the anemia of
end-stage renal disease. Lee-Huang (1984) cloned human erythropoietin cDNA in E. coli. McDonald et al. (1986) and Shoemaker
and Mitsock (1986)
cloned the mouse gene and the latter workers showed that coding DNA and amino acid sequence are about 80% conserved between
man and mouse. This is
a much higher order of conservation than for various interferons, interleukin-2, and GM-CSF.
,,
EnsembI
Confoins oII fhe humon genome DMA sequences
currenfIy ovoiIobIe in fhe pubIic domoin.
Aufomofed onnofofion: by using differenf
soffwore fooIs, feofures ore idenfified in fhe DMA
sequences:
0enes (known or predicfed)
SingIe nucIeofide poIymorphisms (SMPs)
Pepeofs
omoIogies
Creofed ond moinfoined by fhe 8I ond fhe Songer
Cenfer (UI)
www.ensembI.org
Dutubuse 3: protein sequence
SWISS-PPOT: creofed in I98o (A.8oiroch)
TrM8L: creofed in I99o, compIemenf fo SWISS-PPOT, derived
from oufomofed M8L CDS fronsIofions ( profeomic version of
M8L)
0enPepf: derived from oufomofed 0en8onk CDS fronsIofions ond
journoI scons ( profeomic version of 0en8onk)
PIP: Profein Informofion Pesources
MIPS: Morfinsried Insfifufe for Profein Sequences
PIP + PATC (suppIemenf of unverified profein sequences from
exfernoI sources)
Dutubuse 3: protein sequence
MPL-3D: produced by PIP from PD8 (3D sfrufure) sequences
Mony specioIi;ed profein dofoboses for specific fomiIies or groups
of profeins.
xompIes: YPD (yeosf profeins), AMSDb (onfibocferioI pepfides),
0PCPD8 (7 TM recepfors), IM0T (immune sysfem) efc.
SISS-P#%
CoIIoborofion befween fhe SI8 (C) ond M8L/8I
(UI)
Annofofed (monuoIIy), non-redundonf, cross-
referenced, documenfed profein sequence dofobose.
88 '000 sequences from more fhon o'800 differenf
species, 70 '000 references (pubIicofions), bb0 '000
cross-references (dofoboses), ~Z00 Mb of
onnofofions.
WeekIy reIeoses, ovoiIobIe from obouf b0 servers
ocross fhe worId, fhe moin source being xPASy
SISS-P#%: eumpIe
Mever chonged
SISS-P#% {cont,}
SISS-P#% {cont,}
%rEML {%runsIution of EML}
Compufer-onnofofed suppIemenf fo SWISS-PPOT,
os if is impossibIe fo cope wifh fhe fIow of dofo,
WeII-sfrucfure SWISS-PPOT-Iike resource
Derived from oufomofed M8L CDS fronsIofion
(moinfoined of fhe 8I (UI))
TrM8L is oufomoficoIIy generofed ond onnofofed
using soffwore fooIs (incompofibIe wifh fhe
SWISS-PPOT in ferms of quoIify)
TrM8L confoins oII whof is not et in SWISS-
PPOT
Yerkll 8uf fhere is no choice ond fhese soffwore
fooIs ore becoming quife good l
%e simpIified stor of u Sprot entr
cDMAs, genomes, ,.
M8Lnew M8L
TrM8Lnew TrM8L
SWISS-PPOT
Automutic
Pedundoncy check (merge)
InferPro (fomiIy offribufion)
Annofofion
MunuuI
Pedundoncy (merge, confIicfs)
Annofofion
Sprof fooIs (mocros,)
Sprof documenfofion
MedIine
Dofoboses (MIM, M0D,.)
8roin sforming
Once in Sprof, fhe enfry is no more in TrM8L, buf sfiII in M8L (orchive)
CDS
SISS-P#% introduces u ne uritmeticuI concept |
ow mony sequences in SWISS-PPOT + TrM8L 7
88'000 + 300 '000 ~ obouf Z40'000
SWISS-PPOT ond TrM8L (SPTP)
o minimoI of redundoncy
%rEML divisions
TrM8L: SPTrM8L + PMTrM8L
SPTrM8L: TrM8L enfries fhof wiII evenfuoIIy be
infegrofed info SWISS-PPOT, buf fhof hove nof yef be
monuoIIy onnofofed
PMTrM8L: sequences fhof ore nof desfined fo be
incIuded in SWISS-PPOT
ImmunogIobuIins ond T-ceII recepfors
Synfhefic sequences
Pofenfed sequences
SmoII frogmenfs (8 oo)
CDS nof coding for reoI profeins
TrM8L new: updofes fo fhe Iofesf reIeose of TPM8L
%rEML divisions
Subdivisions
Archoe orc
Fungus fun
umon hum
Inverfebrofe inv
MommoIs mom
Mojor isf. Comp. mhc
OrgoneIIes org
Phoge phg
PIonf pIn
Prokoryofe pro
Podenf rod
Uncommenfed unc
ViroI vrI
Verfebrofe vrf
%rEML: eumpIe
SenPept {trunsIution of Senunk}
0enPepf is o profein dofobose fronsIofed from fhe
Iosf reIeose of 0en8onk (+ journoI scons)
The currenf reIeose hos 484'49o enfries
In confrosf fo TrM8L, keeps oII profein sequences
incIuding smoII frogmenfs ( 8 oo), immunogIobuIins,.
Pedundoncy: Z0 enfries for humon PO
SenPept: eumpIe
LOCUS L334I0_I [UMMLCMPL]
DFIMITIOM umon c-mpI Iigond (ML) mPMA, compIefe cds,
eryfhropoiefin homoIogy domoin bp oo..bZZ.
DAT 07-JAM-I99b
ACCSSIOM L334I0
MID
OP0AMISM omo_SP_sopiens
ukoryofo, Mefo;oo, Chordofo, Croniofo, Verfebrofo, ufeIeosfomi,
MommoIio, ufherio, Primofes, Coforrhini, ominidoe, omo.
COMMMT CDS ZIo..IZ77
/gene~"ML"
/producf~"c-mpI Iigond"
/profein_id~"AAAb98b7.I"
/db_xref~"0I:b0o8Z7"
WI0T 378Z3
LM0T 3b3
OPI0IM
I MLTLLLVV MLLLTAPLTL SSPAPPACDL PVLSILLPDS VLSPLSQC PVPLPTPV
oI LLPAVDFSL0 WITQMTI AQDIL0AVTL LL0VMAAP0 QL0PTCLSSL L0QLS0QVPL
IZI LL0ALQSLL0 TQLPPQ0PTT AIDPMAIFL SFQLLP0IV PFLMLV00ST LCVPPAPPTT
I8I AVPSPTSLVL TLMLPMPTS 0LLTMFTAS APTT0S0LLI WQQ0FPAIIP 0LLMQTSPSL
Z4I DQIP0YLMPI LLM0TP0L FP0PSPPTL0 APDISS0TSD T0SLPPMLQP 0YSPSPTPP
30I T0QYTLFPLP PTLPTPVVQL PLLPDPSAP TPTPTSPLLM TSYTSQMLS Q0
//
PI#
Profein Informofion Pesource, creofed in I984
Successor of fhe MofionoI 8iochemicoI Peseorch
Foundofion (M8PF) profein sequence dofobose
deveIoped in I9ob by M. O. Doyhoff AfIos of
Profein Sequence ond Sfrucfure
Moinfoined by MIPS (0ermony) ond JIPID (Jopon)
Provides some cross-referencing fo
M8L/0en8onk/DDJ8 ond PD8, 0D8, FIy8ose,
OMIM, S0D, ond M0D
In ougusf Z000: I78'0b0 enfries.
Pedundoncy: 3 enfries for humon PO
PI#: eumpIe
P1;ZUHU
erythropoietin precursor - human
C;Species: Homo sapiens (man)
C;Date: 27-Nov-1985 #sequence_revision 27-Nov-1985 #text_change 22-Jun-1999
C;Accession: A01855; A24744; A25384; A22210; S56178
R;Jacobs, K.; Shoemaker, C.; Rudersdorf, R.; Neill, S.D.; Kaufman, R.J.; Mufson, A.; Seehra, J.; Jones, S.S.; Hewick, R.; Fritsch,
E.F.; Kawakita, M.; Shimizu, T.; Miyake, T.
Nature 313, 806-810, 1985
A;Title: Isolation and characterization of genomic and cDNA clones of human erythropoietin.
A;Reference number: A01855; MUID:85137899
A;Accession: A01855
A;Molecule type: mRNA; DNA
A;Residues: 1-193
A;Cross-references: GB:X02157; GB:X02158
R;Lin, F.K.; Suggs, S.; Lin, C.H.; Browne, J.K.; Smalling, R.; Egrie, J.C.; Chen, K.K.; Fox, G.M.; Martin, F.; Stabinsky, Z.;
Badrawi, S.M.; Lai, P.H.; Goldwasser, E.
Proc. Natl. Acad. Sci. U.S.A. 82, 7580-7584, 1985
A;Title: Cloning and expression of the human erythropoietin gene.
A;Reference number: A24744; MUID:86067948
A;Accession: A24744
A;Molecule type: DNA
A;Residues: 1-193
A;Cross-references: GB:M11319; NID:g182197; PIDN:AAA52400.1; PID:g182198
R;Lai, P.H.; Everett, R.; Wang, F.F.; Arakawa, T.; Goldwasser, E.
J. Biol. Chem. 261, 3116-3121, 1986
A;Title: Structural characterization of human erythropoietin.
A;Reference number: A25384; MUID:86140080
A;Accession: A25384
A;Molecule type: protein
A;Residues: 28-86,'Q',87-193
A;Experimental source: urine
A;Note: forms without the carboxyl-terminal residue and the four carboxyl-terminal residues were observed
R;Yanagawa, S.; Hirade, K.; Jhnota, H.; Sasaki, R.; Chiba, H.; Ueda, M.; Goto, M.
J. Biol. Chem. 259, 2707-2710, 1984
A;Title: Isolation of human erythropoietin with monoclonal antibodies.
A;Reference number: A22210; MUID:84135751
PI# {cont,}
A;Accession: A22210
A;Molecule type: protein
A;Residues: 28-29,'X',31-33,'L',35-50,'X',52-53,'D',55,'G',57
R;Matsumoto, S.; Ikura, K.; Ueda, M.; Sasaki, R. Plant Mol. Biol. 27, 1163-1172, 1995
A;Title: Characterization of a human glycoprotein (erythropoietin) produced in cultured tobacco cells.
A;Reference number: S56178; MUID:95284365
A;Accession: S56178
A;Molecule type: protein
A;Residues: 28-33,'X',35-37
C;Comment: Erythropoietin is produced by kidney or liver of adult mammals and by liver of fetal or neonatal mammals.
C;Genetics:
A;Gene: GDB:EPJ
A;Cross-references: GDB:119110; JMIM:133170
A;Map position: 7q21.3-7q22.1
A;Introns: 5/1; 53/3; 82/3; 142/3
C;Function:
A;Description: the primary inducer of erythrocyte formation
C;Superfamily: erythropoietin
C;Keywords: erythropoiesis; glycoprotein; hormone; kidney; liver
F;1-27/Domain: signal sequence #status predicted
F;28-193/Product: erythropoietin #status experimental
F;34-188,56-60/Disulfide bonds: #status experimental
F;51,65,110/Binding site: carbohydrate (Asn) (covalent) #status experimental
F;153/Binding site: carbohydrate (Ser) (covalent) #status experimental
P1;ZUHU
MGVHECPAWL WLLLSLLSLP LGLPVLGAPP RLICDSRVLE RYLLEAKEAE NITTGCAEHC
SLNENITVPD TKVNFYAWKR MEVGQQAVEV WQGLALLSEA VLRGQALLVN SSQPWEPLQL
HVDKAVSGLR SLTTLLRALG AQKEAISPPD AASAAPLRTI TADTFRKLFR VYSNFLRGKL
KLYTGEACRT GDR
Composite protein sequence db
MPD8 OWL MIPS SPTrM8L ^
PD8
SWISS-PPOT
PIP
0enPepf
SP updofe
0enPepf updofe
SWISS-PPOT
PIP
0en8onk
MPL-3D
PIP
MIPS
MPL-3D
SWISS-PPOT
M8L fronsIofion
0en8onk fronsIofion
Iobof (immuno)
PseqIP
SWISS-PPOT
SPTrM8L
TrM8Lnew
Differenf composife db use differenf primory sources ond
differenf redundoncy criferio in fheir omoIgomofion procedures
Pedundoncy priorify criferio
^ AIso coIIed SWoII of 8I
SWIP: SPTrM8L + Wormpep
Composite: protein fumiI
The profeins /genes ore cIossified by
superfomiIy/fomiIy occording fo 8Iosf/Fosfo
(homoIogy) resuIfs
0eneroI:
ProfFom: PIP
ProfoMop: SWISS-PPOT
SYSTPS: SWISS-PPOT ond PIP (non redundonf)
ProCIoss: PIP ond PPOSIT
Species specific:
OVP0M: verfebrofes
O8AC0M: bocferio
CO0: compIefe orgonism genome
ProtoMup: eumpIe
ProtoMup {cont,}
Dutubuse 4: protein domuin/fumiI
Confoins bioIogicoIIy significonf poffern /
profiIes/ MM formuIofed in such o woy
fhof, wifh oppropriofe compufionoI fooIs, if
con ropidIy ond reIiobIy defermine fo which
known fomiIy of profeins (if ony) o new
sequence beIongs fo
- fooIs fo idenfify whof is fhe funcfion of
unchorocferi;ed profeins fronsIofed from
genomic or cDMA sequences ( funcfionoI
diognosfic )
Protein domuin/fumiI
Mosf profeins hove moduIor sfrucfure
sfimofion: ~ 3 domoins / profein
Domoins (conserved sequences or sfrucfures) ore
idenfified by muIfi sequence oIignmenfs
Domoins con be defined by differenf mefhods:
Poffern (reguIor expression), used for very conserved domoins
ProfiIes (weighfed mofrices): fwo-dimensionoI fobIes of posifion specific
mofch-, gop-, ond inserfion-scores, derived from oIigned sequence
fomiIies, used for Iess conserved domoins
idden Morkov ModeI (MM), probobiIisfic modeIs, on ofher mefhod fo
generofe profiIes.
Some stutistics
Ib mosf common profein domoins for . sopiens (IncompIefe)
ImmunogIobuIin ond mojor hisfocompofibiIify compIex domoin
ukoryofic profein kinose
Zinc finger, CZZ fype
Phodopsin-Iike 0PCP superfomiIy
Src homoIogy 3 (S3) domoin
PMA-binding region PMP-I (PMA recognifion mofif)
Fibronecfin fype III domoin
PIecksfrin homoIogy (P) domoin
omeobox domoin
Mojor hisfocompofibiIify compIex profein, CIoss I
F-hond fomiIy
0F-Iike domoin
PIM0 finger
Codherin domoin
PDZ domoin (oIso known os DP or 0L0F)
Serine profeoses, frypsin fomiIy
hffp://www.ebi.oc.uk/profeome/UMAM/inferpro/fopIbd.hfmI
Protein domuin/fumiI db
Secondory dofoboses ore fhe fruif of onoIyses of
fhe sequences found in fhe primory db
ifher monuoIIy curofed (i.e. PPOSIT, Pfom,
efc.) or oufomoficoIIy generofed (i.e. ProDom,
DOMO)
Some depend on fhe mefhod used fo defecf if o
profein beIongs fo o porficuIor domoin/fomiIy
(pofferns, profiIes, MM)
Protein domuin/fumiI db
Secondory db Primory source Informofion
PPOSIT SWISS-PPOT Pofferns
(PeguIor expression)
PPOSIT SWISS-PPOT ProfiIes
(Weighfed mofrices)
PPIMTS OWL ond
SWISS-PPOT
AIigned mofifs
(Fingerprinfs)
Pfom SWISS-PPOT MM
(idden Morkov ModeIs)
8LOCIS PPOSIT/PPIMTS AIigned mofifs
IDMTIFY 8LOCIS/PPIMTS
Fu;;y reguIor expressions
Prosite
Creofed in I988 (SI8)
Confoins funcfionoI domoins fuIIy onnofofed, bosed
on fwo mefhods: pofferns ond profiIes
nfries ore deposifed in PPOSIT in fwo disfincf
fiIes:
Poffern/profiIes wifh fhe Iisfs of oII mofches in fhe
porenf version of SWISS-PPOT
Documenfofion
Aug Z000: confoins I0o4 documenfofion enfries fhof
describe I4Z4 differenf pofferns, ruIes ond
profiIes/mofrices.
Prosite {puttern}: eumpIe
ID PO_TPO, PATTPM.
AC PS008I7,
DT OCT-I993 (CPATD), MOV-I99b (DATA UPDAT), JUL-I998 (IMFO UPDAT).
D ryfhropoiefin / fhrombopoeifin signofure.
PA P-x(4)-C-D-x-P-[LIVM](Z)-x-[IP]-x(I4)-C.
MP /PLAS~38,80000,
MP /TOTAL~I4(I4), /POSITIV~I4(I4), /UMIMOWM~0(0), /FALS_POS~0(0),
MP /FALS_M0~0, /PAPTIAL~I,
CC /TAO-PAM0~7777, /MA-PPAT~I,
CC /SIT~3,disuIfide, /SIT~II,disuIfide,
DP P48oI7, PO_8OVIM , T, P33707, PO_CAMFA , T, P33708, PO_FLCA , T,
DP P0Ib88, PO_UMAM , T, P078ob, PO_MACFA , T, QZ8bI3, PO_MACMU , T,
DP P073ZI, PO_MOUS , T, P49Ib7, PO_PI0 , T, PZ9o7o, PO_PAT , T,
DP P33709, PO_SP , T, P4Z70b, TPO_CAMFA , T, P40ZZb, TPO_UMAM , T,
DP P40ZZo, TPO_MOUS , T, P4974b, TPO_PAT , T,
DP P4Z70o, TPO_PI0 , P,
DO PDOC00o44,
//
Diognosfic
performonce
Lisf of
mofches
Prosite {profiIe}: eumpIe
PPOSIT: PSb0097
ID 8T8, MATPI.
AC PSb0097,
DT DC-I999 (CPATD), DC-I999 (DATA UPDAT), DC-I999 (IMFO UPDAT).
D 8T8 domoin profiIe.
MA /0MPAL_SPC: ALPA8T~A8CDF0IILMMPQPSTVWYZ, LM0T~o7,
MA /DISJOIMT: DFIMITIOM~PPOTCT, MI~o, MZ~oZ,
MA /MOPMALIZATIOM: MOD~I, FUMCTIOM~LIMAP, PI~.97bI, PZ~.0Z0o8Z0Z, TT~-Log,
MA /CUT_OFF: LVL~0, SCOP~3o3, M_SCOP~8.b, MOD~I, TT~l,
MA /CUT_OFF: LVL~-I, SCOP~Zo7, M_SCOP~o.b, MOD~I, TT~7,
MA /DFAULT: D~-Z0, I~-Z0, 8I~-b0, I~-b0, MI~-I0b, MD~-I0b, IM~-I0b, DM~-I0b, MM~I, M0~-Z,
MA /I: 8I~0, 8I~-I0b, 8D~-I0b,
MA /M: SY~C, M~-o,-I0,Z8,-I4,-9,-Ib,-Z0,-I4,-I9,-Ib,-I7,-I4,-8,-I9,-I4,-Ib,0,0,-9,-3Z,-I7,-IZ,
MA /M: SY~D, M~-Io,4I,-Z8,b3,Ib,-34,-II,-I,-33,0,-Z7,-Zb,ZI,-II,0,-8,Z,-o,-Zo,-38,-I9,7,
MA /M: SY~V, M~Z,-Z3,-8,-Z8,-Z4,-I,-Z4,-Zb,Io,-Z0,7,o,-Z0,-Zb,-Z3,-Z0,-I0,-4,Z4,-Z3,-9,-Z4,
MA /M: SY~T, M~-Z,-I3,-I8,-I9,-I3,-7,-Z4,-I9,o,-8,-Z,I,-II,-I7,-II,-I0,-I,I0,I0,-Z4,-o,-I3,
MA /M: SY~L, M~-II,-30,-ZZ,-33,-Z4,Ib,-3Z,-Z3,Zb,-Z9,3b,I7,-Zo,-Z7,-Z3,-ZZ,-Z4,-9,Io,-I7,3,-Z4,
MA /M: SY~V, M~0,-II,-I8,-I3,-I0,-IZ,-Z0,-I3,I,-o,-4,Z,-I0,-I9,-o,-7,-4,-Z,8,-Zb,-9,-9,
MA /M: SY~V, M~I,-Zb,-3,-Z9,-Zb,-Z,-Zo,-Zo,I7,-ZZ,I0,7,-Z3,-Zb,-Z3,-ZZ,-II,-3,Z4,-Z7,-I0,-Zb,
MA /M: SY~D, M~-o,7,-Zo,8,7,-Zb,o,-7,-Z7,0,-Z3,-I7,8,-I3,0,-3,3,-o,-Z3,-Z7,-I7,3,
MA /I: I~-b, MI~0, IM~0, DM~-Ib, MD~-Ib,
MA /M: SY~0, M~-o,8,-Z7,8,-3,-Z7,ZZ,-7,-30,-8,-Zo,-I9,I0,-I4,-8,-9,Z,-9,-Z4,-Z8,-ZI,-o,
MA /M: SY~I, M~-7,-4,-Z3,-4,7,-Z3,-I3,-Z,-ZI,I0,-I8,-9,-3,-IZ,7,9,-Z,-4,-Io,-Zb,-IZ,o,
MA /M: SY~, M~-8,-o,-ZI,-8,I,-Ib,-ZI,-7,-7,-I,-I0,-b,-3,-I4,0,-I,-Z,-Z,-o,-Zo,-9,-I,
MA /M: SY~F, M~-IZ,-Z8,-ZZ,-34,-Zo,3I,-3I,-ZI,I8,-Zo,Io,9,-ZZ,-Z7,-Z7,-ZI,-Z0,-9,I4,-o,I3,-Zo,
MA /M: SY~P, M~-I3,-9,-Z4,-I0,-3,-II,-ZI,7,-I7,7,-Io,-4,-4,-8,Z,9,-9,-9,-Io,-Z0,-I,-Z,
MA /M: SY~A, M~ZI,-Ib,-8,-ZZ,-I7,-I0,-I0,-Z3,0,-Ib,-b,-b,-I4,-I8,-I7,-I9,4,o,IZ,-Z4,-Ib,-I7,
MA /M: SY~, M~-Ib,b,-ZZ,Z,-I,-Z0,-Io,ob,-Zo,-8,-ZI,-b,Ib,-I9,o,-Z,-Z,-II,-Zo,-3Z,7,0,
MA /M: SY~I, M~-IZ,-b,-Z9,-b,b,-Zb,-I8,-8,-Zo,34,-Z4,-9,-I,-I4,8,34,-8,-8,-I7,-Z0,-I0,b,
MA /M: SY~A, M~4,-IZ,-IZ,-Io,-I0,-o,-I8,-I4,-Z,-I3,-I,-Z,-II,-I7,-IZ,-I3,-3,I,Z,-Z4,-8,-II,
MA /M: SY~V, M~-7,-Zo,-I9,-3I,-Zo,7,-3Z,-Z4,Z7,-Z3,I4,II,-ZZ,-Zb,-Z3,-ZZ,-I3,0,Z8,-I9,3,-Zo,
MA /M: SY~L, M~-I0,-30,-Z0,-30,-ZI,9,-30,-Z0,ZZ,-Z9,47,Z0,-Z9,-Z9,-Z0,-Z0,-Z9,-I0,IZ,-Z0,0,-ZI,
MA /M: SY~A, M~I8,-o,0,-IZ,-8,-I8,-o,-Io,-Ib,-I0,-I8,-IZ,-Z,-I4,-8,-I3,I8,II,-b,-3Z,-I9,-8,
,.
Prosite {profiIe}: eumpIe {cont,}
,,
MA /M: SY~T, M~-3,3,-Io,I,-3,-I8,-IZ,-9,-Z0,-o,-I9,-Ib,Z,-7,-o,-o,I0,Ib,-I3,-Z7,-IZ,-b,
MA /M: SY~0, M~-I,I,-Zb,Z,-9,-Zo,3I,-IZ,-3Z,-I0,-Zo,-I8,4,-I7,-IZ,-I0,I,-IZ,-Z4,-Zb,-ZZ,-II,
MA /M: SY~, M~-9,3,-Z4,4,I3,-Zb,-Io,-I,-Z4,I3,-ZI,-I3,3,-9,o,I3,-3,-o,-Z0,-Z7,-I3,8,
MA /M: SY~I, M~-o,-ZI,-I8,-Zb,-ZI,-Z,-Z9,-ZI,ZI,-ZI,I4,I0,-I9,-Z4,-I7,-I9,-I3,-3,I9,-Z3,-3,-Z0,
MA /M: SY~, M~-4,3,-Z3,3,4,-I8,-II,-7,-I7,-I,-I8,-I3,3,-9,-I,-b,I,-4,-I4,-Zb,-II,I,
MA /M: SY~I, M~-8,-Zb,-Z3,-Z7,-Z0,I,-30,-ZI,ZI,-Z0,I8,IZ,-ZZ,-I8,-I8,-I8,-I8,-7,Io,-ZI,-I,-Z0,
MA /M: SY~P, M~-o,0,-Z4,Z,I,-ZZ,-I3,-8,-ZI,-Z,-Z3,-Ib,I,I4,-4,-7,3,Z,-I9,-3I,-I8,-3,
MA /M: SY~, M~-7,I,-Z7,4,II,-Z4,-Ib,-4,-I9,Z,-I8,-II,0,-I,o,-I,-Z,-o,-I9,-Zb,-I4,7,
MA /I: I~0, I~-I0b, D~-I0b,
MP /PLAS~39,87397,
MP /TOTAL~4o(44), /POSITIV~4b(43), /UMIMOWM~I(I), /FALS_POS~0(0),
MP /FALS_M0~0, /PAPTIAL~0,
CC /TAO-PAM0~777V, /MA-PPAT~Z,
DP OI48o7, 8ACI_UMAM, T, P9730Z, 8ACI_MOUS, T, P97303, 8ACZ_MOUS, T,
DP P4II8Z, 8CLo_UMAM, T, P4II83, 8CLo_MOUS, T, Q0IZ9b, 8PCI_DPOM, T,
DP Q0IZ9o, 8PCZ_DPOM, T, Q0IZ93, 8PC3_DPOM, T, QZ80o8, CALI_8OVIM, T,
DP QI3939, CALI_UMAM, T, Q08o0b, 0A0A_DPOM, T, Q0I8Z0, 0CLI_DPOM, T,
DP PI0074, IP3_UMAM, T, Q04obZ, ILC_DPOM, T, P4ZZ83, LOLL_DPOM, T,
DP P4ZZ84, LOLS_DPOM, T, OI4o8Z, PII0_UMAM, T, Q0bbIo, PLZF_UMAM, T,
DP O4379I, SPOP_UMAM, T, P4ZZ8Z, TTIA_DPOM, T, PI7789, TTI8_DPOM, T,
DP PZI073, VAbb_VACCC, T, PZ47o8, VAbb_VACCV, T, PZI037, VC0Z_VACCC, T,
DP PI737I, VC0Z_VACCV, T, P3ZZZ8, VC04_SPVIA, T, P3ZZ0o, VCI3_SPVIA, T,
DP PZI0I3, VF03_VACCC, T, PZ43b7, VF03_VACCV, T, PZZoII, VMT8_MYVL, T,
DP P08073, VMT9_MYVL, T, O43Io7, Y44I_UMAM, T, QI0ZZb, YAZ4_SCPO, T,
DP P40bo0, YIAI_YAST, T, P343Z4, YIVZ_CAL, T, P3437I, YLJ8_CAL, T,
DP P34bo8, YMVb_CAL, T, P4I88o, YPT9_CAL, T, Q09bo3, YP47_CAL, T,
DP QI00I7, YSWI_CAL, T, QI3I0b, ZIbI_UMAM, T, Qo08ZI, ZIbI_MOUS, T,
DP PZ4Z78, ZM4o_UMAM, T,
DP QI38Z9, TMPI_UMAM, 7,
DO PDOCb0097,
//
P#IN%S
Compendium of profein mofif fingerprinfs
Mosf profein fomiIies ore chorocferi;ed by
severoI conserved mofifs
Fingerprinf: sef of mofif(s) (simpIe or
composife, such os muIfidomoins) ~
signofure of fomiIy membership
True fomiIy members exhibif oII eIemenfs
of fhe fingerprinf, whiIe subfomiIy
members moy possess onIy o porf
ProDom
consisfs of on oufomofed compiIofion of
homoIogous domoin oIignmenf (procedure
bosed on PSI-8LAST seorches)
Updofing probIem l
Losf ProDom updofe: Februory 7, Z000
buiIf from SWISS-PPOT 38 + TPM8L +TPM8L
updofes - Ocfober ZZ, I999
ProDom: eumpIe
Your query
Protein domuin/fumiI: Composite dutubuses
xompIe: InferPro
Unificofion of PPOSIT, PPIMTS, Pfom ond
ProDom info on infegrofed resource of profein
fomiIies, domoins ond funcfionoI sifes,
SingIe sef of documenfs Iinked fo fhe vorious
mefhods,
WiII be used fo improve fhe funcfionoI onnofofion
of SWISS-PPOT (cIossificofion of unknown profein,)
This reIeose confoins 30bZ enfries, represenfing b74 domoins,
Z4I8 fomiIies, 4o repeofs ond I4 posf-fronsIofionoI modificofion
sifes.
InterPro: eumpIe
IPP00I3Z3
Mome
ryfhropoiefin/fhrombopoeifin
Type
FomiIy
Absfrocf
ryfhropoiefin, o pIosmo gIycoprofein, is fhe primory physioIogicoI mediofor of eryfhropoiesis [I] . If is invoIved in
fhe reguIofion of fhe IeveI of peripheroI eryfhrocyfes by sfimuIofing fhe differenfiofion of eryfhroid progenifor ceIIs,
found in fhe spIeen ond bone morrow, info mofure eryfhrocyfes [Z] . If is primoriIy produced in oduIf kidneys ond
foefoI Iiver, ocfing by offochmenf fo specific binding sifes on eryfhroid progenifor ceIIs, sfimuIofing fheir
differenfiofion [3] . Severe kidney dysfuncfion couses reducfion in fhe pIosmo IeveIs of eryfhropoiefin, resuIfing in
chronic onoemio - injecfion of purified eryfhropoiefin info fhe bIood sfreom con heIp fo reIieve fhis fype of onoemio.
LeveIs of eryfhropoiefin in pIosmo fIucfuofe wifh vorying oxygen fension of fhe bIood, buf ondrogens ond
prosfogIondins oIso moduIofe fhe IeveIs fo some exfenf [3] . ryfhropoiefin gIycoprofein sequences ore weII
conserved, o consequence of which is fhof fhe hormones ore cross-reocfive omong mommoIs, i.e. fhof from one
species, soy humon, con sfimuIofe eryfhropoiesis in ofher species, soy mouse or rof [4] .
Thrombopoeifin (TPO), o gIycoprofein, is fhe mommoIion hormone which funcfions os o megokoryocyfic Iineoge
specific growfh ond differenfiofion focfor offecfing fhe proIiferofion ond mofurofion from fheir commiffed progenifor
ceIIs ocfing of o Iofe sfoge of megokoryocyfe deveIopmenf. If ocfs os o circuIofing reguIofor of pIofeIef numbers.
,.
InterPro: eumpIe
...
xompIeIisf
P33708
P33709
P4974b
view mofches for fhe exompIes
PubIicofions
I. Shoemoker C.8., Mifsock L.D. 849-8b8 (I98o)
Z. Tokeuchi M., Tokosoki S., Miyo;oki ., Iofo T., oshi S., Iochibe M., Iobofo A. J. 8ioI. Chem. Zo3:
3ob7-3oo3 (I988)
3. Lin F.I., Lin C.., Loi P.., 8rowne J.I., grie J.C., SmoIIing P., Fox 0.M., Chen I.I., Cosfro M., Suggs
S. 0ene 44: Z0I-Z09 (I98o)
4. Mogoo M., Sugo ., Okono M., Mosudo S., Morifo ., Ikuro I., Sosoki P.
MucIeofide sequence of rof eryfhropoiefin.
II7I: 99-I0Z (I99Z)
ChiIdren
IPP0030I3
Signofures
PPOSIT PS008I7 PO_TPO
PFAM PF007b8 PO_TPO
Mofches
TobIe 0rophicoI
Dutubuses : mutution/poImorpism
Confoin informofions on sequence voriofions fhof ore Iinked or nof
fo genefic diseoses,
MoinIy humon buf: OMIA - OnIine MendeIion Inherifonce in AnimoIs
SeneruI db:
OMIM
M0D - umon 0ene Mufofion db
SVD - Sequence voriofion db
08AS - umon 0enic 8i-AIIeIic Sequences db
dbSMP - umon singIe nucIeofide poIymorphism (SMP) db
Diseuse-specific db: mosf of fhese dofoboses ore eifher Iinked fo
o singIe gene or fo o singIe diseose,
pb3 mufofion db
AD8 - AIbinism db (Mufofions in humon genes cousing oIbinism)
Asfhmo ond AIIergy gene db
,.
Mutution/poImorpisms: definitions
SNPs: singIe nucIeofide poIymorphisms
c-SNPs: coding singIe nucIeofide poIymorphisms
(SingIe MucIeofide PoIymorphisms wifhin cDMA sequences)
SAPs: singIe omino-ocid poIymorphisms
Missense mufofion: - SAP
Monsense mufofion: - STOP
Inserfion/deIefion of nucIeofides - fromeshiff,
l Mumbering of fhe mufofion depends on fhe db (oo
no I is nof necessory fhe inifiofor Mef l)
Mutution/poImorpisms
dbSMP consorfium hffp://snp.cshI.org/
8oyer, Poche, I8M, Pfi;er, Movorfis, MoforoIo,,
Mission: deveIop up fo 300,000 SMPs disfribufed evenIy fhroughouf fhe
humon genome ond moke fhe informofions reIofed fo fhese SMPs
ovoiIobIe fo fhe pubIic wifhouf infeIIecfuoI properfy resfricfions. The
projecf sforfed in ApriI I999 ond is onficipofed fo confinue unfiI fhe
end of Z00I.
dbSMP of MC8I hffp://www.ncbi.nIm.nih.gov/SMP/
CoIIoborofion befween fhe MofionoI umon 0enome Peseorch Insfifufe ond fhe
MofionoI Cenfer for 8iofechnoIogy Informofion (MC8I)
Mission: cenfroI reposifory for bofh singIe bose nucIeofide subsifufions ond shorf
deIefion ond inserfion poIymorphisms
Aug Z4, Z000 , dbSMP hos submissions for 803bb7 SMPs.
Chromosome ZI dbSMP hffp://csnp.isb-sib.ch/
A joinf projecf befween fhe Division of MedicoI 0enefics of fhe
Universify of 0enevo MedicoI SchooI ond fhe SI8
Mission: comprehensive cSMP (SingIe MucIeofide PoIymorphisms wifhin
cDMA sequences) dofobose ond mop of chromosome ZI
Mutution/poImorpisms
Very heferogeneous formof,
0eneroIIy modesf si;e,
There ore inifiofives fo sfondordi;e ond fo unify
fhese dofoboses (SVD - Sequence Voriofion
Dofobose projecf of 8I: MufD8)
Dutubuses : proteomics
Confoin informofions obfoined by ZD-PA0: mosfer
imoges of fhe geIs ond descripfion of idenfified
profeins
xompIes: SWISS-ZDPA0, COZD8AS, Moi;e-
ZDPA0, SubZD, CyonoZD8ose, efc.
Formof: composed of imoge ond fexf fiIes
Mosf ZD-PA0 dofoboses ore "federofed" ond
use SWISS-PPOT os o mosfer index
There is currenfIy no profein Moss Specfromefry
(MS) dofobose (nof for Iong,)
Dutubuses 7: 3D structure
Confoin fhe spofioI coordinofes of mocromoIecuIes whose 3D
sfrucfure hos been obfoined by -roy or MMP sfudies
Profeins represenf more fhon 907 of ovoiIobIe sfrucfures
(ofhers ore DMA, PMA, sugors, virus, compIex profein/DMA,)
PD8 (Profein Dofo 8onk), SCOP (sfrucfuroI cIossificofion of
profeins (occording fo fhe secondory sfrucfures)), 8MP8
(8ioMogPes8onk, PMM resuIfs)
Fufure: omoIogy-derived 3D sfrucfure db.
PD
Profein Dofo 8onk, monoged by PCS8
CurrenfIy fhere ore ~I3'000 sfrucfures for obouf
4'000 differenf moIecuIes, buf for Iess profein
fomiIy l
There ore oIso dofoboses fhof confoin dofo
derived from PD8. xompIes: SSP (homoIogy-
derived secondory sfrucfure of profeins),
SWISS-3DIMA0 (imoges),
Pesfricfion en;yme
PD: eumpIe
ADP LYAS(OO-ACID) 0I-OCT-9I IZCA IZCA Z
COMPMD CAP8OMIC AMYDPAS /II (CAP8OMAT DYDPATAS) (/CA II) IZCA 3
COMPMD Z (.C.4.Z.I.I) MUTAMT WIT VAL IZI PPLACD 8Y ALA (/VIZIA) IZCA 4
SOUPC UMAM (OMO SAPIMS) PCOM8IMAMT PPOTIM IZCA b
AUTOP S.I.MAIP,D.W.CPISTIAMSOM IZCA o
PVDAT I Ib-OCT-9Z IZCA 0 IZCA 7
JPML AUT S.I.MAIP,T.L.CALDPOM,D.W.CPISTIAMSOM,C.A.FIPI IZCA 8
JPML TITL ALTPIM0 T MOUT OF A YDPOPO8IC POCIT. IZCA 9
JPML TITL Z STPUCTUP AMD IIMTICS OF UMAM CAP8OMIC AMYDPAS IZCA I0
JPML TITL 3 /II$ MUTAMTS AT PSIDU VAL-IZI IZCA II
JPML PF J.8IOL.CM. V. Zoo I73Z0 I99I IZCA IZ
JPML PFM ASTM J8CA3 US ISSM 00ZI-9Zb8 07I IZCA I3
PMAPI I IZCA I4
PMAPI Z IZCA Ib
PMAPI Z PSOLUTIOM. Z.4 AM0STPOMS. IZCA Io
PMAPI 3 IZCA I7
PMAPI 3 PFIMMMT. IZCA I8
PMAPI 3 PPO0PAM PPOLSQ IZCA I9
PMAPI 3 AUTOPS MDPICISOM,IOMMPT IZCA Z0
PMAPI 3 P VALU 0.I70 IZCA ZI
PMAPI 3 PMSD 8OMD DISTAMCS 0.0II AM0STPOMS IZCA ZZ
PMAPI 3 PMSD 8OMD AM0LS I.3 D0PS IZCA Z3
PMAPI 4 IZCA Z4
PMAPI 4 M-TPMIMAL PSIDUS SP Z, IS 3, IS 4 AMD C-TPMIMAL IZCA Zb
PMAPI 4 PSIDU LYS Zo0 WP MOT LOCATD IM T DMSITY MAPS AMD, IZCA Zo
PMAPI 4 TPFOP, MO COOPDIMATS AP IMCLUDD FOP TS PSIDUS. IZCA Z7
,,,
PD {cont,}
ST 3 SI0 P oo P 70 -I O ASM o7 M LU o0 IZCA o8
ST 4 SI0 TYP 88 TPP 97 -I O P 93 M VAL o8 IZCA o9
ST b SI0 ALA IIo ASM IZ4 -I O IS II9 M IS 94 IZCA 70
ST o SI0 LU I4I VAL Ib0 -I O LU I44 M LU IZ0 IZCA 7I
ST 7 SI0 VAL Z07 LU ZIZ I O IL ZI0 M 0LY I4b IZCA 7Z
ST 8 SI0 TYP I9I 0LY I9o -I O TPP I9Z M VAL ZII IZCA 73
ST 9 SI0 LYS Zb7 ALA Zb8 -I O LYS Zb7 M TP I93 IZCA 74
ST I0 SI0 LYS 39 TYP 40 I O LYS 39 M ALA Zb8 IZCA 7b
TUPM I TI 0LM Z8 VAL 3I TYP VI8 (CIS-PPO 30) IZCA 7o
TUPM Z TZ 0LY 8I LU 84 TYP II(PPIM) (0LY 8Z) IZCA 77
TUPM 3 T3 ALA I34 0LM I37 TYP I (0LM I3o) IZCA 78
TUPM 4 T4 0LM I37 0LY I40 TYP I (ASP I39) IZCA 79
TUPM b Tb TP Z00 LU Z03 TYP VIA (CIS-PPO Z0Z) IZCA 80
TUPM o To 0LY Z33 0LU Z3o TYP II (0LY Z3b) IZCA 8I
CPYSTI 4Z.700 4I.700 73.000 90.00 I04.o0 90.00 P ZI Z IZCA 8Z
OPI0I I.000000 0.000000 0.000000 0.00000 IZCA 83
OPI0Z 0.000000 I.000000 0.000000 0.00000 IZCA 84
OPI03 0.000000 0.000000 I.000000 0.00000 IZCA 8b
SCALI 0.0Z34I9 0.000000 0.00oI00 0.00000 IZCA 8o
SCALZ 0.000000 0.0Z398I 0.000000 0.00000 IZCA 87
SCAL3 0.000000 0.000000 0.0I4Ibo 0.00000 IZCA 88
ATOM I M TPP b 8.bI9 -0.7bI I0.738 I.00 I3.37 IZCA 89
ATOM Z CA TPP b 7.743 -I.oo8 II.b8b I.00 I3.4Z IZCA 90
ATOM 3 C TPP b o.78o -Z.b0Z I0.oo7 I.00 I3.47 IZCA 9I
ATOM 4 O TPP b o.4ZZ -Z.08b 9.o07 I.00 I3.b7 IZCA 9Z
ATOM b C8 TPP b o.997 -0.9I7 IZ.o4b I.00 I3.34 IZCA 93
ATOM o C0 TPP b b.784 -0.Z09 IZ.ZZI I.00 I3.40 IZCA 94
ATOM 7 CDI TPP b b.o8I I.084 II.797 I.00 I3.Z9 IZCA 9b
ATOM 8 CDZ TPP b 4.4I7 -0.oo7 IZ.ZZI I.00 I3.34 IZCA 9o
ATOM 9 MI TPP b 4.388 I.4I8 II.bIb I.00 I3.30 IZCA 97
ATOM I0 CZ TPP b 3.b88 0.37b II.797 I.00 I3.3b IZCA 98
ATOM II C3 TPP b 3.837 -I.877 IZ.o4b I.00 I3.39 IZCA 99
ATOM IZ CZZ TPP b Z.ZIo 0.Z08 II.obo I.00 I3.39 IZCA I00
ATOM I3 CZ3 TPP b Z.4ob -Z.043 IZ.b04 I.00 I3.33 IZCA I0I
ATOM I4 CZ TPP b I.ob4 -I.00I IZ.009 I.00 I3.34 IZCA I0Z
,,.
Dutubuses : metuboIic
Confoin informofions fhof describe en;ymes,
biochemicoI reocfions ond mefoboIic pofhwoys,
MZYM ond 8PMDA: nomencIofure dofoboses fhof
sfore informofions on en;yme nomes ond reocfions,
xompIes of mefoboIic dofoboses: coCyc (specioIi;ed
on scherichio coIi), I00, MP/WIT,
UsuoIy fhese dofoboses ore fighfIy coupIed wifh query
soffwore fhof oIIows fhe user fo visuoIise reocfion
schemes.
Dutubuses 9: bibIiogrupic
8ibIiogrophic reference dofoboses confoin
cifofions ond obsfrocf informofions of
pubIished Iife science orficIes,
xompIe: MedIine
Ofher more specioIi;ed dofoboses oIso exisf
(exompIe: AgricoIo).
MedIine
MDLIM covers fhe fieIds of medicine, nursing,
denfisfry, veferinory medicine, fhe heoIfh core
sysfem, ond fhe precIinicoI sciences
more fhon 4,000 biomedicoI journoIs pubIished in fhe
Unifed Sfofes ond 70 ofher counfries
Confoins over I0 miIIion cifofions since I9oo unfiI
now
Confoins Iinks fo bioIogicoI db ond fo some journoIs
Mew records ore odded fo PreMDLIM doiIyl
Mony popers nof deoIing wifh humon ore nof in MedIine l
8efore I970, keeps onIy fhe firsf I0 oufhors l
Mof oII journoIs hove cifofions since I9oo l
MedIine/Pubmed
PubMed is deveIoped by fhe MofionoI Cenfer for
8iofechnoIogy Informofion (MC8I)
PubMed provides occess fo bibIiogrophic
informofion such os MDLIM, PreMDLIM,
eoIfhSTAP, ond fo infegrofed moIecuIor bioIogy
dofoboses (composife db)
PMID: I09Z3o4Z (PubMed ID), UI: Z0378I4b
(MedIine ID)
Dutubuses 10: oters
There ore mony dofoboses fhof connof be
cIossified in fhe cofegories Iisfed previousIy,
xompIes: Pe8ose (resfricfion en;ymes),
TPAMSFAC (fronscripfion focfors), O-
0LYC8AS (O-Iinked sugors), Profein-profein
inferocfions db (DIP), biofechnoIogy pofenfs
db, efc.,
As weII os mony ofher resources concerning
ony ospecfs of mocromoIecuIes ond moIecuIor
bioIogy.
ProIiferution of dutubuses
Whof is fhe besf db for sequence onoIysis 7
Which does confoin fhe highesf quoIify dofo 7
Which is fhe more comprehensive 7
Which is fhe more up-fo-dofe 7
Which is fhe Iess redundonf 7
Which is fhe more indexed (oIIows compIex
queries) 7
Which Web server does respond mosf quickIy 7
,,.777777
Some importunt pructicuI remurks
Dofoboses: mony errors (oufomofed
onnofofion) l
Mof oII db ore ovoiIobIe on oII servers
The updofe frequency is nof fhe some for
oII servers, creofion of db_new befween
reIeoses (exempIe: M8Lnew,
TrM8Lnew,.)
Some servers odd oufomoficoIIy usefuI
cross-references fo on enfry (impIicif
Iinks) in oddifion fo oIreody exisfing Iinks
(expIicif Iinks)
Dutubuse retrievuI tooIs
Sequence PefrievoI Sysfem (SPS, urope) oIIows ony
fIof-fiIe db fo be indexed fo ony ofher, oIIows fo
formuIofe queries ocross o wide ronge of differenf
db fypes vio o singIe inferfoce, wifhouf ony worry
obouf dofo sfrucfure, query Ionguoges,
nfre; (USA): Iess fIexibIe fhon SPS buf expIoifs
fhe concepf of neighbouring , which oIIows reIofed
orficIes in differenf db fo be Iinked fogefher,
whefher or nof fhey ore cross-referenced direcfIy
ATLAS: specific for mocromoIecuIor sequences db
(i.e. MPL-3D)
,.
More informutions ubout
SISS-P#%
%e goIden gouIs of SISS-P#%
Annofofed / curofed
CompIefe
Mon-redundonf
ighIy cross-referenced
AvoiIobIe from o voriefy of servers ond
fhrough sequence onoIysis soffwore fooIs
Associofed wifh wide ronge of documenfofion
Peview: Profein sequence dofoboses
P. ApweiIer (Z000), Adv. in profein chemisfry, b4, 3I-70
SISS-P#%: species
o'840 differenf species
Z0 species represenf obouf 4b7 of oII
sequences in fhe dofobose
b'000 species ore onIy represenfed by one
fo fhree sequences. In mosf coses, fhese
ore sequences which were obfoined in fhe
confexf of o phyIogenefic sfudy
SISS-P#%: cross-references
SWISS-PPOT wos fhe firsf dofobose wifh cross-
references.
xpIicifIy cross-referenced fo 34 dofoboses
Cross-ref fo DMA (M8L/0en8onk/DD8J), 3D-
sfrucfure (PD8), Iiferofure (MedIine), genomic
(MIM, M0D, FIy8ose, S0D, SubfiLisf, efc.), ZD-geI
(SWISS-ZDPA0), specioIi;ed db (PPOSIT,
TPAMSFAC)
ImpIicifIy cross-referenced fo oddifionoI db on
fhe WWW (0eneCords, PPODOM, efc.)
Annotutions
Funcfion(s)
Posf-fronsIofionoI modificofions (PTM)
Domoins
Quofernory sfrucfure
SimiIorifies
Diseoses, mufogenesis
ConfIicfs, vorionfs
Cross-references
,
A Siss-Prot entr
Sprot entr {cont,}
Sprot entr {cont,}
Sprot entr {cont,}
Sprot entr {cont,}
Future for umun proteins
OriginoI esfimofe: from 70'000 fo I00'000 genes
Incyfe recenfIy onnounced on esfimofion of I40'000 genes
More recenf esfimofions give obouf 30'000 fo 40'000 genes
C. eIegons ond DrosophiIo hove ~Ib'000 genes. There wos fwo sefs
of genome dupIicofion in fhe evoIufionory hisfory Ieoding fo
verfebrofes. Very roughIy if meons fhof:
umon genes~~o0'000 genes - Iosses + new genes
8uf more fhon I miIIion profeins l
(due fo PTM, oIfernofive producfs, vorionfs,)
ttp://,ensembI,org/geneseep,tmI
Seneseep
hffp://www.ensembI.org/genesweep.hfmI
ut ufter genomes?
Profeome projecfs ore on essenfioI fooI for
fhe undersfonding of reoI profeins
There wiII be o fIood of chorocferi;ofion
dofo (MS, ZD) fhof wiII be fhe equivoIenf
of STs of fhe profein IeveI
Profein dofoboses ore going fo be more ond
more imporfonf for new bioIogicoI sfudies
Dutubuses in SCS
DMA
M8L, PD, Pep8ose, vecfordb (MC8I)
Profein
Swiss-Prof, TrM8L, PD8
Ofher
PPOSIT, P8AS
o to uccess dutubuses in SCS?
Fefch or fypedofo 7
Sfringseorch
Mome
Lookup (bosed on SPS)
UsefuI fo generofe Iisf fiIes

You might also like