You are on page 1of 309

UCSC Genome Browser

http://genome.ucsc.edu/

Louis Tang
Bioinformatics R&D
National Genotyping Center
Academia Sinica.
louis@ibms.sinica.edu.tw
Quick Overview of features on
genomic regions
End of The Human Genome War
W. James Kent
Coordinate Genome Assembly
Annotation Tracks Gene

Regulation

Variation

Expression

More…
Annotation Tracks
Coordinate
Everlasting Assembly
Ever-Changing Tracks
http://genome.ucsc.edu/
Coordinate

chr3:1,000,000-2,000,000
chr3:1,000,000+2000

chr3:1,000,000-1,001,999
Landmark

chromosome
Gene
SNP
STS EST
Cytogenetic band
chr7

chr7:1-158,821,424
20q12

chr20:37,100,001-41,100,000
apoe

chr19:50,100,879-50,104,490
rs328

chr8:19864004

±250

chr8:19,863,754-19,864,254
D16S2837

chr16: 81,120,163- 81,120,399

±100,000

chr16:81,020,163-81,220,399
Landmark1;Landmark2

Landmark1 Landmark2
rs328;rs316

chr8:19,864,004 chr8:19,862,716

Sort

chr8:19,862,716-19,864,004
Author
McAndrew,P.E.
Browser Graphic

Track Control
Mark
Gene

UTR Intron CDS

>>>>>> >>>>
PDB

>>>>>> >>>>
Reviewed

>>>>>> >>>>
RefSeq Provisional

>>>>>> >>>>
Non-RefSeq

>>>>>> >>>>
Alignment

TAACCAGCTGCCCAA--------TAGAAACTACGAGAGACAACAGGGAGT
||||| ||||||||| ||||||| | ||||||||||
TAACCAGCTGCCCAACTGTAGAAACTACCAACTCATTTCGAACAGGGAGT
Wiggle
Mapping and Sequencing
Phenotype and Disease Association (OMIN)

Genes and Gene Prediction (sno/miRNA)

mRNA and EST

Expression

Regulation (TFBS, miRNA target)

Comparative Genomics

Variation and Repeats (SNP, CNV)

ENCODE Pilot
Display Mode
Full
Pack
Squish
Dense
Hide
Configuration

Description
Display Convention

Method

Credit

Data Usage Restriction

References
Different Display Mode
Exhibits Different Behavior
Stroll Along The Genome
Click
Drag
Ctrl + Click
Practice

Find out if the mouse Brca1 gene has non-

synonymous SNPs, color them blue, and get

external data about one codon-changing SNP.

(Hint: Color option hides in SNP track control)

http://www.openhelix.com/downloads/ucsc/ucsc_home.shtml
Where is the Sequence?
RefSeq Gene
SNP
Browser Graphic Download
chr8:19,863,754-19,864,254

rs328

McAndrew,P.E.

?
AACTAGAAATCAGTCAACAAATTGGATGCTTAGGATAAATTCAAGAACTG
AGTAGAGAAATAAAGCTTAATGAATGACCTTTTGGGCTCCTTCCAGTTCC
AAGGTTTTAGTATTCTAAAATTTTCGGCACAGAACAACTCCAAATGCTCA
GGAAATAAGAATGAGGTCTGTTTTTAAAAGGTGCAGTTTGGAGCATGTTG
GGTGGATGAGGCTATAAAAAGTGAAGTACGATTTTCAAGGAAAGGAAGCT
GACCAATCAAAGTCTTTTGGGCAGCCCCTCCAGAAATCCAGGTGAAGCCC
GGCTCCAGGCTGAGTTGCTGTTACTCTACACGAAAGCCAGGCCGCTACTT
BLAT

BLAST-Like Alignment Tool


W. James Kent
DNA & RNA

500x
Protein

50x
DNA & RNA

95%
25 bases
Protein

80%
20 amino acids
DNA

mRNA
BLAT’s Guess
DNA (RNA)
Query Genome

DNA DNA
Protein
Query Genome

Protein DNA
6 frames

Protein
Translated RNA
Query Genome

RNA DNA
3 frames 6 frames

Protein Protein
Translated DNA
Query Genome

DNA DNA
6 frames 6 frames

Protein Protein
Query

Genome
Query

Genome
Query

Genome
Query

Genome
Practice

Find the protein sequence for mouse APOE (mm9). BLAT


this sequence vs. the human genome (hg19) to find the
human homolog. Look for SNPs (SNPs130) in the coding
region of this gene. Obtain the human DNA sequence for
this region, and underline the SNPs

http://www.openhelix.com/downloads/ucsc/ucsc_home.shtml
Pretty graphic is good,
but…
I want raw data
Table Browser
Text-based access to features
on genomic regions
Mission

(Hg18, SNP130)
Find all single SNPs on tp53
(uc002gij.2)
Genome Browser Table Browser

Database
Database

Table

Table
Table

Table Table
Genome Browser

Annotation Track

Table Table

Table
Table Browser

Table
Table

Table
Apoe

Start Stop
Genome Browser 50100879 50104490
Table Browser 50100878 50104490
C A C C T C A G A
Biology 1 2 3 4 5 6 7 8 9
Computer 0 1 2 3 4 5 6 7 8
C A C C T C A G A
Biology 1 2 3 4 5 6 7 8 9
Computer 0 1 2 3 4 5 6 7 8
C A C C T C A G A
Biology 1 2 3 4 5 6 7 8 9
Computer 0 1 2 3 4 5 6 7 8
C A C C T C A G A
Biology 1 2 3 4 5 6 7 8 9
Computer 0 1 2 3 4 5 6 7 8
C A C C T C A G A
Biology 1 2 3 4 5 6 7 8 9
Computer 0 1 2 3 4 5 6 7 8
3 Questions

Table?
Output Format?
Filter Criteria?
Table
Output
Practice

hg18
snp130
chr6:1,000,000-1,001,000
Sequence Output
100 extra bases on either stream
Filter Criteria
Positional

Non-Positional
Positional

chrom chromStart chromEnd name strand

chr17 7512594 7512595 rs1794293 -


chr17 7512594 7512596 rs34734132 +
chr17 7512595 7512596 rs1794292 -
chr17 7512715 7512716 rs55817367 +
chr17 7512765 7512766 rs35659787 -
chr17 7512796 7512797 rs17884586 -
chr17 7512825 7512826 rs17884306 -
chr17 7512854 7512854 rs35940853 +
chr17 7512977 7512978 rs34182553 +
Non-positional

ccds srcDb mrnaAcc protAcc

CCDS10.1 H ENST00000379268 ENSP00000368570


CCDS10.1 N NM_004195.2 NP_004186.1
CCDS10.1 H OTTHUMT00000004083 OTTHUMP00000001519
CCDS100.2 H ENST00000377411 ENSP00000366628
CCDS100.2 N NM_024980.4 NP_079256.4
CCDS100.2 H OTTHUMT00000127658 OTTHUMP00000082681
CCDS1000.1 H ENST00000368849 ENSP00000357842
CCDS1000.1 N NM_020127.2 NP_064512.1
Coordinate
Landmark
Landmark;Landmark
Author

One-Based
chrX 151073054 151173000
chrX 151183000 151190000
chrX 151283000 151290000

Zero-Based
chrX:151,073,055-151,173,000
chrX:151,183,001-151,190,000
chrX:151,283,001-151,290,000

One-Based
Practice Time

hg18
snp130
chr6:1,000,000-1,500,000
chr6:2,000,000-2,500,000

Item Count: 6,739


Practice
hg18
snp130
genome
rs100, rs200, rs300, rs400, rs500
Data Type

String e.g. Gene Name (TP53)

Number e.g. Chrom Start (100000)

Enumeration e.g. Strand (+/-)


String

value does match criteria


doesn’t

TP53 = TP53

*? = any single character


= any character, any length
Number

value is ignored criteria


in range
<
<=
=
!=
>=
>
in range

100,200 or 100 200 or 100, 200


Enumeration

value does match (choices)


doesn’t
Practice Time
Assembly: hg18
Group: Variation and Repeats
Track: Simple Repeats
Table: simpleRepeat
chrx:1,000,000-2,000,000
Copy number > 100

Item Count: 47
SNP

Gene
SNP

Intersect
Gene
Practice Time
hg18
chr1:1,000,000-2,000,000
Find SNPs(v.130) on sno/miRNA

chr1 1092425 1092426 rs72563729 0 +


Custom Track
http://tiny.cc/c9gb2

browser hide all


track visibility=full useScore=1 itemRgb=on
chr4 1000000 1005000 myData 200 + 1000100 1004900 100,150,200
GFF
GTF PSL
bedGraph MAF
BED
bigWig WIG BED15
bigBed BAM
Data

browser hide all


track visibility=full useScore=1 itemRgb=on
chr4 1000000 1005000 myData 200 + 1000100 1004900 100,150,200
chr4 1000000 1005000 myData 200 + 1000100 1004900 100,150,200

chr start end

Zero-Based
chr4 1000000 1005000 myData 200 + 1000100 1004900 100,150,200

name
chr4 1000000 1005000 myData 200 + 1000100 1004900 100,150,200

score (0 ~ 1000)
chr4 1000000 1005000 myData 200 + 1000100 1004900 100,150,200

strand
chr4 1000000 1005000 myData 200 + 1000100 1004900 100,150,200

thickStart thickEnd
chr4 1000000 1005000 myData 200 + 1000100 1004900 100,150,200

color

Red,Green,Blue
0~255
Track

browser hide all


track visibility=full useScore=1 itemRgb=on
chr4 1000000 1005000 myData 200 + 1000100 1004900 100,150,200
Track

track visibility=full useScore=1 itemRgb=on

Track
Track

track visibility=full useScore=1 itemRgb=on

full
pack
squish
dense (Default)
hide
Track

track visibility=full useScore=1 itemRgb=on

Shading: 1
No Shading: 0 (Default)
Track

track visibility=full useScore=1 itemRgb=on

Color: on
No Color: off (Default)
Track

browser hide all

track visibility=full useScore=1 itemRgb=on

chr4 1000000 1001000 myData 200 + 1000300 1000700 100,150,200


in silico PCR
min 15 bases
5’ 3’

Max Product Size


3’ 5’
5’ 3’

Min Perfect Match


(>= 15 bases)
3’ 5’
5’ 3’

Min Good Match


5’ 3’

Flip
Help
Mailing List
Subject: in-Silico PCR Min Perfect/Good Match confuses me
From: louis@ibms.sinica.edu.tw
To: genome@soe.ucsc.edu
Date: 04/22/2010 03:24:06 PM

Hi,

On the in-silico PCR input page there are two options: min perfect match
& min good match. There are some explanations on the same page:

But how can these two options be specified simultaneously? will one
option be overridden by another?

Louis
Subject: Re: [Genome] in-Silico PCR Min Perfect/Good Match confuses me
From: Galt Barber galt@soe.ucsc.edu
To: genome@lists.soe.ucsc.edu
Date: 04/23/2010 02:50:01 AM

Hi, Louis!

All conditions apply at once:

You must have valid forward and reverse primers matching


to give a result.

This does allow you to increase Min Perfect Match above 15 if you want.
It allows you to increase Min Good Match over Min Perfect Match if you
want. But the specificity near the 3' end of the primers is always at
least 15 bp perfectly matching.

-Galt
"It's been a wonderful stone soup, where
other people have contributed bits,"

- James Kent
http://tiny.cc/d56k6

You might also like