You are on page 1of 3

The Scientist :: The Dark Side of the Genome, May 19, 2003

Page 1 of 4

Volume 17 | Issue 10 | 31 | May 19, 2003

Previous | Next Comment

The Dark Side of the Genome


Researchers shine their lights on noncoding sequence | By Brendan A. Maher

E-mail article

Erica P. Johnson

Search

Advanced Search

The dark side of the moon is a misnomer. Light reaches la luna's entire surface, but one half is unviewable from Earth. The human genome, the now essentially decoded 1 map of life, likewise has a light side--the genes encoding mRNA and protein--and a dark side, which is coming into view for the first time. The dark side encompasses more than its opposite: The majority of the genome comprises intronic regions, stretches of repeat sequence, and other assorted gibberish that has attained the ignoble dubbing, "junk." The first exploratory missions to the human genome's faceted surface are turning up traces regarding the extent of the junk. At a recent National Human Genome Research Institute (NHGRI) conference, numerous presenters invoked Sydney Brenner's classic distinction: "Garbage you throw away and junk you keep, because you think you might want to do something useful with it, and of course you never do."2 Comparative, computational, and experimental studies can shine light on these unexplored DNA elements. Some are known regulatory stretches; others encode RNAs but offer scarce hints at their function. Eric Green, chief of NHGRI's genome sequencing branch, says, "I think the challenge is going to be in the

http://www.the-scientist.com/yr2003/may/research3_030519.html

5/18/2003

The Scientist :: The Dark Side of the Genome, May 19, 2003

Page 2 of 4

nongenic, functional portion of the genome." THREE BIG GIGS IN THE SKY Comparative analysis of mouse and human3 demonstrates that in each of these 3-gigabase mammalian genomes, roughly 5% of the sequence is under selective pressure, and therefore, one would assume, of some use. Roughly 1.5% of the genome encodes protein; the remaining 3.5% appears to have biological functions other than protein coding. Those numbers are based on presumptions, cautions Green, but they do help in understanding the problem's scale. His sequencing center carved a niche for itself in postsequence genomics by doing comparative studies. At the April conference, Green discussed data on a 1.8 megabase region surrounding the highly studied cystic fibrosis gene, CFTR. Using the visualization tool PIP (percentage identity plot), he showed conserved sequence in DNA across human, chimp, mouse, chicken, puffer fish, and other vertebrates. Exonic regions had close matches, of course, but conservation in noncoding introns was also high. Among the species surveyed, 4% of the sequence was conserved. Of that, 28% coded protein, 5% was untranslated regulatory sequence, and roughly 68% had no known function. Janet Rowley, professor of medicine at the University of Chicago, presented data from SAGE (serial analysis of gene expression) experiments on acute myeloid leukemia cells, which colleague San Ming Wang is conducting. The process pulls about 100,000 polyadenylated RNA fragments from cells; about 20% to 30% do not correspond to known genes. She offers possible explanations, including novel genes, transcripts derived from alternative splicing, antisense RNA, and intergenic transcripts. In the past, she says, many have attributed such findings to PCR artifacts. "The experimental work that we've done in the laboratory ... indicates that these lowlevel transcripts are really valid," she says. Referring to the genome's nonprotein-coding elements as "our own dark matter," she asks: "Is there a whole world within the nucleus about which we're fairly ignorant?" To develop tools to explore that world, the NHGRI instituted ENCODE (encyclopedia of DNA elements). 4 The $12 million (US) pilot program will investigate a preselected 1% of the human genome sequence. Half of the selected sequence comprises 30 short segments chosen at random but meeting certain levels of gene-density criteria and nonexonic conservation with the mouse genome. Fourteen targets in the other half were chosen for depth

http://www.the-scientist.com/yr2003/may/research3_030519.html

5/18/2003

The Scientist :: The Dark Side of the Genome, May 19, 2003

Page 3 of 4

of previous study and the existence of substantial comparative sequence data.5 The ultimate goal is to explain what every functional element in that 1% is doing, and then to see what tools best brought about the answer. Green, who chaired a subcommittee involved in selecting target sequences, says, "By everybody focusing their attention on the same 1%, you get a very rich dataset of multiple techniques [and] multiple ideas all being applied to the same portion of the genome." In 2001, major media attention heralded the human genome's first draft; its publication soon after attracted similar interest. But, the newest achievement is the brass ring. Laments National Academies of Science president Bruce Alberts: "We've been crying wolf so many times, I don't think the public understands how much this means." Brendan A. Maher can be contacted at bmaher@thescientist.com.
References 1. T. Powledge, "Human genome project complete," The Scientist, available online at www.biomedcentral.com/news/20030415/03 2. "Decoding the Book of Life," Nova, WGBH Educational Foundation, Oct. 31, 1989. 3. T. Asif et al., "Initial sequencing and comparative analysis of the mouse genome," Nature, 420:520-62, 2002. 4. L. Pray, "Post-genome project launches," The Scientist, available online at www.biomedcentral.com/news/20030305/02 5. See www.genome.gov/10506163

Please indicate on a 1 - 5 scale how strongly you would recommend this article to your colleagues?
Not recommended

j k l m n

j k l m n

j k l m n

j k l m n

j k l m n

Highly recommended

Please register your vote

2003, The Scientist Inc.

Previous | Next

http://www.the-scientist.com/yr2003/may/research3_030519.html

5/18/2003