You are on page 1of 17

1

INTRODUCTION
The
comprises

human
a

approximately
component
nucleotides,

genome

sequence
3
parts,
which

of

billion
called
are

organized into DNA molecules


the

double

helix.

The

nucleotides, which serve as the


alphabet for the language of life, are represented by just four
letters: A, C, G, and T, corresponding to adenine, cytosine, guanine,
and thymine. The nucleotide alphabet codes for the sequence of
amino acids the body will use to build proteins.
Combinations of three nucleotides indicate one of twenty
possible amino acids (for example, CCT codes for the amino acid
glycine), so sets of nucleotide triplets form the instructions that cells
use to build proteins. These proteins perform the work of the cells
from development throughout life, contributing to both our physical
attributes and many of our less tangible features, such as behavior,
learning, and predisposition to disease. A segment of a DNA
molecule that codes for one complete protein is called a gene. The
human genome is carried on 23 different chromosomesor DNA
molecules.
Genomes of other species contain more or fewer nucleotides
and chromosomes but follow the same basic organizational scheme
as the human genome.
In order to study this Human Genome in detail a mega project
called The Human Genome Project was undertaken.
Human Genome Project, international scientific effort to map
all of the genes on the 23 pairs of human chromosomes and, to

sequence the 3.1 billion DNA base pairs that make up the
chromosomes. Begun in 1990 with the goal of enabling scientists to
understand the basis of genetic diseases and to gain insight into
human evolution, the project was largely completed in 2000 when
85% of the human genome was decoded, and ended in 2003 with
99% decoded; detailed analyses of all the pairs were published by
2006. In the process, scientists identified genes for cystic fibrosis,
neurofibromatosis, Huntington's disease, and an inherited form of
breast cancer. In addition, the project decoded the genome of the
bacterium E.coli, a fruit fly, and a nematode worm, in order to study
genetic similarities among species, and the genome of a mouse was
also decoded.
The Human Genome Project involved laboratories in the
United States, France, Great Britain, Germany, and Japan. It was
financed in the United States by the National Institutes of Health
and by the Department of Energy and in Great Britain by the
Wellcome Trust of London. A comparable project using new DNAsequencing machines was begun as a private industry venture in
the United States in 1998, with a stated goal of completing the
mapping of the genome in three years. Early in 2001 scientists from
both teams jointly announced the completion of the mapping of the
human genome, indicating that they had identified an estimated
30,000 genes instead of the expected 100,000, constituting just 1%
of the total human DNA. Subsequent comparison of the two teams'
data has indicated that, because of differences in the genes
identified by the teams, there may in fact be as many as 40,000
human genes. A subsequent, more refined estimate based on
additional work on the genome was that there are between 20,000
and 25,000 genes.
Further work continues on refining the sequencing of genes on
chromosomes, eliminating the remaining gaps in the genome map,

and identifying the extent of variation in the human genome. In


2007 the first sequences of human individuals were released.
Venter's genome was the first individual full diploid human genome.
The NIH's National Centre for Biotechnology Information maintains
GenBank, a database of publicly available genetic sequences from
the genomes of plants and animals, including some extinct species.

HISTORY
The Human Genome Project traces its roots to an initiative in
the U.S. Department of Energy. Since 1945, Department of Energy
and its predecessor agencies have been charged by Congress with
developing new energy resources and technologies and with
pursuing

deeper

understanding

of

potential

health

and

environmental risks posed by their production and use. Such studies


have

since

provided

the

scientific

basis

for

individual

risk

assessments of nuclear medicine technologies, for example.


In 1986, DOE took a bold step in announcing its Human
Genome Initiative, convinced that DOEs missions would be well
served by a reference human genome sequence. Shortly thereafter,
DOE

and

Institutes

the
of

National
Health

developed a plan for a joint


HGP that officially began in
1990.

SIGNIFICANT
FEATURES

Some of the significant observations drawn from human genome


project are as follows:

Human genome contains 3164.7 million nucleotide bases.


An average gene consist of 3000 bases, but sizes vary greatly,
with the largest known human gene being dystrophin at 2.4

million bases.
The total number of genes is estimated at 30,000much lower
than previous estimates of 80,000 to 1,40,000 genes. Almost
all (99.9 per cent) nucleotide bases are exactly the same in all

people.
The functions are unknown for over 50 per cent of the

discovered genes.
Less than 2 per cent of the genome codes for proteins.
Repeated sequences make up very large portion of the human

genome.
Repetitive sequences are stretches of DNA sequences that are
repeated many times, sometimes hundred to thousand times.
They are thought to have no direct coding functions, but they

shed light on chromosome structure, dynamics and evolution.


Chromosome 1 has most genes (2968), and the Y has the

fewest (231).
Scientists have identified about 1.4 million locations where
singlebase

DNA

differences

(SNPs

single

nucleotide

polymorphism) occur in humans. This information promises to


revolutionise the processes of finding chromosomal locations
for disease-associated sequences and tracing human history.

GOALS
The mega project had several important goals which are as
follows:

Identify all the approximately 20,000 - 25,000 genes in human


DNA.

Determine the sequences of the 3 billion chemical base pairs

that make up the human DNA.


Store this information in databases.
Improve tools for data analysis.
Transfer related technologies to other sectors, such as

industries.
Address the ethical, legal and social issues (ELSI) that may
arise from the project.

METHODOLOGIES
The methods involved two major approaches:

One approach focused on identifying all the genes that


expressed

as

RNA

referred

as Expressed

Sequence

Tags (ESTs).
The other approach is blind approach of simply sequencing
the whole set of genome that contained all the coding and
non-coding sequence, and later assigning different regions in
the sequence with functions, referred as Sequence Annotation.

SEQUENCING OF A GENOME
Sequencing means determining the exact order of the base
pairs in a segment of DNA. Human chromosomes range in size from
about 50,000,000 to 300,000,000 base pairs. Because the bases
exist as pairs, and the identity of one of the bases in the pair
determines the other member of the pair.

Steps involved in the sequencing of a genome:

Isolation of total DNA from a cell and converted into random

fragments of relatively smaller size.


Cloning of DNA fragments can be performed by using cloning
vectors like BAC (Bacterial Artificial chromosomes) and YAC

(yeast artificial chromosomes).


The fragments were sequenced

using

automated

DNA

sequencers that worked on the principle of a method

developed by Frederick Sanger.


These sequences were then arranged

based on

some

overlapping regions present in them.

WHOSE DNA WAS SEQUENCED FOR


THE HUMAN GENOME PROJECT?
This is intentionally not known to protect the volunteers who
provided DNA samples for sequencing. The sequence is derived
from the DNA of several volunteers. To ensure that the identities of
the volunteers cannot be revealed, a careful process was developed
to recruit the volunteers and to collect and maintain the blood
samples that were the source of the DNA.
The volunteers responded to local public advertisements near
the laboratories where the DNA libraries were prepared. Candidates
were recruited from a
diverse population. The
volunteers
blood

provided

samples

after

being

extensively

counselled

and

giving

their

then

informed

consent. About 5 to 10 times as many volunteers donated blood as


were eventually used, so that not even the volunteers would know
whether their sample was used. All labels were removed before the
actual samples were chosen.

SINGLE

NUCLEOTIDE

POLYMORPHISM
Slight variations in our DNA sequences can have a major
impact on whether or not we develop a disease and on our
particular responses to such environmental insults as bacteria,
viruses, and toxins. They also impact our reactions to drugs and
other therapies. One of the most common types of sequence
variation is the single nucleotide polymorphism (SNP). SNPs are sites
in the human genome where individuals differ in their DNA
sequence, often by a single base. For example, one person might
have the base A (adenine) where another might have C (cytosine),
and so on. Researchers in public and private sectors are generating
maps of these sites, which can occur in genes as well as in non
coding regions. Scientists believe such SNP maps will help them
identify the multiple genes associated with such complex diseases
as cancer, diabetes, vascular disease, and some forms of mental
illness. SNP maps provide valuable
targets

for

biomedical

and

pharmaceutical research.
The human genome has at
least 10 million SNPs. Most of
these SNPs contribute to human
variation;
influence

some

of

them

development

may
of

diseases, susceptibility to certain drugs, toxins and infectious


agents.

TECHNICAL ASPECTS
The process of determining the human genome involves first
mapping, or characterizing the chromosomes. This is called a
physical map. The next step is sequencing, or determining the order
of DNA bases on a chromosome. These are genetic maps.
Mapping Strategies
To sequence the human genome, maps are needed. Physical
maps are a series of overlapping pieces of DNA isolated in bacteria.
Physical

maps

are

used

to

describe

the

DNA's

chemical

characteristics. Mapping involves dividing the chromosomes into


fragments that can be propagated and characterized, and then
ordering them to correspond to their respective chromosomal
locations.
Genetic markers are invaluable for genome mapping. Markers
are any inherited physical or molecular characteristics that are
different among individuals of a population. An example of a marker
includes restriction fragment length polymorphisms (RFLP). RFLPs
reflect sequence differences in DNA sites that can be cleaved by
restriction enzymes. To be useful in mapping, markers must be
polymorphic, or have more than one form among individuals so that
they can be detectable in studies.
Another marker is Variable Numbers of Tandem Repeats
(VNTR), which are small sections of repeating DNA. VNTRs are
prevalent in human DNA and can exist in wide variance of numbers.
This variability gives individuals unique VNTR regions. This is the
application behind solving crime cases with blood samples. A

genetic map shows the relative locations of these specific markers


on chromosomes.
Used in RFLP markers are restriction enzymes. These enzymes
recognize short sequences of DNA and cut them at specific sites.
Since scientists have characterized hundreds of different restriction
enzymes, DNA can be cut into many different fragments. These
fragments are the DNA pieces used in physical maps.
Different types of physical maps exist. Low-resolution physical
maps include chromosomal or cytogenetic maps that are based on
distinctive

banding

patterns

of

stained

chromosomes.

High-

resolution physical maps represent sets of DNA fragments that were


cut by restriction enzymes and placed in order.
Sequencing Strategies
To sequence DNA, it must be first amplified, or increased in
quantity. Two types of DNA amplifications are:

Cloning
Polymerase Chain Reactions (PCR)

Cloning involves the propagation of DNA fragments in a foreign


host known as recombinant DNA technology.
DNA

fragments

isolated

from

restriction

enzymes are united with a vector and then


reproduced along with the vector's cell DNA.
Vectors normally used are viruses, bacteria,
and

yeast

cells.

Cloning

provides

an

unlimited amount of DNA for experimental


study.
With

PCRs,

DNA

can

be

amplified

hundreds of millions of times in a matter of


hours, a task that would have taken days
with recombinant DNA technology. PCR is

10

valuable because the reaction is


highly specific, easily automated,
and capable of amplifying very
small amounts of DNA. For these
reasons,

PCR

impacts

on

genetic

has
clinical

disease

had

major

medicine,
diagnosis,

forensic science, and evolutionary


biology.
PCR is a process through
which

specialized

polymerase

enzyme

synthesizes

complementary strand of DNA to a separate given strand of DNA in


a mixture of DNA bases and DNA fragments. The mixture is heated,
separating the two strands in a double-stranded DNA molecule. The
mixture is then cooled and through the action of the polymerase
enzyme, the DNA fragments in the mixture find and bind to their
complementary sequences on the now separated strands. The result
is two double helix strands from one double helix strand. Repeated
heating and cooling cycles in PCR machines amplify the target DNA
exponentially. In less than 90 minutes, PCR cycles can amplify DNA
by a million fold.
Now that the DNA has been amplified, sequencing can begin.
Two basic approaches are:

Maxam-Gilbert sequencing
Sanger sequencing

Both methods are successful because gel electrophoresis can


produce

high-resolution

separations

of

DNA

molecules.

Electrophoresis is the process of using gels with stained DNA and


then separating those DNA fragments according to size by the use
of electric current through the gel. Even fragments that have only

11

one single different nucleotide can be separated. Almost all of the


steps in both of these sequences are now automated
Maxam-Gilbert sequencing, also called chemical degradation
method, cleaves DNA at specific bases using chemicals. The result is
different length fragments. A refinement to this method known as
multiplex sequencing enables scientists to analyze approximately 40
clones on a single DNA sequencing gel.
Sanger sequencing, also called the chain termination or dideoxy
method, uses enzymes to synthesize DNA of varying length in four
different reactions, stopping the replication at positions occupied by
one of the four bases, and then determining the resulting fragment
lengths.
A major goal of the HGP is to develop automated sequencing
technology that can accurately sequence more than 100,000 bases
per day. Specific focuses include developing sequencing and
detection schemes that are faster, more sensitive, accurate, and
economical.

WHY

IS

GENOME

SEQUENCING

IMPORTANT?
Genome

sequencing

is

important

because

of

the

below

mentioned reasons:

To obtain a blueprint DNA directs all the instructions needed for

cell development and function.


DNA underlies almost every aspect of human health, both, in

function and dysfunction.


To study gene expression in a specific tissue, organ or tumor.
To study human variation.
To study how humans relate to other organisms.

12

To find correlations how genome information relates to development


of cancer, susceptibility to certain diseases and drug metabolism
(pharmacogenomics)

APPLICATIONS
Scientists

estimate

that

chromosomes

in

the

human

population differ at about 0.1%. Understanding these differences


could lead to discovery of heritable diseases, as well as diseases
and other traits that are common to man. Information gained from
the HGP has already fuelled many positive discoveries in health
care. Well-publicized successes include the cloning of genes
responsible for Duchenne muscular dystrophy, retinoblastoma,
cystic fibrosis, and neurofibromatosis. Increasingly detailed genomic
maps have also aided researchers seeking genes associated with
fragile X syndrome, types of inherited colon cancer, Alzheimer's
disease, and familial breast cancer.
If other disease-related genes are isolated, scientists can
begin to understand the structure and pathology of other disorders
such as heart disease, cancer, and diabetes. This knowledge would
lead to better medical management of these diseases and
pharmaceutical discovery.
Current and potential applications of genome research will
address national needs in molecular medicine, waste control and
environmental cleanup, biotechnology, energy sources, and risk
assessment.

Molecular Medicine
Through genetic research, medicine will look more into
the fundamental causes of diseases rather than concentrating
on treating symptoms. Genetic screening will enable rapid and
specific diagnostic tests making it possible to treat countless

13

maladies. DNA-based tests clarify diagnosis quickly and


enable geneticists to detect carriers within families. Genomic
information can indicate the future likelihood of some
diseases.

As

an example,

if

the gene responsible for

Huntington's disease is present, it may be certain that


symptoms will eventually occur, although predicting the exact
time may not be possible. Other diseases where susceptibility
may be determined include heart disease, cancer, and
diabetes.
Medical researchers will be able to create therapeutic products
based on new classes of drugs, immunotherapy techniques,
and possible augmentation or replacement of defective genes

through gene therapy.


Waste Control and Environmental Cleanup
Through advances gained by the HGP,

the

DOE

formulated the Microbial Genome Initiative to sequence the


genomes of bacteria useful in the areas of energy production,
environmental

remediation,

toxic

waste

reduction,

and

industrial processing. Resulting from that project, six microbes


that live under extreme temperature and pressure conditions
have been sequenced. By learning the unique protein
structure of these microbes, researchers may be able to use
the organisms and their enzymes for such practical purposes

as waste control and environmental cleanup.


Biotechnology
The potential for commercial development presents U.S.
industry with a wealth of opportunities. Sales of biotechnology
products are projected to exceed $20 billion by the year 2000.
The HGP has stimulated significant investment by large
corporations

development

of

hoping

to

capitalize

on

implications of HGP research.


Energy Sources
Biotechnology, strengthened

by

the

will

biotechnology

and

promoted

companies

the

HGP,

new
the

be

important in improving the use of fossil-based resources.

14

Increased energy demands require strategies to circumvent


the

many

problems

with

today's

dominant

energy

technologies. Biotechnology will help address these needs by


providing a cleaner means for the bioconversion of raw
materials to refined products. Additionally, there is the
possibility of developing entirely new biomass-based energy
sources. Having the genomic sequence of the methaneproducing

microorganism

Methanococcus

jannaschii,

for

example, will allow researchers to explore the process of


methanogenesis in more detail and could lead to cheaper
production of fuel-grade methane.

Risk Assessment
Understanding

the

human

genome

will

have

an

enormous impact on the ability to assess risks posed to


individuals

by

environmental

exposure

to

toxic

agents.

Scientists know that genetic differences cause some people to


be more susceptible than others to such agents. More work
must be done to determine the genetic basis of such
variability, but this knowledge will directly address the
Department of Energy's long-term mission to understand the
effects of low-level exposures to radiation and other energyrelated agents, especially in terms of cancer risk. Additional
positive

spin-offs

from

this

research

include

better

understanding of biology, increased taxonomic understanding,


increased development of pest-resistant and productive crops
and livestock, and other commercially useful microorganisms.

ETHICAL,

LEGAL,

AND

SOCIAL

IMPLICATIONS ADDRESSED BY THE


HUMAN GENOME PROJECT

15

The Ethical, Legal, and Social Implications (ELSI) program was


founded in 1990 as an integral part of the Human Genome Project.
The mission of the ELSI program was to identify and address issues
raised by genomic research that would affect individuals, families,
and society. A percentage of the Human Genome Project budget at
the National Institutes of Health and the U.S. Department of Energy
was devoted to ELSI research.
The ELSI program focused on the possible consequences of
genomic research in four main areas:

Privacy and fairness in the use of genetic information,


including

the

potential

for

genetic

discrimination

in

employment and insurance.


The integration of new genetic technologies, such as genetic

testing, into the practice of clinical medicine.


Ethical issues surrounding the design and conduct of genetic
research with people, including the process of informed

consent.
The education of healthcare professionals, policy makers,
students, and the public about genetics and the complex
issues that result from genomic research.

CONCLUSION
Medical researchers did not wait to use data from the Human
Genome Project. When the project began in 1990, fewer than 100
human disease genes had been identified. At the project's
conclusion in 2003, the number of identified disease genes had
risen to more than 1,400.
The Human Genome Project is focused on the DNA sequence
of an individual. Advancement in this research can bring up new
scope in the field of medicine. This project opened doors to a very

16

viable phenomenon, sequencing, which can lead to the cure of


many genetic disorders which are caused due to abnormal coding.
As this research continues many new possibilities open up in this
field of development. Hence in future if well planned and
implemented, the data obtained from the human genome project
stands as a very promising field.

REFERENCES

Studies By J. Sulston And G. Ferry (2003) And J. Shreeve


(2004),The Columbia Encyclopedia, 6th Ed. Copyright 2014,

The Columbia University Press.


Collins, F. S., A. Patrinos, E. Jordan, A. Chakravarti, R.
Gesteland, L. Walters. 1998. New Goals for The U.S. Human

Genome Project: 1998-2003. Science. 282:682-689


Human Genome Project Discoveries: Dialectics And Rhetoric In
The Science Of Genetics, The Catholic University Of America,

Proquest, 2008
Genetics Home Reference- Your Guide To Understanding
Genetic Conditions, A Service Of The U.S. National Library Of
Medicine

Web: Http://Ghr.Nlm.Nih.Gov/
Hindorff, L. A., Junkins, H. A., Hall, P. N., Mehta, J. P. & Manolio,
T. A. A Catalog Of Published Genome-Wide Association

17

Studies,2010

Web:

Http://Www.Genome.Gov/Gwastudies
Levy, S. Et Al. The Diploid Genome Sequence of An Individual

Human. Plos Biol. 5, E254 (2007)


The Human Genome Project: Lessons From Large-Scale

Biology, Francis S. Collins, Et Al. Science 300, 286 (2003)


Understanding The Human Genome Project, Michael Angelo

Palladino, Benjamin Cummings, 2002


National
Human
Genome
Research
Http://Www.Genome.Gov/10001477

Institute

Web:

You might also like