Professional Documents
Culture Documents
Sequence (verb): the act of determining “DNA sequencing”. When sequencing large
the order of the individual units of a
macromolecule. quantities of DNA, it must be fragmented
o Ex: The DNA segment was
sequenced. into smaller, more manageable segments.
construct made of many small overlapping a time, eventually creating one contiguous
Figure 1. A breakdown of the structure of DNA, showing a molecule of DNA in its natural
helical form (left), a linearized fragment of DNA showing the complimentary nucleotide
bases as colored vertical lines (middle), and the specific sequence of this fragment
showing the bases as their letter code (right).
A. B.
Figure 2. Two representations of the process through which a contig is created. (A) An
example sentence is broken up into fragments, showing how the fragments can be re-
aligned based on the shared letters between the fragments (Taylor, 2018). (B) A
visualization that is in the format that is more typical to biological sciences, in which bars
represent pieces of DNA, and vertical lines connecting the bars to represent the individual
overlapping bases. In this example, it is shown how components 1 through 3 can be
combined to make one contig (Genome Reference Consortium).
History
The term “contig” was first used in a order to more easily refer to the product of
publication by Robert Staden in 1980. His overlapping many fragments into one
team devised a system for a more consensus sequence that is used in later
related to one another by overlap of their contigs can be linked together by their
and only one contig, and each contig larger segment, eventually creating one
[readings] in a contig can be summed to represents the original sample. In this way,
and the length of this sequence is the the 1990s, this was the method used to
length of the contig” (Staden, 1980). sequence the human genome. Scientists
Though the wording may change, from all over the world together were
the definition of this term has not brought together, making unparalleled
is sometimes used interchangeably with reference available, great leaps have been
like Shotgun sequencing, the DNA is sequencing” (NGS). With NGS, large
broken up into many fragments and segments of DNA can be sequenced much
sequenced individually, then pieced back faster, and at a much lower price. RNA
together into contigs. In the same way in sequencing is a type of NGS that uses
contigs to build transcriptomes. RNA is collaborative project between the U.S.
many reads. These reads are then DNA DataBank of Japan (DDBJ), and
can be mapped back to it to determine the collection of genomic data from research
available, there are programs that create a submissions from researchers, including a
transcriptome from scratch by linking the type called “Whole Genome Shotgun
Gene identities can then be inferred based genomes constructed from annotated, and
(Manfred et al., 2011). Eventually all reads given a four-letter identifier followed by a
collection of all the genes that are turned individual contig is given an additional
computer data make their data publically ABCD0100005, while a contig from an
progress. It saves researchers time and to fewer contigs constructed, thus, less
already gathered data, rather than a genome (Roberts, Carneiro, and Schatz,
to become faster and more extensive than genome without having to break it into
Pacific Biosciences (PacBio) Single until this day, constructing contigs will
In this extended definition, I used three graphics: a mini-glossary for terms that the
general audience may not fully understand, and two diagrams. For other words that may
appropriate. These help the reader understand the concepts described. I also used every
day language and examples that everyone has heard of (like the human genome project).
I used headings to break the definition up into separate topics, as well as bolding and
italics to emphasize different features. To some degree, I used etymology, in that I wrote
Definitions - Genome Reference Consortium. (n.d.). Retrieved February 20, 2018, from
https://www.ncbi.nlm.nih.gov/grc/help/definitions
King, R. C., Mulligan, P. K., & Stansfield, W. D. (2014). A dictionary of genetics (7th
Loman, N. J., Quick, J., & Simpson, J. T. (2015). A complete bacterial genome
doi:10.1101/015552
Manfred, G. G., Brian, J. H., Moran, Y., Joshua, Z. L., Dawn, A. T., Ido, A., Adiconis, X.,
Fan, L., Raychowdhury, R., Zeng, Q., Chen, Z., Mauceli, E., Hacohen, N., Gnirke
A., Rhind, N., di Pal,a, F., Birren, B. W., Nusbaum, C., Lindblad-Toh, K.,
doi:10.1038/nbt.1883
Roberts, R., Carneiro, M., & Schatz, M. (2017). The advantages of SMRT sequencing
Staden, R. (1980). A new computer method for the storage and manipulation of DNA
http://gcat.davidson.edu/phast/