You are on page 1of 5

ECE/BME/MIIT

EXPERIMENT 11

NGEE ANN POLYTECHNIC


SCHOOL OF ENGINEERING DIPLOMA IN BIOMEDICAL ENGINEERING

Medical Imaging, Informatics and Telemedicine Experiment 11_DNA: Biomedical Computation Lab (2): Multiple Sequence
Alignment Theory: Multiple sequence alignment is to align three or more sequences. It can be a simple extension of pair-wise alignment to multiple sequences. Question 1: How many ways are there for pair-wise alignments of the following two sequences? Assuming that you are free to introduce gaps and you are not looking for an optimal alignment? Seq1: Seq2: ABCDE BDCAE

Answer: ____________ However, introducing an additional sequence to the alignment complicates the whole process. The combinations, consequently, the possible number of alignments, grow exponentially with each added sequence. Question 2: Assuming a third sequence (Seq3 below) is to be aligned with the above two sequences, give any 3 possible alignments. Seq3: ABBCE 2. ___________ 3.___________ ___________ ___________ ___________ ___________

Answer: 1. ___________ ___________ ___________

CLUSTALW A computer program for MSA: CLUSTALW is a more recent version of CLUSTAL with W stands for weighting meaning that the program is able to provide weights to the sequences and parameters. CLUSTALW performs a global multiple alignment using the following steps: 1. Perform pair-wise alignments of all of the sequences. 2. Use the alignment score to produce a phylogenetic tree*. 3. Guided by the phylogenetic tree, align the sequences in such a way that the most closely related sequences are aligned first, and then additional sequences and groups of sequences are aligned.

OSC Rev200604

Page 1

ECE/BME/MIIT

EXPERIMENT 11

*Phylogenetic tree: Sometimes also referred to as a dendrogram, phylogenetic tree is a graphical representation of the evolutionary relationship among three or more genes or organisms. The justification is that organisms with higher degrees of molecular similarity are expected to be more closely related than those that are dissimilar. Question 3: Construct a phylogenetic tree for the three sequences given above. (Hint: Without introducing any gap, find the number of matches among the sequences.) Answer:

(Note: In computer programs, researchers have devised algorithms for automatic generation of phylogenetic trees from data. E.g. Distance Method and Neighbour-joining Method, etc. Details of these algorithms are beyond the scope of this experiment.) Objectives: A. B. C. D. To use Entrez program to retrieve a published mRNA sequence for alignment. To use the BLAST program to find closely related DNA sequences. To select another two DNA sequences from the BLAST result. To use the CLUSTALW program to perform a multiple sequence alignment on the 3 sequences.

Procedures: A. 1. BLAST program can also be accessed from the Sequence Manipulation Suite on BII Web site at: http://mammoth.bii.a-star.edu.sg/sms/index.html
(The website might have been moved due to the frequent restructuring of the computer systems in BII. Please do a search for BLAST program from the BII website. If you do not use BLAST from the BII site, state the web source where you run the BLAST program. E.g NCBI website)

2. Look for the Entrez search tool. [There are various web sites that provide Entrez] Search for: M57414 Nucleotide sequence under the PubMed database 3. Click on M57414 to see the details. [Use alternate ways to find this sequence] 4. Use Edit Copy, to copy the sequence of M57414 (after the word ORIGIN). (This is the first sequence of the 3 sequences that we want to perform a MSA.) 5. Click on the Back button until you reach the BLAST home page again.
OSC Rev200604 Page 2

ECE/BME/MIIT

EXPERIMENT 11

B. 6. Run BLAST. 7. Select Nucleotide - Nucleotide BLAST (blastn). 8. Edit Paste the copied sequence to the inside of the Search box. 9. Click BLAST! Button. 10. Click FORMAT! Button. (You have performed a BLAST search for similar sequences in databases which you have already done in experiment 10: pair-wise sequence alignment. 11. Wait if the search results are not ready yet. (This page will be updated automatically when it is ready). 12. Keep this result page for use in the following steps. i.e. Click the _ button at the top of the page. C&D 13. To perform multiple sequence alignment (MSA), access CLASTALW Web site at: http://mammoth.bii.a-star.edu.sg/clustalw 14. Select DNA Sequences. 15. Paste your query sequence (from step 5) into the sequence editing box. (If there is text inside, click the Reset Input button to clear it). 16. From BLAST result page (step 13), find the sequences for: a. gi|4507344|ref|NM_001057.1 Homo sapiens tachykinin recepto b. gi|206986|gb|M31838.1|RATSKR Rat substance K receptor mRNA 17. Copy and paste these 2 sequences to the inside of the sequence editing box of CLUSTALW. (Note: 1. Copy only the DNA sequences after the word ORIGIN, 2. Place these two sequences below the query sequence and insert a line between each of them.) 18. Now we must give names to the sequences. Edit the text inside the box to show like following: >gi|189134| Human neurokinin A receptor Query sequence obtained in step 5 >gi|4507344| Homo sapiens tachykinin receptor First sequence from step 18.

OSC Rev200604

Page 3

ECE/BME/MIIT

EXPERIMENT 11

>gi|206986| Rat substance K receptor mRNA Second sequence from step 18. (Note: Names after the symbol > are required for CLUSTALW to identify sequences so that it will be able to report the results later. Each sequence will start at the beginning of next line after the line with >). 19. Delete all line numbers from the 3 sequences. 20. We are now ready to perform a multiple sequence alignment (MSA). Click on the Run Clustal W button. (Accept all default settings for Gap penalty, Weight matrix, etc). 21. Print a copy of the result page as part of your report and fill in the following information: Sequence 1: gi|189134| Human neurokinin A receptor __________ bp Sequence 2: _________________________________ __________ bp Sequence 3: _________________________________ __________ bp Sequence (1:2) Aligned. Score: _________ Sequence (1:3) Aligned. Score: _________ Sequence (2:2) Aligned. Score: _________ Sequence (2:3) Aligned. Score: _________ Sequence (3:2) Aligned. Score: _________ Sequence (3:3) Aligned. Score: _________ 22. Click the JalView button to view a graphic display of the results. 23. Based on the scores obtained in step 22, what phylogenetic tree would you propose for these three sequences?

OSC Rev200604

Page 4

ECE/BME/MIIT

EXPERIMENT 11

Conclusion: Multiple sequence alignment (MSA) is a mathematically complex and computation intensive process. For a given set of sequences, there is no single correct alignment. The computer programs only produce an alignment that is optimal based on certain set of criteria. Some MSA programs (e.g. CLUSTAL) rely on probability theory in the construction of their algorithm; hence it is not possible to be sure that the alignments are truly optimal. Hence, which alignment is best for a given set of sequences depends very much on the judgment of the researcher. Outcomes of an alignment could be very different if parameters used by the alignment programs are changed (gap penalties, protein scoring matrixes, etc). Thus it is vital that one should always confirm the alignment results with laboratory experiments.

~~ The End ~~

OSC Rev200604

Page 5

You might also like