Professional Documents
Culture Documents
Objectives
To expose the students to possible areas of computer techniques applicable in Bioinformatics. To expose students to how prediction is important in solving Biology problem. To expose students to how machine learning is applicable in Bioinformatics To expose students to how Expert System is applicable in Bioinformatics.
Dr V.C. Osamor CIS427
Dr V.C. Osamor
CIS427
Prediction in bioinformatics
This will largely require development of your own computational tools or the use of existing tools in areas such as: Predicting the location of genes in DNA Predicting gene roles in an organism Predicting errors in a genetic transcription Predicting the function of proteins Predicting diseases from molecular samples Anything that involves making a judgment; a yes/no decision about whether some sample datum does or does not have some property.
Dr V.C. Osamor CIS427
Representation
DATA - DATA
0101011101100101011001010111010000101101
to the computer, everything is binary!
Dr V.C. Osamor
CIS427
0101011101100101011001010111010000101101 0101101100100111111011010011010000101101 A AC GT CA T T CGA T GAT T CGA Just as we can teach a computer to predict things about a sequence of letters in English prose, we can also teach it to predict things about other sequenceslike a genetic sequence
Dr V.C. Osamor
CIS427
Dr V.C. Osamor
CIS427
It is a blueprint that provides biochemical instructions on how to construct a sequence of amino acids so as to make a working protein that will perform some function in the organism
Dr V.C. Osamor CIS427
untranslated region
encoding region
transcription
factor
Dr V.C. Osamor
CIS427
untranslated region
Dr V.C. Osamor
CIS427
Dr V.C. Osamor
CIS427
ttgcaatcggcgctacgcttcaaaatttattatattcccggc
What transcription factors bind to this gene? Where is the transcription factor binding site?
Dr V.C. Osamor CIS427
Dr V.C. Osamor
CIS427
ttgcaatcggcgctacgcttcaaaatttattatattcccggc Clues: Where there is one binding site, often there is another nearby.
Dr V.C. Osamor
CIS427
Dr V.C. Osamor
CIS427
Proteomics
Three consecutive nucleotides in the coding region form a codon i.e. encode an amino acid. A string of amino acids makes a protein.
43 = 64 possible codons
But there are only 20 amino acids!
Dr V.C. Osamor
CIS427
proteomics
There is quite a bit of redundancy in codons.
Glycine: GGA, GGC, GGG, GGT Tyrosine: TAT, TAC Methionine: ATG
Dr V.C. Osamor
CIS427
Amino Acid
R group
Amide group
Carboxyl group
Dr V.C. Osamor
CIS427
Amino Acid
tyrosine
glycine
Dr V.C. Osamor
CIS427
Dr V.C. Osamor
CIS427
Dr V.C. Osamor
CIS427
Dr V.C. Osamor
CIS427
Artificial Intelligence
Computers do things only human brains can otherwise do
expert system expert
Dr V.C. Osamor
CIS427
Dr V.C. Osamor
CIS427
Machine learning
What is machine learning?
creating computer programs that get better with experience learn how to make expert judgments discover previously hidden, potentially useful information (data mining)
Dr V.C. Osamor
CIS427
Bioinformatics Applications
Bioinformatics was applied in the creation and
maintenance of a database to store biological information at the beginning of the "genomic revolution", such as nucleotide and amino acid sequences. Development of this type of database involved not only design issues but the development of complex interfaces whereby researchers could both access existing data as well as submit new or revised data.
Dr V.C. Osamor CIS427
Biotechnology
Biologists know proteins, computer scientists know machine learning Together, they can find out a lot of hidden information about genes and proteins Biotechnology is a multi-billion dollar industry Biotechnology is one of the best funded areas of scientific research
Dr V.C. Osamor CIS427
Dr V.C. Osamor
CIS427
Sequence Analysis
This sequence information is analyzed to determine genes that encode polypeptides (proteins), RNA genes, regulatory sequences, structural motifs, and repetitive sequences. A comparison of genes within a species or between different species can show similarities between species (the use of molecular systematics to construct phylogenetic trees). With the growing amount of data, it became impractical to analyze DNA sequences manually. Today, computer programs such as BLAST are used.
Dr V.C. Osamor CIS427
Comparative Genomics
The core of comparative genome analysis is the establishment of the correspondence between genes (orthology analysis) or other genomic features in different organisms.
Dr V.C. Osamor
CIS427
Dr V.C. Osamor
CIS427