I690:Algorithm in Bioinformatics: (3CR)
Fall Semester 2007
Instructor: Haixu Tang
Office: Informatics 225
Time & location: EIG 921; MW 11:15-12:30
Description: We aim to introduce the commonly used combinatorial and probabilistic algorithms in bioinformatics, including
- String pattern matching;
- Greedy algorithm;
- Dynamic programming;
- Divide-and-Conquer;
- Linear and integer programming;
- Clustering;
- graph theory;
- complexity theory;
- Randomized algorithms;
Many important bioinformatics topics will be used as examples to illustrate how the formulation and solution of a computer science problem can help to answer a biological question.
This course is designed for the advanced level bioinformatics graduate students. Graduate students with either biology or phisical/computer science backgrounds who have taken a fundemental bioinformatics course are also welcome to take this course.
Textbook: Pavel Pevzner:
Computational molecular biology: an algorithmic approach
, MIT Press,
1998 (CMB). Last updated: 8/10/2007
Some of the topics from the course can not be found in this book. We will
distribute complementary lecture notes and reading materials along the course
for these topics. We also recommend the students to read the book,
Dan Gusfield,
Algorithms on Strings, Trees and Sequences:
Computer Science and Computational Biology, Cambridge university press, 1997 (DAN).
Assignments: We will have 4 take-home4
assignments.
Grading: Combine
assignments (50%), One mid-term exam (20%), Final exam (25%), Attendence
(5%).
Prerequisites:
I519 or equivalent.
Preliminary agenda [This may
change!]:
Week
Date
Contents
Lecture notes
1
8/27
Introduction to Algorithms
Correctness and Efficiency
Algorithm Complexity
Slides
8/29
Analysis of algorithm complexity
2
9/3
Exact string matching
Homework 1 tex (Due in two weeks)Slides
DAN 2.3
DAN 1.4
9/5
Data structure: Suffix tree
Slides
DAN 6.1
A book chapter on suffix tree and suffix array by S. Aluru
3
9/10
Suffix tree: applications
Slides
DAN 3.4
DAN 7.2
9/12
Suffix tree: applications
DAN 7.4
DAN 7.10
MUMmer paper
4
9/17
Suffix tree: applications
Homework 1 due
Homework 2 tex (Due in two weeks)
DAN 7.14
DAN 9.5
DAN 9.6
9/19
Mismatch tree and the motif finding problem
Slides
CMB 8.6
MITRA paper
5
9/24
Breaking problems down: dynamic programming
Slides
A book Chapter by Yuzhen Ye and Haixu Tang
CMB 6.2
CMB 6.3
9/26
Applications of DP
Slides
CMB 6.7
6
10/1
Applications of DP
Homework 2 due
Homework 3 tex (Due in two weeks)
10/3
Combinatorial search: intractable problems
Slides
7
10/8
Integer programming I
Slides
CMB 6.11
10/10
Integer programming II
8
10/15
Reductions
Homework 3 due
CMB 6.10
10/17
Mid-term
9
10/22
Graph algorithms: trees
10/24
Perfect phylogeny
DAN 17.3-17.5
A tutorial written by Dr. Fernandez-Baca
Haplotyping via perfect phylogeny
Notes
10
10/29
Comparing trees
Homework 4 tex (Due 11/14, Wed)
10/31
De Bruijn graph
CMB 5.2
CMB 5.3
CMB 5.4
Notes
11
11/5
Eulerian graphs
CMB 2.5
CMB 2.6
11/7
Minimum Spanning trees
Shortest paths
12
11/12
Matching problems
11/14
Network flow
13
11/19
Breakpoint graph
CMB chapter 10
11/21
no meeting (thanks giving holiday)
14
11/26
Set cover and set packing
11/28
Graph-based clustering
15
12/3
Heuristic algorithms I
15
12/5
Heuristic algorithms II
16
12/10
Final due