I690:Algorithm in Bioinformatics: (3CR)

Fall Semester 2007
Instructor: Haixu Tang
Office: Informatics 225
Time & location: EIG 921; MW 11:15-12:30

Description: We aim to introduce the commonly used combinatorial and probabilistic algorithms in bioinformatics, including
- String pattern matching;
- Greedy algorithm;
- Dynamic programming;
- Divide-and-Conquer;
- Linear and integer programming;
- Clustering;
- graph theory;
- complexity theory;
- Randomized algorithms;

Many important bioinformatics topics will be used as examples to illustrate how the formulation and solution of a computer science problem can help to answer a biological question.

This course is designed for the advanced level bioinformatics graduate students. Graduate students with either biology or phisical/computer science backgrounds who have taken a fundemental bioinformatics course are also welcome to take this course.

 

Textbook: Pavel Pevzner: Computational molecular biology: an algorithmic approach , MIT Press, 1998 (CMB).
Some of the topics from the course can not be found in this book. We will distribute complementary lecture notes and reading materials along the course for these topics. We also recommend the students to read the book, Dan Gusfield, Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology, Cambridge university press, 1997 (DAN).

Assignments: We will have 4 take-home4 assignments.

Grading: Combine assignments (50%), One mid-term exam (20%), Final exam (25%), Attendence (5%).

Prerequisites:
  I519 or equivalent.

Preliminary agenda [This may change!]:

 


Week
Date
Contents
Lecture notes
1
8/27
Introduction to Algorithms
Correctness and Efficiency
Algorithm Complexity
Slides

8/29
Analysis of algorithm complexity

2
9/3
Exact string matching
Homework 1 tex (Due in two weeks)
Slides

DAN 2.3
DAN 1.4

9/5
Data structure: Suffix tree
Slides
DAN 6.1
A book chapter on suffix tree and suffix array by S. Aluru
3
9/10
Suffix tree: applications
Slides
DAN 3.4
DAN 7.2

9/12
Suffix tree: applications

DAN 7.4
DAN 7.10
MUMmer paper
4
9/17
Suffix tree: applications
Homework 1 due
Homework 2 tex (Due in two weeks)

DAN 7.14
DAN 9.5
DAN 9.6

9/19
Mismatch tree and the motif finding problem
Slides
CMB 8.6
MITRA paper
5
9/24
Breaking problems down: dynamic programming
Slides
A book Chapter by Yuzhen Ye and Haixu Tang
CMB 6.2
CMB 6.3

9/26
Applications of DP
Slides
CMB 6.7
6
10/1
Applications of DP
Homework 2 due
Homework 3 tex (Due in two weeks)


10/3
Combinatorial search: intractable problems
Slides
7
10/8
Integer programming I
Slides
CMB 6.11

10/10
Integer programming II

8
10/15
Reductions
Homework 3 due
CMB 6.10

10/17
Mid-term

9
10/22
Graph algorithms: trees


10/24
Perfect phylogeny
DAN 17.3-17.5
A tutorial written by Dr. Fernandez-Baca
Haplotyping via perfect phylogeny
Notes
10
10/29
Comparing trees
Homework 4 tex (Due 11/14, Wed)


10/31
De Bruijn graph

CMB 5.2
CMB 5.3
CMB 5.4
Notes
11
11/5
Eulerian graphs

CMB 2.5
CMB 2.6

11/7
Minimum Spanning trees
Shortest paths

12
11/12
Matching problems


11/14
Network flow

13
11/19
Breakpoint graph

CMB chapter 10

11/21
no meeting (thanks giving holiday)

14
11/26
Set cover and set packing


11/28
Graph-based clustering

15
12/3
Heuristic algorithms I

15
12/5
Heuristic algorithms II

16
12/10
Final due

Last updated: 8/10/2007