I690:  Computational techniques in comparative genomics (3CR)

Spring Semester 2006
Lecture: Tuesday/Thursday, 4-5:15pm, Eigenmann 921

Office Hour: TBA
Instructors: Sun Kim and Haixu Tang
AI :
Jasen Lee

Description: This course will summarize computational techniques for comparing genomes on the DNA and protein sequence levels. Topics include state of the art computational techniques and their applications: understanding of hereditary diseases and cancer, genetic mobile elements, genome rearrangements, genome evolution, and the identification of potential drug targets in microbial genomes.

This course is designed for the advanced level bioinformatics graduate students. Graduate students with entry level background in bioinformatics research (e.g. after taking L519 or equivalent courses) are welcome to take this course. Biological background students who are interested in comparative genomics are also welcome.


Textbook: E. Koonin and M. Y. Galperin: Sequence-Evolution-Function: Computational Approaches in Comparative Genomics, Springer, 2002. (COMP) We chose this book as a reference of the course because the full text of the book is online. However, we will distribute complementary lecture notes and papers along the course for these topics.

Textbook: Dan Gusfild Algorithms on strings, trees and sequences. (ALG) This book covers most of the algorithms we will discuss in the class.

Assignments: In addition to TWO take-home assignments, each student will be asked to present ONE research papers in the class and accomplish ONE class project.

Grading: Assignments (20%), Paper presentation (15%), quiz (10%), Final project (50%), Attendence (5%).

Suggested Reading: Click Here and here.
Schedule for paper presentation can be found here.

Preliminary syllabus [This may change!]:


Instructor/Lecture notes
1/10 (Tue)
A brief overview of string pattern matching algorithms:
BM/KMP algorithm
Suffix tree

Sun Kim
ALG Section I,II

1/12 (Thr)
A brief overview of sequence alignment algorithms
Sun Kim
ALG Chapter 11
1/17 (Tue)
Time and memory efficient algorithms for DNA sequence alignment:
Myers-Miller algorithm
Banded alignment
Haixu Tang

1/19 (Thr)
Sparse dynamic programming
Haixu Tang
1/24 (Tue)
Pairwise genome alignment
Sun Kim

1/26 (Thr)
Multiple genome alignment Haixu Tang
1/31 (Tue)
Protein coding gene finding using multiple genomes I
Sun Kim

2/2 (Thr)
Protein coding gene finding using multiple genomes II
Sun Kim
2/7 (Tue)
Non-coding RNA gene finding using multiple genomes
Haixu Tang

2/9 (Thr)
Regulatory element finding using multiple genomes
Haixu Tang
2/14 (Tue)

Orthology and paralogy

Sun Kim
COMP Chapter 6

2/16 (Thr)
Horizontal gene transfer/Remote homology detection
Sun Kim
COMP Chapter 6
2/21 (Tue)
Operon/Evolution of metabolic pathways
Sun Kim
COMP Chapter 7

2/23 (Thr)
Mobile genetic elements Haixu Tang
2/28 (Tue)
Segmental duplications Haixu Tang

3/2 (Thr)
Whole genome duplication Haixu Tang
3/7 (Tue)
Paper presentation I

3/9 (Thr)
Paper presentation II
Spring access
3/21 (Tue)
Genome rearrangement: random breakage model
Haixu Tang

3/23 (Thr)
Sorting by reversals Haixu Tang
ALG Chapter 19
3/28 (Tue)
Whole genome phylogeny
Haixu Tang

3/30 (Thr)
Gene Fusion
Sun Kim
4/4 (Tue)
Paper presentation III

4/6 (Thr)
Prediction of functionally correlated gene sets Sum Kim
4/11 (Tue)
Protein interaction networks in multple genomes Sun Kim

4/13 (Thr)
Paper presentation IV
4/18 (Tue)
Project presentation I

4/20 (Thr)
Project presentation II
4/25 (Tue)
Project presentation III I

4/27 (Thr)
Project presentation IV
5/2 (Tue)
Project presentation V

5/4 (Thr)
Project presentation VI

5/5 (Fri)
Final report due