I529/B659: Bioinformatics in Molecular Biology and Genetics: Practical Applications (3CR)
Spring Semester 2016
Lecture: MW 2:30-3:20pm(I2 122)
Lab: F 2:30-3:20pm(I109)
Office Hour: Thr 2-3:30pm (LH 301D) Haixu Tang
Fri 10-11am (LH 406) Chao Tao
Instructor: Haixu Tang
AI: Chao Tao
Description: We aim to introduce a broad
range of, from fundamantal and advanced, applications of
bioinformatics methods and tools to solve problems in genomics and molecular biology.
Prior to this class, the students should have learned basic methods and
theories in bioinformatics, e.g. by taking I519. In this class, we will focus on
how to apply them to solving biological problems in real life.
Some advanced computational techniques that are widely applied in bioinformatics,
e.g. Hidden Markov model (HMM), Bayesian Network (BN),
will be discussed in details in the class.
The important themes that will be covered by this course include
- Sequence modeling and classification
- Genome annotation
- Motif finding
- Genome comparison
- Protein families
- Non-coding RNAs
- MicroRNAs and their targets
- Functional prediction
- Phylogenetics
- Mass spectrometry and proteomics
This class will have a separate lab section, in which the students will be taught
in how to solve biological problems in a step-by-step fashion. The programs that will be covered
in the lab of this class include
- Sequence modeling using Markov chains: seq++;
- Pair HMM: SLAM, TwinScan, QRNA;
- HMM: Genscan;
- Profile HMM: Hmmer, Pfam;
- Non-coding RNA search: Rsearch;
Students will be instructed to write scripts (Python and PHP preferrable) and/or programs that make use of the current implementation of sophisticated algorithms, such as HMM, BN, DBN, etc., to solve biological problems.
This course is designed for the advanced level bioinformatics graduate students after they take I519. Graduate students with either biology or phisical/computer science backgrounds who are interested in bioinformatics applications in molecular biology are also welcome to take this course.
Textbook: : Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme Mitchison,
Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids
, Cambridge University Press, 1999, (BSA) Optional textbook: : Kevin Murphy,
Machine learning: a probabillistic perspective
BSA 1.1 - 1.2 2/5 Fri. 2/12 Fri. HMM for multiple sequences III 2/19 Fri. 2/26 Fri. 2/29 Mon. 3/2 Wed. 3/4 Fri. 3/25 Fri. 4/1 Fri. 4/8 Fri. 4/22 Fri. Last updated: 1/12/2016
Assignments: We will have 5 take-home
assignments and 1 class project.
Grading: Combined
assignments (30%), One mid-term exam (20%), Final exam (25%), Class Project (20%), Attendence
(5%).
Prerequisites:
I519 or equivalent knowledge in bioinformatics
required.
Group Assignment: The class will be divided into several small groups for mini projects.
Tentative syllabus [This is subject to minor
changes!]:
Week
Date
Contents
Lecture notes
1
1/11 Mon.
Introduction to the
class
1/13 Wed.
Probabilistic modeling
BSA 1.4, Chapter 11
Notes
1/15 Fri.
Lab 1: Unix servers; web site; group assignments.
2
1/18 Mon.
No class (Martin Luther King Jr. Day)
1/20 Wed.
Probabilistic sequence modeling I: frequency and profiles
Notes
1/22 Fri.
Lab 2: GeneMark,
Glimmer
Slides
3
1/25 Mon.
Probabilistic sequence modeling II: Markov Chain
BSA Chapter 3
Notes
1/27 Wed.
Hidden Markov Model I: Model structure
BSA Chapter 3
Notes
1/29 Fri.
Lab 3: presentation of mini-project 1
4
2/1 Mon.
Hidden Markov Model II: Generalized HMM
(Homework 2)
2/3 Wed.
Hidden Markov Model III: parameter estimation
BSA Chapter 3
Notes
Lab 4: Gene Ontology and function prediction
Slides
5
2/8 Mon.
HMM for multiple sequences I
Notes
2/10 Wed.
HMM for multiple sequences II
Lab 5: HMM for protein sequence analysis
MarCoil: prediction coiled-coil
TMHMM: prediction of transmembrane helices,
slides
6
2/15 Mon.
(Homework 3)
2/17 Wed.
Coalescent HMM
Notes
PSMC paper
MSMC paper
Lab 6: Presentation of mini-project 2
7
2/22 Mon.
Pair HMM I
Notes
2/24 Wed.
Pair HMM II
Lab 7: Genscan and Twinscan
slides
8
EM algorithm I
BSA Chapter 4
A short tutorial
Notes
EM algorithm II
Lab 8: weblogo, MEME, Gibbs Motif Sampler
Slides
9
3/7 Mon.
Q & A
3/9 Wed.
Midterm
10
Spring recess
11
3/21 Mon.
Profile HMM I
BSA Chapter 5
A paper by S. Eddy
Notes3/23 Wed.
Profile HMM II
Lab 9: Presentation of mini-project 3
12
3/28 Mon.
Profile HMM III
3/30 Wed.
Gibbs Sampling
Notes
Lab 10: Hmmer and Pfam
slides
13
4/4 Mon.
Bayesian Network I
Short Tutorial
Notes
4/6 Wed.
Bayesian network II
Discussion about final project topics
14
4/11 Mon.
Junction tree algorithm
A working example
Notes
4/13 Wed.
Module network
Notes
4/15 Fri.
Lab 11: TBD
15
4/18 Mon.
Dynamic Bayesian network I
Book Chapter by Kevin Murphy on DBN
Notes
4/20 Wed.
Dynamic Bayesian network II
Project Presentation
16
4/25 Mon.
Project presentation
4/27 Wed.
Project presentation
17
5/4 Wed (Note: time changed!)
Final Exam: 5-6:30 pm I2 122
5/6 Fri.
Final project report due