L519:  Lab Session 7 (10/21/05)


Today's Topics
    1. Gene Prediction Tools (GenScan, TwinScan)
    2. GNUPlot



1. GenScan
    A. Developed by Chris Burge (Currently at MIT)

    B. Eukaryotic Gene Prediction.

    C. Model : Statistical (Hidden Markov Model)

    E. Web GenScan Service
         
http://genes.mit.edu/GENSCAN.html
         
1) Limit : One million base pairs (1Mbps) in length.
         2) Three different types of organisms
              Vertebrate, Arabidopsis, Maize
         3) It predicts Genes/Exons
         4) For sequences longer than 1Mbps, you should use
email server or local standalone version.

     F. Example
          Let's use GenScan to predicted genes in the following 100Kbps Arabidopsis genomic sequence.
          
Arabidopsis genomic sequence

          * Run GenScan.         GenScan Output File HTML, PDF View
          Q1) How many genes have you found in this piece of DNA?
          Q2) How many exons does the predicted gene#10 have?
          Q3) What protein corresponds to the predicted gene#14?

          Another Example.   Homo sapiens  Chr#18  58,941,000 ~ 59,140,000 

      G. Standalone Version of GenScan at Biokdd
          1) Located at

              /usr/local/biokdd/bin/genscan
              
/home4/genbank/genomes/all-fnas/software/genscanlinux
          2) Options for GenScan
               usage: genscan parfname seqfname [-v] [-cds] [-subopt cutoff] [-ps psfname scale]
              

          3) Check /tmp/L519FALL2005/Lab7 for required files.
          4) Sample Usage

              >genscan Arabidopsis.smat Arabidopsis.fas > Arabidopsis.out


2. TwinScan

     A. TwinScan Service at Washington University.
          
http://genes.cs.wustl.edu/

     B. TwinScan uses both HMM and similarity
         (eg. between Human and Mouse)

     C.  Try a short Arabidopsis genomic sequence  
<SEQ>
           1)
TwinScan Result
           2)
GenScan Result

     D.  Use the above Arabidopsis genomic sequence to run TwinScan and compare its result with GenScan result. 
           1)
TwinScan Result     (PDF)
            


3. EST_GENOME

     A. Similarity based gene prediction program developed by Wellcome Trust Center for Human Genetics

     B. Webpage :
http://www.well.ox.ac.uk/~rmott/ESTGENOME/est_genome.shtml

      

4. GENEMARK

     A. Prokaryotic Gene Prediction Program
 
     B. Webpage :
http://opal.biology.gatech.edu/GeneMark/

 


Last Modified : October 21, 2005

Maintained by : Junguk Hur ()