L519: Lab Session 6 (10/7/05)
Today's Topics
1. Phred/Phrap
2. Genome browsers
1. Phred
A. Developed by Dr. Phil Green [Group
Homepage]
B. The phred software
1) reads DNA sequencing trace files
2) calls bases
3) assigns a quality value to each called
base.
C. Quality value
The quality value is a log-transformed error probability, specifically
Q = -10 log10( Pe )
where Q and Pe are respectively the quality value and error probability of a particular base call
Phred quality score |
Probability that the base is called wrong |
Accuracy of the base call |
---|---|---|
10 |
1 in 10 |
90% |
20 |
1 in 100 |
99% |
30 |
1 in 1,000 |
99.9% |
40 |
1 in 10,000 |
99.99% |
50 |
1 in 100,000 |
99.999% |
D. Executable Programs
1) Installed on the Biokdd : /usr/local/biokdd/bin
2) To view full options (doc)
>phred -doc | more
3) Sample Trace Files
/tmp/L519FALL2005/Lab/Lab5/chromat_dir
ABI Trace Files (Sample)
4) >phred -id [chromato gram directory] -pd [phd directory] -sd [sequence output directory]
-qd [quality score directory]
>phred -id chromat_dir -pd phd_dir -sd fasta_dir -qd quality_dir
>phred -id chromat_dir -sa read.fas -st fasta -qa read.qual -qt fasta
5)
Vector Trimming - To remove sequencing vector sequences
-trim [vector name || vector sequence file]
-trim_out [ -trim_fasta || -trim_scf || -trim_phd]
* Trimming Result Comparison
6)
Sample Base Calling Results Using /tmp/L519Fall2005/Lab/Lab5/chromat-dir
* Without trimming Without quality File
* With trimming Without quality File
E. Viewer
1)
BioEDIT : http://www.mbio.ncsu.edu/BioEdit/page2.html
2) Chromas : http://www.technelysium.com.au/chromas.htm
3) GeneStudio : http://www.genestudio.com/download_gspro.htm
4) Collections : http://www.roswellpark.org/document_5729.html
F. Online Repository for Trace Files
TraceDB : http://www.ncbi.nlm.nih.gov/Traces/trace.cgi?
2. Phrap
A. Developed by Dr. Phil Green [Group Homepage]
B. The phrap software (Doc)
1) assembles
shotgun DNA fragment sequences
2) gives better assembly results with phred quality file (same file name with '.qual')
C. Try to assemble the sample sequences
>phrap <FASTA_Seq_File_Name>
D. Sampel Assembly Results
* Without trimming No Quality Score File
* Without trimming With Quality Score File
* With trimming No Quality Score File
* With trimming With Quality Score File
3. Genome Browsers
A. UCSC Genome Browser
B. ENSEMBL Genome Browser
C. VISTA
4. BLAST Exercise
A. Exercise1
B. Exercise2
Last Modified : October 7, 2005