L519: Lab Session 2 (9/9/05)
Today's Topics :
1. Protein Databases
2. Swiss-Prot
3. PDB
4. Simple CGI Programming
1. Protien Databases
* There are numerous protein related databases. Here are some of the most frequently used ones.
Category |
Function |
Remarks |
Sequence |
||
PIR (Protein Information Resource) |
||
UniProt (Universal Protein Resource) |
||
Domain |
Pfam (Protein Families Databases |
|
PRINTS |
||
SMART (Simple Modular |
||
Protein Structure |
CATH (Class, Architecture, |
|
SCOP (Structural Classfication of Proteins) |
|
|
PDB (Protein Data Bank) |
||
* Introduction to Protein Domain / Structure Databases (PDF)
* For more comprehensive list of available databases, please check NAR (Nucleic Acid Research) Database Issue.
2. Swiss-Prot
The ExPASy (Expert Protein Analysis System)
proteomics server from the Swiss Institute of
Bioinformatics (SIB) is dedicated to molecular biology with an emphasis on
data relevant to proteins. It allows you to browse through a number of databases produced in Geneva, such
as Swiss-Prot, PROSITE, SWISS-2DPAGE, SWISS-3DIMAGE, ENZYME, as well as other cross-referenced databases (such as
EMBL/GenBank/DDBJ, OMIM, Medline, FlyBase, ProDom, SGD, SubtiList, etc). It also
allows access to many analytical tools for the identification of proteins, the
analysis of their sequence and the prediction of their tertiary structure.
ExPASy also offers you many documents relevant to these field of research and
you will find from the servers, links to most relevant sources of information
across the Web.
Swiss-Prot has been developed and maintained by Swiss Institute of Bioinformatics (SIB) and European Bioinformatics Institute (EBI). The best thing of Swiss-Prot is that it is a comprehensive protein information source that is curated by human experts. Thus it can server as the highest-quality protein resource.
* Swiss-Prot on the Web : http://www.expasy.org/swissprot
Sample record : GRAA_HUMAN
* Swiss-Prot on textfile
Sample record : GRAA_HUMAN
* List of Line code for Swiss-Prot flatflie
Line code |
Content |
Occurrence in an entry |
|
---|---|---|---|
ID | Identification | Once; starts the entry | |
AC | Accession number(s) | Once or more | |
DT | Date | Three times | |
DE | Description | Once or more | |
GN | Gene name(s) | Optional | |
OS | Organism species | Once or more | |
OG | Organelle | Optional | |
OC | Organism classification | Once or more | |
OX | Taxonomy cross-reference(s) | Once or more | |
RN | Reference number | Once or more | |
RP | Reference position | Once or more | |
RC | Reference comment(s) | Optional | |
RX | Reference cross-reference(s) | Optional | |
RG | Reference group | Once or more (Optional if RA line) | |
RA | Reference authors | Once or more (Optional if RG line) | |
RT | Reference title | Optional | |
RL | Reference location | Once or more | |
CC | Comments or notes | Optional | |
DR | Database cross-references | Optional | |
KW | Keywords | Optional | |
FT | Feature table data | Optional | |
SQ | Sequence header | Once | |
(blanks) | Sequence data | Once or more | |
// | Termination line | Once; ends the entry |
* Refer to the Swiss-Prot manual for full description of line codes
3. PDB (Protein Data Bank)
The PDB is the single worldwide repository for the processing and distribution
of 3-D structure data of large molecules of proteins and nucleic acids. 3D structures are obtained experimentally by using X-Ray crystallography and NMR (Nuclear Magnetic Resonnance).
* PDB on the Web : http://www.rcsb.org/pdb/
Sample : 1A06 Calmodulin-dependent Protein Kinase from Rat
* PDF file format
Sample : 1A06.pdb (txt)
* File Format Guide
4. Simple CGI Programming
A. Sample CGI 1 (txt)
B. Sample CGI 2 (txt)
C. Introduction to CGI
D. Please use Google or Perl CGI book for further references.
Last Modified : September 1, 2005