L519:  Lab Session 2 (9/9/05)

 

Today's Topics :
    1. Protein Databases
    2. Swiss-Prot
    3. PDB
    4. Simple CGI Programming


 

1. Protien Databases
     
* There are numerous protein related databases. Here are some of the most frequently used ones.

Category

Function

Remarks

Sequence
Annotation

GenPept (NCBI)

Swiss-Prot / TrEMBL

PIR (Protein Information Resource)

UniProt (Universal Protein Resource)

 

Domain
Protein Family

Pfam (Protein Families Databases
of Alignments and HMMs)

Prosite

PRINTS 
(Compendium of Protein Fingerprints)

ProDom

ProtoMaps

SMART  (Simple Modular
Architecture Research Tool)

InterPro

Protein Structure
Classification

CATH (Class, Architecture,
Topology, Homologous Superfamily)

SCOP (Structural Classfication of Proteins)

 

PDB (Protein Data Bank)

The Dali Database


   * Introduction to Protein Domain / Structure Databases (PDF)
   * For more comprehensive list of available databases, please check NAR (Nucleic Acid Research) Database Issue.


2. Swiss-Prot
    
The ExPASy (Expert Protein Analysis System) proteomics server from the Swiss Institute of Bioinformatics (SIB) is dedicated to molecular biology with an emphasis on data relevant to proteins. It allows you to browse through a number of databases produced in Geneva, such as Swiss-Prot, PROSITE, SWISS-2DPAGE, SWISS-3DIMAGE, ENZYME, as well as other cross-referenced databases (such as EMBL/GenBank/DDBJ, OMIM, Medline, FlyBase, ProDom, SGD, SubtiList, etc). It also allows access to many analytical tools for the identification of proteins, the analysis of their sequence and the prediction of their tertiary structure. ExPASy also offers you many documents relevant to these field of research and you will find from the servers, links to most relevant sources of information across the Web.
 

   Swiss-Prot has been developed and maintained by Swiss Institute of Bioinformatics (SIB) and European Bioinformatics Institute (EBI). The best thing of Swiss-Prot is that it is a comprehensive protein information source that is curated by human experts. Thus it can server as the highest-quality protein resource.
 

   * Swiss-Prot on the Web : http://www.expasy.org/swissprot
      Sample record : GRAA_HUMAN
   * Swiss-Prot on textfile
      Sample record : GRAA_HUMAN

    * List of Line code for Swiss-Prot flatflie

 

Line code

Content

Occurrence in an entry

  ID Identification Once; starts the entry
  AC Accession number(s) Once or more
  DT Date Three times
  DE Description Once or more
  GN Gene name(s) Optional
  OS Organism species Once or more
  OG Organelle Optional
  OC Organism classification Once or more
  OX Taxonomy cross-reference(s) Once or more
  RN Reference number Once or more
  RP Reference position Once or more
  RC Reference comment(s) Optional
  RX Reference cross-reference(s) Optional
  RG Reference group Once or more (Optional if RA line)
  RA Reference authors Once or more (Optional if RG line)
  RT Reference title Optional
  RL Reference location Once or more
  CC Comments or notes Optional
  DR Database cross-references Optional
  KW Keywords Optional
  FT Feature table data Optional
  SQ Sequence header Once
  (blanks) Sequence data Once or more
  // Termination line Once; ends the entry

    * Refer to the Swiss-Prot manual for full description of line codes


3. PDB (Protein Data Bank)
    
The PDB is the single worldwide repository for the processing and distribution of 3-D structure data of large molecules of proteins and nucleic acids. 3D structures are obtained experimentally by using X-Ray crystallography and NMR (Nuclear Magnetic Resonnance).

    * PDB on the Web :
http://www.rcsb.org/pdb/
      Sample : 1A06 Calmodulin-dependent Protein Kinase from Rat

    * PDF file format
       Sample : 1A06.pdb (txt)
 
    *
File Format Guide

 

4. Simple CGI Programming
     A. Sample CGI 1  (txt)
     B. Sample CGI 2  (txt)
     C. Introduction to CGI
     D. Please use Google or Perl CGI book for further references.


Last Modified : September 1, 2005

Maintained by : Junguk Hur ()