REFOLD




   [Buckle et al (2005) Nature Methods. 2,3]
   [Chow et al (2005) Protein Expr Purif. 46, 166-171]
Monash University
HomeGraphsAboutRSS FeedsHelp
 
Deposit Data
 
Quick Search

Statistics

Latest Additions

Login

Home > Help > Nomenclature & Definitions
Using REFOLD::Useful Databases and Websites

Databases and websites useful to REFOLD

DATABASES AND WEBSITES

Websites directly relevant to REFOLD: UniProt database, SCOP database, PubMed database
Useful websites to assist REFOLD data entry: ExPASy website, PDB database, Pfam database


UniProt Database: http://www.ebi.uniprot.org/index.shtml

The Universal Protein resource database (UniProt). Each protein in the UniProt database has its own unique ID, based on the protein and the organism in which it is naturally expressed. Each entry provides information of relevant publications, sequence, function, secondary structure, molecular weight, length and links to other database and websites, including Pfam, Pubmed and ExPASy. You can search UniProt using the protein name and organism from which it comes (eg. antitrypsin homo sapiens OR antitrypsin human). Upon the query results appearing, you may need to browse the results further to pinpoint exactly which entry pertains to your protein. You can view an entry in more detail by selecting the ID/Accession no. in the left hand side of the table. The UniProt ID is the first ID listed in this field (eg.A1AT_human).


SCOP Database: http://scop.mrc-lmb.cam.ac.uk/scop/index.html

The Structural Classification of Proteins (SCOP) database. Proteins which have had their structures solved are grouped into families with homologues and other proteins with similar structures. Proteins in the SCOP database are sorted at several hierarchical levels:

Class: refers to general structural classification of the protein, according to the dominant structural components of the protein eg. Alpha, beta, alpha/beta, small proteins, multi-domain proteins
Fold: refers to the major tertiary fold and arrangement of secondary structural components within the protein (eg. In β-sheets, barrels)
Superfamily: relates proteins which have a “probable common evolutionary origin” – they share the same fold, similar functional and structural features but low sequence homology
Family: relates proteins with a “clear evolutionary relationship”. Generally this suggests that they share relatively high sequence homology and/or very similar functions and structures
Protein: refers to the individual proteins and structures, taking into account the organism of origin and specific sequence

Each individual grouping within the Class, Fold, Superfamily, Family and Protein groups has its own specific ID allocated. The ID is listed after the grouping name in square brackets [ ].

When searching the SCOP Database, it is best to search on single words terms (eg.”sapiens” rather than “homo sapiens”), then select the appropriate choice from the list provided. The SCOP Database may also be searched using reference codes from the PDB Database (see PDB database for more details). Also note that some proteins have not been structurally characterized, in which case the SCOP class and family should be listed as “unknown”.


PubMed Database: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed


ExPASy website: http://www.expasy.ch

The ExPASy website has a lot of useful tools and links for protein and DNA analysis. For REFOLD, you will mainly need to use the “Primary structural analysis” tools found in the “Tools and software packages” box. You can select the either “ProtParam” or the “pI/MW” tool to calculate the theoretical pI and/or molecular weight of your protein. After selecting the tool, paste your protein sequence into the box provided (don’t worry about removing numbers, the program will automatically ignore them). Click the appropriate option and the program will calculate the relevant data for you.


PDB Database: http://www.rcsb.org/pdb

The Protein Data Bank (PDB) database is a listing of all proteins for which the structure has been solved. You can search on both protein or organism names. As with the SCOP database, be careful when searching on multiple terms – sometimes specifying too many terms may produce no results at all, whereas it may be more productive to search on one term only and then browse through the query results. Individual entries in the PDB database may be viewed by clicking on the unique PDB entry ID at the left hand side of each listing (4-character ID). Once you have identified a structure as being relevant to your protein, you can search the SCOP Database using the 4-character PDB entry ID, and this should produce the relevant SCOP entry for the protein. Alternatively, a direct link to the relevant entry in the SCOP database is also provided at the bottom of each PDB entry (in summary information).


Pfam Database: http://www.sanger.ac.uk/Software/Pfam/search.shtml

The Protein Families database sorts proteins in clusters of multiple alignments, linking families of proteins together. Pfam also contains information about domain boundaries and disulfide bonds. Contains links to PDB and UniProt Databases. You can search Pfam for specific proteins using UniProt ID numbers (Search by “Protein name or sequence”).

 
Copyright © 2008 The REFOLD team. All Rights Reserved.