Databases over Databases in Molecular Biology

SRS Sequence Retrieval System (network browser for databanks in molecular biology)

Survey of Molecular Biology Databases and Servers

BioMedNet Library

DBGET Database Links

Harvard Genome Research Databases and Selected Servers

Johns Hopkins Univ. OWL Web Server

Index of Biology Internet Servers, USGS

Listing of Molecular Biology Databases (LiMB) gopher://

WWW Server for Virology, UW-Madison

UK MRC Human Genome Mapping Project Resource Centre

WWW for the Molecular Biologists and Biochemists

Links to other Bio-Web servers

http://www.gdb .org/biolinks.html

Molecular Modelling Servers and Databases

EMBO Practical Structural Databases

Web Resources for Protein Scientists

ExPASy Molecular Biology Server

The Antibody Resource Page

Bioinformatics WWW Sites

Bioinformatics and Computational Biology at George Mason University

INFOBIOGEN Catalog of Databases

National Biotechnology Information Facility

Human Genome Project Information

Archives for biological software and databases

Proteome Research: New Frontiers in Functional Genomics (book contents)

13.4 Sequence and Structure Databases

13.4.1 Major Public Sequence Databases

EMBL WWW Services

GenBank Database Query Form (get a GenBank entry)

Protein Data Bank WWW Server (get a PDB structure)

European Bioinformatics Institute (EBI)

EBI Industry support

http://industry. .uk/

SWISS-PROT (protein sequences)

PROSITE (functional protein sites)

Macromolecular Structures Database

Molecules R Us (search and view a protein molecule)

PIR-International Protein Sequence Database

SCOP (structural classification of proteins), MRC

HIV Sequence Database, Los Alamos

HIV Molecular Immunology Database, Los Alamos

TIGR Database

The NCBI WWW Entrez Browser

Cambridge Structural Database (small-molecule organic and organometallic crystal structures)

Gene Ontology Consortium

13.4.2 Specialized Databases

ANU Bioinformatics Hypermedia Server

(virus databases, classification and nomenclature of viruses)

O-GLYCBASE (a revised database of O-glycosylated proteins)

Genome Sequence Database (GSDB) (relational database of annotated DNA sequences)

EBI Protein topology atlas

Database of Enzymes and Metabolic Pathways (EMP)

MAGPIE (multipurpose automated genome project investigation environment)

E.coli database collection (ECDC) (compilation of DNA sequences of E. coli K12)

Haemophilus influenzae database (HIDC) (genetic map, contigs, searchable index)

EcoCyc: Encyclopedia of Escherichia coli Genes and Metabolism

Eddy Lab snoRNA Database

GenProtEc (genes and proteins of Escherichia coli)

NRSub (non-redundant database for Bacillus subtilis) http://pbil.univ-lyon 1. fr/nrsub/nrsub.html

YPD (proteins from Saccharomyces cerevisiae)

Saccharomyces Genome Database

LISTA, LISTA-HOP and LISTA-HON (compilation of homology databases from yeast)

FlyBase (Drosophila database)

MPDB (molecular probe database)

Compilation of tRNA sequences and sequences of tRNA genes

Small RNA database, Baylor College of Medicine

SRPDB (signal recognition particle database)

RDP (the Ribosomal Database Project) .uiuc .edu/

Structure of small ribosomal subunit RNA

Structure of large ribosomal subunit RNA http://rrna.uia. ac .be/lsu/index.html

RNA modification database

HAMSTeRS (haemophilia A mutation database) and factor VIII mutation database

Haemophilia B (point mutations and short additions and deletions) .uk/pub/databases/haemb/

Human p53, hprt and lacZ genes and mutations http://sunsite.unc .edu/dnam/mainpage.html

PAH mutation analysis (disease-producing human PAH loci)

ESTHER (cholinesterase gene server)

IMGT (immunogenetics database) .uk/imgt/

p53 mutations in human tumors and cell lines

Androgen receptor gene mutations database

Glucocorticoid receptor resource

Thyroid hormone receptor resource

16SMDB and 23SMDB (16S and 23S ribosomal RNA mutation database)

MITOMAP (human mitochondrial genome database)

SWISS-2DPAGE (database of two-dimensional polyacrylamide gel electrophoresis)

PRINTS (protein fingerprint database)

KabatMan (database of antibody structure and sequence information)

ALIGN (compendium of protein sequence alignments)

CATH (protein structure classification system)

ProDom (protein domain database)

Blocks database (system for protein classification)

HSSP (homology-derived secondary structure of proteins)

FSSP (fold classification based on structure-structure alignment of proteins)

SBASE protein domains (annotated protein sequence segments)

TransTerm (database of translational signals)

GRBase (database linking information on proteins involved in gene regulation)

ENZYME (nomenclature of enzymes)

REBASE (database of restriction enzymes and methylases)

RNaseP database

REGULONDB (database on transcriptional regulation in E. coli)

TRANSFAC (database on transcription factors and their DNA binding sites)

MHCPEP (database of MHC-binding peptides)

Mouse genome database

Mouse knockout database

ATCC (American type culture collection)

Histone sequence database of highly conserved nucleoprotein sequences

3Dee (database of protein structure domain definitions)

InterPro (integrated resource of protein domains and functional sites) .uk/interpro/

NRL_3D (sequence-structure database derived from PDB, pictures and searches)

VBASE human variable immunoglulin gene sequences

GPCRD (G protein-coupled receptor data)

Human Cytogenetics (chromosomes and karyotypes)

Protein Kinase resource

Carbohydrate databases

Borrelia Molecular Biology Home Page

Human papillomaviruses database

Human 2-D PAGE databases for proteome analysis in health and disease

DBA mammalian genome size database

DOGS database Of Genome Sizes

U.S. patent citation database

13.5 Sequence Similarity Searches

Sequence similarity search page at EBI

NCBI: BLAST notebook


EMBL WWW services

Pattern scan of proteins or nucleotides

MEME (motif discovery and search)

CoreSearch (dentification of consensus elements in DNA sequences)

The PRINTS/PROSITE scanner (search motif databases with query sequence)

DARWIN system at ETH Zurich

PimaII find sequence similarity using dynamic programming

DashPat find sequence similarity using a hashcode comparison with a pattern library

PROPSEARCH (search based on amino acid composition, EMBL)

Sequence search protocol (integrated pattern search)

ProtoMap (automatic hierarchical classification of all swissprot proteins)

GenQuest (Fasta, Blast, Smith Waterman; search in any database) http://www.gdb .org/D an/gq/gq.form.html

SSearch (searches against a specified database)

Peer Bork search list (motif/pattern/profile searches)

PROSITE Database Searches (search for functional sites in your sequence)

PROWL—Protein Information Retrieval at Skirball Institute

CEPH genotype database

13.6 Alignment

13.6.1 Pairwise Sequence and Structure Alignment

Pairwise protein alignment (SIM)

LALNVIEW alignment viewer program ftp ://expasy.hcuge. ch/pub/lalnview

BCM Search Launcher (pairwise sequence alignment)

DALI compare protein structures in 3D http://www2 .uk/dali/

DIALIGN (aligment program without explicit gap penalties)

13.6.2 Multiple Alignment and Phylogeny

ClustalW (multiple sequence alignment at BCM)

PHYLIP (programs for inferring phylogenies)

Other phylogeny programs, a complication from PHYLIP documentation

Tree of Life Home Page (information about phylogeny and biodiversity)

Links for Palaeobotanists

Phylogenetic analysis programs (the tree of life list)


Cladistic software (a list from the Willi Hennig Society)

BCM search launcher for multiple sequence alignments

AMAS (analyse multiply aligned sequences)

Vienna RNA Secondary Structure Package http://www.tbi.univie. ac .at/~ ivo/RNA/

WebLogo (sequence logo) .uk/cgi-bin/seqlogo/logo.cgi

Protein sequence logos using relative entropy

RNA structure-sequence logo

RNA mutual information plots


13.7 Selected Prediction Servers

13.7.1 Prediction of Protein Structure from Sequence

PHD PredictProtein server for secondary structure, solvent accesibility, and transmembrane segments

PhdThreader (fold recognition by prediction-based threading)

PSIpred (protein strcuture prediction server)

THREADER (David Jones)

TMHMM (prediction of transmembrane helices in proteins)

Protein structural analysis, BMERC

Submission form for protein domain and foldclass prediction

NNSSP (prediction of protein secondary sturcture by nearest-neighbor algorithms)

Swiss-Model (automated knowledge-based protein homology modeling server)

SSPRED (secondary structure prediction with multiple alignment)

SSCP (secondary structure prediction content with amino acid composition)

SOPM (Self Optimized Prediction Method, secondary structure) at IBCP, France.

NNPREDICT (neural network for residue-by-residue prediction)

SSpro (secondary structure in 3 classes)

SSpro8 (secondary structure in 8 classes)

ACCpro (solvent accessibility)

CONpro (contact number)

TMAP (service predicting transmembrane segments in proteins)

TMpred (prediction of transmembrane regions and orientation)

MultPredict (secondary structure of multiply aligned sequences)

NIH Molecular Modeling Homepage (modelling homepage with links)

BCM Search Launcher (protein secondary structure prediction)

COILS (prediction of coiled coil regions in proteins)

Coiled Coils

Paircoil (location of coiled coil regions in amino acid sequences)

PREDATOR (protein secondary structure prediction from single sequence)

DAS (Dense Alignment Surface; prediction of transmembrane regions in proteins)

Fold-recognition at UCLA-DOE structure prediction server

Molecular Modelling Servers and Databases

EVA (automatic evaluation of protein structure prediction servers)

13.7.2 Gene Finding and Intron Splice Site Prediction

NetGene (prediction of intron splice sites in human genes)

NetPlantGene (prediction of intron splice sites in Arabidopsis thaliana)

GeneQuiz (automated analysis of genomes)

GRAIL interface (protein coding regions and functional sites)

GENEMARK (WWW system for predicting protein coding regions)

GENSCAN Web Server: Complete gene structures in genomic DNA

FGENEH Genefinder: Prediction of gene structure in human DNA sequences

GRAIL and GENQUEST (E-mail sequence analysis, gene assembly, and sequence comparison)

CpG islands finder

Eukaryotic Pol II promoter prediction

Promoter prediction input form

Web Signal Scan Service (scan DNA sequences for eukaryotic transcriptional elements)

Gene Discovery Page

List of genome sequencing projects

13.7.3 DNA Microarray Data and Methods

Cyber-T (DNA microarray data analysis server)

Brown Lab guide to microarraying

Stanford Microarray Database

Stanford MicroArray Forum

Brazma microarray page at EBI

Web resources on gene expression and DNA microarray technologies http://industry. .uk/~ alan/MicroArray/

Gene-X (array data management and analysis system)

UCI functional genomics array tools and software

Matern's DNA Microarray Page

Public source for microarraying information, tools, and protocols

Weisshaar's listing of DNA microarray links

DNA microarray technology to identify genes controlling spermatogenesis

13.7.4 Other Prediction Servers

NetStart (translation start in vertebrate and A. thaliana DNA)

NetOGlyc (O-glycosylation sites in mammalian proteins)

YinOYang (O-^-GlcNAc sites in eukaryotic protein sequences)


(signal peptide and cleavage sites in gram+, gram-, and eukaryotic proteins)

NetChop (cleavage sites of the human proteasome)

NetPhos (serine, threonine and tyrosine phosphorylation sites in eukaryotic proteins)

TargetP (prediction of subcellular location)

ChloroP (chloroplast pransit peptide prediction)

PSORT (prediction of protein-sorting signals and localization from sequence)

PEDANT (prtein extraction, description, and analalysis tool)

Compare your sequence to COG database

Prediction of HLA-binding peptides from sequences

13.8 Molecular Biology Software Links

Visualization for bioinformatics alan/VisSupp/

The EBI molecular biology software archive

The BioCatalog

Archives for biological software and databases

Barton group software (ALSCRIPT, AMPS, AMAS, STAMP, ASSP, JNET, and SCANPS)

Cohen group software rotamer library, BLoop, QPack, FOLD, Match,

Bayesian bioinformatics at Wadsworth Center

Rasmol software and script documentation



Biosym (Discover) discover/html/Disco_Home.html

SAM software for sequence consensus HMMs at UC Santa Cruz

HMMER (source code for hidden Markov model software)

ClustalW .uk/clustalw/

DSSP program

Bootscanning for viral recombinations

Blocking Gibbs sampling for linkage analysis in very large pedigrees

ProMSED (protein multiple sequences editor for Windows)

DBWatcher for Sun/Solaris

ProFit (protein least squares fitting software)

Indiana University IUBIO software and data http://iubio

Molecular biology software list at NIH

ProAnalyst software for protein/peptide analysis

DRAGON protein modelling tool using distance geometry

Molecular Surface Package

Biotechnological Software and Internet Journal

MCell (Monte Carlo simulator of cellular microphysiology)

HHMpro (HMM simulator for sequence analysis with graphical interface)

13.9 Ph.D. Courses over the Internet

Biocomputing course resource list: course syllabi

Ph.D. course in biological sequence analysis and protein modeling

The Virtual School of Molecular Sciences

EMBnet Biocomputing Tutorials

Collaborative course in protein structure

GNA's Virtual School of Natural Sciences

Algorithms in molecular biology

ISCB education working group

