IBISS: Interactive Bovine In Silico SNP Database

The Interactive Bovine In Silico SNP (IBISS) database has been created at the Commonwealth Scientific and Industrial Research Organisation (CSIRO), by Rachel Hawken, Wes Barris and Brian Dalrymple. IBISS was constructed to create an in silico SNP database by harnessing the vast number of bovine EST sequences available in the public domain.

IBISS is not only a database of SNPs, but also a database of model bovine mRNAs. The mRNAs have a standard and uniform annotation, based on the human orthologues. IBISS is searchable using human standard gene symbols, Genbank human Refseq accession numbers, Genbank bovine accession numbers, or by text searches of the descriptions and by sequence using BLAST.

IBISS was constructed by clustering publicly available EST and mRNA sequences for Bos species (324,031). The Genbank sequences have been cleaned by removing chimeric sequences, sequences with >10%N's and some sequences that are entirely vector. The remaining data set was masked using RepeatMasker and CrossMatch to remove vector, repetitive elements etc. (masked regions are represented by xxxx in the sequences) and was clustered and aligned using stackPACK, which pipelines the data through a number of algorithms and programs.

For each model mRNA, several important features have been included in this interactive web site.

  1. Model mRNAs were annotated using the description lines from the top BLAST hits to the human mRNA and protein RefSeq sets (blastn,blastx and FASTY).
  2. The most likely protein sequence was determined using either FASTY or ESTscan2.
  3. Putative SNPs were identified by examining the multiple sequence alignment for each cluster (SNPs altering an amino acid sequence are also flagged).
  4. Putative gene structure (intron-exon boundaries) of each model mRNA was identified using BLAT searches of each model mRNA versus the human chromosome sequence database (July build of the human chromosome sequences from the UCSC Genome Browser).
  5. The predicted protein sequence was aligned to the respective mRNA using Nap.
  6. A bovine/human genome browser enables chromosome browsing to look for bovine genes in particular areas of human chromosomes.

Data submission

A condition of the IBISS Licence is that users of this web site submit to us any sequence that confirms or denies putative SNPs.

Referenceing IBISS

Manuscripts referencing IBISS should include:

Hawken RJ, Barris WC, McWilliam SM, Dalrymple BP. 2004 An interactive bovine in silico SNP database (IBISS). Mamm Genome. 15(10):819-27.

How have the model mRNAs been annotated?

All primary and alternate consensus sequences were BLASTed against the latest versions of the human protein and mRNA Refseq datasets from the NCBI. These data sets contain known and predicted proteins and mRNAs, which can be identified from their accession numbers as follows;

NP_xxxxxx - known protein
XP_xxxxxx - theoretical protein
NM_xxxxxx - known mRNA
XM_xxxxxx - theoretical mRNA
NG_xxxxxx - pseudogenes