HGM2002 Poster Abstracts: 1. Genome Informatics and Annotation
POSTER NO: 36
The Proteome BioKnowledge® Library: Bringing Biology to Genomics and Proteomics
Incyte's Proteome Model Organism and Mammalian Proteome Databases: The Proteome BioKnowledge® Library represents a growing collection of searchable databases that integrates knowledge from the research literature with genomic information. The BioKnowledge Library is a powerful resource for bioinformatics scientists and biologists of all disciplines. The current volumes in the BioKnowledge Library are the following: YPD™ for S. cerevisiae, PombePD™ for S. pombe, MycoPathPD™ for a collection of fungal pathogens, WormPD™ for C. elegans, HumanPSD™, which includes survey information on human, mouse, and rat proteins, and GPCR-PD™, which provides detailed information on G protein-coupled receptors in human, mouse and rat.
Databases are created using extensive human editorial expertise: Editorial selection of appropriate literature and prioritization directs curators to the most relevant material. Ph.D.-level expert scientific curation teams read and extract experimental information and predictions from the papers. This information is then categorized and attached to the appropriate protein sequence data. Quality control teams continually monitor the database for both scientific correctness and technical accuracy.
Gene Ontology™ (GO) System of Controlled Vocabulary: The descriptive terminology of the Gene Ontology™ Consortium ( http://www.geneontology.org ) has now been integrated into the BioKnowledge Library. Proteome assigned GO terms to all volumes in the BioKnowledge Library and was the first to apply the ontology to H. sapiens, R. norvegicus, C. albicans, and other fungal pathogen proteins. Incyte curators use the Gene Ontology™ (GO) system of controlled vocabulary (Nat. Genetics 25:25-29) to record structured textual annotations. Examples below show how curation with GO terms alone creates a detailed picture of protein function, based solely on the GO molecular function, biological process, and cellular component properties.
GO Terminology and BioKnowledge Transfer: The information in our databases enables transfer of knowledge about known proteins to unknown but related proteins. The common vocabulary provided by GO can also be used to describe the predicted properties of these uncharacterized proteins. Please see below an example of how knowledge incorporated into the BioKnowledge Library combined with GO terms predicts functional features of proteins of interest.
Other abstracts in same session