HGM2002 Poster Abstracts: 1. Genome Informatics and Annotation
POSTER NO: 32
Initial assembly of Synechococcus sp. PCC 7002 Genome
Tao Li, Zhou Yu, Jindong Zhao, Jingchu Luo
Synechococcus sp. strain PCC 7002 was originally isolated by Chase van Baalen in 1961. It is unicellular or forms short filaments of several cells. The whole genome sequence of this organism has been sequenced by the random shotgun approach and targeted sequencing using BAC and Cosmid libraries[1,2]. A total of 26684 reads including 985 walking sequences were obtained. The predicted genome size is about 2.75Mb, which is close to that reported by the method of physical mapping. The sequencing depth coverage is more than 5.8 fold, suggesting that more than 99% of the genome has been sequenced. There are 398 contigs from 684bp to 155kb, and 230 of them are longer than 2kb. Preliminary analysis shows that ~1800 ORFs present in both Synechocystis sp. PCC 6803 and Synechococcus sp. PCC 7002, ~2500 in both Anabaena sp. PCC 7120 and Synechococcus sp. PCC 7002.About 500~600 ORFs were found only in Synechococcus sp. PCC 7002.
During the initial assembly of this genome, we developed a software package designated prokaryotic genome assembly assistant system (PGAAS) to accelerate the finishing phase of genome assembly, especially for the whole genome shotgun approach of prokaryotic species. The approach is to confirm the order of contigs and fill gaps between contigs through peptide links obtained by searching each contig end with BLASTX against protein databases. We used the contig dataset of the cyanobacterium Synechococcus sp. strain PCC7002 after it was sequenced with approximately six-fold coverage and assembled using the Phrap package. The subject database is the protein database of the cyanobacterium Synechocystis sp. strain PCC6803. We found more than 100 non-redundant peptide segments which can link at least 2 contigs. We tested one pair of linked contigs by sequencing and obtained satisfactory result. PGAAS provides a graphic user interface to show the bridge peptides and pier contigs. We integrated Primer3 into our package to design PCR primers at the adjacent ends of the pier contigs. References
1. Bryant, D. A., Zhao, J., et al., The complete genomic sequence of Synechococcus sp. strain PCC 7002: a progress report. Final program of the 7th Cyanobacterial workshop, July 27-31, 2001, Pacific Grove, CA, 26.
2. Marouardt, J., Zhao, J., et al., Preliminary analysis of the genome of the marine cyanobacterium Synechococcus sp. strain PCC 7002. Final program of the 7th Cyanobacterial workshop, July 27-31, 2001, Pacific Grove, CA, 82.
3. Yu, Z., Li, T., Zhao, J., Luo, J. (2002) PGAAS: A prokaryotic genome assembly assistant system. Bioinformatics (in press).
Other abstracts in same session