HGM2002 Poster Abstracts: 1. Genome Informatics and Annotation
POSTER NO: 7
New Strategy for Sequencing Large Mammalian Genomes
Rui Chen, George M. Weinstock, Richard A. Gibbs
Currently, two different methods have been used to sequence a mammalian genome, the 'Clone by Clone' and the 'whole genome shotgun' approach. With the 'Clone by Clone' method, individual clones cover part of the genome are sequenced and assembled separately. To reduce the redundancy, a tiling path covering the genome needs to be established and a minimal redundant clone set is selected for sequencing. Since the 'Clone by Clone' approach reduces a global problem to many local problems, the assembly and finishing process of individual clones become relative easy. However, a lot of resources are needed up front in order to build a tiling path. In contrast, using the 'whole genome shotgun(WGS)' strategy, shotgun libraries are generated directly from the whole genomic DNA. As a result, a tiling path is NOT necessary and the assembly is conducted at the whole genome scale instead of at a local level. The disadvantage of this approach is that it has less information at the finishing step and a minimal coverage of the whole genome is needed before the assembly can be generated.
In the process of sequencing the rat genome, we have used a strategy with a combination of both these two methods in order to maximize advantages provided by both approaches. In addition to the WGS sequences, selected BAC clones are sequenced with 1x coverage. Software tools have been developed to map WGS reads to individual BAC clone and therefore, each BAC clone can serve as the skeleton for the final assembly and finishing step. It also provides a template for researchers to prioritize regions of interests for finishing. One key challenge for this strategy to be effective is to select a set of low redundant clone set that covers all the regions of the genome without a priori knowledge of a tiling path. A software package is being developed at our lab to establish a tiling path dynamically by utilizing the Fingerprint data, Bac End Sequence data and sequence data generated from individual Bac clones as well as WGS reads. Using this package, we will be able to minimize the overlap among the BAC clones selected for sequencing and therefore cover the whole genome efficiently. The details of the strategy and results will be reported.
Other abstracts in same session