HGM2002 Poster Abstracts: 1. Genome Informatics and Annotation
POSTER NO: 52
The Human Genome Project at RIKEN Genomic Sciences Center: Progress of Chromosomes 11q, 18p and 21
1Todd D. Taylor, 1Atsushi Toyoda, 3Takehiko Itoh, 2Tetsushi Yada, 1Yasushi Totoki, 1Hidemi Watanabe, 1Hideki Noguchi, 1Asao Fujiyama, 1Masahira Hattori, 1Yoshiyuki Sakaki
As part of the International Human Genome Project, RIKEN Genomic Sciences Center ( http://hgp.gsc.riken.go.jp/top.html ) is responsible for mapping, sequencing, and annotating chromosomes 11q, 18p, and half of chromosome 21. Since chromosome 21 was finished nearly two years ago, we subsequently completed the draft sequences for the other two regions. Our next goal is to completely finish chromosomes 11q and 18p, of which the first is nearly done. Here we report on the current status of each of these projects.
Mapping: We have constructed several mono-chromosomal cosmid and fosmid libraries using a random fragmentation cloning protocol. In addition, we have developed a procedure for constructing chromosome-enriched BAC libraries. These libraries have proven to be very useful for detecting clones near the centromeres, telomeres, and gap regions, when the standard libraries have failed.
Sequencing: We use the nested-deletion strategy coupled with the shotgun approach. Using 18 MegaBACE 1000 and 9 ABI 3700 capillary sequencers, we can produce over 10 million bases of raw sequence per day. Our current finishing rate is about 50-60 clones per month for up to 10Mb of high-quality sequence.
Annotating: Several systems for automated data-assembly, annotation, and release have been implemented. Our data is released through our web site and DDBJ according to the Bermuda rules.
Chromosome 11q: Approximately 82Mb, this chromosome arm should be nearly completed by the time of this meeting (projected date: May 2002). As of January 1, over 560 clones (out of ~700, including a few from other centers) have been finished for a non-redundant total of more than 66Mb, or about 81% of 11q. We have reached both the centromeric and telomeric repeats, only five clone gaps remain but we continue to screen additional libraries. Chromosome 11q is very gene-dense and harbors genes for disorders such as: ataxia telangiectasia, insulin-dependent diabetes mellitus, several cancers, and many susceptibility genes (e.g. osteoarthritis, obesity, asthma). A preliminary analysis of the finished regions will be presented.
Chromosome 18p: Our next target for finishing, the short arm of chromosome 18 is about 17Mb in length. The draft sequence is nearly complete, but currently less than 2Mb, or about 11%, is finished. Approximately 120 clones in total will be needed to complete this sequence. While there are no internal clone gaps, we are still trying to identify clones that contain the telomeric and centromeric repeats. We expect to complete the sequence of 18p before the end of this year. This gene-poor chromosome arm has been linked to several neurological disorders including schizophrenia, bipolar affective disorder and torsion dystonia. The current state of the map will be presented.
Chromosome 21: The chromosome responsible for Down syndrome, our center completed nearly 17Mb, or 50%, of the finished sequence. In an on-going effort to improve the quality of the sequence and annotation, several minor changes have been incorporated and are now available from our web site and in the public databases. The annotation data has been significantly enhanced, with more types of analyses having been performed. Of note are the inclusion of new genes, SNP information from clone overlaps and other sources, cross- species comparison analysis, additional gene-finding predictions, and homology information. A brief summary will be presented.
Other abstracts in same session