Introduction - The Plasmodium yoelii yoelii Genome Database (PYDB)
The P. y. yoelii genome
TIGR and the Naval Medical Research Center (NMRC) have been collaborating on a research program to provide partial sequence coverage of the Plasmodium yoelii yoelii genome, a rodent animal model for malaria. Funding for this project was provided by the U.S. Department of Defense through cooperative agreement with the US Army and NMRC. The project was completed in the summer of 2002 and has resulted in the identification of more than 95% of all the genes of the P. y. yoelii genome.
The goal of the P. y. yoelii research program differed from the large scale, dense sequencing of individual chromosomes of the P. falciparum genome. Instead, a fast, cost-effective and highly efficient whole genome shotgun method was used to sequence the rodent malaria parasite genome to five-fold coverage. This approach has provided approximately 5,700 contigs for gene discovery and low density gene organization. These sequences will be assembled, and the DNA contigs edited and annotated. This demonstration, the fact that the P. falciparum sequencing project is near completion and the need for developing cost effective approaches to sequencing other species of Plasmodium encouraged us to carry out large-scale sequencing and annotation of this model parasite genome without the investment of expensive and time consuming gap closure.-->
Shotgun libraries were prepared using total genomic DNA from Plasmodium yoelii, strain 17X NL, clone 1.1 raised in mice. Briefly, leukocyte-free infected erythrocytes are isolated from mice and parasite genomic DNA prepared by standard methods. Total genomic DNA was mechanically sheared, and fragments of 1-2 kb size-selected and cloned into a pUC19-derived vector using BstXI adaptors. DNA was sequenced using dye-terminator chemistry on ABI 377 and ABI 3700 DNA sequencers.
In order to provide early access to malaria researchers for "jump starting" biological experiments, we have made the P. yoelii sequences publicly available. Be advised that the DNA assemblies or raw sequence reads are considered preliminary data and the information should be used carefully. These sequences have not been verified for base call ambiguities and may contain errors.
These data are being released to the public and are governed by the data release policy established for the P. falciparum sequencing project (see below), which is subscribed by all the other genome projects carried on at TIGR. Please contact Dr. Jane Carlton should you need any further information.
As of May 29, 2001, we have 223,907 sequence reads, which corresponds to approximately 5x genome representation. The contigs and singletons can be downloaded from our ftp site.
These sequences were assembled in contigs using TIGR Assembler.
The annotation process is fully automated and therefore has not undergone extensive human review. Please be advised that these are unverified transcriptional units (or gene models) identified by GlimmerM, the gene finding software trained with P. falciparum sequences. In addition, some transcriptional units have been identified by BLAST hits of start-stop translation. A trained version of Glimmer to identify P. yoelii genes is under development. Further, contig editing will likely alter the reading frames of these transcriptional units. Therefore please use this data with caution.
The preliminary annotation of the 2x DNA assemblies with 2kb or larger can be seen. GlimmerM has identified 4,686 transcriptional units and 27 tRNAs. We are in the process of updating the annotation tables with the 5x genome data.
For Comments/Questions send mail to py@tigr.org.
|