Plasmodium vivax Genome: Update Notes.
We have performed some updates to the Final release of the Plasmodium vivax
genome assembly and functional annotation. The final genomic sequence
consist of:
The core assembly: this group consists of 51 contigs that have been manually annotated for gene structure
and gene function. The contigs have been sequenced to 10X coverage and, with the
exception of two contigs, they are larger than 10kb. Click here to see the assembly list of core
contigs and their sizes. Chromosome assignment for these contigs is currently underway and we hope to have this information available in the near future.
Small contigs:
- Annotated contigs less 10kb: 2729 small contigs likely to be of subtelomeric/telomeric origin which could not be assembled into larger contigs. Blast results indicate that these contigs contain Vir gene
candidates and other putative gene candidates (although often truncated due to the short
length of the contigs). For this reason, these contigs have been processed for manual gene
modeling and annotation. There are 293 gene models in this group.
A description of all the datasets available for blast search is available in the
Sequence search section of the Annotation Data link.
Latest assembly release data: very few changes:
Those that are familiar with TIGR previous release of September 2005 will notice that there is a small change in:
- Number of contigs in the assembly
- Gene identifiers
The number of contigs have change due to the closing of gaps between
linked contigs, a process called "closure". This process
reduced the number of contigs in the core assembly form 56 to 51. In the case
where a contig remains unchanged, genes retain their old identifier; for contigs
that have been modified, a new identifier has been assigned. The 51 new
and larger contigs are being mapped to their correspondig
chromosomes.
TIGR gene identifiers are of the format "12.m00001" , for example, where:
- "12" is a contig id number
- .m corresponds to "model" i.e. a complete gene
model
5 digits (00001) to accomodate all genes in the contig. These are
assigned in consecutive order
from the start to the end of the contig, but as annotation is an ongoing process,
newely identified genes can be added between existing genes if necessary. Thus TIGR identifiers are not
necessarily sequential (newly inserted genes will recieve the first available
identifier).
For public release purposes, we have also assigned locus names to each
gene.
Click here for a list of locus to TIGR gene identifier
correlation.
For Comments/Questions send mail to pv@tigr.org.
|