We constructed the silkworm gene set from 11,104 full-length cDNAs, 408,172 expressed sequence tags (ESTs), 2,089 publicly available mRNA sequences and 16,625 gene models. An overview of the procedures is depicted in the the image below. The transcripts except gene models were aligned to the silkworm scaffold (Build2) using NCBI-BLAST and est2genome. Based on the information of aligned positions, transcrips were grouped into 16,823 gene sites (Gene set A). The ESTs, that were successfully aligned to the genome, could not align to Gene set A (27,102 ESTs) were also grouped into 7,240 genes (Gene set B; EST-based genes). The Gene-IDs for Gene set B are prefixed with "e". All the transcripts that could not align to genomic sequence were assembled into 6,160 contigs with CLOBB2. These contigs were named as Gene set C.
For more information about analysis result using the gene set, please refer to our paper entitled "Large scale full-length cDNA sequencing reveals a unique genomic landscape in a lepidopteran model insect, Bombyx mori". (PMID: 23821615)
|Group||Gene ID||Chr.||Chr. start||Chr. end||Scaffold||Sc. start||Sc. end||Direction||Tissue specific gene||Orthology||Ortholog in D.plexippus||Ortholog in H.melpomene|