DCAR V1.0 Gene Prediction

Resource Type: 
Analysis
Name: 
DCAR V1.0 Gene Prediction
Description: 

For gene model prediction, mobile element–related repeats were masked using RepeatMasker. De novo prediction using AUGUSTUS v2.5.5, GENSCAN v.1.1.0, and GlimmerHMM-3.0.1 was trained using model species A. thaliana and S. lycoperisum training sets. The protein sequences of S. lycoperisum, Solanum tuberosum, A. thaliana, Brassica rapa, and Oryza sativa were mapped to the carrot genome using TBLASTN (BLAST All 2.2.23) and analyzed with GeneWise version 2.2.0. Carrot ESTs were aligned to the genome using BLAT and analyzed with PASA to detect spliced gene models. RNA-seq reads from 20 DH1 libraries were aligned with TopHat 2.0.9. Transcripts were predicted by Cufflinks. All gene models produced by de novo prediction, protein homology searches, and prediction and transcript-based evidence were integrated using GLEAN v1.1. Putative gene functions were assigned using the best BLASTP match to SwissProt and TrEMBL databases. Gene motifs and domains were determined with InterProScan version 4.7 against the ProDom, PRINTS, Pfam, SMART, PANTHER, and PROSITE protein databases. GO IDs for each gene were obtained from the corresponding InterPro entries. All genes were aligned against KEGG (release 58) proteins.

Data from this analysis can be viewed in JBrowse here.

Publication: 
Iorizzo M, Ellison S, Senalik D, Zeng P, Satapoomin P, Huang J, Bowman M, Iovene M, Sanseverino W, Cavagnaro P, Yildiz M, Macko-Podgórni A, Moranska E, Grzebelus E, Grzebelus D, Ashrafi H, Zheng Z, Cheng S, Spooner D, Van Deynze A, Simon P. A high-quality carrot genome assembly provides new insights into carotenoid accumulation and asterid genome evolution.. Nature genetics. 2016 06; 48(6):657-66.
Relationship: 
There are 2 relationships.
Relationships
The analysis, DCAR V1.0 Gene Prediction, is a part of analysis, Carrot Genome Assembly DCARv2.
The analysis, DCAR Gene annotation V1.0 locations on Carrot Genome Assembly DH1 V3.0, derives from analysis, DCAR V1.0 Gene Prediction.
Loading content
Program, Pipeline, Workflow or Method Name: 
Glean (1.1). model species training
Program Version: 
1.0
Algorithm: 
de novo prediction, protein-homology searches, and prediction and transcript based evidence
Date Performed: 
Friday, May 16, 2014 - 00:00
Data Source: 
Source Name
: PRJNA285926
Source Version
: 1
Source URI
: http://www.ncbi.nlm.nih.gov/bioproject/PRJNA285926
Organism: 
NameCommon NameComment
Carrot
For a general overview of carrot, see the Carrot Facts Page
Loading content
Feature: 
There are 160795 exon features
There are 32113 gene features
There are 32113 mRNA features
There are 32113 polypeptide features
Total: 257134