Carrot (Daucus carota L.) is one of the most economically important crops in the Apiaceae family (Grzebelus et al. 2014), Daucus genus. Carrots are grown on more than one million hectares in temperate climate regions (Grzebelus et al. 2014) worldwide and provide pro- vitamin A, fiber, and other antiulcer, anti-aging and antioxidant properties. Black carrot is of interest as source of natural colorants and fibers. The taste, high nutritive value, good storage life and relatively low cost has made carrot a popular vegetable with consumers (Simon et al. 2019). The relatively slow growth of carrots in the field restricts the breeding cycle to one season per year. Moreover, carrot is susceptible to several disease-causing pathogens such as bacteria, fungi, viruses and nematodes, which significantly reduce the yields. By combining multiple favorable traits using molecular markers, trait selection can be facilitated for faster progress in breeding programs. Additionally, using natural or innate resistance via resistance (R) genes is an economical and sustainable method to prevent or manage carrot disease. To advance markers assisted selection (MAS) and incorporation or gene editing of R genes, a high quality genome is critical (Simon, Philipp W. 2019a).
The first carrot genome assembly ‘Double Haploid Orange Nantes Type (DH1) carrot genome v2’, was published in 2016 (Iorizzo et al., 2016), and was developed primarily using second sequencing technologies such as Illumina. While robust, it needs improvement to address the low contig level and to increase the fraction of the genome anchored to chromosome level. Third generation sequencing and scaffolding technologies can generate long DNA reads and span long physical distances, providing an opportunity to improve the carrot genome assembly.The objectives of this study were to use third generation sequencing and scaffolding technologies to improve and create version 3 of the DH1 carrot genome, and to predict R genes in both v2 and v3 genomes for comparison of results between genome versions.
With the v3 carrot assembly, 692 contigs with total length of 438,921,773bp and N50 of 4,945,074bp were found, accounting for 92.8% of the estimated genome size. An additional 24 Mb sequences anchored to chromosomes level were found, and the contig N50 was increased by 159-fold compared with the published DH1 v2 genome assembly. With less but longer contigs, scaffolds and super-scaffolds, the accuracy and continuity of the whole genome increased significantly. In the v3 genome, over 300 more R genes and over 3,500 more R gene domains were predicted than in the v2 genome. The percent of R genes located on chromosome nine of carrot also increased slightly, from 98.3 % to 99.4 % and short R genes were predicted. The R genes in the v3 genome contained more domains on average and had better continuity than those in the v2 genome. Finally, using a pairwise comparison, the v3 genome was found to contain several new R genes in the v3 genome.