Development Of A Galaxy Workflow For SNP Detection In Grapevine And Poplar Whole Genome Illumina Resequencing Data

Poster by Nathalie Choisne, Marc Bras, Nacer Mohellibi, Sandie Arnoux, Hadi Quesneville, Juliette Goarin, Jean-Michel Boursiquot, Vanina Guérin, Marie-Christine Le Paslier, Aurélie Bérard, Stéphane Schlub, Dominique Brunel, Rémi Bounon, Frédérique Bitton, Patricia Faivre-Rampant, and Anne-Françoise Adam-Blondon, all of INRA.

Presented at PAG 2011.


Large SNPs discovery projects are undergoing in poplar and grapevine using the Illumina sequencing technology. In grapevine, 30 genotypes from different Vitis species are currently being resequenced and the reads obtained will be aligned along the grapevine reference genome sequence. In poplar, resequencing on P. nigra is divided in two steps. The first one is the deep resequencing of a few individuals (i) to construct a referent genome of the species and (ii) to identify SNPs. The second one consists of the resequencing of several genotypes at low coverage (2x) to maximize SNP discovery. Libraries and paired-ends sequencing (2x75bp and 2x100 bp) on GAIIx were performed by EPGV group and CNG (Centre National de Génotypage, Evry, France) Biological resources and Sequencing platforms. Sequencing data are being analysed using MAPHiTS (Mapping Analysis Pipeline for High-Throughput Sequences), a pipeline for SNPs detection developed by the URGI platform using the Galaxy workflow manager. MAPHiTS pipeline is currently running with the following public tools BWA, SAMtools, Tablet and VarScan. MAPHiTS workflow is able to deliver all SNPs and small indels found in the data set and to filter them according to various parameters such as the genome coverage, the allele frequency and pValue. Preliminary results concerning genome coverage and SNPs identification will be discussed.

