The recent advent of high throughput sequencing and genotyping technologies (Next Generation Sequencing, NGS) enables the comparison of patterns of polymorphisms at a very large number of markers, which makes it possible to characterize genomic regions involved in the adaptation of organisms to their environment. However, the current statistical approaches to identify genomic signatures of selection are limited by over-simplified demo-genetic models upon which they rely. Furthermore, these methods generally neglect the information brought by linkage disequilibrium (LD) among genetic markers.
The main objective of the PhD is therefore to propose new methodological developments for detecting signatures of selection, in a Bayesian framework, along two main axes:
- to propose a better modeling of the demo-genetic history of populations, by improving existing methods that rely on a island model at migration-drift equilibrium (Vitalis et al., 2014) or the explicit modeling of the divergence history of populations (Gautier and Vitalis, 2013). An alternative approach will consist in the estimation of the correlation structure of allele frequencies between populations (Guillot et al., 2014).
- to make a better use of the information brought by the spatial organization of markers (LD) and the haplotype structure of populations. This will amount, e.g., to integrate the spatial dependency of markers into the models by (using autoregressive models or hidden Markov model), or to analyze phased data (obtained by reconstructing haplotypes using unsupervised classification methods) and then considering haplotype blocks as multi-allelic markers.
These new methodological developments will be directly applied to NGS data (pool- seq) obtained within the European BiodivERsA EXOTIC project, which aims to characterize the genetic basis of adaptation during the invasion of a flagship species: the harlequin ladybird Harmonia axyridis (Lombaert et al., 2014). These data, already available, will be used to contrast the genomic characteristics of native populations and invasive populations on a global scale.
Gautier M & Vitalis R (2012). An R package to detect footprints of selection in genome-wide SNP data from haplotype structure. Bioinformatics, 28: 1176-1177.
Gautier M & Vitalis R (2013). Inferring population histories using genome-wide allele frequency data. Molecular Biology and Evolution, 30: 654-668.
Guillot G, Vitalis R, le Rouzic A & Gautier M (2014). Detecting correlation between allele frequencies and environmental variables as a signature of selection. A fast computational approach for genome-wide studies. Spatial Statistics, 8: 145-155.
Lombaert E, Guillemaud T, Lundgren J, Koch R, Facon B, Grez A, Loomans A, Malausa T, Nedved O, Rhule E, Staverlokk A, Steenberg T & Estoup A (2014). Complementarity of statistical treatments to reconstruct worldwide routes of invasion: the case of the Asian ladybird Harmonia axyridis. Molecular Ecology, 23: 5931-6205.