PhD position on comparative pangenomics

A PhD position is open at the LABGeM (Laboratory of Bioinformatics Analyses for Genomics and Metabolism). Our team is one of the bioinformatics labs of the Genoscope (CEA, French Genome Institute) located in Evry (35 kms south of Paris). Our areas of expertise are the conception of bioinformatics methods and information systems for the exploration of microbial genomics data.

This thesis is funded by the CEA Phare PhD program with gross monthly salary of €2,040 in years 1 and 2, rising to €2,100 in year 3 (CFR funding).

To apply: Interested candidates should send their CV and a cover letter to Alexandra Calteau (acalteau@genoscope.cns.fr) and David Vallenet (vallenet@genoscope.cns.fr).

Description of the PhD project:

Comparative analysis of pangenomes: from genome plasticity to the metabolic diversity of the microbial world

Recent years have seen the explosion of sequencing projects leading to a deluge of several hundred thousand genomes available in public sequence databases. Comparative genomics approaches in microbiology now use thousands of genomes to analyze a species. Indeed, many studies focus on the overall gene content of a species (the pangenome) to understand its evolutionin terms of common genes (“core-genome”) and accessory genes (“variable-genome”) in the light of epidemiological or environmental data [1]. Nevertheless, processing this mass of data imposes a paradigm shift in knowledge representation and in the algorithms used [2,3]. In this context, our laboratory is working on a new structuration of genomic data in the form of a pangenome graph. It allows the information compression of thousands of genomes while preserving the chromosomal organization of the genes.

The aim of this PhD thesis is to carry out methodological developments for the comparative study of pangenomes. Firstly, a tool for detecting variable regions in a pangenome graph and describing them in functional submodules will be set up. These regions of genomic plasticity (RGP) include both regions that are exchanged between strains by horizontal gene transfer (e.g. genomic islands) and regions lost differentially among lineages. They are of paramount importance to understand the adaptive potential of bacteria. Secondly, methodological developments will allow inter-pangenome comparisons and, thus, the exploration of the RGP dynamic between different species. The algorithms and tools developed during this thesis will be applied to study different bacterial groups of medical, agronomic or biotechnological interest such as actinobacteria, firmicutes or enterobacteria for which a large amount of data is available. Particular attention will be given to the functional analysis of the RGPs with regard to the metabolism of organisms in terms of  secondary metabolite production or catabolic pathways.

This work will benefit from developments and integrated tools of the MicroScope platform as well as the expertise of the LABGeM team on microbial metabolism [4,5]. The tools developed in the context of the thesis will be promoted within the MicroScope platform to meet the analysis needs of academic and industrial partners. One of the originalities of this thesis work lies in the pangenomic approach of comparative genomics, which makes it possible to answer one of the challenges of bioanalysis in the era of big data in biology.

References:

1. Rouli L, Merhej V, Fournier P-E, Raoult D. The bacterial pangenome as a new tool for analysing pathogenic bacteria. New Microbes New Infect. 2015;7: 72–85.

2. Chan AP, Sutton G, DePew J, Krishnakumar R, Choi Y, Huang X-Z, et al. A novel method of consensus pan-chromosome assembly and large-scale comparative analysis reveal the highly flexible pan-genome of Acinetobacter baumannii. Genome Biol. 2015;16: 143.

3. Computational Pan-Genomics Consortium. Computational pan-genomics: status, promises and challenges. Brief Bioinform. 2016; doi:10.1093/bib/bbw089

4. Vallenet D, Calteau A, Cruveiller S, Gachet M, Lajus A, Josso A, et al. MicroScope in 2017: an expanding and evolving integrated resource for community expertise of microbial genomes. Nucleic Acids Res. 2017;45: D517–D528.

5. Médigue C, Calteau A, Cruveiller S, Gachet M, Gautreau G, Josso A, et al. MicroScope-an integrated resource for community expertise of gene functions and comparative analysis of microbial genomic and metabolic data. Brief Bioinform. 2017; doi:10.1093/bib/bbx113

PhD position on comparative pangenomics