Ph.D opportunity: Comparative analysis of archaeal meta-pangenomes: linking genomic diversity of species to ecosystems

We are looking for an enthusiastic Ph.D student to work on pangenomes of Archaea. See below a description of the project.

A characteristic of natural populations is that they are comprised of individuals that are, in the majority of cases, not genetically identical to each other. In the microbial world, variation between individuals appears both as divergence at the single nucleotide level and the presence of hypervariable genomic islands within a more stable set of genes shared by multiple individuals. The total pool of genetic material comprised by all members of a species is referred to as the ‘pangenome’ [1, 2]. It consists of the core/persistent genome that is common to almost all members of a species, plus all the flexible/variable genome content that is present in some members of the species. 

Genome-resolved metagenomics, in which shotgun sequencing of environmental DNA is assembled and binned into draft genomes, has profoundly reshaped our understanding of the distribution, functionalities and roles of Archaea. Within the domain, major supergroups are Euryarchaeota, which includes many methanogens, the TACK, which includes Thaumarchaeaota that impact ammonia oxidation in soils and the ocean, the Asgard, which includes lineages inferred to be ancestral to eukaryotes, and the DPANN, a group of mostly symbiotic small-celled archaea. These archaea are not restricted to extreme habitats, but are widely distributed in diverse ecosystems [3–5].

Archaea phylogeny
Phylogenetic tree of the 1,179 representative genomes of Archaea. The maximum-likelihood tree was calculated based on the concatenation of 14 ribosomal proteins (L2, L3, L4, L5, 92 L6, L14, L15, L18, L22, L24, S3, S8, S17, and S19) using the LG plus gamma model of 93 evolution. Scale bar indicates the average substitutions per site.

However, there has been only limited analysis of the extent of heterogeneity in gene content within archaeal species [6, 7]. The wealth of metagenome-assembled genomes (MAGs) allows access to gene content heterogeneity within environmental populations of uncultivated archaea. In fact, 34 species-level groups of Archaea, as defined by the Genome Taxonomy DataBase [8], contain more than 10 distinct genomes, a number that has been shown to be sufficient to define pangenomes and detect genomic islands using the tools PPanGGOLiN and panRGP we recently developed in our lab [9, 10].

The aim of this PhD thesis is to leverage the hundred thousand MAGs available using our recent methodological developments for the comparative study of meta-pangenomes in Archaea. First, we propose to systematically analyze the pangenomes of archaeal species. Particular attention will be given to the functional analysis of the genomic islands with regard to the biological capacities of organisms in terms of defense systems and metabolic processes. The description of pangenomes from several archaeal species will then allow inter-pangenome comparisons and, thus, the exploration of the dynamic of the genomic islands between different species of Archaea. The future discoveries will benefit further functional characterization by biochemists of our institute. The second part of the PhD project will consist in a meta-pangenomic approach to track the spatio-temporal distribution and abundance of genomes that belong to the same species using read recruitment from metagenomic projects [11]. We plan to add available physical and chemical parameters from sampling sites and perform correlation analysis between the environmental parameters and the genome abundances. The most interesting species to investigate will be partly selected based on their metabolic capacities we will define in the first aim of the project. We anticipate this will yield unique insights into the functional basis of microbial niche partitioning and fitness of archaeal species.

For more information, you may contact Raphaël Méheust (raphael.meheust@genoscope.cns.fr) and David Vallenet (vallenet@genoscope.cns.fr). The position will be located at the Genoscope in Evry.

References

1. Medini D, Donati C, Tettelin H, Masignani V, Rappuoli R. The microbial pan-genome. Curr Opin Genet Dev 2005; 15: 589–594.

2. Tettelin H, Masignani V, Cieslewicz MJ, Donati C, Medini D, Ward NL, et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial ‘pan-genome’. Proc Natl Acad Sci U S A 2005; 102: 13950–13955.

3. Adam PS, Borrel G, Brochier-Armanet C, Gribaldo S. The growing tree of Archaea: new perspectives on their diversity, evolution and ecology. ISME J 2017; 11: 2407–2425.

4. Spang A, Caceres EF, Ettema TJG. Genomic exploration of the diversity, ecology, and evolution of the archaeal domain of life. Science 2017; 357.

5. Baker BJ, De Anda V, Seitz KW, Dombrowski N, Santoro AE, Lloyd KG. Diversity, ecology and evolution of Archaea. Nat Microbiol 2020; 5: 887–900.

6. Deschamps P, Zivanovic Y, Moreira D, Rodriguez-Valera F, López-García P. Pangenome evidence for extensive interdomain horizontal transfer affecting lineage core and shell genes in uncultured planktonic thaumarchaeota and euryarchaeota. Genome Biol Evol 2014; 6: 1549–1563.

7. Tschitschko B, Erdmann S, DeMaere MZ, Roux S, Panwar P, Allen MA, et al. Genomic variation and biogeography of Antarctic haloarchaea. Microbiome 2018; 6: 113.

8. Parks DH, Chuvochina M, Chaumeil P-A, Rinke C, Mussig AJ, Hugenholtz P. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol 2020; 38: 1079–1086.

9. Gautreau G, Bazin A, Gachet M, Planel R, Burlot L, Dubois M, et al. PPanGGOLiN: Depicting microbial diversity via a partitioned pangenome graph. PLoS Comput Biol 2020; 16: e1007732.

10. Bazin A, Gautreau G, Médigue C, Vallenet D, Calteau A. panRGP: a pangenome-based method to predict genomic islands and explore their diversity. Bioinformatics. 2020; 36: i651–i658,

11. Delmont TO, Kiefl E, Kilinc O, Esen OC, Uysal I, Rappé MS, et al. Single-amino acid variants reveal evolutionary processes that shape the biogeography of a global SAR11 subclade. Elife 2019; 8.

Ph.D opportunity: Comparative analysis of archaeal meta-pangenomes: linking genomic diversity of species to ecosystems