A major challenge in modern biology is the discovery of in vivo metabolic or physiological functions of unknown proteins. Our institute has developed an integrated strategy based on in silico prediction of enzymatic activities and in vitro screening of enzymes
MetX and MetA are two phylogenetically unrelated protein families, but both are implied in the first step of the L-methionine biosynthesis pathway. MetX proteins are known as L-homoserine O-acetyltransferases (HAT), while MetA proteins are generally annotated as L-homoserine O-succinyltransferase (HST).
We developed a method to classify proteins of a family based on their conserved genomic contexts. Each protein genomic context is compared against all others to determine syntenies. A graph is generated where nodes are input proteins connected by edges
Nearly 35% of the proteins from large-scale sequencing of microbial genomes are annotated with unknown function. Our objective is to explore, by analogy, active sites from proteins of unknown function using 3D tools in order to suggest enzymatic activities. The
There is a massive amount of sequence and structural data available, and the accumulation rate exceeds the pace of functional studies. One way to enhance functional assessment is to mine the available data to inform a strategy that can be
The proportion of protein sequences of unknown function in public databases stills very important (42% of UniProt sequences are labelled as “hypothetical”, “uncharacterized”, “unknown” or “putative”). On the other hand, a number of enzyme activities (about 30%) remain orphan (i.e.
Genome-resolved metagenomics have profoundly reshaped our understanding of the distribution, functionalities and roles of Archaea. Within the domain, major supergroups are Euryarchaeota, which includes many methanogens, the TACK, which includes Thaumarchaeaota that impact ammonia oxidation in soils and the ocean,
Horizontal gene transfer (HGT) is a major source of variability in prokaryotic genomes. Regions of Genome Plasticity (RGPs) are clusters of genes located in highly variable genomic regions. Most of them arise from HGT and correspond to Genomic Islands (GIs).