Nearly 35% of the proteins from large-scale sequencing of microbial genomes are annotated with unknown function. Our objective is to explore, by analogy, active sites from proteins of unknown function using 3D tools in order to suggest enzymatic activities. The first stage involved is the creation of a database containing potential active sites generated by predictive software that localizes the active sites of an enzyme family and classifies that family into sub-families. The second stage combines: (i) the building of a second database containing actives sites from enzymes with known function; and (ii) the comparison of these two databases using a 3D pattern tool search. The goal of the third stage is to validate the predictions made by the software by running docking simulations of potential substrates on representatives of protein families. Finally, the pipeline iss validated on two families with known function and tested on nine families of unknown function. The protocol highlights some new aldolases and mutarotases activities for two families.
These enzymatic activities will soon be experimentally tested by a team from the UMR.
People: Karine Bastard, Jordan Langlois, Khaoula Jlassi
Keywords : Enzymatic activities, Proteins families, Database, Analogy of active site