Skip to main content

Research Repository

Advanced Search

Entabolons: how metabolites modify the biochemical function of proteins and cause the correlated behavior of proteins in pathways. [Dataset]

Contributors

Jeffrey Skolnick
Data Collector

Bharath Srinivasan
Data Collector

Samuel Skolnick
Data Collector

Brice Edelman
Data Collector

Hongyi Zhou
Data Collector

Abstract

The interior of a cell is a remarkably crowded environment awash with many different types of molecules. For example, the molar concentration of proteins in a HeLa cell is ∼1 mM, while the total cellular concentration of free metabolites is 200 mM–300 mM; at least 31 metabolites have a concentration above 1 mM. Thus, unlike the situation for human-designed drugs where nM activities are desired, there is an excess of many metabolites relative to the total number of cellular proteins. Moreover, the human body likely contains over 100,000 distinct metabolites. Given the energy requirements to maintain such a molecular inventory, their presence seems to be a necessary component of a living system. Indeed, metabolites play important roles in transcription, cellular signaling including mitochondrial nuclear communication, epigenetic regulation, phospholipid homeostasis, regulation of the immune response, and enzymatic activity. Furthermore, metabolite dysregulation is a biomarker of many diseases. Thus, metabolites are not passive but actively participate in many aspects of cellular function. The recognition of the importance of metabolites in living systems spawned metabolomics, which evolved from compiling an inventory of metabolites accompanied by a purely phenomenological description of their behavior to elucidating the mechanisms by which they accomplish their biochemical and ultimately phenotypical function. Yet, many questions remain, and a general mechanistic molecular characterization of what they do in cells is lacking. Having such insight could improve the understanding of how biology works and ultimately lead to new diagnostic and therapeutic approaches.

Citation

SKOLNICK, J., SRINIVASAN, B., SKOLNICK, S., EDELMAN, B. and ZHOU, H. 2025. Entabolons: how metabolites modify the biochemical function of proteins and cause the correlated behavior of proteins in pathways. [Dataset]. Journal of chemical information and modeling [online], ASAP articles. Available from: https://pubs.acs.org/doi/10.1021/acs.jcim.5c00462?goto=supporting-info

Acceptance Date May 7, 2025
Online Publication Date May 16, 2025
Deposit Date May 29, 2025
Publicly Available Date May 29, 2025
Publisher ACS Publications
DOI https://doi.org/10.1021/acs.jcim.5c00462
Keywords Ligands; Metabolism; Monomers; Oligomers; Peptides and proteins
Public URL https://rgu-repository.worktribe.com/output/2849119
Publisher URL https://pubs.acs.org/doi/10.1021/acs.jcim.5c00462?goto=supporting-info
Related Public URLs https://rgu-repository.worktribe.com/output/2849091 (Supplementary data associated with this journal output)
Type of Data ZIP folder containing TXT files
Collection Date Mar 5, 2025
Collection Method A library of 770 human metabolites bound to PDB structures was collected. They are a quite diverse collection ranging from lipids to amino acids to essential metabolites such as ATP, ADP, AMP, or GTP. All pockets are identified using the CAVITATOR pocket detection algorithm, which is capable of finding the pockets in which 97% ligands in the PDB are bound. Using CAVITATOR, we then constructed the template pocket library of 91,269 pockets where the 770 types of human metabolites are bound. LIGMAP, our ligand-binding prediction algorithm works as follows: Structural alignment of the entire pocket library against the largest pocket in the target protein, as identified by CAVITATOR, was done using the APoc pocket alignment algorithm. We align the entire pocket containing the metabolite against the entire pocket in the target's structure. Empirically, we found that this provides the best precision and recall in small molecule screening. We next considered the case where one predicted binding metabolite is a COLIG of another ligand bound to the native structure. To be considered, the native metabolite must have at least 10 Cα contacts with its binding protein. Then, to identify the partner metabolite, the standard LIGMAP algorithm is run with a sequence identity cutoff of 0.99, as the goal is to find examples of singly bound native ligands in the PDB. The predicted metabolite's pose must have at least 10 contacts with the protein Cαs. We then count the number of repulsive Cα contacts defined when a ligand het atom is within 3 Å of any Cα atom. If there are more than four such contacts, the predicted metabolite is rejected. Similarly, the predicted binding pose of the metabolite cannot have more than four overlaps with the native bound ligand. To be accepted, the metabolite must also have at least 30 interactions whose distance to the native ligand >3 Å and <4.5 Å; i.e., it contacts the native ligand. If all these considerations are satisfied, the native-predicted metabolite pair is a COLIG. A library of 28,896 human dimeric proteins was clustered at 80% sequence identity giving 1106 dimer interfacial pockets. These are labeled as "inter" for the purpose of this analysis. An interfacial pocket is defined as follows: They are pockets present in the unbound monomers, are absent in the dimer and which contain at least 10 buried residues in the dimer interface. Interface-adjacent pockets, termed "adj" are pocket's that are not found in the isolated monomers, but occur adjacent to the protein–protein dimeric interface and contain at least five residues from each of the two chains. For each protein in a given pathway, LIGMAP is used to predict which human metabolites, if any, it is likely to bind. Two proteins have a metabolic contact if they bind the same metabolite. We first consider the largest set of proteins that share a common binding metabolite. To identify metabolites that likely disrupt protein–DNA binding, for a given PDB structure that contains a protein bound to DNA, we compiled a list of proteins that are at least 50% identical to the target protein but which do not have bound DNA bound.

Files

SKOLNICK 2025 Entabolons (DATA) (943 Kb)
Archive

Copyright Statement
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.




Downloadable Citations