Journal of Chemical Information and Modeling 2017-11-29

Descriptor Data Bank (DDB): A Cloud Platform for Multi-Perspective Modeling of Protein-Ligand Interactions

Hossam Ashtawy, Nihar Ranjan Mahapatra

Index: 10.1021/acs.jcim.7b00310

Full Text: HTML

Abstract

Protein-ligand (PL) interactions play a key role in many life processes such as molecular recognition, molecular binding, signal transmission, and cell metabolism. Examples of interaction forces include hydrogen bonding, hydrophobic effects, steric clashes, electrostatic contacts, and van der Waals attractions. Currently, a large number of hypotheses and perspectives to model these interaction forces are scattered throughout the literature and largely forgotten. Instead, had they been assembled and utilized collectively, they would have substantially improved the accuracy of predicting binding affinity of protein-ligand complexes. In this work, we present Descriptor Data Bank (DDB), a data-driven platform on the cloud for facilitating multi-perspective modeling of PL interactions. DDB is an open-access hub for depositing, hosting, executing, and sharing descriptor extraction tools and data for a large number of interaction modeling hypotheses. The platform also implements a machine-learning (ML) toolbox for automatic descriptor filtering & analysis, and scoring function (SF) fitting & prediction. The descriptor filtering module is used to filter out irrelevant and/or noisy descriptors and to produce a compact subset from all available features. We seed DDB with 16 diverse descriptor extraction tools developed in-house and collected from the literature. The tools altogether generate over 2,700 descriptors that characterize (i) proteins, (ii) ligands, and (iii) protein-ligand complexes. The in-house descriptors we extract are protein-specific which are based on pair-wise primary and tertiary alignment of protein structures followed by clustering and trilateration. We built and used DDB's ML library to fit SFs to the in-house descriptors and those collected from the literature. We then evaluated them on several data sets that were constructed to reflect real-world drug screening scenarios. We found that multi-perspective SFs that were constructed using large number of diverse DDB descriptors capturing various PL interactions in different ways outperformed their single-perspective counterparts in all evaluation scenarios, with an average improvement of more than 15%. We also found that our proposed protein-specific descriptors improve the accuracy of SFs.

Latest Articles:

Holistic Approach to Partial Covalent Interactions in Protein Structure Prediction and Design with Rosetta

2018-04-19

[10.1021/acs.jcim.7b00398]

Force Field Benchmark of Amino Acids: I. Hydration and Diffusion in Different Water Models

2018-04-18

[10.1021/acs.jcim.8b00026]

Role of Molecular Interactions and Protein Rearrangement in the Dissociation Kinetics of p38α MAP Kinase Type-I/II/III Inhibitors

2018-04-16

[10.1021/acs.jcim.7b00640]

Peptidic Macrocycles - Conformational Sampling and Thermodynamic Characterization

2018-04-13

[10.1021/acs.jcim.8b00097]

ReFlex3D: Refined Flexible Alignment of Molecules Using Shape and Electrostatics

2018-04-13

[10.1021/acs.jcim.7b00618]

More Articles...