Chapter title |
Automated Inference of Chemical Discriminants of Biological Activity
|
---|---|
Chapter number | 16 |
Book title |
Computational Drug Discovery and Design
|
Published in |
Methods in molecular biology, January 2018
|
DOI | 10.1007/978-1-4939-7756-7_16 |
Pubmed ID | |
Book ISBNs |
978-1-4939-7755-0, 978-1-4939-7756-7
|
Authors |
Sebastian Raschka, Anne M. Scott, Mar Huertas, Weiming Li, Leslie A. Kuhn |
Abstract |
Ligand-based virtual screening has become a standard technique for the efficient discovery of bioactive small molecules. Following assays to determine the activity of compounds selected by virtual screening, or other approaches in which dozens to thousands of molecules have been tested, machine learning techniques make it straightforward to discover the patterns of chemical groups that correlate with the desired biological activity. Defining the chemical features that generate activity can be used to guide the selection of molecules for subsequent rounds of screening and assaying, as well as help design new, more active molecules for organic synthesis.The quantitative structure-activity relationship machine learning protocols we describe here, using decision trees, random forests, and sequential feature selection, take as input the chemical structure of a single, known active small molecule (e.g., an inhibitor, agonist, or substrate) for comparison with the structure of each tested molecule. Knowledge of the atomic structure of the protein target and its interactions with the active compound are not required. These protocols can be modified and applied to any data set that consists of a series of measured structural, chemical, or other features for each tested molecule, along with the experimentally measured value of the response variable you would like to predict or optimize for your project, for instance, inhibitory activity in a biological assay or ΔGbinding. To illustrate the use of different machine learning algorithms, we step through the analysis of a dataset of inhibitor candidates from virtual screening that were tested recently for their ability to inhibit GPCR-mediated signaling in a vertebrate. |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 23 | 100% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Doctoral Student | 4 | 17% |
Student > Bachelor | 4 | 17% |
Student > Master | 3 | 13% |
Student > Ph. D. Student | 2 | 9% |
Professor | 1 | 4% |
Other | 2 | 9% |
Unknown | 7 | 30% |
Readers by discipline | Count | As % |
---|---|---|
Biochemistry, Genetics and Molecular Biology | 7 | 30% |
Computer Science | 3 | 13% |
Chemistry | 2 | 9% |
Psychology | 1 | 4% |
Medicine and Dentistry | 1 | 4% |
Other | 1 | 4% |
Unknown | 8 | 35% |