A Ligand for Every target; A target for Every Ligand.
A key problem in drug discovery is selecting which compounds to screen. Whereas over a third of a billion compounds may now be purchased within a few weeks, the biological activity of most of them is unknown. To enable facile access to new chemistry for biology, we have used two chemical informatics methods to predict activity of every molecule for sale in ZINC. Over 40 percent of these—152 million—have high significance predictions (P-value < 1e-21) by the Similarity Ensemble Approach (SEA) for one or more of 1382 targets that are well described by ligands in the literature. An additional 4.6 million compounds had predictions against a further 1347 targets using maximum ECFP4 Tanimoto coefficient to annotated actives of greater than 40% as the prediction threshold. To gauge whether these additional predictions are sensible we investigated 75 predictions for 50 drugs lacking a binding affinity annotation in ChEMBL. Predictions may be classified from likely to unlikely based on the maximum Tanimoto coefficient to the nearest active, allowing for a range of applications from analog hunting to library design. The 546 million target predictions for 156 million compounds at 2629 targets are freely accessible, including full purchasing information and evidence to support each prediction, at https://zinc15.docking.org/predictions/home.
John Irwin is an Adjunct Professor at the University of California Department of Pharmaceutical Chemistry. He develops computational tools and databases to enable research in pharmacology and chemical biology. He is best known for ZINC, a database of commercially available compounds for virtual screening, zinc15.docking.org, the Similarity Ensemble Approach, a method to predict the biological targets of small molecules, sea16.docking.org, DUDE, a benchmark for molecular docking, dude.docking.org and DOCK Blaster, a web-based molecular docking system, blaster.docking.org. He uses these tools in collaborations to discover new chemical matter for biology.