ci9b01212_si_001.pdf (199.46 kB)
Predicting Binding from Screening Assays with Transformer Network Embeddings
journal contribution
posted on 2020-07-02, 03:03 authored by Paul Morris, Rachel St. Clair, William Edward Hahn, Elan BarenholtzCheminformatics
aims to assist in chemistry applications that depend
on molecular interactions, structural characteristics, and functional
properties. The arrival of deep learning and the abundance of easily
accessible chemical data from repositories like PubChem have enabled
advancements in computer-aided drug discovery. Virtual high-throughput
screening (vHTS) is one such technique that integrates chemical domain
knowledge to perform in silico biomolecular simulations, but prediction
of binding affinity is restricted due to limited availability of ground-truth
binding assay results. Here, text representations of 83 000 000
molecules are leveraged to perform single-target binding affinity
prediction directly on the outcome of screening assays. The embedding
of an end-to-end transformer neural network, trained to encode the
structural characteristics of a molecule via a text-based translation
task, is repurposed through transfer learning to classify binding
affinity to single targets with few known binding compounds. We quantify
the observed increase in AUC on binding prediction tasks between classifiers
trained on the translation embedding versus those using an untrained
embedding. Visualization of the embedding space reveals organization
of structural and functional properties that aid binding prediction.
The pretrained transformer, data, and associated software to extract
embeddings are made publicly available at https://github.com/mpcrlab/MolecularTransformerEmbeddings.
History
Usage metrics
Categories
Keywords
silico biomolecular simulationssingle-target binding affinity pred...83 000 00083 000 000 moleculestext-based translation taskchemical domain knowledgeTransformer Network Embeddings Chem...Virtual high-throughput screeningembeddingAUCbinding prediction tasksbinding affinityground-truth binding assay resultsaid binding prediction
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC