%0 Journal Article
%A Zilian, David
%A Sotriffer, Christoph A.
%D 2016
%T SFCscoreRF: A Random Forest-Based
Scoring Function for Improved Affinity Prediction of Protein–Ligand
Complexes
%U https://acs.figshare.com/articles/journal_contribution/SFCscore_sup_i_RF_i_sup_A_Random_Forest_Based_Scoring_Function_for_Improved_Affinity_Prediction_of_Protein_Ligand_Complexes/2383789
%R 10.1021/ci400120b.s001
%2 https://acs.figshare.com/ndownloader/files/4023454
%K Affinity Prediction
%K binding affinities
%K SAR series
%K data sets
%K regression methods
%K PDBbind training
%K SFCscore functions
%K exercise point
%K training sets
%K 1005 complexes
%K performance
%K SFCscore descriptors
%K SFCscoreRF
%K CSAR 2012
%X A major
shortcoming of empirical scoring functions for protein–ligand
complexes is the low degree of correlation between predicted and experimental
binding affinities, as frequently observed not only for large and
diverse data sets but also for SAR series of individual targets. Improvements
can be envisaged by developing new descriptors, employing larger training
sets of higher quality, and resorting to more sophisticated regression
methods. Herein, we describe the use of SFCscore descriptors to develop
an improved scoring function by means of a PDBbind training set of
1005 complexes in combination with random forest for regression. This
provided SFCscoreRF as a new scoring function
with significantly improved performance on the PDBbind and CSAR–NRC
HiQ benchmarks in comparison to previously developed SFCscore functions.
A leave-cluster-out cross-validation and performance in the CSAR 2012
scoring exercise point out remaining limitations but also directions
for further improvements of SFCscoreRF and empirical scoring functions in general.
%I ACS Publications