10.1021/ci700404c.s001 Xiang S. Wang Xiang S. Wang Hao Tang Hao Tang Alexander Golbraikh Alexander Golbraikh Alexander Tropsha Alexander Tropsha Combinatorial QSAR Modeling of Specificity and Subtype Selectivity of Ligands Binding to Serotonin Receptors 5HT1E and 5HT1F American Chemical Society 2008 data mining approaches Combinatorial QSAR Modeling 5 HT 1E ligands MolConnZ descriptors 5 HT 1F ligands Automated Lazy Learning set R 2 values Several descriptor types PDSP Ki Database receptor subtype selectivity MHF value activity data QSAR models MOE FSG 5 HT 1E agonists Scripps Research Institute 5 HT 1E subtype selectivity data combinatorial QSAR modeling PLS model binding affinity 5 HT 1F ligand binding k Nearest Neighbor 5 HT 1E Molecular Hologram Fingerprints TSRI GPCR Serotonin Receptors 5 HT 1E 2008-05-27 00:00:00 Journal contribution https://acs.figshare.com/articles/journal_contribution/Combinatorial_QSAR_Modeling_of_Specificity_and_Subtype_Selectivity_of_Ligands_Binding_to_Serotonin_Receptors_5HT1E_and_5HT1F/2936992 The Quantitative Structure−Activity Relationship (QSAR) approach has been applied to model binding affinity and receptor subtype selectivity of human 5HT1E and 5HT1F receptor−ligands. The experimental data were obtained from the PDSP Ki Database. Several descriptor types and data-mining approaches have been used in the context of combinatorial QSAR modeling. Data mining approaches included <i>k</i> Nearest Neighbor, Automated Lazy Learning (ALL), and PLS; descriptor types included MolConnZ, MOE, DRAGON, Frequent Subgraphs (FSG), and Molecular Hologram Fingerprints (MHFs). Highly predictive QSAR models were generated for all three data sets (i.e., for ligands of both receptor subtypes and for subtype selectivity), and different individual techniques were proved best in each case. For real value activity data available for 5HT1E and 5HT1F ligand binding, models were characterized by leave-one-out cross-validated <i>R</i><sup>2</sup> (<i>q</i><sup>2</sup>) for the training sets and predictive <i>R</i><sup>2</sup> values for the test sets. The best models for 5HT1E ligands were obtained with the <i>k</i>NN approach combined with MolConnZ descriptors (<i>q</i><sup>2</sup> = 0.69, <i>R</i><sup>2</sup> = 0.92); for 5HT1F ligands ALL QSAR method using MolConnZ descriptors gave the best results (<i>R</i><sup>2</sup> = 0.92). Rigorously validated classification models were also developed for the 5HT1E/5HT1F subtype selectivity data set with high correct classification accuracy for both training (CCR<sub>train</sub>= 0.88) and test (CCR<sub>test</sub> = 1.00) sets using <i>k</i>NN with MolConnZ descriptors. The external predictive power of QSAR models was further validated by virtual screening of The Scripps Research Institute (TSRI) screening library to recover 5HT1E agonists and antagonists (not present in the original PDSP data set) with high enrichment factors. The successful development of externally predictive and interpretative QSAR models affords further design and discovery of novel subtype specific GPCR agents.