Conservation of Hot Spots and Ligand Binding Sites
in Protein Models by AlphaFold2

Bekar-Cesaretli, Ayse A.; Khan, Omeir; Nguyen, Thu; Kozakov, Dima; Joseph-Mccarthy, Diane; Vajda, Sandor

doi:10.1021/acs.jcim.3c01761.s001

ci3c01761_si_001.pdf (2.41 MB)

Conservation of Hot Spots and Ligand Binding Sites in Protein Models by AlphaFold2

journal contribution

posted on 2024-01-23, 00:30 authored by Ayse A. Bekar-Cesaretli, Omeir Khan, Thu Nguyen, Dima Kozakov, Diane Joseph-Mccarthy, Sandor Vajda

The neural network-based program AlphaFold2 (AF2) provides high accuracy structure prediction for a large fraction of globular proteins. An important question is whether these models are accurate enough for reliably docking small ligands. Several recent papers and the results of CASP15 reveal that local conformational errors reduce the success rates of direct ligand docking. Here, we focus on the ability of the models to conserve the location of binding hot spots, regions on the protein surface that significantly contribute to the binding free energy of the protein–ligand interaction. Clusters of hot spots predict the location and even the druggability of binding sites, and hence are important for computational drug discovery. The hot spots are determined by protein mapping that is based on the distribution of small fragment-sized probes on the protein surface and is less sensitive to local conformation than docking. Mapping models taken from the AlphaFold Protein Structure Database show that identifying binding sites is more reliable than docking, but the success rates are still 5% to 10% lower than based on mapping X-ray structures. The drop in accuracy is particularly large for models of multidomain proteins. However, both the model binding sites and the mapping results can be substantially improved by generating AF2 models for the ligand binding domains of interest rather than the entire proteins and even more if using forced sampling with multiple initial seeds. The mapping of such models tends to reach the accuracy of results obtained by mapping the X-ray structures.