Predictive Global Models of Cruzain Inhibitors with Large Chemical Coverage
datasetposted on 2021-03-05, 23:14 authored by Jose Guadalupe Rosas-Jimenez, Marco A. Garcia-Revilla, Abraham Madariaga-Mazon, Karina Martinez-Mayorga
Chagas disease affects 8–11 million people worldwide, most of them living in Latin America. Moreover, migratory phenomena have spread the infection beyond endemic areas. Efforts for the development of new pharmacological therapies are paramount as the pharmacological profile of the two marketed drugs currently available, nifurtimox and benznidazole, needs to be improved. Cruzain, a parasitic cysteine protease, is one of the most attractive biological targets due to its roles in parasite survival and immune evasion. In this work, we compiled and curated a database of diverse cruzain inhibitors previously reported in the literature. From this data set, quantitative structure–activity relationship (QSAR) models for the prediction of their pIC50 values were generated using k-nearest neighbors and random forest algorithms. Local and global models were calculated and compared. The statistical parameters for internal and external validation indicate a significant predictability, with qloo2 values around 0.66 and 0.61 and external R2 coefficients of 0.725 and 0.766. The applicability domain is quantitatively defined, according to QSAR good practices, using the leverage and similarity methods. The models described in this work are readily available in a Python script for the discovery of novel cruzain inhibitors.