posted on 2020-09-15, 15:48authored byDingyan Wang, Zeen Yang, Bingqing Zhu, Xuefeng Mei, Xiaomin Luo
A machine-learning
model trained on the whole Cambridge Structural
Database was developed to assist high-throughput cocrystal screening.
With only 2D structures taken as inputs, the probability of cocrystal
formation is returned for two given molecules. All of the cocrystal
records in the CSD were used as positive samples, while negative samples
were constructed by randomly combining different molecules into chemical
pairs. Our model showed a prediction ability comparable with that
of a widely used ab initio method in a head-to-head
comparison test. Both experimental and virtual cocrystal screening
against captopril were conducted at the same time to further validate
the model. Two cocrystals of captopril with l-proline and
sarcosine were obtained and characterized by PXRD, DSC, and FT-IR.
These two coformers were also successfully predicted by our model.
These results suggest that the tool we developed can be used to effectively
guide coformer selection in the discovery of new cocrystals.