cg0c00767_si_001.pdf (4.96 MB)
Download fileMachine-Learning-Guided Cocrystal Prediction Based on Large Data Base
journal contribution
posted on 2020-09-15, 15:48 authored by Dingyan Wang, Zeen Yang, Bingqing Zhu, Xuefeng Mei, Xiaomin LuoA machine-learning
model trained on the whole Cambridge Structural
Database was developed to assist high-throughput cocrystal screening.
With only 2D structures taken as inputs, the probability of cocrystal
formation is returned for two given molecules. All of the cocrystal
records in the CSD were used as positive samples, while negative samples
were constructed by randomly combining different molecules into chemical
pairs. Our model showed a prediction ability comparable with that
of a widely used ab initio method in a head-to-head
comparison test. Both experimental and virtual cocrystal screening
against captopril were conducted at the same time to further validate
the model. Two cocrystals of captopril with l-proline and
sarcosine were obtained and characterized by PXRD, DSC, and FT-IR.
These two coformers were also successfully predicted by our model.
These results suggest that the tool we developed can be used to effectively
guide coformer selection in the discovery of new cocrystals.
History
Usage metrics
Read the peer-reviewed publication
Categories
Keywords
2 D structurescomparison testmachine-learning modelab initio methodLarge Data Baseprediction abilityFT-IRPXRDhigh-throughput cocrystal screeningchemical pairscocrystal recordsCSDcocrystal screeningMachine-Learning-Guided Cocrystal P...Cambridge Structural DatabaseDSCcocrystal formationguide coformer selection