Modeling Phospholipidosis Induction: Reliability and Warnings

Drug-induced phospholipidosis (PLD) is characterized by accumulation of phospholipids, the inducing drugs and lamellar inclusion bodies in the lysosomes of affected tissues. These side effects must be considered as early as possible during drug discovery, and, in fact, numerous in silico models designed to predict PLD have been published. However, the quality of any in silico model cannot be better than the quality of the experimental data set used to build it. The present paper reports an overview of the difficulties and errors encountered in the generation of databases used for the published PLD models. A new database of 466 compounds was constructed from seven literature sources, containing only publicly available compounds. A comparison of the PLD assignations in selected databases proved useful in revealing some inconsistencies and raised doubts about the previously assigned PLD+ and PLD– classifications for some chemicals. Finally, a Partial Least Squares Discriminant Analysis (PLS-DA) approach was also applied, revealing further anomalies and clearly showing that metabolism as well as data quality must be taken into account when generating accurate methods for predicting the likelihood that a compound will induce PLD. A new curated database of 331 compounds is proposed.