ci9b00143_si_002.xlsx (50.98 kB)

Gene Expression Data Based Deep Learning Model for Accurate Prediction of Drug-Induced Liver Injury in Advance

Download (50.98 kB)
posted on 12.06.2019, 00:00 by Chunlai Feng, Hengwei Chen, Xianqin Yuan, Mengqiu Sun, Kexin Chu, Hanqin Liu, Mengjie Rui
Drug-induced liver injury (DILI), one of the most common adverse effects, leads to drug development failure or withdrawal from the market in most cases, showing an emerging challenge that is to accurately predict DILI in the early stage. Recently, the vast amount of gene expression data provides us valuable information for distinguishing DILI on a genomic scale. Moreover, the deep learning algorithm is a powerful strategy to automatically learn important features from raw and noisy data and shows great success in the field of medical diagnosis. In this study, a gene expression data based deep learning model was developed to predict DILI in advance by using gene expression data associated with DILI collected from ArrayExpress and then optimized by feature gene selection and parameters optimization. In addition, the previous machine learning algorithm support vector machine (SVM) was also used to construct another prediction model based on the same data sets, comparing the model performance with the optimal DL model. Finally, the evaluation test using 198 randomly selected samples showed that the optimal DL model achieved 97.1% accuracy, 97.4% sensitivity, 96.8% specificity, 0.942 matthews correlation coefficient, and 0.989 area under the ROC curve, while the performance of SVM model only reached 88.9% accuracy, 78.8% sensitivity, 99.0% specificity, 0.794 matthews correlation coefficient, and 0.901 area under the ROC curve. Furthermore, external data sets verification and animal experiments were conducted to assess the optimal DL model performance. Finally, the predicted results of the optimal DL model were almost consistent with experiment results. These results indicated that our gene expression data based deep learning model could systematically and accurately predict DILI in advance. It could be a useful tool to provide safety information for drug discovery and clinical rational drug use in early stage and become an important part of drug safety assessment.