posted on 2024-01-05, 22:30authored byTing-Fei Zhu, Rong Qian, Xiao Wei, Ai-Ping Lu, Dong-Sheng Cao
Patents play a crucial role in drug research and development,
providing
early access to unpublished data and offering unique insights. Identifying
key compounds in patents is essential to finding novel lead compounds.
This study collected a comprehensive data set comprising 1555 patents,
encompassing 1000 key compounds, to explore innovative approaches
for predicting these key compounds. Our novel PatentNetML framework
integrated network science and machine learning algorithms, combining
network measures, ADMET properties, and physicochemical properties,
to construct robust classification models to identify key compounds.
Through a model interpretation and an analysis of three compelling
case studies, we showcase the potential of PatentNetML in unveiling
hidden patterns and connections within diverse patents. While our
framework is pioneering, we acknowledge its limitations when applied
to patents that deviate from the assumed central pattern. This work
serves as a promising foundation for future research endeavors aimed
at efficiently identifying promising drug candidates and expediting
drug discovery in the pharmaceutical industry.