Extraction and Analysis of Chemical Modification Patterns in Drug Development

Most drugs have been continuously modified from prototypic compounds in the drug development process. Such chemical modifications in the history of drug development are expected to contain a wealth of medicinal chemists’ knowledge, and the KEGG DRUG structure maps have been compiled to capture this knowledge. Here we attempted to extract the information on the chemical modification patterns from 3745 approved drugs in the KEGG DRUG database and 255 drug pairs in the KEGG DRUG structure maps. We first identified 236 core structures and 506 peripheral fragments from the KEGG DRUG database using bit-represented fingerprints and hierarchical clustering of similar structures. We then examined position-dependent relationships between core structures and peripheral fragments, which revealed the tendency of specific fragments connected to specific modification sites on the core structures. Next we converted the drug pairs into 204 peripheral fragment changes at the modification sites. Each change was represented by the transformation profile defined as a difference of fingerprint bit patterns, and the hierarchical clustering of similar transformation profiles was performed. We thus identified 125 chemical modification patterns that characterize the KEGG DRUG structure maps. These patterns were further applied to the reconstruction of a new structure map. The approach presented here may be applicable to systematic in silico drug modifications.