Design and Evaluation of Bonded Atom Pair Descriptors
journal contributionposted on 26.04.2010, 00:00 by Hany E. A. Ahmed, Martin Vogt, Jürgen Bajorath
Atom pairs have been among the first systematically derived fragment-type topological descriptors and have been one of the origins of two-dimensional fingerprint searching. These descriptors continue to be popular and widely used to this date. Herein we introduce a new type of atom pair descriptors, bonded atom pairs, that exclusively capture short-range atom environment information and, thus, depart in their design from other topological descriptors that enumerate bond paths of varying length. Bonded atom pairs combine different types of structural information including element type, hybridization state, aliphatic/aromatic character, and cyclic/acyclic arrangement. Systematic design led to a set of 117 bonded atom pairs, all of which exist in synthetic compounds. A further expanded bonded atom pair set accounting for specific halogen atoms and including a total of 159 descriptors is also provided. Atom pair distribution and frequency analysis in sets of compounds having different selectivity reveals that both conventional and bonded atom pairs capture complementary structural information. In similarity searching, bonded atom pairs meet or exceed the performance of standard atom pairs and structural fragment fingerprints. The complementary nature of structural information captured by atom pairs of different design is also reflected by individual search calculations. Taken together, our findings indicate that bonded atom pairs extend the current repertoire of topological molecular descriptors.