posted on 2014-04-28, 00:00authored byRobert P. Sheridan
In
the pharmaceutical industry, it is common for large numbers
of compounds to be tested for off-target activities. Given a compound
synthesized for an on-target project P, what is the
best way to predict its off-target activity X? Is
it better to use a global quantitative structure–activity relationship
(QSAR) model calibrated against all compounds tested for X, or is it better to use a local model for X calibrated
against only the set of compounds in project P? The literature is
not consistent on this topic, and strong claims have been made for
either. One particular idea is that local models will be superior
to global models in prospective prediction if one generates many local
models and chooses the type of local model that best predicts recent
data. We tested this idea via simulated prospective prediction using
in-house data involving compounds in 11 projects tested for 9 off-target
activities. In our hands, the local model that best predicts the recent
past is seldom the local model that is best at predicting the immediate
future. Also, the local model that best predicts the recent past is
not systematically better than the global model. This means the complexity
of having project- or series-specific models for X can be avoided; a single global model for X is
sufficient. We suggest that the relative predictivity of global vs
local models may depend on the type of chemical descriptor used. Finally,
we speculate why, contrary to observation, intuition suggests local
models should be superior to global models.