American Chemical Society
ci800374h_si_001.pdf (22.73 kB)

Modified Particle Swarm Optimization Algorithm for Adaptively Configuring Globally Optimal Classification and Regression Trees

Download (22.73 kB)
journal contribution
posted on 2009-05-22, 00:00 authored by Yan-Ping Zhou, Li-Juan Tang, Jian Jiao, Dan-Dan Song, Jian-Hui Jiang, Ru-Qin Yu
The configuration of classification and regression trees (CART) used to include tree-growing by greedy recursive partitioning, which selects the splitting parameters (i.e., splitting variables and values) involved in tree, and tree-pruning, which aims to obtain a final tree of right size. This method is successful for most applications; however, it presents some well-known limitations and drawbacks, such as, less comprehensibility, inclination to overfitting, and suboptima. In the present study, the modified discrete particle swarm optimization method was invoked to adaptively configure the globally optimal CART (MPSOCART) via simultaneously selecting the optimal splitting parameters in CART and the appropriate structure of CART. A new objective function was formulated to decide the appropriate CART architecture and the optimum splitting parameters. The proposed MPSOCART was applied to predict the bioactivities of flavonoid derivatives and inhibitory activities of inhibitors of epidermal growth factor receptor tyrosine kinase, compared with partial least-squares and CART induced by greedy recursive partitioning. The comparison revealed that MPSO was a useful tool for inducing a globally optimal CART, which converges fast to the optimal solution and avoid overfitting in great extent.