CATS: A Tool for Clustering the Ensemble of Intrinsically Disordered Peptides on a Flat Energy Landscape

posted on 2018-10-26, 00:00 authored by Jacob C. Ezerski, Margaret S. Cheung
We introduce the combinatorial averaged transient structure (CATS) clustering method as a means to cluster protein structure ensembles based on the distributions of protein backbone descriptor coordinates. In our study, we use phi and psi dihedral angle coordinates of the protein backbone as descriptors due to their translational and rotational invariance. The CATS method was developed to produce unique structure ensembles that are typically difficult to obtain from flat energy landscapes using a one-dimensional separation value (e.g., RMSD cutoff). Through the use of higher-dimensional descriptor coordinates, we remedy structure resolution shortcomings of standard clustering algorithms due to large RMSD fluctuations between structures. We compare the performance of CATS to an RMSD-based clustering method GROMOS, which may not be the best choice for IDP clustering since separation quality heavily relies on cutoff values instead of energy landscape minima. We demonstrate the performance of CATS and GROMOS by analyzing the all-atom molecular dynamics trajectories of the Tau/R2(273–284) fragment in solution with TMAO and urea osmolytes from prior studies. Our study reveals that the CATS method produces more unique clusters than the GROMOS method as a result of higher-dimensional distributions of the descriptor coordinates. The cluster centers produced by CATS correspond to local minima in the multidimensional potential mean force, which generates a structure ensemble that adequately samples the energy landscape.