posted on 2017-12-21, 00:00authored byRahul Kaushik, Ankita Singh, B. Jayaram
The
fact that amino acid sequences dictate the tertiary structures
of proteins has been known for more than five decades. While the molecular
pathways to tertiary structure are still being worked out, with the
axiom that similar sequences adopt similar structures, computational
methods are being developed continually in parallel, utilizing the
Protein Data Bank structural repository and homologue detection strategies
to predict structures of sequences of interest. The success of this
approach is limited by the ability to unravel the hidden similarities
among amino acid sequences. We consider here the 20 amino acids as
a complete set of chemical templates in the physicochemical space
of proteins and propose a new structural and chemical classification
of amino acids. An integration of this perspective into the conventional
evolutionary methods of similarity detection leads to an unprecedented
increase in the accuracy in homologue detection, resulting in improved
protein structure prediction. The performance is validated on a large
data set of 11716 unique proteins, and the results are benchmarked
against conventional methods. The availability of good quality protein
structures helps in structure-based drug design endeavors and in establishing
protein structure–function correlations.