pr0500399_si_001.pdf (71.1 kB)

Predicting Enzyme Subclass by Functional Domain Composition and Pseudo Amino Acid Composition

Download (71.1 kB)
journal contribution
posted on 13.06.2005, 00:00 by Yu-Dong Cai, Kuo-Chen Chou
As a continuous effort to use the sequence approach to identify enzymatic function at a deeper level, investigations are extended from the main enzyme classes (Protein Sci. 2004, 13, 2857−2863) to their subclasses. This is indispensable if we wish to understand the molecular mechanism of an enzyme at a deeper level. For each of the 6 main enzyme classes (i.e., oxidoreductase, transferase, hydrolase, lyase, isomerase, and ligase), a subclass training dataset is constructed. To reduce homologous bias, a stringent cutoff was imposed that all the entries included in the datasets have less than 40% sequence identity to each other. To catch the core feature that is intimately related to the biological function, the sample of a protein is represented by hybridizing the functional domain composition and pseudo amino acid composition. On the basis of such a hybridization representation, the FunD-PseAA predictor is established. It is demonstrated by the jackknife cross-validation tests that the overall success rate in identifying the 21 subclasses of oxidoreductases is above 86%, and the corresponding rates in identifying the subclasses of the other 5 main enzyme classes are 94−97%. The high success rates imply that the FunD-PseAA predictor may become a useful tool in bioinformatics and proteomics of the post-genomic era. Keywords: ENZYME database • 40% cutoff • functional domain • pseudo amino acid composition • ISort predictor • FunD-PseAA predictor • bioinformatics • proteomics