A Novel Computational Approach To Predict Transcription Factor DNA Binding Preference
datasetposted on 06.02.2009, 00:00 by Yudong Cai, JianFeng He, XinLei Li, Lin Lu, XinYi Yang, KaiYan Feng, WenCong Lu, XiangYin Kong
Transcription is one of the most important processes in cell in which transcription factors translate DNA sequences into RNA sequences. Accurate prediction of DNA binding preference of transcription factors is valuable for understanding the transcription regulatory mechanism and elucidating regulation network.− Here we predict the DNA binding preference of transcription factor based on the protein amino acid composition and physicochemical properties, 0/1 encoding system of nucleotide, minimum Redundancy Maximum Relevance Feature Selection method, and Nearest Neighbor Algorithm. The overall prediction accuracy of Jackknife cross-validation test is 91.1%, indicating that this approach is a useful tool to explore the relation between transcription factor and its binding sites. Moreover, we find that the secondary structure and polarizability of transcriptor contribute mostly in the prediction. Especially, a 7-nt motif with AT-rich region of the DNA binding sites discovered via our method is also consistent with the statistical analysis from the TRANSFAC database.