ci0c00401_si_001.pdf (773.03 kB)
Signal‑3L 3.0: Improving Signal Peptide Prediction through Combining Attention Deep Learning with Window-Based Scoring
journal contribution
posted on 2020-07-01, 20:10 authored by Wei-Xun Zhang, Xiaoyong Pan, Hong-Bin ShenSignal peptides play
an important role in guiding and transferring
transmembrane proteins and secreted proteins. In recent years, with
the explosive growth of protein sequences, computationally predicting
signal peptides and their cleavage sites from protein sequences is
highly desired. In this work, we present an improved approach, Signal-3L
3.0, for signal peptide recognition and cleavage-site prediction using
a 3-layer hybrid method of integrating deep learning algorithms and
window-based scoring. There are three main components in the Signal-3L
3.0 prediction engine: (1) a deep bidirectional long short-term memory
(Bi-LSTM) network with a soft self-attention learns abstract features
from sequences to determine whether a query protein contains a signal
peptide; (2) the statistics propensity window-based cleavage site
screening method is applied to generate the set of candidate cleavage
sites; (3) the prediction of a conditional random field with a hybrid
convolutional neural network (CNN) and Bi-LSTM is fused with the window-based
score for identifying the final unique cleavage site. Experimental
results on the benchmark datasets show that the new deep learning-driven
Signal-3L 3.0 yields promising performance. The online server of Signal-3L
3.0 is available at http://www.csbio.sjtu.edu.cn/bioinf/Signal-3L/.