posted on 2023-01-17, 19:37authored byStefano Grasso, Valentina Dabene, Margriet M. W.
B. Hendriks, Priscilla Zwartjens, René Pellaux, Martin Held, Sven Panke, Jan Maarten van Dijl, Andreas Meyer, Tjeerd van Rij
The passage of proteins across biological membranes via
the general
secretory (Sec) pathway is a universally conserved process with critical
functions in cell physiology and important industrial applications.
Proteins are directed into the Sec pathway by a signal peptide at
their N-terminus. Estimating the impact of physicochemical signal
peptide features on protein secretion levels has not been achieved
so far, partially due to the extreme sequence variability of signal
peptides. To elucidate relevant features of the signal peptide sequence
that influence secretion efficiency, an evaluation of ∼12,000
different designed signal peptides was performed using a novel miniaturized
high-throughput assay. The results were used to train a machine learning
model, and a post-hoc explanation of the model is provided. By describing
each signal peptide with a selection of 156 physicochemical features,
it is now possible to both quantify feature importance and predict
the protein secretion levels directed by each signal peptide. Our
analyses allow the detection and explanation of the relevant signal
peptide features influencing the efficiency of protein secretion,
generating a versatile tool for the de novo design and in silico evaluation
of signal peptides.