posted on 2020-01-16, 18:03authored byChristina Nutschel, Alexander Fulton, Olav Zimmermann, Ulrich Schwaneberg, Karl-Erich Jaeger, Holger Gohlke
Improving an enzyme’s (thermo-)stability
or tolerance against
solvents and detergents is highly relevant in protein engineering
and biotechnology. Recent developments have tended toward data-driven
approaches, where available knowledge about the protein is used to
identify substitution sites with high potential to yield protein variants
with improved stability, and subsequently, substitutions are engineered
by site-directed or site-saturation (SSM) mutagenesis. However, the
development and validation of algorithms for data-driven approaches
have been hampered by the lack of availability of large-scale data
measured in a uniform way and being unbiased with respect to substitution
types and locations. Here, we extend our knowledge on guidelines for
protein engineering following a data-driven approach by scrutinizing
the impact of substitution sites on thermostability or/and detergent
tolerance for Bacillus subtilis lipase A (BsLipA) at very large scale. We systematically analyze a
complete experimental SSM library of BsLipA containing
all 3439 possible single variants, which was evaluated as to thermostability
and tolerances against four detergents under respectively uniform
conditions. Our results provide systematic and unbiased reference
data at unprecedented scale for a biotechnologically important protein,
identify consistently defined hot spot types for evaluating the performance
of data-driven protein-engineering approaches, and show that the rigidity
theory and ensemble-based approach Constraint Network Analysis yields
hot spot predictions with an up to ninefold gain in precision over
random classification.