posted on 2025-07-18, 12:02authored byJiang-Yu Yang, Yong-Heng Rong, Zhao-Xi Liu, Long-Jiao Gao, Yong-Qi Liu, Wen-Ya Liu, Jun Zhou, Min Chen
The microbiota serves as a linker between the environment
and the
human body, with glycans mediating this interaction through structurally
distinct domains similar to the domain of proteins. However, the “glycan
code” governing structure–function relationships remains
underexplored. We developed the Glycan Substructure Mining tool (GSMtool),
a graph-based Python framework for systematic identification and comparative
analysis of conserved glycan substructures in large-scale microbial
data sets. GSMtool identified specific glycan substructures in gut
microbiota or pathogenic bacteria, including the subtle differences
in Shigella flexneri and diarrhea-associated
αDGlcp(1–3)βDGalpNAc epitope. Validation through Helicobacter pylori and Clostridium
difficile confirmed the capacity of the GSMtool to
identify pathogen-specific diagnostic glycan substructures. The framework
integrates two analytical pipelines accommodating diverse research
objectives. This methodology advances our understanding of the relationship
between glycan substructure and its function while identifying potential
targets for microbial diagnostics and carbohydrate-based vaccine development.