⮝ Full datasets listing

PXD074117-1

PXD074117 is an original dataset announced via ProteomeXchange.

Dataset Summary
TitleIdentification of a Stable Stool Peptidomic Signature for Monitoring IBD Activity
DescriptionIntroduction Monitoring disease activity in inflammatory bowel disease (IBD) is essential for guiding therapy and preventing irreversible tissue damage. Colonoscopy, although the gold standard, is invasive and unsuitable for frequent monitoring, while fecal calprotectin lacks accuracy within its diagnostic gray zone (fecal calprotectin 100–250 µg/g). Stool proteomics offers a non-invasive alternative by directly capturing molecular signatures of intestinal inflammation. We conducted a proof-of-concept study to determine whether stool-derived peptides can accurately classify IBD activity (Active vs Remission) using a fully unbiased and reproducible nested cross-validation machine-learning framework. Methods A total of 174 stool samples from IBD patients were collected and profiled using SWATH-DIA mass spectrometry. Feature selection was performed within the training loops only (Boruta, LASSO, RFE) across repeated subsampling, retaining peptides consistently identified in ≥70 % of runs. Stable features were used to train four classifiers (GLMNet, SVM-Radial, SVM-Linear, Naïve Bayes) under inner 5-fold tuning. Outer test folds provided fully unseen evaluation, and model performance was additionally assessed exclusively on gray zone samples extracted from the outer test splits to quantify diagnostic resolution in this clinically challenging subgroup. Results Nested cross-validation identified a consensus panel of nine stool-derived peptides from five proteins. Across candidate classifiers, performance was broadly similar, with GLMNet consistently achieving the best trade-off between metrics. For GLMNet, outer-fold mean AUC was 0.93 and balanced accuracy 0.88, with specificity 0.94, sensitivity 0.82, and F1-score 0.85; close agreement between inner- and outer-fold metrics indicated minimal overfitting. Approximately half of the misclassified test patients were shared with the other models, suggesting intrinsically ambiguous cases rather than GLMNet-specific errors. Within the calprotectin gray zone subgroup (n = 34), GLMNet maintained good performance (accuracy 0.76, balanced accuracy 0.78, F1 0.79, AUC 0.80), confirming that the peptide signature remains informative in this diagnostically challenging range. SHAP and correlation analyses confirmed directionally consistent, largely non-redundant peptide contributions, and network analysis showed that the underlying proteins participate in related but distinct inflammatory pathways. Conclusions A stool-based multi-peptide signature, evaluated with a rigorously nested, leakage-free machine-learning framework, can reliably classify IBD activity and retain discriminative power within the gray zone. This biologically interpretable five-protein panel provides a strong basis for targeted mass-spectrometry assay development and prospective validation as a non-invasive tool for personalized IBD monitoring.
HostingRepositoryPRIDE
AnnounceDate2026-05-25
AnnouncementXMLSubmission_2026-05-24_16:13:44.237.xml
DigitalObjectIdentifier
ReviewLevelPeer-reviewed dataset
DatasetOriginOriginal dataset
RepositorySupportUnsupported dataset by repository
PrimarySubmitterElmira Shajari
SpeciesList scientific name: Homo sapiens (Human); NCBI TaxID: NEWT:9606;
ModificationListcarbamoylated residue; monohydroxylated residue
InstrumentTripleTOF 5600
Dataset History
RevisionDatetimeStatusChangeLog Entry
02026-02-05 21:15:26ID requested
12026-05-24 16:13:45announced
Publication List
10.3389/fmolb.2026.1768474;
Shajari E, Gagn, é D, Malick M, Roy P, No, ë, l JF, Gagnon H, Delisle M, Boisvert FM, Brunet MA, Beaulieu JF, Non-invasive assessment of inflammatory bowel disease activity using a DIA-derived stool peptidomic signature and machine learning. Front Mol Biosci, 13():1768474(2026) [pubmed]
Keyword List
submitter keyword: Human, peptidomics, machine learning, Inflammatory bowel disease, biomarker discovery, Proteomics, stool, clinical mass spectrometry, nested cross validation, IBD activity
Contact List
Jean-Francois Beaulieu
contact affiliationLaboratory of Intestinal Physiopathology, Department of Immunology and Cell Biology, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
contact emailJean-Francois.Beaulieu@USherbrooke.ca
lab head
Elmira Shajari
contact affiliationPhD candidate
contact emailelmira.shajari@usherbrooke.ca
dataset submitter
Full Dataset Link List
Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2026/05/PXD074117
PRIDE project URI
Repository Record List
[ + ]