⮝ Full datasets listing
PXD074117
PXD074117 is an original dataset announced via ProteomeXchange.
Dataset Summary
| Title | Identification of a Stable Stool Peptidomic Signature for Monitoring IBD Activity |
| Description | Introduction Monitoring disease activity in inflammatory bowel disease (IBD) is essential for guiding therapy and preventing irreversible tissue damage. Colonoscopy, although the gold standard, is invasive and unsuitable for frequent monitoring, while fecal calprotectin lacks accuracy within its diagnostic gray zone (fecal calprotectin 100–250 µg/g). Stool proteomics offers a non-invasive alternative by directly capturing molecular signatures of intestinal inflammation. We conducted a proof-of-concept study to determine whether stool-derived peptides can accurately classify IBD activity (Active vs Remission) using a fully unbiased and reproducible nested cross-validation machine-learning framework. Methods A total of 174 stool samples from IBD patients were collected and profiled using SWATH-DIA mass spectrometry. Feature selection was performed within the training loops only (Boruta, LASSO, RFE) across repeated subsampling, retaining peptides consistently identified in ≥70 % of runs. Stable features were used to train four classifiers (GLMNet, SVM-Radial, SVM-Linear, Naïve Bayes) under inner 5-fold tuning. Outer test folds provided fully unseen evaluation, and model performance was additionally assessed exclusively on gray zone samples extracted from the outer test splits to quantify diagnostic resolution in this clinically challenging subgroup. Results Nested cross-validation identified a consensus panel of nine stool-derived peptides from five proteins. Across candidate classifiers, performance was broadly similar, with GLMNet consistently achieving the best trade-off between metrics. For GLMNet, outer-fold mean AUC was 0.93 and balanced accuracy 0.88, with specificity 0.94, sensitivity 0.82, and F1-score 0.85; close agreement between inner- and outer-fold metrics indicated minimal overfitting. Approximately half of the misclassified test patients were shared with the other models, suggesting intrinsically ambiguous cases rather than GLMNet-specific errors. Within the calprotectin gray zone subgroup (n = 34), GLMNet maintained good performance (accuracy 0.76, balanced accuracy 0.78, F1 0.79, AUC 0.80), confirming that the peptide signature remains informative in this diagnostically challenging range. SHAP and correlation analyses confirmed directionally consistent, largely non-redundant peptide contributions, and network analysis showed that the underlying proteins participate in related but distinct inflammatory pathways. Conclusions A stool-based multi-peptide signature, evaluated with a rigorously nested, leakage-free machine-learning framework, can reliably classify IBD activity and retain discriminative power within the gray zone. This biologically interpretable five-protein panel provides a strong basis for targeted mass-spectrometry assay development and prospective validation as a non-invasive tool for personalized IBD monitoring. |
| HostingRepository | PRIDE |
| AnnounceDate | 2026-05-25 |
| AnnouncementXML | Submission_2026-05-24_16:13:44.237.xml |
| DigitalObjectIdentifier | |
| ReviewLevel | Peer-reviewed dataset |
| DatasetOrigin | Original dataset |
| RepositorySupport | Unsupported dataset by repository |
| PrimarySubmitter | Elmira Shajari |
| SpeciesList | scientific name: Homo sapiens (Human); NCBI TaxID: NEWT:9606; |
| ModificationList | carbamoylated residue; monohydroxylated residue |
| Instrument | TripleTOF 5600 |
Dataset History
| Revision | Datetime | Status | ChangeLog Entry |
|---|---|---|---|
| 0 | 2026-02-05 21:15:26 | ID requested | |
| ⏵ 1 | 2026-05-24 16:13:45 | announced |
Publication List
| 10.3389/fmolb.2026.1768474; |
| Shajari E, Gagn, é D, Malick M, Roy P, No, ë, l JF, Gagnon H, Delisle M, Boisvert FM, Brunet MA, Beaulieu JF, Non-invasive assessment of inflammatory bowel disease activity using a DIA-derived stool peptidomic signature and machine learning. Front Mol Biosci, 13():1768474(2026) [pubmed] |
Keyword List
| submitter keyword: Human, peptidomics, machine learning, Inflammatory bowel disease, biomarker discovery, Proteomics, stool, clinical mass spectrometry, nested cross validation, IBD activity |
Contact List
| Jean-Francois Beaulieu | |
|---|---|
| contact affiliation | Laboratory of Intestinal Physiopathology, Department of Immunology and Cell Biology, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada |
| contact email | Jean-Francois.Beaulieu@USherbrooke.ca |
| lab head | |
| Elmira Shajari | |
| contact affiliation | PhD candidate |
| contact email | elmira.shajari@usherbrooke.ca |
| dataset submitter | |
Full Dataset Link List
| Dataset FTP location NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2026/05/PXD074117 |
| PRIDE project URI |
Repository Record List
[ + ]




