⮝ Full datasets listing

PXD057120

PXD057120 is an original dataset announced via ProteomeXchange.

Dataset Summary
TitlePredictive Stool-Based Protein Biomarkers for the Classification of Crohn's Disease and Ulcerative Colitis Using a Machine Learning Approach
DescriptionBackground and Aim: Crohn's disease (CD) and ulcerative colitis (UC) are the two major chronic inflammatory bowel diseases (IBD). Although their symptoms are similar, their pathological features and clinical treatments differ. Currently, distinguishing between these diseases involves invasive procedures such as colonoscopy and histopathology, causing discomfort and inconvenience to patients. The use of fecal proteins as non-invasive biomarkers offers a promising alternative due to their stability and proximity to inflamed tissues. This study focuses on using high-throughput data-independent acquisition (DIA) mass spectrometry to develop accurate biomarker signatures from complex stool samples. Methods: Stool samples obtained from 46 active CD patients and 23 active UC patients were analyzed. Using DIA-based SWATH mass spectrometry, we explored the stool proteome, identifying and quantifying approximately 1,250 proteins. The samples were divided into training and testing groups. After data processing, various feature selection algorithms were applied on training group to determine proteins that were significantly different between the CD and UC groups. Additionally, six machine learning algorithms including k-Nearest Neighbors, Naive Bayes, eXtreme Gradient Boosting, Random Forest, Support Vector Machine, and glmnet were evaluated to identify the best-performing classifiers. Results: Sixteen proteins were selected based of several feature selection algorithms and the six ML models trained based on them. According to performance metrics of each algorithm on the training dataset, Naïve Bayes model was selected. For performance validation, the final predictive model was applied to 16 prospective samples as the test dataset. Remarkably, the model achieved an AUC of 0.95 on training dataset and AUC of 0.96 on the test dataset, demonstrating its robustness and lack of overfitting. Conclusion: This study demonstrates the effectiveness of SWATH-based proteomics and machine learning in developing predictive models to classify CD and UC. Further future validation on a larger cohort using targeted MRM mass spectrometry would be served to establish the clinical utility and reliability of this approach.
HostingRepositoryPRIDE
AnnounceDate2025-12-04
AnnouncementXMLSubmission_2025-12-03_19:48:44.911.xml
DigitalObjectIdentifier
ReviewLevelPeer-reviewed dataset
DatasetOriginOriginal dataset
RepositorySupportUnsupported dataset by repository
PrimarySubmitterElmira Shajari
SpeciesList scientific name: Homo sapiens (Human); NCBI TaxID: NEWT:9606;
ModificationListcarbamoylated residue
InstrumentTripleTOF 5600
Dataset History
RevisionDatetimeStatusChangeLog Entry
02024-10-23 16:50:28ID requested
12025-12-03 19:48:45announced
Publication List
10.14309/ctg.0000000000000925;
Shajari E, Gagn, é D, Bourassa F, Malick M, Roy P, No, ë, l JF, Gagnon H, Delisle M, Boisvert FM, Brunet M, Beaulieu JF, Stool-Based Proteomic Signature for the Noninvasive Classification of Crohn's Disease and Ulcerative Colitis Using Machine Learning. Clin Transl Gastroenterol, 16(11):e00925(2025) [pubmed]
Keyword List
submitter keyword: Crohn’s disease,Inflammatory bowel disease (IBD) subtyping
Protein biomarkers
DIA mass spectrometry
quantitative proteomics
machine learning, ulcerative colitis.
Contact List
Jean-Francois Beaulieu
contact affiliationLaboratory of Intestinal Physiopathology, Department of Immunology and Cell Biology, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, J1H 5N4, Canada
contact emailjean-francois.beaulieu@usherbrooke.ca
lab head
Elmira Shajari
contact affiliationPhD candidate
contact emailelmira.shajari@usherbrooke.ca
dataset submitter
Full Dataset Link List
Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2025/12/PXD057120
PRIDE project URI
Repository Record List
[ + ]