⮝ Full datasets listing

PXD064794

PXD064794 is an original dataset announced via ProteomeXchange.

Dataset Summary
TitleIsoBayes: a Bayesian approach for single-isoform proteomics inference
DescriptionThe dataset in this Proteome Exchange entry are specific to the WTC11 benchmarking samples described in the "IsoBayes" publication (Bollon et al.) Motivation: Studying protein isoforms is an essential step in biomedical research; at present, the main approach for analyzing proteins is via bottom-up mass spectrometry proteomics, which return peptide identifications, that are indirectly used to infer the presence of protein isoforms. However, the detection and quantification processes are noisy; in particular, peptides may be erroneously detected, and most peptides, known as shared peptides, are associated to multiple protein isoforms. As a consequence, studying individual protein isoforms is challenging, and inferred protein results are often abstracted to the gene-level or to groups of protein isoforms. Results: Here, we introduce IsoBayes, a novel statistical method to perform inference at the isoform level. Our method enhances the information available, by integrating mass spectrometry proteomics and transcriptomics data in a Bayesian probabilistic framework. To account for the uncertainty in the measurement process, we propose a two-layer latent variable approach: first, we sample if a peptide has been correctly detected (or, alternatively filter peptides); second, we allocate the abundance of such selected peptides across the protein(s) they are compatible with. This enables us, starting from peptide-level data, to recover protein-level data; in particular, we: i) infer the presence/absence of each protein isoform (via a posterior probability), ii) estimate its abundance (and credible interval), and iii) target isoforms where transcript and protein relative abundances significantly differ. We benchmarked our approach in simulations, and in two multi-protease real datasets: our method displays good sensitivity and specificity when detecting protein isoforms, its estimated abundances highly correlate with the ground truth, and can detect changes between protein and transcript relative abundances. Availability and implementation: IsoBayes is freely distributed as a Bioconductor R package, and is accompanied by an example usage vignette.
HostingRepositoryPRIDE
AnnounceDate2025-06-10
AnnouncementXMLSubmission_2025-06-10_07:47:49.365.xml
DigitalObjectIdentifier
ReviewLevelPeer-reviewed dataset
DatasetOriginOriginal dataset
RepositorySupportUnsupported dataset by repository
PrimarySubmitterErin Jeffery
SpeciesList scientific name: Homo sapiens (Human); NCBI TaxID: 9606;
ModificationListmonohydroxylated residue; iodoacetamide derivatized residue
InstrumentOrbitrap Eclipse
Dataset History
RevisionDatetimeStatusChangeLog Entry
02025-06-09 15:50:53ID requested
12025-06-10 07:47:49announced
Publication List
Dataset with its publication pending
Keyword List
submitter keyword: Protein Inference, Orbitrap Eclipse, WTC-11 bottom-up, computational proteomics, DDA
Contact List
Gloria Sheynkman
contact affiliationUniversity of Virginia School of Medicine, Department of Molecular Physiology and Biological Physics
contact emailgs9yr@virginia.edu
lab head
Erin Jeffery
contact affiliationUniversity of Virginia, Department of Molecular Physiology and Biological Physics
contact emailedf4n@virginia.edu
dataset submitter
Full Dataset Link List
Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2025/06/PXD064794
PRIDE project URI
Repository Record List
[ + ]