⮝ Full datasets listing

PXD076296-1

PXD076296 is an original dataset announced via ProteomeXchange.

Dataset Summary
TitleZero-Shot De Novo Peptide Sequencing with Open Post-Translational Modification Discovery
DescriptionProteins play essential roles in biology, yet identifying their precise sequences and modifications remains challenging. De novo peptide sequencing offers a powerful solution by directly inferring sequences from mass spectrometry data without relying on protein databases. Recent deep learning models have significantly advanced this task but remain trapped in a major dilemma: they require labeled training data to recognize post-translational modifications (PTMs), which is unavailable for most biologically relevant but rare or unknown modifications. We solve this long-standing problem by introducing RNovA, a transformer-based de novo sequencing algorithm enhanced with relative positional embeddings and a reinforcement-learning–style sequential decision framework. RNovA enables open PTM discovery in a zero-shot settingwithout retraining or a predefined list of candidate residues—while maintaining state-of-the-art performance on standard benchmarks. Demonstrating this capability, we successfully identified peptides modified by kynurenine—an uncommon and biologically relevant PTM—in clinical samples from rheumatoid arthritis patients. RNovA overcomes key limitations of existing methods and provides a foundation for exploring previously inaccessible regions of the proteome, including peptides with unexpected or unannotated modifications. This capability is widely needed in immunology, biomarker discovery, and biomedical research.
HostingRepositoryPRIDE
AnnounceDate2026-03-29
AnnouncementXMLSubmission_2026-03-29_09:56:06.564.xml
DigitalObjectIdentifier
ReviewLevelPeer-reviewed dataset
DatasetOriginOriginal dataset
RepositorySupportUnsupported dataset by repository
PrimarySubmitterZeping Mao
SpeciesList scientific name: Homo sapiens (Human); NCBI TaxID: NEWT:9606; scientific name: Saccharomyces cerevisiae (Baker's yeast); NCBI TaxID: NEWT:4932;
ModificationListNo PTMs are included in the dataset
InstrumentOrbitrap Fusion Lumos; orbitrap; Orbitrap Fusion
Dataset History
RevisionDatetimeStatusChangeLog Entry
02026-03-29 09:04:07ID requested
12026-03-29 09:56:07announced
Publication List
Dataset with its publication pending
Keyword List
submitter keyword: De Novo, Deep Learning,PTM
Contact List
Ming Li
contact affiliationCanada Research Chair in Bioinformatics University Professor David R. Cheriton School of Computer Science University of Waterloo, Ontario, Canada
contact emailmli@uwaterloo.ca
lab head
Zeping Mao
contact affiliationUniversity of Waterloo
contact emailz37mao@uwaterloo.ca
dataset submitter
Full Dataset Link List
Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2026/03/PXD076296
PRIDE project URI
Repository Record List
[ + ]