PXD076296 is an
original dataset announced via ProteomeXchange.
Dataset Summary
| Title | Zero-Shot De Novo Peptide Sequencing with Open Post-Translational Modification Discovery |
| Description | Proteins play essential roles in biology, yet identifying their precise sequences and modifications remains challenging. De novo peptide sequencing offers a powerful solution by directly inferring sequences from mass spectrometry data without relying on protein databases. Recent deep learning models have significantly advanced this task but remain trapped in a major dilemma: they require labeled training data to recognize post-translational modifications (PTMs), which is unavailable for most biologically relevant but rare or unknown modifications. We solve this long-standing problem by introducing RNovA, a transformer-based de novo sequencing algorithm enhanced with relative positional embeddings and a reinforcement-learning–style sequential decision framework. RNovA enables open PTM discovery in a zero-shot settingwithout retraining or a predefined list of candidate residues—while maintaining state-of-the-art performance on standard benchmarks. Demonstrating this capability, we successfully identified peptides modified by kynurenine—an uncommon and biologically relevant PTM—in clinical samples from rheumatoid arthritis patients. RNovA overcomes key limitations of existing methods and provides a foundation for exploring previously inaccessible regions of the proteome, including peptides with unexpected or unannotated modifications. This capability is widely needed in immunology, biomarker discovery, and biomedical research. |
| HostingRepository | PRIDE |
| AnnounceDate | 2026-03-29 |
| AnnouncementXML | Submission_2026-03-29_09:56:06.564.xml |
| DigitalObjectIdentifier | |
| ReviewLevel | Peer-reviewed dataset |
| DatasetOrigin | Original dataset |
| RepositorySupport | Unsupported dataset by repository |
| PrimarySubmitter | Zeping Mao |
| SpeciesList | scientific name: Homo sapiens (Human); NCBI TaxID: NEWT:9606; scientific name: Saccharomyces cerevisiae (Baker's yeast); NCBI TaxID: NEWT:4932; |
| ModificationList | No PTMs are included in the dataset |
| Instrument | Orbitrap Fusion Lumos; orbitrap; Orbitrap Fusion |
Dataset History
| Revision | Datetime | Status | ChangeLog Entry |
| 0 | 2026-03-29 09:04:07 | ID requested | |
| ⏵ 1 | 2026-03-29 09:56:07 | announced | |
Publication List
| Dataset with its publication pending |
Keyword List
| submitter keyword: De Novo, Deep Learning,PTM |
Contact List
| Ming Li |
| contact affiliation | Canada Research Chair in Bioinformatics University Professor David R. Cheriton School of Computer Science University of Waterloo, Ontario, Canada |
| contact email | mli@uwaterloo.ca |
| lab head | |
| Zeping Mao |
| contact affiliation | University of Waterloo |
| contact email | z37mao@uwaterloo.ca |
| dataset submitter | |
Full Dataset Link List
Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2026/03/PXD076296 |
| PRIDE project URI |
Repository Record List
[ + ]
[ - ]
- PRIDE
- PXD076296
- Label: PRIDE project
- Name: Zero-Shot De Novo Peptide Sequencing with Open Post-Translational Modification Discovery