⮝ Full datasets listing

PXD074105

PXD074105 is an original dataset announced via ProteomeXchange.

Dataset Summary
TitleInstaNovo-P: A de novo peptide sequencing model for phosphoproteomics
DescriptionPhosphorylation, a crucial post-translational modification (PTM), plays a central role in cellular signaling and disease mechanisms. Mass spectrometry-based phosphoproteomics is widely used for system-wide characterization of phosphorylation events. However, traditional methods struggle with accurate phosphorylated site localization, complex search spaces, and detecting sequences outside the reference database. Advances in de novo peptide sequencing offer opportunities to address these limitations, but have yet to become integrated and adapted for phosphoproteomics datasets. Here, we present InstaNovo-P, a phosphorylation specific version of our transformer-based InstaNovo model, fine-tuned on extensive phosphoproteomics datasets. InstaNovo-P significantly surpasses existing methods in phosphorylated peptide detection and phosphorylated site localization accuracy across multiple datasets, including complex experimental scenarios. Our model robustly identifies peptides with single and multiple phosphorylated sites, effectively localizing phosphorylation events on serine, threonine, and tyrosine residues. We experimentally validate our model predictions by studying FGFR2 signaling, further demonstrating that InstaNovo-P uncovers phosphorylated sites previously missed by traditional database searches. These predictions align with critical biological processes, confirming the model’s capacity to yield valuable biological insights. InstaNovo-P adds value to phosphoproteomics experiments by effectively identifying biologically relevant phosphorylation events without prior information, providing a powerful analytical tool for the dissection of signaling pathways.
HostingRepositoryPRIDE
AnnounceDate2026-05-20
AnnouncementXMLSubmission_2026-05-20_01:49:20.050.xml
DigitalObjectIdentifier
ReviewLevelPeer-reviewed dataset
DatasetOriginOriginal dataset
RepositorySupportUnsupported dataset by repository
PrimarySubmitterVahap Canbay
SpeciesList scientific name: Homo sapiens (Human); NCBI TaxID: NEWT:9606;
ModificationListphosphorylated residue; acetylated residue; monohydroxylated residue; iodoacetamide derivatized residue
InstrumentOrbitrap Exploris 480
Dataset History
RevisionDatetimeStatusChangeLog Entry
02026-02-05 13:51:09ID requested
12026-05-20 01:49:20announced
Publication List
Dataset with its publication pending
Keyword List
submitter keyword: InstaNovo
De Novo Sequencing
Phosphoproteomics
Breast Cancer
FGFR2 signaling
Contact List
Konstantinos Kalogeropoulos
contact affiliationDepartment of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark Department of Bionanoscience, Delft University of Technology, 2629 HZ Delft, Netherlands Kavli Institute of Nanoscience, 2629 HZ Delft, Netherlands
contact emailkonka@dtu.dk
lab head
Vahap Canbay
contact affiliationTechnical University of Denmark
contact emailvahcan@dtu.dk
dataset submitter
Full Dataset Link List
Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2026/05/PXD074105
PRIDE project URI
Repository Record List
[ + ]