⮝ Full datasets listing

PXD053291-1

PXD053291 is an original dataset announced via ProteomeXchange.

Dataset Summary
TitleA transformer model for de novo sequencing of data-independent acquisition mass spectrometry data
DescriptionA core computational challenge in the analysis of mass spectrometry data is the de novo sequencing problem, in which the generating amino acid sequence is inferred directly from an observed fragmentation spectrum without the use of a sequence database. Recently, deep learning models have made significant advances in de novo sequencing by learning from massive datasets of high confidence labeled mass spectra. However, these methods are primarily designed for data-dependent acquisition (DDA) experiments. Over the past decade, the field of mass spectrometry has been moving toward using data-independent acquisition (DIA) protocols for the analysis of complex proteomic samples due to their superior specificity and reproducibility. Hence, we present a new de novo sequencing model called Cascadia, which uses a transformer architecture to handle the more complex data generated by DIA protocols. In comparisons with existing approaches for de novo sequencing of DIA data, Cascadia achieves improved performance across a range of instruments and experimental protocols. Additionally, we demonstrate Cascadia’s ability to accurately discover de novo coding variants and peptides from the variable region of antibodies.
HostingRepositoryPanoramaPublic
AnnounceDate2024-06-21
AnnouncementXMLSubmission_2024-06-21_17:18:30.512.xml
DigitalObjectIdentifier
ReviewLevelNon peer-reviewed dataset
DatasetOriginOriginal dataset
RepositorySupportSupported dataset by repository
PrimarySubmitterMichael MacCoss
SpeciesList scientific name: Mus musculus; NCBI TaxID: 10090; scientific name: Homo sapiens; NCBI TaxID: 9606;
ModificationListCarbamidomethyl
InstrumentOrbitrap Astral; Q Exactive HF-X
Dataset History
RevisionDatetimeStatusChangeLog Entry
02024-06-21 16:54:02ID requested
12024-06-21 17:18:30announced
Publication List
Justin Sanders, Bo Wen, Paul Rudnick, Rich Johnson, Christine C. Wu, Sewoong Oh, Michael J. MacCoss, William Stafford Noble A transformer model for de novo sequencing of data-independent acquisition mass spectrometry data bioRxiv 2024.06.03.597251
doi: https://doi.org/10.1101/2024.06.03.597251
Keyword List
submitter keyword: de novo, data independent acquisition, sequence variant, extracellular vesicles, Mag-Net
Contact List
Michael MacCoss
contact affiliationUniversity of Washington
contact emailmaccoss@uw.edu
lab head
Michael MacCoss
contact affiliationUniversity of Washington
contact emailmaccoss@uw.edu
dataset submitter
Full Dataset Link List
Panorama Public dataset URI