PXD013641 is an
original dataset announced via ProteomeXchange.
Dataset Summary
Title | Extremely fast and accurate open modification spectral library searching of high-resolution mass spectra using feature hashing and graphics processing units |
Description | Open modification searching (OMS) is a powerful search strategy to identify peptides with any type of modification. OMS works by using a very wide precursor mass window to allow modified spectra to match against their unmodified variants, after which the modification types can be inferred from the corresponding precursor mass differences. A disadvantage of this strategy, however, are the large computational requirements, as each query spectrum has to be compared against a multitude of candidate peptides. We have previously introduced the ANN-SoLo tool for fast and accurate open spectral library searching. ANN-SoLo uses approximate nearest neighbor indexing to speed up OMS by selecting only a limited number of the most relevant library spectra to compare to an unknown query spectrum. Here we demonstrate how this candidate selection procedure can be further optimized using graphics processing units. Additionally, we introduce a feature hashing scheme to convert high-resolution spectra to low-dimensional vectors. Based on these algorithmic advances, along with low-level code optimizations, the new version of ANN-SoLo is up to an order of magnitude faster than its initial version. This makes it possible to efficiently perform open searches on a large scale to gain a deeper understanding about the protein modification landscape. We demonstrate the computational efficiency and identification performance of ANN-SoLo based on a large data set of the draft human proteome. |
HostingRepository | PRIDE |
AnnounceDate | 2019-12-06 |
AnnouncementXML | Submission_2019-12-06_02:45:43.xml |
DigitalObjectIdentifier | |
ReviewLevel | Peer-reviewed dataset |
DatasetOrigin | Original dataset |
RepositorySupport | Unsupported dataset by repository |
PrimarySubmitter | Wout Bittremieux |
SpeciesList | scientific name: Homo sapiens (Human); NCBI TaxID: 9606; scientific name: Saccharomyces cerevisiae (Baker's yeast); NCBI TaxID: 4932; |
ModificationList | No PTMs are included in the dataset |
Instrument | TripleTOF 5600 |
Dataset History
Revision | Datetime | Status | ChangeLog Entry |
0 | 2019-04-26 01:40:05 | ID requested | |
⏵ 1 | 2019-12-06 02:45:44 | announced | |
Publication List
Bittremieux W, Laukens K, Noble WS, Extremely Fast and Accurate Open Modification Spectral Library Searching of High-Resolution Mass Spectra Using Feature Hashing and Graphics Processing Units. J Proteome Res, 18(10):3792-3799(2019) [pubmed] |
Keyword List
submitter keyword: mass spectrometry, proteomics, open modification searching, spectral library, post-translational modification, approximate nearest neighbor indexing, graphics processing unit, feature hashing |
Contact List
Kris Laukens |
contact affiliation | Department of Mathematics and Computer Science, University of Antwerp, Belgium |
contact email | kris.laukens@uantwerpen.be |
lab head | |
Wout Bittremieux |
contact affiliation | University of Antwerp |
contact email | wout.bittremieux@uantwerpen.be |
dataset submitter | |
Full Dataset Link List
Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2019/12/PXD013641 |
PRIDE project URI |
Repository Record List
[ + ]
[ - ]
- PRIDE
- PXD013641
- Label: PRIDE project
- Name: Extremely fast and accurate open modification spectral library searching of high-resolution mass spectra using feature hashing and graphics processing units