PXD057948 is an
original dataset announced via ProteomeXchange.
Dataset Summary
| Title | Exploration of Artificial Intelligence-Based Spectral Libraries for Data Independent Acquisition in Complex Ocean Metaproteomic Analyses |
| Description | Ocean metaproteomics provides valuable insights into the structure and function of marine microbial communities. Yet, ocean samples are challenging due to their extensive biological diversity that results in a very large number of peptides with a large dynamic range. This study characterized the capabilities of data independent acquisition (DIA) mode for use in ocean metaproteomic samples. Spectral libraries were constructed from discovered peptides and proteins using machine learning algorithms to remove incorporation of false positives in the libraries. When compared with 1-dimensional and 2-dimensional data dependent acquisition analyses (DDA), DIA outperformed DDA both with and without gas phase fractionation. We found that larger discovered protein spectral libraries performed better, regardless of the geographic distance between where samples were collected for library generation and where the test samples were collected. Moreover, the spectral library containing all unique proteins present in the Ocean Protein Portal outperformed smaller libraries generated from individual sampling campaigns. However, a spectral library constructed from all open reading frames in a metagenome was found to be too large to be workable, resulting in low peptide identifications due to challenges maintaining a low false discovery rate with such a large database size. Given sufficient sequencing depth and validation studies, spectral libraries generated from previously discovered proteins can serve as a community resource, saving resequencing efforts. The spectral libraries generated in this study are available at the Ocean Protein Portal for this purpose. |
| HostingRepository | PRIDE |
| AnnounceDate | 2025-09-29 |
| AnnouncementXML | Submission_2025-09-29_14:48:51.994.xml |
| DigitalObjectIdentifier | |
| ReviewLevel | Peer-reviewed dataset |
| DatasetOrigin | Original dataset |
| RepositorySupport | Unsupported dataset by repository |
| PrimarySubmitter | Matthew McIlvin |
| SpeciesList | scientific name: Bacteria; NCBI TaxID: NCBITaxon:2; scientific name: environmental samples |
| ModificationList | acetylated residue; monohydroxylated residue; iodoacetamide derivatized residue |
| Instrument | Orbitrap Fusion |
Dataset History
| Revision | Datetime | Status | ChangeLog Entry |
| 0 | 2024-11-15 11:17:47 | ID requested | |
| ⏵ 1 | 2025-09-29 14:48:52 | announced | |
Publication List
Keyword List
| submitter keyword: DDA DIA Ocean metaproteomics proteomics |
Contact List
| Mak Saito |
| contact affiliation | Woods Hole Oceanographic Institution |
| contact email | msaito@whoi.edu |
| lab head | |
| Matthew McIlvin |
| contact affiliation | Woods Hole Oceanographic Inst. |
| contact email | mmcilvin@whoi.edu |
| dataset submitter | |
Full Dataset Link List
Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2025/09/PXD057948 |
| PRIDE project URI |
Repository Record List
[ + ]
[ - ]
- PRIDE
- PXD057948
- Label: PRIDE project
- Name: Exploration of Artificial Intelligence-Based Spectral Libraries for Data Independent Acquisition in Complex Ocean Metaproteomic Analyses