PXD030708 is an
original dataset announced via ProteomeXchange.
Dataset Summary
Title | MetaNovo : an open-source pipeline for probabilistic peptide discovery in complex metaproteomic datasets |
Description | Results We compared MetaNovo to published results from the MetaPro-IQ pipeline on 8 human mucosal-luminal interface samples, with comparable numbers of peptide and protein identifications, many shared peptide sequences and a similar bacterial taxonomic distribution compared to that found using a matched metagenome database - but simultaneously identified proteins present in the samples that are derived from known gut organisms that were missed by the previous analyses. Finally, MetaNovo was benchmarked on samples of known microbial composition against matched metagenomic and whole genomic database workflows, yielding many more MS/MS for the expected taxa, with improved taxonomic representation, while also highlighting previously described genome sequencing quality concerns for one of the organisms, and providing evidence for a known sample contaminant without prior expectation. Conclusions By estimating taxonomic and peptide level information directly on microbiome samples from tandem mass spectrometry data, MetaNovo enables the simultaneous identification of peptides from all domains of life in metaproteome samples, bypassing the need for curated sequence search databases. We show that the MetaNovo approach to mass spectrometry metaproteomics can be more accurate than current gold standard approaches of tailored or matched genomic database searches, identify sample contaminants without prior expectation and that increases in assigned spectra from this approach can yield novel insights into previously unidentified metaproteomic signals - building on the potential for complex mass spectrometry metaproteomic data to speak for itself. The pipeline source code is available on GitHub and documentation is provided to run the software as a singularity-compatible docker image available from the Docker Hub. |
HostingRepository | PRIDE |
AnnounceDate | 2023-05-16 |
AnnouncementXML | Submission_2023-05-16_04:49:40.189.xml |
DigitalObjectIdentifier | |
ReviewLevel | Peer-reviewed dataset |
DatasetOrigin | Original dataset |
RepositorySupport | Unsupported dataset by repository |
PrimarySubmitter | Matthys Potgieter |
SpeciesList | scientific name: human gut metagenome; NCBI TaxID: 408170; |
ModificationList | monohydroxylated residue |
Instrument | Q Exactive |
Dataset History
Revision | Datetime | Status | ChangeLog Entry |
0 | 2022-01-04 03:15:47 | ID requested | |
⏵ 1 | 2023-05-16 04:49:40 | announced | |
Publication List
Dataset with its publication pending |
Keyword List
submitter keyword: metaproteomics, de novo sequencing, probabilistic |
Contact List
NicolaMulder |
contact affiliation | Computational Biology Division, Department of Integrative Biomedical Sciences, University of Cape Town, Cape Town, South Africa. |
contact email | nicola.mulder@uct.ac.za |
lab head | |
Matthys Potgieter |
contact affiliation | Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, University of Cape Town, South Africa |
contact email | matthys.potgieter@gmail.com |
dataset submitter | |
Full Dataset Link List
Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2023/05/PXD030708 |
PRIDE project URI |
Repository Record List
[ + ]
[ - ]
- PRIDE
- PXD030708
- Label: PRIDE project
- Name: MetaNovo : an open-source pipeline for probabilistic peptide discovery in complex metaproteomic datasets