PXD003500 is an
original dataset announced via ProteomeXchange.
Dataset Summary
Title | Proteogenomic analysis of Mycobacterium smegmatis using high resolution mass spectrometry |
Description | Biochemical evidence is vital for accurate genome annotation. The integration of experimental data collected at the proteome level using high resolution mass spectrometry allows for improvements in genome annotation by providing evidence for novel gene models, while validating or modifying others. Here we report the results of a proteogenomic analysis of a reference strain of Mycobacterium smegmatis (mc2155), a fast growing model organism for the pathogenic Mycobacterium tuberculosis, the causative agent for Tuberculosis. By integrating high throughput LC/MS/MS proteomic data with genomic six frame translation and ab initio gene prediction databases, a total of 2887 ORFs were identified, including 2810 ORFs annotated to a Reference protein, and 63 ORFs not previously annotated to a Reference protein. Further, the translational start site (TSS) was validated for 558 Reference proteome gene models, while upstream translational evidence was identified for 81. In addition, N-terminus derived peptide identifications allowed for downstream TSS modification of a further 24 gene models. We validated the existence of 6 previously described interrupted coding sequences at the peptide level, and provide evidence for 4 novel frameshift positions. Analysis of peptide posterior error probability (PEP) scores indicate high-confidence novel peptide identifications and indicate that the genome of M. smegmatis is not yet fully annotated. |
HostingRepository | PRIDE |
AnnounceDate | 2016-03-31 |
AnnouncementXML | Submission_2016-04-21_01:23:08.xml |
DigitalObjectIdentifier | |
ReviewLevel | Peer-reviewed dataset |
DatasetOrigin | Original dataset |
RepositorySupport | Unsupported dataset by repository |
PrimarySubmitter | Matthys Potgieter |
SpeciesList | scientific name: Mycobacterium smegmatis (strain ATCC 700084 / mc(2)155); NCBI TaxID: 246196; |
ModificationList | No PTMs are included in the dataset |
Instrument | Q Exactive |
Dataset History
Revision | Datetime | Status | ChangeLog Entry |
0 | 2016-01-21 02:04:14 | ID requested | |
1 | 2016-03-31 05:43:02 | announced | |
⏵ 2 | 2016-04-21 01:23:09 | announced | Updated publication reference for PubMed record(s): 27092112. |
3 | 2024-10-22 04:27:48 | announced | 2024-10-22: Updated project metadata. |
Publication List
Potgieter MG, Nakedi KC, Ambler JM, Nel AJ, Garnett S, Soares NC, Mulder N, Blackburn JM, Proteogenomic Analysis of Mycobacterium smegmatis Using High Resolution Mass Spectrometry. Front Microbiol, 7():427(2016) [pubmed] |
Keyword List
curator keyword: Biological |
submitter keyword: Mycobacterium smegmatis, Mass Spectrometry, Proteogenomics, Genome Annotation, Proteomics |
Contact List
Nicola Mulder |
contact affiliation | Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, University of Cape Town, South Africa |
contact email | nicola.mulder@uct.ac.za |
lab head | |
Matthys Potgieter |
contact affiliation | Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, University of Cape Town, South Africa |
contact email | matthys.potgieter@gmail.com |
dataset submitter | |
Full Dataset Link List
Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2016/03/PXD003500 |
PRIDE project URI |
Repository Record List
[ + ]
[ - ]
- PRIDE
- PXD003500
- Label: PRIDE project
- Name: Proteogenomic analysis of Mycobacterium smegmatis using high resolution mass spectrometry