PXD030226
PXD030226 is an original dataset announced via ProteomeXchange.
Dataset Summary
Title | Sequence coverage by multiple reads in shotgun proteomics to validate single amino acid variants |
Description | Mass spectrometry-based shotgun proteomics is currently based on assigning matches between mass-spectra of protein fragments resulting from protease digestion and amino acid sequences predicted from nucleic acid sequences. At the same time, the method lacks reliability in identification of every single amino acid of proteins proteome-wide. We proposed a way to interpret shotgun proteomics results, specifically in data-dependent acquisition mode, as a protein sequence coverage by multiple reads, just as it is done in the field of nucleic acid sequencing for the calling of single nucleotide variants. Multiple reads for each letter in the proteome could be provided by overlapping distinct peptides, which confirm the presence of certain amino acid residues in the overlapping stretch with much lower false discovery rate than conventional 1%. These overlapping distinct peptides were, first, miscleaved tryptic peptides in combination with their properly cleaved counterparts, and, second, the peptides generated by several proteases with different specificities after digestion of the same specimen and analyzed separately. We illustrated this approach using publicly available multiprotease proteomic datasets and in-home data for HEK-293 cell line subproteomes obtained using trypsin, LysC and GluC proteases. A general coverage of proteome in exemplary datasets, even with a single read, was 20-30% at 5-8 thousand protein groups identified. Inside this percentage, 5-7% of the whole proteome were covered at least two-fold and, thus, identified with increased reliability. Of 36 single amino acid variants identified in the HEK-293 cell line, seven variants were covered at least two-fold. The sequence coverage by multiple reads may be further increased with gain in proteome depth and the number of multiple proteases used. |
HostingRepository | MassIVE |
AnnounceDate | 2022-05-11 |
AnnouncementXML | Submission_2022-05-11_00:22:57.492.xml |
DigitalObjectIdentifier | |
ReviewLevel | Peer-reviewed dataset |
DatasetOrigin | Original dataset |
RepositorySupport | Supported dataset by repository |
PrimarySubmitter | Ksenia |
SpeciesList | scientific name: Homo sapiens; common name: human; NCBI TaxID: 9606; |
ModificationList | CarbamidomethylDTT; Oxidation |
Instrument | instrument model |
Dataset History
Revision | Datetime | Status | ChangeLog Entry |
---|---|---|---|
0 | 2021-12-07 09:18:18 | ID requested | |
⏵ 1 | 2022-05-11 00:22:58 | announced |
Publication List
Levitsky LI, Kuznetsova KG, Kliuchnikova AA, Ilina IY, Goncharov AO, Lobas AA, Ivanov MV, Lazarev VN, Ziganshin RH, Gorshkov MV, Moshkovskii SA, Validating Amino Acid Variants in Proteogenomics Using Sequence Coverage by Multiple Reads. J Proteome Res, 21(6):1438-1448(2022) [pubmed] |
Keyword List
submitter keyword: shotgun proteomics, proteome coverage, multi protease analysis |
Contact List
Sergei Moshkovkii | |
---|---|
contact affiliation | Research and Clinical Center of Physical-Chemical Medicine |
contact email | smosh@mail.ru |
lab head | |
Ksenia | |
contact affiliation | Research and Clinical Center of Physical-Chemical Medicine |
contact email | kuznetsova.ks@gmail.com |
dataset submitter |
Full Dataset Link List
MassIVE dataset URI |
Dataset FTP location NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://massive.ucsd.edu/MSV000088536/ |