⮝ Full datasets listing

PXD030226

PXD030226 is an original dataset announced via ProteomeXchange.

Dataset Summary
TitleSequence coverage by multiple reads in shotgun proteomics to validate single amino acid variants
DescriptionMass spectrometry-based shotgun proteomics is currently based on assigning matches between mass-spectra of protein fragments resulting from protease digestion and amino acid sequences predicted from nucleic acid sequences. At the same time, the method lacks reliability in identification of every single amino acid of proteins proteome-wide. We proposed a way to interpret shotgun proteomics results, specifically in data-dependent acquisition mode, as a protein sequence coverage by multiple reads, just as it is done in the field of nucleic acid sequencing for the calling of single nucleotide variants. Multiple reads for each letter in the proteome could be provided by overlapping distinct peptides, which confirm the presence of certain amino acid residues in the overlapping stretch with much lower false discovery rate than conventional 1%. These overlapping distinct peptides were, first, miscleaved tryptic peptides in combination with their properly cleaved counterparts, and, second, the peptides generated by several proteases with different specificities after digestion of the same specimen and analyzed separately. We illustrated this approach using publicly available multiprotease proteomic datasets and in-home data for HEK-293 cell line subproteomes obtained using trypsin, LysC and GluC proteases. A general coverage of proteome in exemplary datasets, even with a single read, was 20-30% at 5-8 thousand protein groups identified. Inside this percentage, 5-7% of the whole proteome were covered at least two-fold and, thus, identified with increased reliability. Of 36 single amino acid variants identified in the HEK-293 cell line, seven variants were covered at least two-fold. The sequence coverage by multiple reads may be further increased with gain in proteome depth and the number of multiple proteases used.
HostingRepositoryMassIVE
AnnounceDate2022-05-11
AnnouncementXMLSubmission_2022-05-11_00:22:57.492.xml
DigitalObjectIdentifier
ReviewLevelPeer-reviewed dataset
DatasetOriginOriginal dataset
RepositorySupportSupported dataset by repository
PrimarySubmitterKsenia
SpeciesList scientific name: Homo sapiens; common name: human; NCBI TaxID: 9606;
ModificationListCarbamidomethylDTT; Oxidation
Instrumentinstrument model
Dataset History
RevisionDatetimeStatusChangeLog Entry
02021-12-07 09:18:18ID requested
12022-05-11 00:22:58announced
Publication List
Levitsky LI, Kuznetsova KG, Kliuchnikova AA, Ilina IY, Goncharov AO, Lobas AA, Ivanov MV, Lazarev VN, Ziganshin RH, Gorshkov MV, Moshkovskii SA, Validating Amino Acid Variants in Proteogenomics Using Sequence Coverage by Multiple Reads. J Proteome Res, 21(6):1438-1448(2022) [pubmed]
Keyword List
submitter keyword: shotgun proteomics, proteome coverage, multi protease analysis
Contact List
Sergei Moshkovkii
contact affiliationResearch and Clinical Center of Physical-Chemical Medicine
contact emailsmosh@mail.ru
lab head
Ksenia
contact affiliationResearch and Clinical Center of Physical-Chemical Medicine
contact emailkuznetsova.ks@gmail.com
dataset submitter
Full Dataset Link List
MassIVE dataset URI
Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://massive.ucsd.edu/MSV000088536/