⮝ Full datasets listing

PXD024676

PXD024676 is an original dataset announced via ProteomeXchange.

Dataset Summary
TitleA systematic evaluation of semispecific peptide search parameter enables identification of previously undescribed N-terminal peptides and conserved proteolytic processing in cancer cell lines
DescriptionBackground: Liquid chromatography- tandem mass spectrometry (LC-MS/MS) has become the most commonly used technique in explorative proteomic research. A variety of open-source tools for peptide-spectrum matching have become available. Most analyses of explorative MS data are performed using conventional settings, such as fully specific enzymatic constraints. Here we evaluated the impact of the fragment mass tolerance as well as the enzymatic constraints on the performance of three search engines. Methods: The open-source search engines including Myrimatch, xTandem and MSGF+ were evaluated with regard to suitability for semi- and unspecific searches as well as the importance of accurate fragment mass spectra. Applying the most suited parameters we performed a semispecific reanalysis of the published NCI-60 deep proteome data. Results: Semi- and unspecific LC-MS/MS data analyses particularly benefit from accurate fragment mass spectra while this effect is less pronounced for conventional, fully specific peptide-spectrum matching. Search speed differed notably between the three search engines with regard to semi- and non-specific peptide-spectrum matching. Semi-specific reanalysis of NCI-60 proteome data revealed hundreds of previously undescribed N-terminal peptides, including cases of proteolytic processing or likely alternative translation start sites, some of which were ubiquitously present in all cell lines of the reanalyzed panel. Conclusions: Highly accurate MS2 fragment data in combination with modern open-source search algorithms facilitate the confident identification of semispecific peptides from large proteomic datasets. The identification of previously undescribed N-terminal peptides in published studies highlights the potential of future reanalysis and data mining in proteomic datasets. The converted .mzML files as well as the sequence databases for the different biological samples are provided. The analysis results are provided as compressed folders containing the results for multiple searches for each .mzML, e.g. using different enzymatic constraints as well as different fragment mass tolerance settings. The NCI-60 raw data from nine representative cancer cell lines was retrieved from https://www.proteomicsdb.org/proteomicsdb/#projects/35/258 and converted to the open mzML format using msconvert using default settings with an additional "metadataFixer" filter. Here we provide the sequence database file as well as the complete reanalysis as compressed galaxy history files which can be downloaded, extracted and imported on https://usegalaxy.eu.
HostingRepositoryMassIVE
AnnounceDate2021-05-25
AnnouncementXMLSubmission_2021-05-25_02:40:18.600.xml
DigitalObjectIdentifier
ReviewLevelNon peer-reviewed dataset
DatasetOriginOriginal dataset
RepositorySupportUnsupported dataset by repository
PrimarySubmitterMatthias Fahrner
SpeciesList scientific name: Homo sapiens; common name: human; NCBI TaxID: 9606; scientific name: Mus musculus; common name: house mouse; NCBI TaxID: 10090; scientific name: Escherichia coli; common name: E. coli; NCBI TaxID: 562;
ModificationListCarbamidomethyl; Acetyl; Oxidation
InstrumentQ Exactive Plus; LTQ Orbitrap Elite
Dataset History
RevisionDatetimeStatusChangeLog Entry
02021-03-11 03:41:30ID requested
12021-05-25 02:40:18announced
Publication List
no publication
Keyword List
submitter keyword: Endogenous proteolysis, Fragment mass tolerance, NCI-60 reanalysis, Semispecific peptide search, Mass spectrometry
Contact List
Oliver Schilling
contact affiliationUniversity of Freiburg
contact emailoliver.schilling@mol-med.uni-freiburg.de
lab head
Matthias Fahrner
contact affiliationUniversity of Freiburg
contact emailmatthias.fahrner@mol-med.uni-freiburg.de
dataset submitter
Full Dataset Link List
MassIVE dataset URI
Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://massive.ucsd.edu/MSV000087034/