PXD024676-1
	PXD024676 is an original dataset announced via ProteomeXchange.
      
	 
	 Dataset Summary
| Title | A systematic evaluation of semispecific peptide search parameter enables identification of previously undescribed N-terminal peptides and conserved proteolytic processing in cancer cell lines | 
| Description | Background: Liquid chromatography- tandem mass spectrometry (LC-MS/MS) has become the most commonly used technique in explorative proteomic research. A variety of open-source tools for peptide-spectrum matching have become available. Most analyses of explorative MS data are performed using conventional settings, such as fully specific enzymatic constraints. Here we evaluated the impact of the fragment mass tolerance as well as the enzymatic constraints on the performance of three search engines. Methods: The open-source search engines including Myrimatch, xTandem and MSGF+ were evaluated with regard to suitability for semi- and unspecific searches as well as the importance of accurate fragment mass spectra. Applying the most suited parameters we performed a semispecific reanalysis of the published NCI-60 deep proteome data. Results: Semi- and unspecific LC-MS/MS data analyses particularly benefit from accurate fragment mass spectra while this effect is less pronounced for conventional, fully specific peptide-spectrum matching. Search speed differed notably between the three search engines with regard to semi- and non-specific peptide-spectrum matching. Semi-specific reanalysis of NCI-60 proteome data revealed hundreds of previously undescribed N-terminal peptides, including cases of proteolytic processing or likely alternative translation start sites, some of which were ubiquitously present in all cell lines of the reanalyzed panel. Conclusions: Highly accurate MS2 fragment data in combination with modern open-source search algorithms facilitate the confident identification of semispecific peptides from large proteomic datasets. The identification of previously undescribed N-terminal peptides in published studies highlights the potential of future reanalysis and data mining in proteomic datasets. The converted .mzML files as well as the sequence databases for the different biological samples are provided. The analysis results are provided as compressed folders containing the results for multiple searches for each .mzML, e.g. using different enzymatic constraints as well as different fragment mass tolerance settings. The NCI-60 raw data from nine representative cancer cell lines was retrieved from https://www.proteomicsdb.org/proteomicsdb/#projects/35/258 and converted to the open mzML format using msconvert using default settings with an additional "metadataFixer" filter. Here we provide the sequence database file as well as the complete reanalysis as compressed galaxy history files which can be downloaded, extracted and imported on https://usegalaxy.eu. | 
| HostingRepository | MassIVE | 
| AnnounceDate | 2021-05-25 | 
| AnnouncementXML | Submission_2021-05-25_02:40:18.600.xml | 
| DigitalObjectIdentifier | |
| ReviewLevel | Non peer-reviewed dataset | 
| DatasetOrigin | Original dataset | 
| RepositorySupport | Unsupported dataset by repository | 
| PrimarySubmitter | Matthias Fahrner | 
| SpeciesList | scientific name: Homo sapiens; common name: human; NCBI TaxID: 9606; scientific name: Mus musculus; common name: house mouse; NCBI TaxID: 10090; scientific name: Escherichia coli; common name: E. coli; NCBI TaxID: 562; | 
| ModificationList | Carbamidomethyl; Acetyl; Oxidation | 
| Instrument | Q Exactive Plus; LTQ Orbitrap Elite | 
Dataset History
| Revision | Datetime | Status | ChangeLog Entry | 
|---|---|---|---|
| 0 | 2021-03-11 03:41:30 | ID requested | |
| ⏵ 1 | 2021-05-25 02:40:18 | announced | 
Publication List 
| no publication | 
Keyword List 
| submitter keyword: Endogenous proteolysis, Fragment mass tolerance, NCI-60 reanalysis, Semispecific peptide search, Mass spectrometry | 
Contact List 
| Oliver Schilling | |
|---|---|
| contact affiliation | University of Freiburg | 
| contact email | oliver.schilling@mol-med.uni-freiburg.de | 
| lab head | |
| Matthias Fahrner | |
| contact affiliation | University of Freiburg | 
| contact email | matthias.fahrner@mol-med.uni-freiburg.de | 
| dataset submitter | |
Full Dataset Link List 
| MassIVE dataset URI | 
| Dataset FTP location NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://massive.ucsd.edu/MSV000087034/ | 


 to receive all new ProteomeXchange dataset release announcements!
 to receive all new ProteomeXchange dataset release announcements!

