PXD027467

PXD027467 is an original dataset announced via ProteomeXchange.

Dataset Summary

Title	ComBat HarmonizR enables the integrated analysis of independently generated proteomic datasets through data harmonization with appropriate handling of missing values
Description	The integration of proteomic datasets, generated by non-cooperating laboratories using different LC-MS/MS setups can overcome limitations in statistically underpowered sample cohorts but has not been demonstrated to this day. In proteomics, differences in sample preservation and preparation strategies, chromatography and mass spectrometry approaches and the used quantification strategy distort protein abundance distributions in integrated datasets. The Removal of these technical batch effects requires setup-specific normalization and strategies that can deal with missing at random (MAR) and missing not at random (MNAR) type values at a time. Algorithms for batch effect removal, such as the ComBat-algorithm, commonly used for other omics types, disregard proteins with MNAR missing values and reduce the informational yield and the effect size for combined datasets significantly. Here, we present a strategy for data harmonization across different tissue preservation techniques, LC-MS/MS instrumentation setups and quantification approaches. To enable batch effect removal without the need for data reduction or error-prone imputation we developed an extension to the ComBat algorithm, ´ComBat HarmonizR, that performs data harmonization with appropriate handling of MAR and MNAR missing values by matrix dissection The ComBat HarmonizR based strategy enables the combined analysis of independently generated proteomic datasets for the first time. Furthermore, we found ComBat HarmonizR to be superior for removing batch effects between different Tandem Mass Tag (TMT)-plexes, compared to commonly used internal reference scaling (iRS). Due to the matrix dissection approach without the need of data imputation, the HarmonizR algorithm can be applied to any type of -omics data while assuring minimal data loss
HostingRepository	PRIDE
AnnounceDate	2022-05-23
AnnouncementXML	Submission_2022-05-23_14:24:49.892.xml
DigitalObjectIdentifier
ReviewLevel	Peer-reviewed dataset
DatasetOrigin	Original dataset
RepositorySupport	Unsupported dataset by repository
PrimarySubmitter	Hannah Voß
SpeciesList	scientific name: Escherichia coli; NCBI TaxID: 562; scientific name: Mus musculus (Mouse); NCBI TaxID: 10090; scientific name: Homo sapiens (Human); NCBI TaxID: 9606; scientific name: Saccharomyces cerevisiae (Baker's yeast); NCBI TaxID: 4932;
ModificationList	monohydroxylated residue; iodoacetamide derivatized residue
Instrument	Q Exactive; Orbitrap Fusion; TripleTOF 6600

Dataset History

Revision	Datetime	Status	ChangeLog Entry
0	2021-07-21 05:41:50	ID requested
⏵ 1	2022-05-23 14:24:50	announced

Publication List

Dataset with its publication pending

Dataset with its publication pending

Keyword List

submitter keyword: Data integration, metastudy, Tissue, FFPE, Fresh-Frozen, ComBat, SILAC, TMT, DIA, SWATH, DDA, Missing values, Harmonisazion

submitter keyword: Data integration, metastudy, Tissue, FFPE, Fresh-Frozen, ComBat, SILAC, TMT, DIA, SWATH, DDA, Missing values, Harmonisazion

Contact List

Prof. Dr. Hartmut Schlüter
contact affiliation	Section of Mass Spectrometric Proteomics, University Medical Center Hamburg eppendorf
contact email	h.schluet@uke.de
lab head
Hannah Voß
contact affiliation	University Medical Center Hamburg Eppendorf, Institute of Clinical Chemistry and Laboratory Medicine, Group of Mass Spectrometric Proteomics
contact email	ha.voss@uke.de
dataset submitter

Full Dataset Link List

Dataset FTP location NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2022/05/PXD027467
PRIDE project URI

Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2022/05/PXD027467

PRIDE project URI

Repository Record List

[ + ]