Updated project metadata.
Updated publication reference for PubMed record(s): 29631402.
A natural way to benchmark the performance of an analytical experimental setup is to use samples of known content, and see to what degree one can correctly infer the content of such a sample from the data. For shotgun proteomics, one of the inherent problems of interpreting data is that the measured analytes are peptides and not the actual proteins themselves. As some proteins share proteolytic peptides, there might be more than one possible causative set of proteins resulting in a given set of peptides. Hence, there is a need for mechanisms that infer proteins from a list of detected peptides. Today's commercially available samples of known content do not expose these complications in protein inference, as their contained proteins deliberately are selected for producing tryptic peptides that are unique to a single protein. For a realistic benchmark of protein inference procedures, there is, therefore, a need for samples of known content where the present proteins share peptides with known absent proteins. Here, we present such a standard, based on E. coli expressed protein fragments.