As tools for quantitative label-free mass spectrometry (MS) rapidly develop a consensus about the best practices is not apparent. In the work described here we compared five popular statistical methods for detecting differential protein expression from quantitative MS data using both controlled experiments with known quantitative differences for specific proteins used as standards, as well as ‘real’ experiments where differences in protein abundance are not known a priori. Our results suggest that data-driven reproducibility-optimization can consistently produce reliable differential expression rankings for label-free proteome tools and are straightforward in their application.