Protein quantification via label-free mass spectrometry (MS) has become an increasingly popular method for predicting genome-wide absolute protein abundances. A known caveat of this approach, however, is the poor technical reproducibility, that is, how consistent predictions are when the same sample is measured repeatedly. Here, we measured proteomics data for Saccharomyces cerevisiae with both biological and inter-batch technical triplicates, to analyze both accuracy and precision of protein quantification via MS. Moreover, we analyzed how these metrics vary when applying different methods for converting MS intensities to absolute protein abundances. We demonstrate that our simple normalization and rescaling approach can perform as accurately, yet more precisely, than methods which rely on external standards. Additionally, we show that inter-batch reproducibility is worse than biological reproducibility for all evaluated methods. These results offer a new benchmark for assessing MS data quality for protein quantification, while also underscoring current limitations in this approach.
- Fold Change
- median absolute Fold Change
- intensity-Based Absolute Quantification
- Mass Spectrometry
- Principal Component
- coefficient of determination
- Stable Isotope Labeling by Amino acids in Cell culture
- Total Protein Approach
- Universal Proteomics Standard
Mass spectrometry (MS) is currently the main technology used for predicting genome wide protein copy number per cell, thanks to its high sensitivity, specificity, and multiplexing capacity . Among the different MS technologies available, quantitative label-free methods are becoming increasingly popular, due to their relative ease of use and cost-effectiveness, particularly when compared to more expensive and laborious methods, such as isotope-labeled peptide based approaches . In quantitative label-free methods, normalization of the raw data is a critical step when predicting protein absolute abundance [3-6]. Two fundamental metrics for assessing the quality of these predictions are: (i) accuracy, that is, how far away from the true value the prediction is, and (ii) precision, that is, how variable different predictions are when the same measurement is repeated (also referred to as reproducibility).
There are several factors that affect the precision and accuracy of absolute protein abundance predictions generated via MS. These are: (i) the intrinsic biological nature of the proteome, with the dynamic range of intracellular protein abundance being able to span several orders of magnitude; (ii) the physicochemical nature of amino acids: as peptide molecules can have different ionization properties, this can lead to two similarly abundant molecules having different capacities for detection by the MS; and (iii) the differences in MS instrumentation (e.g., Orbitraps versus time-of-flight instruments), chromatography and experimental protocols. All of the above factors yield only modest results in MS-based analyses when comparing predictions to the true protein concentrations values [7-9], and is highly likely to contribute to a large level of variability, the latter which is observed across different proteomics studies [7, 10, 11].
Studies that compute absolute protein abundance commonly address biological reproducibility by running biological replicates in the same MS batch [7, 12, 13]. However, awareness of how the MS instrument itself impacts protein abundance, that is, technical reproducibility, has been less studied. This can be determined by running the same biological sample in the same batch , or in separate batches ; with the latter often referred to as “the batch effect.” As different normalization/scaling methods can be used to predict protein abundance from raw MS intensities , it is interesting to study how these methods propagate the inter-batch technical variability into uncertainty in the final protein abundance predictions. In this study, we analyze both accuracy and technical precision of intensity-based absolute quantification of a proteomics dataset from S. cerevisiae and show how prediction quality can be improved using different normalization/scaling methods. In particular, we show that a simple rescaling method  performs as accurately as but more precisely than alternatives that rely on the use of costly external standards.
We generated a proteomics dataset using the S. cerevisiae’s strain CEN.PK113-7D, containing both biological triplicate and technical replicate samples. Samples were obtained from aerobic glucose-limited chemostats at a dilution rate of 0.1/h and were mixed with an internal standard, using stable isotope labeling by amino acids in cell culture (SILAC). Here, a lysine auxotrophic strain was grown in medium supplemented with double labelled heavy 15N, 13C-lysine (Cambridge Isotope Laboratories Inc.); samples were then mixed in a 1:1 ratio with each of the other non-labelled (“light”) samples. The internal standard was also mixed with an external standard of known concentrations, in a ratio of 6:1.1. The external standard used here was the Proteomics Dynamic Range Standard Set (UPS2) mix (Merck), consisting of 48 human proteins in a dynamic concentration range from 500 amol to 50 pmol. All mixed samples were stored at –80°C until their analysis, wherein they were similarly processed; the latter step being crucial in order to isolate variability from either the biological source or the MS equipment, and not from other sources such as sample preparation differences.
For proteome identification, samples were digested with 1:50 LysC overnight at room temperature. Peptides were separated on an Ultimate 3000 RSLCnano system (Dionex), eluted to a Q Exactive Plus (Thermo Fisher Scientific) tandem mass spectrometer and identified with the MaxQuant 188.8.131.52 software package , maintaining the peptide-spectrum match and the protein false discovery rate below 1% using a target-decoy approach. Each sample was measured six times: on three separate batches of the MS instrument (with a time difference of 12 and 30 days), and each time twice, using Top5 and Top10 data-dependent acquisition strategies, wherein only the top five or ten highest intensity peptide peaks per one MS full scan were selected for MS/MS analysis, respectively (additional details on the experimental setup can be found in the Supplementary Material).
Using the generated dataset, we evaluated accuracy and precision of predicted abundances by the four different methods. To evaluate accuracy, we computed the differences as fold changes between the predicted abundances of the external standard proteins detected by the MS (n = 31/48) and the known values in the UPS2 mix. Here, Methods 1, 2 and 4 performed similarly, whereas Method 3 had a significantly higher error (Figure 1A, Figure S3). Specifically, more than 50% of protein abundance predictions from Method 3 deviated from the true value by less than two-fold. We further evaluated the accuracy of each method by testing protein predictions in the ribosome, a protein complex with subunit abundance in equal stoichiometry . Of these subunits, 62 out of 79 were detected in the internal standard, after accounting for paralogs, and compared to their median abundance value, with the expectation that each ribosomal subunit has the same abundance as all others in the complex . Once again, Methods 1, 2 and 4 performed similarly, and outperformed Method 3 (Figure 1B, Figures S4-S5), which we found to be true in the abundance predictions of both the internal standard and the biological triplicates (Figures S6-S7).
We next proceeded to evaluate precision, by comparing protein predictions between all three batches both for the internal standard and the biological triplicates (Figures S8-S9). A cumulative distribution of all fold changes within the internal standard (Figure 1C) showed that Methods 3 and 4 significantly outperformed Methods 1 and 2 (all P-values <0.001). In particular, by using Methods 3 or 4, protein abundance varied by less than two-fold for nearly 75% of all proteins, whereas in the case of Method 1 this was under 60%. Similar observations can also be made when looking at the biological triplicates (Figure S10). Higher inter-batch variability of Methods 1 and 2 was observed both for lowly and highly abundant proteins but especially for proteins below the detection range of the external standard curve (Figures S11-S12), and can be explained by the bias introduced by the external standard (Figures S13-S14), which Methods 3 and 4 did not use.
Taking into consideration results for both the accuracy and precision tests that we performed (Figure 1), we conclude that the best-performing method is Method 4, which omits the use of an external standard and instead rescales normalized MS intensities to equal the injected sample mass. Even though Methods 1 and 2 perform similarly to Method 4 in terms of accuracy, they are not as precise, while although Method 3 is as precise as Method 4, it is not as accurate. Therefore, considering that iBAQ involves significant additional costs to users (including purchasing of the external standard and additional MS running time); however, does not yield better performance, we propose that the rescaling of normalized MS intensities can be used instead. This method can also be used as a benchmark for assessing the predictive power of alternative approaches for computing absolute protein abundances from MS methods.
It is noteworthy to mention that for all methods, the variability between biological replicates in the same MS batch is considerably lower than the variability between batches of the same biological sample. We exemplify this with the biological and batch variability from Method 1 predictions (Figure 2A and B, respectively), and with a principal component analysis of the same predictions (Figure 2C), wherein samples cluster based on batches, not biological replicates. Although inter-batch variability becomes lowest when using Method 4 (Figure S8, Table S1), coming much closer to biological variability levels, still ∼25% of predictions in the internal standard have over a two-fold of variability. This remaining variability is most likely due to the presence of stochastic and non-linear effects in shotgun proteomics [22, 23]. For instance, for each protein there were on average close to five peptides that were different between batches (Table S2, Figure S15), due to a difference in selection of the most intense (top N) precursor ions, ultimately affecting protein abundance predictions. Researchers working with computational methods that rely on absolute protein abundances  should therefore be aware of these limitations and interpret results accordingly.
In conclusion, we present a comprehensive proteomics dataset of yeast, designed for assessment of absolute protein quantification for different biological replicates and batches of samples. Furthermore, we show that a simple method of normalization and rescaling can yield superior results over more complicated and expensive methods such as iBAQ. As protein intensity is used as input, this method can be used both on pre-existing and future datasets regardless of how intensity values were generated, including labeled or unlabeled methods. We therefore expect both our dataset and method to be of benefit to users when assessing accuracy and precision of MS-based approaches in current and future proteomics studies.
The authors would like to thank Dr. Christine Räisänen and Gang Li for reviewing the manuscript, and the anonymous reviewers that contributed with valuable feedback. This project has received funding from the European Union's Horizon 2020 research and innovation program under grant agreement no 686070 and no 668997, the Novo Nordisk Foundation and the Knut and Alice Wallenberg Foundation. BJS and PJL acknowledge financial support from CONICYT (grant #6222/2014) and the Estonian Research Council (grant PUT1488P), respectively.
Conceptualization: B.J.S., P.J.L., J.N.; Data generation: P.J.L., S.K.; Data Analysis: B.J.S., K.C., S.K., R.Y., I.D., A.Z.; Project Supervision: J.N.; Writing – Original Draft: B.J.S.; Writing – Review and Editing: B.J.S., P.J.L., K.C., S.K., R.Y., I.D., A.Z., J.N.
CONFLICT OF INTEREST
The authors declare no conflict of interest.
DATA AVAILABILITY STATEMENT
All MS data used in this study have been deposited to the ProteomeXchange Consortium via the PRIDE  partner repository with the dataset identifier PXD011725. Output tables from MaxQuant, together with all necessary scripts to reproduce the results presented in this study are available at https://github.com/SysBioChalmers/reproduce and have been archived in Zenodo .
|pmic13368-sup-0001-SuppMat.pdf7.5 MB||Supporting Information|
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
- 1, , , & (2013). The coming age of complete, accurate, and ubiquitous proteomes. Molecular Cell, 49, 583–590.
- 2, , & (2017). Standardization approaches in absolute quantitative proteomics with mass spectrometry. Mass Spectrometry Reviews, 37, 715–737.
- 3, , , , , , , & (2011). Global quantification of mammalian gene expression control. Nature, 473, 337–342.
- 4, , & (2012). Comparative analysis of different label-free mass spectrometry based protein abundance estimates and their correlation with RNA-Seq gene expression data. Journal of Proteome Research, 11, 2261–2271.
- 5, , , , , & (2012). Extensive quantitative remodeling of the proteome between normal colon tissue and adenocarcinoma. Molecular Systems Biology, 8, 611.
- 6, , , & (2014). A “Proteomic Ruler” for protein copy number and concentration estimation without spike-in standards. Molecular Cell Proteomics, 13, 3497–3506.
- 7, , , , , & (2012). Comparison and applications of label-free absolute proteome quantification methods on Escherichia coli. Journal Proteomics, 75, 5437–5448.
- 8, , , , , , , , , , & (2013). An automated pipeline for high-throughput label-free quantitative proteomics. Journal of Proteome Research, 12, 1628–1644.
- 9, , , , , , & (2017). Absolute protein quantification by mass spectrometry: Not as simple as advertised. Analytical Chemistry, 89, 7406–7415.
- 10, , , , , , , , , , , , & (2016). Direct and absolute quantification of over 1800 yeast proteins via selected reaction monitoring. Molecular Cell Proteomics, 15, 1309–1322.
- 11, , & (2018). Unification of protein abundance datasets yields a quantitative Saccharomyces cerevisiae proteome. Cell Systems, 6, 192-205.e3.
- 12, , , , , , , , & (2016). The cytotoxic T cell proteome and its shaping by mammalian target of rapamycin. Nature Immunology, 17, 104–112.
- 13, , , , , , & (2017). Absolute quantification of protein and mRNA abundances demonstrate variability in gene-specific translation efficiency in yeast. Cell Systems, 4, 495-504.e5.
- 14, , , , & (2015). MS1-based label-free proteomics using a quadrupole orbitrap mass spectrometer. Journal of Proteome Research, 14, 1979–1986.
- 15, , , , , & (2018). Cost-effective generation of precise label-free quantitative proteomes in high-throughput by microLC and data-independent acquisition. Scientific Reports, 8, 1–10.
- 16, , , , & (2014). aLFQ: An R-package for estimating absolute protein quantities from label-free LC-MS/MS proteomics data. Bioinformatics, 30, 2511–2513.
- 17, & (2008). MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature Biotechnology, 26, 1367–1372.
- 18, & (2014). Multi-enzyme digestion FASP and the ‘Total Protein Approach’-based absolute quantification of the Escherichia coli proteome. Journal Proteomics, 109, 322–331.
- 19, , , , & (2010). Super-SILAC mix for quantitative proteomics of human tumor tissue. Nature Methods, 7, 383–385.
- 20, , , , , , , , , & (2012). Crystal structure of the 80S yeast ribosome. Current Opinion in Structural Biology, 22, 759–767.
- 21, , , , , , & (2014). Comparison of label-free quantification methods for the determination of protein complexes subunits stoichiometry. EuPA Open Proteomics, 4, 82–86.
- 22, , , , & (2012). The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics, 28, 882–883.
- 23, ETH Zurich (2018).
- 24, , , , , & (2017). Molecular Systems Biology, 13, 935.
- 25, , , , , , , , , , , , & (2016). Nucleic Acids Research, 44, D447–D456.
- 26 (2020). https://doi.org/10.5281/zenodo.4192409.