Proteomics Analysis Indicates Greater Abundance of Proteins Involved in Major Metabolic Pathways in Lactuca sativa cv. Salinas than Lactuca serriola Accession US96UC23

Lettuce (Lactuca sativa), cultivated mainly for its edible leaves and stems, is an important vegetable crop worldwide. Genomes of cultivated lettuce (L. sativa cv. Salinas) and its wild relative L. serriola accession US96UC23 are sequenced, but a clear understanding of the genetic basis for divergence in phenotypes of the two species is lacking. Tandem mass tag (TMT) based mass spectrometry is used to quantitatively compare protein levels between these two species. Four‐day old seedlings is transplanted into 500 mL pots filled with soil. Plants are grown for 8 weeks under 250 µmol m−2 sec−1 continuous light, 20 °C and relative humidity between 50–70%. Leaf discs (1 cm diameter) from three individuals per biological replicate are analyzed. A total of 3000 proteins are identified, of which the levels of 650 are significantly different between ‘Salinas’ and US96UC23. Pathway analysis indicated a higher flux of carbon in ‘Salinas’ than US96UC23. Many essential metabolic pathways such as tetrapyrrole metabolism and fatty acid biosynthesis are upregulated in ‘Salinas’ compared with US96UC23. This study provides a reference proteome for researchers interested in understanding lettuce biology and improving traits for cultivation.


Introduction
Cultivated lettuce (Lactuca sativa L.) is one of the most popular leafy green vegetables around the world, with production totaling 27 million tons in 2017 (http: //www.fao.org/faostat/, accessed 2019- [12][13][14][15][16][17][18]. It is the most important leafy green vegetable grown in the U.S., valued at $1.8 billion in California, U.S.A. in the 2018 crop year (https://www.cdfa.ca.gov/ statistics/, accessed 2019- [12][13][14][15][16][17][18]. It comes in many different shapes, sizes, and colors, and is very distinct in phenotype from wild lettuce. [1] Cultivated lettuce is believed to be a result of crossing between different lettuce genera, but is most similar genetically to Lactuca serriola L., which is widely considered to be the direct wild progenitor of L. sativa L. [2] Cultivated lettuce has larger, broader leaves whereas L. serriola has deeply lobed leaves which orient themselves vertically after bolting. [3,4] L. serriola forms deeper tap roots compared to L. sativa, which forms more lateral roots near the surface, likely as the result of inadvertent selection under irrigation and fertilization applications. [5,6] Understanding differences and similarities between crops and their wild relatives has been called an urgent priority to bolster crop biodiversity and food security [7] because wild relatives of crops can provide an important source of genetic diversity for biotic and abiotic stress tolerance in crop plants. [8,9] Wild lettuce shows distinct physiological traits compared to cultivated. A large number of studies have compared cultivated and various species of wild lettuce with the goal of improving traits in cultivated lettuce, including disease resistance, [10,11] pest resistance, [12,13] salt tolerance, [14,15] drought tolerance, [5,16] heat tolerance, [17,18] and bolting resistance. [19] A number of these studies have been successful in facilitating the release of cultivars with genes from wild relatives conferring resistance to downy mildew. [20,21] A number of molecular resources are available for cultivated and wild lettuce. The genome of a commonly grown crispheadtype cultivar 'Salinas' has been sequenced and assembled into nine chromosomal linkage groups, [22] and a draft genome assembly is in progress for L. serriola US96UC23 (https://www. ncbi.nlm.nih.gov/bioproject/412928, accessed 2019-09- 16). An ultra-high-density transcript-based genetic map, [23] as well as www.advancedsciencenews.com www.proteomics-journal.com recombinant inbred lines derived from a cross of these two lines, [24] also exist.
Previous proteomics studies on lettuce focused on analysis of latex proteins, [25] analyzing leaf proteome profiles after exposure to cylindrospermopsin, [26] analysis of microbial communities in packaging, [27] comparative study in response to salinity and zinc stress, [28] thermoinhibition of seed germination, [29] bolting induced by high temperature, [30] growth enhancement by rhizobacteria, [31] identification of bacterial pathogens in lettuce [32] and analysis of changes in lettuce proteome related to color and senescence under modified atmosphere. [33] In this article, we report results from a proteomics study conducted on plants grown under highly controlled conditions. This study provides a reference towards future research aimed at understanding the basis of difference in phenotype between the two species under various stress conditions. It also provides valuable information about which metabolic pathways and proteins are dominant in L. sativa and L. serriola, and which proteins could have a more profound impact on morphology and physiology.

Quality Checks and Fold Change Distribution of Candidate Proteins
Initial sample loading amounts were very similar, as observed from the reporter ion intensities (Figure 1A), and needed minimal adjustment to normalize ( Figure 1B). The biological replicates of US96UC23 were more closely clustered than those of 'Salinas.' Overall, clustering and correlation coefficients do not indicate large differences between 'Salinas' and US96UC23 (Figure 2) The data were compatible with the statistical tests used, as indicated by the flat distribution of large p-values of noncandidates and an increase at small p-values of true candidates ( Figure 1C). Candidates had a fairly even distribution across the low, medium and high fold-change categories ( Figure 2).

Significantly Differentially Regulated Proteins
3000 proteins were identified at a 5% protein and peptide threshold and minimum 2 peptides per protein. Of these, 400 were significantly downregulated, and 250 significantly upregulated in US96UC23 compared with 'Salinas.'

Enrichment and KEGG Pathway Analysis
Gene Ontology (GO) term enrichment analysis indicated a large representation of proteins involved in the "biological processes" category, with a significant representation of processes such as photosynthesis, carbohydrate metabolic process, nitrogen, peptide, and amide related biosynthetic processes, generation of precursor metabolites and energy. From the "cellular components" category, a significant proportion of proteins were part of plastids, chloroplasts and thylakoids (Figure 3). This suggests that a major fraction of proteins that were different between

Significance Statement
Lettuce (Lactuca sativa L.) is an important commodity and is economically the most important leafy green vegetable in the U.S. Genomes of both L. sativa L. and its wild relative, Lactuca serriola L., have been sequenced, with 'Salinas' representative of L. sativa and accession US96UC23 representative of L. serriola. However, we currently lack information about the proteomes of the two species. Our study fills in this gap by using tandem mass tag-based labeling and highly accurate and sensitive mass spectrometry to study the differences in protein levels between the two species under control conditions. Our results suggest significantly lower flux through many essential metabolic pathways in US96UC23 compared with 'Salinas'. Perturbations in phytochrome signaling may be one possible cause of the shorter habit of L. sativa that needs further study, and higher quantity and/or activity of Fatty Acid Desaturase 2, known to contribute to tolerance to many abiotic stresses, may be an important factor contributing to high abiotic stress tolerance in US96UC23 and L. serriola in general. This study provides new information for understanding the difference in physiology and phenotype of the two species. It also provides the first reference proteome for researchers studying lettuce biology for answering fundamental questions as well as for improving agronomically important traits.
'Salinas' and US96UC23 were most likely involved in photosynthesis and/or chlorophyll metabolism, suggesting major differences in the energy metabolism processes between the two species. From the molecular function category, most significant representation was of proteins involved in catalytic activity, which is involved in various processes involving enzyme activity ( Figure 3).
Globally, most represented pathways included carbohydrate metabolism, fatty acid metabolism, nucleotide metabolism, carotenoid biosynthesis, photosynthesis and carbon fixation (Figure 4). The quantity of proteins representing these pathways between the two species highlights major differences and suggests that flux through fatty acid biosynthesis, nucleotide metabolism, carotenoid biosynthesis, sugar metabolism, energy mobilization (pentose phosphate pathway), and other pathways involved in carbon utilization and storage could be significantly higher in L. sativa compared with L. serriola ( Figure 4).

Tetrapyrrole Metabolism
Tetrapyrroles are molecules essential for the survival of all living beings on Earth. [34] In higher plants, four classes of tetrapyrroles are produced, namely, chorophyll, heme, siroheme, and phytochromobilin [35] ( Figure S1, Supporting Information). The first three steps of tetrapyrrole biosynthesis are conserved across all living organisms ( Figure S1, Supporting Information). Tetrapyrrole biosynthesis ( Figure S1, Supporting Information) is highly conserved in higher plants. [35] It begins with the precursor glutamyl tRNA being converted to glutamate-1-semialdehyde by the enzyme glutamyl tRNA reductase (GluTR), which is then converted to 5-aminolevulinic acid, the first committed intermediate in the tetrapyrrole biosynthesis pathway ( Figure  S1, Supporting Information). Tetrapyrrole biosynthesis is tightly regulated by various mechanisms. [36] GluTR is the first enzyme in the pathway, and is subject to regulation in many different forms, including transcriptional, post-transcriptional and post-translational. [36] Porphobilinogen deaminase or hydroxymethylbilane synthase converts porphobilinogen to hydroxymethylbilane; coproporphyrinogen III oxidase (chloroplastic) (CPOX) converts coproporphyrinogen IX to protoporphyrinogen IX; while protoporphyrinogen IX oxidase converts protoporphyrinogen IX to protoporphyrin IX, [37] (Figure S1, Supporting Information). Protoporphyrin IX is a shared intermediate, used as a precursor in chlorophyll and heme biosynthesis. Organisms that do not perform photosynthesis lack the chlorophyll branch of the pathway. Glutamyl tRNA reductase -1 and -2, which constitute the glutamyl-tRNA reductase enzyme, were both present in a higher amount in 'Salinas' compared with US96UC23 ( Figure S2, Supporting Information). www.advancedsciencenews.com www.proteomics-journal.com Glutamyl tRNA reductase catalyzes a key rate-limiting step in the pathway, and one way in which its activity is regulated is by controlling the level of protein. Thus, higher amounts of enzyme suggest increased flux through this pathway. This interpretation is supported by the observation that other important enzymes in the tetrapyrrole biosynthesis pathway such as CPOX were also present in a significantly larger quantity in 'Salinas' compared with US96UC23 ( Table 1

Chlorophyll Metabolism
Chlorophyll metabolism pathways and many of the enzymes involved are highly conserved across all organisms performing oxygenic photosynthesis. [34][35][36][37][38][39] Magnesium protoporphyrin IX is a key enzyme that commits protoporphyrin IX to chlorophyll biosynthesis instead of heme biosynthesis. [35][36][37] It consists of ChlI, ChlD, and ChlH subunits, and its activity is further regulated by Genomes Uncoupled 4. [40][41][42] This enzyme and its components are highly conserved across all photosynthetic organisms, and their functions are essential for survival. [43] This step is tightly regulated in photosynthetic organisms because porphyrins are highly photoreactive molecules and any perturbation in their metabolism can lead to the generation of extremely toxic reactive oxygen species in the presence of light and water. [34,44] One of the regulatory mechanisms includes controlling the amount and ratio of the Mg chelatase subunits. [45] Our data indicates a significant decrease in the level of enzymes involved in chlorophyll biosynthesis, such as ChlI, ChlD, magnesium protoporphyrin IX monomethyl ester oxidative cyclase and protochlorophyllide oxidoreductase, in US96UC23 compared with 'Salinas' (Table 1, Figure S2, Supporting Information). Among the enzymes involved in chlorophyll degradation, pheophytinase was present at slightly higher levels in US96UC23 compared with 'Salinas' (Table 1). Thus, we can hypothesize that chlorophyll biosynthesis occurs at a higher rate, and chlorophyll degradation occurs at a reduced rate in 'Salinas' compared with US96UC23. Size of the light harvesting antenna is regulated at least in part by chlorophyll biosynthesis. [46] Thus, it is possible that higher chlorophyll levels resulting from increased chlorophyll biosynthesis are also associated with increased number of light harvesting complexes, resulting in conversion of relatively more light energy into chemical energy.
www.advancedsciencenews.com www.proteomics-journal.com Figure 3. Enrichment analysis. Test set represents 'Salinas' and reference set represents US96UC23. Enrichment analysis of Gene Ontology (GO) terms was conducted by Blast2GO and Fisher's Exact Test.

Heme Metabolism
Heme biosynthesis in most organisms takes place by a highly conserved route. [47] Ferrochelatase catalyzes the reaction inserting Fe into the protoporphyrin IX porphyrin ring, resulting in heme. [35,48] Heme Oxygenase 1 (HO1) catalyzes the opening of the heme ring to form the linear tetrapyrrole, biliverdin. [35] Chloroplastic phytochromobilin:ferredoxin oxidoreductase (HY2) converts biliverdin to phytochromobilin. [49] Ferrochelatase was present in a slightly higher level, whereas HO1 and HY2 were observed to be present in lower quantities in US96UC23 compared with 'Salinas' (Table 1, Figure S2, Supporting Information). Heme is an essential cofactor in many essential enzymes such as cytochromes, catalases, peroxidases, and nitrate reductase, while phytochromobilin is essential for incorporation into photoreceptors such as phytochromes, cryptochromes, etc. [34,35,50] In addition to photoreception, phytochromes act as signaling molecules which regulate physiological changes in the plant in response to red and far-red light conditions. [51] This in turn could affect root growth and gravitropism, in addition to phototropism, via their signaling activity. Some reports indicate that elevated amounts of the different phytochromes can result in shorter stems, dark green leaves, change in shade avoidance re-sponses, which result in the plant spending less energy in growing tall and use it instead to producing more seeds and expanding their root system. [52] Thus, a higher flux down the heme branch can have many drastic implications for the plant. Based on results in 'Salinas' and US96UC23, it is possible that increase in heme and phytochromobilin biosynthesis in L. sativa results in a higher amount of phytochromes, and may be one of the components of a signaling cascade that ultimately results in shorter stems and bigger leaves in L. sativa compared with L. serriola.

Photosynthesis and Carbon Fixation
KEGG pathway analysis indicates downregulation of some processes and an upregulation of others involved in photosynthesis and carbon fixation in US96UC23 compared with 'Salinas' (Figure 4). Carbonic anhydrases and aquaporins have been shown to explain variation in mesophyll conductance in olive Olea europaea. [53] In order for gaseous CO 2 to move from outside the cell to aqueous sites of carboxylation in chloroplasts, it must be converted to HCO 3 − by carbonic anhydrases (CA). CAs are zinc metalloenzymes that are grouped into three families in plants ( CA, CA, CA) and assist in a wide range of physiological processes. A cluster of CA1-like proteins was significantly up-regulated in US96UC23 compared with 'Salinas' (Table 1). Ribulose bisphosphate carboxylase/oxygenase (rubisco) subunit quantities were not significantly different between 'Salinas' and US96UC23, however, a small but significant increase in rubisco activase was observed in US96UC23 compared with 'Salinas' (Table 1). Rubisco activase is accepted as the major limiting factor in carbon assimilation at least at species-specific high temperatures. [54] Adaptation to high temperature environments across the range of red maple (Acer rubrum) has been attributed to an increased ratio of rubisco activase to rubisco. [55] Higher quantity and/or activity of rubisco activase and CA's, leading to higher mesophyll CO 2 concentration, and supply of metabolites, could lead to higher rates of carbon assimilation in US96UC23 compared with 'Salinas'. Physiological measurements reported in a previous study using gas exchange measurements in the same lines indicate a higher rate of carbon fixation in US96UC23 than 'Salinas,' [56] thus supporting our hypothesis.

Lipid Metabolism
Sugars from photosynthetic CO 2 fixation are converted to pyruvate, an important intermediate in the process of conversion of carbohydrates to fatty acids. [57] Formation of pyruvate and acetyl CoA are important steps in energy conversion, and both pyruvate kinase and acetyl CoA carboxylase enzymes are highly regulated and subject to multiple different forms of regulation. [58][59][60] KEGG pathway analysis indicates an overall downregulation in fatty acid metabolism ( Figure 4). We observed lower amounts of pyruvate kinase and significantly lower amount of pyruvate dehydrogenase in US96UC23 compared with 'Salinas' (Table 1). BCCP as well as the (homomeric) acetyl CoA carboxylase were significantly lower in US96UC23 compared with 'Salinas' (Table 1, Figure S5, Supporting Information). Delta 12 fatty acid desaturase 2 (FAD2), and omega-6 fatty acid desaturase, responsible for introducing the second double bond in the biosynthesis of 18:3 fatty acids, were also present in higher quantities in US96UC23 than 'Salinas' (Table 1, Figure S5, Supporting Information). Increase in the activity of FAD2 leads to an increase in the amount of polyunsaturated fatty acids, especially in ER and plastid membranes, which has been shown to improve physiological and vegetative characteristics, especially tolerance to heat, cold, and salt stress. [61,62]

Concluding Remarks
In addition to individual proteins, we analyzed the cooperative effect of proteins representing the respective metabolic pathways. We identified key regulatory proteins in essential metabolic pathways, that were differentially regulated between 'Salinas' and US96UC23. Because of their regulatory roles, we can predict with www.advancedsciencenews.com www.proteomics-journal.com a high level of confidence that they affect other protein targets in the respective pathways. We can hypothesize that these protein level differences affect the quantity of product or intermediates from that pathway, and thus influence downstream metabolic pathways. Our data suggests comparatively lower chlorophyll biosynthesis, fatty acid biosynthesis, carotenoid metabolism, nucleotide, amino acid biosynthesis and carbohydrate metabolism, and higher carbon fixation rate in US96UC23 than 'Salinas'. A previous study measuring photosynthesis and carbon fixa-tion rates in US96UC23 and 'Salinas' using gas exchange measurements validates our prediction about carbon fixation. This provides a certain level of confidence to our hypotheses about metabolic fluxes through the other branches that could be reliably represented from our dataset. High tolerance to abiotic stress in US96UC23 and possibly other wild lettuce lines may be due to higher level and/or activity of FAD2, which has a known function in tolerance to various abiotic stresses. The shorter growth habit of 'Salinas' and possibly L. sativa in general may at least partly be due to perturbations in the phytochrome signaling pathway, as suggested by an increased flux down the heme and phytochrome precursor branch.
Due to limitations of current proteomics methods, only a small proportion of the proteome of higher eukaryotic cells can be monitored. [63] Low-abundance proteins and other proteins which may play crucial roles in regulation or metabolism and were unable to be detected in our study could potentially change the current prediction of metabolic fluxes. Also, metabolic fluxes may not always be a straightforward correlation with protein abundance, as post-translational modifications, non-protein regulatory components, including availability of metabolites, add an additional layer of complexity. [64] Thus, interpretations of protein abundance with respect to metabolic flux should no doubt be validated in the future via targeted experiments to measure key metabolites in the respective pathways. Likewise, use of gene knockouts and/ or mutants should be used to test the hypothesized effects of upregulation or downregulation of protein levels in the respective lines. Regardless, our study provides baseline proteome differences between the two most well-studied wild and cultivated lettuce genotypes under highly controlled conditions, and provides a reference for pursuing interesting avenues of research in the future.

Experimental Section
Growth Conditions: L. sativa 'Salinas,' and L. serriola US96UC23 seeds were germinated in Petri dishes in distilled water. 4-day old seedlings were transplanted into 500 mL pots filled with potting mix and thinned to include three seedlings per pot. Plants were grown under controlled conditions in a Conviron growth chamber (Conviron, Winnipeg, Canada) maintained at approximately 250 µmol m −2 sec −1 continuous light in the photosynthetically active radiation region, 20°C and 50-70% RH. The plants were fertilized at 1 and 7 weeks with a 20N-20P-20K fertilizer according to the manufacturer's instructions. Approximately 8 weeks after transplantation, three 1 cm leaf discs were excised and pooled from each individual from three pots, for a total of three biological replicates per genotype. Excised leaf discs were flash-frozen in liquid N 2 and stored at −80°C.
Protein Extraction and Qualitative Analysis: Proteins were extracted as described in. [40] Briefly, leaf discs mentioned above were crushed in liquid N 2 and ground to a fine powder. A small aliquot of this frozen powder was dissolved in ice-cold protein extraction buffer in a 1:3 ratio. This mixture was thoroughly homogenized using a plastic micro-pestle. The suspension was clarified by centrifugation at 16 000 g at 4°C for 10 min. Protein concentration of the clarified supernatant were quantified according using Coomassie Plus (Bradford) protein assay (Thermo Fisher Scientific, Waltham, MA, USA) and a bovine serum albumin standard. Samples were normalized to 2 µg µL −1 and qualitatively analyzed by SDS-PAGE.
Proteolytic Digestion: Protein samples (100 µg) were digested in-gel according to [65] with modifications. Gel bands were dehydrated using 100 % acetonitrile and incubated with 10 mm dithiothreitol in 100 mm ammonium bicarbonate, pH ≈ 8, at 56°C for 45 min, dehydrated again and incubated in the dark with 50 mm iodoacetamide in 100 mm ammonium bicarbonate for 20 min. Gel bands were then washed with ammonium bicarbonate and dehydrated again. Sequencing grade modified trypsin was prepared to 0.01 µg µL −1 in 50 mm ammonium bicarbonate and ≈100 µL of this was added to each gel band so that the gel was completely submerged. Bands were then incubated at 37°C overnight. Peptides were extracted from the gel by water bath sonication in a solution of 60% acetonitrile/1% trifluoroacetic acid and vacuum dried to ≈2 µL.
Isobaric Peptide Labeling: Peptide samples (100 µg each) were resuspended in 100 µL of 100 mm triethylamonium bicarbonate (TEAB) and labeled with TMT6 reagents from Thermo Fisher Scientific according to manufacturers' instructions. Aliquots of 5 µL were taken from each labeled sample and used to test labeling efficiency by MS. Remaining labeled peptides were mixed in equal portions and the combined sample was de-salted using reverse phase C18 SepPaks. Eluted peptides were dried by vacuum centrifugation to ≈2 µL and stored at −20°C.
OffGel Fractionation: The dried peptide sample was re-suspended in Agilent OffGel sample buffer to 1.5 mL and fractionated into 12 portions using an Agilent 3100 OFFGEL Fractionator (www.agilent.com) over a nonlinear 3-10 pH gradient according to manufacturer instructions. Following electrophoresis, each fraction was de-salted using C18 stageTips [66] . Purified peptide samples were then dried to ≈2 µL using vacuum centrifugation and frozen at −20°C.
LC/MS/MS Analysis: The samples were re-suspended to 20 µL using 2% acetonitrile, 0.1% formic acid, and an injection of 8 µL was automatically made using a Thermo EASYnLC 1200 onto a Thermo Acclaim PepMap RSLC 0.1 mm x 20 mm C18 trapping column and washed for ≈5 min with buffer A. Bound peptides were then eluted onto a Thermo Acclaim PepMap RSLC 0.075 mm x 500 mm C18 resolving column over 95 min with a gradient of 8% B to 42% B in 84 min, ramping to 90% B at 85 min and held at 90% B for the duration of the run (Buffer A = 99.9% water, 0.1% formic acid; Buffer B = 80% acetonitrile, 0.1% formic acid, 19.9% water) at a constant flow rate of 300 nL min −1 . Eluted peptides were sprayed into a Thermo Fisher Scientific Q-Exactive mass spectrometer using a FlexSpray spray ion source. Survey scans were taken in the Orbi trap (60 000 resolution, determined at m/z 200) and the top ten ions in each survey scan were then subjected to automatic higher energy collision induced dissociation (HCD) with fragment spectra acquired at 30000 resolution. The resulting MS/MS spectra were converted to peak lists using Proteome Discoverer, v2.2 (Thermo Fisher Scientific, Waltham, MA, USA) and searched against a protein database containing all L. sativa sequences available from NCBI (www.ncbi.nlm.nih.gov, downloaded 10/26/2018) appended with common laboratory contaminants (downloaded from www.thegpm.org, cRAP project) using the Mascot search algorithm, v2.6. The search output was then analyzed using Scaffold, v4.8.9 (Proteome Software, Portland, OR, USA) to probabilistically validate protein identifications. Assignments validated using the Scaffold 1% FDR confidence filter are considered true. Quantification of reporter ion intensities was done using the Q+S module within Scaffold.
Mascot parameters for all databases were as follows: 1) allow up to two missed tryptic sites 2) Fixed modification of Carbamidomethyl Cysteine, TMT6-plex to Lysine and peptide N-terminus 3) variable modification of Oxidation of Methionine, Acetylation of Protein N-terminus, 4) quantitation using TMT6-plex reporter ion method in Distiller, 5) peptide tolerance of +/− 10 ppm, 6) MS/MS tolerance of 0.02 Da, 7) FDR calculated using randomized database search.
Proteomics Data Analysis: Reporter ion intensities were quantified using the Q+ module in Scaffold. Non-parametric statistical analysis was conducted by applying the Permutation test with Benjamini Hochberg correction and maintaining significance level as p < 0.05. Data normalizations, quality control checks, and data visualizations were conducted using R following example analyses available at https://github.com/pwilmart/ TMT_analysis_examples. Gene ontology (GO) and enrichment analysis using Fisher's Exact Test was conducted using Blast2GO (BioBam Bioinformatics S.L., Valencia, ES-Spain). Mapping of significantly upregulated or downregulated proteins on metabolic pathways was conducted using the Kyoto Encyclopedia of Genes and Genomes (KEGG) [67] . Proteins with no KEGG identifiers were unable to be mapped onto the metabolic pathways.

Supporting Information
Supporting Information is available from the Wiley Online Library or from the author.