Efficient protein extraction for shotgun proteomics from hydrated and desiccated leaves of resurrection Ramonda serbica plants

Resurrection plant Ramonda serbica is a suitable model to investigate vegetative desiccation tolerance. However, the detailed study of these mechanisms at the protein level is hampered by the severe tissue water loss, high amount of phenolics and polysaccharide, and possible protein modifications and aggregations during the extraction and purification steps. When applied to R. serbica leaves, widely used protein extraction protocols containing polyvinylpolypyrrolidone and ascorbate, as well as the phenol/SDS/buffer–based protocol recommended for recalcitrant plant tissues failed to eliminate persistent contamination and ensure high protein quality. Here we compared three protein extraction approaches aiming to establish the optimal one for both hydrated and desiccated R. serbica leaves. To evaluate the efficacy of these protocols by shotgun proteomics, we also created the first R. serbica annotated transcriptome database, available at http://www.biomed.unipd.it/filearrigoni/Trinity_Sample_RT2.fasta. The detergent-free phenol-based extraction combined with dodecyl-β-d-maltoside-assisted extraction enabled high-yield and high-purity protein extracts. The phenol-based protocol improved the protein-band resolution, band number, and intensity upon electrophoresis, and increased the protein yield and the number of identified peptides and protein groups by LC-MS/MS. Additionally, dodecyl-β-d-maltoside enabled solubilisation and identification of more membrane-associated proteins. The presented study paves the way for investigating the desiccation tolerance in R. serbica, and we recommend this protocol for similar recalcitrant plant material.


Introduction
Quantitative proteomics is a powerful tool in plant physiology, helping to understand the molecular mechanisms of plant responses to different environmental stimuli. It allows simultaneous identification and relative quantification of peptides/ proteins. Protein extraction, the first step in proteomic studies, is critical and essential for obtaining accurate and reliable results. However, depending on species and tissue, plant material usually contains a high amount of various compounds that may interfere with the conventional protein extraction and separation protocols [1].
Our research focuses on metabolic pathways underlying desiccation tolerance in resurrection plant Ramonda serbica ( Fig. 1) that can survive desiccation for a long period and recover metabolic functions already several hours upon watering [2,3]. Cellular desiccation (down to < 10% of relative water content [RWC]) leads to protein denaturation, aggregation, and degradation, making such tissues challenging for proteomic approaches [4]. Besides drought, other adverse environmental factors such as freezing, salinity, and osmotic stress reduce intracellular water content in plants.
An additional challenge to obtain satisfactory protein fractions aiming to generate high-quality mass spectrometric data is a high amount of phenolics and polysaccharides and increased activity of oxidative enzymes which are induced during desiccation and oxidative stress upon rehydration of R. serbica [2,3,5]. To cope with severe water loss, resurrection plants stimulate biosynthesis of osmoprotectants, such as mono-, oligo-, and polysaccharides, increasing the viscosity of the protein extracts [6]. Besides, hairy leaves of R. serbica are rich in polyphenols (catechins, hydroxybenzoic and hydroxycinnamic acids, apigenin, and anthocyanins) [7], in the range of those found in Camellia sinensis famous for high concentrations of catechins and other polyphenols [8].
Phenolics can interact with proteins by hydrogen bonding (especially phenolics with ortho-dihydroxy groups), ionic and hydrophobic interactions (highly methoxylated phenolics with more aromatic rings), and coordinate bonds involving cations [9]. These interactions make proteins more hydrophobic and susceptible to aggregation and precipitation, depending on the structure of phenolic compounds [10]. Moreover, in the presence of pro-oxidants, particularly metals, phenolics easily oxidise to phenoxyl radicals and quinones, which can irreversibly covalently modify protein side chains. Besides non-enzymatic phenolic oxidation, polyphenol oxidases and laccases (ortho-diphenol oxidases) and class III peroxidases (in the presence of hydrogen peroxide) can catalyse the oxidation of various phenolics.
The main goal of our study was to obtain high-yield and high-purity protein fractions for shotgun proteomic analysis of samples from recalcitrant fully hydrated (HL) and desiccated leaves (DL) of R. serbica. To the best of our knowledge, no method has been established for the effective total protein isolation from HL and DL of R. serbica. The elimination of interfering compounds is a prerequisite for satisfactory protein extraction, and we assumed that lower water content in such HL DL recalcitrant material presents an additional challenge for protein extraction compatible with liquid chromatographytandem mass spectrometry (LC-MS/MS). We compared several commonly used protein extraction methods generally applied in such cases, in particular soluble protein extraction w i t h t h e s i m u l t a ne o u s r em o v al o f p o l yp h e n ol s by polyvinylpolypyrrolidone (PVPP), trichloroacetic acid (TCA)/acetone precipitation, and phenol-based extraction to find the best conditions for obtaining a good protein yield and quality from R. serbica leaves. To achieve an efficient membrane-bound protein solubilisation, different detergents were also evaluated: sodium dodecyl sulphate (SDS), dodecyl-β-D-maltoside (DDM), Triton X-100. Protein annotation was done against our newly obtained transcriptome database of R. serbica. Since the genome of this species is unknown, we generated the first de novo transcriptome of R. serbica, which is now publically available at: http://www. biomed.unipd.it/filearrigoni/Trinity_Sample_RT2.fasta. The results of our evaluation and the new annotated database provide new resources for studying the desiccation tolerance mechanisms in this model system.

Plant material
The resurrection plants Ramonda serbica Pančić were collected from their natural habitat in a gorge near the city of Niš in south-eastern Serbia. Desiccation was induced as described in Veljović-Jovanović et al. [2]. The mature leaves (4-5 per plant) of 5 hydrated plants (with approximately 90% of relative water content [RWC]) and of 5 desiccated plants (15% RWC; Fig. 1) were harvested, frozen in the liquid nitrogen, and stored at − 80°C for RNA and protein extraction.

Total RNA extraction
For transcriptome analysis, total RNA from HL of R. serbica was isolated using TRIzol-based protocol [11]. All isolations were performed in triplicates. The leaves from four individual plants were pooled as one RNAseq sample.
RNA purification, cDNA library construction, and transcriptome sequencing of R. serbica leaves For R. serbica transcriptome construction, high-quality RNA from three pooled replicates (mix of leaves from 4 plants) were sequenced. cDNA libraries were sequenced on an Illumina Hi-Seq platform (Illumina, San Diego, CA, USA). The ambiguous nucleotides, adapter sequences, and low-quality sequences were trimmed, and the quality of the reads was checked before and after the trimming. The high-quality reads were used to perform transcriptome assembly using the Trinity platform (the workflow can be found at http://trinityrnaseq.github.io). The transcript abundance was estimated by the alignment-based quantification method-RSEM [12].

Protein extraction
For soluble protein extraction (see Electronic Supplementary Material (ESM) Fig. S1A), buffer A (50 mM HEPES-KOH buffer, pH 8.0 containing 2 mM EDTA, 10 mM ascorbate, 2 mM phenylmethylsulphonyl fluoride [PMSF]) and 10% of water-insoluble PVPP (Polyclar AT, Serva) were added to ground material of 4-5 fully HL and DL per plant. To remove metal ions and other contaminants, PVPP was treated according to Kallinich et al. [9]. Upon centrifugation, pellets were reextracted and re-extracts were pooled with corresponding extracts, concentrated, and stored at − 80°C until analysis. The proteins in the extracts were concentrated by freeze-drying or through ultrafiltration using the Amicon ® Ultra-4 Centrifugal filters (with cut-off 10 kDa, Millipore). Optionally, additional protein purification was done by TCA/acetone precipitation [13]. Phenol-based extraction and precipitation with ammonium acetate in methanol were performed according to Wang et al. [14] with minor modifications (ESM Fig. S1B). After grinding, 10% TCA in acetone containing 0.07% of 2-mercaptoethanol (βME) was added to leaf powder, followed by centrifugation at 16,000g and 4°C for 10 min and additional precipitation with 80% methanol containing 0.1 M ammonium acetate. The obtained pellet was washed with 80% acetone, precipitated, and flushed with argon, and proteins were extracted with a 1:1 mixture of buffer B (50 mM Tris-HCl pH 8.0 containing 2 mM EDTA, 2 mM PMSF, and 30% sucrose to enable phase inversion regarding phenol) and phenol equilibrated with Tris-HCl pH 8. The lower aqueous phase should contain salts, carbohydrates, glycosylated phenolics, nucleic acid, and insoluble cell debris, while the upper phenol phase (buffered to pH 8.0) should contain proteins, hydrophobic phenolics, lipids, and pigments [13,15]. Proteins were precipitated with 80% methanol containing 0.1 M ammonium acetate and washed with acetone, and final pellets were resuspended in 0.1 M Tris-HCl buffer (pH 7.5) containing 7 M urea, 0.4% 3-[(3-cholamidopropyl) dimethylammonio]-1propanesulphonate (CHAPS), 2 mM EDTA, and 20 mM dithiothreitol.

Electrophoresis and protein quantification
The quality of protein extracts (purity and yield) of DL and HL of R. serbica was analysed by SDS polyacrylamide gel electrophoresis (PAGE). Protein extract aliquots (normalised to HL and DL dry weights) were denatured and resolved on 12% SDS-PAGE according to Vidović et al. [14]. For protein band visualisation, two procedures were employed: silver staining [16] and SYPRO Ruby Protein Gel Staining, as recommended by the respective manufacturer (Molecular Probes, Invitrogen).
Since common colourimetric protein quantification assays [17][18][19] were not applicable for poor-quality protein extracts (ESM Fig. S1A and S1C), the "in-gel" protein quantification upon SDS-PAGE followed by SYPRO Ruby Protein Gel Staining was done densitometrically (with two technical replicates for each extraction procedure) with Gel Doc XR gel imager (Bio-Rad) and Image Lab ™ software. An Escherichia coli protein extract (protein concentration: 0.5 and 1.0 mg mL −1 estimated by the Bradford protein assay) was used as a reference and was loaded on each gel, while empty lanes served as the negative control for background subtraction. In improved, phenol-based extracts with removed contaminants (ESM Fig. S1B), protein concentration was quantified by the Bradford protein assay [17], using the abovementioned E. coli extract as a reference.

Liquid chromatography-tandem mass spectrometry
Peptides from all biological replicates (four HL replicates and four DL replicates) were analysed using an ultimate 3000 nanoLC system (Dionex Thermo Fisher Scientific) coupled to an LTQ-Orbitrap XL mass spectrometer (Thermo Fisher Scientific) according to the method described in Borgo et al. [21]. Samples were loaded into an 11-cm-long chromatographic column (Picofrit, 75 μm I.D., 15 μm Tip, New Objective) packed in house with C18 material (Aeris Peptide 3.6 μm XB-C18, Phenomenex) and separated using a linear gradient of acetonitrile/0.1% formic acid, from 3 to 50% in 90 min. Source voltage and temperature were set to 1.2-1.3 kV and 200°C, respectively. The instrument was operated in a datadependent mode, by performing a full scan at a high resolution (60,000) on the Orbitrap, followed by MS/MS scans on the ten most intense ions acquired in the linear ion trap with CID fragmentation.

Interpretation of MS/MS data and statistical analysis
Raw MS/MS files were analysed by Proteome Discoverer 1.4 (Thermo Fisher Scientific) connected to a Mascot Search Engine server version 2.2.4 (Matrix Science). MS/MS spectra were searched against the new R. serbica transcriptomic database concatenated with a database of common contaminants found in proteomics experiments. Trypsin was selected as digesting enzyme with up to 1 missed cleavage allowed. Mass tolerance was set to 10 ppm for precursor ions and 0.6 Da for fragment ions. Carbamidomethylation of cysteines and methionine oxidation were set as fixed and variable modifications, respectively. Data were preliminarily filtered to exclude MS/MS spectra containing less than five peaks. Proteins were considered positive hits if at least 2 unique peptides per protein were identified. Peptides were grouped into protein families according to the principle of maximum parsimony. The algorithm Percolator was used to assess the reliability of protein/ peptide identification, and results were filtered using a q value ≤ 0.01 both at protein and at peptide levels.
The prediction of possible transmembrane α-helices in the identified proteins was obtained with the TMHMM predictor (http://www.cbs.dtu.dk/services/TMHMM/). The evaluation of the grand average of hydropathicity (GRAVY) of identified proteins was performed by the GRAVY calculator (http:// www.gravy-calculator.de/). Venn's diagrams were generated by a tool publically available at http://bioinformatics.psb. ugent.be/webtools/Venn/.
Shotgun proteomic analyses were conducted on four biological replicates (hydrated or desiccated leaves harvested from four different plants) for each evaluated protocol.
In order to test the significant differences in the protein amount and the number of annotated peptide/ protein groups identified by different protocols, the Mann-Whitney U test was used, and the significance threshold value was set at 0.05. All statistical analyses were conducted by the IBM SPSS statistics software (Version 20.0, SPSS Inc., Chicago, USA) with a significant threshold value set at 0.05.

Results
This study aimed to establish an efficient sample preparation protocol for LC-MS/MS shotgun proteomics from HL and DL of R. serbica. First, a new transcriptomics database of R. serbica was created. Then, we focused on the development of the procedure for the extraction of high yield and high quality of soluble and membraneassociated protein fractions. All procedures and protocols compared in this study are summarised in ESM Fig. S1. The most efficient approach in terms of protein quality, yield, and the number of peptides and proteins identified was finally selected. All data regarding protein and peptide identifications and all parameters useful to assess the reliability of MS/MS interpretation are reported in the ESM.

De novo transcriptome assembly
An attempt to analyse proteomics data against a generic database (as the plant section of the UniProt database) gave completely unsatisfactory results. We, therefore, decided to proceed by creating a dedicated R. serbica transcriptomics database using authentic RNAseq data, their de novo assembling into potential transcripts, and transcription in reading frames to create the proteomics database.
After de novo assembly, a transcriptome was generated encompassing 308,288 transcripts with 48,137 and 47,000 genes annotated with InterProScan and BlastHits, respectively (ESM Table S1). In total, 2524 protein sequences were annotated by Blast2GO, and around 10% of annotated proteins were attributed to oxidation-reduction processes (20% of annotated enzymes belonged to the oxidoreductases family), 4.7% to transmembrane transport, 3.5% to proteolysis, and 1.24% were annotated as proteins involved in response to stress.

Comparison of three different protein extraction protocols
In order to obtain high-yield and good-quality proteins required for shotgun proteomics, we compared three methods recommended for protein extraction from recalcitrant plant tissues overviewed in ESM Fig. S1.
Firstly, soluble protein extracts of HL and DL were obtained by the common buffer/PVPP/ascorbate extraction protocol, previously employed to characterise the oxidative enzyme activities in R. serbica leaves [2,3] and to eliminate polyphenols and their oxidation products. However, these extracts showed a significant background smear and low resolution of polypeptide bands upon SDS-PAGE followed by silver staining (ESM Fig. S1A). On the other hand, SYPRO Ruby Protein Gel staining resulted in a reduced smear and quite increased number of protein bands, but still not satisfying enough for further proteomic analysis. The yield of protein extraction was estimated by densitometry measurement after SDS-PAGE electrophoresis and protein staining in comparison with a reference (E. coli extract).
The observed smears were more intense in the high molecular weights range and were more pronounced in DL than in HL protein lanes. The additional protein purification with TCA and acetone was unsuccessful in smear removal, and the protein amount decreased 3-folds (ESM Fig. S1A). To prevent possible protein oxidation and subsequent aggregation during the extraction (as a potential cause of smears visible in the gels) and to improve protein separation, different variations of Laemmli sample buffer, as well as different temperatures (75, 85, 95°C) and incubation times (0, 2, 5, 10 min), were tested. However, no differences in the smearing and streaks were obtained (data not shown). Upon extraction, soluble proteins were concentrated by lyophilisation and Amicon ultrafiltration tubes. The latter reduced protein amounts by two times, compared with lyophilisation (Table 1).
In order to remove contaminants from protein extracts and to obtain gels without smears and with improved separation of polypeptides, we further employed phenol-based extraction including 2% SDS as described by Wang et al. [22]. Since no phase separation between phenol and SDS/Tris buffer could be obtained, we decided to optimise this protocol. Thus, we excluded SDS from the buffer and shortened the procedure by excluding the additional purification step with methanol. Such detergent-free phenol-based extraction provided significantly better protein extraction from both HL and DL of R. serbica than the previous method based on buffer/PVPP/ascorbate extraction, as the number of protein bands upon SDS-PAGE was significantly increased and well-resolved (ESM Fig. S1B). Although obtained silverstained gels exhibited a slight smear, particularly in the case of DL, it was drastically reduced compared with previous poor electrophoretic separation of proteins obtained after buffer/ PVPP/ascorbate extraction. In this way, relatively high protein yield (Table 1) was achieved and the most widely used colourimetric assays for determining protein concentration showed no interference [17][18][19]. No statistically significant differences between HL and DL according to the Mann-Whitney U test was obtained confirming no omitting effects in desiccated tissue compared with fully hydrated one.
To complement the detergent-free phenol-based extracted proteins with membrane-bound proteins, we additionally performed and tested new extractions with a high concentration of ionic detergent 2% SDS, and non-ionic ones 1% DDM and 1% TritonX-100 (ESM Fig. S1C). Protein extracts containing 2% of SDS showed the most pronounced smears in SDS-PAGE gels. Surprisingly, total protein extracts of HL and DL obtained in the presence of the detergents contained less protein amount and showed less protein bands (2-3 times) compared with those obtained without detergent. The comparison of results obtained with the different detergents showed that the highest amount of proteins was detected in the extracts containing non-ionic detergent DDM and the lowest in the extracts containing Triton X-100 (Table 1).
Proteomic analysis of proteins extracted with buffer/PVPP/ascorbate protocol Shotgun proteomics of all lyophilised protein extracts obtained by the three compared protocols (ESM Fig. S1) was Table 1 Protein performed on four biological replicates of HL and four biological replicates of DL. Database search of MS/MS spectra and protein annotation was carried out against the de novo transcriptome R. serbica database generated as described in the "Material and method" section. LC-MS/MS technical replicates were acquired to assess the variation between two separately extracted protein extracts from the same biological sample (homogenised leaf tissue harvested from the same plant) following the same procedures of extraction and digestion, as well as applying the same instrumental parameters for the LC-MS/MS analysis. The percentage of the number of different protein groups exclusively annotated in two technical replicates was 14.25 ± 1.04 (n = 5; data not shown), in agreement with the partially stochastic nature of datadependent acquisition in shotgun proteomics.
Briefly, 563 and 488 different functional protein groups were identified in buffer/PVPP/ascorbate-based extracts from HL (n = 4) and DL (n = 4), respectively (Table 1). In total, 602 different groups of proteins were identified from both HL and DL by this method. The identified proteins (with their corresponding peptides) are listed in ESM Tables S2 and S3 and summarised in Table 2.
As expected, the majority of identified proteins were soluble and only a small amount (2.6%) was manually assigned as membrane-bound proteins according to the UniProt database (Fig. 2), mostly belonging to highly abundant components of photosystem (PS) II. In addition, the calculated GRAVY index for four samples originated from different HL biological replicates was − 0.24 (Fig. 2a).
Among the three selected biological replicates of HL and DL, 51% and 47% of identified protein groups, respectively, were common (ESM Fig. S2). Glyceraldehyde-3-phosphate dehydrogenase, Rubisco subunits, glutamate synthase, sucrose synthase, and three isoforms of polyphenol oxidases were the proteins with the highest number of PSMs (peptide-spectrum matches) per protein in HL samples (ESM Table S3). Besides these proteins, in DL extracts, proteins with the highest number of PSMs corresponded to two stress-responsive family proteins and subunits of two chloroplastic chaperones.
Proteomic analysis of detergent-free phenol-based extracted proteins from R. serbica leaves Qualitative comparison of protein extracts revealed that phenolbased methods gave a higher protein yield when compared with the buffer/PVPP/ascorbate extraction of soluble proteins. By detergent-free phenol-based extraction, 836 different protein groups were identified in total from HL (n = 4) and DL (n = 4), about 34% more than obtained by the buffer/PVPP/ascorbate extraction of soluble proteins ( Table 1). The protein yield and the number of annotated proteins were similar irrespective of tissue-water content (640 in HL and 711 in DL, no statistical difference according to the Mann-Whitney U test). All identified proteins and their corresponding peptides are listed in ESM Tables S2 and S4. As shown in Table 2, compared with buffer/ PVPP/ascorbate extraction, following the detergent-free phenolbased extraction, 1.5 and 2 times higher number of total PSMs were assigned and 1.3 and 1.8 times more peptides were identified in HL and DL samples, respectively.
Good distribution and overlapping in the number of annotated proteins and peptides among different biological replicates for each physiological state were achieved (ESM Fig.  S2). For the illustration, in 3 biological replicates of DL, 619 different protein groups were annotated, of which~50% were common for all of them, and~8% were present in at least two, while up to 10% was exclusively present in one biological replicate. Similar results were obtained for HL samples. The variability obtained in these samples can be attributed to the intrinsic biological variation of the plant samples, which were collected from their natural habitat.

Proteomic analysis of extracts obtained in the presence of different detergents
Regarding different detergents, the highest number of identified protein groups from the same HL biological sample was obtained by DDM (284 protein groups), followed by SDS (207 protein groups) and Triton X-100 (124 protein groups) ( Fig. 3 and ESM Tables S2 and S5). These results are in a good correlation with measured protein contents after SDS-PAGE separation (Table 1). Using Triton X-100 to re-extract the pellets remained after extraction with DDM, did not lead to any significant improvement, as the gel showed very few weak protein bands (data not shown). Hydrophobicity of proteins obtained with SDS and DDM was similar (based on the GRAVY index, data not shown), while in extracts obtained in the presence of Triton X-100 was slightly lower (higher average protein polarity).
Proteomic data revealed that~10% of the total protein groups annotated in the buffer/PVPP/ascorbate-based extracts with and without different detergents were common in all four extract types (Fig. 3a). Most of these proteins were related to highly abundant chloroplastic membrane-bound and soluble proteins and two late abundant embryogenesis proteins (ESM Tables S2  and S5). By using DDM, 27% newly identified proteins in addition to those obtained by buffer/PVPP/ascorbate-based extraction were revealed, significantly higher than those obtained by SDS and Triton X-100 (15% and 11%, respectively).
Several subunits of chloroplastic ATP synthase, PS I and II, proteasome, and V-type ATPase, as well as superoxide dismutase and chloroplast-lipid-associated protein, were among 39 protein groups identified only in SDS extracts (compared with two other detergents; Fig. 3a). Majority of proteins common for all three detergent fractions were annotated as chloroplastic ones, such as several subunits of PS I and PS II and chlorophyll a-b binding proteins, while both Triton X-100 and DDM could extract polyphenol oxidases and some mitochondrial enzymes, such as Mn-superoxide dismutase and aquaporin TIP1-1-like proteins. Protein groups exclusively found in Triton X-100-based extracts (only 48; Fig. 3) were the additional components of photosynthetic electron transport chain (PETC), aquaporins, and mitochondrial ADP/ ATP carrier proteins.
Twenty-five percent of DDM-extracted proteins were predicted to contain transmembrane α-helices, and by manual inspection, we could confirm that 67% of them were real transmembrane proteins (85% if annotated proteins with unknown function are excluded; Fig. 2). Accordingly, the GRAVY index calculated  for the identified proteins extracted in the presence of DDM showed a less negative value (− 0.18), referring to a higher ratio of hydrophobic proteins compared with proteins obtained by the other two methods (Fig. 2a).
On this basis, we selected DDM for the solubilisation of membrane-bound proteins of HL and DL samples over SDS and Triton X-100. Results obtained for DDM-based samples of different biological replicates were similar, regarding protein distribution and the number of annotated proteins (ESM Fig. S2). Almost 50% of annotated protein groups were detected in all biological replicates; 5-10% were found in at least two, while 8-14% were detected exclusively in one replicate.
Among 103 protein groups annotated in extracts obtained by DDM, and not in buffer/PVPP/ascorbate-based extracts (Fig. 3b), 46 protein groups were classified as membraneassociated proteins according to the UniProt database. Among them, 29 were constituents of thylakoids and chloroplast envelope (ATP synthase subunits, chlorophyll a-b binding proteins, constituents of PETC, TIC and TOC proteins, and allene oxide synthase), 4 were attributed to mitochondrial inner and outer membranes, and 13 were associated with plasma membrane and tonoplast (ras-related protein Rab7, transporters, and aquaporins) (ESM Tables S2, S5).
The ratio of confirmed membrane-bound proteins in DDMbased extracts was 4.6 and 6.4 times higher compared with those obtained by phenol-based and buffer/PVPP/ascorbate extraction, respectively (Fig. 2b).
When compared with the detergent-free phenol-based extraction approach, up to 200 additional, protein groups were identified in extracts obtained with DDM (ESM Tables S2 and  S4). In at least three biological replicates obtained by DDMassisted extraction, 147 different protein groups, absent from detergent-free phenol-based extracts, were present in HL extracts, and 148 different protein groups in DL extracts. On average, 151 ± 10 different protein groups exclusively present in the fractions obtained by DDM-assisted extraction were derived from HL, while slightly less, 125 ± 6 were obtained from DL (not significantly relevant, P = 0.171, Mann-Whitney U test, n = 4). Among these proteins, 22.7% were chloroplastic, and 10.3% were annotated as mitochondrial, while 9.6% were attributed to tonoplast and plasma membrane (SE was less than 5%; no statistical differences between HL and DL samples). Moreover, within the protein groups annotated in DDM-based extracts and absent from detergent-free phenol-based extracts of both HL (n = 4) and DL (n = 4), the highest number of PSMs was related to chloroplastic membrane-bound proteins (ATP synthase delta chain, chlorophyll a-b binding proteins, constituents of PET chain, TIC and TOC proteins, and allene oxide synthase), mitochondrial membrane-bound proteins (ADP, ATP carrier protein, ATP synthase subunits and components of inner and outer mitochondrial membrane) and plasma membrane-and tonoplastassociated proteins (V-type proton ATPase subunits, rasrelated protein Rab, ABC, calcium and sucrose transporters, aquaporins) (ESM Tables S2, S4).

Discussion
The aim of this study was to provide an efficient protocol for sample preparation suitable for LC-MS/MS shotgun proteomics from the hydrated (HL) and desiccated (DL) R. serbica leaves. We have previously shown that these leaves are rich in pro-oxidants and polyphenols [2,3,7], common contaminants found in protein extracts. We focused on the development of the procedure for the extraction of high-yield and high-quality protein fractions, and different detergents were tested in order to increase the amount of membrane-bound proteins. The efficiency of selected methods was assessed based on the obtained yield, purity, and quality (evaluated by SDS-PAGE separation profile) and by the number of proteins, peptides, and PSMs identified by shotgun proteomics.

De novo transcriptome assembly
R. serbica is a hexaploid species, with 1261 Mbp 1C genome size [23], but the genome is not sequenced. Therefore, the prerequisite for proteome analysis of HL and DL of R. serbica was to obtain a reliable database. De novo transcriptome assembly results obtained for R. serbica are comparable with the de novo transcriptome of another ancient and resurrection species Haberlea rhodopensis, belonging to the same family as R. serbica -Gesneriaceae, which consisted of 91,753 to 51,046 annotated genes, depending on the database used for annotation [24]. The number of annotated genes in leaves of R. serbica was slightly higher than in Boea hygrometrica, phenotypically similar perennial resurrection plant sharing the same family as R. serbica (36,365 of which 24,230 with assigned gene descriptions) [25]. A similar difference was observed in comparison with another resurrection plant Craterostigma plantagineum (29,400, with more than 15,000 UniProt identities) [26].

Proteomic analysis of common buffer/PVPP/ascorbate-based extracts
The commonly used protocols for the extraction of soluble proteins (e.g. buffer/PVPP/ascorbate extraction) were not suitable for R. serbica leaves, although recommended precaution steps were taken. Poor resolution of protein bands and smear in the range of high mlecular weight upon SDS-PAGE followed by silver staining could be the result of non-migrating contaminants, as well as continuous protein aggregation [1,28,29]. Moreover, silver staining was shown to interact with most of the protein extraction interfering compounds, such as phenolics, polysaccharides, and nucleic acids [16,30], causing the smears on SDS-PAGE gels.
Since even in the presence of 2% SDS in protein extracts, intense smears were obtained in SDS-PAGE gels (ESM Fig.  S1C), we tend to exclude protein aggregation as a potential cause of smearing and we think that protein extracts were probably associated withcontaminating polymers (polyphenols, nucleic acids, and polysaccharides). Indeed, HL and DL protein pellets were gelatinous, sticky, and brownish indicating the possible cross-linking of protein aggregates with oxidised polyphenols or co-precipitation with nucleic acids [15,31]. Moreover, the significant protein loss upon TCA/ acetone precipitation might be attributed to the poor solubility of possible protein/polyphenol co-precipitates as described in Carpentier et al. [32]. In order to eliminate possible phenolic contamination, we applied a high amount of water-insoluble PVPP (5-7% w/v) [33] as recommended for polyphenol-rich leaves, such as olive, cotton, and pine [15,34]. Besides, pH 8 should prevent ionised phenolics to make hydrogen bonds with the peptide bonds of proteins [32], while 10 mM ascorbate (an efficient quinone reductant) and EDTA (a chelating agent of pro-oxidative metal ions) were used to prevent possible oxidation of phenolics to phenoxyl radicals, which could covalently modify the amino acid residues [35,36].
Reduced protein amount when Amicon ultrafiltration tubes were used to concentrate extracted proteins compared with lyophilisation (Table 1) could occur due to the following: (i) protein hydrophobicity and a tendency to form non-specific protein-membrane interactions; (ii) protein unfolding, aggregation, and precipitation, which blocks the membrane. Both would be favoured if phenol-protein interactions occurred.
Although all tested workflows and corresponding LC-MS/ MS analyses were conducted starting with the same amount of initial sample, our data were not obtained with a rigorous quantitative proteomics approach (i.e. the comparison of the relative amount of proteins in the different samples was not our primary goal). However, based on the number of PSMs per protein (which has been shown to correlate with protein abundance [37]), our data indicate the presence of a significant abundance of three polyphenol oxidases in extracts of both leaf tissues of R. serbica, thus confirming previous notions based on the high activity of this enzyme [3]. In addition, the presence of a high number of peptides corresponding to chaperone proteins in DL might implicate their upregulation during desiccation tolerance, suggesting that they might play a role in the maintenance of the structure of key proteins under cellular water loss [4,6].

Proteomic analysis of proteins obtained by detergentfree phenol-based extraction
Based on poor protein quality and low number of identified peptides and proteins obtained by the common extraction protocols based on buffer/PVPP/ascorbate, we tried to improve protein extraction from HL and DL by employing a phenol/ SDS-based protein extraction, recommended for recalcitrant species by several studies [13,15,22,38]. However, we could not achieve the buffer/phenol phase separation even if 30% sucrose was added in the buffer solution. This outcome may be explained by the presence of interfering compounds characteristic for R. serbica leaves, which prevented phase separation. For instance, Wu et al. [39] reported no phase separation when large amounts of phenolic compounds are present in the powdered tissue or when high SDS concentration is applied (e.g. > 5%, w/v).
Our shortened, detergent-free phenol-based extraction provided better protein yield. Although obtained silver-stained gels exhibited a slight smear, particularly in the case of DL, it was drastically reduced compared with previous poor electrophoretic separation of proteins obtained after conventional buffer/ PVPP/ascorbate extraction, implying no protein degradation and low contamination with phenolics, polysaccharides, and/ or nucleic acids. Moreover, phenol causes dissociation of compounds that can interact with proteins and change their hydrophobicity and oxidation level, preventing the possible protein aggregation [40]. Therefore, this protocol should separate proteins from water-soluble impurities, such as salts, nucleic acid, and carbohydrates, while following washing steps with organic solvents should remove polyphenols, pigments, and lipids [15].
However, we could speculate that if polyphenols were the main contaminants during common soluble protein extraction-due to high initial concentration in R. serbica leaves, their initial removal before protein extraction with organic solvents (acetone, methanol) was a better choice than with the application of 10% PVPP, excluding the possible protein loss. Due to a highly oxidising matrix of R. serbica leaves (high pro-oxidative activities of enzymes and polyphenols) [2,3,5,7,11], the probability for proteins to be oxidised during the extraction is increased. We propose that the presence of organic solvents (initial TCA/acetone precipitation, followed by phenol-based extraction) limits the enzymatic activities and enables fractional removal of other low molecular weight interfering compounds and therefore provides a better protein extraction yield and quality, comparing with buffer containing EDTA, ascorbate, and PVPP, usually recommended for common plant protein extractions. Similarly and comparably to our method, protein extraction from cotton leaves obtained in the presence of 7% PVPP gave a less protein yield and more blurred gels with the less band resolution than phenol-based extraction combined with TCA precipitation [34]. A combined approach with the initial addition of polyvinylpyrrolidone into hydrated and desiccated C. plantagineum homogenised leaves followed by TCA/acetone/phenol-SDS-based extraction of total proteins resulted in well-resolved protein bands upon SDS-PAGE and SYPRO staining, with low background staining and an intensive band corresponding to a large Rubisco subunit [38] similarly as observed in our case (ESM Fig. S1C).
A high abundance of soluble proteins related to phenolic metabolism found in HL and DL extracts is not surprising, since polyphenol metabolism, phenolics, and related enzymes (particularly seven isoforms of polyphenol oxidases) have been already recognised as key players during hydration/desiccation transitions in R. serbica [3] as well as in B. hygrometrica [41].
Compared with buffer/PVPP/ascorbate extraction, the detergent-free phenol-based extraction led to the identification of a higher number of soluble proteins as well as membraneassociated proteins (Fig. 2). Although the difference in the number of identified protein groups between HL extracts obtained by common buffer-PVPP-ascorbate extraction and phenol-based extraction was 13%, almost 50% more protein groups were identified in DL phenol-based extracts, compared with common buffer extraction, suggesting that the phenolbased protocol is particularly efficient in case of more challenging, desiccated tissue. However, the superiority of the our phenol-based protocol becomes much more marked when the numbers of unique peptides and PSMs are compared. In the case of HL, the number of PSMs increases by more than 40%, while in the more striking case of DL, the number of PSMs is almost double ( Table 2). The relatively limited number of protein identifications in this study is basically due to a rather short LC gradient and lack of sample fractionation, but even in these conditions the better quality of the phenol-based extracts is evident. As a consequence, although a relatively small number of proteins are identified, the sequence coverage (i.e. the % of sequence that is covered by the identified peptides) is significantly impacted.
We demonstrated that detergent-free phenol-based extraction provided a significant amount of chloroplastic proteins and membrane-associated proteins (e.g. from thylakoids, mitochondrial membrane, and tonoplast). This is in accordance with the abovementioned study on Selaginella tamariscina [42], where the portion of photosynthesis-associated proteins was the highest (22%) followed by proteins involved in carbohydrate and energy metabolism. In addition, in protein extracts from cotton leaves obtained by a combination of TCA precipitation step and phenol-based extraction, a significant percentage of proteins attributed to plastids, thylakoids, and mitochondria were obtained [34].
The percentage of proteins predicted to contain transmembrane α helices was slightly higher (3.6% related to detergentfree phenol-based extraction compared with 2.6% related to buffer/PVPP/ascorbate) while the GRAVY index was similar (Fig. 2a). Thus, the larger number of observed identified membrane-associated proteins produced by phenol-based extraction could be attributed to higher protein yield, compared with common buffer extraction.
A comparison of the results obtained by our improved approach with results obtained in other proteomics studies conducted on recalcitrant plant tissues is hard to perform since they are based on different proteomic approaches. For example, in protein extracts obtained by TCA/acetone precipitation from the hydrated and desiccated leaves from the evolutionary similar resurrection species B. hygrometrica, around 400 spots were detected and numbered on 2D gels [41]. In another resurrection species, S. tamariscina, Wang and co-workers [42] employed a similar detergent-free TCA/acetone/phenol-based extraction procedure, to compare the proteome of hydrated and desiccated shoots using 2D electrophoresis (2DE). In this way, they obtained 653 reproducibly matched protein spots and after digestion, they identified 159 gel spots using MALDI TOF/TOF MS. Ingle et al. [43] used Tris buffer containing 1% Triton X-100, followed by phenol-based extraction, ammonium acetate/methanol precipitation, and "in-gel" digestion to detect 430 protein spots on 2D gels from the resurrection plant Xerophyta viscosa. Label-free shotgun proteomics of proteins extracted from desiccation-tolerant rice grains using 2% DDM, TCA/acetone precipitation, and "insolution" tryptic digestion resulted in annotation of 899 different proteins against the current rice genome annotation that includes 56,797 genes [44].
Besides LC gradient and the extent and type of fractionation performed to reduce sample complexity, the number of proteins that can be identified by shotgun proteomics depends on the database quality (the number of annotated genes), and technical performances of the MS instrument. Indeed, in the already mentioned study with proteins extracted from cotton leaves by comparable TCA/phenol-based protocol, almost 10 times higher number of proteins was identified [34]. The authors could rely on an available full genome database, performed an extensive peptide fractionation, and acquired the LC-MS/MS data on a more performing instrumental platform. Nevertheless, the aim of our study was to evaluate and to establish a protein extraction protocol which could significantly improve the yield and quality of isolated proteins from R. serbica leaves, compared with other conventionally available protocols that worked well in particular cases, but with our samples failed to provide high-quality protein extracts suitable for LC-MS/MS analysis.

Proteomic analysis of extracts obtained in the presence of different detergents
For the efficient solubilisation of membrane-bound proteins, a protocol based on 1% DDM (in contrast to 2% SDS and 1% TritonX-100) was shown to be the most effective method, similarly as done previously with Pelargonium zonale leaves [14], providing the greatest number of identified peptides and protein groups, as well as the highest number and PSMs of proteins classified as membrane-bound proteins (Fig. 3b). Our findings are consistent with the previous observation that DDM efficiently solubilised and maintained the functionality of more aggregation-prone membrane proteins in solution compared with Triton X-100, due to higher polar head and longer non-polar tail [45]. Additionally, Triton X-100 can be contaminated with polyethylene glycol and peroxides which may oxidise extracted proteins [46].
Based on the number of predicted transmembrane α-helices and GRAVY index, the hydrophobicity of proteins extracted by DDM-based protocol was significantly higher than by the other two approaches (Fig. 2). Moreover, the number of confirmed membrane-bound protein groups was the highest in fraction based on DDM compared with other two extraction protocols (Fig. 2b). Although some membrane-attributed protein groups were detected both in fractions obtained by phenolbased and DDM-assisted extraction, they were more abundant (according to the number of detected PSMs) in DDM fraction.
Therefore, a combined approach consisting in the detergentfree phenol-based extraction plus the additional DDM-assisted extraction outperformed other protocols in terms of yield, quality, and number of identified soluble and membrane-bound protein groups from HL and in more challenging DL of R. serbica (Table 1). With respect to other more commonly used protocols, this approach allowed us to drastically improve the sample quality for shotgun proteomic analysis, enabling the total annotation of 953 different proteins from R. serbica leaves, characterised by a high amount of phenolic compounds and sugars, as well as by particularly elevated activity of prooxidative enzymes. Another combined approach was also employed with hydrated and desiccated leaves of X. viscosa; the authors re-extracted proteins from crude extracts obtained with Tris buffer and 1% Triton X-100, by phenol-based extraction [43]. In this way, they were able to identify around 430 different protein spots upon 2DE.
In conclusion, we reported the first proteomic study of hydrated and desiccated R. serbica leaves based on an established protein extraction protocol and a completely new specific annotated database obtained from RNAseq data. Here we described an LC-MS/MS-compatible, efficient, and lowcost two-step protocol (detergent-free phenol-based plus DDM-assisted extraction) followed by FASP tryptic digestion. All evaluated parameters such as the quality of SDS-PAGE gels (less smear, more bands), the protein yield (Table 1), and the number of identified PSMs, peptides, and protein groups ( Table 2) very clearly show that this protocol is far superior to other tested extraction methods for both hydrated and desiccated R. serbica leaves. Our further work will include quantitative proteomics to investigate the differentially expressed proteins in R. serbica under desiccation conditions. Revealing the molecular mechanisms underlying the vegetative desiccation tolerance may have significant implications on stress-related studies of the crop grown in arid areas. Additionally, we envisage that this protocol would be suitable for proteomic analyses of desiccated plant material.