A comprehensive methodology based on NTME/GC-MS data and chemometric tools for lemons discrimination according to geographical origin

.


Introduction
Agro-food products, including citrus fruits, contain in their composition distinctive geographic characteristics arising from the terroir and edafo-climatic conditions.These factors influence metabolomic signatures and consequently the quality and value of the product [1,2].In turn, this will determine the acceptance by consumers.In this way, the use of geographical indication for a given product allows producers to obtain premium prices and market recognition.
T to the chemical additives in the food industry [1,5,6], encompassing both the need for safety and the consumers demand for natural food components.
The volatile composition of food matrices is one of the most important factors influencing flavour and consequently consumer's acceptance [4].In lemon, it has been widely reported that the metabolomic pathways and corresponding volatile composition are influenced by several factors related with the genotype (existence of numerous hybrid cultivars), maturation and geography [1,4].Currently, different methods are used to establish the volatile composition of food matrices and food-related samples.Gas chromatography-mass spectrometry (GC-MS) constitutes golden standard instrumental technique for VOCs analysis in in a wide range of differentiated samples [7][8][9].However, the previous sample preparation, often disregarded, is crucial to concentrate VOCs and remove interferences, particularly from complicated matrices [9,10].Classical extraction techniques, including solvent extraction, distillation and headspace techniques, are mainly based on the solubility or volatility of the VOCs.Such approaches allow the definition of fingerprints of the volatile composition and a comprehensive information on the flavour/aroma of the target sample.Currently, solid phase microextraction (SPME) is a well-established technique in the field of VOCs analysis [11][12][13], but its extraction capacity is hindered by the small amount of sorbent normally used (60-100 μm).As a nonexhaustive technique, SPME efficiency depends of the mass transfer between a small portion of analytes toward the extracting media, and large amounts of analytes remain in the sample solution/matrix [11,14,15].Such drawback is more critical for low abundant VOCs, whose identification is often not possible.In recent years needle trap microextraction (NTME) has been introduced as a simple and fast isolation/extraction technique for VOCs in different matrices [16][17][18][19].NTME is mechanically more robust than SPME, since the sorbent particles are protected inside of the needle trap device (NTD) (Fig. 1) [18,19].Moreover, it is an exhaustive extraction technique, meaning that the sample VOCs can be completely extracted, at least tills the breakthrough (sorbent bed saturation) occurs.In addition, the NTME sensitivity can be improved by increasing the sample volume and its capacity can be expanded by increasing the volume of the packed sorbent in the NTD [11,[18][19][20].Since NTME requires small sample volumes to extract large amount of analytes, normally a sampling volume smaller than the breakthrough volume is used [11,[16][17][18][19]21].The analyte concentration (C 0 ) can be calculated using the following equation: n = C 0 V, where n is the extracted mass by the NTD, C 0 is the concentration of analyte, and V is the sample volume [21,22].
Desorption of the analytes require quick (few seconds) single-stage thermal desorption, being the analytes efficiently transferred to the GC-MS system with minimum or none carryover [15,23,24,26].Under the same GC injection conditions, the thermal desorption is faster in NTME than in SPME, overcorrecting the downside of the SPME technique with a highest sensitivity and speed [7,24].As summarized by Barkhordari et al., NTME has been used mainly for isolation of VOCs from air, water and exhaled breath samples [24].
The main purpose of this work was to explore the potential of the integrated analytical approach using NTME/GC-MS combined with chemometric tools, for the identification of geographical markers of lemons from the same cultivar (Eureka) cultivated in different countries -Portugal (mainland and Madeira Island), Argentina and South Africa.Key NTME experimental parameters that can influence the extraction efficiency, namely extraction temperature, equilibration time, headspace volume and sample amount, were optimized.The acquired data set of non-targeted fingerprints was then processed using multi-dimensional chemometric strategies to identify volatile markers able to discriminate lemons (Eureka variety) according to their geographical origin.As far we are aware, this is the first work reporting the high potential of NTME as an extraction approach for food research.

Chemicals and Materials
All standards used for VOCs confirmation (purity higher than 98.5%) and the n-alkanes mixture containing C 8 -C 20 straight-chain alkanes in hexane, were obtained from Sigma-Aldrich (St. Louis, MO, USA).Helium, ultra-pure grade (Air Liquide, Portugal) was used as carrier gas in the GC system.Clear glass screw cap vials for extraction with PTFE/silica septa were purchased from Supelco (Bellefonte, PA, USA).The NTDs used in this work, "NeedleEx", were custom manufactured by Shinwa Ltd., Japan (60 mm × 0:41 mm id, 0.72 mm od, triple bed configuration Divinylbenzene/Carboxen 1000/Carbopack X -DVB/Car1000/CarX) and purchased to PAS Technology (Magdala, Germany).Prior to their use, NTDs were conditioned in a special custom-made heating device (PAS Technology, Magdala, Germany) at 250 ºC, under permanent helium flow for at least 20 h to eliminate any contaminations from the manufacturing process or shipping.Afterwards, both ends of the needles were sealed with Teflon caps and stored.Before being used, the NTDs were conditioned again for 30 min in the heating device.

Lemon samples
Lemon samples from the same variety (Eureka), but cultivated in different regions (Portugal -mainland and Madeira island, Argentina and South Africa) were selected randomly from a local market.After selection, the peel (exocarp) of each lemon was individually collected, and immediately stored under nitrogen at −80 °C, in 250 mg aliquots until analysis.

Optimization of needle trap microextraction (NTME)
To increase the NTME efficiency, key experimental parameters were optimized [19,27], including (i) the extraction temperature (30 °C, 40 °C to 50 °C), (ii) the equilibration time (10, 30, 50 min), and (iii) the headspace volume (20, 30 and 40 ml), using a 'Design of Experiments' (DoE) optimisation approach.All extractions were performed in triplicate.The DoE is relatively straightforward and can greatly facilitate the optimisation assays, generating a model with 16 combinations.The resulting data matrix was submitted to statistical treatment.

NTME procedure
Following the optimization step, 250 mg of sample was placed into 20 ml of extraction tubes and added 100 µL of 2-heptanol (30 ppm) as internal standard.The extraction tubes were sealed, and the system equilibrated for 10 min at 50 ± 1 °C.Then, the NTDs pre-attached to a disposable 1 mL syringe were inserted into the headspace of the extraction tube, and 30 mL of the gas phase were manually loaded through the sorbent (30 withdraw-loading cycles, average speed 10 ± 2 mL min −1 ).After the extraction, the syringe was discarded and the NTD was sealed in both ends with PTFE caps.Finally, the NTD was injected into GC-MS system at 250 °C for 60 seconds to attain the thermal desorption of the extracted VOCs.Before the next extraction, the sorbent was reactivated by placing the NTDs in a conditioner at 250 °C under constant flow of helium (purity 5.0, Air Liquid, Portugal) at a constant pressure of 1 bar for 30 min.Unless indicated, all procedures were repeated with at least three different samples (N = 3) and analysed in triplicate (n = 3).

Gas chromatography-quadrupole mass spectrometry analysis (GC-qMS)
The analysis was carried out on an Agilent 6890N gas chromatograph system (Agilent Technologies, Palo Alto, CA, USA) coupled with an Agilent 5975 quadrupole inert mass selective detector.The separation of the extracted compounds was performed on a BP-20 fused silica capillary column (60 m × 0.25 mm I.D. × 0.25 µm film thickness).Splitless injection was employed using helium as carrier gas at a constant flow rate of 1.0 mL min −1 .Oven temperature conditions were: 45 ºC (held for 2 min), followed by a gradient temperature ramp from 45 ºC held for 1 min, then up to 90 ºC, held for 3 min at a rate of 2 ºC min −1 , followed by a flow rate of 3 ºC min −1 until 160 ºC (held for 6 min), and finally from 160 ºC to 220 ºC held for 15 min at a rate of 6 ºC min −1 .The injection and ion source temperatures were 250 ºC and 230 ºC, respectively.The mass spectra of the compounds were acquired in electron-impact (EI) mode at 70 eV.The electron multiplier was set to the auto tune procedure.Data acquisition was performed in scanning mode (mass range m/z = 35-300 amu; six scans per second).Chromatograms and spectra were recorded and processed using the Enhanced ChemStation software for GC-MS (Agilent Technologies, Palo Alto, CA, USA).VOCs identification was based on: (i) comparison between the GC retention times (RT) of the chromatographic peaks with those, when available, of authentic standards run under the same conditions; (ii) mass spectra were also compared with the data system library (NIST, 2005 software, Mass Spectral Search Program v.2.0d; Nist 2005, Washington, DC).Single VOM peak was considered as identified compound when its experimental spectrum corresponded with a score of over 80% that present in the library; (iii) determination of Kovat's retention index (KI) values using a C 8 -C 20 n-alkanes series and the values were compared, when available, with values reported in the scientific literature for similar columns.Once again, the values were compared, when available, with values reported in the literature for similar chromatographic columns.
Chromatographic peak areas, expressed in arbitrary units (a.u.) of area, were determined using the Full Scan chromatogram, and were used as an approach to estimate the relative content of each volatile metabolite.For semi-quantification purposes, each sample was injected in triplicate, and the chromatographic peak areas (as kcounts amounts) were determined by a reconstructed full-scan chromatogram using for each compound some specific quantification ions: these corresponded to base ion (m/z 100 % intensity), molecular ion (M + ), and another characteristic ion for each molecule.

Multivariate statistical analysis
The multivariate data analysis (MVDA) was performed using the MetaboAnalyst 4.0 web-based tool [28].The raw GC-qMS data was firstly pre-processed by normalization (to sample median, data transformation by cubic root and data scaling by autoscaling).The analysis of variance (ANOVA, p <0.05), including PCA, was used for variable reductions and to convert a set of highly correlated variables to a set of independent variables by using linear transformations.Hierarchical cluster analysis (HCA) was carried out using the 40 most significant VOCs identified in lemon samples obtained by ANOVA (generated using Ward algorithm and Pearson distance analysis).The ratio of VOCs was first calculated by average algorithm and Pearson distance analysis, and then the metabolic alterations were demonstrated as log10 (ratio) depicting distinct clustering patterns among the studied groups.Principal component analysis (PCA) was used as an unsupervised pattern for statistical procedure that converts a set of observations of possible correlated variables into a set of values of linearly uncorrelated variables (principal components) using orthogonal transformation.This proof of concept work clearly shows the potential of NTME coupled to GC-MS in the definition of geographical markers for lemon varieties.Future works involving a higher number of samples will certainly facilitate data analysis and improve the robustness of the statistical models and obtained results.

Optimization of NTME Procedure
A properly optimized method ensures good accuracy, precision and sensitivity.Accordingly, the most relevant parameters affecting NTMEsample amount, extraction temperature, equilibration time and headspace volume, were optimized using a DoE optimization approach (Table 1.SM, Supplementary Material).Upon the different experiments performed (Fig. 2), DoE predicts the influence of the parameters considered and the outcome of different combinations along the maximum and minimum values obtained for each parameter.

Sample amount
The sample amount should be selected based on the established distribution constant (K D ) of the volatile composition.Depending on the sorbent and nature of the volatiles, the K D may vary substantially, leading to different extraction yields.The dependence of the extraction efficiency on the sample amount gives useful information on NTME method development.Our initial assay reveal that 1 g of lemon peel was excessive, contributing to a poor chromatographic resolution (data not shown).The sample amount was then adjusted to 250 mg of lemon peel, with the respective chromatograms showing a good peak resolution and sensitivity.In addition, the effect of the particle size of the lemon peel amount was also evaluated by comparing 1 single piece of 250 mg vs 250 mg of smaller slices of ± 1 mm 2 (obtained using the ULTRA-TURRAX T25 disperser).Not surprisingly, the smaller slices provide a higher surface area, allowing more efficient extractions and so this condition was used in all further assays.

Extraction temperature
The temperature employed during extraction is one of the most important parameters affecting the efficiency of NTME.This is mainly because high extraction temperatures improve the kinetic of mass transfer from the bulk sample to the headspace, favouring the efficiency of the extraction procedure [10,23,25].However, too high temperatures may cause the thermal degradation of the most labile VOCs and decrease the trapping capability of the sorbent, due to the exothermic effect of the sorption process [25,29].In this work, the best results were obtained using 50 ºC as extraction temperature (Fig. 2).It was observed a direct correlation between the extraction temperature and the total (continued on next page) instrument signal.Furthermore, the results listed in Fig. 2 also show that the extraction temperature is the main factor in the extraction process.

Equilibration time
NTME, in contrast with SPME, has an exhaustive character and its extraction ability can be extended till the sorbent saturation (breakthrough point) [16][17][18][19].Nevertheless, the selection of a proper equilibration time has direct influence on the amount of the target analytes that will be available to extract and consequently in the sensitivity and precision of the NTME method [16][17][18][19].The results (Fig. 2) shows that the equilibration time was the second most important parameter in the optimization model.A linear correlation among equilibration time and instrument signal was obtained meaning that higher the extraction time, higher the instrument signal (within the time range studied).

Headspace volume
Since NTD is an exhaustive technique, the response will be proportional to the sample headspace volume that is loaded through the sorbent (n = C 0 V) [21,22].This agrees with the results obtained in DoE (Fig. 2).According to Trefz et al. [27], the extractive capacity of the sorbent is greater when larger sample headspace volumes are used, at least for a set of model metabolites like isoprene, pentane, toluene and pentanal [18].In conventional NTDs the breakthrough is about 0.5 mg for a packing length of 1 cm [30], while for the DVB/CAR/CAR fibre, the breakthrough volume was not reached up to 60 mL [18] of headspace volume.Based on the results obtained, and to minimize the extraction time, avoid signal saturation on the GC-MS and increase the reusability of the sorbents, 30 mL was selected as the appropriate headspace sample volume.This selection agrees with the results obtained in DoE (Fig. 2).

VOCs profile of lemon peels from different geographical regions
As can be seen in Fig. 3, the volatile profiles from the peels of the Eureka lemon samples from different geographical regions are quite   1).The retention indices of the experimental data obtained were in good agreement with those reported on the literature and with the correspondent linear retention, having an r 2 = 0.995 (Supplementary Fig. 1).The contribution of each VOC for the total volatile fraction expressed as relative peak area were calculated as follow:

Peak area of analyte Peak area of internal standard /
Despite the apparent similarity in the VOCs profiles obtained for the four lemon samples analysed, there are some distinctive features related with the relative abundance of each VOC and functional groups.As shown in Table 1, the lemons from South Africa seems to be more aromatic than the other groups as the sum of the relative peaks areas is significantly higher (1.5, 2 and 3 times higher than lemons from Madeira, Portugal mainland and Argentine, respectively).Monoterpenes are by far the functional class more abundant in all samples analysed, representing over 95% of the volatile fractions.This representation is mainly due to D-limonene, followed by α-and β-pinene, β-myrcene and γ-terpinene, which are the most abundant VOCs identified in all lemon samples.In contrast, higher alcohols are much more abundant in the Portuguese lemons from mainland when compared with the other groups analysed.A similar trend was observed for aldehydes that are more abundant in the lemons cultivated in Portugal (mainland and Madeira) than the samples from Argentine and South Africa.To obtain a closer snapshot of the importance of each functional class in each type of lemon analysed, the VOCs relative areas were normalized and compared.As can be clearly observed in Fig. 4, although monoterpenes represent almost all lemon VOC composition (ranging from 95 to almost 98 % of the volatile composition), there are significant variations in the abundance of minor classes, particularly aldehydes, higher alcohols, ketones and sesquiterpene hydrocarbons.These features are obviously also evident at the individual VOCs level.Hexanal and ethanol, for instance, are far more abundant in the lemons from Portugal (mainland and Madeira) when compared with Argentine and South Africa lemons.Overall, esters and ketones are among the less abundant classes of VOCs (Fig. 4), but very interesting differences were also observed among the four lemon cultivars analysed.The 2-heptanone, for instance, is four times more abundant in the lemon samples from Portugal (mainland and Madeira) than the lemons form Argentine and it was not detected in the lemons form South Africa.Similarly, esters are very poorly represented in the lemon samples analysed, with only three of such compounds identified, being the most abundant, ethyl hexanoate, only detected in the lemons cultivated in Portugal mainland.Also 2-methyl-2-heptenal was only identified in lemon peels from South Africa, while 1-butanol, 2-heptanone, α-thujene and toluene, were not identified in these samples.Overall, these compounds can be considered potential geographic markers and this possibility was assessed using advanced statistical analysis discussed in the next section.
The volatile composition of lemons from Eureka variety has been reported in the literature employing a range of analytical techniques.Lota et al. [31], for instance, identified 22 VOCs, while Zhong et al. [32] identified 34 VOCs and Zhang et al. [33] identified 54 VOCs in Eureka and 67 VOCs in Limonia lemons.By comparison, the 75 VOCs reported in this work is indicative of the higher throughput that NTME allows.In agreement with our results, these reports indicate a richmonoterpene volatile fingerprint for lemons and a similar volatile composition in what concerns to the major VOCs, namely D-limonene, β-pinene, γ-terpinene, β-myrcene and α-pinene.
The first principal component of PCA (PC1) explains 52.1 % of the variance and separate the lemon varieties produced in Portugal -Mainland and Madeira island -from remain varieties, being (ethanol, ethyl octanoate, trans-β-terpinol, α-panansinene, perilla aldehyde and nerol the VOCs responsible for this separation).The second principal component (PC2) contributes for 24.4 % of the total variance of the model and separate the varieties produced in South Africa from those produced in Argentine (α-pinene, α-thujene, toluene, 1-butanol, D-limonene and 2-methyl-2-heptenal).
Following the PCA analysis, HCA was also performed using the 40 most significant VOCs identified in lemon samples obtained by ANOVA, as described in Section 2.5.This strategy allows a better identification of the inherent clustering patterns between each geographic origin, in complementarity with the statistical analysis carried out previously.The result of this treatment can be visualized in the heatmap plot (Fig. 6) and in a dendrogram (Fig. 1SM, Supplementary Material).

Conclusions
In this work we reported the identification of lemons according to their geographical origin using a simple analytical layout and a fairly economic experimental set-up.The optimized analytical approach, NTME/GC-MS, allowed a deep and comprehensive insight on the volatile composition of lemon peels (exocarp) from Eureka variety cultivated at different geographical origins -Portugal Madeira Island (Portugal), Argentine and South Africa.A total of 75 VOCs were identified in lemon peels from Eureka variety, a number slightly higher than those reported in previous published works for the same variety.The monoterpenes family are the most dominant VOCs contributing for about 95% of the volatomic composition of lemon peels from Eureka variety.D-limonene, β-pinene and γ-terpinene are the major volatiles identified in lemon peels from the targeted geographical origins.
The VOCs identified in this work were able to differentiate lemons according to their geographic region.Accordingly, butanal, α-pinene, α-thujene, 2-heptanone, D-limonene, 2-methyl-2-heptenal, nonanal, decanal, 1-octanol, limonene oxide, β-caryophyllene and 2,6-dimethyl-2,6-octadiene, were the VOCs that most contributed for this discrimination.Future work involving more samples and harvesting seasons will certainly improve even further the robustness of the geographical biomarkers here identified.In addition, this analytical approach provides a feasible strategy for authentication of citrus fruits based on volatile fingerprint of its exocarp.NTME/GC-MS reveals a great application potential to other fruits and food matrices, regarding its analytical characterization and authentication based on its Fig. 6.Hierarchical cluster analysis (HCA).The heat map with the 40 most significant volatiles identified in lemon samples obtained by ANOVA, was generated using Ward algorithm and Pearson distance analysis.volatomic composition, enabling effective strategies to support food integrity.The results also suggested a wide range of applications for lemon peels from Eureka variety based on identified VOCs, namely health benefits, potential food additives, as flavour and fragance agents and cosmetic insdustry.
The robustness, high throughput capacity, easy use and sample storage ability for in-field sampling will make NTME very popular extraction approach over a wide range of applications beyond the food analysis here reported.Environmental and clinical analysis will certainly be among those successful applications.

Fig. 2 .
Fig. 2. DoE results as Estimated Response Surface Mesh and Standardized Pareto Chart for Total Area, from different key parameters that influence NTME: Temperature (º C), Equilibration time (min) and Headspace volume (mL).

Fig. 4 .
Fig. 4. Schematic representation of the contribution of the different classes of VOCs identified in the lemon peels.For simplification, total peak areas of each sample type were normalized and represented as percentage (%).

Fig. 5 .
Fig. 5. (A) Scatter plot of the two principal components (PC1 and PC2) using the VOCs obtained by NTME/GC-MS, and (B) variables with highest contribution for the PCA differentiation.

Table 1
Volatile organic compounds (VOCs) identified in lemon peel from Eureka variety, from different geographic origins.

Table 1
(continued) VOCs indicated in bold were confirmed against commercial standards.aRT: retention time expressed in min.b RI calc : experimental Kovat's index.c RI lit : Kovat's index reported in the literature.d Relative Peak Area (×10 −2 ): (VOC peak area/Internal Standard peak area).e Peak number ordered by VOC retention time.f nd: not detected.g IS: Internal Standard (2-heptanol).

Table 2
VOCs responsible for the discrimination of Eureka lemons according to the geographic region of production.
a VOC identified only in this geographical region.