ARTICLE   Open Access    

Comprehensive study of non-volatile and volatile metabolites in five water lily species and varieties (Nymphaea spp.) using widely targeted metabolomics

More Information
  • Water lilies, members of the Nymphaeaceae family, are globally cultivated aquatic plants known for their diverse colors and significant ornamental, economic, beverage, medicinal, and ecological value. In this study, we employed ultra-high performance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS) to analyze the non-volatile components and simultaneous distillation extraction (SDE) in combination with two-dimensional gas chromatography-time-of-flight mass spectrometry (GC×GC-TOFMS) to analyze the volatile components in five water lily species and varieties. Results showed that 118 differential metabolites including flavonoids, phenolic acids, amino acids, and lipids were screened among 533 non-volatiles. Cyanidin-type anthocyanins, including cyanidin-3-O-galactoside, cyanidin-3-O-glucoside, and cyanidin-3-rutinoside, are present in high amounts in the purple-colored Nymphaea 'Detective Erika'. Conversely, delphinidin is found in significant quantities in Nymphaea 'Blue Bird', which exhibits a blue color. KEGG analysis showed that flavonoid biosynthesis and anthocyanin biosynthesis exhibited significant enrichment. Additionally, a total of 166 volatiles were screened in water lilies, mainly including aromatic compounds, alkynes, ketones, alcohols and esters. Among them, the concentrations of key compounds including 1,11-dodecadiene, benzyl alcohol, benzaldehyde, α-farnesene and dimethyl sulfide, varied significantly among different samples. This study reveals significant variations in chemical compounds among different Nymphaea species and varieties. These findings contribute to enhancing our comprehension of the metabolic variability and composition of water lilies, which might shed light on unlocking new possibilities for their potential application in the beverage industry.
  • Genetic variations and epigenetics play important roles in regulating flavonoid accumulation in plants. Genetic variation refers to DNA sequence differences present in the genomes of different individuals, impacting coding sequences, promoter regions, and regulatory elements. These differences can directly or indirectly influence gene expression levels and regulatory patterns, thereby impacting the plant's phenotype. The most common genetic variation is single nucleotide polymorphism (SNP), resulting from changes in a single nucleotide in the DNA sequence. They can be classified as deletion, insertion, or substitution mutations. Research has indicated that SNPs in coding regions can alter amino acid sequences, impacting protein function. Numerous studies on the diversity of gene sequences related to flavonoid biosynthesis have been conducted in plants such as Arabidopsis thaliana[1], maize[2], mango[3], barley[4], and tomato[5]. The results revealed a significant correlation between SNP content in genes associated with flavonoid biosynthesis and the accumulation of flavonoid compounds. Furthermore, genetic variations lead to phenotypic differences among cultivars of important crops like jujube[6], camellia[7], hemp[8], bamboo[9], rice[10], and maize[11]. Plant phenotypic changes are influenced not only by genetic variations but also by epigenetic factors[12]. Epigenetics refers to heritable changes in gene expression patterns without altering the DNA sequence, encompassing DNA methylation, histone modifications, and chromatin remodeling[13]. DNA methylation is a conserved epigenetic marker across eukaryotes, and it is involved in many important biological processes, such as human aging[14], cancer development[15], genome integrity[16], gene imprinting, and plant stress responses. Studies have shown that DNA methylation impacts the accumulation of flavonoid compounds in plants. For instance, in Lithocarpus polystachyus Rehd[17], Triticum aestivum L.[18] and tomatoes[19], DNA methylation occurring in promoter and coding regions regulates gene expression associated with flavonoid biosynthesis, impacting flavonoid compound accumulation. DNA methylation plays a pivotal role in crucial biological processes in seeds[20], embryos[21], and flowers[22]. Consequently, both genetic and epigenetic variations significantly contribute to flavonoid compound accumulation in plants.

    However, sequence variations and epigenetic modifications are not independent phenomena, they interact intricately and culminate in phenotypic transformation. Methylated cytosines are more prone to induce a higher mutation rate at the sequence level, leading to changes in methylation levels and consequently influencing gene expression. Previous studies have shown that in Arabidopsis[23], lotus[24], and sugarcane[25], methylated cytosines are more susceptible to mutation compared to unmethylated cytosines, resulting in CG=>TG type variations and causing changes in methylation that impact plant gene expression. However, SNPs can also lead to changes in the sequence type or methylation levels of DNA. When cytosine is mutated to other types of bases (A, T, and G), the methylation pattern or level can also change, resulting in a difference in methylation levels at these sites[26]. Similarly, when other bases (A, T, and G) mutate to cytosine, the methylation type and methylation level were acquired, resulting in differences between the two sites. In addition, non-cytosine mutations also result in changes in methylation patterns and methylation levels, for example, CHG=>CHH and CHH=>CG. Studies in Arabidopsis[27] and cassava[26] have revealed that SNPs can change both methylation type and level. This indicated a close relationship between genetic variations and epigenetic during plant growth and development. However, current research on the co-regulatory mechanisms of genetic and epigenetic variations in plant phenotypes remains relatively limited.

    Honeysuckle (Lonicera japonica Thunb.), belongs to the genus Lonicera of the family Caprifoliaceae is named for its flower development process (Fig. 1a), in which the color changes from silver-white to golden-yellow. Honeysuckle is rich in medicinally active compounds, including luteoloside, chlorogenic acid, flavonoids, and sesquiterpenes. Pharmacopeia records indicate that the flowering stage, known as the white bud (WB) stage, has the highest content of medicinally active compounds[28]. Among these, flavonoids play a crucial role in antiviral activity[29] and have been used to treat various viruses. Sijihua and Beihua No. 1 (Beihua) are the two most common honeysuckle cultivars in China and show significant phenotypic differences. Sijihua has a short WB stage, lasting only 2−3 d, whereas Beihua can have a WB stage lasting up to 20 d, allowing for a longer storage period of high-content medicinal active compounds. Additionally, the flowers exhibit conspicuous color variations at distinct developmental stages. Research has demonstrated that Beihua consistently displays higher levels of total flavonoids and other essential medicinal compounds compared to Sijihua[30]. A recent study revealed that DNA methylation in Sijihua affects the expression of carotenoid-related genes, leading to variations in flower color during different stages of development[31]. Honeysuckle, a crucial antiviral medicinal plant source, exhibits significant phenotypic differences among cultivars, including variations in metabolite accumulation and other traits. Growth and development, particularly flowering, are regulated by DNA methylation. This makes the honeysuckle an ideal material for studying the co-regulatory mechanisms of genetic variations and epigenetics in flavonoid compound synthesis.

    The release of the honeysuckle genome has laid the foundation for studying the genomic structure of different cultivars[32]. Whole-genome resequencing technology to sequence honeysuckle will provide a comprehensive and high-resolution view of genetic variations among different cultivars. We aimed to elucidate the molecular mechanisms underlying phenotypic differences and accumulation of active compounds in different honeysuckle cultivars from the perspectives of genetic and epigenetic variations. In this study, multi-omics approaches were employed, including whole-genome resequencing, whole-genome bisulfite sequencing, and transcriptome sequencing. This comprehensive investigation focused on Beihua No. 1 (Beihua) and Sijihua during two key stages (WB and GF), characterized by differences in phenotypes and metabolite accumulation. Using whole-genome resequencing, a large number of genetic variations were identified a pseudo-reference genome was constructed for Beihua, single-base resolution DNA methylation profiles were compared between Beihua and Sijihua at different flowering stages, the characteristics of DNA methylation changes examined in different contexts. The relationship between SNPs and differentially methylated cytosines (DMCs) in honeysuckle genomes during two developmental stages were explored. Furthermore, the impact of SNPs on key enzyme-encoding genes involved in the flavonoid biosynthesis pathwaywere analyzed, thereby revealing crucial factors contributing to the differences in flavonoid biosynthesis and accumulation between the two cultivars. This study reveals the molecular mechanisms by which genetic variations and epigenetic co-regulation to phenotypic differences and variations in metabolite accumulation in the two honeysuckle cultivars. This study provides new insights into honeysuckle breeding and cultivation techniques and serves as a reference for exploring regulatory mechanisms of growth and development through the application of genetic variations and epigenetics in other species.

    In this study, flower tissues in different stages of honeysuckle cultivar 'Beihua No.1' (abbreviated as 'Beihua' throughout the entire article) were collected, including juvenile bud (JB), green bud (GB), white bud (WB), silver flower (SF), and golden flower (GF), from Zhongke Honeysuckle Planting Cooperative in Pingyi County, Shandong, China (35°31'02'' N, 117°36'55'' E). Whole-genome bisulfite sequencing (WGBS) and transcriptome sequencing were performed using flower tissues from two stages of the honeysuckle cultivar Beihua, namely, the WB and GF stages. Young leaf tissue from Beihua was used for the resequencing.

    Three biological WGBS libraries were sequenced using the Hiseq X10 sequencer (Illumina, San Diego, CA, USA) as paired-end 150-bp reads. Three biological libraries were generated using the VHTS Universal V6 RNA-seq Library Prep Kit following the manufacturer's instructions (Illumina). All libraries were sequenced using the Novaseq 6000 platform (Illumina, San Diego, CA, USA), and 150 bp paired-end reads were generated.

    The reference genome for the cultivar 'Sijihua' of honeysuckle was obtained from the NCBI database under accession number PRJNA794868. Additionally, transcriptome data for the bud stage (WB: SRX14408207 − SRX14408209) and the flowering stage (GF: SRX144082013 − SRX144082015) were downloaded from the same database. Whole-genome bisulfite sequencing (WGBS) data for the WB stage (SRR18684863 − SRR18684865) and GF stage (SRR18684866 − SRR18684868) were also retrieved from NCBI. It is worth noting that the collection location and environmental conditions for the cultivation of the Sijihua cultivar of honeysuckle are identical to those of Beihua (Honeysuckle Planting Cooperative in Pingyi County, Shandong, China (35°31'02'' N, 117°36'55'' E)[31].

    For resequencing library construction, DNA isolation was fragmented using Bioruptor (ThermoFisher Scientific, Waltham, MA, USA), resulting in libraries with approximately 300 bp fragment sizes. Quality control of the libraries was performed using the Qubit dsDNA HS Assay Kit (ThermoFisher Scientific) and Agilent 2100 Bioanalyzer System (Agilent Technologies, Santa Clara, CA, USA). High-quality DNA libraries were then sequenced on the BGISEQ-500 platform, generating reads of 150 bp in length.

    WGBS libraries were constructed and prepared using the TruSeq DNA L T kit (Illumina) as described previously. Three biological WGBS libraries were sequenced on Illumina Hiseq X10 sequencers as paired-end 150-bp reads.

    Total RNA was extracted from the honeysuckle cultivar 'Beihua' petals at the WB and GF stages using the cetyltrimethylammonium bromide (CTAB) method. Three biological libraries were prepared using the VHTS Universal V6 RNA-seq Library Prep Kit following the manufacturer's instructions from Illumina. Subsequently, all libraries were sequenced using the Novaseq 6000 platform from Illumina, located in San Diego, CA, USA, producing 150 bp paired-end reads.

    We used FASQC (www.bioinformatics.babraham.ac.uk/projects/fastqc) for quality control of the generated FASTQ files, after which low-quality reads were filtered out, including those with more than 10% N content and more than 50% of low-quality bases (< 10). The net data obtained after filtering were compared to the Sijihua reference genome using BWA software (v0.7.12). Duplicates generated by PCR amplification were labeled and removed using the Picard package (https://sourceforge.net/projects/picard/). The 'HaplotypeCaller' function of GATK4 (version 4.1.4.1, https://hub.docker.com/r/broadinstitute/gatk/) was employed to generate GVCF files. Raw variant calling sets underwent hard filtering with parameters 'QD < 2.0 || MQ < 40.0 || FS > 60.0 || SOR > 3.0 || MQRankSum < −12.5 || ReadPosRankSum < −8.0'. The VCF file generated by hard filtering included the chromosome number of SNPs, SNP positions, reference bases, mutant bases, etc. Next, bedtools maskfasta was used to align the SNP information of the VCF file in the reference genome of Sijihua and finally constructed the pseudo-genome of Beihua. The downstream analysis script add_ka_ks.pl from MCScanX was used for calculating Ka/Ks (non-synonymous/synonymous) values for each gene pair.

    SNP annotation was performed using SNPEff software (https://sourceforge.net/projects/snpeff/) based on the constructed Beihua pseudogenome, SNPs were classified into intergenic regions, upstream, downstream, Splice-site-donor, Splice-site-acceptor, UTR-5-prime, UTR-3-prime, and SNPs encoding exons were further divided into synonymous and nonsynonymous mutations.

    Raw bisulfite sequencing reads of Beihua was trimmed by Trimmomatic v0.39 to obtain clean data and aligned to the replaced Beihua pseudogenome by BSMAP v2.90. A 4% mismatch rate was allowed per 150 bp of read length and only uniquely mapped reads were retained for subsequent analysis. Next, reads aligned to unmethylated lambda DNA were used to estimate the conversion rate. Whether cytosine is methylated or not was identified mainly based on the conversion rate and binomial distribution. The methylation level of each cytosine was calculated using the methration.py script of the BSMAP software. The formula for calculating the level of cytosine methylation is #C/(#C+#T), with #C representing the methylated cytosine and #T being the unmethylated cytosine. Similarly, whole genome bisulfite sequencing data of Sijihua were calculated by the same method. DMCs between the two cultivars of Beihua and Sijihua were identified using methylkit default parameters. In addition, the absolute methylation levels of white and golden flowers of both cultivars, under the same contexts of CG, CHG, and CHH, were to vary more than 40%, 20%, and 10%, respectively. In this paper, both the loss of CG, CHG, and CHH (CG-loss, CHG-loss, and CHH-loss) and the gain of CG, CHG, and CHH (CG-gain, CHG-gain, and CHH-gain) were defined as DMCs, and this type of DMCs is relative to the two extremes of variation from the presence to the absence and from the absence to the presence of the two cultivars.

    For each biological replicate of the Beihua and Sijihua transcriptome data, we used Trimmomatic v0.39 for quality control and aligned to the Beihua pseudogenome and Sijihua reference genome using HISAT2 default parameters, and we kept only uniquely mapped reads to estimate expression values (Supplemental Table S1). The expression of each gene we quantified using stringTie v2.1.7. Next, DESeq2 v1.32.0 was used to identify differentially expressed genes in white buds and golden flowers of both Beihua and Sijihua cultivars and differentially expressed genes were required to satisfy Log2|FC| > 2 and FDR < 0.05.

    GOATOOLS was used for gene ontology (GO) enrichment of overlapping genes of differentially expressed genes (DEGs) in white buds and golden flowers of both Beihua and Sijihua cultivars. Only GO terms with P-values less than 0.05 were retained for analysis. Heatmap, and GO enrichment plot of DEGs were made using R software version 3.5.

    To reveal the genomic variations between Sijihua and Beihua, Beihua was resequenced (Fig. 1a), generating 3.7 Gb of high-quality clean data. Reads were mapped to Sijihua's reference genome using BWA software. Over 91% of read pairs successfully aligned at a coverage depth of 30 × (Supplemental Table S2), facilitating subsequent genetic variation analysis. 9,909,981 SNPs were identified in the Sijihua genome. The majority of these SNPs were situated in intergenic regions (72.42%), followed by intronic regions (8.4%), upstream (8.35%), and downstream (7.34%) regions (Fig. 1b, Supplemental Table S3). SNPs were classified into four groups based on their impact on genes (https://pcingola.github.io/SnpEff/se_inputoutput/). 12,688 high-impact SNPs (0.13%) resulted in stop-gained, stop-loss, splice-site acceptor, splice-site donor, and start-lost mutations. 102,188 low-impact SNPs (1.03%) resulted in synonymous coding, start gained, synonymous stop, or non-synonymous start mutations. Additionally, 158,107 moderate-impact SNPs (1.6%) caused non-synonymous mutations in the coding region. The remaining 9,636,998 SNPs (97.25%) had a modifying impact on intergenic, intronic, upstream, downstream, UTR-3-prime, and UTR-5-prime regions. A total of 38,808 genes were affected by four SNP categories. These included 7,555 genes associated with high-impact SNPs, 24,967 genes associated with moderate-impact SNPs, 23,256 genes associated with low-impact SNPs, and 38,787 genes associated with modifier-type SNPs (Fig. 1c, Supplemental Table S4). High-impact SNPs may be involved in the expression of important genes with key functions[33]. Therefore, a Gene Ontology (GO) functional enrichment analysis of genes associated with high-impact SNPs were performed. Regardless of the white bud (WB) or golden flower (GF) stage, genes related to high-impact SNPs were significantly enriched in response to stimuli, response to viruses, gene silencing regulation, and RNA interference regulation (Fig. 1d).

    Previous studies have reported the construction of pseudo-reference genomes with incomplete genome assembly by using genetic variation substitution in phylogenetic closed cultivars, such as to obtain pseudo-reference genomes for maize Mo17 and rice Matsumae, researchers substituted SNP information for maize B73[34], and rice RZ35[35]. Since the genomes of Sijihua and Beihua are very similar (SNP ratio of 1.12%), a pseudo-reference genome was created for Beihua by replacing the SNP sites with the reference genome of Sijihua.

    Figure 1.  Identification and classification of SNPs. (a) Different development stages of Beihua and Sijihua. Five stages are juvenile bud (JB), green bud (GB), white bud (WB), silver flower (SF), and golden flower (GF), respectively. (b) Classification of SNPs based on genome location. (c) Number of genes affected by SNPs (SNPs were divided into four impact types: high, moderate, low, and modifier). (d) GO enrichment analysis of DEGs related to high-impact SNPs.

    To investigate DNA methylation distinctions between the two honeysuckle cultivars, DNA methylation was sequenced at two stages of floral development (WB and GF) in Beihua using the WGBS technique (Supplemental Table S5). Compared with DNA methylation in Sijihua[31], no significant differences in global CG DNA methylation levels were found between the two cultivars at different developmental stages (T-test, WB: p-value = 0.17, GF: p-value = 0.62). we observed Beihua had significantly higher global DNA methylation levels than Sijihua in the CHG and CHH sequence contexts (T-test, p-value < 0.05). This phenomenon was consistently observed in both WB and GF stages (Fig. 2a). The methylation of WB and GF in Beihua and Sijihua were also analyzed. It was found that in Beihua, only the CHH methylation level showed significant differences, while in Sijihua, methylation levels in all three contexts exhibited significant differences (Supplemental Fig. S1). Furthermore, the average DNA methylation levels in 500 Kb windows across the genome were calculated and observed varying degrees of bias in all three sequence contexts. Specifically, the CG methylation distribution deviated to the right (Supplemental Fig. S2a & e), indicating that CG methylation levels were lower in Beihua than in Sijihua. The differential distribution of CHG and CHH methylation showed a significant left deviation (Supplemental Fig. S2b, f, c, & h), indicating that the methylation levels of CHG and CHH were significantly higher in Beihua than in Sijihua. These results were consistent with global DNA methylation levels (Fig. 2a).

    The number of methylated cytosines were further compared between Sijihua and Beihua, revealing that Sijihua had 138,186,587 methylated cytosines, whereas Beihua had 113,582,475. During the WB stage, Sijihua had a higher number of methylated cytosines in the CG (27,577,348), CHG (21,837,724), and CHH (88,771,515) contexts compared to Beihua, which had CG (22,491,550), CHG (17,537,206), and CHH (17,537,206) methylated cytosines. A similar trend was observed during the golden flower stage, indicating that Beihua's DNA structure is less susceptible to methylation than Sijihua (Fig. 2b). When calculating the methylation levels of each cytosine, both CG and CHG methylation in the two honeysuckle cultivars showed frequent occurrences of two extremes (0 and 1), representing the completely methylated and unmethylated states, respectively. Both showed a bimodal distribution pattern. In Sijihua, 72.9% and 36.6% of the CG and CHG methylation sites, respectively, occurred in a completely methylated state, whereas in Beihua, 69% and 33.7% of the CG and CHG methylation sites, respectively, occurred in a completely methylated state. The proportion of completely methylated CG and CHG in Sijihua was higher than that in Beihua, this trend remained consistent at the GF stage (Fig. 2c & d). Furthermore, we found that the higher percentage of completely methylated CG in Sijihua, approximately 80.7% (18,525,847 / 22,956,440) of CG methylation, could be maintained from WB to GF. In contrast, in Beihua, only 76.6% (13,962,522 / 18,227,835) of CG methylation was maintained from WB to GF. This suggests that during the process of maintaining these methylation types, CG methylation in Sijihua may be more stable and faithfully replicated during DNA replication than that in Beihua[36].

    To investigate methylation distribution in two honeysuckle genomes, gene density, transposable elements (TE) density, SNP density, and average methylation level were calculated in the three methylation contexts of Sijihua and Beihua (Fig. 2e). The TE-enriched regions of both Sijihua and Beihua showed higher DNA methylation levels and lower gene density at both WB and GF stages (Fig. 2e, Supplemental Fig. S2d). This phenomenon in many flowering plants, such as Arabidopsis[37], soybean[38], maize[39], rapeseed[40], and tomato[41]. SNPs showed an uneven distribution across the genome, with a high content occurring in regions with elevated CG and CHG methylation levels, while being scarce in CHH-methylated regions (Fig. 2e). To explore the relationship between DNA methylation and SNP density, the correlation between SNP density and methylation in three sequence contexts were analyzed. Significantly, we observed that SNP density was positively correlated with CG and CHG methylation levels, and negatively correlated with CHH methylation (Supplemental Fig. S3). This suggests that highly methylated cytosines may influence the stability of DNA sequences, especially the CG and CHG methylation.

    To examine the methylation patterns in the coding regions of the two honeysuckles, the methylation levels within the gene coding regions and their 2 Kb flanking regions for both Sijihua and Beihua were calculated. CG methylation levels in the coding regions were higher in Sijihua than in Beihua, whereas CHG and CHH methylation levels were lower in Sijihua than in Beihua. This pattern was consistent across both WB and GF (Supplemental Figs S4ac, S5a-c). Based on previous reports indicating the abundance of transposable element insertions in the coding regions of honeysuckles[31], the genes were classified into two types: TE-related genes and TE-unrelated genes. Subsequently, the DNA methylation patterns of these two gene types were calculated. The methylation pattern of the coding regions of TE-related genes was the same as the genome-wide methylation pattern in both honeysuckle cultivars during the two stages, whereas TE-unrelated genes showed significantly reduced methylation levels. This suggests that TE insertions play a crucial role in maintaining the DNA methylation levels in both Sijihua and Beihua (Supplemental Figs. S4df, S5df). In addition, the DNA methylation of the TE region and the 2 Kb flanking region were calculated. The CG methylation levels of TE in Beihua was higher than that in Sijihua at two stages. However, the CHG methylation level in the TE regions did not differ significantly between these two cultivars at the WB stage, but showed marked differences at the GF stage, indicating dynamic methylation levels of TEs during honeysuckle flower development. Surprisingly, CHH methylation levels of TE regions in Beihua were significantly higher than in Sijihua at both stages (Supplemental Figs S4gi, S5gi). Overall, the methylation levels of Sijihua and Beihua differed significantly and varied with developmental stages.

    Figure 2.  DNA methylation landscape of Beihua and Sijihua. (a) Comparison of whole-genome DNA methylation levels in Beihua and Sijihua during the WB and the GF stages. Three biological replicates were calculated as dots over the bar graph (T-test, * p-value < 0.05, NS represents non-significant). (b) The relative proportions of methyl-cytosine in the contexts of CG, CHG, and CHH in Beihua and Sijihua during the WB and GF stages. (c)−(d) Distribution of methyl-cytosine methylation levels in all three sequence contexts in Beihua and Sijihua during the WB(C) and GF(D) stages were compared. (e) Circos plot showing gene density, TE density, SNP density, and DNA methylation levels for all three contexts of the WB stage.

    By comparing the DNA methylation patterns between Sijihua and Beihua, significant differences in methylation levels of all three sequence contexts were observed. To gain a deeper understanding of the specific sites displaying differential methylation, the study focused on comparing methylation levels between Sijihua and Beihua. During the WB stage, the analysis identified 1,602,266 (19%) DMCs, including 641,008 CG DMCs, 589,493 CHG DMCs, and 371,765 CHH DMCs. Notably, hyper-methylation (Sijihua > Beihua) was more prevalent than hypomethylation in three methylation contexts. This trend persisted in the GF stage (Fig. 3a). However, it is important to consider that genetic variation (SNPs) can also contribute to the variation in DNA methylation sites by influencing the sequence context or altering methylation levels[26]. To further understand how genetic variations (SNPs) impact DNA methylation sites in honeysuckles, 6,679,481 (81%) SNP-associated DMCs were identified at both WB and GF stages. It was observed that 2,539,019 DMCs resulted from cytosine loss (36%), 1,808,116 DMCs were attributed to cytosine gain (26%), and 2,332,346 DMCs were ascribed to other types of mutations (38%). For example, an illustrative snapshot from the Genome Browser displayed various SNP-associated DMCs, including CHH-gain, CG-gain, CHG-loss, CG-loss, and CHH-loss (Fig. 3b). Genetic variations can lead to changes in both methylation type and level. Moreover, it was found that the number of SNP-associated DMCs was much greater than the number of non-SNP-associated DMCs (DMCs caused by different methylation levels in the same context) when comparing differentially methylated cytosines within the two honeysuckles (Fig. 3a). These findings underscore the potential substantial contribution of genetic variations in driving disparities in DNA methylation levels between the two honeysuckles.

    Regardless of the number of cytosine methylation type mutations or non-cytosine mutations, it was found that the CHH type was the most affected in terms of cytosine loss, gain, and other types of mutations, followed by the CHG and CG types (Fig. 3c & d). Among other types of mutations, the significance of guanine (G) loss and gain was particularly evident, playing a pivotal role in the dynamic transition between distinct methylation types. Transitions from CG to CHH and from CHG to CHH are the most prevalent, with a predominant proportion of guanine (G) mutations mutating to adenine (A) or thymine (T). Following this trend, mutations from CHH to CG and CHH to CHG emerged as the next most frequent occurrences, primarily involving the conversion of adenine (A) to guanine (G) (Fig. 3e). It is worth noting that these mutations not only led to changes in methylation type but also influenced the methylation levels themselves.

    Specially, when CG sites underwent mutations, transforming into CHG or CHH sites, they exhibited a higher likelihood of becoming hypomethylated. Conversely, CHH sites transitioning to CHG or CG sites were more prone to hypermethylation (Supplemental Fig. S6a). Extending the examination to the diversity of the 500 bp flanking sequences surrounding the SNP-associated DMC sites, it was observed that downstream nucleotide diversity was higher than that upstream of the DMC sites. Additionally, nucleotide diversity at the DMC sites markedly exceeded that within the adjacent flanking sequences. Intriguingly, this phenomenon was not observed at non-DMC sites. The higher nucleotide diversity at DMC sites underscores their susceptibility to frequent natural selection, implying the potential for intricate interplay between SNPs and DNA methylation dynamics (Supplemental Fig. S6b).

    Figure 3.  Identification of differentially methylated cytosines in the honeysuckle genome. (a) The proportion of two DMC types in the honeysuckle genome, including non-SNP-associated DMCs and SNP-associated DMCs. (b) Representative screenshots of DMCs. Methylation site losses, gains, and mC context changes were indicated on the underside of the track (red: Sijihua; blue: Beihua; Light blue shading represents non-SNP-associated DMCs; Light green represents SNP-associated DMCs). (c) The proportion of methylation types of the SNP-associated DMCs. (d) Sankey diagram showing the number of changes in DNA methylation types in three contexts between Sijihua and Beihua. (e) Sunburst chart showing the number of methylation type changes caused by nucleotide mutation.

    To further investigate the effect of cytosine loss on DNA methylation levels, the base substitution rates were calculated for homologous genes between Beihua and Sijihua. The rate of C=>T base substitutions was higher than that of other mutation types (Fig. 4a), consistent with studies in Arabidopsis[23] and lotus[42]. Within the honeysuckle genome, methylated cytosines exhibited a greater propensity to mutate into C=>T compared to unmethylated cytosines (1,155,283 for mC=>T, 636,736 for non-mC=>T, binomial test, p-value = 1.727e-10), resulting in a higher frequency of CG to TG mutations[24]. However, CG=>TG mutations were not evenly distributed throughout the genome. The majority of these mutations occurred in intergenic regions (272,395), followed by introns (19,132) and exons (44,634) (Fig. 4b).

    To validate whether the mutation from CG methylation type to TG in honeysuckle impacted gene methylation levels, we assessed CG methylation levels within the gene body and the 2 Kb flanking regions of both Sijihua and Beihua genes. The analysis revealed significant differences in the gene body regions of both Sijihua and Beihua at both stages (Mann-Whitney test, p-value < 0.001) (Fig. 4c, Supplemental Fig. S7a). When genes unaffected by CG=>TG mutations were considered, as expected, no significant differences (Mann-Whitney test, p-value > 0.05) were observed in the gene-body regions at two stages of Sijihua and Beihua (Fig. 4d, Supplemental Fig. S7b). Subsequently, the study was focused specifically on genes with CG=>TG mutations and calculated the CG methylation levels in their gene body and 2 Kb flanking regions[43]. Intriguingly, significant differences in the gene body methylation levels between Sijihua and Beihua were observed (Mann-Whitney test, p-value < 0.001) (Fig. 4e, Supplemental Fig. S7c). This indicates that genetic variations leading to CG=>TG mutations directly affect CG methylation levels in the gene body. Furthermore, it was observed that genes affected by CG=>TG mutations not only showed reduced CG methylation levels in the gene body but also displayed a significant reduction in gene expression levels in Beihua and Sijihua (Fig. 4fh, Supplemental Fig. S7df). This observation is consistent with previous findings in other plants such as Arabidopsis[44], and Brassica napus[45], where higher CG methylation within the gene body promotes gene transcription.

    In summary, CG=>TG mutations not only altered the methylation levels, but also affected gene expression levels. To understand the functions of these genes affected by CG=>TG mutations, GO enrichment analysis was performed and identified their predominant enrichment in transcriptional regulation, stress response, flavonoid biosynthesis pathways, responses to viruses, biosynthesis of flavonoids, flavonols, and flavonoids (Supplemental Fig. S8). Collectively, these findings underscore the critical role played by the interactions between epigenetic and genetic variations in regulating gene functions in honeysuckle.

    Figure 4.  Relationship of CG=>TG type mutations with CG methylation and gene expression. (a) The substitution rate of nucleotides in homologous genes between Beihua and Sijihua. (b) The distribution of CG=>TG mutation sites on the honeysuckle genome. Comparison of CG methylation levels in the body region and the flanking 2 Kb region of (c) all genes, (d) CG=>TG unrelated genes, and (e) CG=>TG-related genes during the WB stage in Sijihua and Beihua. The expression levels of (f) all genes, (g) CG=>TG unrelated genes, and (h) CG=>TG-related genes during the WB stage in Sijihua and Beihua were compared (Mann-Whitney test, *** p-value < 0.001, NS represents non-significant).

    To investigate deeper into the impact of SNP-associated DMCs, genes were identified that had overlaps with SNP-associated DMCs within their gene body and 2 Kb flanking regions, designating them as genes related to SNP-associated DMCs. It was found that SNP-associated DMCs were linked to 76% (29,899/39,320) of the genes within the honeysuckle genome. Among these genes, 5,324 and 7,067 genes showed differential expression (DEGs, log2|FC| > 2, FDR < 0.05) in the WB and GF stages, respectively, with an overlap of 3,325 genes displaying differential expression in both stages (Fig. 5a). Notably, the majority of genes (3,158/3,325) that exhibited differential expression at both stages showed consistent expression trends. Specifically, in both the WB and GF stages, the expression levels of these genes in Beihua were either higher than those in Sijihua (C3) or lower than those in Sijihua (C4) (Supplemental Fig. S9a). Interestingly, GO enrichment analysis of these genes with consistent expression trends revealed significant enrichment in processes related to flavonoid biosynthesis, flavonol biosynthesis, cellular glucan metabolism, stress response pathways, and floral whorl development (Supplemental Fig. S9b). Overall, genes affected by both SNPs and DMCs play important biological functions in various critical pathways, indicating their substantial regulatory significance.

    The above analysis has highlighted those genes affected by both SNPs and DMCs were significantly enriched in the flavonoid biosynthesis pathway. Flavonoids represent a vital group of secondary metabolites in plants renowned for their natural medicinal properties. They play crucial roles in plant growth, development, and defense against biotic and abiotic stresses. The synthesis of flavonoids originates from phenylalanine through the phenylpropanoid pathway, involving several key enzymes (Fig. 5d)[46]. In the investigation of the effects of SNPs and DMCs on flavonoid biosynthesis in honeysuckle, 35 homologous genes encoding key enzymes in the flavonoid biosynthesis pathway in the honeysuckle genome were identified (Fig. 5d, Supplemental Table S6). Despite the presence of a high number of SNPs in both the gene body and promoter regions of these 35 homologous genes, their Ka/Ks ratios are considerably lower than 1. This suggests that these genes have undergone purifying selection during evolution and possess relatively conserved structures and sequences. Consequently, genetic variation may not be the primary driving factor behind the differences in flavonoid biosynthesis between the two honeysuckles (Fig. 5b & e).

    Furthermore, by comparing the methylation levels of these 35 genes in Sijihua and Beihua at different stages, significant differences were observed in both the gene body and promoter regions. Additionally, some genes showed significant changes in expression. For example, LjPAL (EVM0007831), LjCHS (EVM0026111), LjCHI (EVM0013981), and LjFLS (EVM0027194) showed significantly higher expression levels in Beihua than in Sijihua at both the WB and GF stages (Fig. 5c, Supplemental Fig. S10). Interestingly, it was found that both the gene body and promoter regions of these 35 genes contained SNP-associated DMCs, as well as non-SNP-associated DMCs. These DMCs appear to play crucial roles in the regulation of gene expression (Supplemental Fig. S11). For example, a genome browser showed that LjCHI (EVM0013981) and LjFLS (EVM0027194) have a large number of SNP-associated DMCs and non-SNP-associated DMCs in the promoter region. Additionally, the expression levels of these genes were significantly higher in Beihua than in Sijihua (Fig. 5f & g, Supplemental Fig. S12). Therefore, the promoter regions of these key enzyme genes contain numerous nucleotide mutations that lead to changes in promoter region methylation, directly affecting gene expression. This may be a crucial factor contributing to the higher flavonoid levels in Beihua compared to Sijihua.

    Figure 5.  The relationship between genes related to SNP-associated DMCs and the flavonoid pathway. (a) Venn diagram showing the number of overlapping differentially expressed genes (DEGs) associated with SNP-associated DMCs between WB and GF. (b) Boxplot showing Ka/Ks ratios of 35 key genes and genomes in the flavonoid pathway of honeysuckle (Mann-Whitney test. *** p-value < 0.001). (c) Boxplot showing the methylation levels of CG, CHG, and CHH in the promoter regions of 35 key genes in the flavonoid pathway during the WB in Beihua and Sijihua (Mann-Whitney test. *** p-value < 0.001). (d) The pathway of flavonoid synthesis in plants. (e) Heatmaps showing the expression levels of 35 key genes of the flavonoid pathway, the number of SNPs in the upstream region of the genes, the number of SNP-associated with DMCs, and the number of non-SNP-associated with DMCs for Beihua and Sijihua at the WB and GF (* represents DEGs in the WB and GF stages). (f) Genome browser showing DNA methylation level and expression level of LjCHI (EVM0013981). (g) Genome browser showing representative screenshots of two types of DMCs in the promoter region (Chr01: 82302800 − 82302800) of LjCHI (EVM0013981).

    In this study, whole-genome resequencing, whole-genome bisulfite sequencing, and transcriptome sequencing techniques were employed to sequence different honeysuckle cultivars. From both the genetic and epigenetic perspectives, the factors underlying the phenotypic differences observed between the two honeysuckles were investigated. Previous studies have revealed that high-impact SNPs in the genomes of various plants, such as rice[47], tomato[48], cucumber[49], cotton[50], and peach[51], have significant effects on important agronomic traits. These high-impact SNPs have been observed to play crucial roles in plant resistance. For example, in tomato, resistance genes carrying high-impact SNPs exhibit reduced susceptibility to Oidium neolycopersici infection, resulting in enhanced resistance. In cotton, the GhDRP gene with high-impact SNPs showed decreased expression upon infection with V. dahliae, resulting in leaf yellowing, shedding, and severe cell damage. This indicates a robust lignification response to counteract pathogen attacks. The present study identified a substantial number of genes in the honeysuckle genome affected by high-impact SNPs, significantly enriched in resistance pathways, such as response to fungi, defense response to virus, and response to mechanical stimulus. These findings provide a solid foundation for improving the desirable traits of honeysuckle cultivars.

    Subsequently, the genome-wide single-base resolution DNA methylomes between Beihua and Sijihua were compared. Notably, significant differences in DNA methylation were observed between these two honeysuckles in the three contexts (CG, CHG, and CHH). Some studies have suggested that changes in DNA methylation are influenced by genetic variations, subsequently regulating alterations in gene expression. In the present study, a correlation between SNP density and different methylation types was observed. Specifically, SNP density showed a significant positive correlation with CG and CHG methylation, but a negative correlation with CHH methylation. Previous reports have also indicated a positive correlation between differential methylation regions and the density of SNP variations, suggesting that genetic variations in proximity may influence some of the methylation differences observed between different cultivars[52]. Furthermore, a higher density of SNPs were also observed at sites with differentially methylated cytosines, a phenomenon previously observed in the cassava genome[26].

    In the genome of honeysuckle, there is a pronounced preference for nucleotide mutations (C=>T), primarily driven by cytosine methylation. While DNA methylation plays a critical role in regulating gene expression, it can also have detrimental effects because methylated cytosines are more susceptible to deamination, leading to the conversion of cytosine to thymine. Similar phenomena have been observed in the Arabidopsis[23], canola[24], and sugarcane[25]. The interplay between genetic variation and DNA methylation has a regulatory effect on gene expression[53]. The results revealed that genes related to Beihua affected by CG=>TG had significantly lower methylation levels in the gene body compared to Sijihua, and a significant difference in the gene expression between Beihua and Sijihua.

    Numerous studies have indicated that phenylalanine ammonia-lyase (PAL)[54] and chalcone isomerase (CHI)[55] served as the first and second rate-limiting enzymes in flavonoid synthesis, respectively. Overexpression of these enzymes leads to a significant accumulation of flavonoids. Additionally, the overexpression of flavanol synthase (FLS) results in a substantial accumulation of flavonoids[56]. Therefore, the promoter regions of these key enzyme genes contain numerous nucleotide mutations that lead to changes in promoter region methylation, directly affecting gene expression. This may be a crucial factor contributing to the higher flavonoid levels in Beihua compared to Sijihua. These findings align with prior research on the apple genome[57]. In summary, the interplay between SNP and DMCs in the biosynthesis of flavonoids in honeysuckle regulates the expression changes of genes related to key enzymes, thereby influencing the accumulation of flavonoids.

    A large number of DMCs in the genomes of two honeysuckles were identified, with most of them being SNP-associated DMCs. These DMCs were associated with approximately 80% of the genes and were significantly enriched in several vital pathways, such as the flavonoid biosynthesis pathway, flavonol biosynthesis pathway, anticancer pathway, and cellular glucan metabolic process. In particular, SNP-associated DMCs were found in genes related to key enzymes in the flavonoid synthesis pathway. Some of these genes directly influenced the accumulation of flavonoids and exhibited significant differential expression, including LjPAL (EVM0007831), LjCHI (EVM0013981), and LjFLS (EVM0027194). Previous studies have shown that the expression levels of LjPAL, LjCHI, and LjFLS are positively correlated with flavonoid accumulation in other plants, such as Arabidopsis[55], tomato[58], Dendrobium officinale[59], and Brassica napus[60]. In summary, the interplay between genetic variation and epigenetic regulation exerts a substantial influence on gene expression, leading to noticeable phenotypic differences between the two honeysuckles. These findings provide novel insights into breeding and cultivation techniques for honeysuckles.

    In this study, 9,909,981 SNPs were identified in Sijihua, including 12,688 high-impact SNPs that were significantly enriched in the stress resistance pathways. By comparing the DNA methylation patterns of Beihua and Sijihua, significant differences in DNA methylation levels were observed between the two honeysuckles. Thus, a substantial number of DMCs were identified between these two honeysuckles, with 81% of which were SNP-associated DMCs. Furthermore, methylated cytosines are more prone to mutation, resulting in CG=>TG, and altered DNA methylation, further regulating gene expression. SNP-associated DMCs are linked to 76% of protein-coding genes, with 3,325 genes exhibiting differential expression in both stages (WB and GF), and significantly enriched in the biosynthetic pathway of flavonoids. In the flavonoid pathway, important genes affecting flavonoid accumulation such as LjPAL, LjCHI, and LjFLS showed significant differences. The flanking 2 Kb region and body region of these genes produced a large number of SNP-associated DMCs, which is likely to be the reason that SNP-associated DMCs are regulating the overexpression of genes in Beihua, resulting in an increase in the accumulation of flavonoids in Beihua.

    The authors confirm contribution to the paper as follows: study conception and design: Wang H; re-sequencing performing and DNA methylation analysis: Yu X; transcriptome performing and biological pathway analysis: Yu X, Yu H; tissues collection and analysis: Lu Y, Zhang C; draft manuscript preparation: Yu X, Wang H. All authors reviewed the results and approved the final version of the manuscript.

    The raw reads generated in this study have been deposited in the CNCB sequence read archive (SRA) with the accession number PRJCA018541. The reference genome, WGBS, and RNA-seq public data of Sijihua were downloaded from the NCBI database under the project numbers: PRJNA794868, PRJNA824715, and PRJNA813701, respectively.

    This work was supported by the National Natural Science Foundation of China (32160142), Guangxi Natural Science Foundation (2023GXNSFDA026034), Sugarcane Research Foundation of Guangxi University (2022GZA002), and State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources (SKLCUSA-b202302) to Haifeng Wang.

  • The authors declare that they have no conflict of interest.

  • Supplemental Table S1 Crucial changed differential non-volatile metabolites in five water lily samples.
    Supplemental Fig. S1 The spectrogram of non-volatile components in five water lily samples obtained from UPLC-MS/MS.
    Supplemental Fig. S2 Heatmap of crucially changed differential non-volatile metabolites in five water lily samples.
    Supplemental Fig. S3 Differential non-volatile metabolite KEGG enrichment maps.
    Supplemental Fig. S4 The spectrogram of volatile components in five water lily samples obtained from GC×GC-TOFMS.
    Supplemental Fig. S5 The heat map of volatile compounds with VIP scores of ≥1.
  • [1]

    Plants of the World Online. 2023. Nymphaea L. Kew Science. http://powo.science.kew.org/taxon/urn:lsid:ipni.org:names:330032-2

    [2]

    Zhou Q, Shi M, Zhang H, Zhu Z. 2022. Comparative study of the petal structure and fragrance components of the Nymphaea hybrid, a precious water lily. Molecules 27:408

    doi: 10.3390/molecules27020408

    CrossRef   Google Scholar

    [3]

    Yuan R, 2014. Studies on the composition of volatiles in different cultivars of water lily and functional component and antioxidant activity evaluation in its tea. Thesis. Nanjing Agriculture University, China. pp. 1−2.

    [4]

    Zhao Y, Fan YY, Yu WG, Wang J, Lu W, et al. 2019. Ultrasound-enhanced subcritical fluid extraction of essential oil from Nymphaea alba var and its antioxidant activity. Journal of AOAC International 102:1448−54

    doi: 10.1093/jaoac/102.5.1448

    CrossRef   Google Scholar

    [5]

    Abelti AL, Teka TA, Bultosa G. 2023. Review on edible water lilies and lotus: Future food, nutrition and their health benefits. Applied Food Research 3:100264

    doi: 10.1016/j.afres.2023.100264

    CrossRef   Google Scholar

    [6]

    Wang Z, Cheng Y, Zeng M, Wang Z, Qin F, et al. 2021. Lotus ( Nelumbo nucifera Gaertn) leaf: A narrative review of its Phytoconstituents, health benefits and food industry applications. Trends in Food Science & Technology 112:631−50

    doi: 10.1016/j.jpgs.2021.04.033

    CrossRef   Google Scholar

    [7]

    Yin DD, Yuan RY, Wu Q, Li SS, Shao S, et al. 2015. Assessment of flavonoids and volatile compounds in tea infusions of water lily flowers and their antioxidant activities. Food Chemistry 187:20−28

    doi: 10.1016/j.foodchem.2015.04.032

    CrossRef   Google Scholar

    [8]

    Zhao Y, Zhou W, Chen Y, Li Z, Song X, et al. 2021. Metabolite analysis in Nymphaea 'Blue Bird' petals reveal the roles of flavonoids in color formation, stress amelioration, and bee orientation. Plant Science 312:111025

    doi: 10.1016/j.plantsci.2021.111025

    CrossRef   Google Scholar

    [9]

    Shen S, Zhan C, Yang C, Fernie AR, Luo J. 2023. Metabolomics-centered mining of plant metabolic diversity and function: Past decade and future perspectives. Molecular Plant 16:43−63

    doi: 10.1016/j.molp.2022.09.007

    CrossRef   Google Scholar

    [10]

    Farag MA, Elmetwally F, Elghanam R, Kamal N, Hellal K, et al. 2023. Metabolomics in tea products; a compile of applications for enhancing agricultural traits and quality control analysis of Camellia sinensis. Food Chemistry 404:134628

    doi: 10.1016/j.foodchem.2022.134628

    CrossRef   Google Scholar

    [11]

    Divekar PA, Narayana S, Divekar BA, Kumar R, Gadratagi BG, et al. 2022. Plant secondary metabolites as defense tools against herbivores for sustainable crop protection. International Journal of Molecular Sciences 23:2690

    doi: 10.3390/ijms23052690

    CrossRef   Google Scholar

    [12]

    Anand A, Komati A, Katragunta K, Shaik H, Nagendla NK, et al. 2021. Phytometabolomic analysis of boiled rhizome of Nymphaea nouchali (Burm. f.) using UPLC-Q-TOF-MSE, LC-QqQ-MS & GC–MS and evaluation of antihyperglycemic and antioxidant activities. Food Chemistry 342:128313

    doi: 10.1016/j.foodchem.2020.128313

    CrossRef   Google Scholar

    [13]

    Wu Q, Wu J, Li SS, Zhang HJ, Feng CY, et al. 2016. Transcriptome sequencing and metabolite analysis for revealing the blue flower formation in waterlily. BMC Genomics 17:897

    doi: 10.1186/s12864-016-3226-9

    CrossRef   Google Scholar

    [14]

    Wang H, Hua J, Yu Q, Li J, Wang J, et al. 2021. Widely targeted metabolomic analysis reveals dynamic changes in non-volatile and volatile metabolites during green tea processing. Food Chemistry 363:130131

    doi: 10.1016/j.foodchem.2021.130131

    CrossRef   Google Scholar

    [15]

    Zhou J, Fang T, Li W, Jiang Z, Zhou T, Zhang L, et al. 2022. Widely targeted metabolomics using UPLC-QTRAP-MS/MS reveals chemical changes during the processing of black tea from the cultivar Camellia sinensis (L.) O. Kuntze cv. Huangjinya. Food Research International 162:112169

    doi: 10.1016/j.foodres.2022.112169

    CrossRef   Google Scholar

    [16]

    Maia ACD, de Lima CT, Navarro DM do AF, Chartier M, Giulietti AM, et al. 2014. The floral scents of Nymphaea subg. Hydrocallis (Nymphaeaceae), the new world night-blooming water lilies, and their relation with putative pollinators. Phytochemistry 103:67−75

    doi: 10.1016/j.phytochem.2014.04.007

    CrossRef   Google Scholar

    [17]

    Yuan R, Li S, Zheng X, Wu Q, Zhang H, et al. 2014. Determination of volatiles in water lily flowers using gas chromatography–mass spectrometry. Analytical Letters 47:1541−51

    doi: 10.1080/00032719.2013.878840

    CrossRef   Google Scholar

    [18]

    Yu M, Yang P, Song H, Guan X. 2022. Research progress in comprehensive two-dimensional gas chromatography-mass spectrometry and its combination with olfactometry systems in the flavor analysis field. Journal of Food Composition and Analysis 114:104790

    doi: 10.1016/j.jfca.2022.104790

    CrossRef   Google Scholar

    [19]

    Shi J, Zhu Y, Zhang Y, Lin Z, Lv HP. 2019. Volatile composition of Fu-brick tea and Pu-erh tea analyzed by comprehensive two-dimensional gas chromatography-time-of-flight mass spectrometry. LWT 103:27−33

    doi: 10.1016/j.lwt.2018.12.075

    CrossRef   Google Scholar

    [20]

    Chen W, Gong L, Guo Z, Wang W, Zhang H, et al. 2013. A novel integrated method for large-scale detection, identification, and quantification of widely targeted metabolites: Application in the study of rice metabolomics. Molecular Plant 6:1769−80

    doi: 10.1093/mp/sst080

    CrossRef   Google Scholar

    [21]

    Shi Y, Zhu Y, Ma W, Shi J, Peng Q, et al. 2022. Comprehensive investigation on non-volatile and volatile metabolites in four types of green teas obtained from the same tea cultivar of Longjing 43 ( Camellia sinensis var. sinensis) using the widely targeted metabolomics. Food Chemistry 394:133501

    doi: 10.1016/j.foodchem.2022.133501

    CrossRef   Google Scholar

    [22]

    Tungmunnithum D, Drouet S, Garros L, Hano C. 2022. Differential flavonoid and other phenolic accumulations and antioxidant activities of Nymphaea lotus L. populations throughout Thailand. Molecules 27:3590

    doi: 10.3390/molecules27113590

    CrossRef   Google Scholar

    [23]

    Castañeda-Ovando A, Pacheco-Hernández MDL, Páez-Hernández ME, Rodríguez JA, Galán-Vidal CA. 2009. Chemical studies of anthocyanins: A review. Food Chemistry 113:859−71

    doi: 10.1016/j.foodchem.2008.09.001

    CrossRef   Google Scholar

    [24]

    Lev-Yadun S, Gould KS. 2009. Role of anthocyanins in plant defence. In Anthocyanins, eds. Winefield C, Davies K, Gould K. New York: Springer. pp. 22–28. https://doi.org/10.1007/978-0-387-77335-3_2

    [25]

    Zhu M, Zheng X, Shu Q, Li H, Zhong P, et al. 2012. Relationship between the composition of flavonoids and flower colors variation in tropical water lily ( Nymphaea) cultivars. PLoS ONE 7:e34335

    doi: 10.1371/journal.pone.0034335

    CrossRef   Google Scholar

    [26]

    Ono E, Ruike M, Iwashita T, Nomoto K, Fukui Y. 2010. Co-pigmentation and flavonoid glycosyltransferases in blue Veronica persica flowers. Phytochemistry 71:726−35

    doi: 10.1016/j.phytochem.2010.02.008

    CrossRef   Google Scholar

    [27]

    Shi J, Simal-Gandara J, Mei J, Ma W, Peng Q, et al. 2021. Insight into the pigmented anthocyanins and the major potential co-pigmented flavonoids in purple-coloured leaf teas. Food Chemistry 363:130278

    doi: 10.1016/j.foodchem.2021.130278

    CrossRef   Google Scholar

    [28]

    Cudalbeanu M, Ghinea IO, Furdui B, Dah-Nouvlessounon D, Raclea R, et al. 2018. Exploring new antioxidant and mineral compounds from Nymphaea alba wild-grown in danube delta biosphere. Molecules 23:1247

    doi: 10.3390/molecules23061247

    CrossRef   Google Scholar

    [29]

    Bakr RO, El-Naa MM, Zaghloul SS, Omar MM. 2017. Profile of bioactive compounds in Nymphaea alba L. leaves growing in Egypt: hepatoprotective, antioxidant and anti-inflammatory activity. BMC Complementary and Alternative Medicine 17:52

    doi: 10.1186/s12906-017-1561-2

    CrossRef   Google Scholar

    [30]

    Xiong X, Zhang J, Yang Y, Chen Y, Su Q, et al. 2023. Water lily research: Past, present, and future. Tropical Plants 2:1

    doi: 10.48130/TP-2023-0001

    CrossRef   Google Scholar

    [31]

    Yin P, Kong YS, Liu PP, Wang JJ, Zhu Y, et al. 2022. A critical review of key odorants in green tea: Identification and biochemical formation pathway. Trends in Food Science & Technology 129:221−32

    doi: 10.1016/j.jpgs.2022.09.013

    CrossRef   Google Scholar

    [32]

    Cheng Y, Han L, Huang L, Tan X, Wu H, et al. 2023. Association between flavor composition and sensory profile in thermally processed mandarin juices by multidimensional gas chromatography and multivariate statistical analysis. Food Chemistry 419:136026

    doi: 10.1016/j.foodchem.2023.136026

    CrossRef   Google Scholar

    [33]

    Zheng Y, Fei Y, Yang Y, Jin Z, Yu B, et al. 2020. A potential flavor culture: Lactobacillus harbinensis M1 improves the organoleptic quality of fermented soymilk by high production of 2,3-butanedione and acetoin. Food Microbiology 91:103540

    doi: 10.1016/j.fm.2020.103540

    CrossRef   Google Scholar

    [34]

    Zhou Q, Zeng Y, Shan J, Zhu Z. 2019. Study of volatiles from flowers of Nymphaea hybrida by HS-SPME-GC-MS. Chemistry of Natural Compounds 55:951−52

    doi: 10.1007/s10600-019-02857-7

    CrossRef   Google Scholar

    [35]

    Zhang L, Chen F, Zhang X, Li Z, Zhao Y, et al. 2020. The water lily genome and the early evolution of flowering plants. Nature 577:79−84

    doi: 10.1038/s41586-019-1852-5

    CrossRef   Google Scholar

  • Cite this article

    Yang G, Wei J, Wu Y, Chen S, Yu C, et al. 2024. Comprehensive study of non-volatile and volatile metabolites in five water lily species and varieties (Nymphaea spp.) using widely targeted metabolomics. Beverage Plant Research 4: e012 doi: 10.48130/bpr-0024-0005
    Yang G, Wei J, Wu Y, Chen S, Yu C, et al. 2024. Comprehensive study of non-volatile and volatile metabolites in five water lily species and varieties (Nymphaea spp.) using widely targeted metabolomics. Beverage Plant Research 4: e012 doi: 10.48130/bpr-0024-0005

Figures(6)  /  Tables(1)

Article Metrics

Article views(4000) PDF downloads(668)

ARTICLE   Open Access    

Comprehensive study of non-volatile and volatile metabolites in five water lily species and varieties (Nymphaea spp.) using widely targeted metabolomics

Beverage Plant Research  4 Article number: e012  (2024)  |  Cite this article

Abstract: Water lilies, members of the Nymphaeaceae family, are globally cultivated aquatic plants known for their diverse colors and significant ornamental, economic, beverage, medicinal, and ecological value. In this study, we employed ultra-high performance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS) to analyze the non-volatile components and simultaneous distillation extraction (SDE) in combination with two-dimensional gas chromatography-time-of-flight mass spectrometry (GC×GC-TOFMS) to analyze the volatile components in five water lily species and varieties. Results showed that 118 differential metabolites including flavonoids, phenolic acids, amino acids, and lipids were screened among 533 non-volatiles. Cyanidin-type anthocyanins, including cyanidin-3-O-galactoside, cyanidin-3-O-glucoside, and cyanidin-3-rutinoside, are present in high amounts in the purple-colored Nymphaea 'Detective Erika'. Conversely, delphinidin is found in significant quantities in Nymphaea 'Blue Bird', which exhibits a blue color. KEGG analysis showed that flavonoid biosynthesis and anthocyanin biosynthesis exhibited significant enrichment. Additionally, a total of 166 volatiles were screened in water lilies, mainly including aromatic compounds, alkynes, ketones, alcohols and esters. Among them, the concentrations of key compounds including 1,11-dodecadiene, benzyl alcohol, benzaldehyde, α-farnesene and dimethyl sulfide, varied significantly among different samples. This study reveals significant variations in chemical compounds among different Nymphaea species and varieties. These findings contribute to enhancing our comprehension of the metabolic variability and composition of water lilies, which might shed light on unlocking new possibilities for their potential application in the beverage industry.

    • The genus Nymphaea, belonging to the family Nymphaeaceae and commonly known as water lily, comprises over 60 species of perennial aquatic plants[1]. This precious aquatic plant is distributed across frigid zones to the tropical regions worldwide. In horticulture, water lilies can be divided into two categories based on their ecological habits: tropical water lilies and hardy water lilies[2]. In addition to their aesthetic appeal, economically, Nymphaea species possess a substantial quantity of phytochemicals and nutrients, making them widely utilized for beverage preparation[3], essential oil extraction[4], and as a valuable source of food, nutrition, and medicinal purposes[5,6]. The incorporation of water lily in scented tea offers potential benefits in the tea beverage industry, combining ornamental value and potential health benefits[7]. Understanding the chemical composition and metabolic profiles of water lilies is important for uncovering their biological functions and applications. Metabolomics, a comprehensive study of small molecule metabolites, provides insights into the diverse range of non-volatile and volatile metabolites in water lilies[8,9].

      Non-volatile metabolites encompass a wide range of compounds, including primary metabolites such as carbohydrates, amino acids, and organic acids, as well as secondary metabolites such as flavonoids, alkaloids, and phenolic compounds[10]. These metabolites are essential for the growth, development, and defense mechanisms of plants[11]. Phytochemical screening of water lilies indicates the presence of several bioactive compounds like phenolic, flavonoids, triterpenes, glycosides, carbohydrates, and other compounds[12]. Phenols and flavonoids are the main phytochemicals found in water lilies and are responsible for their health benefits[5]. In addition, the captivating colors of water lily petals are attributed to their richanthocyanin content. For instance, delphinidin 3′-O-(2″-O-galloyl-6″-O-acetyl-β-galactopyranoside was identified as the primary blue anthocyanin in N. colorata[13]. Most detection methods for water lily metabolites are either targeted, with limited coverage, or non-targeted, with lower sensitivity and accuracy[14]. The combination of ultra-high-performance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS) with widely targeted metabolomics techniques provides fast separation, high sensitivity, and broad coverage[15].

      Volatile metabolites, low molecular weight compounds that evaporate at ambient temperatures, are essential for the aroma, pollinator attraction, communication, and defense in plants. Its compositions vary among different species of water lilies. For example, day-blooming species emitting aromatic alcohols and ethers, while nocturnal species have a higher abundance of aromatic ethers, aliphatic esters, and C5-branched chain esters, which may play a role in attracting potential pollinators through olfactory cues[16]. In addition, previous studies using solid phase microextraction gas chromatography-mass spectrometry (SPME-GC-MS) detected 79 and 71 volatile compounds in tropical water lilies and hardy water lilies, respectively, with aromatic substances being their major volatile components[17]. However, with the advancements in analytical techniques, researchers have sought to explore more advanced methods for volatile compound analysis. Currently, comprehensive two-dimensional gas chromatography-time-of-flight mass spectrometry (GC×GC-TOFMS) has been widely adopted for the analysis of volatile compounds in a variety of foods due to its high resolution, high sensitivity, and peak capacity[18,19].

      In this study, we utilized UPLC-MS/MS and GC×GC-TOFMS techniques to comprehensively identify both non-volatile and volatile constituents of water lilies in five different colors. By comparing metabolite profiles across various Nymphaea species and varieties, we can uncover the discrepancies in chemical compounds that maybe linked to variations in scent, coloration, and potential biological activity. The findings of this study will enhance our understanding of the metabolic variability and chemical constitution of water lilies, thereby improving our knowledge of their distinctive attributes and potential uses.

    • Methanol, acetonitrile and ethanol were purchased from Merck (Darmstadt, Germany). Ether (GC) was acquired from Tedia (Fairfield, OH, USA). Anhydrous sodium sulfate (AR) and ethyl decanoate (AR, internal standard) were purchased from Sigma-Aldrich (Shanghai, China). Distilled water was obtained from Wahaha Group Company (Hangzhou, China). The carboxen/divinylbenzene/polydimethylsiloxane (CAR-DVB-PDMS; 50/30 μm) microextraction fiber was obtained from Supelco (Bellefonte, PA, USA). The n-alkanes (C3-C9, C8-C40) were purchased from J&K Scientific (Beijing, China).

    • As shown in Fig. 1, five different water lily samples were collected from Hangzhou Aquatic Plant Society, including two species — N. lotus (NL), N. rubra (NR), and three varieties — Nymphaea 'Texas Dawn' (TD), Nymphaea 'Blue Bird' (BB), and Nymphaea 'Detective Erika' (DE). Three fully expanded flowers (on the first day after flowering) were collected in the morning, three samples of each variety, and processed immediately.

      Figure 1. 

      Five different species and varieties of water lilies selected in this study.

    • The petals of water lily samples were ground using a mixer mill (MM 400,) with a zirconia bead for 1.5 min at 30 Hz. One hundred mg of sample powder was dissolved with 0.6 mL 70% aqueous methanol; the dissolved sample was placed in the refrigerator at 4 °C for 12 h, during which the sample was vortexed six times; finally, the sample was centrifuged at 4 °C for 10 min at 12,000 r/min, and the extracts were absorbed and filtrated (SCAA-104, 0.22 μm pore size; ANPEL, Shanghai, China), and then sealed in the injection vial for subsequent UPLC-MS/MS analysis.

    • The analysis was conducted by MetWare (Wuhan, China)[20]. UPLC (Shim-pack UFLC SHIMADZU CBM30A) equipped with tandem mass spectrometry (Applied Biosystems 4500 QTRAP) was used for the wide-targeted metabolomic assays of non-volatiles in water lily samples. The analytical conditions were as follows: Waters ACQUITY UPLC HSS T3 C18 column (2.1 mm × 100.0 mm, 1.8 μm); flow rate of 0.35 mL/min; column temperature of 40 °C; injection volume of 4 μL. The mobile phase A was ultrapure water (dissolved in 0.04% acetic acid), and the mobile phase B was acetonitrile (dissolved in 0.04% acetic acid). The mobile phase elution gradient was as follows: the proportion of phase B was 5% at 0.00 min, the proportion of phase B increased linearly to 95% at 10.00 min and maintained at 95% for 1 min, the proportion of phase B was reduced to 5% at 11.00–11.10 min and equilibrated with the proportion of phase B at 5% for 14 min. The effluent was alternatively connected to an ESI-triple quadrupole-linear ion trap (QTRAP)-MS.

      Linear ion trap (LIT) and triple quadrupole (QQQ) scans were acquired on a triple quadrupole-linear ion trap mass spectrometer (Q TRAP), API 4500 Q TRAP UPLC/MS/MS System, equipped with an ESI Turbo Ion-Spray interface, operating in positive and negative ion mode and controlled by Analyst 1.6.3 software (AB Sciex). The ESI source operation parameters were as follows: ion source, turbo spray; source temperature 550 °C; ion spray voltage (IS) 5,500 V (positive ion mode)/–4,500 V (negative ion mode); ion source gas I (GSI), gas II (GSII), curtain gas (CUR) were set at 50, 60, and 30 psi, respectively; the collision gas (CAD) was high. Instrument tuning and mass calibration were performed with 10 and 100 μmol/L polypropylene glycol solutions in QQQ and LIT modes, respectively. QQQ scans were acquired as MRM experiments with collision gas (nitrogen) set to 5 psi. Declustering potential (DP) and collision energy (CE) for individual MRM transitions was carried out with further DP and CE optimization. A specific set of MRM transitions were monitored for each period according to the metabolites eluted within this period.

    • For the qualitative analysis, the metabolites were identified by matching the retention time, fragmentation patterns, and accurate m/z values to the standards in the self-constructed metabolite database (MetWare, Wuhan, China). The quantitative analysis was conducted based on the signal intensities of metabolites obtained from characteristic ions. MultiaQuant software was used to integrate and calibrate chromatographic peaks. The peak area of each chromatographic peak represented the relative content of the corresponding substance.

    • The sample preparation method underwent minor modifications, as outlined in the previous study[19]. Specifically, simultaneous distillation extraction (SDE) method was used to extract the aroma components of water lily flower. The specific steps were as follows: weigh 10.00 g of the water lily flower sample to be tested, put it in a 500 mL round-bottomed flask and add 300 mL of boiling distilled water, and heat it to a slight boil with an electric heating jacket. Add 30 mL of redistilled anhydrous ether into the extraction flask and distill the extract at 50 °C for 1 h. Remove the water from the obtained material with anhydrous sodium sulfate and concentrate under nitrogen, then put into the injection bottle for sealing and storage at –20 °C for measurement.

    • The analysis of aroma components in the water lily samples was conducted using a GC×GC-TOFMS system, which consisted of a gas chromatograph (Agilent 7890B; Santa Clara, CA, USA) coupled with a TOFMS instrument (LECO Pegasus 4D; LECO Corporation, St. Joseph, MI, USA). The first dimension (1st D) column was a non-polar DB-5MS column (30 m × 250 μm × 0.25 μm) and the second dimension (2nd D) column was a moderate polar DB-17HT column (1.9 m × 100 μm × 0.10 μm), and both above columns were purchased from Agilent Technologies (Santa Clara, CA, USA). Injection port temperature: 280 °C; Transfer line temperature: 270 °C; Carrier gas: helium (purity 99.999%); No split injection; Modulation time interval: 4.0 s; Sample injection volume: 1.0 μL. The 1st D column temperature program: hold at 60 °C for 3.0 min, ramp at 4.0 °C/min to 280 °C, hold for 2.5 min; 2nd D column temperature program: hold at 65 °C for 3.0 min, ramp at 4.0 °C/min to 280 °C, hold for 2.5 min; Total analysis time: 60.5 min. MS conditions: Electron ionization source; Ionization energy: –70 eV; Mass scanning range: 33 to 600 m/z; Ion source temperature: 220 °C.

    • The GC×GC-TOFMS data were processed using the LECO ChromaTOF software. The calculated retention index (RI) values calculated using C8–C40 n-alkanes, and the reference RI values were obtained from the NIST 2014 semi-standard non-polar (DB-5) column. A compound was considered tentatively identified if the difference between the calculated and reference RI values was less than 20.

    • Patrial least squares discriminant analysis (PLS-DA) was performed by MetaboAnalyst 5.0 (www.metaboanalyst.ca). The data was log transformed (log10) and normalization by median before PLS-DA. Significantly regulated metabolites between groups were determined by VIP ≥ 1 and absolute Log2FC (fold change) ≥ 1. To mitigate the risk of overfitting, we conducted a permutation test with 100 permutations.

      The identified non-volatile metabolites were annotated using Kyoto Encyclopedia of Genes and Genomes (KEGG) Compound database (www.kegg.jp/kegg/compound), annotated metabolites were then mapped to KEGG pathway database (www.kegg.jp/kegg/pathway.html). Pathways with significantly regulated metabolites mapped to were then fed into metabolite sets enrichment analysis, their significance was determined using p-values obtained from the hypergeometric test.

    • A total of 533 non-volatile metabolites including nine categories were tentatively identified for water lilies by a comparison with tandem mass spectrum information from published databases and standards from the MetWare self-constructed metabolite database (Supplemental Fig. S1). As shown in Fig. 2a, these were 151 flavonoids (including 64 flavonoid and flavonoid carbonoside, 51 flavonols, 11 anthocyanins, 11 flavanols, nine dihydroflavone and dihydroflavonol, five isoflavones), 109 phenolic acids, 68 amino acids and derivatives, 57 lipids, 39 nucleotides and derivatives, 30 organic acids, 27 alkaloids, 23 saccharides and alcohols, and 29 other metabolites. Results showed that flavonoids, phenolic acids, lipids, amino acids and derivatives were the dominant non-volatile metabolites in the five water lilies. To better understand the differences in the content of non-volatile components between water lily samples from different species and varieties, the analysis was calculated by Log2FC (fold change) ≥ 2 or ≤ 0.5. The plots reflect the information on differential metabolite up-regulation and down-regulation (Fig. 2b). The numbers of differential metabolites identified in NL, TD, NR, and DE were 329 (237 up, 92 down), 310 (218 up, 92 down), 314 (213 up, 101 down), and 282 (177 up, 105 down), respectively.

      Figure 2. 

      Overview of the non-volatile components. (a) Quantitative distribution of chemical classes of volatile compounds. (b) Number of differentiated compounds with fold change ≥ 2 or ≤ 0.5. Note: NL, N. lotus; NR, N. rubra; TD, Nymphaea 'Texas Dawn'; BB, Nymphaea 'Blue Bird'; DE, Nymphaea 'Detective Erika'.

    • In the present study, all differential metabolites were investigated based on fold change, and a total of 118 non-volatiles were screened in five water lilies, mainly consisted of flavonoids, phenolic acids, amino acids and derivatives, lipids, and organic acids metabolites (Supplemental Table S1 & Supplemental Fig. S2). Specifically, we screened 10 up-regulated and 10 down-regulated metabolites with the highest fold change values in different water lilies (Fig. 3). Comparing the NL samples to the BB samples, we observed higher levels of chrysoeriol-O-malonylhexoside, 6,7,8-tetrahydroxy-5-methoxyflavone, myricetin-O-glucoside-rhamnoside, and kaempferol-3-O-neohesperidoside. On the other hand, the BB samples contained higher levels of isoquercitrin and luteolin-7-O-β-D-gentiobioside. Moreover, the TD samples exhibited high levels of neochlorogenic acid, chlorogenic acid, 1-O-p-coumaroylquinic acid, and tetragallic acid, while the BB samples exhibited high levels of 2'-hydroxygenistein, phloretin 2'-O-glucoside, and 3,4-dihydroxybenzaldehyde. Comparing the NR samples to the BB samples, we found higher levels of cyanidin-3-rutinoside, cyanidin-3-O-galactoside, and myricetin-O-glucoside-rhamnoside. Conversely, the BB samples exhibited higher levels of 7-methoxycoumarin, myricitrin, and naringenin-7-O-glucoside. In comparison to the BB samples, the DE samples displayed higher levels of myricetin-O-glucoside-rhamnoside, cis-4-hydroxy-D-proline, myricetin-3-O-rhamnoside-7-O-rhamnoside, and neochlorogenic acid. On the other hand, the BB samples contained higher levels of myricitrin, isoquercitrin, and naringenin-7-O-glucoside.

      Figure 3. 

      The highest fold change values of non-volatile metabolites. (a) Fold change plot of NL vs BB. (b) Fold change plot of TD vs BB. (c) Fold change plot of NR vs BB. (d) Fold change plot of DE vs BB. Note: The horizontal coordinate is the log2FC of the differentially metabolized metabolite, and the vertical coordinate is the differentially metabolized metabolite. Red represents up-regulated differentially expressed metabolites, cyan represents down-regulated differentially expressed metabolites. NL, N. lotus; NR, N. rubra; TD, Nymphaea 'Texas Dawn'; BB, Nymphaea 'Blue Bird'; DE, Nymphaea 'Detective Erika'.

      We further identified the enrichment of differential metabolites in the KEGG mapping. The results of pathway enrichment analysis of the detected differential compounds using the KEGG database are shown in Supplemental Fig. S3. A total of 354, 327, 380, 299 differential compounds from BB vs NL, BB vs TD, BB vs NR, BB vs DE samples could be annotated to the relevant metabolic pathways, which were mainly significantly enriched in the pathways of biosynthesis of secondary metabolites, flavonoid biosynthesis, anthocyanin biosynthesis, flavonoids and flavonols biosynthesis, isoflavonoid biosynthesis. Furthermore, we observed significant enrichment in pathways associated with tryptophan metabolism and phenylpropanoid biosynthesis in the BB vs TD group. Additionally, caffeine metabolism exhibited significant enrichment in the BB vs NR group.

    • The fragrance of water lily contains volatile compounds such as terpenes, phenylpropanoids, benzenoids, fatty acid derivatives, and amino acid derivatives. These compounds not only attract pollinators, but also play a crucial role in transmitting signals in plant-plant interactions and providing protection and defense for the plant[17]. In this study, the volatile compounds of water lily samples were analyzed by GC×GC-TOFMS (Supplemental Fig. S4). By comparing the MS of the compounds and comparing the chromatographic peaks and tested RI values with the reported RI values, 166 volatiles were identified, including 46 aromatic compounds, 34 alkynes, 22 ketones, 10 alcohols, 18 esters, 20 aldehydes, three carboxylic acids, five heterocyclics, five sulfur-containing compounds, and three other compounds (Fig. 4a). We identified 128, 141, 142, 135, and 129 volatile compounds in BB, NL, TD, NR, and DE, respectively, and 108 of the 166 metabolites were common to all water lily samples (Fig. 4b). We observed that 20 volatile compounds were exclusively detected in specific water lily samples. For instance, in BB variety, compounds such as amorpha-4,11-diene, (Z)-geranylacetone, α-bisabolol, and (E)-β-ionone were found. In NR variety, pyrocinchonic anhydride and 2-tetradecanone were exclusively detected. NL variety exhibited unique compounds including mequinol, 4-ethylresorcinol, p-xylene, and 2,5-hexanedione. TD variety showed the presence of terpilene, (E,E)-2,6-dimethyl-2,4,6-octatriene, 1-nonanol, β-bisabolene, sabinene, 3-carvomenthenone, (4E,6Z)-2,6-dimethyl-2,4,6-octatriene, ipsdienol, and 2-thujene. Lastly, DE variety exclusively contained 2-pentoxyethyl acetate.

      Figure 4. 

      Overview of the volatile components. (a) Quantitative distribution of chemical classes of volatile compounds. (b) Venn plot; numbers represent the identified metabolites. (c) Relative abundance of different types of volatile compounds. Note: NL, N. lotus; NR, N. rubra; TD, Nymphaea 'Texas Dawn'; BB, Nymphaea 'Blue Bird'; DE, Nymphaea 'Detective Erika'.

      The relative content of volatiles calculated from the total ion chromatograms varied in the concentration and proportion of each chemical class in different samples. Among them, the highest peak area of volatile components was found in BB, followed by NL (Fig. 4c). Alkenes were found to be the most abundant volatile components in BB, accounting for 69.58% of the total volatile components. Aromatic compounds were found to be the most abundant volatile components in NL, accounting for 64.25% of the total volatile components. In addition, alcohols and alkenes accounted for 28.05% and 26.89% of the total volatile components in DE, respectively.

      We investigated the relative amounts of the main 24 chemical compounds released that were greater than 1% in BB, NL, TD, NR, DE, with total relative contents of 90.92%, 91.45%, 83.19%, 77.29%, and 90.83%, respectively (Table 1 & Fig. 5). The concentrations of the identical chemical compounds varied across distinct samples. Within the provided BB sample, three compounds of the alkene class, namely 1,11-dodecadiene (27.30%), (E)-β-famesene (18.28%), and α-farnesene (14.44%), collectively constitute more than half of the total volatile compounds. Among the NL samples, 2,5-dimethoxytoluene (56.18%) emerged as the most abundant compound. In TD and NR samples, dimethyl sulfide was the predominant volatile compound, comprising 33.61% and 18.75% of the total volatile content, respectively. In addition, in DE sample, the concentration of benzyl alcohol (23.83%) was the most abundant, followed by dimethyl sulfide (16.67%) and α-farnesene (16.41%).

      Table 1.  Comparison of the main volatile compounds in five water lily samples.

      No.CompoundsClassRI (ref)[a]RI (cal)[b]CASIonOdor type[c]Flavor[c]Relative content (%)
      BBNLTDNRDE
      1Benzyl alcoholAlcohols1036 ± 41037100-51-679FloralFloral, rose, balsamic0.114.486.328.1323.83
      2BenzaldehydeAldehydes962 ± 3969100-52-777FruityAlmond, burntsugar, sweet8.102.133.175.223.91
      3HexanalAldehydes800 ± 280166-25-141GreenGreen, fatty, leafy0.150.690.613.840.34
      4BenzeneacetaldehydeAldehydes1045 ± 41049122-78-191GreenGreen, floral, honey0.160.141.030.090.85
      51,11-DodecadieneAlkenes1179 ± 217635876-87-967//27.303.6712.627.0411.45
      6(E)-β-FameseneAlkenes1457 ± 2145318794-84-841WoodyWoody, citrus, herbal18.280.390.840.381.27
      7α-FarneseneAlkenes1508 ± 21506502-61-441WoodyCitrus, herbal, neroli14.440.1710.380.1816.41
      8(E)-α-BergamoteneAlkenes1433 ± 3143713474-59-493WoodyWoody2.020.110.030.031.65
      9β-SesquiphellandreneAlkenes1524 ± 2152820307-83-969HerbalHerbal, fruity, woody3.430.120.150.101.37
      10(E)-β-OcimeneAlkenes1049 ± 210483779-61-193/Herbal, sweet0.010.033.890.040.01
      112,5-DimethoxytolueneAromatic compounds1251 ± 5124924599-58-4137//0.3856.180.4912.180.52
      121,4-DimethoxybenzeneAromatic compounds1168 ± 91166150-78-7123GreenGreen, hay, sweet0.053.620.026.630.02
      13PhenolAromatic compounds980 ± 4981108-95-294PhenolicPhenolic, plastic, rubbery0.170.870.561.460.62
      14Acetic acidCarboxylic acids610 ± 1058164-19-745AcidicPungent, sour3.422.012.180.771.58
      15Benzoic acid, methyl esterEsters1094 ± 3109693-58-3105PhenolicPhenolic, wintergreen, almond0.030.570.071.860.03
      16Ethyl acetateEsters612 ± 5613141-78-643EtherealEthereal fruity sweet0.380.390.140.002.06
      17Acetic acid, phenylmethyl esterEsters1164 ± 21166140-11-4108FloralFloral, fruity, jasmine1.880.010.060.020.45
      186-Methyl-5-heptene-2-oneKetones986 ± 2984110-93-043CitrusCitrus, green, musty1.292.162.153.673.05
      19(E)-3-Penten-2-oneKetones735 ± N/A7443102-33-869//0.211.010.311.580.40
      20(E)-GeranylacetoneKetones1453 ± 214483796-70-143FloralFruity, fresh, rose2.150.760.650.721.18
      212-HeptadecanoneKetones1902 ± 719002922-51-258//1.680.411.180.700.80
      22Dimethyl sulfideSulfur-containing compounds520 ± 755375-18-347SulfurousSulfurous, sweet corn4.378.7633.6118.7516.67
      23Carbon disulfideSulfur-containing compounds549 ± 1356575-15-076/Sweet0.551.171.961.291.39
      24BenzothiazoleSulfur-containing compounds1229 ± 8124095-16-9135SulfurousSulfurous, rubbery, vegetable, cooked0.290.480.561.160.53
      Total relative content (%)90.9291.4583.1977.2990.83
      RI, Retention index, Ion, Qualitative ion; [a] RI (ref): The RI values (median ± deviation) were the reference values for semi-standard non-polar (DB-5) column obtained from NIST 2014; [b] RI (cal): The RI values were calculated from C8-C40 n-alkanes; [c] odor type and flavor were obtained from website (www.thegoodscentscompany.com/search2.html). Note: NL, N. lotus; NR, N. rubra; TD, N. 'Texas Dawn'; BB, N. 'Blue Bird'; DE, N. 'Detective Erika'.

      Figure 5. 

      Molecular formulas of major volatile compounds.

    • To gain a comprehensive understanding of the variations in volatile compound content among the five water lily samples, we utilized PLS-DA with the peak areas of 166 volatile compounds as input variables. As shown in Fig. 6a, the five samples were distinctly segregated from the remaining samples along the principal component 1 axis (R2X [1] = 37.6%) and principal component 2 axis (R2X [2] = 27.1%). Cross-validation was performed using the leave-one-out method, with the first two principal components explaining 99.6% of the total variance (R2X). The model exhibited good predictive ability (Q2 = 85.1%) and was not overfitted. The evident segregation and high reproducibility observed among the various sample groups substantiated the presence of significant disparities in the volatile compositions of the five water lily species and varieties (Fig. 6a).

      Figure 6. 

      The partial least squares discriminant analysis (PLS-DA) of the volatile compounds. (a) Score plot of PLS-DA. (b) The loading plot of PLS-DA. (c) Variable importance in the project (VIP) plot of PLS-DA. Note: NL, N. lotus; NR, N. rubra; TD, Nymphaea 'Texas Dawn'; BB, Nymphaea 'Blue Bird'; DE, Nymphaea 'Detective Erika'.

      The variable importance in projection (VIP) value is a comprehensive metric that quantifies the contribution of a variable in describing the data and indicates the significance of an independent variable for the model[21]. By utilizing the PLS-DA model, we identified 42 key volatile compounds with VIP scores of ≥ 1 across all samples (Supplemental Fig. S5). Subsequently, a one-way analysis of variance (ANOVA) was conducted on these compounds, revealing statistically significant differences among the distinct sample groups (p < 0.05). Notably, 26 metabolites of volatile compounds exhibited VIP values exceeding 1.5 (Fig. 6c), with 2,3-butanedione, octanal, 1-methyl-4-(1-hydroxy-1-methylethyl)benzene, acetic acid, phenylmethyl ester, 2,5-dimethoxytoluene, (E)-β-ocimene being among the top-ranked compounds in descending order of VIP values.

    • Although previous studies have been reported on the volatile and non-volatile components of water lilies, the number of compounds reported in these studies is relatively limited[2,22]. By employing widely targeted metabolomics, we overcome the limitations associated with both targeted and non-targeted metabolite detection methods. This approach provided us with a high-throughput analysis, increased sensitivity, and wide coverage, enabling a more comprehensive understanding of the metabolomic profile of water lilies. In the present study, we comprehensively identified 533 non-volatile and 166 volatile components in five different colors of water lilies using UPLC-MS/MS and GC × GC-TOFMS techniques.

      It is widely recognized that flower color is a crucial characteristic of ornamental plants and is influenced by various factors. Specifically, the type and concentration of anthocyanins are generally considered to be the primary determinants[23]. Anthocyanins play an important role in plant physiology, serving as attractions for pollinators and herbivores, acting as deterrents against herbivores and parasites, and also influencing visual signals and mimicry of defensive structures[24]. Among the diverse range of flower colors, blue coloration primarily results from the presence of anthocyanins derived from delphinidin[25]. In the petals of Nymphaea 'King of Siam', four anthocyanins were identified: delphinidin 3-O-β-galactopyranoside, delphinidin 3′-O-(2″-O-galloyl-β-galactopyranoside, delphinidin 3-O-(6″-O-acetyl-β-glucopyranoside, delphinidin 3′-O-(2″-O-galloyl-6″-O-acetyl-β-galactopyranoside[13]. In the present study, a total of 11 anthocyanins were detected in five different-colored water lilies. Among them, delphinidin-derived anthocyanins exhibited relatively higher relative abundances in BB, TD, and DE, while its abundance was low in DE. Notably, cyanidin-3-O-galactoside, cyanidin-3-O-glucoside, cyanidin-3-rutinoside, cyanidin-O-acetylhexoside, and cyanidin-O-syringic acid, which are glycosides of the cyanidin type, were found to be the most abundant exclusively in the DE samples. In contrast, these compounds were found at very low levels or were not detected in the other four water lily samples.

      In addition, numerous studies conducted on various plant species have provided evidence that flavonoids (including flavones, flavanols, and isoflavonoids) and their glycosides play a significant role in co-pigmentation[26,27]. According to reports, N. lotus stamen extracts contain a higher concentration of flavonoids compared to perianth extracts[22]. The flavonoids identified in the stamen extracts include kaempferol 3-O-galactoside, quercetin 3′-O-xyloside, quercetin 3-O-rhamnoside, isorhamnetin 7-O-galactoside, and myricetin 3′-O-xyloside. Another study indicated that the content of flavonoids in the petals of the N. alba flower is significantly higher than in the stem and root[28]. In addition, a large number of phenolic acids, for example, caffeic acid, chlorogenic acid, p-coumaric acid have been identified in water lilies[12]. Our present study identified various flavonoid glycosides with glucose, rhamnose, galactose, and arabinose as the major sugar ligands. As a result, the number of detected flavonoids and their glycosides in water lilies has significantly expanded. Among them, quercetin, kaempferol, apigenin, myricetin, and luteolin were identified as the five major flavonols present in water lilies. In addition, many previous studies have reported the antioxidant potential of water lily extracts that are associated with the accumulation of its phytochemicals, especially flavonoids[5,7,29]. Therefore, the variations in the composition of flavonoids among distinct water lily species and varieties, as revealed in this investigation, are expected to provide valuable insights into their diverse antioxidant properties.

      Water lilies not only exhibit a wide variety of colors throughout their flowering period but also emit a captivating fragrance, which has garnered significant attention among researchers in recent studies[2,30]. Among the seven species of N. subg. Hydrocallis (Nymphaeaceae), N. lotus was only found to contain detectable levels of 2,5-dimethoxytoluene, and the content was notably high[16]. This is consistent with the results of our study. Dimethyl sulfide, a group of sulfur-containing compounds, was detected at significant concentrations in all five water lilies. Notably, it has been identified as a crucial aromatic volatile in green tea[31] and mandarin juices[32], and even in trace amounts, it contributes to the development of sulfurous notes. Benzyl alcohol, renowned for its delightful floral and sweet essence, has been extensively documented in various Nymphaea species[2]. Its relative concentration in the DE samples was 23.83% of the total volatiles. This high concentration may be the main reason for the sweet and floral aroma of the DE samples. Additionally, according to the report, 2,3-butanedione is known for its buttery, sweet, and creamy flavor, which imparts a cheese-like taste to soy milk[33]. Our research findings indicate that the BB samples had the highest concentration of 2,3-butanedione, which may be responsible for its sweet and creamy flavor, while it was not detected in the NL sample.

      In addition, previous studies reported that the flower aroma of the cold-resistant water lily is influenced by nerolidol and lilac alcohols, with orange and clove flower aromas, while the aroma of tropical water lilies is influenced by ethyl benzoate, acetic acid phenylmethyl ester, and 2-heptadecanone, which provide a fruity and sweet flower aroma[2]. Moreover, the primary volatile compounds identified in the flowers of Nymphaea hybrida were benzyl alcohol, benzyl acetate, benzaldehyde, (E)-α-bergamotene, and these may constitute its main fragrant components[34]. We observed that the NL sample releases a greater proportion of aromatic compounds than the other samples, which may account for its fruity and sweet flavor. In addition, (E)-α-Bergamotene, (E)-β-famesene and 11 other different volatile compounds were reported in N. colorata flowers, which may serve as olfactory cues for insect pollinators[35]. (E)-β-famesene and α-farnesene possess woody, citrus, and herbal aromatic attributes, was found to be more pronounced in BB samples, accounting for 18.28% and 14.44% of the total volatile composition, respectively. These compounds are likely responsible for the woody fragrance observed in BB samples. Overall, the varying concentrations and relative amounts of volatile compounds contribute to the rich and diverse fragrance qualities of these five water lily species and varieties.

    • To sum up, our study utilized advanced widely targeted metabolomics to comprehensively investigate the non-volatile and volatile components in five water lily species and varieties. The results revealed significant differences in the composition and abundance of metabolites among the different colors of water lily samples. Regarding non-volatile components, we found that cyanidin-type anthocyanins were abundant in the purple-colored DE samples, while delphinidin-derived anthocyanins were prominent in BB, which exhibited a blue color. Flavonoids, phenolic acids, amino acids, lipids, and organic acids were found to be the dominant non-volatile metabolites, expanding our knowledge of their metabolic profile. Pathway enrichment analysis of the differential compounds indicated the involvement of various metabolic pathways such as flavonoid biosynthesis and anthocyanin biosynthesis in water lily metabolism. In terms of volatile components, water lilies contain diverse volatile compounds, including aromatic compounds, alkynes, ketones, alcohols, esters, and aldehydes. Among these, the concentration ofkey compounds including 1,11-dodecadiene, benzyl alcohol, benzaldehyde, α-farnesene and dimethyl sulfide, showed significant differences among samples. By investigating both non-volatile and volatile metabolites, we have obtained a comprehensive understanding of the metabolic pathways and interplay within the water lily species and varieties. Nevertheless, the intricate mechanisms underlying the synthesis and release of metabolites in water lilies warrant further investigation.

    • The authors confirm contribution to the paper as follows: study conception and design: Chen Y, Lv H; data collection: Wei J, Wu Y; resources: Chen S, Yu C; analysis and interpretation of results: Lin Z, Zhu Y; draft manuscript preparation: Yang G. All authors reviewed the results and approved the final version of the manuscript.

    • The datasets generated during and/or analyzed during the current study are not publicly available due to management requests, but are available from the corresponding author on reasonable request.

      • This work was supported by the Science and Technology Innovation Project of Chinese Academy of Agricultural Sciences (CAAS-ASTIP-2014-TRICAAS) for financial support.

      • The authors declare that they have no conflict of interest.

      • Supplemental Table S1 Crucial changed differential non-volatile metabolites in five water lily samples.
      • Supplemental Fig. S1 The spectrogram of non-volatile components in five water lily samples obtained from UPLC-MS/MS.
      • Supplemental Fig. S2 Heatmap of crucially changed differential non-volatile metabolites in five water lily samples.
      • Supplemental Fig. S3 Differential non-volatile metabolite KEGG enrichment maps.
      • Supplemental Fig. S4 The spectrogram of volatile components in five water lily samples obtained from GC×GC-TOFMS.
      • Supplemental Fig. S5 The heat map of volatile compounds with VIP scores of ≥1.
      • Copyright: © 2024 by the author(s). Published by Maximum Academic Press, Fayetteville, GA. This article is an open access article distributed under Creative Commons Attribution License (CC BY 4.0), visit https://creativecommons.org/licenses/by/4.0/.
    Figure (6)  Table (1) References (35)
  • About this article
    Cite this article
    Yang G, Wei J, Wu Y, Chen S, Yu C, et al. 2024. Comprehensive study of non-volatile and volatile metabolites in five water lily species and varieties (Nymphaea spp.) using widely targeted metabolomics. Beverage Plant Research 4: e012 doi: 10.48130/bpr-0024-0005
    Yang G, Wei J, Wu Y, Chen S, Yu C, et al. 2024. Comprehensive study of non-volatile and volatile metabolites in five water lily species and varieties (Nymphaea spp.) using widely targeted metabolomics. Beverage Plant Research 4: e012 doi: 10.48130/bpr-0024-0005

Catalog

  • About this article

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return