-
This study analyzed a total of 422 gene sequences, which are homologous genes of 42 flowering induction regulatory pathway genes, and found that 143 genes contained CpG islands. The CpG islands search results of each gene are shown in Supplemental Table S1.
These 42 genes contain four different floral induction pathway genes and six integron genes (Fig. 1a). The percentage of CpG islands varies among different homologous genes. For example, the percentages of ZTL, GAI, VIP2, and REF6 genes reached 100%, while some homologous genes (e.g., CCA1, LHY, and PIF3) have no distribution of CpG islands. The distribution of CpG islands has gene preference, and it is surmised that the DNA methylation regulation mechanisms of homologous genes may have certain similarities.
Figure 1.
Sequence analysis was used to analyse the flowering-related genes of C. lavandulifolium. (a) The percentage of CpG islands in the 42 homologous genes associated with flowering: (1) Photoperiod pathway; (2) Vernalization pathway; (3) GA pathway; (4) Autonomous pathway; (5) Floral pathway integrator. (b) CpG loci information of 143 genes in homologous genes. (c) Genetic classification based on the percentage of homologous CpG islands. (0−2,000 bp represents the promoter region). (d) Network of floral induction pathways in C. lavandulifolium (modified from Wang et al.[47]). Labelling of genes regulated by DNA methylation: black crowns refer to those that may be regulated by methylation genes, pink crowns refers to those highly likely regulated by methylation genes.
According to the CpG islands analysis of the information of all the sequences of a certain gene, all the genes can be finally divided into the following three categories which are associated with CG site methylation: highly likely to be regulated by methylation; possibly regulated by methylation; highly likely to not be regulated by methylation genes (Fig. 1c).
The results of screening for floral induction genes were placed in the floral induction pathways[47], and it was found that at least one of the four pathways is highly likely to be regulated by DNA methylation (Fig. 1d). The FCA and REF6 genes in the autonomous pathway, the ZTL gene in the photoperiod pathway, the VIP2 gene in the vernalization pathway and the GAI gene in the gibberellin pathway.
CpG islands location preference
-
We analyzed all gene sequences containing CpG islands. A total of 422 gene sequences of 42 homologous genes were analyzed, amongst them, 143 genes contained CpG islands which were mapped to find their locations. CpG islands were primarily distributed in the promoter region, and across the promoter region and gene body. The CpG islands mainly exist in the promoter region; in other words, the distribution of the CpG islands has a position preference (Fig. 1b).
MBD enrichment genome sequencing results
-
Box plots were performed on six samples of the two developmental stages of vegetative growth and budding, and the overall genomic DNA methylation level was characterized. The methylation levels of the six samples were largely the same and there was no significant difference (Fig. 2a).
Figure 2.
MBD sequencing technology was used to screen the results of DNA methylated regulatory genes. (a) Boxplot of methylation levels for six samples. (b) The length distribution of nucleotides in MBD-seq sequencing results. (c) Nr annotation results in species statistics. (d) Nr annotation result sequence type statistics. (e, f) The volcano figure and heat map of differentially enriched genes between the seedling stage and bud stage.
The genomic MBD enrichment results of six samples of C. lavandulifolium were sequenced, and a total of 328,644,614 reads were obtained, which contained 430,941,997 nucleotide sequence information. The obtained DNA sequence was assembled by de novo sequencing to obtain 605,110 contig sequences. The CG content was 38.22%, the sequence length ranged from 224−28,176 bp, the average length was 312.81 bp and the N50 was 572 bp. Among these lengths, 200−500 bp, 500−1,000 bp, 1,000−2,000 bp and ≥ 2,000 bp each accounted for 71.63%, 25.35%, 2.74%, and 0.28%, respectively. The sequencing range was mainly between 200 bp and 500 bp, but there were also several long fragments (Fig. 2b).
Among these sequences, 118,289 sequences with good sequence quality were obtained and annotation by the NCBI Nr database was matched to 825 species (Fig. 2c). The sequence of Chrysanthemum was the most common, with a total of 85,973 cases accounting for 72.68%. Artemisia annua 5,755 cases accounted for 4.87%, Helianthus annuus 3,429 cases accounted for 2.90%, Vigna angularis 947, Solanum pennellii 685, Vitis vinifera 755 and Triticum aestivum 263. Of these species, 83.02% of the sequences were noted in the top seven species but mainly in the genus Chrysanthemum.
The sequence types of the annotated samples were analyzed, and the most common sequence types were microsatellites with a total of 41,675 sequences and highly repetitive sequences in plants were highly methylated (Fig. 2d). These sequence types were followed by the complete genome, usually the DNA sequence of the chloroplast or mitochondria. Our focus was on annotating the results for sequences of mRNA-coding and promoter regions.
Establishing the gene pool according to the results of the Nr database
-
The main gene annotations obtained from the Nr database were microsatellite sequences, complete sequences, mRNA sequences, promoter regions, and UTR regions. Nr is annotated and clustered, and similar structures or functions are summarized and counted for the cells with clear gene functions. A brief summary of gene names and functions provides an excellent gene pool for subsequent research, which can be divided into three main categories: genes that maintain basic metabolism; families of transcription factors; important single genes.
We performed a detailed analysis of the sequences of the promoter region and the mRNA region for the annotation results. In the Nr annotation results, there were 4,949 contig annotation results for promoter regions, having 3,488 sequences with unambiguous annotation results, which could be annotated as 28 genes. We summarize the results of the notes as shown in Supplemental Table S2.
The genes involved in floral induction include the ERF1 gene, the WRKY transcription factor and the LEAFY gene. Most of the remaining genes are enzyme genes in the secondary metabolic pathway, such as the diene synthase (AOC) gene, C4-sterol methyl oxidase gene, linalool synthase (LS) gene, and artemisinic aldehyde delta11(13) reductase (DBR2) gene.
In the Nr annotation results, there were 2044 contig annotation results for the mRNA region, with 137 sequences exhibiting clear annotation results. These results can be annotated as 61 genes. We summarize the results of the notes, as shown in Supplemental Table S3. Among the genes involved in floral induction are the DOF transcription factor family, WRKY transcription factor, GRAS protein, FT, PHYA, CRY 1a, GI, bHLH2, and AG1.
Differential enrichment sequence at the vegetative growth stage and bud stage
-
Through differential expression visualization, we can see the distribution of the differential enrichment sequence (Fig. 2e). In the vegetative growth stage and bud stage, the number of downregulated genes in the bud stage was less than the number of upregulated genes. This result indicates that the differential enrichment sequence has a significant change in DNA methylation status in several sequences during the flower development process, and that the overall methylation status of the genome has not changed significantly. Downregulated genes were more highly expressed than upregulated genes, indicating that DNA demethylation occurred with more genes involved in floral induction (Fig. 2f).
Differential enrichment analysis was performed between the two groups of samples at the vegetative growth stage and bud stage. According to the scatter plot, the sequences with significant upregulation and downregulation are shown. The sequences with significant differences produced 1738 annotation results. The set sequence was mainly microsatellite sequences, with a total of 1118 sequences, accounting for 64.32%, followed by mRNA sequences (81) and promoter regions (201).
The results of differential enrichment sequence annotation are in the promoter region, including the WRKY transcription factor, DBR2, ALDH1, AOC, CPR, LS, and the C4-sterol methyl oxidase gene, etc., as shown in Table 1.
Table 1. The result of enrichment of promoter difference sequence.
Contig No. Species Nr annotation 7 Artemisia annua aldehyde reductase (DBR2) gene 2 Artemisia annua ALDH1 gene 2 Artemisia annua allene oxide cyclase (AOC) gene 12 Artemisia annua artemisinic aldehyde delta11(13) reductase (DBR2) gene 140 Artemisia annua C4-sterol methyl oxidase gene 1 Artemisia annua amorpha-4,11-diene 12-hydroxylase 2 Artemisia annua cytochrome P450 reductase (CPR) gene 24 Artemisia annua epi-cedrol synthase gene 5 Artemisia annua linalool synthase (LS) gene 1 Artemisia annua WRKY-like transcription factor gene The result of the differential enrichment sequence was that the genes in the mRNA region included DELLA protein, DOF transcription factor, GRAS17, GRAS3, HB15, MYB44, MYB46, and S-adenosyl-L-homocysteine hydrolase, as shown in Table 2.
Table 2. Results for the enrichment of mRNA difference sequence.
Contig No. Species Nr annotation 1 Artemisia annua cytochrome P450 mono-oxygenase (cyp03) 1 Artemisia annua DELLA protein (DELLA) 1 Chrysanthemum × morifolium ChlH mRNA for magnesium chelatase subunit H 1 Chrysanthemum × morifolium DOF transcription factor 17 1 Chrysanthemum × morifolium GRAS protein (GRAS17) 1 Chrysanthemum x morifolium GRAS protein (GRAS3) mRNA, 1 Chrysanthemum × morifolium HD-ZIP protein (HB15) 1 Chrysanthemum × morifolium nitrate transporter 2.3 3 Chrysanthemum × morifolium trihelix protein (TH11) 6 Gymnocladus dioica succinate dehydrogenase subunit 4 (sdh4) 1 Helianthus annuus knotted-1-like protein 2 1 Medicago truncatula SPRY domain protein 1 Morus notabilis Calcium-transporting ATPase 2 1 Nicotiana tabacum S-adenosyl-L-homocysteine hydrolase (SAHH3) 1 Arachis duranensis tubulin alpha-4 chain 1 Beta vulgaris alanine--tRNA ligase 1 Beta vulgaris UDP-glucuronate 4-epimerase 6 1 Beta vulgaris zinc finger MYM-type protein 1-like 1 Brassica rapa 1-aminocyclopropane-1-carboxylate synthase 5 1 Brassica rapa condensin complex subunit 2 1 Camelina sativa L-ascorbate oxidase homolog 1 Capsicum annuum ABC transporter F family member 1 1 Capsicum annuum chaperone protein dnaJ 11 1 Capsicum annuum probable pectate lyase 8 1 Citrus sinensis F-box/kelch-repeat protein 1 Daucus carota E3 ubiquitin-protein ligase UPL1 1 Daucus carota ESCRT-related protein CHMP1B 1 Daucus carota inositol-tetrakisphosphate 1-kinase 1-like 1 Drosophila ficusphila glycine-rich cell wall structural protein 1.0 1 Erythranthe guttatus DNA topoisomerase 2 1 Fragaria vesca transcription factor MYB46 1 Gossypium hirsutum heat shock protein-like 1 Jatropha curcas ABC transporter C family member 4 1 Malus x domestica pectinesterase 3-like 1 Nicotiana sylvestris probable methyltransferase PMT11 1 Phoenix dactylifera MYB44 1 Ricinus communis ATP synthase subunit a 1 Sesamum indicum glycylpeptide N-tetradecanoyltransferase 1-like 1 Solanum tuberosum transcription elongation factor SPT5 Comparison of the results of two analytical methods
-
Based on the DNA hypermethylation status, we calculated the percentage of CpG islands in the promoter region of the homologous gene, and the results indicate that most of the genes might be regulated by DNA methylation (10%−80%). For example, 40% of the gene promoter region of the WRKY transcription factor, 62.29% of the homologous gene promoter region of the GRAS protein, 38.13% of the homologous gene promoter region of the DOF transcription factor family, 100% of the homologous gene promoter region of the LEAFY gene and 100% of the homologous gene promoter region of the DELLA protein were identified (Table 3). In early studies, it was found that the CpG island of the LEAFY gene promoter region and the demethylation of the LEAFY gene CpG island region are involved in the regulation of LEAFY expression during development[51], as determined by our research. The LEAFY gene was identified as a gene that is highly likely to be regulated by DNA methylation.
Table 3. Summary and analysis of key candidate genes in C. lavandulifolium.
Genes Type of methylation analysis CpG islands Promoter mRNA Enrichment of differences WRKY transcription factors family 40.00% Yes Yes Yes GRAS protein family 61.29% − Yes Yes DOF transcription factors family 38.13% − Yes Yes DELLA protein family 100.00% Yes Yes LEAFY 100.00% Yes − − FT 26.32% Yes − GI 31.03% Yes − PHYA 60.00% Yes − The sequence analysis technique is based on the possibility that sequence specificity may be regulated by DNA methylation. The MBD protein enrichment technique was used to detect the methylation status of the sequence. The genes we selected based on the two characteristics have certain mutual conformation characteristics. As DNA methylation plays a key role in floral induction, for the gene pool, further confirmation of its role requires validation through experimental analysis.
Expression analysis of candidate genes by RT-PCR
-
During the seven days of short daylight induction, almost all the detected genes were differentially expressed, the expression timing or pattern was different among all the genes. For example, the ClLS gene was expressed only on D3 (day 3), ClFCA was expressed only on D3 and D4 after flowering, and ClFLC, ClCOL4, ClGAI and ClMET were expressed from D3-D5. ClDOF was only expressed between D2 and D5. ClROS1 is expressed on D0, D4 and D5 (Fig. 3a).
Figure 3.
The expression result of candidate genes in the floral induction process of Chrysanthemum lavandulifolium. (a) Dynamic changes of differentially expressed genes. (b) Differential expression screening of the WRKY gene family by RT-PCR. (c) Relative gene expression during floral induction of C. lavandulifolium. (d) Gene expression and DNA methylation status markers of ClWRKY21, the red arrows represent the disappearance of DNA methylation at this stage.
ClDELLA and ClFTL genes were not expressed at the initial stage of floral induction and were highly expressed at the subsequent stage. The DELLA protein gene was not expressed from D0 to D2 while highly expressed between D3 and D7, indicating that the DELLA protein played an important role during D3 to D7 to induce flower formation. The ClFTL gene was only expressed on D6 and D7 after two days (Fig. 3a). The expression of the ClFRI, ClDEMETER, ClCRY1b, ClLSL and ClPIE genes was downregulated in the floral induction process; in other words, they were highly expressed in the early stage of floral induction and not in the late stage of SD induction. ClFTL, ClCMT, ClDML, ClLHY and other genes showed no significant difference in expression during the induction period of short sunlight (Fig. 3a).
In addition, the expression of WRKY gene family members was analyzed by RT-PCR, among them, ClWRKY14, ClWRKY56, ClWRKY21-4 and ClWRKY15-2 were only expressed in the middle stage of floral induction, and ClWRKY10 showed downregulated expression, while ClWRKY12-2 and ClWRKY13 showed no obvious pattern (Fig. 3b). The WRKY gene family presents different expression rules in the floral induction process. It is possible that different members play different roles in the flowering induction process.
We performed a class quantitative analysis of the gene expression of 30 genes, including five DNA methylation-related genes, 14 flowering genes and 11 members of the WRKY gene family. There were differences before and after the floral induction of differentially expressed genes, and the expression pattern was different during the floral induction period. Different genes showed high expression at various stages of floral induction (early, mid, late), while some genes exhibit high expression in the floral induction period with no obvious differences. Five DNA methylation related genes (ClMET, ClDEMETER, ClCMT, ClDML, and ClROS1) were differentially expressed, indicating that DNA methylation plays an important role in the flowering induction process of C. lavandulifolium. Different members of the WRKY gene family also showed similar rules in the floral induction of C. lavandulifolium. Considering that it takes response time for DNA methylation to play a role in flowering, more detailed studies should be carried out on the genes with high expression in floral induction.
qRT-PCR expression analysis of candidate flowering genes
-
qRT-PCR was used to analyze the expression of some key candidate genes, and all nine genes showed a trend of differential expression in the flowering induction process, which increased earlier and then decreased in the later stages. Three of the genes (ClGRAS, ClLSL, ClWRKY12) had similar dynamic expression changes and the remaining six genes showed very similar expression trends in the induction of flowering, and their expression peaks all appeared at D5 (Fig. 3c).
In the early stage of induction, the expression of these six genes gradually increased, and the expression of the gene reached its peak at day 5 (the middle and late stages of floral induction) after induction and decreased sharply after the induction of flowering (late stages of flowering induction). Such genes play an important role in floral induction and flower development and are likely to be key regulatory factors in the floral induction pathway of C. lavandulifolium. The regulatory mechanism of DNA methylation in this gene should be further explored and studied.
Candidate gene promoter region methylation analysis (MSP)
-
After the sample was treated with bisulfite, if the fragment was amplified by a methylation-specific prime-M, the detected site will be considered methylated, On the other hand, if Primer-U amplify the fragment, that means the detected site did not show methylation. For example, ClWRKY21 was amplified only with nonmethylated specific primers, indicating the lack of DNA methylation. If both are amplified, two states, ClFT, DFL, and ClMET coexist across the genome. As the amplification results of the methylated fragments change during floral induction, it indicates that the methylated state of ClWRKY17 and ClWRKY21 has a dynamic change (Table 4).
Table 4. MSP results and DNA methylation status analysis.
Type Gene name Flowering induction stage Conclusion D0 D1 D2 D3 D4 D5 D6 Quantitative change ClFT-M + + + + + + + Both ClFT- U + + + + + + + DFL-M + + + + + + + Both DFL- U + + + + + + + ClMET-M + + + + + + + Both ClMET- U + + + + + + + Toqualitative change ClWRKY12-M × × × × × × × Unmethylation ClWRKY12- U + + + + + + + ClWRKY17-M + + + + + × + Dynamic change ClWRKY17- U + + + + + + + ClWRKY21-M + + × + + + × Dynamic change ClWRKY21- U + + + + + + + According to the results, the methylated state fragments of ClFT, DFL and ClMET coexist with the nonmethylated state; in other words, only the methylation quantity changes in the floral induction process, meaning that further quantitative analysis is needed.
ClWRKY21 showed a non-methylated state throughout the floral induction process, and the methylated state did not change. The methylation states of ClWRKY17 and ClWRKY21 changed during the floral induction process. The methylation state of ClWRKY17 changed at D5, while that of ClWRKY21 changed twice at D2 and D6. ClWRKY17 and ClWRKY21 were demethylated during floral induction, leading to high gene expression (Table 4).
In combination with the expression of the ClWRKY21 gene, the methylation state of the ClWRKY21 gene may have changed twice during the floral induction process (Fig. 3d).
When the expression level of the ClFT gene was changed, the methylation level of the ClFT gene was stable within a certain range without significant change. The DFL gene maintained the methylation level across all stages, however, it was higher before short day exposure and lower after short day exposure. The DFL gene may regulate gene expression through changes in DNA methylation levels (Fig. 4a). There is no corresponding rule between the expression of the ClMET gene and its methylation level, and the change in the ClMET gene expression level may be unrelated to the change in the DNA methylation level. The ClMET gene is an important DNA methylation transferase gene (Fig. 4a). Although this gene is not regulated by DNA methylation during floral induction, the difference in its expression level is likely to regulate gene methylation and achieve the ultimate goal of flowering regulation[37]. However, there is no clear evidence of the relationship between MET and DFL genes, and the high expression of MET may regulate other flowering inhibitors.
Figure 4.
The result of DNA methylation of key genes and construction of a floral induction network of C. lavandulifolium. (a) The expression level and methylation level of key flowering genes. (b) Role of the ClWRKY21 gene in floral induction of C. lavandulifolium.
Role sites of the ClWRKY21 floral induction regulation network
-
Based on the above results, the ClMET, ClWRKY21, DFL and ClFT genes, all play a key role in the floral induction of C. lavandulifolium. Specifically, these genes are differentially expressed in the floral induction, and D5 showed highest expression in the critical stage. There was no significant change in the DNA methylation level in the ClFT promoter region during floral induction, while the DNA methylation level in the DFL gene promoter region decreased gradually during short-day induced flowering (Fig. 4a). The promoter region of ClWRKY21 disappeared twice in the flowering process, suggesting that it is a key factor regulated by DNA methylation in the floral induction of C. lavandulifolium in response to short days. The dynamic changes in DNA methylation of the above genes may be affected by the expression level of the DNA methyltransferase gene ClMET (Fig. 4b). But ClMET is a flowering suppressor, and early flowering can be achieved by silencing the gene. In this study, the ClMET expression can only indicate that the gene is highly expressed in the flowering induction stage, and further research is needed on the downstream target gene of ClMET.
-
During floral induction, a complex DNA methylation regulation mechanism is activated. We constructed a system of screening DNA methylation-regulated gene groups and obtained DNA methylation-regulated gene groups in the C. lavandulifolium floral induction pathway. This approach provides an effective method for related studies of epigenetics in other species without a reference genome. Based on the gene groups determined to be regulated by DNA methylation, this study supplemented the genes regulated by DNA methylation in the existing flowering regulation network (Fig. 6). Based on qRT-PCR and MSP results, it was verified that the DNA methylation changes observed on ClWRKY21 and DFL lead to their differential expression and thus regulate the flowering process of C. lavandulifolium.
-
About this article
Cite this article
Kang D, Dai S, Wang Z. 2022. MBD protein recognizes flower control genes regulated by DNA methylation in Chrysanthemum lavandulifolium. Ornamental Plant Research 2:3 doi: 10.48130/OPR-2022-0003
MBD protein recognizes flower control genes regulated by DNA methylation in Chrysanthemum lavandulifolium
- Received: 30 September 2021
- Accepted: 25 January 2022
- Published online: 24 February 2022
Abstract: Dynamic changes in DNA methylation regulate the expression of genes and play important roles especially in the flowering processes of higher plants. Methyl-CpG-binding domain protein could specifically recognize hypermethylated regions in the genome, thus MBD sequencing technology and CpG islands analysis of the sequences were used to identify candidate genes that were regulated by DNA methylation, in particular the flowering induction stage of Chrysanthemum lavandulifolium. MBD-seq identified 89 candidate genes which included 49 genes exhibiting changes in DNA methylation status during floral induction. Based on CpG islands analysis of the sequences, 27 candidate genes were selected that may be regulated by DNA methylation. The expression levels of 30 candidate genes and nine key genes were determined by RT-PCR and qRT-PCR during floral induction (7D), four genes (ClFT, ClMET, DFL and ClWRKY21) were similarly up-regulated. Methylation-specific PCR analysis also indicated that there were changes in the DNA methylation status in the DFL and ClWRKY21. The changes in the DNA methylation status during the induction phase of flowering may lead to changes in gene expression. In this study, a set of genes were identified that are proposed to be involved in floral induction and two key genes were identified (DFL, ClWRKY21) that were regulated by DNA methylation during the flowering process of C. lavandulifolium.
-
Key words:
- DNA Methylation /
- MBD /
- CpG Islands /
- Floral Induction /
- Chrysanthemum lavandulifolium /
- Gene Screening