-
The six AtCYP78A proteins were employed as a query to search the SmCYP78As in the eggplant genome via BlastP program, resulting in six putative SmCYP78A genes. The presence of the P450 domains in the six putative SmCYP78A proteins was confirmed via Pfam and SMART, indicating that the six proteins are members of the eggplant CYP78A family (Table 1).
Table 1. Summary information of CYP78A family genes in eggplant, Arabidopsis, rice and tomato.
Gene ID Gene name Location Deduced polypeptide PSL Chr Start End Length (aa) MW (KDa) pI Smechr0101302 CYP78A6 Chr1 12640581 12643245 538 60.35 8.90 ER Smechr0302971 CYP78A5 Chr3 89886800 89889184 516 58.19 6.79 ER Smechr0400048 CYP78A7 Chr4 420566 422920 525 58.98 9.23 ER Smechr0500049 CYP78A8 Chr5 783272 785346 528 59.76 9.30 ER Smechr0502180 CYP78A9 Chr5 74969262 74971177 537 60.55 7.52 ER Smechr1100733 CYP78A10 Chr11 10534819 10536967 551 61.70 8.23 ER AT1G01190 CYP78A8 Chr1 83045 84946 541 60.91 8.22 ER AT1G13710 CYP78A5 Chr1 4702657 4704694 518 57.64 8.57 ER AT1G74110 CYP78A10 Chr1 27866667 27868368 538 60.18 7.84 ER AT2G46660 CYP78A6 Chr2 19153328 19155579 531 59.57 8.29 ER AT3G61880 CYP78A9 Chr3 22905868 22907958 556 62.62 9.04 ER AT5G09970 CYP78A7 Chr5 3111945 3114239 537 59.49 6.69 ER LOC_Os10g26340 CYP78A11/PLA1 Chr10 13658790 13660543 556 59.08 7.06 ER LOC_Os11g29720 CYP78A5 Chr11 17234285 17238178 539 59.64 10.19 ER LOC_Os03g04190 CYP78A9 Chr3 1920043 1921896 516 55.80 8.10 ER LOC_Os03g30420 CYP78A6/GL3.2 Chr3 17340415 17342284 516 56.05 9.00 ER LOC_Os03g40600 CYP78A7 Chr3 22567670 22568685 194 20.29 7.99 ER LOC_Os03g40610 CYP78A8 Chr3 22572706 22574008 308 32.84 10.12 ER LOC_Os07g41240 CYP78A13/GE Chr7 24713778 24715813 526 55.89 8.68 ER LOC_Os08g43390 CYP78A15/BSR2 Chr8 27420501 27422836 552 59.79 9.38 ER LOC_Os09g35940 CYP78A10 Chr9 20691306 20693116 554 60.74 9.07 ER Solyc01g096280 CYP78A6 Chr1 79622266 79624618 539 61.05 8.60 ER Solyc03g114940 CYP78A5/KLUH Chr3 59217389 59219730 517 58.38 6.21 ER Solyc05g015350 CYP78A8 Chr5 10475028 10479267 332 37.85 6.08 ER Solyc05g047680 CYP78A7 Chr5 58506390 58508292 532 60.49 9.20 ER Solyc10g009310 CYP78A9 Chr10 3224421 3227123 526 61.64 9.05 ER Solyc12g056810 CYP78A10 Chr12 62510941 62512676 537 60.71 7.50 ER The information of the six SmCYP78As, including chromosomal locations, amino acids number (length), PIs, MWs and predicted subcellular localizations (PSL), was listed in Table 1. To gain further insights into the CYP78A family genes in plants, the information of CYP78As from Arabidopsis, rice and tomato was also included in Table 1. The amino acids number of SmCYP78A proteins varies from 516 (SmCYP78A5) to 551 (SmCYP78A10), the PI ranges from 6.79 (SmCYP78A5) to 9.30 (SmCYP78A8) and the MW ranges from 58.19 (SmCYP78A5) to 61.70 KDa (SmCYP78A10). Interestingly, the amino acids lengths of rice CYP78A7 and CYP78A8 as well as tomato CYP78A8 (SlCYP78A8) were much smaller than other CYP78As from Arabidopsis, rice, tomato and eggplant. It would be interesting to know whether the three short CYP78A proteins show similar functions with other CYP78As. Notably, the subcellular localizations of all CYP78As were predicted to be localized in endoplasmic reticulum (ER), which is in agreement with the subcellular localizations of AtCYP78A5 and TaCYP78A5 in vivo[4,16].
To elucidate the evolutionary relationships of SmCYP78As, an unrooted neighbor-joining (NJ) phylogenetic tree was constructed using the full protein sequences from the six AtCYP78As, nine OsCYP78As, six SlCYP78As and six SmCYP78As. The resulting tree contained five distinct clades (C1-C5) (Fig. 1a). Phylogenetic analysis revealed that C1 and C4 were shared in all the four species. There was equal number of the CYP78A proteins from eggplant, Arabidopsis, tomato and rice in C1 (Fig. 1a). C4 contained two SmCYP78As and one CYP78A from each of tomato, Arabidopsis and rice, indicating the expansion of SmCYP78As in this clade compared with the other three species. C2 and C3 didn't contain SmCYP78As. C2 was shared by rice and tomato, but not in Arabidopsis and eggplant, suggesting the unique roles of the CYP78As in C2 that were likely acquired or expanded in tomato and rice after divergence from the last common ancestor with eggplant and Arabidopsis. It is worth noting that C3 only contains rice CYP78As and OsCYP78A8 didn't fit into any clades, which may have evolved following divergence and have special roles in rice. Moreover, C5 didn't include any rice CYP78As but only members from eggplant, tomato and Arabidopsis, suggesting that the CYP78As in C5 may have been lost in rice during evolution.
Figure 1.
Phylogenetic relationships, gene structure and conserved protein motifs of CYP78A genes from eggplant, Arabidopsis, rice and tomato. (a) The phylogenetic tree was constructed based on the full-length protein sequences of six AtCYP78As, nine OsCYP78As, six SlCYP78As and six SmCYP78As proteins using MEGA 7.0 software. Eggplant, Arabidopsis, rice and tomato CYP78As were labeled by red, black, pink and green dots. (b) Exon-intron structure of CYP78A genes. Black lines indicate introns. The number indicates the phases of corresponding introns. (c) The motif composition of CYP78A proteins. The motifs, numbers 1–10, are displayed in different colored boxes. The sequence logos and E values for each motif are given in Supplemental Fig. S1.
Gene structure analysis showed that the number of exons in the 27 CYP78A genes was conserved and most of them contain two exons except OsCYP78A7 and AtCYP78A9, which contains only one and three exons, respectively (Fig. 1b). In addition, the introns of the 27 CYP78As are a phase 0 intron (Fig. 1b), further suggesting the highly conservation of CYP78A genes during the evolution of the four plants.
Ten conserved motifs that are shared among the 27 CYP78A proteins were identified using the MEME (Fig. 1c; Supplemental Fig. S1). Twenty-four CYP78A proteins contain all 10 motifs with motif 6, 2, 9, 7 and 10 at N terminal and motif 1, 5, 3, 4 and 8 at C terminal (Fig. 1c), suggesting the similar function of these CYP78As. The other three CYP78A proteins with shortest amino acid length did not include some motifs (Fig. 1c). For example, SlCYP78A8 does not have motif 2, 6 and 10. While OsCYP78A7 does not include motif 1, 6, 2, 7, 9 and 10, OsCYP78A8 does not contain motif 1, 5, 3, 4 and 8. Further studies are required to investigate the roles of these motifs regarding the functions of CYP78As.
Synteny analysis of CYP78A genes from eggplant, Arabidopsis, tomato and rice
-
The six SmCYP78A genes were mapped on five chromosomes, i.e. E01, E03, E04, E05 and E11 (Fig. 2a). Interestingly, all the six SmCYP78A genes were located at the end of the five chromosomes. Similar locations of CYP78As were also found in Arabidopsis, tomato and rice genomes (Fig. 2). Syntenic analysis of the eggplant genome were performed using MCscanX to identify duplication events among SmCYP78As. Only one gene pair, SmCYP78A6 and SmCYP78A7, were identified in the eggplant genome, indicating that segmental duplication contributes to the expansion of the CYP78A family in eggplant.
Figure 2.
Gene duplication and synteny analysis of SmCYP78A genes. (a) Schematic representations for the chromosomal distribution and interchromosomal relationships of SmCYP78A genes. Gray lines indicate all synteny blocks in the eggplant genome, and the red lines indicate segmental duplicated SmCYP78A gene pairs. (b) Synteny analysis of CYP78A genes between eggplant and Arabidopsis. (c) Synteny analysis of CYP78A genes between eggplant and tomato. (d) Synteny analysis of CYP78A genes between eggplant and rice. Gray lines in the background indicate the collinear blocks between genomes, while the red lines highlight the syntenic blocks harboring CYP78A gene pairs.
Comparative syntenic analyses of eggplant genome were performed with genomes of Arabidopsis, tomato and rice. Three (SmCYP78A5, SmCYP78A6 and SmCYP78A7), four (SmCYP78A5, SmCYP78A6, SmCYP78A5, SmCYP78A8 and SmCYP78A9) and one (SmCYP78A5) SmCYP78A gene show syntenic relationships with those in Arabidopsis, tomato and rice, respectively (Fig. 2). Interestingly, SmCYP78A6 and SmCYP78A7 were syntenic with three Arabidopsis CYP78A genes (AtCYP78A6, AtCYP78A8 and AtCYP78A9), respectively. Notably, SmCYP78A5 showed a syntenic relationship with AtCYP78A10, SlCYP78A5/KLUH and OsCYP78A13/GE, indicating that these orthologous pairs likely have existed before the ancestral divergence with conserved functions.
Expression profiles for SmCYP78A genes in different tissues
-
Real-time quantitative RT-PCR were used to detect the expression patterns for the six SmCYP78A genes in the roots, stems, leaves, young flower buds, petals, sepals, pericarp and fruit flesh. The six SmCYP78A genes showed different patterns of tissue-specific expression and exhibited relatively low expression levels in most tissues (Fig. 3a). SmCYP78A5, SmCYP78A7, SmCYP78A8, SmCYP78A9 and SmCYP78A10 was specifically expressed in young flower buds, roots, petals, roots and stems, respectively (Fig. 3a). SmCYP78A6 showed high levels of transcript abundance in roots and pericarp (Fig. 3a). The different expression patterns of the SmCYP78As indicate their distinct roles in various aspects of physiological and developmental processes.
Figure 3.
Expression profiles of the six SmCYP78A genes in different tissues. (a) Relative transcript abundances of the SmCYP78A genes examined by qRT-PCR. (b) Expression of the SmCYP78A genes in young flower buds and developing ovaries detected by RNA-seq. Rt, Root; St, Stems; Le, Leaf; Pet, Petal; Se, Sepal; Per, Pericarp; FF, Fruit flesh; YB, Young flower buds; DBA, Days before anthesis; DPA, Days post anthesis.
Considering the important roles of CYP78A genes in fruit development and the fact that fruit size was largely determined at the early developmental stages, we analyzed the RNA-seq data of young flower buds and developing ovaries in eggplant. The six SmCYP78A genes showed different expression patterns in developing ovaries (Fig. 3b). While SmCYP78A5 and SmCYP78A10 showed high expression, the other four SmCYP78A genes were barely expressed in young flower buds and developing ovaries (Fig. 3b). Moreover, SmCYP78A5 showed highest expression in young flower buds and gradually decreased with the development of eggplant ovary and showed no expression at 0 DPA (Fig. 3b). Interestingly, SmCYP78A5 showed similar expression patterns with tomato KLUH, the closest ortholog of SmCYP78A5, in developing ovaries[11], indicating their conserved roles in regulating fruit size. SmCYP78A10 abundantly expressed in developing ovaries with the expression peak at 10 DBA and very low expression in 0 DPA (Fig. 3b).
Analysis of co-expression genes of SmCYP78A5 and SmCYP78A10
-
The high expression of SmCYP78A5 and SmCYP78A10 in developing ovaries (Fig. 3b) indicated their important roles in regulating fruit development in eggplant. To gain further insight into the functions of SmCYP78A5 and SmCYP78A10, co-expression analysis was performed using fuzzy C-means clustering. Twelve co-expressed clusters were identified with Cluster 6 and 11 representing SmCYP78A10 and SmCYP78A5, respectively (Fig. 4; Supplemental Table S2).
Figure 4.
Twelve co-expressed clusters are clustered using fuzzy C-means clustering in Mfuzz with normalized expression values (z-scores). The red lines represent the average of expression values, whereas the gray lines represent the expression values of the co-expressed genes. YB, Young flower buds; DBA, Days before anthesis; DPA, Days post anthesis.
Cluster 6 represented genes that expressed at higher levels in ovaries at 10 DBA and 7 DBA than young flower buds and ovaries at 0 DPA (Fig. 4). Cluster 6 was significantly enriched with genes involved in cellular processes, such as 'Cell cycle process', 'Organelle fission' and 'Microtubule-based process' (Fig. 5). Genes involved in these processes included SmCYP78A10 and putative orthologs of Arabidopsis SUN1, TON1, TUA6 and NEK1 (Fig. 5). Genes in Cluster 11 showed highest expression in young flower buds and low expression in developing ovaries (Fig. 4). Cluster 11 was enriched with genes involved in photosynthesis related processes, including 'photosynthesis', 'carbon fixation' and 'response to high light intensity'. Genes involved in these processes included NDFs, PRK and PPH1 (Fig. 5). The GO enrichment analysis indicated that SmCYP78A10 and SmCYP78A5 regulate fruit development likely through different mechanisms.
Figure 5.
Significantly enriched GO terms (biological process) of co-expression genes in (a) Cluster 6 and (b) Cluster 11. Only the top five enriched GO terms are shown. The color of lines represents different GO terms.
Since transcription factors (TFs) are the main regulators of gene expression, we sought out the TFs in the two clusters. Cluster 6 harbored 77 TFs (7.60%) which were classified into 29 families (Fig. 6a; Supplemental Table S3). The 10 most abundant TF families in cluster 6 were HB (8), GRAS (6), MYB (5), bHLH (5), B3 (5), ERF (4), zf-HD (3), NAC (3), MYB-related (3) and GRF (3) (Fig. 6a; Supplemental Table S3). The Cluster 11 contained 63 TFs (6.52%) mainly from families classified as HB (8), bHLH (5), MIKC (4), MYB (4), NF-YA (4), bZIP (3), C2C2-CO-like (3), C2C2-YABBY (3), C3H (3) and HSF (3) (Fig. 6b; Supplemental Table S4). Interestingly, HB, MYB and bHLH TFs were found in both Cluster 6 and 11, suggesting that HB, MYB and bHLH TFs might play important roles in regulating the expression of CYP78As in eggplant.
Figure 6.
Overview of distribution of TF families that were co-expressed with
(a) SmCYP78A10 in Cluster 6 and (b) SmCYP78A5 in Cluster 11. The Plant Transcription Factor Database v5.0 (http://planttfdb.gao-lab.org) was used to identify TFs in the eggplant genome. Transcription factor binding site analysis in the promoters of SmCYP78As
-
To gain further insight into the transcriptional regulation of the SmCYP78A5 and SmCYP78A10, we selected a 1.5 kb regulatory region upstream of the ATG of SmCYP78A5 and SmCYP78A10 (Supplemental Table S5) to scan transcription factor binding sites (TFBSs) using PlantRegMap. Interestingly, two HB TFs, Smechr0402062 and Smechr0101299, that are co-expressed with SmCYP78A5 in Cluster 11 were predicted to directly target SmCYP78A5 (Table 2). SmCYP78A10 was identified as candidate target of Smechr0402092 (AP2), Smechr0902218 (Cysteine-rich polycomb-like protein, CPP), Smechr0801604 (MYB) and Smechr0201168 (TCP) that are co-expressed genes of SmCYP78A10 in Cluster 6. Some orthologs of the TFs were known from other studies to be involved in organ size regulation in plants. For example, AINTEGUMENTA (ANT) is an ortholog of Smechr0402092 in Arabidopsis and has been demonstrated as a positive organ size regulator by stimulating cell proliferation and modulating auxin biosynthesis[35,36]. Smechr0902218 encodes a CPP TF and is closely related to Arabidopsis TCX2/SOL2 that has been reported to regulate both cell fate and cell division[37,38]. Smechr0201168 is a putative ortholog of Arabidopsis TCP20 which has been proposed to control cell division and growth by directly binding to the GCCCR element in the promoters of cyclin CYCB1;1[39]. In addition, studies from Arabidopsis have shown the important roles of HB TFs in regulating organ size[40,41]. Therefore, the TFs may function as regulators of eggplant fruit development by directly binding the promoters of CYP78As.
Table 2. Candidate transcription factors binding promoters of SmCYP78As identified by PlantRegMap.
Gene ID TF family Arabidopsis
orthologBinding sequence Strand P value Smechr0402062 HB AT4G08150 CACTTCCCTTCTCTCTCTCT + 1.71E-05 Smechr0101299 HB AT2G46680 TCATTTATTGAAC − 9.07E-05 GGAATGATTGTAA − 9.88E-05 Smechr0402092 AP2 AT4G37750 CATCACAAATTCCAAAATCCC + 2.73E-05 AAACACTCTCCCCCACGTATA − 7.73E-05 Smechr0902218 CPP AT4G14770 TAAAATTTTAAAA − 7.34E-05 TGAAATTTAAAAA − 8.37E-05 TCAAATTTAAAAA + 8.47E-05 Smechr0801604 MYB CTTGAAGACCGTTGA + 9.42E-05 Smechr0201168 TCP AT3G27010 TTGCCCCAC + 5.27E-05 -
In this work, we identified six CYP78A family genes in the eggplant genome and provided comprehensive analysis of CYP78A genes from eggplant, Arabidopsis, rice and tomato. The results indicated the close evolutionary relationship and functional conservation of CYP78A genes in plants. The high expression of SmCYP78A5 and SmCYP78A10 in young flower buds and developing ovaries suggested their important roles in controlling fruit development. Co-expression clustering, GO enrichment analysis and TF binding site analysis indicated the different mechanisms underlying fruit development regulation between SmCYP78A5 and SmCYP78A10 and identified six potential upstream TFs that directly bind to the promoters of SmCYP78A5 and SmCYP78A10.
-
About this article
Cite this article
Zhou M, Zhang L, Luo S, Song L, Shen S, et al. 2023. Comprehensive analysis of CYP78A family genes reveals the involvement of CYP78A5 and CYP78A10 in fruit development in eggplant. Vegetable Research 3:5 doi: 10.48130/VR-2023-0005
Comprehensive analysis of CYP78A family genes reveals the involvement of CYP78A5 and CYP78A10 in fruit development in eggplant
- Received: 19 November 2022
- Accepted: 27 December 2022
- Published online: 14 February 2023
Abstract: The CYP78A family is a plant-specific family, members of which have been considered as promising targets for yield improvement due to their important roles in regulating organ size. Eggplant is an important vegetable cultivated worldwide. However, little information about the eggplant CYP78As (SmCYP78As) limits the potential utilization of SmCYP78As for crop improvement. In this study, we identified six CYP78A genes in the eggplant genome named SmCYP78A5 to SmCYP78A10 according to the phylogenetic relationships to Arabidopsis CYP78As. The phylogenetic analysis of CYP78As from eggplant, Arabidopsis, rice and tomato classified the 27 CYP78As into five clades. SmCYP78As were found in three of the five clades. This classification is consistently supported by their gene structures, domains and conserved motifs. Segmental duplication events were found to contribute to the expansion of the SmCYP78A family. Comparative syntenic analysis provided further insight into the phylogenetic relationships of CYP78A genes from the four plants. qRT-PCR analysis revealed that the expression of the six SmCYP78As was detected in at least one of the eight tissues, showing a tissue-specific pattern. Notably, SmCYP78A5 and SmCYP78A10 were highly expressed in developing ovaries, indicating the involvement of fruit development in eggplant. Co-expression clustering and GO enrichment analysis suggested that SmCYP78A5 and SmCYP78A10 regulate fruit development likely through different pathways. In addition, six transcription factors were identified as promising candidates that may directly bind promoters of SmCYP78A5 and SmCYP78A10. This study provides a comprehensive overview of the SmCYP78As family, which would lay a foundation for further understanding of evolution and function of the SmCYP78A family.
-
Key words:
- Eggplant /
- CYP78A /
- Organ size /
- Developing ovaries /
- Fruit development