-
A snapshot of visual variation in FBD and WCI is presented in Fig. 1. The average WCI and FBD across all ~900 seedlings ranged from 0 to 7, and from 0% to 58%, respectively. Based on the MLM analysis, the estimated narrow-sense heritability (h2) of WCI and FBD was 0.57 (standard error = 0.18) and 0.09 (standard error = 0.05), respectively. The estimated genetic correlation between WCI and FBD was 0.58, and several fruit quality traits displayed unfavourable correlations with WCI (Supplemental Fig. S1). Seedlings with higher WCI scores were generally characterised by poor firmness and crispness, plus higher astringency and sourness. Estimated phenotypic and genetic correlations between all pairs of traits are listed in Supplemental Table S1.
Figure 1.
Transverse cross-sections of apple slices showing range in (a) flesh colouration, and (b) flesh browning disorder for Type 1 red-fleshed apple. The weighted cortical intensity (WCI) scores (0−9 scale) and the proportion of the cortex area showing symptoms of flesh browning disorder are also displayed.
Characteristics of DNA pools, and variant calling
-
A few seedlings had no red pigment in the cortex, but the average WCI score across all seedlings was 2.25. About two-thirds of the seedlings did not display any FBD symptoms, but among the remaining seedlings, FBD ranged between 1% and 58% (Fig. 2). The average WCI score for the 'low' and 'high' WCI pool was 0.45 and 5.2, respectively, while the average FBD was 0% and 20.6% for the 'low' and 'high' FBD pool, respectively (Supplemental Table S2). The average WCI score of the seedlings in the FBD pools was similar (Low: 5.5; High: 4.6), while the average FBD of the high- and low-WCI pools was 4.5% and 0.1%, respectively.
Figure 2.
The distribution of weighted cortex intensity (WCI) scores (a) and the internal flesh browning disorder IFBD%; (b) in the population of ~900 apple seedlings. The green and red circles highlight the individuals used to form the 'low' and 'high' pools of samples.
FBD-associated genomic regions, and candidate gene screening
-
After filtering, about 204,000 SNPs were used and the average sequencing depth of SNP loci was similar for the two pools (42 vs 44). There was a near-perfect correlation between the Z-test statistics and G-statistics, so only the latter are discussed hereafter. A plot of the G' values, smoothed over 2 Mb windows, is shown for all 17 chromosomes (Chrs) in Supplemental Fig. S2. XP-GWAS identified genomic regions significantly associated with FBD on 12 out of the 17 Chrs (Fig. 3), and putative candidate genes within ±1.0 Mb distanceof the significant G' peaks were identified (Table 1). Additional genomic regions, which did not meet the significance threshold but displayed distinguished G' peaks, were also identified across all chromosomes (Supplemental Table S3).
Figure 3.
G'-statistics across the linkage groups (LG) showing significant association with the flesh browning disorder (FBD) in apple The horizontal red lines indicate the significance threshold. The putative candidate genes (refer to Table 1) underpinning various G' peaks are also shown.
Table 1. A list of the genomic regions associated with internal flesh browning disorder (FBD) in apples. Putative candidate genes residing within these regions are also listed using GDDH13v1.1 reference genome assembly.
Chr Genomic region (Mb) Putative genes functions 2 21.9–23.5 Ethylene-responsive element binding factor 13 (MdERF13: MD02G1213600); 3 3.1–4.9 cinnamate 4-hydroxylase (C4H) enzyme (MD03G1051100, MD03G1050900 and MD03G1051000); MdWRKY2: MD03G1044400; MdWRKY33 (MD03G1057400) 3 8.2–9.6 ascorbate peroxidase 1 (MdAPX1: MD03G1108200, MD03G1108300) 3 14.7–16.6 senescence-related MdNAC90 (MD03G1148500) 3 36.5–37.5 Ethylene response sensor 1 (MdERS1: MD03G1292200); flavonoid biosynthesis protein MdMYB12 (MD03G1297100); Heat shock protein DnaJ (MD03G1296600, MD03G1297000); pectin methylesterase (MdPME) inhibitor protein (MD03G1290800, MD03G1290900, MD03G1291000). 4 11.0–13.0 phenylalanine and lignin biosynthesis protein MdMYB85 (MD04G1080600) 4 22.1–24.4 MYB domain protein 1 (MD04G1142200); HSP20-like protein (MD04G1140600); UDP-glucosyltransferase (UGT) proteins UGT85A7 (MD04G1140700, MD04G1140900); UGT85A3 (MD04G1140800); UGT (MD04G1141000, MD04G1141300); UGT85A2 (MD04G1141400); UGT85A4 (MD04G1141500); DNAJ heat shock protein (MD04G1153800, MD04G1153900, MD04G1154100) 4 27.6–29.6 HCT/HQT regulatory genes MD04G1188000 and MD04G1188400 6 29.8–31.7 Volz et al. (2013) QTL for IFBD; anthocyanin regulatory proteins MdMYB86 (MD06G1167200); triterpene biosynthesis transcription factor MdMYB66 (MD06G1174200); Cytochrome P450 (MD06G1162600; MD06G1162700, MD06G1162800, MD06G1163100, MD06G1163300, MD06G1163400, MD06G1163500, MD06G1163600, MD06G1163800, MD06G1164000, MD06G1164100, MD06G1164300, MD06G1164400, MD06G1164500, MD06G1164700) 7 13.9–15.9 heat shock protein 70B (MD07G1116300) 7 17.4–19.0 Drought-stress WRKY DNA-binding proteins (MdWRKY56: MD07G1131000, MD07G1131400) 7 23.6–25.2 MdPAL2 (MD07G1172700); drought-stress gene NGA1 (MD07G1162400); DNAJ heat shock family protein (MD07G1162300, MD07G1162200), stress-response protein (MdNAC69: MD07G1163700, MD07G1164000) 9 7.9–11.8 MD09G1110500 involved in ascorbate oxidase (AO); MdUGT proteins (MD09G1141200, MD09G1141300, MD09G1141500, MD09G1141600, MD09G1141700, MD09G1141800, MD09G1142000, MD09G1142500, MD09G1142600, MD09G1142800, MD09G1142900, MD09G1143000, MD09G1143200, MD09G1143400) involved in flavonoids biosynthesis; heat shock proteins 89.1 (MD09G1122200) and HSP70 (MD09G1137300); Ethylene-forming enzyme MD09G1114800; Anthocyanin regulatory protein MdNAC42 (MD09G1147500, MD09G1147600) 9 14.9–16.5 triterpene biosynthesis transcription factor protein MdMYB66 (MD09G1183800); 11 1.7–3.0 ethylene response factor proteins (MdEIN-like 3: MD11G1022400) 11 38.6–40.6 Senescence-related gene 1 (MD11G1271400, MD11G1272300, MD11G1272000, MD11G1272100, MD11G1272300, MD11G1272400, and MD11G1272500); Chalcone-flavanone isomerase (CHI) protein (MD11G1273600) and MdbHLH3 (MD11G1286900, MDP0000225680); cytochrome P450 enzyme (MD11G1274000, MD11G1274100, MD11G1274200, MD11G1274300, MD11G1274500, and MD11G1274600); heat shock transcription factor A6B (MdHSFA6B: MD11G1278900) – involved in ABA-mediated heat response and flavonoid biosynthesis. 12 2.2–3.5 heat shock protein 70-1 ((MD12G1025600, MD12G1025700 and MD12G1026300) and heat shock protein 70 (MD12G1025800 and MD12G1025900 and MD12G1026000); ethylene (MD12G1032000) and auxin-responsive (MD12G1027600) proteins. 12 7.2–8.4 DNAJ heat shock domain-containing protein (MD12G1065200 and MD12G1067400) 13 27.5–30.5 MD13G1257800 involved in polyphenol 4-coumarate:CoA ligase (4CL) 13 37.5–39.5 pectin methyl esterase inhibitor superfamily protein MdPMEI (MD13G1278600) 14 25.4–27.5 Drought-stress response gene MdWRKY45 (MD14G1154500); chalcone synthase (CHS) family proteins (MD14G1160800 and MD14G1160900); triterpene biosynthesis transcription factor MdMYB66 (MD14G1180700, MD14G1181000, MD14G1180900); anthocyanin biosynthesis protein (MdMYB86: MD14G1172900); cytochrome P450 proteins (MD14G1169000, MD14G1169200, MD14G1169600, MD14G1169700) 15 0–1.5 Ethylene synthesis proteins (MD15G1020100, MD15G1020300 and MD15G1020500); dihydroflavonol reductase (DFR) gene (MD15G1024100) 15 4.6–6.8 MdMYB73 (MD15G1076600, MD15G1088000) modulates malate transportation/accumulation via interaction with MdMYB1/10; MdNAC52: MD15G1079400) regulates anthocyanin/PA; heat shock transcription factor B4 (MD15G1080700); stress-response protein MdWRKY7 (MD15G1078200) 15 53.5–54.9 MdC3H (MD15G1436500) involved in chlorogenic acid biosynthesis; MdEIN3 (MD15G1441000) involved in regulating ethylene synthesis and anthocyanin accumulation 16 8.9–10.9 SAUR-like auxin-responsive protein (MD16G1124300) and MdNAC83: (MD16G1125800) associated with fruit ripening; MD16G1140800 regulates proanthocyanidin; MdPAE genes (MD16G1132100, MD16G1140500) regulates ethylene production. The ethylene-responsive factor 13 (MdERF13: MD02G1213600) resided within the significant region (21.9–23.5 Mb) on Chr2, while the ascorbate peroxidase 3 (MdAPX3: MD02G1127800, MD02G1132200) resided in the prominent region between 9.45 and 11.28 Mb) (Fig. 3, Table 1). There were several genomic regions showing association with FBD on Chr3. The first region (8.2–9.5 Mb) flanked MdAPX1 (MD03G1108200, MD03G1108300), while MdERF3 (MD03G1194300) and heat shock protein 70 (HSP70: MD03G1201800, D03G1201700) resided within the prominent peak region (25.5–27.5 Mb). Another significant region (36.5–37.5 Mb) at the bottom on Chr3 harboured ethylene response sensor 1 (MdERS1: MD03G1292200), a flavonoid-biosynthesis related protein MdMYB12 (MD03G1297100), HSP DnaJ and pectin methyl esterase (PME) inhibitor proteins (Fig. 3, Table 1).
The significant G' region (27.6–29.6 Mb) on Chr4 flanked the genes MD04G1188000 and MD04G1188400 involved in the biosynthesis of hydroxycinnamoyl CoA shikimate/quinate hydroxycinnamoyl transferase (HCT/HQT), while the adjacent region (22.1–24.3 Mb) harboured a cluster of UDP-glucosyltransferase (UGT) proteins and HSP (Table 1). Another region between 11.0 and 13.0 Mb encompassed phenylalanine and lignin biosynthesis gene MYB85 (MD04G1080600)[17]. The only significant region associated with FBD on Chr6 spanned between 29.8 and 31.7 Mb, which included a SNP earlier reported associated with FBD[4]. This region also harboured several MYB proteins (MdMYB86: MD06G1167200; MdMYB98: MD06G1172900; MdMYB66: MD06G1174200) and a large cluster of cytochrome P450 proteins (Table 1).
A significant FBD-associated region (13.9–15.9 Mb) on Chr7 encompassed the HSP 70B (MD07G1116300), while another significant region between 23.6 Mb and 25.2 Mb flanked the gene coding for phenylalanine ammonia lyase 2 (MdPAL2: MD07G1172700), a drought stress gene NGA1 (MD07G1162400), DNAJ HSP, and stress-response protein (MdNAC69: MD07G1163700, MD07G1164000) (Fig. 3, Table 1). A large significant region spanning between 7.9 and 11.8 Mb on Chr9 encompassed the gene MD09G1110500 putatively involved in ascorbate oxidase (AO), HSP (HSP70: MD09G1137300; HSP89.1: MD09G1122200), an ethylene-forming enzyme (MD09G1114800), and MdNAC42 (MD09G1147500, MD09G1147600). Another significant genomic region on Chr9 was between 14.9 and 16.5 Mb, which harbours the MYB domain protein MdMYB66 (MD09G1183800) (Table 1).
A sharp G' peak region (1.7–3.0 Mb) on Chr11 associated with FBD encompassed ethylene insensitive 3 (MdEIN3: MD11G1022400) along with a cluster of UGT proteins, WD-40, and bHLHL proteins (Table 1). A significant region between 38.6 and 40.6 Mb at the bottom of Chr11 was dominated by clusters of senescence-related genes and cytochrome P450 enzymes. This genomic region also flanked a chalcone-flavanone isomerase (CHI) family protein (MD11G1273600) and a bHLH protein (MdbHLH3: MD11G1286900, MDP0000225680), along with the heat shock transcription factor A6B (HSFA6B: MD11G1278900) (Table 1, Supplemental Table S3).
The region (2.2–3.3 Mb) associated with FBD on Chr12 flanked genes for the HSP 70 and 70-1, NAC proteins, ethylene and auxin-responsive proteins (Table 1). An adjacent significant region (7.2–8.4 Mb) flanked DNAJ HSP, along with bZIP, bHLH and WD-40 repeat-like proteins (Table 1; Supplemental Table S3). There was a large genomic region on Chr13 showing a significant association with FBD. In this region, the first G' peak (27.5–30.5 Mb) harboured MD13G1257800, which regulates polyphenol 4-coumarate: CoA ligase (4CL) synthesis. The second G' peak region (32.6–34.7 Mb) corresponded to an earlier mapped QTL for flesh browning in white-fleshed apples[18].
The significant region (25.4–27.5 Mb) on Chr14 harboured various genes for proteins with different putative functions, such as AP2 proteins, WD-40 repeat family proteins, bHLH proteins, MdNAC83 (MD14G1150900), MdWRKY45 (MD14G1154500), chalcone synthase (CHS) family proteins (MD14G1160800 and MD14G1160900), cytochrome P450 proteins, and several MYB domain proteins (MdMYB86: MD14G1172900; MdMYB98: MD14G1179000; MdMYB66: MD14G1180700, MD14G1181000, MD14G1180900) (Fig. 3, Table 1, Supplemental Table S3). A sharp G' peak (15.2–17.2 Mb) on Chr14 did not meet the significance threshold corresponding to the FBD QTL in white-fleshed apples[18].
The upper 1.5 Mb region on Chr15 associated with FBD encompassed several transcription factor families, including WD-40 repeats, bHLH, bZIP, ethylene synthesis proteins, and dihydroflavonol reductase (DFR) protein (MD15G1024100) (Fig. 3, Table 1; Supplemental Table S3). Another significant region on Chr15 (4.6−6.8 Mb) harboured HSF B4, MdMYB73 (which interacts with MdMYB1/10 to modulate malate transportation) and MdNAC52 (MD15G1079400), which regulates anthocyanin and PA synthesis by directly regulating MdLAR[19]. A significantly associated region at the bottom of Chr15 (53.5–54.9 Mb) harboured MdNAC35 (MD15G1444700), MdC3H (MD15G1436500) involved in the production of p-coumarate 3-hydroxylase (C3H) enzyme, which plays a role in chlorogenic acid biosynthesis, and MdEIN3 (MD15G1441000).
The significant region between 8.9 and 10.9 Mb on Chr16 harboured a gene for SAUR-like auxin-responsive protein (MD16G1124300) and MdNAC83 (MD16G1125800), both of which have been reportedly associated with apple fruit ripening[20]. This region also encompassed MD16G1140800, which regulates PA[11], and a gene for the pectin acetyl esterase protein MdPAE10 (MD16G1132100) involved in ethylene production and shelf-life[21]. A sharp G' peak region (20.4–22.4 Mb) in the middle of Chr16 flanked MdERF1B (MD16G1216900) and MdMYB15 (MD16G1218000 and MD16G1218900), involved in altering anthocyanin and PA concentrations[10] (Fig. 3, Table 1, Supplemental Table S3).
WCI-associated genomic regions, and candidate genes screening
-
The average sequencing depth of the SNP loci (~160,000) retained for marker-trait association was similar for the two WCI pools (41 vs 44). The significant regions were located on Chrs 2, 4, 6, 7, 10, 15 and 16 (Fig. 4). The genomic intervals within ±1.0 Mb of the significant G' peaks, and the putative candidate genes within those intervals, are listed in Table 2. Additional genomic regions, which did not meet the significance threshold but displayed distinct G' peaks, were also identified across most chromosomes ( Supplemental Fig. S2, Supplemental Table S3). A significant region on Chr4 encompassed chalcone synthase (CHS) genes, along with an ERF (MD04G1009000) involved in regulating PA biosynthesis[11].
Figure 4.
G’-statistics across the linkage groups (LG) showing significant association with the weighted cortex intensity (WCI) in apple. The horizontal red lines indicate the significance threshold. The putative candidate genes (refer to Table 2) underpinning various G' peaks are also shown.
Table 2. A list of the genomic regions significantly associated with the weighted cortex intensity (WCI) in apples. Putative candidate genes residing within these regions are also listed using GDDH13v1.1 reference genome assembly.
Chr Genomic region (Mb) Putative genes functions 2 11.2–13.2 UDP-glucosyltransferase (UGT) proteins (MD02G1153000, MD02G1153100, MD02G1153200, MD02G1153300; MD02G1153400; MD02G1153500; MD02G1153700; MD02G1153800; MD02G1153900); 4 0–1.2 Chalcone synthase (CHS) genes (MD04G1003000; MD04G1003300 and MD04G1003400); DNAJ heat shock protein (MD04G1003500); MdERF (MD04G1009000) involved in regulating PA biosynthesis. 6 9.5–11.0 Ubiquitin protein (MD06G1061100); stress-response WRKY protein MdWRKY21 (MD06G1062800); 6 12.1–13.6 pectin methylesterase inhibitor superfamily protein (MdPME: MD06G1064700); phenylalanine and lignin biosynthesis gene (MdMYB85) 6 16.0–17.6 MD06G1071600 (MDP0000360447) involved in leucoanthocyanidin dioxygenase (LDOX) synthesis 6 24.0–26.0 UDP-glycosyltransferase proteins (MD06G1103300, MD06G1103400, MD06G1103500 and MD06G1103600); auxin response factor 9 (MdARF9: MD06G1111100) and heat shock protein 70 (Hsp 70; MD06G1113000). 7 4.1–5.6 Ethylene insensitive 3 protein (MdEIN3: MD07G1053500 and MD07G1053800) involved in proanthocyanidins (PA) biosynthesis; ubiquitin-specific protease (MD07G1051000, MD07G1051100, MD07G1051200, MD07G1051500, MD07G1051300, MD07G1051700 and MD07G1051800). 10 15.9–18.3 bHLH proteins (MD10G1098900, MD10G1104300, and MD10G1104600); UGT protein (MD10G1101200); UGT 74D1 (MD10G1110800), UGT 74F1 (MD10G1111100), UGT 74F2 (MD10G1111000). 10 35.6–38.6 Stress-response WRKY proteins (MdWRKY28: MD10G1266400; MdWRKY65: MD10G1275800). ethylene responsive factors (MdERF2: MD10G1286300; MdERF4: MD10G1290400, and MdERF12: MD10G1290900) and a NAC domain protein (MdNAC73: MD10G1288300); polyphenol oxidase (PPO) genes ((MD10G1298200; MD10G1298300; MD10G1298400; MD10G1298500; MD10G1298700; MD10G1299100; MD10G1299300; MD10G1299400). 15 24.8–26.8 heat shock factor 4 (MD15G1283700), drought-stress response WRKY protein 7 (MdWRKY7: MD15G1287300), MdMYB73 (MD15G1288600) involved in ubiquitination and malate synthesis 15 31.7–34.2 MYB domain protein 93 (MdMYB93: MD15G1323500) regulates flavonoids and suberin accumulation (Legay et al. 2016); ubiquitin -specific protease 3 (MD15G1318500), 16 1.5–3.4 MYB domain proteins (MD16G1029400) regulates anthocyanin; senescence-associated gene 12 (MD16G1031600); ethylene response factor proteins (MdERF118: MD16G1043500, and MD16G1047700 (MdRAV1); MdMYB62 ( MD16G1040800) flavonol regulation; malate transporter MdMa2 (MD16G1045000: MDP0000244249), Ubiquitin-like superfamily protein (MD16G1036000), 16 5.4–7.4 MdMYB88 (MD16G1076100) regulates phenylpropanoid synthesis and ABA-mediated anthocyanin biosynthesis; MdMYB66 (MD16G1093200) regulates triterpene biosynthesis, MdWRKY72 (MD16G1077700) mediates ultraviolet B-induced anthocyanin synthesis. There were distinct G' peaks within a large genomic region (spanning between 9.0 and 18.0 Mb) significantly associated with WCI on Chr6 (Fig. 4). The G' peak region 12.1–13.6 Mb encompassed MdMYB85 (MD06G1064300), and a gene for a pectin methyl esterase
(PME) inhibitor protein (MD06G1064700); while the region between 16.0 and 17.6 Mb flanked the genes involved in leucoanthocyanidin dioxygenase (LDOX) synthesis (Table 2). The auxin response factor 9 (MdARF9: MD06G1111100) and the HSP70 (MD06G1113000) resided in the significant region between 24.0 and 26.0 Mb on Chr6 (Table 2).
The genomic region (4.1–5.6 Mb) with significant association with WCI on Chr7 flanked MdEIN3 (MD07G1053500, MD07G1053800), which plays an important role in the regulatory network of PA biosynthesis[11]. A significant region (35.6–38.7 Mb) on Chr10 encompassed gene clusters for bHLH and WRKY proteins along with an ethylene repressor factor (MdERF2: MD10G1286300) (Fig. 4, Table 2; Supplemental Table S3). This region also harboured MdERF4 (MD10G1290400), MdERF12 (MD10G1290900) and NAC domain genes (MdNAC73: MD10G1288300), which have been reported to be associated with fruit ripening[20]. In addition, there was a cluster of polyphenol oxidase (PPO) genes residing in this region (Table 2).
On Chr15, the significant WCI-associated region spanning between 31.7 Mb and 34.2 Mb encompassed several TF families, including MdMYB93 (MD15G1323500) (Fig. 4, Table 2, Supplemental Table S3). A distinguished G' peak region (24.9–26.9 Mb) on Chr15 harboured genes for WD-40 repeat-like proteins, bHLH proteins, redox responsive transcription factor 1 (MD15G1283200), cytochrome P450 proteins, HSF4 (MD15G1283700), and MdWRKY7 (MD15G1287300). The MdMYB73 (MD15G1288600) residing in this region has been shown to interact with MdMYB1/10 and regulates several functions, including cold-stress response, ubiquitination and malate synthesis[22, 23].
The significant 1.5–3.4 Mb region on Chr16 flanked an anthocyanin repressor MYB protein (MD16G1029400), senescence-associated gene 12 (MD16G1031600), malate transporter MdMa2 (MD16G1045000: MDP0000244249), and two ethylene response factor genes (MdERF118: MD16G1043500; MdRAV1: MD16G1047700), which interact in retaining flesh firmness[24]. The MdMYB62 (MD16G1040800) gene residing in this region is phylogenetically linked to MdMYB8 (MD06G1217200), which plays a major role in flavonoid biosynthesis[25]. Another significant region (5.4–7.4 Mb) on Chr16 harboured several bHLH genes, including MdMYB88 (MD16G1076100) involved in phenylpropanoid synthesis resulting in drought resistance[26] and ABA-mediated anthocyanin production[27]. The MdMYB66 (MD16G1093200) gene, which is involved in triterpene biosynthesis, and the MdWRKY72 (MD16G1077700) gene involved in anthocyanin synthesis, were also present in this region (Table 2, Fig. 4).
Homologous regions associated with FBD and WCI
-
The genomic regions tagged by XP-GWAS were enriched for regulatory functions, and several of these have a paralog (Table 3). For example, paralogues for a trio of genes (MdMYB86, MdMYB98 and Cytochrome 450) resided in the FBD-associated genomic regions on Chr6 and Chr14. Several pairs of genes resided together in the FBD-associated paralog regions; for example, MdNAC90 and germin-like protein 10 on Chr3 and Chr11; WRKY55 and WRKY70 on Chr1 and Chr7; and ERF1 and ERF5 on Chr4 and Chr6. The ethylene response factor MdERF1B had a paralog in the FBD-associated region on Chr13 and Chr16. Interestingly, three copies of MdNAC83 (Chrs 14, 16 17) and four copies of MdMYB66 (Chrs 4, 6, 9 and 14) resided in the FBD-associated regions. A pair of genes (RAV1 and MYB62) resided together in the WCI-associated paralog regions on Chrs13 and 16 (Table 3).
Table 3. A list of homologues genes/genomic regions associated with the internal flesh browning disorder (IFBD) and red flesh (WCI) in apples. Putative candidate genes residing within these regions are also listed using GDDH13v1.1 reference genome assembly.
Trait Chr (genomic region: Mb) Gene name Predicted gene ID Putative function IFBD Chr6 (29.8–31.7 Mb) MdMYB66 MD06G1174200 Suberin/triterpene deposition IFBD Chr14 (25.4–27.5 Mb) MdMYB66 MD14G1180700, MD14G1181000, MD14G1180900 Suberin/triterpene deposition IFBD Chr4 (7.0–8.1 Mb) MdMYB66 MD04G1060200 Suberin/triterpene deposition IFBD Chr9 (14.9–16.5 Mb) MdMYB66 MD09G1183800 Suberin/triterpene deposition IFBD Chr6 (29.8–31.7 Mb) MdMYB86 MD06G1167200 Anthocyanin regulation IFBD Chr14 (25.4–27.5 Mb) MdMYB86 MD14G1172900 Anthocyanin regulation IFBD Chr6 (29.8–31.7 Mb) MdMYB98 MD06G1172900 Drought stress response IFBD Chr14 (25.4–27.5 Mb) MdMYB98 MD14G1179000 Drought stress response IFBD Chr6 (29.8–31.7 Mb) Cytochrome P450 MD06G1162600, MD06G1162700, MD06G1162800, MD06G1163100, MD06G1163300, MD06G1163400, MD06G1163500, MD06G1163600, MD06G1163800, MD06G1164000, MD06G1164100, MD06G1164300, MD06G1164400, MD06G1164500, MD06G1164700 Flavonoid and triterpenic metabolism IFBD Chr14 (25.4–27.5 Mb) Cytochrome P450 MD14G1169000, MD14G1169200, MD14G1169600, MD14G1169700 Flavonoid and triterpenic metabolism IFBD Chr3 (14.7–16.6 Mb) MdNAC90 MD03G1148500 Senescence-related IFBD Chr11 (16.5–18.5 Mb) MdNAC90 MD11G11679000 Senescence-related IFBD Chr14 (25.4–27.5 Mb) MdNAC83 MD14G1150900 Senescence/ripening-related IFBD Chr16 (8.9–10.9 Mb) MdNAC83 MD16G1125800 Senescence/ripening-related IFBD Chr17 (0–0. 7 Mb) MdNAC83 MD17G1010300 Senescence/ripening-related IFBD Chr3 (14.7–16.6 Mb) Germin-like protein 10 MD03G1148000 Polyphenol oxidase IFBD Chr11 (16.5–18.5 Mb) Germin-like protein 10 MD11G1167000, MD11G1167100, MD11G1167400, MD11G1169200 Polyphenol oxidase IFBD Chr1 (26.9–28.5 Mb) MdWRKY55 MD01G1168500 Drought stress response IFBD Chr7 (30.3–31.9 Mb) MdWRKY55 MD07G1234600 Drought stress response IFBD Chr1 (26.9–28.5 Mb) MdWRKY70 MD01G1168600 Drought stress response IFBD Chr7 (30.3–31.9 Mb) MdWRKY70 MD07G1234700 Drought stress response IFBD Chr4 (7.0–8.1 Mb) MdERF1 MD04G1058000 Ethylene responsive factor IFBD Chr6 (5.5–7.3 Mb) MdERF1 MD06G1051800 Ethylene responsive factor IFBD Chr4 (7.0–8.1 Mb) MdERF5 MD04G1058200 Ethylene responsive factor IFBD Chr6 (5.5–7.3 Mb) MdERF5 MD06G1051900 Ethylene responsive factor IFBD Chr13 (18.0–20.0 Mb) MdERF1B MD13G1213100 Ethylene response factor 1 IFBD Chr16 (20.4–22.4 Mb) MdERF1B MD16G1216900 Ethylene response factor 1 WCI Chr13 (2.6–4.4 Mb) MdRAV1 MD13G1046100 Ethylene responsive factor WCI Chr16 (1.5–3.4 Mb) MdRAV1 MD16G1047700 Ethylene responsive factor WCI Chr13 (2.6–4.4 Mb) MdMYB62 MD13G1039900 Flavonol biosynthesis WCI Chr16 (1.5–3.4 Mb) MdMYB62 MD16G1040800 Flavonol biosynthesis IFBD Chr9 (7.9–11.8 Mb) MdNAC42 MD09G1147500, MD09G1147600 Anthocyanin accumulation WCI Chr17 (11.4–12.4 Mb) MdNAC42 MD17G1134400 Anthocyanin accumulation IFBD Chr9 (7.9–11.8 Mb) HSP 70 MD09G1137300 Heat stress response WCI Chr17 (11.4–12.4 Mb) HSP 70 MD17G1127600 Heat stress response IFBD Chr4 (11.0–13.0 Mb) MdMYB85 MD04G1080600 Phenylalanine and lignin biosynthesis WCI Chr6 (12.1–13.6 Mb) MdMYB85 MD06G1064300 Phenylalanine and lignin biosynthesis IFBD Chr15 (13.2–14.2 Mb) MdEBF1 MD15G1171800 Ethylene inhibition WCI Chr8 (15.2–17.2 Mb) MdEBF1 MD08G1150200 Ethylene inhibition IFBD Chr15 (53.5–54.9 Mb) MdEIN3 MD15G1441000 Ethylene insensitive 3 protein WCI Chr8 (30.3–31.6 Mb) MdEIN3 MD08G1245800 Ethylene insensitive 3 protein IFBD Chr11 (1.7–3.0 Mb) MdEIN3 MD11G1022400 Ethylene insensitive 3 protein WCI Chr7 (4.1–5.6 Mb) MdEIN3 MD07G1053500, MD07G1053800 Ethylene insensitive 3 protein IFBD Chr15 (4.6–6.8 Mb) MdMYB73 MD15G1076600, MD15G1088000 Cold-stress response & malate accumulation WCI Chr15 (24.8–26.8 Mb) MdMYB73 MD15G1288600 Cold-stress response & malate accumulation IFBD Chr15 (4.6–6.8M b) MdWRKY7 MD15G1078200 Anthocyanin accumulation WCI Chr15 (24.8–26.8 Mb) MdWRKY7 MD15G1287300 Anthocyanin accumulation IFBD Chr15 (43.5–45.5 Mb) MdMYB93 MD15G1369700 Flavonoid & suberin accumulation WCI Chr15 (31.7–34.2 Mb) MdMYB93 MD15G1323500 Flavonoid & suberin accumulation Paralogs of several regulatory functions were also found in the regions associated with either FBD or WCI; for example, a significant region (7.9–11.8 Mb) harbouring a gene trio (MdNAC42, HSP70 and HSP89.1) on Chr9 was associated with FBD, but the paralogs of this trio also resided in a distinct G' region associated with WCI on Chr17 (Table 3, Supplemental Table S3). Some other examples included MdMYB85 (Chr4 for FBD, and Chr6 for WCI), MdEBF1 (Chr8 for WCI, and Chr15 for FBD), MdEIN3 (Chr8 for WCI, and Chr15 for FBD), and EIN3 (Chr7 for WCI, and Chr11 for FBD). A pair of genes (MdMYB73 and MdWRKY7) resided together in the separate regions associated with FBD (4.6–6.8 Mb) and WCI (24.8–26.8 Mb) within Chr15 (Table 3).
Differentially expressed genes in R6:MdMYB10 apple fruit
-
The red pigmentation in fruit flesh differed amongst transgenic lines, with fruit from line A10 presenting the most deeply pigmented tissues (Supplemental Fig. S3), while those from lines A2 and A4 were similar in having a lower intensity of pigmentation. No pigmentation was observed in the flesh of control 'Royal Gala' (RG) fruit. RNAseq analysis of a representative transgenic line (A2) compared with RG revealed that a total of 1,379 genes were differentially expressed (log 2-fold), with 658 genes upregulated and 721 genes downregulated. This list was then assessed for commonality with the genomic regions and candidate genes from the XP-GWAS.
Genes that contained mis-sense SNPs (which would result in a change in predicted protein) and that were also differentially expressed between A2 red-fleshed transgenic line and white-fleshed 'Royal Gala' (control) apples included anthocyanin-related flavanone 3-hydroxylase, chalcone synthase and dihydroflavonol 4-reductase (Table 2 & 4). Many more upstream DNA variants (e.g. in potential promoter-controlling elements) were seen in this group of differentially expressed genes (DEGs) that were also in regions underlying WCI or FBD. Genes encoding enzymes that may be involved in FBD, such as a Rho GTPase activating protein, peroxidase, lipoxygenase 1, and ethylene-forming enzyme (ACO4), were DEGs and showed mis-sense SNPs, including a potential stop codon in the Rho GTPase-activating protein (Table 4). This intersection between DNA change and differential expression warrants further research to evaluate the functions of these genes.
Table 4. List of candidate genes associated with WCI or FBD in the XP-GWA and R6:MdMYB10 apple datasets.
Mutations in GWAS apple population Predicted gene function Expression in R6 and ‘Royal Gala’ apples Gene Missense
SNPsSNP
StopUpstream
variantsDataset Locus Annotation and TAIR ID Average
RPKM
R6 fleshAverage RPKM
WT fleshlog2Fold
ChangeMD02G1132200 1 FBD Supplemental Table S3 Chr02:9450615-11289014 flavanone 3-hydroxylase
(F3H, TT6, F3'H) AT3G51240387 75 2.47 MD02G1133600 3 FBD Supplemental Table S3 Chr02:9450615-11289014 fatty acid desaturase 5
(FAD5) AT3G15850181 0 9.98 MD02G1153700 1 WCI Table 2 Chr02:11278254-13234556 UDP-Glycosyltransferase, lignin related AT2G18560 1038 534 1.03 MD03G1059200 2 FBD Supplemental Table S3 Chr03:3177994-4910866 Peroxidase AT5G05340 25 1 4.21 MD03G1143300 4 22 FBD Table 1 Chr03:14797661-16611554 bZIP transcription factor (DPBF2, AtbZIP67) AT3G44460 114 852 -2.80 MD03G1147700 12 1 FBD Table 1 Chr03:14797661-16611554 Rho GTPase activating protein AT5G22400 205 89 1.29 MD04G1003400 3 31 WCI Table 2 Chr04:1-1284894 Chalcone synthase (CHS, TT4) AT5G13930 1443 312 2.34 MD04G1204100 6 3 FBD Table 1 Chr04:27629536-29612282 lipoxygenase 1 (LOX1, ATLOX1) AT1G55020 546 268 1.09 MD06G1160700 1 29 FBD Table 1 Chr06:29862341-31737341 peptide met sulfoxide reductase AT4G25130 21212 5456 2.04 MD06G1161400 1 4 FBD Table 1 Chr06:29862341-31737341 Pectin lyase-like protein AT5G63180 6141 29530 -2.22 MD07G1240700 1 26 FBD Supplemental Table S3 Chr07:30357332-31979025 Fe superoxide dismutase 2 AT5G51100 105 1826 -3.99 MD07G1306900 6 FBD Supplemental Table S3 Chr07:34607570-36531467 UDP-glucosyl transferase 78D2 AT5G17050 677 111 2.78 MD08G1249100 2 WCI Supplemental Table S3 Chr08:30486569-31607516 HSP20-like chaperone (ATHSP22.0) AT4G10250 3186 45 6.12 MD09G1114000 3 2 FBD Table 1 Chr09:7999212-11883202 fatty acid desaturase 5 (FAD5) AT3G15850 76 0 8.68 MD09G1146800 1 FBD Table 1 Chr09:7999212-11883202 PHYTOENE SYNTHASE (PSY) AT5G17230 38456 16090 1.32 MD10G1328100 4 5 FBD Supplemental Table S3 Chr10:40235258-41736791 ethylene-forming enzyme (ACO4) AT1G05010 876903 391033 1.21 MD15G1023600 1 FBD Table 1 Chr15:1-1487288 jasmonic acid carboxyl methyltransferase (JMT) AT1G19640 4721 1155 2.07 MD15G1024100 8 FBD Table 1 Chr15:1-1487288 dihydroflavonol 4-reductase (DFR, TT3, M318) AT5G42800 872 309 1.63 MD17G1133400 4 WCI Supplemental Table S3 Chr17:11422073-12433463 PHYTOENE SYNTHASE (PSY) AT5G17230 600 172 1.87 MD17G1260600 2 FBD Supplemental Table S3 Chr17:31098955-32776972 dehydroascorbate reductase 1 (DHAR3) AT5G16710 18 67 −1.78 -
A population of 900 apple seedlings composed of 24 full-sib families was generated in 2011 by selected crossings between six red-leaved pollen parents and six white-fleshed female parents. All six pollen parents inherited their red-leaf phenotype from the same great-grandparent 'Redfield'[1]. Each pollen parent was involved in four crosses, and the female parents were involved in three to six crosses each. Foliage colour of young seedlings is a phenotypic marker for Type 1 RF apple. The main purpose of this trial was to understand the flesh colour variation and FBD in the Type 1 RF seedlings, so only the seedlings with red foliage (i.e. carrying MdMYB10) were kept for this trial. The number of seedlings per family varied from 10 to 95. The seedlings were grafted onto 'M9' rootstock and were planted in duplicate at the Plant & Food Research orchard in Hawke's Bay, New Zealand (39°39′ S, 176°53′ E) in 2015.
Phenotyping for RF and FBD was conducted over two consecutive fruiting seasons (2017 and 2018). Fruit were harvested once, when judged mature, based on a change in skin background colour from green to yellow, and when the starch pattern index (SPI) was between 1.0 and 2.0 (on a scale of 0 to 7). In each season, six fruit were harvested from each plant and stored for 70 d at 0.5 °C, followed by 7 d at 20 °C before fruit evaluation. Fruit were cut in half across the equator and the proportion of the cortex area (PRA) that was red in colour, and the intensity of the red colour (RI) (= 1 (low) to 9 (high)) was scored. A weighted cortical intensity (WCI) was then calculated (PRA × RI) as an estimation of the amount of red pigment in the fruit. The proportion of the cortex area showing symptoms of FBD was also recorded. WCI and FBD were averaged over all fruit for a particular seedling. Fruit were also assessed for the following eating quality traits on a 1 (lowest) to 9 (highest) scale: firmness, crispness, juiciness, sweetness, sourness and astringency, to understand the genetic correlations of eating-quality traits with WCI and FBD.
Generation of R6:MdMYB10 apple fruit
-
The binary vector pSA277-R6:MYB10 was transferred into Agrobacterium tumefaciens strain LAB4404 by electroporation. Transgenic 'Royal Gala' plants were generated by Agrobacterium-mediated transformation of leaf pieces, using a method previously reported[5]. Wild-type 'Royal Gala' and three independent transgenic lines (A2, A4 and A10) of R6:MdMYB10 were grown under glasshouse conditions in full potting mix with natural light. The resulting fruit were assessed for flesh colour phenotypes at harvest (around 135 d after full bloom). Fruit peel and cortex from three biological replicates were collected and frozen in liquid nitrogen, with each replicate compiled from five pooled mature fruit for each transgenic line or wild-type control.
Real-Time quantitative RT-qPCR analysis
-
Total RNA of 36 samples (3 R6:MYB10 lines and 1 wild type control, 3 time points, 3 biological replicates) was extracted, using Spectrum Plant Total RNA Kit (SIGMA). Removal of genomic DNA contamination and first-strand cDNA synthesis were carried out using the mixture of oligo (dT) and random primers according to the manufacturer's instructions (QuantiTect Reverse Transcription Kit, Qiagen). Real-time qPCR DNA amplification and analysis was carried out using the LightCycler 480 Real-Time PCR System (Roche), with LightCycler 480 software version 1.5. The LightCycler 480 SYBR Green I Master Mix (Roche) was used following the manufacturer's method. The qPCR conditions were 5 min at 95 °C, followed by 45 cycles of 5 s at 95 °C, 5 s at 60°C, and 10 s at 72 °C, followed by 65 °C to 95 °C melting curve detection. The qPCR efficiency of each gene was obtained by analyzing the standard curve of a cDNA serial dilution of that gene. The expression was normalized to Malus × domestica elongation factor 1-alpha MdEF1α (XM_008367439) due to its consistent transcript levels throughout samples, with crossing threshold values changing by less than 2.
Phenotypic data analysis
-
Individual fruit measurements were first averaged for each seedling. As the phenotyping was repeated over two years, we used a mixed linear model (MLM) accounting for this 'permanent environmental effect', as previously described[58]. Pedigree-based additive genetic relationships among seedlings were taken into account for estimation of genetic parameters using ASReml software[60]. Product-moment correlations between best linear unbiased predictions (BLUP) of breeding values of all seedlings for different traits were used as estimates of genetic correlation among traits.
Identification of extreme phenotypes, pool-sequencing, and QC of sequence data
-
A selective DNA pooling procedure was adopted to construct DNA pools. A high-pool and a low-pool were constructed separately for the two traits (WCI and FBD). Genomic DNA was extracted from the leaves of selected seedlings, and quantified by fluorimetry using the picogreen reagent (Cat#P11496, Thermo). The low and high pools consisted of 35 seedlings each, and normalised amounts (~300 ng) of DNA from individuals were pooled. The pools were dried down with DNA Stable reagent (Cat#93021001, Biomatrica) in a centrifugal evaporator and shipped for sequencing. Each DNA pool was sequenced using paired-end 125 bp reads on the Illumina HiSeq 2500 platform. The quality of raw sequence reads was checked with FastQC/0.11.2 and MultiQC/1.2. Based on the quality control reports, the reads were aligned to the published apple reference genome GDDH13 v1.1[61] using the program bowtie2/2.3.4.3[62] with trimming from both ends before alignments and aligning in full read length ("-5 6 -3 5 –end-to-end"). The mapping results were marked for duplicate alignments, sorted, compressed and indexed with samtools/1.12[63]. Based on the alignment of binary alignment map (BAM) files of the high and low pools, single nucleotide polymorphism (SNP) identification was performed using samtools/1.12 ('samtools mpileup') and bcftools/1.12 ('bcftools call –mv')[63, 64]. Variant sites with missing genotype in any of the pools, or having the same genotype between the pools, were discarded. To minimise the influence of sequencing quality on association analysis, the identified SNPs were further filtered according to the following criteria: 1) a Phred-scaled quality score > 20; and 2) the read depth in each pool was neither < 35, nor > 500.
Association test using DNA pools
-
The allele frequencies between each pair of bulk DNAs (low versus high WCI; low versus high FBD) were compared at each SNP locus. Differences in the allele frequencies between the low and high pool were expected to be negligible for unlinked SNP markers, but allele frequency differences would be larger for SNPs closely linked to the underlying quantitative trait loci (QTLs) contributing to the extreme phenotypes. A nonparametric test (G-statistic = 2 × Σni ln(ni/nexp), where ni (i = 1 to 4) represented counts of reference and alternate alleles at a particular SNP generated from sequencing of the low and high pool, and nexp was the expected allele count assuming no allele frequency divergence between the two DNA pools[65].
We then calculated a modified statistic (G'), which took into account read count variation caused by sampling of segregants as well as variability inherent in short-read sequencing of pooled samples[65]. Using R package QTLseqr[66], firstly a G-statistic was calculated for each SNP marker, and then a weighted average using Nadaraya-Watson kernel was obtained to yield a G' statistic for a sliding genomic window of 2 Mb size. The Nadaraya-Watson method weights neighbouring markers' G-values by their distance from the focal SNP so that closer SNPs receive higher weights. The 95th percentile value of G' was used as a threshold to identify significant hotspots and to identify the putative candidate genes residing within the ±1 Mb region around the G' peak. For comparison purposes, a standard two-sided Z-test[14] was also performed to determine the significance of allele frequency differences at SNP loci between the pools for each trait.
Candidate gene identification
-
The GDDH gene models intersecting with the XP-GWA hotspots were pulled out with bedtools/2.30.0 ("bedtools intersect -wo -nonamecheck"). The selected genes were further blasted to TAIR10 ("-evalue 1e-5") and the annotated functions from Arabidopsis genes with the best blast score, the highest % identity, and the longest aligned length, were used. Then the expressions of genes located in the GWA hotspots were extracted from the RNAseq analysis of the R6:MdMYB10 representative transgenic line, and the log 2-fold change between the R6:MdMYB10 and 'Royal Gala' apples were calculated.
-
About this article
Cite this article
Kumar S, Deng C, Molloy C, Kirk C, Plunkett B, et al. 2022. Extreme-phenotype GWAS unravels a complex nexus between apple (Malus domestica) red-flesh colour and internal flesh browning. Fruit Research 2:12 doi: 10.48130/FruRes-2022-0012
Extreme-phenotype GWAS unravels a complex nexus between apple (Malus domestica) red-flesh colour and internal flesh browning
- Received: 20 May 2022
- Accepted: 12 August 2022
- Published online: 29 August 2022
Abstract: The genetic link between apple red flesh (RF) coloration and the internal flesh browning disorder (FBD) is a major challenge when breeding high fruit quality RF apple cultivars. A genome-wide association study (GWAS) was conducted in a population of about 900 red-leaved seedlings to identify genomic regions and putative candidate genes using whole genome sequencing of the pools of extreme phenotypes (XP) for the RF colour coverage (using the weighted cortex index (WCI)) and FBD. This study identified novel genomic regions contributing to WCI and FBD variation in the red-leaved seedlings. The FBD-associated regions were enriched for genes regulating senescence, heat shock proteins, cytochrome P450, ascorbate metabolism and pectin methyl esterases. Although there were no significant regions in common for WCI and FBD, there were several genes (e.g. MYB85, MYB66, ethylene insensitive 3, DNAJ heat shock protein, WRKY7, and NAC42) enriched commonly between the genomic regions associated with these traits, potentially underpinning the genetic connection between WCI and FBD. Some of the differentially expressed genes between the R6:MdMYB10 and white-fleshed ‘control’ apples resided within the GWAS hotspot for WCI (e.g. chalcone synthase, UDP-Glycosyl transferase) and FBD (e.g. Rho GTPase activating protein, lipoxygenase 1, phytoene synthase) – validating the XP-GWAS findings. Paralogs of several genes resided in the trait-associated genomic regions, suggesting that whole genome duplication plays an important role in the regulation of these traits. Adverse genetic correlations between WCI and sensory traits were observed, and strategies to develop FBD-free high fruit quality RF cultivars are discussed.
-
Key words:
- Red-fleshed apple /
- Internal browning /
- Fruit quality /
- XP-GWAS /
- Genetic architecture