ARTICLE   Open Access    

Infrared guided smart food formulation: an innovative spectral reconstruction strategy to develop anticipated and constant apple puree products

More Information
  • An innovative chemometric method was developed to exploit visible and near-infrared (Vis-NIR) spectroscopy to guide food formulation to reach the anticipated and constant quality of final products. First, a total of 671 spectral variables related to the puree quality characteristics were identified by spectral variable selection methods. Second, the concentration profiles from multivariate curve resolution-alternative least squares (MCR-ALS) made it possible to reconstruct the identified spectral variables of formulated purees. Partial least square based on the reconstructed Vis-NIR spectral variables was evidenced to predict the final puree quality, such as a* values (RPD = 3.30), total sugars (RPD = 2.64), titratable acidity (RPD = 2.55) and malic acid (RPD = 2.67), based only on the spectral data of composed puree cultivars. These results open the possibility of controlling puree formulation: a multiparameter optimization of the color and taste of final puree products can be obtained using only the Vis-NIR spectral data of single-cultivar purees.
  • As a major staple crop, today maize accounts for approximately 40% of total worldwide cereal production (http://faostat.fao.org/). Since its domestication ~9,000 years ago from a subgroup of teosinte (Zea mays ssp. parviglumis) in the tropical lowlands of southwest Mexico[1], its cultivating area has greatly expanded, covering most of the world[2]. Human's breeding and utilization of maize have gone through several stages, from landraces, open-pollinated varieties (OPVs), double-cross hybrids (1930s-1950s) and since the middle 1950s, single-cross hybrids. Nowadays, global maize production is mostly provided by single-cross hybrids, which exhibit higher-yielding and better stress tolerance than OPVs and double-cross hybrids[3].

    Besides its agronomic importance, maize has also been used as a model plant species for genetic studies due to its out-crossing habit, large quantities of seeds produced and the availability of diverse germplasm. The abundant mutants of maize facilitated the development of the first genetic and cytogenetic maps of plants, and made it an ideal plant species to identify regulators of developmental processes[46]. Although initially lagging behind other model plant species (such as Arabidopsis and rice) in multi-omics research, the recent rapid development in sequencing and transformation technologies, and various new tools (such as CRISPR technologies, double haploids etc.) are repositioning maize research at the frontiers of plant research, and surely, it will continue to reveal fundamental insights into plant biology, as well as to accelerate molecular breeding for this vitally important crop[7, 8].

    During domestication from teosinte to maize, a number of distinguishing morphological and physiological changes occurred, including increased apical dominance, reduced glumes, suppression of ear prolificacy, increase in kernel row number, loss of seed shattering, nutritional changes etc.[9] (Fig. 1). At the genomic level, genome-wide genetic diversity was reduced due to a population bottleneck effect, accompanied by directional selection at specific genomic regions underlying agronomically important traits. Over a century ago, Beadle initially proposed that four or five genes or blocks of genes might be responsible for much of the phenotypic changes between maize and teosinte[10,11]. Later studies by Doebley et al. used teosinte–maize F2 populations to dissect several quantitative trait loci (QTL) to the responsible genes (such as tb1 and tga1)[12,13]. On the other hand, based on analysis of single-nucleotide polymorphisms (SNPs) in 774 genes, Wright et al.[14] estimated that 2%−4% of maize genes (~800−1,700 genes genome-wide) were selected during maize domestication and subsequent improvement. Taking advantage of the next-generation sequencing (NGS) technologies, Hufford et al.[15] conducted resequencing analysis of a set of wild relatives, landraces and improved maize varieties, and identified ~500 selective genomic regions during maize domestication. In a recent study, Xu et al.[16] conducted a genome-wide survey of 982 maize inbred lines and 190 teosinte accession. They identified 394 domestication sweeps and 360 adaptation sweeps. Collectively, these studies suggest that maize domestication likely involved hundreds of genomic regions. Nevertheless, much fewer domestication genes have been functionally studied so far.

    Figure 1.  Main traits of maize involved in domestication and improvement.

    During maize domestication, a most profound morphological change is an increase in apical dominance, transforming a multi-branched plant architecture in teosinte to a single stalked plant (terminated by a tassel) in maize. The tillers and long branches of teosinte are terminated by tassels and bear many small ears. Similarly, the single maize stalk bears few ears and is terminated by a tassel[9,12,17]. A series of landmark studies by Doebley et al. elegantly demonstrated that tb1, which encodes a TCP transcription factor, is responsible for this transformation[18, 19]. Later studies showed that insertion of a Hopscotch transposon located ~60 kb upstream of tb1 enhances the expression of tb1 in maize, thereby repressing branch outgrowth[20, 21]. Through ChIP-seq and RNA-seq analyses, Dong et al.[22] demonstrated that tb1 acts to regulate multiple phytohormone signaling pathways (gibberellins, abscisic acid and jasmonic acid) and sugar sensing. Moreover, several other domestication loci, including teosinte glume architecture1 (tga1), prol1.1/grassy tillers1, were identified as its putative targets. Elucidating the precise regulatory mechanisms of these loci and signaling pathways will be an interesting and rewarding area of future research. Also worth noting, studies showed that tb1 and its homologous genes in Arabidopsis (Branched1 or BRC1) and rice (FINE CULM1 or FC1) play a conserved role in repressing the outgrowth of axillary branches in both dicotyledon and monocotyledon plants[23, 24].

    Teosinte ears possess two ranks of fruitcase-enclosed kernels, while maize produces hundreds of naked kernels on the ear[13]. tga1, which encodes a squamosa-promoter binding protein (SBP) transcription factor, underlies this transformation[25]. It has been shown that a de novo mutation occurred during maize domestication, causing a single amino acid substitution (Lys to Asn) in the TGA1 protein, altering its binding activity to its target genes, including a group of MADS-box genes that regulate glume identity[26].

    Prolificacy, the number of ears per plants, is also a domestication trait. It has been shown that grassy tillers 1 (gt1), which encodes an HD-ZIP I transcription factor, suppresses prolificacy by promoting lateral bud dormancy and suppressing elongation of the later ear branches[27]. The expression of gt1 is induced by shading and requires the activity of tb1, suggesting that gt1 acts downstream of tb1 to mediate the suppressed branching activity in response to shade. Later studies mapped a large effect QTL for prolificacy (prol1.1) to a 2.7 kb 'causative region' upstream of the gt1gene[28]. In addition, a recent study identified a new QTL, qEN7 (for ear number on chromosome 7). Zm00001d020683, which encodes a putative INDETERMINATE DOMAIN (IDD) transcription factor, was identified as the likely candidate gene based on its expression pattern and signature of selection during maize improvement[29]. However, its functionality and regulatory relationship with tb1 and gt1 remain to be elucidated.

    Smaller leaf angle and thus more compact plant architecture is a desired trait for modern maize varieties. Tian et al.[30] used a maize-teosinte BC2S3 population and cloned two QTLs (Upright Plant Architecture1 and 2 [UPA1 and UPA2]) that regulate leaf angle. Interestingly, the authors showed that the functional variant of UPA2 is a 2-bp InDel located 9.5 kb upstream of ZmRAVL1, which encodes a B3 domain transcription factor. The 2-bp Indel flanks the binding site of the transcription factor Drooping Leaf1 (DRL1)[31], which represses ZmRAVL1 expression through interacting with Liguleless1 (LG1), a SBP-box transcription factor essential for leaf ligule and auricle development[32]. UPA1 encodes brassinosteroid C-6 oxidase1 (brd1), a key enzyme for biosynthesis of active brassinolide (BR). The teosinte-derived allele of UPA2 binds DRL1 more strongly, leading to lower expression of ZmRAVL1 and thus, lower expression of brd1 and BR levels, and ultimately smaller leaf angle. Notably, the authors demonstrated that the teosinte-derived allele of UPA2 confers enhanced yields under high planting densities when introgressed into modern maize varieties[30, 33].

    Maize plants exhibit salient vegetative phase change, which marks the vegetative transition from the juvenile stage to the adult stage, characterized by several changes in maize leaves produced before and after the transition, such as production of leaf epicuticular wax and epidermal hairs. Previous studies reported that Glossy15 (Gl15), which encodes an AP2-like transcription factor, promotes juvenile leaf identity and suppressing adult leaf identity. Ectopic overexpression of Gl15 causes delayed vegetative phase change and flowering, while loss-of-function gl15 mutant displayed earlier vegetative phase change[34]. In another study, Gl15 was identified as a major QTL (qVT9-1) controlling the difference in the vegetative transition between maize and teosinte. Further, it was shown that a pre-existing low-frequency standing variation, SNP2154-G, was selected during domestication and likely represents the causal variation underlying differential expression of Gl15, and thus the difference in the vegetative transition between maize and teosinte[35].

    A number of studies documented evidence that tassels replace upper ears1 (tru1) is a key regulator of the conversion of the male terminal lateral inflorescence (tassel) in teosinte to a female terminal inflorescence (ear) in maize. tru1 encodes a BTB/POZ ankyrin repeat domain protein, and it is directly targeted by tb1, suggesting their close regulatory relationship[36]. In addition, a number of regulators of maize inflorescence morphology, were also shown as selective targets during maize domestication, including ramosa1 (ra1)[37, 38], which encodes a putative transcription factor repressing inflorescence (the ear and tassel) branching, Zea Agamous-like1 (zagl1)[39], which encodes a MADS-box transcription factor regulating flowering time and ear size, Zea floricaula leafy2 (zfl2, homologue of Arabidopsis Leafy)[40, 41], which likely regulates ear rank number, and barren inflorescence2 (bif2, ortholog of the Arabidopsis serine/threonine kinase PINOID)[42, 43], which regulates the formation of spikelet pair meristems and branch meristems on the tassel. The detailed regulatory networks of these key regulators of maize inflorescence still remain to be further elucidated.

    Kernel row number (KRN) and kernel weight are two important determinants of maize yield. A number of domestication genes modulating KRN and kernel weight have been identified and cloned, including KRN1, KRN2, KRN4 and qHKW1. KRN4 was mapped to a 3-kb regulatory region located ~60 kb downstream of Unbranched3 (UB3), which encodes a SBP transcription factor and negatively regulates KRN through imparting on multiple hormone signaling pathways (cytokinin, auxin and CLV-WUS)[44, 45]. Studies have also shown that a harbinger TE in the intergenic region and a SNP (S35) in the third exon of UB3 act in an additive fashion to regulate the expression level of UB3 and thus KRN[46].

    KRN1 encodes an AP2 transcription factor that pleiotropically affects plant height, spike density and grain size of maize[47], and is allelic to ids1/Ts6 (indeterminate spikelet 1/Tassel seed 6)[48]. Noteworthy, KRN1 is homologous to the wheat domestication gene Q, a major regulator of spike/spikelet morphology and grain threshability in wheat[49].

    KRN2 encodes a WD40 domain protein and it negatively regulates kernel row number[50]. Selection in a ~700-bp upstream region (containing the 5’UTR) of KRN2 during domestication resulted in reduced expression and thus increased kernel row number. Interestingly, its orthologous gene in rice, OsKRN2, was shown also a selected gene during rice domestication to negatively regulate secondary panicle branches and thus grain number. These observations suggest convergent selection of yield-related genes occurred during parallel domestication of cereal crops.

    qHKW1 is a major QTL for hundred-kernel weight (HKW)[51]. It encodes a CLAVATA1 (CLV1)/BARELY ANY MERISTEM (BAM)-related receptor kinase-like protein positively regulating HKW. A 8.9 Kb insertion in its promoter region was find to enhance its expression, leading to enhanced HKW[52]. In addition, Chen et al.[53] reported cloning of a major QTL for kernel morphology, qKM4.08, which encodes ZmVPS29, a retromer complex component. Sequencing and association analysis revealed that ZmVPS29 was a selective target during maize domestication. They authors also identified two significant polymorphic sites in its promoter region significantly associated with the kernel morphology. Moreover, a strong selective signature was detected in ZmSWEET4c during maize domestication. ZmSWEET4c encodes a hexose transporter protein functioning in sugar transport across the basal endosperm transfer cell layer (BETL) during seed filling[54]. The favorable alleles of these genes could serve as valuable targets for genetic improvement of maize yield.

    In a recent effort to more systematically analyze teosinte alleles that could contribute to yield potential of maize, Wang et al.[55] constructed four backcrossed maize-teosinte recombinant inbred line (RIL) populations and conducted detailed phenotyping of 26 agronomic traits under five environmental conditions. They identified 71 QTL associated with 24 plant architecture and yield related traits through inclusive composite interval mapping. Interestingly, they identified Zm00001eb352570 and Zm00001eb352580, both encode ethylene-responsive transcription factors, as two key candidate genes regulating ear height and the ratio of ear to plant height. Chen et al.[56] constructed a teosinte nested association mapping (TeoNAM) population, and performed joint-linkage mapping and GWAS analyses of 22 domestication and agronomic traits. They identified the maize homologue of PROSTRATE GROWTH1, a rice domestication gene controlling the switch from prostrate to erect growth, is also a QTL associated with tillering in teosinte and maize. Additionally, they also detected multiple QTL for days-to-anthesis (such as ZCN8 and ZmMADS69) and other traits (such as tassel branch number and tillering) that could be exploited for maize improvement. These lines of work highlight again the value of mining the vast amounts of superior alleles hidden in teosinte for future maize genetic improvement.

    Loss of seed shattering was also a key trait of maize domestication, like in other cereals. shattering1 (sh1), which encodes a zinc finger and YABBY domain protein regulating seed shattering. Interesting, sh1 was demonstrated to undergo parallel domestication in several cereals, including rice, maize, sorghum, and foxtail millet[57]. Later studies showed that the foxtail millet sh1 gene represses lignin biosynthesis in the abscission layer, and that an 855-bp Harbinger transposable element insertion in sh1 causes loss of seed shattering in foxtail millet[58].

    In addition to morphological traits, a number of physiological and nutritional related traits have also been selected during maize domestication. Based on survey of the nucleotide diversity, Whitt et al.[59] reported that six genes involved in starch metabolism (ae1, bt2, sh1, sh2, su1 and wx1) are selective targets during maize domestication. Palaisa et al.[60] reported selection of the Y1 gene (encoding a phytoene synthase) for increased nutritional value. Karn et al.[61] identified two, three, and six QTLs for starch, protein and oil respectively and showed that teosinte alleles can be exploited for the improvement of kernel composition traits in modern maize germplasm. Fan et at.[62] reported a strong selection imposed on waxy (wx) in the Chinese waxy maize population. Moreover, a recent exciting study reported the identification of a teosinte-derived allele of teosinte high protein 9 (Thp9) conferring increased protein level and nitrogen utilization efficiency (NUE). It was further shown that Thp9 encodes an asparagine synthetase 4 and that incorrect splicing of Thp9-B73 transcripts in temperate maize varieties is responsible for its diminished expression, and thus reduced NUE and protein content[63].

    Teosintes is known to confer superior disease resistance and adaptation to extreme environments (such as low phosphorus and high salinity). de Lange et al. and Lennon et al.[6466] reported the identification of teosinte-derived QTLs for resistance to gray leaf spot and southern leaf blight in maize. Mano & Omori reported that teosinte-derived QTLs could confer flooding tolerance[67]. Feng et al.[68] identified four teosinte-derived QTL that could improve resistance to Fusarium ear rot (FER) caused by Fusarium verticillioides. Recently, Wang et al.[69] reported a MYB transcription repressor of teosinte origin (ZmMM1) that confers resistance to northern leaf blight (NLB), southern corn rust (SCR) and gray leaf spot (GLS) in maize, while Zhang et al.[70] reported the identification of an elite allele of SNP947-G ZmHKT1 (encoding a sodium transporter) derived from teosinte can effectively improve salt tolerance via exporting Na+ from the above-ground plant parts. Gao et al.[71] reported that ZmSRO1d-R can regulate the balance between crop yield and drought resistance by increasing the guard cells' ROS level, and it underwent selection during maize domestication and breeding. These studies argue for the need of putting more efforts to tapping into the genetic resources hidden in the maize’s wild relatives. The so far cloned genes involved in maize domestication are summarized in Table 1. Notably, the enrichment of transcription factors in the cloned domestication genes highlights a crucial role of transcriptional re-wiring in maize domestication.

    Table 1.  Key domestication genes cloned in maize.
    GenePhenotypeFunctional annotationSelection typeCausative changeReferences
    tb1Plant architectureTCP transcription factorIncreased expression~60 kb upstream of tb1 enhancing expression[1822]
    tga1Hardened fruitcaseSBP-domain transcription factorProtein functionA SNP in exon (K-N)[25, 26]
    gt1Plant architectureHomeodomain leucine zipperIncreased expressionprol1.1 in 2.7 kb upstream of the promoter region increasing expression[27, 28]
    Zm00001d020683Plant architectureINDETERMINATE DOMAIN transcription factorProtein functionUnknown[29]
    UPA1Leaf angleBrassinosteroid C-6 oxidase1Protein functionUnknown[30]
    UPA2Leaf angleB3 domain transcription factorIncreased expressionA 2 bp indel in 9.5 kb upstream of ZmRALV1[30]
    Gl15Vegetative phase changeAP2-like transcription factorAltered expressionSNP2154: a stop codon (G-A)[34, 35]
    tru1Plant architectureBTB/POZ ankyrin repeat proteinIncreased expressionUnknown[36]
    ra1Inflorescence architectureTranscription factorAltered expressionUnknown[37, 38]
    zflPlant architectureTranscription factorAltered expressionUnknown[40, 41]
    UB3Kernel row numberSBP-box transcription factorAltered expressionA TE in the intergenic region;[4446]
    SNP (S35): third exon of UB3
    (A-G) increasing expression of UB3 and KRN
    KRN1/ids1/Ts6Kernel row numberAP2 Transcription factorIncreased expressionUnknown[47, 48]
    KRN2Kernel row numberWD40 domainDecreased expressionUnknown[50]
    qHKW1Kernel row weightCLV1/BAM-related receptor kinase-like proteinIncreased expression8.9 kb insertion upstream of HKW[51, 52]
    ZmVPS29Kernel morphologyA retromer complex componentProtein functionTwo SNPs (S-1830 and S-1558) in the promoter of ZmVPS29[53]
    ZmSWEET4cSeed fillingHexose transporterProtein functionUnknown[54]
    ZmSh1ShatteringA zinc finger and YABBY transcription factorProtein functionUnknown[57, 58]
    Thp9Nutrition qualityAsparagine synthetase 4 enzymeProtein functionA deletion in 10th intron of Thp9 reducing NUE and protein content[63]
    ZmMM1Biotic stressMYB Transcription repressorProtein functionUnknown[69]
    ZmHKT1Abiotic stressA sodium transporterProtein functionSNP947-G: a nonsynonymous variation increasing salt tolerance[70]
    ZmSRO1d-RDrought resistance and productionPolyADP-ribose polymerase and C-terminal RST domainProtein functionThree non-synonymous variants: SNP131 (A44G), SNP134 (V45A) and InDel433[71]
     | Show Table
    DownLoad: CSV

    After its domestication from its wild progenitor teosinte in southwestern Mexico in the tropics, maize has now become the mostly cultivated crop worldwide owing to its extensive range expansion and adaptation to diverse environmental conditions (such as temperature and day length). A key prerequisite for the spread of maize from tropical to temperate regions is reduced photoperiod sensitivity[72]. It was recently shown that CENTRORADIALIS 8 (ZCN8), an Flowering Locus T (FT) homologue, underlies a major quantitative trait locus (qDTA8) for flowering time[73]. Interestingly, it has been shown that step-wise cis-regulatory changes occurred in ZCN8 during maize domestication and post-domestication expansion. SNP-1245 is a target of selection during early maize domestication for latitudinal adaptation, and after its fixation, selection of InDel-2339 (most likely introgressed from Zea mays ssp. Mexicana) likely contributed to the spread of maize from tropical to temperate regions[74].

    ZCN8 interacts with the basic leucine zipper transcription factor DLF1 (Delayed flowering 1) to form the florigen activation complex (FAC) in maize. Interestingly, DFL1 was found to underlie qLB7-1, a flowering time QTL identified in a BC2S3 population of maize-teosinte. Moreover, it was shown that DLF1 directly activates ZmMADS4 and ZmMADS67 in the shoot apex to promote floral transition[75]. In addition, ZmMADS69 underlies the flowering time QTL qDTA3-2 and encodes a MADS-box transcription factor. It acts to inhibit the expression of ZmRap2.7, thereby relieving its repression on ZCN8 expression and causing earlier flowering. Population genetic analyses showed that DLF1, ZmMADS67 and ZmMADS69 are all targets of artificial selection and likely contributed to the spread of maize from the tropics to temperate zones[75, 76].

    In addition, a few genes regulating the photoperiod pathway and contributing to the acclimation of maize to higher latitudes in North America have been cloned, including Vgt1, ZmCCT (also named ZmCCT10), ZmCCT9 and ZmELF3.1. Vgt1 was shown to act as a cis-regulatory element of ZmRap2.7, and a MITE TE located ~70 kb upstream of Vgt1 was found to be significantly associated with flowering time and was a major target for selection during the expansion of maize to the temperate and high-latitude regions[7779]. ZmCCT is another major flowering-time QTL and it encodes a CCT-domain protein homologous to rice Ghd7[80]. Its causal variation is a 5122-bp CACTA-like TE inserted ~2.5 kb upstream of ZmCCT10[72, 81]. ZmCCT9 was identified a QTL for days to anthesis (qDTA9). A Harbinger-like TE located ~57 kb upstream of ZmCCT9 showed the most significant association with DTA and thus believed to be the causal variation[82]. Notably, the CATCA-like TE of ZmCCT10 and the Harbinger-like TE of ZmCCT9 are not observed in surveyed teosinte accessions, hinting that they are de novo mutations occurred after the initial domestication of maize[72, 82]. ZmELF3.1 was shown to underlie the flowering time QTL qFT3_218. It was demonstrated that ZmELF3.1 and its homolog ZmELF3.2 can form the maize Evening Complex (EC) through physically interacting with ZmELF4.1/ZmELF4.2, and ZmLUX1/ZmLUX2. Knockout mutants of Zmelf3.1 and Zmelf3.1/3.2 double mutant presented delayed flowering under both long-day and short-day conditions. It was further shown that the maize EC promote flowering through repressing the expression of several known flowering suppressor genes (e.g., ZmCCT9, ZmCCT10, ZmCOL3, ZmPRR37a and ZmPRR73), and consequently alleviating their inhibition on several maize florigen genes (ZCN8, ZCN7 and ZCN12). Insertion of two closely linked retrotransposon elements upstream of the ZmELF3.1 coding region increases the expression of ZmELF3.1, thus promoting flowering[83]. The increase frequencies of the causal TEs in Vgt1, ZmCCT10, ZmCCT9 and ZmELF3.1 in temperate maize compared to tropical maize highlight a critical role of these genes during the spread and adaptation of maize to higher latitudinal temperate regions through promoting flowering under long-day conditions[72,8183].

    In addition, Barnes et al.[84] recently showed that the High Phosphatidyl Choline 1 (HPC1) gene, which encodes a phospholipase A1 enzyme, contributed to the spread of the initially domesticated maize from the warm Mexican southwest to the highlands of Mexico and South America by modulating phosphatidylcholine levels. The Mexicana-derived allele harbors a polymorphism and impaired protein function, leading to accelerated flowering and better fitness in highlands.

    Besides the above characterized QTLs and genes, additional genetic elements likely also contributed to the pre-Columbia spreading of maize. Hufford et al.[85] proposed that incorporation of mexicana alleles into maize may helped the expansion of maize to the highlands of central Mexico based on detection of bi-directional gene flow between maize and Mexicana. This proposal was supported by a recent study showing evidence of introgression for over 10% of the maize genome from the mexicana genome[86]. Consistently, Calfee et al.[87] found that sequences of mexicana ancestry increases in high-elevation maize populations, supporting the notion that introgression from mexicana facilitating adaptation of maize to the highland environment. Moreover, a recent study examined the genome-wide genetic diversity of the Zea genus and showed that dozens of flowering-related genes (such as GI, BAS1 and PRR7) are associated with high-latitude adaptation[88]. These studies together demonstrate unequivocally that introgression of genes from Mexicana and selection of genes in the photoperiod pathway contributed to the spread of maize to the temperate regions.

    The so far cloned genes involved in pre-Columbia spread of maize are summarized in Fig. 2 and Table 2.

    Figure 2.  Genes involved in Pre-Columbia spread of maize to higher latitudes and the temperate regions. The production of world maize in 2020 is presented by the green bar in the map from Ritchie et al. (2023). Ritchie H, Rosado P, and Roser M. 2023. "Agricultural Production". Published online at OurWorldInData.org. Retrieved from: 'https:ourowrldindata.org/agricultural-production' [online Resource].
    Table 2.  Flowering time related genes contributing to Pre-Columbia spread of maize.
    GeneFunctional annotationCausative changeReferences
    ZCN8Florigen proteinSNP-1245 and Indel-2339 in promoter[73, 74]
    DLF1Basic leucine zipper transcription factorUnknown[75]
    ZmMADS69MADS-box transcription factorUnknown[76]
    ZmRap2.7AP2-like transcription factorMITE TE inserted ~70 kb upstream[7779]
    ZmCCTCCT-domain protein5122-bp CACTA-like TE inserted ~2.5 kb upstream[72,81]
    ZmCCT9CCT transcription factorA harbinger-like element at 57 kb upstream[82]
    ZmELF3.1Unknownwo retrotransposons in the promote[84]
    HPC1Phospholipase A1 enzymUnknown[83]
    ZmPRR7UnknownUnknown[88]
    ZmCOL9CO-like-transcription factorUnknown[88]
     | Show Table
    DownLoad: CSV

    Subsequent to domestication ~9,000 years ago, maize has been continuously subject to human selection during the post-domestication breeding process. Through re-sequencing analysis of 35 improved maize lines, 23 traditional landraces and 17 wild relatives, Hufford et al.[15] identified 484 and 695 selective sweeps during maize domestication and improvement, respectively. Moreover, they found that about a quarter (23%) of domestication sweeps (107) were also selected during improvement, indicating that a substantial portion of the domestication loci underwent continuous selection during post-domestication breeding.

    Genetic improvement of maize culminated in the development of high planting density tolerant hybrid maize to increase grain yield per unit land area[89, 90]. To investigate the key morphological traits that have been selected during modern maize breeding, we recently conducted sequencing and phenotypic analyses of 350 elite maize inbred lines widely used in the US and China over the past few decades. We identified four convergently improved morphological traits related to adapting to increased planting density, i.e., reduced leaf angle, reduced tassel branch number (TBN), reduced relative plant height (EH/PH) and accelerated flowering. Genome-wide Association Study (GWAS) identified a total of 166 loci associated with the four selected traits, and found evidence of convergent increases in allele frequency at putatively favorable alleles for the identified loci. Moreover, genome scan using the cross-population composite likelihood ratio approach (XP-CLR) identified a total of 1,888 selective sweeps during modern maize breeding in the US and China. Gene ontology analysis of the 5,356 genes encompassed in the selective sweeps revealed enrichment of genes related to biosynthesis or signaling processes of auxin and other phytohormones, and in responses to light, biotic and abiotic stresses. This study provides a valuable resource for mining genes regulating morphological and physiological traits underlying adaptation to high-density planting[91].

    In another study, Li et al.[92] identified ZmPGP1 (ABCB1 or Br2) as a selected target gene during maize domestication and genetic improvement. ZmPGP1 is involved in auxin polar transport, and has been shown to have a pleiotropic effect on plant height, stalk diameter, leaf length, leaf angle, root development and yield. Sequence and phenotypic analyses of ZmPGP1 identified SNP1473 as the most significant variant for kernel length and ear grain weight and that the SNP1473T allele is selected during both the domestication and improvement processes. Moreover, the authors identified a rare allele of ZmPGP1 carrying a 241-bp deletion in the last exon, which results in significantly reduced plant height and ear height and increased stalk diameter and erected leaves, yet no negative effect on yield[93], highlighting a potential utility in breeding high-density tolerant maize cultivars.

    Shade avoidance syndrome (SAS) is a set of adaptive responses triggered when plants sense a reduction in the red to far-red light (R:FR) ratio under high planting density conditions, commonly manifested by increased plant height (and thus more prone to lodging), suppressed branching, accelerated flowering and reduced resistance to pathogens and pests[94, 95]. High-density planting could also cause extended anthesis-silking interval (ASI), reduced tassel size and smaller ear, and even barrenness[96, 97]. Thus, breeding of maize cultivars of attenuated SAS is a priority for adaptation to increased planting density.

    Extensive studies have been performed in Arabidopsis to dissect the regulatory mechanism of SAS and this topic has been recently extensively reviewed[98]. We recently showed that a major signaling mechanism regulating SAS in Arabidopsis is the phytochrome-PIFs module regulates the miR156-SPL module-mediated aging pathway[99]. We proposed that in maize there might be a similar phytochrome-PIFs-miR156-SPL regulatory pathway regulating SAS and that the maize SPL genes could be exploited as valuable targets for genetic improvement of plant architecture tailored for high-density planting[100].

    In support of this, it has been shown that the ZmphyBs (ZmphyB1 and ZmphyB2), ZmphyCs (ZmphyC1 and ZmphyC2) and ZmPIFs are involved in regulating SAS in maize[101103]. In addition, earlier studies have shown that as direct targets of miR156s, three homologous SPL transcription factors, UB2, UB3 and TSH4, regulate multiple agronomic traits including vegetative tillering, plant height, tassel branch number and kernel row number[44, 104]. Moreover, it has been shown that ZmphyBs[101, 105] and ZmPIF3.1[91], ZmPIF4.1[102] and TSH4[91] are selective targets during modern maize breeding (Table 3).

    Table 3.  Selective genes underpinning genetic improvement during modern maize breeding.
    GenePhenotypeFunctional annotationSelection typeCausative changeReferences
    ZmPIF3.1Plant heightBasic helix-loop-helix transcription factorIncreased expressionUnknown[91]
    TSH4Tassel branch numberTranscription factorAltered expressionUnknown[91]
    ZmPGP1Plant architectureATP binding cassette transporterAltered expressionA 241 bp deletion in the last exon of ZmPGP1[92, 93]
    PhyB2Light signalPhytochrome BAltered expressionA 10 bp deletion in the translation start site[101]
    ZmPIF4.1Light signalBasic helix-loop-helix transcription factorAltered expressionUnknown[102]
    ZmKOB1Grain yieldGlycotransferase-like proteinProtein functionUnknown[121]
     | Show Table
    DownLoad: CSV

    In a recent study to dissect the signaling process regulating inflorescence development in response to the shade signal, Kong et al.[106] compared the gene expression changes along the male and female inflorescence development under simulated shade treatments and normal light conditions, and identified a large set of genes that are co-regulated by developmental progression and simulated shade treatments. They found that these co-regulated genes are enriched in plant hormone signaling pathways and transcription factors. By network analyses, they found that UB2, UB3 and TSH4 act as a central regulatory node controlling maize inflorescence development in response to shade signal, and their loss-of-function mutants exhibit reduced sensitivity to simulated shade treatments. This study provides a valuable genetic source for mining and manipulating key shading-responsive genes for improved tassel and ear traits under high density planting conditions.

    Nowadays, global maize production is mostly provided by hybrid maize, which exhibits heterosis (or hybrid vigor) in yields and stress tolerance over open-pollinated varieties[3]. Hybrid maize breeding has gone through several stages, from the 'inbred-hybrid method' stage by Shull[107] and East[108] in the early twentieth century, to the 'double-cross hybrids' stage (1930s−1950s) by Jones[109], and then the 'single-cross hybrids' stage since the 1960s. Since its development, single-cross hybrid was quickly adopted globally due to its superior heterosis and easiness of production[3].

    Single-cross maize hybrids are produced from crossing two unrelated parental inbred lines (female × male) belonging to genetically distinct pools of germplasm, called heterotic groups. Heterotic groups allow better exploitation of heterosis, since inter-group hybrids display a higher level of heterosis than intra-group hybrids. A specific pair of female and male heterotic groups expressing pronounced heterosis is termed as a heterotic pattern[110, 111]. Initially, the parental lines were derived from a limited number of key founder inbred lines and empirically classified into different heterotic groups (such as SSS and NSS)[112]. Over time, they have expanded dramatically, accompanied by formation of new 'heterotic groups' (such as Iodent, PA and PB). Nowadays, Stiff Stalk Synthetics (SSS) and PA are generally used as FHGs (female heterotic groups), while Non Stiff Stalk (NSS), PB and Sipingtou (SPT) are generally used as the MHGs (male heterotic groups) in temperate hybrid maize breeding[113].

    With the development of molecular biology, various molecular markers, ranging from RFLPs, SSRs, and more recently high-density genome-wide SNP data have been utilized to assign newly developed inbred lines into various heterotic groups, and to guide crosses between heterotic pools to produce the most productive hybrids[114116]. Multiple studies with molecular markers have suggested that heterotic groups have diverged genetically over time for better heterosis[117120]. However, there has been a lack of a systematic assessment of the effect and contribution of breeding selection on phenotypic improvement and the underlying genomic changes of FHGs and MHGs for different heterotic patterns on a population scale during modern hybrid maize breeding.

    To systematically assess the phenotypic improvement and the underlying genomic changes of FHGs and MHGs during modern hybrid maize breeding, we recently conducted re-sequencing and phenotypic analyses of 21 agronomic traits for a panel of 1,604 modern elite maize lines[121]. Several interesting observations were made: (1) The MHGs experienced more intensive selection than the FMGs during the progression from era I (before the year 2000) to era II (after the year 2000). Significant changes were observed for 18 out of 21 traits in the MHGs, but only 10 of the 21 traits showed significant changes in the FHGs; (2) The MHGs and FHGs experienced both convergent and divergent selection towards different sets of agronomic traits. Both the MHGs and FHGs experienced a decrease in flowering time and an increase in yield and plant architecture related traits, but three traits potentially related to seed dehydration rate were selected in opposite direction in the MHGs and FHGs. GWAS analysis identified 4,329 genes associated with the 21 traits. Consistent with the observed convergent and divergent changes of different traits, we observed convergent increase for the frequencies of favorable alleles for the convergently selected traits in both the MHGs and FHGs, and anti-directional changes for the frequencies of favorable alleles for the oppositely selected traits. These observations highlight a critical contribution of accumulation of favorable alleles to agronomic trait improvement of the parental lines of both FHGs and MHGs during modern maize breeding.

    Moreover, FST statistics showed increased genetic differentiation between the respective MHGs and FHGs of the US_SS × US_NSS and PA × SPT heterotic patterns from era I to era II. Further, we detected significant positive correlations between the number of accumulated heterozygous superior alleles of the differentiated genes with increased grain yield per plant and better parent heterosis, supporting a role of the differentiated genes in promoting maize heterosis. Further, mutational and overexpressional studies demonstrated a role of ZmKOB1, which encodes a putative glycotransferase, in promoting grain yield[121]. While this study complemented earlier studies on maize domestication and variation maps in maize, a pitfall of this study is that variation is limited to SNP polymorphisms. Further exploitation of more variants (Indels, PAVs, CNVs etc.) in the historical maize panel will greatly deepen our understanding of the impact of artificial selection on the maize genome, and identify valuable new targets for genetic improvement of maize.

    The ever-increasing worldwide population and anticipated climate deterioration pose a great challenge to global food security and call for more effective and precise breeding methods for crops. To accommodate the projected population increase in the next 30 years, it is estimated that cereal production needs to increase at least 70% by 2050 (FAO). As a staple cereal crop, breeding of maize cultivars that are not only high-yielding and with superior quality, but also resilient to environmental stresses, is essential to meet this demand. The recent advances in genome sequencing, genotyping and phenotyping technologies, generation of multi-omics data (including genomic, phenomic, epigenomic, transcriptomic, proteomic, and metabolomic data), creation of novel superior alleles by genome editing, development of more efficient double haploid technologies, integrating with machine learning and artificial intelligence are ushering the transition of maize breeding from the Breeding 3.0 stage (biological breeding) into the Breeding 4.0 stage (intelligent breeding)[122, 123]. However, several major challenges remain to be effectively tackled before such a transition could be implemented. First, most agronomic traits of maize are controlled by numerous small-effect QTL and complex genotype-environment interactions (G × E). Thus, elucidating the contribution of the abundant genetic variation in the maize population to phenotypic plasticity remains a major challenge in the post-genomic era of maize genetics and breeding. Secondly, most maize cultivars cultivated nowadays are hybrids that exhibit superior heterosis than their parental lines. Hybrid maize breeding involves the development of elite inbred lines with high general combining ability (GCA) and specific combining ability (SCA) that allows maximal exploitation of heterosis. Despite much effort to dissect the mechanisms of maize heterosis, the molecular basis of maize heterosis is still a debated topic[124126]. Thirdly, only limited maize germplasm is amenable to genetic manipulation (genetic transformation, genome editing etc.), which significantly hinders the efficiency of genetic improvement. Development of efficient genotype-independent transformation procedure will greatly boost maize functional genomic research and breeding. Noteworthy, the Smart Corn System recently launched by Bayer is promised to revolutionize global corn production in the coming years. At the heart of the new system is short stature hybrid corn (~30%−40% shorter than traditional hybrids), which offers several advantages: sturdier stems and exceptional lodging resistance under higher planting densities (grow 20%−30% more plants per hectare), higher and more stable yield production per unit land area, easier management and application of plant protection products, better use of solar energy, water and other natural resources, and improved greenhouse gas footprint[127]. Indeed, a new age of maize green revolution is yet to come!

    This work was supported by grants from the Key Research and Development Program of Guangdong Province (2022B0202060005), National Natural Science Foundation of China (32130077) and Hainan Yazhou Bay Seed Lab (B21HJ8101). We thank Professors Hai Wang (China Agricultural University) and Jinshun Zhong (South China Agricultural University) for valuable comments and helpful discussion on the manuscript. We apologize to authors whose excellent work could not be cited due to space limitations.

  • The authors declare that they have no conflict of interest. Haiyang Wang is an Editorial Board member of Seed Biology who was blinded from reviewing or making decisions on the manuscript. The article was subject to the journal's standard procedures, with peer-review handled independently of this Editorial Board member and his research groups.

  • [1]

    Research FM. 2019. Fruit Puree Market Size and Forecasts (2021−2031), Global and Regional Share, Trends, and Growth Opportunity Analysis. Report. The Insight Partners, New York, USA. www.theinsightpartners.com/reports/fruit-puree-market/

    [2]

    Defernez M, Kemsley EK, Wilson RH. 1995. Use of Infrared Spectroscopy and Chemometrics for the Authentication of Fruit Purees. Journal of Agricultural and Food Chemistry 43:109−13

    doi: 10.1021/jf00049a021

    CrossRef   Google Scholar

    [3]

    Lan W, Renard CMGC, Jaillais B, Buergy A, Leca A, et al. 2021. Mid-infrared technique to forecast cooked puree properties from raw apples: A potential strategy towards sustainability and precision processing. Food Chemistry 355:129636

    doi: 10.1016/j.foodchem.2021.129636

    CrossRef   Google Scholar

    [4]

    Espinosa L, To N, Symoneaux R, Renard CMGC, Biau N, et al. 2011. Effect of processing on rheological, structural and sensory properties of apple puree. Procedia Food Science 1:513−20

    doi: 10.1016/j.profoo.2011.09.078

    CrossRef   Google Scholar

    [5]

    Lan W, Renard CMGC, Jaillais B, Leca A, Bureau S. 2020. Fresh, freeze-dried or cell wall samples: Which is the most appropriate to determine chemical, structural and rheological variations during apple processing using ATR-FTIR spectroscopy? Food Chemistry 330:127357

    doi: 10.1016/j.foodchem.2020.127357

    CrossRef   Google Scholar

    [6]

    Picouet PA, Landl A, Abadias M, Castellari M, Viñas I. 2009. Minimal processing of a Granny Smith apple purée by microwave heating. Innovative Food Science & Emerging Technologies 10:545−50

    doi: 10.1016/j.ifset.2009.05.007

    CrossRef   Google Scholar

    [7]

    O'sullivan M. 2016. A handbook for sensory and consumer-driven new product development: Innovative technologies for the food and beverage industry. Cambridge: Woodhead Publishing. https://doi.org/10.1016/C2014-0-03843-9

    [8]

    Giovanelli G, Sinelli N, Beghi R, Guidetti R, Casiraghi E. 2014. NIR spectroscopy for the optimization of postharvest apple management. Postharvest Biology and Technology 87:13−20

    doi: 10.1016/j.postharvbio.2013.07.041

    CrossRef   Google Scholar

    [9]

    McGlone VA, Jordan RB, Martinsen PJ. 2002. Vis/NIR estimation at harvest of pre- and post-storage quality indices for 'Royal Gala' apple. Postharvest Biology and Technology 25:135−44

    doi: 10.1016/S0925-5214(01)00180-6

    CrossRef   Google Scholar

    [10]

    Zude M, Herold B, Roger JM, Bellon-Maurel V, Landahl S. 2006. Non-destructive tests on the prediction of apple fruit flesh firmness and soluble solids content on tree and in shelf life. Journal of Food Engineering 77:254−60

    doi: 10.1016/j.jfoodeng.2005.06.027

    CrossRef   Google Scholar

    [11]

    Contal L, León V, Downey G. 2002. Detection and quantification of apple adulteration in strawberry and raspberry Purées using visible and near infrared spectroscopy. Journal of Near Infrared Spectroscopy 10:289−99

    doi: 10.1255/jnirs.345

    CrossRef   Google Scholar

    [12]

    Kemsley EK, Holland JK, Defernez M, Wilson RH. 1996. Detection of adulteration of raspberry Purees using infrared spectroscopy and chemometrics. Journal of Agricultural and Food Chemistry 44:3864−70

    doi: 10.1021/jf960089l

    CrossRef   Google Scholar

    [13]

    Lan W, Bureau S, Chen S, Leca A, Renard CMGC, et al. 2021. Visible, near- and mid-infrared spectroscopy coupled with an innovative chemometric strategy to control apple puree quality. Food Control 120:107546

    doi: 10.1016/j.foodcont.2020.107546

    CrossRef   Google Scholar

    [14]

    Vohland M, Ludwig M, Thiele-Bruhn S, Ludwig B. 2014. Determination of soil properties with visible to near- and mid-infrared spectroscopy: Effects of spectral variable selection. Geoderma 223-225:88−96

    doi: 10.1016/j.geoderma.2014.01.013

    CrossRef   Google Scholar

    [15]

    Guo Z, Wang M, Agyekum AA, Wu J, Chen Q, et al. 2020. Quantitative detection of apple watercore and soluble solids content by near infrared transmittance spectroscopy. Journal of Food Engineering 279:109955

    doi: 10.1016/j.jfoodeng.2020.109955

    CrossRef   Google Scholar

    [16]

    Zhang D, Xu Y, Huang W, Tian X, Xia Y, et al. 2019. Nondestructive measurement of soluble solids content in apple using near infrared hyperspectral imaging coupled with wavelength selection algorithm. Infrared Physics & Technology 98:297−304

    doi: 10.1016/j.infrared.2019.03.026

    CrossRef   Google Scholar

    [17]

    Tian X, Fan S, Li J, Xia Y, Huang W, et al. 2019. Comparison and optimization of models for SSC on-line determination of intact apple using efficient spectrum optimization and variable selection algorithm. Infrared Physics & Technology 102:102979

    doi: 10.1016/j.infrared.2019.102979

    CrossRef   Google Scholar

    [18]

    Engelen L, de Wijk RA. 2012. Oral processing and texture perception. In Food Oral Processing, eds. Chen J, Engelen L. Hoboken: Blackwell Publishing. pp. 157−76. https://doi.org/10.1002/9781444360943.ch8

    [19]

    The R Core Team. 2019. R: A language and environment for statistical computing. http://lib.stat.cmu.edu/R/CRAN/doc/manuals/r-devel/fullrefman.pdf

    [20]

    Cordella CBY, Bertrand D. 2014. SAISIR: A new general chemometric toolbox. TrAC Trends in Analytical Chemistry 54:75−82

    doi: 10.1016/j.trac.2013.10.009

    CrossRef   Google Scholar

    [21]

    de Juan A, Tauler R. 2006. Multivariate curve resolution (MCR) from 2000: progress in concepts and applications. Critical Reviews in Analytical Chemistry 36:163−76

    doi: 10.1080/10408340600970005

    CrossRef   Google Scholar

    [22]

    Nicolaï BM, Beullens K, Bobelyn E, Peirs A, Saeys W, et al. 2007. Nondestructive measurement of fruit and vegetable quality by means of NIR spectroscopy: A review. Postharvest Biology and Technology 46:99−118

    doi: 10.1016/j.postharvbio.2007.06.024

    CrossRef   Google Scholar

    [23]

    Lan W, Jaillais B, Chen S, Renard CMGC, Leca A, et al. 2022. Fruit variability impacts puree quality: Assessment on individually processed apples using the visible and near infrared spectroscopy. Food Chemistry 390:133088

    doi: 10.1016/j.foodchem.2022.133088

    CrossRef   Google Scholar

    [24]

    Buergy A, Rolland-Sabaté A, Leca A, Renard CMGC. 2020. Pectin modifications in raw fruits alter texture of plant cell dispersions. Food Hydrocolloids 107:105962

    doi: 10.1016/j.foodhyd.2020.105962

    CrossRef   Google Scholar

    [25]

    Buergy A, Rolland-Sabaté A, Leca A, Renard CMGC. 2021. Apple puree's texture is independent from fruit firmness. LWT 145:111324

    doi: 10.1016/j.lwt.2021.111324

    CrossRef   Google Scholar

    [26]

    Chen J, Engelen L. 2012. Food oral processing: fundamentals of eating and sensory perception. UK: Blackwell Publishing. https://doi.org/10.1002/9781444360943

    [27]

    Lan W, Jaillais B, Leca A, Renard CMGC, Bureau S. 2020. A new application of NIR spectroscopy to describe and predict purees quality from the non-destructive apple measurements. Food Chemistry 310:125944

    doi: 10.1016/j.foodchem.2019.125944

    CrossRef   Google Scholar

    [28]

    Takos AM, Ubi BE, Robinson SP, Walker AR. 2006. Condensed tannin biosynthesis genes are regulated separately from other flavonoid biosynthesis genes in apple fruit skin. Plant Science 170:487−99

    doi: 10.1016/j.plantsci.2005.10.001

    CrossRef   Google Scholar

    [29]

    Zude-Sasse M, Truppel I, Herold B. 2002. An approach to non-destructive apple fruit chlorophyll determination. Postharvest Biology and Technology 25:123−33

    doi: 10.1016/S0925-5214(01)00173-9

    CrossRef   Google Scholar

    [30]

    Bobelyn E, Serban AS, Nicu M, Lammertyn J, Nicolai BM, et al. 2010. Postharvest quality of apple predicted by NIR-spectroscopy: Study of the effect of biological variability on spectra and model performance. Postharvest Biology and Technology 55:133−43

    doi: 10.1016/j.postharvbio.2009.09.006

    CrossRef   Google Scholar

    [31]

    Solomakhin A, Blanke MM. 2010. Can coloured hailnets improve taste (sugar, sugar: acid ratio), consumer appeal (colouration) and nutritional value (anthocyanin, vitamin C) of apple fruit? LWT - Food Science and Technology 43:1277−84

    doi: 10.1016/j.lwt.2010.02.020

    CrossRef   Google Scholar

    [32]

    Omar AF, Atan H, MatJafri MZ. 2012. Peak response identification through near-infrared spectroscopy analysis on aqueous sucrose, glucose, and fructose solution. Spectroscopy Letters 45:190−201

    doi: 10.1080/00387010.2011.604065

    CrossRef   Google Scholar

    [33]

    Lu R, Guyer DE, Beaudry RM. 2000. Determination of firmness and sugar content of apples using near-infrared diffuse reflectance. Journal of Texture Studies 31:615−30

    doi: 10.1111/j.1745-4603.2000.tb01024.x

    CrossRef   Google Scholar

    [34]

    Zhu D, Ma Z, Lu A, Zhao L, Tu Z, et al. 2010. The Detection of Soluble Solid Contents and Conductivity of Apple Juice by Homemade Near Infrared Spectrometer. Sensor Letters 8:158−62

    doi: 10.1166/sl.2010.1219

    CrossRef   Google Scholar

    [35]

    Wang H, Peng J, Xie C, Bao Y, He Y. 2015. Fruit quality evaluation using spectroscopy technology: a review. Sensors 15:11889−927

    doi: 10.3390/s150511889

    CrossRef   Google Scholar

    [36]

    Malvandi A, Feng H, Kamruzzaman M. 2022. Application of NIR spectroscopy and multivariate analysis for Non-destructive evaluation of apple moisture content during ultrasonic drying. Spectrochimica Acta Part A:Molecular and Biomolecular Spectroscopy 269:120733

    doi: 10.1016/j.saa.2021.120733

    CrossRef   Google Scholar

    [37]

    Le Dréau Y, Dupuy N, Artaud J, Ollivier D, Kister J. 2009. Infrared study of aging of edible oils by oxidative spectroscopic index and MCR-ALS chemometric method. Talanta 77:1748−56

    doi: 10.1016/j.talanta.2008.10.012

    CrossRef   Google Scholar

  • Cite this article

    Wang Z, Bureau S, Jaillais B, Renard CMGC, Chen X, et al. 2024. Infrared guided smart food formulation: an innovative spectral reconstruction strategy to develop anticipated and constant apple puree products. Food Innovation and Advances 3(1): 20−30 doi: 10.48130/fia-0024-0003
    Wang Z, Bureau S, Jaillais B, Renard CMGC, Chen X, et al. 2024. Infrared guided smart food formulation: an innovative spectral reconstruction strategy to develop anticipated and constant apple puree products. Food Innovation and Advances 3(1): 20−30 doi: 10.48130/fia-0024-0003

Figures(6)  /  Tables(2)

Article Metrics

Article views(4326) PDF downloads(566)

ARTICLE   Open Access    

Infrared guided smart food formulation: an innovative spectral reconstruction strategy to develop anticipated and constant apple puree products

Food Innovation and Advances  3 2024, 3(1): 20−30  |  Cite this article

Abstract: An innovative chemometric method was developed to exploit visible and near-infrared (Vis-NIR) spectroscopy to guide food formulation to reach the anticipated and constant quality of final products. First, a total of 671 spectral variables related to the puree quality characteristics were identified by spectral variable selection methods. Second, the concentration profiles from multivariate curve resolution-alternative least squares (MCR-ALS) made it possible to reconstruct the identified spectral variables of formulated purees. Partial least square based on the reconstructed Vis-NIR spectral variables was evidenced to predict the final puree quality, such as a* values (RPD = 3.30), total sugars (RPD = 2.64), titratable acidity (RPD = 2.55) and malic acid (RPD = 2.67), based only on the spectral data of composed puree cultivars. These results open the possibility of controlling puree formulation: a multiparameter optimization of the color and taste of final puree products can be obtained using only the Vis-NIR spectral data of single-cultivar purees.

    • Apple puree accounts for the second largest market of fruit puree, with a global market value of about 2,000 million USD annually[1]. It is used as the basic ingredient of jams, preserves or compotes, which are popular among people of all ages, especially for babies and elders[2]. According to previous work, a large diversity of apple cultivars[3] and processing conditions (cooking parameters, grinding intensity and refining levels, etc.)[46] can introduce strong chemical, textural and rheological variations on processed purees. However, the current puree processing systems are not always adapted and optimized to the raw apples but meet the required constant quality standards of final products. Further, the increasing demands of various anticipated products for current consumers have put much stress on the development of personalized puree products from industrial manufacturers. Therefore, it would be highly beneficial to develop innovative puree production strategies, that can consider a large variability of raw materials and provide new solutions to reach the anticipated and constant taste and texture of purees.

      Puree formulation is one of the most economical and efficient strategies for manufacturers to adjust the texture and taste of final puree products depending on a mixture of different proportions of single apple varieties[7]. However, fruit manufacturers do not often have the access to good choices to determine how to formulate puree products with large variability of raw materials. The challenge is therefore to develop innovative strategies to provide specific guidance for the formulation of final puree products based on the information of single cultivar puree, to reach their anticipated and constant taste and texture.

      Visible and near-infrared (Vis-NIR) spectroscopy has been applied as a simple and rapid technique to give considerable predictions of chemical, physical and textural properties of raw apples[810] and processed puree[5,1113]. In our previous work, a spectral reconstruction strategy based on the concentration profile of multivariate curve resolution-alternative least squares (MCR-ALS) was first developed[13]. Based on that, the mid-infrared (MIR) spectra of single-cultivar apple purees can be used to reconstruct the spectra of differently formulated purees, then the multivariate regression models using the reconstructed spectra of formulated purees can successfully predict their quality characteristics (soluble sugars, titratable acidity, pH and viscosity, etc.)[13]. However, this strategy was not available to reconstruct the Vis-NIR spectra of formulated purees from their corresponding spectra of single-cultivar purees, because of the unacceptable concertation profiles of MCR-ALS. One possible reason could be the large sets of spectral variables that were strongly collinear and noisy, which may affect the success of spectral reconstruction.

      To partly compensate for these effects, spectral variable selection is an important approach, as it tends to parsimonious data representation and can result in multivariate models with greater predictive ability[14]. Particularly, several spectral variable selection methods, such as competitive adaptive reweighted sampling (CARS)[15], successive projections algorithm (SPA)[16], and uninformative variable elimination (UVE)[17], can significantly improve the Vis-NIR prediction accuracy. The challenging work here was to reconstruct the spectra of final formulated purees according to the most relative Vis-NIR spectral variables of single cultivar apple purees by MCR-ALS. Then, the predictive models of formulated puree quality traits (physical and chemical) using the reconstructed spectra dataset could predict the properties of formulated puree products based on the relative Vis-NIR spectra of composed single-cultivar purees. If so, this new strategy opens a new possibility to provide practical and suitable strategies for the multicriteria optimization of puree formulation with anticipated and constant quality.

      This study intended to develop a smart food formulation model based on the rapid and high-throughput spectral information of their individual composed components, which could provide various formulation guidance and monitor their quality parameters of final products. To reach this objective, Vis-NIR technique coupled with spectral variable selection methods (CARS, SPA, UVE) were applied both on the different formulated purees and their corresponding single-cultivar purees to highlight their featured spectral variables; then the selected spectra variables of single-cultivar purees to reconstruct spectra of formulated purees; and finally investigated the possibility to develop regression models to evaluate the quality parameters of final formulated purees.

    • The experiment was conducted on four apple varieties: 'Golden Delicious'(GD), 'Granny Smith'(GS), 'Braeburn'(BR), and 'Royal Gala'(GA). They were harvested at commercial maturity from La Pugère experimental orchard (Mallemort, Bouches du Rhône, France) in 2019, and stored at 4 °C and around 90% relative humidity for up to 2 months to ensure starch regression. A multi-functional processing system (Roboqbo, Qb8-3, Bentivoglio, Italy) was used to process apple purees following a Hot Break recipe: cooked at 95 °C for 5 min at a 1,500 rpm grinding speed, then cooled down to 65 °C while maintaining the grinding speed. During three successive weeks, around 2 kg of each apple cultivar was processed into single-cultivar purees, then conditioned in two hermetically sealed cans: one was cooled in a cold room (4 °C) before formulation, while the other was stored at –20 °C for biochemical measurement of individual sugars (fructose, sucrose, and glucose) and malic acid.

    • After puree processing, the four single cultivar purees were formulated by two of each of them, into six different experimental groups named 'A' (GS × GS), 'B' (GD × BR), 'C' (GD × GA), 'D' (GS × BR), 'E' (GS × GA) and 'F' (BR × GA), respectively (Fig. 1). Each experimental group (A−F) included nine samples with different formulated proportions of weight, which were divided into two subsets: the first included six proportions (10%:90%, 25%:75%, 50%:50%, 75%:25%, 90%:10%, 95%:5%) for the modeling set, while the second included three proportions (80%:20%, 33%:67%, 14%:86%) for the external prediction set. Finally, all the single and formulated purees were prepared for both the Vis-NIR spectral measurements and quality characterizations.

      Figure 1. 

      Experimental scheme of puree reformation, quality characterizations, and spectral acquisition.

    • The puree color was determined three times using a CR-400 chromameter (Minolta, Osaka, Japan) and expressed in the CIE 1976 L*a*b* color space (illuminant D65, 0° view angle, illumination area diameter 8 mm). Puree rheological measurements were carried out using a Physica MCR-301 controlled stress rheometer (Anton Paar, Graz, Austria) and a 6-vane geometry (FL100/6W) with a gap of 3.46 mm, at 22.5 °C. The flow curves were performed after a pre-shearing period of 1 min at a shear rate of 50 s−1, followed by 5 min at rest. The viscosity was then measured at a controlled shear rate range of [10; 250] s−1 on a logarithmic ramp. The values of viscosity at 50 and 100 s−150 and η100 respectively) were kept as final indicators of the puree viscosity linked to sensory characteristics during consumption[18].

    • The dry matter content (DMC) was estimated from the weight of freeze-dried samples upon reaching a constant weight (freeze-drier, 5 d). Titratable acidity (TA) was determined by titration up to pH 8.1 with 0.1 mol/L NaOH and expressed in mmol H+ kg−1 of fresh weight (FW) using an autotitrator (Methrom, Herisau, Switzerland). Individual sugars and malic acid were quantified using colorimetric enzymatic kits (R-biopharm, Darmstadt, Germany), respectively. The contents of glucose, fructose, sucrose, and malic acid were expressed in g kg−1 FW. The total sugar content (TSC) of each puree sample was presented as the sum of characterized glucose, sucrose and fructose. The individual sugars (fructose, glucose, sucrose) and malic acid contents of formulated puree samples were calculated based on the measured values of processed single cultivar purees.

    • A multi-purpose analyzer spectrometer (Bruker Optics®, Wissembourg, France) with OPUS software Version 5.0 (Bruker Optics®) was used to acquire the Vis-NIR spectral data of purees at 23 °C, which can provide diffuse reflectance measurements with a spectral resolution of 8 cm-1 from 500 to 2,500 nm. Totally 32 scans were recorded and averaged for each spectrum of purees. Purees were transferred into 10 mL glass vials (5 cm height × 18 mm diameter) which were placed on the automated sample wheel of the spectrophotometer. A reference background measurement was automatically activated before each data set acquisition using an internal Spectral on reference. Each puree sample was measured three times on different aliquots. Totally, 36 spectra of single-cultivar purees (4 cultivars × 3 testing weeks × 3 replicates) and 486 spectra of their formulated purees (6 experimental groups × 9 formulated proportions × 3 testing weeks × 3 replicates) (Fig. 1).

    • After checking the normal distribution with a Shapiro-Wilk test (α = 0.05), the reference data of processed purees were presented as mean values and standard deviation values (SD) in Tables 1 & 2. Analysis of variance (ANOVA) and Pearson correlation analysis were carried out to determine the significant differences and internal correlations of puree quality traits of the different single apple cultivars and formulated puree groups using XLSTAT (version 2018.5.52037, Addinsoft SARL, Paris, France) data analysis toolbox, which was described in our previous work[13]. The physical (a* and b* values), rheological (viscosity η50), and biochemical (SSC, TA, and fructose) parameters of formulated purees were displayed in the boxplots using R software (version 2.6.2)[19] (Fig. 2).

      Table 1.  PLS prediction of physical, chemical and rheological parameters of all formulated purees using Vis-NIR (500–2,500 nm) spectra or their selected spectral variables based on SPA, CARS and UVE methods.

      ParametersRangeSDMethodsVariablesPLSR
      R2cRMSECR2pRMSEPRPD
      L*41.6−48.91.5FULL2,7220.870.60.730.71.76
      SPA60.830.60.810.62.13
      CARS510.870.60.810.62.14
      UVE1,2590.880.50.800.62.06
      a*(−4.8)−2.42.0FULL2,7220.980.30.960.45.17
      SPA60.980.30.960.45.24
      CARS330.980.30.970.35.56
      UVE1,5960.980.30.970.45.38
      b*9.6−18.41.7FULL2,7220.710.90.541.21.48
      SPA70.730.90.581.21.54
      CARS1430.750.80.571.21.53
      UVE9970.750.80.591.21.55
      Viscosity η100834−1,721210FULL2,7220.8384.60.7998.22.17
      SPA90.7995.50.8292.42.30
      CARS440.8483.30.8291.22.33
      UVE1,1930.8776.00.8585.42.49
      Viscosity η50526−1,029119FULL2,7220.8840.20.8350.82.38
      SPA110.8249.00.8251.52.35
      CARS1660.8643.60.8547.12.57
      UVE1,1330.8840.90.8744.52.73
      DMC (g/g FW)0.14−0.170.01FULL2,7220.740.0040.490.0061.39
      SPA90.560.0060.500.0061.42
      CARS920.750.0040.580.0051.56
      UVE1,4970.800.0040.630.0051.66
      TSC (g/kg FW)93.2−145.412.6FULL2,7220.962.70.903.63.57
      SPA90.962.80.923.63.53
      CARS1010.972.70.923.53.66
      UVE1,5310.962.70.923.63.63
      TA (meq/kg FW)28.0−94.816.2FULL2,7220.960.30.910.53.31
      SPA100.940.40.900.53.12
      CARS920.960.30.910.53.37
      UVE10610.960.30.910.53.34
      pH3.39−4.470.23FULL27220.900.070.860.092.57
      SPA80.690.130.690.141.76
      CARS510.880.080.880.092.69
      UVE1,3850.880.080.870.092.73
      Glucose (g/kg FW)13.2−28.33.7FULL2,7220.861.20.851.32.42
      SPA50.851.30.841.42.51
      CARS380.871.10.861.32.58
      UVE1,1810.861.20.851.32.54
      Fructose (g/kg FW)40.2−80.39.1FULL2,7220.714.90.546.01.46
      SPA120.595.80.576.21.42
      CARS920.645.50.615.51.61
      UVE1,2350.635.60.615.51.61
      Sucrose (g/kg FW)33.2−57.35.5FULL2,7220.742.80.663.11.72
      SPA160.673.20.643.21.65
      CARS920.732.90.633.31.61
      UVE1,2790.673.20.643.41.60
      Malic acid (g/kg FW)3.0−8.81.3FULL2,7220.930.30.910.43.33
      SPA200.910.40.900.43.16
      CARS920.920.40.910.43.34
      UVE9520.920.40.920.43.36
      R2c: determination coefficient of the calibration test; R2p: determination coefficient of the external prediction test; RMSEP: root mean square error of prediction test; RPD: the residual predictive deviation of the prediction test.

      Table 2.  Prediction results of chemical and rheological parameters of all formulated purees from the reconstructed spectra computed by the concentration of MCR-ALS and the selected spectral variables of single-cultivar purees.

      ParametersRangeSDLVsPLSR
      R2cRMSECR2pRMSEPRPD
      L*41.6−48.91.5100.770.70.620.81.57
      a*(−4.8)−2.42.080.910.60.920.63.30
      b*9.6−18.41.790.581.00.421.41.31
      Viscosity η100834−1721210100.81870.81962.22
      Viscosity η50526−1029119100.82480.82542.26
      DMC (g/g FW)0.14−0.170.01100.570.0050.430.0061.38
      TSC (g/kg FW)93.2−145.412.6100.913.60.864.82.64
      TA (meq/kg FW)28.0−94.816.280.920.40.850.62.55
      pH3.39−4.470.23100.840.090.850.102.47
      Glucose (g/kg FW)13.2−28.33.7100.851.30.821.52.25
      Fructose (g/kg FW)40.2−80.39.1100.685.00.605.61.58
      Sucrose (g/kg FW)33.2−57.35.5130.822.30.772.62.08
      Malic acid (g/kg FW)3.0−8.81.380.930.30.860.52.67
      R2c: determination coefficient of the calibration test; R2p: determination coefficient of the external prediction test; RMSEP: root mean square error of prediction test; RPD: the residual predictive deviation of prediction.

      Figure 2. 

      Boxplot of colors (a* and b*), rheological parameters (η50), total sugars (TSC), titratable acidity (TA) and fructose of different formulated puree groups.

    • Spectral discrimination and multivariate analyses were performed with MATLAB 7.5 (Mathworks Inc. Natick, MA, USA) software using the SAISIR package[20]. Principal component analysis (PCA) was carried out on the single-cultivar puree spectra to evaluate their variability and point out the contributed wavelengths. ANOVA was performed on the Vis-NIR spectra of all formulated purees to analyze their variations during puree formulation.

    • Three spectral variable selection methods, including competitive adaptive reweighted sampling (CARS), successive projections algorithm (SPA) and uninformative variable elimination (UVE) have been applied respectively on the formulated puree spectra matrix D (n × λ), which was made up with the number of samples (n) and the intensity at each wavelength (λ with 2722 spectral variables from 400 to 2,500 nm) (Fig. 3), to extract the most informative wavelengths for prediction models of each puree quality traits. After comparison, the specific spectral variables related to all puree quality traits were extracted by CARS and composed as the matrix D' (n × λ′) of all formulated purees, consisting of the same number of samples (n) and the intensity at selected variables (λ′ with 671 spectral variables from 400 to 2,500 nm).

      Figure 3. 

      Process of Vis-NIR spectral data by multivariate resolution alternative least square (MCR-ALS) and spectral reconstruction of formulated purees.

    • Multivariate curve resolution-alternative least square (MCR-ALS) is an effective multivariate self-modeling curve resolution method, which can simultaneously elucidate the pure spectra of different species present in processed products and their concentration profiles[21]. As displayed in Fig. 3, the Vis-NIR spectra of formulated purees were reconstructed from their composed single-cultivar purees based on our previously developed method[13]. The ST matrix (s × λ′) is the spectroscopic matrix describing the selected spectral variables (λ′) of all single-cultivar purees (s). The matrix D′ can be mathematically decomposed into the individual contributions related to the spectral information of matrix ST according to Eqn (1) and is interactively transformed using an alternative least square (ALS) procedure as Eqn (2).

      D=CST+E (1)
      C=D(ST)+ (2)

      Matrix C (n× q) is the concentration matrix describing the contribution of every single-cultivar puree (q) in reconstructed purees (n). E is the error matrix that provides the data variation not explained by their contributions. The matrix (ST)+ is the pseudo-inverse matrix of ST. A general constraint used in the curve resolution method is the non-negativity on the concentration profiles.

      Once the concentration profiles (matrix C) for each single-cultivar spectrum of Golden Delicious (CGD), Granny Smith (CGS), Braeburn (CBR), and Royal Gala (CGA), were obtained, they were used to reconstruct a new spectroscopic matrix R (n × k) for monitoring all formulated purees. Each row Ri (i = 1,…n) was made up of a reconstructed spectrum. And each column Rj (j = 1,…λ′) gave the reconstructed spectral intensity at each selected Vis-NIR wavelength based on the corresponding pure puree spectra of Golden Delicious (λ′GD), Granny Smith (λ′GS), Braeburn (λ′BR) and Royal Gala (λ′GA), following Eqn (3).

      R=CGDλGD+CGSλGS+CBRλBR+CGAλGA (3)
    • Spectral pre-processing and multivariate regression were performed with MATLAB 7.5 (Mathworks Inc. Natick, MA, USA) software with the 'PLS' toolbox and displayed in Fig. 4. For all spectral datasets, standard normal variate (SNV) and derivative transform calculation (Savitzky–Golay method, window size = 11, 21, 31, 41) of the first or second order were compared before multivariate regression. SNV pre-processing applied on the Vis-NIR spectral data showed the best performances to predict puree quality and was then systematically used.

      Figure 4. 

      Overview of the applied methodology of Vis-NIR spectra pre-processing, spectral variable selection, spectral reconstruction and multivariate regression.

      The partial least square (PLS) regression models were developed to predict the quality characteristics of formulated purees based on: i) their full spectral variables (FULL) and the selected spectral matrices from CARS, UVE, and SPA, respectively (Table 1); and ii) the reconstructed Vis-NIR spectral matrix (Table 2). All aforementioned spectral matrices correspond to the same reference dataset. The 324 spectra of the formulated purees (6 groups × 6 proportions × 3 weeks × 3 replicates) were used for modeling calibration. Then, the calibrated models were further validated with the external prediction set of 162 puree spectra (6 groups × 3 proportions × 3 weeks × 3 replicates) (see Fig. 1). The optimal numbers of latent variables (LVs) for PLSR models were selected by the Venetian blinds cross-validation method. The prediction ability of developed models was described by the determination coefficients of calibration (R2c) and external prediction (R2p), root mean square error of calibration (RMSEC) and prediction (RMSEP), residual predictive deviation (RPD) value as described by Nicolai et al.[22]. And the most contributed spectral variables (VIP) for each puree characteristic during the modeling training were obtained and analyzed.

    • After processing, four different single cultivar purees presented significant (p < 0.05) differences in physical, chemical and rheological properties, which were in line with our previous results[23] (Fig. 2). Particularly, the BR purees had the highest redness (a* values = 1.76 ± 0.27) and lowest yellowness (b* values = 10.18 ± 0.45) among these four puree groups. Four apple cultivars presented a significant difference (p < 0.05) in puree viscosity (η50 and η100). Particularly, the GD, GS and BR presented a significantly higher (p < 0.0001) viscosity than GA purees, which were addressed as their bigger particle sizes and promoted cell adhesion with more branched pectin[24,25]. The TSC and DMC of GD purees were significant (p < 0.0001) higher than the other three groups, while the GS purees gave the highest acidic values (TA = 89.1 ± 1.3 meq/kg FW, and malic acid = 8.1 ± 0.5 g/kg FW). Besides, GD and GA purees presented the significant largest differences (p < 0.0001) in all individual sugar content (fructose, glucose and sucrose).

      After puree formulation, three groups with the admixture of BR purees (B, D, F) can introduce a more intensive variation of color parameters (a* and b* values) than other groups (A, C, E) (Fig. 2a & b). Besides, the formulated purees prepared with GS purees (A, D, E) presented a relatively large variability of viscosity at a shear rate of 50 s−150) (Fig. 2c) than other groups (B, C, F), which is commonly used to describe the in-mouth texture perception of fluid foods[26]. The variability of TSC in different puree groups was ranked as C > E > F > A > B > D, and the addition of GA purees (C, E, F) resulted in a relatively large range of TSC (Fig. 2). However, the variability of fructose in formulated purees, which is the dominant individual sugar in purees[27], was different from the results of TSC and DMC. Three puree groups formulated with GS (C, E, F) gave a relatively larger variation of fructose than other groups (A, B, D) (Fig. 2f). The formulation of GS and BR purees (E) resulted in the largest variations of TA values among the six groups, but the limited changes of TA in GD and BR formulated purees (B) (Fig. 2e).

      Consequently, the different formulation strategies based on these four different single cultivar purees can provide a large variability of appearance, and rheological and chemical properties.

    • PCA was performed on the Vis-NIR spectra of four single cultivar purees to highlight their most variable wavelength regions (Fig. 5). The first principal component (PC1) discriminated against 'GD' and 'BR' purees and accounted for 54.2% of the total variation (Fig. 5a). The specific wavelengths at 524–528 nm, 672 nm in the Vis range and around 800–1,250 nm in the NIR range were the main contributors to the PC1 (Fig. 5b). The two bands at 528 and 672 nm are explained by the anthocyanin and chlorophyll contents of apples, respectively[28,29]. The NIR region at 800–1,250 nm is known as the absorption of apple carbohydrates and water content[8,10] and is already used for apple cultivar discrimination[30]. The 'GS' and 'GA' purees could be separated by the second principal component (PC2), with an explanation rate of 33.4%. The PC2 score was highly correlated to the visible spectra wavelengths at 612 and 672 nm, indicating the large differences in greenness between 'GS' and 'GA' purees[31].

      Figure 5. 

      Principal component analysis (PCA) of the SNV pre-treated Vis-NIR spectra of four single cultivar purees. (a) Discrimination map of four apple cultivars. (b) Beta-coefficients of first principal component (PC1). (c) Beta-coefficients of second principal component (PC2).

      ANOVA was performed on the Vis-NIR spectra of all formulated apple purees respectively (Fig. 6a & b), to point out the wavelength that varied at different formulation strategies. According to the F-values, the variability of formulation was much higher at the visible spectral region (F-values from 7.1 to 147.6) (Fig. 6a) than at the NIR region (F-values from 2.4 to 15.3) (Fig. 6b). This result demonstrated that the color varied more intensively than the chemical parameters among different formulated apple purees, from the spectroscopic point of view. Particularly, the most variable visible spectral wavelengths were located at 528, 614, and 672 nm, which is similar to the PCA results of four single-cultivar purees (Fig. 5). At the NIR region, the most informative wavelengths were located at 800−1,400 nm, in particular with 916, 1,070, and 1,270 nm, respectively. The NIR region at around 916 nm was related to the C-H and O-H bands of sucrose[32], and the typical peaks at 1,070 and 1,075 contributed to the soluble sugars of apples[33] and processed juices[34].

      Figure 6. 

      ANOVA results of the SNV pre-treated (a) visible (500–780 nm) spectra and (b) NIR (780–2,500 nm) spectra of all formulated apple purees.

      Accordingly, the specific Vis-NIR spectral wavelengths at 524−528 nm, 614, 672, 916, 1,070 and 1,270 nm were shown to be potentially linked to the different chemical and physical variations among different single-cultivar purees and their formulated products.

    • In this part, PLS regression coupled with all the spectral variables (FULL) or the selected spectral variables based on SPA, CARS and UVE were applied to compare their ability to predict color, rheological and biochemical characteristics of formulated purees (Table 1). As expected, the decreases of determination coefficients between the calibration set (R2c) and the external validation set (R2p) for all developed PLS models. According to the RPD values over 2.5[22], acceptable to good predictions were obtained for a*, viscosity (η50), TA, pH, glucose and malic acids.

      For all the color parameters (L*, a* and b*), PLS models coupled with three different spectral variable selection methods gave better predictions than those with full spectral variables. The excellent predictions of a* values were obtained for all PLS models, with an RPD decreasing order based on CARS (RPD = 5.56), UVE (RPD = 5.38) and SPA (RPD = 5.24) methods, respectively. And the selected spectral variables for a* values of purees were mainly dominated in the visible spectral region at around 672 and 614 nm, which were described previously in this study.

      Apparent puree viscosity at a share rate value of 50 s−150) can be considerably predicted by UVE-PLS models, with an R2p of 0.87 and an RPD value of 2.73. Besides, the UVE-PLS predictions of puree viscosity at a single shear rate value of 100 s−1100) (R2p = 0.85 and RPD = 2.49) observed here was much better than its perdition from the full spectral variables (R2p = 0.79 and RPD = 2.17). The specific spectral variables of puree viscosity were identified at 672−677 nm, 1,043−1,079 nm and 1,379−1,381 nm, mainly presenting the chlorophyll[29], the C-H and O-H vibration of soluble (mainly sugars and acids)[35] and water contents[36]. These results evidenced the spectral variable selection on Vis-NIR spectra gives obvious improvements to estimate the puree viscosity at different formulation strategies. Different from the poor predictions of puree viscosity using the full spectral variables of NIR[27] and Vis-NIR[13], the possible reasons for these improvements of prediction results could be: i) the exclusion of non-relevant or unimportant spectral variables which disturbed the calibration modeling, and ii) the good collinear relationships (r2 = 0.72) between puree color a* values and viscosity in different formulated purees, which also has been observed between raw and processed apples from our previous work[23].

      For biochemical characteristics, all the developed PLS models did not give satisfactory predictions for the DMC (R2P < 0.66, RPD < 1.66) of the formulated purees. These poor results were relatively lower than our previous predictions by Vis-NIR of SSC (R2P = 0.80, RPD = 2.30) and DMC (R2P = 0.73, RPD = 1.9) of individually processed purees[23]. The main reason was probably related to the lower variations of DMC among these formulated purees (SD = 0.006 g/g) in comparison with the previous study taking into account the variability of individual apples (SD = 0.008 g/g). TSC, presenting all the individual sugars (sucrose, fructose and glucose), was satisfactorily predicted by CARS-PLS with 101 selected spectral variables (R2P = 0.92, RPD = 3.66). However, the Vis-NIR prediction was only acceptable for glucose (RPD > 2.42) but not for fructose (RPD < 1.61) and sucrose (RPD < 1.60).

      Concerning the different characteristics of puree acidity including pH, TA and malic acid, both CARS-PLS and UVE-PLS models provided considerable predictions with R2p > 0.87 and RPD > 2.69. It can be noticed that better puree acidity predictions can be obtained using these two methods than the full spectral variables, which was also highlighted on apples[15]. Particularly, the most contributed spectral variables were located at 664−668 nm and 1,065−1,165 nm, which presented the vibration of O-H from carboxylic acids in fruits[35].

      Consequently, PLS models coupled with three spectral variable selection methods offered better predictions of puree physical, rheological and biochemical characteristics than using full Vis-NIR spectral variables alone. In comparison with three spectral variable selection methods, CARS always extracted the intermediate numbers of spectral variables (higher than SPA and much lower than UVE), but reached similar PLS prediction results as the UVE method with thousands of featured variables. In summary, PLS models coupled with selected VIS-NIR spectral variables have promising to well estimate the* color parameter, viscosity (η50), TSC, TA, pH, glucose and malic acids in formulated apple purees.

    • Based on the CARS-PLS prediction results, a total 671 Vis-NIR spectral variables were extracted from the 2,722 spectral variables of full wavelengths. According to our method described, MCR-ALS was applied to the selected Vis-NIR spectral variables of all formulated purees and of the four single-cultivar purees, to compute the concentration profiles of relevant single-cultivar composition. In total, 486 spectra of formulated purees were reconstructed (including 324 spectra of the calibration set and 162 spectra of the validation set, respectively) based on their corresponding 36 spectra of single-cultivar purees.

      PLS models were developed using these reconstructed Vis-NIR spectra and the reference data of all formulated purees. A good prediction of a* values of puree can be obtained with R2p and RPD of 0.92 and 3.30, but not for L* (R2p = 0.62) and b* values (R2p = 0.42). However, the PLS models based on the reconstructed Vis-NIR spectra were not precise enough to evaluate puree rheological parameters (η50 and η100) (R2p < 0.82, RPD < 2.22), according to their PRD values less than 2.5[22]. This result indicated the difficulties in predicting the viscosity of formulated puree from their Vis-NIR spectra of single-cultivar purees. Besides, the unsatisfactory predictions of DMC of formulated purees (R2p = 0.46, RPD = 1.38), were in line with their PLS results based on the real spectra (in Table 1). Although the PLS results of TSC based on the reconstructed Vis-NIR spectra (in Table 2) were less accurate than the real spectral prediction (in Table 1), the PLS models had an acceptable ability to estimate it for all formulated purees (RMSEP = 4.8 g/kg FW, RPD = 2.64). Considering the global acidity parameters, although the slightly lower RMSEP and RPD values than the results obtained directly on the real Vis-NIR spectra (RMSEP = 0.5 meq/kg, RPD = 3.37 for TA, and RMSEP = 0.09, RPD = 2.73 for pH), the PLS models had an acceptable ability to estimate them for all formulated purees (RMESP = 0.6 meq/kg, RPD = 2.55 for TA, and RMSEP = 0.10, RPD = 2.47 for pH). Particularly, similar VIP wavelengths were observed in reconstructed spectra and directly on formulated purees described above, mainly the 664−668 nm and 1,065−1,165 nm, which were described earlier. For individual sugars and acids, PLS models provided the acceptable predictions of malic acid (R2p = 0.86, RPD < 2.67), but not for all individual sugars (R2p < 0.82, RPD < 2.25).

    • The concentration profile of MCR-ALS opened a potential way to directly estimate the a* value, TA, pH, malic acid, and glucose for formulated purees based only on the CARS selected spectral variables of single-cultivar purees. Overall, the PLS results of these aforementioned quality parameters based on the reconstructed spectra of formulated purees presented a relatively lower prediction accuracy than directly on real puree spectra, because of the non-negativity of the concentration profiles which could constrain the spectral reconstruction[37]. Compared to our previous prediction models obtained on the mid-infrared spectra of purees[3], these results provided further evidence of our new chemometric strategy that using the concentration profile of MCR-ALS to reconstruct the Vis-NIR spectra of formulated purees from their corresponding spectra of single-cultivar purees. Besides, what stands out in this work is the necessity of spectral variable selection on the puree spectra to obtain considerable concentration profiles for the spectral reconstruction of formulated purees.

      It's the first time to reconstruct the spectra of formulated products using the concentration profile of MCR-ALS from selected Vis-NIR spectral variables of composed raw materials. Innovatively, this strategy opens the possibility to guide the production of constant and anticipated purees by simply scanning the single-cultivar apple purees in the apple industry. For instance, after acquiring Vis-NIR spectra of the four single-cultivar purees, our developed PLS models could: i) provide several strategies to formulate purees with defined tastes (e.g. 127.3 ± 5.7 g/kg FW of TSC and 7.8 ± 0.2 meq/kg FW of TA, which might be reached with the formulate solutions as 80% GS–20% GA, 33.3% GD–66.6% GS and 80% GS–20% BR purees), depending the might be used in industry; or ii) simulate and optimize puree formulation for anticipated products development depending on the market, such as 20% GD−80% GS purees (low sweetness, high acidity), 33.3% GD–66.6% GA purees (high sweetness, low acidity) and GS 14%–GA 86% (low sweetness, low acidity).

    • This study firstly demonstrated a better ability of Vis-NIR spectroscopy coupled with advanced chemometric methods (CARS, UVE and SPA variable selection and PLS regression) to estimate the physical (a* value), rheological (η50 and η100) and chemical compositions of apple purees than the use of full wavelengths.

      Further, an innovative spectral reconstruction strategy based on the MCR-ALS and spectral variable selection was developed to provide practical and suitable strategies for the multicriteria optimization of puree formulation with anticipated and constant quality (a* value, TSC, TA, glucose, malic acid) from their composed single-cultivar purees. As far as we know, this was the first report providing the potential formulation strategy to develop anticipated and constant final fruit products using the Vis-NIR spectral information of the initial purees based on the spectral reconstruction approach. Further this new chemometric strategy has the potential to provide production guidance for other food formulations, such as multifruit juices, blend oils, and even admixed flavoring agent etc, based on the Vis-NIR spectra acquired directly on their composed raw materials.

      • The authors thank Patrice Reling, Barbara Gouble, Marielle Boge, Caroline Garcia and Gisèle Riqueau (INRAE, SQPOV unit) for their technical help. This work was supported by the 'Interfaces' project, an Agropolis Foundation Flashship project publicly funded through the ANR (French Research Agency) under the 'Investissements d'Avenir' program (ANR-10-LABX-01-001 Labex Agro, coordinated by Agropolis Fondation), the National Natural Science Foundation of China (NSFC, 32302204), and Research Startup Foundation of Nanjing Agricultural University (No. 804120).

      • The authors confirm contribution to the paper as follows: investigation: Wang Z, Sun Y; resources: Renard C; Pan L; supervision: Renard C, Pan L, Lan W; conceptualization, funding acquisition: Lan W; data analysis: Wang Z; writing – original draft: Wang Z; writing – review & editing: Bureau S; Jaillais B; Chen X; Lv D, Lan W. All authors reviewed the results and approved the final version of the manuscript.

      • The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request.

      • The authors declare that they have no conflict of interest. Catherine M.G.C. Renard is the Editorial Board member of Food Innovation and Advances who was blinded from reviewing or making decisions on the manuscript. The article was subject to the journal's standard procedures, with peer-review handled independently of this Editorial Board member and the research groups.

      • Copyright: © 2024 by the author(s). Published by Maximum Academic Press on behalf of China Agricultural University, Zhejiang University and Shenyang Agricultural University. This article is an open access article distributed under Creative Commons Attribution License (CC BY 4.0), visit https://creativecommons.org/licenses/by/4.0/.
    Figure (6)  Table (2) References (37)
  • About this article
    Cite this article
    Wang Z, Bureau S, Jaillais B, Renard CMGC, Chen X, et al. 2024. Infrared guided smart food formulation: an innovative spectral reconstruction strategy to develop anticipated and constant apple puree products. Food Innovation and Advances 3(1): 20−30 doi: 10.48130/fia-0024-0003
    Wang Z, Bureau S, Jaillais B, Renard CMGC, Chen X, et al. 2024. Infrared guided smart food formulation: an innovative spectral reconstruction strategy to develop anticipated and constant apple puree products. Food Innovation and Advances 3(1): 20−30 doi: 10.48130/fia-0024-0003

Catalog

  • About this article

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return