Back Propagation Neural Network model for analysis of hyperspectral images to predict apple firmness

Shuiping Li; Yueyue Chen; Xiaobo Zhang; Junbo Wang; Xuanxiang Gao; Yunhong Jiang; Zhaojun Ban; Cunkun Chen; Shuiping Li; Yueyue Chen; Xiaobo Zhang; Junbo Wang; Xuanxiang Gao; Yunhong Jiang; Zhaojun Ban; Cunkun Chen

doi:10.48130/fia-0025-0004

2025 Volume 4

Article Contents

Next Previous

ARTICLE Open Access

Back Propagation Neural Network model for analysis of hyperspectral images to predict apple firmness

1.
School of Biological and Chemical Engineering, Zhejiang University of Science and Technology, Zhejiang Provincial Key Laboratory of Chemical and Biological Processing Technology of Farm Products, Zhejiang Provincial Collaborative Innovation Center of Agricultural Biological Resources Biochemical Manufacturing, Hangzhou 310023, China
2.
Computer of Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
3.
Aksu Youneng Agricultural Technology Co., Ltd, Aksu 843001, China
4.
Department of Applied Sciences, Faculty of Health and Life Science, Northumbria University, Newcastle Upon Tyne, NE1 8ST, United Kingdom
5.
Institute of Agricultural Products Preservation and Processing Technology (National Engineering Technology Research Center for Preservation of Agriculture Product), Tianjin Academy of Agricultural Sciences; Key Laboratory of Postharvest Physiology and Storage of Agricultural Products, Ministry of Agriculture and Rural Affairs of P.R.China, Tianjin 300384, China

More Information

Corresponding authors: banzhaojun@zust.edu.cn; cck0318@126.com

Received: 02 August 2024
Revised: 04 December 2024
Accepted: 05 December 2024
Published online: 16 January 2025
Food Innovation and Advances 2025, 4(1): 1−9 | Cite this article

Abstract

The potential of employing hyperspectral imaging (HSI) in the near-infrared (NIR) range (386.82−1,004.50 nm) for predicting the firmness of 'Fuji' apples cultivated in Aksu has been evaluated. The performance of seven preprocessing algorithms and two feature selection algorithms was evaluated. The coefficient of determination (R²) and root mean square error (RMSE) of Partial Least Squares (PLS) models are contrasted using various inputs. These results confirm that the Multiplicative Scatter Correction (MSC) preprocessing algorithm was the optimal choice ($ {R}_{p}^{2} $ = 0.7925, RMSEP = 0.6537), and the Competitive Adaptive Reweighted Sampling (CARS) feature selection algorithm demonstrated superior performance ($ {R}_{p}^{2} $ = 0.8325, RMSEP = 0.6257). Based on the aforementioned findings, PLS, Multiple Linear Regression (MLR), Heterogeneous Transfer Learning (HTL), and Back Propagation Neural Network (BPNN) models were constructed for cross-validation purposes. The experimental results indicate that the CARS-BPNN model exhibits the optimal prediction performance, with an $ {R}_{p}^{2} $ value of 0.9350 and an RMSEP value of 0.4654. The results of the research indicated that a deep learning method combined with hyperspectral imaging technology could be utilized to non-destructively detect the firmness of 'Fuji' apples, which will be beneficial and potentially applicable for post-harvest fruit firmness monitoring. This research provides a reference point for the non-destructive detection of apple in the selection of preprocessing, feature selection algorithms, and predicting firmness model.
- Non-destructive detection,
- Deep learning,
- 'Fuji' apple,
- Hyperspectral image,
- Feature selection
Rights and permissions
Copyright: © 2025 by the author(s). Published by Maximum Academic Press on behalf of China Agricultural University, Zhejiang University and Shenyang Agricultural University. This article is an open access article distributed under Creative Commons Attribution License (CC BY 4.0), visit https://creativecommons.org/licenses/by/4.0/.

References

[1]	Wang J, Liu T. 2022. Spatiotemporal evolution and suitability of apple production in China from climate change and land use transfer perspectives. Food and Energy Security 11:e386 doi: 10.1002/fes3.386 CrossRef Google Scholar
[2]	Oyenihi AB, Belay ZA, Mditshwa A, Caleb OJ. 2022. "An apple a day keeps the doctor away": The potentials of apple bioactive constituents for chronic disease prevention. Journal of Food Science 87:2291−309 doi: 10.1111/1750-3841.16155 CrossRef Google Scholar
[3]	Harker FR, Roigard CM, Colonna AE, Jin D, Ryan G, et al. 2024. The relative importance of postharvest eating quality and sustainability attributes for apple fruit: A case study using new sensory-consumer approaches. Postharvest Biology and Technology 217:113099 doi: 10.1016/j.postharvbio.2024.113099 CrossRef Google Scholar
[4]	Cheng XY, Zhao YY, Xue HL, Bi Y, Sun CC, et al. 2022. Model fit based on the weight loss and texture parameters of MAP cherry tomatoes during storage. Journal of Food Processing and Preservation 46:e16204 doi: 10.1111/jfpp.16204 CrossRef Google Scholar
[5]	Bejaei M, Stanich K, Cliff MA. 2021. Modelling and classification of apple textural attributes using sensory, instrumental and compositional analyses. Foods 10:384−97 doi: 10.3390/foods10020384 CrossRef Google Scholar
[6]	Yang X, Zhu L, Huang X, Zhang Q, Li S, et al. 2022. Determination of the soluble solids content in korla fragrant pears based on visible and near-infrared spectroscopy combined with model analysis and variable selection. Frontiers in Plant Science 13:938162 doi: 10.3389/fpls.2022.938162 CrossRef Google Scholar
[7]	Liu Y, Wu Q, Huang J, Zhang X, Zhu Y, et al. 2021. Comparison of apple firmness prediction models based on non-destructive acoustic signal. International Journal of Food Science and Technology 56:6443−50 doi: 10.1111/ijfs.15311 CrossRef Google Scholar
[8]	Martínez Gila DM, Navarro Soto JP, Satorres Martínez S, Gómez Ortega J, Gámez García J. 2022. The advantage of multispectral images in fruit quality control for extra virgin olive oil production. Food Analytical Methods 15:75−84 doi: 10.1007/s12161-021-02099-w CrossRef Google Scholar
[9]	Wang C, Liu B, Liu L, Zhu Y, Hou J, et al. 2021. A review of deep learning used in the hyperspectral image analysis for agriculture. Artificial Intelligence Review 54:5205−53 doi: 10.1007/s10462-021-10018-y CrossRef Google Scholar
[10]	Wesoły M, Przewodowski W, Ciosek-Skibińska P. 2023. Electronic noses and electronic tongues for the agricultural purposes. Trends in Analytical Chemistry 164:117082−103 doi: 10.1016/j.trac.2023.117082 CrossRef Google Scholar
[11]	Schlie TP, Dierend W, Köpcke D, Rath T. 2022. Detecting low-oxygen stress of stored apples using chlorophyll fluorescence imaging and histogram division. Postharvest Biology and Technology 189:111901−09 doi: 10.1016/j.postharvbio.2022.111901 CrossRef Google Scholar
[12]	Peng K, Ma W, Lu J, Tian Z, Yang Z. 2023. Application of machine vision technology in citrus production. Applied Sciences 13:9334 doi: 10.3390/app13169334 CrossRef Google Scholar
[13]	Zhang P, Wang H, Ji H, Li Y, Zhang X, et al. 2023. Hyperspectral imaging-based early damage degree representation of apple: a method of correlation coefficient. Postharvest Biology and Technology 199:112309−17 doi: 10.1016/j.postharvbio.2023.112309 CrossRef Google Scholar
[14]	Arendse E, Nieuwoudt H, Magwaza LS, Nturambirwe JFI, Fawole OA, et al. 2021. Recent advancements on vibrational spectroscopic techniques for the detection of authenticity and adulteration in horticultural products with a specific focus on oils, juices and powders. Food and Bioprocess Technology 14:1−22 doi: 10.1007/s11947-020-02505-x CrossRef Google Scholar
[15]	Shlezinger N, Whang J, Eldar YC, Dimakis AG. 2023. Model based deep learning. Proceedings of the IEEE 111:465−99 doi: 10.1109/JPROC.2023.3247480 CrossRef Google Scholar
[16]	Zhou L, Zhang C, Liu F, Qiu Z, He Y. 2019. Application of deep learning in food: A review. Comprehensive Reviews in Food Science and Food Safety 18:1793−811 doi: 10.1111/1541-4337.12492 CrossRef Google Scholar
[17]	Li S, Song Q, Liu Y, Zeng T, Liu S, et al. 2023. Hyperspectral imaging-based detection of soluble solids content of loquat from a small sample. Postharvest Biology and Technology 204:112454 doi: 10.1016/j.postharvbio.2023.112454 CrossRef Google Scholar
[18]	Tian Y, Sun J, Zhou X, Yao K, Tang N. 2022. Detection of soluble solid content in apples based on hyperspectral technology combined with deep learning algorithm. Journal of Food Processing and Preservation 46:e16414 doi: 10.1111/jfpp.16414 CrossRef Google Scholar
[19]	Ma T, Xia Y, Inagaki T, Tsuchikawa S. 2021. Non-destructive and fast method of mapping the distribution of the soluble solids content and pH in kiwifruit using object rotation near-infrared hyperspectral imaging approach. Postharvest Biology and Technology 174:111440−47 doi: 10.1016/j.postharvbio.2020.111440 CrossRef Google Scholar
[20]	Park B, Shin T, Cho JS, Lim JH, Park KJ. 2023. Improving blueberry firmness classification with spectral and textural features of microstructures using hyperspectral microscope imaging and deep learning. Postharvest Biology and Technology 195:112154−64 doi: 10.1016/j.postharvbio.2022.112154 CrossRef Google Scholar
[21]	Ragavendra S, Ganguli S, Selvan PT, Nayak MM, Chaudhury S, et al. 2022. Deep learning based dual channel banana grading system using convolution neural network. Journal of Food Quality 2022:6050284 doi: 10.1155/2022/6050284 CrossRef Google Scholar
[22]	Xiang Y, Chen Q, Su Z, Zhang L, Chen Z, et al. 2022. Deep learning and hyperspectral images based tomato soluble solids content and firmness estimation. Frontiers in Plant Science 13:860656−66 doi: 10.3389/fpls.2022.860656 CrossRef Google Scholar
[23]	Xu M, Sun J, Yao K, Cai Q, Shen J, et al. 2022. Developing deep learning based regression approaches for prediction of firmness and pH in kyoho grape using Vis/NIR hyperspectral imaging. Infrared Physics and Technology 120:104003−12 doi: 10.1016/j.infrared.2021.104003 CrossRef Google Scholar
[24]	Liu P, Zhang P, Ni F, Hu Y. 2021. Feasibility of nondestructive detection of apple crispness based on spectroscopy and machine vision. Journal of Food Process Engineering 44:13802 doi: 10.1111/jfpe.13802 CrossRef Google Scholar
[25]	Shao Y, Ji S, Xuan G, Wang K, Xu L, et al. 2024. Soluble solids content monitoring and shelf life analysis of winter jujube at different maturity stages by Vis-NIR hyperspectral imaging. Postharvest Biology and Technology 210:112773−80 doi: 10.1016/j.postharvbio.2024.112773 CrossRef Google Scholar
[26]	Kim HJ, Baek JW, Chung K. 2021. Associative knowledge graph using fuzzy clustering and min-max normalization in video contents. IEEE Access 9:74802−16 doi: 10.1109/ACCESS.2021.3080180 CrossRef Google Scholar
[27]	Siino M, Tinnirello I, La Cascia M. 2024. Is text preprocessing still worth the time? A comparative survey on the influence of popular preprocessing methods on transformers and traditional classifiers. Information Systems 121:102342 doi: 10.1016/j.is.2023.102342 CrossRef Google Scholar
[28]	Liu HL, Yu CH, Wan LC, Qin SJ, Gao F, et al. 2022. Quantum mean centering for block-encoding-based quantum algorithm. Physica A: Statistical Mechanics and its Applications 607:128227 doi: 10.1016/j.physa.2022.128227 CrossRef Google Scholar
[29]	Sohn SI, Pandian S, Oh YJ, Zaukuu JLZ, Na CS, et al. 2022. Vis-NIR spectroscopy and machine learning methods for the discrimination of transgenic Brassica napus L. and their hybrids with B. juncea. Processes 10:240 doi: 10.3390/pr10020240 CrossRef Google Scholar
[30]	Butt UM, Letchmunan S, Ali M, Hassan FH, Baqir A, et al. 2021. Machine learning based diabetes classification and prediction for healthcare applications. Journal of Healthcare Engineering 2021:9930985 doi: 10.1155/2021/9930985 CrossRef Google Scholar
[31]	Figueiredo NS, Ferreira LHC, Dutra OO. 2019. An approach to savitzky-golay differentiators. Circuits Systems and Signal Processing 38:4369−79 doi: 10.1007/s00034-019-01045-w CrossRef Google Scholar
[32]	Endut R, Sabri MSA, Aljunid SA, Ali N, Laili AR, et al. 2023. Prediction of potassium (K) content in soil analysis utilizing near-infrared (NIR) spectroscopy. Journal of Advanced Research in Applied Sciences and Engineering Technology 33:92−101 doi: 10.37934/araset.33.1.92101 CrossRef Google Scholar
[33]	Gao Q, Wang P, Niu T, He D, Wang M, et al. 2022. Soluble solid content and firmness index assessment and maturity discrimination of Malus micromalus Makino based on near-infrared hyperspectral imaging. Food Chemistry 370:131013 doi: 10.1016/j.foodchem.2021.131013 CrossRef Google Scholar
[34]	Xuan W, Wang Y. 2021. Competitive adaptive reweighted sampling method for fault detection. Journal of Physics: Conference Series 1820:012078 doi: 10.1088/1742-6596/1820/1/012078 CrossRef Google Scholar
[35]	Sun J, Yang W, Feng M, Liu Q, Kubar M. 2020. An efficient variable selection method based on random frog for the multivariate calibration of NIR spectra. RSC Advances 10:16245−53 doi: 10.1039/d0ra00922a CrossRef Google Scholar
[36]	Meng Q, Shang J, Huang R, Zhang Y. 2021. Determination of soluble solids content and firmness in plum using hyperspectral imaging and chemometric algorithms. Journal of Food Process Engineering 44:e13597 doi: 10.1111/jfpe.13597 CrossRef Google Scholar
[37]	Chen Y, Jiang X, Liu Q, Wei Y, Wang F, et al. 2024. A hyperspectral imaging technique for rapid non-destructive detection of soluble solid content and firmness of wolfberry. Journal of Food Measurement and Characterization 18:7927−41 doi: 10.1007/s11694-024-02775-5 CrossRef Google Scholar
[38]	Tang N, Sun J, Yao K, Zhou X, Tian Y, et al. 2021. Identification of Lycium barbarum varieties based on hyperspectral imaging technique and competitive adaptive reweighted sampling-whale optimization algorithm-support vector machine. Journal of Food Process Engineering 44:e13603 doi: 10.1111/jfpe.13603 CrossRef Google Scholar
[39]	Xia Z, Yang J, Wang J, Wang S, Liu Y. 2020. Optimizing rice near-infrared models using fractional order savitzky–golay derivation (FOSGD) combined with competitive adaptive reweighted sampling (CARS). Applied Spectroscopy 74:417−26 doi: 10.1177/0003702819895799 CrossRef Google Scholar
[40]	Meng Q, Tan T, Feng S, Wen Q, Shang J. 2024. Prediction and visualization map for physicochemical indices of kiwifruits by hyperspectral imaging. Frontiers in Nutrition 11:1364274 doi: 10.3389/fnut.2024.1364274 CrossRef Google Scholar
[41]	Xing Z, Du C, Shen Y, Ma F, Zhou J. 2021. A method combining FTIR-ATR and raman spectroscopy to determine soil organic matter: Improvement of prediction accuracy using competitive adaptive reweighted sampling (CARS). Computers and Electronics in Agriculture 191:106549 doi: 10.1016/j.compag.2021.106549 CrossRef Google Scholar
[42]	Yang B, Li X, Wu L, Chen Y, Zhong F, et al. 2022. Citrus huanglongbing detection and semi-quantification of the carbohydrate concentration based on micro-FTIR spectroscopy. Analytical and Bioanalytical Chemistry 414:6881−97 doi: 10.1007/s00216-022-04254-6 CrossRef Google Scholar
[43]	Chen S, Lou F, Tuo Y, Tan S, Peng K, et al. 2023. Prediction of soil water content based on hyperspectral reflectance combined with competitive adaptive reweighted sampling and random frog feature extraction and the back-propagation artificial neural network method. Water 15:2726 doi: 10.3390/w15152726 CrossRef Google Scholar
[44]	Gorzelany J, Belcar J, Kuźniar P, Niedbała G, Pentoś K. 2022. Modelling of mechanical properties of fresh and stored fruit of large cranberry using multiple linear regression and machine learning. Agriculture 12:200 doi: 10.3390/agriculture12020200 CrossRef Google Scholar
[45]	Luo Y, Wen Y, Liu T, Tao D. 2019. Transferring knowledge fragments for learning distance metric from a heterogeneous domain. IEEE Transactions on Pattern Analysis and Machine Intelligence 41:1013−26 doi: 10.1109/TPAMI.2018.2824309 CrossRef Google Scholar
[46]	Elsherbiny O, Fan Y, Zhou L, Qiu Z. 2021. Fusion of feature selection methods and regression algorithms for predicting the canopy water content of rice based on hyperspectral data. Agriculture 11:51−71 doi: 10.3390/agriculture11010051 CrossRef Google Scholar
[47]	Ciccoritti R, Paliotta M, Amoriello T, Carbone K. 2019. FT-NIR spectroscopy and multivariate classification strategies for the postharvest quality of green-fleshed kiwifruit varieties. Scientia Horticulturae 257:108622−31 doi: 10.1016/j.scienta.2019.108622 CrossRef Google Scholar

About this article

Cite this article

Li S, Chen Y, Zhang X, Wang J, Gao X, et al. 2025. Back Propagation Neural Network model for analysis of hyperspectral images to predict apple firmness. Food Innovation and Advances 4(1): 1−9 doi: 10.48130/fia-0025-0004

Li S, Chen Y, Zhang X, Wang J, Gao X, et al. 2025. Back Propagation Neural Network model for analysis of hyperspectral images to predict apple firmness. Food Innovation and Advances 4(1): 1−9 doi: 10.48130/fia-0025-0004

Figures(7) / Tables(3)

Download PDF

Article Metrics

Article views(7239) PDF downloads(1867)

Other Articles By Authors

on this site
on Google Scholar

HTML

Introduction

China holds a dominant position in the world's apple production, having the greatest apple-growing area, and the largest export^[1]. Apples' rich trace elements and organic component composition give them a delicious flavor and significant nutritional benefits. In this case, the quality of apples has a significant impact by several factors, including shape, size, sugar, acids, external colour, soluble solids content (SSC), and texture^[2]. However, weight loss, disease, and chilling injury of postharvest loss are the most common occurrences. These losses may affect customer purchasing decisions and result in a decline in apple sales^[3]. Detecting all postharvest loss parameters is unquestionably intricate. Research has revealed a substantial link between fruit weight reduction and texture attributes, suggesting that texture trait evaluations may be used to determine postharvest loss^[4].

Firmness is an important metric for analyzing textural characteristics and determining the degree of postharvest loss. At present, the techniques employed for determining firmness are largely based on conventional physicochemical methods, and sensory analysis. However, these approaches are known to be detrimental, time-consuming, and arduous^[5]. The industry standard for determining the firmness of fruits is a penetrometer test that involves piercing fruit flesh to a depth with a Magness-Taylor instrument, which leads to a loss of financial losses^[6]. Hence, it is essential to develop a rapid, non-destructive test technique for monitoring apple firmness.

Several research works have been carried out non-destructive evaluation of fruit quality using acoustic^[7], multispectral imaging^[8], hyperspectral imaging^[9], electronic nose^[10], fluorescence imaging^[11], machine vision^[12], and so on. Hyperspectral imaging stands out as the most comprehensive of the approaches listed above since it allows for the utilization of both visual and spectral data from the sample for firmness detection^[13]. The mechanism behind some detections is grounded on the measurement of the spectrum from the fruit surface by reflection, interaction, or transmission. By applying certain chemometric techniques, the relevant wavelength variations in the spectrum can be employed to be connected with firmness, because the measured spectrum is related to the content and structure of the fruit^[14].

However, because hyperspectral imaging remains enormously dimensional, standard processing techniques find it challenging to handle the massive volume of data. In this project, we intend to introduce deep learning techniques. Using huge amounts of data or high dimensional data, deep learning creates deep neural networks to simulate human brain neurons and perform complicated function approximation^[15]. With numerous successful applications in the fields of food, image processing, speech recognition, and object detection, these techniques have demonstrated their sophisticated technology for big data analysis^[16]. More studies on the hyperspectral imaging for fruit quality assessment could be found such as loquat^[17], apple^[18], kiwifruit^[19], blueberry^[20], and banana^[21]. Xiang et al.^[22] completed SSC and firmness nondestructive testing of tomatoes, applying hyperspectral imaging and deep learning. Additionally, a novel regression model based on one-dimensional (1D) Con1dResNet (Con1dResNet) was proposed and evaluated in comparison to existing techniques. The evaluation results indicate that with a sufficiently large number of samples, this technique outperforms the state-of-the-art technique by 26.4% for SSC and 33.7% for firmness^[22]. Hyperspectral imaging and deep learning were utilized by Xu et al. to predict the firmness and pH of Kyoho grapes^[23]. Their research demonstrated that grape firmness and acidity may be rapidly and non-destructively assessed through the integration of stacked auto-encoders (SAE) with hyperspectral imaging. At the moment, the majority of literature researches concentrate on optimization of a single model, which results in a lack of comparison effect of other models and limited reference value.

This manuscript's particular goals were: (1) Process the spectral data and firmness indicators of the collected apple samples in order to determine the optimal spectral preprocessing method; (2) Compare the optimization of feature wavelength extraction methods in order to determine the optimization feature wavelength extraction methods; (3) Optimization learn modeling by selecting methods: multiple linear regression (MLR), heterogeneous transfer learning (HTL), and backpropagation neural network (BPNN); (4) Through various modeling analysis and prediction effects on the firmness of apple samples, the modeling method with a larger coefficient of determination (R²) and a smaller root mean squared error (RMSE) is selected to determine the best prediction model to achieve the optimal result for apple firmness.

Materials and methods

Apple material

All the 220 tested apples (Malus domestica Borkh) were harvested on local farms (80°20' E, 41°28' N, Aksu Prefecture, Xinjiang, China) within a week after the frost's descent (25−30 October, 2023). The average apple weight of these apple samples was 233.42 g and the average apple diameter was 83.40 mm. After removing the frost wax, the apple samples were numbered and labeled. Following that, spectral analysis and associated experiments were used to analyze the apple samples. The raw data set was rearranged at a 4:1 ratio using the Kennard-Stone algorithm (KS), resulting in the generation of two distinct datasets: a calibration set comprising 176 apple samples and a prediction set comprising 44 apple samples.

Hyperspectral image
In the actual screening process, the position of the apple is not fixed and there are numerous potential configurations. To enhance the precision of detection, we opted to gather four surfaces of apples in a flat position, aiming to gather as much surface information as feasible. The experiment used a push-and-scan hyperspectral imaging camera (ResononPikaKC2 imaging spectrometer, Beijing Liga, Beijing, China), linear mobile platform, installation tower, lighting device, head, NB single-phase current intelligent detector, GST36U12-P1JW power supply sensor (Mingwei, Taiwan), DMX-J-SA-17 stepper motor (Arcas, USA), acA1920-155 μm array camera (Basler, Germany). The hyperspectral spectrometer included an exposure time of 20.0 ms, a frame frequency of 20.0 HZ, a spectral resolution of 1.3 nm, and a spectral range of 386.82−1,004.50 nm. The platform has a maximum operating rate of 355 pps (packets per second), a maximum gain of 3, and a fixed distance of 20.0 cm between the sample and the camera^[24].

Firmness measurement
Firmness measurement using GY-1 fruit firmness tester (Dongguan Sanliang Measuring Tools Co., Ltd, Dongguan, China), suitable for measuring apples, pears, and other high-firmness fruits professionally. The scale display range is 2−15 kg/cm², the side head diameter size is 3.5 mm, the index value is 0.1 kg/cm², and the indentation depth of the indenter is standardized to 10 mm. The external dimensions of the firmness tester are 140 mm × 60 mm × 30 mm and the net weight is 0.5 kg. Considering that the GY-1 firmness tester is a manual measurement, to ensure the accuracy of the experiment, apple samples need to be collected three times the firmness value, and take the average of the three as the final firmness of the sample to obtain the measurement value. The final experimental solution was determined as three different parts of the apple on the equatorial line, with each part spaced 120 degrees apart.

Spectral data extraction
In order to obtain a stable light environment, the hyperspectral imaging equipment should be turned on and warmed up for approximately half an hour before scanning each sample. By using Eqn (1), the raw hyperspectral image is calibrated with the standard white and dark reference images in order to remove the effects of uneven illumination and dark current noise^[25].

$ {R}_{c}=\left({R}_{0}-B\right)/ (W-B) $ (1)

In this context, R_c represents the calibrated hyperspectral image, R₀ designates the raw hyperspectral data, W means the standard white reference image obtained through the use of a rectangular Teflon plate, and B stands for the standard black reference image, which is obtained by fully covering the lens completely with an opaque black cover.

Every apple was given a hyperspectral image, which was then preprocessed and used to extract the spectra. To extract the spectral data information from the acquired spectral image of the apple sample, a 150 × 150 pixels region of interest (ROI) was identified in the vicinity of the equatorial plane. The Environment for Visualizing Images software (ENVI 5.1, Research Systems Inc, Boulder, CO, USA) was used to calculate the raw average reflectance from ROI.

Spectral processing

Spectral preprocessing can boost accuracy, remove redundant and erroneous information, and lessen the impact of light, noise, and background interference produced by the test instrument during the spectrum acquisition procedure on the measured light spectrum data. To reduce noise from various electronic sources and variations in sample conditions, the raw mean spectra data were preprocessed using seven standard methods: Min Max Scaler (MMS)^[26], Standard Scaler (SS)^[27], Mean Centering (CT)^[28], Standard Normal Variate (SNV)^[29], Moving Average (MA)^[30], Savitzky Golay smoothing filtering (SG)^[31], and Multiplicative Scatter Correction (MSC)^[32].

The Partial least squares regression (PLS) model was developed using the pre-treated spectrum and the raw spectrum, with the determination coefficient (R²) and the root mean square error (RMSE) as the evaluation indexes, and the optimal scheme was determined by selecting the one with greater R² and a smaller RMSE. The calculation formulas of R² and RMSE are shown in Eqns (2) & (3):

$ {R}_{c}^{2},{R}_{p}^{2}= 1- [{\textstyle\sum }_{i=1}^{n}{\left({y}_{i}-{\hat{y}}_{i}\right)}^{2}/{\textstyle\sum }_{i=1}^{n}{({y}_{i}-{y}_{m})}^{2}] $

(2)

$ RMSEC,\;RMSEP=\sqrt{\dfrac{1}{n}{\textstyle\sum }_{i=1}^{n}{({y}_{i}-{y}_{m})}^{2}} $

(3)

where, $ \hat{{y}_{i}} $ express the predicted value of the i^th sample, y_i means the measured value of the i^th sample, n shows the total number of samples, and y_m is the mean value of the samples.

Feature selection algorithm

The selection of an effective wavelength is a crucial aspect of spectral data analysis. Its function is to eliminate superfluous information present in the spectrum, retain data pertinent to the current task, and subsequently reduce the data dimension^[33]. The Competitive Adaptive Reweighted Sampling (CARS) algorithm enables the identification of the optimal combination of specific key variables, thereby enhancing the detection of corresponding indicators^[34]. The Random Frog (RF) algorithm is employed to find the most likely significant variables, then local search is used to expand the significant variable interval width^[35]. At present, there are successful cases of the application of CARS and RF in combining the firmness of hyperspectral technology. Both feature wavelength selection algorithms have certain superiority. Based on a comprehensive consideration of the article, we have selected the above two feature selection methods for experimental investigation.

CARS not only optimizes the accuracy of firmness prediction but also increases the efficiency of the prediction model in comparison to other feature wavelength selection algorithms, including the Successive Projection Algorithm (SPA)^[36], and Principal Component Analysis (PCA)^[37]. It chooses the wavelengths exhibiting the greatest absolute values of the PLS model's regression coefficients, emulating the Darwinian principle of 'survival of the fittest'^[38]. CARS is capable of filtering out the most complex bands with the greatest number of eigenvalues, and can be combined with other processing methods to enhance the accuracy and stability of the model^[39]. Experiments have been conducted to establish a correlation between hyperspectral images and kiwifruit hardness, with successful results in predicting and visualizing this variable. Therefore, the use of CARS is both reasonable and appropriate for the experiment^[40].

The PLS model was created by the algorithm using 80% randomly divided data sets for analysis. The target variable's explanatory value is determined by the regression coefficient's absolute value. Every sample iteration involves four sequential steps that CARS goes through to function: (1) Model sampling using Monte Carlo; (2) Perform enforced wavelength selection, using an exponentially diminishing function; (3) Adopt ARS to achieve a competitive wavelength selection process; (4) Use cross-validation to assess the subset^[41]. In this study, the Monte Carlo sampling run times were 500, the maximum principal component number was 10, the sampling rate was 0.8, and the optimal number of iteration number was 195. Equations (4)−(6) describe the most important theory of CARS.

$ {r}_{i}=\alpha {e}^{-ki} $

(4)

$ \alpha ={(P/ 2)}^{1 / (N- 1)} $

(5)

$ k=\mathrm{ln}\left(P/ 2\right)/ (N- 1) $

(6)

where, r_i shows the ratio column of reserved wavelength points obtained, i express the Monte Carlo sampling runs, α and k are two constants, P designates the raw wavelength number, N means preset Monte Carlo sampling number.

The RF algorithm is based on post-heuristic particle swarm optimization, enabling the iterative process by integrating the benefits of the reversible jump Markov chain Monte Carlo algorithm. The selection probability of the variable is calculated using the Markov chain based on the stationary distribution. The optimal bands selected by RF provide a technical foundation for subsequent semi-quantitative modeling of spectroscopy and chemometrics^[42]. In light of the possibility of errors in the CARS (ignoring interactions between features or errors caused by other reasons), the CARS algorithm is used to extract features while the RF algorithm is used to select spectral features^[43]. There are four steps in the process: (1) Set the initial number of frog population variables Q, which form a subset V₀; (2) Calculate the positional fitness of each variable; (3) Appropriately transform the frog position according to its fitness and relevance to the problem; (4) After N iterations, calculate the probability of the variable being selected according to Eqn (7)^[35]. The RF algorithm had its operational specifications set as follows: the number of iterations N was 3000, the frog population variables Q was 6, and the resampling factor for variable adjustment was 10.

$ Probability_i=N_j/N,\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ j=\mathrm{1,2},3,\dots,p $

(7)

where, Probability_i designates the probability variables, N_i means iterations number in progress, N is the total iterations number, p expresses the total wavelength number.

Firmness prediction models

To create models between the apple's firmness and feature wavelength, four common techniques were used: Back Propagation Neural Network (BPNN), Heterogeneous Transfer Learning (HTL), Multiple Linear Regression (MLR), and PLS. Supervised learning in hyperspectral imaging analysis frequently uses methods like MLR. Because it can identify the linear relationships with a single independent variable and a multitude of dependent variables.

The objective of MLR is to minimize the discrepancies between anticipated and actual results by using a simple method of assigning values to the independent variable coefficients^[44]. The majority of HTL techniques used today deal with heterogeneous domains by either learning an asymmetric transformation between them or discovering a common subspace for them. The objective is to use knowledge (or information) from related tasks to enhance performance on the target learning task^[45].

By using an end-to-end feature extraction method in place of the manual feature extraction process, BPNN may swiftly extract information about hidden features from a given data set^[46]. There are different types of layers that make up the structure of a neural net: (1) the input layer, which contains the basic data of the network; (2) the hidden layer, which works as an intermediary between the intermediate input layer and the downstream output layer; and (3) the output layer, which generates the output based on the input.

Following building the regression model, the models were evaluated for accuracy using the determination coefficient of calibration and prediction ($ {R}_{c}^{2} $ and $ {R}_{p}^{2} $ ) and the root mean square error of calibration and prediction (RMSEC and RMSEP). Formulas for parametric in the spectral processing section.

Using the characteristic wavelength data elected by CARS and RF, the prediction model of apple firmness based on the MLR algorithm was established. For the network model optimizer of transfer learning, Adaptive Moment Estimation (ADAM) was elected as the network model optimizer for transfer learning; Leaky Rectified Linear Unit (Leaky ReLU) was taken as the activation function; Mean Squared Error (MSE) was chosen as the loss function; Mean Absolute Error (MAE) was chosen as the training evaluation criterion; the batch size was 64; the verification ratio of each round was 0.2; the initial learning rate was 0.0005. The initial training rounds epoch was 20. The value of the output layer represents the predicted firmness, and the number of layers is 1. The number of neurons in the hidden layer is 26, and the training stride is 20000.

Method	$ {R}_{c}^{2} $	RMSEC	$ {R}_{p}^{2} $	RMSEP
MMS	0.7247	0.8081	0.7130	0.8330
SS	0.7795	0.6855	0.7690	0.7336
CT	0.7758	0.6935	0.7780	0.7083
SNV	0.7750	0.6993	0.7861	0.6770
MA	0.7876	0.6481	0.6778	0.8902
SG	0.7913	0.6434	0.6680	0.9013
MSC	0.7862	0.6525	0.7925	0.6537
RW	0.7942	0.6357	0.7891	0.6613

{{lists.name}}

Back Propagation Neural Network model for analysis of hyperspectral images to predict apple firmness

Abstract