An interpretable machine learning model for predicting forest fire danger based on Bayesian optimization

Zhiyang Liu; Kuibin Zhou; Qichao Yao; Pedro Reszka; Zhiyang Liu; Kuibin Zhou; Qichao Yao; Pedro Reszka

doi:10.48130/emst-0024-0026

2024 Volume 4

Article Contents

Next Previous

ARTICLE Open Access

An interpretable machine learning model for predicting forest fire danger based on Bayesian optimization

1.
College of Safety Science and Engineering, Nanjing Tech University, Nanjing 211816, Jiangsu, China
2.
National Institute of Natural Hazards, Haidian, Beijng 100085, China
3.
Faculty of Engineering and Sciences, Universidad Adolfo Ibáñez, Santiago 7941169, Chile

More Information

Corresponding author: kbzhou@njtech.edu.cn

Received: 30 August 2024
Revised: 02 November 2024
Accepted: 07 November 2024
Published online: 12 December 2024
Emergency Management Science and Technology 4, Article number: e025 (2024) | Cite this article

Abstract

As global warming increases forest fire frequency, early prevention and effective management become crucial. This requires models that are both accurate and easily understood. However, traditional machine learning models, which typically use preset parameters, are often inaccurate and hard to interpret. Therefore, this study introduces an enhanced approach using data from 2000 to 2019 in the Sichuan and Yunnan provinces of China, incorporating 18 driving factors. Bayesian optimization algorithms, i.e., the Gaussian Process (GP) and Tree-structured Parzen Estimator (TPE) probabilistic proxy models, were used to optimize the hyperparameters for LightGBM, Random Forest (RF), and Support Vector Machine (SVM), respectively. Finally, forest fire danger prediction models were constructed to draw forest fire danger maps, and the performance was compared between different models. In detail, the model's predictive performance was evaluated using metrics like accuracy, recall, precision, Balanced F Score (F1), and area under curve (AUC). The evaluation demonstrated that the TPE-LightGBM exhibited remarkable accuracy (AUC = 0.962). The forest fire danger map categorizes the study area into five danger levels. The TPE-LightGBM effectively classifies 62.58% of the study area as low-danger level and 5.33% as high-danger Level V. The Shapley additive explanation (SHAP) model interpretation of TPE-LightGBM highlights daily the average relative humidity, sunshine hours, elevation, daily average air pressure, and daily maximum ground surface temperature as the primary influential factors, followed by the human activity indexed by the gross domestic product (GDP) and the distance to the nearest railway.
- Forest fires,
- Bayesian optimization,
- LightGBM,
- Fire danger,
- SHAP

Supplementary information

Supplementary Table S1 Forest fire factors and their classification in the study area.
Supplementary Table S2 Optimal hyperparameter values for machine learning model achieved through GP and TPE.
Supplementary Fig. S1 Generalized plot of each factor: (a) elevation, (b) slope, (c) aspect, (d) topographic wetness index, (e) distance to nearest railway, (f) distance to nearest road, (g) distance to nearest settlement, (h) density of population, (i) GDP, (j) vegetation type.
Supplementary Fig. S2 Framework for model hyperparameter optimization.

Rights and permissions
Copyright: © 2024 by the author(s). Published by Maximum Academic Press on behalf of Nanjing Tech University. This article is an open access article distributed under Creative Commons Attribution License (CC BY 4.0), visit https://creativecommons.org/licenses/by/4.0/.

References

[1]	Cilli R, Elia M, D'Este M, Giannico V, Amoroso N, et al. 2022. Explainable artificial intelligence (XAI) detects wildfire occurrence in the Mediterranean countries of Southern Europe. Scientific Reports 12:16349 doi: 10.1038/s41598-022-20347-9 CrossRef Google Scholar
[2]	National Bureau of Statistics of China. 2023. Statistical Communique of the People's Republic of China on National Economic and Social Development 2018−2022. https://www.stats.gov.cn/
[3]	Kondylatos S, Prapas I, Ronco M, Papoutsis I, Camps-Valls G, et al. 2022. Wildfire danger prediction and understanding with deep learning. Geophysical Research Letters 49(17):e2022GL099368 doi: 10.1029/2022gl099368 CrossRef Google Scholar
[4]	Ma W, Feng Z, Cheng Z, Chen S, Wang F. 2020. Identifying forest fire driving factors and related impacts in China using random forest algorithm. Forests 11(5):507 doi: 10.3390/f11050507 CrossRef Google Scholar
[5]	Liu N, Lei J, Gao W, Chen H, Xie X. 2021. Combustion dynamics of large-scale wildfires. Proceedings of the Combustion Institute 38(1):157−98 doi: 10.1016/j.proci.2020.11.006 CrossRef Google Scholar
[6]	Li S, Wu Z, Liang Y, He H. 2017. The temporal and spatial clustering characteristics of forest fires in the great Xing'an Mountains. Chinese Journal of Ecology 36(1):198−204 doi: 10.13292/j.1000-4890.201701.034 CrossRef Google Scholar
[7]	Rodrigues M, de la Riva J. 2014. An insight into machine-learning algorithms to model human-caused wildfire occurrence. Environmental Modelling & Software 57:192−201 doi: 10.1016/j.envsoft.2014.03.003 CrossRef Google Scholar
[8]	Mandallaz D, Ye R. 1997. Prediction of forest fires with Poisson models. Canadian Journal of Forest Research 27(10):1685−94 doi: 10.1139/x97-103 CrossRef Google Scholar
[9]	Tien Bui D, Bui QT, Nguyen QP, Pradhan B, Nampak H, et al. 2017. A hybrid artificial intelligence approach using GIS-based neural-fuzzy inference system and particle swarm optimization for forest fire susceptibility modeling at a tropical area. Agricultural and Forest Meteorology 233:32−44 doi: 10.1016/j.agrformet.2016.11.002 CrossRef Google Scholar
[10]	Van Beusekom AE, Gould WA, Monmany AC, Khalyani AH, Quiñones M, et al. 2018. Fire weather and likelihood: characterizing climate space for fire occurrence and extent in Puerto Rico. Climatic Change 146(1):117−31 doi: 10.1007/s10584-017-2045-6 CrossRef Google Scholar
[11]	Yue W, Ren C, Liang Y, Guo Y, Zhang S. 2024. Study of wildfire hazard susceptibility in Nanning based on interpretable machine learning model. Science Technology and Engineering 24(2):858−70 doi: 10.12404/j.issn.1671-1815.2301676 CrossRef Google Scholar
[12]	Wang Z, Wang K, Li Y, Li G. 2023. Research on forest fire prediction in Yunnan province based on LightGBM and SHAP. Fire Science and Technology 42(11):1567−71 doi: 10.3969/j.issn.1009-0029.2023.11.021 CrossRef Google Scholar
[13]	Huang CL, Dun JF. 2008. A distributed PSO–SVM hybrid system with feature selection and parameter optimization. Applied Soft Computing 8(4):1381−91 doi: 10.1016/j.asoc.2007.10.007 CrossRef Google Scholar
[14]	Putatunda S, Rama K. 2018. A Comparative Analysis of Hyperopt as Against Other Approaches for Hyper-Parameter Optimization of XGBoost. SPML '18: Proceedings of the 2018 International Conference on Signal Processing and Machine Learning, Shanghai, China, 2018. New York, NY, USA: Association for Computing Machinery. pp. 6−10. doi: 10.1145/3297067.3297080
[15]	Abdollahi A, Pradhan B. 2023. Explainable artificial intelligence (XAI) for interpreting the contributing factors feed into the wildfire susceptibility prediction model. Science of The Total Environment 879:163004 doi: 10.1016/j.scitotenv.2023.163004 CrossRef Google Scholar
[16]	Iban MC, Sekertekin A. 2022. Machine learning based wildfire susceptibility mapping using remotely sensed fire data and GIS: A case study of Adana and Mersin Provinces, Turkey. Ecological Informatics 69:101647 doi: 10.1016/j.ecoinf.2022.101647 CrossRef Google Scholar
[17]	Al-Bashiti MK, Naser MZ. 2022. Machine learning for wildfire classification: Exploring blackbox, eXplainable, symbolic, and SMOTE methods. Natural Hazards Research 2(3):154−65 doi: 10.1016/j.nhres.2022.08.001 CrossRef Google Scholar
[18]	Xie L, Zhang R, Zhan J, Li S, Shama A, et al. 2022. Wildfire risk assessment in Liangshan prefecture, China based on an integration machine learning algorithm. Remote Sensing 14(18):4592 doi: 10.3390/rs14184592 CrossRef Google Scholar
[19]	Li Y, Li G, Wang K, Wang Z, Chen Y. 2023. Forest fire risk prediction based on stacking ensemble learning for Yunnan Province of China. Fire 7(1):13 doi: 10.3390/fire7010013 CrossRef Google Scholar
[20]	Ma W, Feng Z, Cheng Z, Wang F. 2020. Study on driving factors and distribution pattern of forest fires in Shanxi province. Journal of Central South University of Forestry & Technology 40(9):57−69 doi: 10.14067/j.cnki.1673-923x.2020.09.007 CrossRef Google Scholar
[21]	Guo F, Wang G, Su Z, Liang H, Wang W, et al. 2016. What drives forest fire in Fujian, China? Evidence from logistic regression and Random Forests. International Journal of Wildland Fire 25(5):505−19 doi: 10.1071/WF15121 CrossRef Google Scholar
[22]	Catry FX, Rego FC, Bação FL, Moreira F. 2009. Modeling and mapping wildfire ignition risk in Portugal. International Journal of Wildland Fire 18(8):921−31 doi: 10.1071/wf07123 CrossRef Google Scholar
[23]	Chang Y, Zhu Z, Bu R, Chen H, Feng Y, et al. 2013. Predicting fire occurrence patterns with logistic regression in Heilongjiang Province, China. Landscape Ecology 28(10):1989−2004 doi: 10.1007/s10980-013-9935-4 CrossRef Google Scholar
[24]	Oliveira S, Oehler F, San-Miguel-Ayanz J, Camia A, Pereira JMC. 2012. Modeling spatial patterns of fire occurrence in Mediterranean Europe using Multiple Regression and Random Forest. Forest Ecology and Management 275:117−29 doi: 10.1016/j.foreco.2012.03.003 CrossRef Google Scholar
[25]	Syphard AD, Radeloff VC, Keuler NS, Taylor RS, Hawbaker TJ, et al. 2008. Predicting spatial patterns of fire on a southern California landscape. International Journal of Wildland Fire 17(5):602−13 doi: 10.1071/WF07087 CrossRef Google Scholar
[26]	Maingi JK, Henry MC. 2007. Factors influencing wildfire occurrence and distribution in eastern Kentucky, USA. International Journal of Wildland Fire 16:23−33 doi: 10.1071/wf06007 CrossRef Google Scholar
[27]	Eskandari S, Pourghasemi HR, Tiefenbacher JP. 2020. Relations of land cover, topography, and climate to fire occurrence in natural regions of Iran: Applying new data mining techniques for modeling and mapping fire danger. Forest Ecology and Management 473:118338 doi: 10.1016/j.foreco.2020.118338 CrossRef Google Scholar
[28]	Moore ID, Grayson RB, Ladson AR. 1991. Digital terrain modelling: A review of hydrological, geomorphological, and biological applications. Hydrological Processes 5(1):3−30 doi: 10.1002/hyp.3360050103 CrossRef Google Scholar
[29]	Zhang C, Yang Q, Li R. 2005. Advancement in topographic wetness index and its application. Progress in Geography 24:116−23 doi: 10.3969/j.issn.1007-6301.2005.06.014 CrossRef Google Scholar
[30]	Cardille JA, Ventura SJ, Turner MG. 2001. Environmental and social factors influencing wildfires in the upper Midwest, United States. Ecological Applications 11(1):111−27 doi: 10.1890/1051-0761(2001)011[0111:EASFIW]2.0.CO;2 CrossRef Google Scholar
[31]	Garcia CV, Woodard PM, Titus SJ, Adamowicz WL, Lee BS. 1995. A logit model for predicting the daily occurrence of human caused forest-fires. International Journal of Wildland Fire 5(2):101−11 doi: 10.1071/WF9950101 CrossRef Google Scholar
[32]	Oliveira S, Pereira JMC, San-Miguel-Ayanz J, Lourenço L. 2014. Exploring the spatial patterns of fire density in Southern Europe using Geographically Weighted Regression. Applied Geography 51:143−57 doi: 10.1016/j.apgeog.2014.04.002 CrossRef Google Scholar
[33]	Guo F, Su Z, Wang G, Sun L, Lin F, et al. 2016. Wildfire ignition in the forests of southeast China: Identifying drivers and spatial distribution to predict wildfire likelihood. Applied Geography 66:12−21 doi: 10.1016/j.apgeog.2015.11.014 CrossRef Google Scholar
[34]	Pham BT, Jaafari A, Avand M, Al-Ansari N, Dinh Du T, et al. 2020. Performance evaluation of machine learning methods for forest fire modeling and prediction. Symmetry 12(6):1022 doi: 10.3390/sym12061022 CrossRef Google Scholar
[35]	Shi C, Zhang F. 2023. A forest fire susceptibility modeling approach based on integration machine learning algorithm. Forests 14(7):1506 doi: 10.3390/f14071506 CrossRef Google Scholar
[36]	Gigović L, Pourghasemi HR, Drobnjak S, Bai S. 2019. Testing a new ensemble model based on SVM and random forest in forest fire susceptibility assessment and its mapping in serbia's Tara National Park. Forests 10(5):408 doi: 10.3390/f10050408 CrossRef Google Scholar
[37]	Peng W, Wei Y, Chen G, Lu G, Ye Q, et al. 2023. Analysis of wildfire danger level using logistic regression model in Sichuan Province, China. Forests 14(12):2352 doi: 10.3390/f14122352 CrossRef Google Scholar
[38]	Bergstra J, Bardenet R, Bengio Y, Kégl B. 2011. Algorithms for hyper-parameter optimization. NIPS'11: Proceedings of the 24th International Conference on Neural Information Processing Systems, Granada, Spain, 2011. Red Hook, NY, USA: Curran Associates Inc. pp. 2546−54
[39]	Ke G, Meng Q, Finley T, Wang T, Chen W, et al. 2017. LightGBM: a highly efficient gradient boosting decision tree. NIPS 2017: 31 ^st Conference on Neural Information Processing Systems, Long Beach, CA, USA, 2017. Red Hook, NY, USA: Curran Associates Inc. pp. 3149−57.
[40]	Breiman L. 2001. Random Forests. Machine Learning 45:5−32 doi: 10.1023/A:1010933404324 CrossRef Google Scholar
[41]	Jing X, Li X, Zhang D, Liu W, Zhang W, Zhang Z. 2024. Forecast zoning of forest fire occurrence: A case study in southern China. Forests 15(2):265 doi: 10.3390/f15020265 CrossRef Google Scholar
[42]	Vapnik VN. 2000. The Nature of Statistical Learning Theory. New York: Springer. doi: 10.1007/978-1-4757-3264-1
[43]	Yao X, Tham LG, Dai FC. 2008. Landslide susceptibility mapping based on Support Vector Machine: A case study on natural slopes of Hong Kong, China. Geomorphology 101(4):572−82 doi: 10.1016/j.geomorph.2008.02.011 CrossRef Google Scholar
[44]	Lundberg S, Lee SI. 2017. A unified approach to interpreting model predictions. NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, California, USA, 2017. Red Hook, NY, United States: Curran Associates Inc. pp. 4768−77
[45]	Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, et al. 2020. From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence 2(1):56−67 doi: 10.1038/s42256-019-0138-9 CrossRef Google Scholar
[46]	Bui DT, Ngo PTT, Pham TD, Jaafari A, Minh NQ, et al. 2019. A novel hybrid approach based on a swarm intelligence optimized extreme learning machine for flash flood susceptibility mapping. CATENA 179:184−96 doi: 10.1016/j.catena.2019.04.009 CrossRef Google Scholar
[47]	Abbas F, Zhang F, Ismail M, Khan G, Iqbal J, et al. 2023. Optimizing machine learning algorithms for landslide susceptibility mapping along the karakoram highway, Gilgit Baltistan, Pakistan: A comparative study of baseline, Bayesian, and metaheuristic hyperparameter optimization techniques. Sensors 23(15):6843 doi: 10.3390/s23156843 CrossRef Google Scholar
[48]	Liang M, An B, Li K, Du L, Deng T, et al. 2022. Improving genomic prediction with machine learning incorporating TPE for hyperparameters optimization. Biology 11(11):1647 doi: 10.3390/biology11111647 CrossRef Google Scholar
[49]	Zhang G, Wang M, Liu K. 2019. Forest fire susceptibility modeling using a convolutional neural network for Yunnan Province of China. International Journal of Disaster Risk Science 10(3):386−403 doi: 10.1007/s13753-019-00233-1 CrossRef Google Scholar
[50]	Wang SSC, Qian Y, Leung LR, Zhang Y. 2021. Identifying key drivers of wildfires in the contiguous US using machine learning and game theory interpretation. Earth's Future 9(6):e2020EF001910 doi: 10.1029/2020EF001910 CrossRef Google Scholar
[51]	Chen J, Di XY. 2015. Forest fire prevention management legal regime between China and the United States. Journal of Forestry Research 26(2):447−55 doi: 10.1007/s11676-015-0067-3 CrossRef Google Scholar
[52]	Ying L, Cheng H, Shen Z, Guan P, Luo C, et al. 2021. Relative humidity and agricultural activities dominate wildfire ignitions in Yunnan, Southwest China: patterns, thresholds, and implications. Agricultural and Forest Meteorology 307:108540 doi: 10.1016/j.agrformet.2021.108540 CrossRef Google Scholar
[53]	Wang J, Li D, Chen F, Wang S, Niu S. 2012. Study on spatial distribution and trend change of forest fires in Sichuan Province. Journal of Wildland Fire Science (2):26−30 doi: 10.3969/j.issn.1002-2511.2012.02.012 CrossRef Google Scholar

About this article

Cite this article

Liu Z, Zhou K, Yao Q, Reszka P. 2024. An interpretable machine learning model for predicting forest fire danger based on Bayesian optimization. Emergency Management Science and Technology 4: e025 doi: 10.48130/emst-0024-0026

Liu Z, Zhou K, Yao Q, Reszka P. 2024. An interpretable machine learning model for predicting forest fire danger based on Bayesian optimization. Emergency Management Science and Technology 4: e025 doi: 10.48130/emst-0024-0026

Figures(10) / Tables(3)

Download PDF

Article Metrics

Article views(5366) PDF downloads(1481)

Other Articles By Authors

on this site
on Google Scholar

HTML

Introduction

In the context of climate change and global warming, forest fire occurrence increases threat to life, property, forest resources, and the environment^[1]. As given by the National Bureau of Statistics of China^[2], a total of 7,301 forest fires occurred and burned an area of 48,000 hectares from 2018 to 2022. Therefore, the development of accurate and interpretable forest fire danger models is crucial for early warning and emergency response.

Forest fires involve the interaction of multiple factors at different spatial and temporal scales, including vegetation, topography, meteorology, and human activities^[3−5]. Early studies of forest fires mainly explored the temporal and spatial distribution. They estimated the spatial clustering characteristics of fire occurrence^[6], but they were limited to judging the macroscopic distribution of forest fires. The remote sensing technology coupled with Geographic Information Systems (GIS) facilitates extensive data acquisition, which in turn supports the application of logistic regression models, Geographically Weighted Logistic Regression^[7], Poisson models^[8], and various other statistical methods for the analysis of factor interrelationships. However, statistical methods assume that the interactions between factors are linear, leading to poor prediction accuracy of the developed models^[9].

Many studies recently utilized the 'black box' approach of machine learning to address the complex relationships among factors. It has been demonstrated that machine learning models are adept at handling the complex nonlinear relationships inherent among meteorological, topographical, anthropogenic, and vegetative factors, thereby enabling the precise mapping of forest fire danger. Van Beusekom et al.^[10] conducted a study in Puerto Rico, utilizing meteorological data and human activities as predictors. They applied RF to analyze the correlation between fire occurrences. In another study, Yue et al.^[11] focused on Nanning City, incorporating meteorology, topography, human activities, and vegetation as predictors. They employed LightGBM, Classification and Regression Tree (CART), RF, and XGBoost to develop a susceptibility prediction model. Their findings indicated that the XGBoost model outperformed others, particularly in identifying high-danger areas within a specific region of Nanning. Wang et al.^[12], in their research on Yunnan Province, selected 16 predictors encompassing meteorological, topographical, vegetative data, and measures such as the distance between vegetation and rivers or roads, as well as population density. They employed Logistic Regression (LR), SVM, Artificial Neural Network (ANN), RF, Gradient Boosting Decision Tree (GBDT), and LightGBM models for analysis. Their analysis revealed that LightGBM was the most accurate model, which was subsequently utilized to construct susceptibility models for forest fire and to map associated danger areas.

Although machine learning models have achieved good performance in forest fire danger assessment, choosing model parameters is crucial for achieving high classification accuracy and effective danger mapping. The 'black box' nature poses an additional challenge, making the interpretation of machine learning model results less transparent. To address this issue, there is a need for models that are not only accurate but also understandable, which helps to interpret what causes forest fires and why the model predicts what it does. Optimization algorithms can be instrumental in fine-tuning the hyperparameters of machine learning models, thereby enhancing their predictive performance^[13,14]. Furthermore, interpretable artificial intelligence (AI) offers solutions to the 'black box' dilemma, with the SHAP model being a notable example. It provides insights into the output results, objectively quantifying the impact and contribution of each factor^[15−17]. It is noteworthy that previous studies have often relied on Gaussian Process (GP) models as probabilistic proxies for hyperparameter optimization^[12,18]. However, the potential of tree-structured Parzen estimator (TPE) models as probabilistic proxies has been somewhat overlooked. Further research is needed to compare the advantages and disadvantages of TPE for predicting forest fires.

In this study, an interpretable machine learning model is developed to predict forest fire danger based on GP and TPE optimization. The fire occurrence data from 2000−2019 in Sichuan and Yunnan provinces, China were utilized for analysis. Eighteen factors, encompassing vegetation, topography, meteorology, and human activities, were selected to interpret the temporal and spatial distribution of forest fires. Six optimal machine learning models were developed, after using GP and TPE probabilistic proxy models within a Bayesian optimization framework to fine-tune the hyperparameters of LightGBM, RF, and SVM, respectively. Comparative analyses were conducted for the six models, using Accuracy, Precision, Recall, Balanced F Score (F1), and area under curve (AUC) indexes. The SHAP model was used to interpret the optimal machine learning models, providing insights into the contribution and influence of each factor. Finally, a forest fire danger map was produced to serves as a scientific foundation for forest fire likelihood prediction and early warning systems in Sichuan and Yunnan.

No.	Factor	VIF value before eliminating factor	VIF value after eliminating factor
1	Da_AVGTEM	142.109	3.859
2	Da_MINTEM	51.681	−
3	Da_MAXTEM	30.345	−
4	Da_PRE	1.245	1.242
5	Da_AVGRH	3.440	2.220
6	Da_AVGWIN	2.609	2.420
7	Da_MAXWIN	2.603	2.536
8	Da_AVGPRS	3.999	3.855
9	SSD	3.424	2.577
10	Da_AVGGST	40.164	−
11	Da_MAXGST	11.016	4.639
12	Elevation	3.902	3.876
13	Slope	1.000	1.304
14	Aspect	1.001	1.001
15	TWI	1.040	1.190
16	Dis_to_railway	1.450	1.400
17	Dis_to_road	1.382	1.390
18	Dis_to_sett	1.458	1.459
19	Den_pop	4.882	4.871
20	GDP	3.811	3.806
21	Forest	1.104	1.108

Model parameters	TPE- LightGBM	TPE- RF	TPE- SVM	GP- LightGBM	GP- RF	GP- SVM
TP	5779	5727	5570	5705	5709	5505
TN	8695	8633	8254	8505	8511	8213
FP	917	979	1358	1105	1101	1399
FN	633	685	842	707	703	907
ACC (%)	90.3	89.6	86.3	88.7	88.7	85.6
Precision (%)	86.3	85.4	80.4	83.8	83.8	79.7
Recall (%)	90.1	89.3	86.8	88.9	89.0	85.9
F1 (%)	88.2	87.3	83.5	86.3	86.3	82.7

No.	Forest fire occurrence probability	Fire danger level	Description of fire
1	0−0.2	I	Virtually no fire
2	0.2−0.4	II	Unlikely to occur
3	0.4−0.6	III	Possible to occur
4	0.6−0.8	IV	Prone to occur
5	0.8−1	V	Highly likely to occur

{{lists.name}}

An interpretable machine learning model for predicting forest fire danger based on Bayesian optimization