Extreme gradient boosting algorithm based urban daily traffic index prediction model: a case study of Beijing, China

Jiancheng Weng; Kai Feng; Yu Fu; Jingjing Wang; Lizeng Mao; Jiancheng Weng; Kai Feng; Yu Fu; Jingjing Wang; Lizeng Mao

doi:10.48130/DTS-2023-0018

2023 Volume 2

Article Contents

Next Previous

CASE STUDY Open Access

Extreme gradient boosting algorithm based urban daily traffic index prediction model: a case study of Beijing, China

1.
Beijing Key Laboratory of Traffic Engineering, Beijing University of Technology, Beijing 100124, China
2.
China Resources Wanxiang Life Shijiazhuang Wanxiang City, Shijiazhuang 050031, Hebei, China
3.
Beijing Municipal Transportation Operations Coordination Center, Beijing 100073, China

More Information

Corresponding author: youthweng@bjut.edu.cn

Received: 02 June 2023
Accepted: 12 September 2023
Published online: 28 September 2023
Digital Transportation and Safety 2023, 2(3): 220−228 | Cite this article

Abstract

The exhaust emissions and frequent traffic incidents caused by traffic congestion have affected the operation and development of urban transport systems. Monitoring and accurately forecasting urban traffic operation is a critical task to formulate pertinent strategies to alleviate traffic congestion. Compared with traditional short-time traffic prediction, this study proposes a machine learning algorithm-based traffic forecasting model for daily-level peak hour traffic operation status prediction by using abundant historical data of urban traffic performance index (TPI). The study also constructed a multi-dimensional influencing factor set to further investigate the relationship between different factors on the quality of road network operation, including day of week, time period, public holiday, car usage restriction policy, special events, etc. Based on long-term historical TPI data, this research proposed a daily dimensional road network TPI prediction model by using an extreme gradient boosting algorithm (XGBoost). The model validation results show that the model prediction accuracy can reach higher than 90%. Compared with other prediction models, including Bayesian Ridge, Linear Regression, ElatsicNet, SVR, the XGBoost model has a better performance, and proves its superiority in large high-dimensional data sets. The daily dimensional prediction model proposed in this paper has an important application value for predicting traffic status and improving the operation quality of urban road networks.
- Traffic prediction,
- Traffic performance index (TPI),
- Influencing factor,
- XGBOOST,
- Machine learning model
Rights and permissions
Copyright: © 2023 by the author(s). Published by Maximum Academic Press, Fayetteville, GA. This article is an open access article distributed under Creative Commons Attribution License (CC BY 4.0), visit https://creativecommons.org/licenses/by/4.0/.

References

[1]	Habtemichael FG, Cetin M. 2016. Short-term traffic flow rate forecasting based on identifying similar traffic patterns. Transportation Research Part C: Emerging Technologies 66:61−78 doi: 10.1016/j.trc.2015.08.017 CrossRef Google Scholar
[2]	Zhao Z, Chen W, Wu X, Chen PCY, Liu J. 2017. LSTM network: a deep learning approach for short-term traffic forecast. IET Intelligent Transport Systems 11:68−75 doi: 10.1049/iet-its.2016.0208 CrossRef Google Scholar
[3]	Tan H, Wu Y, Shen B, Jin PJ, Ran B. 2016. Short-term traffic prediction based on dynamic tensor completion. IEEE Transactions on Intelligent Transportation Systems 17:2123−33 doi: 10.1109/TITS.2015.2513411 CrossRef Google Scholar
[4]	Kumar SV, Vanajakshi L. 2015. Short-term traffic flow prediction using seasonal ARIMA model with limited input data. European Transport Research Review 7:21 doi: 10.1007/s12544-015-0170-8 CrossRef Google Scholar
[5]	Ojeda LL, Kibangou AY, de Wit CC. 2013. Adaptive Kalman filtering for multi-step ahead traffic flow prediction. 2013 American Control Conference, Washington, DC, USA, June 17−19, 2013. USA: IEEE. pp. 4724−29. https://doi.org/10.1109/ACC.2013.6580568
[6]	Cai Y, Huang H, Cai H, Qi Y. 2017. A K-nearest neighbor locally search regression algorithm for short-term traffic flow forecasting. 2017 9^th International Conference on Modelling, Identification and Control (ICMIC), Kunming, China, July 10−12, 2017. USA: IEEE. pp. 624−29. https://doi.org/10.1109/ICMIC.2017.8321530
[7]	Li L, He S, Zhang J. 2016. Online short-term traffic flow prediction considering the impact of temporal-spatial features. Journal of Transportation Systems Engineering and Information Technology 16:165−71 doi: 10.16097/j.cnki.1009-6744.2016.05.025 CrossRef Google Scholar
[8]	Ma X, Tao Z, Wang Y, Yu H, Wang Y. 2015. Long short-term memory neural network for traffic speed prediction using remote microwave sensor data. Transportation Research Part C: Emerging Technologies 54:187−97 doi: 10.1016/j.trc.2015.03.014 CrossRef Google Scholar
[9]	Yu H, Wu Z, Wang S, Wang Y, Ma X. 2017. Spatiotemporal recurrent convolutional networks for traffic prediction in transportation networks. Sensors 17:1501 doi: 10.3390/s17071501 CrossRef Google Scholar
[10]	Li Y, Chai S, Ma Z, Wang G. 2021. A hybrid deep learning framework for long-term traffic flow prediction. IEEE Access 9:11264−71 doi: 10.1109/ACCESS.2021.3050836 CrossRef Google Scholar
[11]	Çakmak UC, Apaydın MS, Çatay B. 2018. Traffic speed prediction with neural networks. In Operations Research Proceedings 2017, eds. Kliewer N, Ehmke J, Borndörfer R. Cham: Springer. pp. 737−43. https://doi.org/10.1007/978-3-319-89920-6_98
[12]	Zhang L, Zhang G. 2011. Combined forecast model for medium-term traffic flow based on polynomial and Fourier series. Journal of Xihua University (Natural Science Edition) 30(5):5−8+17 doi: 10.3969/j.issn.1673-159X.2011.05.002 CrossRef Google Scholar
[13]	Hou Z, Li X. 2016. Repeatability and similarity of freeway traffic flow and long-term prediction under big data. IEEE Transactions on Intelligent Transportation Systems 17:1786−96 doi: 10.1109/TITS.2015.2511156 CrossRef Google Scholar
[14]	Dong X, Lei T, Jin S, Hou Z. 2018. Short-term traffic flow prediction based on XGBoost. 2018 IEEE 7^th Data Driven Control and Learning Systems Conference (DDCLS), Enshi, China, May 25−27, 2018. USA: IEEE. pp. 854−59. https://doi.org/10.1109/DDCLS.2018.8516114
[15]	Lartey B, Homaifar A, Girma A, Karimoddini A, Opoku D. 2021. XGBoost: a tree-based approach for traffic volume prediction. 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC). October 17-20, 2021, Melbourne, Australia. USA: IEEE. pp. 1280−86. https://doi.org/10.1109/SMC52423.2021.9658959
[16]	Zhang X, Zhang Q. 2020. Short-Term Traffic Flow Prediction Based on LSTM-XGBoost Combination Model. CMES-Computer Modeling in Engineering & Sciences 125(1):95−109 doi: 10.32604/cmes.2020.011013 CrossRef Google Scholar
[17]	Chen Z, Fan W. 2021. A freeway travel time prediction method based on an XGBoost model. Sustainability 13:8577 doi: 10.3390/su13158577 CrossRef Google Scholar
[18]	Cheng W, Li J, Xiao H, Ji L. 2022. Combination predicting model of traffic congestion index in weekdays based on LightGBM-GRU. Scientific Reports 12:2912 doi: 10.1038/s41598-022-06975-1 CrossRef Google Scholar
[19]	Tran Quang D, Bae SH. 2021. A hybrid deep convolutional neural network approach for predicting the traffic congestion index. Promet - Traffic & Transportation 33:373−85 doi: 10.7307/ptt.v33i3.3657 CrossRef Google Scholar
[20]	Zhang L, Liu S, Tian Y. 2021. Traffic state index prediction based on convolutional and LSTM fusion model. Traffic & Transportation 37(1):91−95 Google Scholar
[21]	Bao X, Jiang D, Yang X, Wang H. 2020. An improved deep belief network for traffic prediction considering weather factors. Alexandria Engineering Journal 60:413−20 doi: 10.1016/j.aej.2020.09.003 CrossRef Google Scholar
[22]	Wan J, Li J, Zhang S. 2018. Prediction model for ship traffic flow considering periodic fluctuation factors. 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, October 12−14, 2018. USA: IEEE. pp. 1506−10. https://doi.org/10.1109/IAEAC.2018.8577732
[23]	Chen Y, Lv Y, Li Z, Wang F. 2016. Long short-term memory model for traffic congestion prediction with online open data. 2016 IEEE 19^th International Conference on Intelligent Transportation Systems (ITSC), Rio de Janeiro, Brazil, 2016. USA: IEEE. pp. 132-37. https://doi.org/10.1109/ITSC.2016.7795543
[24]	Pulugurtha SS, Duddu VR, Venigalla M. 2020. Evaluating spatial and temporal effects of planned special events on travel time performance measures. Transportation Research Interdisciplinary Perspectives 6:100168 doi: 10.1016/j.trip.2020.100168 CrossRef Google Scholar
[25]	Beijing Municipal Bureau of Quality and Technical Supervision. 2011. Urban road traffic performance index, DB11/T 785-2011. http://jtw.beijing.gov.cn/xxgk/flfg/jthy/201912/P020191231386181515095.pdf
[26]	Saeedmanesh M, Geroliminis N. 2017. Dynamic clustering and propagation of congestion in heterogeneously congested urban traffic networks. Transportation Research Part B: Methodological 105:193−211 doi: 10.1016/j.trb.2017.08.021 CrossRef Google Scholar
[27]	Chen T, Guestrin C. 2016. XGBoost: A Scalable Tree Boosting System. KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco California USA, August 13−17, 2016. New York, United States: Association for Computing Machinery. pp. 785−94. https://doi.org/10.1145/2939672.2939785
[28]	Ding C, Wang D, Ma X, Li H. 2016. Predicting short-term subway ridership and prioritizing its influential factors using gradient boosting decision trees. Sustainability 8:1100 doi: 10.3390/su8111100 CrossRef Google Scholar
[29]	Firinguetti-Limone L, Pereira-Barahona M. 2020. Bayesian estimation of the shrinkage parameter in ridge regression. Communications in Statistics - Simulation and Computation 49:3314−27 doi: 10.1080/03610918.2018.1547395 CrossRef Google Scholar
[30]	Kemp F. 2003. Applied multiple regression/correlation analysis for the behavioral sciences. Journal of the Royal Statistical Society Series D (the Statistician) 52:691 doi: 10.1046/j.1467-9884.2003.t01-2-00383_4.x CrossRef Google Scholar
[31]	Alshaybawee T, Midi H, Alhamzawi R. 2017. Bayesian elastic net single index quantile regression. Journal of Applied Statistics 44:853−71 doi: 10.1080/02664763.2016.1189515 CrossRef Google Scholar
[32]	Ahn J, Ko E, Kim EY. 2016. Highway traffic flow prediction using support vector regression and Bayesian classifier. 2016 International Conference on Big Data and Smart Computing (BigComp), Hong Kong, China, January 18-20, 2016. USA: IEEE. pp. 239−44. https://doi.org/10.1109/BIGCOMP.2016.7425919

About this article

Cite this article

Weng J, Feng K, Fu Y, Wang J, Mao L. 2023. Extreme gradient boosting algorithm based urban daily traffic index prediction model: a case study of Beijing, China. Digital Transportation and Safety 2(3):220−228 doi: 10.48130/DTS-2023-0018

Weng J, Feng K, Fu Y, Wang J, Mao L. 2023. Extreme gradient boosting algorithm based urban daily traffic index prediction model: a case study of Beijing, China. Digital Transportation and Safety 2(3):220−228 doi: 10.48130/DTS-2023-0018

Figures(4) / Tables(5)

Download PDF

Article Metrics

Article views(3132) PDF downloads(373)

Other Articles By Authors

on this site
on Google Scholar

HTML

Introduction

High-precision traffic prediction can help transportation agencies to understand the road network traffic operation status, and provide quantitative data support for traffic management strategy formulation. It can also enable the public to receive the operation status of the road network in future periods in time, so they can choose a more reasonable travel mode^[1−3].

Traffic state prediction contributes to the foreknowledge of the variation of traffic states on different future time scales, from minutes to hours or even days. At present, many studies have been carried out on short-term traffic prediction. Short-term traffic state prediction is an important real-time decision-making tool of intelligent transportation systems for traffic managers and travelers who must make decisions in minutes. Kumar et al.^[4] used the ARIMA model to conduct a single-point short-term traffic flow prediction model. Luis et al.^[5] forecasted traffic flow in a multi-step way based on the adaptive Kalman filtering theory. Cai et al.^[6] used a local search strategy to search for optimal nearest neighbors' outputs and used optimal nearest neighbors' outputs weighted by local similarities to forecast short-term traffic flow, to improve the prediction mechanism of the K-NN model. Lin et al.^[7] built an online short-term traffic volume prediction model based on support vector regression and considering the influence of space-time factors, and completed the short-term traffic volume prediction of the expressway. Ma et al.^[8] proposed a novel architecture of neural networks, with the use of Long-Term and Short-Term Neural Network (LSTMNN), to capture nonlinear traffic dynamics effectively, and to forecast the travel speed data from traffic microwave detectors. Yu et al.^[9] proposed a Spatial-temporal recursive convolutional network (SRCNs) algorithm to predict the traffic flow of 278 arterial roads in Beijing. In addition, most of the traditional short-sighted traffic flow forecasting models only pay attention to the prediction of a single period. Although it has scientific significance, it cannot meet the practical application of multi-time period or long-term traffic flow forecasting.

Accurate medium and long-term traffic flow prediction is important for intelligent transportation. The systematic traffic management system and congestion analysis and early warning system have important practical significance^[10]. There are relatively few existing studies on the prediction of medium and long-term traffic operation status, Umut et al.^[11] employed feed-forward neural networks which combined time series forecasting techniques to forecast the traffic volume of two sections of Istanbul in half a month. Zhang et al.^[12] established a polynomial Fourier combination forecasting model of road traffic flow and tested the validity and robustness of the method for traffic flow data of the Wapenyao section in Harbin. Hou et al.^[13] used the statistical average of the basic series of traffic flow and the deviation series to define the similarity and repeatability of traffic flow patterns and proposed a long-term traffic flow prediction algorithm.

XGBoost (eXtreme Gradient Boosting) is a gradient boosting tree algorithm that combines the advantages of the gradient boosting framework and decision tree models. It has demonstrated excellent performance in various machine learning problems, particularly well-suited for handling large-scale data and complex feature relationships. It has been widely applied to forecasting tasks in the latest research. Dong et al.^[14] proposed a traffic flow prediction model that combined wavelet decomposition reconstruction with Extreme Gradient Boosting (XGBoost) algorithm. The model utilized wavelet denoising algorithm to preserve the traffic flow trends for each sampling period and reduced the influence of short-term high-frequency noise. Lartey et al.^[15] employed the Extreme Gradient Boosting (XGBoost) algorithm to efficiently predict hourly traffic flow under extreme weather conditions and further investigated the impact of ridge and LASSO regularization on the performance of XGBoost. A new approach was proposed to set the LASSO regularization parameter based on the number of observations and predictors. Zhang et al.^[16] proposed a short-term traffic flow prediction method for urban roads based on the LSTM-XGBoost model, aiming to analyze and address issues related to the periodicity, stationary, and abnormality of time series data. By validating the model using speed data samples from multiple road segments in Shenzhen, it was found that the proposed model can improve the accuracy of traffic flow predictions, enabling efficient traffic guidance and control. Chen et al.^[17] employed the XGBoost model to predict highway travel time using probe vehicle data and discussed the impact of different parameters on the model's performance. By comparing it with the gradient boosting model, the study demonstrated significant advantages of the proposed model in terms of prediction accuracy and efficiency.

The latest research utilizes statistical analysis and machine learning methods for predicting traffic index, aiming to capture the changing trends of road network operating conditions. Cheng et al.^[18] proposed a method to enhance the expressive power of limited features by using Light Gradient Boosting Machine (LightGBM) and Gated Recurrent Unit (GRU). Researchers conducted experimental analysis using ridesharing data from Chengdu city and constructed a SARIMA-GRU model for traffic performance index forecasting. Quang et al.^[19] proposed a hybrid deep convolutional neural network (CNN) approach that utilized gradient descent optimization algorithm and pooling operations to predict short-term traffic congestion index in urban networks based on probe vehicle data. The results demonstrate that the proposed method effectively visualizes the temporal variations in traffic congestion across the entire urban network. Zhang et al.^[20] researched a traffic state index prediction model based on the fusion of convolutional and recurrent neural networks. The convolutional network in the model automatically extracted important influencing factors, while the recurrent network captured temporal feature changes from past to future. The results demonstrated that the predictive accuracy of this fusion model reached 90.2%.

The former studies consider several factors when predicting the operation state of road network. Bao et al.^[21] learned key features of traffic data in an unsupervised manner and improves the deep belief network (DBNs) based on traffic data and monitored weather data to predict traffic flow in poor weather. Wan et al.^[22] proposed an improved linear growth model for predicting ship traffic flow, taking all periodic fluctuation factors (e.g., seasonal changes, climate impact, etc.) into consideration for Bayesian estimation and prediction. Chen et al.^[23] utilized web-based map service data to construct long-short term memory model for predicting traffic condition patterns. The proposed model had superior performance over multilayer perceptron model, decision tree model and support vector machine model. Srinivas et al.^[24] adopt a systemic evaluation method to assess the difference in travel time performance measures during the day of the planned special event compared to the normal day to quantify the impact of planned special events on travel time performance measures. When constructing the influencing factor set, most existing studies concentrate on temporal characteristics or mostly focus on single-factor influences, such as weather, seasons, and traffic management measures, lacking a comprehensive consideration of external dynamic factors.

In general, the previous research mostly focused on short-term traffic index prediction at minute and hour levels, while they are constrained by model performance and can only be used for predicting short periods. In constructing the prediction model, they solely took into account the influence of temporal features on the traffic index, while neglecting the impact of external environmental conditions. Therefore, under the condition of multiple influencing factors coupling, traffic index prediction at a daily level or longer periods becomes particularly important. The XGBoost algorithm has the ability to automatically capture the nonlinear relationships between input features and flexible handling both continuous and categorical variables. We construct a daily traffic index prediction model by consideration the impact of time, weather, holidays, vehicle restriction, special events on traffic operation state based on the Beijing traffic index data and relevant influencing factors data. Finally, the model is compared with the existing medium-term prediction methods to verify the prediction accuracy.

Construction of forecasting model

XGBoost concept

Extreme Gradient Boosting (XGBoost) is an improved algorithm of gradient boosted decision trees (GBDT)^{[27, 28]}, a powerful sequential integration technique with a parallel learning modular structure to achieve fast computation. For this study, XGBoost demonstrates good robustness to missing and abnormal values, effectively handling datasets containing influential factors with missing or abnormal values, thus avoiding impacts on predictive performance due to data quality issues. It provides feature importance rankings that can help better understand the factors behind the predicted results, with good interpretability. XGBoost optimizes the model by iteratively selecting and combining features automatically, and can adjust various hyperparameters, resulting in good predictive accuracy. These characteristics make XGBoost a suitable means to predict and explain the spatial heterogeneity of the TPI. The prediction model for XGBoost can be expressed as:

$ {\hat y_i} = \displaystyle\sum\nolimits_{k = 1}^t {{f_k}({x_i})} = \hat y_i^{(t - 1)} + {f_t}({x_i}) $

(4)

Where $ {f_t}({x_i}) $ represents the t-th tree, and $ {\hat y_i} $ represents the predicted result of the sample $ {x_i} $.

XGBoost implements a balancing algorithm between model performance and computation speed. To learn the set of functions used in the model, we minimize the following regularized objective.

$ obj = \displaystyle\sum\nolimits_{i = 1}^n {l({y_i},{{\hat y}_i})} + \displaystyle\sum\nolimits_{k = 1}^t {\Omega ({f_k})} $

(5)

Where $ l $ represents a differentiable convex loss function that measures the difference between the prediction $ {\hat{y}}_{i} $ and the target $ {y}_{i} $, $ \Omega $represents the complexity of the model, and $ n $ represents the total amount of data imported by $ n $ into the i-th tree.

The second term $ \Omega $ penalizes the complexity of the model (i.e., the regression tree functions). The additional regularization term helps to smooth the final learned weights to avoid over-fitting. Intuitively, the regularized objective will tend to select a model employing simple and predictive functions.

$ \Omega ({f_k}) = \gamma T + \displaystyle\frac{1}{2}\lambda \displaystyle\sum\nolimits_{j = 1}^T {{\omega ^2}} $

(6)

Where $ \gamma $ and $ \lambda $ represent artificially set parameters, T represents the total number of leaves, $ \omega $ represents score on j-th leaves, $ \dfrac{1}{2}\mathrm{\lambda }{\displaystyle\sum }_{\mathrm{j}=1}^{\mathrm{T}}{\mathrm{\omega }}^{2} $ represents the L2 modulus square of ω.

When the regularization parameter is zero, XGBoost degenerates into a traditional boosting model. The model iterates using additive training to further minimize the objective function and update the objective function at each iteration.

As XGBoost is an algorithm in the boosting family, it obeys forward step-wise addition, and the model objective function at step t can be expressed as:

$ ob{j^{(t)}} = \displaystyle\sum\nolimits_{i = 1}^n {l({y_i},\hat y_i^{(t - 1)} + {f_t}({x_i})} ) + \Omega ({f_t}) $

(7)

In order to find the function f_t that minimizes the objective function, XGBoost utilizes a second-order Taylor expansion approximation at f_t = 0 to approximate it. This extends the Taylor series of the loss function to the second order. Thus, the objective function is approximated as:

$ ob{j^{(t)}} \simeq \displaystyle\sum\nolimits_{i = 1}^n {[l({y_i},\hat y_i^{(t - 1)} + {f_t}({x_i})} ) + \displaystyle\frac{1}{2}{h_i}f_t^2({x_i})] + \Omega ({f_t}) $

(8)

Equation (8) aggregates the loss function values for each data point, as demonstrated in the following process:

$ \begin{aligned} obj& \simeq \displaystyle\sum\nolimits_{i = 1}^n {[{g_i}{f_t}({x_i})} + \displaystyle\frac{1}{2}{h_i}f_t^2({x_i})] + \Omega ({f_t}) \\ & = \displaystyle\sum\nolimits_{j = 1}^T {[(\displaystyle\sum\nolimits_{i \in {I_j}} {{g_i}} ){w_j} + \displaystyle\frac{1}{2}(\displaystyle\sum\nolimits_{i \in {I_j}} {{h_i}} + \lambda )w_j^2]} + \lambda T \\ \end{aligned} $

(9)

Where obj represents the objective function, $ {g_i} = \partial {\hat y^{t - 1}}l({y_i},{\hat y^{t - 1}}) $ represents the first derivative, $ {h_i} = {\partial ^2}{\hat y^{t - 1}}l({y_i},{\hat y^{t - 1}}) $ represents the second derivative.

Equation (9) rewrites the objective function as a univariate quadratic function in terms of the leaf node score $ \omega $. The optimal $ \omega $ and corresponding value of the objective function are obtained as follows:

$ \omega _j^ * = - \displaystyle\frac{{{G_j}}}{{{H_j} + \lambda }} $

(10)

$ obj = - \displaystyle\frac{1}{2}\displaystyle\sum\nolimits_{j = 1}^T {\frac{{{G_j}}}{{{H_j} + \lambda }}} + \lambda T $

(11)

Where $ {G_j} = \displaystyle\sum\nolimits_{i \in {I_j}} {{g_i}} $, $ {H_j} = \displaystyle\sum\nolimits_{i \in {I_j}} {{h_i}} $.

The pseudo-code of XGBoost algorithm is shown in Table 2.

Table 2. The pseudo-code of XGBoost algorithm.

XGBoost Pseudo-code:
Input: Training set D = {(x_i, y_i)}, where x_i represents the i-th input vector and y_i is the corresponding label. Output: Prediction model f(x).
// Step 1: Initialize the ensemble Initialize the base prediction model as a constant value: f₀(x) = initialization_constant
// Step 2: Iterate over the boosting rounds for m = 1 to M: // M is the number of boosting rounds
// Step 3: Compute the pseudo-residuals Compute the negative gradient of the loss function with respect to the current model's predicted values: r_mi = - ∂L(y_i, f_m−1(x_i)) / ∂f_m−1(x_i)
// Step 4: Fit a base learner to the pseudo-residuals Fit a base learner (e.g., decision tree) to the pseudo-residuals: h_m(x).
// Step 5: Update the prediction model Update the prediction model by adding the new base learner: f_m(x) = f_m−1(x) + η * h_m(x), where η is the learning rate.
// Step 6: Output the final prediction model Output the final prediction model: f(x) = f_m(x)

In each iteration, the XGBoost algorithm calculates the prediction residuals of the current model and uses these residuals to train a new regression tree model. The prediction results of this model are then weighted and cumulatively added to the previous model's prediction results, updating the overall model's predictions. This process is repeated until the specified number of iterations is reached. The learning rate parameter is used to control the contribution of each model in updating the overall model.

Model parameter

This study constructs an initial decision tree through a machine learning algorithm, then carries out feature selection and searches for parameters with stronger generalization ability and higher scores. The model optimization can greatly improve the accuracy of the learners, reduce the training time of the model, and prevent the phenomenon of under-fitting and over-fitting.

Smaller learning rates need more iterations for the same training set. The combination of the learning rate and its corresponding optimal the number of trees is applied together for determining the fitting effect of the model. Considering the different combinations of the learning rate and the number of trees in the meanwhile, the optimal depth of the tree for each combination can also be found. Model performance scores of different combinations are shown in Table 3, and the number of trees is the optimal number under the learning rate among them.

Table 3. Performance of extreme gradient boosting (XGBoost) models for daily TPI prediction.

Learning rate	The number of trees	R2	MAE	MSE
Maxmium depth of the tree = 3
0.05	1,400	0.8800	0.4934	0.4911
0.1	1,300	0.8779	0.4978	0.4998
0.5	160	0.8666	0.5274	0.5461
1	140	0.8117	0.6442	0.7708
Maxmium depth of the tree = 4
0.05	700	0.8797	0.4923	0.4927
0.1	600	0.8978	0.4640	0.4430
0.5	120	0.8872	0.4763	0.4620
1	110	0.8889	0.4791	0.4550
Maxmium depth of the tree = 5*
0.05	350	0.8865	0.4734	0.4646
0.1*	160*	0.8950	0.4474	0.4309
0.5	50	0.8886	0.4730	0.4560
1	30	0.8756	0.5103	0.5095
Maxmium depth of the tree = 6
0.05	195	0.8896	0.4655	0.4520
0.1	70	0.8791	0.4902	0.4950
0.5	30	0.8945	0.4572	0.4321
1	20	0.8860	0.4838	0.4666

In this model, the combination of {max_depth = 4, learning_rate = 0.1, n_estimators = 600} and {max_depth = 5, learning_rate = 0.1, n_estimators = 160} have better performance. Given that it takes a long time for the learner to iterate 600 times, the combination {max_depth = 5, learning_rate = 0.1, n_estimators = 160} is selected as the preferred combination.

For the 'min_samples_split' and the 'min_samples_leaf', the default values are 2. It is recommended to increase this value as the sample size increases. By the method of parameters comparison, {min_samples_leaf = 40, and min_samples_split = 2} as the preferred combination is selected, which means the node will be pruned together with the sibling node when the sample size of each leaf node is less than 40.

Conclusions

A forecasting method of daily road network TPI based on XGBoost is proposed in this study. The study is of great significance in alleviating urban traffic congestion and scientific management of urban road networks. Based on the historical road network TPI data of Beijing during 18 consecutive months from 2018 to 2019, influencing factors of road network operation quality are proposed, including day of week, time period, public holiday, car usage restriction policy, special event, etc. The importance of factors is quantitatively calculated to identify the important factors. The results indicate that time period, week, and month are the top three factors in terms of relative importance, with weights of 0.355, 0.181, and 0.121, respectively. This suggests that temporal factors have the most significant impact on the changes in the operational status of the road network. The XGBoost is introduced to predict the daily TPI. It is found that the accuracy of the XGBoost model can reach more than 90%, which is significantly higher than that of other traditional regression models include and SVR models. It shows that the factors set and a model constructed in this study can accurately predict road traffic operation status. Based on the prediction results of the road network TPI, it can be used for road network operation monitoring and early warning, assisting traffic management departments in identifying congested periods, issuing traffic guidance information in advance, making the spatial-temporal distribution of traffic flow in the road network more balanced, improving the efficiency of road network operation. It can also assist traffic industry managers in formulating traffic management strategies and addressing traffic congestion problems from a policy level.

The forecasting model proposed in this study is an estimation of the future traffic operation condition, which is based on the accurate acquisition of the influencing factors in the future. Therefore, the accuracy of the factors and conditions judgment such as weather conditions is an important prerequisite to ensure the accuracy of the TPI forecasting model. In future work, the factors set should be further improved to enhance the applicability of the model for short-term factors.

Author contributions

Weng J: methodology, writing - review & editing, Supervision. Feng K: conceptualization, methodology, writing - original draft. Fu Y: methodology, writing - original draft. Wang J: resources, data analysis. Mao L: resources, model construction. All authors have read and approved the final manuscript.

Name	Symbol	Count
Month	0: January; 1: February; ...; 11: December	18 months
Week	0: Sunday; 1: Monday; ...; 6: Saturday	72 weeks
Time period	21:0500-0515; 22:0515-0530; ...; 92:2245-2300	39,312 periods
Day type	0: Weekday; 1: Weekend	546 d
Public holiday	1: First day of holiday	12 d
	2: Middle day(s) during holiday	25 d
	3: Last day of holiday	12 d
Summer or winter vacation	0: Normal days	426 d
Summer or winter vacation	1: Summer and winter vacation	120 d
Special holiday	0: Normal day	421 d
Special holiday	1: Special holiday	5 d
Car usage restriction policy	0: The last digit of license plate number is 0 or 5.	73 d
	1: The last digit of license plate number is 1 or 6.	74 d
	2: The last digit of license plate number is 2 or 7.	73 d
	3: The last digit of license plate number is 3 or 8.	71 d
	4: The last digit of license plate number is 4 or 9.	70 d
	5: No limit	185 d
Weather	0: Sunny, or cloudy 1: Rain	490 d
	0: Sunny, or cloudy 1: Rain	63 d
	2: Snow	6 d
	3: Haze	31 d
Special events	1: Short-term events	252 times
Special events	2: Large events lasting the whole day	314 times

Forecast data	Prediction accuracy
Week 1 (April 22 to April 28, 2019)	94.3%
Week 2 (April 29 to May 5 2019)	85.3%
Week 3 (May 6 to May 12, 2019)	91.1%
Week 4 (May 13 to May 19, 2019)	89.1%
Average value	90.0%

TPI prediction	Performance of different models (Measured by MAE, MSE and R²)
TPI prediction	SVR	ElatsicNet	Bayesian Ridge	Linear Regression	XGBoost
MAE	0.611	1.668	1.581	2.189	0.396*
MSE	1.693	3.111	4.121	3.553	0.989*
R²	0.784	0.034	0.113	0.391	0.786*
MAE, Mean Absolute Error; MSE, Mean Squared Error

{{lists.name}}

Extreme gradient boosting algorithm based urban daily traffic index prediction model: a case study of Beijing, China

Abstract