Figure 1.
Fluctuation characteristics of TPI over different periods.
Figure 2.
Relative importance of different influencing factors.
Figure 3.
Comparison of TPI prediction results for one week.
Figure 4.
Comparison of TPI prediction results for rainy, snowy, and hazy weather.
Name Symbol Count Month 0: January; 1: February; ...; 11: December 18 months Week 0: Sunday; 1: Monday; ...; 6: Saturday 72 weeks Time period 21:0500-0515; 22:0515-0530; ...; 92:2245-2300 39,312 periods Day type 0: Weekday; 1: Weekend 546 d Public holiday 1: First day of holiday 12 d 2: Middle day(s) during holiday 25 d 3: Last day of holiday 12 d Summer or
winter vacation0: Normal days 426 d 1: Summer and winter vacation 120 d Special holiday 0: Normal day 421 d 1: Special holiday 5 d Car usage
restriction policy0: The last digit of license plate number is 0 or 5. 73 d 1: The last digit of license plate number is 1 or 6. 74 d 2: The last digit of license plate number is 2 or 7. 73 d 3: The last digit of license plate number is 3 or 8. 71 d 4: The last digit of license plate number is 4 or 9. 70 d 5: No limit 185 d Weather 0: Sunny, or cloudy
1: Rain490 d 63 d 2: Snow 6 d 3: Haze 31 d Special events 1: Short-term events 252 times 2: Large events lasting the whole day 314 times Table 1.
Descriptive statistics of influencing factors.
XGBoost Pseudo-code: Input: Training set D = {(xi, yi)}, where xi represents the i-th input vector and yi is the corresponding label.
Output: Prediction model f(x).// Step 1: Initialize the ensemble
Initialize the base prediction model as a constant value: f0(x) = initialization_constant// Step 2: Iterate over the boosting rounds
for m = 1 to M: // M is the number of boosting rounds// Step 3: Compute the pseudo-residuals
Compute the negative gradient of the loss function with respect to the current model's predicted values:
rmi = - ∂L(yi, fm−1(xi)) / ∂fm−1(xi)// Step 4: Fit a base learner to the pseudo-residuals
Fit a base learner (e.g., decision tree) to the pseudo-residuals: hm(x).// Step 5: Update the prediction model
Update the prediction model by adding the new base learner:
fm(x) = fm−1(x) + η * hm(x), where η is the learning rate.// Step 6: Output the final prediction model
Output the final prediction model: f(x) = fm(x)Table 2.
The pseudo-code of XGBoost algorithm.
Learning rate The number of trees R2 MAE MSE Maxmium depth of the tree = 3 0.05 1,400 0.8800 0.4934 0.4911 0.1 1,300 0.8779 0.4978 0.4998 0.5 160 0.8666 0.5274 0.5461 1 140 0.8117 0.6442 0.7708 Maxmium depth of the tree = 4 0.05 700 0.8797 0.4923 0.4927 0.1 600 0.8978 0.4640 0.4430 0.5 120 0.8872 0.4763 0.4620 1 110 0.8889 0.4791 0.4550 Maxmium depth of the tree = 5* 0.05 350 0.8865 0.4734 0.4646 0.1* 160* 0.8950 0.4474 0.4309 0.5 50 0.8886 0.4730 0.4560 1 30 0.8756 0.5103 0.5095 Maxmium depth of the tree = 6 0.05 195 0.8896 0.4655 0.4520 0.1 70 0.8791 0.4902 0.4950 0.5 30 0.8945 0.4572 0.4321 1 20 0.8860 0.4838 0.4666 Table 3.
Performance of extreme gradient boosting (XGBoost) models for daily TPI prediction.
Forecast data Prediction accuracy Week 1 (April 22 to April 28, 2019) 94.3% Week 2 (April 29 to May 5 2019) 85.3% Week 3 (May 6 to May 12, 2019) 91.1% Week 4 (May 13 to May 19, 2019) 89.1% Average value 90.0% Table 4.
Forecast accuracy of TPI for each week.
TPI prediction Performance of different models
(Measured by MAE, MSE and R2)SVR ElatsicNet Bayesian
RegressionXGBoost MAE 0.611 1.668 1.581 2.189 0.396* MSE 1.693 3.111 4.121 3.553 0.989* R2 0.784 0.034 0.113 0.391 0.786* MAE, Mean Absolute Error; MSE, Mean Squared Error Table 5.
Accuracy verification result of different models.