Using the unbiased grey Markov chain optimization model to forecast passenger flows through existing urban rail transit stations

Fuquan Pan; Wenzheng Li; Lixia Zhang; Xiaoxia Yang; Hailiang Tang; Yongkai Xia; Fuquan Pan; Wenzheng Li; Lixia Zhang; Xiaoxia Yang; Hailiang Tang; Yongkai Xia

doi:10.48130/dts-0026-0002

To accurately forecast the passenger flows of existing stations in urban rail transit and provide a scientific basis for subway operation management departments, a passenger flow prediction model optimized by an unbiased grey Markov chain was constructed. Using 18 months of average daily passenger flow data from Shilaoren Bathing Beach Station on Qingdao Metro Line 2, a grey prediction model was established and tested. The data were classified into six states according to their relative errors for Markov state modeling, and the classification results were validated. A state transition probability matrix and the corresponding k-step transition probabilities were then established. The unbiased grey Markov chain prediction model was used to predict the passenger flow data of existing stations in the next 3 months. By comparing the prediction results with the actual passenger flow data, it was found that the prediction error after optimization by the unbiased grey Markov model was controlled within 10%, which corresponded well with the actual characteristics of urban rail transit passenger flows. This model is suitable for predicting the passenger flows of existing urban rail transit stations and can provide an effective reference for managing subway operations and ensuring safety.

HTML

Introduction

Urban rail transit, with its advantages of large capacity, high speed, and low pollution, has become a core aspect of modern urban public transportation systems. It can effectively alleviate ground traffic congestion and promote the transformation of the urban transportation structure towards low carbon and high efficiency. However, the growing passenger demand has posed greater challenges to rail transit operations and management, especially during holidays, when a surge in passenger flows in a short period of time can easily create operational stress. In this context, accurate passenger flow forecasting has become the key to ensuring operational safety and improving service quality. It can also provide a scientific basis for line operation management departments to optimize capacity allocation and formulate emergency plans^[1].

At present, passenger flow prediction methods are mainly divided into two categories: Traditional prediction methods and intelligent algorithms. Traditional prediction methods are based on linear assumptions and low-dimensional data processing, covering time series methods, gray model-based methods, linear regression methods, Kalman filter methods, etc. Zhou et al.^[2] combined singular spectrum analysis (SSA) with an AdaBoost-weighted extreme learning machine to establish an urban rail transit transfer passenger flow prediction model. Jiao^[3] successfully predicted passenger flows during regular periods by fitting the linear trend of historical passenger flows, based on the autoregressive integrated moving average (ARIMA) model. Wang et al.^[4] achieved short-term predictions of stable passenger flows through linear state estimation with the Kalman filter model. In addition, most models are based on clear mathematical assumptions, and the parameters' meanings are intuitive and easy to understand and explain. Carmona-Benítez et al.^[5] optimized the damped trend grey prediction model (DTGM) using seasonal dynamic damping factors and constructed a seasonal ARIMA damped trend grey prediction model (SDTGM). The seasonal damping factor was used to directly quantify the seasonal fluctuations of passenger flows, and the model was highly interpretable. This type of method does not require large-scale sample support and can achieve low-cost and rapid predictions in scenarios where the amount of passenger flow data is small, the data fluctuations are gentle, and the linear pattern is significant.

In view of the limitations of linear modeling, existing improvements have mostly been made by adding correction factors to the traditional model and improving the Kalman filter on the basis of the maximum entropy principle. Peng et al.^[6] combined the traditional grey prediction model with the Markov chain state transfer matrix to construct a grey Markov prediction model of railway passenger flows. Li et al.^[7] constructed a grey system theory model and a grey Markov chain prediction model based on traffic accident data over several years to predict data for the next 2 years. Ye et al.^[8] proposed an adaptive grey Markov prediction model with fusion parameters, applied it to the passenger flow prediction of Chengdu Metro Line 1, and improved the accuracy of predicted passenger flows by correcting the absolute error. Ding et al.^[9] constructed a combination model with gray Grey Model (1,1) and Markov chain, and used the particle swarm algorithm to iteratively optimize it. The average error after optimization was reduced by 37%, and the prediction accuracy was significantly improved. Cai et al.^[10] improved the defect of the traditional Kalman filter, which exhibits large errors in the presence of non-Gaussian noise. However, these improvements remain local optimizations within the linear modeling framework, and they have not fundamentally overcome its structural constraints.

Intelligent algorithm methods are centered on nonlinear modeling and high-dimensional feature extraction, covering support vector regression (SVR), neural networks (Back Propagation [BP], recurrent neural networks [RNN]) and deep learning models (long short-term memory [LSTM], convolutional neural networks [CNN], graph convolutional network [GCN]). They do not require a preset data distribution and can automatically extract nonlinear features from large-scale, high-dimensional data. They are suitable for complex scenarios such as abnormal passenger flow during holidays and special events. Hu et al.^[11] used the typical support vector regression (SVR) algorithm to construct an urban rail transit passenger flow prediction model and optimized the prediction model with the improved particle swarm optimization (IPSO) algorithm. Shi et al.^[12] successfully captured the sudden fluctuations in holiday passenger flows by improving the Variational Mode Decomposition-Genetic Algorithm-Back Propagation Neural Network (VMD-GA-BP) model. Xue et al.^[13] proposed a hybrid deep neural network framework based on traditional smart card data and social media data to construct a station entry flow and social media interference model for predicting subway passenger flows prediction during special events. Mulerikkal et al.^[14] used an RNN to generate an intermediate feature space, integrated spatial features into the time series, and introduced an outlier detection and elimination algorithm based on support vector machines (SVMs) to improve the performance in predicting subway passenger flows. Bapaume et al.^[15] proposed a computer vision framework based on deep learning methods for predicting the real-time passenger flows and departure intervals of subway lines in urban transportation networks. Tu et al.^[16] integrated internet event data based on the DeepSPE model to achieve multistep passenger flow prediction during large-scale events. Yue et al.^[17] used an LSTM network to process historical transfer passenger flow data and combined it with the Transformer prediction model to predict short-term transfer passenger flows between integrated transportation hubs in urban agglomerations. In addition, some scholars have proposed a deep learning model for predicting rail transit passenger flows based on a bidirectional LSTM (BILSTM) network that considers temporal characteristics^[18,19]. Some scholars have combined CNNs with LSTM for predicting short-term passenger flowsin urban rail transit^[20,21]. Other scholars have proposed a multigraph convolutional recurrent neural network (MGC-RNN) and flow-similarity attention graph convolutional network (F-SAGCN) model that comprehensively considers various factors affecting passenger flows and is suitable for predicting short-term passenger flows in urban rail transit systems^[22,23]. Xiu et al.^[24] combined the correlation-based spatiotemporal feature selection (Cor-STFS) model for optimal input selection with the STA-PTCN-BiGRU model that can capture dynamic patterns to propose a new subway passenger flow prediction framework. This method improves the prediction efficiency and accuracy through parallel computing. Although deep learning models have certain advantages, their network structure requires a large number of labeled samples for training and requires high-performance computing support. In small-sample scenarios, overfitting or a sharp drop in prediction accuracy is prone to occur.

Because of the high randomness and volatility of urban rail transit passenger flows, traditional prediction models, though suitable for small sample sizes and linear scenarios, struggle to cope with the periodic and sudden fluctuations in passenger flows. They can only fit long-term trends and rely heavily on empirical factors to correct for short-term fluctuations. Intelligent optimization algorithms, though capable of handling nonlinear and complex passenger flows, are poorly adapted to small sample sizes and have high training costs. Their predictions for small sample sizes are suboptimal, capturing short-term fluctuations but lacking stability and interpretability for long-term trends. Furthermore, most model improvements rely on high-precision multisource data or are designed only for specific scenarios, making them challenging to implement in scenarios where basic data are missing or in cross-scenario applications. To address these shortcomings, this study proposes a prediction model based on the combined optimization of an unbiased grey model and a Markov algorithm to predict irregular fluctuations in urban rail transit passenger flows. The model first eliminates the systematic deviation of passenger flow data through unbiased processing, uses the unbiased grey model to accurately fit the long-term trend to make up for the trend prediction defects of the traditional model, and then introduces the Markov optimization model to divide the state through the data and quantify the short-term fluctuations, thereby achieving full coverage prediction of various samples and multiple scenarios. Ultimately, it provides a high-precision and easy-to-implement prediction plan for the average daily passenger flows entering the station on weekdays in the next 3 months, effectively making up for the shortcomings of the existing model in terms of scenario adaptability, modeling accuracy, and practical universality.

Establishment of the unbiased grey prediction model

Discussion

This study aims to predict passenger flows at existing urban rail transit stations. It conducts a multidimensional analysis of historical passenger flow data from typical stations on Qingdao Metro Line 2. The model is systematically applied for construction, accuracy verification, passenger flow state classification, and case validation of an unbiased grey Markov chain combined prediction model. By avoiding the accumulation and subtraction operations of the traditional GM (1,1) model, the unbiased grey model effectively captures the long-term evolution of passenger flows. For Shilaoren Bathing Beach Station, the model's deviation threshold D = 0.658 meets the Level 3 standard of qualified accuracy. The Markov chain accurately corrects for random fluctuations in passenger flows by classifying passenger flows into six discrete states. Forecasts for passenger flows over the subsequent 3 months show that the relative error is within 10%. The model also effectively captures the passenger flow characteristics of stations in scenic areas during the peak tourist season, verifying its predictive effectiveness and providing a quantitative basis for optimizing line capacity allocation and organizing train operations. The model achieves stable predictions with only 18 months of sample data. It is particularly suitable for existing stations with a short operating life and for complex passenger flow scenarios characterized by both long-term trends and short-term fluctuations. The model demonstrates superior adaptability. Compared with existing research, this study replaces the traditional GM (1,1) model with an unbiased grey model, avoiding the bias caused by accumulation and subtraction operations. The model eliminates the need for complex data preprocessing, significantly simplifies the modeling process, and improves the engineering application efficiency. The model also significantly reduces the sample size requirement and eliminates the need for high-performance computing equipment, reducing the model's complexity and computational cost. This makes it easier for operations managers to understand, master, and apply it, resulting in greater practicality and operability.

This study has certain limitations. First, the spatiotemporal distribution of passenger flows varies across stations and lines, resulting in the uneven predictive accuracy of the Markov chain optimization model across lines. Furthermore, the current classification of passenger flow states relies on researchers' empirical judgment, making it difficult to completely eliminate the impact of subjective factors on the model's accuracy. The study only considered passenger flow scenarios under good weather conditions and normal operating conditions, and did not incorporate external disturbances such as inclement weather, major holidays, and large-scale sport or entertainment events, which can cause sudden changes in passenger flows. This results in insufficient adaptability to extreme scenarios.

Therefore, future improvements to the model should focus on the following directions:

1. Reduce subjectivity: Classify passenger flow states more objectively by incorporating clustering techniques to minimize human bias.

2. Enhance adaptability: Improve the model's robustness to highly volatile data by integrating adaptive algorithms or hybrid models capable of dynamically responding to sudden changes in passenger flows.

3. Incorporate external factors: Extend the model to account for external influences such as severe weather, major holidays, and large-scale events.

These improvements will significantly enhance the practical value and reliability of the model in diverse and dynamic environments.

[1]	Wang Y, Ma J, Zhang J. 2019. Metro passenger flow forecast with a novel Markov-grey model. Periodica Polytechnica Transportation Engineering 48(1):70−75 doi: 10.3311/pptr.11131 CrossRef Google Scholar
[2]	Zhou W, Wang W, Zhao D. 2020. Passenger flow forecasting in metro transfer station based on the combination of singular spectrum analysis and AdaBoost-weighted extreme learning machine. Sensors 20(12):3555 doi: 10.3390/s20123555 CrossRef Google Scholar
[3]	Jiao B. 2025. Analysis and prediction of railway passenger flow patterns based on the ARIMA model. Journal of Engineering Innovations & Technology 7(2):25−61 Google Scholar
[4]	Wang H, Teng J, Ye L, Chen Y. 2024. Short-term prediction method for passenger density in urban rail transit based on nonlinear kalman filter. Urban Mass Transit 27(6):33−38,43 Google Scholar
[5]	Carmona-Benítez RB, Nieto MR. 2020. SARIMA damp trend grey forecasting model for airline industry. Journal of Air Transport Management 82:101736 doi: 10.1016/j.jairtraman.2019.101736 CrossRef Google Scholar
[6]	Peng L, Shao X, Huang W. 2022. Prediction of Yantai railway passenger volume based on gray Markov model. Journal of Ludong University (Natural Science Edition) 38(1):50−56 (in Chinese) Google Scholar
[7]	Li H, Wen C. 2024. Prediction of road traffic accidents based on the grey-Markov model. Communications Science and Technology Heilongjiang 47(5):170−174 (in Chinese) doi: 10.16402/j.cnki.issn1008-3383.2024.05.020 CrossRef Google Scholar
[8]	Ye J, Xu Z, Gou X. 2022. An adaptive Grey-Markov model based on parameters self-optimization with application to passenger flow volume prediction. Expert Systems With Applications 202:117302 doi: 10.1016/j.eswa.2022.117302 CrossRef Google Scholar
[9]	Ding T, Pan N, Du B, Ai W. 2023. Forecast of cargo throughput in port based on improved grey Markov. Journal of Chongqing Jiaotong University (Natural Science) 42(09):130−136 (in Chinese) Google Scholar
[10]	Cai L, Zhang Z, Yang J, Yu Y, Zhou T, et al. 2019. A noise-immune Kalman filter for short-term traffic flow forecasting. Physica A: Statistical Mechanics and Its Applications 536:122601 doi: 10.1016/j.physa.2019.122601 CrossRef Google Scholar
[11]	Hu S, Ji M, Chang Z, Wang H, Kong X. 2025. An improved particle swarm optimization algorithm based urban rail passenger flow prediction model: a case study in Beijing, China. Digital Transportation and Safety 4(2):101−107 doi: 10.48130/dts-0025-0005 CrossRef Google Scholar
[12]	Shi F, Yang X, Hu X, Xu G, Wu R. 2019. A VMD-GA-BP method for predicting non-holiday passenger flow of high speed railway based on data replacement correction. China Railway Science 40(3):129−136 doi: 10.3969/j.issn.1001-4632.2019.03.18 CrossRef Google Scholar
[13]	Xue G, Liu S, Ren L, Ma Y, Gong D. 2022. Forecasting the subway passenger flow under event occurrences with multivariate disturbances. Expert Systems with Applications 188:116057 doi: 10.1016/j.eswa.2021.116057 CrossRef Google Scholar
[14]	Mulerikkal J, Thandassery S, Rejathalal V, Kunnamkody DMD. 2022. Performance improvement for metro passenger flow forecast using spatio-temporal deep neural network. Neural Computing and Applications 34(2):983−994 doi: 10.1007/s00521-021-06522-5 CrossRef Google Scholar
[15]	Bapaume T, Côme E, Ameli M, Roos J, Oukhellou L. 2023. Forecasting passenger flows and headway at train level for a public transport line: focus on atypical situations. Transportation Research Part C-Emerging Technologies 153:104195 doi: 10.1016/j.trc.2023.104195 CrossRef Google Scholar
[16]	Tu Q, Geng G, Zhang Q. 2023. Multi-step subway passenger flow prediction under large events using website data. Tehnički Vjesnik 30(5):1585−1593 Google Scholar
[17]	Yue M, Ma S. 2023. LSTM-based transformer for transfer passenger flow forecasting between transportation integrated hubs in urban agglomeration. Applied Sciences 13(1):637 doi: 10.3390/app13010637 CrossRef Google Scholar
[18]	Halyal S, Mulangi RH, Harsha MM. 2022. Forecasting public transit passenger demand: With neural networks using APC data. Case Studies on Transport Policy 10(2):965−975 doi: 10.1016/j.cstp.2022.03.011 CrossRef Google Scholar
[19]	Qi Q, Cheng R, Ge H. 2023. Short-term inbound rail transit passenger flow prediction based on BILSTM model and influence factor analysis. Digital Transportation and Safety 2(1):12−22 doi: 10.48130/dts-2023-0002 CrossRef Google Scholar
[20]	Cao B, Li Y, Chen Y, Yang A. 2024. A CNN-LSTM model for short-term passenger flow forecast considering the built environment in urban rail transit stations. Journal of Transportation Engineering, Part A: Systems 150(11):04024072 doi: 10.1061/JTEPBS.TEENG-8579 CrossRef Google Scholar
[21]	Bogaerts T, Masegosa AD, Angarita-Zapata JS, Onieva E, Hellinckx P. 2020. A graph CNN-LSTM neural network for short and long-term traffic forecasting based on trajectory data. Transportation Research Part C: Emerging Technologies 112:62−77 doi: 10.1016/j.trc.2020.01.010 CrossRef Google Scholar
[22]	He Y, Li L, Zhu X, Tsui KL. 2022. Multi-graph convolutional-recurrent neural network (MGC-RNN) for short-term forecasting of transit passenger flow. IEEE Transactions on Intelligent Transportation Systems 23(10):18155−18174 doi: 10.1109/TITS.2022.3150600 CrossRef Google Scholar
[23]	Yu S, Luo A, Wang X. 2023. Railway passenger flow forecasting by integrating passenger flow relationship and spatiotemporal similarity. Intelligent Automation and Soft Computing 37(2):1877−1893 doi: 10.32604/iasc.2023.039132 CrossRef Google Scholar
[24]	Xiu C, Zhan S, Pan J, Peng Q, Lin Z, et al. 2026. Correlation-based feature selection and parallel spatiotemporal networks for efficient passenger flow forecasting in metro systems. Transportmetrica A: Transport Science 22(1):1 doi: 10.1080/23249935.2024.2335244 CrossRef Google Scholar
[25]	Pan J, Ma C. 2018. Passenger flow forecast based on improved grey Markov model. Technology & Economy in Areas of Communications 20(6):52−56,69 Google Scholar
[26]	Guo X, Chen X, Yu Q, Sun Y. 2019. Traffic state prediction based on cluster analysis and Markov model. Highway 64(8):304−309 Google Scholar
[27]	Bao L. 2017. Real-time forecast of passenger flow volume in urban rail transit. Urban Mass Transit 20(5):104−106,111 Google Scholar
[28]	Yu Q, Yao Z. 2019. Research on Markov particle filter for traffic flow prediction. Journal of Transportation Systems Engineering and Information Technology 19(2):209−215 Google Scholar

Inspection-level accuracy	First-order accuracy	Second-order accuracy	Third-order accuracy	Fourth-order accuracy
D	D < 0.35	0.35 ≤ D < 0.5	0.5 ≤ D < 0.75	D ≥ 0.75

Date	Average daily passenger flow (persons)	Date	Average daily passenger flow (persons)
2018.1	6,138	2018.10	8,049
2018.2	6,284	2018.11	8,100
2018.3	7,155	2018.12	8,098
2018.4	8,064	2019.1	7,923
2018.5	8,266	2019.2	7,836
2018.6	9,406	2019.3	8,699
2018.7	11,484	2019.4	8,789
2018.8	13,036	2019.5	9,065
2018.9	8,976	2019.6	10,051

Date	Predicted value (persons)	Relative error (%)	Date	Predicted value (persons)	Relative error (%)
2018.1	6,138	0.00	2018.10	8,792	−9.23
2018.2	8,383	−33.41	2018.11	8,844	−9.19
2018.3	8,433	−17.87	2018.12	8,897	−9.86
2018.4	8,484	−5.20	2019.1	8,950	−12.96
2018.5	8,534	−3.24	2019.2	9,003	−14.90
2018.6	8,585	8.73	2019.3	9,057	−4.11
2018.7	8,636	24.80	2019.4	9,111	−3.66
2018.8	8,688	33.36	2019.5	9,165	−1.11
2018.9	8,740	2.63	2019.6	9,220	8.27

Parameter	$ \overline{X} $	S₁	$ \overline{q} $	S₂	D
Numerical value	8,634	1,661	1,023	1,093	0.658

{{lists.name}}

Using the unbiased grey Markov chain optimization model to forecast passenger flows through existing urban rail transit stations

Abstract

Rights and permissions

References

About this article

Cite this article

Article Metrics

Access History

Other Articles By Authors