Application of Phase Space Reconstruction in Enhancing the Performance of Machine Learning Models for Drought Prediction (Case Study: Bandar Abbas Synoptic Station)

Document Type : Research Paper

Authors

1 , PhD Student, Department of Natural Resources Engineering, Faculty of Agricultural and Natural Resources Engineering, University of Hormozgan, Bandar Abbas, Iran.

2 Professor, Department of Natural Resources Engineering, Faculty of Agricultural and Natural Resources Engineering, University of Hormozgan, Bandar Abbas, Iran.

3 Assistant Professor, Department of Natural Resources Engineering, Faculty of Agricultural and Natural Resources Engineering, University of Hormozgan, Bandar Abbas, Iran.

4 Assistant Professor, Department of Mathematics and Statistics, Faculty of Basic Sciences, University of Hormozgan, Bandar Abbas, Iran.

10.29252/aridbiom.2026.4035

Abstract

Drought, as one of the most complex and multidimensional climatic phenomena, is inherently associated with high uncertainty and nonlinear dependencies among climatic variables. In this study, a hybrid framework integrating phase space reconstruction (PSR), Vine Copula, and quantile regression was developed to improve the accuracy and interpretability of drought prediction in the hyper-arid climate of Bandar Abbas over the period 1960–2022. The results demonstrate that phase space reconstruction, by unveiling the hidden dynamical structure of the SPEI time series, significantly enhances model performance, such that the coefficient of determination () in the testing phase increases from approximately 0.10 to more than 0.80. The Vine Copula approach effectively captures the nonlinear and asymmetric dependencies among climatic variables and, when combined with PSR, provides a coherent framework for representing the dynamic behavior of the drought system. Quantile-based models, including QRF, QXGBoost, and QVineCopula, not only improve predictive accuracy but also enable data-driven quantification of uncertainty, resulting in superior coverage of prediction intervals (PICP = 0.91). Variable importance analysis highlights the dominant role of lagged SPEI components—particularly SPEI9 and SPEI8—along with maximum temperature and wind speed, in governing regional drought dynamics. Overall, the hybrid PSR–QVineCopula model achieves an optimal balance between accuracy, robustness, and interpretability, and offers a novel framework for both deterministic and probabilistic drought prediction in hyper-arid regions.

Keywords

Main Subjects


[1] Abdallah, M., Mohammadi, B., Zaroug, M. A., Omer, A., Cheraghalizadeh, M., Eldow, M. E., & Duan, Z. (2022). Reference evapotranspiration estimation in hyper-arid regions via D-vine copula based-quantile regression and comparison with empirical approaches and machine learning models. Journal of Hydrology: Regional Studies, 44, 101259. https://doi.org/10.1016/j.ejrh.2022.101259.
[2] Abdallah, M., Zhang, K., Chao, L., Omer, A., Hassaballah, K., Welde Reda, K., ... & Nour, O. M. (2024). A D-vine copula-based quantile regression towards merging satellite precipitation products over rugged topography: a case study in the upper Tekeze–Atbara Basin. Hydrology and Earth System Sciences, 28(5), 1147-1172. https://doi.org/10.5194/hess-28-1147-2024.
[3] Aas, K., Czado, C., Frigessi, A., & Bakken, H. (2009). Pair-copula constructions of multiple dependence. Insurance: Mathematics and economics, 44(2), 182-198.https://doi.org/10.1016/j.insmatheco.2007.02.001.
[4] Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32. https://doi.org/10.58496/BJML/2024/007.
[5] Chen, C., Liaw, A., & Breiman, L. (2004). Using random forest to learn imbalanced data. Statistics Department of University of California at Berkeley. Berkeley. Technical Report 666.
[6] Cannon, A. J. (2011). Quantile regression neural networks: Implementation in R and application to precipitation downscaling. Computers & geosciences, 37(9), 1277-1284. https://doi.org/10.1016/j.cageo.2010.07.005.
[7] Dikshit, A., Pradhan, B., & Alamri, A. M. (2020). Short-term spatio-temporal drought forecasting using random forests model at New South Wales, Australia. Applied Sciences, 10(12), 4254. https://doi.org/10.3390/app10124254
[8] Danandeh Mehr, A., Rikhtehgar Ghiasi, A., Yaseen, Z. M., Sorman, A. U., & Abualigah, L. (2023). A novel intelligent deep learning predictive model for meteorological drought forecasting. Journal of Ambient Intelligence and Humanized Computing, 14(8), 10441-10455. https://doi.org/10.1007/s12652-022-03701-7.
[9] Elidan, G. (2010). Copula bayesian networks. Advances in neural information processing systems, 23.
[10] Feng, P., Wang, B., Li Liu, D., & Yu, Q. (2019). Machine learning-based integration of remotely-sensed drought factors can improve the estimation of agricultural drought in South-Eastern Australia. Agricultural Systems, 173, 303-316. https://doi.org/10.1016/j.agsy.2019.03.015.
[11] Fraser, A. M., & Swinney, H. L. (1986). Independent coordinates for strange attractors from mutual information. Physical review A, 33(2), 1134. https://doi.org/10.1109/18.32121.
[12] Goldblatt, R., You, W., Hanson, G., & Khandelwal, A. K. (2016). Detecting the boundaries of urban areas in india: A dataset for pixel-based image classification in google earth engine. Remote Sensing, 8(8), 634. https://doi.org/10.3390/rs8080634.
[13] Guo, Q., He, Z., & Wang, Z. (2024). Monthly climate prediction using deep convolutional neural network and long short-term memory. Scientific Reports, 14(1), 17748. https://doi.org/10.1038/s41598-024-68906-6
[14] Ghafari, L., & Parvishi, A. (2025). Climate projection and drought assessment in the lake Urmia basin using LSTM-based downscaling of GCM models under SSP scenarios. Physics and Chemistry of the Earth, Parts A/B/C, 104134. https://doi.org/10.1016/j.pce.2025.104134.
[15] Huber, P. J. (1973). Robust regression: asymptotics, conjectures and Monte Carlo. The annals of statistics, 799-821. https://doi.org/10.1214/aos/1176342503.
[16] Bergmeir, C., Hyndman, R. J., & Koo, B. (2018). A note on the validity of cross-validation for evaluating autoregressive time series prediction. Computational Statistics & Data Analysis120, 70-83.
[17] Joe, H. (1996). Families of m-variate distributions with given margins and m (m-1)/2 bivariate dependence parameters. Lecture notes-monograph series, 120-141.https://doi.org/10.1214/lnms/1215452614.
[18] Javadi, A., Ghahremanzadeh, M., Sassi, M., Javanbakht, O., & Hayati, B. (2024). Impact of climate variables change on the yield of wheat and rice crops in Iran (application of stochastic model based on Monte Carlo simulation). Computational Economics63(3), 983-1000. https://doi.org/10.1007/s10614-023-10389-0.
[19] Konstantelos, I., Sun, M., Tindemans, S. H., Issad, S., Panciatici, P., & Strbac, G. (2018). Using vine copulas to generate representative system states for machine learning. IEEE Transactions on Power Systems, 34(1), 225-235. https://doi.org/10.1109/TPWRS.2018.2859367.
[20] Kursa, M. B., & Rudnicki, W. R. (2010). Feature selection with the Boruta package. Journal of statistical software, 36, 1-13. https://doi.org/10.18637/jss.v036.i11.
[21] Koenker, R., & Bassett Jr, G. (1978). Regression quantiles. Econometrica: journal of the Econometric Society, 33-50. https://doi.org/10.1080/03610910701723963.
[22] Kheyruri, Y., Sharafati, A., & Neshat, A. (2023). The socioeconomic impact of severe droughts on agricultural lands over different provinces of Iran. Agricultural Water Management289, 108550. https://doi.org/10.1016/j.agwat.2023.108550.
[23] Meinshausen, N., & Ridgeway, G. (2006). Quantile regression forests. Journal of machine learning research, 7(6).
[24] Muñoz-Sabater, J., Dutra, E., Agustí-Panareda, A., Albergel, C., Arduini, G., Balsamo, G., ... & Thépaut, J. N. (2021). ERA5-Land: A state-of-the-art global reanalysis dataset for land applications. Earth system science data13(9), 4349-4383. https://doi.org/10.5194/essd-2021-82
[25] Nazeri Tahroudi, M., Ramezani, Y., de Michele, C., & Mirabbasi, R. (2022). Multivariate analysis of rainfall and its deficiency signatures using vine copulas. International Journal of Climatology, 42(4), 2005-2018. https://doi.org/10.1002/joc.7349.
[26] Ni, L., Wang, D., Wu, J., Wang, Y., Tao, Y., Zhang, J., ... & Xie, F. (2020). Vine copula selection using mutual information for hydrological dependence modeling. Environmental research, 186, 109604. https://doi.org/10.1016/j.envres.2020.109604.
[27] Neetu, & Ray, S. S. (2019). Exploring machine learning classification algorithms for crop classification using Sentinel 2 data. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 42, 573-578. https://doi.org/10.5194/isprs-archives-XLII-3-W6-573-2019.
[28] Rahmati, O., Falah, F., Dayal, K. S., Deo, R. C., Mohammadi, F., Biggs, T., ... & Bui, D. T. (2020). Machine learning approaches for spatial modeling of agricultural droughts in the south-east region of Queensland Australia. Science of the total environment, 699, 134230. https://doi.org/10.1016/j.scitotenv.2019.134230.
[29] Sklar, M. (1959). Fonctions de répartition à n dimensions et leurs marges. In Annales de l'ISUP (Vol. 8, No. 3, pp. 229-231).
[30] Sivakumar, B. (2002). A phase-space reconstruction approach to prediction of suspended sediment concentration in rivers. Journal of Hydrology, 258(1-4), 149-162. https:/ /doi:10.1016/S0022-1694(01)00573-X.
[31] Samantaray, S., & Ghose, D. K. (2022). Prediction of S12-MKII rainfall simulator experimental runoff data sets using hybrid PSR-SVM-FFA approaches. Journal of Water and Climate Change, 13(2), 707-734. https://doi.org/10.2166/wcc.2021.221
[32] Sahin, E. K. (2020). Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN Applied Sciences, 2(7), 1308. https://doi.org/10.1007/s42452-020-3060-1.
[33] Takens, F. (1981). Detecting strange attractors in turbulence. In D. Rand & L.-S. Young (Eds.), Dynamical Systems and Turbulence, Lecture Notes in Mathematics (Vol. 898, pp. 366–381). Springer. https://doi.org/10.1007/BFb0091924.
[34] Tavosi, M., Vafakhah, M., Sadeghi, S. H., Shekohideh, H., & Moosavi, V. (2025). Comparative assessment of Watershed Hydrological Health (WHH) using multi-criteria decision-making approach based on PSR framework. Journal of Environmental Management, 373, 123833. https://doi.org/10.1016/j.jenvman.2024.123833.
[35] Tayebikhorami, A., Mahdavi, R& nohegar, A. (2015). Survey of Consumption water resources priorities in normal conditions and drought (case study: Bandar Abbas city). Iranian Water Research Journal. 13(1), 29-39 [in Farsi].
[36] Wang, S., Liu, Y., Wang, W., Zhao, G., & Liang, H. (2024). Interpretable machine learning guided by physical mechanisms reveals drivers of runoff under dynamic land use changes. Journal of Environmental Management, 367, 121978. https://doi.org/10.1016/j.jenvman.2024.121978.
[37] Wu, T., Bai, J., & Han, H. (2022). Short-term agricultural drought prediction based on D-vine copula quantile regression in snow-free unfrozen surface area, China. Geocarto International, 37(25), 9320-9338. https://doi.org/10.1080/10106049.2021.2017015.
[38] Wang, M., Jiang, S., Ren, L., Xu, C. Y., Wei, L., Cui, H., ... & Yang, X. (2022). The development of a nonstationary standardised streamflow index using climate and reservoir indices as covariates. Water Resources Management36(4), 1377-1392. https://doi.org/10.1007/s11269-022-03088-2.
[39] Wallot, S., & Mønster, D. (2018). Calculation of average mutual information (AMI) and false-nearest neighbors (FNN) for the estimation of embedding parameters of multidimensional time series in matlab. Frontiers in psychology, 9, 1679. https://doi.org/10.3389/fpsyg.2018.01679.
[40] Van Dijk, M., Morley, T., Rau, M. L., & Saghai, Y. (2021). A meta-analysis of projected global food demand and population at risk of hunger for the period 2010–2050. Nature food2(7), 494-501. https://doi.org/10.1038/s43016-021-00322-9.
[41] Zare, S., Abtahi, A., Dehghani, M., Shamsi, S. R. F., Baghernejad, M., & Lagacherie, P. (2024). Quantile random forest technique for soil moisture contents digital mapping, Sarvestan Plain, Iran. In Advanced Tools for Studying Soil Erosion Processes (pp. 351-368). Elsevier. https://doi.org/10.1016/B978-0-443-22262-7.00001-1.
[42] Zegaar, A., Ounoki, S., & Telli, A. (2024). Machine learning for groundwater quality classification: A step towards economic and sustainable groundwater quality assessment process. Water Resources Management, 38(2), 621-637. https://doi.org/10.1007/s11269-023-03690-y.
[43] Zamani, H., Pakdaman, Z., Shakari, M., Bazrafshan, O., & Jamshidi, S. (2025). Enhancing drought monitoring with a multivariate hydrometeorological index and machine learning-based prediction in the south of Iran. Environmental Science and Pollution Research32(9), 5605-5627. https://doi.org/10.1007/s11356-025-36049-4