Predicting Water Distribution Pipe Failures Using Machine Learning and Cross-Infrastructure Data
- Authors: Daniel Kozelj, David Abert Fernández
- Citation: Acta hydrotechnica, vol. 38, no. 68, pp. 53-64, 2025. https://doi.org/10.15292/acta.hydro.2025.05
- Abstract: Water pipeline failures in urban networks are a significant source of non-revenue water, service disruptions, and high maintenance costs. This study develops a machine learning model to predict pipeline failure probabilities and inform risk-based maintenance strategies. Trained on real-world assets and geospatial data from 2010 to 2025, the model incorporates standard pipe attributes – such as material, age, diameter, network type, and maintenance history – alongside spatially derived indicators of the surrounding infrastructure. Notably, it quantifies the predictive impact of adjacent infrastructure systems, including electricity grids, gas pipelines, district heating, sewage systems, and roads, utilizing spatial buffering and overlay techniques. Several of these cross-utility features, particularly road category, electricity voltage, and sewer type, showed meaningful predictive importance, reflecting their indirect but consistent influence on the risk of pipe failure. The ML model, built with the XGBoost algorithm and validated through stratified K-fold cross-validation, achieved high performance (ROC AUC: 0.9102, recall: 0.7750, accuracy: 0.8750). Despite lower precision due to class imbalance, the F1 score (0.2261) and LogLoss (0.2500) confirm its reliability. This study introduces a novel, spatially enriched approach to failure prediction, advancing urban infrastructure management through context-aware, data-driven insights.
- Keywords: Water distribution systems, Pipe failure prediction, Machine learning, XGBoost, Spatial analysis, condition assessment.
- Full text: a38dk.pdf
- References:
- Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M. (2019). Optuna: A Next-generation Hyperparameter Optimization Framework. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Association for Computing Machinery, New York, NY, United States, Pages 2623–2631. https://doi.org/10.1145/3292500.3330701.
- Asadi, Y. (2024). Employing machine learning in water infrastructure management: predicting pipeline failures for improved maintenance and sustainable operations. Industrial Artificial Intelligence 2(8), 1-13. https://doi.org/10.1007/s44244-024-00022-w.
- Bakhtawar, B., Zayed, T., Elshaboury, N. (2025). Time-to-failure based deterioration factors of water networks: Systematic review and prioritization. Reliability Engineering & System Safety, 263, 111246. https://doi.org/10.1016/j.ress.2025.111246.
- Cabral, M., Gray, D., Brentan, B., Covas, D. (2024). Assessing Pipe Condition in Water Distribution Networks. Water, 16(10), 1318. https://doi.org/10.3390/w16101318.
- Chen, T., Guestrin, C. (2016). "XGBoost: A scalable tree boosting system." Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785-794. https://doi.org/10.1145/2939672.2939785.
- Chen, T., et al. (2023). XGBoost documentation. XGBoost. https://xgboost.readthedocs.io/.
- European Commission. (2025). European Water Resilience Strategy (COM(2025) 280 final). Brussels: European Commission. https://environment.ec.europa.eu/publications/european-water-resilience-strategy_en.
- Ganjidoost, A., Haghighi, A., Klise, K. A. (2022). Pipe failure prediction in water distribution networks using machine learning models. Journal of Water Resources Planning and Management, 148(5), 04022017. https://doi.org/10.1061/(ASCE)WR.1943-5452.0001557.
- GURS. (2025). Data on public infrastructure = Zbirni kataster gospodarske javne infrastrukture. Republic of Slovenia, Surveying and Mapping Authority of the Republic of Slovenia. Slovenia. https://www.e-prostor.gov.si/.
- MOPE. (2025). Information System of Public Environmental Protection Services = Informacijski sistem za spremljanje gospodarskih javnih služb varstva okolja (IJSVO). Republic of Slovenia, Ministry of Natural Resources and Spatial Planning, Slovenia. (in Slovenian) https://www.ijsvo.si/.
- Jafar, R., Shahrour, I., Juran, I. (2010). Application of artificial neural networks (ANN) to model the failure of urban water mains. Mathematical and Computer Modelling, 51(9–10), 1170–1180. https://doi.org/10.1016/j.mcm.2009.12.033.
- Karadirek, I. E., Kaya-Basar, E., Akdeniz, T. (2024). A study on pipe failure analysis in water distribution systems using logistic regression. Water Supply, 24(1), 176–186. https://doi.org/10.2166/ws.2023.335.
- Kleiner, Y., Rajani, B. (2001). Comprehensive review of structural deterioration of water mains: Statistical models. Urban Water, 3(3), 131–150. https://doi.org/10.1016/S1462-0758(01)00033-4.
- Kozelj, D., Gorjup, M., Kramar Fijavž, M. (2017). Uporaba teorije grafov za zasnovo merilnih območij v vodovodnem omrežju = An application of spectral graph partition for designing district metered areas in water supply networks. Acta hydrotechnica, 30(53), 81–96. https://www.dlib.si/details/URN:NBN:SI:doc-AEEWAAH7.
- Latifi, M., Zali, R.B., Javadi, A.A., Farmani, R. (2024). Efficacy of Tree-Based Models for Pipe Failure Prediction and Condition Assessment: A Comprehensive Review. Journal of Water Resources Planning and Management, 150(7), 03124001. https://doi.org/10.1061/JWRMD5.WRENG-6334.
- Large, A., Le Gat, Y., Elachachi, S. M., Renaud, E., Breysse, D., Tomasian, M. (2015). Improved modelling of ‘long-term’ future performance of drinking water pipes. Journal of Water Supply: Research and Technology—AQUA, 64(4), 415–425. https://doi.org/10.2166/aqua.2015.115.
- Le Gat, Y., Curt, C., Werey, C., Caillaud, K., Rulleau, B., Taillandier, F. (2025). Water infrastructure asset management: state of the art and emerging research themes. Structure and Infrastructure Engineering, 21(4), 539-562, https://doi.org/10.1080/15732479.2023.2222030.
- Liu, Z., Kleiner, Y., Rajani, B., Wang, L., Condit, W. (2012). Condition assessment technologies for water transmission and distribution systems (EPA/600/R-12/017). U.S. Environmental Protection Agency. https://cfpub.epa.gov/si/si_public_record_report.cfm?dirEntryId=241510&Lab=NRMRL.
- Mackey, T., Cashman, A., Cumberbatch, R. (2014). Identification of factors contributing to the deterioration and losses in the water distribution system in Barbados (CERMES Technical Report No. 68, 73 pp.). The University of the West Indies, Centre for Resource Management and Environmental Studies. https://www.cavehill.uwi.edu/cermes/docs/technical_reports/mackey_et_al_2014_pipe_deterioration_and_water_los.aspx.
- Misiunas, D., Vítkovský, J., Olsson, G., Lambert, M., Simpson, A. (2006). Failure monitoring in water distribution networks, Water Science & Technology, 53 (4-5), 503–511. https://doi.org/10.2166/wst.2006.154.
- Mohammadagha, M., Najafi, M., Kaushal, V., Jibreen, A. (2025). Machine Learning Models for Reinforced Concrete Pipes Condition Prediction: The State-of-the-Art Using Artificial Neural Networks and Multiple Linear Regression in a Wisconsin Case Study. arXiv, cs.LG, 2502.00363. https://doi.org/10.48550/arXiv.2502.00363.
- Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., … others. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830. http://www.jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdf.
- Phan, H. C., Dhara, A. S., Hu, G., Sadiq, R. (2019). Managing water main breaks in distribution networks––A risk-based decision making. Reliability Engineering & System Safety, 191, 106581. https://doi.org/10.1016/j.ress.2019.106581.
- QGIS.org, 2025. QGIS Geographic Information System. QGIS Association. http://www.qgis.org.
- Rajani, B., Kleiner, Y. (2001). Comprehensive review of structural deterioration of water mains: Physically based models. Urban Water, 3(3), 151–164. https://doi.org/10.1016/S1462-0758(01)00032-2.
- Rezaei, H., Ryan, B., Stoianov, I. (2015). Pipe failure analysis and impact of dynamic hydraulic conditions in water supply networks. Procedia Engineering, 119, 253–262. https://doi.org/10.1016/j.proeng.2015.08.879.
- SURS. (2025). Public Water Supply - Water supplied from public water supply (1000 m3). Statistical Office of the Republic of Slovenia. https://pxweb.stat.si/SiStat/en.
- VOKAS. (2025). Annual report 2024. Javno podjetje VODOVOD KANALIZACIJA SNAGA d.o.o. (in Slovenian) https://www.vokasnaga.si/sites/www.jhl.si/files/dokumenti/letno_porocilo_2024.pdf.
- Warad, A.A.M., Wassif, K. Darwish, N.R. (2024). An ensemble learning model for forecasting water-pipe leakage. Sci Rep 14, 10683. https://doi.org/10.1038/s41598-024-60840-x.
- Zevnik, J., Kramar Fijavž, M., Kozelj, D. (2019). Generalized normalized cut and spanning trees for water distribution network partitioning. Journal of Water Resources Planning and Management, 145(10), 1–12. https://doi.org/10.1061/(ASCE)WR.1943-5452.0001100.