A new implementation of stacked generalisation approach for modelling arsenic concentration in multiple water sources

dc.contributor.authorIbrahim, B.
dc.contributor.authorEwusi, A.
dc.contributor.authorZiggah, Y.Y.
dc.contributor.authorAhenkorah, I.
dc.date.issued2023
dc.descriptionData source: Supplementary information, https://doi.org/10.1007/s13762-023-05343-4
dc.description.abstractThe current study proposes an effective machine learning model based on a stacked generalisation technique for predicting arsenic content in water sources (groundwater, surface water and drinking water) based on physicochemical water parameters (turbidity, pH, electrical conductivity and total suspended solids). In the proposed approach, random forest and decision trees were stacked as base regressors in the first layer. Then, extreme gradient boosting was employed as a meta-regressor in the second layer to compute the final predictions. A comprehensive assessment of the proposed approach was performed using reliable statistical metrics and diagnostic plots of the observed and predicted arsenic concentration. The results demonstrated a better generalisation performance of the proposed stacked approach as compared with the standalone models of decision trees, random forest, extreme gradient boosting, generalised regression neural network, light gradient boosting, multi-layer perceptron, multivariate adaptive regression splines and other stacked variants models. The proposed stacked approach outperformed all comparative models by achieving the lowest RMSE and MAPE of 8.041E-04 and 0.4689, respectively, and the highest NSE and R 2 of 0.9778 and 0.9787, respectively. Overall, the results have indicated that the proposed stacked generalisation performance is very sensitive to the choice of base learners. The outcome of this study indicates that a stronger predictive potential of base learners could lead to higher performance of the overall stacking model. Hence, the proposed approach could be principal in predicting arsenic concentration in water sources.
dc.identifier.citationInternational Journal of Environmental Science and Technology, 2023; 21(5):5035-5052
dc.identifier.doi10.1007/s13762-023-05343-4
dc.identifier.issn1735-1472
dc.identifier.issn1735-2630
dc.identifier.urihttps://hdl.handle.net/11541.2/37067
dc.language.isoen
dc.publisherCenter for Environment and Energy Research and Studies
dc.rightsCopyright 2023 The Author(s) under exclusive licence to Iranian Society of Environmentalists (IRSEN) and Science and Research Branch, Islamic Azad University
dc.source.urihttps://doi.org/10.1007/s13762-023-05343-4
dc.subjectarsenic
dc.subjectmachine learning
dc.subjectmodelling
dc.subjectstacked generalisation
dc.subjectwater sources
dc.titleA new implementation of stacked generalisation approach for modelling arsenic concentration in multiple water sources
dc.typeJournal article
pubs.publication-statusPublished
ror.mmsid9916812526301831

Files

Collections