Machine Learning Predicts Groundwater Heavy Metal Pollution in Densu Basin
A recent study on arXiv has introduced a predictive model for heavy metal contamination in the Densu Basin’s groundwater. Tackling the complexities related to the Heavy Metal Pollution Index (HPI), the study employed three transformation techniques: raw, log, and Gaussian copula, assessing their effectiveness with six machine learning methods including support vector regression and k-nearest neighbors. The initial findings suggested that raw models might yield overly optimistic outcomes, particularly with R² values nearing 1.0 for Elastic Net and the ensemble model. In contrast, the log transformation demonstrated notable improvements in stabilizing variance and enhancing prediction reliability.
Key facts
- Study focuses on groundwater heavy metal pollution in the Densu Basin.
- Conventional methods fail to capture statistical complexity and spatial heterogeneity.
- HPI is skewed and affected by correlated contaminants.
- Three transformations applied: raw, log, Gaussian copula.
- Six learners used: SVM, k-NN, CART, Elastic Net, kernel ridge regression, stacked Lasso ensemble.
- Raw-scale models gave R² ≈ 1.0, indicating over-optimism.
- Log transformation stabilized variance.
- Framework integrates response transformations with nested cross-validated ensemble machine learning.
Entities
Institutions
- arXiv
Locations
- Densu Basin