欢迎访问《高校地质学报》官方网站,今天是
分享到:

高校地质学报 ›› 2024, Vol. 30 ›› Issue (04): 485-495.DOI: 10.16108/j.issn1006-7493.2023033

• • 上一篇    下一篇

融合空间及环境要素机器学习模型的重金属含量预测:以硇洲岛为例

贾黎黎,胡飞跃,李婷婷,朱 鑫,易隆科   

  1. 广东省地质调查院,广州 510080
  • 出版日期:2024-08-20 发布日期:2024-08-20

Prediction of Heavy Metal Content through Fusion of Spatial and Environmental Factors in Machine Learning Models:#br# A Case Study of Naozhou Island 

JIA Lili,HU Feiyue,LI Tingting,ZHU Xin,YI Longke   

  1. Geological Investigation Institute of Guangdong Province, Guangzhou 510080, China
  • Online:2024-08-20 Published:2024-08-20

摘要: 传统的土壤重金属污染评价方法通常只依赖于有限采样点数据进行空间插值分析,忽略了如地质背景、人类活动、地理要素等环境协变量对重金属分布的影响,而仅考虑环境协变量的预测又难以反映重金属元素分布的空间聚集效应。因此,该文章提出了一种新的方法,即融合空间及环境要素作为协变量,基于随机森林(RF)、极致梯度提升树(XGboost)、深度学习神经网络(DNN)三种模型对硇洲岛重金属空间分布进行预测。结果表明,将空间要素融入模型后,预测性能得到显著提升。文章另外还采用了数据分割和外部数据验证等方法验证改进模型的稳健性,并选择最优模型进行预测。基于预测结果,从数据特征层面对重金属元素进行相关分析、聚类分析,并从空间分布层面对重金属元素进行LISA空间聚类分析的研究表明,控制岛内Cu、Ni、Cr、Zn元素分布的关键因素是地质因素,而Pb、As、Hg的分布主要由人类活动控制,Cd元素的分布由人类活动及地质背景共同影响。

关键词: 重金属, 空间异质性, 机器学习, 精确预测

Abstract: Conventional assessment methods of soil heavy metal contamination predominantly depend on spatial interpolation analyses, conducted with data derived from a restricted number of sampling points, thereby often overlooking the influence exerted by environmental covariates such as geological backdrop, human activities, and geographical features on the heavy metal distribution. Predictions based solely on environmental covariates tend to fall short in adequately reflecting the spatial aggregation effects associated with heavy metal dispersion. As such, this paper introduces a novel approach that amalgamates spatial and environmental factors as covariates, employing three models: Random Forest (RF), Extreme Gradient Boosting (XGboost), and Deep Learning Neural Network (DNN), designed to predict the spatial distributions of heavy metals in Naozhou Island. The results underscore the substantial improvement in predictive performance achieved through the integration of spatial elements into the model. To test and enhance the robustness of the improved model, data splitting and external data validation techniques are utilized in this study, and the most optimal model for prediction is selected. Following the prediction results, correlation and clustering analyses were conducted on the heavy metal elements at the data feature level, and a LISA spatial clustering analysis at the spatial distribution level. The analyses reveal that geological factors predominantly dictate the dispersion of Cu, Ni, Cr, and Zn elements within the island, while human activities primarily govern the distribution of Pb, As, Hg. Moreover, the distribution of Cd element is ascertained to be influenced by a combination of human activities and geological background. 

Key words: Heavy metals, spatial heterogeneity, machine learning, accurate prediction