Digital Soil Mapping Using Machine Learning Algorithms in a Tropical Mountainous Area
Increasingly, applications of machine learning techniques for digital soil mapping (DSM) are being used for different soil mapping purposes. Considering the variety of models available, it is important to know their performance in relation to soil data and environmental variables involved in soil mapping. This paper investigated the performance of eight machine learning algorithms for soil mapping in a tropical mountainous area of an official rural settlement in the Zona da Mata region in Brazil. Morphometric maps generated from a digital elevation model, together with Landsat-8 satellite imagery, and climatic maps, were among the set of covariates to be selected by the Recursive Feature Elimination algorithm to predict soil types using machine learning algorithms. Mapping performance was assessed using the confusion matrix, and the Z-test among the Kappa indexes of the matrices. In a conventional soil survey, the soils described and classified in the Brazilian System of Soil Classification [Argissolos Vermelho-Amarelos Distróficos – PVAd (Acrisols), Cambissolos Háplicos Tb Distróficos – CXbd (Cambisols), Gleissolos Háplicos Háplicos Tb Distróficos – GXbd (Gleysols), Latossolos Amarelos Distróficos – LAd (Xanthic Ferralsos), Latossolos Vermelho-Amarelos Distróficos – LVAd (Rhodic Ferralsols), and Neossolos Litólicos Distróficos – RLd (Neossols)] were grouped into composite mapping units (MU) using the conventional method. The eight algorithms showed similar performance without statistical difference (Kappa 0.42-0.48). The mapping of soils with varying slopes (LAd, LVAd, CXbd) showed lower accuracy, whereas soils on hydromorphic lowlands (GXbd) were classified more accurately. In map algebra, the result was rather satisfactory, with 63-67 % agreement between the conventional soil map and maps produced by machine learning. The areas with the largest disagreement in the DSM occurred in the LAd unit due to subtle color variation in the Latossolos mantle without a clear relation to any environmental variable, highlighting difficulties in DSM regarding hill slope landforms. Model performance was satisfactory, and good agreement with the conventional soil map demonstrates the importance of the DSM as a potential complementary tool for assisting soil mapping in mountainous areas in Brazil for the purpose of land use planning.