Rev. Bras. Ciênc. Solo.2021;45:e0210084.
Optimized data-driven pipeline for digital mapping of quantitative and categorical properties of soils in Colombia
24/Nov/2021
DOI: 10.36783/18069657rbcs20210084
Graphical Abstract

Highlights
We propose a toolbox facilitating the workflow of a digital soil mapping project.
The toolbox was tested across a relatively large area of 14,537 km2 in Colombia.
The results confirm derived products offer the robustness required for a DSM project.
Parallel processing increases toolbox’s performance for covariates selection and modeling steps.
ABSTRACT
Soil maps provide a method for graphically communicating what is known about the spatial distribution of soil properties in nature. We proposed an optimized pipeline, named dino-soil toolbox, programmed in the R software for mapping quantitative and categorical properties of legacy soil data. The pipeline, composed of four main modules (data preprocessing, covariates selection, exploratory data analysis and modeling), was tested across a study area of 14,537 km 2 located between the departments of Cesar and Magdalena, Colombia. We assessed the feasibility of the toolbox to model three soil properties: pH at two depth intervals (0.00-0.30 and 0.30-1.00 m), soil taxonomy (great group) and taxonomic family by particle-size, according to a set of 25 environmental factors derived from auxiliary layers of climate, land cover and terrain. As a result, we successfully deployed the proposed semi-automatic and sequential pipeline, yielding rapid digital soil mapping (DSM) outputs across the study area. By providing multiple outputs such as tables, charts, maps, and geospatial data in four main modules, the pipeline offers considerable robustness to support outcomes and analysis of a DSM project. Future studies might be interesting to expand on further machine learning frameworks for predictive modeling of soil properties such as ensembles and deep learning models, which have shown a high performance for DSM.
488

