Identifying appropriate reference ecosystems based on soil indicators to evaluate postmining reclamation: A multivariate framework

ABSTRACT Large-scale mining operations, such as those associated with iron extraction, disturb soils and vegetation and create the need for effective rehabilitation practices. The Iron Quadrangle region of southeastern Brazil is one of the world’s biodiversity hotspots; however, iron mining activities threaten many natural and seminatural ecosystem types in which many rare/protected species occur. The Iron Quadrangle has four main ecosystem types: Atlantic Forest (AF), ferruginous rupestrian grassland with dense vegetation (FRG-D); ferruginous rupestrian grassland with sparse vegetation (FRG-S); and quartzite rupestrian grassland (QRG). To support rehabilitation and monitoring plans, we evaluated reference areas and identified soil and vegetative attributes that best differentiated between these four ecosystems. We measured thirty-four physical, chemical, and biological soil properties and two vegetation parameters and, using a multivariate analysis, detected: 1) correlations between properties and 2) differences between areas. We identified twelve properties that best differentiated the areas (in order from most to least relevant): nickel content; exchangeable aluminum; clay content; above-ground vegetation volume; aluminum saturation; particle density; bulk density; arsenic content; zinc content; lead content, fine sand plus silt content; and fine sand content. Soil physicochemical properties proved to be more sensitive to differences in ecosystem type, and in particular, parameters related to fertility and the presence of metals and semi-metals differentiated the AF from the FRG-D and FRG-S. Soil physical properties, including fine sand and silt content, were most important for differentiating QRG from the other ecosystems, possibly resulting from the exposure of quartzite material to erosive processes. This study demonstrates the importance of identifying appropriate reference areas for post-mining reclamation.


INTRODUCTION
Metallurgical and steel industries have increased demand for mineral resources, leading to extensive exploration of iron ore deposits (Carvalho et al., 2014).This activity can cause many detrimental effects, such as releasing greenhouse gases, inducing ecotoxicity in freshwater, generating solid waste, and transforming natural environments (Liu et al., 2020).Moreover, mined land often has very different chemical, physical and biological properties compared to the pre-mining state (Sousa et al., 2020;Woźniak et al., 2022).Such changes make it challenging to restore these areas to their former condition.
Post-mining efforts can be classified as land restoration or land rehabilitation (Gastauer et al., 2019).Both sets of practices require the ability to evaluate or measure changes that result from the intervention.However, land restoration efforts also need appropriate or optimal reference areas to use as comparators (i.e., goals) (Toma et al., 2023).These target areas must still have most of their original functions (Uehara and Gandara, 2011), being used to estimate feasible trajectories for disturbed sites (SER, 2004).However, due to their proximity to different types of reference areas in their surroundings acting as propagule sources (Toma et al., 2023), post-mined areas under recovery may have their trajectory altered in diverse environments.
As an example, the Iron Quadrangle (IQ) region in Minas Gerais State, Brazil, holds numerous mining operations and contains remarkable geological, geomorphological, pedological, biological, and phytophysiognomic diversity (Messias et al., 2012;Schaefer et al., 2015;Fernandes, 2016;Coelho et al., 2017).The local flora is associated with the Atlantic Forest and Brazilian Savana ecotones, along with Rupestrian grasslands that form transitional areas between the two (Sousa et al., 2020).
Rupestrian grasslands have a broad diversity of habitat types, primarily controlled by lithological, topographic and pedological factors (Schaefer et al., 2023).Such habitats include physiognomies with a predominance of native grasses, rocky outcrops, or gallery forests; forest fragments on hilltops are also common (Carvalho Filho et al., 2010;Fernandes et al., 2016a).Together they reveal the true identity and representativeness of this unique and neglected ecosystem.However, the description of reference areas in this hotspot is still not clear enough.Recent studies have emphasized the urge for a deeper evaluation of references to guide and improve the quality of ecological restoration and prevent the loss of important ecosystems (Fernandes et al., 2016b;Toma et al., 2023).
The urge for ecosystem functions incorporation into restoration ecology studies, as they have been affected by land degradation and climate change as been repeatedly stressed (Higgs et al., 2014;Kollmann et al., 2016;Gastauer et al., 2019).Soil provides numerous of these functions (Pereira et al., 2018), justifying the relevance of soil quality for postmining restoration (Ramos et al., 2022).In Rupestrian grassland specifically, it also plays a key role in driving physiognomy distribution due to its great variability, influencing the local vegetation structure (Schaefer et al., 2023).Furthermore, the difficulty in managing and amending the soil and survival of the species used during the revegetation process, for example, can contribute to the less effective recovery of post-mine areas in Brazil (Guedes et al., 2021).Thus, greater importance must be given to the soil in the selection of reference areas.
To properly identify the most appropriate reference environments in areas such as the IQ, and assist the construction of knowledge about reference sites in this important region, this study aimed to identify and describe the most relevant soil properties and vegetation parameters to differentiate among potential reference sites within the Iron Quadrangle.

Study site
This study was performed in the city and surroundings of Nova Lima, located in the IQ geological province in the state of Minas Gerais.The region's climate was classified as humid subtropical (Cwa), according to the Köppen-Geiger classification system, with an average annual rainfall of 1390 mm and an average temperature of 21 °C.
Four different areas were evaluated during the study and they were named according to the vegetative cover present: ferruginous rupestrian grassland with small shrub vegetation in flat relief (FRG-S); ferruginous rupestrian grassland with dense shrub vegetation and steeper relief (FRG-D); quartzitic rupestrian grassland with the predominance of grasses in steep relief (QRG), and Atlantic Forest (AF) with dense and arboreal vegetation (Figure 1).The AF area was part of the Nova Lima group (gneiss and granitoid rocks of the Bonfim Complex and the Quartzite Moeda) in the Rio das Velhas supergroup.The FRG-D site was located in the Cauê Formation of the Itabira group (itabirite metamorphic rock), and the QRG site was located in the Caraça group (quartzite); both groups were part of the Minas supergroup.The FRG-S covered an area with debris-lateritic, which represented the most recent set in the IQ.Soils were classified as Leptsols (Neossolo Litólico distrófico) in the FRG-D and QRG, Petroplinthic Plinthosols (Plintossolo Pétrico concrecionário) in the FRG-S, and Cambisols (Cambisso Háplico Tb distrófico) in the AF, following the WRB/ FAO classification system.The areas ranged in altitude from 1,277 to 1,482 m, and the FRG-S had flat relief, while the three areas had moderate-to-steep relief (2-10 % slopes).All four sites included native soil and vegetation that had not previously been altered by anthropogenic processes and therefore represented suitable reference areas to guide local restoration/rehabilitation efforts.

Soil samples and vegetation analysis
Four plots of 100 m² were delimited in each area.Composite samples were collected in each plot by homogenizing four simple samples that were collected randomly.Due to shallow soil depths in the FRG-S and QRF areas, a single composite sample (0.000-0.025 m deep) was collected per plot.In the other two areas (FRG-D and AF), separate composite samples were collected from the 0.000-0.025and 0.025-0.100m depths.All samples were air-dried, macerated in a porcelain mortar, and sieved through a 2-mm sieve.The chemical, physical and biological soil properties analyses, and vegetation parameters were summarized in table 1.
Due to the challenges of collecting volumetric rings in shallow and stony soils, we used a modified version of the beaker method (Teixeira et al., 2017a), normally used for sandy soils, in which we determined the mass of oven-dried soil (24 h at 105 °C) necessary to completely fill a 10 cm³ cylinder.We obtained particle density (Pd) using the volumetric flask method (Viana et al., 2017).Particle size analysis was performed using the pipette method (Donagemma et al., 2017), where coarse sand is formed by particles between 2 and 0.2 mm, fine sand by particles between 0.2 and 0.053 mm, silt by particles between 0.053 and 0.002 mm, and clay by particles smaller than 0.002 mm.Total porosity (ϕ) was calculated from the measured bulk and particle densities (Almeida et al., 2017).Gravimetric water contents at field capacity (FC) and wilting point (WP) were measured using a positive pressure system set to respective tensions of 10 and 1,500 kPa (Teixeira and Behring, 2017).The available soil water content (WC) was determined by subtracting WP from FC.Total organic carbon content (TOC) was obtained by dry combustion (Shimadzu SSM 5000-A COT-L Analyzer), following the procedure of Carmo and Silva (2012).Soil fertility analyses were performed using air-dried fine soils.The pH was determined using the potentiometer method using a soil:water ratio of 1:2.5 (Teixeira et al., 2017b).Potential acidity (H+Al) was determined by extraction with 0.5 mol L -1 calcium acetate buffered at pH 7.0, followed by titration (Campos et al., 2017a).Exchangeable contents of multivalent cations were obtained using a KCl 1 mol L -1 extractant solution that was analyzed by titration (Al 3+ ) and atomic absorption spectroscopy (Ca 2+ and Mg 2+ ), following Teixeira et al. (2017a).Available potassium (K + ) and phosphorus (P) were extracted with a Mehlich-1 solution, with K + concentration then determined by flame spectrophotometry and P concentration determined by UV-Vis spectrophotometry (Teixeira et al., 2017c).The remaining phosphorus content (P-rem) was measured using a 1:10 ratio of air-dried fine soils into a 0.010 mol L -1 CaCl 2 solution containing 60 mg L -1 of P (Alvarez V. et al., 2017).
Soil samples were again macerated and passed in sieves of 100 mesh for the analysis of the total contents of metals and semi-metals.Total contents of As, Co, Fe, Mn, Ni, and Pb were obtained by semi-open acid digestion with nitric acid hydrochloride (HCl:HNO 3 , 3:1, v v -1 ), according to the German standard DIN 38414-S7; the procedure was performed in triplicate.The elemental quantification was carried out on the extracts using inductively coupled plasma optical emission spectroscopy (ICP-OES).The certified SS-1 EnviroMAT TM -Soil Contaminated sample from SCP Science was used for each digestion.The detection and quantification limits for each element evaluated were calculated as described in the Eurachem laboratory guide (Magnusson and Örnemark, 2014) Microbial biomass carbon (MBC) and microbial biomass nitrogen (MBN) were extracted by the irradiation-extraction method (Ferreira et al., 1999) and determined by titration according to Tedesco et al. (1985).Accumulated C-CO 2 content was evaluated by a respiration test [adapted from Stotzky (1965)] conducted under controlled conditions (25 ± 1 °C, in the dark).Soil moisture was adjusted to approximately 70 % of FC, and then 50 g were incubated for 21 days in 600 mL glass jars with screw-on tops containing a central septum.The gases produced were collected with 60 mL syringes at the beginning of the test and then 12,24,48,96,192,288,384, and 504 h later.After each collection, the pots were opened, ventilated, and left open for 15 min.The accumulated CO 2 concentration was then measuring using cavity ring-down spectroscopy (CRDS; G2121-i, Picarro, Santa Clara CA USA 95054).The carbon associated with accumulated CO 2 (i.e., C-CO 2 accumulated ) was calculated by the sum of the C-CO 2 contents (mg kg -1 of soil) obtained in each period.Basal soil respiration (BSR) (mg kg -1 h -1 ) was determined using equation 1.
in which: t is the total test duration (h).
Aerial images were captured using a DJI Phantom 4 Professional drone (quadcopter) coupled with a 4K full resolution (5472 × 3648 pixel) camera to obtain vegetation data in a low-cost and straightforward way to monitor recovery areas effectively.The georeferenced images were processed in the Agisoft Metashape ® to obtain orthorectified images of the area.
Images of each plot were cropped from the larger image using the ArcGis ® software.The ground cover percentage (COV) of each plot was determined from images with Modified Photochemical Reflectance Index (MPRI), constructed by the RG bands (red and green), using ArcGis ® .The digital terrain and surface models were generated in the Agisoft Eq. 1 Rev Bras Cienc Solo 2023;47:e0230014 Metashape ® software, and the two models were then differenced from one another to determine vegetation volume (VOL).This value only approximates the vegetation size since it does not consider empty spaces in the vegetation (e.g., voids in the canopy).
Determining the vegetation volume does not provide any information about the existing biodiversity, but it can be an interesting parameter to assess the initial stage of recovery and represents an advance in relation to the simple determination of soil cover.

Statistical analyses
We used a Principal Component Analysis (PCA) to evaluate how soil properties and vegetation parameters were distributed among the four reference areas.This analysis was used to support the selection of the most relevant properties that distinguish between these areas.First, since plots in the FRG-S and QRG areas had only a single (0.000-0.025 m) layer whereas the FRG-D and AF areas had two depths (0.000-0.025 and 0.025-0.100m), in the latter two areas we used a weighted approach to generate mean values for each plot.Properties evaluated in the 0.000-0.025m layer were weighted by 0.25 (25 %), and properties in the 0.025-0.100m layer were weighted by 0.75 (75 %).Second, we performed a Pearsons correlation analysis to evaluate correlations between variables, which is a necessary step when performing PCA (Figueiredo Filho and Silva Júnior, 2010).Third, we standardized the variables by their means and standard deviations (new scale range of 0 to 1 for all variables) so that scale differences did not influence the principal components (Jolliffe and Cadima, 2016).Then we used permutational multivariate analysis of variance (PERMANOVA) to test for significant differences between properties of the different reference areas (α = 0.05).PERMANOVA is an adapted ANOVA for a matrix of distances, calculated by the Jaccard method, and the Non-Metric Multidimensional Scaling (nMDS) result shows stress values.Statistical analyzes were performed in the RStudio integrated development environment of the R software (R Development Core Team, 2022), using the Hmisc, permute, lattice, vegan and factoextra packages.

Selection of the most relevant properties
We used the component selection criteria proposed by Bhardwaj et al. (2011), and retained the principal components with the highest eigenvalues and the variables with the highest load values.We specifically selected main components with eigenvalues >1 (Kaiser criterion), as that condition ensured that we only retained components that explained more of the total variance than the original variable.Once we selected these principle components, we used only variables with loads ≥0.80.
The data were submitted to PCA in multiple cycles, using the variables selected in the previous cycle as the starting point for each new analysis.At each PCA cycle, any principal components and variables that met the elimination criteria (eigenvalue <1 or load ≤0.8) were excluded.The cycles were applied until no further exclusion of variables was possible.

Assignment of weights
The contribution of each selected indicator was represented by their relative weight (W i ), which was calculated according to equation 2 (Borges, 2013).
in which: R ij is the load of property i on component j; F j is the eigenvalue of component j; i is the index of the property and j is the index of the retained component with eigenvalue >1.Note that the sum of all weights must be equal to 1, i.e., = = ∑ 1.

RESULTS
The final PCA analysis resulted in the selection of only two principal components (PC1 and PC2), which together explained 91.8 % of the total variance, exceeding the minimum recommended value of 70 % (Jolliffe and Cadima, 2016).The selected variables included many physical and chemical properties of the soil and vegetation, but no biological soil properties (Figure 2).The following properties (in descending order) were most relevant to PC1: Ni, Al, Clay, Vol, m, Bd, Pd, and As, whereas the following properties were more important for PC2: Zn, Pb, FS+S, and FS.The first component axis (PC1) provided most of the differences between the AF versus FRG-S and FRG-D areas but had little influence on QRG.The second component axis (PC2) better differentiated between the three rupestrian grassland areas.Soil texture was an important property to both components: clay was one of the properties with highest loads in PC1, whereas parameters associated with the fine sand fraction (FS and FS+S) had the highest loads of all attributes in PC2 (Table 2).However, certain soil chemical properties also had high loads, including Pb, Zn, Ni, and Al, and vegetation volume also had a strong load in PC1.These results show the importance of measuring multiple parameters when differentiating between areas.
The selected physical properties (FS, FS+S, Clay, Bd, and Pd) had greater variation in the ferruginous rupestrian grassland soils, especially in the FRG-S, than in the Atlantic Forest area (Figure 3).The QRG area stood out for the highest content of FS+S among the other areas, whereas the Atlantic Forest soils had the highest clay contents.The FRG-S and FRG-D areas had the highest values of Bd and Pd, in line with their more hematitic mineralogy [iron being denser than quartz (Enkin et al., 2020)].The QRG area had low metal and semi-metal content, and no Zn was detected.Arsenic, Ni, and Zn contents were higher in AF compared to other areas.The AF area also had the highest values for exchangeable aluminum content (Al 3+ ) and aluminum saturation (m), whereas there were no significant differences between FRG-S and FRG-D for these parameters.However, the contents of other metals and semi-metals were higher in FRG-S than in FRG-D.The highest total Pb content was found in FRG-S, followed by FRG-D.Niquel contents were low in all areas, except for the AF.Finally, vegetation volume (VOL) was the only vegetation parameter that showed significant variation among the evaluated areas.As expected, the AF had the highest mean value, followed by the FRG-D, FRG-S, and QRG.The weights calculated for the selected properties were widely distributed, with a gradual reduction, suggesting that these properties had similar importance for differentiating between areas (Figure 4a).Niquel and Al 3+ had the highest weights, with W i = 0.104 for both.When grouped by parameter type, soil chemical properties contributed the most to distinguishing between areas (50.9 % contribution), followed by physical attributes (38.9 % contribution) and, finally, by the vegetation volume (10.2 % contribution; Figure 4b).
The PERMANOVA analysis showed that the selected soil and vegetation properties could identify significant differences between all areas (jaccard, p<0.001,R 2 = 0.91).The areas FRG-S, FRG-D, and QRG all had relatively similar properties to each other, while the AF area was very distinct from the others (Figure 5).
The complexity of the ferruginous rupestrian field areas can be observed by the more significant variability between the different plots evaluated within each area, represented by the larger polygons.The Atlantic Forest area, by contrast, had the most homogeneous properties.

Selecting attributes to differentiate between reference areas
In this study, we measured 36 different soil-and vegetation-related indicators in four reference sites in the IQ region of Brazil, with the goal of determining which characteristics were most useful for differentiating between these areas.We focused primarily on soil properties, since losses in its quality can impact the surrounding ecosystem (Doran and Parkin, 1994).
Our analysis revealed that twelve indicators were most important for detecting differences between sites: fine sand, silt, clay, bulk density, particle density, Pb content, Ni content, As content, Zn content, exchangeable aluminum content, aluminum saturation, and vegetation volume.In total, there were five physical indicators, seven chemical indicators, and one vegetation indicator.Interestingly, no soil biological properties were detected as being meaningful for selecting reference sites.We speculate that this result may be due to frequent burning that occurs in our study areas (Kolbek and Alves, 2008), which likely impacts the soil biological activity and consumes soil organic matter.

Soil chemical properties
Soil chemical properties proved to be the most relevant for distinguishing between reference areas, possibly due to differences in: (i) fertility and organic matter content, and (ii) metals and semi-metals.The presence and concentrations of metals and semimetals vary widely in the region, due to different parent materials (Costa, 2003).Niquel,  Zn, and As contents present in AF soil may be linked to the occurrence of Algomatype iron formations present in extensions belonging to the Rio das Velhas supergroup (Rossi, 2014).Niquel, As, Fe, Pb, Zn, and Fe are commonly associated with these formations (Branco, 1982).Likewise, the Pb and Zn contents in FRG-S and FRG-D are possibly related to the occurrence of ferruginous crusts (canga or concretions) of the detritus-lateritic cover of the itabirites of the Cauê Formation.The QRG soils have no detectable metal or semi-metal content, as they are not commonly found in quartzite and phyllite-type rocks.
Chemical properties also clearly distinguished the ferruginous rupestrian fields from the AF, the latter of which had higher active and exchangeable acidity, along with higher aluminum saturation and exchangeable aluminum content.Forest-covered Cambisols in the IQ tend to have high acidity (Carvalho Filho et al., 2010;Coelho et al., 2017).The high Al 3+ content combined with the high carbon content in this soil is possibly explained by the greater water availability and consequent greater weathering (Carvalho Filho et al., 2010).The highest aluminum saturation was found in AF, probably due to low soil natural fertility and greater cation exchange capacity.Denser and more voluminous vegetation generates residues that contribute more to the increase of soil organic matter, and often promote greater absorption of basic cations in the soil and, consequently, a high amount of Al 3+ occupies the exchange sites in the organic particles (Jiang et al., 2018).Thus, aluminum saturation plays an influential role in its characterization.
Lower values of properties related to soil acidity in FRG-S and FRG-D are possibly justified by the mineralogical composition of the source material, which are ferruginous rocks with little silica and alumina (Schaefer et al., 2015).In addition, Coelho et al. (2017) also reported considerably lower levels of Al 3+ in ferruginous rupestrian grasslands compared to forest soils.Quartzitic rupestrian grassland soils of the Caraça group are derived from quartzite and phyllite (Costa, 2003), which are rocks that consist mostly of sericite, kaolinite, and quartz.These minerals were a probable source of the higher Al 3+ content in QRG than in FRG-S and FRG-D.The reduced cation exchange capacity observed in QRG, explained by the low carbon and clay contents, also may help to explain the higher aluminum saturation in the soil of this area compared to ferruginous soils.

Soil physical properties
Physical properties associated with soil granulometry were also essential to differentiate between reference areas.Clay content presented the third highest weight among the selected indicators, and the highest contents were measured in the AF (mean of 0.47 kg kg -1 ).The variation in clay content was much smaller between the rupestrian fields (0.16 to 0.22 kg kg -1 ), yet even minor changes in clay content can alter water dynamics, nutrient availability, and other physiochemical properties (Schnabel et al., 2013;Kome et al., 2019).Therefore, even though post-mining rehabilitation activities will likely not change clay content, measuring this property at the beginning of the recovery process can help to define the recovery trajectory and an appropriate reference area.
Beyond clay content, the sum of fine sand and silt contents (FS+S) is another essential indicator for identifying reference areas.The FS+S quantity encompasses the grain sizes most commonly found in post-mining environments, such as waste rock piles.At the same time, higher FS+S values are associated with greater susceptibility to soil erosion (Sun et al., 2021).In the ferruginous rupestrian grassland areas, the FS+S contents exceed those of clay.These mineral particles carry the same chemical characteristics as the canga or itabirite from which they originated (Varajão et al., 2009;Coelho et al., 2017).The coarsest particle size observed in rupestrian field soils (with relatively high silt contents) is associated with erosive processes that occur in young soils combined with rugged relief, mild temperatures, and high resistance related to the chemical composition of the minerals present (Carvalho Filho, 2008;Leonardi, 2014).Thus, in QRG, belonging to the Rio das Velhas Supergroup, the highest contents of FS+S are possibly related to the source material consisting essentially of quartzites with intercalated phyllite layers (Salgado et al., 2007).It is noteworthy that young soils in the IQ are fundamental for landscape formation from erosion processes (Coelho et al., 2017).
Bulk density was the sixth most important property.The coarser texture of the soils can explain the high bulk density values observed in the FRG-S and FRG-Dnce the aggregation of particles is small, resulting in a lower porosity (Lal and Shukla, 2004), and because the mineralogy is rich in Fe, which generally has a density >3.0 Mg m -3 (Carvalho Filho, 2008).We emphasize that the bulk density values were high even when using air-dried, sieved, and repacked soil.Lower particle density values are likely to be related to the contribution of non-ferrous nature materials, such as those found in QRG and AF.The AF soils also may have had lower particle densities because of higher organic matter content.

Vegetation parameters
Our analysis showed that vegetation volume was the only parameter with a significant influence in distinguishing the areas, with the fourth-largest weight of all indictors.In terms of absolute values, the AF area had much greater vegetation volume than the other areas.This fact was expected because it is an area with dense arboreal vegetation.This physiognomy as a reference is important since post-mining areas usually have a deep substrate (Technosols), and dense arboreal vegetation can be established.In addition, having more roots can contribute to better soil structuring, increasing its porosity (Cunha Neto et al., 2018).Thus, the expressive vegetation volume in AF is related to a deep and well-structured soil with higher organic matter content, total porosity, and clayey granulometry.
This result suggests that, on the one hand, vegetation volume can be a low-cost, easyto-measure method to identify target reference areas.On the other hand, it can take many years or decades for vegetation to recover in the nutrient-poor soils such as those in Brazil.Therefore, vegetation volumes in reference areas may not be informative for monitoring short-to-intermediate term progress during rehabilitation.Instead, we can use our analysis to identify soil-related characteristics that may influence the ability of different areas to support healthy vegetation.As one example, the organic carbon content in these environments can affect vegetation development both directly and indirectly, via benefits such as faster germination, additional root absorption, better soil structure, higher cation exchange capacity, and greater water retention (Selle, 2007).
As another example, vegetation recovery can be expected to be limited in soils with low nutrient availability, rocky outcrops, or thin substrates, such as areas in the ferruginous rupestrian grasslands with lateritic cangas close to the surface (Messias et al., 2012).In our study the FRG-D area had less clay and greater rockiness than the FRG-S and had more voluminous vegetation.This high volume is probable because the effective depth and morphological aspects also influence them (Pereira, 2010;Messias et al., 2012), mainly by positively or negatively affecting the root development of native plants.Furthermore, the gravel size and pebbles from petroplinthic nodules and their arrangement in the soil profile dramatically influence the water dynamics at the site and can be decisive for the development of denser and more voluminous vegetation (Coelho et al., 2017).Similarly, the QRG has soil with the lowest TOC, MBC, MBN, low porosity and CEC, and the highest fine sand and silt contents.These attributes show inefficient cycling, soil with a weak structure, and low nutritional potential, justifying smaller vegetation.

A transferable framework to identify appropriate reference areas
Our analysis showed that PCA and PERMANOVA were efficient statistical tools to differentiate between reference areas based on soil properties and vegetation parameters.The PCA was particularly useful in reducing the number of evaluated properties and eliminating redundant variables (Masto et al., 2007).In our study, the first two principal components together explained more than 90 % of the variation in the data.Likewise, the PERMANOVA showed, from the selected variables, the homogeneity of the areas (the smallest area of the polygon in figure 5, the greater the homogeneity) and whether the difference between them is statistically significant (the areas do not differ from each other if there is an overlap of the polygons).Thus, our approach proved to be a parsimonious solution capable of explaining much of the variation in the data (Figueiredo Filho and Silva Júnior, 2010).
The PCA also allowed us to better understand variation within and between these reference areas.For example, the FRG-S and FRG-D have previously been considered to represent the same phytophysiognomy, despite having different vegetation densities.As a result, there has not previously been a way to appropriately distinguish between them when selecting reference areas.Our analysis showed that these two areas have different characteristics (based on the PERMANOVA results), meaning that selecting Rev Bras Cienc Solo 2023;47:e0230014 one or the other as a reference can lead to different interpretations about restoration or rehabilitation processes.Our analysis also revealed that the AF region was very distinct from the rupestrian grassland areas.Beyond the expected differences in vegetation volumes, we also determined that the AF area had different clay contents, As, Ni, Al 3+ , and m, and lower Bd and Pd.
These methods can be extended beyond the current study and be used in other post-mining reclamation areas.The selection of the best indicators for monitoring the reclamation process and selection of the reference area is specific to each environment, or ecosystem studied.We recommend choosing possible reference areas for each ecosystem to be restored and starting monitoring with as many variables as possible, including necessarily soil chemical, physical and biological indicators, and vegetation parameters.After the first year of analyses, monitoring can be continued using only the selected indicators and parameters.

CONCLUSIONS
Our study compared 36 characteristics from four possible reference areas in the Iron Quadrangle, Brazil: an Atlantic Forest site and three rupestrian grassland areas.We found that 12 indicators were most useful for separating the areas.The indicators selected as most important were, in descending order, Al content, Ni content, clay, vegetation volume, aluminum saturation, bulk density, particle density, As content, Zn content, Pb content, fine sand + silt, and fine silt.Based on this list, soil chemical and physical properties were the most important for distinguishing between reference environments and, therefore, should be considered most important for defining reference areas for environmental monitoring during post-mining reclamation.
The primary focus was on soil because of its importance in affecting other environmental processes and ecosystem functions.At the same time, soil properties at the conclusion of land reclamation activities strongly influence the subsequent recovery process.These factors suggest focusing on soil-based measurements for both characterizing reference areas and monitoring post-mining reclamation.However, future studies may draw more robust conclusions by considering other vegetation parameters, including species endemism and phytosociology.

Figure 1 .
Figure 1.Location of the Iron Quadrangle in Minas Gerais (a) and location of the studied areas (b) where iron mines are concentrated: ferruginous rupestrian grassland with small shrub vegetation at the altitude of 1,482 m (in the image is FRG-S), ferruginous rupestrian grassland with dense shrub vegetation at the altitude of 1,277 m (in the image is FRG-D), quartzitic rupestrian grassland at the altitude of 1,459 m (in the image is QRG), and Atlantic Forest at the altitude of 1,014 m (in the image is AF).

Figure 3 .
Figure 3. Boxplots of the properties identified as being important by principal component analysis for the four study areas: Atlantic Forest (AF); ferruginous rupestrian grassland with dense vegetation (FRG-D); ferruginous rupestrian grassland with small shrubs (FRG-S), and quartzite rupestrian grassland (QRG).

Figure 4 .Figure 5 .
Figure 4. Weights of selected properties from the principal component analysis (PCA) (a), and contribution of selected chemical, physical, and vegetation properties (b)within the entire parameter set used to distinguish the areas.Properties / Parameter Properties

Table 2 .
Loads, eigenvalues and percentage of total and accumulated variances explained by the two principal components (PC1 and PC2) extracted from the principal component analysis (PCA) of soil and vegetation properties FS: fine sand; FS+S: fine sand + silt; Al 3+ : exchangeable aluminum; As: arsenic total content; Bd: bulk density; Pd: particle density; m: aluminum saturation; Ni: semi-total nickel content; Pb: lead total content; VOL: vegetation volume; Zn: zinc total content.Rev Bras Cienc Solo 2023;47:e0230014