Assembling spatial units into meaningful clusters is a challenging task, as it must cope with a consequential computational complexity while controlling for the modifiable areal unit problem (MAUP), spatial autocorrelation and attribute multicolinearity. Nevertheless, these effects can reveal significant interactions among diverse spatial phenomena, such as segregation and economic specialization. Various regionalization methods have been developed in order to address these questions, but key fundamental properties of the aggregation of spatial entities are still poorly understood. In particular, due to the lack of an objective stopping rule, the question of determining an optimal number of clusters is yet unresolved. Therefore, we develop a clustering algorithm which is sensitive to scalar variations of multivariate spatial correlations, recalculating PCA scores at several aggregation steps in order to account for differences in the span of autocorrelation effects for diverse variables. With these settings, the scalar evolution of correlation, compactness and isolation measures is compared between empirical and 120 random datasets, using two dissimilarity measures. Remarkably, adjusting several indicators with real and simulated data allows for a clear definition of a stopping rule for spatial hierarchical clustering. Indeed, increasing correlations with scale in random datasets are spurious MAUP effects, so they can be discounted from real data results in order to identify an optimal clustering level, as defined by the maximum of authentic spatial self-organization. This allows singling out the most socially distressed areas in Greater Santiago, thus providing relevant socio-spatial insights from their cartographic and statistical analysis. In sum, we develop a useful methodology to improve the fundamental comprehension of spatial interdependence and multiscalar self-organizing phenomena, while linking these questions to relevant real world issues.
|Number of pages||11|
|Journal||Computers, Environment and Urban Systems|
|State||Published - 1 Mar 2016|
- Data mining
- Spatial clustering
- Stopping rule