TY - JOUR
T1 - A selective view of climatological data and likelihood estimation
AU - Blasi, Federico
AU - Caamaño-Carrillo, Christian
AU - Bevilacqua, Moreno
AU - Furrer, Reinhard
N1 - Publisher Copyright:
© 2022 The Author(s)
PY - 2022/8
Y1 - 2022/8
N2 - This article gives a narrative overview of what constitutes climatological data and their typical features, with a focus on aspects relevant to statistical modeling. We restrict the discussion to univariate spatial fields and focus on maximum likelihood estimation. To address the problem of enormous datasets, we study three common approximation schemes: tapering, direct misspecification, and composite likelihood for Gaussian and non-Gaussian distributions. We focus particularly on the so-called ‘sinh-arcsinh distribution’, obtained through a specific transformation of the Gaussian distribution. Because it has flexible marginal distributions – possibly skewed and/or heavy-tailed – it has a wide range of applications. One appealing property of the transformation involved is the existence of an explicit inverse transformation that makes likelihood-based methods straightforward. We describe a simulation study illustrating the effects of the different approximation schemes. To the best of our knowledge, a direct comparison of tapering, direct misspecification, and composite likelihood has never been made previously, and we show that direct misspecification is inferior. In some metrics, composite likelihood has a minor advantage over tapering. We use the estimation approaches to model a high-resolution global climate change field. All simulation code is available as a Docker container and is thus fully reproducible. Additionally, the present article describes where and how to get various climate datasets.
AB - This article gives a narrative overview of what constitutes climatological data and their typical features, with a focus on aspects relevant to statistical modeling. We restrict the discussion to univariate spatial fields and focus on maximum likelihood estimation. To address the problem of enormous datasets, we study three common approximation schemes: tapering, direct misspecification, and composite likelihood for Gaussian and non-Gaussian distributions. We focus particularly on the so-called ‘sinh-arcsinh distribution’, obtained through a specific transformation of the Gaussian distribution. Because it has flexible marginal distributions – possibly skewed and/or heavy-tailed – it has a wide range of applications. One appealing property of the transformation involved is the existence of an explicit inverse transformation that makes likelihood-based methods straightforward. We describe a simulation study illustrating the effects of the different approximation schemes. To the best of our knowledge, a direct comparison of tapering, direct misspecification, and composite likelihood has never been made previously, and we show that direct misspecification is inferior. In some metrics, composite likelihood has a minor advantage over tapering. We use the estimation approaches to model a high-resolution global climate change field. All simulation code is available as a Docker container and is thus fully reproducible. Additionally, the present article describes where and how to get various climate datasets.
KW - CMIP6 data
KW - Composite likelihood
KW - Random field
KW - Sinh-arcsinh distribution
KW - Spatial process
KW - Tapering
UR - http://www.scopus.com/inward/record.url?scp=85124004258&partnerID=8YFLogxK
U2 - 10.1016/j.spasta.2022.100596
DO - 10.1016/j.spasta.2022.100596
M3 - Article
AN - SCOPUS:85124004258
SN - 2211-6753
VL - 50
JO - Spatial Statistics
JF - Spatial Statistics
M1 - 100596
ER -