摘要翻译:
在气候研究中,发现与样本平均值有很大偏差的空间模式仍然是一个统计挑战。虽然主成分分析(PCA)或等效的经验正交函数(EOF)分解经常用于此目的,但只有当潜在的多元分布是高斯分布时,它才能提供有意义的结果。事实上,主成分分析是基于二阶矩量的优化,协方差矩阵只能捕捉多元高斯向量的完全依赖结构。只要手头的应用不能满足这一正态性假设(例如降水数据),就必须开发和研究五氯苯甲醚的替代品和/或改进。为了克服二阶统计量限制PCA的适用性,我们利用累积量函数产生高阶矩信息。这个统计文献中著名的累积量函数允许我们提出一种新的、简单而快速的方法来识别非高斯数据的空间模式。我们的算法包括最大化累积量函数。为了说明我们的方法,在三个多元随机向量族上进行了显式计算。此外,我们还证明了我们的算法对应于选择投影数据在边缘概率密度尾上显示最大分布的方向。
---
英文标题:
《Detecting spatial patterns with the cumulant function. Part I: The
theory》
---
作者:
Alberto Bernacchia, Philippe Naveau
---
最新提交年份:
2007
---
分类信息:
一级分类:Mathematics 数学
二级分类:Statistics Theory 统计理论
分类描述:Applied, computational and theoretical statistics: e.g. statistical inference, regression, time series, multivariate analysis, data analysis, Markov chain Monte Carlo, design of experiments, case studies
应用统计、计算统计和理论统计:例如统计推断、回归、时间序列、多元分析、
数据分析、马尔可夫链蒙特卡罗、实验设计、案例研究
--
一级分类:Statistics 统计学
二级分类:Statistics Theory 统计理论
分类描述:stat.TH is an alias for math.ST. Asymptotics, Bayesian Inference, Decision Theory, Estimation, Foundations, Inference, Testing.
Stat.Th是Math.St的别名。渐近,贝叶斯推论,决策理论,估计,基础,推论,检验。
--
---
英文摘要:
In climate studies, detecting spatial patterns that largely deviate from the sample mean still remains a statistical challenge. Although a Principal Component Analysis (PCA), or equivalently a Empirical Orthogonal Functions (EOF) decomposition, is often applied on this purpose, it can only provide meaningful results if the underlying multivariate distribution is Gaussian. Indeed, PCA is based on optimizing second order moments quantities and the covariance matrix can only capture the full dependence structure for multivariate Gaussian vectors. Whenever the application at hand can not satisfy this normality hypothesis (e.g. precipitation data), alternatives and/or improvements to PCA have to be developed and studied. To go beyond this second order statistics constraint that limits the applicability of the PCA, we take advantage of the cumulant function that can produce higher order moments information. This cumulant function, well-known in the statistical literature, allows us to propose a new, simple and fast procedure to identify spatial patterns for non-Gaussian data. Our algorithm consists in maximizing the cumulant function. To illustrate our approach, its implementation for which explicit computations are obtained is performed on three family of of multivariate random vectors. In addition, we show that our algorithm corresponds to selecting the directions along which projected data display the largest spread over the marginal probability density tails.
---
PDF链接:
https://arxiv.org/pdf/707.0574