摘要翻译:
我们证明了Kullback-Leibler距离是用有限数据集估计的相关矩阵的统计不确定性的一个很好的度量。对于多元高斯变量的相关矩阵,我们解析地确定了样本相关矩阵与参考模型的Kullback-Leibler距离的期望值,并证明了当特定模型未知时,期望值也是已知的。我们提出利用Kullback-Leibler距离来估计通过相关滤波过程从相关矩阵中提取的信息。我们还展示了如何使用这个距离来衡量统计不确定度方面的滤波过程的稳定性。通过比较四种滤波方法,其中两种是基于谱分析的滤波方法,另两种是基于层次聚类的滤波方法,说明了该方法的有效性。我们将这些技术应用于因子模型和经验数据的模拟进行比较。我们研究了这些滤波过程在从模拟中恢复模型相关矩阵的能力。我们从模型参数的异质性和数据序列的长度两个方面讨论了这种能力。我们还表明,这两种光谱技术比基于层次聚类的技术更能反映样本相关矩阵,而基于层次聚类的技术在统计不确定性方面更稳定。
---
英文标题:
《Kullback-Leibler distance as a measure of the information filtered from
multivariate data》
---
作者:
Michele Tumminello, Fabrizio Lillo, Rosario Nunzio Mantegna
---
最新提交年份:
2007
---
分类信息:
一级分类:Physics 物理学
二级分类:Data Analysis, Statistics and Probability
数据分析、统计与概率
分类描述:Methods, software and hardware for physics data analysis: data processing and storage; measurement methodology; statistical and mathematical aspects such as parametrization and uncertainties.
物理数据分析的方法、软硬件:数据处理与存储;测量方法;统计和数学方面,如参数化和不确定性。
--
一级分类:Physics 物理学
二级分类:Physics and Society 物理学与社会
分类描述:Structure, dynamics and collective behavior of societies and groups (human or otherwise). Quantitative analysis of social networks and other complex networks. Physics and engineering of infrastructure and systems of broad societal impact (e.g., energy grids, transportation networks).
社会和团体(人类或其他)的结构、动态和集体行为。社会网络和其他复杂网络的定量分析。具有广泛社会影响的基础设施和系统(如能源网、运输网络)的物理和工程。
--
一级分类:Quantitative Finance 数量金融学
二级分类:Statistical Finance 统计金融
分类描述:Statistical, econometric and econophysics analyses with applications to financial markets and economic data
统计、计量经济学和经济物理学分析及其在金融市场和经济数据中的应用
--
---
英文摘要:
We show that the Kullback-Leibler distance is a good measure of the statistical uncertainty of correlation matrices estimated by using a finite set of data. For correlation matrices of multivariate Gaussian variables we analytically determine the expected values of the Kullback-Leibler distance of a sample correlation matrix from a reference model and we show that the expected values are known also when the specific model is unknown. We propose to make use of the Kullback-Leibler distance to estimate the information extracted from a correlation matrix by correlation filtering procedures. We also show how to use this distance to measure the stability of filtering procedures with respect to statistical uncertainty. We explain the effectiveness of our method by comparing four filtering procedures, two of them being based on spectral analysis and the other two on hierarchical clustering. We compare these techniques as applied both to simulations of factor models and empirical data. We investigate the ability of these filtering procedures in recovering the correlation matrix of models from simulations. We discuss such an ability in terms of both the heterogeneity of model parameters and the length of data series. We also show that the two spectral techniques are typically more informative about the sample correlation matrix than techniques based on hierarchical clustering, whereas the latter are more stable with respect to statistical uncertainty.
---
PDF链接:
https://arxiv.org/pdf/0706.0168