英文标题:
《How Much Data Do You Need? An Operational, Pre-Asymptotic Metric for
Fat-tailedness》
---
作者:
Nassim Nicholas Taleb
---
最新提交年份:
2018
---
英文摘要:
This note presents an operational measure of fat-tailedness for univariate probability distributions, in $[0,1]$ where 0 is maximally thin-tailed (Gaussian) and 1 is maximally fat-tailed. Among others,1) it helps assess the sample size needed to establish a comparative $n$ needed for statistical significance, 2) allows practical comparisons across classes of fat-tailed distributions, 3) helps understand some inconsistent attributes of the lognormal, pending on the parametrization of its scale parameter. The literature is rich for what concerns asymptotic behavior, but there is a large void for finite values of $n$, those needed for operational purposes. Conventional measures of fat-tailedness, namely 1) the tail index for the power law class, and 2) Kurtosis for finite moment distributions fail to apply to some distributions, and do not allow comparisons across classes and parametrization, that is between power laws outside the Levy-Stable basin, or power laws to distributions in other classes, or power laws for different number of summands. How can one compare a sum of 100 Student T distributed random variables with 3 degrees of freedom to one in a Levy-Stable or a Lognormal class? How can one compare a sum of 100 Student T with 3 degrees of freedom to a single Student T with 2 degrees of freedom? We propose an operational and heuristic measure that allow us to compare $n$-summed independent variables under all distributions with finite first moment. The method is based on the rate of convergence of the Law of Large numbers for finite sums, $n$-summands specifically. We get either explicit expressions or simulation results and bounds for the lognormal, exponential, Pareto, and the Student T distributions in their various calibrations --in addition to the general Pearson classes.
---
中文摘要:
本注释给出了一个单变量概率分布的胖尾性的操作度量,单位为$[0,1]$,其中0为最大细尾(高斯),1为最大胖尾。除其他外,1)它有助于评估建立统计显著性所需的比较样本量,2)允许在不同类别的厚尾分布之间进行实际比较,3)有助于理解对数正态分布的一些不一致属性,取决于其尺度参数的参数化。有关渐近行为的文献非常丰富,但对于$n$的有限值,即用于操作目的的有限值,存在很大的空白。传统的厚尾性度量,即1)幂律类的尾部指数,以及2)有限矩分布的峰度,无法适用于某些分布,并且不允许跨类和参数化进行比较,即在列维稳定流域以外的幂律之间,或在其他类的分布之间,或在不同总和数的幂律之间。如何将100个三自由度学生T分布随机变量的总和与Levy稳定或对数正态类中的一个进行比较?一个人如何将一个有3个自由度的100个学生T的总和与一个有2个自由度的学生T的总和进行比较?我们提出了一个操作性和启发性的度量方法,允许我们在有限第一矩的所有分布下比较n$和的自变量。该方法基于有限和的大数定律的收敛速度,特别是n$-和。我们得到了对数正态分布、指数分布、帕累托分布和学生T分布在各种校准中的显式表达式或模拟结果和界,以及一般的Pearson类。
---
分类信息:
一级分类:Statistics 统计学
二级分类:Methodology 方法论
分类描述:Design, Surveys, Model Selection, Multiple Testing, Multivariate Methods, Signal and Image Processing, Time Series, Smoothing, Spatial Statistics, Survival Analysis, Nonparametric and Semiparametric Methods
设计,调查,模型选择,多重检验,多元方法,信号和图像处理,时间序列,平滑,空间统计,生存分析,非参数和半参数方法
--
一级分类:Quantitative Finance 数量金融学
二级分类:Statistical Finance 统计金融
分类描述:Statistical, econometric and econophysics analyses with applications to financial markets and economic data
统计、计量经济学和经济物理学分析及其在金融市场和经济数据中的应用
--
---
PDF下载:
-->