摘要翻译:
通常的高维预测方法依赖于要么具有稀疏信号模型,即大多数参数为零且有少量大小较大的非零参数的模型,要么具有密集信号模型,即没有大参数且有许多小的非零参数的模型。我们考虑这两个基本模型的一个推广,这里称为“稀疏+稠密”模型,其中信号由稀疏信号和稠密信号之和给出。这种结构给传统的稀疏估计(如lasso)和传统的稠密估计(如岭估计)带来了问题。我们提出了一种新的基于惩罚的方法,称为lava,它计算效率很高。在适当选择惩罚参数的情况下,该方法对lasso和Ridge都具有严格的控制作用。我们导出了高斯序列模型中熔岩估计量的有限样本风险函数的解析表达式。在固定设计的高斯回归模型中,我们也给出了预测风险的偏差界。在这两种情况下,我们为熔岩的预测风险提供了斯坦氏无偏估计量。一个模拟例子在一个回归例子中使用可行的、依赖于数据的惩罚参数比较了lava与lasso、ridge和弹性网的性能,并说明了lava相对于这些基准的改进性能。
---
英文标题:
《A lava attack on the recovery of sums of dense and sparse signals》
---
作者:
Victor Chernozhukov, Christian Hansen, Yuan Liao
---
最新提交年份:
2015
---
分类信息:
一级分类:Statistics 统计学
二级分类:Methodology 方法论
分类描述:Design, Surveys, Model Selection, Multiple Testing, Multivariate Methods, Signal and Image Processing, Time Series, Smoothing, Spatial Statistics, Survival Analysis, Nonparametric and Semiparametric Methods
设计,调查,模型选择,多重检验,多元方法,信号和图像处理,时间序列,平滑,空间统计,生存分析,非参数和半参数方法
--
一级分类:Computer Science 计算机科学
二级分类:Information Theory 信息论
分类描述:Covers theoretical and experimental aspects of information theory and coding. Includes material in ACM Subject Class E.4 and intersects with H.1.1.
涵盖信息论和编码的理论和实验方面。包括ACM学科类E.4中的材料,并与H.1.1有交集。
--
一级分类:Economics 经济学
二级分类:Econometrics 计量经济学
分类描述:Econometric Theory, Micro-Econometrics, Macro-Econometrics, Empirical Content of Economic Relations discovered via New Methods, Methodological Aspects of the Application of Statistical Inference to Economic Data.
计量经济学理论,微观计量经济学,宏观计量经济学,通过新方法发现的经济关系的实证内容,统计推论应用于经济数据的方法论方面。
--
一级分类:Mathematics 数学
二级分类:Information Theory 信息论
分类描述:math.IT is an alias for cs.IT. Covers theoretical and experimental aspects of information theory and coding.
它是cs.it的别名。涵盖信息论和编码的理论和实验方面。
--
---
英文摘要:
Common high-dimensional methods for prediction rely on having either a sparse signal model, a model in which most parameters are zero and there are a small number of non-zero parameters that are large in magnitude, or a dense signal model, a model with no large parameters and very many small non-zero parameters. We consider a generalization of these two basic models, termed here a "sparse+dense" model, in which the signal is given by the sum of a sparse signal and a dense signal. Such a structure poses problems for traditional sparse estimators, such as the lasso, and for traditional dense estimation methods, such as ridge estimation. We propose a new penalization-based method, called lava, which is computationally efficient. With suitable choices of penalty parameters, the proposed method strictly dominates both lasso and ridge. We derive analytic expressions for the finite-sample risk function of the lava estimator in the Gaussian sequence model. We also provide an deviation bound for the prediction risk in the Gaussian regression model with fixed design. In both cases, we provide Stein's unbiased estimator for lava's prediction risk. A simulation example compares the performance of lava to lasso, ridge, and elastic net in a regression example using feasible, data-dependent penalty parameters and illustrates lava's improved performance relative to these benchmarks.
---
PDF链接:
https://arxiv.org/pdf/1502.03155