摘要翻译:
本文研究了回归环境下的统计聚合过程。一个激励因素是存在着许多不同的估计方法,这可能导致相互竞争的估计者。我们在这里考虑三种不同类型的聚集:模型选择(MS)聚集,凸(C)聚集和线性(L)聚集。(MS)的目标是从列表中选择最优的单个估计量;(C)是选择给定估计量的最优凸组合;(L)的方法是选择给定估计量的最优线性组合。我们感兴趣的是评估通过这些程序得到的估计量的超额风险的收敛速度。我们的方法的动机是最近发表的极小极大结果[Nemirovski,A.(2000).非参数统计的主题.概率论和统计讲座(Saint-Flood,1998).数学课堂讲稿.1738 85-277.Springer,Berlin;Tsybakov,A.B.(2003).最优聚合率.学习理论和核机器.
人工智能课堂讲稿2777 303-313.Springer,Heidelberg]。对于(MS)、(C)和(L)的每一种情况,存在相互竞争的聚合程序来实现最优收敛速度。由于这些程序之间没有直接的可比性,我们建议一个替代的解决方案。我们证明了所有三个最优速率,以及新引入的集合(子集选择)的最优速率,几乎都是通过一个单一的“通用”集合过程实现的。该过程包括将初始估计量与通过惩罚最小二乘法获得的权重混合。考虑了两种不同的惩罚:一种是BIC类型的惩罚,另一种是依赖于数据的$\ell_1$类型的惩罚。
---
英文标题:
《Aggregation for Gaussian regression》
---
作者:
Florentina Bunea, Alexandre B. Tsybakov, Marten H. Wegkamp
---
最新提交年份:
2007
---
分类信息:
一级分类:Mathematics 数学
二级分类:Statistics Theory 统计理论
分类描述:Applied, computational and theoretical statistics: e.g. statistical inference, regression, time series, multivariate analysis, data analysis, Markov chain Monte Carlo, design of experiments, case studies
应用统计、计算统计和理论统计:例如统计推断、回归、时间序列、多元分析、
数据分析、马尔可夫链蒙特卡罗、实验设计、案例研究
--
一级分类:Statistics 统计学
二级分类:Statistics Theory 统计理论
分类描述:stat.TH is an alias for math.ST. Asymptotics, Bayesian Inference, Decision Theory, Estimation, Foundations, Inference, Testing.
Stat.Th是Math.St的别名。渐近,贝叶斯推论,决策理论,估计,基础,推论,检验。
--
---
英文摘要:
This paper studies statistical aggregation procedures in the regression setting. A motivating factor is the existence of many different methods of estimation, leading to possibly competing estimators. We consider here three different types of aggregation: model selection (MS) aggregation, convex (C) aggregation and linear (L) aggregation. The objective of (MS) is to select the optimal single estimator from the list; that of (C) is to select the optimal convex combination of the given estimators; and that of (L) is to select the optimal linear combination of the given estimators. We are interested in evaluating the rates of convergence of the excess risks of the estimators obtained by these procedures. Our approach is motivated by recently published minimax results [Nemirovski, A. (2000). Topics in non-parametric statistics. Lectures on Probability Theory and Statistics (Saint-Flour, 1998). Lecture Notes in Math. 1738 85--277. Springer, Berlin; Tsybakov, A. B. (2003). Optimal rates of aggregation. Learning Theory and Kernel Machines. Lecture Notes in Artificial Intelligence 2777 303--313. Springer, Heidelberg]. There exist competing aggregation procedures achieving optimal convergence rates for each of the (MS), (C) and (L) cases separately. Since these procedures are not directly comparable with each other, we suggest an alternative solution. We prove that all three optimal rates, as well as those for the newly introduced (S) aggregation (subset selection), are nearly achieved via a single ``universal'' aggregation procedure. The procedure consists of mixing the initial estimators with weights obtained by penalized least squares. Two different penalties are considered: one of them is of the BIC type, the second one is a data-dependent $\ell_1$-type penalty.
---
PDF链接:
https://arxiv.org/pdf/710.3654