摘要翻译:
我们发展了使用Lasso和后Lasso方法在线性工具变量(IV)模型中形成第一阶段预测和估计最优工具的结果。我们的结果即使在$P$比样本量$N$大得多的情况下也适用。我们证明了当第一阶段近似稀疏时,基于Lasso或后Lasso的IV估计是根n相合的,并且是渐近正态的;也就是说,当给定内生变量的条件期望时,工具可以被一组相对较小的变量很好地逼近,这些变量的恒等式可能是未知的。当结构误差为同方差时,我们还证明了估计量是半参数有效的。值得注意的是,我们的结果允许不完美的模型选择,并且不依赖于不切实际的“beta-min”条件,这些条件被广泛用于确定模型选择后推理的有效性。在仿真实验中,与最近提倡的多仪器鲁棒程序相比,基于套索的数据驱动惩罚的IV估计器表现良好。在一个处理司法征用权决定对经济结果影响的实证例子中,基于套索的IV估计器优于直觉基准。在发展IV结果的过程中,我们建立了一系列关于非参数条件期望函数的Lasso和后Lasso估计的新结果,这些结果具有独立的理论和实际意义。我们构造了一个改进的Lasso来处理非高斯异方差扰动,它使用了一个数据加权的$ell_1$-惩罚函数。利用自归一化和的中等偏差理论,在$\log p=o(n^{1/3})$的条件下,我们给出了相应的Lasso和后Lasso估计的收敛速度与同方差高斯情形的收敛速度一样快。
---
英文标题:
《Sparse Models and Methods for Optimal Instruments with an Application to
  Eminent Domain》
---
作者:
Alexandre Belloni, Daniel Chen, Victor Chernozhukov, Christian Hansen
---
最新提交年份:
2015
---
分类信息:
一级分类:Statistics        统计学
二级分类:Methodology        方法论
分类描述:Design, Surveys, Model Selection, Multiple Testing, Multivariate Methods, Signal and Image Processing, Time Series, Smoothing, Spatial Statistics, Survival Analysis, Nonparametric and Semiparametric Methods
设计,调查,模型选择,多重检验,多元方法,信号和图像处理,时间序列,平滑,空间统计,生存分析,非参数和半参数方法
--
一级分类:Economics        经济学
二级分类:Econometrics        计量经济学
分类描述:Econometric Theory, Micro-Econometrics, Macro-Econometrics, Empirical Content of Economic Relations discovered via New Methods, Methodological Aspects of the Application of Statistical Inference to Economic Data.
计量经济学理论,微观计量经济学,宏观计量经济学,通过新方法发现的经济关系的实证内容,统计推论应用于经济数据的方法论方面。
--
一级分类:Mathematics        数学
二级分类:Statistics Theory        统计理论
分类描述:Applied, computational and theoretical statistics: e.g. statistical inference, regression, time series, multivariate analysis, data analysis, Markov chain Monte Carlo, design of experiments, case studies
应用统计、计算统计和理论统计:例如统计推断、回归、时间序列、多元分析、
数据分析、马尔可夫链蒙特卡罗、实验设计、案例研究
--
一级分类:Statistics        统计学
二级分类:Statistics Theory        统计理论
分类描述:stat.TH is an alias for math.ST. Asymptotics, Bayesian Inference, Decision Theory, Estimation, Foundations, Inference, Testing.
Stat.Th是Math.St的别名。渐近,贝叶斯推论,决策理论,估计,基础,推论,检验。
--
---
英文摘要:
  We develop results for the use of Lasso and Post-Lasso methods to form first-stage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments, $p$. Our results apply even when $p$ is much larger than the sample size, $n$. We show that the IV estimator based on using Lasso or Post-Lasso in the first stage is root-n consistent and asymptotically normal when the first-stage is approximately sparse; i.e. when the conditional expectation of the endogenous variables given the instruments can be well-approximated by a relatively small set of variables whose identities may be unknown. We also show the estimator is semi-parametrically efficient when the structural error is homoscedastic. Notably our results allow for imperfect model selection, and do not rely upon the unrealistic "beta-min" conditions that are widely used to establish validity of inference following model selection. In simulation experiments, the Lasso-based IV estimator with a data-driven penalty performs well compared to recently advocated many-instrument-robust procedures. In an empirical example dealing with the effect of judicial eminent domain decisions on economic outcomes, the Lasso-based IV estimator outperforms an intuitive benchmark.   In developing the IV results, we establish a series of new results for Lasso and Post-Lasso estimators of nonparametric conditional expectation functions which are of independent theoretical and practical interest. We construct a modification of Lasso designed to deal with non-Gaussian, heteroscedastic disturbances which uses a data-weighted $\ell_1$-penalty function. Using moderate deviation theory for self-normalized sums, we provide convergence rates for the resulting Lasso and Post-Lasso estimators that are as sharp as the corresponding rates in the homoscedastic Gaussian case under the condition that $\log p = o(n^{1/3})$. 
---
PDF链接:
https://arxiv.org/pdf/1010.4345