最优仪器的稀疏模型和方法及其应用征用权

358

收藏 2022-03-02

摘要翻译：
我们发展了使用Lasso和后Lasso方法在线性工具变量(IV)模型中形成第一阶段预测和估计最优工具的结果。我们的结果即使在$P$比样本量$N$大得多的情况下也适用。我们证明了当第一阶段近似稀疏时，基于Lasso或后Lasso的IV估计是根n相合的，并且是渐近正态的；也就是说，当给定内生变量的条件期望时，工具可以被一组相对较小的变量很好地逼近，这些变量的恒等式可能是未知的。当结构误差为同方差时，我们还证明了估计量是半参数有效的。值得注意的是，我们的结果允许不完美的模型选择，并且不依赖于不切实际的“beta-min”条件，这些条件被广泛用于确定模型选择后推理的有效性。在仿真实验中，与最近提倡的多仪器鲁棒程序相比，基于套索的数据驱动惩罚的IV估计器表现良好。在一个处理司法征用权决定对经济结果影响的实证例子中，基于套索的IV估计器优于直觉基准。在发展IV结果的过程中，我们建立了一系列关于非参数条件期望函数的Lasso和后Lasso估计的新结果，这些结果具有独立的理论和实际意义。我们构造了一个改进的Lasso来处理非高斯异方差扰动，它使用了一个数据加权的$ell_1$-惩罚函数。利用自归一化和的中等偏差理论，在$\log p=o(n^{1/3}）$的条件下，我们给出了相应的Lasso和后Lasso估计的收敛速度与同方差高斯情形的收敛速度一样快。
---
英文标题：
《Sparse Models and Methods for Optimal Instruments with an Application to
Eminent Domain》
---
作者：
Alexandre Belloni, Daniel Chen, Victor Chernozhukov, Christian Hansen
---
最新提交年份：
2015
---
分类信息：

一级分类：Statistics 统计学
二级分类：Methodology 方法论
分类描述：Design, Surveys, Model Selection, Multiple Testing, Multivariate Methods, Signal and Image Processing, Time Series, Smoothing, Spatial Statistics, Survival Analysis, Nonparametric and Semiparametric Methods
设计，调查，模型选择，多重检验，多元方法，信号和图像处理，时间序列，平滑，空间统计，生存分析，非参数和半参数方法
--
一级分类：Economics 经济学
二级分类：Econometrics 计量经济学
分类描述：Econometric Theory, Micro-Econometrics, Macro-Econometrics, Empirical Content of Economic Relations discovered via New Methods, Methodological Aspects of the Application of Statistical Inference to Economic Data.
计量经济学理论，微观计量经济学，宏观计量经济学，通过新方法发现的经济关系的实证内容，统计推论应用于经济数据的方法论方面。
--
一级分类：Mathematics 数学
二级分类：Statistics Theory 统计理论
分类描述：Applied, computational and theoretical statistics: e.g. statistical inference, regression, time series, multivariate analysis, data analysis, Markov chain Monte Carlo, design of experiments, case studies
应用统计、计算统计和理论统计：例如统计推断、回归、时间序列、多元分析、数据分析、马尔可夫链蒙特卡罗、实验设计、案例研究
--
一级分类：Statistics 统计学
二级分类：Statistics Theory 统计理论
分类描述：stat.TH is an alias for math.ST. Asymptotics, Bayesian Inference, Decision Theory, Estimation, Foundations, Inference, Testing.
Stat.Th是Math.St的别名。渐近，贝叶斯推论，决策理论，估计，基础，推论，检验。
--

---
英文摘要：
We develop results for the use of Lasso and Post-Lasso methods to form first-stage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments, $p$. Our results apply even when $p$ is much larger than the sample size, $n$. We show that the IV estimator based on using Lasso or Post-Lasso in the first stage is root-n consistent and asymptotically normal when the first-stage is approximately sparse; i.e. when the conditional expectation of the endogenous variables given the instruments can be well-approximated by a relatively small set of variables whose identities may be unknown. We also show the estimator is semi-parametrically efficient when the structural error is homoscedastic. Notably our results allow for imperfect model selection, and do not rely upon the unrealistic "beta-min" conditions that are widely used to establish validity of inference following model selection. In simulation experiments, the Lasso-based IV estimator with a data-driven penalty performs well compared to recently advocated many-instrument-robust procedures. In an empirical example dealing with the effect of judicial eminent domain decisions on economic outcomes, the Lasso-based IV estimator outperforms an intuitive benchmark. In developing the IV results, we establish a series of new results for Lasso and Post-Lasso estimators of nonparametric conditional expectation functions which are of independent theoretical and practical interest. We construct a modification of Lasso designed to deal with non-Gaussian, heteroscedastic disturbances which uses a data-weighted $\ell_1$-penalty function. Using moderate deviation theory for self-normalized sums, we provide convergence rates for the resulting Lasso and Post-Lasso estimators that are as sharp as the corresponding rates in the homoscedastic Gaussian case under the condition that $\log p = o(n^{1/3})$.
---
PDF链接：
https://arxiv.org/pdf/1010.4345

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群