全部版块 我的主页
论坛 经济学人 二区 外文文献专区
640 0
2022-03-02
摘要翻译:
大多数现代有监督统计/机器学习(ML)方法都被明确地设计用来很好地解决预测问题。实现这一目标并不意味着这些方法自动提供因果参数的良好估计。这些参数的例子包括个人回归系数,平均治疗效果,平均提升,以及需求或供给弹性。事实上,由于正则化偏差,通过将ML估计量简单地插入到这些参数的估计方程中而得到的这些因果参数的估计会表现得很差。幸运的是,这种正则化偏差可以通过ML工具解决辅助预测问题来消除。具体来说,我们可以通过结合辅助和主ML预测来形成目标低维参数的正交得分。然后使用该分数来建立目标参数的去偏估计量,该去偏估计量通常将以尽可能快的1/根(n)速率收敛,并且近似无偏且正常,并且可以从该去偏估计量中构造这些感兴趣参数的有效置信区间。由此产生的方法因此可以被称为“双ML”方法,因为它依赖于估计主要和辅助预测模型。为了避免过拟合,我们的构造还利用了K倍样本分裂,我们称之为交叉拟合。这使得我们可以使用非常广泛的ML预测方法来解决辅助和主要的预测问题,如随机森林、lasso、ridge、deep neural nets、boosted树,以及这些方法的各种混合和聚合器。
---
英文标题:
《Double/Debiased Machine Learning for Treatment and Causal Parameters》
---
作者:
Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo,
  Christian Hansen, Whitney Newey, and James Robins
---
最新提交年份:
2017
---
分类信息:

一级分类:Statistics        统计学
二级分类:Machine Learning        机器学习
分类描述:Covers machine learning papers (supervised, unsupervised, semi-supervised learning, graphical models, reinforcement learning, bandits, high dimensional inference, etc.) with a statistical or theoretical grounding
覆盖机器学习论文(监督,无监督,半监督学习,图形模型,强化学习,强盗,高维推理等)与统计或理论基础
--
一级分类:Economics        经济学
二级分类:Econometrics        计量经济学
分类描述:Econometric Theory, Micro-Econometrics, Macro-Econometrics, Empirical Content of Economic Relations discovered via New Methods, Methodological Aspects of the Application of Statistical Inference to Economic Data.
计量经济学理论,微观计量经济学,宏观计量经济学,通过新方法发现的经济关系的实证内容,统计推论应用于经济数据的方法论方面。
--

---
英文摘要:
  Most modern supervised statistical/machine learning (ML) methods are explicitly designed to solve prediction problems very well. Achieving this goal does not imply that these methods automatically deliver good estimators of causal parameters. Examples of such parameters include individual regression coefficients, average treatment effects, average lifts, and demand or supply elasticities. In fact, estimates of such causal parameters obtained via naively plugging ML estimators into estimating equations for such parameters can behave very poorly due to the regularization bias. Fortunately, this regularization bias can be removed by solving auxiliary prediction problems via ML tools. Specifically, we can form an orthogonal score for the target low-dimensional parameter by combining auxiliary and main ML predictions. The score is then used to build a de-biased estimator of the target parameter which typically will converge at the fastest possible 1/root(n) rate and be approximately unbiased and normal, and from which valid confidence intervals for these parameters of interest may be constructed. The resulting method thus could be called a "double ML" method because it relies on estimating primary and auxiliary predictive models. In order to avoid overfitting, our construction also makes use of the K-fold sample splitting, which we call cross-fitting. This allows us to use a very broad set of ML predictive methods in solving the auxiliary and main prediction problems, such as random forest, lasso, ridge, deep neural nets, boosted trees, as well as various hybrids and aggregators of these methods.
---
PDF链接:
https://arxiv.org/pdf/1608.00060
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群