全部版块 我的主页
论坛 经济学人 二区 外文文献专区
467 0
2022-03-02
摘要翻译:
在这篇论文中,我们研究了在实验或观察研究中估计因果效应的异质性和推断群体子集之间治疗效应差异的大小的问题。在应用中,我们的方法提供了一种数据驱动的方法来确定哪些亚群体有大或小的治疗效果,并检验关于这些效果差异的假设。对于实验,我们的方法允许研究人员识别预先分析计划中没有指定的治疗效果的异质性,而不必担心由于多次测试而导致推断无效。在大多数关于监督机器学习的文献(例如回归树、随机森林、LASSO等)中,目标是建立一个单元属性和观察结果之间关系的模型。交叉验证在这些方法中起着突出的作用,它将预测与测试样本中的实际结果进行比较,以选择提供最佳预测能力的模型的复杂程度。我们的方法是密切相关的,但它的不同之处在于,它是为预测治疗的因果效应而不是一个单位的结果而量身定制的。面临的挑战是,因果效应的“基本真理”并不是针对任何单个单位观察到的:我们观察有治疗或没有治疗的单位,但不是同时观察两者。因此,如何使用交叉验证来确定因果效应是否被准确预测并不明显。我们针对这个问题提出了几种新的交叉验证准则,并通过模拟证明了它们比标准方法在因果效应问题上表现得更好的条件。然后,我们将该方法应用于一个大规模的现场实验,在一个搜索引擎上对结果进行重新排序。
---
英文标题:
《Recursive Partitioning for Heterogeneous Causal Effects》
---
作者:
Susan Athey and Guido Imbens
---
最新提交年份:
2015
---
分类信息:

一级分类:Statistics        统计学
二级分类:Machine Learning        机器学习
分类描述:Covers machine learning papers (supervised, unsupervised, semi-supervised learning, graphical models, reinforcement learning, bandits, high dimensional inference, etc.) with a statistical or theoretical grounding
覆盖机器学习论文(监督,无监督,半监督学习,图形模型,强化学习,强盗,高维推理等)与统计或理论基础
--
一级分类:Economics        经济学
二级分类:Econometrics        计量经济学
分类描述:Econometric Theory, Micro-Econometrics, Macro-Econometrics, Empirical Content of Economic Relations discovered via New Methods, Methodological Aspects of the Application of Statistical Inference to Economic Data.
计量经济学理论,微观计量经济学,宏观计量经济学,通过新方法发现的经济关系的实证内容,统计推论应用于经济数据的方法论方面。
--

---
英文摘要:
  In this paper we study the problems of estimating heterogeneity in causal effects in experimental or observational studies and conducting inference about the magnitude of the differences in treatment effects across subsets of the population. In applications, our method provides a data-driven approach to determine which subpopulations have large or small treatment effects and to test hypotheses about the differences in these effects. For experiments, our method allows researchers to identify heterogeneity in treatment effects that was not specified in a pre-analysis plan, without concern about invalidating inference due to multiple testing. In most of the literature on supervised machine learning (e.g. regression trees, random forests, LASSO, etc.), the goal is to build a model of the relationship between a unit's attributes and an observed outcome. A prominent role in these methods is played by cross-validation which compares predictions to actual outcomes in test samples, in order to select the level of complexity of the model that provides the best predictive power. Our method is closely related, but it differs in that it is tailored for predicting causal effects of a treatment rather than a unit's outcome. The challenge is that the "ground truth" for a causal effect is not observed for any individual unit: we observe the unit with the treatment, or without the treatment, but not both at the same time. Thus, it is not obvious how to use cross-validation to determine whether a causal effect has been accurately predicted. We propose several novel cross-validation criteria for this problem and demonstrate through simulations the conditions under which they perform better than standard methods for the problem of causal effects. We then apply the method to a large-scale field experiment re-ranking results on a search engine.
---
PDF链接:
https://arxiv.org/pdf/1504.01132
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群