摘要翻译:
估计治疗的长期效果是许多领域感兴趣的。估计这种治疗效果的一个共同挑战是,在做出政策决定所需的时间框架内,长期结果没有得到观察。克服这一数据缺失问题的一种方法是分析治疗对中间结果的影响,通常称为统计代理,如果它满足治疗和结果独立于统计代理的条件。代孕条件的有效性经常有争议。在这里,我们利用了这样一个事实,即在现代数据集中,研究人员经常观察到大量,可能是数百或数千个中间结果,被认为位于或接近治疗和感兴趣的长期结果之间的因果链。即使没有一个单独的代理本身满足统计代孕标准,使用多个代理在因果推理中也是有用的。我们主要关注一个有两个样本的设置,一个实验样本包含关于治疗指标和替代者的数据,一个观察样本包含关于替代者和主要结果的信息。我们陈述了假设,在此假设下,平均治疗效果可以用一个共同满足代孕假设的代理的高维向量来识别和估计,并从违反代孕假设中导出偏差,并表明即使在实验样本中也观察到了主要结果,仍然可以从使用代孕者中获得信息。
---
英文标题:
《Estimating Treatment Effects using Multiple Surrogates: The Role of the
  Surrogate Score and the Surrogate Index》
---
作者:
Susan Athey, Raj Chetty, Guido Imbens, Hyunseung Kang
---
最新提交年份:
2020
---
分类信息:
一级分类:Statistics        统计学
二级分类:Methodology        方法论
分类描述:Design, Surveys, Model Selection, Multiple Testing, Multivariate Methods, Signal and Image Processing, Time Series, Smoothing, Spatial Statistics, Survival Analysis, Nonparametric and Semiparametric Methods
设计,调查,模型选择,多重检验,多元方法,信号和图像处理,时间序列,平滑,空间统计,生存分析,非参数和半参数方法
--
一级分类:Economics        经济学
二级分类:Econometrics        计量经济学
分类描述:Econometric Theory, Micro-Econometrics, Macro-Econometrics, Empirical Content of Economic Relations discovered via New Methods, Methodological Aspects of the Application of Statistical Inference to Economic Data.
计量经济学理论,微观计量经济学,宏观计量经济学,通过新方法发现的经济关系的实证内容,统计推论应用于经济数据的方法论方面。
--
一级分类:Statistics        统计学
二级分类:Machine Learning        
机器学习
分类描述:Covers machine learning papers (supervised, unsupervised, semi-supervised learning, graphical models, reinforcement learning, bandits, high dimensional inference, etc.) with a statistical or theoretical grounding
覆盖机器学习论文(监督,无监督,半监督学习,图形模型,强化学习,强盗,高维推理等)与统计或理论基础
--
---
英文摘要:
  Estimating the long-term effects of treatments is of interest in many fields. A common challenge in estimating such treatment effects is that long-term outcomes are unobserved in the time frame needed to make policy decisions. One approach to overcome this missing data problem is to analyze treatments effects on an intermediate outcome, often called a statistical surrogate, if it satisfies the condition that treatment and outcome are independent conditional on the statistical surrogate. The validity of the surrogacy condition is often controversial. Here we exploit that fact that in modern datasets, researchers often observe a large number, possibly hundreds or thousands, of intermediate outcomes, thought to lie on or close to the causal chain between the treatment and the long-term outcome of interest. Even if none of the individual proxies satisfies the statistical surrogacy criterion by itself, using multiple proxies can be useful in causal inference. We focus primarily on a setting with two samples, an experimental sample containing data about the treatment indicator and the surrogates and an observational sample containing information about the surrogates and the primary outcome. We state assumptions under which the average treatment effect be identified and estimated with a high-dimensional vector of proxies that collectively satisfy the surrogacy assumption, and derive the bias from violations of the surrogacy assumption, and show that even if the primary outcome is also observed in the experimental sample, there is still information to be gained from using surrogates. 
---
PDF链接:
https://arxiv.org/pdf/1603.09326