摘要翻译:
随着越来越多的科技公司从事严格的经济分析,我们面临着一个数据问题:由于使用敏感、专有或私人数据,内部文件无法复制。读者只能假设被遮蔽的真实数据(例如谷歌内部信息)确实产生了给定的结果,或者他们必须寻找产生类似结果的可比较的面向公众的数据(例如谷歌趋势)。改善这种重复性问题的一种方法是让研究人员根据他们的真实数据发布合成数据集;这使得外部各方可以复制内部研究人员的方法。在这个简短的概述中,我们探索在经济分析的高水平上的合成数据生成。
---
英文标题:
《Synthetic Data Generation for Economists》
---
作者:
Allison Koenecke and Hal Varian
---
最新提交年份:
2020
---
分类信息:
一级分类:Economics        经济学
二级分类:General Economics        一般经济学
分类描述:General methodological, applied, and empirical contributions to economics.
对经济学的一般方法、应用和经验贡献。
--
一级分类:Computer Science        计算机科学
二级分类:Machine Learning        
机器学习
分类描述:Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.
关于机器学习研究的所有方面的论文(有监督的,无监督的,强化学习,强盗问题,等等),包括健壮性,解释性,公平性和方法论。对于机器学习方法的应用,CS.LG也是一个合适的主要类别。
--
一级分类:Quantitative Finance        数量金融学
二级分类:Economics        经济学
分类描述:q-fin.EC is an alias for econ.GN. Economics, including micro and macro economics, international economics, theory of the firm, labor economics, and other economic topics outside finance
q-fin.ec是econ.gn的别名。经济学,包括微观和宏观经济学、国际经济学、企业理论、劳动经济学和其他金融以外的经济专题
--
---
英文摘要:
  As more tech companies engage in rigorous economic analyses, we are confronted with a data problem: in-house papers cannot be replicated due to use of sensitive, proprietary, or private data. Readers are left to assume that the obscured true data (e.g., internal Google information) indeed produced the results given, or they must seek out comparable public-facing data (e.g., Google Trends) that yield similar results. One way to ameliorate this reproducibility issue is to have researchers release synthetic datasets based on their true data; this allows external parties to replicate an internal researcher\'s methodology. In this brief overview, we explore synthetic data generation at a high level for economic analyses. 
---
PDF下载:
-->