摘要翻译:
扫描仪数据为CPI或HICP计算提供了新的机会。它们可以从各种各样的零售商(超市、家用电子产品、互联网商店等)获得,并提供条形码级别的信息。使用扫描仪数据的优点之一是它们包含完整的交易信息,即每一个售出商品的价格和数量。要使用扫描仪数据,必须仔细处理。在清理数据和统一产品名称后,产品应仔细分类(如COICOP 5或以下)、匹配、过滤和汇总。这些过程通常需要创建新的IT或编写自定义脚本(R、Python、Mathematica、SAS等)。与扫描仪数据有关的新挑战之一是指数公式的合理选择。在这篇文章中,我们提出了一个实现扫描仪数据处理的各个阶段的建议。指出了扫描仪数据处理过程中可能存在的问题及其解决方法。最后,基于实际扫描仪数据集,比较了大量的Price指数方法,验证了它们对所采用的数据过滤和聚合方法的敏感性。
---
英文标题:
《Scanner data in inflation measurement: from raw data to price indices》
---
作者:
Jacek Bia{\l}ek, Maciej Ber\k{e}sewicz
---
最新提交年份:
2020
---
分类信息:
一级分类:Statistics 统计学
二级分类:Applications 应用程序
分类描述:Biology, Education, Epidemiology, Engineering, Environmental Sciences, Medical, Physical Sciences, Quality Control, Social Sciences
生物学,教育学,流行病学,工程学,环境科学,医学,物理科学,质量控制,社会科学
--
一级分类:Economics 经济学
二级分类:General Economics 一般经济学
分类描述:General methodological, applied, and empirical contributions to economics.
对经济学的一般方法、应用和经验贡献。
--
一级分类:Quantitative Finance 数量金融学
二级分类:Economics 经济学
分类描述:q-fin.EC is an alias for econ.GN. Economics, including micro and macro economics, international economics, theory of the firm, labor economics, and other economic topics outside finance
q-fin.ec是econ.gn的别名。经济学,包括微观和宏观经济学、国际经济学、企业理论、劳动经济学和其他金融以外的经济专题
--
---
英文摘要:
Scanner data offer new opportunities for CPI or HICP calculation. They can be obtained from a~wide variety of~retailers (supermarkets, home electronics, Internet shops, etc.) and provide information at the level of~the barcode. One of~advantages of~using scanner data is the fact that they contain complete transaction information, i.e. prices and quantities for every sold item. To use scanner data, it must be carefully processed. After clearing data and unifying product names, products should be carefully classified (e.g. into COICOP 5 or below), matched, filtered and aggregated. These procedures often require creating new IT or writing custom scripts (R, Python, Mathematica, SAS, others). One of~new challenges connected with scanner data is the appropriate choice of~the index formula. In this article we present a~proposal for the implementation of~individual stages of~handling scanner data. We also point out potential problems during scanner data processing and their solutions. Finally, we compare a~large number of~price index methods based on real scanner datasets and we verify their sensitivity on adopted data filtering and aggregating methods.
---
PDF链接:
https://arxiv.org/pdf/2005.11233