全部版块 我的主页
论坛 经济学人 二区 外文文献专区
497 0
2022-03-06
摘要翻译:
本文研究了在有限内存约束下,当样本容量为n$时,分位数回归(QR)中的推理问题,其中内存只能存储M$的小批数据。一种自然的方法是NA\\“ive divide-and-conquer方法,它将数据分成大小为$M$的批,为每批计算局部QR估计量,然后通过平均聚合估计量。但是,此方法只在$n=O(m^2)$时工作,并且计算开销很大。本文提出了一种计算效率较高的方法,该方法只需对小批数据进行初始QR估计,然后通过多轮聚合对估计进行连续精化。理论上,只要N$以M$为多项式增长,我们就建立了所得到的估计量的渐近正态性,并且证明了我们的估计量只需几轮聚合就能达到与QR估计量在所有数据上计算相同的效率。此外,我们的结果允许维数$P$到无穷大的情况。该方法也可以应用于分布式计算环境(例如,在大规模传感器网络中)或实时流数据中的QR问题。
---
英文标题:
《Quantile Regression Under Memory Constraint》
---
作者:
Xi Chen, Weidong Liu, Yichen Zhang
---
最新提交年份:
2018
---
分类信息:

一级分类:Statistics        统计学
二级分类:Methodology        方法论
分类描述:Design, Surveys, Model Selection, Multiple Testing, Multivariate Methods, Signal and Image Processing, Time Series, Smoothing, Spatial Statistics, Survival Analysis, Nonparametric and Semiparametric Methods
设计,调查,模型选择,多重检验,多元方法,信号和图像处理,时间序列,平滑,空间统计,生存分析,非参数和半参数方法
--
一级分类:Economics        经济学
二级分类:Econometrics        计量经济学
分类描述:Econometric Theory, Micro-Econometrics, Macro-Econometrics, Empirical Content of Economic Relations discovered via New Methods, Methodological Aspects of the Application of Statistical Inference to Economic Data.
计量经济学理论,微观计量经济学,宏观计量经济学,通过新方法发现的经济关系的实证内容,统计推论应用于经济数据的方法论方面。
--
一级分类:Statistics        统计学
二级分类:Machine Learning        机器学习
分类描述:Covers machine learning papers (supervised, unsupervised, semi-supervised learning, graphical models, reinforcement learning, bandits, high dimensional inference, etc.) with a statistical or theoretical grounding
覆盖机器学习论文(监督,无监督,半监督学习,图形模型,强化学习,强盗,高维推理等)与统计或理论基础
--

---
英文摘要:
  This paper studies the inference problem in quantile regression (QR) for a large sample size $n$ but under a limited memory constraint, where the memory can only store a small batch of data of size $m$. A natural method is the na\"ive divide-and-conquer approach, which splits data into batches of size $m$, computes the local QR estimator for each batch, and then aggregates the estimators via averaging. However, this method only works when $n=o(m^2)$ and is computationally expensive. This paper proposes a computationally efficient method, which only requires an initial QR estimator on a small batch of data and then successively refines the estimator via multiple rounds of aggregations. Theoretically, as long as $n$ grows polynomially in $m$, we establish the asymptotic normality for the obtained estimator and show that our estimator with only a few rounds of aggregations achieves the same efficiency as the QR estimator computed on all the data. Moreover, our result allows the case that the dimensionality $p$ goes to infinity. The proposed method can also be applied to address the QR problem under distributed computing environment (e.g., in a large-scale sensor network) or for real-time streaming data.
---
PDF链接:
https://arxiv.org/pdf/1810.08264
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群