英文标题:
《Agglomerative Likelihood Clustering》
---
作者:
Lionel Yelibi, Tim Gebbie
---
最新提交年份:
2021
---
英文摘要:
We consider the problem of fast time-series data clustering. Building on previous work modeling the correlation-based Hamiltonian of spin variables we present an updated fast non-expensive Agglomerative Likelihood Clustering algorithm (ALC). The method replaces the optimized genetic algorithm based approach (f-SPC) with an agglomerative recursive merging framework inspired by previous work in Econophysics and Community Detection. The method is tested on noisy synthetic correlated time-series data-sets with built-in cluster structure to demonstrate that the algorithm produces meaningful non-trivial results. We apply it to time-series data-sets as large as 20,000 assets and we argue that ALC can reduce compute time costs and resource usage cost for large scale clustering for time-series applications while being serialized, and hence has no obvious parallelization requirement. The algorithm can be an effective choice for state-detection for online learning in a fast non-linear data environment because the algorithm requires no prior information about the number of clusters.
---
中文摘要:
我们考虑了快速时间序列数据聚类问题。在对基于关联的自旋变量哈密顿量建模的基础上,我们提出了一种更新的快速非昂贵凝聚似然聚类算法(ALC)。该方法将基于优化遗传算法的方法(f-SPC)替换为凝聚式递归合并框架,该框架受到了经济物理学和社区检测领域先前工作的启发。该方法在具有内置聚类结构的噪声合成相关时间序列数据集上进行了测试,结果表明该算法产生了有意义的非平凡结果。我们将其应用于多达20000个资产的时间序列数据集,并认为ALC可以在序列化的同时减少时间序列应用程序大规模集群的计算时间成本和资源使用成本,因此没有明显的并行化要求。由于该算法不需要关于聚类数目的先验信息,因此可以作为快速非线性数据环境中在线学习状态检测的有效选择。
---
分类信息:
一级分类:Quantitative Finance 数量金融学
二级分类:Computational Finance 计算金融学
分类描述:Computational methods, including Monte Carlo, PDE, lattice and other numerical methods with applications to financial modeling
计算方法,包括蒙特卡罗,偏微分方程,格子和其他数值方法,并应用于金融建模
--
一级分类:Physics 物理学
二级分类:Data Analysis, Statistics and Probability
数据分析、统计与概率
分类描述:Methods, software and hardware for physics data analysis: data processing and storage; measurement methodology; statistical and mathematical aspects such as parametrization and uncertainties.
物理数据分析的方法、软硬件:数据处理与存储;测量方法;统计和数学方面,如参数化和不确定性。
--
一级分类:Statistics 统计学
二级分类:Machine Learning
机器学习
分类描述:Covers machine learning papers (supervised, unsupervised, semi-supervised learning, graphical models, reinforcement learning, bandits, high dimensional inference, etc.) with a statistical or theoretical grounding
覆盖机器学习论文(监督,无监督,半监督学习,图形模型,强化学习,强盗,高维推理等)与统计或理论基础
--
---
PDF下载:
-->