基于因子的三向受限Boltzmann的背景减除机器

280

收藏 2022-03-08

摘要翻译：
本文提出了一种基于连续感官输入的三维模型重建方法。机器人可以利用各种传感器从现实世界中获取极其大量的数据。然而，感官输入通常是噪声太大的高维数据。当机器人试图建立三维模型时，利用这些原始数据进行处理是非常困难和耗时的。因此，需要有一种方法，可以提取有用的信息，从这些感官输入。为了解决这个问题，我们的方法利用了对象语义层次(OSH)的概念。与以往的基于层次结构的运动信息提取方法不同，本文采用深度信任网络技术来提取运动信息，而不是采用经典的计算机视觉方法。我们对平移和旋转的两大组随机点图像（10,000）进行了训练，并成功地提取了几个解释平移和旋转运动的基。基于这个平移和旋转的基础，背景减法已经成为可能使用对象语义层次。
---
英文标题：
《Background subtraction using the factored 3-way restricted Boltzmann
machines》
---
作者：
Soonam Lee and Daekeun Kim
---
最新提交年份：
2018
---
分类信息：

一级分类：Computer Science 计算机科学
二级分类：Computer Vision and Pattern Recognition 计算机视觉与模式识别
分类描述：Covers image processing, computer vision, pattern recognition, and scene understanding. Roughly includes material in ACM Subject Classes I.2.10, I.4, and I.5.
涵盖图像处理、计算机视觉、模式识别和场景理解。大致包括ACM课程I.2.10、I.4和I.5中的材料。
--
一级分类：Computer Science 计算机科学
二级分类：Machine Learning 机器学习
分类描述：Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.
关于机器学习研究的所有方面的论文（有监督的，无监督的，强化学习，强盗问题，等等），包括健壮性，解释性，公平性和方法论。对于机器学习方法的应用，CS.LG也是一个合适的主要类别。
--
一级分类：Electrical Engineering and Systems Science 电气工程与系统科学
二级分类：Image and Video Processing 图像和视频处理
分类描述：Theory, algorithms, and architectures for the formation, capture, processing, communication, analysis, and display of images, video, and multidimensional signals in a wide variety of applications. Topics of interest include: mathematical, statistical, and perceptual image and video modeling and representation; linear and nonlinear filtering, de-blurring, enhancement, restoration, and reconstruction from degraded, low-resolution or tomographic data; lossless and lossy compression and coding; segmentation, alignment, and recognition; image rendering, visualization, and printing; computational imaging, including ultrasound, tomographic and magnetic resonance imaging; and image and video analysis, synthesis, storage, search and retrieval.
用于图像、视频和多维信号的形成、捕获、处理、通信、分析和显示的理论、算法和体系结构。感兴趣的主题包括：数学，统计，和感知图像和视频建模和表示；线性和非线性滤波、去模糊、增强、恢复和重建退化、低分辨率或层析数据；无损和有损压缩编码；分割、对齐和识别；图像渲染、可视化和打印；计算成像，包括超声、断层和磁共振成像；以及图像和视频的分析、合成、存储、搜索和检索。
--

---
英文摘要：
In this paper, we proposed a method for reconstructing the 3D model based on continuous sensory input. The robot can draw on extremely large data from the real world using various sensors. However, the sensory inputs are usually too noisy and high-dimensional data. It is very difficult and time consuming for robot to process using such raw data when the robot tries to construct 3D model. Hence, there needs to be a method that can extract useful information from such sensory inputs. To address this problem our method utilizes the concept of Object Semantic Hierarchy (OSH). Different from the previous work that used this hierarchy framework, we extract the motion information using the Deep Belief Network technique instead of applying classical computer vision approaches. We have trained on two large sets of random dot images (10,000) which are translated and rotated, respectively, and have successfully extracted several bases that explain the translation and rotation motion. Based on this translation and rotation bases, background subtraction have become possible using Object Semantic Hierarchy.
---
PDF链接：
https://arxiv.org/pdf/1802.01522

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群