基于RGB-D图像的目标定位和尺寸估计

nandehutu2022

422

收藏 2022-04-06

摘要翻译：
深度感知相机（例如，Kinect传感器、Tango手机）可以获取注册到公共视点的颜色和深度图像。这为开发利用这两种传感方式优点的算法提供了可能性。传统上，来自彩色图像的线索被用于对象定位（如YOLO）。然而，深度图像的添加可进一步用于分割否则可能具有相同颜色信息的图像。此外，深度图像可用于对象大小（高度/宽度）估计（在真实世界的测量单位中，例如米），而不是基于图像的分割，该分割将只支持在感兴趣的对象周围绘制边界框。本文首先在Tango Phab2手机上使用一个自定义的Android应用程序收集彩色相机信息和深度信息。其次，在两个数据源之间进行时间和空间对齐。最后，我们评估了几种测量感兴趣物体高度的方法，这些方法在不同的设置下，在捕获的图像中测量感兴趣物体的高度。
---
英文标题：
《Object Localization and Size Estimation from RGB-D Images》
---
作者：
ShreeRanjani SrirangamSridharan, Oytun Ulutan, Shehzad Noor Taus
Priyo, Swati Rallapalli, Mudhakar Srivatsa
---
最新提交年份：
2018
---
分类信息：

一级分类：Computer Science 计算机科学
二级分类：Computer Vision and Pattern Recognition 计算机视觉与模式识别
分类描述：Covers image processing, computer vision, pattern recognition, and scene understanding. Roughly includes material in ACM Subject Classes I.2.10, I.4, and I.5.
涵盖图像处理、计算机视觉、模式识别和场景理解。大致包括ACM课程I.2.10、I.4和I.5中的材料。
--
一级分类：Electrical Engineering and Systems Science 电气工程与系统科学
二级分类：Image and Video Processing 图像和视频处理
分类描述：Theory, algorithms, and architectures for the formation, capture, processing, communication, analysis, and display of images, video, and multidimensional signals in a wide variety of applications. Topics of interest include: mathematical, statistical, and perceptual image and video modeling and representation; linear and nonlinear filtering, de-blurring, enhancement, restoration, and reconstruction from degraded, low-resolution or tomographic data; lossless and lossy compression and coding; segmentation, alignment, and recognition; image rendering, visualization, and printing; computational imaging, including ultrasound, tomographic and magnetic resonance imaging; and image and video analysis, synthesis, storage, search and retrieval.
用于图像、视频和多维信号的形成、捕获、处理、通信、分析和显示的理论、算法和体系结构。感兴趣的主题包括：数学，统计，和感知图像和视频建模和表示；线性和非线性滤波、去模糊、增强、恢复和重建退化、低分辨率或层析数据；无损和有损压缩编码；分割、对齐和识别；图像渲染、可视化和打印；计算成像，包括超声、断层和磁共振成像；以及图像和视频的分析、合成、存储、搜索和检索。
--

---
英文摘要：
Depth sensing cameras (e.g., Kinect sensor, Tango phone) can acquire color and depth images that are registered to a common viewpoint. This opens the possibility of developing algorithms that exploit the advantages of both sensing modalities. Traditionally, cues from color images have been used for object localization (e.g., YOLO). However, the addition of a depth image can be further used to segment images that might otherwise have identical color information. Further, the depth image can be used for object size (height/width) estimation (in real-world measurements units, such as meters) as opposed to image based segmentation that would only support drawing bounding boxes around objects of interest. In this paper, we first collect color camera information along with depth information using a custom Android application on Tango Phab2 phone. Second, we perform timing and spatial alignment between the two data sources. Finally, we evaluate several ways of measuring the height of the object of interest within the captured images under a variety of settings.
---
PDF链接：
https://arxiv.org/pdf/1808.00641

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群