摘要翻译:
户外环境下的多摄像机人体和动物姿态捕捉是一个极具挑战性的问题。我们的方法涉及到一个团队的合作微型飞行器(MAV),只有机载相机。我们方法的关键使能方面是车载人员检测和跟踪方法。在此背景下,基于深度
神经网络(DNN)的最新方法是非常有前途的。然而,与现有的摄像机分辨率相比,实时DNN在输入数据维度上受到严重限制。因此,DNS常常在距离摄像机较远或规模较小的物体上失效,这是空中机器人场景的典型特征。因此,本文研究的核心问题是如何利用DNNs实现车载、实时、连续、准确的视觉检测,通过MAVS实现视觉人的跟踪。我们的解决方案利用了多个MAV之间的合作。首先,每个MAV将自己的检测结果与其他MAV获得的检测结果融合,进行协同视觉跟踪。这允许预测被跟踪人的未来姿态,这些姿态被用于选择性地只处理未来图像的相关区域,即使在高分辨率下也是如此。因此,使用我们的基于DNN的检测器,我们能够以高精度和速度连续跟踪甚至遥远的人类。我们通过真实的机器人实验证明了我们的方法的有效性,包括两个空中机器人跟踪一个人,同时保持主动感知驱动的队形。我们的解决方案完全运行在我们的微型飞行器的CPU和GPU板上,没有远程处理。基于ROS的源代码是为了社区的利益而提供的。
---
英文标题:
《Deep Neural Network-based Cooperative Visual Tracking through Multiple
Micro Aerial Vehicles》
---
作者:
Eric Price, Guilherme Lawless, Heinrich H. B\"ulthoff, Michael Black
and Aamir Ahmad
---
最新提交年份:
2018
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Robotics 机器人学
分类描述:Roughly includes material in ACM Subject Class I.2.9.
大致包括ACM科目I.2.9类的材料。
--
一级分类:Electrical Engineering and Systems Science 电气工程与系统科学
二级分类:Image and Video Processing 图像和视频处理
分类描述:Theory, algorithms, and architectures for the formation, capture, processing, communication, analysis, and display of images, video, and multidimensional signals in a wide variety of applications. Topics of interest include: mathematical, statistical, and perceptual image and video modeling and representation; linear and nonlinear filtering, de-blurring, enhancement, restoration, and reconstruction from degraded, low-resolution or tomographic data; lossless and lossy compression and coding; segmentation, alignment, and recognition; image rendering, visualization, and printing; computational imaging, including ultrasound, tomographic and magnetic resonance imaging; and image and video analysis, synthesis, storage, search and retrieval.
用于图像、视频和多维信号的形成、捕获、处理、通信、分析和显示的理论、算法和体系结构。感兴趣的主题包括:数学,统计,和感知图像和视频建模和表示;线性和非线性滤波、去模糊、增强、恢复和重建退化、低分辨率或层析数据;无损和有损压缩编码;分割、对齐和识别;图像渲染、可视化和打印;计算成像,包括超声、断层和磁共振成像;以及图像和视频的分析、合成、存储、搜索和检索。
--
---
英文摘要:
Multi-camera full-body pose capture of humans and animals in outdoor environments is a highly challenging problem. Our approach to it involves a team of cooperating micro aerial vehicles (MAVs) with on-board cameras only. The key enabling-aspect of our approach is the on-board person detection and tracking method. Recent state-of-the-art methods based on deep neural networks (DNN) are highly promising in this context. However, real time DNNs are severely constrained in input data dimensions, in contrast to available camera resolutions. Therefore, DNNs often fail at objects with small scale or far away from the camera, which are typical characteristics of a scenario with aerial robots. Thus, the core problem addressed in this paper is how to achieve on-board, real-time, continuous and accurate vision-based detections using DNNs for visual person tracking through MAVs. Our solution leverages cooperation among multiple MAVs. First, each MAV fuses its own detections with those obtained by other MAVs to perform cooperative visual tracking. This allows for predicting future poses of the tracked person, which are used to selectively process only the relevant regions of future images, even at high resolutions. Consequently, using our DNN-based detector we are able to continuously track even distant humans with high accuracy and speed. We demonstrate the efficiency of our approach through real robot experiments involving two aerial robots tracking a person, while maintaining an active perception-driven formation. Our solution runs fully on-board our MAV's CPU and GPU, with no remote processing. ROS-based source code is provided for the benefit of the community.
---
PDF链接:
https://arxiv.org/pdf/1802.01346