摘要翻译:
场景分类是理解高分辨率遥感影像的一个基本问题。近年来,卷积
神经网络(Convlutional neural network,ConvNet)在不同的任务中取得了令人瞩目的性能,并为卫星图像场景分类开发了多种表示方法。在本文中,我们提出了一种新的基于ConvNet的上下文聚合表示方法。所提出的双路径ResNet(ResNet-TP)体系结构采用ResNet作为主干,两条路径允许网络同时建模局部细节和区域上下文。基于ResNet-TP的表示是通过在两条路径的最后卷积层上进行全局平均池生成的。在UCM土地利用和NWPU-RESISC45两个场景分类数据集上的实验表明,该机制比现有的方法有了很好的改进。
---
英文标题:
《Satellite Image Scene Classification via ConvNet with Context
  Aggregation》
---
作者:
Zhao Zhou, Yingbin Zheng, Hao Ye, Jian Pu, Gufei Sun
---
最新提交年份:
2018
---
分类信息:
一级分类:Electrical Engineering and Systems Science        电气工程与系统科学
二级分类:Image and Video Processing        图像和视频处理
分类描述:Theory, algorithms, and architectures for the formation, capture, processing, communication, analysis, and display of images, video, and multidimensional signals in a wide variety of applications. Topics of interest include: mathematical, statistical, and perceptual image and video modeling and representation; linear and nonlinear filtering, de-blurring, enhancement, restoration, and reconstruction from degraded, low-resolution or tomographic data; lossless and lossy compression and coding; segmentation, alignment, and recognition; image rendering, visualization, and printing; computational imaging, including ultrasound, tomographic and magnetic resonance imaging; and image and video analysis, synthesis, storage, search and retrieval.
用于图像、视频和多维信号的形成、捕获、处理、通信、分析和显示的理论、算法和体系结构。感兴趣的主题包括:数学,统计,和感知图像和视频建模和表示;线性和非线性滤波、去模糊、增强、恢复和重建退化、低分辨率或层析数据;无损和有损压缩编码;分割、对齐和识别;图像渲染、可视化和打印;计算成像,包括超声、断层和磁共振成像;以及图像和视频的分析、合成、存储、搜索和检索。
--
---
英文摘要:
  Scene classification is a fundamental problem to understand the high-resolution remote sensing imagery. Recently, convolutional neural network (ConvNet) has achieved remarkable performance in different tasks, and significant efforts have been made to develop various representations for satellite image scene classification. In this paper, we present a novel representation based on a ConvNet with context aggregation. The proposed two-pathway ResNet (ResNet-TP) architecture adopts the ResNet as backbone, and the two pathways allow the network to model both local details and regional context. The ResNet-TP based representation is generated by global average pooling on the last convolutional layers from both pathways. Experiments on two scene classification datasets, UCM Land Use and NWPU-RESISC45, show that the proposed mechanism achieves promising improvements over state-of-the-art methods. 
---
PDF链接:
https://arxiv.org/pdf/1802.00631