摘要翻译:
基于学习的视频压缩的一个关键挑战是运动预测编码,一个非常有效的视频压缩工具,很难训练成神经网络。本文提出了像素运动
神经网络(PixelMotionCNN,PMCNN)的概念,它包括运动扩展和混合预测网络。PMCNN可以对时空相干性进行建模,从而在学习网络中有效地执行预测编码。在PMCNN的基础上,我们进一步探索了一种基于学习的视频压缩框架,增加了迭代分析/合成、二值化等内容,实验结果证明了该方案的有效性。虽然本文没有采用熵编码和复杂结构,但与MPEG-2相比,我们仍然表现出了优越的性能,并取得了与H.264相当的结果。本文提出的基于学习的视频编码方案为进一步提高视频编码的压缩效率和功能提供了一个可能的新方向。
---
英文标题:
《Learning for Video Compression》
---
作者:
Zhibo Chen, Tianyu He, Xin Jin, Feng Wu
---
最新提交年份:
2019
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Multimedia 多媒体
分类描述:Roughly includes material in ACM Subject Class H.5.1.
大致包括ACM学科类H.5.1中的材料。
--
一级分类:Electrical Engineering and Systems Science 电气工程与系统科学
二级分类:Image and Video Processing 图像和视频处理
分类描述:Theory, algorithms, and architectures for the formation, capture, processing, communication, analysis, and display of images, video, and multidimensional signals in a wide variety of applications. Topics of interest include: mathematical, statistical, and perceptual image and video modeling and representation; linear and nonlinear filtering, de-blurring, enhancement, restoration, and reconstruction from degraded, low-resolution or tomographic data; lossless and lossy compression and coding; segmentation, alignment, and recognition; image rendering, visualization, and printing; computational imaging, including ultrasound, tomographic and magnetic resonance imaging; and image and video analysis, synthesis, storage, search and retrieval.
用于图像、视频和多维信号的形成、捕获、处理、通信、分析和显示的理论、算法和体系结构。感兴趣的主题包括:数学,统计,和感知图像和视频建模和表示;线性和非线性滤波、去模糊、增强、恢复和重建退化、低分辨率或层析数据;无损和有损压缩编码;分割、对齐和识别;图像渲染、可视化和打印;计算成像,包括超声、断层和磁共振成像;以及图像和视频的分析、合成、存储、搜索和检索。
--
---
英文摘要:
One key challenge to learning-based video compression is that motion predictive coding, a very effective tool for video compression, can hardly be trained into a neural network. In this paper we propose the concept of PixelMotionCNN (PMCNN) which includes motion extension and hybrid prediction networks. PMCNN can model spatiotemporal coherence to effectively perform predictive coding inside the learning network. On the basis of PMCNN, we further explore a learning-based framework for video compression with additional components of iterative analysis/synthesis, binarization, etc. Experimental results demonstrate the effectiveness of the proposed scheme. Although entropy coding and complex configurations are not employed in this paper, we still demonstrate superior performance compared with MPEG-2 and achieve comparable results with H.264 codec. The proposed learning-based scheme provides a possible new direction to further improve compression efficiency and functionalities of future video coding.
---
PDF链接:
https://arxiv.org/pdf/1804.09869