基于端到端学习的深度图像压缩

545

收藏 2022-03-14

摘要翻译：
提出了一种基于深度卷积神经网络(CNNs)的有损图像压缩方法，通过多尺度结构相似度（MS-SSIM）测试，该方法在相同比特率下优于现有的BPG、WebP、JPEG2000和JPEG。目前，大多数基于CNNs的方法都是在像素域中利用重建和地面真实之间的L2损失来训练网络，这会导致过平滑结果和视觉质量下降，尤其是在很低的比特率下。因此，我们在提高主观质量的同时，还增加了感知损失和对抗损失。为了获得更好的率失真优化(RDO)，我们还在增加量化误差和速率约束的情况下引入了一种简单到难的转移学习。最后，我们在Kodak和苏黎世ETH计算机视觉实验室发布的测试数据集P/M上对我们的方法进行了评估，平均比BPG降低了7.81%和19.1%。
---
英文标题：
《Deep Image Compression via End-to-End Learning》
---
作者：
Haojie Liu, Tong Chen, Qiu Shen, Tao Yue, Zhan Ma
---
最新提交年份：
2018
---
分类信息：

一级分类：Electrical Engineering and Systems Science 电气工程与系统科学
二级分类：Image and Video Processing 图像和视频处理
分类描述：Theory, algorithms, and architectures for the formation, capture, processing, communication, analysis, and display of images, video, and multidimensional signals in a wide variety of applications. Topics of interest include: mathematical, statistical, and perceptual image and video modeling and representation; linear and nonlinear filtering, de-blurring, enhancement, restoration, and reconstruction from degraded, low-resolution or tomographic data; lossless and lossy compression and coding; segmentation, alignment, and recognition; image rendering, visualization, and printing; computational imaging, including ultrasound, tomographic and magnetic resonance imaging; and image and video analysis, synthesis, storage, search and retrieval.
用于图像、视频和多维信号的形成、捕获、处理、通信、分析和显示的理论、算法和体系结构。感兴趣的主题包括：数学，统计，和感知图像和视频建模和表示；线性和非线性滤波、去模糊、增强、恢复和重建退化、低分辨率或层析数据；无损和有损压缩编码；分割、对齐和识别；图像渲染、可视化和打印；计算成像，包括超声、断层和磁共振成像；以及图像和视频的分析、合成、存储、搜索和检索。
--
一级分类：Computer Science 计算机科学
二级分类：Computer Vision and Pattern Recognition 计算机视觉与模式识别
分类描述：Covers image processing, computer vision, pattern recognition, and scene understanding. Roughly includes material in ACM Subject Classes I.2.10, I.4, and I.5.
涵盖图像处理、计算机视觉、模式识别和场景理解。大致包括ACM课程I.2.10、I.4和I.5中的材料。
--

---
英文摘要：
We present a lossy image compression method based on deep convolutional neural networks (CNNs), which outperforms the existing BPG, WebP, JPEG2000 and JPEG as measured via multi-scale structural similarity (MS-SSIM), at the same bit rate. Currently, most of the CNNs based approaches train the network using a L2 loss between the reconstructions and the ground-truths in the pixel domain, which leads to over-smoothing results and visual quality degradation especially at a very low bit rate. Therefore, we improve the subjective quality with the combination of a perception loss and an adversarial loss additionally. To achieve better rate-distortion optimization (RDO), we also introduce an easy-to-hard transfer learning when adding quantization error and rate constraint. Finally, we evaluate our method on public Kodak and the Test Dataset P/M released by the Computer Vision Lab of ETH Zurich, resulting in averaged 7.81% and 19.1% BD-rate reduction over BPG, respectively.
---
PDF链接：
https://arxiv.org/pdf/1806.01496

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群