Xin Yu · Peng Dai · Wenbo Li · Lan Ma · Jiajun shen · Jia Li · Xiaojuan Qi
Dataset Code Paper
When photographing the contents displayed on the digital screen, an inevitable
frequency aliasing between the camera’s color filter array (CFA) and the screen’s
LCD subpixel widely exists. The captured images are thus mixed with colorful
stripes, named moire patterns, which severely degrade images’ perceptual qualities. Although a plethora of dedicated demoiré methods have been proposed in the research community recently, yet is still far from achieving promising results in the real-world scenes.
The key limitation of these methods is that they all only conduct research on low-resolution or synthetic images. However, with the rapid development of mobile devices, modern widely-used mobile phones typically allow users to capture 4K resolution (i.e.,ultra-high-definition) images, and thus the effectiveness of these methods on this practical scenes is not promised.
In this work, we explore moire pattern removal for ultra-high-definition images.
First, we propose the first ultra-high-definition demoiréing dataset
(UHDM), which contains 5,000 real-world 4K resolution image pair, and
conduct a benchmark study on the current state of the art. Then, we analyze limitations of the state of the art and summarize the key issue of them, i.e., not scale-robust. To address their deficiencies, we deliver a plug-and-play semantic-aligned scale-aware module which helps us to build a frustratingly simple baseline model for tackling 4K moire images. Our framework is easy to implement and fast for inference, achieving state-of-the-art results on four demoiréing datasets while being much more lightweight. We hope our investigation could inspire more future research in this more practical setting in image demoiréing.
We show the visual results of our demoiréing methods when applied to full 4K resolution images from mobile phones (i.e., 4032x3024~4624x3472) below. It should be noted that the default test setting in our benchmark is processing center-cropped images at a standard 4K resolution (3840x2160). But since our method is lightweight which is tolerant to the full 4K images, we present the results in full 4K scenes here. To play with the demo, click the button in the middle of each image and slide it to the right. For results on other datasets, you are welcome to check our paper, or try our code directly.
For the training of 4K demoiréing models and the evaluation of existing methods, we collect a large-scale ultra-high-definition demoiréing dataset (UHDM). We shoot the screen images via different camera views to produce different pattern appearance and combine multiple devices to produce diverse degradation styles (including pattern appearance and global color style), boosting the generalization ability for trained methods. Our dataset contains 5,000 image pairs in total. We randomly split them into 4,500 for training and 500 for testing.
Based upon our dataset, we conduct a benchmark study on state-of-the-art methods. Our empirical study reveals that most methods are struggling in simultaneously removing moire patterns with a much wider range of scales in 4K images and tolerating the growing demands for computational cost or image fine detail. We attribute their deficiencies to the lack of an effective multi-scale feature extraction strategy (welcome to see our paper for details). Existing methods utilize features from different depths to obtain multi-scale representations. However, features at different depths have different levels of semantic information. Thus, they are incapable of representing multi-scale information at the same semantic level, which might provide important cues for boosting the model’s multi-scale modeling capabilities. To this end, we deliver SAM, a module intended to extract multi-scale features within the same semantic level and allow them to interact and be dynamically fused, significantly improving the model’s ability to handle moire patterns with a wide range of scales. SAM incorporates a pyramid context extraction module to effectively and efficiently extract multi-scale features aligned at the same semantic level. Further, a cross-scale dynamic fusion module is developed to selectively fuse multi-scale features where the fusion weights are learned and dynamically adapted to individual images. Equipped with SAM, we can develop an efficient and scale-robust single-stage framework, avoiding complex coarse-to-fine two-stage modeling, as the previous method did in 1080p image demoiréing, which suffers from heavy computational cost when applied to 4K images yet is still not sufficient to remove moire patterns.
Last updated: July 2022 Contact me: yuxin27g@gmail.com