Dual-domain Deep Convolutional Neural Networks for Image DemoireingAn Gia Vien, Hyunkook Park, and Chul LeeDepartment of Multimedia EngineeringDongguk University, Seoul, Koreaviengiaan@mme.dongguk.edu, email@example.com, firstname.lastname@example.orgAbstractgorithms to remove moiré artifacts, and many of these haverecently been proposed.The most common classical approach to removing moirépatterns is to add an optical low-pass filter to the cameralens for anti-aliasing . However, this approach maycause over-smoothing due to the loss of high-frequencycomponents. Another approach is employing color filter array subsampling  based on the gradients of color difference interpolation. However, this approach has a high computational complexity which renders it unsuitable for practical applications, and its output quality relies heavily onthe green channel. Recently, a signal processing-based approach  that explores low-rank and sparsity constraintsfor moiré pattern removal in the frequency domain was developed. However, it may fail in regions with complexmoiré patterns.Recent works have shown that deep learning-based approaches are more effective than model-based algorithms.For example, Sun et al.  proposed a method for modeling moiré patterns by learning from a huge dataset. Eventhough they provided better results than model-based approaches, their network may yield poor results when testimages are taken with different camera settings from theirtraining data. He et al.  developed a neural network formoiré pattern removal by investigating multiple propertiesof the moiré patterns in the pixel domain. More recently,in the AIM 2019 Demoireing Challenge , several deeplearning-based approaches to remove moiré artifacts for images captured the monitors have been proposed . However, these approaches still have difficulties in removing severe moiré artifacts or those with strong color textures dueto the lack of accurate models of moire patterns.In this work, we develop deep convolutional neural networks (CNNs) to remove moiré artifacts in images in multiple complementary domains, specifically in the pixel andfrequency domains. The proposed network is composed ofthree subnetworks: the pixel network, frequency network,and fusion network. First, the pixel network converts images with moiré patterns into feature maps and processesthese features at different resolutions, as a moiré patternWe develop deep convolutional neural networks (CNNs)for moiré artifacts removal by exploiting the complex properties of moiré patterns in multiple complementary domains, i.e., the pixel and frequency domains. In the pixeldomain, we employ multi-scale features to remove the moiréartifacts associated with specific frequency bands usingmulti-resolution feature maps. In the frequency domain,we design a network that processes discrete cosine transform (DCT) coefficients to remove moiré artifacts. Next, wedevelop a dynamic filter generation network that learns dynamic blending filters. Finally, the results from the pixel andfrequency domains are combined using the blending filtersto yield moiré-free images. In addition, we extend the proposed approach to arbitrary-length burst image demoireing. Specifically, we develop a new attention network toeffectively extract useful information from each image inthe burst and align them with the reference image. Wedemonstrate the effectiveness of the proposed demoireingalgorithm by evaluating on the test set in the NTIRE 2020Demoireing Challenge: Track 1 (Single image) and Track 2(Burst).1. IntroductionMoiré patterns occur in images captured by digital cameras when the subject contains repetitive details that exceedthe resolution of the camera sensor . The captured images contain strange-looking patterns called moiré artifacts.Moiré patterns have various and complex shapes; they canappear as stripes, curves, and ripples. Furthermore, moirépatterns overlap different color variations superposed ontothe images. Moiré artifacts cause significant degradationto the visual quality of the images and the performance ofsubsequent image processing and computer vision applications. Thus, it is crucial to develop effective demoireing alThis work was supported by the National Research Foundation ofKorea (NRF) grant funded by the Korea government (MSIP) (No. NRF2019R1A2C4069806).1
spans a wide range of frequencies. Second, inspired byrecent observations on moiré patterns [32, 13], moiré artifacts are removed in the frequency domain using the discrete cosine transform (DCT). The frequency network processes DCT coefficients to remove moiré artifacts in the frequency domain. Subsequently, the dynamic filter generation network learns the dynamic blending filters. Finally,the outputs of the pixel network and the frequency networkare combined by the dynamic blending filters to generatea moiré-free image. Additionally, we extend the proposednetwork to burst-image demoireing, which removes moiréartifacts in multiple images with different geometric transformations of the same scene. We demonstrate the effectiveness of the proposed demoireing algorithm through theNTIRE 2020 Demoireing Challenge . We achieved anaverage PSNR of 38.28 dB for Track 1 (Single image) and38.50 dB for Track 2 (Burst).The remainder of this paper is organized as follows. Section 2 provides an overview of related works. Section 3describes the proposed algorithm. Section 4 discusses theexperimental results. Finally, Section 5 concludes the paper.super-resolution [10, 16, 36, 18], denoising [37, 19], deblurring , dehazing [4, 35], and compression artifact reduction [9, 5]. These learning-based algorithms have achievedstate-of-the-art performance in image quality improvement.It was shown that a block or module developed for a certainimage restoration task also performs well in other restoration tasks .Demoireing can be considered as image restoration, asit attempts to reconstruct a clean image by removing moiréartifacts. Among the state-of-the-art modules and blocks inimage restoration, the dense block (DB) [14, 29, 35] andresidual dense block (RDB)  are most closely relatedto demoireing. DB shows effectiveness in super-resolutionby preserving low-level information to reconstruct highfrequency details, while RDB is an extension of DB thatextracts abundant local features via densely connected convolutional layers and avoids gradient vanishing in a deepnetwork. Because our goal is not only to remove moiré patterns but also to reconstruct missing information, DB andRDB are important and relevant modules necessary for thiswork. However, because moiré patterns are complex anddifficult to distinguish from texture and color in images, thestraightforward adoption of DB and RDB modules may notprovide a high performance.Attention mechanisms in deep learning: Attention mechanisms facilitate deep neural networks to determine whereto focus and improve the representation of interest. Recently, attention mechanisms have been shown to be a critical component in deep learning and have been extensivelyemployed in computer vision [39, 21, 1, 11]. Among themany variations of the attention module, the convolutionalblock attention module (CBAM)  showed efficacy inimage denoising [3, 27] and super-resolution [8, 7] becauseit directs the network to focus on essential features and suppresses unnecessary ones. Therefore, we employ CBAM inthe proposed network.Additionally, there is an approach to incorporate attention processing to allow the models to identify misalignedregions before merging the features . By determiningmisaligned image regions at an early stage of the network,the algorithm yields high-quality results with less artifactsthan conventional algorithms. Thus, we employ an attentionmodule in  for burst-image demoireing to avoid misaligned features before moiré pattern removal.2. Related WorksMoiré pattern removal: Moiré patterns are a commondegradation that occurs in images captured by conventionalcameras because of the interference between the frequencyof textures in images or display screens and camera sensors. Several algorithms have been proposed to remove different types of moiré patterns. For example, Yang et al. proposed low-rank constraint and sparse matrix decomposition to remove moiré patterns in high-frequency texturesby observing and analyzing the moiré patterns of texturesin the frequency domain. Recently, several deep learningbased demoireing algorithms have been developed and haveshown to be more effective than conventional model-basedalgorithms. Sun et al.  exploited intrinsic correlationsbetween moiré patterns and image components at differentlevels in a multi-resolution pyramid network. He et al. proposed a framework to remove moiré patterns by considering three components: a multi-scale feature aggregationin the pixel domain, an edge predictor to estimate the edgemap of moiré-free images, and an appearance classificationto classify moiré patterns using multiple appearance labels.Recently, in the AIM 2019 Demoireing Challenge , several deep network architectures were proposed . Thesenetworks employed state-of-the-art blocks and modules thathave been applied to image restoration to remove moiré patterns.Image restorations: Image restoration generally focuseson noise removal, contrast enhancement, or high-frequencydetail reconstruction. Deep learning models have beensuccessfully applied to image restoration tasks, including3. Proposed Algorithms3.1. Dual-domain NetworkWe design a dual-domain network to effectively removemoiré artifacts and generate a high-quality clean image.Figure 1 shows the architecture of the proposed network,which takes the moiré image as input and then reconstructsa clean image. Specifically, the proposed dual-domain net2
eConvCADBRADBUpsampleADBRADBUpsampleConvPixel NetworkADBDCTFrequency NetworkRADBRADB Conv: 5, channel: 32ADBConvRADB 3CFusionNetworkIDCTConv: 10, channel: 32C ConcatenationC Concatenation Conv (1 1)Conv: 5, channel: 16ConcatConv1 1ConcatConv1 1restoration [6, 12], we use multi-scale features in the pixelvalue domain. In addition, assuming that the moiré artifacts have global structures in images, we first extract edgemaps using the edge extraction layer and concatenate themwith the inputs. The pixel network is composed of multiple branches of different resolutions. The branch at the toplevel processes feature maps of the original resolution of theinput image, while the other branches process coarser feature maps. The first convolutional layer (Conv) with a kernel size of 2 2 and a stride of 2 in each branch is responsible for downsampling the feature map from the higher-levelbranch by a factor of 2. By converting the input image intomultiple feature maps at different resolutions, we exploitdifferent levels of details from the input images. At eachbranch, the output feature maps from the first layer are fedinto a sequence of attention dense block (ADB) and residual attention dense blocks (RADB) as shown in Figures 2(a)and (b), respectively, which are composed of CBAM ,DB [14, 29], and RDB . We increase the resolution ofthe feature map at each branch using the upsampling module, which is a combination of a single convolutional layerand pixel shuffle  and then concatenate a feature mapwith that of the finer branch. Finally, at the end of the topbranch, a convolutional layer is used to generate the finaloutput SELUFigure 1: Architecture of the proposed dual-domain network.ConvSELUCBAMConvCBAMSELUConv(a) ADB (b) RADBDynamic FilterGenerationNetwork : Local convolution(c) Fusion network Figure 2: Architectures of the ADB, RADB, and fusion network.work is composed of three subnetworks: the pixel network,frequency network, and fusion network. First, we estimatethe individual results produced by the pixel network and thefrequency network. Subsequently, the fusion network usesthese results to generate moiré-free images by learning thedynamic filters.Frequency network: Moiré patterns are complex in termsof distributions in the frequency domain . Accordingto an observation in , images with moiré patterns areindistinguishable from clean images, as moiré patterns arespread across a wide range of frequency bands. Therefore,exploring the properties of moiré patterns in the frequencyPixel network: The pixel network processes an input imagein the pixel value domain. Based on the recent observationthat multi-scale contextual information is effective in image3
domain is necessary for their efficient removal. Thus, wedevelop an additional subnetwork to process the DCT coefficients of the input images to remove moiré artifacts inthe frequency domain. The network is built by cascading asingle ADB and three RADBs, shown in Figures 2(a) and(b), respectively, and a convolutional layer. The final outputimage is obtained by applying the inverse discrete cosinetransform (IDCT).Fusion network: We obtain the moiré-free image by combining the two images obtained from the pixel network andthe frequency network, as shown in Figure 1. Since thesetwo images have different characteristics, they are used ascomplementary candidates of the moiré-free image. A common approach to yield the final result from the two candidates is to use a convolutional layer with a kernel sizeof 1 1 for pixel-wise blending. However, this approachmay cause the final image to retain the artifacts if eithernetwork fails to accurately remove them. To alleviate thisissue, we instead employ a dynamic filter generation network  that takes the aforementioned pair of candidatesas input and outputs local blending filters. These filters arethen used to yield the moiré-free image. Figure 2(c) showsthe proposed fusion network using dynamic local blendingfilters. The coefficients of the filters are learned through adynamic blending filter network .transformations and selects the center image as the reference. We employ the attention module in  to predictthe attention maps against the reference. These attentionmaps can suppress the different geometric distortions in thenon-reference images, which prevents the undesirable features from reaching the merging process whose results areused as inputs for the dual-domain network. Finally, feature merging processes the stack of aligned features withrespect to the reference and the global module to obtain thefinal global features containing additional information forthe moiré artifact removal network.Dual-domain network: We use the same frequency network in Section 3.1 for burst-image demoireing. Specifically, the frequency network takes only the center image inthe sequence as input and removes moiré artifacts therein.3.3. Implementation DetailsThe proposed algorithm includes three neural networks:the pixel network, frequency network, and fusion network.In our implementation, each convolution is followed bya scaled exponential linear unit (SELU) activation function . In each training batch, we apply geometric transformations of 90 , 90 , and 180 rotations and horizontal and vertical flipping, thereby producing seven additionalaugmented versions of each image. We trained the networks using the AdamW optimizer  with β1 0.9 andβ2 0.999. The learning rate was fixed to 10 3 , and thebatch size was set to 16. We trained the network using theL2 loss. We implemented our model using PyTorch .We experimentally found that training the networks separately is more efficient than end-to-end training in termsof both training time and memory space. Thus, first, wetrained the pixel network and the frequency network separately. Then, we trained the fusion network. In the challenge, we also employed ensemble strategies by applyinggeometric transformations of 90 , 90 , and 180 rotations and horizontal and vertical flipping, thereby producingseven additional augmented versions of each image. Therefore, we yield eight different output images using the proposed network. Inverse geometric transformations are applied to the output images generated from the augmentedimages. Finally, the final output image is generated by averaging the eight output images.We used the training dataset provided by the NTIRE2020 Demoireing Challenge .1 Because the challengedoes not provide ground-truth images for validation, werandomly selected 500 images from the training dataset asthe test set for the experiments. Thus, the new trainingdataset contains 9,500 images out of 10,000. The training took approximately two days for single image and threedays for burst images using a computer with Intel Core 3.2. Extension to Burst-Image DemoireingWe extend the dual-domain network in Section 3.1 toburst-image demoireing. In burst-image demoireing, a setof images captured of the same target, where each image hasa different geometric transformation, is considered. Whileeach image contains different moiré patterns, they still retain pieces of useful information about the underlying cleanimage. This additional information in the sequence shouldbe exploited for the effective removal of moiré artifacts. Tothis end, we add an additional subnetwork in the pixel network, i.e., the attention network, for feature extraction andalignment, as shown in Figure 3. The architecture of theproposed attention network is illustrated in Figure 4.Attention network: The advantage of burst images overa single image is the redundant information across the images. To fully exploit this advantage, we develop an attention network composed of global feature extraction, featurealignment, and feature merging. Global feature extractionis the first global module  in the attention network. Thekey design goal is to capture the variations between eachimage and generate global feature maps that contain additional information from the burst that can be combined effectively. Additionally, to ensure that the additional information can directly influence the moiré pattern removal, wefuse the global features with its input features using a convolutional layer. Second, feature alignment considers themisalignment in the images due to the different geometric1 34
Burst ImagesMoire-free ImageAttentionNetworkPixel eADBRADB 3ConvIDCTFrequency NetworkRADB Conv: 5, channel: 32ADBCConvConv: 5, channel : 16Conv: 10, channel: 32RADBFusionNetworkC ConcatenationC Concatenation Conv (1 1)Figure 3: Architecture of the proposed network for burst-image demoireing.AttentionModuleCAttentionModuleConv1 1.:.:Feature ( ention ModuleGlobal ModuleConv1 1RADBConvConv :ConvSigmoidElement-wise multiplication Figure 4: Architecture of the proposed attention network.4. Experimental Resultsi9-9900X @4.4GHz CPU, 64GB RAM, and Nvidia RTX 2080 Ti GPU.4.1. Quantitative and Qualitative EvaluationIn the testing phase, the dual-domain network is fullyend-to-end, in that it takes a moiré image as input and produces a moiré-free image. Table 1 shows the quantitativecomparisons on the test set for both tracks in the challenge,5
igure 5: Single-image demoireing results for the test set. (a) Ground-truth, (b) moiré images, and outputs of (c) pixelnetwork, (d) frequency network, and (e) fusion network. PSNR scores are provided below each image.Table 1: Quantitative comparison of the demoireing performances of the proposed dual-domain network in the NTIRE2020 Demoireing Challenge: Track 1 (Single image) andTrack 2 (Burst)
Dual-domain Deep Convolutional Neural Networks for Image Demoireing An Gia Vien, Hyunkook Park, and Chul Lee Department of Multimedia Engineering Dongguk University, Seoul, Korea email@example.com, firstname.lastname@example.org, email@example.com Abstract We develop deep convolutional neural networks (CNNs)
Learning a Deep Convolutional Network for Image Super-Resolution . a deep convolutional neural network (CNN)  that takes the low- . Convolutional Neural Networks. Convolutional neural networks (CNN) date back decades  and have recently shown an explosive popularity par-
ImageNet Classiﬁcation with Deep Convolutional Neural Networks Alex Krizhevsky University of Toronto firstname.lastname@example.org Ilya Sutskever University of Toronto email@example.com Geoffrey E. Hinton University of Toronto firstname.lastname@example.org Abstract We trained a large, deep convolutional neural network to classify the 1.2 million
Deep Convolutional Neural Networks for Remote Sensing Investigation of Looting of the Archeological Site of Al-Lisht, Egypt by Timberlynn Woolf . potential to expedite the looting detection process using Deep Convolutional Neural Networks (CNNs). Monitoring of looting is complicated in that it is an illicit activity, subject to legal sanction .
ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012 M. Zeiler and R. Fergus, Visualizing and Understanding Convolutional Networks, ECCV 2014 K. Simonyan and A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition, ICLR 2015
Deep Neural Networks Convolutional Neural Networks (CNNs) Convolutional Neural Networks (CNN, ConvNet, DCN) CNN a multi‐layer neural network with – Local connectivity: Neurons in a layer are only connected to a small region of the layer before it – Share weight parameters across spatial positions:
Video Super-Resolution With Convolutional Neural Networks Armin Kappeler, Seunghwan Yoo, Qiqin Dai, and Aggelos K. Katsaggelos, Fellow, IEEE Abstract—Convolutional neural networks (CNN) are a special type of deep neural networks (DNN). They have so far been suc-cessfully applied to image super-resolution (SR) as well as other image .
2 Convolutional neural networks CNNs are hierarchical neural networks whose convolutional layers alternate with subsampling layers, reminiscent of sim-ple and complex cells in the primary visual cortex [Wiesel and Hubel, 1959]. CNNs vary in how convolutional and sub-sampling layers are realized and how the nets are trained. 2.1 Image processing .
of tank wall, which would be required by each design method for this example tank. The API 650 method is a working stress method, so the coefficient shown in the figure includes a factor of 2.0 for the purposes of comparing it with the NZSEE ultimate limit state approach. For this example, the 1986 NZSEE method gave a significantly larger impulsive mode seismic coefficient and wall thickness .