HDR Image Reconstruction From A Single Exposure Using

2y ago
18 Views
2 Downloads
4.55 MB
15 Pages
Last View : 3m ago
Last Download : 3m ago
Upload by : Ophelia Arruda
Transcription

HDR image reconstruction from a single exposure using deep CNNsGABRIEL EILERTSEN, Linköping University, SwedenJOEL KRONANDER, Linköping University, SwedenGYORGY DENES, University of Cambridge, UKRAFAŁ K. MANTIUK, University of Cambridge, UKJONAS UNGER, Linköping University, SwedenReconstructed HDR imageInputReconstructionGround truthInputReconstructionGround truthInput LDR imageFig. 1. The exposure of the input LDR image in the botom let has been reduced by 3 stops, revealing loss of information in saturated image regions. Usingthe proposed CNN trained on HDR image data, we can reconstruct the highlight information realistically (top right). The insets show that the high luminanceof the street lights can be recovered (top row), as well as colors and details of larger saturated areas (botom row). The exposures of the insets have beenreduced by 5 and 4 stops in the top and botom rows, respectively, in order to facilitate comparisons. All images have been gamma corrected for display.Camera sensors can only capture a limited range of luminance simultaneously, and in order to create high dynamic range (HDR) images a set ofdiferent exposures are typically combined. In this paper we address theproblem of predicting information that have been lost in saturated image areas, in order to enable HDR reconstruction from a single exposure. We showthat this problem is well-suited for deep learning algorithms, and propose adeep convolutional neural network (CNN) that is speciically designed takinginto account the challenges in predicting HDR values. To train the CNNwe gather a large dataset of HDR images, which we augment by simulatingsensor saturation for a range of cameras. To further boost robustness, wepre-train the CNN on a simulated HDR dataset created from a subset of theMIT Places database. We demonstrate that our approach can reconstructhigh-resolution visually convincing HDR results in a wide range of situations, and that it generalizes well to reconstruction of images captured witharbitrary and low-end cameras that use unknown camera response functions and post-processing. Furthermore, we compare to existing methods forHDR expansion, and show high quality results also for image based lighting.Finally, we evaluate the results in a subjective experiment performed on anHDR display. This shows that the reconstructed HDR images are visuallyconvincing, with large improvements as compared to existing methods.Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor proit or commercial advantage and that copies bear this notice and the full citationon the irst page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior speciic permission and/or afee. Request permissions from permissions@acm.org. 2017 Association for Computing Machinery.0730-0301/2017/11-ART178 15.00https://doi.org/10.1145/3130800.3130816CCS Concepts: · Computing methodologies Image processing; Neuralnetworks;Additional Key Words and Phrases: HDR reconstruction, inverse tone-mapping,deep learning, convolutional networkACM Reference format:Gabriel Eilertsen, Joel Kronander, Gyorgy Denes, Rafał K. Mantiuk, and JonasUnger. 2017. HDR image reconstruction from a single exposure using deepCNNs. ACM Trans. Graph. 36, 6, Article 178 (November 2017), 15 pages.https://doi.org/10.1145/3130800.31308161 INTRODUCTIONHigh dynamic range (HDR) images can signiicantly improve theviewing experience ś viewed on an HDR capable display or by meansof tone-mapping. With the graphics community as an early adopter,HDR images are now routinely used in many applications including photo realistic image synthesis and a range of post-processingoperations; for an overview see [Banterle et al. 2011; Dufaux et al.2016; Reinhard et al. 2010]. The ongoing rapid development of HDRtechnologies and cameras has now made it possible to collect thedata required to explore recent advances in deep learning for HDRimaging problems.In this paper, we propose a novel method for reconstructing HDRimages from low dynamic range (LDR) input images, by estimatingmissing information in bright image parts, such as highlights, lostdue to saturation of the camera sensor. We base our approach on aACM Transactions on Graphics, Vol. 36, No. 6, Article 178. Publication date: November 2017.

178:2 G. Eilertsen, J. Kronander, G. Denes, R. K. Mantiuk and J. Ungerfully convolutional neural network (CNN) design in the form of ahybrid dynamic range autoencoder. Similarly to deep autoencoderarchitectures [Hinton and Salakhutdinov 2006; Vincent et al. 2008],the LDR input image is transformed by an encoder network toproduce a compact feature representation of the spatial contextof the image. The encoded image is then fed to an HDR decodernetwork, operating in the log domain, to reconstruct an HDR image.Furthermore, the network is equipped with skip-connections thattransfer data between the LDR encoder and HDR decoder domainsin order to make optimal use of high resolution image details inthe reconstruction. For training, we irst gather data from a largeset of existing HDR image sources in order to create a trainingdataset. For each HDR image we then simulate a set of correspondingLDR exposures using a virtual camera model. The network weightsare optimized over the dataset by minimizing a custom HDR lossfunction. As the amount of available HDR content is still limitedwe utilize transfer-learning, where the weights are pre-trained ona large set of simulated HDR images, created from a subset of theMIT Places database [Zhou et al. 2014].Expansion of LDR images for HDR applications is commonlyreferred to as inverse tone-mapping (iTM). Most existing inversetone-mapping operators (iTMOs) are not very successful in reconstruction of saturated pixels. This has been shown in a number ofstudies [Akyüz et al. 2007; Masia et al. 2009], in which naïve methods or non-processed images were more preferred than the resultsof those operators. The existing operators focus on boosting thedynamic range to look plausible on an HDR display, or to producerough estimates needed for image based lighting (IBL). The proposed method demonstrates a step improvement in the quality ofreconstruction, in which the structures and shapes in the saturatedregions are recovered. It ofers a range of new applications, such asexposure correction, tone-mapping, or glare simulation.The main contributions of the paper can be summarized as:(1) A deep learning system that can reconstruct a high qualityHDR image from an arbitrary single exposed LDR image,provided that saturated areas are reasonably small.(2) A hybrid dynamic range autoencoder that is tailored to operateon LDR input data and output HDR images. It utilizes HDRspeciic transfer-learning, skip-connections, color space andloss function.(3) The quality of the HDR reconstructions is conirmed in asubjective evaluation on an HDR display, where predictedimages are compared to HDR and LDR images as well as arepresentative iTMO using a random selection of test imagesin order to avoid bias in image selection.(4) The HDR reconstruction CNN together with trained parameters are made available online, enabling prediction from anyLDR images: https:// github.com/ gabrieleilertsen/ hdrcnn.2 RELATED WORK2.1 HDR reconstructionIn order to capture the entire range of luminance in a scene it isnecessary to use some form of exposure multiplexing. While staticscenes commonly are captured using multiplexing exposures inthe time domain [Debevec and Malik 1997; Mann and Picard 1994;ACM Transactions on Graphics, Vol. 36, No. 6, Article 178. Publication date: November 2017.Unger and Gustavson 2007], dynamic scenes can be challenging asrobust exposure alignment is needed. This can be solved by techniques such as multi-sensor imaging [Kronander et al. 2014; Tocciet al. 2011] or by varying the per-pixel exposure [Nayar and Mitsunaga 2000] or gain [Hajisharif et al. 2015]. Furthermore, saturatedregions can be encoded in glare patterns [Rouf et al. 2011] or withconvolutional sparse coding [Serrano et al. 2016]. However, all theseapproaches introduce other limitations such as bulky and custombuilt systems, calibration problems, or decreased image resolution.Here, we instead tackle the problem by reconstructing visually convincing HDR images from single images that have been capturedusing standard cameras without any assumptions on the imagingsystem or camera calibration.2.2 Inverse tone-mappingInverse tone-mapping is a general term used to describe methodsthat utilize LDR images for HDR image applications [Banterle et al.2006]. The intent of diferent iTMOs may vary. If it is to displaystandard images on HDR capable devices, maximizing the subjectivequality, there is some evidence that global pixel transformationsmay be preferred [Masia et al. 2009]. Given widely diferent inputmaterials, such methods are less likely to introduce artifacts compared to more advanced strategies. The transformation could bea linear scaling [Akyüz et al. 2007] or some non-linear function[Masia et al. 2009, 2017]. These methods modify all pixels withoutreconstructing any of the lost information.A second category of iTMOs attempt to reconstruct saturatedregions to mimic a true HDR image. These are expected to generateresults that look more like a reference HDR, which was also indicated by the pair-wise comparison experiment on an HDR displayperformed by Banterle et al. [2009]. Meylan et al. [2006] used a lineartransformation, but applied diferent scalings in highlight regions.Banterle et al. [2006] irst linearized the input image, followed byboosting highlights using an expand map derived from the mediancut algorithm. The method was extended for video processing, andwith additional features such as automatic calibration and crossbilateral iltering of the expand map [Banterle et al. 2008]. Rempelet al. [2007] also utilized an expand map, but computed this fromGaussian iltering in order to achieve real-time performance. Wanget al. [2007] applied inpainting techniques on the relectance component of highlights. The method is limited to textured highlights, andrequires some manual interaction. Another semi-manual methodwas proposed by Didyk et al. [2008], separating the image into diffuse, relections and light sources. The relections and light sourceswere enhanced, while the difuse component was left unmodiied.More recent methods includes the iTMO by Kovaleski and Oliviera[2014], that focus on achieving good results over a wide range ofexposures, making use of a cross-bilateral expand map [Kovaleskiand Oliveira 2009].For an in-depth overview of inverse tone-mapping we refer to thesurvey by Banterle et al. [2009]. Compared to the existing iTMOs,our approach achieves signiicantly better results by learning fromexploring a wide range of diferent HDR scenes. Furthermore, thereconstruction is completely automatic with no user parametersand runs within a second on modern hardware.

HDR image reconstruction from a single exposure using deep CNNs 178:32.3 Bit-depth extension2.4 Convolutional neural networksCNNs have recently been applied to a large range of computer visiontasks, signiicantly improving on the performance of classical supervised tasks such as image classiication [Simonyan and Zisserman2014], object detection [Ren et al. 2015] and semantic segmentation [Long et al. 2015], among others. Recently CNNs have alsoshown great promise for image reconstruction problems related tothe challenges faced in inverse tone-mapping, such as compressionartifact reduction [Svoboda et al. 2016], super-resolution [Lediget al. 2016], and colorization [Iizuka et al. 2016]. Recent work oninpainting [Pathak et al. 2016; Yang et al. 2016] have also utilizedvariants of Generative Adversarial Networks (GANs) [Goodfellowet al. 2014] to produce visually convincing results. However, as thesemethods are based on adversarial training, results tend to be unpredictable and can vary widely from one training iteration to the next.To stabilize training, several tricks are used in practice, includingrestricting the output intensity, which is problematic for HDR generation. Furthermore, these methods are limited to a single imageresolution, with results only shown so far for very low resolutions.Recently, deep learning has also been successfully applied forimproving classical HDR video reconstruction from multiple exposures captured over time [Kalantari and Ramamoorthi 2017]. Interms of reconstructing HDR from one single exposed LDR image,the recent work by Zhang and Lalonde [2017] is most similar toours. They use an autoencoder [Hinton and Salakhutdinov 2006]in order to reconstruct HDR panoramas from single exposed LDRcounterparts. However, the objective of this work is speciically torecoverer high intensities near the sun in order to use the predictionfor IBL. Also, the method is only trained using rendered panoramasof outdoor environments where the sun is assumed to be in thesame azimuthal position in all images. Given these restrictions, andthat predictions are limited to 128 64 pixels, the results are onlyapplicable for IBL of outdoor scenes. Compared to this work, wepropose a solution to a very general problem speciication withoutany such assumptions, and where any types of saturated regionsare considered. We also introduce several key modiications to thestandard autoencoder design [Hinton and Salakhutdinov 2006], andshow that this signiicantly improves the performance.Finally, it should be mentioned that the concurrent work by Endoet al. [2017] also treats inverse tone-mapping using deep learningalgorithms, by using a diferent pipeline design. Given a single(a) f 1 (D )(b) exp(ŷ )(c) α A standard 8-bit LDR image is afected not only by clipping but alsoby quantization. If the contrast or exposure is signiicantly increased,quantization can be revealed as banding artifacts. Existing methodsfor decontouring, or bit-depth extension, include dithering methodsthat use noise in order to hide the banding artifacts [Daly and Feng2003]. Decontouring can also be performed using low-pass ilteringfollowed by quantization, in order to detect false contours [Daly andFeng 2004]. There are also a number of edge-preserving ilters usedfor the same purpose. In this work we do not focus on decontouring,which is mostly a problem in under-exposed images. Also, sincewe treat the problem of predicting saturated image regions, the bitdepth will be increased with the reconstructed information.(d) Ĥ(e) HFig. 2. Zoom-in of an example of the components of the blending operationin Equation 1, compared to the ground truth HDR image. (a) is the inputimage, (b) is prediction, (c) is the blending mask, (d) is the blending of (a-b)using (c), and (e) is ground truth. Gamma correction has been applied tothe images, for display purpose.exposure input image, the method uses autoencoders in order topredict a set of LDR images with both shorter and longer exposures.These are subsequently combined using standard methods, in orderto reconstruct the inal HDR image.3 HDR RECONSTRUCTION MODEL3.1 Problem formulation and constraintsOur objective is to predict values of saturated pixels given an LDRimage produced by any type of camera. In order to produce the inalHDR image, the predicted pixels are combined with the linearizedinput image. The inal HDR reconstructed pixel Ĥi,c with spatialindex i and color channel c is computed using a pixel-wise blendingwith the blend value α i ,Ĥi,c (1 α i ) f 1 (D i,c ) α i exp(ŷi,c ),(1)where D i,c is the input LDR image pixel and ŷi,c is the CNN output (in the log domain). The inverse camera curve f 1 is used totransform the input to the linear domain. The blending is a linearramp starting from pixel values at a threshold τ , and ending at themaximum pixel value,max(0, maxc (D i,c ) τ ).(2)1 τIn all examples we use τ 0.95, where the input is deined to be inthe range [0, 1]. The linear blending prevents banding artifacts between predicted highlights and their surroundings, as compared to abinary mask. It is also used to deine the loss function in the training,as described in Section 3.4. For an illustration of the componentsof the blending, see Figure 2. Due to the blending predictions arefocused on reconstructing around the saturated areas, and artifactsmay appear in other image regions (Figure 2(b)).The blending means that the input image is kept unmodiied inthe non-saturated regions, and linearization has to be made fromeither knowledge of the speciic camera used or by assuming aαi ACM Transactions on Graphics, Vol. 36, No. 6, Article 178. Publication date: November 2017.

G. Eilertsen, J. Kronander, G. Denes, R. K. Mantiuk and J. nec4d4xv.on1c1xv.onecdx4v.onec.nvcodx44v.on1c1x4x4 deconv.20x20x102420x20x512v.on3c3xv.on3c v.on3c3x3xv.on3cploo v.on3c3x3x10x10x512plooolv.v.ononpo 20x64320x320x3LDR encoder(VGG16 conv. 20x3HDR decoder411xv.on3c3x20x20x512v.on3c v.on3c3xv.on3cploo3x3xv.on3c3x3 conv.Latent representation 40x40x1024178:43x3 conv. poolDomain transformationskip-connectionSkip-layer3xFig. 3. Fully convolutional deep hybrid dynamic range autoencoder network, used for HDR reconstruction. The encoder converts an LDR input to a latentfeature representation, and the decoder reconstructs this into an HDR image in the log domain. The skip-connections include a domain transformation fromLDR display values to logarithmic HDR, and the fusion of the skip-layers is initialized to perform an addition. The network is pre-trained on a subset of thePlaces database, and deconvolutions are initialized to perform bilinear upsampling. While the specified spatial resolutions are given for a 320 320 pixelsinput image, which is used in the training, the network is not restricted to a fixed image size.certain camera curve f . We do not attempt to perform linearizationor color correction with the CNN. Furthermore, information lostdue to quantization is not recovered. We consider these problemsseparate for the following reasons:(1) Linearization: The most general approach would be to linearize either within the network or by learning the weights ofa parametric camera curve. We experimented with both theseapproaches, but found them to be too problematic given anyinput image. Many images contain too little information inorder to evaluate an accurate camera curve, resulting in highvariance in the estimation. On average a carefully chosenassumed transformation performs better.(2) Color correction: The same reasoning applies to color correction. Also, this would require all training data to be properly color graded, which is not the case. This means that givena certain white balancing transformation of the input, thesaturated regions are predicted within this transformed colorspace.(3) Quantization recovery: Information lost due to quantization can potentially be reconstructed from a CNN. However,this problem is more closely related to super-resolution andcompression artifact reduction, for which deep learning techniques have been successfully applied [Dong et al. 2015; Lediget al. 2016; Svoboda et al. 2016]. Furthermore, a number ofiltering techniques can reduce banding artifacts due to quantization [Bhagavathy et al. 2007; Daly and Feng 2004].ACM Transactions on Graphics, Vol. 36, No. 6, Article 178. Publication date: November 2017.Although we only consider the problem of reconstructing saturatedimage regions, we argue that this is the far most important partwhen transforming LDR images to HDR, and that it can be use

data required to explore recent advances in deep learning for HDR imaging problems. In this paper, we propose a novel method for reconstructing HDR images from low dynamic range (LDR) input images, by estimating missing information in bright image parts, such as highlights, lost due to s

Related Documents:

Images HDR et rendu HDR Images HDR Rendu HDR Définitions Principes Comparaison avec les images classiques Construction des images HDR Logiciels Formats. Historique Utilisation des images HDR pour le rendu 3D Intérêt du rendu HDR Limitations Solutions pour l'affichage.

by the HDR technology which is a combination of High Dynamic Range, Wide Color Gamut and Higher Bit-depth (10/12-bit sampling). HDR components[10] are shown in Figure 2-1. HDR improves the pixels and enables viewer to see a more realistic image and have an immersive visual experience. Figure 2-1 Component of HDR technology

brightness and contrast on a TV to produce a more realistic image. Brightness: A normal TV puts out around 100 - 300 "nits" of brightness. An HDR TV can in theory produce up to 5,000 nits. Note: Generally TVs that claim to be HDR or "HDR compatible" may be able to only play HDR content, but not have full HDR picture quality.

The Atlona AT-HDR-H2H-88MA is an 8 8 HDMI matrix switcher for high dynamic range (HDR) formats. It is HDCP 2.2 compliant and supports 4K/UHD video @ 60 Hz with 4:4:4 chroma sampling, as well as HDMI data rates up to 18 Gbps. The AT-HDR-H2H-88MA is ideal for professional HDMI signal rout

Issue #2 for image reconstruction: Incomplete data For “exact” 3D image reconstruction using analytic reconstruction methods, pressure measurements must be acquired on a 2D surface that encloses the object. There remains an important need for robust reconstruction algorithms that work with limited data sets.

High Dynamic Range (HDR) and Wide Color Gamut (WCG) can have a big positive impact on a viewer by creating a more convincing and compelling sense of light than has ever before been possible in television. A recent scientific study1 with professional-quality Standard Dynamic Range (SDR) and HDR videos found that viewers prefer HDR over SDR

Download Software 4 Add new License 4 Help Center 5 3. Activating SilverFast HDR 7 4. SilverFast HDR as demo 8 . tif, jpg 2000 and psd. The following data formats are supported in 24 bit: tif, jpg 2000, pdf and psd. HDR (Studio) is also able to read . as PDF files by clicking Download in the Dow

Short presentation on archaeological illustration generally. Introduction to pottery illustration, the equipment and the layout, presentation and conventions commonly used. Demonstration of how to draw a rim, followed by practical session Session 2 - 11th Oct. 1- 4.30pm. - Nadia Knudsen Presentation and demonstration of how to draw a pot base and a complete profile of a vessel followed by a .