Professional Documents
Culture Documents
Tunc Ozan Aydn Nikolce Stefanoski Simone Croci Markus Gross Aljoscha Smolic
Disney Research Zurich ETH Zurich
(a)
time 5
(b) HDR
Mean Brightness
4
3
Tone 2
Mapped (c) 1
0
Figure 1: Our method allows local tone mapping of HDR video where the scene illumination varies greatly over time, in this case more than
4 log10 units (c). Note that our method is temporally stable despite the complex motion of both the camera and the actor (b), and the resulting
tone mapped video shows strong local contrast (a). The plots show the logmean luminance of the HDR video and the mean pixel values of
the tone mapped video.
(e) Warping
INPUT
Ip Ip 1
stimulus signal is composed by a step function that is modulated p := 1 +
. (8)
with a sine wave of increasing frequency (a chirp signal) and Gaus-
sian noise. In (b) and (c), corresponding detail layers and tone
mapping results are shown, respectively. Tone mapping is per- The permeability between p and its right neighbor pixel is close to 0
formed by compressing the base layer from (a) and keeping the cor- if the absolute value of the corresponding color difference is high,
responding details from (b). After one iteration, only ne scale de- and it is 1 if the difference is low. The parameter indicates the
tails (the Gaussian noise) are present in the detail layer (this result point of transition from large to low permeability, while controls
is identical for all ), while after 50 iterations also medium scale the slope of the transition around . In this work we used = 2
details (from the sinusoidal wave) are added to the detail layer. and a in the range 0.11. Similar to [Cigla and Alatan 2013], we
Note that = 1 signicantly reduces the halo artifact, i.e. the use these permeability weights to dene the permeability between
overshooting on both sides of the step edge in (c). two arbitrary pixels p and q as
1
: p=q
qx 1
n=px (n,py ) : px < qx , py = qy
pq := px 1 . (9)
Equation 7 is therefore highly desirable near strong edges, because
n=qx (n,py ) : px > qx , py = qy
it signicantly reduces halo artifacts by shifting the ltered result
0 : otherwise
towards the original input. Based on these observations we con-
clude that our instantiation of the lter shown in Equation 6 should Thus, the permeability between two pixels p and q of the same row
use = 1 and it should have a low diffusion strength in areas close is large if the permeability between each pair of neighboring pixels
to strong image edges. on the path between p and q is large. In particular, the permeability
between p and q is signicantly reduced, if there is a pair of neigh-
Figure 5 shows ltering results for different settings of on a repre- boring pixels along the path with low permeability (e.g. a strong
sentative 1D example (we use our instantiation of matrix H that will vertical edge between pixels p and q). The permeability between
be introduced in the following section). Note that in the case = 0, pixels of different rows is dened to be zero. To obtain a stochastic
strong edge-aware smoothing creates signicant halo artifacts in the matrix H := {hpq }, we normalize these weights according to
tone mapping result. These artifacts are substantially reduced in the
case = 1, since the ltering result is biased towards the stimulus pq
hpq := w , (10)
signal in the area close to the step edge. n=1 (n,py ),q
Figure 9: Comparison of our method with current TMOs capable of processing videos. For each method, we show two representative frames
and plot the mean pixel values (tone mapped) and logmean luminance (HDR) over time. Note that the HDR plots are shifted and compressed
by an exponent for presentation.
weights. In our method, by alternating the direction of diffusion curve. When we desired a stronger dynamic range compression, we
between horizontal and vertical directions with each iteration, pix- utilized the log curve with the compression factor c = [0.2, 0.4].
els are diffused in potentially large image regions after only a few We also conservatively set the number of spatial ltering iterations
iterations. However, although both smoothing methods converge to 20. (Performance benchmarks have been obtained with these
against mathematically different solutions, ltering results are vi- settings and temporal neighborhood size 21.)
sually similar in practice4 .
In the remainder of this section, we rst qualitatively compare our
methods results on the publicly available HDR video data provided
5 Results by Eilertsen et al. [2013] (Section 5.1). On the same public data
set we also performed a controlled subjective study, which vali-
Our video tone mapping method was developed by necessity for a dates our initial claim that our method maintains a high level of
lm production. Similar to Eilertsen et al. [2013], our experience local contrast with fewer temporal artifacts compared to the state-
with the state-of-the-art in video tone mapping yielded unsatisfac- of-the-art (Section 5.2). After briey discussing user interaction
tory results. Our newly developed method has been successfully (Section 5.3), we present our main results obtained from process-
used in a professional post-production workow at FullHD and ing HDR footage shot with an Arri Alexa XT camera at FullHD or
2880 1620 resolutions. Excluding I/O, our unoptimized Mat- 2880 1620 resolutions5 (Section 5.4). Finally we present exam-
lab implementation of the full tone mapping pipeline requires on ples on how our method allows control over spatial and temporal
average 3.445 seconds to process a color frame at FullHD resolu- contrast (Section 5.5), and demonstrate the advantage of using our
tion, and 6.465 seconds at 2880 1620 resolution on a current method in low light HDR sequences (Section 5.6).
desktop computer. The computation time of our method scales ap-
proximately linearly with the resolution of the input HDR video. 5.1 Comparison with Other Methods
The results in this section have been generated by using the default In Figure 9 we show a qualitative comparison between a number
parameter values presented in Sections 3 and 4, with the excep- of current tone mapping operators on the publicly available Hall-
tions of s (that controls spatial permeability) and the tone curve way [Eilertsen et al. 2013; Kronander et al. 2013a; Kronander et al.
parameters (either the compression factor c or the bias parameter 2013b]. In addition to the provided tone mapping results, we gen-
of the adaptive logarithmic tone curve, depending on which method erated another sequence by running the Bilateral TMO using stan-
was used). Since the spatial permeability and tone curve parame- dard parameters. Presenting a concise and meaningful visual com-
ters mainly govern the aesthetics of the tone mapped content, their parison between multiple video TMOs is challenging for various
values can be set up to personal taste. For generating the results in reasons. In Figure 9, we show for each video two representative
this section and the supplemental video we used s = [0.1, 1] and
bias = [0.2, 0.5] whenever we used the adaptive logarithmic tone 5 Refer to the supplemental video for more results.
frames to demonstrate its ability to show local contrast, as well as a
plot of the mean brightness of both the HDR and the tone mapped
sequences over time, which is helpful for investigating temporal
coherence.
Ramsey04 + Boitard14 Ours Gastal11 + Boitard14
The brightness ickering problems of local methods Retina
Our framework also allows explicit control over the temporal con-
5.3 User Interaction
trast (discussed in Section 3.1) of the tone mapped video. One
possibility when tone mapping HDR sequences with high temporal
HDR tone mapping is an artistic process that often involves ex- contrast to apply strongly compressive tone curve, which reduces
perimentation through trial and error. As such, any tone map- the dynamic range of the whole sequence such that it ts into the
ping software is expected to give the user visual feedback at in- display dynamic range without generating any over or under ex-
teractive rates while tone mapping parameters are being changed. posed regions. However, such a tone mapping that signicantly re-
The dilemma is that inter- duces temporal contrast may sometimes be undesirable. The other
active video editing on cur- option of applying a less compressive tone curve, on the other hand,
rent standard hardware at high may generate strongly over or under exposed frames (Figure 15, top
resolutions is challenging to row). This can easily be remedied in our framework using a bright-
achieve beyond simple opera- ness compensation factor that controls the brightness of each frame.
tions. Our solution for inter- Since our methods base layer is temporally stable, such a bright-
active editing consists of the ness compensation factor can be modulated by the inverse of the
ofine pre-computation of the logmean luminance of the spatiotemporal base layer without caus-
base and detail layers required ing brightness ickering (Figure 15, bottom row).
for tone mapping, followed by Figure 12: A screenshot from
and interactive editing session our user interface during edit-
Without Brightness
Compensation
rameter that controls the base layer smoothness a priori. That said,
we found that our two-step tone mapping workow to work well
in practice, as s and temporal ltering parameters can efciently
set by trial and error on a small number of representative frames. 150
Mean Pixel Value
The main application of our method is local video tone mapping. Figure 15: The brightness compensation applied as a function of
Figure 13, our teaser and the supplemental video show our methods logmean value of our methods temporally coherent base layer can
results produced from HDR sequences obtained with an Arri Alexa be used to limit the temporal contrast in tone mapping congura-
XT camera at FullHD or 2880 1620 resolution7 . The input video tions where a strongly compressive tone curve is not desired.
sequences contain complex motion and strong changes in scene il-
lumination. Despite that, our method maintains a temporally stable
mean brightness over time, as well as high local contrast. 5.6 Tone Mapping of Low-light Sequences
7 Due to their high contrast, our results are best viewed on a computer Tone mapping video sequences shot in low-light scenes is often dif-
display. Printed versions may not be representative of the intended results. cult due to the camera noise. TMOs can not distinguish camera
(a)
time 5
Mean Brightness
(b) 4
HDR 3
Our Result 2
1
(c) 0
0 100 200 300
(d)
time
5
(f )
Mean Brightness
(e) 4
HDR
3
Our Result 2
1
Figure 13: Representative local tone mapped frames from two sequences (a, d). Despite the high local contrast and strong dynamic range
compression, our method maintains a temporally stable logmean luminance over time (b, e). The input HDR videos that span more than 4
log10 units are visualized in false color (c, f).
noise from scene details, which means that as scene details become artifacts can be suppressed to a reasonable level, we still feel that
more visible during the tone mapping process, so does the noise. our method can be improved by using additional and more sophis-
Due to the denoising effect temporal ltering involved in our tone ticated ow condence measures (e.g. by utilizing machine learn-
mapping pipeline, our method is especially suitable for such low- ing [Mac Aodha et al. 2013]). Finally, our method currently pro-
light sequences. Figure 16 shows an example where the TMO sig- vides only a basic means for controlling chrominance in the form
nicantly amplies camera noise (a), which can be somewhat sup- of a saturation slider, which we hope to address in future work.
pressed by denoising the HDR frame before applying tone mapping
(b). The advantage of our method is that our results are comparable 7 Conclusion
to the latter without the need of an extra denoising step (c).
We presented a local HDR video tone mapping method, that can
6 Discussion signicantly reduce the input dynamic range while preserving lo-
cal contrast. Our key difference is the use of a temporal ltering
While our method does not offer the nal solution to the open prob- through per-pixel motion paths, which allowed us to achieve tem-
lem of local HDR video tone mapping, we believe that it takes a poral stability without ghosting. We formulated an edge-aware l-
big step forward by signicantly advancing the state-of-the-art and ter that is applied by iterative shift-variant convolutions, and shares
paves the way for the (much needed) future work in this eld. As the same halo suppression properties with the WLS lter, which
the entertainment industry is moving towards HDR video pipelines, allowed us to efciently realize our tone mapping framework. We
we think that the relevant research can be stimulated by further ad- presented qualitative and subjective comparisons to the state-of-the-
vances in ltering techniques geared towards tone mapping appli- art which resulted favorably for our method. We showed results
cations, and the emergence of high quality public HDR video data. produced with various tone mapping congurations from challeng-
ing HDR sequences, and presented a simple yet effective user in-
Our method is not without limitations. As an example, for artis- terface. Finally we noted the advantage of our method in low-light
tic reasons it is desirable to have multiple detail layers at differ- HDR sequences.
ent contrast scales. Due to the many technical challenges involved
in obtaining even a single detail layer, we deferred a multi-scale
approach to future work. Also, as with every optical ow based
method, our technique is always susceptible to visual artifacts due
to the inevitable errors in the ow computation. Even though the
outcome of our subjective study (Section 5.2) suggests that these
(a) (b) (c)
Figure 16: A frame-by-frame tone mapping using the bilateral TMO results in signicantly visible camera noise (a), which can be reduced
by denoising each frame before tone mapping using external video editing software, as an example we use The Foundry NUKE (b). On the
other hand, our method can achieve comparably low noise levels without an extra denoising step.
B ENNETT, E. P., AND M C M ILLAN , L. 2005. Video enhancement G ASTAL , E. S. L., AND O LIVEIRA , M. M. 2011. Domain trans-
using per-pixel virtual exposures. ACM Trans. Graph. 24, 3. form for edge-aware image and video processing. ACM Trans.
Graph. 30, 4, 69:169:12.
B ENOIT, A., A LLEYSSON , D., H ERAULT, J., AND C ALLET, P. L.
2009. Spatiotemporal tone mapping operator based on a retina G UTHIER , B., S., K., M., E., AND W., E. 2011. Flicker reduction
model. In CCIW, Springer, vol. 5646 of Lecture Notes in Com- in tone mapped high dynamic range video. Proc. SPIE 7866.
puter Science, 1222.
H E , K., S UN , J., AND TANG , X. 2013. Guided image ltering.
B OITARD , R., B OUATOUCH , K., C OZOT, R., T HOREAU , D., AND IEEE Transactions on Pattern Analysis and Machine Intelligence
G RUSON , A. 2012. Temporal coherency for video tone map- 35, 6, 13971409.
ping. Proc. SPIE 8499.
I RAWAN , P., F ERWERDA , J. A., AND M ARSCHNER , S. R. 2005.
B OITARD , R., C OZOT, R., T HOREAU , D., AND B OUATOUCH , K. Perceptually based tone mapping of high dynamic range image
2014. Survey of Temporal Brightness Artifacts in Video Tone streams. In Proceedings of Eurographics Conf. on Rendering
Mapping. In HDRi2014. Techniques, EGSR05, 231242.
B OITARD , R., C OZOT, R., T HOREAU , D., AND B OUATOUCH , K.
K ANG , S. B., U YTTENDAELE , M., W INDER , S., AND S ZELISKI ,
2014. Zonal brightness coherency for video tone mapping. Sig-
R. 2003. High dynamic range video. ACM Trans. Graph. 22, 3,
nal Processing: Image Communication 29, 2, 229 246.
319325.
C IGLA , C., AND A LATAN , A. A. 2013. Information permeability
for stereo matching. Signal Processing: Image Communication K ISER , C., R EINHARD , E., T OCCI , M., AND T OCCI , N. 2012.
28, 9, 10721088. Real time automated tone mapping system for HDR video. In
Proceedings of the IEEE International Conference on Image
D RAGO , F., M YSZKOWSKI , K., A NNEN , T., AND C HIBA , N. Processing, 27492752.
2003. Adaptive logarithmic mapping for displaying high con-
trast scenes. Computer Graphics Forum 22, 3, 419426. K RONANDER , J., G USTAVSON , S., B ONNET, G., AND U NGER ,
J. 2013. Unied HDR reconstruction from raw cfa data. In
D URAND , F., AND D ORSEY, J. 2002. Fast bilateral ltering for Computational Photography (ICCP), 2013 IEEE International
the display of high-dynamic-range images. ACM Trans. Graph. Conference on, 19.
21, 3, 257266.
K RONANDER , J., G USTAVSON , S., B ONNET, G., Y NMERMAN ,
E ILERTSEN , G., WANAT, R., M ANTIUK , R. K., AND U NGER , J., AND U NGER , J. 2013. A unied framework for multi-sensor
J. 2013. Evaluation of tone mapping operators for hdr-video. HDR video reconstruction. In Signal Processing: Image Com-
Computer Graphics Forum 32, 7, 275284. munications.
L ANG , M., WANG , O., AYDIN , T., S MOLIC , A., AND G ROSS , M. quisition, Display, and Image-Based Lighting, Second Edition.
2012. Practical temporal consistency for image-based graphics Morgan Kaufmann.
applications. ACM Trans. Graph. 31, 4 (July), 34:134:8.
R EINHARD , E., P OULI , T., K UNKEL , T., L ONG , B.,
L EDDA , P., S ANTOS , L. P., AND C HALMERS , A. 2004. A local BALLESTAD , A., AND DAMBERG , G. 2012. Calibrated im-
model of eye adaptation for high dynamic range images. AFRI- age appearance reproduction. ACM Trans. Graph. 31, 6 (Nov.),
GRAPH, 151160. 201:1201:11.
L EE , C., AND K IM , C.-S. 2007. Gradient domain tone mapping T OCCI , M. D., K ISER , C., T OCCI , N., AND S EN , P. 2011. A
of high dynamic range videos. In Image Processing, 2007. ICIP versatile hdr video production system. ACM Trans. Graph. 30,
2007. IEEE International Conference on, vol. 3, III 461III 4, 41:141:10.
464. T OMASI , C., AND M ANDUCHI , R. 1998. Bilateral ltering for
L ISCHINSKI , D., FARBMAN , Z., U YTTENDAELE , M., AND gray and color images. In ICCV, 839846.
S ZELISKI , R. 2006. Interactive local adjustment of tonal val- VAN H ATEREN , J. H. 2006. Encoding of high dynamic range
ues. ACM Trans. Graph., 646653. video with a model of human cones. ACM Trans. Graph. 25, 4,
M AC AODHA , O., H UMAYUN , A., P OLLEFEYS , M., AND B ROS - 13801399.
TOW, G. J. 2013. Learning a condence measure for optical Y E , G., G ARCES , E., LIU , Y., DAI , Q., AND G UTIERREZ , D.
ow. IEEE Trans. Pattern Anal. Mach. Intell. 35, 5, 11071120. 2014. Intrinsic Video and Applications. ACM Trans. Graph. (To
Appear) 33, 4.
M ANTIUK , R., M YSZKOWSKI , K., AND S EIDEL , H.-P. 2006. A
perceptual framework for contrast processing of high dynamic Z IMMER , H., B RUHN , A., AND W EICKERT, J. 2011. Optic Flow
range images. ACM Trans. Appl. Percept. 3, 3, 286308. in Harmony. International Journal of Computer Vision 93, 3,
368388.
M ANTIUK , R., DALY, S., AND K EROFSKY, L. 2008. Display
adaptive tone mapping. ACM Trans. Graph. 27, 3, 68:168:10.
M EYER , C. D. 2001. Matrix Analysis and Applied Linear Algebra.
SIAM: Society for Industrial and Applied Mathematics.
M ILANFAR , P. 2013. A tour of modern image ltering. IEEE
Signal Processing Magazine 30, 1, 106128.
M R AZEK , P., W EICKERT, J., AND B RUHN , A. 2004. On robust
estimation and smoothing with spatial and tonal kernels. In Proc.
Dagstuhl Seminar: Geometric Properties from Incomplete Data,
Springer, 388722.
N ORDSTROM , K. N. 1989. Biased anisotropic diffusiona unied
regularization and diffusion approach to edge detection. Tech.
rep., EECS Dept., UC Berkeley.
PARIS , S., H ASINOFF , S. W., AND K AUTZ , J. 2011. Local lapla-
cian lters: Edge-aware image processing with a laplacian pyra-
mid. ACM Trans. Graph. 30, 4, 68:168:12.
PATTANAIK , S. N., T UMBLIN , J., Y EE , H., AND G REENBERG ,
D. P. 2000. Time-dependent visual adaptation for fast realistic
image display. In Proc. of Conf. on Computer Graphics and
Interactive Techniques, SIGGRAPH 00, 4754.
P ETIT, J., AND M ANTIUK , R. K. 2013. Assessment of video
tone-mapping: Are cameras s-shaped tone-curves good enough?
Journal of Visual Communication and Image Representation 24,
7, 10201030.
R AMSEY, S. D., J OHNSON III, J. T., AND H ANSEN , C. 2004.
Adaptive temporal tone mapping. In Proceedings of the 7th
IASTED International Conference on Computer Graphics and
Imaging, 124128.
R EINHARD , E., AND D EVLIN , K. 2005. Dynamic range reduc-
tion inspired by photoreceptor physiology. IEEE Transactions
on Visualization and Computer Graphics 11, 1324.
R EINHARD , E., S TARK , M., S HIRLEY, P., AND F ERWERDA , J.
2002. Photographic tone reproduction for digital images. ACM
Trans. Graph. 21, 3, 267276.
R EINHARD , E., WARD , G., PATTANAIK , S., D EBEVEC , P., H EI -
DRICH , W., AND M YSZKOWSKI , K. 2010. HDR Imaging - Ac-