Professional Documents
Culture Documents
Legendre-Fenchel transformation
Legendre-Fenchel (LF) transformation of a continuous but not necessarily differentiable function f : R R {}, is defined as
f (p) = sup{ px f (x)}
x R
(1)
Geometrically it means that we are interested in finding a point x on the function f (x) such that the slope of line p passing through (x, f (x)) has a maximum
intercept on the y-axis. This happens to be the point on the curve which has a
slope p which is nothing but the tangent at that point.
p = f 0 (x)
(2)
2
2.1
(3)
II
is required. For instance, one would like to know whether a given function is
linear, whether the function is well behaved in a given domain etc. to name
a few. Transformations are one way of mapping the function to another space
where better and easy ways of understanding the function emerge out. Take for
instance Fourier Transform and Laplace Transform. Legendre-Fenchel is one
such transform which maps the (x, f (x)) space to the space of slope and conjugate that is (p, f (p)). However, while the Fourier transform consists of an
integration with a kernel, the Legendre Fenchel transform uses supremum as
the transformation procedure. Under the assumption that the transformation is
reversible, one form is the dual of the other. This is easily expressed as
(x, f (x)) (p, f (p))
(4)
p is the slope and f (p) is called the convex conjugate of the function f (x). A
conjugate allows one to build a dual problem which may be easier to solve than
the primal problem. Legendre-Fenchel conjugate is always convex.
There are two ways of viewing a curve or a surface, either as a locus of points
or envelope of tangents [1]. Now, let us imagine that we want to use the duality
of tangents to represent a function f (x) instead of points. A tangent is parameterised by two variables namely the slope p and the intercept it cuts on negative
y-axis (using negative y-axis for intercept is purely a matter of choice) which is
denoted by c. We provide two ways to solve for the intercept and the slope and
arrive at the same result.
III
3.1
Motivation I
(5)
Now imagine this line is to touch the function f (x) at x, then we can equate
both of them and write as
px c = f (x)
(6)
Also suppose that the function f (x) is convex and is x2 , a parabola (as an example). Then we can solve for quadratic equation
x2 px + c = 0
(7)
(8)
But we know that this line should touch this convex function only at once (if the
function was non-convex, the line could touch the function at two points) and
because we want to use this line as a tangent to represent the function in dual
space. Therefore, the roots of this equation should be equal which is to say that
the determinant of the above quadratic equation should be zero i.e. p2 4c = 0
which gives us
f (p) = c =
p2
4
(9)
(10)
p2
4
(11)
(12)
Motivation II
(13)
(14)
IV
(15)
we can take the first order derivative to obtain the slope which is
p = f 0 (x) = 2x
(16)
p
2
(17)
x = f 01 (p) =
Replacing y we get
y = f (x)
(18)
y = f ( f 01 (p))
(19)
p
p2
y = f( ) =
2
4
(20)
p2
2
4
p2
4
(21)
(22)
(23)
Let us now turn our attention towards the cases where a function may not be
differentiable at a given point x and has a value f (x ). In this case we can
rewrite our equation as
px c = f (x )
(24)
c = px f (x )
(25)
which means
Fig. 2. At the point x we can draw as many tangents we want with slopes ranging from
[1, 1] and they form the subgradients of the curve at x .
(26)
VI
Fig. 3. An illustration of duality where lines are mapped to points while points are
mapped to lines in the dual space.
In simpler term, a subdifferential is defined as the slope of the line at point x such
that it is either always touching or remaining below the graph of function. For
the same notorious | x | function the differential is not defined at x = 0 because
x
is not defined at x = 0 while subdifferential at point x = 0 is the close interval
|x|
[-1, 1] because we can always draw a line with a slope between [-1,1] which
is always below the function. The subdifferential at any point x < 0 is the
singleton set {-1}, while the subdifferential at any point x > 0 is the singleton
{1}. Members of the subdifferential are called subgradients.
4.2
(27)
In order to prove that function is convex we need to prove that for a given
0 1 the function should obey Jensens inequality i.e.
f (z1 + (1 )z2 ) f (z1 ) + (1 ) f (z2 )
(28)
(29)
(30)
VII
(31)
Assuming p1 = (xt z1 f (x)) and p2 = (xt z2 f (x)) for brevity, we know that
sup {p1 + (1 )p2 } sup {p1 } + sup {(1 )p2 }
xRn
xRn
xRn
(32)
(33)
(34)
(35)
and
(36)
f (z)
Therefore,
is convex always irrespective of the whether the function f (x)
is convex or not.
5
5.1
Examples
Example 1: Norm function (non-differentiable at zero)
f (y) = ||y||
(37)
t
(38)
||y|| = max yt b
||b||1
(39)
Now, we know that the maximum value of yt b is ||y|| so it is trivial to see that
max {yt b ||y||} = 0 y R
||b||1
(40)
VIII
Fig. 4. The image explains the process involved when fitting a tangent with slope p =
2 or z = 2. Any line with slope
/ [-1,1] has to intersect the y-axis at , to be able to be
tangent to the function.
5.2
Example 2: Parabola
f (y) = y2
(41)
(42)
(43)
z 2y = 0
(44)
z = 2y
substituting the value of y in the above function f (z) we get
2
z
1
z
= z2
f (z) = z
2
2
4
5.3
(45)
(46)
1 t
y Ay
2
(47)
Let us assume that A is symmetric, then the LF transform of a function f (y) for
an n-dimensional vector y is defined as
f (z) = sup {yt z f (y)}
yRn
(48)
IX
Fig. 5. The plot shows the parabola y2 and its conjugate which is also a parabola 41 z2 .
(49)
(50)
is symmetric]
(51)
(52)
(53)
(54)
(55)
(56)
Example 4: l p Norms
f (y) =
1
||y|| p
p
1<p<
(57)
1
||y|| p }
p
(58)
1
||y|| p ) = 0
p
y
z ||y|| p1
=0
||y||
y (yt z
(59)
(60)
z = ||y|| p2 y
||z|| = ||y||
p 1
1
p 1
||y|| = ||z||
z
y=
p 2
||z|| p1
(61)
(62)
(63)
(64)
zt
p 2
p 1
||z||
zt z
p
1
||z|| p1
p
p
1
||z|| p1
p
(66)
p 2
p 1
p
1
||z|| p1
p
(67)
p 2
p
1
||z|| p1
p
p
2(p1)(p2)
1
p 1
||z|| p1
= ||z||
p
p
2p2 p+2
1
= ||z|| p1 ||z|| p1
p
p
p
1
= ||z|| p1 ||z|| p1
p
p
1
= (1 )||z|| p1
p
= ||z||
(65)
p 2
p 1
||z||
||z||2
||z||
2 p 1
(68)
(69)
(70)
(71)
(72)
= (1
=
1
1
)||z|| 1 p
p
1
1
(1 1p )
||z|| 1 p
(73)
(74)
(75)
Let us call
q=
1
(1 1p )
(76)
XI
Therefore, we obtain
f (z) =
1
||z||q
q
1 1
+ =1
p q
5.5
(77)
(78)
Fig. 6. The plot shows the function e x and its conjugate which is z(ln z 1).
f (y) = ey
(79)
f (z) = sup{yt z ey }
(80)
y R
(81)
if z < 0 : supyR {yt z ey } is unbounded so f (z) =
if z > 0 : supyR {yt z ey } is bounded and can be computed as
y (yz ey ) = 0
(82)
z ey = 0
(83)
y = ln z
(84)
XII
{yt z ey }
= supyR
{ey }
(85)
=0
(86)
(87)
Fig. 7. The plot shows the function -log y and its conjugate which is -1 -log (-z).
y (yz + log y) = 0
1
z+ = 0
y
y=
(88)
(89)
1
z
(90)
f (z) = 1 log(z)
f (z) =
(91)
(92)
XIII
Many problems in computer vision can be expressed in the form of energy minimisations [7]. A general class of the functions representing these problems can
be written as
min{ F(Kx) + G(x)}
xX
(93)
where F and G are proper convex functions and K Rnm . Usually, F(Kx)
corresponds to regularisation term of the form ||Kx || and G(x) corresponds to
the data term. The dual form can be easily derived by replacing F(Kx) with its
convex conjugate, that is
min max{hKx, yi F (y) + G(x)}
x X y Y
(94)
(95)
hKx, yi = h x, K T yi
(96)
and in case the dot product is defined on hermitian space we can write it as
hKx, yi = h x, K yi
(97)
(98)
min{h x, K yi + G(x)} = G (K y)
(99)
x X y Y
Now by definition
xX
XIV
because
max{h x, K yi G(x)} = G (K y)
(100)
min{h x, K yi + G(x)} = G (K y)
(101)
min{h x, K yi + G(x)} = G (K y)
(102)
min{h x, K yi + G(x)} = G (K y)
(103)
xX
xX
xX
xX
Under the weak assumptions in convex analysis , min and max can be switched
in Eqn. 98, the dual problem then becomes
max{ G (K y) F (y)}
(104)
y Y
y Y
(105)
(106)
n+1
n+1
n+1
= ProxG (x K y
=x
n+1
+ (x
n+1
(107)
)
x )
(108)
(109)
(110)
It can be shown that if 0 1 and ||K ||2 < 1, x n converges to the minimiser of the original energy function.
Note: The standard way of writing the update equations is via proximity operator but it reduces to standard pointwise gradient descent with projection
onto a unit ball which is expressed in max(1, | p|) in subsequent equations. This
projection comes from the indicator function.
7.1
Premilinaries
Given a and x are column vectors where the dot product ha, xi can be written
as
ha, xi = aT x
(111)
XV
a1,1
a1,2
a1,1 a1,2 a1,n
..
.
a2,1 a2,2 a2,n
Am,n = .
(113)
.. . .
.. = a1,n
.
.
. . a
.
m,1
x 0 0
0 0 a1,1
x
. . . . . .
.. .. . . . . . . . . ... a1,2
..
.
0 0 0 x
a1,n
(114)
A = 0 0
y
am,1
0 0
.
..
. y
. .. . . . . . . . . ..
. . . . . a
. .
m,n
0 0 0 y
x
a1,1
ax
1,2
.
..
ax
1,n
ax
m,1
.
0 ..
x
0
am,n
.. ay
1,1
. y
a1,2
.
y
..
y
a1,n
y
am,1
.
..
divA =
0 0 0
0
..
.
0
..
.
0
0 0
. . .. .. . .
. . . .
0 0
0 0 0 0
0 y
0 0
.. .. .. . . .. . .
.
. . . . .
0
0
am,n
(115)
XVI
7.2
Representation of norms
(116)
W H
||u||1 = |ui,j |
(117)
i=1 j=1
|ui,j | =
( x ui,j )2 + (y ui,j )2
(118)
(119)
y ui,j
= u(i, j) u(i, j 1)
(120)
(121)
+y ui,j
(122)
= u(i, j + 1) u(i, j)
(123)
(124)
ROF Model
||u g||22
2
(125)
XVII
||u||1 = max h p, ui P (p)
p P
(126)
(127)
p E(u, p) = u
(132)
u E(u, p) = divp + (u g)
(133)
(134)
(135)
(136)
(137)
pn+1 = pn + un
pn + un
pn+1 =
max(1, | pn + un |)
un un+1
= u E(u, p) = divp + (un+1 g)
un + divpn+1 + g
un+1 =
1 +
1
(138)
(139)
(140)
(141)
(142)
The projection onto unit ball max(1, | p|) comes from the indicator function as explained in a note just below the Primal Dual Gap paragraph. In the subsequent equations we have used this property wherever projection is made.
XVIII
7.4
Huber-ROF model
The interesting thing about Huber model is that it has a continuous first derivative, so a simple gradient descent on the function can bring us to the minima
while if we were to use Newton-Raphson method which requires the second
order derivative, it wouldnt be possible to do so because the second derivative
of Huber model is not continuous. So, the function we want to minimise is
min ||u||h +
u X
||u g||22
2
(143)
|| x || =
| x |2
2
|x|
if | x |
if | x | >
|| p||22
2
|| p||
(144)
f (p) =
2
if < || p|| 1
otherwise
(145)
(146)
Minimisation The minimisation can be carried out following the series of steps
1. Compute the derivative with respect to p i.e. p E(u, p) which is
(147)
(148)
(149)
(150)
(151)
(152)
XIX
(153)
(154)
(155)
(156)
(157)
(158)
pn + un
pn+1 =
1 +
(159)
(160)
pn + un
pn+1 =
1+
pn + un
1+ |)
max(1, |
un un+1
= u E(u, p) = divp + (un+1 g)
un + divpn+1 + g
un+1 =
1 +
(161)
(162)
(163)
TVL1 denoising
(164)
This can be further rewritten as a new equation where is subsumed inside the
norm i.e.
min ||u||1 + ||(u f )||1
u X
(165)
q Q
(166)
(167)
XX
(168)
(169)
(170)
(171)
p P (p) = 0
p Q (q) = 0
(173)
p E(u, p, q) = u
(174)
q P (p) = 0
(178)
q Q (q) = 0
(179)
q E(u, p, q) = (u f )
(180)
(181)
u P (p) = 0
(184)
u Q (q) = 0
(185)
u E(u, p) = divp + q
(186)
(182)
(183)
XXI
pn+1 = pn + un
pn + un
pn+1 =
max(1, | pn + un |)
qn+1 qn
= q E(u, p, q) = (un f )
qn+1 = qn + (un f )
qn + (un f )
qn+1 =
max(1, |qn + (un f )|)
un un+1
= u E(u, p, q) = divpn+1 + qn+1
(187)
(188)
(189)
(190)
(191)
(192)
(193)
(194)
pn+1 = pn + un
pn + un
pn+1 =
max(1, | pn + un |)
qn+1 qn
= q E(u, p, q) = (un f )
qn+1 = qn + (un f )
qn + (un f )
qn+1 =
|qn +(un f )|
)
max(1,
un un+1
= u E(u, p, q) = divpn+1 + qn+1
(195)
(196)
(197)
(198)
(199)
(200)
(201)
(202)
Image Deconvolution
min ||u||1 +
u X
|| Au g||22
2
(203)
|| Au g||22 P (p)
2
(204)
XXII
p E(u, p) = u
(205)
(206)
(207)
(208)
(209)
(210)
(211)
(212)
(213)
(214)
(215)
(216)
(217)
T
(218)
[Lets say B is A A]
(219)
(u A g )(Au g) = u A Au u A g + g Au + g g
T
u u A Au = u u Bu
T
u u Bu = (B + B )u
T
(220)
T
u u A Au = (A A + (A A) )u
T
u u A Au = (A A + A A)u
u u A Au = 2A Au
(221)
(222)
(223)
XXIII
un un+1
pn+1 pn
= p E(u, p) = un
pn+1 = pn + un
pn + un
pn+1 =
max(1, | pn + un |)
T
(A A) n+1
= u E(u, p) = divpn+1 +
u
AT g
1
un+1 I + A T A = un + divpn+1 + A T g
u
n+1
=
I + A A
1
(un + divpn+1 + A T g)
(224)
(225)
(226)
(227)
(228)
(229)
(230)
This requires matrix inversion. In some cases the matrix may be singular because it is generally sparse and therefore inversion is not a feasible solution.
Therefore, one resorts to using Fourier Analysis.
Another alternative is to dualise again with respect to u, which then yields
1
||q||2
2
(231)
(232)
q E(u, p, q) = Au g
1
q
(233)
(234)
XXIV
pn+1
pn+1 = pn + p un
pn + p un
=
max(1, | pn + p un |)
qn+1 qn
1
= q E(u, p, q) = Aun g qn+1
q
n + Aun g
q
q
q
qn+1 =
1 + q
un un+1
= u E(u, p) = divpn+1 + A T qn+1
(235)
(236)
(237)
(238)
(239)
(240)
(241)
This saves matrix inversion. One of the benefits of using the LegendreFenchel transformation.
Interesting tip Imagine we have a function of the form
E = (h u f )2
(242)
where operator denotes the convolution. If one wants to take the derivate
with respect to u, one can make use of the fact that h u can be expressed as
a linear function of sparse matrix D, i.e. Du. Rewriting the equation we can
derive
E = (Du f )2 = (Du f )T (Du f )
(243)
Now it is very trivial to see the derivative of this function with respect to u.
Referring to the Eqn. 76 in the [2], we can then write the derivative of E with
respect to u as follows
E
= 2D T (Du f )
u
Du = h u
D T (Du f ) = h (h u f )
(244)
(245)
(246)
Optic Flow
Optic flow was popularised by Horn and Schunks seminal paper [4] which
has over the next two decades sparked a great interest in minimising the energy function associated with computing optic flow and its various different
XXV
formulations [5][6][9]. Writing the standard L1 norm based optic flow equation
min
||u||1 + ||v||1 + | I1 (x + f ) I2 (x)|
(247)
u X,vY
where f is a flow vector (u, v) at any pixel (x, y) in the image. For brevity (x, y)
is replaced by x. Substituting p for dual variable corresponding to u, q for v
and r for I1 (x + f ) I2 (x), we can rewrite the original energy formulation in its
primal-dual form as
max
min
h p, ui + hq, vi
(248)
(249)
(250)
(251)
(252)
(255)
XXVI
(257)
(258)
pn+1
pn+1 pn
= un
p
pn + p un
=
max(1, | pn + p un |)
(259)
(260)
qn+1
qn+1 qn
= vn
q
qn + q vn
=
max(1, |qn + q vn |)
(261)
(262)
r n+1
r n+1 r n
= (I1 (x + f n ) I2 (x))
r
r n + r (I1 (x + f n ) I2 (x))
=
max(1, |r n + r (I1 (x + f n ) I2 (x))|)
(263)
(264)
(265)
(266)
XXVII
8.3
Super-Resolution
The formulation was first used in [8] but we will describe here the minimisation
procedue below.
min
u X
||u ||ehu
h
+ ||DBWi u fi ||ed
With > 0, let us now rewrite the conjugate for ||.|| we see
f (p) = sup h p, ui ||u||
u R
p
f (p) = sup h , ui ||u||
u R
Let us now denote
(267)
i=1
(268)
(269)
(270)
f (p) = f (k)
(271)
But we know that supuR hk, ui ||u|| is an indicator function defined
by
0
if ||k|| 1
f (k) =
(272)
otherwise
Therefore, we can write the f (p) as
0
f (p) =
Now replace k by
if ||k|| 1
otherwise
(273)
f (p) =
p
if || || 1
otherwise
(274)
+ hqi , DBWi u f i i X
||q|| {|qi |(h)2 }
2(h)2
i =1
Minimisation Minimisation equations can be written as follows
(275)
XXVIII
p h p, ui = u
(277)
p E(u, p) = u
eu
p
h2
(278)
ed
2
||
q
||
i
2(h)2
e
p,
qi ) = (h)2 (DBWi u fi ) d 2 qi
qi E(u,
(h)
p,
qi ) = (h)2 (DBWi u fi ) qi
qi E(u,
(279)
(280)
p,
qi ) = divp + (h)2 u (qiT (DBWi u fi ))
u E(u,
(281)
i=1
p,
qi ) = divp + (h)2 (WiT BT DT qi )
u E(u,
(282)
i=1
pn+1
max(1,
| pn+1 |
)
(283)
(284)
(285)
(286)
XXIX
qin+1 qin
e
= (h)2 (DBWi u n fi ) d 2 qin+1
q
(h)
q ed n+1
qin+1 qin = q (h)2 (DBWi u n fi )
q
(h)2 i
qn + q (h)2 (DBWi u n fi )
qin+1 = i
q ed
1 + (h)
2
qin+1 =
qin+1
max(1, |qin+1 |)
(287)
(288)
(289)
(290)
N
u n u n+1
p,
qi ) = divpn+1 + (h)2 (WiT BT DT qin+1 ) (291)
= u E(u,
i=1
N
u n+1 = u n divpn+1 + (h)2 (WiT BT DT qin+1 )
(292)
i=1
8.4
Let us now try to turn our attention towards doing full joint tracking and super
resolution image reconstruction. Before we derive anything lets try to formulate the problem from bayesian point of view. We are given the downsampling,
blurring operators and we want to determine the optical flow between the images and reconstruct the super resolution image at the same time. The posterior
probability can be written as
N
N
{w i }i=1
, D, B)
|{ fi }i=1
P(u,
(293)
Using standard bayes rule, we can write this in terms of likelihoods and priors
as
N
N
{w i }i=1
P(u,
|{ fi }i=1
, D, B)
P( fi |w i , u, D, B)P(w i , u, D, B)
(294)
i=1
(295)
D, B) marks our prior for the super resolution image and the flow.
while P(w i , u,
It can be easily simplied under the assumption that flow prior is independent
of super resolution prior.
D, B) =
P(w i , u,
P(
w
)
i P(u)
i=1
(296)
XXX
(297)
= ||u ||
logP(u)
(298)
and
i=1
i=1
N
{w i }i=1
+ w i ) fi || + i {||w xi || + ||w yi ||} + ||u ||
E(u,
) = || DBu(x
(299)
We dualise with respect to each L1 norm and obtain the following expression
N
N
{w i }i=1
+ w i ) fi i
E(u,
) = hqi , DBu(x
i=1
(300)
+ i {hr xi , w xi i + hryi , w yi i} + h p, u i
i=1
(301)
(302)
(303)
ry
n
= wyi
(304)
XXXI
(305)
(306)
n ) fi = DB{u(x
n
+ w in1 + dw
+ w in1 ) + x u(x
+ w in1 )dw
DBu(x
i
xi
n } fi
+ w n1 )dw
+ y u(x
(307)
yi
n and dw
n by w n w n1 and w n w n1 respectively we can
Replacing dw
xi
yi
xi
yi
xi
yi
rewrite the above equation as
n ) fi = DB{u(x
+ w in1 + dw
+ w in1 )+ x u(x
+ w in1 )(w nxi w nxi1 )
DBu(x
i
(308)
+ w n1 )(w n w n1 )} fi
+ y u(x
i
yi
yi
Treating now w in1 to be constant, we can minimise the energy function with ren respectively. The obtained update equations can be written
spect to w xi and w yi
as
w nxi w nxi1
+ w in1 ) + x u(x
+ w in1 )(w nxi w nxi1 )
= w xi hqi , DB u(x
w
n 1
n
n 1
+ w i )(w yi w yi ) f i i + hr xi , w xi i + hryi , w yi i
+ y u(x
(309)
w nxi w nxi1
= IxT B T D T qin divr nxi
w
(310)
or
nxi
w n+1
xi w
= IxT B T D T qin+1 divr n+1
xi
w
+ w in )))
Ix = diag( x (u(x
(311)
(312)
n+1
= IyT B T D T qin+1 divryi
(313)
+ w in )))
Iy = diag(y (u(x
(314)
Optimisations with respect to qi , wxi , wyi , r xi and ryi are done on a coarseto-fine pyramid fashion.
XXXII
Optimising with respect to u Given the current solution for w in we can write
+ w in ) as a linear function of u by multiplying it with warping matrix Win u
u(x
N
u n+1 u n
(n+1) T T T n+1
n+1
= (Wi
) B D qi divp
(315)
u
i=1
The constants and are usually very easy to set if the operator K in the equation
min F(Kx) + G(x)
(316)
xX
is a simple operator, in which case and can be easily found from the constraint that L2 1, where L is the operator norm i.e. ||K ||. For instance if we
try to look at our problem of ROF model and dualise it we can see that
min ||u||1 +
u X
||u g||22
2
(317)
p
K=
, x = u, y =
, F (y) =
and G(x) = 0
1
2
T
I
q
q f + 2 q
(319)
(320)
It is easy to see that if K has a simple form, we can write the closed form solution
of the norm of K, i.e. ||K ||. However, if K has some complicated structure e.g. in
the case of deblurring or super resolution K would have different entries in each
row and its hard to come up with a closed form solution of the norm of K in
which case one would like to know how to set the and so that we can carry
out the succession of iterations for our variables involved in minimisation. A
new formulation from Pock et al. [10] describe a way to set the and such
that the optimality condition of convergence still holds. It is
j =
where generally = 1.
N
i=1
1
|Kij |2
and i =
M
j=1
1
|Kij |
(321)
XXXIII
10
It may at first seem a bit confusing that adding more variables using duality
makes the optimisation quicker. However, expressing any convex function as a
combination of simple linear functions in dual space makes the whole problem
easy to handle. Working on primal and dual problems at the same time brings
us closer to the solution very quickly. Switching between min and max between
the optimisation means a strong duality holds.
10.1
Let us take an example of a function which is convex and differential everywhere. We take the ROF model and replace the L1 norm with the standard L2
norm, i.e.
E(u, u) = min ||u||22 +
u X
||u g||22
2
(322)
E(u,u)
u
E
E
=
u
u
x u x
y uy
(323)
(324)
where ofcourse our ||u||2 is defined as (u2x + u2y ) where u x is the derivative
with respect to x and similarly for y. Now if we were to write the gradient
descent update step with respect to u, we will obtain the following updateequation.
un+1 un
= ((u f ) 2 (u x ) 2 (uy ))
u
x
y
(325)
2 u =
u x uy 2 u 2 u
+
= 2+ 2
x
y
x
y
(326)
Therefore our minimisation with respect to u takes us to the final gradient step
update as
un+1 un
= ((u f ) 22 u)
u
(327)
XXXIV
1 2
1 2
p + hq, u f i
q
2
2
(328)
(329)
(330)
(331)
(332)
References
1. Rockafellar. T.: Convex Analysis. II
2. Matrix Cookbook: http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/
3274/pdf/imm3274.pdf XXIV
3. http://gpu4vision.icg.tugraz.at/
4. B.K.P. Horn and B.G. Schunck.: Determining optical flow. Artificial Intelligence, vol
17, pp 185-203, 1981. XXIV
5. Simon Baker and Ian Matthews: Lucas-Kanade 20 Years On: A Unifying Framework,
International journal of computer vision, Volume 56, Number 3, 221-255 XXV
6. C. Zach, T. Pock and H. Bishof: A Duality Based Approach for Realtime TV-L1 Optical
Flow, Proceedings of the DAGM conference on Pattern recognition, 2007. XXV
XXXV
7. Chambolle, A., Pock, T.: A First-Order Primal-Dual Algorithm for Convex Problems,
Journal of Mathematical Imaging and Vision XIII
8. Unger, M., Pock, T., Werlberger, M., Bishof, H.: A covex approach for variational
Super-Resolution, XXI, XXVII
9. Steinbruecker, F., Pock, T., Cremers, D.: Large Displacement Optical Flow Computation without Warping, International Conference on Computer Vision 2009 XXV
10. Pock, T., Chambolle, A.:Diagonal preconditioning for first order primal-dual algorithms in convex optimization, International Conference on Computer Vision 2011
XXXII