Sampling 60 Years After Shannon

Sampling: 60 years after Shannon
Michael Unser
Biomedical Imaging Group
EPFL, Lausanne
Switzerland
Plenary talk, DSP2009, Santorini, Greece, July 2009
SAMPLING: 50 years after Shannon

Analog/physical world
Discrete domain
Continuous signals: L2 (R)
sampling
Signal subspace
Discrete signals: 2 (Z)
interpolation
reconstruction algorithms
denoising
signal processing
...
Universal
Shannons sampling theory
(40s)
Generalized samplingusing splines
(70s and 90s)
Consistent sampling (noise-free scenario)
(90s)
1- 2
SAMPLING: 60 years after Shannon

Analog/physical world
Discrete domain
Continuous signals: L2 (R)
sampling
Signal subspace
Discrete signals: 2 (Z)
interpolation
reconstruction algorithms
denoising
signal processing
...
Universal
Constrained
(prior knowledge)
Shannons sampling theory
(40s)
Generalized samplingusing splines
(70s and 90s)
Consistent sampling (noise-free scenario)
(90s)
Regularized sampling (for noisy data) (00s)
Sampling with sparsity constraints
...
1- 3
Shannons sampling reinterpreted

!!
Generating function: (x) = sinc(x)
!!
Subspace of bandlimited functions: V () = span{(x k)}kZ

analysis
synthesis
sampling
f (x) L2
(x)
anti-aliasing
filter
f(x)
kZ
(x k)
(x)
f(x)
ideal filter
f(k) = sinc(x k), f (x)
!!
Analysis:
!!
Synthesis: f(x) =
kZ
f(k) sinc(x k)
Orthogonal basis: sinc(x k), sinc(x l) = kl
Orthogonal projection operator !
1- 4
Fundamental sampling questions

!
Q1: Are there alternative choices of representations?

ANSWER: Yes, of course!
Specification of reconstruction space
Q2: How good is the representation/signal reconstruction?

ANSWER: Approximation theory
Rate of decay of the error as sampling step goes to zero
Q3: How should we formulate the reconstruction problem?

!
!
Noise-free: consistent (but, not exact) reconstruction

Noisy data: regularized sampling
smoothness and/or sparsity constraints
Q4: Can we design fast/efficient algorithms?
Q5: Can we specify optimal reconstruction spaces/solutions?

ANSWER: Yes, under specific conditions
Q6: Should we redesign the whole system?

Compressive sensing ...
1- 5
Part 1: Sampling theory and splines

More general generating function
sinc(x)
(x)
n (x) (polynomial B-spline of degree n)
Justifications for using (B-)splines
Ease of use: short, piecewise-polynomial basis functions

Generality: progressive transition from piecewise-constant (n = 0) to bandlimitted (n )
Improved performance: best cost/quality tradeoff
Optimal from a number of perspectives
- Approximation theory: shortest basis functions for a given order of approximation
- Link with differential operators (Green functions)
- Variational properties
- Minimum Mean Square Error estimators for certain classes of stochastic processes
- Fundamental role in wavelet theory
1- 6
PRELIMINARIES
!
Function and sequence spaces
Shift-invariant subspaces
Splines and operators
1-7
Continuous-domain signals
Mathematical representation: a function of the continuous variable x R
Lebesgues space of finite-energy functions
L2 (R) = f (x), x R :
xR
L2 -inner product: f, g =
L2 -norm: f L2 =
|f (x)| dx < +
2
f (x)g (x)dx
xR
1/2
|f (x)| dx
= f, f
2
xR
Fourier transform
Integral definition: f() =
f (x)ejx dx
xR
Parseval relation:
f 2L2
1
=
2
|f()|2 d
1- 8
Discrete-domain signals
Mathematical representation: a sequence indexed by the discrete variable k Z
Space of finite-energy sequences
2 (Z) =
a[k], k Z :
2 -norm: a2 =
kZ
kZ
|a[k]| < +
2
|a[k]|
1/2
Discrete-time Fourier transform

z -transform: A(z) =
a[k]z k
kZ
Fourier transform: A(ej ) =
a[k]ejk
kZ
1- 9
Shift-invariant spaces
Integer-shift-invariant subspace associated with a generating function (e.g., B-spline):
V () =
f (x) =
Generating function:
kZ
c[k](x k) : c 2 (Z)
F
()
(x)
(x)ejx dx
xR
Autocorrelation (or Gram) sequence

a [k] = (), ( k)
A (ej ) =
nZ
|(
+ 2n)|2
Riesz-basis condition
Positive-definite Gram sequence:
0 < A2 A (ej ) B 2 < +
A c2
kZ c[k](x k)L B c2
2
f L
Orthonormal basis a [k] = k A (ej ) = 1 c2 = f L2 (Parseval)
1- 10
Example of reconstruction spaces

Piecewise-constant functions
a [k] = k
the basis is orthonormal
|(
+ 2n)|2 = 1
the basis is orthonormal
(x) = rect(x) = 0 (x)
bandlimited functions
(x) = sinc(x)
nZ
Polynomial splines of degree n
(x) = n (x) = ( 0 0 0 )(x)
(n+1)
Autocorrelation sequence:
times
!2
!1
a n [k] = ( n n )(x)|x=k = 2n+1 (k)
Proposition. The B-spline of degree n, n (x), generates a Riesz basis with lower and
upper Riesz bounds A = inf {A n (ej )}
2 n+1
and B = sup {A n (ej )} = 1.

1- 11
Cardinal L-splines
L{}: differential operator (translation-invariant)
(x): Dirac distribution
Definition
The continuous-domain function s(x) is called a cardinal L-spline iff.
L{s}(x) =
kZ
a[k](x k)
Location of singularities = spline knots (integers)

Generalization: includes polynomial splines as particular case (L =
dN
dxN
1-12
Example: piecewise-constant splines

Spline-defining operators
d
dx
+ {}
D=
Continuous-domain derivative:
Discrete derivative:
1 ej
Piecewise-constant or D-spline
s(x) =
kZ
+ s(k)

a[k] (x k)
D{s}(x) =
0
s[k]+
(x k)
kZ
B-spline function
0
+
(x)
= + D
{}(x)
1 ej
j
1-13
Basic sampling problem

Sampling system
sampling
f (x) L2 (R)
acquisition
device
noise
+
Discrete measurements:
g[k] = (h f )(x)|x=k + n[k]
Reconstruction
algorithm
Constraints
(prior
knowledge)
Continuous-domain reconstruction
f(x) =
kZ
Riesz-basis property
c[k](x k)
signal coefficients
{c[k]}kZ
Goal: Specify a set of constraints, a reconstruction space and a reconstruction algorithm so that f(x) is a good approximation of f (x)
1- 14
VARIATIONAL RECONSTRUCTION
!
Regularized interpolation
Generalized smoothing splines
Optimal reconstruction space
Splines and total variation
1-15
Regularized interpolation (Ideal sampler)

Given the noisy data g[k] = f (k) + n[k], obtain an estimation f of f that is
1. (piecewise-) smooth to reduce the effect of noise (regularization)
2. consistent with the given data (data fidelity)
Variational formulation
f
= arg min J(f, g; ),
J(f, g; ) =
kZ
f V ()
|g[k] f (k)| +
2
Data Fidelity Term
(|L{f }(x)|) dx
Regularization
L: Differential operator used to quantify lack of smoothness; e.g., D =
d
dx
or D2
(): Increasing potential function used to penalize non-smooth solutions (e.g., (u) = |u|2 )
0: Regularization parameter to strike a balance between smoothing and consistency
1- 16
Regularized fit: Smoothing splines

Theorem: The solution (among all functions) of the smoothing spline problem
min
f (x)
kZ
|g[k] f (k)|2 +
|Dm f (x)|2 dx
is a cardinal polynomial spline of degree 2m 1. Morever, its B-spline coefficients
can be obtained by suitable recursif filtering of the input samples g[k].
[Schoenberg, 1973; U., 1992]

Polynomial spline reconstruction:
f (x) =
kZ
Discrete, noisy input:
g[k] = f (k) + n[k]
Smoothing
spline filter
c[k] n (x k)
c[k] = (h g)[k]
1- 17
Draftmans spline: m = 2 and 0.
Minimum curvature interpolant is a cubic spline!
Photo courtesy of Carl De Boor

1-18
Generalized smoothing spline

L: Spline-defining differential operator
Theorem: The solution (among all functions) of the smoothing spline problem
min
f (x)
kZ
|g[k] f (k)|2 +
|Lf (x)|2 dx
is a cardinal L L spline. The solution can calculated as
f (x) =
(h g)[k]L (x k)
kZ
where L is an optimal B-spline generator and h a corresponding digital reconstruction filter parametrized by .
[U.-Blu, IEEE-SP, 2005]
1- 19
Variational reconstruction: optimal discretization

Definition: L is an optimal generator with respect to L iff
it generates a shift-invariant Riesz basis {L (x k)}kZ

L is a cardinal L L-spline; i.e., there exists a sequence q[k] s.t.
L L{L }(x) =
q[k](x k).
kZ
Short support: L can be chosen of size 2N where N is the order of the operator
Optimal digital reconstruction filter

H (z) =
1
BL (z) + Q(z)
with
BL (z) =
L (k)z k
kZ
1- 20
Stochastic optimality of splines

Stationary processes
A smoothing spline estimator provides the MMSE estimation of a continuously-defined signal f (x)
given its noisy samples iff L is the whitening operator of the process and =
2
02
[Unser-Blu, 2005].
Advantages: the spline machinery often yields a most efficient implementation:

shortest basis functions (B-splines) together with recursive algorithms (especially in 1D).
Fractal processes
Fractional Brownian motion (fBm) is a self-similar process of great interest for the modeling of natural
signals and images. fBms are non-stationary, meaning that the Wiener formalism is not applicable
(their power spectrum is not defined !).
Yet, using a distributional formalism (Gelfands theory of generalized stochastic processes), it can be
shown that these are whitened by fractional derivatives.
The optimal MSE estimate of a fBm with Hurst exponent H is a fractional smoothing spline of order
= 2H + 1: L()
= (j)/2 [Blu-Unser, 2007].
Special case: the MMSE estimate of the Wiener process (Brownian motion) is a linear spline ( = 2).
1- 21
Generalization: non-quadratic data term

General cost function with quadratic regularization
J(f, g) = Jdata (f, g) + Lf 2L2 (Rd )
Jdata (f, g): arbitrary, but depends on the input data g[k] and the samples {f (k)}kZ only
Theorem. If L is optimum with respect to L and a solution exists, then the optimum
reconstruction over ALL continuously-defined functions f is such that
min J(f, g) =
f
min J(f, g).
f V (L )
Hence, there is an optimal solution of the form

by DISCRETE optimization.
kZ
c[k]L (x k) that can be found
Note: similar optimality results apply for the non-ideal sampling problem
[Ramani-U., IEEE-IP, 2008]

1- 22
Splines and total variation

Variational formulation with TV-type regularization
f = arg min J(f, g),
J(f, g) =
kZ
f L2 (R)
|g[k] f (k)| +
2
Data Fidelity Term
|Dn {f }(x)|1 dx
R
TV{Dn1 f }
Theorem: The above optimization problem admits a solution that is a non-uniform

spline of degree n 1 with adaptive knots.
[Mammen, Van de Geer, Annals of Statistics, 1997]
More complex algorithm (current topic of research)
1- 23
Part 2: From smoothness to sparsity

Choices of regularization functionals
- Aim: Penalize non-smooth (or highly oscillation) solutions
- Limitation of quadratic regularization: over-penalizes sharp signal transitions
Signal domain
Lf 22
Lf 1
(Sobolev-type norm)
e.g., Df 1 = TV{f }
(total variation)
Wavelet domain
(u)
p=2
1
0.5
p0
Wf 1 f B11 (L1 (R))
(Besov norm)
Compressive
sensing theory
Wf 0
Sparsity index (non-convex)

1- 24
SAMPLING AND SPARSITY

!
Wavelets yield sparse representations
Theory of compressive sensing
Wavelet-regularized solution of general

linear inverse problems
Biomedical imaging examples

!
!
3D deconvolution
Parallel MRI
1-25
Wavelet bases of L2
Family of wavelet templates (basis functions)
i,k (x) = 2
i/2
x 2i k
2i
Orthogonal wavelet basis

i,k , j,l = ij,kl
W1 = WT
Analysis: wi [k] = f, i,k
(wavelet coefficients)
Reconstruction: f (x) L2 (R), f (x) =
wi [k] i,k (x)
iZ kZ
Vector/matrix notation
Discrete signal: f = ( , c[0], c[1], c[2], )
Wavelet coefficients: w = ( , w1 [0], w1 [1], , w2 [0], )

Analysis formula: w = WT f
Synthesis formula: f = Ww =
wk k
26
Wavelets yield sparse decompositions
Wavelet transform
Inverse wavelet transform
Discarding small coefficients
27
Theory of compressive sensing

Generalized sampling setting (after discretization)
Linear inverse problem:
u = Hf + n
Sparse representation of signal: f = WT v
with v0 = K Nv
Nu Nv system matrix : A = HWT
Formulation of ill-posed recovery problem when 2K < Nu Nv

(P0) min u Av22
v
subject to
v0 K
Theoretical result
Under suitable conditions on A (e.g., restricted isometry), the solution is unique
and the recovery problem (P0) is equivalent to:
(P1) min u Av22
v
subject to
v1 C1
[Donoho et al., 2005

Cands-Tao, 2006, ...]
28
Sparsity and l1-minimization

Prototypical inverse problem
min u Av22 + v22 min u Av22 subject to v2 = C2

v
min u Av22 + v1 min u Av22 subject to v1 = C1

v
smallest weighted 2 -distance to u
v2
(
u1 , u
2 )
v1
2 -ball: |v1 |2 + |v2 |2 = Constant
1 -ball: |v1 | + |v2 | = Constant
)T AT A(v u
)
Elliptical norm: u Av22 = (v u
with
= A1 u
u
29
Solving general linear inverse problems

Space-domain measurement model
H: system matrix (image formation)
g = Hf + n
n: additive noise component
Wavelet-regularized signal recovery
Wavelet expansion of signal:

f = Ww
22
Data term: g H
f 22 = g HWw
1 C1
Wavelet-domain sparsity constraint: w
Convex optimization problem
= arg min g Aw22 + w1

w
w
with A = HW
or
f = arg min g Hf 2 + WT f
2
1
f
30
Alternating minimization: ISTA

Convex cost functional: J(f ) = g Hf 22 + WT f 1
Special cases
Classical least squares: = 0
Landweber algorithm:
f = (HT H)1 HT g
fn+1 = fn + HT (g Hfn ) (steepest descent)
Pure denoising: H = I
f = W T {WT g}
(Chambolle et al., IEEE-IP 1998)
Iterative Shrinkage-Thresholding Algorithm (ISTA)

1. Initialization (n 0), f 0 = g
(Figueiredo, Nowak, IEEE-IP 2003)

v = T (u)
2. Landweber update: z = fn + HT (g Hfn )

3. Wavelet denoising: w = WT z,
= T {w} (soft threshold)

w
and repeat from Step 2 until convergence

4. Signal update: fn+1 Ww
Proof of convergence: (Daubechies, Defrise, De Mol, 2004)

31
Fast multilevel wavelet-regularized deconvolution

Key features of multilevel wavelet deconvolution algorithm (ML-ISTA)
Acceleration by one order of magnitude with respect to state-of-the art algorithm (ISTA)
(multigrid iteration strategy)
Applicable in 2D or 3D:
first wavelet attempt for the deconvolution of 3D fluorescence micrographs
Works for any wavelet basis
Typically outperforms oracle Wiener solution (best linear algorithm)
Shannon wavelets
9/7 wavelets
ML-ISTA
8
SNRI (dB)
SNRI (dB)
ISTA
6
FTL
TL
5
0
10
15
Time (sec)
20
FTL
TL
25
5
0
10
15
Time (sec)
20
25
(Vonesch-Unser, IEEE-IP, 2009)
32
Deconvolution of 3D fluorescence micrographs
Widefield micrograph
ML-ISTA 5 iterations
ISTA 5 iterations
38428832 stack (maximum-intensity projections); sample: fibroblast cells;
staining: actine filaments in green (Phalloidin-Alexa488), vesicles and nucleus membrane in red (DiI);
objective: 63 plan-apochromat 1.4 NA oil-immersion;
diffraction-limited PSF model; initialization: measured data.
33
3D fluorescence microscopy experiment

Input data
(open pinhole)
ML-ISTA 15 iterations
ISTA 15 iterations
Confocal reference
Maximum-intensity projections of 51235296 image stacks;
Zeiss LSM 510 confocal microscope with a 63 oil-immersion objective;

C. Elegans embryo labeled with Hoechst, Alexa488, Alexa568;
each channel processed separately; computed PSF based on diffraction-limited model;

separable orthonormalized linear-spline/Haar basis.
34
Preliminary results with parallel MRI

(M. Guerquin-Kern, BIG)
Simulated parallel MRI experiment

Shepp-Logan brain phantom
Space
4 coils, undersampled spiral acquisition, 15dB noise
Backprojection
L2 regularization (CG)
1 wavelet regularization
NCCBI collaboration with K. Prssmann, ETHZ
35
Fresh try at ISMRM reconstruction challenge

L2 regularization (Laplacian)
1 wavelet regularization
36
Sampling-related problems and formulations
Generalized
sampling
Linear inverse
problems
MMSE
Optimal
discretization
and solution
Smoothing
spline
Optimal
discretization
and solution
Hybrid Wiener
filter
Direct numerical
solution
Digital filtering
Numerical,
matrix-form
solution
CG (iterative)
Gaussian
MAP
Gaussian
MAP
TV
Sparsity
Optimal solution Exact solution

(for ortho basis)
space
Soft-threshold
Nonuniform
spline
Iterative TV
deconvolution
Iterative TV
reconstruction
Numerical
optimization
Multi-level,
iterated,
threshold
Numerical
optimization
1 -norm
Level of complexity
Ideal sampling
Variational
Iterated
thresholding
1- 37
CONCLUSION
!
Generalized sampling
!
!
!
!
Regularized sampling
!
!
!
Unifying Hilbert-space formulation: Riesz basis, etc.

Approximation point of view: projection operators
Increased flexibility; closer to real-world systems
Generality: nonideal sampling, interpolation, etc...
Regularization theory: smoothing splines

Stochastic formulation: hybrid form of Wiener filter
Non-linear techniques (e.g., TV)
Quest for the best representation space

!
!
!
Optimal choice determined by regularization operator L

Spline-like representation; compactly-supported basis functions
Not bandlimited !
1-38
CONCLUSION (Contd)
!
Sampling with sparsity constraints

!
!
!
!
Requires sparse signal representation (wavelets)

Theory of compressed sensing
Qualitatively equivalent to non-quadratic regularization (e.g. TV)
Challenge: Can we re-engineer the acquisition process in order to
sample with fewer measurements?
Further research issues

!
!
!
!
Fast algorithms for l1-constrained signal reconstruction

CS: beyond toy problems
real-word applications of the compressed part of theory
Strengthening the link with spline theory
Better sparsifying transforms of signal and images:
tailored basis functions, rotation-invariance, ...
1-39
Acknowledgments
Many thanks to
"
"
"
"
"
"
"
EPFLs Biomedical Imaging Group
Prof. Thierry Blu

Prof. Akram Aldroubi
Dr. Philippe Thvenaz
Dr. Sathish Ramani
Dr. Cdric Vonesch
Prof. D. Van De Ville
Prof. Yonina Eldar
+ many other researchers, and graduate students
Preprint and demos at:
http://bigwww.epfl.ch/
1- 40
Bibliography
Sampling theory and splines
M. Unser, A. Aldroubi, A General Sampling Theory for Nonideal Acquisition Devices, IEEE Trans Signal Processing, vol. 42, no. 11, pp. 2915-2925, 1994.
M. Unser, Sampling50 Years After Shannon, Proc. IEEE, vol. 88, no. 4, pp. 569-587, 2000.
M. Unser, Splines: A Perfect Fit for Signal and Image Processing, IEEE Signal Processing Magazine, vol. 16,
no. 6, pp. 22-38, 1999.
Regularized sampling
M. Unser, T. Blu, Generalized Smoothing Splines and the Optimal Discretization of the Wiener Filter, IEEE
Trans. Signal Processing, vol. 53, no. 6, pp. 2146-2159, 2005.
Y.C. Eldar, M. Unser, Nonideal Sampling and Interpolation from Noisy Observations in Shift-Invariant Spaces,
IEEE Trans. Signal Processing, vol. 54, no. 7, pp. 2636-2651, 2006.
T. Blu, M. Unser, Self-Similarity: Part IIOptimal Estimation of Fractal Processes, IEEE Trans Signal Processing, vol. 55, no. 4, pp. 1364-1378, 2007.
S. Ramani, D. Van De Ville, T. Blu, M. Unser, Nonideal Sampling and Regularization Theory, IEEE Trans. Signal
Processing, vol. 56, no. 3, pp. 1055-1070, March 2008.
E. Mammen, S. van de Geer, Locally adaptive regression splines, Annals of Statistics, vol. 25, no. 1, pp.
387-413, 1997.
41
Bibliography (Contd)
Compressive sampling
` J. Romberg, T. Tao, Robust uncertainty principles: Exact signal reconstruction from highly incomplete
E. Candes,
frequency information,, IEEE Trans. Information Theory, vol. 52, no. 2, pp. 489 - 509, 2006.
D. Donoho, Compressed sensing, IEEE Trans. Information Theory, vol. 52, no. 4, pp. 1289 - 1306, 2006.
` and M. Wakin, An introduction to compressive sampling, IEEE Signal Processing Magazine, vol. 25,
E. Candes
no. 2, pp. 21 - 30, 2008.
A.M. Bruckstein, D.L. Donoho, M. Elad,, From Sparse Solutions of Systems of Equations to Sparse Modeling of
Signals and Images, SIAM Review, vol. 51, no. 1, pp. 34.81, 2009.
1 -regularized signal reconstruction

L. Rudin, S.J. Osher, E. Fatemi, Nonlinear total variation based noise removal algorithms, Physica D., vol. 60, pp.
259-268, 1992.
A. Chambolle, R.A. DeVore, N.-Y. Lee and B.J. Lucier, Nonlinear Wavelet Image Processing: Variational Problems, Compression, and Noise Removal Through Wavelet Shrinkage, IEEE Trans. Image Processing, vol. 7, no.
33, pp. 319-335, April 1998.
M. Figueiredo and R. Nowak, An EM algorithm for wavelet-based image restoration, IEEE Trans. Image Processing, vol. 12, no. 8, pp. 906916, Aug. 2003.
I. Daubechies, M. Defrise, and C. De Mol, An Iterative Thresholding Algorithm for Linear Inverse Problems with a
Sparsity Constraint, Comm. Pure and Applied Mathematics, vol. 57, no. 11, pp. 14131457, August 2004.
C. Vonesch, M. Unser, A Fast Multilevel Algorithm for Wavelet-Regularized Image Restoration, IEEE Trans. Image
Processing, , vol. 18, no. 3, pp. 509-523, 2009.
42

Sampling 60 Years After Shannon

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sampling 60 Years After Shannon

Uploaded by

Copyright:

Available Formats

Sampling: 60 years after Shannon

Plenary talk, DSP2009, Santorini, Greece, July 2009

SAMPLING: 50 years after Shannon

Continuous signals: L2 (R)

Discrete signals: 2 (Z)

Shannons sampling theory

Generalized samplingusing splines

(70s and 90s)

Consistent sampling (noise-free scenario)