You are on page 1of 32

Harmonic Analysis

Sarah Constantin March 27, 2012

Lecture 1

Statements: Rademachers Theorem, 1935. If a function F : Rd Rk F is a Lipschitz map, sup


x=y

||F (x) F (y)|| ||x y||

then F is dierentiable almost everywhere. This is actually proved on the line in every measure theory course, but its never stated as such. But the proof in Rn is a little bit harder, though still only a few lines long. What wasnt around in 1935 was Calderon-Zygmund theory, which is a major focus of this course. The other motivation for this course is the question, Does there exist a converse? The nal ingredient in the proof is a result by Mariana Csornyei and Peter Jones that d 1 implies the following: If you have a set E of d-dimensional Lebesgue measure zero, then there exists a function from Rd to itself such that F is nondierentiable on all of E. In one dimension its actually simple. F :RR nowhere dierentiable on E, m1 (E) = 0. Let
x

F (x) =
0

F (t)dt

Note that this statement holds for Lipschitz functions its a consequence of a theorem for functions of bounded variation. Now you build F . Heres how you do it: You have your 1 set E, and you cover it with some intervals Ij . Well do an innite number of coverings, k so that the sum j |Ij | < 10k . We do it so that each set of intervals is contained in the 1 2 2 1 previous. Now we build F to be +1 on each interval Ij unless its in Ik where Ih Ij . So it goes +1, -1, +1, -1, with each stage of neness. F is nowhere dierentiable on E, because you cant take limits! There exist innitely many scales such that F (x) F (y) = 2 on that scale. Immediately, this proof doesnt work in R2 . You cant do it with a simple covering argument. Theres a strange connection to combinatorics here. Erdos-Szekeres Theorem: If youre given a sequence x1 . . . xn real numbers then there exists a sequence of length greater than N 1/2 such that xjk is either strictly increasing or strictly decreasing. Lets take this in 2 dimensions, where our sequence of points is a graph . We claim that is Lipschitz and hits at least N 1/2 points. This is easy to prove in 2d, and nobody knows how to prove it in more dimensions. Theorem: |E| = in Rd , E = 1 Lipschitz graph of allowed type.
N (M )

Ej approximately the tangent plane Pj for any

The one-dimensional Hausdor measure H 1 (E ) C(d, M )|E|1/d This is a measuretheoretic Erdos-Szekeres Theorem, and we dont know how to prove this. The localized version is that we have it in a cube of volume 1 in Rd . Other topics will include wavelet-style bases and Hardy spaces. What are Hardy spaces? And why do we care? H 2 is very important in signal processing Hardy must be rolling in his grave, but actually we applied folks use this stu extensively. H p (D) is the Classical Hardy Space. We need a denition of Hp , for ALL positive values of p. F are holomorphic functions on the unit disc
2

||F ||H p = sup 1/2(


0<r<1 0

|f (rei )|p d)1/p

The Hardy space is the set of all holomorphic functions for which this norm is bounded. This turns out to be a Hilbert space. Theorem: F (ei ) Lp (S 1 ) F (z) = Pz F (ei ) 2

Real H p spaces for p 1: F (ei ) = an ein Recommendation: Dym-McKean, Fourier Series. Theorem of Hardy and Littlewood: An analytic function F H p , 0 < p is equivalent to F (ei ) Lp (S 1 , d) where F is the Poisson Maximal Function. What is the maximal function? Take a cone, with its point at ei and angle , pointing into the unit sphere. This is the Privilov Ice Cream Cone. Denote it . F (ei ) = z |F (z)|. F (ei ) Lp (S 1 ) but the boundary values are a distribution NOT A FUNCTION! In future well be connecting geometry, holomorphy, and the Fourier Transform. Also well be dealing with singular integrals, and the spaces H 1 and BMO.

Lecture 2

Basics: ||f ||Lp (d) = (

|f (x)|p d(x))1/p

=p
0

p1 ({x|f (x)| > })d

Indeed, if f = A is the characteristic function of A, |f |p d = p (A)

=p
0

p1 (A)d = p (A)

and then we write f= and then let g f . 3 j A j

Abstract corollary: interpolation of Banach Spaces. Examples: Lp , H p, BM O Heres an example: T : L1 L1 L L ||T f ||1 A||f ||1 ||T f || B||f || This implies that for 1 < p < T maps Lp to itself, boundedly. ||T f ||p A1 B ||f ||p where is dened as 1/p =
1 q

= 1 . This is the Riesz-Thorin Theorem.

Real interpolation is a weaker hypothesis... we say that F is in weak L1 if there exists a constant c such that for all > 0 the measure {x||F (x)| > } C/. L1 is in weak L1 by Chebyshev. The weak L1 norm is the smallest value of c. Real method of interpolation - Marcinkiewicz. Example: T : L1 wL1 L L with constants A and B, respectively. This implies that T : Lp Lp , 1 < p < , and ||T ||p,p C(A, B, p) Proof. Fix 1 < p < and integrate p
p1 ({x|T f (x) 0

> })d Fix , F = b H . H = F (){F (x)>}

Hf (x) =
R p

1/

f (t) dx tx

H : L Lp l Its an isometry. Turns out to be easy to get p between 1 and 2, and then duality to get p between 2 and innity. Now we get to introduce BMO. 4

The norm is given as sup( 1 |B(x, r)| |(x) B(x,r) |p dx)1/p


B(x,r)

Now, a little dyadic analysis.


n Dn is the nth dyadic part of R. Denote intervals with Ij . Each dyadic interval in Dn is a disjoint union of two dyadic intervals in Dn+1 . Each I Dn has two children. This is, in fact, exactly how option pricing works.

Dene HI as 2n/2 on half the interval, 2n/2 on the other half of the interval. Haar wavelet. If I, J are dyadic, then either I=J IJ J I I J = When you have described L2 ([0, 1]) you have described all the probability measures.

Lecture 3

Language; last time, we had that if you take a function f L1 ([0, 1]) or a measure 1 M ([0, 1]) formally: let F0 = E[F ] = 0 F (x)dx = E0 (F ). Then we can write F = F0 + I[0,1] < F, HI > HI .

F = F0 +
n=1 l(I)=2n+1

< F, HI > HI

We can now write this as F0 +

n (F )(x)
n=1 n

En (F )(x) = F0 +
j=1

j (x)

Write En (F ) = Fn En (Fn ) = Fn 5

En (Fn+k ) = Fn here, Hn is a dyadic martingale. In general, Fn is a martingale implies there exists a probability measure on [0, 1]. Sigma algebra An An+1 . . . Fn measurable with respect to An , then is equivalent with An . Example of a theorem: Martingale Convergence Theorem. Just for dyadic martingales. Def Fn is an Lp bounded martingale if 1 p if there exists A such that ||Fn ||Lp A. Martingale convergence theorem: Hypothesis: Fn is an L1 bounded martingale. Then a.e.(dx) Fn (x) F (x), there is an almost everywhere limit. and there exists a measure on [0, 1] whose total mass as a measure is bounded by A ||||M A and Fn = En (). Ex. M Fn = En () is L1 bounded. Def: F (x) = supn |Fn (x)|. Hardy-Littlewood maximal function. Corresponds to Hardy-Littlewood maximal function. Analyst: Hardy-Littlewood maximal function. 1 sup |F |dy xI |I| I F : Lp Lp F L1 F = M F wL1 Same concept for Fn . F (x) = sup |Fn (x)| = sup |
n xI

1 |I|

F dy

Because Fn = En (F ) = 1/|I| cel.

F dy +

k (x) because all the succeeding terms can-

More general: Whitney Decomposition of any set. Rn K = Qn where Qn are dyadic cubes. c1 dist(Qn , K) c2 (Qn )

We need another important idea, called stopping time. Given any martingale, Fn we dene the stopping time (x) to be a natural number or innity. Two rules: 1. (x) = n implies F (x) (x) = Fn (x) 2. (x) = n, x I, then (y) = n, for any y I. Point at which its xed for the whole interval. 6

Let Gn (x) = Fn (x) if n < (x) and Fm (x) if n (x) = m. Then this is a martingale. Indeed, G0 = F0 , G1 = F1 G2 = F1 on [0, 1/2] and F2 on [1/2, 1] We would like to say that at the stopping time |Fn (x)| c. Fn (x) = Fn if n and F if n. Call this new martingale Gn .
1

|Gn | 1
0

Martingale property: induction on n. Fix some n where we stop. |


(Ij )2n

Gn | =

|
Ij

G |

|
Ij

Fn |

and the LHS has /2|Ij |


(Ij )2n

|
Ij

Gn | =

|
Ij

G |

Proof to be continued.

Lecture 4

Here we introduce the BMO space. If we take a function f (x) to f (ax) then the L2 norm changes. But if we take the L norm, it cant possibly change. And this is the same for BMO. These two spaces are scale free. For today, we look at R, but higher dimensions are just the same. BM O if L1 and the BMO norm is |||| = supIR loc I = I (y)dy the mean value. John-Nirenberg Theorem: There exist C1 , C2 such that for all I, 1/|I| |{x I : |(x) I | > }| C1 ec2 /||||BM O For example: sgn(x) log(1/|x|) BM O / Martingale Proof: Exercise: notice the statement is a scale-free statement. Ex 2: ||(ax)||BM O = ||(x)||BM O 7
I

|(x) I |dx where

Its sucient to take I = [0, 1] because the space is also translation invariant. A dyadic interval is either on the left side of zero or the right; so if (BM O(R)) then BM Odyadic (R). So we turn into a dyadic martingale {n } where n (x) = En ()(x). Let = (x) = 1 0 = (x) 0 (x)dx. We have to prove |{x [0, 1]|(x) > } c1 ec2 We do this with a stopping time argument. Claim |1 | 1 Proof: |1 (x)| = for any [0, 1] is 0 + 1 (x) = 0 + .
1 1

|1 (x)| =
0

|1 (x) 0 |dy
0

|(x) 0 |dx

||||BM O = 1 So |n+1 n (x)| 1 Let (x) be the smallest integer inf {n : |n (x)| 2}. This is a consistent stopping time argument. is a dyadic martingale. Look at {x| (x) } = Ij dyadic intervals where we stop. Visually: you have the usual picture, the square divided into dyadic parts. 1 is the second layer, 2 the third, and so on. The support of (x) is the nth generation; the length of the intervals are 2n . (n1) (x) = n1 (x) havent stopped yet: this implies |n1 (x)| < 2. |n (x) |n1 (x)| + |n1 (x) n (x)| 2 + 1 = 3. so 2 (Ij )
Ij

|n (x)|dx

and n (x)| 3 on I. |n (x)| < 2 for all n implies almost everywhere |(x)| 2 if x is not in the rst generation, so the probability that x [0, 1] such that (x) 2 is less than 1/2. 8

Idea: renormalize! Everything is scale free. G1 is rst generation, G2 is second generation, and so on. G1 (Ij ) = G2 ([0, 1]) So you iterate; when you stop youve stopped at less than 3. The next time you stop youre at lest than 3 , and then at 9, and then at 12. But you have geometric decay of the measure. Two ways to measure: let J be a dyadic interval. 1 |J| If BM O this implies |aI | ||||BM O,L2 | J |2 = 1/|J|
J

< , HI >2

Lecture 5

Recall what we know about BMO. ||||BM O = sup(1/|Q|


Q Q

| Q |p )1/p

Where q is the average. Any p > 0 works, and you get an equivalent norm. This is a Banach space modulo constants. Recall the John-Nirenberg Theorem. Fix p, then there exists constants C1 , C2 depending on dimension such that for any Q, 1/|Q||{x Q||(x) Q | > } c1 ec2 /|||| This is a concentration inequality. It means the size of the area where is a certain distance away from the average is asymptotically small in the distance. Strombergs Lemma: There exists A = A(d) such that: if for all Q there exists c such that 1/|Q||{x Q||(x) c| < B} < A then BM O and ||||BM O C(d)B. Lets do an example. Lets look at log 1/|X|x>0 . This looks like rapid decay to 0. This is NOT in BMO. Proof: take a symmetric interval around the origin. Diverges. But it satises the condition for c = 1/2, B = 1. If you take a constant bigger than 1/2, you cant have such counterexamples in one dimension. For every dimensions there is a c like that. Other Places with BMO: 9

Problem: let H be the Hilbert transform Hf = f 1/x. This is dened as lim 0 |xy|> f (x)/(x y)dy. Marcel Riesz, ||Hf ||Lp Cp ||f ||Lp . This is the beginning of the Calderon-Zygmund theory. Natural question: Let 0 be a positive measure on R. |Hf (x)|p d(x) c |M f (x)|p d(x) |f (x)|p d(x) |f (x)|p dmu(x)

when do these hold? BOTH HOLD SIMULTANEOUSLY. If and only if the measure d is locally L1 , of the form w(x)dx nothing supported on a set of Lebesgue measure 0, and w is called an Ap weight. ||w||Ap = sup(1/|Q|
Q Q

w(x)dx)(1/|Q|
W

w1/(p1) )p1

1. Check w(x) = Then there exists p = p() such that w Ap 2. w Ap implies w Aq , q > p. Helson-Szego theorem: |Hf (x)|2 d(x) c if and only if d = w(x)dx Can you do it constructively? This is an open problem. Theorem of Feerman: BM O implies = u + Hv ||u|| < , ||v|| C||||BM O Then ||u + Hv||BM O c(||u|| + ||v|| ) Another class: A1 . ||w||A1 = sup(1/|Q|
Q Q

|x| .

|f (x)|2 d(x)

w)(infq w(x))1

Theorem (Coifman-Rochberg) w A1 i w = eu (M f (x))1 10

for any > 0 and f (x) L1 and M f nite at one point in Rn . Easy to prove; and theres loc a reason why you would want this. Suppose f = E and you want to prove a very delicate statement about the geometry of E. A measurable set can be extremely complicated. The problem is, the characteristic function on a set is not smooth. But! If I take the Hardy-Littlewood maximal function (M E (x))1 , this is in A1 and it is smoother. Lets see how. The function (M E )1 has a logarithm dened everywhere, as long as E has positive measure. And || log(M f )||BM O C0 , an absolute constant. Well need this later to express the idea of tangent planes to sets. If you take M E , this equals 1 on E, and its supremum is attained on E. 1 M E (x) 1 + |x| = sup 1/|I|
I I

E dx =

|E I| I

You dont even need a function for Coifman-Rochberg. You just need a sigma-nite measure on Euclidean space. If its maximal function is nite at one point, its the same theorem. Now lets do a martingale proof of Hardy-Littlewood. f L1 ([0, 1]) ||f ||L1 = 1 F1 = |f (x)| F {Fn } n=0 dyadic martingale. Martingale convergence theorem: Fn (x) F (x) a. e. and |Fn F ||L1 0 Fix > 0. Stopping time: (x) is the smallest integer n such that Fn (x) > . Then the following theorem holds. Fn 2.

11

Proof: if (x) = then Fn (x) . If (x) = n then Fn1 (x) because we didnt stop at n 1. Let I be an interval. (I) = 2(n1) . Its divided into a left interval and a right interval.If x is in the left interval, then 1/|I|
L

|f | F
L

= 1/2|L|

But a stopped martingale is still a martingale; Fn is a martingale.


1 1

Fn (x)dx =
0 0

F = F0

Let ||f ||L1 = 1. Then |{x|Md F (x) > 2}| 2/ For that set is contained in the set of all x so that (x) < . If you stop, you stop when youre . Because the function is positive: (x) < F (x) so F (x) 2. Next time: general Hardy-Littlewood maximal theorem and then A1 . Besicovitch Covering Lemma x E there exists Ix = (x , x + ) depending on x such that E Ux Ix and Ij (y) 3 for any y R. The proof is a greedy algorithm; take the biggest interval you can nd at each step such that it satises the conditions of the theorem.

Lecture 6

Theorem. If w Ap , w is an Ap weight, then w = uv 1p for some u, v A1 . ||w||Ap = sup(1/|Q|


Q Q

w)(1/|Q|
Q

w1/(p1) )p1

||A||A1 = sup(1/|Q|
Q Q

w)(infxQ w(x))1

This implies a.e.w(x) c1/|Q|


Q

intuitively, if w A1 then w only goes up. Theres a constant it doesnt go below, times its average. In fact you can remove the almost everywhere.

12

Interesting Fact: w A1 if and only if there exists > 0 st w1+ A1 . Application: w A1 implies there exists = (||w||A1 ) such that w1+ A1 . Now look at the Hardy-Littlewood maximal function of w1+ (x) is supxQ 1/|Q| Q w1+ . Then this must be bounded by the A1 norm ||w1+ ||A1 . So the integral 1/|Q| Q w ||w||A1 w(x) for x Q. General theorem: if there exists x st M f (x) < then M F (x) < a.e. (This is an exercise.) Theorem (Coifman and Rochling): if there exists an x st M F (x) < then for any 0 < < 1 , then (M f (x))1 A1 . First well prove this on [0, 1] where all I are dyadic. F L1 . Suppose 0 F = 1. So I get out of it a martingale Fn . Let (x) = inf n Fn (x) 2. First time Fn beats 2. When we stop, F (x) (x) 4 because the function is positive, so on the right half or the left half the average could not be bigger than 4. If (x) = n then Fn1 (x) < 2.
1 1 Get a collection of intervals {Ik } such that 2 1/|Ik | 1

sum of the lengths Fn+j (x) 2Fn (x).

1 |Ik |

1/2. We can repeat; x

1 Ik ,

with length 2n and stop when

1 Ik

F 4. By Chebyshev, the

As per usual we draw the picture where we cut the square in half and half again. Each level is Fn . Suppose both halves at level F1 are less than 2. It could be 2 in the rst quarter of F2 . Then it has to be < 2 in all other 3 quarters of F2 . If its 2 in one of the eighths of F3 in the right-hand half, the other eighths have to be < 2, and so on. The 1 1 rst subinterval where its greater than 2 are I1 , I2 , etc. Then for each of these Is, we restart the process; except instead of looking for bigger than 2, were looking for bigger 2 than 4. These intervals are the Ij . Second stage of bigness. For the same reason, these intervals are less than 4 because otherwise it would have stopped earlier. So we get a whole 2 collection of intervals Ij , and their sum is, by Chebyshev once again, less than 1/4. And m 2 1/|Ij | I 2 Fn 4 4, just as last time it was bounded by 4. We repeat. We get {Ij }, where
m n |Ij | 2m and if x Ij then F (x) < 2m .
j

What does it mean?


m The set of all x {x|F (x) 4m } j Ij , up to measure zero. And therefore m |{x|F (x) 4m }| | j Ij |

Now we can use the Hardy-Littlewood maximal function. if M F (x) > 4m you must be in a set whose measure is bounded by 2m .
1

F = 1 = e0
0

|{x [0, 1]| log(M F (x)) M log 4}| = |{x|M F (x) 4m | 2m

13

so 0<
0

log M F (x) C + log


0

F
I

Exercise: for all intervals I, dyadic, there exists a constant c such that 1/|I| A|dx C0 . Hints: A = sup 1/|J|
IJ j

| log M f

is a candidate for the Hardy-Littlewood maximal function, and therefore M F (x). The log of the Hardy-Littlewood maximal function is just bounded by a universal constant. Doesnt even depend on Lp behavior. This is a weird claim. Suppose for all Q there exists a number c(Q) such that 1/|Q| Q | c| 1. This implies 1/|Q| Q | Q | C0 . Now, for every interval, we found a constant, so that 1/|I| I | log M F (x) A|dx c0 . So the log maximal function is in BMO. Now we go one step further: || log M F ||BM O C0 Well come back to the proof next time but well use the previous dyadic argument to prove (M F )1 A1 . Fix I a dyadic integral. 1/|I|
I

(M F )1 C (M F (x))1

Proof: by Hardy-Littlewood we have the weak-type inequality |{x [0, 1]|M F (x) {| c0 /||F ||L1 This implies that F L1 on [0, 1]. Recall the nice formula
1

G(x)p dx = p
0 0

p1 |{x

[0, 1]|G(x) > }d

but if G = M F we get the above equation, after setting p = 1 .

(1 )
1

c0 / = (1 )C 1/

Repeat on any interval. And put in A instead of 1. And then it all goes away. INTERLUDE: SCALE FREE BANACH SPACES X is the Banach space. ||F (x)|| = ||F (ax + b)|| Norm is invariant under dilation and translation. If you rotate or translate a function in L2 its norm doesnt change. But if

14

you dilate, then the norm changes! In L2 or L1 . But L is scale free. Also BMO. Also something called the Bloch Space. (Bloch murdered his brother.) F = 0 F holomorphic || F (z)|| cy 1

Lecture 7

Adriano Garcia covers some of this stu: H p spaces, BMO, in the martingale setting, and especially dyadic. Dyadic setting: {Fn } are an Lp bounded dyadic Martingale. 1p Equivalently, Fn = En (f ), f Lp . Let F (x) = sup |Fn (x)|
n

The Doob Maximal Function actually the Hardy-Littlewood maximal function. Theorem: ||F (x)||Lp Cp ||f ||Lp Note that |{x||F (x)| > }| 2/||||M This is the weak type inequality; proven earlier. Notice f L |En (f )| ||f || |F (x)| ||f || so that extreme is trivial. Fix F Lp , 1 < p < . Then F (x) = sup |Fn (x)|
n

So by real interpolation between the weak type bounds L1 , L we get ||F ||Lp Cp ||f ||p 15

Now, lets look at locally integrable functions. F (x) = sup


r>0

1 |B(x, r)|

|F (y)|dy
B(x,r)

Then we have the same theorem: |{F (x) > }| cd ||F ||L1 / and for p > 1 ||F ||Lp cp,d ||F ||Lp Proof: trivially, F (x) is bounded by the L norm; so we just need the weak L1 statement and the same interpolation statement does this. Besicovich Covering Lemma: Let {B(x, r)|x A} set of balls of radius r centered at points in a. with all r < A < +. Then there exists a countable subcollection {xj |xj A} such that Bxj (y) cd
j

where d refers to the dimension. Corollary is the weak type inequality (or, technically, the weak-type(1, 1) inequality.) Let = {x|F (x) > } An open interval. Besicovich says that Bxj where the sum of the measures of the balls is less than cd . || | Bxj | = 1/
Bj

|Bxj |

Bj (y)dy |F | F Bj

The Besicovich proof is a greedy algorithm. Step 1. Let x1 be such that rx1 9/10 sup{ry |y A}. Step 2. rx2 9/10 sup{ry |y Bx1 , y A} / and so on: rxn 9/10 sup{ry |y Bxj , y A} / 16

We need to check two things: 1. All of A is covered. 2. rj 10/9ri , because

Bj C1 . Well, if j > i then

ri 3/4 sup{r|B(a, r), a Ai } 9/10rj Ai Aj The balls are disjoint, by design; j > i aj Bi |ai aj | > ri = ri /3 + 2ri /3 / 2/3 3rj /5 > ri /3 + rj /3 Proof unnished LOOKUP!

Lecture 8

Today: maximal functions, F , operators on Hilbert space, and singular integral operators on Lp , etc. Crucial estimate for boundary value problems. You have the Dirichlet problem on R2 . You have data: F Lp . solution: F = 0. And you want to nd a solution given + the boundary value problem. 1 P (x) = (1 + x2 ) The Poisson kernel, integral over the line is one. Standard way to make an approximation to the identity. Pt (x) = 1/tP (x/t) This lives approximately between t and +t. F (z) = F (x + iy) = lim
R

F (s)Py (s x)ds F (x)p


R

|F (x + iy)|p dx F (x) =

sup |F (z)|
z(x,)

where is the cone with point at x and angle . ||F ||Lp cp ||F ||Lp p = 1 weak type: |{x R|F (x) }| C/||F ||L1 17

Let 0 be radial and decreasing in r, with integral 1 on R. We claim sup |F t (x)| cM F (x) Proof: may as well assume F 0. This is called the Birthday Cake. Dominate t with a weighted sum of balls. t 2j Bj 2n t (0) < 2n+1 n (x) = t (x)[xn ,xn ] (x) 2 2n n+1 (x) = t (x)[xn+1 ,xn+1 ] n 2 2n [xn+1 ,xn+1 ] and so on; adding them all up, t =
kn

k (x) 2

2k1 [xk+1 ,xk+1 ] 2k1 [xk+1 ,xk+1 ] . Then

In this case, xn increases in n. Dene K(x) =

K(x) C F (x)t (s)ds so wk c The birthday cake is decomposition by layers. Literally, a stack. It turns out that the Hilbert transform Hf (x) = f Turns out T : L2 L2 is an isometry; and T is of weak type one-one. By Marcinkiewicz interpolation, T is okay for everything between 1 and 2. For 2 < p < you can just use Lp Lq duality, and the fact that you have an isometry. So really the only steps we need are the weak-type and the L2 isometry. Use Fourier Transform, if youve got one: ||F ||L2 = ||F ||L2 18 1 x F (x s)K(s)ds

wk M F (x)

F G() = F ()G() On an abstract Hilbert space H: suppose

T =

Tk

orthogonal For any k, ||Tk ||H,H A


Tj Tk = 0

by orthogonality; so <
j

kTk F, =<

self >H A||F ||H Tj Tk F, F >


j,k Tk Tk F, F >

<
k

=< = ||

Tk F,

Tk F >

Tk F ||2 ||F ||2 H H

Third method (Cotler): Suppose you dont want to use the Fourier transform; you dont get pure orthogonality, but maybe you get almost orthogonality. Suppose T = Tk

||Tk ||H,H A and for all j, ||Tj Tk ||1/2 ,


k k ||Tk Tj ||1/2 B

Then ||T || C(A, B) We dont give the proof but it has to do with spectral radius. Theres a nice application though. Suppose you have a hyperbola. Function like 1/x. Draw a function thats basically supported on the inside of this hyperbola: < 1/x for positive

19

x or > 1/x for negative x. And call this n (x). Then 1/x is a function of |n (x)| c/2n = c2n . and n = 0 Then Tn F = F n . Then Cotler holds. |j k | c2kj 2k [2n+2 ,2n+2 ] By Minkowski integral inequality, this is c2|kj|

n (x) and

This is a general method called Schurs Lemma. Lets look at an innite matrix M. Suppose for all j, k |mjk | A, and for all k, |mjk B then ||M || 2 , 2 A1/2 B 1/2 . This is something like Cotler.

Lecture 9

Wiener Covering Lemma B(x , r ) are an indexed set of balls. Any cardinality. Open balls. Let E = A x . There exists some B such that r B. THEN. There exists a subcollection j such that Bj Bk is empty when j = k and E cupB(xj , 3rj ) They dont touch each other when you blow up by three. This was for ergodic theory originally. See Rudin for a proof. Just another greedy algorithm. Take r1 , and then take points not in a ball of radius 3r1 , and then pick the largest one, and then see that the two things cannot intersect 3r1 > r1 + r2 . So you can continue. Then its just an exercise to see that you pick up everything. Start with a compact case, and then it can be expanded for everything else. Recall the Hilbert transform is 1/ Discrete Hilbert transform Hd f (n) = Hd :
2

f (y) dx yx f (m) mn
2

20

is a bounded function. The proof of this is actually elementary. Formal proof that the Hilbert transform takes L2 to L2 is enough to show |F (x) F (y)| C|x y|. Take the integral A F (y) dy yx
<|yx|<A

F (x) dy = 0 yx

because y x is odd and you just take the F (x) outside. lim exists.
<|yx|A

<|xy|<A

F (y) dx yx c|x y| 2cA |x y|

|F (y) F (x)| dy |y x|

<|yx|<A

Since F is assumed to be in the Schwartz class, |F (y)| dx < |y x| 1 dx |y x|2

|xy|>1

B
|xy|1

= lim Take Fourier transform:

F (y) dy yx

(F G) = F () G() (F 1/x) = F ()(1/x)() If


1 (x+i)

= K (x),
0

lim K () = isgn()

So if F is Schwartz, ||HF (x)||L2 = ||F () isgn()|| by Plancherel = ||F ()||L2 = ||F ||L2 So the Hilbert transform is an isometry. 21

Equivalent real transform: the Riesz Transforms, F Kj where Kj (x) = an isometry on L2 . BUT. What if the function is L2 but not Schwartz? Need
0,A+

c d xj . |x|d+1

This is also

lim

<|xy|<A

F (y) yx

exists almost everywhere. You have to take a maximal function argument. H F (x) = sup
,A <xy<A

F (y) dy yx

||F ||L2 = 1 for j 2, ||Fj || 1027j anything you want, because smooth functions are dense in L2 . You only have to prove a statement of this type for functions with compact support and continuous. Let P 0 be a positive kernel. For example, the Poisson kernel y 1/ x2 +y2 . P =1
R

Or kt (x) =

2 1 ex /4t , 4t

which also integrates to 1 the heat kernel.

And you have a function F in the Schwartz class. We then compute kt HF (x) which is the same as H(kt F (x)). Now let > 0. Look at F (y) dy yx

|xy|>

This is almost equal to the continuous function obtained by connecting the two halves of 1/(y x), namely Q . Q has mean value zero. |LHS RHS| AM F (x) ||HF ||L2 = ||F ||L2 if F is smooth with compact support. Take kt HF (x) = H(kt F (x)). LHS is F (y) dx yx with error A0 M F (x). So |H F (x)| = | sup
<|yx|<A

F (y) dy| yx

22

|HF (x)| + A0 M F (x) both of which are in L2 . On the circle H(F )(ei ) = We want to look at boundary values. Is ||HF ||LP cp ||F ||Lp ? This was worked out by Marcel Riesz, with holomorphic functions. Which doesnt work in higher dimensions. Now we come to Calderon-Zygmund Theory. Take a kernel K(x) = is a function suciently smooth. Need (y)d(y) = 0
S d1

i sgn(F (n))ein .

(x/|x|) |x|

Calderon-Zygmund decomposition: F =G+B G is good, B is bad. G L and B L1 .

10

Lecture 11
HR : L BM O

Last time: Same BM O BM O. Dene T = (K(x y) K(y))(y)dy Ktr (y) = K(y)|y|1 (y) kernels like
cx . |xy|d+1

Example: BM O implies |(y)| 1 dy < 1 + |y|d+1

Proof: |[R,R]d | |[1,1]d | + c||||BM O log R 23

Note ||(ax + b)||BM O = ||||BM O . Recall ||u||BM O = sup 1/|Q|


Q Q

|u(y) uQ |dy

sum of the deviations from the average. |K(y)|


R|y|2R

is independent of R because we have taken K(y) to be homogeneous of degree -d. Suppose you have a cube of size 1, and outside it a cube of size 3. Assume = I + O is I is inside a little box and O is outside. So |K i |2
Rd Q

= 0. So

|I |2 =

||2 C

because Q is 0, and so 3Q |Q | + C||||BM O . Lets look at the outside: is there some value of c such that |K(x y) c|2 c ?
Q

Let c = K O (cQ ). Then the above is


Q

|
Rd 3Q

|K(x y) K(cQ y))O (y)dy|2 dx

Recap: When we have a Calderon-Zygmund operator T, it maps Lp to L1 , it maps L1 to weak L1 , and it maps L O to BM O. BM These are kernels bounded by C/|x y|n with |K(x, y) K(x , y)| C|x x | |x y| + |x y|)n+

What does this mean for detecting the dimension of a set? Form the data matrix. M = A A SVD this thing. M = U DV Order the eigenvalues in D in decreasing order. Assume also that the expectation E(| xj |2 ) = 1 x 24

How close is this to being a three-dimensional data set? The data set lives at scale one around the mean value (thats what variance one means.) Globally theres a sharp theorem if we dene 2 (Q0 ) = 2 + 2 4 5 2 (Q0 ) = inf 1/N
P P3

dist(xj , P )2

nd the closest 3-plane to the space of points. Let N be the number of points in Q0 . If we dene dist(xj , P )2 N (Q0 )2 you have a scale-invariant object. The relevant object is a sum over all scales: 2 (Qn )
(Qn )=2n

This is a singular integral. This corresponds to the square function. Let f L2 (Rd ). |f |2 = c
R2 +

| F (z)|2 ydxdy |F/y(z)|2 ydxdy

=c
R+

This is called a square function. On the complex circle: |(z n ) |2 log 1/|z|dxdy = c
D

| F (z)|2 log 1/|z|dxdy = c


S1

|F (ei ) F (0)|2 d

by Greens Theorem or by Plancherel. Plancherel: F = 0 and on the boundary F = f , and F (z) = PZ F then F (, y) = Pz () F () Greens Theorem: set U (z) = |F (z)|2

25

set V(z) to be the Greens function, y on Rd+1 and log 1/|z| on D. Lets x a point x and + a cone with point at x in the upper half plane. (x) is the cone. By Fubini, the square function of F, SF (x) = (
(x)

( F (z))2 dxdy)1/2

Fubini implies
2 S F (x)dx = c R R

F2

Let {Fn } be a dyadic martingale on [0,1]. I could actually do this on the whole line. Theres a natural way to do this. n = Fn Fn1
n

Fn = F0 +
k=1

Martingale Square Function;

2 (x) k
k=1

11

Lecture 12

Lets say we have the standard dyadic grid on R. Look at BM Od ([0, 1]). That is, for all I dyadic intervals in [0, 1], we have 1/|I|
I

| I |2 )1 /2 A

For example: dene (x) = 0 on [0, 1/2] and (x) = log 1/|x 1/2| for x 1/2. ||||BM Od = A < BM O(R) / because if J = [1/2 , 1/2 + ] 1 |J| | c| log 1/|J|
J

So its not in BMO. But it is in dyadic BMO. P (J [0, 1/2], J [1/2, 1]) 2 ). This is a / / low probability event. Lets do this on the circle. subtract constants: assume t = 0
S

26

Then = 1/2
0

t (x)dt BM O

||BM O(S 1 ) C t = aIt hIt ()

where It = I + t in (mod 2). Rotate the angle by t. 1/|Jt |


Jt

| Jt |2 = 1/|Jt |
It Jt

|aIt |2 (It )

This is the L2 dyadic BMO norm on the circle. t =


n=0 |It |=22n

aIt hIt

=
n=0

n,t ()

This is like a Fourier transform broken up into frequencies, which correspond to the lengths of the intervals. So () = Et (n,t (). There are a bunch of intervals at each of the lengths. This isnt even a continuous function. But its almost continuous. Probability that you pick two points at random and their dierence is small is high (because theyre likely to be in the same interval, which is locally constant.) So thats *almost* like Lipschitz. P (|n,t (x) n,t (y)| A|x y|) << 1 if |x y| < 2n . But |n,t | 1. So it should be safe to average over these; hitting a discontinuity is a low probability event. Lemma: ||Et (n,t )||Lip C0 2n The proof is the same for all n. Need only do when |x y| < 2n . What is the probability that a random translation of I, (I) = 2n 2. has x L, y L, or x R, y R or / y L, x L, or y R, x R. This happens only if hIt (x) = HIt (y). The probability of / / this happening is less than the probability x [c l, c]. Whats the probability over all possible random translations? /c. (x) (y) = E[(n,t (x) n,t (y))x,ybad ]

27

Probability of badness 8|xy|/2n . so expectation 28|xy|/2n . So |psin,t ()| c2n . by Lipschitz. n,t = J,t where J is a dyadic interval. (J) = 2n . This gives us the standard unit grid. J (t) = E[aIt hIt cIt J ] Its an exercise to show that these are Lipschitz. Even more: J are almost orthogonal. First, well do this on S 1 . m =
2m

n =
2n

= =|
2J

J ()K ()d| = |
2J

| J ()K (cJ )|
2J

J (theta)(K () K (cJ ))| + |


2J

|J ()c| ck |/ (K)d 3/2C (J)/ (K) |J |

Now sum over all J, K. Each J hits 2 intervals. n m c 2m 2n |J |


J

C2mn

|J |

These J s are becoming wavelets; mean value is zero, Lipschitz, almost orthogonal, etc. This is the canonical way to break up a function into bandlimited functions. Any BMO function is a dyadic BMO function. Not all dyadic BMO functions are BMO. But if you average them, you do get a BMO function!

28

12

Lecture 13

Product Formulas for Measures and Applications to Analysis and Geometry Fact; take a probability measure P r([0, 1]) or on [0, 1]d . This happens if and only if (and uniquely!) if = (1 + aI hI (x)) where hI is the Haar function on an interval I, 1 on the LHS and +1 on the RHS and 0 elsewhere. Closed interval on the left and open on the right, unless the right endpoint is 1, in which case its open. and aI [1, +1]. Dene this innite product to be a limit, as n 1 + aI hI (x)
2n (I)1

This converges in the weak-star topology to . g(x)dx


n

g(x)d

Theorem of Robert Feerman, Carlos Kenig, and Jill Pipher. We prove this by induction. We have [0, 1] and say ([0, 1/2)) = ([1/2, 1]) = + = 1. Then on [0, 1/2] its 1 a and on the other its 1 + a. 1/2(1 a) = 1/2(1 + a) =
1

(x)dx = 1
0 n

Another way of thinking about this: take the ratio of the two integrals on the two halves, 1a (L) = 1+a (R)
n1 . Conversely, choosing a sequence of Thats all it is. n is constant on (I) = 2 coecients aI [1, 1] gives you a Borel probability measure.

Burstiness: internet trac, stock prices, etc.

29

These measures can be very spiky. You can simulate such random measures. Recall what A2 is. The A2 norm is ||w||A2 = sup(
I

1 |I|

wdx)(
I

1 w1 dx) |I|

This is always 1. In fact, ||w||A2 sup


I

1 |I|

e(x)I x
I

if w = e . If the A2 measure is 1 + , then will have BM O norm . There exists some constant c such that cw(x)dx is a probability measure on [0, 1]. We can write this a s the product (1 + aI hI (x)) and this looks like the logarithm. Approximations to the logarithms of the weights. e e
P ai hi
2

not exactly true, but becomes true if we add in the exponential eO(aI I ) . If all the |aI | > 0 then the measure is singular with respect to Lebesgue measure. Get some call volumes from telephone antennae. For each antenna and each day theres a usage pattern. Spikes and valleys. Expand into dyadic coecients. Then do a diusion embedding of the coecients. They pick two low eigenvalues. Why not just do wavelets? Its insane. Wavelet coecients have nothing to do with the measure on the left and right subintervals, etc. No measure comes out of them. A Lipschitz function on Rd is dierentiable almost everywhere. Say F maps Rd to Rm . This is Rademachers Theorem. The question is, is there a converse? Is there such an F and a set E of measure 0, such that F is not dierentiable at all points of E? Let E R, |E| = 0, E Ij |Ij | < Then let G1 (x) =
x

Ij G1 (t)dt

F1 =

F (x) = 1, x E 30

because on every such interval we have a constant. Now take sub-intervals that still cover E. And then put -1, the same way. Derivative is identically 1 on this subset. And so on. Now, take a d-dimensional version of this picture and it becomes much more complicated. Cover with zig-zaggy strips instead of intervals. The product formula above needs three kinds of Haar functions instead of one. Vertical split, horizontal split, and no split. Integral along a horizontal segment is always 1, no matter the height; same with a vertical segment.

13

Lecture 14

In Rd we have 2d 1 Haar functions, just as there are vertical and horizontal ones in 2d. Divide an interval in half in each dimension and do -1, +1 on each dimension. These are all orthogonal functions. If Q i (1 + aQxi hQxi (z)) is a measure dened by Haar functions in all the dimensions, we need to set this up so that the product converges. You can dene the functions so that the measure is dened everywhere. The product is one over an interval Q, just by the nature of Haar functions, since the ai are between 1 and +1, and the integral is just 1. Lets go back to dimension 2. Let Q be a dyadic square, Q = I J. For an interval I, let L, R denote the left and right halves of J. The product for = 1 1E (x) = (1 + 1Q aI hI (x)) (1 + 1Q aL hL (y))(1 + 1Q aR hR (y))
1

= 1 2 =

on E. Look at Ex . If we integrate on a horizontal line L, L 1 (x, y)dx = 1 because on L, 1 is just a product corresponding to some probability measure. Since 1 (z)
1/2

on Ex , the length of Ex L is less than 1/2 . Suppose we have a Lipschitz curve in a square. = (A(y), y). |A(y) A(y )| |y y | Suppose Ey is some small set that the curve goes through. Suppose m1 (, Ey ) < 1.

31

If this happens, Ey is covered by a small clection of Lip strips. I can now build a Lipschitz function f (z) = and then dene F (x, y) =
0 y

width(Stripj ) << 1.

Sj (z)

f (x, s)ds

/yF = 0 on the strips, and 1 on the boundary. |F (x, s1 ) F (x, s2 )| |s1 s2 | But its also Lipschitz in the vertical direction, trivially. |F (z1 ) F (z2 )| M ||z1 z2 || I can dene a partial order on points, such that if z1 , z2 , we say z1 < z2 if theyre both in E and z1 is within the cone around z1 of width , where is the Lipschitz constant of the graph . So in each strip, we can just pick a minimal point. We need a small number of Lip curves to cover all the zj in . Take all minimal elements xj . Let F1 be those points. Step 2: repeat on {zj |zj F1 } and get 2 . Step 3. {zj |zj F1 F2 }. these are all disjoint. / / Suppose there exist M +1 and xj0 F1 F2 . . . FM . So there exists xjM FM such that / xjM < xj0 . So there exists xjM 1 FM 1 such that xjM 1 < xjM . So xj1 < xj2 < xj3 . . ., which is a contradiction. So you only need M Lipchitz curves. We have to integrate, unfortunately, across all Lipschitz curves...and thats an innite dimensional family. However, its rather easy to show that you can perturb x and y a little bit so they wont be equal pointwise but theyre approximately the same, and when you integrate over a line in a small cone, the integral is small. The set of directions is nite dimensional. But all Lipschitz curves thats a lot more. Instead, we do much more exotic things than Haar functions. Next week: no class.

32

You might also like