You are on page 1of 17

EE263

S. Lall

2011.05.16.01

Homework 6
Due Thursday 5/19
46040

1. Image reconstruction from line integrals.


In this problem we explore a simple version of a tomography problem. We consider a
square region, which we divide into an n n array of square pixels, as shown below.

xn+1

x1
x2

x2n

xn

xn2

The pixels are indexed column first, by a single index i ranging from 1 to n2 , as shown
above. We are interested in some physical property such as density (say) which varies
over the region. To simplify things, well assume that the density is constant inside each
2
pixel, and we denote by xi the density in pixel i, i = 1, . . . , n2 . Thus, x Rn is a
vector that describes the density across the rectangular array of pixels. The problem is
to estimate the vector of densities x, from a set of sensor measurements that we now
describe. Each sensor measurement is a line integral of the density over a line L. In
addition, each measurement is corrupted by a (small) noise term. In other words, the
sensor measurement for line L is given by
2

n
X

li xi + v,

i=1

where li is the length of the intersection of line L with pixel i (or zero if they dont
intersect), and v is a (small) measurement noise. This is illustrated below for a problem
with n = 3. In this example, we have l1 = l6 = l8 = l9 = 0.
line L
x1

x4

l7
l4

x2

x5

x8
l5

l2
x3

x6

x9

l3

Now suppose we have N line integral measurements, associated with lines L1 , . . . , LN .


From these measurements, we want to estimate the vector of densities x. The lines are
characterized by the intersection lengths
lij ,

i = 1, . . . , n2 ,

j = 1, . . . , N,

EE263

S. Lall

2011.05.16.01

where lij gives the length of the intersection of line Lj with pixel i. Then, the whole set
of measurements forms a vector y RN whose elements are given by
2

yj =

n
X

lij xi + vj ,

j = 1, . . . , N.

i=1

And now the problem: you will reconstruct the pixel densities x from the line integral
measurements y. The class webpage contains the M-file tomodata.m, which you should
download and run in Matlab. It creates the following variables:
N, the number of measurements (N ),
n_pixels, the side length in pixels of the square region (n),
y, a vector with the line integrals yj , j = 1, . . . , N ,
lines_d, a vector containing the displacement dj , j = 1, . . . , N , (distance from the
center of the region in pixels lengths) of each line, and
lines_theta, a vector containing the angles j , j = 1, . . . , N , of each line.
The file tmeasure.m, on the same webpage, shows how the measurements were computed,
in case youre curious. You should take a look, but you dont need to understand it to
solve the problem. We also provide the function line_pixel_length.m on the webpage,
which you do need to use in order to solve the problem. This function computes the pixel
intersection lengths for a given line. That is, given dj and j (and the side length n),
line_pixel_length.m returns a n n matrix, whose i, jth element corresponds to the
intersection length for pixel i, j on the image. Use this information to find x, and display
it as an image (of n by n pixels). Youll know you have it right when the image of x
forms a familiar acronym... Matlab hints: Here are a few functions that youll find useful
to display an image:
A=reshape(v,n,m), converts the vector v (which must have n*m elements) into an
n m matrix (the first column of A is the first n elements of v, etc.),
imagesc(A), displays the matrix A as an image, scaled so that its lowest value is
black and its highest value is white,
colormap gray, changes Matlabs image display mode to grayscaled (youll want to
do this to view the pixel patch),
axis image, redefines the axes of a plot to make the pixels square.
Note: While irrelevant to your solution, this is actually a simple version of tomography,
best known for its application in medical imaging as the CAT scan. If an x-ray gets
attenuated at rate xi in pixel i (a little piece of a cross-section of your body), the j-th
measurement is
n2
Y
exi lij ,
zj =
i=1

with the lij as before. Now define yj = log zj , and we get


2

yj =

n
X

xi lij .

i=1

Solution.
The first thing to do is to restate the problem in the familiar form y = Ax + v. Here,
2
y RN is the measurement (given), x Rn is the physical quantity we are interested
2
in, A RN n is the relation between them, and v RN is the noise, or measurement

EE263

S. Lall

2011.05.16.01

error (unknown, and well not worry about it). So we need to find the elements of A . . .
how do we do that?
2

Comparing

y = Ax,

i.e.,

yj =

n
X

Aji xi

with our model

yj =

i=1

n
X

lij xi + vj

i=1

it is clear that Aji = lij , j = 1 . . . N, i = 1 . . . n2 . We have thus determined a standard


linear model that we want to invert to find x. On running tomodata.m, we find that
n = 8 and N = 121, so N > n2 , i.e., we have more rows than columns a skinny
matrix. If A is full-rank the problem is overdetermined. We can find a unique (but not
exact) solution xls the least-squares solution that minimizes kAx yk which, due to
its noise-reducing properties, provides a good estimate of x (it is the best linear unbiased
estimate). So heres what we do: we construct A element for element by finding the
length of the intersection of each line with each pixel. Thats done using the function
line_pixel_length.m provided. A turns out to be full-rank (rank(A) returns 64), so we
can compute a unique xls . We go ahead and solve the least-squares problem, and then
display the result.
Here comes a translation of the above paragraph into Matlab code:
% Load the problem data.
tomodata
%%%% Create the matrix A, then compute its elements.
A=zeros(N,n_pixels^2); for i=1:N
% Compute the intersection lengths of line i with all pixels
l=line_pixel_length(lines_d(i),lines_theta(i),n_pixels);
A(i,:)=l; % Those lengths become the ith row of A
end
%%%% Check the rank of A
r=rank(A)
% Returns 900, i.e. n_pixels^2, i.e. number of
columns
%%%% Solve the least-squares problem: choose x to minimize
||y-Ax||
x_ls=A\y;
%%%% Reorder elements of vector x into a matrix X, for display.
X_ls=reshape(x_ls,n_pixels,n_pixels);
%%%% Display the reconstructed grayscale image.
figure(1) colormap gray imagesc(X_ls) axis image
And heres the end result, the reconstructed image

EE263

S. Lall

2011.05.16.01

10

15

20

25

30
5

46035

10

15

20

25

30

2. Quadratic extrapolation of a time series, using least-squares fit.


This is an extension of problem ??, which concerns quadratic extrapolation of a time
series. We are given a series z up to time t. We extrapolate, or predict, z(t + 1) based
on a least-squares fit of a quadratic function to the previous ten elements of the series,
z(t), z(t 1), . . . , z(t 9). Well denote the predicted value of z(t + 1) by z(t + 1). More
precisely, to find z(t + 1), we find the quadratic function f ( ) = a2 2 + a1 + a0 for which
t
X

(z( ) f ( ))

=t9

is minimized. The extrapolated value is then given by z(t + 1) = f (t + 1).


Show that

110

where c R

z(t + 1) = c

z(t)
z(t 1)
..
.
z(t 9)

does not depend on t. Find c explicitly.

Solution.
XXX changed!
f (n + k)

a2 (n + k)2 + a1 (n + k) + a0

=
=

a2 k 2 + (2na2 + a1 )k + (n2 a2 + na1 + a0 )


u2 (n)k 2 + u1 (n)k + u0 (n)

by letting
u2 (n)
u1 (n)

=
=

a2
2na2 + a1

u0 (n)

n2 a2 + na1 + a0

(well omit the index n and write n2 , n1 , and n0 .) Then,

02
0 1
f (n)

u2
f (n 1) (1)2 1 1

.. .. | u1
..
..

. .
.
.
u0
2
(9) 9 1
f (n 9)
|
{z
}
A

EE263

S. Lall

2011.05.16.01

Let u(n) = u = [ u2 u1 u0 ]T . Then,


min

0
X

(z(n + k) f (n + k))

min kx Auk2

k=9

This is a least-squares problem since A is skinny and full rank. The optimal solution is
uopt = (AT A)1 AT x. Therefore,
y

= f (n + 1) = a2,opt (n + 1)2 + a1,opt (n + 1) + a0,opt


= a2,opt + (2na2,opt + a1,opt ) + (n2 a2,opt + na1,opt + a0,opt )
= u2,opt (n) + u1,opt (n) + u0,opt (n)

= [1 1 1]uopt = [1 1 1](AT A)1 AT x.

1
Therefore c = A(AT A)1 1 . So c is indeed independent of n and is equal to:
1
c=

0.900

0.500

0.183

0.050

0.200

0.267

0.250

0.150

0.033

0.300

Matlab code for Problem 6


a = (0:-1:-9); A = [ a.^2 a.^1 a.^0 ]; c = A*inv(A*A)*[1 1 1]
46055

3. Estimation with sensor offset and drift.


We consider the usual estimation setup:
yi = aTi x + vi ,

i = 1, . . . , m,

where
yi is the ith (scalar) measurement
x Rn is the vector of parameters we wish to estimate from the measurements
vi is the sensor or measurement error of the ith measurement
In this problem we assume the measurements yi are taken at times evenly spaced, T
seconds apart, starting at time t = T . Thus, yi , the ith measurement, is taken at time
t = iT . (This isnt really material; it just makes the interpretation simpler.) You can
assume that m n and the measurement matrix
T
a1
aT2

A= .
.
.
aTm
is full rank (i.e., has rank n). Usually we assume (often implicitly) that the measurement
errors vi are random, unpredictable, small, and centered around zero. (You dont need
to worry about how to make this idea precise.) In such cases, least-squares estimation of
x works well. In some cases, however, the measurement error includes some predictable
terms. For example, each sensor measurement might include a (common) offset or bias,
as well as a term that grows linearly with time (called a drift). We model this situation
as
vi = + iT + wi
where is the sensor bias (which is unknown but the same for all sensor measurements),
is the drift term (again the same for all measurements), and wi is part of the sensor
error that is unpredictable, small, and centered around 0. If we knew the offset and

T

EE263

S. Lall

2011.05.16.01

the drift term we could just subtract the predictable part of the sensor signal, i.e.,
+ iT from the sensor signal. But were interested in the case where we dont know
the offset or the drift coefficient . Show how to use least-squares to simultaneously
estimate the parameter vector x Rn , the offset R, and the drift coefficient R.
Clearly explain your method. If your method always works, say so. Otherwise describe
the conditions (on the matrix A) that must hold for your method to work, and give a
simple example where the conditions dont hold.
Solution.
Substituting the expression for the noise into the measurement equation gives
yi = aTi x + + iT + wi
In matrix form we can write this as

T
a1
y1
y2 aT2


.. = ..
. .
ym
aTm
| {z } |

1
1
..
.

i = 1, . . . , m.

T
2T
..
.

1 mT |
{z
}

x
.

{z }
x

If A is skinny (m n + 2) and full-rank, least-squares can be used to estimate x, , and


. In that case,

xls
1 AT y.
x
ls = ls = (AT A)
ls

The requirement that A be skinny (or at least, square) makes perfect sense: you cant
extract n + 2 parameters (i.e., x, , ) from fewer than n + 2 measurements. Even is A
is skinny and full-rank, A may not be. For example, with

2
2
A=
2
2
we have

2
2
A =
2
2

0
1
0
1

0
1
,
0
1

1 T
1 2T
,
1 3T
1 4T

which is not full-rank. In this example we can understand exactly what happened. The
first column of A, which tells us how the sensors respond to the first component of x,
has exactly the same form as an offset, so it is not possible to separate the offset from
the signal induced by x1 . In the general case, A is not full rank only if some linear
combinations of the sensor signals looks exactly like an offset or drift, or some linear
combination of offset and drift. Some people asserted that A is full rank if the vector
of ones and the vector (T, 2T, . . . , mT ) are not in the span of the columns of A. This is
false. Several people went into the case when A is not full rank in great detail, suggesting
regularization and other things. We werent really expecting you to go into such detail
about this case. Most people who went down this route, however, failed to mention the
most important thing about what happens. When A is not full rank, you cannot separate
the offset and drift parameters from the parameter x by any means at all. Regularization
means that the numerics will work, but the result will be quite meaningless. Some people
pointed out that the normal eqautions always have a solution, even when A is not full
rank. Again, this is true, but the most important thing here is that even if you solve the
normal equations, the results are meaningless, since you cannot separate out the offset

EE263

S. Lall

2011.05.16.01

and drift terms from the parameter x. Accounting for known noise characteristics (like
offset and drift) can greatly improve estimation accuracy. The following matlab code
shows an example.
T = 60;
% Time between samples
alpha = 3;
% sensor offset
beta = .05;
% sensor drift constant
num_x = 3;
num_y = 8;
A =[ 1
4
0
2
0
1
-2
-2
3
-1
1
-4
-3
1
1
0
-2
2
3
2
3
0
-4
-6
]; % matrix whose rows are a_i^T
x = [-8; 20; 5];
, with v a (Gaussian, random) noise
y = A*x;
for i = 1:num_y;
y(i) = y(i)+alpha+beta*T*i+randn;
end;
x_ls = A\y;
for i = 1:num_y
last_col(i) = T*i;
end
A = [A ones(num_y,1) last_col];
x_ls_with_noise_model = A\y;
norm(x-x_ls)
norm(x-x_ls_with_noise_model(1:3))
Additional comments. Many people correctly stated that A needed to be full rank and
then presented a condition they claimed was equivalent. Unfortunately, many of these
statements were incorrect. The most common error was to claim that if neither of the
two column vectors that were appended to A in creating A was in range(A), then A was
full rank. As a counter-example, take

A=
,
..

.
m+1
and

A =

2
3
..
.
m+1

1
1
..
.

1
2
..
.

1 m

Since the first column of A is the sum of the last two, A has rank 2, not 3.
48010

4. Optimal control of unit mass.


In this problem you will use Matlab to solve an optimal control problem for a force acting
on a unit mass. Consider a unit mass at position p(t) with velocity p(t),

subjected to
force f (t), where f (t) = xi for i 1 < t i, for i = 1, . . . , 10.

EE263

S. Lall

2011.05.16.01

(a) Assume the mass has zero initial position and velocity: p(0) = p(0)

= 0. Find x
that minimizes
Z 10
f (t)2 dt
t=0

subject to the following specifications: p(10) = 1, p(10)

= 0, and p(5) = 0. Plot the


optimal force f , and the resulting p and p.
Make sure the specifications are satisfied.
Give a short intuitive explanation for what you see.
(b) Assume the mass has initial position p(0) = 0 and velocity p(0)

= 1. Our goal is to
bring the mass near or to the origin at t = 10, at or near rest, i.e., we want
2
J1 = p(10)2 + p(10)

small, while keeping


J2 =

10

f (t)2 dt
t=0

small, or at least not too large. Plot the optimal trade-off curve between J2 and J1 .
Check that the end points make sense to you. Hint: the parameter has to cover
a very large range, so it usually works better in practice to give it a logarithmic
spacing, using, e.g., logspace in Matlab. You dont need more than 50 or so points
on the trade-off curve.
Your solution to this problem should consist of a clear written narrative that explains
what you are doing, and gives formulas symbolically; the Matlab source code you devise
to find the numerical answers, along with comments explaining it all; and the final plots
produced by Matlab.
Solution.
(a) First note that
Z

R 10
0

f (t)2 dt is nothing but kxk2 because

f (t)2 dt

=
=

x21 + x22 + + x210


kxk2 .

10

1
0

x21 dt +

2
1

x22 dt + +

10
9

x210 dt

The constraints p(10) = 1, p(10)

= 0, and p(5) = 0 can be expressed as linear


constraints in the force program x. From homework 1 we know that


9.5 8.5 7.5 6.5 5.5 4.5 3.5 2.5 1.5 0.5
p(10)
1
p(10)
= 0 = 1
1
1
1
1
1
1
1
1
1 x.

0
4.5 3.5 2.5 1.5 0.5 0
0
0
0
0
p(5)
(1)
Therefore the optimal x is given by the minimizer of kxk subject to (??). In other
words, the optimal x is the minimum norm solution x = xln of (??). x can be
calculated using Matlab as follows:
>> a1=linspace(9.5,0.5,10);
>> a2=ones(1,10);
>> a3=[linspace(4.5,0.5,5), 0 0 0 0 0];
>> A=[a1;a2;a3]
A =
Columns 1 through 7
9.5000
8.5000
7.5000
6.5000
5.5000
1.0000
1.0000
1.0000
1.0000
1.0000
4.5000
3.5000
2.5000
1.5000
0.5000
Columns 8 through 10

4.5000
1.0000
0

3.5000
1.0000
0

EE263

S. Lall

2.5000
1.5000
1.0000
1.0000
0
0
>> y=[1;0;0]
y =
1
0
0
>> x=pinv(A)*y
x =
-0.0455
-0.0076
0.0303
0.0682
0.1061
0.0939
0.0318
-0.0303
-0.0924
-0.1545
>> A*x
ans =
1.0000
0.0000
-0.0000
>> norm(x)
ans =
0.2492
>>

2011.05.16.01

0.5000
1.0000
0

Note that between times t = i and t = i + 1 for i = 0, . . . , 9 the force f is piecewise


constant, the velocity p is piecewise linear, and the position p is piecewise quadratic.
The following Matlab commands plot f , p and p.

>> figure
>> subplot(3,1,1)
>> stairs([x;0])
>> grid on
>> xlabel(t)
>> ylabel(f(t))
>> T1=toeplitz([ones(10,1)],[1,zeros(1,9)])
T1 =
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
1
1
1
0
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
1
0
0
0
1
1
1
1
1
1
0
0
1
1
1
1
1
1
1
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
>> p_dot=T1*x
p_dot =
-0.0455
-0.0530
-0.0227

0
0
0
0
0
0
0
0
1
1

0
0
0
0
0
0
0
0
0
1

EE263

S. Lall

0.0455
0.1515
0.2455
0.2773
0.2470
0.1545
0.0000
>> subplot(3,1,2)
>> plot(linspace(0,10,11),[0;p_dot])
>> grid on
>> xlabel(t)
>> ylabel(p_dot(t))
>> T2=toeplitz(linspace(0.5,9.5,10),[0.5,zeros(1,9)])
T2 =
Columns 1 through 7
0.5000
0
0
0
0
0
1.5000
0.5000
0
0
0
0
2.5000
1.5000
0.5000
0
0
0
3.5000
2.5000
1.5000
0.5000
0
0
4.5000
3.5000
2.5000
1.5000
0.5000
0
5.5000
4.5000
3.5000
2.5000
1.5000
0.5000
6.5000
5.5000
4.5000
3.5000
2.5000
1.5000
7.5000
6.5000
5.5000
4.5000
3.5000
2.5000
8.5000
7.5000
6.5000
5.5000
4.5000
3.5000
9.5000
8.5000
7.5000
6.5000
5.5000
4.5000
Columns 8 through 10
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.5000
0
0
1.5000
0.5000
0
2.5000
1.5000
0.5000
>> p=T2*x
p =
-0.0227
-0.0720
-0.1098
-0.0985
-0.0000
0.1985
0.4598
0.7220
0.9227
1.0000
>> subplot(3,1,3)
>> plot(linspace(0,10,11),[0;p])
>> grid on
>> xlabel(t)
>> ylabel(p(t))

10

2011.05.16.01

0
0
0
0
0
0
0.5000
1.5000
2.5000
3.5000

EE263

S. Lall

2011.05.16.01

f (t)

0.2

p(t)

0.2
0

10

10

10

0
1

p(t)

0.5
0

0.5
0

Note that the plot for p(t) is not quite right because the plot command in Matlab
plots p(t) as being piecewise linear and not piecewise quadratic. It is interesting that
the optimal force is such that p(t) < 0 for 0 < t < 5 which means that the mass
doesnt stay at position zero for 0 < t < 5. It backups a little bit so the second time
it reaches position zero it has positive velocity.
(b) In this case there are two competing objectives that we want to keep small. J2 = kxk2
as it was shown above. To express J1 we rewrite the motion equations, this time for
p(0) = 0 and p(0)

= 1. It is easy to verify that:



 
 



p(10)
0
9.5 8.5 7.5 6.5 5.5 4.5 3.5 2.5 1.5 0.5
10
=
=
x+
.
p(10)

0
1
1
1
1
1
1
1
1
1
1
1
 (2)

10
2
. So,
Therefore, to minimize J1 we have to minimize kAx yk , where y =
1
all we need is to calculate the minimizer x of the expression J1 + J2 = kAx
yk2 + kxk2 for different values of . As it was proven in the lecture notes,
x = (AT A + I)1 AT y.
We use Matlab to calculate x and the resulting pairs (J1 , J2 ) and plot the tradeoff
curve. A sample Matlab script is given below:
clear; clf;
N = 50;
a1 = linspace(9.5,0.5,10); a2 = ones(1,10); A = [a1; a2];
y = [-10; -1];
F = eye(10);
mu = logspace(-2, 4, N);
for i=1:N;
f = inv(A*A + mu(i)*F*F)*A*y;
J_1(i) = (norm(A*f - y))^2;
J_2(i) = (norm(f))^2;
end;
plot(J_2,J_1); title(Optimal Tradeoff Curve); xlabel(J_2 =
||x||^2) ylabel(J_1 = p(10)^2 + p_dot(10)^2);
The resulting tradeoff curve is shown below:

11

EE263

S. Lall

2011.05.16.01

Optimal Tradeoff Curve

100

90

80

70

J1

60

50

40

30

20

10

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

J2

48030

We can now use the curve to decide which x we are going to use. The choice will
depend on the importance we decide to attach to each of the two objectives. A
final comment: Note that there is a value of x after which J1 becomes equal to 0.
This value is the minimum-norm solution, because it corresponds to the minimum
J2 = kxk2 for which Ax = y. On the other hand, we cannot drive the system to or
near the desired state without spending
some energy, so J1 is large for small values


10 2

of J2 . For J2 = 0, J1 is equal to
1 = 101.

5. Phased-array antenna weight design.

We consider the phased-array antenna system shown below.

d
The array consists of n individual antennas (called antenna elements) spaced on a line,
with spacing d between elements. A sinusoidal plane wave, with wavelength and angle
of arrival , impinges on the array, which yields the output e2j(k1)(d/) cos (which is
a complex number) from the kth element. (Weve chosen the phase center as element 1,
i.e., the output of element 1 does not depend on the incidence angle .) A (complex)
linear combination of these outputs is formed, and called the combined array output,
y() =

n
X

wk e2j(k1)(d/) cos .

k=1

The complex numbers w1 , . . . , wn , which are the coefficients of the linear combination,
are called the antenna weights. We can choose, i.e., design, the weights. The combined
array output depends on the angle of arrival of the wave. The function |y()|, for 0

12

EE263

S. Lall

2011.05.16.01

180 , is called the antenna array gain pattern. By choosing the weights w1 , . . . , wn
intelligently, we can shape the gain pattern to satisfy some specifications. As a simple
example, if we choose the weights as w1 = 1, w2 = = wn = 0, then we get a
uniform or omnidirectional gain pattern; |y()| = 1 for all . In this problem, we want a
gain pattern that is one in a given (target) direction target , but small at other angles.
Such a pattern would receive a signal coming from the direction target , and attenuate
signals (e.g., jammers or multipath reflections) coming from other directions. Heres the
problem. You will design the weights for an array with n = 20 elements, and a spacing
d = 0.4 (which is a typical value). We want y(70 ) = 1, and we want |y()| small
for 0 60 and 80 180 . In other words, we want the antenna array to
be relatively insensitive to plane waves arriving from angles more that 10 away from
the target direction. (In the language of antenna arrays, we want a beamwidth of 20
around a target direction of 70 .) To solve this problem, you will first discretize the angles
between 0 and 180 in 1 increments. Thus y C180 will be a (complex) vector, with
yk equal to y(k ), i.e., y(k/180), for k = 1, . . . , 180. You are to choose w C20 that
minimizes
180
60
X
X
|yk |2
|yk |2 +
k=80

k=1

subject to the constraint y70 = 1. As usual, you must explain how you solve the problem.
Give the weights you find, and also a plot of the antenna array response, i.e., |yk |, versus
k (which, hopefully, will achieve the desired goal of being relatively insensitive to plane
waves arriving at angles more than 10 from = 70 ). Hints:
Youll probably want to rewrite the problem as one involving real variables (i.e., the
real and imaginary parts of the antenna weights), and real matrices. You can then
rewrite your solution in a more compact formula that uses complex matrices and
vectors (if you like).
Very important: in Matlab, the prime is actually the Hermitian conjugate operator. In other words, if A is a complex matrix or vector, A gives the conjugate
transpose, or Hermitian conjugate, of A.
Although we dont require you to, you might find it fun to also plot your antenna
gain pattern on a polar plot, which allows you to easily visualize the pattern. In
Matlab, this is done using the polar command.
Solution.
The first thing to do is to realize that y C180 is a linear function of w C20 :
yp =

n
X

wk e2j(k1)(d/) cos(p/180)

k=1

We can write this as y = Aw, where A C18020 , and Apk = e2j(k1)(d/) cos(p/180) .
The mathematical problem is to choose w C20 that minimizes
60
X

|yk | +

180
X

|yk |2

k=80

k=1

subject to the constraint y70 = 1. It turns out there are several ways to solve this problem,
and even a few ways that, although technically not exactly correct, get the solution with
reasonable engineering precision. We havent studied the exact problem of minimizing
kF wk2 subject to an equality constraint like c w = 1 (even though its not that hard to
do, and its pretty close to some others we have studied . . . ). The solution is given by
w=

1
c (F F )1 c

13

(F F )1 c.

EE263

S. Lall

2011.05.16.01

To use this result to solve our problem, we take F to be a submatrix of A, consisting of


the first through 60th rows, and also the 80th through 180th rows. (Thus, F w gives the
vector of combined output at the angles where we want to array to be insensitive.) For
c we take the conjugate of the 70th row of A. Then the formula above gives the optimal
w. The formula can be derived using Lagrange multipliers, for example, but we certainly
didnt expect you to solve this new type of least-squares problem this way. Lets solve the
problem following the hint. (Of course, in the end, well get the same solution described
above!) The equality constraint y70 = 1 can be expressed as
y70 = w1 +

n
X

wk e2j(k1)(d/) cos(p/180) = 1,

k=2

using the fact that the first column of A consists of ones (since k 1 is zero, so the exponential becomes 1). Therefore the constraint (called a target normalization constraint
in antenna array design) can be expressed as
w1 = 1

n
X

wk e2j(k1)(d/) cos(p/180) .

k=2

This means we can select any values of w2 , . . . , w20 we like, and then determine w1 using
the equation above, to make sure we satisfy the target normalization constraint. Lets
define our free variable x C19 as x = (w2 , . . . , w19 ), and express the relation between x
and w as w = Gx + h. (Actually, the matrix G and vector h have lots of structure, but it
doesnt matter to us.) Now let A C16120 be the submatrix of A consisting of the rows
1 to 60, and also rows 80 to 180. The problem is now to choose x to minimize

2.
kA(Gx
+ h)k2 = k(AG)x
+ (Ah)k
This is a complex least-squares problem, with solution

1
AG)

(Ah)

x = (AG)
(AG)
(which agrees with what we have above, although its alot of trouble to show). A sample
matlab m-file solving the problem by this approach is given below. Now lets describe
another method some people used. They formed a matrix B C16220 , whose first row
consists of the 70th row of A, and the other rows are the first through 60th, and 80th
through 180th row of A. The goal, then, is to have Bw ydes = (1, 0, . . . 0), i.e., y70 1
and yi 0 for i = 1, . . . , 60, 80, . . . , 180. They then applied complex least squares to
obtain the solution
wls = (B B)1 B ydes .
It turns out this solution is pretty close to the correct solution, but not exact. One reason
is that we specified y70 = 1, not y70 1. Even though the result happened to be pretty
close to the exact answer, we deducted some points for this. We deducted very few if the
solution was honest, and stated that this least-squares problem is not the same as the
one we were asked to solve, but not a bad approximation. If there was no explanation
that you are solving a problem that is not the same as the one youre supposed to solve,
then we deducted more points. The issue here is conceptual; when you solve this least
squares problem you dont have y70 = 1, which we required. Instead, you only have
y70 1. Finally, some people did this. They computed the solution wls via least-squares
(as above), and then normalized to make sure that y70 = 1, by taking w = wls /y70 .
In fact, this yields the correct answer, but its not easy to show. Here are the optimal
antenna weights (obtained by any of the correct methods):
w =
0.0259 + 0.0010i

14

EE263

S. Lall

2011.05.16.01

-0.0087 - 0.0234i
0.0249 - 0.0237i
-0.0825 - 0.0225i
0.0191 + 0.0204i
-0.0994 + 0.0551i
0.1260 + 0.0480i
-0.0391 + 0.0324i
0.1767 - 0.0685i
-0.1103 - 0.0424i
0.0647 - 0.0988i
-0.1834 + 0.0477i
0.0507 + 0.0034i
-0.0741 + 0.1126i
0.1129 - 0.0134i
-0.0036 + 0.0277i
0.0538 - 0.0665i
-0.0341 - 0.0046i
-0.0066 - 0.0240i
-0.0205 + 0.0159i
The following matlab code obtains these weights using the approach suggested in the
hints:
lambda=1;
%% wavelength
d=0.4;
%% element separation
numElem=20;
%% number of antenna elements
theta=(1:1:180); Phase=2*pi*d/lambda*cos(theta/180*pi);
PhaseMat=((0:(numElem-1))*Phase); A=exp(j*PhaseMat);
h=[1;zeros(numElem-1,1)];
G=[-exp(j*2*pi*d/lambda*cos(70/180*pi)*(1:(numElem-1)));
eye(numElem-1)];
Atilde=A([1:60 80:180],:);
xls=-inv((Atilde*G)*(Atilde*G))*(Atilde*G)*(Atilde*h); w=G*xls+h
y=A*w;
figure(1);
plot(theta, abs(y))
grid on; axis([0 180 0 1.1]);
xlabel(theta); ylabel(|y|);
print -deps antenna_pat
figure(2);
polar(theta*pi/180,abs(y));
print -deps antenna_pat_pol
The resulting pattern is shown in the two plots below:

15

EE263

S. Lall

2011.05.16.01

|y()|

0.8

0.6

0.4

0.2

0
0

20

40

60

80

100

120

140

160

180

90
1
60

120
0.8
0.6

30

150
0.4
0.2
180

210

330

240

300
270

The plots show that we do pretty well; the array is indeed not too sensitive to plane
waves arriving from directions outside the interval 70 10 . Phased-array antennas were
once used only for military purposes (hence, the unfriendly terminology), but now they
are widely used for socially positive purposes, e.g., commercial aircraft, cellular telephone
systems, etc. Note that all that was important in this problem is linearity of y as a
function of w. This allows the same method to be used for a great variety of other
antenna array weight design problems. Examples of extensions include:

16

EE263

S. Lall

2011.05.16.01

arrays of arbitrary geometry, e.g., circular


electromagnetic interaction between antenna elements (which is complicated but
linear)
multi-frequency or broadband arrays, possibly with weights that vary with frequency
other types of arrays, e.g., acoustic or sonar
The very same method can be used to synthesis transmitting arrays as well as receiving
arrays.
48080

6. Trade-off formulae
Consider a multi objective least squares problem, where we would also like to keep the
norm of the solution small. In the notation of lecture 6, we have the objective function
as
J1 + J2 = kAx yk2 + kxk2 .
Thus, we have F = I and g = 0. As shown in the lecture notes, the optimal solution to
this problem is given as
x = (AT A + I)1 AT y
In this problem you will show that, for any matrix A, and any positive number , the
matrices AT A + I and AAT + I are both invertible, and
(AT A + I)1 AT = AT (AAT + I)1 .
(a) Lets first show that AT A + I is invertible, assuming > 0. (The same argument,
with AT substituted for A, will show that AAT + I is invertible.) Suppose that
(AT A + I)z = 0. Multiply on the left by z T , and argue that z = 0. This is what
we needed to show. (Your job is to fill all details of the argument.)
(b) Now lets establish the identity above. First, explain why
AT (AAT + I) = (AT A + I)AT
holds. Then, multiply on the left by (AT A+I)1 , and on the right by (AAT +I)1 .
(These inverses exist, by part (a).)
(c) Now assume that A is fat and full rank. Show that as tends to zero from above
(i.e., is positive) we have
(AT A + I)1 AT AT (AAT )1 .
Solution.
(a) Suppose, for any > 0, the matrix AT A + I is not invertible. Then there is a
nonzero vector in the nullspace of the matrix, i.e., for some z 6= 0, (AT A + I)z = 0.
Therefore
z T (AT A + I)z = kAzk2 + kzk2 = 0.
This implies, since > 0, that z = 0 which is a contraction. Thus the matrix
AT A + I is invertible.
Similarly the matrix AAT + I, for > 0, is invertible.
(b) Matrix multiplication is associative, we have
AT (AAT + I) = (AT A + I)AT .
Multiplying on the left by (AT A + I)1 , and on the right by (AAT + I)1 , we get
(AT A + I)1 AT = AT (AAT + I)1 .
(c) Suppose A is fat and full rank. As tends to zero from above we have
lim (AT A + I)1 AT = lim+ AT (AAT + I)1 = AT (AAT )1 .

0+

Note that when A is fat and full rank AAT is invertible.

17

You might also like