You are on page 1of 16

Distributed Transmission Based

on Inference on Graphs for


Cooperative Base Stations

Burak ŞEKERLİSOY
Supervisor: Asst. Professor Defne AKTAŞ
1. Introduction

Recent studies show that the capacity is highly increased using multiple input multiple
output schemes. With the same amount of transmitted power at the transmitter, using multiple
antennas at the transmitter and/or receiver increases the capacity.

x1 h11 y1

n1
x2 y2
. .
. n2 .
. .
. .
. .
xt hrt yr

nr

Figure1 – MIMO Channel

y = Hx + n (1)

In this research, the communication between the mobile stations (MS) and base
stations (BS) is modeled as a MIMO system and the downlink part of this system (from BS to
MS) is studied.
In the downlink communication scenario, base station simultaneously transmits to
multiple users. Then there is inter-user interference for one user generated by the signal,
transmitted to the other users. By using multi-user detection, it is possible to mitigate the
interference in the receiver side but since this technique is too costly to use in receivers, it is
not used and in this work, it is assumed that no cooperation exists between the users for the
downlink communication. The mitigation of the interference is handled in transmitter side by
intelligently designing the transmitted vector given the channel state information (CSI). This
inter-user interference can be mitigated by intelligent beamforming or the use of dirty paper
codes.
One way of dealing with inter-user interface is to use the inverse of the channel matrix
and multiply the transmitted vector with this inverse channel matrix. This simple technique is
know as channel inversion. It has been shown [1] that channel inversion technique does not
result in the linear capacity growth with the number of antennas. The reason is that with a
power constraint, a inversion of a channel matrix may require a large normalization factor and
this will result in reduction of the SNR dramatically. The regularized channel inversion
technique [1] is the modified version of the channel inversion technique where the
interference is not totally suppressed. This technique gives ability to regularize the inverse of
the channel matrix. This modification achieves the throughput growth linearly in the number
of antennas but still puts a gap in capacity. Besides the linear processing for multi-user MIMO
downlink communication, there are also dirty paper coding techniques, which are nonlinear
techniques. This is based on the “writing on dirty paper” concept by Costa [8]. Costa’s work
tells us that the capacity of a channel where the transmitter knows the interfering signal is the
same as if there were no interference. It is hard to develop a practical dirty paper coding
technique for this problem. The simplest thing to do is to set the transmitted signal to the
desired signal minus the interference but this results in increased transmission power. In [1],
the vector to be transmitted is constructed and by using a modulo function at the transmitter
and receiver, the system operates close to the capacity. This technique will be the one that our
work will be based on and we will verify the performance of our algorithms with respect to
the results which are achieved using this technique.

2. Brief Explanation of the Research


The downlink algorithms are designed so that the most of the processing is done in the
BSs. Assuming that the channel is known to the transmitter (BS), the transmitted signal from
the BSs are modified so that the MSs decode easily. The method that we initially considered
is the regularized perturbation technique [1]. If the transmitter knows the channel, then one
way that comes to the mind is to find the inverse of the channel matrix and multiply the
transmitted symbol with this inverse. However, power limitation at the transmitter is a
constraint and the large singular values of the channel inverse prevent us from directly
multiplying the transmitted symbol with the inverse of the channel matrix. The proposed way
is to make sure that the transmitted data does not lie along the large singular values of the
inverse of the channel matrix. The data is perturbed so that the data vector is approximately
orthogonal to the right singular vectors associated with the large singular values of the inverse
of the channel matrix. In other words, we form u%from the data vector u such that s = H −1u%

has norm smaller than H −1u and u%can be still decoded at the receivers. At this point an idea
derived from Tomlinson-Harashima (TH) precoding [6],[7] is used and each element of u is
perturbed by an integer. In this case
u%= u + τ l (2)
where τ is a positive real number and l is a complex vector whose real and imaginary parts
are integers. Here, the aim is to find l and τ which minimizes the norm of s = H −1u%. The
choice of the parameter l can be written as:
l = arg min(u
'
+ τ l ' )* ( HH * ) −1 (u + τ l ' ) (3)
l

The problem in (3) is an integer lattice least squares problem and sphere decoder
algorithm [2],[3],[4] can be used to solve this problem. This algorithm gives the minimum
distance solution in a lattice, i.e., it tries to find a x̂ from a finite or infinite lattice, which
2
minimizes y − Hx .

In this work we consider the scenario where BSs cooperate and this enables us to solve
the problem on factor graphs using distributed algorithms. After this point, the problem
reduces to finding the solution to the following problem in a distributed manner.
Problem:
2
x̂ = arg minn y-Hx
x∈X

where X : finite dimensional constellation


x : nx1 vector ∈ X n
y : mx1 vector ∈ R m
H: mxn matrix ∈ R mxn
Proposed Solution:
The problem can be formulated as a MAP problem with jointly Gaussian
distribution.
x̂ = arg maxn p(x|y)
x∈X

x̂ i = arg max p(xi |y)


x i ∈X

 1 2
p(x|y) α exp - 2 y-Hx 
 2σ 
 1
p(x|y) α exp  - 2  x T H T Hx - 2yT Hx  
 2σ 
Let bi @yT h i
A ij @h i h j where hi is the i th column of H
Then one can write b = y T H and A = HT H.
Then it is obvious that A matrix is symmetric so that we can write :
 1  n n n 
p(x|y) α exp  - 2  ∑ (Aii x i 2 -2bi xi ) + ∑ ∑ Aij x i x j   (4)
 2σ i=1 
  i=1 j=1,j ≠i 
After having equation (4), the next step is to use sum-product algorithm on factor
graphs and run this algorithm on the graph to get the x vector which maximizes the a-
posteriori probability.

3. Brief Explanation of the Sum-Product Algorithm on Factor


Graphs
Let x1 ,x2 ….. xn be a collection of variables where xi is element of some alphabet Ai.
Let g(x1 ,x2 ….. xn) be a function of these variables. Then associated with every function
g(x1 ,x2 ….. xn), there are n marginal functions gi(xi) and we write

g i (x i ) = ∑ g(x ,x
~{x i }
1 2 ,...,xn ) (5)

which tells that ith marginal function associated with g(x1 ,x2 ….. xn) is the summary for xi of g.
Instead of indicating the variables being summed over, we indicate the ones not being
summed over and “summary for xi” means that, the sum is being taken over the variables
except xi. By definition, a factor graph is a bipartite graph that expresses the factorization of
the function and it consists of variable nodes for each variable and factor nodes for each local
function [5].

x1 x2 x32 x42 x52


fA fB fC fD fE

Figure2 – Sample Factor Graph


For instance the Factor Graph in Figure-2 expresses the function

g(x1 ,x 2 ,x 3 ,x 4 ,x5 ) = fA (x1 )fB (x2 )fC (x1 ,x2 ,x3 )fD (x3 ,x4 )fE (x3 ,x5 )

The sum-product rule runs on this graph and for example it calculates g1 (x1 ) by only
message passing and processing the messages at the nodes. We can think about this algorithm
to benefit from multiprocessor systems and each processor is doing its job at a particular time
and we can associate each node as a processor and make use of parallel processing.

In order to use the sum-product algorithm in this work, we first observe our global
function which is the function in equation (4). Here we have some modifications to do. Since
we are looking for the arguments of the x which maximizes the global function, we do not
need to calculate the marginal function itself. Instead we only look for x which maximizes the
function in (4) or equivalently which minimizes the -log(global function). Since after taking
the -log(.) of the global function and looking for the argument which minimizes the function
we have min-sum algorithm to run on the factor graph.

x̂ = arg maxn p(x|y)


x∈X

 1  n n-1 n 
p(x|y) α exp  - 2  ∑ (A ii x i 2 -2bi xi ) + 2∑ ∑ Aij xi x j  
 2σ i=1 
  i=1 j=i+1 
then one can write
x̂ = arg minn -log(p(x|y))
x∈X

 n n-1 n

= arg minn  ∑ (A ii x i 2 -2bi xi ) + 2∑ ∑ Aij xi x j  (6)
x∈X
 i=1 i=1 j=i+1 

As we see in equation (6), our aim is to minimize sum of functions and this is why it is
called min-sum algorithm. So we convert the problem to a min-sum algorithm and the
remaining part is to construct the factor graph and run the min-sum algorithm on the
associated graph. We notice that the graph is not loop-free which means that a message sent
from one node to another will affect itself at the future iterations. This creates a convergence
problem and the convergence will be one of the most important issues to be analyzed in this
research. Besides, convergence of the min-sum algorithm depends on how we partition the
global function into local functions. Different partitioning of the function in equation (6) gives
different results and the related work is described in Section 5.

4. Objectives of the Research


First of all, our aim is to develop a practical, distributed, low-complexity and high
performance algorithm. Of course it will be hard to satisfy all of these criteria but we are
studying step by step. For instance at the first phase of the research, complexity is not a high
priority issue. In this phase, our main focus is to find an algorithm that achieves the optimal
performance and by performance, we mean the error performance of the algorithm. For
subsequent phases of the research, we will refrain from high complexity and impractical
designs compared to a centralized algorithm.
The following items will be the issues to consider while developing the algorithm :
• Error Performance
• Convergence
• Complexity
• Practicality
• Behavior of the algorithm against malfunctioning of the system
By the last item, we mean that if for instance a message cannot pass between two
nodes because of a malfunctioning in the system for any reason, the response of the algorithm
will be considered. The complexity and practicality are two very important issues since they
will define the applicability of the algorithm in real life and it may be a possibility to trade off
between performance and these issues in the next phases of the design. As a result we will
study the trade-off between performance and complexity.

5. Work Done
Throughout our studies channel matrix H is modeled as a complex random matrix and
its entries are assumed to be i.i.d. Gaussian distributed with zero-mean and variance 1/2 for
both real and imaginary parts. The channel is modeled as a block fading channel which means
that for a predetermined duration of symbol time, channel matrix H is fixed. Furthermore, the
channel is assumed to be slowly varying and the channel state information is perfectly known
by the transmitter.
Firstly, we began our study implementing the sphere decoder [4] in a centralized
manner. After this step, we proceeded with the centralized implementation, the vector-
perturbation technique [1] where the sphere decoder is utilized. By verifying our
implemntations in terms of error performance for the centralized vector perturbation
technique, we moved on studying on min-sum algorithm and we are still working on
designing an efficient distributed algorithm.

5.1 – Sphere Decoder


The aim of the sphere decoder is doing the ML decoding for lattices which can have
high dimensions. In order to develop a lattice decoder, the proposed algorithm in [4] is used
and the simulations are done in MATLAB environment. The implementation labeled as
Algorithm 2 in [4] is simulated in MATLAB.
The simulation is done by transmitting symbols which are elements of 64-QAM
constellation and we consider the number of receive and transmit antennas to be equal to 4.
We also assume that the channel matrix H is fixed for T=100 symbols and then changes
randomly. The SNR (signal-to-noise) ratio is swept from 5 dB to 35 dB with 5 dB steps. We
did the simulations for 10000 channel realizations and for each realization we transmit 100
blocks.
The resulting bit-error-rate of the simulations for the 64-QAM signal constellation is
given in Figure 3.
BER vs. Eb/No with Best Curve Fit
0
10
Empirical BER
Exp Plus Const Fit
-1
10

-2
10
BER

-3
10

-4
10

-5
10
5 10 15 20 25 30 35
Eb/No (dB)

Figure4 – Bit Error Rate for Lattice Decoder


The numeriacl results are verified using the results presented in [4].

5.2 – Vector Perturbation Technique – Sphere Encoder


The aim of the sphere encoder is performing the lattice encoding at the transmitter and
decreasing the receiver complexity. In order to develop a lattice encoder, the proposed
algorithm in [1] is used and the simulations are done in MATLAB environment. The
simulation is done by transmitting symbols which are elements of 16-QAM constellation and
we consider the number of receive and transmit antennas to be equal to 4. We also assume
that the channel matrix H is fixed for T=100 symbols and then changes randomly. The SNR
(signal-to-noise) ratio is swept from 7.5 dB to 25 dB with 2.5 dB steps. We did the
simulations for 10000 channel realizations and for each realization we transmit 10 blocks.

The resulting bit-error-rate of the simulations for the 64-QAM signal constellation is
given in Figure 4.
BER vs. Eb/No with Best Curve Fit
0
10
Empirical BER
Exp Plus Const Fit

-1
10

-2
10
BER

-3
10

-4
10

-5
10
5 10 15 20 25 30 35
Eb/No (dB)

Figure5 – Bit Error Rate for Lattice Encoder

The numeriacl results are verified using the results presented in [1].

5.3 – Min-Sum on Factor Graph


For this part, we are currently trying two different factor graph formulations by
alternative partitioning of the global function into local functions differently.

5.3.1 – Min-Sum Version1


First, we take the global function and write it as :
 n n-1 n

x̂ = arg minn  ∑ (A ii x i 2 -2bi xi ) + 2∑ ∑ Aij xi x j  (7)
x∈X
 i=1 i=1 j=i+1 

Then for this version we have the following nodes :


variable nodes : x1 ,x2 ….. xn ( Number of variable nodes = n )
factor nodes : f1 ,f2 ….. fn ( Number of factor nodes = n(n+1)/2 )
gij i = 1….n-1
j = i+1…n
The graph is shown in Figure3.

f1 f2 f3 f4
. . .

x1 x2 . . . x3 x4

g12 g23 g1n g(n-1)n

Figure6 – Factor Graph for Verison1

f i (x i )=Aii x i 2 -2bi xi
g ij (x i ,x j )=2Aij xi x j i=1 ... n-1
j=i+1 ... n
variable node → factor node
i-1 n
μ xi -gij (x i )=μ fi -xi (xi ) + ∑
k=1,k ≠ j
μgki -xi (xi ) + ∑
k=i+1,k ≠ j
μgik -xi (xi ) + c

factor node → variable node


μ fi -xi (x i )=fi (xi )

Let k = min(i,j) and l = max(i,j)


μ gkl -xi (x i )= min g ji (xi ,x j ) + μx j -gkl (x j )  + d
xj

Initialization
μ xi -gij (x i ) = 0 μgij -xi (xi ) = 0 (8)

Termination
i-1 n
μ i (x i ) = μ f i -x i (x i ) + ∑μ
k=1
g ki -x i (xi ) + ∑μ
k=i+1
g ik -x i (xi )
0
10
min-sum version1
Sphere decoder
-1
10

-2
10

-3
10

-4
10

-5
10

-6
10
0 5 10 15 20 25 30

Figure7 – BER plot for min-sum version1


It is obvious from Figure7 that we have convergence problem for this algorithm. By
looking at the BER values at high SNR from Figure7, we conclude that approximately 1 of
100 symbols is suffering from convergence. We verified the convergence problem also by
debugging the code. This convergence problem is the result of the loopy structure of the
problem. We then look for alternative partitioning of the global function into local functions.
5.3.2 – Min-Sum Version2
In this version, we provide an alternative partitioning of the global function into local
functions and the partitioning is described as follows :

 n n-1 n 
x̂ = arg minn  ∑ (A ii x i 2 -2bi xi ) + 2∑ ∑ Aij xi x j 
x∈X
 i=1 i=1 j=i+1 
 n n-1 n 
= arg minn  ∑ (A ii x i 2 -2bi xi ) + 2∑ xi ∑ Aij x j  (9)
x∈X
 i=1 i=1 j=i+1 
then we select the factor node operations as :
f i (x i )=Aii xi 2 -2bi xi ( Number of factor nodes = 2n-1 )
n
g i (x i ,x j )=2xi ∑ Aij x j i=1 ... n-1
j=i+1

f1 f2 f3 f4
. . .

x1 x2 x3 x4

g1 g2 gn-1
. . .

Figure8 – Factor Graph for Verison2

In this configuration of the factor graph, our aim is to improve the convergence of the
previous algorithm and make the messages more independent from each other. Since this
problem is modeled on a loopy graph, the partitioning of the global function plays an
important role in the convergence of the algorithm.

variable node → factor node


i
μ xi -g j (x i )=μ fi -xi (x i ) + ∑
k=1,k ≠ j
μgk -xi (xi ) + c

factor node → variable node


μ fi -xi (x i )=fi (x i )
 n 
μ g j -x i (x i )= min g j (xi ,x j ) + ∑ μx k -g j (xk ) +d
xj
 k=j,k ≠i 
Initialization
μ xi -g j (x i ) = 0 μ g j -xi (x i ) = 0
Termination
i
μ i (x i ) = μ f i -x i (x i ) + ∑μ
k=1
g k -x i (xi )

This part is still in progress and the algorithm is coded and it is being debugged. The
following table graph summarizes our min-sum work.

# of # of
Min-Sum
Partitioning of the global function variable factor
Version
nodes nodes
 n n-1 n 
Version-1 arg minn  ∑ (A ii x i 2 -2bi x i ) + 2∑ ∑ Aij xi x j  n n(n+1)/2
x∈X
 i=1 i=1 j=i+1 
 n n-1 n 
Version-2 arg minn  ∑ (A ii x i 2 -2bi x i ) + 2∑ xi ∑ Aij x j  n 2n-1
x∈X
 i=1 i=1 j=i+1 
Table1 – Summary of the min-sum work
6. Future Work
First step of the future work is to develop an algorithm which has a satisfactory error
performance and which has convergence problem as small as possible. At first we will not
take the practicality and complexity of the algorithm into account but in subsequent phases,
practicality and complexity and the effects of the real life problems will be considered.
For now, we are investigating a how to implement the distributed algorithm on a factor
graph as a min-sum algorithm. In this phase, we are working with finite constellations.
However, the problem in this research includes infinite lattice and sphere decoder has a
satisfactory performance for this infinite lattice. Our aim will be to achieve a similar
performance as the centralized sphere decoder. After analyzing the error performance and the
convergence issues for the algorithm, we will work on practicality and complexity and may
need to sacrifice performance in order to get an algorithm applicable to real life. For instance,
sensitivity of the algorithm to CSI feedback, synchronization and packet loss between BSs
will be analyzed.
Bibliography
[1] M. Hochwald, B.Peel, A.Lee ``A Vector Perturbation Technique for Near Capacity Multiantenna
Multiuser Communication – Part II : Perturbation'' IEEE Trans. On Communications, vol. 53, No.3, March
2005.
[2] U. Fincke and M. Pohst, “Improved methods for calculating vectors of short lengths in a lattice, including a
complexity analysis,” Math. Computat., vol. 44, pp. 463–471, Apr. 1985.
[3] E. Agrell, T. Eriksson, A. Vardy, and K. Zeger, “Closest point searches in lattices,” IEEE Trans. Inf. Theory,
vol. 48, pp. 2201–2214, Aug. 2002.
[4] M. O. Damen, H. El Gamal, and G. Caire, “On maximum-likelihood detection and the search for the closest
lattice point,” IEEE Trans. Inf. Theory, vol. 49, pp. 2389–2402, Oct. 2003.
[5] R. Kschischang, J. Frey, Hans-Andrea Loeliger, “Factor Graphs and the Sum-Product Algorithm” IEEE
Trans. On Info. Theory, Vol 47, N0.2, February 2001
[6] M. Tomlinson, “New automatic equaliser employing modulo arithmetic,” Electron. Lett., vol. 7, pp. 138–
139, Mar. 1971.
[7] H. Harashima and H. Miyakawa, “Matched-transmission technique for channels with intersymbol
interference,” IEEE Trans. Commun., vol. COM-20, pp. 774–780, Aug. 1972.
[8] M. Costa, “Writing on Dirty Paper,” IEEE Trans. Info. Theory, vol. 29, May 1983, pp. 439–41.

You might also like