Professional Documents
Culture Documents
d1
i=0
2
i
+ 1 to
d
i=0
2
i
Let ct = . If (t) T
k1
,
If t mod 2 = 0 and (t + 1) < T
k1
, then ct =
TravDep(, t + 1, T
k
);
Else if t mod 2 = 1 and (t 1) < T
k1
, then
ct = TravDep(, t 1, T
k
).
code = code, ct.
d = d 1.
Here, we give a toy example. Suppose there is a code block
as Fig. 7(a). We can construct the binary tree (t) for 1 t <
22
2
2
2
as Fig. 7(b). With T
0
=4, code=TravDep(, 1,
T
0
). Because (1)=7T
0
and t =1<44, code=1, cl, cr
for cl =TravDep(, 2, T
0
) and cr =TravDep(, 2 + 1, T
0
).
We can get TravDep(, 2, T
0
) = 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0,
and the 5th and the 11th bits are the signs of the sig-
nicant coefcients 7 and 5, respectively. Similarly,
TravDep(, 3, T
0
) = 1, 0, 0, 0, 1, and the last bit is the sign
of the signicant coefcient 4. Hence, we can get the code
{1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1} with T
0
= 4.
With T
1
= 2, code = TravLev(T
1
). When d = 5, we scan
the nodes (17), (22), (30) rst, and get the code {1, 1, 1,
1, 1, 0}. When d = 4, we scan the notes (9), (10), (14),
and get the code {1, 0, 0, 0, 1, 0, 1}. When d = 3, we scan the
note (6), and get the code {0}. Hence, we can get the code
{1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0} with T
1
= 2.
After the function TravLev(T
1
) is nished, a renement
pass is performed in order to rene all the coefcients found
to be signicant. This pass makes lower or higher decisions of
uncertainty for the given threshold to minimize the quantization
errors. That is, a coefcient in the upper half of the uncertainty
interval is coded with 1, otherwise coded with 0. For the
example, there are three previous signicant coefcient 7,
5, and 4, and the renement code is {1, 0, 0}, and it
can be decoded to 7, 5, and 5. Finally, we get the
resulting code 1111110010001000111111010001010100 for
the BTCA method and decoded block is as Fig. 7(c).
The adaptive scanning order scans the coefcients closest to
the previous signicant coefcients rst, and then scans the next
one, and so on. Because the edges are usually around the sig-
nicant coefcients and the proposed method can preferentially
encode these regions, the quality of the edges of decoded image
can be improved at a specied bit rates. In other words, the
adaptive scanning order can be as a means to adaptively adjust
to the edges, particularly to horizontal or vertical edges.
The closer to the previous signicant coefcients the coef-
cients are, the more likely signicant they are. Hence, when
using the function TravLev(T
k
), the resulting code is usually
more efcient at higher tree levels such as d = Dor d = D 1,
than lower tree levels such as d = 3 or d = 2. Moreover, the
code of the renement pass is usually more efcient than that
of lower tree levels too. Thus, the renement pass can be
performed when d = D 3 in the function TravLev(T
k
), to
get better performance. For example, if the size of a block
to be coded is 2
6
2
6
, then the tree depth D = 13, and the
renement pass is performed when d = 10.
C. Scan-Based Processing
If we directly use the binary tree to encode a large image,
namely, an entire wavelet image is regarded as a code block,
then it needs a lot of memory to store the binary tree. As
an alternative, we can divide the entire wavelet image into
several code blocks and encode them alone with the proposed
method and combine rate-distortion optimization algorithm [9].
For each code block and each bit plane, the points at the end
of renement pass and each adaptive traversing level can be
taken as the primitive truncation points. Because of the adaptive
scanning order, the distortion rates of these truncation points are
almost monotonically decreasing. It can get more nely embed-
ded bit streams than EBCOT. To decrease the computational cost
of the rate-distortion optimization at a certain extent, we can use
3742 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 50, NO. 10, OCTOBER 2012
Fig. 7. Toy example. In (b), the numbers in the circles are the values of the tree nodes, and the numbers near the circles are the indexes of the nodes of the binary
tree. The gray nodes are the previous signicant nodes whose magnitudes are greater than 4. (a) A code block. (b) The binary tree of (a). (c) The decoded block.
the memory-efcient progressive rate-distortion optimization
algorithm [25] or the greedy heap-based rate-control algorithm
[26] instead of the PCRD algorithm using in EBCOT [9].
Finally, we can get a truncation point in each code block
at a specied bit rate. Hence, there are only a small number
of the nal truncation points needed to store, which is equal
to the number of code blocks. Combining the rate-distortion
optimization, it can get better performance (gains about 0.2 dB)
and requires less memory than BTCA, and it has a rich set of
features like JPEG2000, such as quality, position, and resolu-
tion scalability [9].
However, the rate-distortion optimization is expensive, be-
cause there could be many unnecessary coding units and the
distortion rates must be calculated to minimize distortion ac-
cording to a target bit rate. To decrease the computational cost,
we propose a scan-based method based on BTCA without rate-
distortion optimization, which also has quality, position, and
resolution scalability. This method is called BTCA-S.
Since remote sensing data are often captured incrementally
by sensors in a push-broom fashion and quite large, the scan-
based approach is also very desirable to handle the data. In [14],
the authors present a scan-based method that enables the use
of JPEG2000 with incrementally acquired data. CCSDS-IDC
[16] also recommends a stripe-based method. BTCA-S uses the
same line-based wavelet transform and scan elements as [14]
and [16], but its coding algorithm is different. The complex
context-based entropy coding and rate-distortion optimization
are used in [14], which are too expensive to become a recom-
mended standard for space missions [16]. CCSDS-IDC recom-
mendation has been designed for real remote sensing scenarios
and provides an excellent tradeoff performance complexity, but
it only has quality scalability, and the levels of DWT is xed to
three. However, BTCA-S can provide quality, position, and reso-
lution scalability without any entropy coding and rate-distortion
optimization and can use any number of levels of DWT, too.
1) Creating Scan Elements: In [27], a line-based wavelet
transform is proposed. For a given lter length and level of
decomposition, the memory requirement of wavelet transform
of this approach depends only on the width of the image,
rather than the total size as in a traditional row column l-
tering implementation, but it can yield the same results as a
traditional one. The approach is based on the following ideas.
Each time we receive a line of original image, we perform
horizontal ltering and we store the data into a circular buffer.
After we have received enough lines we can perform vertical
ltering. The next level of decomposition is also based on this
principle, which is implemented when there are enough rows
and columns. Let 2F + 1 be the maximum length of the lters.
Fig. 8. Relationship between scan elements (right) and resolution subbands
(left) for two-level wavelet transform of a single component. Regions in
different subbands shaded with the same color constitute a scan element. That
is, wavelet transform data can be rearranged to form scan elements.
The buffer size for ltering is B
f
= L(2F + 1), and buffer size
for synchronization is B
s
= (2
L
L 1)F. The total circular
buffer size needed for L levels of wavelet decomposition is
B
f
+B
s
= L(2F + 1) + (2
L
L 1)F rows [27]. For ex-
ample, when we use the CDF9/7 lters with L = 3 levels of
decomposition, the total buffer size is 3 9 + (2
3
3 1)
4 = 43 rows. If there are many rows in the original image, then
it can save much memory for wavelet transform.
The line-based wavelet transform is performed on the avail-
able data, and subsets of wavelet coefcients are collected
into scan elements. Each scan element consists of data from
a certain number of rows from each resolution subband nom-
inally corresponding to a stripe of the image in the spatial
domain. For example, Fig. 8 shows two scan elements with
two levels of wavelet transform for an image size of 16
16. Each scan elements contains wavelet coefcients of four
rows, namely, 48 coefcients from the three subbands of the
highest resolution, 12 coefcients from the subbands of the
second highest resolution, and 4 coefcients fromthe the lowest
frequency subband. In Fig. 8, regions in different subbands
shaded with the same color correspond to the same region of the
image in the spatial domain, which constitute a scan element.
The formation of scan elements using small amounts of data
from different subbands is possible because of the use of the
incremental wavelet transform. This concept of scan elements
can be extended to any number of resolution levels.
If there are too many rows in a scan element, then it needs
much memory. However, if there are too few rows, then the
performance will decrease obviously. Hence, the proposed
method trades off the memory requirement and performance
to set the rows in each scan element as 32, except that the last
scan element may contain less than 32 rows. In fact, if there are
64 rows in each scan element, the mean PSNR is only about
HUANG AND DAI: NEW ON-BOARD IMAGE CODEC BASED ON BINARY TREE 3743
0.02 dB away from what is obtained when each scan element
contains 32 rows.
2) Encoding Scan Element: As soon as a scan element is
available, we can encode it by BTCA, which has quality scala-
bility because BTCAis an embedded coding algorithm. In order
to get position and resolution scalability, we can further divide
the scan element into several code blocks and encode them with
BTCA. More precisely, we can take each subband of a scan
element as a code block. This is based on the following ideas.
The scan element only contains 32 rows of wavelet coefcients.
The subbands in a scan element are some at rectangles. For
example, when an image with size of 500 500 is decomposed
by three levels wavelet transform and a scan element with size
of 32 500 is available, the sizes of the subbands of each
resolution in the scan element are 16 200, 8 100, and
4 50, respectively. If we use a xed code block size such as
32 32, then the code block may contain coefcients from
more than one subbands, which will decease the coding ef-
ciency and does not have resolution scalability. Hence, we take
each subband of a scan element as a code block. If there are too
wide subbands, for example, wider than 512, then they can be
divided into two code blocks, which cannot raise PSNR accord-
ing to the experiment, but can provide ner position scalability.
To allocate the bit rates for the code blocks of a scan
element adaptively, we propose a scanning order across the
binary tree as follows. Suppose the size of a scan element is
2
M
2
N
from a two levels wavelet transform, in which there
are seven subbands, four with size of 2
M2
2
N2
and three
with size of 2
M1
2
N1
. We construct a binary tree for each
subband, namely,
1
,
2
,
3
,
4
,
5
,
6
,
7
for the seven
subbands, with levels of D
1
= D
2
= D
3
= D
4
= M +N
3 and D
max
= D
5
= D
6
= D
7
= M +N 1, respectively.
Then, we can traverse the binary trees as function TravLevs.
Function code = TravLevs(T
k
)
Let D = D
max
. Repeat the following steps while D > 1.
For j = 1 to 7
If j 4, then d = D 2.
If d 0, continue.
For t =
d1
I=0
2
i
+ 1 to
d
i=0
2
i
Let ct = . If
j
(t) T
k1
,
If t mod 2 = 0 and
j
(t + 1) < T
k1
, then ct =
TravDep(
j
, t + 1, T
k
);
Else if t mod 2 = 1 and
j
(t 1) < T
k1
, then
ct = TravDep(
j
, t 1, T
k
).
code = code, ct.
D = D 1.
The function TravLevs is similar with the function
TravLev. The difference is that the former traverses multiply
trees by levels at the same time, because the coding efciency
of BTCA for different code blocks is very similar at the same
bit plane and the same level of the binary tree. The function
TravLevs is easy to extend to higher levels of wavelet trans-
form, in which there are more subbands.
Because the function TravLevs is an embedded coding pro-
cess, it can be stop at any CR. When each specied CR (quality
layer) is reached, we can record the length of the bits for each
code block into the head of the coding stream. For example,
suppose there are Q quality layers and C code blocks. Let l
q
c
denote the length of the bits for cth code block and the qth qual-
ity layer. Then, the QC numbers l
q
c
[c = 1, 2, . . . , C, q =
1, 2, . . . , Q need to be recoded. Consequently, in the nal
stream, the bit stream for each quality and each code blocks
is preceded by a header indicating its length. Then, when we
transmit or decode the coding stream, we can select a certain
quality layer, some resolution or position bit stream without
needing to decode the whole encoded stream, so as to get the
scalability of quality, resolution, and position.
3) Rate Allocation for Scan Elements: When multiple scan
elements are desired, the method of allocating bit rates for each
scan element is required. The simplest method is to encode
each scan element at a xed rate, which is proportional to the
number of the rows in it. However, this simple method has some
disadvantages. When the image content varies considerably
between different areas, the reconstructed image of the simple
method may vary in quality from top to bottom. If one wants
to optimize overall image quality subject to an overall rate
constraint, a strategy to optimize the allocation of compressed
bytes to different scan elements could be employed. Such an
implementation, however, might have high complexity [16].
In fact, it has almost no difference (only about 0.03 dB in
the experiment in [14]) between the overall PSNR of the
images reconstructed from the xed-rate method and that from
overall rate-distortion optimization method when the size of
scan elements is not very small, and there is a xed-rate option
in CCSDS-IDC recommendation[16]. Hence, we propose a
modied xed-rate method for rate allocation as follows.
First, we encode each scan element in the xed-rate method
with the function TravLevs. When qth specied CR is reached
in a scan element, the length of the bits for the cth code block
in the sth scan element is recoded by l
q
s
c
. At this time, the rst
several code blocks in a scan element are encoded by the func-
tion TravLevs with the same level D of the binary trees, and
the others are encoded with level D + 1. We continue encoding
these code blocks in the scan element until the while loop
with the level D is nished, hence a better quality is attained,
and the length of the bits for cth code block, namely, l
q
e
s
c
, is
recorded. We can also record the length of bits for the rst
several code blocks as l
q
b
s
c
when coding at the beginning of the
while loop with the level D, which represents a worse quality.
The corresponding distortion with the level D at the bit plane
T
k
is marked by D
q
s
c
= T
k
D
max
+D.
Finally, we can get the candidate truncation points
l
q
s
c
, l
q
b
s
c
, l
q
e
s
c
and corresponding distortion D
q
s
c
. At qth CR,
the l
q
b
s
c
bits for each code blocks of all scan elements are selected
rst. Then, we can select more bits, namely, instead l
q
b
s
c
of l
q
s
c
or
l
q
e
s
c
, for each code block according to the D
q
s
c
descending order,
until the qth CR is reached. Be pointed out that there is a small
number of truncation points to select, thus it only needs a little
cost, but it can balance the quality difference between different
scan elements.
For example, Fig. 9 shows the candidate truncation points
for the code blocks. It lists three code blocks, namely, the (c
1)th, cth, (c + 1)th code block, in qth scan element. The gray
nodes are the previous signicant nodes, and the nodes marked
with numbers 1, 2, . . . , 12 are the nodes traversed by levels.
3744 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 50, NO. 10, OCTOBER 2012
Fig. 9. Candidate truncation points for the code blocks. There list three code blocks, namely, the (c 1)th, cth, (c + 1)th code block, in sth scan element. The
gray nodes are the previous signicant nodes, and the nodes marked with numbers 1, 2, . . . , 12 are the nodes traversed by levels. Their scanning order is labeled
with the numbers. Suppose the algorithm reaches qth compression ratio when coding the node marked with 9, then the truncation points for the three code
blocks are attained as the gure.
Their scanning order is labeled with the numbers. Suppose the
algorithm reaches qth CR when coding the node marked with
9, then the truncation points for the three code blocks are
attained as the gure.
III. COMPLEXITY ANALYSIS
A. Time Complexity
In this subsection, we will analyze the time complexity of
our method and compare it with those of EZBC and SPIHT. In
fact, zerotree encoders, such as SPIHT, usually trade memory
for computation by precomputing and storing the maximum
magnitudes of all possible descendant and grand-descendanst
set [6], [7], [28], and they also need to compare coefcients
each other and the comparing times is similar with that of
constructing the binary tree in our method. Details of the
calculation are given as follows.
Suppose an entire wavelet image is regarded as a code block
with size of 2
N
2
N
. There are 2
N
2
N
= 2
2N
nodes at the
bottom of the binary tree. When an upper level node of the
tree is constructed, it needs a comparison with two children.
Let D
B
= log
2
(2
2N
) + 1 = 2N + 1 stand for depth of the tree.
Hence, all the comparison times for constructing the tree is
C
B
=2
2N
_
1
2
+
1
2
2
+ +
1
2
D
B
1
_
=2
2N
_
1
1
2
2N
_
=2
2N
1.
Similarly, when a node of the quadtree of EZBC is con-
structed, it needs three comparisons to nd the max value
of four children. Let D
E
= log
4
(2
2N
) + 1 = N + 1 stand for
depth of the quadtree. All the comparison times for constructing
the quadtree is
C
E
=3 2
2N
_
1
4
+
1
4
2
+ +
1
4
D
E
1
_
=3 2
2N
_
1
_
4
_
1 (1/4)
N
_
1 1/4
_
=2
2N
1.
We can see C
B
= C
E
.
To nd the maximummagnitude of all possible descendant in
SPIHT, we need to construct a quadtree in each subband rst.
There are 2
2N
/4 coefcients in a highest resolution subband,
and the comparison times for construct the quadtree in the
subband are 2
2N
/4 1. Let D
S
= log
4
(2
2N
) + 1 = N + 1
stand for the wavelet decomposition level, so all the comparison
times are
C
S
=3
__
2
2N
4
1
_
+
_
2
2N
4
2
1
_
+ +
_
2
2N
4
D
s
1
1
__
=2
2N
1 3N.
In addition, it needs to compare the corresponding nodes
in the quadtrees of different subbands at different resolution.
Hence, all the comparison times nding the maximum magni-
tude of all possible descendant are more than C
S
. Thus, C
B
is
nearly equal to C
S
. In fact, C
S
is also the comparison times for
nding the max magnitude of all coefcients, which is required
for all wavelet image compression algorithms.
Including all the wavelet coefcients, the number of the
nodes in the binary tree is N
B
= 2
2N
+ 2
2N
1, and those
in EZBC and SPIHT are N
E
= 2
2N
+ (2
2N
1/3), N
S
=
2
2N
+ (2
2N
/4) + (2
2N
/4
2
), respectively. N
B
is more than
N
E
and N
S
, and it needs more time to traverse all the nodes,
but, in fact, it does not need to traverse all the nodes in coding
process usually. When we get a bit for signicance test in BTC,
EZBC, or SPIHT, it needs to compare a value with the threshold
one time, and the other computation is proportional to it. Hence,
the cost time of BTC, EZBC, and SPIHT is similar at the same
bit rates when the max magnitudes of all possible set are stored.
The method coding more signicant coefcients can get better
performance.
The additional computation of the BTCA is to nd the adap-
tive scanning order. BCTA performs the binary tree coding level
by level from the bottom to the top and does not increase the
extra bitstreams compared with BTC. It increases the judgment
for the statement If
j
(t) T
k1
for each node in the binary
tree, but the statement is very simple and does not require any
other operation such as sort operation. Hence, there is only
a little cost time for the adaptive scanning order. The scan-
based process needs not any extra computation. On the contrary,
because this process reduces the memory requirement greatly,
it can save the time for exchanging cache.
The step of entropy coding usually accounts for most of the
coding/decoding time. For an example in [7], when we use
arithmetic coding in SPIHT in C language implementation,
HUANG AND DAI: NEW ON-BOARD IMAGE CODEC BASED ON BINARY TREE 3745
the CPU time for encoding an image at 0.5 bpp is 0.33 s,
while it is 0.14 s without arithmetic coding. The rate-distortion
optimization is very expensive, too, because there could be
many unnecessary coding units and the distortion rates of all
truncation points must be stored and sorted, which is a major
component that dictates the performance [25]. Hence, the meth-
ods using complex context-based arithmetic coding and rate-
distortion optimization, such as EBCOT [9], JPEG2000 [11],
HIC [21], would be too complex to become a recommended
standard for space missions.
If EBCOT does not use arithmetic coding, its performance
will decrease a lot. Except for the renement code and sign
code, there are some original binary codes of EBCOT rep-
resenting the signicance of four coefcients, and there are
most codes representing the signicance of single coefcient.
Only a few codes represent the signicance of the coefcients
of a subblock with size of 16 16. Thus, there are many
0 in the original binary codes of EBCOT, and they need to
use arithmetic coding to raise the performance. However, the
original binary codes of SPIHT can represent the signicance of
all the coefcients in a zero tree, and the original binary codes
of the binary tree coding can represent the signicance of all
the coefcients in a subtree, so the performance of SPIHT and
the proposed method are better than EBCOT without arithmetic
coding. The PSNR of EBCOT will decrease more than 1 dB
without arithmetic coding according to our experiment.
B. Memory Requirement
When a traditional row column ltering is implemented for
wavelet transform, it needs memory with size as the whole im-
age. For example, if there is an image of size 20482048 to be
encoded, and each wavelet coefcients needs 4 bytes to store,
then the traditional method needs memory of 20482048
4=16 M bytes, while it only needs (B
f
+B
s
)20484=
0.3359 Mfor three levels of line-based wavelet transform, where
B
f
=27andB
s
=16. Whena scanelement contains coefcients.
of 32 rows for the proposed method and scan-based JPEG2000,
or CCSDS-IDC contains 2048 blocks in a segment, then it
needs memory of 3220484=0.25 M, including buffer size
for synchronization B
s
, which is 1620484=0.125 M. In
this case, the total buffer size for storing the coefcients of scan-
based method is 0.3359+0.250.1250.46 M bytes.
In addition, it needs some extra memory, such as the mem-
ory for recording signicant/insignicant coefcients and sets,
which is related to the CR. For example, using SPIHT to encode
an image of size 2048 2048 at 1 bpp, there are about 650 000
signicant coefcients, 540 000 insignicant coefcients, and
300 000 insignicant sets. If we use linked list to store them,
then each one needs 8 bytes. Hence, the memory for recoding
signicant/insignicant coefcients and sets is about 11 M
bytes. The total memory for SPIHT is about 16 + 11 = 27 M.
In the proposed method, the number of the nodes in the binary
trees is twice as the number of the coefcients in the scan
element, but nodes of the bottom level of the tree are equal
to the coefcients and need not be stored. Thus, the additional
memory requirement is equal to the size of the scan element,
namely, 0.25 M for the binary tree in the above case, which
is not related to the CR. Hence, the total memory requirement
for the proposed method is 0.46 + 0.25 = 0.71 M, except
TABLE II
MEMORY REQUIREMENT (M bytes) OF SPIHT, SCAN-BASED
CCSDS-IDC, SCAN-BASED JPEG2000, AND THE
PROPOSED METHODS AT 1 bpp
some minor variables. Table II shows comparison of memory
requirement of traditional methods and the proposed methods,
scan-based CCSDS-IDC, and scan-based JPEG2000 at 1 bpp.
They are evaluated with two images with two kinds of sizes,
i.e., 2048 2048 and 1024 1024. From Table II, we can
nd that the memory requirement of the scan-based methods,
namely, BTCA-S3, BTCA-S4, scan-based CCSDS, and scan-
based JPEG2000, is very similar and is much less than that of
frame-based method like SPIHT. The memory requirement of
the proposed methods is only a little more than that of scan-
based JPEG2000.
IV. EXPERIMENTAL RESULTS
To evaluate the performance of the proposed method for re-
mote sensing images, experiments are conducted on the CCSDS
reference test image set [29]. The image set includes a variety of
space imaging instrument data such as solar, stellar, planetary,
earth observations, optical, radar, and meteorological, as shown
in Fig. 10. We will compare different methods with PSNR
(peak-signal-to-noise ratio, dB) at different CR (bpp), which
are expressed by the following relations [31]:
PSNR(dB) =10 log
10
(2
b
1)
2
MSE
MSE =
i,j
(x
ij
x
ij
)
2
Row Col
CR(bpp) =
number of coded bits
Row Col
where x
ij
and x
ij
denote the original and reconstructed pixel,
respectively, and b denotes the bit depth of the image which is
of size Row Col.
A. Comparison of the Proposed Methods and Other
Algorithms Without Entropy Coding
To compare our methods with other algorithms without en-
tropy coding, all the 8 bit-depth images with size of 1024
1024 and size of 512 512 from CCSDS reference test image
set are used. In addition, these images are all decomposed by
ve levels of 9/7-tap bi-orthogonal wavelet lters [30] with
a symmetric extension at the image boundary. The PSNRs
at different CR of the proposed method were compared to
those algorithms without arithmetic coding, such as SPIHT [7],
SPECK [8], and EZBC [19]. We will also give the results of
BTC and BTCA to show the gradually improved process of our
proposed method.
Table III lists the experiment results of the proposed method
and other algorithms without arithmetic coding. The results are
3746 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 50, NO. 10, OCTOBER 2012
Fig. 10. Part of the remote sensing images used in simulation with different sizes and bit depth. (a) Coastal-b3 (8 bit depth, 1024 1024). (b) Europa3
(8 bit depth, 600 557)). (c) Marstest (8 bit depth, 512 512). (d) Lunar (8 bit depth, 512 512). (e) Spot-la-panchr (8 bit depth, 1000 1000). (f) Ice-2kb1
(10 bit depth, 2048 2048). (g) Pleiades-portdebouc-b3 (12 bit depth, 320 1376).
evaluated at ve bit rates, namely: 1, 0.5, 0.25, 0.125, and
0.0625 bpp. From the results, we can see that the average
PSNR of BTC is better than SPIHT, SPECK, and EZBC at
ve bit rates with all the listed images. It increases about an
average of 0.97 dB than SPIHT, 0.65 dB than SPECK, 0.13 dB
than EZBC, respectively, which proves that the proposed bi-
nary tree provides a more efcient image representation than
SPIHT, SPECK, and EZBC. The performance of SPIHT is
poor at lower bit rates because there is too much insignicant
coefcients in the initial list when using ve levels of wavelet
transform.
We can also nd that the PSNR of BTCA is better than that
of BTC, up to 0.49 dB, with an average of 0.14 dB. It shows that
the adaptive scanning order gets better performance. In fact, the
PSNR of BTCA is almost the best of all methods with all listed
test images at different bit rates.
When we use BTCA in scan-based mode, each subband of
a scan element only contains a few rows. If the coefcients
in a large square of original wavelet domain are insignicant,
then it cannot be represented by BTCA in scan-based mode.
Moreover, there is not rate-distortion optimization in BTCA-S.
Hence, the performance of BTCA-S is worse than BTCA. How-
ever, BTCA-S needs much less memory than BTCA, and the
former has quality, position, and resolution scalability, while
the latter only has quality scalability.
B. Comparison of the Proposed Method, Scan-Based
CCSDS-IDC, and Scan-Based JPEG2000
To compare our method with scan-based CCSDS-IDC and
scan-based JPEG2000, all the images from CCSDS reference
test image set are used. These images include all the images
with 8 bit depth listed in Table IV, and all the images with
10 bit depth such as ice-2kb1, ice-2kb4,india-2kb1,
india-2kb4, ocean-2kb1, ocean-2kb4 with size of
2048 2048, and landesV-G7-10b with size of 2381 454,
marseille-G6-10b with size of 1856 528. There are also
eight images with 10 bit depth, six images with 12 bit depth,
and two images with 16 bit depth [29]. In addition, these images
are all decomposed by oat 9/7-tap bi-orthogonal wavelet lters
for lossy compression. When lossless compression is desired,
the integer 5/3 DWT is used for the proposed method and
JPEG2000, while integer 9/7 DWT is used for CCSDS-IDC.
BTCA-S4 uses DWT of four levels, and the other algorithms
use DWT of three levels. The results of scan-based CCSDS-
IDC and scan-based JPEG2000 are attained from annex C of
CCSDS-IDC Green Book [17].
When the coefcients are almost signicant, the coding
efciency of BTC is not good. However, the coefcients
in the lowest frequency subband are just almost signicant,
and there are a lot of such coefcients using only three or
four levels of wavelet transform. Thus, the proposed method
HUANG AND DAI: NEW ON-BOARD IMAGE CODEC BASED ON BINARY TREE 3747
TABLE III
PSNR (dB) FOR THE PROPOSED METHOD AND OTHER ALGORITHMS
WITHOUT ENTROPY CODING WITH FIVE LEVELS
OF WAVELET TRANSFORM
directly encodes the coefcients in the lowest frequency sub-
band at this time, namely, when the coefcient is insignicant,
then 0 is output, otherwise 1 and its sign are output.
TABLE IV
PSNR (dB) FOR SCAN-BASED CCSDS-IDC, SCAN-BASED
JPEG2000, AND THE PROPOSED METHOD
USING 8-BIT-DEPTH IMAGES
Because all the coefcients in the lowest frequency subband
are usually positive, we can also omit the sign coding at
this time.
3748 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 50, NO. 10, OCTOBER 2012
TABLE V
AVERAGE PSNR (dB) FOR SCAN-BASED CCSDS-IDC, SCAN-BASED
JPEG2000, AND THE PROPOSED METHOD USING
IMAGES WITH DIFFERENT BIT DEPTH
TABLE VI
PERFORMANCE (BITS/PIXEL) OF LOSSLESS COMPRESSION FOR
SCAN-BASED CCSDS-IDC, SCAN-BASED JPEG2000,
AND THE PROPOSED METHOD
Table IV lists the experiment results of scan-based CCSDS-
IDC, scan-based JPEG2000, and the proposed method using all
the images with 8 bit depth. From the results, we can see that
the average PSNR of BTCA-S3 is better than that of CCSDS-
IDC at 1 bpp and 0.5 bpp. Because BTCA-S directly encodes
the lowest resolution subband, its average PSNR is less than
CCSDS-IDC at lower bit rates with three levels of wavelet
transform. When using four levels of wavelet transform, the
performance of BTCA-S is better than CCSDS-IDC at three
bit rates and is better than JPEG2000 in scan-based mode at
0.5 bpp and 0.25 bpp. At 1 bpp, the PSNR of JPEG2000 in
scan-based mode increases only an average of 0.04 dB than
that of BTCA-S4 using all 8 bit-depth images and is less than
BTCA-S4 using the images with size of 1024 1024. Table V
lists the comparison of average PSNR using images with 10,
12, and 16 bit depth, and similar result is attained.
When lossless compression is desired, every bit plane needs
to be encoded. However, the performance of the proposed
method are not good at the last bit plane, so the proposed
method directly encodes the insignicant coefcients at this
time (i.e., when T
k
= 0). Table VI lists the performance
(bits/pixel) of lossless compression for scan-based CCSDS-
IDC, scan-based JPEG2000, and the proposed method. From
the results, we can see that the average bpp of BTCA-
S3 is similar to CCSDS-IDC, and only a little worse than
JPEG2000.
C. Comparison of Visual Effects
Fig. 11 shows a visual comparison between BTCA-S and
scan-based CCSDS with the coastal-b3 image at 0.25 bpp,
and Fig. 12 is the comparison with lunar image at 0.5 bpp.
The recovered portion of the image has a size of 200
100 pixels, while the original image size is 1024 1024 for
coastal-b3, and 512 512 for lunar, respectively. Because
the magnitudes of the wavelet coefcients around the edges
are often signicant, and BCTA can encode these coefcients
before other coefcients are scanned. When algorithm stops at
a specied bit rates, it can encode more signicant coefcients
on the edge, and thus it can improve the quality of decoded
image. We can see that the image reconstructed by BTCA-S is
clearer than CCSDS in the rectangles.
D. Comparison of Encoding Time
Table VII shows comparison of the CPU encoding times(s)
of CCSDS-IDC, SPIHT without arithmetic coding and the
proposed method in C language implementation. The results
are evaluated with three images of different sizes at three bit
rates. The program of SPIHT without arithmetic coding comes
from its ofcial website [32], and the program of CCSDS can
be download from [33], which is introduced in annex B of
CCSDS-IDC Green Book [17]. From the result, we can see
that the proposed method is only a little slower than SPIHT
without arithmetic coding, but is faster than CCSDS with
entropy coding.
V. CONCLUSION AND DISCUSSION
In this paper, we proposed an on-board remote sensing image
codec based on a binary tree with adaptive scanning order in
scan-based mode. We perform the line-based wavelet transform
on the available data, and subsets of wavelet coefcients are col-
lected into several scan elements. Then, we construct a binary
tree for each code block of scan element. In each code block
and each bit plane, we traverse each level of the binary tree with
an adaptive scanning order. Experimental results show that the
proposed method can signicantly improve PSNR compared
with SPIHT without arithmetic coding and scan-based CCSDS-
IDC and is similar to scan-based JPEG2000. The speed of
our method is very fast. Being less complex, our method is
fully implementable either in hardware or software. Hence, the
proposed method is very t for on-board remote sensing image
compression.
Because multicomponent images are very common in re-
mote sensing applications, to attain component scalability,
CCSDS-IDC encodes each spectral band separately, while our
proposal is able to encode all the spectral bands together as
follows. We can create scan elements for each spectral band
as Section II-C1 and encode them with BCTA as Section II-C2.
Then, we can allocate the bit rates of all scan elements from
all spectral bands with the method as Section II-C3. At last,
random access is guaranteed to each band for the proposed
method.
In order to improve the coding performance, a common
strategy for hyperspectral images can decorrelate rst the image
in the spectral domain [3], [4], [10], [14], [34][36], such
as DWT and principal component analysis. However, it is a
HUANG AND DAI: NEW ON-BOARD IMAGE CODEC BASED ON BINARY TREE 3749
Fig. 11. Comparison of BTCA-S and CCSDS with the coastal-b3 image at 0.25 bpp. The PSNRs of the reconstructed images are 39.68 dB and 39.13 dB,
respectively. There are richer lines in the top rectangle of the image reconstructed by BTCA-S, and the lines in the bottom rectangle with BTCA-S are more
clearer. Because BCTA can preferentially encode the neighbor of the previous signicant coefcients where the edges locate, the quality of decoded image on the
edges of is more clear for BTCA-S. (a) Original image. (b) Image reconstructed by BTCA-S. (c) Image reconstructed by CCSDS.
Fig. 12. Comparison of BTCA-S and CCSDS with the lunar image at 0.5 bpp. The PSNRs of the reconstructed images are 30.92 dB and 31.52 dB, respectively.
The boundary of the objects in the rectangles of the image reconstructed by BTCA-S is clearer and more continuous. (a) Original image. (b) Reconstructed image
with BTCA-S. (c) Reconstructed image with CCSDS.
TABLE VII
CPU ENCODING TIMES(S) OF CCSDS-IDC, SPIHT WITHOUT
ARITHMETIC CODING AND THE PROPOSED METHOD
problem for the proposed method to allocate bit rates across
the transformed bands with low-complexity, low-memory, and
efcient way, which is our future work.
ACKNOWLEDGMENT
The authors would like to thank the associate editor and
the three anonymous reviewers for thoughtful comments and
insightful suggestions which have brought improvements to this
manuscript.
REFERENCES
[1] D. Chaudhuri and A. Samal, An automatic bridge detection technique for
multispectral images, IEEE Trans. Geosci. Remote Sens., vol. 46, no. 9,
pp. 27202727, Sep. 2008.
[2] B. Sirmacek and C. Unsalan, Urban-area and building detection using
SIFT keypoints and graph theory, IEEE Trans. Geosci. Remote Sens.,
vol. 47, no. 4, pp. 11561167, Apr. 2009.
[3] Q. Du and J. E. Fowler, Hyperspectral image compression
using JPEG2000 and principal component analysis, IEEE
Geosci. Remote Sens. Lett., vol. 4, no. 2, pp. 201205,
Apr. 2007.
3750 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 50, NO. 10, OCTOBER 2012
[4] B. Penna, T. Tillo, E. Magli, and G. Olmo, Transform coding techniques
for lossy hyperspectral data compression, IEEE Trans. Geosci. Remote
Sens., vol. 45, no. 5, pp. 14081421, May 2007.
[5] P. Hou, M. Petrou, C. I. Underwood, and A. Hojjatoleslami, Improving
JPEG performance in conjunction with cloud editing for remote sensing
applications, IEEE Trans. Geosci. Remote Sens., vol. 38, no. 1, pp. 515
524, Jan. 2000.
[6] J. M. Shapiro, Embedded image coding using zerotrees of wavelet co-
efcients, IEEE Trans. Signal Process., vol. 41, no. 12, pp. 34453462,
Dec. 1993.
[7] A. Said and W. A. Pearlman, A new, fast, and efcient image codec based
on set partitioning in hierarchical trees, IEEE Trans. Circuits Syst. Video
Technol., vol. 6, no. 3, pp. 243250, Jun. 1996.
[8] W. A. Pearlman, A. Islam, N. Nagaraj, and A. Said, Efcient low-
complexity image coding with a set-partitioning embedded block coder,
IEEE Trans. Circuits Syst. Video Technol., vol. 14, no. 11, pp. 12191235,
Nov. 2004.
[9] D. Taubman, High performance scalable image compression with
EBCOT, IEEE Trans. Image Process., vol. 9, no. 7, pp. 11581170,
Jul. 2000.
[10] X. Tang and W. A. Pearlman, Three-dimensional wavelet-based com-
pression of hyperspectral images, in Hyperspectral Data Compression.
New York: Springer-Verlag, 2006, pp. 273308.
[11] JPEG2000 Image Coding System, ISO/IEC Std. 15 444-1, 2000.
[12] G. Yu, T. Vladimirova, and M. N. Sweeting, Image compression systems
on board satellites, Acta Astronaut., vol. 64, no. 9/10, pp. 9881005,
May/Jun. 2009.
[13] B. Li, R. Yang, and H. X. Jiang, Remote-sensing image compression
using two-dimensional oriented wavelet transform, IEEE Trans. Geosci.
Remote Sens., vol. 49, no. 1, pp. 236250, Jan. 2011.
[14] P. Kulkarni, A. Bilgin, M. W. Marcellin, J. C. Dagher, J. H. Kasner,
T. J. Flohr, and J. C. Rountree, Compression of earth science data with
JPEG2000, in Hyperspectral Data Compression. New York: Springer-
Verlag, 2006, pp. 347378.
[15] Consult. Comm. Space Data Syst. [Online]. Available: http://www.
ccsds.org
[16] CCSDS122.0-B-1, Image Data Compression, Nov. 2005. [Online]. Avail-
able: http://public.ccsds.org/publications/archive/122x0b1c3.pdf
[17] CCSDS120.1-G-1, Image Data Compression, Jun. 2007. [Online]. Avail-
able: http://public.ccsds.org/publications/archive/120x1g1e1.pdf
[18] F. Garca-Vlchez and J. Serra-Sagrist, Extending the CCSDS recom-
mendation for image data compression for remote sensing scenarios,
IEEE Trans. Geosci. Remote Sens., vol. 47, no. 10, pp. 34313445,
Oct. 2009.
[19] S. T. Hsiang and J. W. Woods, Embedded image coding using zeroblock
of subband/wavelet coefcients and context modeling, in Proc. Data
Compress. Conf., Washington, DC, 2001, pp. 8392.
[20] H. F. Ates and M. T. Orchard, Spherical coding algorithm for wavelet im-
age compression, IEEE Trans. Image Process., vol. 18, no. 5, pp. 1015
1024, May 2009.
[21] H. F. Ates and E. Tamer, Hierarchical quantization indexing for wavelet
and wavelet packet image coding, Signal Process., Image Commun.,
vol. 25, no. 2, pp. 111120, Feb. 2010.
[22] A. Abrardo, M. Barni, E. Magli, and F. Nencini, Error-resilient and low-
complexity on-board lossless compression of hyperspectral images by
means of distributed source coding, IEEE Trans. Geosci. Remote Sens.,
vol. 48, no. 4, pp. 18921904, Apr. 2010.
[23] J. Oliver and M. P. Malumbres, Low-complexity multiresolution image
compression using wavelet lower trees, IEEE Trans. Circuits Syst. Video
Technol., vol. 16, no. 11, pp. 14371444, Nov. 2006.
[24] G. Melnikov and A. K. Katsaggelos, A jointly optimal fractal/DCT com-
pression scheme, IEEE Trans. Multimedia, vol. 4, no. 4, pp. 413422,
Dec. 2002.
[25] T. Kim, H. M. Kim, P. S. Tsai, and T. Acharya, Memory efcient pro-
gressive rate-distortion algorithm for JPEG 2000, IEEE Trans. Circuits
Syst. Video Technol., vol. 15, no. 1, pp. 181187, Jan. 2005.
[26] W. Yu, F. Sun, and J. E. Fritts, Efcient rate control for JPEG-2000,
IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 5, pp. 577589,
Jan. 2006.
[27] C. Chrysas and A. Ortega, Line-based, reduced memory, wavelet image
compression, IEEE Trans. Image Process., vol. 9, no. 3, pp. 378389,
Mar. 2000.
[28] F. W. Wheeler and W. A. Pearlman, SPIHT image compression without
lists, in Proc. ICASSP, Istanbul, Turkey, 2000, pp. 20472050.
[29] Consult. Comm. Space Data Syst., CCSDS Image Test. [Online]. Avail-
able: http://cwe.ccsds.org/sls/docs/sls-dc/
[30] M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies, Image coding
using wavelet transform, IEEE Trans. Image Process., vol. 1, no. 2,
pp. 205220, Apr. 1992.
[31] D. J. Granrath, The role of human visual models in image processing,
Proc. IEEE, vol. 69, no. 5, pp. 552561, May 1981.
[32] Center Image Process. Res. [Online]. Available: http://ipl.rpi.edu/
[33] Univ. Nebraska-Lincoln. [Online]. Available: http://hyperspectral.
unl.edu/
[34] J. Fowler and J. T. Rucker, 3-D wavelet-based compression of hyper-
spectral imager, in Hyperspectral Data Exploitation: Theory and Appli-
cations. Hoboken, NJ: Wiley, 2007, pp. 379407.
[35] E. Magli, Multiband lossless compression of hyperspectral images,
IEEE Trans. Geosci. Remote Sens., vol. 47, no. 4, pp. 11681178,
Apr. 2009.
[36] H. Wang, S. D. Babacan, and K. Sayood, Lossless hyperspectral-image
compression using context-based conditional average, IEEE Trans.
Geosci. Remote Sens., vol. 45, no. 12, pp. 41874193, Dec. 2007.
Ke-Kun Huang received the B.S. and M.S. degrees
from Sun Yat-Sen University, Guangzhou, China, in
2002, and 2005, respectively.
He is currently with the Department of Mathemat-
ics, JiaYing University, JiaYing University, Meizhou,
China. His research interests include image process-
ing and face recognition.
Dao-Qing Dai (M09) received the B.Sc. degree
from Hunan Normal University, Changsha, China, in
1983, the M.Sc. degree from Sun Yat-Sen University,
Guangzhou, China, in 1986, and the Ph.D. degree
from Wuhan University, Wuhan, China, in 1990, all
in mathematics.
From 1998 to 1999, he was an Alexander von
Humboldt Research Fellow with Free University,
Berlin, Germany. He is currently a Professor and
Associate Dean of the Faculty of Mathematics and
Computing, Sun Yat-Sen University, Guangzhou. He
is an author or coauthor of over 100 refereed technical papers. His current
research interests include image processing, wavelet analysis, face recognition,
and bioinformatics.
Dr. Dai was the recipient of the outstanding research achievements in
mathematics award from the International Society for Analysis, Applications,
and Computation, Fukuoka, Japan, in 1999. He served as a Program Cochair
of Sinobiometrics in 2004 and a program committee member for several
international conferences.