You are on page 1of 14

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 50, NO.

10, OCTOBER 2012 3737


A New On-Board Image Codec Based on Binary Tree
With Adaptive Scanning Order in Scan-Based Mode
Ke-Kun Huang and Dao-Qing Dai, Member, IEEE
AbstractRemote sensing images offer a large amount of infor-
mation but require on-board compression because of the storage
and transmission constraints of on-board equipment. JPEG2000 is
too complex to become a recommended standard for the mission,
and CCSDS-IDC xes most of the parameters and only provides
quality scalability. In this paper, we present a new, low-complexity,
low-memory, and efcient embedded wavelet image codec for
on-board compression. First, we propose the binary tree as a novel
and robust way of coding remote sensing image in wavelet domain.
Second, we develop an adaptive scanning order to traverse the
binary tree level by level from the bottom to the top, so that better
performance and visual effect are attained. Last, the proposed
method is processed with a scan-based mode, which signicantly
reduces the memory requirement. The proposed method is very
fast because it does not use any entropy coding and rate-distortion
optimization, while it provides quality, position, and resolution
scalability. Being less complex, it is very easy to implement in
hardware and very suitable for on-board compression. Experi-
mental results show that the proposed method can signicantly
improve peak signal-to-noise ratio compared with SPIHT without
arithmetic coding and scan-based CCSDS-IDC, and is similar to
scan-based JPEG2000.
Index TermsAdaptive scanning order, binary tree coding,
on-board application, remote sensing image compression, scan-
based mode.
I. INTRODUCTION
T
HE application of remote sensing images has become very
widespread such as environment monitoring, geology de-
tection, urban planning, etc. [1][3]. Along with the increasing
demand of remote sensing data, the sensor technology has
been developed to improve the spatial and spectral resolution,
which signicantly increases the image data and requires more
and more storage space and transmission capabilities. On the
other hand, the storage and channel bandwidth of on-board
equipment are limited. Therefore, the high-delity and high-
speed compression technology is very desired to alleviate these
contradictions.
Most coding systems are based on a transform stage [4], such
as KarhunenLoeve transform, discrete cosine transform [5],
and wavelet transform. The discrete wavelet transform (DWT)
provides an efcient way to compress images by separating an
Manuscript received August 23, 2011; revised January 1, 2012 and
January 24, 2012; accepted February 2, 2012. Date of publication March 16,
2012; date of current version September 21, 2012. This project was supported
in part by the National Science Foundation of China (90920007, 11171354).
K.-K. Huang is with the Department of Mathematics, JiaYing University,
Meizhou 514015, China (e-mail: kkcocoon@163.com).
D.-Q. Dai is with the Center for Computer Vision and the Department
of Mathematics, Sun Yat-Sen University, Guangzhou 510275, China (e-mail:
stsddq@mail.sysu.edu.cn).
Color versions of one or more of the gures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identier 10.1109/TGRS.2012.2187340
image into several scales and exploiting the correlations across
scales as well as within each scale. Many well-known coding
systems use wavelet transform, such as EZW [6], SPIHT [7],
SPECK [8], and EBCOT [9]. And they are used for hyper-
spectral data in 3-D mode [10]. The JPEG2000 standard [11],
using EBCOT as its kernel, is also a wavelet-based coding
systemand usually taken as the reference technique. In practice,
JPEG2000 has been employed in some space missions [12]. In
[13], the wavelet transform in JPEG2000 is replaced by a 2-D
oriented wavelet transform, and it outperforms JPEG2000 for
remote sensing images. Since remote sensing data are often
captured incrementally by sensors in a push-broom fashion
and quite large, the scan-based approach is also very desirable
to handle the data. In [14], the authors present a scan-based
method that enables the use of JPEG2000 with incrementally
acquired data.
However, some components of JPEG2000 that help to pro-
vide high compression performance (context modeling, arith-
metic coding, Lagrangian rate-distortion optimization) also
have high implementation complexity. This limits the suitability
of JPEG2000 for airborne or satellite-borne missions with
high data-throughput rates and limited capacity of acquisition,
storage, and transmission.
Since 1982, the Consultative Committee for Space Data
Systems (CCSDS) [15] is working toward the development
of space data handling standards. In November 2005, CCSDS
published a recommended standard, CCSDS-IDC Blue Book
[16], for a data compression algorithm based on wavelet trans-
form. In June 2007, the Green Book 120.1-G-1 [17], a tutorial
of the IDC Recommendation, was published. The recommen-
dation specically targets use aboard spacecraft and focuses
more on compression and less on options of handling and
distributing compressed data. A careful trade off has been
performed between compression performance and complexity,
which is very suitable for on-board image compression. To
date, more than 300 space missions have elected to y with
CCSDS protocols and realized the benets: reduced cost, risk,
and development time.
However, CCSDS-IDC only has quality scalability and do
not allow the interactive decoding. It xes most of the pa-
rameters limiting the choices of the end user, and the most
signicant is the number of levels of DWT, which is set to
three. In [18], some prominent extensions have been presented
for the CCSDS-IDC, and compression performance is notably
improved with respect to the recommendation for a large variety
of remote sensing images.
In this paper, we present a new, low-complexity, low-
memory, and efcient embedded wavelet image codec for on-
board compression. The proposed method is based on a binary
tree representation, which is related to EZBC and HIC. EZBC
0196-2892/$31.00 2012 IEEE
3738 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 50, NO. 10, OCTOBER 2012
TABLE I
FEATURE COMPARISON OF SPIHT, EZBC, SCAN-BASED CCSDS,
SCAN-BASED JPEG2000, AND THE PROPOSED METHOD
is presented in [19]. In this scheme, quadtree representation of
wavelet coefcients is established for individual subbands, and
the context models were carefully designed for coding quadtree
nodes at different tree levels and subbands. In [20], the spherical
representation (SPHE) has been introduced, which uses local
energy as a direct measure to differentiate between parts of the
wavelet subband and to decide how to allocate the available bi-
trate. In [21], a hierarchical classication (HIC) map is dened
in each wavelet subband, which describes the quantized data
through a series of index classes and extends SPHE.
EZBC and HIC build good models to match the true distri-
bution of the wavelet domain information. The coefcients are
classied or partitioned into signicant and insignicant sets
so that large sets of insignicant coefcients are coded with
less cost. The disadvantage of these coders is that they have a
high computational complexity, too, particularly using entropy
coding. However, without entropy coding, the performance of
these methods is not very good. Moreover, EZBC uses quadtree
representation which is not ne enough, and HIC cannot gener-
ate embedded code streams. These methods are not not suitable
for remote sensing image compression, because it is an on-
board real-time process and should be designed in practice,
particularly on computational complexity.
The proposed method builds a binary tree model similar to
EZBC and HIC, but is designed in low complexity and low
memory. In addition, the proposed method uses an adaptive
scanning order so that better performance and visual effect are
attained. Furthermore, the proposed method is processed with
scan-based mode, which signicantly reduces the memory re-
quirement. The proposed method is very fast because it does not
use any entropy coding and rate-distortion (R-D) optimization,
while it has quality, position, and resolution scalability. The
proposed method based on binary tree coding with adaptive
scanning order in scan-based mode is called BTCA-S.
There are some other low-complexity algorithms based on
wavelet transform, such as the distributed source coding (DSC)
[22] and lower tree wavelet (LTW) [23] encoder. However,
DSC is only for lossless compression, and LTW is not a bit
plane encoder, which does not provide quality, position, and
resolution scalability, and is hard to rate control. They both use
entropy coding too.
In summary, Table I shows the feature comparison between
SPIHT, EZBC, scan-based CCSDS, scan-based JPEG2000, and
the proposed method.
The remainder of the paper is organized as follows. In
Section II, we present the proposed method based on binary
tree with adaptive scanning order in scan-based mode. In
Section III, the complexity of time and space of the proposed
method is analyzed. The experimental results are given in
Section IV. Finally, the conclusion is provided in Section V.
Fig. 1. Remote sensing image and a subband of its wavelet transform. In (b),
the white position is where the signicant coefcients locate. It can be seen
that there are large contiguous areas where the insignicant coefcients locate.
(a) A remote sensing image: coastal-b1. (b) A subband of wavelet transform to
coastal-b1.
II. PROPOSED METHOD
In this section, we propose a binary tree coding algorithm,
called BTC, for each code block of wavelet subband rst. Based
on BTC, the adaptive scanning order is introduced, which is
called BTCA. Then, the BTCA is processed with scan-based
mode, and the nal proposed method is called BTCA-S.
A. Binary Tree Coding
The energy in wavelet subbands is usually clustered. It
motivates the use of spatially varying models in coding the
wavelet information. A subband of the wavelet transform to a
remote sensing image is shown as Fig. 1(b). The white position
is where the signicant coefcients with respect to the threshold
8 locate, and the others are the insignicant coefcients. It can
be seen that there are large contiguous areas where the insignif-
icant coefcients locate. Hence, we can divide the subband into
several subblocks with the same size such as 8 8 and 16 16,
and then we check whether it contains signicant coefcients.
If a subblock does not contain signicant coefcients, then
we can use a binary bit to encode the entire block. If there
are many such subblocks, then high compression ratio (CR) is
attained. However, it is a problem that how big the subblock
size is. If the size is too big, then the proportion of insignicant
coefcients will be increased in signicant blocks, and the
coding performance is reduced. If the size is too small, then
there are more contiguous insignicant subblocks, and it cannot
further improve the CR.
To solve the problem, EZBC [19] adopts the quadtree split-
ting method. Each quadtree node splits into four insignicant
descendent nodes of the next lower level once it tests as
signicant against the threshold of the current bitplane coding
pass. This scheme in EZBC is not ne enough. Suppose there
is a code block as Fig. 2. With quadtree coding in EZBC, we
rst encode 1 because the 4 4 block contains signicant
coefcients. Then ,the four 2 2 subblocks are coded. We also
encode 1 because the upper left 2 2 subblock contains sig-
nicant coefcients. And then the four coefcients are encoded
with 0100. Finally, the resulting code is 1101001010000,
with a total of 13 bits. However, we can get better performance
using the binary tree coding method of this paper.
We provide the notation for the binary tree as follows.
Suppose there is a code block B of wavelet image with
HUANG AND DAI: NEW ON-BOARD IMAGE CODEC BASED ON BINARY TREE 3739
Fig. 2. Simple block to be encoded. Almost all coefcients are insignicant
except two.
Fig. 3. Linear indexing with a code block. The 2-D data is transformed into
1-D using morton scanning order.
size of 2
N
2
N
. Let B(r, c)[0 r, c < 2
N
denote the
coefcients of the code block B. In order to facilitate
description, the 2-D data is transformed into 1-D using
morton scanning order [28]. We can represent the row in-
dex in binary r = [r
N1
r
N2
. . . r
1
r
0
], and do the same
for the column index c = [c
N1
c
N2
. . . c
1
c
0
]. Let p =
[r
N1
c
N1
r
N2
c
N2
. . . r
1
c
1
r
0
c
0
] and V (p) = B(r, c), then
the linear index p ranges from 0 to 2
N
2
N
1. Then, V can
represent B with a single index instead of two. Fig. 3 shows an
example of this indexing scheme.
From bottom to the top, we construct the binary tree whose
nodes are (t) for 1 t < 2 S where S = 2
N
2
N
. The
bottom level of the binary tree consists of all coefcients in
morton scanning order
(t) = [V (t S)[ for S t < 2 S.
The upper levels of the tree are dened iteratively as follows:
(t) = max (2t), (2t + 1) for 1 t < S.
(2t) and (2t + 1) are called each others brother, and
they are called the children of (t). (t) is called the parent
of (2t) and (2t + 1). The tree depth D = N +N + 1. For
example, there are a code block and its binary tree in the left
and right of Fig. 4, respectively. Then, the bottom level of the
binary tree consists of four nodes: (4) = [V (4 2 2)[ = 1,
(5) = 0, (6) = 2 and (7) = 4. The upper level consists of
two nodes: (2) = max(4), (5) = 1 and (3) = 4. The
top level consists of one node: (1) = max(2), (3) = 4.
When a binary tree is constructed for a code block, we
can traverse the tree by depth as follows: for each bit plane,
going from top to the bottom of the subtree, if a tree node is
insignicant, then it is coded by 0, otherwise it is coded by
1, and the process is recursively applied to the two children. If
the process reaches the bottomof the tree and the corresponding
coefcient is signicant, then the sign of the coefcient is
coded. In this way, we can quickly zoomin to high energy areas,
Fig. 4. Code block (left) and its binary tree (right). The bottom level of the
binary tree consists of all coefcients in morton scanning order. The value of
the other nodes is the maximum of its two children.
and regions of all zero pixels can be compactly represented.
There are two kinds of codes that can be neglected.
If one node has been coded with signicant in larger
threshold, then we know this node must be signicant.
For example, the root of the tree in Fig. 4 is equal to 4.
Certainly, it is larger than 2.
If one node has a signicant parent and the brother of the
node has just been coded with insignicant, then we know
this node must be signicant. For example, the root of the
tree in Fig. 4 is equal to 4, and its left child is equal to
1 that is less than 4. Hence, the right child must be
equal to 4.
The detail steps of the proposed method can be described as
a function TravDep as follows, where t is the index of a node
of the binary tree, and T
k
is a threshold where T
0
= 2
log
2
(1)|
and T
k
= T
0
/2
k
. The function will traverse the subtree whose
root is the node (t) by depth rst.
Function code = TravDep(, t, T
k
)
If (t) has been coded with signicant in larger threshold,
namely, (t) T
k1
,
If t < S,
cl = TravDep(, 2t, T
k
);
cr = TravDep(, 2t + 1, T
k
);
code = cl cr.
Else
code = sign(V (t S)).
Else if (t) has a signicant parent and the brother of (t)
has just been coded with insignicant, namely, t > 1 and
t mod 2 = 1 and (t 1) < T
k
,
If t < S,
cl = TravDep(, 2t, T
k
);
cr = TravDep(, 2t + 1, T
k
);
code = cl cr.
Else
code = sign(V (t S)).
Else if (t) T
k
If t < S,
cl = TravDep(, 2t, T
k
);
cr = TravDep(, 2t + 1, T
k
);
code = 1 cl cr.
Else
code = 1 sign(V (t S)).
Else
code = 0.
3740 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 50, NO. 10, OCTOBER 2012
Fig. 5. Comparison of the histograms of two kinds of coefcients (except signicant coefcients). About 40% of the magnitudes of the coefcients in (a)
are greater than 4, while there are about 13% in (b). Hence, we can scan the neighborhood of the previous signicant coefcients rst. (a) The signicant-like
neighborhood. (b) All the coefcients.
The function TravDep is described as a recursive form
above. To accelerate the calculation speed, it can be realized
into nonrecursive function by a stack. We can regard an entire
wavelet image as a code block and encode it with function
TravDep(, 1, T
k
) at each bit plane. This method is called
BTC. The binary tree can be represented by an array, which
is a very simple structure so as to easily store in restricted
scenarios.
The 2-D data can be transformed into 1-D using raster scan-
ning order (line by line scanning) instead of morton scanning
order. However, it would decrease the peak signal-to-noise
ratio (PSNR) a little, about an average of 0.25 dB. Because
there could be many insignicant square blocks of wavelet
domain that can be represented by some nodes of the binary
tree constructed by morton scanning order, while the nodes of
the binary tree constructed by raster scanning order can only
represent some insignicant linear blocks that are less seen.
The morton scanning can also be replaced by a Hilbert (fractal-
like) scanning [24], in which almost the same performance is
attained.
Using the BTC algorithm to encode the block as Fig. 2,
we will get the resulting code as 11110011000 without sign
coding, with a total of 11 bits, which is more efcient than
EZBC.
When the size of the code block is arbitrary such as W H,
which is not equal to 2
N
2
N
, we can deal with it as follows.
First, we can linear index the code block with morton scan-
ning order into V (p)[0 p < S
0
where S
0
= W H. Let
D = ,log
2
(S
0
)| + 1 and S = 2
D1
, then we can construct the
binary tree whose nodes are (t) for 1 t < 2 S as above,
where (t) = [V (t S
0
)[ for S t < S +S
0
and (t) = 0
for S +S
0
t < 2 S. In order to avoid the unnecessary
code for the S S
0
coefcients expanded with zero value, we
calculate the number of the invalid nodes of each level of the
tree v(d) for 1 d D, where v(D) = S S
0
and v(d) =
v(d + 1)/2| for 1 d D 1. Then, the index set of the
possible visiting invalid nodes U for the function TravDep can
be attained as follows.
U = .
For d = 1 to D
If mod(v(d), 2) = 1,
U = U 2
d
v(d).
Finally, when (t) T
k
, the function TravDep needs to be
modied as follows:
Else if (t) T
k
while t < S and 2t + 1 U,
t = 2t.
If t < S,
cl = TravDep(, 2t, T
k
);
cr = TravDep(, 2t + 1, T
k
);
code = 1, cl, cr.
Else
code = 1, sign(V (t S)).
For example, suppose there is a code block with size of 1
5 which is indexed by V (0) = 0, V (1) = 1, V (2) = 0, V (3) =
0, V (4) = 1, where S
0
= 5, M = 3 and S = 8. The number of
the invalid nodes of each level v(4) = 3, v(3) = 1, v(2) = 0,
v(1) = 0, and the possible visiting invalid nodes U = 7, 13,
respectively. Using the modied function TravDep to encode
the binary tree, we can get the resulting code as 111001
without sign coding.
B. Adaptive Scanning Order
The wavelet coefcients on the edges are often signicant.
Moreover, because the brightness around the edges of natural
image usually changes gradually, the magnitudes of the wavelet
coefcients around the edges are often large too. Such as
Fig. 1(b), the coefcients whose magnitudes are greater than
4 but less than 8 locate in the gray position. The other black
position is where the coefcients whose magnitudes are less
than 4 locate. It can be found that the gray coefcients are likely
around the white coefcients.
To facilitate the description, the four coefcients on the left,
right, up, and down of a coefcient are called the neighborhood
of the coefcient, and the neighborhood of the previous signi-
cant coefcients is called signicant-like neighborhood.
Except for the previous signicant coefcients whose magni-
tudes are greater than 8, we can get the histogram of signicant-
like neighborhood as Fig. 5(a). The number on the horizontal
axis is the magnitude of the coefcients, and the number
on the vertical axis is the number of the coefcients in the
corresponding interval. It can be found that about 40% of the
magnitudes of the coefcients are greater than 4, and 70% of
HUANG AND DAI: NEW ON-BOARD IMAGE CODEC BASED ON BINARY TREE 3741
Fig. 6. Adaptive scanning order in the binary tree. The gray nodes are the
previous signicant nodes. Going from the bottom to the top of the tree, if one
node is insignicant and its brother is signicant, then the subtree whose root
is just the node is traversed rst.
them are greater than 2. And the histogram is very different
from the histogram of all the coefcients as Fig. 5(b), where
only about 13% of the magnitudes of the coefcients are greater
than 4, and 26% of them are greater than 2.
According to the above analysis, we can scan signicant-
like neighborhood before other regions are scanned. When
algorithm stops at a certain point, it can encode more signicant
coefcients, and thus it can improve the quality of decoded
image, particularly on the edge. In the binary tree, we can
design an adaptive scanning order as follow: going from the
bottom to the top of the tree, if one node is insignicant and its
brother is signicant with respect to the larger threshold, then
the subtree whose root is just the node is traversed rst. For
example, suppose there is a tree as Fig. 6. The gray nodes are
previous signicant nodes, and the scanning order is labeled
with the number in the white nodes. For example, at the bottom
level, the nodes labeled with 1 and 2 are encoded rst,
because their brothers are signicant already, while the nodes
labeled with 4, 5, 7, and 8 should not be encoded at
this time. At the next upper level, the subtree whose root is the
node labeled with 3 is encoded by the function TravDep,
so the nodes labeled with 4 and 5 may be encoded by the
function TravDep if the node labeled with 3 is signicant.
Combining the binary tree coding algorithm, the detailed
steps of the adaptive scanning order for the binary tree coding
at each bit plane can be described as a function TravLev as
follows. This method is called BTCA. Before traversing the tree
by levels with the function, we should traverse the tree by depth
with T
0
, namely code = TravDep(, 1, T
0
), so that there are
previous signicant nodes with smaller thresholds T
k
[k > 0.
Function code = TravLev(T
k
)
From bottom to the top of the tree, we nd the brother of
previous signicant nodes to traverse. namely, let d = D,
repeat the following steps while d > 1.
For t =

d1
i=0
2
i
+ 1 to

d
i=0
2
i
Let ct = . If (t) T
k1
,
If t mod 2 = 0 and (t + 1) < T
k1
, then ct =
TravDep(, t + 1, T
k
);
Else if t mod 2 = 1 and (t 1) < T
k1
, then
ct = TravDep(, t 1, T
k
).
code = code, ct.
d = d 1.
Here, we give a toy example. Suppose there is a code block
as Fig. 7(a). We can construct the binary tree (t) for 1 t <
22
2
2
2
as Fig. 7(b). With T
0
=4, code=TravDep(, 1,
T
0
). Because (1)=7T
0
and t =1<44, code=1, cl, cr
for cl =TravDep(, 2, T
0
) and cr =TravDep(, 2 + 1, T
0
).
We can get TravDep(, 2, T
0
) = 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0,
and the 5th and the 11th bits are the signs of the sig-
nicant coefcients 7 and 5, respectively. Similarly,
TravDep(, 3, T
0
) = 1, 0, 0, 0, 1, and the last bit is the sign
of the signicant coefcient 4. Hence, we can get the code
{1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1} with T
0
= 4.
With T
1
= 2, code = TravLev(T
1
). When d = 5, we scan
the nodes (17), (22), (30) rst, and get the code {1, 1, 1,
1, 1, 0}. When d = 4, we scan the notes (9), (10), (14),
and get the code {1, 0, 0, 0, 1, 0, 1}. When d = 3, we scan the
note (6), and get the code {0}. Hence, we can get the code
{1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0} with T
1
= 2.
After the function TravLev(T
1
) is nished, a renement
pass is performed in order to rene all the coefcients found
to be signicant. This pass makes lower or higher decisions of
uncertainty for the given threshold to minimize the quantization
errors. That is, a coefcient in the upper half of the uncertainty
interval is coded with 1, otherwise coded with 0. For the
example, there are three previous signicant coefcient 7,
5, and 4, and the renement code is {1, 0, 0}, and it
can be decoded to 7, 5, and 5. Finally, we get the
resulting code 1111110010001000111111010001010100 for
the BTCA method and decoded block is as Fig. 7(c).
The adaptive scanning order scans the coefcients closest to
the previous signicant coefcients rst, and then scans the next
one, and so on. Because the edges are usually around the sig-
nicant coefcients and the proposed method can preferentially
encode these regions, the quality of the edges of decoded image
can be improved at a specied bit rates. In other words, the
adaptive scanning order can be as a means to adaptively adjust
to the edges, particularly to horizontal or vertical edges.
The closer to the previous signicant coefcients the coef-
cients are, the more likely signicant they are. Hence, when
using the function TravLev(T
k
), the resulting code is usually
more efcient at higher tree levels such as d = Dor d = D 1,
than lower tree levels such as d = 3 or d = 2. Moreover, the
code of the renement pass is usually more efcient than that
of lower tree levels too. Thus, the renement pass can be
performed when d = D 3 in the function TravLev(T
k
), to
get better performance. For example, if the size of a block
to be coded is 2
6
2
6
, then the tree depth D = 13, and the
renement pass is performed when d = 10.
C. Scan-Based Processing
If we directly use the binary tree to encode a large image,
namely, an entire wavelet image is regarded as a code block,
then it needs a lot of memory to store the binary tree. As
an alternative, we can divide the entire wavelet image into
several code blocks and encode them alone with the proposed
method and combine rate-distortion optimization algorithm [9].
For each code block and each bit plane, the points at the end
of renement pass and each adaptive traversing level can be
taken as the primitive truncation points. Because of the adaptive
scanning order, the distortion rates of these truncation points are
almost monotonically decreasing. It can get more nely embed-
ded bit streams than EBCOT. To decrease the computational cost
of the rate-distortion optimization at a certain extent, we can use
3742 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 50, NO. 10, OCTOBER 2012
Fig. 7. Toy example. In (b), the numbers in the circles are the values of the tree nodes, and the numbers near the circles are the indexes of the nodes of the binary
tree. The gray nodes are the previous signicant nodes whose magnitudes are greater than 4. (a) A code block. (b) The binary tree of (a). (c) The decoded block.
the memory-efcient progressive rate-distortion optimization
algorithm [25] or the greedy heap-based rate-control algorithm
[26] instead of the PCRD algorithm using in EBCOT [9].
Finally, we can get a truncation point in each code block
at a specied bit rate. Hence, there are only a small number
of the nal truncation points needed to store, which is equal
to the number of code blocks. Combining the rate-distortion
optimization, it can get better performance (gains about 0.2 dB)
and requires less memory than BTCA, and it has a rich set of
features like JPEG2000, such as quality, position, and resolu-
tion scalability [9].
However, the rate-distortion optimization is expensive, be-
cause there could be many unnecessary coding units and the
distortion rates must be calculated to minimize distortion ac-
cording to a target bit rate. To decrease the computational cost,
we propose a scan-based method based on BTCA without rate-
distortion optimization, which also has quality, position, and
resolution scalability. This method is called BTCA-S.
Since remote sensing data are often captured incrementally
by sensors in a push-broom fashion and quite large, the scan-
based approach is also very desirable to handle the data. In [14],
the authors present a scan-based method that enables the use
of JPEG2000 with incrementally acquired data. CCSDS-IDC
[16] also recommends a stripe-based method. BTCA-S uses the
same line-based wavelet transform and scan elements as [14]
and [16], but its coding algorithm is different. The complex
context-based entropy coding and rate-distortion optimization
are used in [14], which are too expensive to become a recom-
mended standard for space missions [16]. CCSDS-IDC recom-
mendation has been designed for real remote sensing scenarios
and provides an excellent tradeoff performance complexity, but
it only has quality scalability, and the levels of DWT is xed to
three. However, BTCA-S can provide quality, position, and reso-
lution scalability without any entropy coding and rate-distortion
optimization and can use any number of levels of DWT, too.
1) Creating Scan Elements: In [27], a line-based wavelet
transform is proposed. For a given lter length and level of
decomposition, the memory requirement of wavelet transform
of this approach depends only on the width of the image,
rather than the total size as in a traditional row column l-
tering implementation, but it can yield the same results as a
traditional one. The approach is based on the following ideas.
Each time we receive a line of original image, we perform
horizontal ltering and we store the data into a circular buffer.
After we have received enough lines we can perform vertical
ltering. The next level of decomposition is also based on this
principle, which is implemented when there are enough rows
and columns. Let 2F + 1 be the maximum length of the lters.
Fig. 8. Relationship between scan elements (right) and resolution subbands
(left) for two-level wavelet transform of a single component. Regions in
different subbands shaded with the same color constitute a scan element. That
is, wavelet transform data can be rearranged to form scan elements.
The buffer size for ltering is B
f
= L(2F + 1), and buffer size
for synchronization is B
s
= (2
L
L 1)F. The total circular
buffer size needed for L levels of wavelet decomposition is
B
f
+B
s
= L(2F + 1) + (2
L
L 1)F rows [27]. For ex-
ample, when we use the CDF9/7 lters with L = 3 levels of
decomposition, the total buffer size is 3 9 + (2
3
3 1)
4 = 43 rows. If there are many rows in the original image, then
it can save much memory for wavelet transform.
The line-based wavelet transform is performed on the avail-
able data, and subsets of wavelet coefcients are collected
into scan elements. Each scan element consists of data from
a certain number of rows from each resolution subband nom-
inally corresponding to a stripe of the image in the spatial
domain. For example, Fig. 8 shows two scan elements with
two levels of wavelet transform for an image size of 16
16. Each scan elements contains wavelet coefcients of four
rows, namely, 48 coefcients from the three subbands of the
highest resolution, 12 coefcients from the subbands of the
second highest resolution, and 4 coefcients fromthe the lowest
frequency subband. In Fig. 8, regions in different subbands
shaded with the same color correspond to the same region of the
image in the spatial domain, which constitute a scan element.
The formation of scan elements using small amounts of data
from different subbands is possible because of the use of the
incremental wavelet transform. This concept of scan elements
can be extended to any number of resolution levels.
If there are too many rows in a scan element, then it needs
much memory. However, if there are too few rows, then the
performance will decrease obviously. Hence, the proposed
method trades off the memory requirement and performance
to set the rows in each scan element as 32, except that the last
scan element may contain less than 32 rows. In fact, if there are
64 rows in each scan element, the mean PSNR is only about
HUANG AND DAI: NEW ON-BOARD IMAGE CODEC BASED ON BINARY TREE 3743
0.02 dB away from what is obtained when each scan element
contains 32 rows.
2) Encoding Scan Element: As soon as a scan element is
available, we can encode it by BTCA, which has quality scala-
bility because BTCAis an embedded coding algorithm. In order
to get position and resolution scalability, we can further divide
the scan element into several code blocks and encode them with
BTCA. More precisely, we can take each subband of a scan
element as a code block. This is based on the following ideas.
The scan element only contains 32 rows of wavelet coefcients.
The subbands in a scan element are some at rectangles. For
example, when an image with size of 500 500 is decomposed
by three levels wavelet transform and a scan element with size
of 32 500 is available, the sizes of the subbands of each
resolution in the scan element are 16 200, 8 100, and
4 50, respectively. If we use a xed code block size such as
32 32, then the code block may contain coefcients from
more than one subbands, which will decease the coding ef-
ciency and does not have resolution scalability. Hence, we take
each subband of a scan element as a code block. If there are too
wide subbands, for example, wider than 512, then they can be
divided into two code blocks, which cannot raise PSNR accord-
ing to the experiment, but can provide ner position scalability.
To allocate the bit rates for the code blocks of a scan
element adaptively, we propose a scanning order across the
binary tree as follows. Suppose the size of a scan element is
2
M
2
N
from a two levels wavelet transform, in which there
are seven subbands, four with size of 2
M2
2
N2
and three
with size of 2
M1
2
N1
. We construct a binary tree for each
subband, namely,
1
,
2
,
3
,
4
,
5
,
6
,
7
for the seven
subbands, with levels of D
1
= D
2
= D
3
= D
4
= M +N
3 and D
max
= D
5
= D
6
= D
7
= M +N 1, respectively.
Then, we can traverse the binary trees as function TravLevs.
Function code = TravLevs(T
k
)
Let D = D
max
. Repeat the following steps while D > 1.
For j = 1 to 7
If j 4, then d = D 2.
If d 0, continue.
For t =

d1
I=0
2
i
+ 1 to

d
i=0
2
i
Let ct = . If
j
(t) T
k1
,
If t mod 2 = 0 and
j
(t + 1) < T
k1
, then ct =
TravDep(
j
, t + 1, T
k
);
Else if t mod 2 = 1 and
j
(t 1) < T
k1
, then
ct = TravDep(
j
, t 1, T
k
).
code = code, ct.
D = D 1.
The function TravLevs is similar with the function
TravLev. The difference is that the former traverses multiply
trees by levels at the same time, because the coding efciency
of BTCA for different code blocks is very similar at the same
bit plane and the same level of the binary tree. The function
TravLevs is easy to extend to higher levels of wavelet trans-
form, in which there are more subbands.
Because the function TravLevs is an embedded coding pro-
cess, it can be stop at any CR. When each specied CR (quality
layer) is reached, we can record the length of the bits for each
code block into the head of the coding stream. For example,
suppose there are Q quality layers and C code blocks. Let l
q
c
denote the length of the bits for cth code block and the qth qual-
ity layer. Then, the QC numbers l
q
c
[c = 1, 2, . . . , C, q =
1, 2, . . . , Q need to be recoded. Consequently, in the nal
stream, the bit stream for each quality and each code blocks
is preceded by a header indicating its length. Then, when we
transmit or decode the coding stream, we can select a certain
quality layer, some resolution or position bit stream without
needing to decode the whole encoded stream, so as to get the
scalability of quality, resolution, and position.
3) Rate Allocation for Scan Elements: When multiple scan
elements are desired, the method of allocating bit rates for each
scan element is required. The simplest method is to encode
each scan element at a xed rate, which is proportional to the
number of the rows in it. However, this simple method has some
disadvantages. When the image content varies considerably
between different areas, the reconstructed image of the simple
method may vary in quality from top to bottom. If one wants
to optimize overall image quality subject to an overall rate
constraint, a strategy to optimize the allocation of compressed
bytes to different scan elements could be employed. Such an
implementation, however, might have high complexity [16].
In fact, it has almost no difference (only about 0.03 dB in
the experiment in [14]) between the overall PSNR of the
images reconstructed from the xed-rate method and that from
overall rate-distortion optimization method when the size of
scan elements is not very small, and there is a xed-rate option
in CCSDS-IDC recommendation[16]. Hence, we propose a
modied xed-rate method for rate allocation as follows.
First, we encode each scan element in the xed-rate method
with the function TravLevs. When qth specied CR is reached
in a scan element, the length of the bits for the cth code block
in the sth scan element is recoded by l
q
s
c
. At this time, the rst
several code blocks in a scan element are encoded by the func-
tion TravLevs with the same level D of the binary trees, and
the others are encoded with level D + 1. We continue encoding
these code blocks in the scan element until the while loop
with the level D is nished, hence a better quality is attained,
and the length of the bits for cth code block, namely, l
q
e
s
c
, is
recorded. We can also record the length of bits for the rst
several code blocks as l
q
b
s
c
when coding at the beginning of the
while loop with the level D, which represents a worse quality.
The corresponding distortion with the level D at the bit plane
T
k
is marked by D
q
s
c
= T
k
D
max
+D.
Finally, we can get the candidate truncation points
l
q
s
c
, l
q
b
s
c
, l
q
e
s
c
and corresponding distortion D
q
s
c
. At qth CR,
the l
q
b
s
c
bits for each code blocks of all scan elements are selected
rst. Then, we can select more bits, namely, instead l
q
b
s
c
of l
q
s
c
or
l
q
e
s
c
, for each code block according to the D
q
s
c
descending order,
until the qth CR is reached. Be pointed out that there is a small
number of truncation points to select, thus it only needs a little
cost, but it can balance the quality difference between different
scan elements.
For example, Fig. 9 shows the candidate truncation points
for the code blocks. It lists three code blocks, namely, the (c
1)th, cth, (c + 1)th code block, in qth scan element. The gray
nodes are the previous signicant nodes, and the nodes marked
with numbers 1, 2, . . . , 12 are the nodes traversed by levels.
3744 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 50, NO. 10, OCTOBER 2012
Fig. 9. Candidate truncation points for the code blocks. There list three code blocks, namely, the (c 1)th, cth, (c + 1)th code block, in sth scan element. The
gray nodes are the previous signicant nodes, and the nodes marked with numbers 1, 2, . . . , 12 are the nodes traversed by levels. Their scanning order is labeled
with the numbers. Suppose the algorithm reaches qth compression ratio when coding the node marked with 9, then the truncation points for the three code
blocks are attained as the gure.
Their scanning order is labeled with the numbers. Suppose the
algorithm reaches qth CR when coding the node marked with
9, then the truncation points for the three code blocks are
attained as the gure.
III. COMPLEXITY ANALYSIS
A. Time Complexity
In this subsection, we will analyze the time complexity of
our method and compare it with those of EZBC and SPIHT. In
fact, zerotree encoders, such as SPIHT, usually trade memory
for computation by precomputing and storing the maximum
magnitudes of all possible descendant and grand-descendanst
set [6], [7], [28], and they also need to compare coefcients
each other and the comparing times is similar with that of
constructing the binary tree in our method. Details of the
calculation are given as follows.
Suppose an entire wavelet image is regarded as a code block
with size of 2
N
2
N
. There are 2
N
2
N
= 2
2N
nodes at the
bottom of the binary tree. When an upper level node of the
tree is constructed, it needs a comparison with two children.
Let D
B
= log
2
(2
2N
) + 1 = 2N + 1 stand for depth of the tree.
Hence, all the comparison times for constructing the tree is
C
B
=2
2N
_
1
2
+
1
2
2
+ +
1
2
D
B
1
_
=2
2N
_
1
1
2
2N
_
=2
2N
1.
Similarly, when a node of the quadtree of EZBC is con-
structed, it needs three comparisons to nd the max value
of four children. Let D
E
= log
4
(2
2N
) + 1 = N + 1 stand for
depth of the quadtree. All the comparison times for constructing
the quadtree is
C
E
=3 2
2N
_
1
4
+
1
4
2
+ +
1
4
D
E
1
_
=3 2
2N
_
1
_
4
_
1 (1/4)
N
_
1 1/4
_
=2
2N
1.
We can see C
B
= C
E
.
To nd the maximummagnitude of all possible descendant in
SPIHT, we need to construct a quadtree in each subband rst.
There are 2
2N
/4 coefcients in a highest resolution subband,
and the comparison times for construct the quadtree in the
subband are 2
2N
/4 1. Let D
S
= log
4
(2
2N
) + 1 = N + 1
stand for the wavelet decomposition level, so all the comparison
times are
C
S
=3
__
2
2N
4
1
_
+
_
2
2N
4
2
1
_
+ +
_
2
2N
4
D
s
1
1
__
=2
2N
1 3N.
In addition, it needs to compare the corresponding nodes
in the quadtrees of different subbands at different resolution.
Hence, all the comparison times nding the maximum magni-
tude of all possible descendant are more than C
S
. Thus, C
B
is
nearly equal to C
S
. In fact, C
S
is also the comparison times for
nding the max magnitude of all coefcients, which is required
for all wavelet image compression algorithms.
Including all the wavelet coefcients, the number of the
nodes in the binary tree is N
B
= 2
2N
+ 2
2N
1, and those
in EZBC and SPIHT are N
E
= 2
2N
+ (2
2N
1/3), N
S
=
2
2N
+ (2
2N
/4) + (2
2N
/4
2
), respectively. N
B
is more than
N
E
and N
S
, and it needs more time to traverse all the nodes,
but, in fact, it does not need to traverse all the nodes in coding
process usually. When we get a bit for signicance test in BTC,
EZBC, or SPIHT, it needs to compare a value with the threshold
one time, and the other computation is proportional to it. Hence,
the cost time of BTC, EZBC, and SPIHT is similar at the same
bit rates when the max magnitudes of all possible set are stored.
The method coding more signicant coefcients can get better
performance.
The additional computation of the BTCA is to nd the adap-
tive scanning order. BCTA performs the binary tree coding level
by level from the bottom to the top and does not increase the
extra bitstreams compared with BTC. It increases the judgment
for the statement If
j
(t) T
k1
for each node in the binary
tree, but the statement is very simple and does not require any
other operation such as sort operation. Hence, there is only
a little cost time for the adaptive scanning order. The scan-
based process needs not any extra computation. On the contrary,
because this process reduces the memory requirement greatly,
it can save the time for exchanging cache.
The step of entropy coding usually accounts for most of the
coding/decoding time. For an example in [7], when we use
arithmetic coding in SPIHT in C language implementation,
HUANG AND DAI: NEW ON-BOARD IMAGE CODEC BASED ON BINARY TREE 3745
the CPU time for encoding an image at 0.5 bpp is 0.33 s,
while it is 0.14 s without arithmetic coding. The rate-distortion
optimization is very expensive, too, because there could be
many unnecessary coding units and the distortion rates of all
truncation points must be stored and sorted, which is a major
component that dictates the performance [25]. Hence, the meth-
ods using complex context-based arithmetic coding and rate-
distortion optimization, such as EBCOT [9], JPEG2000 [11],
HIC [21], would be too complex to become a recommended
standard for space missions.
If EBCOT does not use arithmetic coding, its performance
will decrease a lot. Except for the renement code and sign
code, there are some original binary codes of EBCOT rep-
resenting the signicance of four coefcients, and there are
most codes representing the signicance of single coefcient.
Only a few codes represent the signicance of the coefcients
of a subblock with size of 16 16. Thus, there are many
0 in the original binary codes of EBCOT, and they need to
use arithmetic coding to raise the performance. However, the
original binary codes of SPIHT can represent the signicance of
all the coefcients in a zero tree, and the original binary codes
of the binary tree coding can represent the signicance of all
the coefcients in a subtree, so the performance of SPIHT and
the proposed method are better than EBCOT without arithmetic
coding. The PSNR of EBCOT will decrease more than 1 dB
without arithmetic coding according to our experiment.
B. Memory Requirement
When a traditional row column ltering is implemented for
wavelet transform, it needs memory with size as the whole im-
age. For example, if there is an image of size 20482048 to be
encoded, and each wavelet coefcients needs 4 bytes to store,
then the traditional method needs memory of 20482048
4=16 M bytes, while it only needs (B
f
+B
s
)20484=
0.3359 Mfor three levels of line-based wavelet transform, where
B
f
=27andB
s
=16. Whena scanelement contains coefcients.
of 32 rows for the proposed method and scan-based JPEG2000,
or CCSDS-IDC contains 2048 blocks in a segment, then it
needs memory of 3220484=0.25 M, including buffer size
for synchronization B
s
, which is 1620484=0.125 M. In
this case, the total buffer size for storing the coefcients of scan-
based method is 0.3359+0.250.1250.46 M bytes.
In addition, it needs some extra memory, such as the mem-
ory for recording signicant/insignicant coefcients and sets,
which is related to the CR. For example, using SPIHT to encode
an image of size 2048 2048 at 1 bpp, there are about 650 000
signicant coefcients, 540 000 insignicant coefcients, and
300 000 insignicant sets. If we use linked list to store them,
then each one needs 8 bytes. Hence, the memory for recoding
signicant/insignicant coefcients and sets is about 11 M
bytes. The total memory for SPIHT is about 16 + 11 = 27 M.
In the proposed method, the number of the nodes in the binary
trees is twice as the number of the coefcients in the scan
element, but nodes of the bottom level of the tree are equal
to the coefcients and need not be stored. Thus, the additional
memory requirement is equal to the size of the scan element,
namely, 0.25 M for the binary tree in the above case, which
is not related to the CR. Hence, the total memory requirement
for the proposed method is 0.46 + 0.25 = 0.71 M, except
TABLE II
MEMORY REQUIREMENT (M bytes) OF SPIHT, SCAN-BASED
CCSDS-IDC, SCAN-BASED JPEG2000, AND THE
PROPOSED METHODS AT 1 bpp
some minor variables. Table II shows comparison of memory
requirement of traditional methods and the proposed methods,
scan-based CCSDS-IDC, and scan-based JPEG2000 at 1 bpp.
They are evaluated with two images with two kinds of sizes,
i.e., 2048 2048 and 1024 1024. From Table II, we can
nd that the memory requirement of the scan-based methods,
namely, BTCA-S3, BTCA-S4, scan-based CCSDS, and scan-
based JPEG2000, is very similar and is much less than that of
frame-based method like SPIHT. The memory requirement of
the proposed methods is only a little more than that of scan-
based JPEG2000.
IV. EXPERIMENTAL RESULTS
To evaluate the performance of the proposed method for re-
mote sensing images, experiments are conducted on the CCSDS
reference test image set [29]. The image set includes a variety of
space imaging instrument data such as solar, stellar, planetary,
earth observations, optical, radar, and meteorological, as shown
in Fig. 10. We will compare different methods with PSNR
(peak-signal-to-noise ratio, dB) at different CR (bpp), which
are expressed by the following relations [31]:
PSNR(dB) =10 log
10
(2
b
1)
2
MSE
MSE =

i,j
(x
ij
x
ij
)
2
Row Col
CR(bpp) =
number of coded bits
Row Col
where x
ij
and x
ij
denote the original and reconstructed pixel,
respectively, and b denotes the bit depth of the image which is
of size Row Col.
A. Comparison of the Proposed Methods and Other
Algorithms Without Entropy Coding
To compare our methods with other algorithms without en-
tropy coding, all the 8 bit-depth images with size of 1024
1024 and size of 512 512 from CCSDS reference test image
set are used. In addition, these images are all decomposed by
ve levels of 9/7-tap bi-orthogonal wavelet lters [30] with
a symmetric extension at the image boundary. The PSNRs
at different CR of the proposed method were compared to
those algorithms without arithmetic coding, such as SPIHT [7],
SPECK [8], and EZBC [19]. We will also give the results of
BTC and BTCA to show the gradually improved process of our
proposed method.
Table III lists the experiment results of the proposed method
and other algorithms without arithmetic coding. The results are
3746 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 50, NO. 10, OCTOBER 2012
Fig. 10. Part of the remote sensing images used in simulation with different sizes and bit depth. (a) Coastal-b3 (8 bit depth, 1024 1024). (b) Europa3
(8 bit depth, 600 557)). (c) Marstest (8 bit depth, 512 512). (d) Lunar (8 bit depth, 512 512). (e) Spot-la-panchr (8 bit depth, 1000 1000). (f) Ice-2kb1
(10 bit depth, 2048 2048). (g) Pleiades-portdebouc-b3 (12 bit depth, 320 1376).
evaluated at ve bit rates, namely: 1, 0.5, 0.25, 0.125, and
0.0625 bpp. From the results, we can see that the average
PSNR of BTC is better than SPIHT, SPECK, and EZBC at
ve bit rates with all the listed images. It increases about an
average of 0.97 dB than SPIHT, 0.65 dB than SPECK, 0.13 dB
than EZBC, respectively, which proves that the proposed bi-
nary tree provides a more efcient image representation than
SPIHT, SPECK, and EZBC. The performance of SPIHT is
poor at lower bit rates because there is too much insignicant
coefcients in the initial list when using ve levels of wavelet
transform.
We can also nd that the PSNR of BTCA is better than that
of BTC, up to 0.49 dB, with an average of 0.14 dB. It shows that
the adaptive scanning order gets better performance. In fact, the
PSNR of BTCA is almost the best of all methods with all listed
test images at different bit rates.
When we use BTCA in scan-based mode, each subband of
a scan element only contains a few rows. If the coefcients
in a large square of original wavelet domain are insignicant,
then it cannot be represented by BTCA in scan-based mode.
Moreover, there is not rate-distortion optimization in BTCA-S.
Hence, the performance of BTCA-S is worse than BTCA. How-
ever, BTCA-S needs much less memory than BTCA, and the
former has quality, position, and resolution scalability, while
the latter only has quality scalability.
B. Comparison of the Proposed Method, Scan-Based
CCSDS-IDC, and Scan-Based JPEG2000
To compare our method with scan-based CCSDS-IDC and
scan-based JPEG2000, all the images from CCSDS reference
test image set are used. These images include all the images
with 8 bit depth listed in Table IV, and all the images with
10 bit depth such as ice-2kb1, ice-2kb4,india-2kb1,
india-2kb4, ocean-2kb1, ocean-2kb4 with size of
2048 2048, and landesV-G7-10b with size of 2381 454,
marseille-G6-10b with size of 1856 528. There are also
eight images with 10 bit depth, six images with 12 bit depth,
and two images with 16 bit depth [29]. In addition, these images
are all decomposed by oat 9/7-tap bi-orthogonal wavelet lters
for lossy compression. When lossless compression is desired,
the integer 5/3 DWT is used for the proposed method and
JPEG2000, while integer 9/7 DWT is used for CCSDS-IDC.
BTCA-S4 uses DWT of four levels, and the other algorithms
use DWT of three levels. The results of scan-based CCSDS-
IDC and scan-based JPEG2000 are attained from annex C of
CCSDS-IDC Green Book [17].
When the coefcients are almost signicant, the coding
efciency of BTC is not good. However, the coefcients
in the lowest frequency subband are just almost signicant,
and there are a lot of such coefcients using only three or
four levels of wavelet transform. Thus, the proposed method
HUANG AND DAI: NEW ON-BOARD IMAGE CODEC BASED ON BINARY TREE 3747
TABLE III
PSNR (dB) FOR THE PROPOSED METHOD AND OTHER ALGORITHMS
WITHOUT ENTROPY CODING WITH FIVE LEVELS
OF WAVELET TRANSFORM
directly encodes the coefcients in the lowest frequency sub-
band at this time, namely, when the coefcient is insignicant,
then 0 is output, otherwise 1 and its sign are output.
TABLE IV
PSNR (dB) FOR SCAN-BASED CCSDS-IDC, SCAN-BASED
JPEG2000, AND THE PROPOSED METHOD
USING 8-BIT-DEPTH IMAGES
Because all the coefcients in the lowest frequency subband
are usually positive, we can also omit the sign coding at
this time.
3748 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 50, NO. 10, OCTOBER 2012
TABLE V
AVERAGE PSNR (dB) FOR SCAN-BASED CCSDS-IDC, SCAN-BASED
JPEG2000, AND THE PROPOSED METHOD USING
IMAGES WITH DIFFERENT BIT DEPTH
TABLE VI
PERFORMANCE (BITS/PIXEL) OF LOSSLESS COMPRESSION FOR
SCAN-BASED CCSDS-IDC, SCAN-BASED JPEG2000,
AND THE PROPOSED METHOD
Table IV lists the experiment results of scan-based CCSDS-
IDC, scan-based JPEG2000, and the proposed method using all
the images with 8 bit depth. From the results, we can see that
the average PSNR of BTCA-S3 is better than that of CCSDS-
IDC at 1 bpp and 0.5 bpp. Because BTCA-S directly encodes
the lowest resolution subband, its average PSNR is less than
CCSDS-IDC at lower bit rates with three levels of wavelet
transform. When using four levels of wavelet transform, the
performance of BTCA-S is better than CCSDS-IDC at three
bit rates and is better than JPEG2000 in scan-based mode at
0.5 bpp and 0.25 bpp. At 1 bpp, the PSNR of JPEG2000 in
scan-based mode increases only an average of 0.04 dB than
that of BTCA-S4 using all 8 bit-depth images and is less than
BTCA-S4 using the images with size of 1024 1024. Table V
lists the comparison of average PSNR using images with 10,
12, and 16 bit depth, and similar result is attained.
When lossless compression is desired, every bit plane needs
to be encoded. However, the performance of the proposed
method are not good at the last bit plane, so the proposed
method directly encodes the insignicant coefcients at this
time (i.e., when T
k
= 0). Table VI lists the performance
(bits/pixel) of lossless compression for scan-based CCSDS-
IDC, scan-based JPEG2000, and the proposed method. From
the results, we can see that the average bpp of BTCA-
S3 is similar to CCSDS-IDC, and only a little worse than
JPEG2000.
C. Comparison of Visual Effects
Fig. 11 shows a visual comparison between BTCA-S and
scan-based CCSDS with the coastal-b3 image at 0.25 bpp,
and Fig. 12 is the comparison with lunar image at 0.5 bpp.
The recovered portion of the image has a size of 200
100 pixels, while the original image size is 1024 1024 for
coastal-b3, and 512 512 for lunar, respectively. Because
the magnitudes of the wavelet coefcients around the edges
are often signicant, and BCTA can encode these coefcients
before other coefcients are scanned. When algorithm stops at
a specied bit rates, it can encode more signicant coefcients
on the edge, and thus it can improve the quality of decoded
image. We can see that the image reconstructed by BTCA-S is
clearer than CCSDS in the rectangles.
D. Comparison of Encoding Time
Table VII shows comparison of the CPU encoding times(s)
of CCSDS-IDC, SPIHT without arithmetic coding and the
proposed method in C language implementation. The results
are evaluated with three images of different sizes at three bit
rates. The program of SPIHT without arithmetic coding comes
from its ofcial website [32], and the program of CCSDS can
be download from [33], which is introduced in annex B of
CCSDS-IDC Green Book [17]. From the result, we can see
that the proposed method is only a little slower than SPIHT
without arithmetic coding, but is faster than CCSDS with
entropy coding.
V. CONCLUSION AND DISCUSSION
In this paper, we proposed an on-board remote sensing image
codec based on a binary tree with adaptive scanning order in
scan-based mode. We perform the line-based wavelet transform
on the available data, and subsets of wavelet coefcients are col-
lected into several scan elements. Then, we construct a binary
tree for each code block of scan element. In each code block
and each bit plane, we traverse each level of the binary tree with
an adaptive scanning order. Experimental results show that the
proposed method can signicantly improve PSNR compared
with SPIHT without arithmetic coding and scan-based CCSDS-
IDC and is similar to scan-based JPEG2000. The speed of
our method is very fast. Being less complex, our method is
fully implementable either in hardware or software. Hence, the
proposed method is very t for on-board remote sensing image
compression.
Because multicomponent images are very common in re-
mote sensing applications, to attain component scalability,
CCSDS-IDC encodes each spectral band separately, while our
proposal is able to encode all the spectral bands together as
follows. We can create scan elements for each spectral band
as Section II-C1 and encode them with BCTA as Section II-C2.
Then, we can allocate the bit rates of all scan elements from
all spectral bands with the method as Section II-C3. At last,
random access is guaranteed to each band for the proposed
method.
In order to improve the coding performance, a common
strategy for hyperspectral images can decorrelate rst the image
in the spectral domain [3], [4], [10], [14], [34][36], such
as DWT and principal component analysis. However, it is a
HUANG AND DAI: NEW ON-BOARD IMAGE CODEC BASED ON BINARY TREE 3749
Fig. 11. Comparison of BTCA-S and CCSDS with the coastal-b3 image at 0.25 bpp. The PSNRs of the reconstructed images are 39.68 dB and 39.13 dB,
respectively. There are richer lines in the top rectangle of the image reconstructed by BTCA-S, and the lines in the bottom rectangle with BTCA-S are more
clearer. Because BCTA can preferentially encode the neighbor of the previous signicant coefcients where the edges locate, the quality of decoded image on the
edges of is more clear for BTCA-S. (a) Original image. (b) Image reconstructed by BTCA-S. (c) Image reconstructed by CCSDS.
Fig. 12. Comparison of BTCA-S and CCSDS with the lunar image at 0.5 bpp. The PSNRs of the reconstructed images are 30.92 dB and 31.52 dB, respectively.
The boundary of the objects in the rectangles of the image reconstructed by BTCA-S is clearer and more continuous. (a) Original image. (b) Reconstructed image
with BTCA-S. (c) Reconstructed image with CCSDS.
TABLE VII
CPU ENCODING TIMES(S) OF CCSDS-IDC, SPIHT WITHOUT
ARITHMETIC CODING AND THE PROPOSED METHOD
problem for the proposed method to allocate bit rates across
the transformed bands with low-complexity, low-memory, and
efcient way, which is our future work.
ACKNOWLEDGMENT
The authors would like to thank the associate editor and
the three anonymous reviewers for thoughtful comments and
insightful suggestions which have brought improvements to this
manuscript.
REFERENCES
[1] D. Chaudhuri and A. Samal, An automatic bridge detection technique for
multispectral images, IEEE Trans. Geosci. Remote Sens., vol. 46, no. 9,
pp. 27202727, Sep. 2008.
[2] B. Sirmacek and C. Unsalan, Urban-area and building detection using
SIFT keypoints and graph theory, IEEE Trans. Geosci. Remote Sens.,
vol. 47, no. 4, pp. 11561167, Apr. 2009.
[3] Q. Du and J. E. Fowler, Hyperspectral image compression
using JPEG2000 and principal component analysis, IEEE
Geosci. Remote Sens. Lett., vol. 4, no. 2, pp. 201205,
Apr. 2007.
3750 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 50, NO. 10, OCTOBER 2012
[4] B. Penna, T. Tillo, E. Magli, and G. Olmo, Transform coding techniques
for lossy hyperspectral data compression, IEEE Trans. Geosci. Remote
Sens., vol. 45, no. 5, pp. 14081421, May 2007.
[5] P. Hou, M. Petrou, C. I. Underwood, and A. Hojjatoleslami, Improving
JPEG performance in conjunction with cloud editing for remote sensing
applications, IEEE Trans. Geosci. Remote Sens., vol. 38, no. 1, pp. 515
524, Jan. 2000.
[6] J. M. Shapiro, Embedded image coding using zerotrees of wavelet co-
efcients, IEEE Trans. Signal Process., vol. 41, no. 12, pp. 34453462,
Dec. 1993.
[7] A. Said and W. A. Pearlman, A new, fast, and efcient image codec based
on set partitioning in hierarchical trees, IEEE Trans. Circuits Syst. Video
Technol., vol. 6, no. 3, pp. 243250, Jun. 1996.
[8] W. A. Pearlman, A. Islam, N. Nagaraj, and A. Said, Efcient low-
complexity image coding with a set-partitioning embedded block coder,
IEEE Trans. Circuits Syst. Video Technol., vol. 14, no. 11, pp. 12191235,
Nov. 2004.
[9] D. Taubman, High performance scalable image compression with
EBCOT, IEEE Trans. Image Process., vol. 9, no. 7, pp. 11581170,
Jul. 2000.
[10] X. Tang and W. A. Pearlman, Three-dimensional wavelet-based com-
pression of hyperspectral images, in Hyperspectral Data Compression.
New York: Springer-Verlag, 2006, pp. 273308.
[11] JPEG2000 Image Coding System, ISO/IEC Std. 15 444-1, 2000.
[12] G. Yu, T. Vladimirova, and M. N. Sweeting, Image compression systems
on board satellites, Acta Astronaut., vol. 64, no. 9/10, pp. 9881005,
May/Jun. 2009.
[13] B. Li, R. Yang, and H. X. Jiang, Remote-sensing image compression
using two-dimensional oriented wavelet transform, IEEE Trans. Geosci.
Remote Sens., vol. 49, no. 1, pp. 236250, Jan. 2011.
[14] P. Kulkarni, A. Bilgin, M. W. Marcellin, J. C. Dagher, J. H. Kasner,
T. J. Flohr, and J. C. Rountree, Compression of earth science data with
JPEG2000, in Hyperspectral Data Compression. New York: Springer-
Verlag, 2006, pp. 347378.
[15] Consult. Comm. Space Data Syst. [Online]. Available: http://www.
ccsds.org
[16] CCSDS122.0-B-1, Image Data Compression, Nov. 2005. [Online]. Avail-
able: http://public.ccsds.org/publications/archive/122x0b1c3.pdf
[17] CCSDS120.1-G-1, Image Data Compression, Jun. 2007. [Online]. Avail-
able: http://public.ccsds.org/publications/archive/120x1g1e1.pdf
[18] F. Garca-Vlchez and J. Serra-Sagrist, Extending the CCSDS recom-
mendation for image data compression for remote sensing scenarios,
IEEE Trans. Geosci. Remote Sens., vol. 47, no. 10, pp. 34313445,
Oct. 2009.
[19] S. T. Hsiang and J. W. Woods, Embedded image coding using zeroblock
of subband/wavelet coefcients and context modeling, in Proc. Data
Compress. Conf., Washington, DC, 2001, pp. 8392.
[20] H. F. Ates and M. T. Orchard, Spherical coding algorithm for wavelet im-
age compression, IEEE Trans. Image Process., vol. 18, no. 5, pp. 1015
1024, May 2009.
[21] H. F. Ates and E. Tamer, Hierarchical quantization indexing for wavelet
and wavelet packet image coding, Signal Process., Image Commun.,
vol. 25, no. 2, pp. 111120, Feb. 2010.
[22] A. Abrardo, M. Barni, E. Magli, and F. Nencini, Error-resilient and low-
complexity on-board lossless compression of hyperspectral images by
means of distributed source coding, IEEE Trans. Geosci. Remote Sens.,
vol. 48, no. 4, pp. 18921904, Apr. 2010.
[23] J. Oliver and M. P. Malumbres, Low-complexity multiresolution image
compression using wavelet lower trees, IEEE Trans. Circuits Syst. Video
Technol., vol. 16, no. 11, pp. 14371444, Nov. 2006.
[24] G. Melnikov and A. K. Katsaggelos, A jointly optimal fractal/DCT com-
pression scheme, IEEE Trans. Multimedia, vol. 4, no. 4, pp. 413422,
Dec. 2002.
[25] T. Kim, H. M. Kim, P. S. Tsai, and T. Acharya, Memory efcient pro-
gressive rate-distortion algorithm for JPEG 2000, IEEE Trans. Circuits
Syst. Video Technol., vol. 15, no. 1, pp. 181187, Jan. 2005.
[26] W. Yu, F. Sun, and J. E. Fritts, Efcient rate control for JPEG-2000,
IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 5, pp. 577589,
Jan. 2006.
[27] C. Chrysas and A. Ortega, Line-based, reduced memory, wavelet image
compression, IEEE Trans. Image Process., vol. 9, no. 3, pp. 378389,
Mar. 2000.
[28] F. W. Wheeler and W. A. Pearlman, SPIHT image compression without
lists, in Proc. ICASSP, Istanbul, Turkey, 2000, pp. 20472050.
[29] Consult. Comm. Space Data Syst., CCSDS Image Test. [Online]. Avail-
able: http://cwe.ccsds.org/sls/docs/sls-dc/
[30] M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies, Image coding
using wavelet transform, IEEE Trans. Image Process., vol. 1, no. 2,
pp. 205220, Apr. 1992.
[31] D. J. Granrath, The role of human visual models in image processing,
Proc. IEEE, vol. 69, no. 5, pp. 552561, May 1981.
[32] Center Image Process. Res. [Online]. Available: http://ipl.rpi.edu/
[33] Univ. Nebraska-Lincoln. [Online]. Available: http://hyperspectral.
unl.edu/
[34] J. Fowler and J. T. Rucker, 3-D wavelet-based compression of hyper-
spectral imager, in Hyperspectral Data Exploitation: Theory and Appli-
cations. Hoboken, NJ: Wiley, 2007, pp. 379407.
[35] E. Magli, Multiband lossless compression of hyperspectral images,
IEEE Trans. Geosci. Remote Sens., vol. 47, no. 4, pp. 11681178,
Apr. 2009.
[36] H. Wang, S. D. Babacan, and K. Sayood, Lossless hyperspectral-image
compression using context-based conditional average, IEEE Trans.
Geosci. Remote Sens., vol. 45, no. 12, pp. 41874193, Dec. 2007.
Ke-Kun Huang received the B.S. and M.S. degrees
from Sun Yat-Sen University, Guangzhou, China, in
2002, and 2005, respectively.
He is currently with the Department of Mathemat-
ics, JiaYing University, JiaYing University, Meizhou,
China. His research interests include image process-
ing and face recognition.
Dao-Qing Dai (M09) received the B.Sc. degree
from Hunan Normal University, Changsha, China, in
1983, the M.Sc. degree from Sun Yat-Sen University,
Guangzhou, China, in 1986, and the Ph.D. degree
from Wuhan University, Wuhan, China, in 1990, all
in mathematics.
From 1998 to 1999, he was an Alexander von
Humboldt Research Fellow with Free University,
Berlin, Germany. He is currently a Professor and
Associate Dean of the Faculty of Mathematics and
Computing, Sun Yat-Sen University, Guangzhou. He
is an author or coauthor of over 100 refereed technical papers. His current
research interests include image processing, wavelet analysis, face recognition,
and bioinformatics.
Dr. Dai was the recipient of the outstanding research achievements in
mathematics award from the International Society for Analysis, Applications,
and Computation, Fukuoka, Japan, in 1999. He served as a Program Cochair
of Sinobiometrics in 2004 and a program committee member for several
international conferences.

You might also like