Professional Documents
Culture Documents
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING
AbstractWe present in this work a first performance assessment of the Parallel Small BAseline Subset (P-SBAS) algorithm,
for the generation of Differential Synthetic Aperture Radar (SAR)
Interferometry (DInSAR) deformation maps and time series,
which has been migrated to a Cloud Computing (CC) environment. In particular, we investigate the scalable performances
of the P-SBAS algorithm by processing a selected ENVISAT
ASAR image time series, which we use as a benchmark, and by
exploiting the Amazon Web Services (AWS) CC platform. The
presented analysis shows a very good match between the theoretical and experimental P-SBAS performances achieved within
the CC environment. Moreover, the obtained results demonstrate
that the implemented P-SBAS Cloud migration is able to process
ENVISAT SAR image time series in short times (less than 7 h)
and at low costs (about USD 200). The P-SBAS Cloud scalable
performances are also compared to those achieved by exploiting
an in-house High Performance Computing (HPC) cluster, showing that nearly no overhead is introduced by the presented Cloud
solution. As a further outcome, the performed analysis allows us
to identify the major bottlenecks that can hamper the P-SBAS
performances within a CC environment, in the perspective of processing very huge SAR data flows such as those coming from the
existing COSMO-SkyMed or the upcoming SENTINEL-1 constellation. This work represents a relevant step toward the challenging
Earth Observation scenario focused on the joint exploitation of
advanced DInSAR techniques and CC environments for the massive processing of Big SAR Data.
Index TermsBig data, Cloud Computing (CC), Differential
Synthetic Aperture Radar (SAR) Interferometry (DInSAR), Earth
surface deformation, Parallel Small BAseline Subset (P-SBAS).
Manuscript received November 06, 2014; revised February 13, 2015;
accepted April 13, 2015. This work was supported in part by the Italian
Ministry of University and Research (MIUR) under the project Progetto
Bandiera RITMARE, and in part by the Italian Civil Defence Department
(DPC) of the Prime Ministers Office. This work has been carried out
through the I-AMICA (Infrastructure of High Technology for Environmental
and Climate MonitoringPONa3_00363) project of Structural improvement
financed under the National Operational Programme (NOP) for Research and
Competitiveness 20072013, cofunded with European Regional Development
Fund (ERDF) and National resources. The ENVISAT SAR data have been provided by the European Space Agency through the Virtual Archive 4. The DEM
of the investigated zone was acquired through the SRTM archive.
I. Zinno, S. Elefante, C. De Luca, M. Manunta, R. Lanari, and F. Casu
are with the Istituto per il Rilevamento Elettromagnetico dellAmbiente
(IREA), Consiglio Nazionale delle Ricerche, Napoli 80124, Italy (e-mail:
casu.f@irea.cnr.it).
C. De Luca is also with the Department of Electrical Engineering and
Information Technology, University of Naples Federico II, Napoli, Italy.
L. Mossucca and O. Terzo are with the Advanced Computing and
Electromagnetics (ACE), Istituto Superiore Mario Boella, Torino 10138, Italy.
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/JSTARS.2015.2426054
I. I NTRODUCTION
1939-1404 2015 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution
requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Thisarticlehasbeenacceptedforinclusioninafutureissueofthisjournal.Contentisfinalaspresented,withtheexceptionofpagination.
2
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING
The paper is organized as follows. In Section II, a concise description of the P-SBAS processing chain is provided,
which aims at recalling the main processing steps. Section III
describes how the entire P-SBAS processing chain has been
migrated to the selected CC environment. Section IV is dedicated to the experimental framework that includes the scalable
performance study, which has been performed by exploiting both a dedicated cluster and the AWS Cloud, as well
as the analysis of the processing times and costs. Finally, in
Section V, some concluding remarks and further developments
are thoroughly discussed.
II. P-SBAS D ESCRIPTION
This section focuses on providing a concise but informative description of the P-SBAS processing chain which has
already been thoroughly discussed in [14]. It aims at recalling the P-SBAS major processing steps in terms of main tasks,
implemented procedures and computational challenges that will
be addressed by showing CPU usage, RAM occupation, and
Input/Output (I/O) transfer requirements.
The P-SBAS solution has been designed by carefully taking into account several different conceptual aspects, such
as data dependencies, task partitioning, inherent granularity,
scheduling policy, load unbalancing, in order to optimize the
usage of CPU, RAM, and I/O resources. Moreover, the heterogeneous nature of the computational algorithms that are
comprised within the SBAS processing chain has strived the
P-SBAS design toward the employment of proper parallelization strategies that depend on the algorithmic structure of the
considered specific processing step [14], [18]. Furthermore,
P-SBAS has been designed in a manner that allows us to
take advantage of both multinode and multicore architectures
and therefore two-parallelization levels have been employed:
process and thread. The former considers a coarse/medium
granularity-based approach and has been mainly applied to the
whole processing chain, while the latter relies on a fine-grained
parallelization. This strategy has been implemented both for the
most computing-intensive operations, to optimize CPU usage
through multithreading programming, as well as for highly
RAM demanding algorithms, to reduce memory occupation by
applying a data parallelism strategy [14].
The block diagram of the P-SBAS processing chain is shown
in Fig. 1; in this scheme, the steps depicted by blue blocks
represent the jobs that are parallel executed by simultaneously running on different nodes, while black blocks represent
steps that are intrinsically sequential. Moreover, dashed line
blocks describe the steps that are multithreaded programmed. In
Table I, instead, the main characteristics of each step of Fig. 1,
in terms of CPU and RAM usage as well as I/O operations, are
briefly summarized in Table I.
In the following, a conceptual description of the P-SBAS
processing chain will be briefly addressed. It is worth noting
that such a processing chain has been designed to analyze the
majority of SAR data available through the different spaceborne
systems (ERS-1/2, ENVISAT, COSMO-SkyMed, TerraSAR-X,
ALOS-1/2, and RADARSAT-1/2). Moreover, it is also robust
with respect to possible system failures (e.g., ERS-2 gyroscope
Thisarticlehasbeenacceptedforinclusioninafutureissueofthisjournal.Contentisfinalaspresented,withtheexceptionofpagination.
ZINNO et al.: FIRST ASSESSMENT OF P-SBAS DINSAR ALGORITHM PERFORMANCES
Fig. 1. P-SBAS workflow. Black and blue blocks represent sequential and parallel (from a process-level perspective) processing steps, respectively. Dashed line
blocks represent multithreading programmed processing steps.
Thisarticlehasbeenacceptedforinclusioninafutureissueofthisjournal.Contentisfinalaspresented,withtheexceptionofpagination.
4
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING
TABLE I
P-SBAS A LGORITHM R ESOURCE R EQUIREMENTS
Reference values for a standard dataset (64 ENVISAT SAR images, see Section IV). CPU: medium (<100%),
high (100%300%), very high (>300%); RAM: low (<1 GB), medium (110 GB), high (>10 GB); I/O: medium
(<10 GB), high (tens of GB), very high (hundreds of GB).
Thisarticlehasbeenacceptedforinclusioninafutureissueofthisjournal.Contentisfinalaspresented,withtheexceptionofpagination.
ZINNO et al.: FIRST ASSESSMENT OF P-SBAS DINSAR ALGORITHM PERFORMANCES
Fig. 2. Cloud architecture for P-SBAS analysis constituted of several WNs which include a MN and a common storage volume in a RAID 0 configuration. Each
component is located in the Amazon Web Services Cloud.
As mentioned in the previous section, P-SBAS rationale provides that in many steps of the chain the joint processing of
the outputs generated by previous steps needs to be performed.
Hence, a common storage is required and therefore a network
file system (NFS) has been adopted [31].
Moreover, the majority of the P-SBAS algorithms are developed in the Exelis Interactive Data Language (IDL) [32]: a
programming language that is widely used by the scientists who
develop algorithms for SAR and DInSAR data processing [14],
[33]. IDL is a commercial software and, therefore, each VM
running the application requires a license. Hence, an interconnection layer between the IDL License Server (located at the
site of CNR-IREA institution) and VMs (located on Cloud) has
been implemented. This layer allows us to connect end-points
and satisfies the firewall policies adopted by the CNR-IREA
institution and the Cloud provider.
The used CC platform is hosted in the Amazon Elastic
Compute Cloud (EC2), an infrastructure as a service (IaaS) that
is part of AWS Cloud; EC2 has been chosen because it is currently a feature-rich, stable and commercial public Cloud [20].
Moreover, a web service through which users can configure and
instantiate a VM image is also available. AWS adopts a virtualization technology that permits to flexibly configure VM
instances allowing users to fully set up features such as the
number of CPUs cores, the processor type, the memory, the
I/O performance, etc. In addition, the operating system and the
software that runs on the VM can be customized by the user.
In particular, a Virtual Private Cloud (VPC) has been implemented that is a logically isolated section of the AWS in which
resources can be launched in a completely defined virtual network. The easy customization of the network configuration
allows users to fully control the virtual networking environment, through IP address range selection, subnet creation,
Thisarticlehasbeenacceptedforinclusioninafutureissueofthisjournal.Contentisfinalaspresented,withtheexceptionofpagination.
6
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING
TABLE II
CNR-IREA C LUSTER C ONFIGURATION
TABLE III
AWS I NSTANCE T YPE C ONFIGURATION
T1
.
TN
(1)
SN
N
(2)
Thisarticlehasbeenacceptedforinclusioninafutureissueofthisjournal.Contentisfinalaspresented,withtheexceptionofpagination.
ZINNO et al.: FIRST ASSESSMENT OF P-SBAS DINSAR ALGORITHM PERFORMANCES
Fig. 3. Mean deformation velocity map of the Napoli Bay area, generated via the P-SBAS processing on AWS Cloud. The graph of the displacement time series
relevant to two pixels located in the area of maximum deformation of Campi Flegrei and in the Vomero residential hill are also shown.
1
,
1 fs
fs +
N
0 fs 1
(3)
where fs is the parallel program fraction that has been sequentially executed (sequential fraction) [35]. It is also worth mentioning that the formulation (3) of Amdahls law does not take
into account either the load unbalancing or the data transfer
overhead.
The experimental analysis has been performed by processing an interferometric dataset acquired over the Napoli bay,
a volcanic and densely urbanized area in Southern Italy that
includes the active caldera of Campi Flegrei, the Vesuvius volcano, and the city of Napoli. In particular, we consider the
overall time series of ENVISAT acquisitions collected by the
ASAR sensor from ascending orbits, which is composed by 64
SAR data. This dataset, which spans the 20032010 time interval and covers an area of about 100 100 km2 that corresponds
to an ENVISAT frame, is often used as a benchmark dataset
for DInSAR analyses [8], [14], [15], [18]. The selected dataset
has been processed by using the P-SBAS algorithm to generate the DInSAR products. In particular, in Fig. 3, the obtained
mean deformation velocity map is shown. This map has been
geocoded and afterward superimposed on a multilook SAR
image of the investigated area. The estimated mean deformation
velocity has been only computed in coherent areas; accordingly, areas in which the measurement accuracy is affected by
decorrelation noise have been excluded from the false-color
map. In particular, it is worth noting that in Fig. 3, a significant deformation pattern corresponding to the area of the
Campi Flegrei caldera is clearly shown. Moreover, the computation of the temporal evolution of the detected deformation
has also been carried out for each coherent point of the scene.
For instance, the chronological sequences of the computed
displacement of two specific points (the first located in the
maximum deforming area of the Campi Flegrei caldera while
the second in the Vomero residential hill which shows a slow
down lift movement) are plotted in the insets of Fig. 3. These
results are in accordance with ground truth measurements
[8], [10].
C. Experimental Results
As previously mentioned, a scalability analysis with respect
to the number of exploited computing nodes has been carried out both on the CNR-IREA cluster and on AWS Cloud.
Concerning the test performed at the CNR-IREA premises, the
speedup depicted as a function of the number of engaged nodes
is represented in Fig. 4. Such a plot shows the speedup ideal linear behavior (blue/diamonds), the Amdahls law (red/squares)
and the experimental results achieved on the above mentioned
cluster (green/triangles). The Amdahls law has been evaluated by computing the processing sequential fraction as the
ratio between the sum of elapsed times of the P-SBAS sequential parts and the total processing time on a single computing
node; it has turned out to be approximately 9% (fs = 0.09). It
is evident from Fig. 4 that the achieved speedup is definitely
Thisarticlehasbeenacceptedforinclusioninafutureissueofthisjournal.Contentisfinalaspresented,withtheexceptionofpagination.
8
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING
cluster and Cloud cases are different; in particular, the processors of the Cloud nodes are slightly more powerful than those
of the cluster nodes (see Tables II and III). As Fig. 5 clearly
shows, also in this case, the speedup behavior is very close to
the Amdahls law and it begins to diverge as approaching 16
nodes. In Table IV, the Amdahls law and actual speedup values
evaluated in both the CNR-IREA cluster and AWS Cloud cases
are shown. Furthermore, the percentage deviations between the
theoretical (Amdahls law) and experimental behavior are provided, quantitatively confirming the good match between the
P-SBAS performance both on cluster and Cloud.
To provide an idea of the times as well as the economical expenses that P-SBAS took to complete the processing, in
Table V, the elapsed times relevant to the P-SBAS running
tests performed on both CNR-IREA cluster and AWS Cloud
by exploiting 1, 2, 4, 8, and 16 nodes are shown together with
the corresponding costs relative to the AWS usage. Note that
such costs consider both those relevant to the EC2 exploited
instances as well as those of the selected storage volumes.
On the contrary, the cost of the IDL licenses has not been
included because we used those available on the CNR-IREA
server with no additional expenses. By exploiting the AWS
Cloud, the P-SBAS processing times passed from 41 to less
than 7 h when moving from 1 to 16 nodes with a cost of USD
113 and 213, respectively. Table V shows that the P-SBAS parallel performances on Cloud are deemed satisfactory, as they
are absolutely comparable with those achieved on the dedicated
cluster.
The elapsed times and associated costs shown in Table V,
which are related to a maximum of 16 nodes, are undoubtedly
adequate when the processing of ENVISAT ASAR datasets on
the Cloud is concerned. However, in the perspective of dealing
with archives bigger than ENVISAT ones, as in the case of CSK
and Sentinel-1 data, the need of exploiting a larger number of
nodes is envisaged in order to keep the processing time of the
same order of magnitude. In this case, according to the performance behavior represented by the speedup curves of Fig. 5, the
discrepancy between the actual and the Amdahls law speedup
curves is expected to increase, thus significantly lowering the
efficiency.
In order to identify which is the performance bottleneck
when the number of parallel processes increases, some useful
metrics, such as the read/write bandwidth and average queue
length, provided by the AWS CloudWatch monitoring system
[20] have been analyzed in detail.
It turned out that the loss of efficiency relevant to the
16-nodes processing is ascribable to the I/O workload as it
understandably increases with the number of parallel processes
which concurrently read or write on the common storage volume. Hence, the factor that mainly lowers the P-SBAS scalable
performance in our case is essentially the storage volume
access bandwidth that is smaller than the network bandwidth
(256 MB/s vs. 10 Gb/s).
In the following the performances of two steps of the PSBAS algorithm, the image coregistration and the spatial phase
unwrapping (blocks C and I of Fig. 1), which are characterized
by very different I/O workloads, are thoroughly analyzed. To
this aim, we considered two metrics: the read/write bandwidth
Thisarticlehasbeenacceptedforinclusioninafutureissueofthisjournal.Contentisfinalaspresented,withtheexceptionofpagination.
ZINNO et al.: FIRST ASSESSMENT OF P-SBAS DINSAR ALGORITHM PERFORMANCES
TABLE IV
VALUES OF THE A MDAHL S L AW AND E XPERIMENTAL S PEEDUP ON B OTH CNR-IREA C LUSTER AND AWS C LOUD ;
T HEIR P ERCENTAGE D EVIATION IS A LSO S HOWN
TABLE V
P-SBAS P ROCESSING T IMES AND C OSTS
*Note that the reported costs include both the instances and the provisioned IOPS (SSD) volume
usage.
The critical value of the average queue length for the employed
storage configuration is around 40 counts, indeed this number depends on the I/O capacity of the selected EBS volume
[20]. Once again eight nodes are already enough to reach the
maximum allowed latency threshold and a greater number of
employed nodes would not bring a significant reduction in the
elapsed time for this step.
On the contrary, the phase unwrapping step, even if very
burdensome from a computational viewpoint, is less heavy for
what concerns I/O operations. This step presents satisfactory
scalable performances as shown in Table VII, with a 70% efficiency in correspondence with 16 nodes and it would therefore
further scale if a greater number of nodes were used. Indeed, in
this case, both the read/write bandwidth and the queue length
values are evidently below the saturation threshold.
It is worth noting that the scalable performances of the PSBAS processing chain in the presented Cloud configuration
can be further improved by increasing the storage volume
access bandwidth by configuring a RAID 0 striping with a
greater number of provisioned IOPS volumes (up to 12).
This would allow us to exploit a larger number of nodes without saturating the storage volume access bandwidth, as long
as it is supported by an adequate network bandwidth, with the
performance limit given by the Amdahls law.
Thisarticlehasbeenacceptedforinclusioninafutureissueofthisjournal.Contentisfinalaspresented,withtheexceptionofpagination.
10
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING
TABLE VI
I MAGE C OREGISTRATION S TEP : E FFICIENCY AND I/O M ETRICS (R ETRIEVED F ROM THE C LOUDWATCH M ONITORING S YSTEM )
AS A F UNCTION OF THE N UMBER OF E XPLOITED N ODES ON AWS C LOUD
TABLE VII
P HASE U NWRAPPING S TEP : E FFICIENCY AND I/O M ETRICS (R ETRIEVED F ROM THE C LOUDWATCH M ONITORING S YSTEM )
AS A F UNCTION OF THE N UMBER OF E XPLOITED N ODES ON AWS C LOUD
Thisarticlehasbeenacceptedforinclusioninafutureissueofthisjournal.Contentisfinalaspresented,withtheexceptionofpagination.
ZINNO et al.: FIRST ASSESSMENT OF P-SBAS DINSAR ALGORITHM PERFORMANCES
This work represents a relevant step toward the challenging EO scenario focused on the joint exploitation of advanced
DInSAR techniques and CC platforms for the massive processing of Big SAR Data. This will give the opportunity
of generating value added interferometric products on very
large scale and in short times, thus broadening the path to a
comprehensive understanding of Earths surface deformation
dynamics.
ACKNOWLEDGMENT
The authors would like to thank S. Guarino, F. Parisi, and
M.C. Rasulo for their technical support.
R EFERENCES
[1] E. Sansosti, F. Casu, M. Manzo, and R. Lanari, Space-borne radar interferometry techniques for the generation of deformation time series: An
advanced tool for Earths surface displacement analysis, Geophys. Res.
Lett., vol. 37, 2010, doi: 10.1029/2010GL044379.
[2] P. Berardino, G. Fornaro, R. Lanari, and E. Sansosti, A new algorithm
for surface deformation monitoring based on small baseline differential
SAR interferograms, IEEE Trans. Geosci. Remote Sens., vol. 40, no. 11,
pp. 23752383, Nov. 2002.
[3] F. Casu, M. Manzo, and R. Lanari, A quantitative assessment of the
SBAS algorithm performance for surface deformation retrieval from
DInSAR data, Remote Sens. Environ., vol. 102, pp. 195210, 2006.
[4] A. Manconi et al., On the effects of 3-D mechanical heterogeneities
at Campi Flegrei caldera, southern Italy, J. Geophys. Res. Solid Earth,
vol. 115, 2010, doi: 10.1029/2009JB007099.
[5] R. Lanari, F. Casu, M. Manzo, and P. Lundgren, Application of the
SBAS-DInSAR technique to fault creep: A case study of the Hayward
fault, California, Remote Sens. Environ., vol. 109, pp. 2028, 2007.
[6] F. Cal et al., Enhanced landslide investigations through advanced
DInSAR techniques: The Ivancich case study, Assisi, Italy, Remote Sens.
Environ., vol. 142, pp. 6982, 2014.
[7] R. Lanari, P. Lundgren, M. Manzo, and F. Casu, Satellite radar
interferometry time series analysis of surface deformation for Los
Angeles, California, Geophys. Res. Lett., vol. 31, 2004, doi:
10.1029/2004GL021294.
[8] M. Bonano, M. Manunta, A. Pepe, L. Paglia, and R. Lanari, From previous C-band to new X-band SAR systems: Assessment of the DInSAR
mapping improvement for deformation time-series retrieval in urban
areas, IEEE Trans. Geosci. Remote Sens., vol. 51, no. 4, pp. 19731984,
Apr. 2013.
[9] A. Pepe, E. Sansosti, P. Berardino, and R. Lanari, On the generation of ERS/ENVISAT DInSAR time-series via the SBAS technique,
IEEE Geosci. Remote Sens. Lett., vol. 2, no. 3, pp. 265269, Jul.
2005.
[10] M. Bonano, M. Manunta, M. Marsella, and R. Lanari, Long-term
ERS/ENVISAT deformation time-series generation at full spatial resolution via the extended SBAS technique, Int. J. Remote Sens., vol. 33,
pp. 47564783, 2012.
[11] S. Salvi et al., The Sentinel-1 mission for the improvement of the scientific understanding and the operational monitoring of the seismic cycle,
Remote Sens. Environ., vol. 120, pp. 164174, May 2012.
[12] A. Rucci, A. Ferretti, A. M. Guarnieri, and F. Rocca, Sentinel 1 SAR
interferometry applications: The outlook for sub millimeter measurements, Remote Sens. Environ., vol. 120, pp. 156163, May 2012.
[13] F. De Zan and A. Monti Guarnieri, TOPSAR: Terrain observation by
progressive scans, IEEE Trans. Geosci. Remote Sens., vol. 44, no. 9,
pp. 23522360, Sep. 2006.
[14] F. Casu et al., SBAS-DInSAR parallel processing for deformation timeseries computation, IEEE J. Sel. Topics Appl. Earth Observ. Remote
Sens., vol. 7, no. 8, pp. 32853296, Aug. 2014.
[15] P. Imperatore et al., Scalable performance analysis of the parallel SBASDInSAR algorithm, in Proc. IEEE Int. Geosci. Remote Sens. Symp.
(IGARSS14), Quebec City, Canada, pp. 350353, 2014.
[16] S. Elefante et al., G-POD implementation of the P-SBAS DInSAR algorithm to process big volumes of SAR data, in Proc. Conf. Big Data from
Space (BiDS14), 2014.
11
Thisarticlehasbeenacceptedforinclusioninafutureissueofthisjournal.Contentisfinalaspresented,withtheexceptionofpagination.
12
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING