You are on page 1of 603

2008 ieee International Conference

on Sensor Networks, Ubiquitous,


and Trustworthy Computing

11-13 June 2008


Taichung, Taiwan

Sutc 2008 Information Sessions AUTHORS Search


trademark Getting Started

Editors Sponsors
Mukesh Singhal, University of Kentucky, USA IEEE Computer Society
Giovanna Di Marzo Serugendo, University of London, UK National Science Council, Taiwan, ROC
Jeffrey J. P. Tsai, University of Illinois, Chicago, USA Academia Sinica, Taiwan, ROC
Wang-Chien Lee, Pennsylvania State University, USA Institute for Information Industry, Taiwan, ROC
Kay Romer, ETH Zurich, Switzerland Asia University, Taiwan, ROC
Yu-Chee Tseng, National Chiao-Tung University, Taiwan
Han C. W. Hsiao, Asia University, Taiwan
IEEE Computer Society
Conference Publications
Operations
ations Committee
CPOC Chair
Chita R. Das
Professor, Penn State University
Board Members
Mike Hinchey, Director, Software Engineering Lab, NASA Goddard
Paolo Montuschi, Professor, Politecnico di Torino
Jeffrey Voas, Director, Systems Assurance Technologies, SAIC
Suzanne A. Wagner, Manager, Conference Business Operations
Wenping Wang, Associate Professor, University of Hong Kong
IEEE Computer Society Executive Staff
Angela Burgess, Executive Director
Alicia Stickley, Senior Manager, Publishing Services
Thomas Baldwin, Senior Manager, Meetings & Conferences
IEEE Computer Society Publications
The world-renowned IEEE Computer Society publishes, promotes, and distributes a wide variety of authoritative
computer science and engineering texts. These books are available from most retail outlets. Visit the CS Store at
http://www.computer.org/portal/site/store/index.jsp for a list of products.

IEEE Computer Society Conference Publishing Services (CPS)


The IEEE Computer Society produces conference publications for more than 250 acclaimed international
conferences each year in a variety of formats, including books, CD-ROMs, USB Drives, and on-line publications.
For information about the IEEE Computer Society’s Conference Publishing Services (CPS), please e-mail:
cps@computer.org or telephone +1-714-821-8380. Fax +1-714-761-1784. Additional information about Conference
Publishing Services (CPS) can be accessed from our web site at: http://www.computer.org/cps
IEEE Computer Society / Wiley Partnership
The IEEE Computer Society and Wiley partnership allows the CS Press Authored Book program to produce a
number of exciting new titles in areas of computer science and engineering with a special focus on software
engineering. IEEE Computer Society members continue to receive a 15% discount on these titles when purchased
through Wiley or at: http://wiley.com/ieeecs. To submit questions about the program or send proposals, please e-
mail jwilson@computer.org or telephone +1-714-816-2112. Additional information regarding the Computer
Society’s authored book program can also be accessed from our web site at:
http://www.computer.org/portal/pages/ieeecs/publications/books/about.html
Revised: 21 January 2008

CPS Online is our innovative online collaborative conference publishing system designed to speed the delivery of
price quotations and provide conferences with real-time access to all of a project's publication materials during
production, including the final papers. The CPS Online workspace gives a conference the opportunity to upload
files through any Web browser, check status and scheduling on their project, make changes to the Table of Contents
and Front Matter, approve editorial changes and proofs, and communicate with their CPS editor through discussion
forums, chat tools, commenting tools and e-mail.
The following is the URL link to the CPS Online Publishing Inquiry Form:
http://www.ieeeconfpublishing.org/cpir/inquiry/cps_inquiry.html
Proceedings

SUTC 2008

Table of Contents

Foreword ...........................................................................................................................................................xiii
Welcome from the Conference Program Co-Chairs ....................................................xv
Committees ....................................................................................................................................... xvi

Keynotes

Cyber-Physical Systems: A New Frontier .....................................................................................................................1


Lui Sha, Sathish Gopalakrishnan, Xue Liu, and Qixin Wang

Security Enforcement Model for Distributed Usage Control ......................................................................................10


Xinwen Zhang, Jean-Pierre Seifert, and Ravi Sandhu

China’s National Research Project on Wireless Sensor Networks ..............................................................................19


Lionel M. Ni

Mobile and Wireless Mesh/Ad Hoc Networks

Strong QoS and Collision Control in WLAN Mesh and Ubiquitous Networks ..........................................................20
Chi-Hsiang Yeh and Richard Wu

Construct Small Worlds in Wireless Networks Using Data Mules .............................................................................28


Chang-Jie Jiang, Chien Chen, Je-Wei Chang, Rong-Hong Jan, and
Tsun Chieh Chiang

Load Awareness Multi-channel MAC Protocol Design for Ad Hoc Networks...........................................................36


Chih-Min Chao and Kuo-Hsiang Lu

v
A Reactive Local Positioning System for Ad Hoc Networks ......................................................................................44
Sungil Kim, Yoo Chul Chung, Yangwoo Ko, and Dongman Lee

Security, Privacy, and Trust

A Novel Distributed Authentication Framework for Single Sign-On Services ...........................................................52


Kaleb Brasee, S. Kami Makki, and Sherali Zeadally

A Study on Digital Audio Watermarking Internet Applications .................................................................................59


Yiju Wu and Shigeru Shimamoto

An Enhanced Trust Model Based on Reputation for P2P Networks ...........................................................................67


Xu Wu, Jingsha He, and Fei Xu

On the Security of the Full-Band Image Watermark for Copyright Protection ...........................................................74
Chu-Hsing Lin, Jung-Chun Liu, and Pei-Chen Han

Ubiquitous and Wireless Security

Using Body Sensor Networks for Increased Safety in Bomb Disposal Missions........................................................81
John Kemp, Elena I. Gaura, James Brusey, and C. Douglas Thake

A Cloaking Algorithm Based on Spatial Networks for Location Privacy ...................................................................90


Po-Yi Li, Wen-Chih Peng, Tsung-Wei Wang, Wei-Shinn Ku, Jianliang Xu,
and J. A. Hamilton Jr.

Controlled Disclosure of Context Information across Ubiquitous Computing Domains.............................................98


Cristian Hesselman, Henk Eertink, Martin Wibbels, Kamran Sheikh, and
Andrew Tokmakoff

Efficient Proxy Signatures for Ubiquitous Computing..............................................................................................106


Santosh Chandrasekhar, Saikat Chakrabarti, and Mukesh Singhal

Deployment and Coverage of Wireless Sensor Networks

Sub-optimal Step-by-Step Node Deployment Algorithm for User Localization


in Wireless Sensor Networks.....................................................................................................................................114
Yuh-Ren Tsai and Yuan-Jiun Tsai

Neighborhood-Aware Density Control in Wireless Sensor Networks.......................................................................122


Mu-Huan Chiang and Gregory T. Byrd

Two-Way Beacon Scheduling in ZigBee Tree-Based Wireless Sensor Networks ....................................................130


Lun-Wu Yeh, Meng-Shiuan Pan, and Yu-Chee Tseng

vi
Applications and Protocols of Wireless Sensor Networks

Algorithms and Methods beyond the IEEE 802.15.4 Standard for a Wireless
Home Network Design and Implementation .............................................................................................................138
M. A. Lopez-Gomez, A. Florez-Lara, J. M. Jimenez-Plaza, and
J. C. Tejero-Calado

Service-Oriented Design Methodology for Wireless Sensor Networks:


A View through Case Studies....................................................................................................................................146
Elena Meshkova, Janne Riihijärvi, Frank Oldewurtel, Christine Jardak,
and Petri Mähönen

Kuka: An Architecture for Associating an Augmented Artefact with Its User


Using Wearable Sensors ............................................................................................................................................154
Kaori Fujinami and Susanna Pirttikangas

Generating a Tailored Middleware for Wireless Sensor Network Applications........................................................162


Christian Buckl, Stephan Sommer, Andreas Scholz, Alois Knoll,
and Alfons Kemper

Energy Efficient Object Tracking in Sensor Networks by Mining Temporal


Moving Patterns.........................................................................................................................................................170
Vincent S. Tseng, Kawuu W. Lin, and Ming-Hua Hsieh

Network Resource Optimization and Management

A Space-Time Network Optimization Model for Traffic Coordination and


Its Evaluation.............................................................................................................................................................177
Nirav Shah, Subodha Kumar, Farokh Bastani, and I-Ling Yen

Fair Broadcasting Schedules on Dependent Data in Wireless Environments............................................................185


Ming-Te Shih and Chuan-Ming Liu

Hovering Information—Self-Organising Information that Finds Its Own Storage ...................................................193


Alfredo A. Villalba Castro, Giovanna Di Marzo Serugendo, and Dimitri Konstantas

Pervasive and Power-Aware Computing and Communications

EvAnT: Analysis and Checking of Event Traces for Wireless Sensor Networks......................................................201
Matthias Woehrle, Christian Plessl, Roman Lim, Jan Beutel, and Lothar Thiele

Power-Aware Real-Time Scheduling upon Identical Multiprocessor Platforms.......................................................209


Vincent Nélis, Joël Goossens, Raymond Devillers, Dragomir Milojevic,
and Nicolas Navet

Finding Similar Answers in Data-Centric Sensor Networks .....................................................................................217


I-Fang Su, Yu-Chi Chung, and Chiang Lee

Energy-Efficient Real-Time Co-scheduling of Multimedia DSP Jobs ......................................................................225


Chien-Wei Chen, Chuan-Yue Yang, Tei-Wei Kuo, and Ming-Wei Chang

vii
Special Session

An Automated Bacterial Colony Counting System ...................................................................................................233


Chengcui Zhang, Wei-Bang Chen, Wen-Lin Liu, and Chi-Bang Chen

LOFT: Low-Overhead Freshness Transmission in Sensor Networks........................................................................241


Chin-Tser Huang

Inter-domain Authentication for Seamless Roaming in Heterogeneous


Wireless Networks.....................................................................................................................................................249
Summit R. Tuladhar, Carlos E. Caicedo, and James B. D. Joshi

Structural Videotext Regions Completion with Temporal-Spatial Consistency........................................................256


Tsung-Han Tsai and Chih-Lun Fang

Effective Feature Space Reduction with Imbalanced Data for Semantic


Concept Detection .....................................................................................................................................................262
Lin Lin, Guy Ravitz, Mei-Ling Shyu, and Shu-Ching Chen

Industry Panel

Wireless Sensor Network Industrial View? What Will Be the Killer Apps
for Wireless Sensor Network? ...................................................................................................................................270
Ming-Whei Feng

Sensor Network, Where Is It Going to Be? ...............................................................................................................271


Polly Huang

Industry Track

A Framework of Machine Learning Based Intrusion Detection for Wireless


Sensor Networks........................................................................................................................................................272
Zhenwei Yu and Jeffrey J. P. Tsai

Mobile Intelligence for Delay Tolerant Logistics and Supply Chain Management...................................................280
Tianle Zhang, Zongwei Luo, Edward C. Wong, C. J. Tan, and Feng Zhou

Towards Scalable Deployment of Sensors and Actuators for Industrial Applications ..............................................285
Han Chen, Paul Chou, and Hao Yang

An Environment Sensor Fusion Application on Smart Building Skins .....................................................................291


Kun-Cheng Tsai, Jing-Tian Sung, and Ming-Hui Jin

Automated Management of Assets Based on RFID ..................................................................................................296


Shengguang Meng, Dickson K.W. Chiu, Liu Wenyin, and Xuxiang Chen

ZigBee Source Route Technology in Home Application ..........................................................................................302


Yao-Ting Wu

viii
Two Practical Considerations of Beacon Deployment for Ultrasound-Based
Indoor Localization Systems .....................................................................................................................................306
Chun-Chieh Hsiao and Polly Huang

A Measurement-Based Method for Improving Data Center Energy Efficiency........................................................312


Hendrik F. Hamann

Solution Templates Tool for Enterprise Business Applications Integration..............................................................314


Shiwa S. Fu, Jeaha Yang, Jim Laredo, Ying Huang, Henry Chang,
Santhosh Kumaran, Jen-Yao Chung, and Yury Kosov

Workshop on Ad Hoc and Ubiquitous Computing

A Fuzzy-Based Transport Protocol for Mobile Ad Hoc Networks............................................................................320


Neng-Chung Wang, Yung-Fa Huang, and Wei-Lun Liu

Region-Based Sensor Selection for Wireless Sensor Networks ................................................................................326


Yoshiyuki Nakamura, Kenji Tei, Yoshiaki Fukazawa, and Shinichi Honiden

CRT-MAC: A Power-Saving Multicast Protocol in the Asynchronous


Ad Hoc Networks ......................................................................................................................................................332
Yu-Chen Kuo and Chih-Nung Chen

Adaptive Bandwidth Management and Reservation Scheme in Heterogeneous


Wireless Networks.....................................................................................................................................................338
I-Shyan Hwang, Bor-Jiunn Hwang, Ling-Feng Ku, and Pen-Ming Chang

WAP: Wormhole Attack Prevention Algorithm in Mobile Ad Hoc Networks .........................................................343


Sun Choi, Doo-young Kim, Do-hyeon Lee, and Jae-il Jung

Performance of a Hierarchical Cluster-Based Wireless Sensor Network ..................................................................349


Yung-Fa Huang, Neng-Chung Wang, and Ming-Che Chen

Ad Hoc Collaborative Filtering for Mobile Networks...............................................................................................355


Patrick Gratz, Adrian Andronache, and Steffen Rothkugel

A Reconfigurable Distributed Broker Infrastructure for Publish Subscribe


Based MANET ..........................................................................................................................................................361
Mayank Pandey and B. D. Chaudhary

A Hilbert Curve-Based Distributed Index for Window Queries in Wireless


Data Broadcast Systems ............................................................................................................................................367
Jun-Hong Shen and Ye-In Chang

An Efficient Quorum-Based Fault-Tolerant Approach for Mobility Agents


in Wireless Mobile Networks ....................................................................................................................................373
Yeong-Sheng Chen, Chien-Hsun Chen, and Hua-Yin Fang

PNECOS: A Peer-to-Peer Network Coding Streaming System ................................................................................379


Tein-Yaw Chung, Chih-Cheng Wang, Yung-Mu Chen, and Yang-Hui Chang

ix
Workshop on Ambient Semantic Computing

A Common Concept Description of Natural Language Texts as the Foundation


of Semantic Computing on the Web..........................................................................................................................385
Mitsuru Ishizuka

BioSemantic System: Applications of Structured Natural Language to Biological


and Biochemical Research.........................................................................................................................................386
David Hecht, Rouh-Mei Hu, Rong-Ming Chen, Jong-Waye Ou,
Chao-Yen Hsu, Haitao Gong, Ka-Lok Ng, Han C. W. Hsiao,
Jeffrey J. P. Tsai, and Phillip C.-Y. Sheu

Comparing the Conceptual Graphs Extracted from Patent Claims............................................................................394


Shih-Yao Yang and Von-Wun Soo

Semantic Enforcement of Privacy Protection Policies via the Combination


of Ontologies and Rules ............................................................................................................................................400
Yuh-Jong Hu, Hong-YI Guo, and Guang-De Lin

Development of an Integrated Platform for Social Collaborations Based on


Semantic Peer Network .............................................................................................................................................408
Ching-Long Yeh, Yun-Maw Cheng, and Li-Chieh Chen

A Survey of State of the Art Biomedical Text Mining Techniques


for Semantic Analysis................................................................................................................................................410
Hong-Jie Dai, Chi-Hsin Huang, Jaimie Yi-Wen Lin, Pei-Hsuan Chou,
Richard Tzong-Han Tsai, and Wen-Lian Hsu

Adaptive Automatic Segmentation of HEp-2 Cells in Indirect Immunofluorescence


Images .......................................................................................................................................................................418
Yu-Len Huang, Yu-Lang Jao, Tsu-Yi Hsieh, and Chia-Wei Chung

Outline Detection for the HEp-2 Cell in Indirect Immunofluorescence Images


Using Watershed Segmentation.................................................................................................................................423
Yu-Len Huang, Chia-Wei Chung, Tsu-Yi Hsieh, and Yu-Lang Jao

A Multi-layered Approach to the Polysemy Problems in a Chinese to


Taiwanese System .....................................................................................................................................................428
Yih-Jeng Lin, Ming-Shing Yu, Chin-Yu Lin, and Yuan-Tsun Lin

Extracting Alternative Splicing Information from Captions and Abstracts


Using Natural Language Processing..........................................................................................................................436
Chia Yang Cheng, F. R. Hsu, and Chuan Yi Tang

Workshop on Embedded Processors, Sensors, and Actuators

New Frontiers of Microcontroller Education: Introducing SiLabs ToolStick


University Daughter Card..........................................................................................................................................439
Gourab Sen Gupta and Chew Moi-Tin

An Embedded Computing Platform for Robot ..........................................................................................................445


Ching-Han Chen and Sz-Ting Liou

x
Actuation Design of Two-Dimensional Self-Reconfigurable Robots........................................................................451
Ming-Chiuan Shiu, Hou-Tsan Lee, Feng-Li Lian, and Li-Chen Fu

Fabrication of Microfluidic Pump Using Conducting Polymer Actuator ..................................................................457


Jung Ho Kim, King Tong Lau, Dermot Diamond

Calibration and Data Integration of Multiple Optical Flow Sensors for Mobile
Robot Localization ....................................................................................................................................................464
Jwu-Sheng Hu, Yung-Jung Chang, and Yu-Lun Hsu

A Combined Platform of Wireless Sensors and Actuators Based on


Embedded Controller.................................................................................................................................................470
S. C. Mukhopadhyay, G. Sen Gupta, and R. Y. M. Huang

A Novel Multiphysics Sensoring Method Based on Thermal and EC Techniques


and Its Application for Crack Inspection...................................................................................................................475
Cheng-Chi Tai and Yen-Lin Pan

Training Data Compression Algorithms and Reliability in Large Wireless


Sensor Networks........................................................................................................................................................480
Vasanth Iyer, Garimella Rammurthy, and M. B. Srinivas

Workshop on Intelligent Multimedia Processing


for Ubiquitous Applications

Video Summarization Based on Semantic Feature Analysis and User Preference....................................................486


Wen-Nung Lie and Kuo-Chiang Hsu

Intelligent Multimedia Recommender by Integrating Annotation and


Association Mining....................................................................................................................................................492
Vincent S. Tseng, Ja-Hwung Su, Bo-Wen Wang, Chin-Yuan Hsiao,
Jay Huang, and Hsin-Ho Yeh

Acoustic and Phoneme Modeling Based on Confusion Matrix for Ubiquitous


Mixed-Language Speech Recognition.......................................................................................................................500
Po-Yi Shih, Jhing-Fa Wang, Hsiao-Ping Lee, Hung-Jen Kai, Hung-Tzu Kao,
and Yuan-Ning Lin

Speech Watermarking Based on Wavelet Transform and BCH Coding ...................................................................507


Shi-Huang Chen, Shih-Yin Yu, and Chung-Hsien Chang

Workshop on Mobile, Ubiquitous and Classroom Technology


Enhanced Learning

QoS-Based Learning Services Composition for Ubiquitous Learning ......................................................................513


Fu-Ming Huang, Ci-Wei Lan, and Stephen J. H. Yang

Collaborative Annotation Creation and Access in a Multimodal Environment


with Heterogeneous Devices for Decision Support and for Experience Sharing.......................................................519
Charles Robert

xi
A Computer-Assisted Approach for Designing Context-Aware Ubiquitous
Learning Activities ....................................................................................................................................................524
Tzu-Chi Yang, Fan-Ray Kuo, Gwo-Jen Hwang, and Hui-Chun Chu

Usability Comparison of Pen-Based Input for Young Children on Mobile Devices.................................................531


Chih-Kai Chang

Evaluation of the Learning of Scientific English in Podcasting PCs, MP3s,


and MP4s Scenarios...................................................................................................................................................537
Siew-Rong Wu

The Study of Using Sure Stream to Construct Ubiquitous Learning Environment ...................................................543
Koun-Tem Sun and Hsin-Te Chan

Workshop on Ubiquitous Service Computing

Interest-Based Peer Selection in P2P Network..........................................................................................................549


Harry Chiou, Addison Su, and Stephen Yang

Free-Form Annotation Tool for Collaboration ..........................................................................................................555


Han-Zhen Wu, Stephen J. H. Yang, and Yu-Sheng Su

Hands-On Training for Chemistry Laboratory in a Ubiquitous Computing


Environment ..............................................................................................................................................................561
Mune-Aki Sakamoto and Masakatsu Matsuishi

A Progress Report and a Proposal: Interactivity in Ubiquitous Learning


Enhanced by Virtual Tutors in E-learning Contents..................................................................................................564
Toshiyuki Yamamoto and Ryo Miyashita

Collaborative Interpretative Service Assisted Design System Based on Hierarchical


Case Based Approach................................................................................................................................................569
Huan-Yu Lin, Shian-Shyong Tseng, Jui-Feng Weng, and Jun-Ming Su

Author Index .................................................................................................................................................577

xii
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Cyber-Physical Systems: A New Frontier


Lui Sha1, Sathish Gopalakrishnan2, Xue Liu3, and Qixin Wang1
1
University of Illinois at Urbana Champaign, 2University of British Columbia, 3McGill University
lrs@cs.uiuc.edu; sathish@ece.ubc.ca; xueliu@cs.mcgill.ca; qwang4@uiuc.edu

Abstract: The report of the President's Council is wasted and doesn't end as useful. This is the
of Advisors on Science and Technology (PCAST) measure of what's in front of us and why we should
has placed CPS on the top of the priority list for be excited.” Buildings and transportation are
federal research investment [6]. This article first sectors with heavy energy consumption. During
reviews some of the challenges and promises of the NSF CDI Symposium (September 5-6, 2007) at
CPS, followed by an articulation of some specific RPI, Clas A. Jacobson2 noted that green buildings
challenges and promises that are more closely hold great promises. Energy used in lightening and
related to the Sensor Networks, Ubiquitous and cooling buildings is estimated at 3.3 trillion KWh.
Trustworthy Computing Conference. Technologically, it is possible to reach the state of
Net Zero Energy Buildings, where 60-70%
1. Introduction efficiency gains required for reducing demand and
balance to be supplied by renewable. However, to
The Internet has made the world “flat” by reach the goal of net zero energy buildings, we
transcending space. We can now interact with must tightly integrate the cyber world and the
people and get useful information around the globe physical world. He noted that in the past the
in a fraction of a second. The Internet has science of computation has systematically
transformed how we conduct research, studies, abstracted away the physical world and vice versa.
business, services, and entertainment. However, It is time to construct a Hybrid Systems Science
there is still a serious gap between the cyber world, that is simultaneously computational and physical,
where information is exchanged and transformed, providing us with a unified framework for robust
and the physical world in which we live. The design flow with multi-scale dynamics and with
emerging cyber-physical systems shall enable a integrated wired and wireless networking for
modern grand vision for societal-level services that managing the flows of mass, energy, and
transcend space and time at scales never possible information in a coherent way.
before.
According to the Department of Energy, the
Two of the greatest challenges of our time are transportation share of the United States’ energy
global warming coupled with energy shortage, and use reached 28.4% in 2006, which is the highest
the rapid aging of a significant fraction of the share recorded since 19703. In the United States,
world’s population with the related chronic passenger and cargo airline operations
diseases that threaten to bankrupt healthcare alone required 19.6 billion gallons of jet fuel
services, such as Medicare, or to dramatically cut in 2006. According to Time4, 88% of all trips in
back medical benefits. the U.S. are by car. Work related needs including
daily work commute and business travel is a
During the meeting of the World Business significant fraction of the transportation cost.
Council for Sustainable Development in Beijing on Telepresence research seeks to make all
March 29, 2006, George David1 noted: “More than
90 percent of the energy coming out of the ground 2
Chief Scientist, Control, United Technology Research Center
3
http://cta.ornl.gov/data/new_for_edition26.shtml
4
www.time.com/time/specials/2007/environment/article/0,28804,160
1
Chairman and CEO of United Technology Research Center 2354_1603074_1603122,00.html

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 1


DOI 10.1109/SUTC.2008.85
interactions seem local rather than remote. It is one Advanced biotechnology holds great promise to
of the three grand challenges of in multimedia improve the heath of an aging population. For
research5 to make interactions with remote people example, stem-cell biotechnology holds the
and environments nearly the same as interactions promise of treatment for many age-related diseases.
with local people and environments. Integrating According to NIH 7 , “stem cells, directed to
wired and wireless networks with real-time, differentiate into specific cell types, offer the
interactive, immersive three-dimensional possibility of a renewable source of replacement
environments and tele-operation can minimize cells and tissues to treat diseases including
work-related travel. Parkinson's and Alzheimer's diseases, spinal cord
injury, stroke, burns, heart disease, diabetes,
The rapidly aging population with age related osteoarthritis, and rheumatoid arthritis.” In
chronic diseases is another formidable societal addition, “Human stem cells could also be used to
challenge. It is alarming to note that the growth of test new drugs. For example, new medications
per-capita health cost has been increasing near could be tested for safety on differentiated cells
exponentially with an increase in the age of the generated from human pluripotent cell lines.”
population.
However, much of this potential is not tapped,
According to the CDC6, more than 90 million largely due to lack of sufficient knowledge of the
Americans live with chronic illnesses. complex and dynamic stem-cell microenvironment,
also known as the niche. There is a need to mimic
• Chronic diseases account for 70% of all niche conditions precisely in artificial
deaths in the United States. environments to correctly regulate stem cells ex
• The medical care costs of people with vivo. Indeed, the sensing and the control of the
chronic diseases account for more than 75% of the stem cell microenvironment are at the frontier of
nation’s $1.4 trillion medical care costs. stem cell research. According to Badri Roysam8
• Chronic diseases account for one-third of the stem cell niche has a complex multi-cellular
the years of potential life lost before age 65. architecture that has many parameters, including
multiple cell types related by lineage, preferred

5 7
http://delivery.acm.org/10.1145/1050000/1047938/p3- http://stemcells.nih.gov/info/basics/basics6.asp
8
rowe.pdf?key1=1047938&key2=8175939811&coll=GUIDE&dl=GU Professor, ECSE & Biomedical Engineering, RPI and Associate
IDE&CFID=15151515&CFTOKEN=6184618 Director, NSF ERC Center for Subsurface Sensing & Imaging
6
http://www.cdc.gov/nccdphp/overview.htm#2 Systems

2
spatial locations and orientations of cells relative to created an exciting new ubiquitous computing
blood vessels, soluble factors, insoluble factors paradigm that facilitates computing and
related to the extra-cellular matrix, bio-electrical communication services all the time, everywhere.
factors, biomechanical factors, and geometrical This emerging paradigm is changing the way we
factors. The combinatorial space of parameter live and work today. Via a ubiquitous
optimization and niche environment control calls is infrastructure consisting of a variety of global
a grand challenge in embedded sensing and and localized networks, users, sensors, devices,
actuation. systems and applications may seamlessly
interact with each other and even the physical
A closely related problem is providing care to world in unprecedented ways. To realize this
the elderly population without sending them to continually evolving ubiquitous computing
expensive nursing homes. In the United States paradigm, trustworthy computing that delivers
alone, the number of people over age 65 is secure, private, and reliable computing and
expected to reach 70 million by 2030, doubling communication services play an essential role.”
from 35 million in 2000. Expenditure in the United
States for health-care will grow to 15.9% of the Clearly, SUTC research has a key role to
GDP ($2.6 trillion) by 2010. Unless the cost of play in the emerging cyber-physical system of
health care for the elderly can be significantly systems. A variety of questions need to be
reduced, financially stressed Social Security and answered, at different layers of the architecture
Medicare/Medicaid systems will likely lead to and from different aspects of systems design, to
burdensome tax increases and/or benefit reductions. trigger and to ease the integration of the
A major cost is the loss of the ability to remain in physical and cyber worlds.
the home because of the need for greater health
care supervision. 2. The Challenges of Cyber-Physical System
Research
One crucial factor contributing to the loss of
independence and the resulting institutionalization 2.1. Real-time System Abstractions
is the need for assistance in physical mobility.
Another key factor is cognitive impairment that Future distributed sensors, actuators, and
requires daily supervision of medication and mobile devices with both deterministic and
health-condition monitoring. When future CPS stochastic data traffic require a new paradigm for
infrastructure supports tele-presence, persons with real-time resource management that goes far
one or more minor mobility impairments can beyond traditional methods. The interconnection
regain their freedom of movement at home. In topology of mobile devices is dynamic and the
addition, physiological parameters critical to the system infrastructure can also be dynamically
medical maintenance of health can be monitored reconfigured in order to contain system disruptions
remotely. When the elderly can maintain their or optimize system performance. There is a need
independent living without loss of privacy, a major for novel distributed real-time computing and real-
financial saving in senior care will result. time group communication methods for dynamic
Furthermore, the elderly will be much happier by topology control in wireless CPS systems with
living independently at home while staying in mobile components with dynamic topology control.
contact with their social networks. Understanding and eventually controlling the
impact of reconfigurable topologies on real-time
As observed by SUTC 2008, “rapid research performance, safety, security, and robustness will
and technological advances in wireless have tremendous impact in distributed CPS system
communications and increasing availability of architecture design and control.
sensors, actuators, and mobile devices have

3
Existing hardware design and programming • Consistent views of distributed states in real-
abstractions for computing are largely built on the time within the sphere of influence. This challenge
premise that the principal task of a computer is is especially great in mobile devices
data transformation. Yet cyber-physical systems • Topology control and “dynamic real-time
are real-time systems. This requires a critical re- groups” in the form of packaged service classes of
examination of existing hardware and software bounded delay, jitter and loss under precisely
architectures that have been built over the last specified conditions,
several decades. There are foundational • Interface to access to the same type of
opportunities that have the potential of defining the controls regardless of the underlying network
landscape of computation in the cyber-physical technology.
world. When computation interacts with the
physical world, we need to explicitly deal with 2.2. Robustness, Safety and Security of Cyber-
events distributed in space and time. Timing and Physical Systems
spatial information need to be explicitly captured
into programming models. Other physical and Uncertainty in the environment, security
logical properties such as physical laws, safety, or attacks, and errors in physical devices and in
power constraints, resources, robustness, and wireless communication pose a critical challenge
security characteristics should be captured in a to ensure overall system robustness, security and
composable manner in programming abstractions. safety. Unfortunately, it is also one of the least
Such programming abstractions may necessitate a understood challenges in cyber-physical systems.
dramatic rethinking of the traditional split between There is a clear intellectual opportunity in laying
programming languages and operating systems. the scientific foundations for robustness, security
Similar changes are required at the and safety of cyber-physical systems in general
software/hardware level given performance, and in SUTC systems in particular. An immediate
flexibility, and power tradeoffs. aim should be to establish a prototypical SUTC
model challenge problems and to establish a set of
We also need strong real-time concurrent useful and coherent metrics that capture
programming abstractions. Such abstractions uncertainty, errors, faults, failures and security
should be built upon a model of simultaneity: attacks.
bands in which the system delays are much smaller
than the time constant of the physical phenomenon We have long accepted that perfect physical
of interest. The programming abstractions that are devices are rare. A perfect example of this
needed should also capture the ability of software approach is the design of reliable communication
artifacts to execute at multiple capability levels. protocols that use an inherently error-prone
This is motivated by the need for software medium, whether wired or wireless. This prudence
components to migrate within a cyber-physical has not been applied to other software engineering
system and execute on devices with different processes. We have depended, more often than not,
capabilities. Software designers should be capable on the correctness of the results from our
of expressing the deprecated functionality of a microprocessors and other hardware elements.
software component when it executes on a device While we have successfully masked many
with limited resources. hardware failures using a combination of
innovative circuit design, redundancy and replay,
The programming abstractions that we envision we have largely regarded most other errors as
will need support at the middleware and operating either transient – caused by bit flips – or permanent.
system layers for: Transient errors can be ameliorated by re-
execution and permanent failures require migrating
• Real-time event triggers, tasks to fault-free hardware. Sub-micron scaling of

4
semiconductor devices and device density, many useful, but not safety critical, cyber-physical
however, will present us with hardware that is components that have not been fully verified and
more error-prone, and errors are likely to be validated. The broadcast nature of a wireless
neither transient nor permanent. Intermittent errors network and interference make these challenges
– that last several milliseconds to a few seconds – more serious. In physical systems, it is the theory
may not be uncommon in future generation chip of feedback control that provides the very
multiprocessors [9]. To tolerate intermittent foundation to achieve robustness and stability
failures, we will likely need to apply algorithms despite uncertainty in the environment and errors
that do not rely on the accuracy of one in sensing and control. The current open loop
computation. Ideas concerning imprecise architecture in software systems may allow a
computations [11] will gain more relevance; minor error to cascade into system failure. The
developing algorithms using those principles will loops must be closed across both the cyber world
be extremely valuable on the road to robust and physical world. The system must have the
systems. capability to effectively counter-act uncertainties,
faults, failures and attacks. The recent
These trends will, however, make our current development of formal specification based
efforts of build perfect software more difficult. automatic generation of system behavior
Indeed, there has been great advancement in monitoring, the steering of computation
automated theorem proving and model checking in trajectories, and the use of analytically redundant
recent years. However, it is important to modules based on different principles, while still in
remember that cyber-physical systems are real- infancy, is an encouraging development.
time systems and the complexity of verifying
temporal logic specifications is exponential. That Safety, robustness and security of the
is, like the physical counterpart, a perfect software composed CPS also require explicit and machine
component is also rare and will remain that way. checkable assumptions regarding external
This has profound implications. We need to invent environments; formally specified and verifiable
a cyber-physical system architecture in which the reduced complexity critical services and reduced
safety critical services of large and complex CPS complexity interaction involving safety critical and
can be guaranteed by a small subset of modules non-safety critical components; and analytically
and their interactions; the design of this subset will redundant sensing and control subsystems based
have to be formally specified and verified. Their on different physical principles and/or algorithms
assumptions about the physical environments so as to avoid common mode failures due to faults
should be fully tested, and furthermore, we need to or attacks. We also need theory and tools to design
develop advanced and integrated static analysis and ensure well-formed dependency relations
and testing technologies to ensure that 1) the between components with different criticality as
software code is compliant with the design, and they share resources and interact. Stable and robust
that 2) the assumptions regarding external upgrading of running systems will be another
environment are sound. Finally, cyber-physical critical aspect of cyber-physical systems,
systems are deeply embedded and they will evolve especially in large critical infrastructure systems
in place. The verification and validation of cyber- that cannot or too expensive to shut down.
physical system is not a one-time event; it should
be life cycle process that produces an explicit body 2.3. System QoS Composition Challenge
of evidence for the certification of safety critical
services. CPS systems are distributed and hybrid real-
time dynamic systems, with many loops of
Safety critical services apart, we still have the different degree of application criticality operating
great challenge of how to handle known and at different time and space scales. Compositional
unknown residual errors, and security gaps in system modeling, analysis, synthesis and

5
integration for such systems are at the frontier of to note that these protocols may not be orthogonal
research. The “science” of system composition and, sometimes, could have pathological
has clearly emerged as one of the grand themes interactions; for example, the well-known problem
driving many of our research questions in of unbounded priority inversion when we use
networking and distributed systems. By system synchronization protocols and real-time priority
composition we mean that the QoS properties and assignments as is. There are also numerous reports
functional correctness of the system can be derived from the field about the adverse interactions
from the architectural structure, subsystem between certain security, real-time and fault
interaction protocols, and the local QoS properties tolerant protocols. Thus, the theory of system
and functional properties of various constituent composition must address not only the
components. composability at each QoS dimension but also the
question of how the protocols interact.
A framework for system composition should
highlight the manner of the composition of 2.4. Systems Engineering Research
components and the methods to derive QoS
metrics for a composite system. From a systems engineering perspective, we
need a scientific methodology to iteratively build
Current compositional frameworks have been both the system structure model and the system
developed with limited heterogeneity, such as real- behavior model. We need to develop analytical
time resource management, automata and capability to map behavior onto structure and vice
differential equations. The new theory of system versa so that we identify what aspects of the
composition must provide a comprehensive required behavior will be performed by which
treatment of system integration concerns. Each specific parts of structure. We need techniques to
component should provide a system composition perform quantitative trade-off analysis that will
interface that specifies not only its input and output take into account the available technology and
but also relevant QoS properties and constraints. constraints on the cyber components, from the
In electronic subsystems, the properties of a circuit physical components, and from human operators.
depend not only on component properties but also In the development of CPS systems, constraints
on how they are connected together. Likewise, a imposed by the physical sciences (such as physics,
CPS system's properties will depend on both chemistry, materials science) will need to interact
component properties and the structure of system with constraints on the computing artifacts (such as
architecture. computational complexity, robustness, safety and
security). To make fundamental progress, we need
The framework for capturing subsystem a combination of model-based system and software
requirements needs to be powerful enough to design and integration technologies; and deep
describe both deterministic requirements and analysis of the underlying abstractions and their
probabilistic requirements. Not all subsystems in a interactions.
CPS will be hard real-time systems and methods
need to evolve to capture different types of With this scientific and engineering framework,
requirements, to synthesize compositional we must be able to judiciously choose the location,
requirements, and to determine the feasibility of computing and communication capabilities as well
meeting those requirements. as energy reserves of network nodes in order to
handle the required data flows efficiently. We
Large CPS systems will have many different should be able to alter the topological
QoS properties encompassing stability, robustness, characteristics of the network by changing
schedulability, security, each of which will employ transmission power, medium access control, and
a different set of protocols and will need to be communication protocols. Our challenge is to
analyzed using a different theory. It is important

6
formulate a new calculus that merges time- protected with different levels of information
triggered and event-driven systems. We need it to disclosure to different roles (health care providers,
be applicable to hierarchies that involve dynamics medical team, relatives, or assisted persons).
at drastically different time scales from months to Unauthorized access to private information can
microseconds and geographic scope from on chip have serious consequences. We need an analytical
to the world. This is a grand challenge. foundation and the associated engineering
framework to address privacy protection in cyber-
2.5. Trust in Cyber-Physical Systems physical systems. A combination of mechanisms
for auditing and regulating access to information,
Users of cyber-physical systems will need to for preserving privacy of individuals but exporting
place a high level of trust in the operation of the aggregate statistics, and legal procedures for
systems. Trust is a combination of a many enforcement of privacy protection would need to
characteristics, mainly reliability, safety, security, evolve to make cyber-physical systems acceptable
privacy and usability. to a large population.

System models and abstractions (described All systems become usable when complexity
earlier) have to incorporate fault models and that does not need to be exposed to users is kept
recovery policies that reflect the scale, lifetime, hidden, and when unavoidable complexity is
distributed control and replace/reparability of exposed to users according to cohesive, conceptual
components. Safety, as we have mentioned in brief models that maximizes system predictability,
in prior sections, requires attention in a larger supports users’ abilities to generalize about such
context as well. The ubiquitous use of CPS behavior, and minimizes corner cases. Usability
applications should not limit the availability of poses a variety of problems involving human
alternative systems – social and technological – cognition, computer-human interaction and
that can handle large-scale failures. We would interface design.
need to create guards that ensure that the
automation does not increase hazards when 3. Medical Device Network: An Example
compared to the non-automated system. The extra Cyber-Physical System
emphasis on safety is to highlight the need for
tools that provide support for visualizing and An example on how to design medical device
analyzing a cyber-physical within the broader network may provide us better understanding of
context of other social and cyber-physical systems. the aforementioned challenges. As noted in the
Two cyber-physical systems may never interact report of the NSF High-Confidence Medical
directly but may be coupled by human behavior, Device Software and Systems (HCMDSS)
and we need to understand the nature of such workshop [7]: “Advances in computing,
interactions and reason about safety as a global, networking, sensing, and medical-device
not local, property. technology are enabling the dramatic proliferation
of diagnostic and therapeutic devices. Those
An increased dependence on cyber-physical devices range from advanced imaging machines to
systems will lead to the collection of a vast amount minimally invasive surgical techniques, from
of human-centric data at various scales. Although camera-pills to doctor-on-a-chip, from
cyber-physical systems greatly enrich our life computerized insulin pumps to implantable heart
qualities and experiences, they also bring about devices. Although advances in standalone
privacy and security concerns [10]. For example, diagnostic and treatment systems have been
in applications such as assisted living [5] and accelerating steadily, the lack of proper
wireless medical device networks (Section 3), integration and interoperation of those systems
private personal data and medical data should be produces systemic inefficiencies in health care

7
delivery. This inflates costs and contributes to important to develop a standard-based, certifiable
avoidable medical errors that degrade patient care. wire and wireless networked medical devices
The use of software that controls medical devices infrastructure to lower the cost of development,
to overcome these problems is inevitable and will approval, and deployment of new
ensure safe advances in health care delivery. The technologies/devices. The development of
crucial issue, however, is the cost-effective technologies that can formally specify both the
development and production of reliable and safe application context and the device behaviors is a
medical-device software and systems.” major challenge for the vision of certifiable plug
and play medical devices in the future.
The next generation medical system is
envisioned as a ubiquitous system of wired and Second, most monitoring devices are being
wireless networked medical devices and medical moved from wired networks to wireless networks.
information systems for secured, reliable, privacy- How do we provide on-demand reliable real-time
preserving health care. It will be a networked streaming of critical medical information in a
system that improves the quality of life. For wireless network? This is another hard problem.
example, during a surgical operation, context For example, when an EKG device detects
information such as sensitivity to certain drugs will potentially dangerous and abnormal heartbeats, it
be automatically routed to relevant devices (such is of critical importance to ensure that not only the
as infusion pumps) to support personalized care warning but also the real-time EKG streams are
and safety management. A patient’s reactions – reliably displayed at nursing stations. Furthermore,
changes in vital signs – to medication and surgical reliable on demand real-time streaming must
procedures will be correlated with streams of coexist with other wireless devices. For example,
imaging data; streams will be selected and in an intensive care unit, we have 802.11 wireless
displayed, in the appropriate format and in real networks, cellular phones, wireless PDAs, RFID,
time, to medical personnel according to their needs, two-way radios and other RF emitting devices.
e.g., surgeons, nurses, anesthetists and This necessitates a network infrastructure to
anesthesiologists. During particularly difficult reliably integrate myriad wireless devices, to let
stages of a rare surgical operation, an expert them coexist safely, reliably and efficiently. To
surgeon can remotely carry out key steps using address these concerns, the FDA has issued an
remote displays and robot-assisted surgical official guideline for medical wireless network
machines, sparing the surgeon of the need to fly development [3].
across the country to perform, say, a fifteen-minute
procedure. Furthermore, data recording will be To design an integrated wired and wireless
integrated with storage management such that medical device network, we face all the
surgeons can review operations and key findings aforementioned QoS composition challenges. For
for longitudinal studies for the efficacy of drugs example, how does one monitor and enforce safe,
and operational procedures. secure, reliable and real time sharing of various
resources, in particular the wireless spectrum?
While networked medical devices hold many
How does one balance the resources dedicated to
promises, they also raise many challenges. First,
reliability, real-time performance and the need for
from operating rooms to enterprise systems,
coexistence? What is the programming paradigm
different devices and subnets have different levels
and system composition architecture to support
of clinical criticality. Data streams with different
safe and secured medical device plug and play [8]?
time sensitivities and criticality levels may share
many hardware and software infrastructure
resources. How to maintain safety in an integrated Acknowledgements. Most of the material
system is a major challenge that consists of many presented here originated from discussions,
research issues. Indeed, many medical devices are presentations, and working group documents from
safety critical and must be certified. Thus, it is NSF workshops on Real-time GENI and from

8
NSF workshops on Cyber-Physical Systems [1][2].
The authors thank all the workshop participants
for their insightful contributions.

References

[1] Real-time GENI report.


http://www.geni.net/GDD/GDD-06-32.pdf
[2] NSF Workshops on Cyber Physical Systems.
http://varma.ece.cmu.edu/cps/
[3] FDA, Draft Guidance for Industry and FDA Staff
– Radio-Frequency Wireless Technology in Medical
Devices, Jan. 2007.
http://www.fda.gov:80/cdrh/osel/guidance/1618.html
[4] Mu Sun, Qixin Wang, and Lui Sha, "Building
Reliable MD PnP Systems", Proceedings of the Joint
Workshop on High Confidence Medical Devices,
Software, and Systems and Medical Device Plug-and-
Play Interoperability, Jun. 2007.
[5] Qixin Wang, et al., “I-Living: An open system
architecture for assisted living,” Proceedings of the
IEEE International Conference on Systems, Man and
Cybernetics, Oct. 2006, pp. 4268-4275.
[6] http://ostp.gov/pdf/nitrd_review.pdf
[7] Insup Lee, et al., High-confidence medical device
software and systems.
http://ieeexplore.ieee.org/Xplore/login.jsp?url=/iel5/2/3
3950/01620992.pdf
[8] http://www.mdpnp.org/Home_Page.html
[9] Phillip M. Wells, Koushik Chakraborty, and
Gurindar S. Sohi, “Adapting to intermittent faults in
future multicore systems,” Proceedings of the
International Conference on Parallel Architectures and
Compilation Techniques, Sept. 2007.
[10] Jaideep Vaidya and Chris Clifton. “Privacy-
preserving data mining”, IEEE Security & Privacy
Magazine, Vol. 2, No. 6, Nov.-Dec. 2004, pp. 19 – 26.
[11] Jane W.-S. Liu, et al., “Imprecise computations”,
Proceedings of the IEEE, Vol. 82, No. 1, Jan. 1994, pp.
83 – 94.

9
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Security Enforcement Model for Distributed Usage Control

Xinwen Zhang and Jean-Pierre Seifert Ravi Sandhu


Samsung Information Systems America Institute for Cyber Security
San Jose, California, USA University of Texas at San Antonio
{xinwen.z, j.seifert}@samsung.com ravi.sandhu@utsa.edu

Abstract For example, in trust management [4, 9, 14, 19] systems, a


user presents a set of attributes or credentials and another
Recently proposed usage control concept and models ex- subject (e.g., a resource or service provider) can determine
tend traditional access control models with features for con- the permissions of the user based on the presented creden-
temporary distributed computing systems, including con- tials. In this problem, objects are typically protected in a
tinuous access control in dynamic computing environments centralized server. The second problem focuses on continu-
where subject attributes and system states can be changed. ous access control to an object after it is distributed to other
Particularly, this is very useful in specifying security re- (decentralized) locations or platforms, which is referred as
quirements to control the usage of an object after it is re- the usage control problem proposed by researchers in liter-
leased into a distributed environment, which is regarded as atures [15, 21, 22, 27].
one of the fundamental security issues in many distributed Although there is no precise definition in the literature,
systems. However, the enabling technology for usage con- the main goal of usage control is to enable continuous ac-
trol is a challenging problem and the space has not been cess control after an object is released to a different control
fully explored yet. In this paper we identify the general re- domain from its owner or provider, especially in highly dis-
quirements of a trusted usage control enforcement in het- tributed and heterogeneous environments. Typically, a us-
erogeneous computing environments, and then propose a age control policy is defined for a target object by its stake-
general platform architecture and enforcement mechanism holder, which specifies the conditions that accesses to the
by following these requirements. According to our usage object on a target platform can be allowed 1 . A stakeholder
control requirements, we augment the traditional SELinux can be the owner of a target object, or a service provider
MAC enforcement mechanism by considering subject/object that is delegated by the object owner to protect the object.
integrity and environmental information. The result shows An object in usage control can be static data, various types
that our framework is effective in practice and can be seen of messages, or user or subject attribute or even a creden-
as a general solution for usage control in distributed and tial. Thus, this makes the problem pervasive in many dis-
pervasive computing environments with widely deployed tributed computing applications such as healthcare informa-
trusted computing technologies on various computing de- tion systems, Web Services, and identity management sys-
vices. tems. Different from other distributed access control prob-
lems such as trust management, in usage control, an object
is located out of the controlling domain of a policy stake-
1 Introduction holder such that (1) there are many aspects of access control
decisions other than subject identities and attributes, and (2)
an object stakeholder needs high assurance on the enforce-
The traditional access control problem [10, 13, 18] is
ment of the policy.
considered in closed environments, where identities of sub-
jects and objects can be fully authenticated, and enforce- As Figure 1 shows, an object and its usage control pol-
ment mechanisms are trusted by system administrators 1 Some literatures present another way of usage control, which focuses
which define access control policies. However, with in- on confining the usage purposes of an object, instead of security sensitive
creasing distributed and decentralized computing systems, purposes like integrity and confidentiality. The typical example on this
more computing cycles and data are processed on leaf kind of usage control system is digital rights management (DRM), which
controls a user’s use of an object based on payment information, e.g., play
nodes. This leads to two distinct access control problem and copy. However, we do not distinguish usage rights and other security
spaces. The first one focuses on the reasoning of autho- sensitive rights on an object, thus this kind of usage control is a subset of
rizations with subject attributes from different authorities. the problem in this paper.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 10


DOI 10.1109/SUTC.2008.79
icy are distributed from a data provider to a target platform. atively easy to achieve at least in closed systems. Note
The policy is enforced in the platform to control access to that in trust management systems, policy enforcement is
the object within a trusted subsystem. Typically, an access still within the stakeholder’s control domain. However,
control decision is determined according to pre-defined fac- as objects or services are deployed to different domains
tors specified in a policy, which, logically, can be defined from their stakeholders, a mandatory requirement for us-
based upon the information of the subject and the object age control is the trustworthy enforcement of security poli-
of an access request, where the subject is an active entity cies by the reference monitor. Here, through trustworthi-
trying to perform actions on the passive entity object. In ness, a stakeholder needs to ensure that (1) all factors for
closed access control systems such as in a local platform, usage control decisions can be obtained and their informa-
policies are defined based upon the identities of subjects tion (e.g., attribute values or environmental conditions) are
and objects. In traditional distributed access control sys- authentic, (2) correct decisions are made based on these fac-
tems such as trust management, policies are defined based tors, (3) the reference monitor enforces access control deci-
on attributes or credentials that are certified by external au- sions correctly, and (4) all accesses to a target object on a
thorities. However, in usage control, access control policies target platform have to go through the reference monitor.
can be defined by very general attributes of subjects and Overall, by a “trusted subsystem” we mean that it is ex-
objects, such as application-specific attributes and temporal pected to behave in a “good” manner and this manner can
status. Furthermore, as an object can be located on plat- be verified by the policy stakeholder.
form in a heterogeneous environment such as a mobile de- These requirements represent the essential security con-
vice, environmental restrictions and system conditions are siderations in many distributed computing environments.
mandatory decision factors in many applications, such as For example, in our ongoing project, a healthcare service
location-based service and time-limited access. An ongo- provider (e.g., a hospital) provides healthcare data to autho-
ing access should be terminated if these environmental or rized service requestors (e.g., a physician). A physician or a
system conditions change which violate policies. For ex- nurse uses a desktop in a clinic to retrieve healthcare data of
ample, a mobile application might require that a service a patient from the service provider. To preserve the integrity
can be used only if a mobile device is in a particular loca- and privacy of the healthcare data, the service provider typ-
tion, which itself is activated by a user through the service ically requires that the data released to the client machine
agent deployed on the mobile device. Simply relying on are correctly processed by authorized users only through
traditional access control mechanisms on a target platform the trusted healthcare application, and are not eavesdropped
cannot satisfy these requirements since the decision factors by any other hidden processes concurrently running on the
(i.e., subject and object attributes) of these approaches are client platform and during data transfers. For another ex-
mostly static and pre-defined and cannot fit a dynamic com- ample, for a data or service provider to deploy their value-
puting environment. added services on a mobile device, the runtime environment
of the mobile device has to be trusted and satisfy some se-
Data/service provider Target platform
(usage control policy stakeholder) (usage control enforcement) curity requirements. For example, location-based services
have been widely deployed and a user is allowed to use ser-
Object Object
vice only within a particular scope of a physical location.
Previous work on usage control focus on high level pol-
Trusted icy specifications and conceptual architectures [15, 21, 22,
subsystem

Policy
27], while enabling mechanisms are mainly relied on digi-
Usage
Control
Policies
MAC
Policies
tal rights management (DRM) approaches. However, DRM
mechanisms cannot support general attributes and trusted
enforcement in ubiquitous environments. Most importantly,
Figure 1. Distributed usage control. A DRM approaches cannot provide an overall solution for us-
trusted subsystem in client platform en- age control in open and general-purpose target platforms,
sures the enforcement of usage control since they usually rely on software-enabled payment-based
policies. enforcement in relatively closed environments, e.g., through
a media player by connecting to a dedicated license server.
As usage control is naturally distributed, another chal- Another intuitive solution is to use cryptography algorithm.
lenge to enforce usage control policies is the trustworthy For example, a stakeholder can encrypt a target object such
of the security enforcement mechanism. Typically, an ac- that it only can be decrypted on a target platform with a par-
cess control decision is made and enforced by a reference ticular application. Fundamentally, this has the same prob-
monitor, which has the requirements of being tamper-proof, lems as the DRM approach, since a typical DRM scheme re-
always-invoked, and small enough [5, 11] — which is rel- lies on encryption/decryption with a unique key shared be-

11
tween a client and content server [1, 2]. Particularly, cryp- out of the domain of a stakeholder such that high assur-
tography alone cannot protect the key during the runtime on ance of policy enforcement is desired. However, as usage
a target platform such as to build a trusted subsystem [20]. control is such pervasive that it can happen in open and
For example, malicious software can easily steal a secret by general-purpose platforms, a “usable security” mechanism
exploring some vulnerability of the protection system, ei- is strongly desired to satisfy also the cost-effective objec-
ther when the secret is loaded in some memory location, or tive. For example, leveraging a local host access control
when the secret is stored locally. mechanism to enforce usage control policy is very desirable
In this paper we make the first step towards a general if the mechanism can be trusted to do the “right” thing. That
framework for a trusted usage control enforcement in ubiq- is, the goal of pervasive usage control is not to provide a per-
uitous computing environments. We start with an analysis fect solution for security but just to be “good-enough” [26].
of the security requirements in the usage control problem. Requirement 2: Need a comprehensive policy model. Tra-
Within these high level requirements, we identify manda- ditional security systems distinguish policy and mech-
tory components to build a trusted subsystem in a general anism [17]. However, early policies such as Bell-
client platform for usage control policy enforcement. This LaPadula [7] and Biba [8] are too restrictive for convenient
subsystem needs a strong protected environment such that use within applications. They support simple policies such
security mechanisms in the client platform should respect as one-way information flow but provide insufficient and
its autonomy. Specifically, this trusted subsystem should inflexible support for general data and application integrity.
have the final and complete control over its resources (e.g., Typically, usage control considers many constraints or con-
an object downloaded from a remote stakeholder), and the ditional restrictions such as time and location as aforemen-
security mechanisms of the local platform cannot compro- tioned. Traditional policy models cannot support these and
mise or bypass this control. Also, information flow between usage control needs a comprehensive policy model to sup-
this subsystem and any others on the target platform has port the variants of such additional security requirements.
to be controlled, if allowed. For these purposes, we claim
that mandatory access control (MAC) is necessary. Fur- Requirement 3: Need MAC mechanism for trusted subsys-
thermore, to achieve the assurance of policy enforcement tem on a target platform. In discretionary access control
as aforementioned, the integrity of the subsystem has to be (DAC), a root-privileged subject has the capability to violate
verifiable by the policy stakeholder. Thus, mechanisms like the security configuration of the whole system such that the
integrity measurement, storage, and verification are needed subsystem can be compromised either by a malicious user
on such a target platform. or software (e.g., a virus or Trojan horse). As evidenced by
As one of the main contributions of this work, we con- many security attacks, a virus or worm can obtain the root
sider the integrity of a subsystem in access control mecha- permission of a system by exploring some vulnerabilities,
nisms. With this, not only traditional subject and object at- e.g., with buffer-over-flow attacks. Thus, mandatory access
tributes are considered in access control decisions, but also control (MAC) mechanism is needed. For example, with
the integrity of subjects and objects, and any other support- SELinux, one can label the applications and all resources of
ing components in a trusted subsystem. The overall goal of a subsystem with a particular domain and define policies to
our approach is to build a “virtually closed” and trusted sub- control the interactions between this domain and others for
system for remote usage control policy enforcement. Our isolation and information flow control purposes.
work presented in this paper can be regarded as the en- Requirement 4: Need a policy transformation mechanism
forcement model of usage control in PEI security frame- from high level usage control policies to concrete MAC poli-
work [27]. cies. Typically, a stakeholder’s policy is specified in differ-
The present paper is organized as follows. Section 2 ent formats and semantics from those of the MAC policies
presents the principles to build distributed usage control on a target platform. For example, a stakeholder can be
systems. We describe our general platform architecture to implemented as a Web Service, where a security policy is
build a trusted subsystem in Section 3. Our prototype sys- specified in XACML. This policy has to be transformed to a
tem is presented in Section 4. Related work is presented in concrete policy that is enforced on a target platform, which
Section 5. We eventually conclude this paper in Section 6. follows its local MAC model. Thus, an efficient and conve-
nient policy transformation mechanism is needed such that
2 System Design Principles security properties are preserved during a transformation,
i.e., the allowable permissions and information flows are the
In order to enforce usage control in a trustworthy man- same in the policies before and after a transformation.
ner, we have identified a set of general design principles. Requirement 5: Need security mechanism on operating sys-
Requirement 1: Need high assured but usable security tem (OS) level. There has been long discussion that appli-
mechanism. Typically in usage control, objects are located cation level security alone cannot provide high assurance of

12
a system. For example, many web service based security downloading, a stakeholder wants to verify that the request
protocols such as WS-Security [3] are built on credentials comes from an authentic subject on a target platform and the
(public key certificates) between service providers and con- subject does have the permission to obtain an object. After
sumers. However, credential management and private key downloading, the subject needs to verify the integrity and
protection are critical problems for general-purpose com- authenticity of an object and its policy. Secondly, during the
puting platforms. Without OS level security mechanism, runtime of processing a target object, a trusted subsystem
message-based security mechanisms cannot guarantee end- ensures that usage control policies are enforced implying
to-end security – integrity, confidentiality, and privacy can that only authorized processes and users can access the data,
be compromised on OS level by software-based attacks and interactions between running processes are controlled
(virus, spyware, worms, etc). such that correct information flows are preserved. Typi-
Requirement 6: Build trust chain for policy enforcement cally, authorized accesses from “known good” processes
from system boot to application execution. The high as- and users ensure the confidentiality and privacy of protected
surance of a subsystem in a remote computing platform objects, and information flow control ensures the integrity
should origin from a root-of-trust, and then is extended to of the data.
other system components upon which the policy enforce-
ment mechanism is built. Typically, a MAC mechanism is 3.1 Trusted Subsystem Architecture
implemented in the kernel of the OS on a platform. Thus,
a trusted subsystem should include a trusted kernel while Our trusted subsystem includes a root-of-trust, trust
any other components booted before the kernel, such as the chain, and a policy transformation and enforcement mech-
BIOS and the boot loader also need to be strictly trusted. anism, and also a runtime integrity measurement mecha-
To obtain the trust of the MAC mechanism in a subsystem, nism. Figure 2 shows our target platform architecture to en-
any other supporting components should also be trusted, force usage control policies. The hardware layer includes a
including policy transformation and management, subject Trusted Platform Module (TPM), a Core Root of Trust Mea-
and object attribute acquisition, and the reference monitor surement (CRTM), and other devices. The TPM and the
itself. The fundamental goal of this trust chain is to achieve CRTM provide the hardware-based root-of-trust. Similar to
a trusted runtime environment for object access where the trusted or authenticated boot [6, 12, 25], the booting compo-
integrity of all related parts can be verified by a stakeholder. nents of the platform, including BIOS, boot loader, and OS
Requirement 7: Build trusted subsystem with minimum kernel, are measured and their integrity values are stored in
trusted computing base. Related to the above requirement, particular Platform Configuration Registers (PCRs) of the
to build a practical and usable trusted subsystem, a min- TPM. Specifically, according to the TCG specification [28],
imum trusted computing base (TCB) is desired. A TCB the CRTM is the first component to run when the platform
includes all the components in the trust chain for policy en- boots. It measures the integrity of the BIOS before the
forcement during runtime. A larger number of components BIOS starts, which in turn, measures the boot loader and
in this chain results in higher costs both on system develop- hereafter the kernel and kernel modules, recursively. Along
ment and verification since each trusted component requires this booting and measurement sequence, particular PCR(s)
a detailed verification of the software implementation. are extended with the measured values, and the result is de-
noted as P CRboot . The TPM guarantees that P CRboot is
As policy model and formal specifications have been ex-
reset once the platform re-boots.
tensively studied in previous work, in this paper we focus on
the policy enforcement issue of usage control. We propose Upon a user’s request on the target platform, a client ap-
a general platform architecture by following these princi- plication (e.g., a healthcare client software) is invoked to
ples. We have developed a prototype with emerging trusted communicate with a data owner/provider to obtain an ob-
computing technologies including a hardware-based root- ject. At that same time, a policy can be downloaded by
of-trust. We leverage the MAC mechanism in SELinux for the client application from a stakeholder, which can be the
policy enforcement. Due to space limit we ignore the policy same as the data provider or different. For example, a data
transformation mechanism in this paper. provider can delegate its policy specifications to a security
service provider, which is the policy stakeholder when an
object is downloaded and processed on a client platform.
3 Platform Architecture When a usage control policy (e.g., an XACML policy
file) is downloaded from its stakeholder, it is transformed
A trusted subsystem is the foundation to enforce usage by the policy transformation service to a MAC policy such
control policies. We define two phases for this purpose. that they can be enforced by the reference monitor. The
First, a prerequisite for usage control is secure object and client application is the target process that can manipulate
policy download. This requirement is two-fold. Before the object and is to be protected by MAC policies. Also,

13
MAC policies should include rules to control accesses to the • The reference monitor is measured after the kernel is
object from other applications and any configurations for booted.
the client application and the overall security system (e.g.,
local security policy management). • The client application, object, and its configurations
are measured right before the client application is in-
voked.

• The integrity of the usage control policy, policy trans-


Configurations Object
Usage Control Policy formation service, and the sensor are measured when
(e.g., XACML Policy )
they are invoked and just before their execution.

Client Application
Policy
Transformation
• MAC policies are measured when they are loaded, ei-
Service ther, when the platform boots or during runtime (i.e.,
loaded by the policy transformation service).
Integrity
Verification Sensor
Service • Any other applications or services that need to com-
municate or collaborate with the client application are
Integrity
Reference
measured before they are invoked.
Measurement MAC
Monitor
Service Policies
In general, in order to only allow accesses to target ob-
Kernel
jects from authorized applications, and control information
flow between this applications and others, IMS should mea-
CRTM TPM Device Device
Hardware sure not only the policy enforcement services such as policy
transformation and platform sensor, but also all other appli-
Figure 2. Platform architecture for usage con- cations that are allowed to interact with the sensitive client
trol policy enforcement. applications running on the same platform.
As part of the policy enforcement, the integrity verifi-
As aforementioned, usage control policies typically in- cation service (IVS) verifies corresponding integrity val-
clude environmental authorization factors such as time and ues measured by the IMS and generates inputs to the ref-
location. A sensor is the component that reports these en- erence monitor. As a typical example, the client applica-
vironmental information and thus can be considered by the tion can only access the target object when its “current” in-
reference monitor. For example, in a mobile application tegrity corresponds a known good value, where the current
where a service can only be accessed in a particular loca- integrity is the one measured by the IMS.
tion, the sensor reports the physical (e.g., through a cellular Note that although we use data objects (e.g., files)
network provider or GPS) or logical (e.g., through a Wi-Fi through this paper, our usage control mechanism is applica-
access point) location of the device, such as home, office, ble to other types of objects such as messages and streams.
airport, etc. The essential requirement for the object is that its authen-
In the kernel level of the platform, the reference moni- ticity and integrity can be verified after downloaded, such
tor captures an access attempt to the object and queries the that, as an input for the application on a client platform, the
MAC policies before allows the access. A fundamental re- initial state of the platform can be trusted.
quirement for the reference monitor is that it has to capture
all kinds of access attempts, from the storage space of the 3.2 Secure Object and Policy Download
local file system to the memory space of the object. Also,
the reference monitor controls the interactions between the Before any access, the target platform obtains the ob-
client application and others, locally and remotely, and es- ject and the XACML policy file from the policy stakeholder
pecially according to the loaded MAC policies. through authentication and attestation protocols. Without
The integrity measurement service (IMS) is a manda- loss of generality, we assume that the access request is gen-
tory component in a trusted subsystem, which starts right erated by the client application from the target platform (cf.
after the kernel is booted. The main function of the IMS Figure 2). Policies can be defined in the target platform to
is to measure other runtime components which consist of confine that only the dedicated client application (e.g., the
the TCB to enforce usage control policies. All measured healthcare client software or a mobile service agent) can
events and the integrity values are stored in a measurement generate an access request to the stakeholder.
list and the corresponding PCRs are extended accordingly. Upon receiving the request, the stakeholder generates an
Particularly: attestation challenge to the target platform. An attestation

14
agent on the target platform collects a set of integrity val- checked by the reference monitor based upon the au-
ues measured by the IMS, signs them with an attestation thentication of the user to the system, for example, the
identity key (AIK) of the TPM, and sends them back to the role and necessary credentials of the user. When the
stakeholder. The integrity included in this response consists client application is invoked by the user, it is measured
of P CRboot and all mandatory components for the runtime by IMS before loaded to memory.
policy enforcement, including the reference monitor, policy
transformation service, IVS, sensor, and system configura- • During runtime, if the client application generates ac-
tions. After positively verifying these integrity values, the cess requests to the target object, the measured in-
stakeholder decides that the data can be released, and the tegrity of the application is evaluated and verified by
corresponding usage control policy can be generated. The the IVS and the result is considered by the reference
integrity of the policy to confine which application can gen- monitor.
erate access request to the stakeholder can also be measured
and attested. • The sensor service monitors the environmental infor-
Note that although we assume each object is associated mation of the computing device (e.g., location) and
with a usage control policy logically, a policy can also be provides these also to the reference monitor for pol-
associated with an application or the type of objects or ser- icy evaluation when an access happens on the platform
vices. For example, the healthcare application aforemen- with regarding to the policies. Whenever there is a
tioned can use the same policy for all patient records of a change of any information specified in the policies, the
particular type of diseases. new information is reported by the sensor service thus
invokes the re-evaluating of the ongoing access.
3.3 Runtime Policy Enforcement
With these mandatory components, we show that a general
usage control policy can be enforced on a target platform
When a usage control policy is received, it is transformed
with verifiable trustworthiness.
to local MAC policies on a local platform. The policies
specify the following factors for secure information pro-
cessing during runtime. Firstly, the MAC policies should 4 Implementation and Evaluation
confine the users that can initialize applications to access
certain target objects. For example, a patient’s healthcare 4.1 Conditional SELinux Policy
information only can be manipulated by a registered nurse,
which is represented by a digital credential. Secondly, MAC
As aforementioned, we have developed a policy trans-
policies should also specify the integrity of the client plat-
formation mechanism to transfer high level policies, spec-
form and the target application which allow an object ac-
ified by a stakeholder in XACML format, to low level
cess. The integrity information, typically, provides the as-
policies specified by the MAC policy model inside the
surance that the object is correctly processed, and there is
target platform. In our project, we use SELinux as
no illegal information leakage. Thirdly but not lastly, MAC
our reference monitor, and usage control policies are re-
policies should consider the environmental restrictions by
alized via the conditional policies of SELinux. Fig-
which an object can be accessed, e.g., the location of a mo-
ure 3 shows an example policy for a medical applica-
bile platform to access a service.
tion. The policy is formed as an SELinux loadable
We now explain how these factors are considered in our
policy module, where types and allow rules are de-
trusted subsystem during runtime. With a TPM-enabled
fined within this module. The permission statement of
platform, all booting components are measured and their
medicalApplication t is made conditional on the
respective integrities are stored in the TPM, including the
basis of application integrity and platform location. The
IMS in the kernel. When the kernel boots, the integrity
boolean variable integrity and accessLocation
of our reference monitor and other user space services are
correspond to those in the constrain expression. The
measured. Consider that an object and its usage control
overall semantics of this policy module is that, if predi-
policy has been downloaded by a client platform and their
cate (integrity && accessLocation) is true, the
integrity have been verified after downloading. The usage
permissions are allowed by SELinux; otherwise, only
control policy is then transformed to a set of MAC policy
getattr is allowed, e.g., the file can be listed in the di-
rules that can be enforced by the reference monitor. These
rectory but cannot be opened.
MAC rules specify all the security requirements including
Note that the policy in Figure 3 shows only a simple
user, integrity, and environmental restrictions.
example of how to integrating integrity information with
• When a user logins or attempts to access the object by read file permission in SELinux. Other permissions, such
invoking the client application, the user attributes are as the executing of the medical application or its writing to

15
the object during runtime, or the integrity of other compo- policies are changed. For example, in SELinux, file access
nents in the platform, can be integrated with similar mech- permission is checked on every read and write to a file, even
anism. Typically, the usage control policy defined by its when the file has been already opened. With this, if the se-
stakeholder determines these configurations in SELinux. curity context of the file or the accessing subject changes,
the access is revoked on the next read or write attempt 2 .
With this feature, many flexible and dynamic security re-
quirements can be supported, such as runtime integrity ver-
ification and location-based authorizations.

4.2.1 Integrity Measurement


Integrity measurement service is implemented by re-using
some codes from IBM Integrity Measurement Architec-
ture (IMA) [25]. Specifically, we re-use the ima main.c
and drivers/char/tpm.c in IMA. The measure-
ment function is called within the SELinux hook function
Figure 3. Example SELinux loadable policy file mmap(). Whenever, a new file is loaded into mem-
module with conditional policy. ory, the file mmap is called for the verification of addi-
tional permissions by SELinux.
4.2 Policy Enforcement Architecture An independent configuration file called
/sys/kernel/measure is maintained to indicate
Figure 4.2 shows the overall architecture of our proto- the files that are needed to be protected. Each entry in this
type. The left part of the diagram is the usual SELinux file contains the file name and its absolute path. The mea-
access permission check module. On the right part, there surement function searches the configuration file for the
is a set of services running on the user space, including the desired entry – passed as a parameter to file mmap().
policy transformation, integrity verification, and sensors. In If the file passed as a parameter to the file mmap() is
the kernel space, the SELinux filesystem provides interfaces found in /sys/kernel/measure, the same procedure
to allow user space services to set boolean values for con- is followed as done by the IMA for augmenting the hash to
ditional policies. When an XACML policy is transformed, the kernel list. These measurements are then protected by
a set of files are created in this filesystem and default val- the TPM.
ues are set. The boolean values are updated by the user
space services, e.g., with the result of integrity verification 4.2.2 Integrity and Location verification
or location change. Whenever there is an access request, the
SELinux security server obtains the current boolean values The Integrity and location verification is done by two
(if specified in the policy) and makes a decision. separate daemons. The integrity verification service
(IVS) daemon is responsible for monitoring the integrity
XACML
Policies
of all those files listed in /sys/kernel/measure.
Client Application
Each listed file has a corresponding SELinux boolean
Policy
Transformation
Integrity
Verification
Sensor
Service
variable which is stored in a corresponding file inside
User Space Service
the SELinux filesystem. For example, in our healthcare
Kernel Space
System Calls scenario, the file /usr/share/medicalPolicy is
SELinux Filesystem
listed in /sys/kernel/measure and the boolean
Linux DAC Check variable file /selinux/booleans/user share
Allow or deny? Access
medicalPolicyIntegrity corresponds to it. The
LSM Hooks Security Server
Vector
Cache
(Binary Policies and Decision Logic) absolute path of a target file is used for the boolean variable
filename to avoid conflicts between boolean variables
Access Operations
SElinux LSM Module
augmented by different loadable policy modules.
The integrity verification service daemon retrieves the
kernel list and the /sys/kernel/measure file period-
Figure 4. Usage control platform architecture
ically. More than one entry for an integrity protected file
via SELinux and conditional policies.
2 However, SELinux does not support access revocation to memory-
An important feature of SELinux is that an ongoing ac- mapped objects, e.g., memory-mapped file data and inter-process commu-
cess can be revoked when the security context or related nication messages.

16
in the measured kernel list designates that the file is com- and Sandhu [21]. However, it is basically a centralized ap-
promised. Based on the measurement list integrity ver- proach and there is no concrete implementation mechanism
ification, this daemon transits the corresponding boolean existing for it yet.
variables accordingly. For example, in our scenario, the Attestation-based remote access control [24] is proposed
IVS sets the usr share medicalPolicyIntegrity based upon the IBM integrity measurement architecture
to false if the /usr/share/medicalPolicy has more (IMA). Similar to usage control, an enterprise server de-
than one entry in the kernel list. ploys security policies on a client platform which are en-
Similarly, the sensor daemon monitors the location of the forced based on the integrity status of the client platform.
device, e.g., through the IP address or Wi-Fi access point of However, there are significant differences between this and
the device. If the device location changes, the daemon re- our work. First, the objective of attestation-based remote
evaluates the location and based on a set of locations, sets access control is to filter the traffic origins from a client to
the corresponding SELinux boolean variable. a server while target objects (i.e., enterprise resources) are
still located on centralized server. In usage control, the fun-
4.3 Evaluation damental goal is to enable continuous control after an object
is distributed. Second, based on the limitation of IMA, the
policy enforcement in this work needs to verify all com-
Our implementation leverages IMA for integrity mea- ponents loaded in a platform after booting, such that it is
surement. Therefore, on the kernel level, our system has the not practical to deploy it in very open and heterogeneous
similar performance as IMA [25]. Our experiments show environments [23, 16]. Most importantly, our approach
that it takes 5765μs to measure a medical record file and ex- integrates IMA with SELinux with an augmented policy
tend the PCR with the corresponding measurements. Like model. Moreover, we leverage the loadable policy mod-
in IMA, this overhead is due to opening the configuration ule of SELinux so that we can build a relatively “closed”
file, writing the measurement request and closing the con- trusted subsystem by defining SELinux policies according
figuration file. Further, the size of the files in our healthcare to usage control requirements on remote platforms.
scenario is small, therefore, fingerprinting the files does not
poses significant performance concerns for our implemen-
tation. We have taken these results on a desktop with 2.4
6 Conclusions and Future Work
GHz processor and 1 GB of RAM.
Usage control focuses on the problem of enforcing se-
On the user space, our system includes the extra integrity
curity policies on a remote client platform with high as-
verification step to consider integrity in access control de-
surance and verifiable trust. In this paper we present gen-
cisions. We measure the time to make an integrity verifi-
eral security requirements for usage control and propose a
cation and set the value for the boolean variable inside the
general framework for this problem. The main idea of our
SELinux filesystem, and the time to evaluate a single ac-
approach is to build a trusted subsystem on an open plat-
cess control decision with SELinux, respectively. The data
form such that a policy stakeholder can deploy sensitive
shows that it takes 95μs overall for integrity verification,
data and services on this subsystem. We propose an archi-
transiting the corresponding SELinux boolean variables and
tecture with a hardware-based TPM as the root-of-trust and
accessing the protected file. Whereas, without integrity ver-
consider integrity measurement/verification and other envi-
ification, it is just 88μs on average. This shows that the dif-
ronmental restrictions in our MAC policy model. We have
ference is not so much from the performance point of view.
also implemented a real prototype system by integrating in-
Note that the overhead of integrity verification is indepen-
tegrity measurement and SELinux. By leveraging the latest
dent from the measured file size by IMS.
SELinux conditional policies and loadable policy modules,
our approach enables verifiable assurance to build a rela-
5 Related Work tively “closed” trusted subsystem for usage control.

A distributed usage control policy language and its en- References


forcement requirements are presented in [15, 22]. Similar
to our objective, their work targets on control over data after [1] Fairplay. http://en.wikipedia.org/wiki/FairPlay.
its release to third parties. However, the significant differ-
[2] Windows media digital rights management (DRM).
ence between this and our work is that our work relies on
http://www.microsoft.com/windows/windowsmedia
the underlying trusted subsystem of a platform, where the
/drm/default.aspx.
root-of-trust is built by a hardware TPM and extended to
all mandatory components for policy enforcement. A com- [3] Web Services Security: SOAP Message Security 1.1.
prehensive usage control policy model is proposed by Park OASIS Web Service Security TC, 2004.

17
[4] M. Abadi, M. Burrows, and B. Lampson. A calcu- [16] T. Jaeger, R. Sailer, and U. Shankar. PRIMA: Policy-
lus for access control in distributed systems. ACM reduced integrity measurement architecture. In Pro-
Transactions on Programming Languages and Sys- ceedings of the 11th ACM Symposium on Access Con-
tems, 15(4):706–734, 1993. trol Models and Technologies, pages 19–28, June
2006.
[5] J. P. Anderson. Computer security technology
planning study volume II, ESD-TR-73-51, vol. [17] B. Lampson. Computer security in the real world.
II, electronic systems division, air force systems IEEE Computer, (6):37–46, June 2004.
command, hanscom field, bedford, MA 01730. [18] B.W. Lampson. Protection. In 5th Princeton Sympo-
http://csrc.nist.gov/publications/history/ande72.pdf, sium on Information Science and Systems, pages 437–
Oct. 1972. 443, 1971. Reprinted in ACM Operating Systems Re-
view 8(1):18–24, 1974.
[6] W. A. Arbaugh, D. J. Farber, and J. M. Smith. A se-
cure and reliable bootstrap architecture. In Proc. of [19] N. Li, J. C. Mitchell, and W. H. Winsborough. Design
IEEE Conference on Security and Privacy, pages 65– of a role-based trust-management framework. In Proc.
71, 1997. of IEEE Symposium on Security and Privacy, pages
114–130, 2002.
[7] D. E. Bell and L. J. LaPadula. Secure computer sys-
tems: Mathematical foundations and model. Mitre [20] P. Loscocco, S. Smalley, P. Muckelbauer, R. Taylor,
Corp. Report No.M74-244, Bedford, Mass., 1975. J. Turner, and J. Farrell. The inevitability of failure:
The flawed assumption of computer security in mod-
[8] K. J. Biba. Integrity consideration for secure com- ern computing environments. In Proceedings of the
puter system. Technical report, Mitre Corp. Report National Information Systems Security Conference,
TR-3153, Bedford, Mass., 1977. October 1998.

[9] Matt Blaze, Joan Feigenbaum, and Jack Lacy. Decen- [21] J. Park and R. Sandhu. The UCONabc usage control
tralized trust management. In Proceedings of IEEE model. ACM Transactions on Information and Sys-
Symposium on Security and Privacy, pages 164–173, tems Security, 7(1):128–174, February 2004.
Oakland, CA, May 1996. [22] A. Pretschner, M. Hilty, and D. Basin. Distributed us-
age control. Communications of the ACM, (9):39–44,
[10] D. E. Denning. A lattice model of secure information 2006.
flow. Communications of the ACM, 19(5), May 1976.
[23] J. F. Reid and W. J. Caelli. Drm, trusted computing
[11] Department of Defense National Computer Security and operating system architecture. In Australasian In-
Center. Department of Defense Trusted Computer formation Security Workshop, 2005.
Systems Evaluation Criteria, December 1985. DoD
[24] R. Sailer, T. Jaeger, X. Zhang, and L. van Doorn.
5200.28-STD.
Attestation-based policy enforcement for remote ac-
[12] J. Dyer, M. Lindemann, R. Perez, R. Sailer, L. van cess. In Proceedings of ACM Conference on Computer
Doorn, S. W. Smith, and S. Weingart. Building the ibm and Communication Security, 2004.
4758 secure coprocessor. IEEE Computer, (10):57– [25] R. Sailer, X. Zhang, T. Jaeger, and L. van Doorn.
66, 2001. Design and implementation of a TCG-based integrity
measurement architecture. In USENIX Security Sym-
[13] M. H. Harrison, W. L. Ruzzo, and J. D. Ullman. Pro-
posium, pages 223–238, 2004.
tection in operating systems. Communication of ACM,
19(8), 1976. [26] R. Sandhu. Good-enough security: Toward a prag-
matic business-driven discipline. IEEE Internet Com-
[14] A. Herzberg, Y. Mass, J. Mihaeli, D. Naor, and puting, (1):66–68, 2003.
Y. Ravid. Access control meets public key infras-
tructure, or: assigning roles to strangers. In Proc. of [27] R. Sandhu, K. Ranganathan, and X. Zhang. Secure
IEEE Symposium on Security and Privacy, pages 2– information sharing enabled by trusted computing and
14, 2000. PEI models. In Proc. of ACM Symposium on Informa-
tion, Computer, and Communication Security, 2006.
[15] M. Hilty, D. Basin, and A. Pretschner. On obligations. [28] TCG TPM. Main part 1 design principles specification
In Proc. of 10th European Symp. on Research in Com- version 1.2, https://www.trustedcomputinggroup.org.
puter Security, September 2005.

18
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

China’s National Research Project on Wireless Sensor Networks


Lionel M. Ni
Chair Professor and Head
Department of Computer Science and Engineering
The Hong Kong University of Science and Technology

Abstract

This talk will give an overview of the 5-year National Basic Research Program of China (also
known as the 973 Program) on Wireless Sensor Networks that was launched in September 2006
and sponsored by the Ministry of Science and Technology. This national research project
involves researchers from many major universities in China and Hong Kong, and aims to tackle
fundamental research issues rising in three major application domains: coal mine surveillance,
water pollution monitoring, and traffic monitoring and control. The distinctive feature of this
project is that it will present a systematic study of wireless sensor networks, from node platform
development, core protocol design and system solution development to critical problems. This
talk will address the research challenges, current progress, and future plans.

Biography

Lionel M. Ni is Chair Professor and Head of the Computer Science and Engineering Department
at the Hong Kong University of Science and Technology. He also serves as Chief Scientist of the
National Basic Research Program of China (973 Program), Director of HKUST China Ministry
of Education/Microsoft Research Asia IT Key Lab, and Director of HKUST Digital Life
Research Center. Among his many honorary and adjunct positions, he is an honorary President
of the South China Institute of Software Engineering at Guangzhou University, China, a
Distinguished Professor of Shanghai Jiao Tong University, and an honorary Chair Professor of
National Tsinghua University (Hsinchu). Dr. Ni earned his Ph.D. degree in Electrical and
Computer Engineering at Purdue University in 1980. A fellow of IEEE, Dr. Ni has chaired many
professional conferences and served on the editorial board of many journals. He has directly
supervised 34 Ph.D. students, won five best paper awards, and the 1994 Michigan State
University Distinguished Faculty Award. He is a co-author of three books: "Interconnection
Networks: An Engineering Approach" (Morgan Kaufmann 2002), “Smart Phone and Next
Generation Mobile Computing” (Morgan Kaufmann 2006), and “Professional Smartphone
Programming” (Wrox 2007).

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 19


DOI 10.1109/SUTC.2008.23
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing
Strong QoS and Collision Control in WLAN Mesh and
Ubiquitous Networks
Chi-Hsiang Yeh and Richard Wu
Abstract— Quality of service (QoS) provisioning is critical to real-time pected to go through its second letter ballot soon. The IEEE
applications such as VOIP and IPTV, but has not been resolved in multihop 802.16j task group is extending WiMAX to WMAN mesh, with
networks such as mesh and ad hoc networks. For multiple access in ubiqui-
tous networks, energy efficiency is of the single most important design goal, the draft standard D1.0 [6] just completed in August 2007. The
and can be supported through power save and power control techniques. In above are the main issues that are currently attracting tremen-
this paper, we propose a class of MAC protocols based on binary countdown
for strong differentiation capability and energy efficiency. A unique advan- dous efforts and resources from the wireless communications
tage for this class of protocols is that the collision rate can be significantly industry.
reduced or virtually eliminated. This collision rate issue is often ignored in
QoS MAC and sensor MAC protocol designs, but is in fact indispensable
for QoS and energy-efficient MAC since service quality will be otherwise
considerably degraded due to exponentially increased backoff delay, and Although the MAC protocol of IEEE 802.11 [3] works rea-
energy will be wasted unnecessarily. sonably well in current single-hop WLANs, several problems
will arise when they are applied to multihop networks. A main
I. I NTRODUCTION reason is that it relies on mandatory carrier sensing to avoid col-
The holy grail for medium access control is freedom from lisions, but such a transmitter-centric paradigm is vulnerable to
collision, while the holy grail for priority-based QoS is abso- the hidden terminal problem, and is inefficient in terms of spatial
lute differentiation, where higher-priority packets can be trans- reuse especially when power controlled. Also, they have an op-
mitted without being affected by lower-priority packets. Fair- tional RTS/CTS mechanism that may mitigate the hidden termi-
ness can then be provisioned among packets of the same traffic nal problem, but such control messages are collision prone and
category, and lower-priority traffic categories may be allowed are thus unreliable for this purpose. Moreover, in many mul-
some bandwidth to avoid starvation. IEEE 802.11 [3] is the de tihop networking environments, the interference range is con-
facto standards for medium access control (MAC) in wireless siderably larger than the communication range (for data packets
LANs. It performs reasonably in single-hop scenarios, though and control messages), where the RTS/CTS mechanism is not
collision rate can be high in a dense area with many competing effective in protecting receptions. These problems result in high
WiFi devices. Moreover, PCF is the mechanism that supports collision rates in multihop networks, considerably degrading the
QoS in the IEEE 802.11 standard, but it is never implemented throughput, energy efficiency, QoS provisioning, and fairness.
in commercial product. IEEE 802.11e [3] is an amendment that
enhances the prioritization capability of IEEE 802.11 through
The goal for this research is to develop medium access
EDCA, and also has an improvement over PCF. However, IEEE
techniques that can achieve the preceding objectives including
802.11e has been standardized for a few years and yet the indus-
strong QoS capability, high throughput, as well as power control
try is very slow in adopting this amendment. The IEEE 802.11
and multihop networking supports. To achieve high throughput
working group recently has several initiatives concerning QoS
in wireless networks, collision rate should be very small, while
in WLANs, including the QoS Enhancement (QSE), and Video
communication overheads (e.g., RTS/CTS dialogues) and chan-
Transport Stream (VTS) Study Groups. QoS is still an important
nel idleness (e.g., due to backoff) must be both relatively small
issue that requires further research and development for WLAN
as compared to data packet durations. Low collision rate is also
protocols.
essential to QoS provisioning in any network employing expo-
Another noticeable trend in wireless networking is that mul-
nential backoff or a similar strategy. Therefore, our proposed
tihop networking is becoming the norm for future networking
techniques have to overcome the collision and hidden terminal
environments. This is evidenced by the strong supports from
problems in multihop networking environments, and consider-
industry for the IEEE 802.11s and IEEE 802.16j task groups,
ably reduce the communication overheads/idleness introduced
which are standardizing new MAC protocols for multihop net-
by current RTS/CTS dialogues and backoff strategies. Our pro-
working in WLANs and WMANs, respectively. In particular,
posed solution to the preceding contradicting requirements is to
the IEEE 802.11s task group is standardizing the MAC, rout-
employ detached dual binary countdown (DDBC), a subclass
ing, and security aspects for WLAN mesh, with the newest draft
of dual prohibition multiple access (DPMA) that replaces the
standard D1.10 [5] just completed in March 2008, and is ex-
functionality of RTS/CTS dialogues with prohibiting signals.
Prof. Chi-Hsiang Yeh is with the Dept. of Electrical and Computer Engineer- The resultant protocol inherits important advantages from bi-
ing, Queen’s University, Kingston, Ontario, K7L 3N6, Canada. Phone: +1 613-
533-6368, Fax: +1 613-533-6615, E-mail: yeh@ee.queensu.ca . Richard Wu is
nary countdown including collision freedom/controllability, pri-
with the Nortel Networks, Ottawa, Ontario, K2H 8E9, Canada. oritization capability, and elimination of hidden terminals.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 20


DOI 10.1109/SUTC.2008.92
den terminal problem naturally does not exist. Since a trans-
mitter or receiver only prohibit nodes necessarily, the exposed
receiver prohibiting slots
terminal problem does not exist either. As a result, the hidden
and exposed terminal problems can be solved without relying
on collision-prone RTS/CTS messages, leading to freedom from
collision. By appropriately selecting the threshold for being pro-
hibited from transmitting/receiving, power control can be effec-
tively supported, which is not possible with naive application of
sender prohibiting slots binary countdown that is transmitter centric.
Another important practical issue with naive transmitter-
Fig. 1. The prohibition stage for dual binary countdown. centric implementations of binary countdown is that the power
for prohibiting signals have to be considerably higher when the
II. I SSUES AND THE P ROPOSED S OLUTIONS path loss is high. The reason is that they have to be sent to
Binary countdown is a competition-based technique that en- a distance at least equal to the distance between the transmit-
ables only one winner to access the network, leading to its de- ter and its receiver, plus the maximum interference distance so
sirable collision freedom property. In [8], Tanenbaum states that that all potential interfering transmitters. This considerably lim-
“Binary countdown is an example of a simple, elegant, and effi- its its practicability since the required power may exceed the
cient protocol that is waiting to be rediscovered.” Binary count- maximum power level allowed by FCC when the path loss is
down was designed for single-hop networks where everybody high. Another practical issue is that for different channel prop-
hears everybody else. However, this is not the case for mul- agation environments or models, the required power levels can
tihop wireless networks such as mesh networks, and when bi- be considerably different for a certain distance, and are difficult
nary countdown is directly applied to such networking environ- to predict in reality.
ments, various problems will be caused. We have pioneered In dual binary countdown, however, a transmitter (or re-
the application of binary countdown to multihop networks [13], ceiver) only needs to send prohibiting signals far enough to
[10], [11], [12], and several other research groups also adopted reach nearby nodes that may be interfered by (or interfere) their
similar strategies [9]. However, absolute differentiation, among data packet transmissions (reception). This way the required
other issues, was not achieved yet. In this section, we point out power level is guaranteed not to exceed the FCC regulation since
some issues and our solutions to them. it is no larger than the transmitter’s own transmission power
or nearby node’s transmission power. Such required power is
A. Dual Binary Countdown (DBC) for Strong Collision Control irrelevant to the channel propagation environments or models
Conventional binary countdown is transmitter-centric. Sim- since interference also attenuates faster when path loss is higher
ilar to CSMA, such a paradigm suffers from the hidden termi- (and thus the required range for prohibiting signals also become
nal problem, leading to collisions, and in turn causing problems smaller). The required power levels for prohibiting signals in
in QoS provisioning and waste energy. Also, if directly apply- dual binary countdown are thus predictable and reasonably low.
ing binary countdown to multihop wireless networks as in some In what follows we use an example to illustrate the differ-
previous protocols [13], [9], a potential transmitter will have to ences. Assume that the transmitter-receiver distance is 1, while
block a large area to protect its receiver, leading to a sever case the maximum interference distance is 2. When the path loss
of the exposed terminal problem that will make the performance factor is 2, the power level required in transmitter-centric binary
prohibitively inefficient. countdown is higher by a factor of 9/4 when compared to dual
To resolve these theoretical and practical problems, our solu- binary countdown. When the path loss factor is 4, however, the
tion employs the dual binary countdown technique as proposed power level required in transmitter-centric binary countdown is
in [10]. Transmitters and receivers send their prohibiting sig- higher by a factor of 81/16 when compared to dual binary count-
nals in different but adjacent prohibition slots. As shown in down.
Fig. 1, the slots for transmitters and receivers are interwoven
B. Detached Binary Countdown for Strong Differentiation
for the entire “competition round” or called “contention round”.
A transmitter slot and a receiver slot corresponding to the same Another issue with naive application of binary countdown to
bit/digit are next to each other. Since both the transmitter and multihop wireless networks is that it can achieve absolute dif-
receiver in a pair participate in the binary countdown competi- ferentiation only when all nodes are synchronized to compete at
tion, the dual binary countdown signals can replace the func- the same time, and all data packets are of equal length and syn-
tionality of RTS and CTS messages in IEEE 802.11. Also, chronized to transmit and terminate at the same time. If one of
contrary to conventional binary countdown or CSMA/IC [13], the preceding conditions is not satisfied, higher-priority packets
where only transmitters participate in the competition, dual bi- may be blocked by nearby lower-priority packets, and the prob-
nary countdown is not transmitter-centric anymore so the hid- lem has not been solved in the literature thus far.

21
Our solution is to employ the novel detached binary count- declaration slots. As shown in Fig. 1, the slots for transmitters
down technique, where the binary countdown competition is ap- and receivers are interwoven for the entire competition round.
plied before its associated data packet transmission/reception A transmitter slot and a receiver slot corresponding to the same
with a lead time. Different traffic categories have different bit/digit (for either prohibition or declaration) must be next to
ranges for their lead times, and higher-priority traffic categories each other, but its order can be interchanged throughout a com-
use larger lead times. Convention wisdom is that such lead times petition round. However, the order must be known and followed
or gap will lead to higher delay, thus should be kept smaller for by all DPMA devices. As an example, transmitters can have
higher-priority packets. However, when traffic is not light, the earlier slots for bits/digits at odd positions, but later slots for
delay in a wireless network is mainly contributed by the queue- bits/digits at even positions.
ing delays (i.e., the times wasted waiting in buffers). Thus, we
look at this problem from a different angle to be explained as During the prohibition stage, transmitters are prohibited
follows. For two packets to be transmitted at the same time, by nearby receivers with higher competition numbers (CNs)
if we allow larger lead time for the binary countdown compe- through their prohibition signals; while receivers are prohibited
tition of one of them, that packet will have the privilege to re- by transmitters with higher CNs through their prohibition sig-
serve the channel earlier. So contrary to intuition, this policy nals. Note that for transmitters, the thresholds for being pro-
can effectively prioritize between different traffic categories. As hibited are determined by their transmission power levels; while
can be shown through a mathematical proof rigorously, abso- for receivers, the thresholds for being prohibited are determined
lute differentiation can be achieved by the proposed technique by their tolerance to the interference levels. Note also that the
when the ranges for gaps differ by at least the maximum data safe margin (as a component of the threshold) can change from
packet length for different traffic categories. This is of signif- slot to slot to improve the performance. Since transmitters do
icant theoretical importance and is the first of its kind for dis- not prohibit other intended transmitters and receivers do not
tributed medium access with variable packet length. prohibit other intended receivers, the exposed terminal problem
does not exist in DPMA. Moreover, different from CSMA where
C. Shrinking Contention Windows for Idleness Reduction on-going transmissions block nearby intended transmitters from
Finally, IEEE 802.11 employs backoff to resolve collisions transmitter to transmitter, on-going receptions in DPMA block
and maintain system stability. These issues are of practical im- nearby intended transmitters from receiver to transmitter di-
portance so that such a backoff mechanism cannot be removed. rectly, and on-going transmissions discourage nearby intended
However, it considerably increases the idle time for the wire- receivers from transmitter to receiver directly. As a result, ob-
less channel, leading to inefficiency for the utilization of the structions or hidden terminals (for CSMA systems) will not
scarce wireless resources, in addition to unpredictable delays. cause collisions in DPMA as in CSMA.
Our proposed binary countdown solution can eliminate concur-
rent competitors and thus can naturally replace the functionality For a node to determine whether its partner is surviving,
of the backoff mechanism, or at least reduce the required con- a transmitter checks whether a prohibition signal correspond-
tention window (CW) sizes for backoff without compromising ing to the same bit/digit value (with reasonable power level)
system stability or collision rates. This can considerably reduce exists in receiver prohibition slots, and similarly, a receiver
channel idle times and thus leading to higher channel utilization also sense prohibition signals in transmitter prohibition slots
and throughput. to make the judgment. If a node finds that the partner in its
In the following sections, we present a detailed implementa- transmitter/receiver-pair has failed the competition, it will with-
tion of the proposed solutions, to be referred to as dual binary draw from competition immediately. Similarly, if a node cannot
countdown multiple access, and then evaluate its performance sense any signals in the declaration slot that may be sent by its
and compare it to CSMA/CA. partner, it also withdraws from the competition.

III. R ELATED W ORK : DBC


If signals are detected in the declaration slots, it implies that
In this section we present details for DBC [10], a subclass certain nearby nodes were successful in the current competition
of DPMA based on dual binary countdown without detached round. If the received power levels are sufficiently high to pro-
binary countdown. In Section IV, we will extend DBC to DDBC hibit the transmission or reception of a competing node, the node
by incorporating the detached binary countdown technique into will wait for a default or specified packet duration minus the
protocol design. length of a competition round, before it become eligible to enter
In time-division DPMA (TD-DPMA), transmitters and re- a new competition again. However, if the node can transmit data
ceivers send their prohibiting signals in different but adja- packets at lower power or receive data with higher interference
cent prohibition slots, while in frequency-division DPMA (FD- tolerance (e.g., by using larger spreading factor or transmitting a
DPMA), transmitters and receivers send their prohibiting signals different data packet out of order), it may restart earlier. In what
in different control channels. Following the prohibition stage are follows, we present a simple example for implementing DPMA.

22
wise, it loses the competition. If a node survives all k bit-slots,
it becomes a candidate for transmission or reception. It then an-
Contention Round
nounces that it wins the competition by transmitting prohibiting
signals in the declaration slot. If a node detects the signals sent
by its partner, it will transmit or receive accordingly; otherwise,
Control Channel
it loses the competition.

B. Dual Position-based Prohibition (DPP)

Prohibition Slots Declaration Slot


To mitigate the additive prohibiting signal strength problem,
we can use larger competition durations, especially for receiver
slots and the first few transmitter slots. To better utilize the
Fig. 2. The timing diagram for the control channel of DPMA
extended slot durations, however, it is desirable to employ the
dual position-based prohibition mechanism. This can be done
by separating the prohibition slots for transmitters and receivers
A. Synchronized Dual Binary Countdown
in a way similar to the description in the preceding subsection
To simplify the description, we consider a synchronized TD- (except that we are not using the binary on/off prohibiting mech-
DPMA based on binary countdown. In this simple version, to be anism anymore). Or, equivalently, we substitute each bit-slot in
referred to as DBC, all competing nodes are synchronized and DBC with a position-based “digit-slot”, and follow similar rules
start competition at the same time. The time axis for the control or spirits as in the simple version in the preceding subsection.
channel is partitioned into equal length of “contention round”. Similar to the way DBC is extended to DDBC, we can also ex-
Each contention round starts with the prohibition slots followed tend dual position-based prohibition to detached dual position-
by a declaration slot. The contention round is used for resolving based prohibition (DDPP) by incorporating detached position-
the access contention among wireless stations. At the beginning, based prohibition.
a pair that has data packets to transmit and receive schedule one We can use position-based dual prohibition for the entire pro-
or several mutually agreed times to participate in competition. hibition stage if so desired. When the digit-slot is sufficiently
The prohibiting signals are transmitted in a narrow PHY channel large and the number of competition is sufficiently small, we
dedicated for control, while data packets are transmitted in a can also use a single digit-slot-pair for the prohibition stage, fol-
separate PHY channel. lowed by a declaration slot-pair and possibly an additional yield
A competing transmitter senses the status of all receiver slots, period (with sensitive carrier sensing) before transmitting or re-
but does not sense any transmitter slots; while a competing re- ceiving the data packet. An accompanying mechanism for re-
ceiver senses the status of all transmitter slots, but does not sense ducing/controlling the number of concurrent competitors is pre-
any receiver slots. In bit-slot i, i = 1, 2, 3, 4, ..., k of the com- sented in the following subsection.
petition, only nodes that survive all the first i − 1 bit-slots par-
C. Backoff Control in Combination with Binary Countdown
ticipate in the competition. Such a surviving node whose ith bit
of its CN is 1 transmits a prohibiting signal in the appropriate Backoff control is employed in DPMA as a means to con-
bit-slot (i.e., either transmitter bit-slot or receiver bit-slot) at ap- duct flow control. Backoff control algorithms/schemes in pre-
propriate power level (so that it reaches most/all nodes within vious protocols as well as their variants may be adapted. This
its prohibitive range). A surviving transmitter whose ith bit is 0 way the number of concurrent competing pairs can be consider-
keeps silent and senses whether there is any prohibiting signal ably reduced so that the prohibition slots in DPMA do not need
during receiver bit-slot i that has received power level above its to be too expensive or even prohibitively large just to mitigate
threshold. If so, it loses the competition; otherwise, it survives the additive prohibiting signal strength problem. Moreover, the
and remains in the competition. A surviving transmitter whose number of concurrent competing pairs can be dynamically con-
ith bit is 1 senses whether there is any prohibiting signal during trolled according to the traffic conditions, so that the number
receiver bit-slot i that may be sent by its receiver. If so, it sur- of slots and their durations can remain fixed in DPMA without
vives the competition and remains in the competition; otherwise, compromising the performance.
it loses the competition. Similarly, a surviving receiver whose Different from IEEE 802.11/11e where backoff control is the
ith bit is 0 keeps silent and senses whether there is any prohibit- only mechanism to reduce the attempt rate, DPMA employs
ing signal during transmitter bit-slot i that has power above its position-based dual prohibition to further eliminate the number
threshold. If so, it loses the competition; otherwise, it survives of competing nodes. This way the parameters as well as algo-
and remains in the competition. A surviving receiver whose ith rithms/schemes employed for the backoff control in DPMA do
bit is 1 senses whether there is any prohibiting signal during not need to be as conservative as previous MAC protocols. In
transmitter bit-slot i that may be sent by its transmitter. If so, it other words, backoff control is only utilized to reduce the typ-
survives the competition and remains in the competition; other- ical number of competitors (e.g., to a constant number consid-

23
erably greater than 1), and then the first or first few prohibition receiver pair. Note that for transmitters, the thresholds for being
slot-pairs will effectively eliminate most of the remaining com- prohibited are determined by their transmission power levels;
petitors to prevent collision. This way the additive prohibiting while for receivers, the thresholds for being prohibited are deter-
signal strength problem can be resolved without noticeable idle mined by their tolerance to the interference levels. Since trans-
times for the ratio channel. mitters do not prohibit other intended transmitters and receivers
As a comparison, when backoff control is the only or main do not prohibit other intended receivers, the exposed terminal
mechanism to avoid collisions (that would be frequently caused problem does not exist in DDBC. Moreover, different from
by concurrent attempts otherwise), the radio channel may stay CSMA where on-going transmissions block nearby intended
idle for a non-negligible portion of time so that radio resources transmitters from transmitter to transmitter, on-going recep-
are not efficiently utilized. Thus, backoff control combined with tions in DDBC block nearby intended transmitters from receiver
position-based dual prohibition as in DPMA can considerably to transmitter directly, and on-going transmissions discourage
increase the radio efficiency. nearby intended receivers from transmitter to receiver directly.
As a result, obstructions or hidden terminals (for CSMA sys-
IV. D ETACHED D UAL B INARY C OUNTDOWN (DDBC) tems) will not cause collisions in DDBC as in CSMA. Thus, the
In DDBC, the transmit and receiver in a pair both participate hidden terminal problem can be resolved in DDBC without rely-
in binary countdown competition. They are allowed to compete ing on hidden terminal detection mechanism or complex group
before they actually transmit or receive their data packets, and competition or other operations as in previous MACP protocols.
the lead time is upper-bounded by RLTT C for traffic class TC. For a node to determine whether its partner is surviving,
To achieve absolute differentiation capability, RLTi has to be a transmitter checks whether a prohibition signal correspond-
greater than RLTi+1 by at least a maximum data packet length, ing to the same bit/digit value (with reasonable power level)
where traffic class i has higher priority than traffic class i + 1 exists in receiver prohibition slots, and similarly, a receiver
by one level. The duration for a data packet is typically larger also sense prohibition signals in transmitter prohibition slots
than that for a binary competition round. If a pair lose their to make the judgment. If a node finds that the partner in its
competition, then can continue to compete in a following round transmitter/receiver-pair has failed the competition, it will with-
until success. Note that the number of contending pairs can nat- draw from competition immediately. Similarly, if a node cannot
urally be eliminated to one (for one collision domain) during bi- sense any signals in the declaration slot that may be sent by its
nary countdown, thus the backoff mechanism as in IEEE 802.11 partner, it also withdraws from the competition. If signals are
is not required. However, for lower-priority traffic categories, detected in the declaration slots, it implies that certain nearby
backoff with contention windows smaller than those defined in nodes were successful in the current competition round. If the
IEEE 802.11 may still be employed to reduce the number of con- received power levels are sufficiently high to prohibit the trans-
tending pairs so as to lower the accumulated prohibiting signal mission or reception of a competing node, the node will wait
strength. for a default or specified packet duration minus the length of
The winner pair will announce their success in declaration a competition round, before it become eligible to enter a new
slots following the binary countdown competition round. In competition again. In what follows, we present a simple exam-
these slots, the starting time for the reserved transmission or re- ple for implementing the dual binary countdown mechanism in
ception is indicated by transmitting a prohibiting signal in cor- DDBC.
responding slots. For example, a simple implementation is to To simplify the description, we consider a synchronized dual-
use the first slot to indicate transmission/reception time 8 units channel DDBC. In this simple version, all competing nodes are
later, and the second slot to indicate transmission/reception time synchronized and start competition at the same time. The pro-
7 units later, and so on. hibiting signals are transmitted in a narrow PHY channel dedi-
In DDBC, transmitters and receivers conducting their binary cated for control, while data packets are transmitted in a separate
countdown by sending their prohibiting signals in different but PHY channel. Each intended transmitter/receiver-pair coordi-
adjacent prohibition slots, in a separate control channel. Fol- nates in advance to decide the round(s) of competition to partic-
lowing the prohibition stage are declaration slots. As shown in ipate in, and the competition number (CN) to use. Such coordi-
Fig. 1, the slots for transmitters and receivers are interwoven for nation mechanisms are not difficult to design and are omitted in
the entire competition round. A transmitter slot and a receiver this paper.
slot corresponding to the same bit/digit (for either prohibition or A competing transmitter senses the status of all receiver slots,
declaration) must be next to each other. for but does not sense any transmitter slots; while a competing re-
During the prohibition stage, transmitters are prohibited ceiver senses the status of all transmitter slots, but does not sense
by nearby receivers with higher competition numbers (CNs) any receiver slots. In bit-slot i, i = 1, 2, 3, 4, ..., k of the com-
through their prohibition signals; while receivers are prohibited petition, only nodes that survive all the first i − 1 bit-slots par-
by transmitters with higher CNs through their prohibition sig- ticipate in the competition. Such a surviving node whose ith bit
nals, where a CN is a binary number shared by a transmitter- of its CN is 1 transmits a prohibiting signal in the appropriate

24
Average Throughput

240
Table 1. Summary for parameter values in the simulation model. 200

Througput (packet/sec)
160
Parameter Nominal Value
120
Simulation area 30 *30 grid units
80 ID 2-3-4
Maximum data transmission range 10 grid units ID 2-4-8
Number of nodes 90 40 ID 2-5-10
CSMA/CA
Data packet size 2K bytes 0
Channel bandwidth(Data/Control) 54Mbps/1Mbps 0 180 360 540 720 900

Minimum SNR 4 Arrival Rate (packet/se c)

Fig. 3. Comparisons of throughput for CSMA/CA and DPMA with different


CN lengths.

bit-slot (i.e., either transmitter bit-slot or receiver bit-slot) at ap-


propriate power level (so that it reaches most/all nodes within Average Delay
its prohibitive range). A surviving transmitter whose ith bit is 0
0.01
keeps silent and senses whether there is any prohibiting signal
during receiver bit-slot i that has received power level above its 0.008
threshold. If so, it loses the competition; otherwise, it survives

Delay (sec)
0.006
and remains in the competition. A surviving transmitter whose
ith bit is 1 senses whether there is any prohibiting signal during 0.004
receiver bit-slot i that may be sent by its receiver. If so, it sur- CSMA/CA
ID 2-5-10
0.002
vives the competition and remains in the competition; otherwise, ID 2-4-8
ID 2-3-4
it loses the competition. Similarly, a surviving receiver whose 0
ith bit is 0 keeps silent and senses whether there is any prohibit- 0 40 80 120 160

ing signal during transmitter bit-slot i that has power above its Arrival Rate (packet/sec)
threshold. If so, it loses the competition; otherwise, it survives
and remains in the competition. A surviving receiver whose ith
bit is 1 senses whether there is any prohibiting signal during Fig. 4. Comparisons of delay for CSMA/CA and DPMA with different CN
lengths.
transmitter bit-slot i that may be sent by its transmitter. If so, it
survives the competition and remains in the competition; other-
a data packet arrival rate. There are four separate queues for
wise, it loses the competition. If a node survives all k bit-slots,
different traffic classes at each wireless station. In the simulation
it becomes a winner eligible for transmission or reception.
we assume the arrival rates for all traffic class are the same. The
aggregated arrival rate is the sum of the arrival rates of the four
V. P ERFORMANCE E VALUATION
queues for all wireless stations in the network.
In this section we compare the performance of DPMA based In Figs. 3 and 4, we compare CSMA/CA with DPMA with
on DBC with CSMA/CA by changing various system and pro- different lengths for the competition numbers (CNs). “ID 2-3-4”
tocol parameter values. denotes 2 bits for the priority number part, 3 bits for the random
In the simulation model, we assume that each node has the number part, and 4 bits for the ID part. The prohibition slot
same maximum transmission power. And all the nodes are ran- duration is equal to 0.1 µs. We change the number of bits for
domly distributed within a 30 × 30 grid. The total number of each part to compare the performance among DPMA protocols
nodes is 90. The data channel bandwidth is set to 54 Mbps for and CSMA/CA.
both CSMA/CA and DPMA. And also, data packets have fixed Figures 5 and 6 compare DPMA with different ID lengths
length of 2K Bytes. When the signal noise ratio (SNR) is lower to optimize its parameters. The numbers of bits for CNs with
than 4, a reception is collided. Signals transmitted by all wire- lengths 7, 8, 11, 14, 17, and 20 are (2,2,3), (2,2,4), (2,3,6),
less stations in the network are taken into account when calcu- (2,4,8), (2,5,10), and (2,6,12) for the (priority, random, ID
lating SNR. In Table 1, we summarize the parameter values used parts). The 14-bit ID results in better performance relative to
in the simulation model. other options. We can see that CSMA/CA can achieve higher
We assume that the wireless stations are stationary throughout throughput than DPMA if the later do not choose its CN length
the simulation. Each wireless station is a Poisson source with properly.

25
Max. Throughput vs. ID Length Average Throughput For Different Priority Packets

150 120
Throughput (packet/sec)

Average Throughput
100
120 PRI-4

(packet/sec)
80
PRI-3
90 60 PRI-2
60 40 CSMA/CA
20 PRI-1
30
0
0 0 180 360 540 720 900

A
7

11

14

17

20
Arrival Rate (packet/sec)

C
A/
M
ID Length (bits)
CS Fig. 7. Throughput differentiation in DDBC.

Fig. 5. Maximum achievable throughput for different CN lengths.

Average Delay For Different Priority


Packets
Average Delay vs. ID Length
0.01
0.5
0.008 CSMA/CA

Delay (sec)
0.47 PRI-1
0.006
Delay (sec)

0.44 PRI-2
0.004 PRI-3
0.41
0.002 PRI-4
0.38
0
0.35 0 50 100 150 200 250 300
A
7

11

14

17

20

Averal Rate (packets/sec)


/C
A
M

ID Length (bits)
CS

Fig. 8. Delay differentiation in DDBC.


Fig. 6. Operating delay for moderate arrival rates for different CN lengths.

more. We can see that when the prohibition slot duration is not
Figures 7 and 8 evaluate the differentiation capability of
negligible, the advantage of DPMA will be reduced when the
DPMA with four priority classes, where priority 4 has the high-
CN length is large.
est priority. The length of prohibition signal is equal to 0.1 µs.
The CN used is (2,3,4) – 2 bits for priority, 3 bits for random In Figs. 11 and 12, we further consider even larger prohibi-
numbers, 4 bits for station IDs. We assume no differentiation tion slot durations, and find that the performance of DPMA may
capability in the CSMA/CA protocol as in DCF. degrade to a degree worst than CSMA/CA when the prohibition
slot duration is 2.5 µs.
From Fig. 7, we find that the system throughput is similar for
the four priorities when the traffic load is low. When the traffic
load becomes higher, we can see that the throughput of lower VI. C ONCLUSIONS
priority packets begin to decrease, while the throughput of the
highest priority packets still continue to increase without being In this paper, we proposed detached dual binary countdown
affected. Figure 8 shows that all the four priority packets have (DDBC) for multihop wireless networks. DDBC does not re-
similar average delay as the traffic load is low. When the traf- quire RTS/CTS or other control messages, and thus do not suffer
fic load gets higher, the highest priority packets remain a very from the collision problems of RTS/CTS messages as in IEEE
small average delay while the average delays of lower priority 802.11 and MACAW [1]. DDBC can resolve the hidden and
packets increase quickly. These results demonstrate the strong exposed terminal problems without relying on busytone (which
differentiation capability of DPMA. suffers from “self interference problem” [12]) or complex mech-
In the following figures, we compare CSMA/CA with DPMA anisms. The significance of DDBC is that it is the very first so-
with different system or protocol parameter values. In Figs. 9 lution reported the literature thus far that can achieve absolute
and 10, we compare CSMA/CA with DPMA protocols that use differentiation in distributed multihop wireless networks with
different CN lengths. We use larger prohibition slot duration variable packet lengths. It also leads to freedom from collision
1µs when the length of CN will impact the network performance when the competition numbers (CNs) used are locally unique.

26
DPMA with Different BCN vs. CSMA/CA DPMA with Different Unit Slot Length
vs. CSMA/CA - Throughput
- Throughput
140 120

Throughput (packet/sec)
Throughput (packet/sec)

120
DPMA 2-4-8 90 DPMA SL=1
100
80 DPMA 2-5-10 DPMA SL=1.5
60
60 DPMA 2-2-3 CSMA/CA
40 CSMA/CA DPMA SL=2.5
30
20
0
0
0 100 200 300 400 500
0 60 120 180 240 300
Arrival Rate (packet/sec)
Arrival Rate (packet/sec)

Fig. 9. Comparisons of throughput for CSMA/CA and DPMA with different Fig. 11. Comparisons of throughput for CSMA/CA and DPMA with different
CN lengths. prohibition slot durations.

DPM A with Different BCN vs. CSMA/CA DPMA with Different Unit Slot Length
- Average Delay vs. CSMA/CA - Ave rage Delay
10 10
8
8
Delay (msec)

Delay (msec)
CSMA/CA
6
DPMA 2-5-10 6
4 DPMA 2-4-8 DPMA SL=1
DPMA 2-2-3 4 DPMA SL=1.5
2
CSMA/CA
2
0 DPMA SL=2.5
0 25 50 75 100
0
Averal Rate (packets/sec) 0 25 50 75 100

Averal Rate (packets/sec)

Fig. 10. Comparisons of delay for CSMA/CA and DPMA with different CN Fig. 12. Comparisons of delay for CSMA/CA and DPMA with different prohi-
lengths. bition slot durations.

R EFERENCES access control protocol for synchronized wireless networks,” Mobile Net-
works and Applications, Springer Science and Business Media B.V., Vol.
[1] Bharghavan, V., A. Demers, S. Shenker, and L. Zhang, “MACAW: A me- 10, No. 5, Oct. 2005, pp. 627 - 637.
dia access protocol for wireless LAN’s,” Proc. of ACM SIGCOMM’94, [10] Yeh, C.-H., “A collision-controlled MAC protocol for mobile ad hoc net-
1994, pp. 212-225. works and multihop wireless LANs,” Proc. IEEE Globecom’04, Dec.
[2] Haas, Z.J. and J. Deng, “Dual busy tone multiple access (DBTMA)- 2004.
performance evaluation,” Proc. IEEE Vehicular Technology Conf., 1999, [11] Yeh, C.-H , “RPMA: receiver prohibition multiple access for collision-
pp. 314-319. controlled wireless mesh networks and mobile ad hoc networks,” Proc.
[3] ISO/IEC 8802-11: 1999(E) ANSI/IEEE Std 802.11, Part 11: Wireless IEEE Globecom’06, Nov./Dec., 2006.
LAN Medium Access Control (MAC) and Physical Layer (PHY) specifica- [12] Yeh, C.-H., “New busytone solutions to medium access control in wireless
tions, 1 edition, 1999. mesh, ad hoc, and sensor networks” Proc. IEEE ICC’07, May, 2006.
[4] IEEE 802.11 WG, Draft Supplement to STANDARD FOR Telecommunica- [13] You, T., C.-H. Yeh, and H. Hassanein, “CSMA/IC: A new class of
tions and Information Exchange Between Systems LAN/MAN Specific Re- collision-free MAC protocols for ad hoc wireless networks,” Proc. IEEE
quirements – Part 11: Wireless Access Control (MAC) and Physical Layer Int’l Symp. Computer Communications, Jun./Jul. 2003, pp. 843-848.
(PHY) specifications: Medium Access Control (MAC) Enhancements for
Quality of Service (QoS), IEEE 802.11e/D8.0, 2004.
[5] IEEE P802.11s/D1.10, Draft STANDARD for Information Technology-
Telecommunications and information exchange between systems- Local
and metropolitan area networks- Specific requirements- Part 11: Wireless
LAN Medium Access Control (MAC) and Physical Layer (PHY) specifi-
cations, Amendment: Mesh Networking, Mar., 2008.
[6] IEEE P802.16j/D1, Draft Standard for Local and Metropolitan Area
Networks- Part 16: Air Interface for Fixed and Mobile Broadband Wire-
less Access Systems, Multihop Relay Specification, Aug., 2007.
[7] Sobrinho J.L. and A.S. Krishnakumar, “Quality-of-Service in ad hoc car-
rier sense multiple access networks,” IEEE J. Selected Areas in Communi-
cations, Vol. 17, No. 8, Aug. 1999, pp. 1353-1368.
[8] Tanenbaum, A.S., Computer Networks, 4th Edition, Prentice Hall, N.J.,
2003.
[9] Wu, H., A. Utgikar, and N.-F. Tzeng, “SYN-MAC: A distributed medium

27
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Construct Small Worlds in Wireless Networks Using


Data Mules
Chang-Jie Jiang, Chien Chen, Je-Wei Chang and Rong-Hong Jan Tsun Chieh Chiang
Department of Computer Science Information & Communications Research Laboratories,
National Chiao Tung University Industrial Technology Research Institute,
Hsinch, Taiwan, R.O.C Hsinch, Taiwan, R.O.C
{gis93598, cchen}@cis.nctu.edu.tw; tcchiang@itri.org.tw
{changzw, rhjan}@cs.nctu.edu.tw
Abstract world phenomenon exists in our daily life.
A small world phenomenon has been discovered in a wide However, in the real world, with only local information
range of disciplines, such as physics, biology, social science, available, it is impossible to find the shortest path using these
information system, computer networks, etc. In a wireless randomly-constructed shortcuts. Data relayed in such a
network, the small world phenomenon is used in the development network is either hop by hop or randomly. Kleinberg identified
of novel routing strategies. However, past studies made use of wire this problem of how to find shortest paths in a decentralized
lines as shortcuts to construct a small world in a wireless network. way for the small world network with only local information in
First, it’s difficult to determine the length of these wired shortcuts
a study published in 2000 [11].
in advance. Second, wired lines are unsuitable in certain
circumstances such as rural area, battlefield, etc. They are more Some previous studies on wireless networks also make use of
costly in deployment and more vulnerable to unexpected damage. the small world phenomenon to improve network performance.
Finally, in wireless networks that lack central infrastructures In 2003, Helmy [10] discussed the influence of shortcuts in
such as mobile ad hoc networks (MANET) and wireless vehicular wireless networks. He shows that by adding a few shortcut links
networks, fixed-length wired line shortcuts cannot be employed. with a fixed length between 0.3 and 0.4 of the diameter of the
This study proposes a new method to construct a small world in a network topology can reduce the average path length of
wireless network. Instead of deploying wired lines as shortcuts. wireless networks considerably. This study obtained a similar
Variable length shortcuts are constructed by using mobile router result in previous works on the small world phenomenon.
nodes called data mules. Data mules move data between nodes
However, how to construct a small world in a wireless network
which don’t have direct wireless communication link. These data
mules imitate shortcuts in a small world. The small world became a challenge. Most past studies adopted wire links as
phenomenon in connected and disconnected wireless networks shortcuts [29, 30]. However, wired shortcuts have some
containing various numbers of data mules is then discussed. problems such as exposure to unsuspected danger of natural or
Finally the small world phenomenon is considered in wireless man-made disasters such as flood, fire, road construction,
sensor networks. development, etc. Another problem regards the fixed shortcut
length proposed by Kleinberg and Helmy. Adding fixed-length
Index Terms—Small world phenomenon, wireless networks, shortcuts only fits well with a fixed network topology.
wireless sensor networks (WSN), data mule. However, it is difficult for a mobile wireless network such as
mobile ad hoc network, or vehicular networks, to construct
I. INTRODUCTION fixed length shortcuts due to dynamically change of the
During the past few years, the small world phenomenon has topology. It is even more difficult to make the mobile nodes to
been a hot topic. The small world phenomenon unfolded how be attached by the wired shortcuts
each individual in the real world has relationships to other Given the reasons above, this study proposes a new method
people. Milgram [5] performed a series of mail delivery of constructing a small world in a wireless network. Unlike the
experiments in 1967. By given receiver’s information such as traditional method of constructing a small world in a wireless
address, career, or race, each individual who received a letter network with fixed-length wired-line shortcuts in advance,
forwarded it to a friend or relative whom presumably more mobile nodes called data mules or data ferries are used to
likely knew the receiver. In this experiment, Milgram found imitate shortcuts. Numerous wireless network applications
that there was an average of about six intermediate delivers today have mobile nodes already. For example, Fig. 1 shows
before the final receiver being reached. This work first the wireless network comprised of soldiers and tanks in a
quantified the famous notion of the so-called “six degrees battlefield, where the tanks have higher speed than the soldiers
separation” between any individual on earth. and thus can be utilized as data mules to forward data. Another
Then in 1998 Watts and Strogatz [6] proposed a network example is in urban vehicle networks as in Fig 2. In such a
model, which shows that rewiring a few links in a regular ring network each vehicle is mobile and can be utilized to forward
lattice graph (the destinations of the rewiring links are data to the next vehicle that is closer to the destination. The
randomly selected from the graph) can decrease the average distance of traveling by mobile nodes is the shortcut length.
path length between any two nodes in the graph considerably Precisely, the position where data gets on the mule is the
while still maintaining a high degree of clustering within the beginning of the shortcut, and the position where data gets off
neighbored nodes [4, 6]. They called these rewired links the mule is the end of the shortcut.
shortcuts. This network model clearly explained why the small The data mule shortcuts differ from the traditional shortcuts.
The traditional shortcuts can not move and must have a fixed

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 28


DOI 10.1109/SUTC.2008.93
Figure 2. A vehicle network in a city, all of cars have wireless
communication device
Figure 1. A mobile ad hoc network in a battlefield

length because they must be constructed in advance. Thus all of it is clear that in many applications data mule shortcuts are a
the data must co-operate with the shortcuts. However, data better option than wire shortcuts. However, WSN has sink
mule shortcuts are generated when data needs to be forwarded nodes that all data must be forwarded to. In this situation, active
and the shortcut length is determined by data demand. Even if forwarding model is adopted on data mule.
data mule and traditional shortcuts have totally different The remainder of this paper is organized as follows. Section
natures, the simulation results presented in this paper show that II introduces some related works on small world phenomenon,
a small numbers of data mules can significantly reduce average small worlds in wireless networks, and data mules. Section III
path length, which is compatible with the finding of the small proposes how to construct a small world in a wireless network
world phenomenon. using data mules as shortcuts. Subsequently, Section IV
After deciding to use data mule shortcuts, it is necessary to presents the simulation results. A final conclusion is given in
determine the forwarding model to be used for the data mules. Section V.
Forwarding models of the data mules can be divided into two
categories: active and passive forwarding. Active forwarding is II. RELATED WORKS
used to actively control data mule motion and can be operated
in applications where data destination is known, such as a sink This section introduces the concept of the small world
node in a sensor network, as we will discuss later. On the phenomenon. Then we present previous studies on small
contrary, in networks that already have mobile nodes, such as worlds in wireless networks and data mules.
Fig 1 and Fig 2, or in applications that have no fixed destination,
passive forwarding is more appropriate. A. Small World Phenomenon
To realize the small world phenomenon in wireless networks The small world phenomenon comes from the observation
with data mules, this study divides wireless networks into that individuals are often linked by a short chain of
connected and disconnected networks based on topology type. acquaintances. The same phenomenon was also found to exist
In connected wireless networks, all nodes are connected and on the today internet and the World Wide Web (WWW) in [7,
data mules imitate shortcuts. Adding different numbers of data 8, 9]. Watts & Strogatz conducted a set of re-wiring
mules to the network can demonstrate the small world experiments on regular graphs in [4, 6] and observed that by
phenomenon. In disconnected wireless networks, some of the rewiring a few random links in a regular graph, the average path
nodes are disconnected, such as in sparse mobile ad hoc length was reduced significantly (approaching that of random
networks or delay tolerant networks (DTN). Previous studies graphs) while still maintaining a high degree of clustering
on small worlds in wireless networks assume that the network within the neighbored nodes (since the node neighbors are also
is connected and calculate the average path length and cluster neighbors of each other, they form the cluster). This class of
coefficient of all nodes in a topology to determine the small graphs was termed small world graphs, a label that emphasizes
world phenomenon. However, in disconnected networks, the the importance of random links acting as shortcuts that reduce
path length of disconnected node pairs cannot be calculated. average graph path length.
Watt in [28] proposed a model called α model to explain the However, how do we find the shortest path in such a network
small world phenomenon in a disconnected network. We utilize model? If the real world were the same as the network model
data mules as shortcuts to forward data between any proposed by Watts and Strogatz, the mail delivery experiment
disconnected node pairs. We claim a similar small world of Milgram would not succeed. In this mail delivery experiment,
phenomenon can be observed. mail forwarding is determined by intermediate delivers
Finally, this study discusses the data mule shortcuts in personal information such as race, career, age, and other
wireless sensor networks (WSN). In [29, 30], the authors used characteristics. For this reason, data forwarding in the real word
wire line to construct a small world in WSN. However, is in a decentralized way, not in a centralized way like the
numerous WSN applications such as remote surveillance model in the [4, 6].
system or target tracking with mobile sensor node might not be Kleinberg tried to solve the decentralized problem in [11]. In
able to connect the network with wires. Adding wires is also not contrast to Watts and Strogatz’s model, Kleinberg’s model
economical for short term applications. Judging from the above, assumes that the graph of the network is a directed
29
2-dimesional lattice graph, not an undirected 1-dimensional Kleinberg in [11] show that regardless of partial knowledge or
ring lattice, and that each node has shortcuts by adding links in full knowledge, shortcuts in networks must be within a certain
the network, not by rewiring them. The shortcut added in range (network model in [10] is full knowledge that each node
networks is determined via the following procedure. Node u is a knows all of the information in the networks and [11] is partial
sender in a network and has a link to endpoint v with knowledge that each node only know its neighbor information).
-r
probability proportional to [d(u , v)] , where d(u, v) denotes the
distance between node u and node v and r denotes a tunable C. Data Mules
The data mules (i.e. data ferries) have been proposed as a
parameter. To obtain a probability distribution, Kleinberg
strategy for providing connectivity in the networks such as
divided this quantity by the appropriate normalizing constant
DTN, vehicular networks, etc., where a set of nodes called
∑ d (u, v )
−r
v . He also proposes a greedy routing algorithm to mules are responsible for carrying data for all network nodes.
forward data: each data forwarder u forwards data to the The operating modes of data mules can be divided into two
neighbor node v where node v is closest to the destination categories: active and passive forwarding. In active forwarding,
among all of the neighbor nodes (neighbor nodes of node u are the direction of a data mule is determined by the destination of
the endpoints of the original links and shortcut links of the node the data on the mule or by a fixed track mapped out in advance.
u) of node u. Kleinberg found that different probability In [21], a subset of mobile nodes is moved in a coordinated way. 
distributions of adding shortcuts (different value of r) affect the Those mobile nodes always remain pair-wise adjacent and
performance of data delivery time significantly. Each different move in a snake-like way. The snake’s moving direction is
distribution causes a different expected length of the shortcut. determined by the snake’s head, which follows a random walk.
Kleinberg found that appropriate shortcut length makes The snake-like sequence is responsible for relaying data.
network nodes find the best next hop based on local Moreover, [22]proposed a method that guarantees to minimize
information (each node maintains the positions of the neighbor the message transmission time. In this approach, mobile hosts
nodes) and without losing the benefits associated with shortcuts. actively modify their trajectories to transmit messages. In [23],
If shortcuts are too long, the source initial algorithm may be non-randomness is introduced to mobile node movement. Here,
unable to identify the shortest path. Moreover, if the shortcuts non-randomness means that the movement of the data mules is
are too short, the advantage of the small world phenomenon tracked by a fixed tracker like satellite.
would not occur. Kleinberg shows that minimum delivery time In contrast, passive forwarding does not control the direction
in the 2-dimensional lattice network occurs when r = 2. of mobile nodes in a network. Data forwarding via mobile
Kleinberg also extended the result to the general case for r nodes in the passive mode resembles hitching a ride. Passive
equals to k in the k-dimensional lattice network. That is, forwarding is termed opportunistic forwarding and can be
networks for which only partial knowledge is available can use divided into three categories: No, All, and Selective forwarding
appropriate length of shortcuts to effectively reduce average [3]. No forwarding is similar to the strategy used in the Data
path length. Mule project [24]. Mobile nodes accept data from a data source
and cache it in their buffer. The mobile nodes continue to carry
B. Small Worlds in Wireless Networks this data until they receive a RESPONSE from the destination,
Wireless networks, such as ad hoc or sensor networks, are at which they deliver the data and remove it from their cache. In
spatial graphs in which links are determined by radio this method, mobile nodes do not communicate with one
connectivity. Such networks usually do not have long-length another. No forwarding uses a minimum amount of
connection (shortcuts) due to limitation of radio range. Hence, system-wide buffer space, since only one copy of each packet
they do not show the small world phenomenon. Even though, propagates throughout the system. All forwarding [25]
such networks have high clustering. Therefore, their average represents the opposite extreme to No forwarding. In this case,
hop distance is much larger than the average hop distance of a mobile node unconditionally exchanges data with all of the
wired networks. other mobile nodes it meets. Whenever a mobile node hears a
Helmy [10] first puts the concept of small world on wireless RESPONSE from another mobile node, it forwards all of the
networks. He added a few links which may represent the data in its buffer to the neighbored mobile nodes. The
physical wires among random nodes to construct the shortcuts redundancy of data provided by this strategy helps maximize
in wireless networks and calculated the average path length. the success rate of packet delivery and reduces overall delay.
The experimental results are the same as in the small world However, this method incurs an extremely high overhead in
model: clustering coefficient remains almost constant and terms of the number of messages exchanged and buffer space.
average path length is reduced significantly by the addition of a Selective forwarding is a form of greedy and
few shortcuts. geographical-based routing. A mobile node forwards the data to
Simultaneously, Helmy experimented with the addition of a the corresponding neighbor only if the position of the neighbor
fixed length shortcut to minimize shortcut constructing cost in is closer to the destination than its own current position.
wireless networks. The experimental results show that the
shortcut length only needs to be a small fraction (25% ~ 40%) III. DATA MULE IN WIRELESS NETWORKS
of the network diameter. Moreover, a longer shortcut (>40%) In this section, we first propose to construct a small world in
does not help to reduce average path length. The experiment a wireless network using data mules. We argue how data mules
demonstrated that it is not necessary to spend extra costs to can prevent defects of using wired lines or multiple-hop path
increase shortcut length. The results of Helmy in [10] and for shortcuts. Then the proposed operation models of data
30
mules and network nodes are introduced. Both small world fixed sink nodes, and this is discussed in subsection D.
phenomena in connected and disconnected wireless networks Although both data mules and traditional shortcuts have the
with data mules are discussed respectively. Furthermore, a same objective, they are totally different in nature. Traditional
small world phenomenon in wireless sensor networks using wired type shortcuts are fixed in terms of their positions in the
data mules is demonstrated. network, while data mule shortcuts are generated when data
delivery demands and are not fixed in terms of positions. Thus
A. Construct Small Worlds using Data Mules the length of traditional shortcuts must be measured as
In this subsection let us consider how to establish shortcuts in discussed in [10, 11]; however, the length of a data mule
wireless networks. There are two methods existing for shortcut is determined by demands of data on the networks.
constructing small worlds in wireless network, physical and Even if data mule shortcuts and traditional shortcuts are totally
logical ways. The physical way deploys wire lines as the different in nature, we show that deployment of small numbers
shortcuts in wireless network [12] or increases the radio range of data mules can significantly decrease average path length.
of wireless devices to make the long-length shortcuts. In the
logical way, long-length shortcuts are frequently imitated by B. Operation of Data Mules and Network Nodes
multi-hop paths [13, 14]. However, increasing the radio range This section discusses the operating details of data mules and
in common-channel network may negatively impact the network nodes. We adapt the selective passive forwarding as
spectrum utilization. Using the multi-hop paths to simulate a our operation models for data mules. The operation of data
long-length shortcut can not decrease path length actually. mules can be discussed in two aspects:
The most widespread method of constructing a small world 1. The time when the data is loaded onto the data mule: Data
in wireless networks is to deploy wired lines as shortcuts [12]. can only be loaded onto the data mule when the direction of
Due to deploy the wired line, the shortcuts should be added in the data mule coincides with the location of the data
advance. However; shortcuts in topology are used by all nodes destination. As Fig. 3 shows, the direction of data mule M is
and a too short or too long shortcut length may not decrease toward the colored region. In this case the data on sender S0
average path length [10, 11]. Therefore the shortcuts must be chose the data mule M to get on because data mule M is
maintained within a fixed range of the length according to the toward D0.
size of network topologies. For this reason, networks require a 2. The time when the data is unloaded from data mule: There
predetermined infrastructure to establish shortcuts. Even more are three situations in which data on a mule must get off.
problematical the wireless nodes which attached to the wired First, when the data has already reached its destination, the
shortcuts need to be stationary. However, the key benefit of data should get off the mule and transmit to the destination
wireless networks is mobility. The network topology of mobile node; second, when the data has reached the location at
wireless network is dynamic. It is impossible to construct fixed which the remaining distance to its destination is minimal,
length wired shortcuts in a dynamic environment. Therefore, it and the data mule is moving away from the destination of the
is necessary to find another way to construct shortcuts in order data, then the data should get off the mule; third, when the
to adapt the node mobility and topology change. data mule turns, we should check to see if the location of the
Rather than creating physical wires, this study attempts to data destination is still on the trajectory of the data mule. If
use mobile nodes (data mules) to carry data to construct small the new direction of the mule is opposite to the destination of
worlds. Numerous applications such as Figs. 1 and Figs. 2 have the data, then the data should get off to one of the nearest
mobile nodes already and can be utilized as shortcuts. This neighbor nodes
study assumes that all data mules in wireless networks have Aspects 1 and 2 above demonstrate how data utilizes data
storage and mobile ability. These data mules are used to mules in passive forwarding modes. The following, we discuss
simulate shortcuts. Data is loaded onto the mule at the source the operation of data on normal network nodes.
and unloaded at the destination. The path between these two Routing algorithms in wireless ad hoc networks can be
locations is shortcut. In this study, we adopt passive forwarding divided into position and topology based approaches [1]. The
for the data mules because most applications have mobile nodes position based approach has many advantages compared to the
travel randomly already (i.e., vehicular networks or ad hoc topology based approach, and is adopted in numerous
networks). Unknown data demand and unpredictable data applications. Therefore, position-based routing is used in the
generation are other reasons for this decision. Thus, passive normal network nodes in this study.
forwarding is sometimes more appropriate than active Position based routing uses greedy packet forwarding. The
forwarding. However, active forwarding is adopted in certain information of a packet including the approximate position of
situations, such as in WSN with stationary sensor nodes and the recipient is sent. When an intermediate node receives a
packet, it forwards it to a neighbor in the general direction of
the recipient. Ideally, this process can be repeated until the
recipient is reached. However, data delivered by greedy
forwarding frequently becomes stuck owing to the local
optimal problem. In [19] the right hand rule is used to overcome
this problem. Fig. 4 shows an example: data in sender S is stuck
in local maximal and starts the right-hand rule recovery
algorithm. Data will encounter an unclosed face if they follow a
Figure 3. Data mule decides which data should get on the mule. counterclockwise sequence, as shown in path 1. In this situation,
data must return to the sender because of the failure of finding
31
the destination and change to follow a clockwise direction, as
shown in path 2. By repeating this procedure, data in connected
networks can find a path to the destination
Based on the above operations, this study proposes an
algorithm that operates on normal network nodes and data α =0 α =C α =∞
mules. Normal nodes execute greedy forwarding algorithm and Figure 5. Different graphs with different α
solve the local optimal problem by entering recovery mode (In connected and the average path length increases significantly.
recovery mode, data transmission complies with the right-hand Interestingly, Watts found that if few more shortcuts are added
rule). Data on the normal network node attempts to get onto to the network with α= C, the average path length drops
neighboring mules heading towards the destination of that data. drastically while high clustering is maintained. Fig. 6 shows
If the data cannot identify any neighboring data mule, then it
that the x axis is the value of α and the y axis is the value of
moves forward one hop via greedy forwarding. Data on the
average path length and it demonstrates the small world
mule gets off upon arrival at the best location or when the mule
phenomenon.
changes direction. Data which gets off from the mule selects
another mule heading towards the direction of the destination. Comparing the graph of the α-model with the wireless
If no mule is available, then the data gets off at the normal node. network topology, sparse networks resemble the graph when α
=0, while sparse networks containing a lot of data mules
C. Small World in connected Wireless Network resemble the case when α=∞. As shown in Fig. 7, each circular
In connected wireless networks, all node pairs are connected. part represents the connected area, while the red lines represent
Consequently, all data reach at their destination within certain the data mule paths. This study attempts to find a proper
hops. Thus, we can calculate average path length by algorithm number of data mules to correspond to value of α. The
mentioned in section B. simulation results are presented in the following section.

D. Small World in Disconnected Wireless Network E. Small World in Wireless Sensor Network
To date, all researches interested in small worlds in wireless In this section we apply the small world phenomena to the
networks have assumed that network topology is connected. wireless sensor networks (WSN). Wireless sensor network is a
This assumption is necessary for calculating average path rapidly growing discipline, with new technologies emerging
length. However, certain types of wireless networks are and new applications under development. In addition to
disconnected, such as delay tolerant network (DTN). DTN is a providing light and temperature measurements, wireless sensor
form of emerging network that experiences frequent and nodes have applications such as security surveillance,
long-duration partitions. An end-to-end path may not exist environmental monitoring, and wildlife watching. A WSN
between some or all nodes in a DTN. A data mule is proposed contains at least one sink node that collects all of the sensing
for connecting these disconnected nodes in [29]; however, information in WSN (all of the sensing data must be delivered
small worlds in such disconnected wireless networks have not to the sink nodes). Nodes in WSN are usually deployed across a
been discussed yet. wide area, as shown in Figs. 1 and 2. Reducing average path
To resolve the problem of disconnected networks, let us length is important when large areas are involved. In [30], small
consider the concept proposed by Watts in 1999 [28]. Watts worlds were constructed to decrease average path length by
proposed the first model for experimenting with the small using wired shortcuts; however, the use of wired lines is
world phenomenon in disconnected network and named it as associated with certain problems. 1. Wired lines face
α−model. α is a value that is defined from 0 to ∞, which unpredictable risks, and the deployment of wired lines in wide
enables the model to generate graphs ranging from highly area involves unknown dangers. For example, natural disasters
ordered to highly random. As Fig. 5 shows, α= 0 represents an and man-made construction may break lines. 2. For some
extremely ordered graph in which each node is only connected applications like remote surveillance systems or target tracking
to its neighbor nodes. Meanwhile, α = ∞ represents an by using mobile sensor nodes, it may not be possible to deploy
a network with wires. 3. For some other short-term applications,
extremely random graph in which each node connects
adding wires is also unsuitable. Wire deployment is
randomly with other network nodes. Both of the extreme cases
time-consuming and becomes uneconomical after the end of
have a low average path length because in extreme ordered
the application. 4. Some applications have mobile sink nodes.
graph, the nodes only connects to its neighbors, and in
Wired lines are deployed in advance with the sink node. A
randomly generated graphs, the average path length has been
mobile sink node may cause the wired lines to become
demonstrated to be short. However, when α = C most nodes are
meaningless. 5. Certain applications already have mobile
D
Average path length

1
2

α =0 α =C α =∞

Figure 4. Right-hand rule to solve local optimal problem in wireless


Figure 6. Average path length V.S. α
network with unclosed face
32
(a) center-8 (b) center-16

S S
S
Sink nodes

Path of data mules


(C) corner-8 (D) corner-16
S S

Figure 7. Disconnected wireless networks correspond to α−model


Figure 8. Path of data mules with different location of sinks

nodes (data mules). For example, eco-monitoring sensor networks, ranging from 10% to 100% of total nodes (since the
systems are located in forest areas, and sensor nodes are topology contains 300 nodes, the number of data mules would
deployed randomly. Rangers or hikers can act as data mules to be from 30 to 300).
collect the data of sensor nodes and forward it to base station This study uses the number of hops from sender to
(sink node). Fig. 1 shows another application involving mobile destination to measure path length. As shown in Fig. 9, the x
nodes; in this sensor system, tanks and vehicles, which move axis represents the different numbers of data mules added to the
faster than foot soldiers, are used as data mules to forward data. network and the y axis represents the ratio of L(p)/L(0). L(p)
For the above reasons, data mule is more suitable than wired denotes the average path length by adding p*(number of nodes
line in WSN. However, data forwarding in WSN differs from in network) data mules. For example, in a topology containing
that in typical wireless networks. The data is only forwarded to 300 nodes, L (0.1) is the average path length of the topology
sink nodes in WSN. In this situation it is more appropriate to with 0.1*300=30 data mules. Likewise, L(0) is the average path
control the motion of data mules to relay data. Therefore, active length for the network without data mules. Fig. 9 shows that
forwarding is adopted in WSN. Fig. 8 shows the data mule adding a small number of mules (10%) can reduce the average
paths deployed for different sink locations. In (a) and (b) the path length by nearly 50%. A network with 300 nodes only
sink node is located at the center of the topology, while in (c) needs 30 mules to have a 50% reduction in average path length.
and (d) it is located at the corner of the topology. The topology However, to keep adding more mules can only improve the
employs different numbers of lines, (a) and (c) deploy eight average path length very slowly. For example, if the number of
lines, (b) and (d) deploy sixteen lines in network. Each line has mules increases to 100% (300 nodes and 300 mules in the
at least one data mule shuttling to and from between the sink network simultaneously) the average path length can only
nodes and the edge of topology. Data mules collect data and decrease 70%; the performance gain only gets 20% enhance
forward them to the sink nodes. Different numbers of mules are while number of data mules are added to 10 times. The results
deployed to experiment with the influence of average path indicate that the addition of a small number of mules can
length. The detail simulation results are shown in the following significantly reduce the average path length with the similar
section. findings of [6] and [10]. It concludes that use data mules to
emulate the shortcuts in wireless networks without losing the
IV. SIMULATION RESULTS small world phenomenon.
The following simulation result involves measuring the
This section is divided into three subsections. Subsection A delay caused by data mules. Similar to Fig. 9, in Fig. 10 the
illustrates the simulation results of small worlds in connected x-axis represents the different numbers of data mules added to
wireless networks with data mules. Subsection B shows the the network and the y-axis is the ratio of D(p)/D(1). D(p)
simulation results of the small world phenomenon in denotes the average delay of the data communication for
disconnected wireless networks with data mule. Finally, networks with p* (number of nodes in network) mules.
subsection C obtains the simulation results using data mule Moreover, D(1) is the average delay for the network with the
with active forwarding modes in wireless sensor networks. number of data mules equal to the number of nodes in the
network (300 in this case). As shown in Fig. 10, when decrease
A. Small World in Connected Wireless Network the number of mules from 300 to 0, the delay also reduces.
This investigation uses ns2 as the simulation tool. The size of Moreover, when the number of mules is about 30 ( p=0.1) the
the network topology is 800m x 800m. The network contains delay will reduce by 50%. Fig. 11 combines Fig. 9 and Fig. 10.
300 nodes, and the radio range of each node is 70 m. Nodes in The figure determines the optimum number of mules, which is
network repeatedly generate data and the destinations of data about 10% of total network nodes, required to achieve the best
are randomly chosen. We use the routing algorithm described tradeoff between average path length and delay.
in Section III. Various numbers of data mules are added to

hop delay

1
1
0.9 0 .9
0.8 0 .8
0.7 0 .7
0.6 0 .6
D(p)/D(0)
L(p)/L(0)

0.5 hop 0 .5 delay


0.4 0 .4

0.3 0 .3

0.2 0 .2

0.1 0 .1
0
0
0 0.1 0. 2 0.3 0.4 0.5 0.6 0. 7 0.8 0. 9 1 0 0 .1 0.2 0.3 0.4 0.5 0 .6 0.7 0.8 0.9 1

% of mules % of mules

Figure 9. Average path length decrease drastically when Figure 10. Average delay increase drastically when few number of
few numbers of mules add in connected wireless networks. mules add in networks and more mules cause more delay.
33
hop V.S. delay

hop V.S delay with 0.1 mules


1 1

0.9 0.9 1 1

0.8 0.8 0.9 0.9

0.7 0.7 0.8 0.8

0.7 0.7
0.6 0.6

D(p)/D(0)
L(p)/L(0)
delay 0.6 0.6

D(p)/D(0)
L(p)/L(0)
0.5 0.5
hop hop
0.5 0.5
0.4 0.4 delay
0.4 0.4
0.3 0.3
0.3 0.3
0.2 0.2
0.2 0.2
0.1 0.1
0.1 0.1
0 0
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0 0.07 0.14 0.2 0.28 0.35 0.42 0.5 0.57 0.64 0.71 0.78 0.85 0.92 1
% of m ules
% of diam eter

Figure 11. Average path length and average delay Figure 12. Length of shortcuts V.S. average path length and delay
when data mule increases
After realizing the tradeoff between path length and delay, phenomenon. However, if more data mules are added to the
this study discusses the appropriate length of shortcuts networks, then the average path length will decrease noticeably
emulated by data mules. In Fig. 12, the x-axis indicates the because enough data mules are available to forward data to the
shortcut length emulated by traveling distance of data mule, destinations, which act as the shortcuts in small world networks.
which is represented by the fraction of network diameter. The This result resembles to a small world phenomenon as in [28].
y-axis is L(0.1)/L(0) for the average hop distance and delay.
The shortcut length influences network performance. C. Small World in Wireless Sensor Network
Average shortcut length of less than 0.2*network diameter This section simulates four kinds of topology. As shown in
causes average path length ratio exceeding 0.6. However, when Fig. 8, the sink node at the center/corner has 8/16 lines. Fig. 17
the shortcut length is greater than 0.2*network diameter, the shows simulation results for these topologies. The x-axis shows
average path length can be reduced to 0.53. Similar to [10], the the number of mules on each line, while the y-axis is L(n)/L(0) ,
result of Fig. 12 shows that shortcut length must exceed at least where n is the number of mules on each line.
20% of fraction of network diameter. In Fig. 12, increasing Notably, the average path length decreases with an
shortcut length to exceed 0.3* network diameter would not increasing number of data mules. However, a small number of
reduce average path length. It’s because delivery of data over mules can decrease path length significantly and increasing the
such long distances is rarely necessary [11] in small world number of mules does not achieve further reductions. The
networks. simulation results support the findings of previous simulations
involving mule number. Additionally, Fig. 17 illustrates that
B. Small World in Disconnected Wireless Network performance is better with 16 lines than with 8 lines because the
In previous simulations, we have used a connected network former can cover more area than the later.
and radio range of 70m. In this simulation, the radio range is
reduced to 60/50 for each node to create a disconnected V. CONCLUSION
network. Fig. 13 and Fig.14 illustrate data delivery ratio versus Recently, the small world phenomenon has been a hot
various numbers of mules, while Fig. 15 and Fig. 16 show research topic with various studies. Those studies discussed the
average path length versus various numbers of mules. In the x relationships of connecting individuals in the real world. Some
axes of Figs. 13~16, p denote the number of data mules added studies on wireless networks also used the small world
in the network while the y axis of Figs. 13 and 14 shows the phenomenon to enhance network performance. However,
data delivery ratio and Figs. 15 and 16 illustrate L(p)/L(0.333). previous studies on the small world phenomenon added
Fig. 13 and Fig. 14 show that even in different radio ranges, shortcuts in advance because the length of shortcuts must be
the addition of a small number of mules to a disconnected within a range that it fits all individuals in networks; a fixed
network can significantly increase the data delivery ratio. The shortcut only suits for fixed topology. However, in a mobile
results show that data mules are a practical solution to increase network it is impossible to construct fixed length shortcuts
data delivery ratio in the disconnected networks such as DTN, owing to the lack of fixed stations for establishing shortcuts.
Radio range=50
Radio range =60 1

1
0.9
0.9
0.8
0.8
0.7
0.7
data delever ratio

0.6
datadeliver ratio

0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2

0.1 0.1

0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 .9 1

p p

Figure 13. Number of Mules V.S. data deliver ratio when Figure14. Number of Mules V.S. data deliver ratio when
Radio range=60 Radio range=50
vehicular networks, etc. From the points of view above, this study used mobile nodes,
Furthermore, Fig. 15 and Fig. 16 demonstrate the small called data mules, to construct shortcuts randomly. The use of
world phenomenon in disconnected wireless networks. The data mules allows shortcuts to be built easily and economically.
figures show that a network without data mules initially is The simulation results demonstrate that such shortcuts can
disconnected, and thus the average path length is quite short. easily reduce the average path length without losing the small
When p=0.0333 most disconnected parts are connected and world property.
average path length increases significantly because network Furthermore, previous research assumed that wireless
data can be delivered to the destination via data mule, but the networks are connected and calculated the average path length
number of data mules is insufficient to form the small world and cluster coefficient of all nodes in a topology to confirm
34
Radio range= 60 Radio range= 50 average pathlength V.S. number of mules on each line center-16
1 1 1 center-8
0.9 corner-16
0.8 0.8
L(p)/L(0.03333)

L(p)/L(0.03333)
0.8
corner-8
0.7
0.6 0.6 0.6

L(n)/L(0)
0.5
0.4 0.4 0.4
0.3
0.2
0.2 0.2
0.1
0
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1 2 3 4 5 6 7 8 9 10
p 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
p n

Figure 15. Number of Mules V.S. Average path Figure 16. Number of Mules V.S. Average path Figure 17. average path length V.S. number of
length when Radio range=60 length when Radio range=50 mules on each line

small world phenomenon. However, in disconnected networks, Reverse-Path Forwarding (tbrpf),” Internet draft,
it is impossible to estimate the path length of disconnected node draft-ietf-manet-tbrpf-01.txt, work in progress, Mar. 2001.
[18]. H. Takagi and L. Kleinrock, “Optimal Transmission Ranges for
pairs. Consequently, this study utilizes the concept of α -model Randomly Distributed Packet Radio Terminals,” IEEE Trans.
Commun., volum.32, no.3, pp. 246–57. Mar 1984.
[28] proposed by Watt in 1999 to solve the problem of [19]. B. Karp and H. T. Kung, “Greedy Perimeter Stateless Routing for
disconnected network. Simulation results demonstrate that the Wireless Networks,” Proc. 6th Annual ACM/IEEE Int’l. Conf. Mobile
small world phenomenon also exists in disconnected wireless Comp. Net., pp.243–254, Aug. 2000.
networks with addition of small number of the data mules. [20]. G. Toussaint, “The Relative Neighborhood Graph of a Finite Planar
Set,” Pattern Recognition, volume. 12, no. 4, pp. 261–68, 1980.
Finally this study discusses the small world phenomenon in [21]. I. Chatzigiannakis, S. Nikoletseas, and P. Spirakis, “Analysis and
wireless sensor networks containing data mules. Past studies Experimental Evaluation of an Innovative and Efficient Routing
constructed small worlds in WSN by using fixed wires; Approach for Ad-hoc Mobile Networks,” Proc. 4th Annual Workshop
however, data mule shortcuts are superior to fixed wires in most on Algorithmic Engineering, 2000.
[22]. Q. Li, D. Rus, “ Sending messages to mobile users in disconnected
applications. This study thus actively controls the motion of ad-hoc wireless networks,“ Proc. International Conference on Mobile
data mules to fit sink nodes. The results of this study show that Computing and Networking, pp.44 - 55, 2000.
adding a small number of mules reduces average path [23]. W. Zhao, M. Amma, E. Zegura, “A message ferrying approach for data
significantly while adding larger numbers of data mules cannot delivery in sparse mobile ad hoc networks,” Proc. 5th ACM
international symposium on Mobile ad hoc networking and computing
effectively reduce the average path length. pp.187-198 ,2004.
[24]. A. Vahdat and D. Becker, “Epidemic routing for partially connected ad
REFERENCE hoc networks,” Tech. Rep. CS-200006, Department of Computer
Science, Duke University, Durham, Apr 2000.
[1]. M. Mauve, A. Widmer, H. Hartenstein, “A survey on position-based [25]. C. Chuah, D. Ghosal, H. Chang, J. Anda, and M. Zhang. “Enabling
routing in mobile ad hoc networks,” IEEE Network Magazine, 15, Issue energy demand response with vehicular mesh networks,” Proc. IEEE
6, pp.30 – 39, Nov 2001. Conference on Mobile and Wireless Communication Networks, 2004.
[2]. R. Shah, S. Roy, S. Jain, W. Brunette, “Data MULEs: modeling a [26]. Z-Da Chen, H-T Kung, D. Vlah, “Ad hoc relay wireless networks over
three-tier architecture for sparse sensor networks,” Proc. IEEE SNPA moving vehicles on highways,” Proc. MobiHoc, Oct 2001
Workshop, May 2003. [27]. D.J. Watts, “Networks, Dynamics, and the Small-World Phenomenon,”
[3]. J. LeBrun, C-N Chuah, D. Ghosal, “Knowledge Based Opportunistic American Journal of Sociology, volume 105, issue 2, pp.493–527,
Forwarding in Vehicular Wireless Ad Hoc Networks,” Proc. IEEE 1999.
Vehicular Technology Conference (VTC), spring, 2005. [28]. D.J. Watts, “Six degrees: the science of a connected age,” W. W. Norton
[4]. D.J. Watts, “Small Worlds, The dynamics of networks between order & Company; Reprint edition, Feb 2004.
and randomness,” Princeton University Press, 1999. [29]. R. Chitradurga and A. Helmy, “Analysis of wired shortcut in wireless
[5]. S. Milgram, “The small world problem,” Psychology Today, pp.60-67, sensor networks,” Proc. IEEE/ACM International Conference on
1967. Pervasive Services, July 2004.
[6]. D. Watts, S. Strogatz, “Collective dynamics of ‘small-world’ [30]. G. Sharma and R. Mazumdar, “Hybrid sensor networks: A small
networks,” Nature Volum.393, pp. 440-442, 1998. world,” Proc. ACM MOBIHOC, Mar 2005
[7]. L. Adamic, “The Small World Web,” Proc. EDCL, pp.443-452, 1999.
[8]. A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R.
Stata, A. Tomkins, J. Wiener, “Graph structures in the web,” Computer
Networks, June 2000.
[9]. R. Albert, H. Jeong, A. Barabasi, “Diameter of the world wide web,”
Nature, 1999.
[10]. A. Helmy, ”Small worlds in wireless networks,“ IEEE Communications
Letters, Volume 7, Issue 10, pp.490 - 492 , Oct. 2003.
[11]. J. Kleinberg “The Small-World Phenomenon: An Algorithmic
Perspective,” Proc ACM Symposium on Theory of Computing, pp.163 –
170, 2000.
[12]. G. Sharma, R. Mazumdar, ”Hybrid Sensor Networks: A Small World,”
Proc. ACM MOBIHOC, Mar 2005.
[13]. A. Helmy, "Mobility-Assisted Resolution of Queries in Large-Scale
Mobile Sensor Networks (MARQ)," Computer Networks Journal –
Elsevier Science, Special Issue on Wireless Sensor Networks, August
2003.
[14]. A. Helmy "Contact Based Architecture for Resource Discovery
(CARD) in Large Scale MANets," Proc. IPDPS, April 2003.
[15]. C. Perkins and P. Bhagwat, “Highly Dynamic Destination Sequenced
Distance-vector Routing (dsdv) for Mobile Computers,” Proc.
SIGCOMM ’94 Conference on Communications Architectures.
pp.234-244, Aug 1994.
[16]. P. Jacquet et al., “Optimized Link State Routing Protocol,” Internet
draft, draftietf-manet-olsr-04.txt, work in progress, Sept. 2001.
[17]. B. Bellur, R.Ogier, and F. Templin, “Topology Broadcast Based on

35
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Load Awareness Multi-Channel MAC Protocol Design for Ad Hoc Networks

Chih-Min Chao and Kuo-Hsiang Lu


Department of Computer Science and Engineering
National Taiwan Ocean University, 20224, Taiwan
Email: {cmchao,m95570023}@ntou.edu.tw

Abstract channel allocation problem. In the literature, solutions to


this problem can be categorized according to the following
Exploiting multiple channels improves the system perfor- two factors:
mance in a heavy-loaded wireless ad hoc network. Many
multi-channel MAC protocols have been proposed recently. • Single- or multiple-transceiver: Whether a user can ac-
The number of channels being used is fixed for these pro- cess one channel at a time or multiple channels simul-
tocols, which limits their flexibility. As a result, these pro- taneously.
tocols may suffer from either longer delay in a light-loaded
network or higher collision probability in a heavy-loaded • With/without dedicated control channel: Whether the
one. In this paper, we propose a Load Awareness Multi- control messages are transmitted in a dedicated control
channel MAC (LAMM) protocol which dynamically adjusts
channel or not.
the number of channels being used according to the net-
work load. Our scheme is able to utilize multiple channels Base on these two factors, we classify existing multi-
to increase throughput when the network is heavily loaded.
channel protocols in Table 1
It can also switch back to use a single channel to reduce
transmission delay when the network is lightly loaded. Sim-
ulation results verify that our LAMM significantly improves
Table 1. Classification of multi-channel MAC
the network performance.
protocols
with dedicated without dedicated
Keywords: Ad hoc networks, multi-channel, medium ac- control channel control channel
cess control. multiple-transceiver [4, 5, 10, 11] [3, 9]
single-transceiver [6, 12] [2, 7, 8], ours

1 Introduction
Equipped with multiple transceivers, nodes can access
A wireless ad hoc network consists of a cluster of mo- multiple channels simultaneously. Several solutions belong
bile hosts without any pre-designed infrastructure of the to the multiple-transceiver, with dedicated control channel
base stations. The IEEE 802.11 standard is usually adopted category [4, 5, 10, 11]. In this category, a user can utilize at
as the access scheme for ad hoc network. The MAC pro- least two transceivers, one for control messages exchanging
tocol of IEEE 802.11 standard is designed for accessing and the other for data transferring. The transceivers for data
a single channel. It is well known that utilizing multiple transferring are capable of dynamically switching among
channels can increase spatial reuse and hence enhance net- different channels to fulfill data transfer. Some solutions [3,
work throughput. Thus, it is desirable for a user to uti- 9] utilize multiple transceivers but avoid using a dedicated
lize multiple channels to communicate with others. Several control channel. These schemes adopt the concept of ATIM
multi-channel MAC protocols have been proposed recently. window in IEEE 802.11 power saving mode (PSM).
When a user is involved in a transmission in a multi-channel The main concern of using multiple transceivers is in-
network, the most important question is: “which channel creased hardware cost. Thus, many proposals adopt only
should be used for transmission.” It is referred to as the one transceiver to reduce hardware cost. Some of them

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 36


DOI 10.1109/SUTC.2008.36
[6, 12] use single-transceiver and employ a dedicated con- 2 Related Work
trol channel to fulfill channel negotiation. The weakness of
having a dedicated control channel is that it may become a In the multiple-transceiver, with dedicated control chan-
bottleneck if the network is heavy-loaded. The other solu- nel class, there is one transceiver dedicated to control mes-
tions avoid using a dedicated control channel [2, 7, 8]. Some sages exchange and the other transceivers can switch among
of them [7, 8] adopt the concept of ATIM window in IEEE n data channels. Since control and data packets are trans-
802.11 PSM to complete channel allocation. The ATIM mitted on different channels with different transceivers,
window can be considered as a common control period dur- users can receive both kinds of packets at the same time.
ing which users switch to a predefined channel to exchange A user that has completed channel negotiation in the con-
control messages. Thus, the ATIM window may still be a trol channel can switch a data transceiver to the reserved
bottleneck when the network is heavily loaded. Moreover, data channel for data transmission. In DCA [10], each host
since data channel switching is applied every beacon inter- has two transceivers, one for control and the other for data
val, these protocols also suffer from the head-of-line (HOL) transmission. Since the control transceiver is listened all the
problem (we will describe this problem in Section 2). An- time, each node knows the complete channel usage infor-
other solution that does not apply a dedicate control chan- mation and thus can select a free channel whenever needed.
nel is SSCH [2]. Rather than negotiating a channel with Each node running MTMAC [11] maintains a Channel Al-
its communication partner, a user running SSCH switches location Vector (CAV) which contains the usage statuses of
channels in a scheduled manner. Two users can talk to each all data channels. Similar to the concept of Network Allo-
other only when they switch to the same channel at the same cation Vector in IEEE 802.11 MAC mechanism, CAV indi-
time. The downside of SSCH is that it may incur longer de- cates the transmission time reserved for each data channel.
lay when the network is lightly loaded and may incur the With this information, channel allocation can be accom-
HOL problem when heavily loaded. A comprehensive re- plished in the following way. A sender will first transmit
view of existing solution is in Section 2. a RTS packet which includes its CAV. The receiver replies
a CTS packet which contains a channel that is available in
both its CAV and the CAV included in the RTS packet. After
All the above mentioned multi-channel protocols operate
recognizing the CTS packet, the sender will send a SACK
in an environment where the number of channels is fixed.
packet to confirm the selected channel. The packets CTS
They may suffer from either longer delay or higher colli-
and SACK enable nodes that overhear these two packets to
sion probability when network load changes. In this paper,
update their own CAV. In PCAM [5], each node requires
we propose a Load Awareness Multi-channel MAC proto-
three transceivers. Two transceivers, the primary and the
col (LAMM) which belongs to the single-transceiver, with-
secondary transceivers, are used for data transmission and
out dedicated control channel class. Our LAMM is able to
the third transceiver is used for broadcast the channel set-
dynamically adjust the number of channels being used to re-
ting message. The channel for a node’s primary transceiver
duce transmission delay and to alleviate the HOL problem.
is assigned in advance and is announced through the third
To achieve this goal, we also adopt the ATIM window con-
transceiver. The selection of channel for actual data transfer
cept of IEEE 802.11 PSM and exploit asynchronous chan-
is receiver-based. When a node A initiates a communica-
nels. When network load is light, only one channel is used
tion to another node B, the channel being used is the one
for all users. When the channel is saturated, another chan-
assigned to the primary transceiver of node B. That is, if the
nel will be added into operation to share both control and
channel is different from A’s primary channel, node A will
data packets. Such design avoids the bottleneck of having a
tune his secondary transceiver to the channel for communi-
common control period.
cation.
In the multiple-transceiver, without dedicated control
The rest of the paper is organized as follows. We review channel class, the concept of ATIM window in IEEE 802.11
existing multi-channel MAC protocols in Section 2. Section PSM is adopted by several proposals. In PSM-MMAC [9],
3 presents our scheme in detail. Simulation results are given each node has one default transceiver serving for traffic in-
in section 4. Finally, we conclude the paper in section 5. dication and channel negotiation. During the ATIM win-

37
dow, each node switch its default transceiver to the de- control channel class try to mitigate the bottleneck prob-
fault channel to exchange transmission intention through lem [2, 7, 8]. In SSCH [2] each node hops between chan-
ATIM/ATIM-ACK packets. Also, the source-destination nels using its own channel hopping sequences. The chan-
pairs select channels according to the quality and traffic nel hopping sequences have been designed such that nodes
load in each channel. After ATIM window, all channels, will overlap with each other at least once in a cycle. The
including the default channel can serve to exchange data dwelling time for each hop is set to 10 ms. Within this pe-
on all the transceivers. Another algorithm, Novel, which riod, IEEE 802.11 DCF is adopted as the MAC mechanism.
adopts the same ATIM window concept is proposed in [3]. SSCH also enables a node to change its channel hopping
A node running Novel has two transceivers, a control one sequence to match that of its receiver’s. SSCH avoids the
and a data one. Both transceivers can be used for negotia- control channel bottleneck by sharing the control overhead
tion during the ATIM window: the data transceiver tunes to among all channels. However, since channel switching hap-
the default channel and the control transceiver tunes to the pens every 10 ms, the capacity of the channel can be fully
control channel. During the non-ATIM period, data trans- utilized only if there are enough packets to be delivered to
mission can be conveyed on all channels, including the con- those nodes reside on the same channel. Moreover, the aver-
trol channel. In these two multiple-transceiver categories, age delay for a successful packet transmission is increased
each node is equipped with multiple transceivers, which is since nodes generally take longer time before they hop to
undesirable due to higher hardware cost. the same channel.
To reduce hardware cost, several proposals utilize only Some other solutions adopt the concept of ATIM window
one transceiver to solve the channel allocation problem. in IEEE 802.11 PSM. In MMAC [7], all nodes initially lis-
Some of them adopt a dedicated control channel [6, 12]. ten on the default channel during the ATIM window. When
In AMCP [6], one control channel and n data channel are a node A intends to communicate with another node B, node
assumed. Each node locally maintains an n-entry channel A will send an ATIM packet to node B on the default chan-
table to keep track of the usage of data channels. Initially, nel. This packet includes a Preferable Channel List (PCL)
all nodes stay on the control channel. When a node A has which presents node A’s channel usage information. Af-
packet pending for transmission to node B, nodes A sends ter receiving the ATIM packet, node B will choose a chan-
its RTS packet which contains its preferred channel, say nel that is free in both node A’s PCL and its own. Node B
channel x, to be used for upcoming data transfer. Node B then notifies node A the selected channel through an ATIM-
checks channel x in its channel table and replies a Confirm- ACK packet and node A confirms this channel by replying
ing CTS if it is available. Then, both nodes switch to the an ATIM-RES packet. At the end of ATIM window, both
scheduled channel for data transmission. If channel x is not nodes switch to the negotiated channel to fulfill data trans-
available for node B, a Rejecting CTS which contains a list mission. Although there is no dedicated control channel in
of node B’s available data channel is replied. In that case, MMAC, the ATIM window can be considered as a common
another round of channel selection and negotiation will be control period which still produces a bottleneck. Moreover,
triggered. After data transmission, both nodes will switch since the sender contends a channel for the first packet in
back to the control channel and set all data channel except x its queue, MMAC suffers from the throughput degradation
to be unavailable for a certain period of time. Such settings if there is no sufficient number of packets to the same re-
can avoid the multi-channel hidden terminal problem. Sim- ceiver. It is referred to as the HOL problem.
ilar to AMCP, LBM [12] utilizes one control channel and In SLKB [8], each beacon interval is divided into three
n data channels. LBM tries to balance load sharing among subintervals and three asynchronous channels are used. Fig-
channels while making the channel allocation. Nodes run- ure 1 shows the channel usage of SLKB. ATIM and ATIM-
ning LBM will use the channel that is available and has the ACK packets exchanged in the ATIM window indicate the
lowest utilization ratio for data transmission. An unsatisfac- transmission pairs in the data subframe. Nodes that have
tory feature of using a dedicated control is that this control completed their tasks in the first subinterval at channel 0
channel becomes a bottleneck which limits the overall uti- will switch to channel 1 (the first neighbor channel of chan-
lization. nel 0); nodes that have completed their tasks in the second
Solutions in the single-transceiver, without dedicated subinterval at channel 0 will switch to channel 2 (the second

38
1st subinterval 2nd subinterval
sion. A node that does not receive an ATIM can enter the
doze mode at the end of the ATIM window.

Task2
Task1

Task3

Task4

Task5

Task7
Task6
ATIM
Ch0 Window
Time
Beacon Beacon

A
ATIM DATA
Ch1 ATIM Beacon
Window
Time
Beacon

B
Ch2 ATIM
Window ATIM - ACK ACK
Time
Beacon
Doze mode
C
Figure 1. Channel usage in SLKB Time
ATIM Data
Window Window

Beacon Interval

neighbor channel of channel 0) for possible more transmis-


sions. Similarly, nodes have completed their transmissions Figure 2. Operation of IEEE 802.11 PSM
at channel 1 will switch to channel 2 or channel 0 to contend
for further data transmission. Using asynchronous chan- The LAMM protocol concentrates on solving the chan-
nels increases utilization when traffic load is high. How- nel allocation problem. Nodes will not go to sleep in the
ever, in a lightly loaded network, nodes may produce many data window. Similar to IEEE 802.11 PSM, the ATIM
unnecessary switching since the original channel may not and ATIM-ACK packets are used for the purpose of packet
be fully utilized. Such unnecessary switching may increase transmission announcement. In LAMM, the ATIM and
the delay to transmit a packet since the sender and the re- ATIM-ACK packets include the number of pending packets
ceiver may be partitioned. Moreover, if some nodes switch for the destination. This information implies the maximum
to another channel and they have no packet pending for each achievable throughput which is crucial for LAMM to deter-
other, they are unable to switch back to the original chan- mine whether a new channel should be added into operation
nel since only nodes that have successfully completed data or not.
transmission can switch. We define a node A to be inactive if one of the following
The inefficiency and questions of existing protocols conditions holds.
mentioned above motivate this work. We aim to adjust • Node A has transmitted an ATIM packet to its destina-
the channel usage according to network traffic load. Mean- tion but doesn’t receive an ATIM-ACK packet.
while, we try to avoid the control channel bottleneck by dis-
tributing the control messages to asynchronous channels. • Node A has ATIM packets to send but fails to deliver
one for the whole ATIM window.
3 Proposed Load-Awareness Multichannel
• Node A has no ATIM packet to send and doesn’t re-
MAC (LAMM) Protocol
ceive any ATIM packet during the ATIM window.

The protocol we proposed belongs to the single- To facilitate the operation of LAMM, each node main-
transceiver, without dedicated control channel class. The tains a variable CUi to estimate the usage of channel i. In
structure of LAMM is similar to that of IEEE 802.11 PSM. LAMM, we use the total number of packets pending to send
Time is divided into a series of beacon intervals. At the be- on channel i during the data window as the measure of CUi .
ginning of each interval, there exists a small window called Since the ATIM and ATIM-ACK packets contain the num-
the ATIM window. All nodes should keep awake during ber of pending packets for destination. Each node is able to
the ATIM window. Fig. 2 illustrates the operation of IEEE count the number of packets pending for the data window.
802.11 PSM. If node A has traffic for node B, it sends an At the beginning of each beacon interval, CUi is set to zero.
ATIM packet to B during the ATIM window. Node B replies For the sender and the receiver of a particular ATIM/ATIM-
an ATIM-ACK to A to respond this ATIM packet. After a ACK transmission, their CUi are increased by the num-
successful ATIM/AITM-ACK dialog. both nodes A and B ber of pending packets contained in the ATIM/ATIM-ACK
will stay awake during the data window for data transmis- packets. If a node hears an ATIM or an ATIM-ACK packet

39
Beacon Interval

on channel i during the ATIM window, it adds the number Default channel
ATIM Window Data Window

of pending packets to its CUi . Another parameter that is


E C A ATIM
needed in LAMM is CUmax which indicates the maximum Beacon
RTS Data
Beacon

number of packets that can be sent during a data window. F D B


Inactive node Time
CTS
The value of CUmax depends on system parameters such ATIM - ACK ACK

Channel 1
as transmission rate, data window size in a beacon interval,
E C
and packet size. For example, if a network has transmis- Beacon
ATIM
RTS Data
Beacon

sion rate 2 Mbps with data window 80 ms for an 100 ms Cu 0 > CU thrd
F D
beacon interval and packet size 512 bytes, the CUmax is 40 Inactive node
ATIM - ACK CTS ACK
Time

( 2M512∗8
bps∗0.08
). Channel 2

The main idea of our LAMM is to use less number of E ATIM


RTS Data
channels when system is lightly loaded and use more chan- Cu 1 > CU thrd
Beacon Beacon

nels when heavily loaded. In a light load network, since F


Time
ATIM - ACK CTS ACK
there is no need to utilize multiple channels, all the nodes
should remain in the same channel to avoid the waiting time Figure 3. Operation of LAMM
for nodes to hop to the same channel. Such a mechanism
can reduce the delay for successful transmissions. When
traffic load increases, it is reasonable to utilize multiple switch back to the default at the next beacon interval. To
channels to increase spatial reuse. To fulfill our idea, each avoid ping-pong effect, we set the value of the parameter k
node examines its CUi for the currently being used channel to be two.
i to determine if adding/removing a channel is necessary. An example of the operation of LAMM is illustrated in
When the utilization of channel(s) being used currently are Fig. 3. We assume that there is enough traffic to saturate
higher than a predifined threshold CUthrd , another asyn- the channel being used and we focus on three pairs of trans-
chronous channel is created to share both control and data missions among others: nodes A to B, C to D, and E to
packets. When the utilization of some of the channels being F. In the beginning, all nodes reside on the default channel
used drops below CUthrd , users reside on these channels (channel 0). Assuming that only nodes A and B in the three
should switch back to the default channel. pairs have successfully complete their ATIM/ATIM-ACK
Now we present our LAMM protocol. Initially, all the exchange in channel 0. At the end of the ATIM window
nodes reside on the default channel (channel 0). During in channel 0, confirming that CU0 ≥ CUthrd , nodes C, D,
the ATIM window, nodes update their CU0 according to E, and F will switch to channel 1 for possible transmission.
transmittd/received/heard ATIM and ATIM-ACK packets. Similarly, supposing that nodes C and D have successfully
At the end of the ATIM window, nodes that have com- exchanged ATIM/ATIM-ACK in channel 1, nodes E and F
pleted their ATIM and ATIM-ACK dialog will proceed to will switch to channel 2 at the end of the ATIM window if
data transfer. For those inactive nodes, the decision whether CU1 ≥ CUthrd .
they should switch to another asynchronous channel will be It should be noted that we do not solve the missing re-
made. These inactive nodes will switch to the next channel ceiver problem [6]. The missing receiver problem may pro-
if the estimated utilization of the channel being used cur- duce negative impact on throughput, especially when net-
rently (CU0 ) is larger than or equal to CUthrd . The next work traffic load is light. This problem happens when con-
channel of channel i is channel i + 1 (mod n), where n is trol packets sent on a certain channel fail to reach an in-
the total number of channels that can be used in the network. tended receiver because this node is not in the same chan-
If the estimated utilization of the channel being used cur- nel. For example, in Fig. 3, node C may be the transmit-
rently (CU0 ) is less than CUthrd , these inactive nodes will ter of node D and the receiver of node A at the same time.
stay on the current channel. To remove a channel, when the The missing receiver problem may exist in LAMM between
estimated utilization of a channel i being used (except the nodes A and C if one node is active and the other is inactive
default channel) is lower than CUthrd (CUi < CUthrd ) for and switches to the next channel. We do not provide solu-
successively k times, nodes reside on this channel should tions to this problem in this paper. However, LAMM allevi-

40
ates this problem to some extent since a channel switching 4
CU thrd= 0 . 25 * CU max
happens only when the network is heavily loaded.

Aggregate Throughput (Mbps)


3.5
CU thrd= 0 . 50 * CU max
3 CU thrd= 0 . 75 * CU max
4 Performance Evaluation 2.5 CU thrd= CU max

2
1.5
We evaluate the performance of our protocol using the
1
ns-2 simulator (version 2.28) [1]. We compare LAMM with
0.5
MMAC and SLKB since they utilize the same ATIM win-
0
dow concept. We use the following three metrics to evaluate 1 0.8 0.6 0.4 0.2 0.08 0.06 0.04
Interarrival Time (s)
these protocols: aggregate throughput, packet delivery ra-
tio, and average packet delay. In the simulations, totally 25
nodes is running in a single-hop environment. The source- Figure 4. Aggregate throughput for different
destination pairs are selected randomly which means a node CUthrd values
may be a sender and/or a receiver at the same beacon in-
terval. Each node generates constant-bit-rate (CBR) traffic
with a packet size of 512 bytes. The transmission rate of a 4

single channel is 2 Mbps. The beacon interval is set to 100 3.5

Aggregate Throughput (Mbps)


LAMM
3 MMAC
ms where the ATIM window occupies the first 20 ms. Each
2.5 SLKB
data point in the graphs is an average of 20 runs, each sim-
2
ulating 40 seconds. We assume that totally three channels 1.5
can be utilized. The other MAC layer parameters are given 1
in Table 2. 0.5
0
1 0.8 0.6 0.4 0.2 0.08 0.06 0.04
Table 2. MAC layer parameters Interarrival Time (s)

Parameter Value
SIFS 10µs Figure 5. Aggregate throughput at different
DIFS 50µs packet interarrival times
CWmin 31
CWmax 1023
ATIM Retry Limit 3
Short Retry Limit 7
Long Retry Limit 4 LAMM outperforms the other two protocols at all differ-
Preamble Length 144 bits
PLCP Length 48 bits
ent interarrival times. The MMAC protocol suffers from
the HOL problem at a light-loaded environment and incurs
a bottleneck (the common control period) when the net-
In our simulations, first we need to determine the best work is heavy-loaded. In SLKB, only those nodes that have
CUthrd . We have tested four different CUthrd values and successfully finished their data transfer can switch to the
the results (aggregate throughput) are shown in Fig. 4. next channel. The other nodes (what we defined as inac-
Obviously, the best performance happens when CUthrd is tive nodes) stay on the same channel. Such a mechanism
equal to CUmax . When lower CUthrd values are activated, produces unnecessary switchings at light load; it also fails
inactive nodes may frequently switching between channels. to distribute nodes properly at heavy load since the inactive
This may lower the probability for nodes to reside on the nodes can do nothing staying in the original channel. Our
same channel and thus lower throughput is produced. This LAMM avoids the problems of MMAC and SLKB and thus
experiment verifies that another new channel should be cre- achieves better throughput.
ated only when the capacity being used currently is satu- The packet delivery ratio at different traffic load for dif-
rated. In the following, the CUthrd is set to 40. ferent protocols can be found in Fig. 6. When packet in-
The aggregate throughput for different protocols at dif- teraarrival time is less than 0.4 second, the delivery ratio
ferent network load is shown in Fig. 5. It is clear that our of LAMM is close to 100%. It is because a sender can al-

41
100 4
90 LAMM 3.5
Packet Delivery Ratio (%) LAMM

Average Packet Delay (s)


80 MMAC
3 MMAC
70 SLKB
60 2.5 SLKB
50 2
40 1.5
30
1
20
10 0.5
0 0
1 0.8 0.6 0.4 0.2 0.08 0.06 0.04 1 0.8 0.6 0.4 0.2 0.08 0.06 0.04
Interarrival Time (s) Interarrival Time (s)

Figure 6. Packet delivery ratio at different Figure 7. Average packet delay at different
packet interarrival times packet interarrival times

ways find its receiver since all nodes stay on the same chan- Acknowledgements
nel. The ratios for SLKB and MMAC at the same loads
are around 60% and 50%, respectively. As traffic load be- This research is supported by the National Science
comes heavies, the ratio for all protocols drops significantly. Council, ROC, under grant NSC96-2221-E-019-015-MY2.
Among them, LAMM still performs the best.
Fig. 7 shows the average packet delay as the network References
load increases. Again, our LAMM incurs the least delay,
which is followed by SLKB and MMAC. Reduced delay is [1] The Network Simulator - ns-2.
a direct consequence of a load awareness channel allocation http://www.isi.edu/nsnam/ns/.
mechanism. Utilizing only one channel when the network
[2] P. Bahl, R. Chandra, and J. Dunagan. SSCH: Slotted
load is low, LAMM can reduce the packet delay since the
Seeded Channel Hopping for Capacity Improvement
waiting time for a transmission pair to switch to the same
in IEEE 802.11 Ad-Hoc Wireless Networks. In Pro-
channel is removed. As the network load increase, inac-
ceedings of ACM MobiCom, pp. 216-230, September
tive nodes may move to other channels and the time for
2004.
the sender and the receiver to reside on the same channel
increases accordingly. The increased delay for MMAC at [3] S.-C. Lo and C.-W. Tseng. A Novel Multi-Channel
heavy load is caused by the HOL problem. For SLKB, the MAC Protocol for Wireless Ad Hoc Networks. In Pro-
enlarged delay results from improper allocation of those in- ceedings of IEEE VTC, pp. 46-50, April 2007.
active nodes.
[4] R. Maheshwari, H. Gupta, and S. R. Das. Multichan-
nel MAC Protocols for Wireless Networks. In Pro-
5 Conclusions ceedings of IEEE SECON, pp. 393-401, September
2006.
In this paper, we propose a multi-channel MAC proto-
[5] J. S. Pathmasuntharam, A. Das, and A. K. Gupta. Pri-
col LAMM that dynamically adjust the number of chan-
mary Channel Assignment Based MAC (PCAM) - A
nels being utilized according to traffic load. With such a
Multi-Channel MAC Protocol for Multi-Hop Wireless
load-aware channel allocation principle, LAMM reduces
Networks. In Proceedings of IEEE WCNC, pp.1110 -
the packet delay when the network is lightly loaded while
1115 Vol.2, March 2004.
increases spatial reuse when heavily loaded. The simula-
tion results verify that our LAMM performs well and is a [6] J. Shi, T. Salonidis, and E. W.Knightly. Starvation
flexible multi-channel protocol for ad hoc networks. Mitigation Through Multi-Channel Coordination in

42
CSMA Multi-hop Wireless Networks. In Proceedings
of ACM MobiHoc, pp. 214-225, May 2006.

[7] J. So and N. Vaidya. Multi-Channel MAC for Ad Hoc


Networks: Handling Multi-Channel Hidden Terminals
Using A Single Transceiver. In Proceedings of ACM
MobiHoc, pp. 222V233, May 2004.

[8] C. Son, N.-H. Lee, B. Kim, and S. Bahk. MAC Pro-


tocol using Asynchronous Multi-Channels in Ad Hoc
Networks. In Proceedings of IEEE WCNC, pp. 401-
405, March 2007.

[9] J. Wang, Y. Fang, and D. Wu. A Power-Saving Multi-


radio Multi-channel MAC Protocol for Wireless Local
Area Networks. In Proceedings of IEEE INFOCOM,
pp. 1-12, April 2006.

[10] S.-L. Wu, C.-Y. Lin, Y.-C. Tseng, and J.-P. Sheu. A
New Multi-Channel MAC Protocol with On-Demand
Channel Assignment for Multi-Hop Mobile Ad Hoc
Networks. In Proceedings of International Symposium
on Parallel Architectures, Algorithms and Networks,
pp. 232-237, December 2000.

[11] C. Xu, Y. Xu, J. Gao, and J. zhou. Multi-Transceiver


based MAC Protocol for IEEE 802.11 Ad Hoc Net-
works. In Proceedings of IEEE WCNM, pp. 804-807,
September 2005.

[12] X. Zheng, L. Ge, and W. Guo. A Load-balanced MAC


Protocol for Multi-channel Ad-hoc Networks. In Pro-
ceedings of IEEE ITST, pp. 642-645, June 2006.

43
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A Reactive Local Positioning System for Ad Hoc Networks*

Sungil Kim, Yoo Chul Chung, Yangwoo Ko, and Dongman Lee
Information and Communications University
{adreamer, chungyc, newcat, dlee}@icu.ac.kr

Abstract Existing ad hoc positioning algorithms can be


categorized into two types: anchor-based algorithms
Existing ad hoc location systems measure locations and anchor-free ones. The anchor-based algorithms
proactively by continuous measurements and start from a small set of nodes, called anchor nodes,
communication among all nodes in an ad hoc network. which already know their locations. The rest of the
However, in certain location-based applications only a nodes calculate their locations by communicating with
few nodes need the location of a few other nodes, in the anchor nodes, usually using multi-lateration or
which case the proactive approach results in an triangulation. Hop-TERRAIN [2] provides a scheme
unnecessarily large overhead. We propose a scheme for performing tri-lateration when the anchor nodes are
for a reactive ad hoc location system, which avoids the not one-hop neighbors of the node. However, the
overhead of a proactive scheme by limiting multi- anchor-based schemes require a non-negligible amount
lateration and coordination of coordinate systems to of anchor nodes in the network. Under certain
nodes centered around the route between the source situations, deploying a sufficient number of anchor
node requesting the location and the target node whose nodes may not be feasible or they may simply not be
location is being requested. Simulations show that our available.
scheme reduces overheads substantially compared to The anchor-free algorithms assume that there are no
an existing proactive scheme. anchor nodes. In some schemes, a few coordinator
nodes create a coordinate system among themselves,
which the rest of the nodes use as a shared coordinate
1. Introduction system. However, movement of the coordinator nodes
causes changes to the coordinate system and thus gives
Location is an important context in ubiquitous rise to coordinate inconsistency. The Self-Positioning
computing environments. As the technology for mobile Algorithm [3] minimizes coordinate inconsistency by
devices and ad hoc networks advance, spontaneous periodically broadcasting messages for maintenance of
interactions among users on a mobile ad hoc network the shared coordinate system and minimizing the
becoming of high interest. Such an ad hoc ubiquitous movement of the coordinate system. The existing
computing environment usually does not have GPS schemes require continuous location message
support or the support of any other infrastructure exchanges between nodes for maintaining a shared
providing location information. In such cases we need coordinate system. In many practical situations, some
to use the relative positioning of nodes in an ad hoc of which are described in [4], such as finding someone
network, which we call ad hoc positioning. These are or something, proximity-based notification, and
obtained using self-configuring schemes where nodes resource tracking, applications neither use locations all
determine relative distances to their neighbors and the time, nor do they need to know the location of
converge to a consistent coordinate assignment via every single node in the network.
cooperation between the nodes [1]. This is useful In this paper, we propose a reactive ad hoc
enough for applications that need to find a nearest positioning algorithm, which minimizes message
object, calculate distances to objects of interest, or exchanges for obtaining the location of a given node.
reveal relative positions between nodes in the Each node creates a local coordinate system consisting
neighborhood. of its neighbors on demand. When a location request is
made, the requesting node broadcasts the request,
*
which is received by its neighbors. Then the node that
This research has been supported by the Ubiquitous Computing and is next on the route towards the target node calculates
Network Project sponsored by the Ministry of Information and
Communication 21st Century Frontier R&D Program in Korea. its relative position with respect to the requesting node.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 44


DOI 10.1109/SUTC.2008.22
This procedure continues until a neighbor node of the required for it to be reasonably accurate, and neither
target is reached. The neighbor node knows the does the scheme consider mobility.
coordinate of the target node according to its local Efrat et al. [6] proposed a force-directed positioning
coordinate system and how to transform it to the system which calculates positions in an anchor-free
requester’s coordinate system. The neighbor node situation. They tried to solve the problems with
reports to the requesting node the coordinate of the previous schemes, where they do not work well with
target node as expressed in the requester’s coordinate large networks and yield poor results with noisy data.
system. By avoiding periodic location message Force-directed algorithm defines an attractive force
exchanges for coordinate consistency and reusing local function for adjacent nodes and a repulsive force
coordinate systems, the proposed scheme can further function for non-adjacent nodes, and changes the
reduce network overhead. layout of the nodes to a low energy state. This scheme
Since a location is calculated relative to a requesting is able to make a consistent coordinate assignment for
node, the proposed scheme does not need to maintain a the nodes, but continuous communication among the
shared coordinate system, and neither does it have to nodes is required.
consider the movement of the shared coordinate The Self-Positioning Algorithm [3] is a distributed,
system. Simulations show that the proposed scheme anchor-free positioning algorithm that uses the
provides reasonable accuracy with significantly less distances between the nodes to build a relative
network overhead compared with an existing proactive coordinate system. Each node creates a local
scheme, the Self Positioning Algorithm [3]. coordinate system using itself as the origin. After
The remaining sections are organized as follows. creating local coordinate systems, these local
Section 2 introduces and analyzes related work and coordinates are integrated into a shared coordinate
Section 3 describes what we considered in the design system based on a group of nodes called the location
of the proposed scheme. Our proposed scheme is reference group. Because movement of nodes causes
described in Section 4, and an evaluation of the changes in coordinates, the center of the location
proposed scheme based on simulations is described in reference group is used as the center of the coordinate
Section 5. Section 6 concludes with discussion about system, and the direction of the coordinate system is
our proposed scheme. used as the average direction of the local coordinate
systems of location reference group. This scheme
2. Related work could minimize coordinate movement caused by node
mobility and maintain a shared coordinate system.
Several ad hoc positioning systems that do not As far as we know, previous anchor-free ad hoc
require any infrastructure have been described in the location systems require periodic beaconing between
literature. Relate [5] is a relative positioning system all nodes for maintaining a shared coordinate system,
using an ultrasonic dongle. By arranging the ultrasound and if the nodes can move, additional work is required
sensors in three directions, a node can measure the to minimize the movement of the coordinate system.
angular positions of adjacent nodes as well as their Certain location-based applications do not require
distances. Therefore the node can calculate the location information at all times with only a few nodes
positions of the neighbor nodes. Relate can provide being interested in the location of a few other nodes, in
locations and also the orientation of the physical device, which case periodic beaconing among all nodes is
but it uses special hardware and does not scale well unnecessary. We propose a reactive local positioning
with network size due to the use of a time-division system on ad hoc networks which avoids such waste.
scheme.
Hop-TERRAIN [2] is an anchor-based distributed 3. Design considerations
positioning algorithm for ad hoc networks. In the start-
up phase, nodes calculate their initial positions by We assume a scenario where a user is looking for
using the hop count to the anchor nodes. If the node his friend in a big hall or a large open space with many
can find out the hop count to some anchor node, then it people. In this case, the user is only interested in the
uses the average hop distance to compute the position of his friend and not the others. While the user
approximate distance to the anchor node. These can use schemes like those from [3] and [6] when there
distances are used to perform triangulation which are no anchor nodes in the environment, these schemes
estimates a node’s position. After the start-up phase, require that all nodes in the area should join in the
the nodes perform refinement by triangulating between positioning task and do periodic broadcasts, which
neighbors. Only a few anchor nodes are required in this imposes a large overhead on the network. We want to
scheme, but a large amount of anchor nodes are limit the number of nodes participating in the

45
positioning task and reduce the communication 4.2. Basic procedures
overheads between nodes with our proposed scheme.
We considered the following when designing our This section gives a brief description of the basic
scheme. procedures used in our proposed scheme. The first
First, only a limited number of nodes should be part procedure is the creation of a local coordinate system
of the positioning task. We take advantage of the using two neighbors. The second is the transformation
network route from the requester node to the requested procedure between two different local coordinate
node. The nodes that take part in measuring the systems. Our proposed scheme is built on top of these
coordinate of the requested node in the local coordinate two basic procedures, which are the same as those as
system of the requester node are limited to those in the described in [3].
vicinity of the network route.
Second, the communication cost for the localization 4.2.1. Local coordinate system creation. Each node
should be reduced. Our target scenario only requires maintains a local coordinate system with itself as the
locations occasionally, so we design the scheme to origin.
work reactively. This avoids the periodic beaconing by
all nodes as would be required in proactive schemes,
which results in smaller network overheads. We also
limit the rate at which broadcasts requesting distance
measurements are made.

4. Reactive ad hoc positioning

4.1. Overview

Unlike previous work that depends on periodic


beaconing, the proposed scheme initiates node Figure 1 Local coordinate system creation
positioning only when a user requests the location of a
target node. Procedures for creating a local coordinate When node i determines the distances between any two
system based on the node itself by using distances nodes p and q, node i can create a local coordinate
between the neighbors are based on those described in system as in Figure 1. The coordinates for node i is
[3]. When a user requests the location of a target node, defined as (0, 0), and x axis is oriented so that one node
the request is eventually sent to the neighbor of the (p) lies on its positive component. The coordinates of
target node and the position of the target node is other nodes are then defined in terms of this local
transformed along the path back to the requester node. coordinate system. Equation 1 shows how the
The requester node would then have the position of the coordinates for i, p and q are computed, which
target node in its local coordinate system. determines the coordinate system.
The assumptions for our proposed scheme are
similar to those in [3] with a few additions, which are Equation 1
described as follows: i x = 0; i y = 0;
- An infrastructure-less network of mobile and p x = dip ; p y = 0;
wireless devices
- All wireless networking devices have the q x = d iq cos γ ; q y = d iq sin γ ;
same technical characteristics  d iq2 + d ip2 − d 2pq 
- Wireless links between the nodes are γ = cos −1  
 2d iq d ip 
bidirectional  
- All nodes move at human walking speed
where γ is the angle ∠(p, i, q), dab is the distance
- Node density is high enough so that there is at
least two nodes in one hop between a and b node
- A route to the target node is known in
advance. The coordinates of other nodes can be computed by
- Each device includes a sensor which can triangulation using these three known positions.
measure the distances to neighboring nodes Equation 2 shows how the coordinates of node j is
computed.

Equation 2

46
j x = d ij cos α j coordinate system of k can be rotated by the correction
angle and displaced by the coordinates of k in the local
if βj = αj −γ => j y = d ij sin α j coordinate system of i, which results in coordinates
based on the local coordinate system of i.
else => j y = − d ij sin α j
4.3. Algorithm
In our proposed scheme, nodes create a local
coordinate system for computing the location of
neighbors.

4.2.2. Transformation between coordinate systems.


While coordinates of neighbors can be directly
measured and computed in terms of a local coordinates
system, the coordinates of other nodes must be
obtained by transforming their coordinates from other
local coordinate systems. Transforming coordinates
from one local coordinate system to a neighboring
local coordinate system is possible by using a common
neighbor.

Figure 3 Overall view of our scheme

The goals of our proposed scheme are to start the


positioning reactively and to have only a limited
number of nodes communicate with each other for
positioning. To achieve these goals, we use the
location request message from a requester node to a
target node as the trigger for the positioning. Figure 3
Figure 2 Rotating coordinate system of k illustrates how the scheme is intended to work. These
are the basic steps:
If node i want to use coordinates from the local
coordinate system of k, then it should rotate the 1. The location request is broadcast.
coordinates of k and move to i. Node j, which is a 2. When a node receives a location request, it
common neighbor of i and k, is used to check whether measures the distances to its neighbors and
mirroring is required. broadcasts the distances.
3. Receivers of broadcasted distances compute
Equation 3 the local coordinates of its neighbors. If the
if α j − α k < π andβ j − βi > π receiver is the next hop to the route to the
target node, it repeats step 1.
or α j − α k > π and β j − β i < π 4. When a node neighboring the target node
obtains the coordinates of the target in step 3,
Ö correction angle = β i − α k + π the node sends it towards the requesting node.
Ö mirroring is not necessary 5. While sending back the coordinates, each
node on the route transforms it according to
ifα j − α k < π and β j − β i < π its local coordinate system.
6. The requesting node receives the coordinates
or α j − α k > π and β j − β i > π and transforms it to its local coordinate
system.
Ö correction angle = β i + α k
Ö mirroring is necessary The distances between neighbors can be reused for
computing the position of other nodes, so a node
Equation 3 shows the equation for detecting whether suppresses measurement and broadcast of distances if
mirroring is required or not and how to determine the it has sent the information recently. This rate
correction angle. Coordinates based on the local suppression is also applied to the broadcast of

47
computed positions, and each node caches the When a node receives a LOCATION_REQ message,
computed positions for further use. it should determine the coordinates of neighboring
nodes as in Figure 4. All nodes receiving
4.4. Protocol design LOCATION_REQ message should measure the
distances to the neighbors and broadcast
Table 1 Messages for the proposed scheme DISTANCE_REP messages, which include distances
Messages Description between themselves and their neighbors. This distance
A message requesting the information is used by other nodes to create their own
LOCATION_REQ local coordinate systems, so the source node and the
location of a given node.
A reply message for computed node on the next hop can also create their own local
LOCATION_REP coordinate systems.
locations.
A reply message for distance If a receiving node of a LOCATION_REQ message
DISTANCE_REP is also the node on the next hop on the route, it also
information
checks if the position of the requested node is known
or not. If not, it stores the requester node address and
The three messages used in the proposed scheme
the target node address in a requested location list, and
are shown in Table 1.
broadcasts a LOCATION_REQ message, which is
received by the next node on the route. By repeating
this, a LOCATION_REQ will be forwarded to the
node which has the location of the requested address,
while nodes on the route will determine coordinates for
its neighbors.
If a node on the route has the location of the target
node, then it broadcasts a LOCATION_REP message,
which includes coordinates for a subset of known
nodes in its coordinate system. When a node receives a
LOCATION_REP message, the node will have enough
information to transform coordinates from neighboring
local coordinate systems to its own local coordinate
system by comparing the subset of coordinates
included in the message with those in its own local
coordinate system.
When a node receives a LOCATION_REP message,
it transforms the coordinates of the target node to those
in its local coordinate system and sends another
Figure 4 Actions after receiving a request LOCATION_REP message with the new coordinates
message. T is the target, S is the broadcasting and coordinates of a subset of its own neighbors
node, and both A and B are nodes on the toward the next node on the route back to the
route. requesting node. Eventually the requester node will
receive a LOCATION_REP from which it can
A LOCATION_REQ message is broadcasted when compute the location of the target node in its own local
the location of a specific node is requested. This coordinate system.
message includes the target node address, destination A couple of techniques are used in our proposed
node address and requester node address. The target scheme to improve the performance. The first one is
node is the node whose location is requested. The DISTANCE_REP rate limitation, where if a
destination node address indicates the node on the next DISTANCE_REP has been sent recently, a node
hop on the route from the requester node to the target suppresses measurement and broadcast of distances.
node, and is used for sending the LOCATION_REQ The next one is location caching, where a computed or
message along the network route. It is obtained from received location is cached on a node for further use.
the routing table in the network stack. The requester
node address stores the requester of the target location. Table 2 Pseudo-code for the proposed scheme
This information is required for sending back the recv (packet) {
requested location. switch(packet.cmd){
case LOCATION_REQ:
if(Not broadcast recently)

48
{ In this section, we compare the proposed scheme
dRep=new DISTANCE_REP with the Self-Positioning Algorithm [3] by simulation
dRep.src=selfID and evaluate the effectiveness of our scheme.
dRep.dest=ANY
dRep.data=distanceList
send dRep after random delay 5.1. Simulation environment
}
if(packet.dest==selfID){ The proposed scheme is simulated with the NS-2
if(Location of target is not known){ simulator [7], and Table 3 describes the parameters
if(isMyNeighbor(packet.target)){ used in the simulation. We implement and simulate the
packet.dest=ANY Self-Positioning Algorithm with the same settings for
} else {
comparison with the proposed scheme.
packet.dest=next hop to target
}
send packet Table 3 Settings for the simulation
requestList.insert(packet.src, Parameter Value
packet.target) Size of area 1000 x 1000 m
} Number of nodes 100
}
Duration of simulation 1000 seconds
case LOCATION_REP: Wireless model 802.11
updateLocationList(packet) Transmission range 250 m
for each item in requestList{ Mobility pattern Random
if(Requested location is available Maximum node speed 1~10 m/s (10sec pause)
&& Not broadcast recently){ Minimum broadcast 3 sec
lRep=new LOCATION_REP
lRep.src=selfID
interval
lRep.dest=ANY Sensor error Maximum +/- 5%
lRep.data=locationList,
send lRep In our simulation, each node has an application
requestList.delete(item) running on it, and it requests the location of a
} randomly selected node. Location requests are made in
}
fixed intervals and the target node is changed every
case DISTANCE_REP: time. The number of messages sent for positioning is
updateNeighborsDistanceList(packet) used to evaluate the network overhead. We compared
if(distanceList.num() > 3){ the difference between the real distance and calculated
createLocalCoordinateSystem() distance to a target node, and calculate the time
for each item in requestList{ required for getting a location.
if(Requested location is available Unless otherwise mentioned, nodes are moving at 5
&& Not broadcast recently){ m/s speed with 10 seconds of pause time, cache life
lRep=new LOCATION_REP
lRep.src=selfID
time is 5 seconds and 100 nodes are periodically
lRep.dest=ANY requesting a location in 30 second intervals. Sensor
lRep.data=(locationList, errors are simulated by adding a uniformly distributed
distanceList) random error that is at most 5% of the distance. AODV
send lRep [8] is the underlying routing protocol.
requestList.delete(item)
} 5.2. Network overhead
}
}
} Figure 5 shows the number of messages generated
} by our proposed scheme. The number of request
messages obviously has a close relation with the
Table 2 shows pseudo-code for our proposed scheme. interval of location requests. Since all nodes on a route
between the requester and the target node should send
a request message, the number of messages increases
5. Evaluation rapidly as the number of requests does. We can also
see that the number of location reply messages
increases as the number of requests increases (even
though it is a very small amount), but the growth rate

49
of the number of distance replies is much smaller, 5.3. Accuracy
resulting in a convex shape. This is because the
distance replies can be suppressed easier than location
replies. A distance reply is not broadcasted if it was
recently broadcasted while a location reply is not
broadcasted if it and the targeted location were not
recently broadcasted.

Figure 7 Distance errors in SPA and the


proposed scheme

Figure 8 shows the errors between the real distances


and computed distances from the coordinates derived
by SPA and our scheme. Even though the sensor has a
Figure 5 Number of messages in proposed maximum error of 5%, the result of our scheme shows
scheme (Number of requester nodes = 100) less than 10% error at human walking speeds. Without
a cache, the errors are reduced by about 2%, which is
Figure 6 shows a comparison of the number of due to the fact that the distances used when computing
messages between SPA and the proposed scheme. We coordinates are less out of date. However, not using a
set the beaconing interval of SPA as 3 second, which is cache increases latency as seen in Section 5.4.
same as the location request interval. SPA works The error in SPA is significantly larger than in our
proactively, so the number of messages is not affected proposed scheme, especially when mobility is higher.
by changing the number of requesters. Under these We believe that this is because of the difficulty of
circumstances the number of messages required is maintaining a shared coordinate system when nodes
smaller than that of SPA when the ratio of requesters is move faster.
smaller than 80%.
5.4. Latency

Figure 6 Number of messages for SPA and the


Figure 8 Required time for receiving requested
proposed scheme
position (Number of requester nodes=30)
(Location request interval = 3 sec.)
Figure 8 shows a comparison between the average
latencies for receiving a requested location when
caching is used and when it is not used. We can see
that caching decreases latency significantly.
Figure 8 also shows that the latency is lower when
location requests are more frequent. On reason for this
is that when packets are dropped during a location

50
request, it is more likely that we can take advantage of Hawaii International Conference on System Sciences,
packets sent during other location requests when January 2001.
location requests are more frequent. The other reason,
which only applies to caching, is that frequent location [4] S. Steiniger, M. Neun and A. Edwardes, “Foundations of
Location-based Services”, CartouCHe - Lecture notes on
requests can take advantage of the cache to reduce
LBS, v. 1.0.
latency.
[5] M. Hazas, C. Kray, H. Gellersen, H. Agbota, G. Kortuem,
6. Conclusion and A. Krohn. “A relative positioning system for co-located
mobile devices”, In Proceedings of MobiSys 2005, Seattle,
The goal of this paper is to develop a reactive ad USA, June 2005. pp.177-190.
hoc location system which decreases the
[6] A. Efrat, D. Forrester, A. Iyer, and S. G. Kobourov,
communication overhead between nodes by limiting
“Force-Directed Approaches to Sensor Localization”, 8th
the nodes that take part in positioning and by avoiding Workshop on Algorithm Engineering and Experiments
periodic broadcasts by nodes. Using local coordinate (ALENEX), 2006, pp. 108-118.
systems, each node can compute the coordinates of
their neighbors in their own local coordinate systems, [7] ns-2 simulator, http://www.isi.edu/nsnam/ns
and these coordinates can be transformed between the
local coordinate systems for nearby nodes. Based on [8] C. E. Perkins and E. M. Royer. Ad Hoc On Demand
these two basic procedures, our proposed scheme can Distance Vector (AODV) Routing. RFC3561, July 2003.
obtain the coordinate of a target node in the local
[9] Global Positioning System, “Global Positioning System”,
coordinate system of the requester node.
November 2006, http://www.gps.gov
The proposed scheme does not need beaconing, and
communication only occurs on a limited set of nodes.
Also, by computing the position only when it is
requested, it does not have to maintain a shared
coordinate system, which requires extensive
communication and computation between nodes. As a
result, the proposed scheme provides locations of
nodes with less communication overhead when only a
few nodes occasionally need the locations of a few
other nodes.
Currently, our scheme assumes that routing is
already done, but for the case where routes are not
known, integrating routing and positioning could be an
interesting way to further reduce overheads.
Developing a cross-layer scheme for localization is our
next step. Also, disconnections or node failures can be
common in a MANET environment, so the scheme
could be improved to make it robust for these cases.
We are working on a prototype implementation to
make our proposed scheme work in realistic settings.

7. References
[1] N. B. Priyantha and H. Balakrishnan and E. Demaine and
S. Teller “Anchor-Free Distributed Localization in Sensor
Networks”, Technical Report TR-892, MIT LCS, Apr. 2003.

[2] C. Savarese, J. Rabay and K. Langendoen, “Robust


Positioning Algorithms for Distributed Ad-Hoc Wireless
Sensor Networks”, In Proceedings of the General Track:
2002 USENIX Annual Technical Conference, Monterey, CA,
June 2002.

[3] S. Capkun, M. Hamdi, and J.P. Hubaux, “GPS-free


positioning in mobile ad-hoc networks”, in Proceedings of

51
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A Novel Distributed Authentication


Framework for Single Sign-On Services

Kaleb Brasee, S. Kami Makki Sherali Zeadally


Department of Electrical Engineering and Department of Computer Science and
Computer Science Information Technology
University of Toledo University of the District of Columbia
Toledo, Ohio Washington, DC
{kbrasee, kmakki}@eng.utoledo.edu szeadally@udc.edu

Abstract Single sign-on (SSO) allows users to verify their


identity on a central system and gain access to many
In this paper we present a novel single sign-on different resources that trust the central system. A
scheme known as Secure Distributed Single Sign-On widely-used Internet SSO system could help people
(SeDSSO). SeDSSO provides secure fault-tolerant protect their identity secrets by replacing many site-
authentication using threshold key encryption with a specific logins with a single SSO login. This would
distributed authentication service. The authentication make it possible for the average user to choose secure
service consists of n total authentication servers identity information and remember it without writing
utilizing a (t, n) threshold encryption scheme, where t it down. Correspondingly, this system would reduce
distinct server-signed messages are required to the need for insecure transmission of logins through
generate a message signed by the service. SeDSSO email when users forget their information. Various
provides secure portable identities by defining a two- SSO architectures have been proposed and
factor identity that uses both a username/password implemented over the past decade, but none have been
and a unique USB device. The combination of a used significantly on large-scale public Internet
distributed authentication service and two-factor domains.
identities allows SeDSSO to securely authenticate This paper describes the design of a distributed
users in any environment. SSO scheme that offers improvements over existing
SSO systems. Because many users and sites rely on
Keywords the SSO central authentication service, it must offer
fail-safe authentication that remains available and
Single sign-on; Two-factor authentication; Computer secure through partial hardware and software failures.
security; Distributed systems; SeDSSO. A robust system must also provide a way for users to
safely sign on from any location, including potentially
insecure computers found in places such as Internet
1. Introduction cafés and public libraries. Our scheme is called Secure
Distributed Single Sign-On (SeDSSO) and it provides
As the number of personal Internet-site accounts
SSO services with a fail-safe distributed
grows, organizing and remembering confidential
authentication system and secure two-factor
identity information becomes more difficult for the
authentication user identities.
individual. It is often impossible to use the same
In section 2 of this paper we discuss the
information on every site. Common usernames may
components that make up a complete SeDSSO
already be taken and sites frequently impose unique
system, including the distributed authentication
requirements for passwords (e.g., the password must
service. Section 3 details SeDSSO user identities and
consist of both lowercase and uppercase letters or it
the two-factor authentication scheme. Section 4
must contain a digit). In an RSA Security survey,
describes the processing and communication protocol
more than 30% of users reported needing between 6 to
used to perform SeDSSO operations. Section 5
12 different passwords for their business-related
discusses our SeDSSO prototype and the results of
logins and almost 25% said that they needed to
performed tests, and section 6 presents our
remember 13 or more passwords [1]. When people
conclusions.
cannot remember all of their information and are
forced to physically record it, the secrecy of their
identity is jeopardized. The Computer Emergency 2. SeDSSO Components
Response Team (CERT) reported that 80% of the
security breaches they examine are related to SeDSSO consists of 3 different components:
passwords [2]. service providers, users, and the authentication
servers.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 52


DOI 10.1109/SUTC.2008.17
provider requirements. Service providers must store
2.1. Authentication Service all site-specific user data on their own servers, and can
do this in any way they choose. As long as the stored
Authentication servers are individual systems that user data is related to SeDSSO user identifiers then
work together to vouch for the identity of users. the service provider will be able to recall the data for
Because they authenticate users and store information that user as soon as the sign-on procedure is
about each SeDSSO user and service provider, these completed.
servers must be high-performance high-availability
systems that can perform many intensive data storage, 2.3. Users
computation, and network I/O tasks simultaneously.
Collectively, the group of authentication servers is The user account is an individual’s representation
referred to as the authentication service. on the SeDSSO service. A user’s identity is
SeDSSO implements threshold encryption by represented by a username and password as well as a
deploying n authentication servers and generating one public and private key. The user creates all of these
public and private key for the entire authentication values, but the username must be verified by the
service. This key generation takes place on the authentication service to ensure that it has not been
certificate authority (CA) server. The authentication previously chosen. The username (and the
service key generation is its only task, it is never corresponding username hash) is the information by
connected to a network, and it is physically guarded. which service providers, authentication servers, and
These steps are required to ensure the security of the other users identify an individual. It is possible for a
authentication service’s public and private keys and person to separate their identity by possessing
thereby maximize trust in the service. The CA server multiple accounts, although the need to remember too
splits the private key into n partial keys with a (t, n) many usernames and passwords negates one of the
threshold scheme, and one partial key is given to each major benefits that a SSO identity provides.
authentication server. Since no authentication server
possesses the entire private key, at least t servers must 3. SeDSSO User Identity System
sign an identical message in order to act as the
authentication service. Every user and service SeDSSO represents users with a two-factor identity
provider in the SeDSSO system is given access to the consisting of their username/password as well as
authentication service public key, making it possible information stored on a specialized USB device
to verify messages signed by the authentication (which at this point is only a functional design). The
service private key. username and password is the factor that they know
Individual authentication servers each possess a and the information on the USB device is the factor
self-generated public and private key to use for that they have. Possession of both factors is required
server-to-server communications, similar to the for a user to successfully authenticate with the
intranet keys found in a distributed certification SeDSSO system. The advantage of this system is that
system known as COCA [3]. It is only necessary that a coordinated effort is required to steal a user’s
authentication servers know these keys, and they do identity, and classic one-factor attacks are insufficient.
not need to be distributed to users and service Keystroke logging software cannot access the USB
providers. The presence of these keys facilitates device information, and the theft and examination of
secure communication within the authentication the USB device does not reveal the corresponding
service. username and password. Existing distributed SSO
systems such as CorSSO [7] and ThresPassport [8]
2.2. Service Providers have used only one authentication factor for users.
Service providers offer some type of service to 3.1. USB Identity Device (USBID)
users through the provider’s web site. The service
provider can be a business web site or a personally- The SeDSSO USB identity device (USBID) is a
owned site and can offer any combination of free or specialized device that combines a built-in processor
payment-based services. The only requirement is that with flash memory and communicates with a
the service provider has the need to identify computer through the USB interface. All of the
individual users. Joining SeDSSO allows this hardware is housed in a casing the size of a normal
provider to offer personalized services to users USB flash drive. The USBID is responsible for
without having to invest in standalone authentication storing the public and private keys for one or more
software and hardware, because the authentication users, as well as the secret counter values that allow
service performs this function for all service users to gain authorization with service providers.
providers. This device must be accessible by the client software
Although the authentication service centralizes every time SeDSSO account creation or
authentication for the entire SeDSSO system, its authentication is requested. A similar USB-interface
functionality does not extend into specific service computation device with specialized hardware was

53
proposed in [4], but was designed for electronic
payment instead of SSO identity proof. 4.1.1. Session Key Generation. Because
The USBID processor generates the user’s private public/private key pairs place strict length limitations
and public key when the account is created and is on the encrypted payload and require far more CPU
responsible for performing all operations that require effort than symmetric keys, they are only used at the
the use of identity factors, such as signing a message beginning of a session. Once it has been verified that
with the private key. This makes it unnecessary to both communicating parties know the private key
pass the user’s private key to the computer where it corresponding to their claimed identity, a symmetric
could be observed by a program designed to retrieve session key SK is created and used for the remainder
this information. of the communication. In the following protocol, C is
When a communication message needs to be the connecting system, R is the receiving system, and
signed during the authentication process, the client their public keys are KC and KR respectively.
software passes the message to the USBID processor.
The processor retrieves the encrypted private key 1. C • R: < nonceC, [KC] > KR
from memory and decrypts it with the password. Once 2. R • C: < nonceC XOR 00…0001, nonceR, SK >
the private key has been decrypted it is used to sign KC
the message, and the signed message is returned to the 3. C verifies that the first parameter in the above
client software on the user’s system. message is its nonce with the last bit flipped.
The USBID memory is standard flash memory. 4. C • R: < nonceR XOR 00…0001 > SK
However, unlike common flash drives, the USBID 5. R verifies that the parameter in the above
does not allow access to the memory through a message is its generated nonce with the last bit
computer file system. Only the USBID processor can flipped.
access this memory. 6. R • C: < “success” > SK

3.2. Counter System 4.1.2. Adding an Authentication Server. Because


this process occurs very infrequently, must be highly
The counter system is part of the “have” factor in secure and consists of extra-network steps, it is not
SeDSSO’s two-factor authentication scheme. To implemented using a communication protocol.
make authentication unfeasible without the USBID, a The authentication server parameters are as
pseudo-random number generator seed is created follows: AID is the authentication server ID and KA/kA
when the user first contacts a service provider. It is are the authentication server’s individual
stored on the service provider’s system and the user’s public/private keys. In addition, the authentication
USBID. The USBID uses the seed to generate a service has a single public key KAS and a
number during the authentication process and this corresponding private key kAS. No server has
number is sent to the service provider. If the service possession of the entire service private key, but each
provider generates the same number then the user’s possesses a distinct partial private key kpAS. When t
possession of the seed (and therefore possession of the distinct kpAS keys are used to create t encryptions of
USBID) has been proven. This counter system the same message, the encryptions can be combined
provides an effective additional layer of security. to form one message encrypted with the private key
Even if t authentication servers are hacked so that a kAS.
malicious party can gain authentication as a user
without the USBID, the service provider requires the 4.1.3. Adding a Service Provider. The service
counter value on the USBID independently. provider uses extra-network communication to add
itself to a single authentication server Ac. Ac then uses
4. SeDSSO Processes the following protocol to add the service provider to
every other authentication server.
SeDSSO consists of a number of communication The service provider data is referred to as follows:
processes which are responsible for setting up new SID is the service provider ID and KS/kS is the service
systems and authenticating users to allow sign-on. provider’s public/private key.

4.1. Setup Processes 4.1.4. Adding a User. User data collection and
generation takes place in the initialization functions
The first process in this section describes the steps when the user software is executed. This inputs and
necessary to generate a session key and set up a generates all data necessary to begin the user addition
secure symmetric-encryption connection. This session process.
key is generated at the beginning of every The user data is referred to as follows: UID is the
communication process between two existing unique user ID, UP is a hash of the username and
SeDSSO parties. Additionally, this section discusses password combined, KU/kU is the user’s public/private
the processes for adding new authentication servers, key combination, and INV is the account invalidation
service providers, and users. code.

54
The addition process begins after data collection 1. U • random auth. server A:
has taken place on the user’s computer. The user “AUTHENTICATE_USER”, 1, < UID, UP,
enters the username and password. UID is calculated nonce > SK.
by hashing the username and UP is calculated by 2. AuthSet = random set of t-1 authentications
hashing the username and password combination. A servers.
secure pseudorandom number and computing 3. ContactedSet = AuthSet + A.
environment data is used to seed the generator for KU, 4. A establishes a secure session key connection
kU and INV. with each server in AuthSet. Each connection
uses an independent SK.
1. U • random auth. server A: 5. A •x∀ •Ax ∈ •AuthSet:
“CREATE_USER”, 1, < UID, UP, KU, “AUTHENTICATION_CHECK”, 1, < UID,
hash(INV) > SK UP, nonce > SK.
2. A verifies that the UID is not already claimed 6. A examines the information it received from
by another user account. U.
3. A establishes a secure session key connection a. UID exists and UP is correct: add A’s
with all other authentication servers A1…An. response to the success set.
Each connection uses an independent SK. b. UID does not exist or incorrect UP: add
4. A • A1…An : “ADD_USER”, 1, < UID, UP, KU, A’s response to the failure set.
hash(INV) > SK 7. Servers in AuthSet decrypt and analyze the
5. A1…An decrypt and analyze the message. message and return one of the following
a. Invalid message: “ADD_USER”, 2, messages to A.
<“general_failure” > SK. a. Invalid message, UID not found, or UP
b. Username already exists: “ADD_USER”, incorrect:
2, <“uid_failure” > SK. “AUTHENTICATION_CHECK”, 2, <
c. Corrent message: save message, send “failure” > SK.
“ADD_USER”, 2, < “success” > SK to A. b. UID exists and UP correct:
6. A receives messages from A1…An and tallies “AUTHENTICATION_CHECK”, 2, < <
A their responses. UID, nonce > kpAS > SK.
a. “success” < t or “uid_failure” > 0: send 8. A receives all responses from the AuthSet
“ADD_USER”, 3, < “discard” > SK. servers and adds each to the successes or
b. “success” >= t: add user and send: failures response set.
“ADD_USER”, 3, < “add” > SK to A1…An. 9. If any responses are present in the failure set:
7. A1…An receive “ADD_USER”, 3 message a. A selects random server and adds to
from A and either add or discard the user. ContactedSet.
8. A sends a message to U describing the results b. A sends message from step 5 to the
of the user creation process. random server, receives response &
9. U receives the message from A and reports the updates response set.
status to the user. c. Repeat until successes >= t, time runs out,
or no more authentication servers can be
4.2. Authentication Processes contacted.
10. A analyze responses and send a message to U.
The processes for user authentication are defined in a. successs >= t: threshold combine t
this section. Authentication requires the user to responses, send:
communicate with the authentication service to obtain “AUTHENTICATE_USER”, 2, <
an identity voucher. This voucher must contain a fresh “success”, < UID, nonce > kAS > SK.
nonce that the service provider sent to the user and b. successes < t: send:
must be signed with the authentication service private “AUTHENTICATE_USER”, 2, <
key. Every authentication process between a user and “failure” > SK.
service provider relies on this voucher.
4.2.2. Initial User Sign-On to a Service Provider.
4.2.1.User Authentication Voucher Generation. The sign-on procedure describes the steps necessary
Before a user can access a service provider, that user for a SeDSSO user with an existing account to gain
must receive a message signed by the authentication access to a service provider. The following process
system that vouches for their identity. This message describes a user’s first access to a service provider.
contains the user ID and service provider ID, the
username/password hash, and a nonce created by the 1. U establishes a secure session key connection
service provider to eliminate the possibility of replay with S.
attacks. 2. U • S: “USER_SIGN_ON”, 1, < UID > SK.
3. S • U: “USER_SIGN_ON”, 2, < nonceS > SK.

55
4. U performs the authentication message request When max_depthUS successful logins have been
procedure using UID and nonceS. performed, depthUS equals max_depthUS and both U
a. Success: U receives the message < and S must generate new seedUS, depthUS and
“success”, < UID, nonceS > kAS > SK max_depthUS values. This is done by using the last
b. Failure: U receives the message < used counter value as the new seedUS, setting depthUS
“failure” > SK, aborts the sign-on back to 1, and calculating a new max_depthUS by
procedure, send to S: creating a number from the last 2 digits of seedUS and
“USER_SIGN_ON”, 3, < “failure” > SK. adding 1. U and S perform this counter update without
5. U • S: “USER_SIGN_ON”, 3, < “success”, < indicating in a message that the change is being
UID, nonceS > kAS > SK. performed.
6. S decrypts message with session key, then with
KAS, and analyzes the message. 5. SeDSSO Implementation
a. Invalid message, UID incorrect, nonceS !=
step 2 nonce, or if U has signed on to S A project simulating the operation of each
previously: end process and send SeDSSO component has been created to test the
“USER_SIGN_ON”, 4, < “failure” > SK to performance and correct operation of a working
U. SeDSSO system. The project is programmed in Java
7. S generates pseudo-random seedUS, depthUS = and compiled with the Java SE 6 platform, using the
1, and max_depthUS = two least significant ThreshSig library [5] created by Stephen Weis for
digits of seedUS + 1. threshold cryptography.
8. S • U: “USER_SIGN_ON”, 4, < “success”,
seedUS > SK. S grants an access session to U. 5.1.Test Environment and Simplifications
9. U stores seedUS and initializes depthUS and
max_depthUS as S did. The user is now granted Individual components of the SeDSSO simulation
a session to the service provider. were executed on separate 1GHz SPARC III Sun
Blade workstations running the Solaris 10 operating
4.2.3. Subsequent User Sign-On to a Service system with 1GB RAM on a 100 Mbps network.
Provider. The following procedure is performed While a simulation of each type of system was
when a user attempts to sign on to a service provider created for the test environment, several aspects were
that they have already successfully logged onto in the simplified. Each test was performed in isolation,
past. Since steps 1 through 4 in this process are making the tested operation the only SeDSSO
identical to the initial sign-on process, they are operation being performed at that time. Also, as
omitted from the following protocol definition. noted before, the USBID has only been functionally
specified and is therefore simulated in software.
5. U generates the next counter value counterUS.
6. U • S: “USER_SIGN_ON”, 3, < “success”, < 5.2. Test Results
UID, nonceS > kAS, counterUS > SK.
7. S decrypts the message with SK, then with KAS, The following test results were calculated by
and analyzes the message. averaging the results of 50 individual tests. Prior to
a. Invalid message, UID incorrect, nonceS != the measurements, the tested operation was run once
step 2 nonce, or if the user has never to make sure that the Java virtual machine had
signed on to S: end process and send performed all of the necessary compilations.
“USER_SIGN_ON”, 4, < “failure” > SK to
U. 5.2.1. User Account Creation. Figure 1 presents the
8. S calculates counterUS. time that it takes to create a SeDSSO user account.
a. Counters match: send The minimum number of servers required to use the
“USER_SIGN_ON”, 4, < “success” > SK to authentication system private key was set to 2 (t = 2).
U. S grants an access session to U and When all authentication servers are available the
increments depthUS by 1. average account creation time is .6842 seconds and
b. Counters don’t match: send with only two servers working that time decreased to
“USER_SIGN_ON”, 4, < “failure” > SK to .6173 seconds.
U. S does not grant access to U and does User account creation tests were also run on a
SeDSSO system with n = 9 and t = 5, and
not increment depthUS.
authentication service sets of 9, 7, and 5 working
9. U receives the message from S and decrypts servers were tested. With all servers working the
the contents with the session key. average completion time is .7139 seconds, decreasing
a. success: U increments depthUS by 1. to .6815 seconds when 7 servers are available and
b. failure: error reported to the user, depthUS further decreasing to .6766 seconds with 5 servers
not incremented. functioning.

56
If less than n authentication servers are running time increases to 2.3167 seconds, and 5 servers raises
then unavailability is encountered at two points in the the sign-on time to 2.4883 seconds.
account creation process. When the user randomly
selects an initial authentication server to contact there
is a chance that an unavailable server will be
contacted. One or more additional random attempts
will be necessary to find an authentication server that
is available. Once a connection with a working
authentication server has been established, that
authentication server will encounter the unavailable
server or servers as it attempts to connect to all other
authentication servers.

Figure 2 - User sign-on times in seconds

Unlike account creation, the sign-on process does


not need to attempt a connection with all n
authentication servers. Once the first server is
contacted, that server only needs to receive signatures
from t-1 different servers in order to sign the user
identity voucher with the authentication service
private key. The contacted authentication server
chooses the set of t-1 servers at random and attempts
Figure 1 - User account creation times in seconds to create connections with all of them simultaneously.
The process is designed this way to minimize the load
The account creation time decreases as the number on the authentication service and improve sign-on
of unavailable servers increases because detecting times.
unavailability is faster than the account creation When the entire authentication service is available,
process. Unavailability is detected in several all of the initial random server connections are
milliseconds, but the communication between two successful and the voucher is created in the fastest
working systems can take several tenths of a second time possible. This is verified by the times for 3 and 9
(although the multi-threaded authentication server servers available in figure 3 and figure 4 respectively.
implementation minimizes the delay by allowing As the number of available servers declines, the more
multiple connections to progress simultaneously). likely it becomes that the user needs to contact
While the account creation times appear better when multiple authentication servers until it discovers a
fewer authentication servers are working, a realistic working server. Additionally, the contacted server
SeDSSO authentication service would need to pass may encounter connection errors with other servers
the newly-created user to the unavailable servers and thus need to attempt new connections to collect t
when they resume availability. In that case, the signatures. While communication with a newly-
additional overhead would make the total contacted server consumes the same amount of
performance requirement of unavailable servers more processing time as the initial connections, the new
costly than when all authentication servers are connections begin at a delayed time and subsequently
working. increase the total length of the single sign-on process.

5.2.2. User Sign-On. The times measured for user 5.3. Security Analysis
sign-on tests with n = 3 and t = 2 are shown in figure
2. Three available authentication servers yield an In the past, SSO systems have experienced
average sign-on time of 1.7438 seconds. If one of the vulnerabilities to two major security attacks. A man-
servers is disabled the time rises to 2.0891 seconds. in-the-middle attack occurs when an eavesdropper
User sign-on was also tested with n = 9, t = 5, and intercepts messages between two parties to change
9, 7, and 5 authentication servers available. When all them without either party knowing that such an attack
servers are available the average sign-on time is has taken place. Given the distributed flow of internet
1.8328 seconds. With only 7 servers available the

57
traffic, it is possible for an eavesdropper with access implementing a two-factor authentication scheme
to a routing device to observe raw communication consisting of a username/password combination and
messages in any protocol. These attacks have taken the USBID.
place on systems with various security protocols, A protocol describing the interaction between
including some that rely on public-key cryptography. SeDSSO users, service providers, and the
SeDSSO is immune to man-in-the-middle attacks. authentication service has been developed. Our
In order for a man-in-the-middle attack to work simulation implements every function of this protocol
against SeDSSO’s public-key authentication system, and yields consistently correct operations with
the eavesdropper needs to replace the real key pairs favorable performance measurements. The simulation
with counterfeit key pairs and assume that the also demonstrates the advantages of distributed
communicating systems will still operate given these authentication. Even with t–1 authentication servers
replacements. The public keys belonging to disabled (almost half of the authentication service),
individual SeDSSO authentication servers and the all functions are still available and in most cases the
public key for the entire authentication service are system suffers only a minor performance penalty.
widely distributed, and a root certificate authority As more people use more Internet sites, they need
vouches for their authenticity. The public key for the a way to replace many identities with one easy-to-use
certificate authority can be embedded directly in the highly secure entity that can be used anywhere
USBID as well as service provider software so that without fear of identity theft. SeDSSO was designed
user and service provider systems can make sure that to fulfill this need, and initial tests show the potential
an authentication public key is correct. of our solution. However, more work must be done to
Trojan horse attacks are more subversive because test SeDSSO in an environment that realistically
they take direct control of the user’s system. The simulates the stress that a high-volume Internet
Trojan program runs in the background and waits authentication service would need to endure.
until a connection has been established. It then sends
requests over this connection to perform malicious 7. References
activities with the user’s identity. The communication
protocol and server architecture of an authentication [1] RSA Security, “The 2nd Annual RSA Security
system would be unable to prevent this, no matter Password Management Survey,” August 2006.
how secure it is. Protection must be implemented [2] DigitalPersona, Inc., “Solving the Weakest Link:
directly in the client-side software or hardware. Password Security”,
Although a simulation of SeDSSO has been http://www.digitalpersona.com/resources/downloads/
programmed, the full client software is not yet Weakest_Link_wp_0205.pdf.
developed. Consequently, testing to gauge SeDSSO’s [3] L. Zhou, F. Schneider, and R. van Renesse, “COCA:
Trojan attack resistance cannot be performed at this A Secure Distributed On-line Certification Authority,”
ACM Transactions on Computer Systems 20, 4, pp.
time. 329—368, November 2002.
Several security approaches may allow SeDSSO [4] M. Ghosh and S. Makki, “A Secure Framework for
and other two-factor authentication schemes to Electronic Payment System,” Proceedings of the
effectively resist Trojan attacks. Client software that International Conference on Internet Computing, Las
makes it impossible for a SeDSSO connection to be Vegas, Nevada, USA, June 21-24, 2004.
established without forced user interaction could alert [5] ThreshSig: Java Threshold Signature Package, created
user to a Trojan operating in the background, but it is by Stephen A. Weis, http://threshsig.sourceforge.net/.
difficult to guarantee that this interaction cannot be [6] S. Schoen, “Trusted Computing – Promise and Risk”,
bypassed in some way. The new initiative known as Electronic Frontier Foundation (EFF) site,
trusted computing may also be able to defend against www.eff.org/Infrastructure/trusted_computing/200310
01_tc.pdf.
these attacks by limiting the ability of other programs
to interact with the user’s session. However, at this [7] W. Josephson, E. Sirer, and F. Schneider, “Peer-to-
Peer Authentication with a Distributed Single Sign-On
time the future of trusted computing is unclear and rd
Service,” 3 Int. Workshop on Peer-to-Peer Systems
the potential advantages and disadvantages are still (IPTPS’04), San Diego, USA, February 2004.
being discussed [6]. [8] T. Chen, B. Zhu, S. Li, X. Cheng, “ThresPassport - A
Distributed Single Sign-On Service,” International
6. Conclusion Conference on Intelligent Computing (ICIC) 2005,
Hefei, China, August 2005.
In this paper we presented SeDSSO, a secure and
fail-safe Internet authentication SSO architecture.
Threshold encryption and a distributed authentication
service allow SeDSSO to eliminate authentication as
a central point of failure. Although the existing single
sign-on systems CorSSO and ThresPassport rely on
distributed authentication with threshold encryption,
SeDSSO improves on their security and usability by

58
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Efficient Proxy Signatures For Ubiquitous


Computing
Santosh Chandrasekhar, Saikat Chakrabarti and Mukesh Singhal
Laboratory for Advanced Networking, Department of Computer Science, University of Kentucky, Lexington, KY 40506
Email: {san, saikat}@netlab.uky.edu, singhal@cs.uky.edu

Abstract—Proxy signatures have been extensively used to solve check the validity of the proxy signature as well as delegator’s
authentication issues in mobile agent applications and autho- agreement on the signed message.
rization problems in distributed systems. However, conventional The process of proxy delegation has been classified into
proxy signatures use traditional public key cryptosystems and are
quite heavyweight. Thus, a direct application of these traditional three broad categories, namely, full delegation, partial delega-
signatures face significant performance challenges when applied tion and delegation-by-warrant or certificate. Full delegation
to resource constrained ubiquitous computing environments. In is a rather intuitive solution in which the delegator securely
this paper, we introduce the use of an efficient cryptographic transfers its secret key to the proxy. This requires absolute
primitive from linear feedback shift register (LFSR) sequences trust on the proxy, a secure channel and provides unrestricted
to build lightweight proxy signatures, suitable for resource con-
strained devices. We present a novel third-order LFSR sequence- signing rights to proxy. In a partial delegation scheme, the
based, 2-party signature scheme, SCLFSR, following a well- delegator also uses a secure channel to transfer a delegation
known Schnorr signature scheme. Using SCLFSR, we construct key derived from its secret key to the proxy. The proxy
an efficient proxy signature, PCLFSR, which can serve as a then derives its proxy signing key pair from the delegation
protocol building block for performance sensitive ubiquitous key. However, the proxy still maintains unrestricted signing
computing applications. The scheme, PCLFSR, is also the first
construction of a proxy signature using primitives from LFSR capabilities. The need for a secure channel and absolute trust
sequences. We perform extensive theoretical analysis including on the proxy are impractical restrictions, and are eliminated
correctness and security of PCLFSR and also present a perfor- in the delegation-by-warrant scheme: the delegator generates
mance (computation and communication costs, storage overhead) a warrant and a signature on the warrant, called a certificate,
comparison of the proposed scheme with well-known traditional and sends the (warrant, certificate) pair to the proxy. The proxy
constructions.
Index Terms—Proxy signature, mobile agents, ubiquitous sys- generates signatures—using its own private key—on messages
tems, Schnorr signature, provable security, LFSR sequence, cubic conforming to the warrant, and includes the (warrant, certifi-
LFSR-based cryptosystems. cate) pair in the resulting signatures.

I. I NTRODUCTION A. Key Contributions


There are umpteen number of miniature devices, PDAs, Conventional proxy signatures use traditional public key
mobile phones, motes, and other resource constrained devices cryptosystems, and a direct application of such mechanisms
plugged-in to the modern Internet. Ubiquitous computing would invariably face performance challenges in resource
seems to be the order of the day and is becoming more constrained ubiquitous computing environments. To address
and more prevalent in the modern Internet. However, this this issue, we propose the use of a rather new cryptographic
mushrooming of new services harbinger new security threats to primitive derived from linear feedback shift register (LFSR)
the consumers using the cloud of the Internet. Recently, there sequences to construct proxy signatures.
is a big movement to re-design the Internet, in an attempt to First, we present a novel cubic (third-order) LFSR-based 2-
make it robust, secure and incrementally deployable. In this party signature scheme SCLFSR by transforming the provably
paper, we focus on a subset of this big problem of securing secure Schnorr signature scheme [1]. Then, we construct the
the Internet: how can we secure systems where nomadic clients proposed proxy signature scheme, PCLFSR, using the 2-
need to search for special services or products, negotiate with party signature scheme. We provide an extensive theoretical
potential business entities and perform remote operations on analysis, including correctness, security and performance, of
behalf of some other client? Proxy signatures have found the proposed proxy signature scheme.
extensive use in such scenarios.
Proxy delegation is a process by which an entity, the Reasons to Use LFSR Sequence-based Constructions:
delegator, transfers its signing rights and capabilities to another A substantial portion of public key cryptography, for exam-
entity, the proxy. Following delegation, the proxy can generate ple, the Diffie-Hellman key agreement and the Digital Signa-
signatures on behalf of the delegator. The messages signed ture Standard [2], is based on the discrete logarithm assump-
by the proxy conform to a set of business policies, typically tion on an underlying finite field Fq . However, the chosen field
embodied in a warrant, agreed in advance by the delegator and sizes, q, must be sufficiently large to withstand the existing
the proxy. Any entity wanting to verify a proxy signature must attacks — algorithms to solve the discrete logarithm problem.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 106


DOI 10.1109/SUTC.2008.64
LFSR-based public key cryptosystems [3], [4], [5] use reduced secure variant of the Kim et al.’s scheme [8]. Lee et al. [16]
representations of finite field elements. This enables us to analyzed the necessity of secure channels in proxy signature
represent finite field elements, say, in the extension field Fqn , schemes like [14], [6], [7], [8]. Zhang et al. [17] proposed
by the corresponding minimal polynomials whose co-efficients the first proxy signature scheme based on bilinear pairings in
are chosen from the base field Fq . The security of LFSR- elliptic and hyper-elliptic curves.
based public key cryptosystems is based on the difficulty of B. LFSR-PKCs and Applications
solving the discrete logarithm problem in the extension field
Fqn . However, all computations, involving sequence terms, The idea of developing a public key cryptosystem based on
needed for the protocol are performed in the base field Fq . LFSR sequences was first proposed by Niederreiter, who also
This leads to substantial savings, both in communication and presented the encryption and decryption of messages based on
computational overhead, for a desired security level. For ex- nth order LFSR sequences [5]. Gong et al. [3], [18] proposed
ample, 170-bits of the XTR (a phonetic acronym for Efficient the GH (Gong-Harn) public key cryptosystem based on third-
and Compact Subgroup Trace Representation) PKC, based on order LFSR sequences and developed an ElGamal-like digital
third order LFSR sequences, gives security equivalent to 1024- signature scheme based on the GH-PKC. Lenstra et al. [4]
bits of cryptosystems using traditional representation of finite proposed the XTR (Efficient and Compact Subgroup Trace
fields [4]. Representation) public key cryptosystem based on cubic LFSR
The performance enhancement resulting from such a trans- sequences over specialized finite fields. Giuliani et al. [19]
formation is shown in Section V-C. We regard 1024-bit presented a signature scheme using the trace discrete logarithm
security corresponding to traditional finite fields as the current problem and proved the equivalence between several LFSR-
standard and thus, use cubic-LFSR sequences in our protocol based computational problems and corresponding finite field
construction. counterparts.
LFSR sequence-based PKCs have found recent applica-
The rest of the paper is organized as follows. We discuss
tions in several efficient and scalable network authentication
related work in Section II. We also discuss the cryptographic
protocols. Chakrabarti et al. proposed first constructions of
preliminaries of cubic LFSR sequences and related public
multisignature schemes based on cubic LFSR sequences to
key cryptosystems in Section III. We present proposed cubic
authenticate route discovery in the dynamic source routing
LFSR-based proxy signature scheme in Section IV. We per-
protocol in mobile ad hoc networks [20], and to authenticate
form a theoretical analysis (correctness, security, and perfor-
feedback in multicast applications [21]. Chakrabarti et al. [22]
mance) of the proposed proxy signature scheme in Section V.
also presented an efficient blind signature scheme, based on
Section VI concludes the paper.
LFSR sequences which could potentially serve as a protocol
II. R ELATED W ORK building block for privacy-preserving accountability systems,
where an authority needs to vouch for the legitimacy of a
We divide the related work into two sub-areas, proxy sig-
message but there is also a need to keep the ownership of the
natures, and LFSR sequence-based public key cryptosystems
message secret from the authority. Li et al. [23] presented
and applications.
LFSR-based signatures with message recovery and Tan et
A. Proxy signatures al. [24] used cubic LFSRs to develop two signature schemes
equivalent to the Schnorr signature and signed ElGamal en-
The concept of a proxy signature was introduced by
cryption schemes.
Mambo et al. [6]. Mambo et al. classified proxy delega-
tion into partial delegation, full delegation and delegation III. C RYPTOGRAPHIC P RELIMINARIES OF C UBIC LFSR
by warrant, presented possible constructions for proxy signa- S EQUENCES AND R ELATED PKC S
tures, and provided an informal security analysis. Petersen et A sequence of elements {sk } = s0 , s1 , . . . over the finite
al. [7] proposed a novel proxy signature construction using a field Fq is called a 3rd order homogeneous linear recurring
Schnorr [1]-type 3-pass weak blind signature scheme. Kim et sequence in Fq if for all k ≥ 0:
al. [8] introduced the notion of partial delegation by warrant
and proposed a Schnorr-based scheme, where the signing sk+3 = c0 sk+2 + c1 sk+1 + c2 sk (1)
capabilities of the proxy were restricted by the warrant. where, c0 , c1 , c2 ∈ Fq and sk denotes the kth term of the
Researchers have focused on developing new proxy sig- sequence {sk }. Such sequences can be efficiently generated
natures by enhancing the security or efficiency of previous by a special kind of electronic switching circuit, called LFSR.
schemes [9], [10], [11], [12], [13]. Following the footsteps Consider the following monic irreducible polynomial over Fq :
of Mambo et al. [6], the first improvement to the notion
of security for proxy signature schemes was made by Lee f (x) = x3 − ax2 + bx − 1; a, b ∈ Fq (2)
et al. [14]. Lee et al. classified proxy signatures into strong
The sequence {sk } is said to be a cubic-LFSR sequence
and weak signatures based on informal security requirements
generated by f (x) if we have c0 = a, c1 = b and c2 = 1
and presented a strong proxy signature scheme following the
in Equation 1, i.e., for all k ≥ 0:
construction of [8]. Boldyreva et al. [15] formalized security
notions for proxy signature schemes and presented a provably sk+3 = ask+2 − bsk+1 + sk

107
The polynomial f (x) is called the characteristic polynomial IV. T HE P ROPOSED P ROXY S IGNATURE S CHEME U SING
of the sequence {sk } if, given a root α of f (x), for all k ≥ LFSR S EQUENCES
2
0, we have sk = αk + αkq + αkq , where α ∈ Fq3 . The In this section, we present a novel third-order linear feed-
sequence {sk } is called the third-order characteristic sequence back shift register (LFSR) sequence-based, 2-party signature
generated by f (x) (or by α). The initial state (kth state denoted scheme, SCLFSR. Using SCLFSR we construct an efficient
as s̄k = {sk , sk+1 , sk+2 }) of the characteristic sequence of proxy signature PCLFSR which can serve as a protocol
f (x) is given by s̄0 = {3, a, a2 − 2b} [19]. building block for secure ubiquitous computing applications.
Recently, two PKCs, namely, GH-PKC [3] and XTR-
A. Basic Idea
PKC [4] were proposed based on cubic LFSR sequences [25].
In cubic LFSR-based PKCs [3], [4], elements in Fq3 are rep- A proxy signature scheme is divided into five phases,
resented by their corresponding minimal polynomials whose namely, initialization, key generation, proxy delegation, proxy
coefficients are chosen from Fq . However, the security of cubic signature generation and proxy signature verification. During
LFSR-based PKCs is based on the difficulty of solving the the initialization phase, all entities choose and agree upon
discrete logarithm problem in Fq3 . This leads to substantial common system public parameters. During the key generation
savings, both in communication and computational overhead, phase all entities generate their (public, private) key pairs.
for a desired security level. In particular, 170-bits of XTR- An entity (the delegator) wanting to delegate signing rights
PKC gives security equivalent to 1024-bits of cryptosystems to another entity (the proxy), does the following:
using traditional representation of finite fields [4]. The XTR 1) Creates an appropriate warrant. The delegator and the
cryptosystem is constructed by choosing: proxy agree upon warrant specifications by out of band
mechanisms, perhaps using and abiding by formal busi-
1) p, a large prime of the order of 170 bits. Set q = p2 .
ness policies. The warrant specifies the proxy’s signing
2) Q, a large prime factor of p2 − p + 1 of the order of 160 rights and (possibly) identifies the delegator and the
bits. proxy.
3) Characteristic polynomial f (x) = x3 − ax2 + ap x − 1 2) Generates a signature on the hash of warrant. In the
with period Q by randomly choosing a ∈ Fq and using proposed proxy signature scheme, we use a Schnorr-
standard irreducibility testing algorithms. like [1] signing equation. The delegator’s signature on
Let fk (x) denote the minimal polynomial of αk where α ∈ the warrant prevents misuse of signing rights by corrup-
Fq3 is a root of f (x). It can be shown that the polynomial tion of the warrant by a malicious delegator, proxy or
fk (x) can be represented as [3], [4], [19]: fk (x) = x3 − any adversary.
sk x2 + spk x − 1 in the XTR-PKC. Thus, the polynomial fk 3) Sends the (warrant, signature) pair to the proxy. Our
(we drop the indeterminate x for simplicity of notation) can scheme does not require a secure channel between
be represented by sk ∈ Fq in XTR. The sequence terms are delegator and proxy (a restriction faced by several other
computed using the following two sequence operations [18]: schemes [16]).
OP1 : Given an integer k and fe , compute the (ke)th state After receiving the (warrant, signature) pair, the proxy checks
of the LFSR, s̄ke ; the warrant specifications. If the warrant is bogus, the proxy
OP2 : Given s̄k and s̄e (both integers k, e can be unknown), aborts; otherwise, it verifies the delegator’s signature on the
compute the (k + e)th state of the LFSR, s̄k+e . warrant. If verification is successful, the proxy generates
its proxy signing key using its own long-term private key
These sequence operations have been efficiently implemented and delegator’s signature on the warrant, and computes the
in hardware [26]. We use the sequence operations to cre- corresponding proxy verification key. In our scheme, the proxy
ate/manipulate sequence terms in the proposed multisignature signing key is computed as a solution to a Schnorr-like
scheme. signing equation involving the delegator’s signature, warrant,
In cubic LFSR-based PKCs, an entity randomly chooses and proxy’s long-term private key (cf. Section IV-C). The
a long-term private key SK = x in Z∗Q and computes the Schnorr-like equation is designed to allow implicit verification
long-term public key PK = s̄x = {sx , sx+1 , sx+2 } using of the proxy’s ownership of the verification key (guarantees
the sequence operation OP1 (x, f ). Algorithms for sequence strong identifiability) and, simultaneously, allow verification of
term computations use the following commutative law [3] for the delegator’s signature on the warrant. This completes the
characteristic sequences: for all integers r and e, the rth term of proxy delegation phase.
the characteristic sequence generated by the polynomial fe (x) The proxy uses the proxy signing key to generate a signature
th
equals the (re) term of the characteristic sequence generated on a message, on behalf of the delegator. The proxy tags the
by the polynomial f (x), i.e., sr (fe ) = sre (f ) = se (fr ). signature along with the delegator’s signature—our scheme
We construct our proxy signature scheme using the XTR- requires only the ephemeral public key of the delegator to
PKC for simplicity, although the proposed scheme can be be included in the proxy signature—warrant and message to
seamlessly built using the GH-PKC and also extended to collectively generate the proxy signature.
PKCs based on higher order LFSR sequences, with minor Any entity that wants to verify the proxy signature does the
modifications. following:

108
1) Checks whether the message conforms to the warrant. LFSR sequences. Let an entity D act as the delegator, wanting
If not, the proxy aborts. to delegate signing rights to another entity P, the proxy.
2) Computes the proxy verification key using the warrant, The proposed proxy signature scheme, PCLFSR, consists
delegator’s signature and the long term public keys of five phases, namely, initialization, key generation, proxy
of proxy and delegator. This step implicitly identifies delegation, proxy signature generation and proxy signature
the proxy and simultaneously verifies the delegator’s verification. The phases are executed as follows:
attestation on the warrant (guarantees verifiability). 1) Initialization (PCLFSR.Init).: Similar to the SCLFSR
3) Checks that the proxy indeed stamped the message with scheme, during the initialization phase, all entities choose
the proxy signing key, by verifying the proxy’s signature and agree on the system public parameters, params =
on the message (contained in the proxy signature) under p, Q, f (x), H.
the proxy verification key computed in the previous step. 2) Key Generation (PCLFSR.KeyGen).: During the key gen-
If verification fails, the verifier aborts. eration phase, the delegator D and the proxy P generate
Subsequent verification of proxy signatures, exchanged by the their long-term (private, public) key pairs, (x, X = s̄x )
same (proxy, verifier) pair, gets more efficient: the verifier and (y, Y = s̄y ), respectively, following the key generation
computes the proxy verification key only once per delegation. algorithm of the underlying PKC as described in Section III.
Next, we present a novel 2-party signature construction 3) Proxy Delegation.: During the proxy delegation phase,
using primitives from cubic LFSR sequences. D interacts with P to delegate its signing rights to P. D
restricts the signing capabilities of P by specifying a warrant
B. Two Party Signature Scheme Using LFSR Sequences
(an agreement between D and P) to which all messages that
The cubic LFRS-based signature scheme, SCLFSR, is built P signs must conform.
using Schnorr-like [1] signing equation.
The scheme SCLFSR consists of four phases: initialization, PCLFSR.Delegate: The delegator D delegates a proxy P as
key generation, signature generation and signature verification. follows:
Entities S and V participate in the SCLFSR scheme, where 1) Generate a warrant w that specifies restrictions on the
S is the signer and V is the verifier. During the initialization messages the proxy signer is allowed to sign and the
phase, both S and V choose and agree on the system public identities of delegator and proxy.
parameters: params = p, Q, f (x), H, where p, Q and f (x) 2) Compute a signature σ = t, fk , using private key
are as described in the previous section and H : {0, 1}∗ × x, on the warrant w following the signature generation
Z∗Q → Z∗Q is a cryptographic hash function. procedure of SCLFSR.
In the key generation phase, the signer S randomly chooses 3) Send signature, σ = t, fk  and warrant, w to P.
a long-term private key SK = x in Z∗Q and computes the
long term public key PK = s̄x = {sx , sx+1 , sx+2 } ←
PCLFSR.Accept: After receiving the delegation parameters
OP1 (x, f (x)). Fig. 1 describes the signing and verification
from D, the proxy P performs the following operations:
procedures of SCLFSR
1) Check whether warrant w conforms to the agreement
Signature Generation Signature Verification with D. If check fails, abort.
2) Verify D’s signature σ = t, fk  on warrant w, under
1) Randomly choose ephemeral 1) Compute A = f(t) ←
private key k ∈R Z∗Q and OP1 (t, f ). public key X, following the signature verification pro-
compute ephemeral public key 2) Compute cedure of SCLFSR. If verification fails, abort.
fk ← OP1 (k, f ). Denote r = B `= f(k+xh) ← ´ 3) Compute proxy signing key xP as: xP ≡ t+yh mod Q.
sk mod Q as an integer. OP2 fk , OP1 (h, s̄x ) .
2) Compute hash of message h = 3) Accept the signature if
Compute the corresponding proxy verification key as
H(m, r); Solve for t in the A = B, else reject the XP = s̄xP ← OP1 (x, f ).
following equation: t ≡ k + xh signature.
mod Q. ` ´
4) Proxy Signature Generation (PCLFSR.PSigGen).: Dur-
3) Send the signature σ = fk , t ing the proxy signature generation phase, the proxy P uses
and the message m to entity V . its proxy signing key xP to generate signatures on behalf of
D on messages that comply to the agreement between D and
P. Given a message m, P generates a proxy signature σP as
Fig. 1. The SCLFSR Signature Scheme.
follows:
Next, we present the proposed proxy signature scheme that 1) Check whether the message m conforms to restrictions
uses the above algorithm as a building block. specified in the warrant w. If check fails, abort.
2) Compute a signature tP , fkP  using proxy signing key
C. The Proposed Proxy Signature Scheme PCLFSR xP on the message m following the signature generation
The proposed proxy signature scheme, PCLFSR is a cubic procedure of SCLFSR.
LFSR-based transformation of the Schnorr-based proxy signa- 3) The proxy signature σP is the tuple tP , fkP , w, m, fk .
ture scheme proposed by Kim et al. [8] using primitives from Send σP to verifier V.

109
5) Proxy Signature Verification (PCLFSR.PSigVer).: Given congruence tP ≡ kP + xP hP mod Q (established by the
a proxy signature σP , V can verify the delegation agreement, signing equation) we know,
identify the proxy and verify the proxy’s signature on message
A = ftP = f(kP +xP hP ) = B
m conforming to warrant w as follows:
1) Check whether the message m conforms to warrant This concludes our proof.
specifications w. If check fails, abort.
B. Security Analysis
2) Compute h = H(w, r). Compute the proxy
verification key as XP  = s̄h(x+y)+k ← The security of the proposed proxy signature scheme
   PCLFSR is based on the difficulty of solving the discrete
OP2 fk , OP1 h, OP2 (X, Y ) .
logarithm (DL) problem in Fq3 or based on the difficulty of
Note: this step is done only once per delegation for all solving the trace discrete logarithm (Tr-DL) problem in Fq [3],
signatures exchanged between the same (proxy, verifier) [4], [18], [19]. Informally, the trace function T r : Fq3 → Fq
pair. 2
is given as T r(α) = α + αq + αq . The Tr-DL problem and
3) Verify P’s signature (generated on behalf of D) tP , fkP 
assumption can be defined as follows:
on message m, under proxy verification key XP , follow-
ing the signature verification procedure of SCLFSR. If Definition 1: Let α be a generator of the multiplicative
verification fails, abort. group (Fq3 )∗ , where q is a large prime or a power of a large
In the following section, we analyze the correctness and prime. The Tr-DL Problem in Fq can be defined as follows:
security of the proposed cubic LFSR-based proxy signature Given (q, α ∈ (Fq3 )∗ , β ∈ Fq ), find an index k such that
scheme and compare its performance with well-known con- β = T r(αk ) or determine that there is no such index.
struction of proxy signatures. Let A be a probabilistic polynomial time (PPT) algorithm
that runs in time t and solves the Tr-DL problem with
V. T HEORETICAL A NALYSIS probability at least . Define the advantage of the (t, ) Tr-
We present a detailed theoretical analysis of correctness, DL solver A as: AdvTArDL = P r[A(q, α, β) = k | α ∈R
security and performance of the proposed cubic LFSR-based Fq3 , k ∈R ZQ , β = T r(αk )]. The probability is over the
proxy signature scheme. random choices of α, k and the random bits of A.
Tr-DL Assumption: The finite field Fq satisfies the Tr-DL
A. Correctness Assumption if AdvTArDL (λ) is a negligible function.
A cubic LFSR-based proxy signature scheme constructed Lemma 1 (Giuliani et al. [19]): The Tr-DL Problem is
following the procedures in Section IV-C is correct if an equivalent to the DL problem.
arbitrary proxy signature, σP = tP , fkP , w, m, fk , generated We prove that the SCLFSR scheme is provably secure in the
by a proxy P on behalf of D, passes the proxy signature random oracle model (ROM) [27] under the DL assumption
verification procedure PCLFSR.PSigVer, using the long term by showing the equivalence of the 2-party scheme SCLFSR
public keys of D and P, and proxy verification key of P with the Schnorr signature scheme [1] and using the fact that
provided: (1) All entities choose and agree upon the system the DL problem reduces to forging a Schnorr signature [28].
public parameters params = p, Q, f (x); (2) D and P A total break of SCLFSR occurs if an adversary manages to
honestly execute key generation algorithm PCLFSR.KeyGen to do the following: given the public key s̄x of any entity E, the
generate their key pairs (x, X) and (y, Y ), respectively; (3) D adversary is able to compute the corresponding private key x.
honestly executes the delegation algorithm CLFSR.Delegate In such a case, any entity’s signature can be forged. However
to create a (warrant, signature) pair (w, σ), and (4) P honestly given s̄x , finding x is equivalent to solving the DL problem in
executes the acceptance algorithm CLFSR.Accept to compute the extension field Fq3 [18]. Using the following lemma we
the proxy (signing, verification) key pair (xP , XP ). show that, assuming a total break has not occurred, the 2-party
Proposition 1: The proposed cubic LFSR-based proxy sig- scheme SCLFSR is provably secure in the ROM under the DL
nature scheme, PCLFSR, follows the correctness property. assumption.
Proof Sketch: We first show that during verification of Lemma 2: The 2-party signature scheme SCLFSR is equiv-
a proxy signature, σP = tP , fkP , w, m, fk  on message m the alent to the well-known Schnorr [1] signature scheme.
verification key computed by V as XP = s̄h(x+y)+k , where Proof Sketch: Recall that the signing equation in the
h = H(w, r), equals the proxy verification key computed by Schnorr [1] scheme is: t ≡ k − xh mod Q, where p (1024-
P during proxy delegation as follows: bits) and Q (160-bits) are two large primes with Q|(p − 1),
α ∈ Z∗p is a generator of the cyclic subgroup of order Q,
XP = s̄xP = s̄t+yh = s̄k+xh+yh = s̄h(x+y)+k
h = H(m, r) is the hash of the message, t is the signature on
What remains to be shown is that the signature tP , fkP  on h, x ∈R Z∗Q is the private key, αx is the public key, k ∈R Z∗Q
hP = H(m, rP ), where kP is the ephemeral private key chosen is the ephemeral private key and r = αk is the ephemeral
by P during signature generation and rP = skP mod Q public key.
denoted as an integer, passes the verification under the proxy [=⇒] Given a valid SCLFSR signature σ = fk , t on
verification key XP . This is straightforward, since given the hashed message h = H(m, r) under the public key s̄x , we

110
know that ft = f(x+kh) . Let α ∈ Fq3 be a root of the proxy signature in a Boldyreva-type (transformed) PCLFSR
characteristic polynomial f (x). By the definition of fk (x) (cf. scheme in the ROM.
2
Section III), the roots of ft) are αt , αtq , αtq ∈ Fq3 , the roots Proof Sketch: The proof follows from Lemmas 3 and 4.
2
of f(x+kh) are α(x+kh) , α(x+kh)q , α(x+kh)q ∈ Fq3 . Also, we
know from the signing equation of SCLFSR that t ≡ x + kh Next, we present a detailed performance comparison of the
mod Q. Thus, the root αt of ft is equivalent to the root proposed proxy signature scheme PCLFSR with some well-
α(x+kh) of f(x+kh) in Fq . We now have αt = α(x+kh) with known constructions of proxy signatures.
t ≡ x + kh mod Q, which is the Schnorr scheme. Thus, the
C. Performance Analysis
SCLFSR reduces to the Schnorr scheme.
[⇐=] Given a valid Schnorr signature t = r, t on hashed Table I shows direct comparisons of the proposed proxy
message h = H(m, r) under the public key αx , we know that signature scheme, PCLFSR, with those developed by Mambo
2 2
αt = α(x+kh) . Also, αtq = α(x+kh)q and αtq = α(x+kh)q et al. (MUO) [6], Kim et al. (KPW) [8], Lee et al. (LKK) [14],
2 Petersen et al. (PH) [7], Huang et al. (HSMW) [30] and Zhang
where, q = p and p is a 170-bit prime. We know:
et al. (ZNS) [17]. The scheme by Boldyreva et al. [15] is a
ft = T r(αt ) provable secure variant of the KPW scheme and both schemes
2
= αt + αtq + αtq have the same performance. For the sake of uniformity in
f(x+kh) = T r(α(x+kh) ) comparison, we consider a security benchmark of 1024 bits
2 — the system public parameters of MUO, KPW, LKK, and PH
= α(x+kh) + α(x+kh)q + α(x+kh)q are given by the tuple params = p, q, α, where p and q are
Thus, ft = f(x+kh) with t ≡ x + kh mod Q, which is 1024 and 160 bit primes, respectively, and α is an element of
the SCLFSR scheme. Thus, the Schnorr scheme reduces to order q in Z∗p .
SCLFSR scheme. Note that given α ∈ Fp6 , where p is a 170-bit prime, com-
Hence, the equivalence. puting αk for any integer k requires approximately 23.4 log2 Q
Lemma 3: The 2-party signature scheme SCLFSR is prov- multiplications in Fp , where Q, the order of α, is a 160-
ably secure in the ROM under the DL assumption. bit prime [4]. However, computing the kth sequence term
sk = T r(αk ) given f (represented by T r(α)) using sequence
Proof Sketch: We know that the well-known DL problem
operation OP1 takes only 8 log2 (k mod Q) multiplications in
reduces to forging a Schnorr signature [28]. The rest of the
Fp which is approximately three times faster than computing
proof follows from Lemma 2.
αk , given α [4]. Thus, the computational cost for one OP1
Our proxy signature scheme is a transformation of the proxy
operation in the proposed PCLFSR scheme is roughly equiva-
signature scheme by Kim et al. [8] using primitives from cubic
lent to 0.33 modular exponentiations in the MUO, KPW, LKK
LFSR sequences. Boldyreva et al. [15] presented a provably
and PH schemes. Furthermore, these LFSR-based sequence
secure variant of Kim et al.’s scheme [8]. Boldyreva et al.
operations have been efficiently implemented in hardware [26],
showed that the forging a proxy signature in their transformed
[31]. The HSMW and ZNS schemes use considerably more ex-
scheme reduces to forging a Schnorr signature in the ROM.
pensive bilinear pairing operations in the delegation and proxy
The proposed scheme PCLFSR can also be transformed into
signature verification phases. To the best of our knowledge, the
a provably secure proxy signature scheme in the ROM by
best known result for computing a single Tate pairing equals
directly applying the modifications as proposed by Boldyreva
approximately 11110 multiplications in Fq , where q is a 171-
et al. [15].
bit prime (for security benchmark of 1024-bits) [32].
Lemma 4: Forging a proxy signature in a Boldyreva-type
For schemes, MUO, KPW, LKK and PH, the size of the
(transformed) PCLFSR scheme, reduces to forging a signature
long-term public key, excluding shared components (primes p
in the 2-party scheme SCLFSR in the ROM under the DL
and q) equals 2048 bits. The pairing-based schemes, HSMW
assumption.
and ZNS, use public keys of size 1532-bits. The proposed
Proof Sketch: The proposed scheme PCLFSR can be
scheme, PCLFSR, achieves the least storage overhead: the size
directly transformed following the modifications as proposed
of the long term public key and hash key combined equals
by Boldyreva et al. [15]. All notations being the same, the two
1020 bits.
modifications to PCLFSR are as follows [15]:
• The warrant w is hashed as h = G(0||X||Y ||w, r), where VI. C ONCLUSIONS
G : {0, 1}∗ × Z∗Q → Z∗Q is a cryptographic hash function The problem of authenticating mobile agents and other
associated with D. resource constrained devices in the realm of ubiquitous com-
• The message m is hashed as hP = puting is inherently more complex than in conventional envi-
H(0||M ||X||Y ||w||r, rP ), where H : {0, 1}∗×Z∗Q → Z∗Q ronments. The use of conventional cryptography is expensive
is a cryptographic hash function associated with P. and does not offer attractive solutions.
The remainder of the proof follows the proof by Boldyreva We presented the first LFSR-based proxy signature scheme
et al. [15] and is omitted due to space constraints. to solve the authentication problem in systems where no-
Theorem 1: The discrete log problem reduces to forging a madic agents need to search for special services or products

111
MUO KPW LKK PH HSMW ZNS PCLFSR
Delegation 3e 4e 4e 4e 3s + 2p 2s + 2p 4OP1
Proxy signature generation 1e 1e 1e 1e 5s 2s 1OP1
Proxy signature verification 5e 3e (2e)† 3e 3e 5p 1s + 2p 3OP1 (2OP1 )†
Public key size (bits) 2048 1532‡ 1532 1020
Proxy signature size (bits) 480 160 840
Secure Channel Y N Y N N N N

TABLE I
P ERFORMANCE C OMPARISON WITH SECURITY BENCHMARK OF 1024- BITS . e: MODULAR EXPONENTIATION , s: SCALAR MULTIPLICATIONS , p: PAIRING
COMPUTATION ; 1OP1 ≈ 0.33e; †: SUBSEQUENT VERIFICATION OF PROXY SIGNATURES , BETWEEN ANY PAIR OF PROXY AND VERIFIER , ‡: SYSTEM
PARAMETERS REQUIRE UPTO 10KB OF ADDITIONAL STORAGE [29]

across networks. LFSR-based public key cryptosystems use [10] A. K. Awasthi and S. Lal, “Proxy blind signature scheme,” Transactions
reduced representations of finite field elements. This leads to on Cryptology., vol. 2, no. 1, pp. 5–1, 2005.
[11] S. Lal and A. K. Awasthi, “A scheme for obtaining a warrant message
substantial savings, both in communication and computational from the digital proxy signatures.” Cryptology ePrint Archive, Report
overhead, for a desired security level. Our proxy signature 2003/073, 2003, http://eprint.iacr.org/2003/073.
scheme is constructed using a cubic LFSR-based, Schnorr- [12] T. Okamoto, M. Tada, and E. Okamoto, “Extended proxy signatures for
smart cards.” in Proceedings of ISW, Second International Workshop on
type 2-party signature scheme, and uses comparatively fast Information Security, ser. LNCS, M. Mambo and Y. Zheng, Eds., vol.
LFSR operations to achieve the best computational efficiency 1729. Springer, 1999, pp. 247–258.
and least storage overhead among the existing protocols. The [13] H.-M. Sun and B.-T. Hsieh, “Remarks on two nonrepudiable proxy
signature schemes,” in Proceedings of 9th National Conference on
scheme is provably secure in the random oracle model under Information Security, 1999, pp. 241–246.
the discrete logarithm assumption. [14] B. Lee, H. Kim, and K. Kim, “Strong proxy signature and its ap-
Although the proposed schemes are instantiated using XTR- plication,” in Proceedings of SCIS, Symposium on Cryptography and
Information Security Osio, Japan, vol. 2/2, 2001, pp. 603–608.
PKC for simplicity, they can be seamlessly built using the [15] A. Boldyreva, A. Palacio, and B. Warinschi, “Secure proxy signature
GH-PKC and also extended to PKCs based on higher order schemes for delegation of signing rights,” Cryptology ePrint Archive,
LFSR sequences, with minor modifications, depending on the Report 2003/096, 2003, http://eprint.iacr.org/2003/096.
[16] J.-Y. Lee, J. H. Cheon, and S. Kim, “An analysis of proxy signatures: Is a
desired security level. secure channel necessary?” in Proceedings of CT-RSA, Cryptographers’
We believe that our results have not complete solved the Track at the RSA Conference, ser. LNCS, M. Joye, Ed., vol. 2612.
hard problem of authentication in ubiquitous systems, but Springer, 2003, pp. 68–79.
[17] F. Zhang, R. Safavi-Naini, and W. Susilo, “An efficient signature scheme
our on-going work involving LFSR-sequence based proxy from bilinear pairings and its applications.” in Proceedings of PKC,
signatures have significant potential to form crucial building 7th International Workshop on Theory and Practice in Public Key
blocks for securing performance sensitive ubiquitous systems. Cryptography, ser. LNCS, F. Bao, R. H. Deng, and J. Zhou, Eds., vol.
2947. Springer, 2004, pp. 277–290.
R EFERENCES [18] G. Gong, L. Harn, and H. Wu, “The GH public-key cryptosystem,” in
Proceedings of SAC, Eighth Annual International Workshop on Selected
[1] C.-P. Schnorr, “Efficient signature generation by smart cards.” Journal Areas in Cryptography, Revised Papers, ser. LNCS, S. Vaudenay and
of Cryptology, vol. 4, no. 3, pp. 161–174, 1991. A. M. Youssef, Eds., vol. 2259. Springer, 2001, pp. 284–300.
[2] FIPS, “Digital signature standard (DSS),” National Institute for Stan- [19] K. J. Giuliani and G. Gong, “New LFSR-based cryptosystems and
dards and Technology, pp. ii + 74, January 2000. the trace discrete log problem (trace-DLP),” in Proceedings of SETA,
[3] G. Gong and L. Harn, “Public-key cryptosystems based on cubic finite Third International Conference on Sequences and Their Applications,
field extensions,” IEEE Transactions on Information Theory, vol. 45, ser. LNCS, T. Helleseth, D. V. Sarwate, H.-Y. Song, and K. Yang, Eds.,
no. 7, pp. 2601–2605, 1999. vol. 3486. Springer, 2004, pp. 298–312.
[4] A. K. Lenstra and E. R. Verheul, “The XTR Public Key System,” in [20] S. Chakrabarti, S. Chandrasekhar, M. Singhal, and K. L. Calvert,
Proceedings of CRYPTO, Twentieth Annual International Cryptology “Authenticating DSR using a novel multisignature scheme based on
Conference, ser. LNCS, M. Bellare, Ed., vol. 1880. Springer, 2000, cubic LFSR sequences,” in Proceedings of ESAS, The Fourth European
pp. 1–19. Workshop on Security and Privacy in Ad hoc and Sensor Networks,
[5] H. Niederreiter, “A public-key cryptosystem based on shift register ser. LNCS, F. Stajano, C. Meadows, and S. Capkun, Eds., vol. 4572.
sequences,” in Proceedings of EUROCRYPT, Workshop on the Theory Springer, 2007, pp. 156–171.
and Application of Cryptographic Techniques, ser. LNCS, F. Pichler, [21] ——, “Authenticating feedback in multicast applications using a novel
Ed., vol. 219. Springer, 1986, pp. 35–39. multisignature scheme based on cubic LFSR sequences,” AINAW, 21st
[6] M. Mambo, K. Usuda, and E. Okamoto, “Proxy signatures for delegating International Conference on Advanced Information Networking and
signing operation.” in Proceedings of CCS, Third ACM Conference on Applications Workshops, vol. 1, pp. 607–613, 2007.
Computer and Communications Security. ACM Press, 1996, pp. 48–57. [22] S. Chakrabarti, S. Chandrasekhar, K. L. Calvert, and M. Singhal,
[7] H. Petersen and P. Horster, “Self-certified keys – concepts and ap- “Efficient blind signatures for accountability,” in Proceedings of NPSec,
plications,” in Proceedings of CMS, 3rd International Conference on The Third Workshop on Secure Network Protocols, Beijing, China,
Communications and Multimedia Security. Chapman & Hall, 1997, October 2007.
pp. 102–116. [23] X. Li, D. Zheng, and K. Chen, “LFSR-based signatures with message
[8] S. Kim, S. Park, and D. Won, “Proxy signatures, revisited,” in Pro- recovery.” International Journal of Network Security, vol. 4, no. 3, pp.
ceedings of ICICS, First International Conference on Information and 266–270, 2007.
Communication Security, ser. LNCS, Y. Han, T. Okamoto, and S. Qing, [24] C. H. Tan, X. Yi, and C. K. Siew, “Signature schemes based on 3rd order
Eds., vol. 1334. Springer, 1997, pp. 223–232. shift registers.” in Proceedings of ACISP, Sixth Australasian Conference
[9] H.-M. Sun, “An efficient nonrepudiable threshold proxy signature on Information Security and Privacy, ser. LNCS, V. Varadharajan and
scheme with known signers.” Computer Communications, vol. 22, no. 8, Y. Mu, Eds., vol. 2119. Springer, 2001, pp. 445–459.
pp. 717–722, 1999. [25] S. W. Golomb, Shift Register Sequences. Holden-Day, 1967.

112
[26] E. Peeters, M. Neve, and M. Ciet, “XTR implementation on reconfig-
urable hardware.” in Proceedings of CHES, Sixth International Work-
shop on Cryptographic Hardware and Embedded Systems, ser. LNCS,
M. Joye and J.-J. Quisquater, Eds., vol. 3156. Springer, 2004, pp.
386–399.
[27] M. Bellare and P. Rogaway, “Random oracles are practical: a paradigm
for designing efficient protocols,” in Proceedings of CCS, First ACM
conference on Computer and communications security. ACM Press,
1993, pp. 62–73.
[28] N. Koblitz and A. Menezes, “Another look at ”provable security”,”
Journal of Cryptology, vol. 20, no. 1, pp. 3–37, 2007.
[29] S. Lu, R. Ostrovsky, A. Sahai, H. Shacham, and B. Waters, “Sequential
aggregate signatures and multisignatures without random oracles,” in
Proceedings of EUROCRYPT, 25th Annual International Conference on
the Theory and Applications of Cryptographic Techniques, ser. LNCS,
S. Vaudenay, Ed., vol. 4004. Springer, 2006, pp. 465–485.
[30] X. Huang, W. Susilo, Y. Mu, and W. Wu, “Proxy signature without
random oracles,” in Proceedings of MSN, Second International Con-
ference on Mobile Ad-hoc and Sensor Networks, ser. LNCS, J. Cao,
I. Stojmenovic, X. Jia, and S. K. Das, Eds., vol. 4325. Springer, 2006,
pp. 473–484.
[31] M. Neve, E. Peeters, G. M. de Dormale, and J.-J. Quisquater, “Faster
and smaller hardware implementation of XTR,” in Advanced Signal
Processing Algorithms, Architectures, and Implementations XVI, F. T.
Luk, Ed., 2006.
[32] P. S. L. M. Barreto, B. Lynn, and M. Scott, “On the selection of pairing-
friendly groups.” in Proceedings of SAC, Tenth Annual International
Workshop on Selected Areas in Cryptography, Revised Papers, ser.
LNCS, M. Matsui and R. J. Zuccherato, Eds., vol. 3006. Springer,
2003, pp. 17–25.

113
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A study on Digital Audio Watermarking Internet Applications


Yiju Wu, Shigeru SHIMAMOTO
Graduate school of Global Information and Telecommunication Studies, Waseda
University. Bldg.29-7, 1-3-10 Nishi-Waseda, Shinjuku-Ku, Tokyo,Japan

Abstract attached files, the watermarking technology is cost-


This paper aims to explore the feasibility of saving when it is applied to communication medium.
embedding Cellophane’s QR Code in sound Therefore, the research tried to explore the driving
watermarking and conducting simultaneous linkage of patterns of the hiding and reading of Internet linking
hidden information through the principles of data through hiding wide-used QR code by sound
psychological and physiological models. In the watermarking. The sound watermarking technology in
research, we tried to realize the theory through Visual the research was designed according to the
Basic and observed that different type of music would psychological and physiological models of human
indirectly affect the identifiably of sound watermarking. auditory sense. Firstly, we divided the audio signals in
time domain and calculated the frequency spectrum of
1. The Rationale of Sound Watermarking signals at each section-dividing point. Furthered to
divide the frequency spectrums into sub-areas and
In recent years, the development of algorisms of calculated the masking curves in each sub-areas and
digital audio watermarking could be divided into two compared them with the power spectrum of audio
major streams: time field algorism time domain signals. Then embedded the watermarking into the
algorism and transform domain algorism. In the low and middle level of transform coefficients of sub-
development of two technologies, the transfer field waves in audio signals to enable digital watermarking
method could embed the data that needs to hide into to have better detect ability.
the sensitive areas of original audio files. It could also Sound signals could be classified as a continuous
perfectly take advantage of the characteristics of data stream that it is impossible to test the signals of
human auditory sense in hiding information to avoid each section in signal test and designing algorism
the problem of inferior audio quality in hiding picked out by watermarking. So we need to conduct
watermarking data. Therefore, the transform domain further researches in present stage of technology if we
watermarking algorism technology has become one of want sound watermarking to have capability to retrieve
main stream technologies in the continuous data, combine hidden information and link Internet
development of various sound watermarking simultaneously. In the research, we tried to use wavelet
technologies. [1]The figure 1 shows the Technologies transform method to divide a signal into five sub-signal
of Sound Watermarking areas and found out the areas that the energy in sub-
area suddenly rose. The same method was applied to
the picking-out and retrieving of watermarking. After
picking out information, we run programs to drive
embedded information to link a specific webpage.

2.Watermarking Pattern Based on


Psychological and Physiological Models
Masking effect is a common psychological-acoustic
phenomenon that is determined by the distinguishing
mechanism of sound frequency by human ears. Within
Figure 1 The Current Technology of Sound a certain range of frequency, if a strong and a weak
Watermarking [2-12] sound signals that have a certain difference of energy
exist at the same time, the weak one will not be
Due to the characteristics of watermarking that it detected, that is called strong sound masking
could keep the size of original file after being added phenomenon. Taking advantage of the characteristic of

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 59


DOI 10.1109/SUTC.2008.6
sound, we could embed watermarking that could not be ⎧ 17 Δ z − 0.4 x ( z i ) + 11,−3 ≤ Δ z < −1
detected easily. ⎪ (0.4 x( z ) + 6) Δ + 11,−3 ≤ Δ < 0
⎪ z z
Yi ( z i , z j ) = ⎨
The calculation steps of the value of masking − 17 Δ z ,0 ≤ Δ z < 1

domain could be divided into 1) calculate power ⎪⎩− 17 Δ z + 0.15(Δ z − 1) x ( z i ),1 ≤ Δ z < 8
spectrum of original signals; 2) identify tonal
(4)
component and non-tonal component; 3) calculate the
masking value of a single masker; 4) calculate masking
Among them, X ( zi ) is the sound pressure level of
values of all maskers; and 5) confirm the minimum
masking value. masker that its critical frequency is z i , Δz i is the
The purpose of calculating the power spectrum of distance of the masker and the masked, Δz = z j − z i ,
the original signals is to divide sound signals the masking domain values of the masker with critical
s(i) into several sections. Assume the length of each frequency domain z i generated at frequency domain
section is N=512, sampling frequency is 44.1kHz zi T (z , z ) T (z , z )
are 1 i j or 2 i j .
that is divided into 24 critical frequencies. The The pseudo-tone masker is:
calculation formula of signal power intensity x(k ) is: :
T1 ( zi , z j ) = X ( zi ) + Y1 ( zi ) + Y f ( zi , z j )
(5)
N −1 2
1
X (k ) = 10 lg ∑ h(i) s (i ) exp(− j 2 ∏ ki / N )
N i =0 (1) The pseudo-narrow band noise masker is:

T2 ( z i , z j ) = X ( z i ) + Y2 ( z i ) + Y f ( z i , z j )
Among them, h(i) is Hanning window function that (6)
is used for reducing boundary effects. Map signals
from frequency domain to critical frequency domain. Among them, Y1 , Y2 are masking coefficients of
Critical frequency is a frequency domain psychology
or sound value measurement that reflects the pseudo-tone and pseudo-narrow noise respectively. Y f
frequency selectivity of human ears that could is the masking function. With the increasing difference
identify the minimum frequency width of masked of critical frequency Δ z , the masking effect will be
sound signals. reducing. So when Δ z < −3 Bark or Δ z ≥ 8 Bark, we do
Because the tone of masker will affect masking
domain value, the identification of tonal component not consider masking. At this time, we assume T1 ( z i , z j )
and non-tonal component needs to confirm the and T2 ( z i , z j ) is − ∞ .
characteristics of sound signals according to In calculating all masking values, because of the
distribution of short-time power spectrum of signals to superposition of masking, all masking values Ta ( z j ) at
identify pseudo-tone signals and pseudo-narrow band zj
noise signals. is the sum of the safety domain values T0 ( z j ) at the
point plus masking domain values generated by all
The masking domain value in calculations is
determined by sound pressure levels, self-masking tonal and non-tonal components at the point.
The sum is:
levels and masking functions. The self-masking levels
of pseudo-tone signals and pseudo-narrow band signals Ta ( z j ) = 10 lg ∑10 1 + ∑10 + ∑10
T ( z i , z j ) / 10 T2 ( z i , z j ) / 10 Ta ( z i , z j ) / 10

are different, their formulas are: i i i (7


)
Y1 ( Z i ) = −6.025 − 0.275 zi
(2) Based on the all masking values, the minimum
Y2 ( Zi ) = −2.025 − 0.175zi masking domain values can be figured out by dividing
all frequencies into 24 equal-frequency sub-
(3) frequencies. The minimum masking domain value is:
Among them, Zi is critical frequency domain.
Tmin ( n) = min(Ta ( z j )), z j ∈ n
The masking functions of pseudo-tone signals and (8)
pseudo-narrow band noise signals Y f (Z i, Z j ) are same; Combining characteristics of auditory sense of
human ears, apply the calculation method of auditory
sense domain values to find out candidate term that
could be embedded watermarking. As long as the
power frequency spectrum of sound signals is lower

60
than masking value, the corresponding frequency K=
N
quantile of power frequency spectrum will be masked. M (12)
And satisfy:
3. Proposal of Research
K ≤ Ld 1 (13)
The watermarking embedding method adopted in the
Embed K values of m sections (0<m<M) by order
research is to add simultaneous signals for test before
the position of embedded watermarking. The detail of into the sequence d1 of wavelet transform of sound
the method is that embed three equal-amplitude signal in corresponding the m section.
sampling points before the starting point that a
watermarking is embedded. Due to the small [ ]
⎧d i 1 + αω n (i − s0 ) , S 0 ≤ i < S 0 + k
d1' (i ) = ⎨ 1
amplitude, human ears may not detect the changes if
not listen carefully. ⎩ d1 (i), others,0 ≤ S 0 < Lm − k (14)
Adopting the idea of the dissolution of sound
watermarking signals introduced by D. Kirovski, et al., Among them, α is a private double factor, After
we divided the original sound effect signals that adjusting its values, we could ensure that the
needed to add into M sections, and conducted wavelet watermarking have enough strength while it is not
transform at L level in each section. Keep the detailed detectable for auditory sense. S 0 is the starting point of
quantile of preceding L-1 level and approximate
embedding watermarking in the sequence d1 .
quantile of small waves at L level unchanged.
In embedding watermarking, use the minimum
Embedded watermarking in the detailed quantile di at
masking domain value Tim (n) and private double factor
L level and assume the length of di is Ld i . α to modulate the amplitude of watermarking w(n) and
Because the image of QR code could be considered
modulate embedding strength to make the
as one kind of two-dimension image, a binary
watermarking more undetectable for auditory sense.
sequence QS is obtained:
After completing the embedding watermarking of
psychological and physiological model, we could
Qs = {Qsi ,0 ≤ i < n}, Qsi ∈ {− 1,1} ''
(9) obtain numerical sequence d tα .
''
Use d i , the L level approximate quantile and
Among them, Qsi represents the point in QR preceding L-1 detailed quantile to conduct inverse
Code image, 1 or -1 represents black or wavelet transform and obtain the signal of the m
white of the point. section after embedding the watermarking. Repeat
Use a N-length pseudorandom sequence P to modulate above-mentioned steps and will obtain signals of the M
S, and a sequence W is obtained: section that can be combined as signals after
embedding watermarking.
p = {pi ,0 ≤ i < N }, Pi ∈ {− 1,1} The essential part of the conception of the proposal
(10)
is using psychological and physiological models and
W = {wi ,0 ≤ i < N }, wi ∈ {− 1,1} simultaneous signals to bury QR Code images in sound
(11) files. After extracting QR Code, the sound
watermarking will link web pages according to the
w = s i ⋅ p i ,0 ≤ i < N hidden text information in QR Code. The integral
Among them i
W is the watermarking needing to be embedded. The conception of the research is as figure 2.
pseudorandom sequence P in formula (10) is very
important to the safety of algorism. Due to the
pseudorandom of P, W also has the characteristics of
pseudorandom that the attacker will be very difficult to
detect the watermarking and disrupt them without
understanding P.
In order to ensure the high quality of sound
signals after embedding watermarking and
enhance the stability of integral algorism, we divided
the watermarking W into M equal –length sections and
the length of each section is K.

61
Figure2 The Proposal of Enable Sound Repeating above-mentioned steps for all signals of
Watermarking to Link Internet by Embedding the M section, we could obtain watermarking signal w
'

QR Code '
of N-length signal. Use sequence P to adjust w and
'

In order to verify simultaneous watermarking in obtain the sequence QS :


experiments, we used Visual Basic in the research to QSi' = P(i) ⋅ w'i ,0 ≤ i < N
design software. The primary structure of software (16)
'
(ADW 2007) includes embedding QR Code data, From QS , we could obtain QR Code image hidden in
identifying watermarking data, retrieving digital watermarking.
watermarking data, feedback of error messages, QR Mi Mi

information driving. The over-all operation system is ∑∑ w(i, j )w (i, j )


'

as figure 3. ρ ( w, w) = i =1 i =1
Mi Mi Mi Mi

∑∑ w (i, j ) ∑∑ w
2 −2
(i, j )
i =1 i =1 i =1 i =1 (17)

Formula (17) is the definition of focused evaluation


of watermarking itself. The value of ρ is between 0
and 1. The test mistaken or test missing of
simultaneous signals will affect greatly whole sound
watermarking. The definition of test mistaken is testing
out a simultaneous signal in a sound signal that is not
embedded digital watermarking. The test mistake will
lead to the random data behind the position being
mistaken as watermarking data and obtain inaccurate
or meaningless QR Code image. So-called test missing
is not testing out an equal and simultaneous signal in
data that is embedded watermarking. The test missing
will lead to the missing of some data and the QR Code
read out will be incomplete or even unreadable. The
test mistaken of simultaneous signal has relation to
Figure3 The Implementation of Proposed ADW only cycle ρ of the M sequence and domain value T.
2007 Operation System After calculation, we could obtain the test mistaken
rate of simultaneous signal as follows:
Regarding the test-impending sound signals, firstly
we have to find out the starting point for embedding p
1
watermarking. Beginning from the starting end of
signals, we have to test by order the amplitude of each
H0 = p
• ∑ C pk
2 k = p −e
node. If the amplitudes of linked starting points of
(18)
three linked nodes are close to the amplitudes of
simultaneous signals in watermarking test, then we k

could confirm that the next node is the starting point Among them, e=(P-T)/2, C p is number of
for watermarking embedding. combinations.
In a condition of set BER, we could obtain test
Repeat divide of original sound and use di and S0 , missing rate of simultaneous signal as follows:
W m' p
we could obtain the watermarking signals of the
M section from the following formula:
H1 = ∑C k
p • ( BEF ) k • (1 − BEF ) p − k
k = e +1 (19)

⎧ d1' (i + S 0 ) After obtaining QR Code image, we have to


⎪ + 1.1 − ≥0
⎪ d1 (i + S 0 ) conduct the interpretation of QR Code. In VB program,
Wm' = ⎨ after the entire QR Code going through interpretation
⎪ − 1, others
⎪⎩ 0≤i<k program, the interpreted data will be driven through
(15) the Sell function in VB program :

Sell (“Start to interpret positions of text file in data”)

62
Figure 5 The Shape of Wave after Embedding
The data could link the target website in the Watermarking
information hidden by QR Code. The major program Compare the watermarking before and after
code of QR Code cannot be reveal because it is the embedding, we could find that there is no significant
property of PSYTEC (Japan). difference between them in the shape of waves. For
verifying the robustness of test algorism, we conduct
sound watermarking test according to Step 2001 rules
4.The Results of Experiment and (as tables 1) that the operators currently use for the test
Evaluation of watermarking software.
Under the same model, general environmental
The research will evaluate from two aspects: the noise and shouting from animals require relatively low
objective function test of digital sound watermarking quality of sound watermarking. The reason is that there
itself and subjective evaluation test of software. In the is indirect correlation with the anticipation of sound by
experiment of the objective function test of digital human psychology. So the same sound watermarking
sound watermarking itself, we firstly select a QR Code will have different identification ratios after the sound
that is converted into two-dimension image model with watermarking is embedded. From the following data,
size 60 x50. We adopted the sound of 16 bit 44.1kHz we could find that different from general watermarking
(CD format) as the original signal of digital that is emphasize only on the technological aspect,
watermarking. Its shape of wave is shown as figure 4. after the embedding of the digital sound watermarking,
the result of the test of the sound effect shows that
under concentration on listening to sound, most of
human beings will unconsciously generate anticipated
emotions to the sound, particularly, the insight will
increase when they listen to a much familiar sound.
The anticipated emotion will also affect sound
watermarking in its capability to avert keen perception.
Therefore, after tests of different categories of music,
we found that future digital watermarking has to
modify slightly for different categories of music in the
Figure 4 The Shape of Wave before Embedding development of technology to address the effects of the
Watermarking human inherent anticipated emotions to some specific
sounds on the detect ability of watermarking.
The shape of wave when QR Code is considered as
watermarking data and after smoothly embedding.(as Table 1 The Step 2001 Specification List for the
figure 5) Test of Digital Watermarking Robustness

63
Test Items Content of Handling Co Noise Low- Re- Not
D/A, A/D Digital Signal →Analogue Signal mpr PassFilt Sampli Attacke
Transform →Digital Signal essi er ng d
Frequency Stereo Sound (2 Channels ) → on
Number Single Ear(1 Channel) Wavelet ρ =0 ρ =0. ρ =0.83 ρ =0.86 ρ =1.0
Transform Transfor .647 7614 52 32
Downward 44.1kHz/16bit/2ch m 1
Sampling →16kHz/16bit/2ch Psycholo ρ =0 ρ =0. ρ =0.93 ρ =0.91 ρ =1.0
Amplitude 44.1kHz/16bit/2ch→44.1kHz/8bit/2 gical and .841 8232 05 17
Compression ch Physiolo 0
Time and Bit Time Compression/Extension : - gical
Compression 10% & +10% Models
and Extension Bit Moving Compression/Extension:
-10% & +10% From the above table, we could learn that under the
Linear Data MP3:128kbps,96kbps,64kbps(Mono same length of the original sound signal and the same
Compression ) level of effects of noise (by noise rate) on original
MPEG2 AAC:128kbps,96kbps medium signal, using psychological and physiological
ATRAC(MD): Version 4.5 models to design will have better undetectability of
ATRAC 3: 132kbps,105kbps sound.
Real Audio:128kbps,64kbps In the experiment of software subjective test, we
Window Media adopted ABX subjective test method (as figure 6) and
Audio:128kbps,64kbps practical testing out watermarking. In ABX subjective
Non-Linear FM,AM,PCM test, the test people broadcast and arrange orderly the
Data original sound signal and embedded watermarking
Compression sound signal as following figure. The test people firstly
Frequency FM,AM,PCM broadcast original sound effects (A) for 40 seconds
Response before broadcast embedded watermarking sound
Transform effects (B) for another 40 seconds. After repeating the
Noise Test White Noise: S/M:-40dB steps once, randomly select sound (A) or (B) and
broadcast.
In implementing test, we retrieved part of sub-items of After experiencing sound test for 160 seconds, the
Step 2001 and conducted test: 1) linear MP3 testees have to answer if the last sound (X) is orginal
compression/ decompression, compression rate is 5:1; effect (A) or embedded watermarking sound effects
2) add Gauss white noise, average value is 0, and (B).
mean-square error is 0.01; 3)low pass filter: use
Chebychev low filter that the length is 9 and cut-off
frequency is 5kHz; 4) re-sampling, conduct downward
sampling of from 44.1kHz to 22.05kHz before conduct
interpolation of from 22.05kHz to 44.1kHz. Under the
condition of the same length of original sound signal
and of the same noise rate after embedding
watermarking, we also conducted experiment and Figure 6 The Procedure of ABX Sound
comparisons of two groups: 1) singly adopted wavelet Watermarking Test
transform; 2) adopted the method mentioned in this
article. Meanwhile, we found the situations of above- The scoring standard of ABX sound effects test is
mentioned two methods under attack. The results of shown as following table. If the testees successfully
comparison are shown as table 2. identify X that is embedded into watermarking sound
Table 2 The Comparisons of Traditional Wavelet effect in the process of test, the undetectability of
Transform and Psychological and Physiological entire watermarking will be in doubt and the scores
Model will be reduced in calculating success rates of
identifying entire sound watermarking.
Table 2 Evaluation List for ABX Test

64
Daily Life %
X Sound Talks 19 Middle to High
Judgment Original 19 81
Add Watermarking %
by Listener Sound POP Music 33 67 33 High
to Sound (B)
(A) %
Original Sound Right Wrong Answer Classic 35 65 35 High
(A) Answer Music %
Add Wrong Detected, Reduce 1 Total 116 384 23. -
Watermarking Answer Score 2%
to Sound (B)
Table 4 The Judgment List for Determining the
Level of Step 2001 Sound Watermarking Software
Item Perfect Practic Impracticab
In the research, we took 100 samples for ability ility Level
identification test. The average successful Level
identification rate was 23%. Among them, the lowest Test of Robustness More More Less than
rate is for identifying the sound effects of classic (21 Items) than than 50% are
90% are 50% Qualified.
music. According to the feedbacks of the qualifie are
questionnaires by most of the surveyed, the classic d* qualifie
music has more connectivity of melody that will make ( * : d. *
listener have more melody anticipation in music itself general
and thus the listener will require higher quality of sound)
sound. Test of Sound Quality Less Less >30%
(Detected) than 10% than
Under the same model, general environmental noise 30%
and shouting from animals require relatively low Other Extraction <10 10~3 >30
quality of sound watermarking. The reason is that there Tests time Seconds 0 seconds
is indirect correlation with the anticipation of sound by second
s
human psychology. So the same sound watermarking
Reliability No No Yes
will have different identification ratios after the sound False False
watermarking is embedded. From the following data, Positive Positiv
we could find that different from general watermarking e
that is emphasize only on the technological aspect, Simultaneo At the Has a Inability to
us same little do
after the embedding of the digital sound watermarking, Operation speed lag simultaneo
the result of the test of the sound effect shows that time usly
under concentration on listening to sound, most of Linear Data Able Partiall Unable to
human beings will unconsciously generate anticipated Compressi to resist y able resist
on to
emotions to the sound, particularly, the insight will resist
increase when they listen to a much familiar sound.
The anticipated emotion will also affect sound The limit value of data storage capacity of QR Code
watermarking in its capability to avert keen perception. is around 1000 words. But under the combination of
Therefore, after tests of different categories of music, sound watermarking and QR Code, in theory, it could
we found that future digital watermarking has to also store as many as around 1000 words. However,
modify slightly for different categories of music in the this research focuses on the implementation of using
development of technology to address the effects of the QR Code as watermarking data to link web pages.
human inherent anticipated emotions to some specific Therefore, the precision level of integral images is not
sounds on the detectability of watermarking. (as Table complete as existing QR Code. So we reached the level
3) that embedded only 10 words in the research and
Table 3 Different Categories of Music Correspond Internet linkage driving has to rely on built-in VB Sell
to the Test Out of Digital Watermarking Function to address linkage driving that is not tested in
Being the other systems than Window XP.
Categories Not
Succe
of Iden Requirement of According to the standards of Step 2001 sound
ssfully Rate
Embedded tifie Sound Quality watermarking test and summing up the statistics of
Identif
Sound d questionnaires filled out by users, we could induct that
ied
Shouting of 13 87 13 the method introduced by the research would meet the
Low to Middle requirements of practicability level of Step 2001(as
Animals %
Noise in 16 84 16 Low to Middle table 4). But there is still room for the improvement of

65
the enhancement of linear compression and the Decomposition,IEIEC Trans.Fundamentals,(S1745-1337),2004,87-
capability of simultaneous operation. A(7):1647-1650
[9]Wu S-Q,Huang J-W,Huang D-R,”Efficiently self-synchroized
5.Conclusion audio watermarking for assured audio data transmission,IEEE
In the research, we adopted wavelet transform trans”,2005,51(1):69-76.
combining the masking effects of auditory sense [10]Swanson M,,Zhu B,Tewfik A,el al. “Robust audio watermarking
system of human ears to conduct the design of sound using perceptual masking”,signal processing,1998,66(3):337~335.
watermarking. Taking advantage of perception [11]Cvejic N, Keskinarkaus A, Seppane T.”Audio Watermarking
redundance of original sound signal to control the using m-sequence and temporal masking”,New
quality of watermarking and reduce the chances of York:IEEE,Workshops on Applications of signal processing to audio
detectability by human ears. However, the existing and acoustics,New Paltz,2001.227~230.
algorism has lag time in corresponding simultaneous [12]Lee S K, Ho Y S,”Digital audio watermarking in the cepstrum
operation. So the algorism has room for improvement. domain”,IEEE Trans,2001,46(3):744~750
In the research, we proposed an idea that using QR
Code to hide information about website and strengthen
the application of sound digital watermarking. The idea Wu,Yi-ju (Member)
will become a new direction in developing cost-saving Mr. Wu earned his master degree in GITS
communication model in future. Same sound from Waseda University, Japan. He
watermarking having different results of sound test and worked with CPT Company and Photonics
how to enhance the level of precise identification of Industry and Technology Development Association ,
QR Code in sound watermarking may become new an affiliated organization of National Science Council,
research directions for follow-up researcher in future. Taiwan, as a researcher. He is currently a technology
commentator of Weekly Display and China Times and
student of Doctoral Program , Waseda University. He
References specializes in multi-media communications and FPD
[1]F.Jordan,M.Kutter,T.Ebrahimi,“Proposal of a watermarking technology. He is also an official member of IEICE,
technique for hiding/retrieving data in compressed”,MPEG97,1997. IEEE, ITE, IETC and IPS.
[2] Naoshisa komatsu,”Digital watermarking technology”,NII,2004
[3] D Kirovski, H Malvar. Spread Spectrum Watermarking of Audio Shigeru Shimamoto (Member)
Signals. IEEE Trans. Signal Process,2003,51:1045~1053 Professor Shimamoto earned his Doctor
[4] OONO,”Digital Watermarking and Contents degree in Tohoku University, Japn. He
protection”,OHMSHA,pp.140~181,Fab,2001. entered into NEC in 1987. Now he is a
[5]Cox I J,Matt L Miller.”The first 50 years of electronic professor in GITS of Waseda University,
watermarking”,Journal of Applied Siginal Processing,2002,126~132. Deputy Editor-in-Chief of Editing Committee of
[6]Kim H J.”Audio Watermarking Techniques”,Pacific Rim IEICE. Professor Shimamoto specializes in multi-
Workshop on Digital Steganography,Kyushu Institude of media wireless communications, optical
Technology,Kitakyushu,Japan,July3~4,2003. communications. He is an official member of IEEE
[7]Arnold M,”Audio Watermarking:Buying information in the and IEICE. Writings: Wireless Communication
data”,Dr Dobbs’s Journal,2001,11(1):21~28 Technology, Digital Broadcast, etc.
[8]Byeong-Seob KO,Ryouichi Nishimura,Yoiti Suzuki,”Robust
Watermarking Based on Time-spread Echo Method with Subband

66
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

An Enhanced Trust Model Based on Reputation for P2P Networks

Xu Wu Jingsha He Fei Xu
College of Computer Science School of Software College of Computer Science
and Technology Engineering and Technology
Beijing University of Beijing University of Beijing University of
Technology Technology Technology
Beijing 100022, China Beijing 100022, China Beijing 100022, China
zhixinxuxu@emails.bjut.edu.cn jhe@bjut.edu.cn Xf8878xf@emails.bjut.edu.cn

Abstract people. Some traditional security techniques, such as


identity-based authorization in peer-to-peer networks,
Trust plays an important role in peer-to-peer (P2P) can be used as protection means against known
networks for enabling peers to share resources and malicious peers. However, they cannot deal with
services. It is necessary to build an effective security unknown peers and can hardly manage an exceedingly
trust model to solve the trust problem during large number of users and/or sizable resources.
interactions between peers. In this paper, we propose Mechanisms for trust and reputation can be used to
an enhanced trust model based on reputation. We help peers distinguish good partners from bad ones.
apply SupP2Prep [1] in our trust model, which is a In this case it is essential to build a reliable and
protocol for reputation management via polling in P2P secure trust model based on reputation to determine
networks, but with three adjustments. Most of the how much one peer in the network should trust another
existing trust models only consider the poll problem peer to whom it is not connected. With the help of trust
from the perspective of trust. Here, we consider voting model, trustworthy peers can be identified and trust
for peers from the perspectives of both trust and relationship established between peers. Based on an
distrust. Our work appears to be the first attempt to established trust relationship, malicious peers are
incorporate distrust in the polling algorithm. The prevented from interactions, which can greatly enforce
proposed model is shown to be robust in the presence the security of interactions.
of attackers through simulation. In the current literature, many trust models based on
reputation have been proposed for P2P networks, such
1. Introduction as EigenTrust [2] and a reputation-based trust model
proposed by Xiong and Liu [3] and so on. Yet, little
has been done to show how trust and distrust can be
In a peer-to-peer (P2P) network, there is no central incorporated into a trust model to yield an outcome that
authority or infrastructure that could coordinate the is beneficial to trust computation. In fact, even a small
behavior of the peers. A peer can act both as a server amount of information about distrust can tangibly help
and a client since it can provide services to other peers judge about a peer’s trust [9].
as well as request services from other peers. Any peer We propose in this paper an enhanced trust model
can arbitrarily join or leave the network at any time based on reputation in which we incorporate trust and
and each peer itself is responsible for making local distrust into a single model by the polling algorithm. In
autonomous decisions based on information received our trust model, a peer’s trustworthiness is defined by
from other peers in the network. Therefore, an open an evaluation of the peer in terms of the level of
P2P network is highly dynamic and autonomous. reputation it receives in interacting with other peers in
Established based on such an environment, the past. Protocol SupP2Prep [1] for reputation
interactions usually happen between stranger peers. management via polling in P2P networks is used in our
Thus, it has brought about a series of security problems model, but with two adjustments.
including sending unreal interacting information, First, peers in a P2P network are divided into
posing other peers to provide bad services, and so on, different groups based on their interests and only the
though P2P networks bring more conveniences to members that belong to the same group are permitted

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 67


DOI 10.1109/SUTC.2008.28
to vote. Being recognized as context-dependent makes differentiates between two types of trust, trust in the
trust related to the roles of peers (for example, a host’s capability to provide the service and trust in the
different degree of trust for a peer acting as a doctor host’s reliability in providing recommendations.
rather than acting as a book seller). A peer may belong Donato, Paniccia and Selis [5] proposed new
to different groups based on its different interests. All metrics for reputation management in P2P networks.
member peers of a group not only have the same The work combines these metrics with the original
interest, but also share a common way of thinking. EigenTrust approach. The main contribution is that of
Before computing the trust value, a peer must join a introducing a number of new attack models not
group based on its own interest. The proposed trust addressed before and a new metric called dishonesty.
model is different from SupP2Prep in which peers Tran and Cohen [6] proposed a reputation oriented
aren’t divided into different groups based on their reinforcement learning algorithm for buying agents in
interests. The another advantage is that the polling electronic market environments, taking into account
processes can save more bandwidth and processing the fact that the quality of a good offered by different
power than those in SupP2Prep since the requesting selling agents may not be the same and that a selling
peer need only broadcast messages to all other peers. agent may alter the quality of its goods.
Second, distrust is taken into account. Our model A reputation-based management protocol [7] is
considers voting for peers from the perspectives of proposed by Selcuk et al in which the reliability of
both trust and distrust, which is inspired by [9]. peers is calculated through the outcomes of past
This paper is organized as follows. Section 2 interactions and is saved in trust vectors.
describes related work. Section 3 presents the proposed SupP2PRep [1] is a protocol of reputation
trust model. Section 4 contains experimental study. management developed on the top of Gnutella 0.6 in
Finally, we conclude this paper in Section 5. which reputation management is based on a distributed
polling protocol and the requesting peer broadcasts
2. Related work messages to all other peers.
Dewan stressed in [8] the need to introduce external
In a P2P network, trust and trust relationship have motivation for peers to cooperate and be trustworthy
been the subject of earlier research and repuation-based and recommended the use of “digital reputations”
systems have been used to establish trust among peers which represents the online transaction history of a
where parties with no prior knowledge of each other host.
use feedbacks from other peers to assess the Guha et al. [9] proposed a framework for trust and
trustworthiness of the peers in the network. The distrust propagation on networks. The main
existing trust and reputation mechanisms provide the contribution is to address the conceptual and
foundation for the proposed research. computational difficulties to propagate both trust and
EigenTrust [2] model is designed for the reputation distrust.
management of P2P systems. The global reputation of Runfang Zhou and Kai Hwang [10] proposed a
peer i is marked by the local trust values assigned to power-law distribution in user feedbacks and a
peer i by other peers, which reflects the experience of computational model, i.e., PowerTrust, to leverage the
other peers with it. The core of the model is that a power-law feedback characteristics. The paper used a
special normalization process where the trust rating trust overlay network (TON) to model the trust
held by a peer is normalized to have their sum equal to relationships among peers. PowerTrust can greatly
1. The shortcoming is that the normalization could improves global reputation accuracy and aggregation
cause the loss of important trust information. speed.
A reputation based trust model [3] proposed by Surprisingly, most previous work has failed to
Xiong and Liu is developed for P2P e-commerce consider the importance of distrust in the calculating
communities. Five important trust factors are combined process of the trust values.
to define a general trust metric which is used to
compute trustworthiness of peers and to address the 3. The trust model
fake or misleading feedbacks. The metrics can also
adapt to different communities and situations. However, In this section, we first present the definition and
the model doesn’t consider the influence of a history type of trust and introduce the seven trust factors. We
record to trust since trust always deteriorates with time. then present the details about how to evaluate the
A Bayesian network-based trust model [4] proposed trustworthiness of peers by the seven trust factors.
by Wang and Vassileva uses reputation built on
recommendations in P2P networks. The work

68
3.1 The definition and type of trust important factor that should be considered in the trust
model. For peers without any interacting history, most
previous trust models often define a default level of
Trust has been examined in many contexts,
trust. But if it is set too low, it would make it more
including sociology, social psychology, economics and
difficult for a peer to show trustworthiness through its
marketing. Each context has a unique perspective on
actions. If it is set very high, there may be a need to
the notion of trust. In this paper, we define trust as an
expectation about the behaviors of what an individual, limit the possibility for peers to Āstart overā by
say A, expects another individual, say B, to perform in re-registration after misbehaving. In our trust model,
a given context. the introduction of the size of interactions effectively
Our trust model has two types of trust: direct trust solves the trust problem of peers without any
and recommendation trust. Direct trust is the trust of a interacting history. The details will be described in 3.3.
peer on another based on the direct interacting Time: The influence of an interacting history record
experience and is used to evaluate trustworthiness to trust always decays with time. The more recent
when a peer has enough interacting experience with interactions have more influence on trust evaluation of
another peer. On the other hand, recommendation trust a peer. For instance, if peer X has interacted with peer
is used when a peer has little interacting experience Y for a long time, the change of trust degree influenced
with another one. Recommendation trust is the trust of by the interaction three years earlier is weaker than that
a peer on another one based on direct trust and other of today. In our trust model, we introduce time factor
peers’ recommendation. In our model, peers’ to reflect this decay, that is, the most recent interaction
recommendation is received through a polling protocol. usually has the biggest time factor.
Vote accuracy: We use a distributed polling
3.2 The trust factors algorithm [1] to collect peer’s reputation information in
our model. Vote accuracy factor reflects the accuracy
In order to effectively evaluate the trustworthiness that a peer votes for other peers. For example, if a peer
of peers and to address various malicious behaviors in correctly votes for other peers, it will have a high vote
a P2P network, we design and develop an enhanced accuracy factor. The purpose of introducing this factor
trust model based on reputation. Seven trust factors are is to encourage peers to vote actively and correctly in
identified in evaluating trustworthiness of peers. our model. As if their suggestions are more worthy of
Satisfaction or dissatisfaction degree in belief, people are always honest in the real society.
interactions: When a peer finishes an interaction with Punishment function: The measurement of trust is
another peer, the peer will evaluate its behavior in the the accumulation of the effects of interactions, both
interaction. The result of evaluation is described using positive and negative. We not only consider the decay
satisfaction or dissatisfaction degree which is in the of influence of the interaction experiences with time,
range (-1, 1). Satisfaction and dissatisfaction degrees but also punish malicious actions. Punishment should
express how well and how poor this peer has be involved by decreasing its trust degree according to
performed in the interaction, respectively. Satisfaction the amount of malicious behaviors. Therefore we
or dissatisfaction degree can encourage interacting introduce the punishment factor in our model to be
sides to behave well during interactions. However, it is used to fight against subtle malicious attacks. For
sufficient to measure a peer’s trustworthiness without instance, if a peer increases its trustworthiness through
taking into account the number of interactions. well-behaving in small-size interactions and tries to
Number of interactions: Some peers have a higher make a profit by misbehaving in large-size interactions,
interaction frequency than some other peers due to a the peer would need more successful small-size
skewed interaction distribution. A peer will be more interactions to offset the loss of its trust degree.
familiar with other peers by increasing the number of Risk: Every peer has its own security defense
interactions. This factor is related to the interaction ability which is reflected by risk factor, such as the
authority which is described in 3.3. If a peer wants to ability to detect vulnerabilities, the ability to address
get more high interaction authority, it needs to do more any viruses and to defend against intrusions.
successful interactions with other peers.
Size of interactions: Size has different meanings in 3.3 The computational model
different P2P environments. For example, in a P2P file 
sharing network, the size of interaction expresses the Consider the situation where peer X wants to
file size shared in each interaction, while in a P2P interact with peer Y in order to accomplish a certain
business community, it shows the sums of money task. Peer X won’t interact unless it is sure that peer Y
involved in each interaction. Size of interactions is an is trustworthy. In order to find out whether peer Y is

69
trustworthy, peer X calculates a trust value for peer Y. By introducing the notion of level, new peers are
There are two ways in which to calculate trust value: given the chances for interactions, which solves the
direct and recommendation. When peer X has enough trust problem when no prior interaction history exists,
interaction experience with peer Y, peer X uses direct an issue that has not been addressed in many models.
trust to calculate the trust value for peer Y. On the At the same time, it prevents new peers from cheating
other hand, when peer X doesn’t have enough big time by having interaction chances. The direct trust
interaction experience with peer Y, peer X uses value Tx ( y ) is defined as:
recommendation trust to calculate the trust value for N ( y)
S ( x, y ) M ( x, y ) Z 1
peer Y. In our paper, an interaction experience Tx ( y) D ¦ (  pen(i) )  E Risk ( y)
N ( y) 1  e n
threshold is predefines based on the size of interactions i 0

in a P2P network. For example, the threshold is lower (1)


when downloading a 1M confidential file than a 200M where D and E and are weighting factors that satisfies
confidential file. If peer X’s interaction experience the condition D  E 1 . N ( y ) denotes the total
with peer Y exceeds this predefined threshold, peer X
number of interactions that peer X has performed with
chooses direct trust to calculate the trust value for peer
Y, otherwise, it chooses recommendation trust. peer Y and S ( x, y ) denotes the peer X’s satisfaction
degree of interaction in its ith interaction with peer Y
3.3.1. Direct trust value. Direct trust is denoted which is in the range of (-1, 1). M ( x, y ) is the ratio
between the size of the ith interaction and the average
as D (Tx ( y ), S ) . Where Tx ( y ) is the direct trust size of interactions which reflects the importance of the
ith interaction among all the interactions that peer X
value that peer X calculates for peer Y. S expresses has performed with peer Y. Therefore,
peer Y’s level of size of interaction which is granted by mi
M ( x, y ) where mi is the size of the ith
peer X. The level of size of interaction in a P2P mv
business community is shown in Table 1. interaction and mv is the average size of all
interactions. We use Z to denote the time factor. Thus,
Level Bottom limit Top limit 1
Z u (ti , tnow ) , Z  (0,1) (2)
1 m0 m1 tnow  ti
where ti is the time when the ith interaction occurs
2 m1 m2
and tnow is the current time. pen(i ) denotes the
…. …. …. punishment function and

n mn 1 mn 1, if the ith interaction fails (3)


pen(i ) =
Table 1. The level of size of interaction 0, if the ith interaction succeeds
In Table 1, m0 d m1 d m2 d  d mn , where 1
is the acceleration factor where n denotes
m denotes the size of interaction in a P2P business 1  e n
community, e.g., $100, $1000, $10000, etc. The level the number of failures. It can make trust value drop fast
of size of interaction satisfies the following rules. when an interaction fails. As this factor increases
(1) The lowest level is given to a new peer that with n , it helps avoid heavy penalty simply because of
doesn’t have any interaction history.
(2) A certain level is updated if the number of a few unintentional cheats. Finally, Risk ( y ) is used to
successful interactions reaches the predefined express the risk factor.
number in the level. The predefined number is
decided by the peer itself. The lower the current 3.3.2. Recommendation trust value. When two peers
level is, the more the number of successful have litter interaction experience, other peers’
interactions it needs. recommendation is needed for trust establishment.
(3) The predefined successful interaction number in a Recommendation trust is the trust of a peer on another
certain level is increased if interactions fail due to one based on direct trust and other peers’
malicious activities. recommendation. Recommendation trust is calculated

70
based on an enhanced polling protocol to be described and T ' the vote result from the perspective of distrust.
below. T is given as
Let we assume that peer Y requests an interaction
with peer X and the size of the interaction is Q . First, N ( w)

peer X computes peer Y’s direct trust denoted ¦ R(w) u p


i 1
as D (Tx ( y ), S ) . T (6)
N ( w)
(1) If Q d S and Tx ( y ) reaches a certain value where N ( w) denotes the total number of votes and
(which is set by peer X), peer X considers peer Y R( w) denotes peer w ’s vote accuracy factor which is
to be trustworthy. It will then decide to interact
with peer Y. in the range of (0, 1). p is related to DTw ( y ) such
(2) If Q d S but Tx ( y ) fails to reach a certain that if DTw ( y ) ! 0 , p 1 , else p 0 .
value, peer X chooses to join a group based on its
interest and requests all other members of the 4. Experimental study
group to cast a vote for peer Y from the
perspective of trust and distrust in the level of Q . Experiments have been carried out to study the
For any new peer without any interaction history, effectiveness and the benefits of our proposed model.
its trust value would be 0 and would be granted In a real environment, there may exist some vicious
the lowest level of the size of interaction. Without attacks including malicious recommendations or
voting, it will be permitted to interact at the lowest cheating in the accumulation of trust in small-size
level. interactions. In addition, it should solve the trust
(3) If Q t S but Tx ( y ) fails to reach a certain problem when there is no interaction history or little
trust value.
value, peer X immediately refuses to interact with
peer Y. 4.1. Malicious recommendation
(4) If Q t S and Tx ( y ) reaches a very high value, 
peer X chooses to join a group based on its The first experiment results show that our proposed
interest and then requests all other members of the trust model can effectively prevent malicious peers
group to cast a vote for peer Y from the from providing malicious recommendations. In the
perspective of trust and distrust at the level of Q . recommendation process, a group of malicious peers
can collaborate and send incorrect recommendation
Second, after the other peers receive the poll request
results. For example, when peer X requests other peers
message, they will decide whether to cast the vote
to cast a vote for peer Y, if all peers are honest, peer
based on the following formula. Let E denotes a voting Y’s recommendation trust value calculated based on
peer, then their vote result is 0.3. However, peer Y may
N ( y)
S (e, y ) M (e, y ) Z 1 (4)
DT ( y )
e ¦ (
Ne ( y)
 pen(i )
1  e n
) collaborate with all or part of these peers to improve its
i 1
recommendation trust value by sending the incorrect
where DTe ( y ) is the poll value of e in Y. vote results to peer X so that peer X has the wrong
judgment. It is an act of cheating on behalf of peer Y.
N ( y) denotes the total number of interactions E Our proposed model can effectively solve the problem.
has conducted with Y at level Q . S (e, y ) denotes In the experiment, we set the total peers to 120
peer E ’s satisfaction degree in the ith interaction among which 20 are malicious peers. Figure 1 shows
with y from the perspective of trust or the the simulation result in which the broken line denotes
dissatisfaction degree from the perspective of distrust. the recommendation trust value Tm that includes
malicious peers’ recommendations and the solid line
M (e, y ) is the ratio between the size of the ith
denotes the real recommendation trust value Tr that
interaction and the average size of interactions. doesn’t include any malicious recommendations. We
pen(i ) is the punishment function. can see that Tm fluctuates around Tr but the scale of
Lastly, X calculates the recommendation trust the fluctuation is very small. So we conclude that the
which is given as: voting of the malicious peers has little effect on the
RTx ( y ) (T  T ') (5) real recommendation trust in our trust model. One of
the main reasons is the introduction of the vote
T is the vote result from the perspective of trust accuracy factor which makes the honest peers’

71
recommendation to take a higher weight in the trust degree.
calculation of the recommendation trust. 0.9

0.85

0.9 0.8

0.8 0.75

0.7
0.7

0.6
0.65
0.5
0
30 60 90 120 150 180 210 240 270 300
0.4

0.3

0.2 Figure 2. The relationship between the direct trust


0.1 value and the number of interactions
0
3 5 7 9 11 13 15 17 19 21

4.3. The trust problem for a new peer


Figure 1. The relationship between the wrong polling 
number and the recommendation trust For peers without any interacting history, most
previous trust models often define a default level of
trust. The problem is that if this level is set too low, it
4.2. Cheating in the accumulation of trust may be difficult for the new peer to prove its
 trustworthiness through actions. If this level is set too
In our trust model direct trust has the property of high, there may be a need to limit the possibility that
rising slowly and dropping fast. The introduction of peers would have to “start over” by re-registration after
level of size of interaction limits interactions to certain misbehaving. In our trust, the introduction of the size
levels. Consequently, attack is prevented when of interaction effectively solves the trust problem of
malicious peers cheat in large interactions through peers without any interacting history.
improving trust value using many small interactions. We assume that peer X is a new peer which has no
Our second experiment verifies the effectiveness of the interaction experience with other peers. If it wants to
proposed model by simulating the relationship between interact with peer Y, in our trust model, its trust value
the direct trust and the interaction number of malicious is 0 and it would be granted the lowest level of size of
peers. In the experiment, we assume that a peer’s vote interaction. Without any voting, peer X would be
accuracy factor is 1 and we conducted 150 interactions. immediately permitted to interact at the lowest level.
If there is no fraud, the satisfaction degree of Therefore our trust effectively solves the trust problem
interaction is 0.9 as shown in Figure 2. We can see for a new peer. As the new peer is granted the lowest
when the peer has a behavior of fraud in the 101st level of size of interaction, it is prohibited from
interaction, its direct trust value would drop to 0.85. cheating in larger size interactions by giving
The peer has 50 small interactions from the 101th interaction chances.
interaction to 151th interaction, but there is a
difference between the current direct trust value and 4.4. The influence of the time factor
the previous 101th one. In the 151th interaction, the 
peer has another malicious behavior resulting in its The third experiment shows that our proposed
direct trust value dropping faster. When it has one model can truly reflect the influence of interaction
more cheating behavior, its trust value continues rapid history to trust which always decays with time, as
dropping. All together, though the peer has shown in Figure 3. Many previous trust models failed
successfully conducted 100 small-size interactions, the to consider the decay influence, so the calculation of
direct trust value could not recover the original trust trust values is not very accurate. As the result, trust
value. The main reason is that after one malicious values are usually computed higher than the real ones
behavior, a peer needs to successfully conduct many in those trust models. Our proposed model can well
more honest interactions to make up for the loss of solve this problem.
trust value. If it conducts small-size interactions, it
needs to have more interactions to offset the loss of its

72
1
References
0.9

0.8
[1] S. Chhabra , E. Damiani, S. Paraboschi and P. Samarati,
“A Protocol for Reputation Management in Super-Peer
0.7 Networks”, in Proc. 15th International Workshop on
Database and Expert Systems Applications, Zaragoza, Spain,
0.6
August 2004, pp. 973-983.
0.5 [2] S.D. Kamvar, M.T. Schlosser and H.G. Molina, “The
EigenTrust Algorithm for Reputation Management in P2P
0.4
Networks”, in Proc. 12th International Conference on Word
0.3 Wide Web, Budapest, Bulgaria, May 2003, pp. 640-651.
[3] L. Xiong and L. Liu, “A Reputation-Based Trust Model
0.2 for Peer-to-Peer eCommerce Communities”, in Proc. IEEE
0.1
International Conference on E-Commerce, San Diego, CA,
July 2003, pp. 228-229.
0 [4] Y. Wang and J. Vassileva, “Trust and Reputation Model
in Peer-to-Peer Networks,” in Proc. 3th International
Figure 3. The relationship between time and trust value Conference on Peer-to-Peer Computing, Sweden, September
2003, pp. 150-157.
[5] D. Donato, M. Paniccia, M. Selis, C. Castillo, G. Cortese
5. Conclusion and future work and S. Leonardi, “New Metrics for Reputation Management
in P2P Networks”, in Proc. 3rd International Workshop on
Adversarial Information Retrieval on the Web, Canada, May
In this paper, we proposed an enhanced trust model
2007, pp. 65-72.
based on reputation. Seven trust factors were identified [6] T. Tran and R. Cohen, “Modeling Reputation in Agent
in the evaluation of trustworthiness of peers. The Based Marketplaces to Improve the Performance of Buying
trustworthiness is evaluated based on the direct Agents”, in Proc. 9th International Conference on User
interaction experience and other peers’ Modeling, Pennsylvania, June 2003, pp. 273-282.
recommendations. We applied SupP2Prep [1] in our [7] A. Selcuk, E. Uzun and M. Pariente, “A
trust model, which is a protocol for reputation Reputation-Based Trust Management System for P2P
management via polling in P2P networks, but with Networks”, in Proc. IEEE International Symposium on
three adjustments. Most of the existing trust models Cluster Computing and the Grid, Chicago, IL, April 2004, pp.
251-258.
only consider the poll problem from the perspective of
[8] L. Mui, M. Mohtashemi and A. Halberstadt, “A
trust. In this paper, we consider voting for peers from Computational Model of Trust and Reputation Agents:
the perspectives of both trust distrust. Our work Evolutionary Games and Social Networks,” PhD Thesis,
appears to be the first to incorporate distrust in the Massachusetts Institute of Technology, 2002.
polling algorithm. The proposed trust model provides [9] R. Guha, R. Kumar, P. Raghavan and A. Tomkins,
an effective scheme to prevent vicious attacks and “Propagation of Trust and Distrust”, in Proc. 13th
solves the trust problem of a new peer. In the future we International Conference on Word Wide Web, New York,
will further explore new mechanisms to make our trust NY, May 2004, pp. 403-412.
model more robust against malicious behaviors. [10] R. Zhou and K. Hwang, "PowerTrust: A Robust and
Scalable Reputation System for Trusted P2P Computing,"
IEEE Transactions on Parallel and Distributed Systems,
Vol.18, No.5, May 2007.

73
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Sub-optimal Step-by-step Node Deployment Algorithm for User Localization


in Wireless Sensor Networks

Yuh-Ren Tsai and Yuan-Jiun Tsai


Institute of Communications Engineering,
National Tsing Hua University
Hsinchu 30013, Taiwan
Email: yrtsai@ee.nthu.edu.tw

Abstract Several methods have been proposed for the im-


plementation of localization applications in WSNs,
User/object localization is one of the promising ap- including coverage-based methods [7]-[9] and meas-
plications for WSNs. So far, there is no flexible node urement-based methods [1]-[6]. In general, measure-
deployment algorithm targeting on optimizing the lo- ment-based methods are widely applied and can
calization performance. To facilitate node deployment achieve a better location estimation performance. In a
for localization applications in WSNs, we propose, network, multiple sensor nodes, referred to as the ref-
based on a universal performance evaluation metric, a erence nodes, are deployed within the desired area for
low-complexity, step-by-step node deployment algo- measuring a specific signal coming from the target.
rithm which provides sub-optimal solutions feasible for Then the measured data is sent to a localization center
large-scale WSNs. This proposed node deployment which is assumed to possess the location information
algorithm has the computational complexity linearly of all reference nodes. By applying a specific localiza-
proportional to the number of available nodes, and is tion algorithm and the location information of the ref-
flexible for different system scenarios, including the erence nodes, the localization center can estimate the
cases with a non-homogeneous user distribution and location of the target. The measures used for localiza-
with an irregular sensing area. The performance of tion may be based on time-of-arrival (TOA), received
our proposed algorithm is compared with some other signal strength (RSS), angle of arrival (AOA), and so
available benchmarks. It is found that the proposed on [4]. Different types of signals induce different con-
deployment algorithm can provide flexible network sequences of hardware cost, complexity and localiza-
topologies with very good location estimation per- tion accuracy. In this work, we focus only on the RSS-
formance. based localization applications, since it is the most
general and easy way for implementation.
1. Introduction The performance of location estimation completely
relies on the applied localization algorithm and the
Sensing coverage Wireless sensor networks (WSNs) network topology, i.e. the deployment of the reference
have been widely applied to the applications such as sensor nodes. Well designed node deployment can
environmental surveillance and remote parameters provide good measurement information, and therefore
detection. User/object localization is one of the prom- benefit the localization performance. For localization
ising applications for WSNs [1]-[9], especially when applications, determination of the optimal network
the acquired data is location dependent or the service topology is one of the key issues, especially for the
requires user location information. Moreover, accurate scenarios with a non-homogeneous user distribution
location information can benefit many other applica- and/or an irregular desired area. Exhaustive search is
tions, such as route selection, traffic monitoring, and definitely a possible way to determine the optimal net-
so on [10]-[11]. work topology. However, this approach is time con-
suming with the computational complexity exponen-
This work was supported in part by the National Science Council,
tially proportional to the number of available nodes,
Taiwan, R.O.C., under Grant NSC 96-2752-E-007-003-PAE. and thus is not feasible when the number of possible

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 114


DOI 10.1109/SUTC.2008.33
network topologies is very large, i.e., it is not feasible WSNs since they can be easily implemented with a
for large-scale WSNs. Furthermore, the scenarios with low cost. The RSS-based methods strongly depend on
a non-homogeneous user distribution and/or an irregu- the statistics of the radio propagation environment,
lar desired area will further increase the difficulty of which can be characterized by the propagation path
search for a good network topology. loss model. The received signal power (in dB units)
Some works have focused on the issue of evaluating can be expressed as [14]
the location estimation performance of a specific net-
work topology [12]. So far, however, there is no flexi- P(d ) = P0 − 10n log10 (d d 0 ) + χ , (1)
ble node deployment algorithm targeting on optimizing
the localization estimation performance. To facilitate where d denotes the distance between the target source
node deployment for localization applications in and a sensor node; P0 (in dB units) is the average re-
WSNs, we propose a low-complexity, step-by-step
node deployment algorithm which provides sub- ceived signal power at a reference distance d 0 ; n de-
optimal solutions feasible for large-scale WSNs. In notes the signal power decay exponent; and χ (in dB
considering generality of our proposed algorithm, a units) is introduced to represent the shadowing effects
universal performance evaluation metric, which is in- in the propagation path, assumed to be a Gaussian ran-
dependent of the applied location estimation algo- dom variable (RV) with a zero mean and a variance
rithms, is essential. The Cramer-Rao lower bound σ 2 . It is noted that different propagation paths may
(CRLB) is a powerful tool for the performance evalua- experience different values of shadowing loss. The
tion of an estimator [13]. Moreover, the CRLB can be shadowing effects may cause a large variation in the
easily determined and has been shown to be a good received signal [15], and thus degrade the estimation
measure of the localization performance [1]-[2]. Hence, accuracy.
we apply the CRLB as the performance metric to de-
velop our node deployment algorithm. This proposed 2.2. Performance Evaluation Metric – CRLB
node deployment algorithm has the computational
complexity linearly proportional to the number of In this work, we apply the CRLB as the perform-
available reference nodes, and is feasible for different ance evaluation metric for the design of the proposed
system scenarios, including the cases with a non- node deployment algorithm. This metric is independent
homogeneous user distribution and with an irregular of the applied estimation algorithm, and thus is feasible
sensing area. for different kinds of estimation algorithms. Consider-
The remainder of this paper is organized as follows. ing the RSS-based methods, the corresponding Fisher
Section II illustrates the system and the applied per- information matrix (FIM) for a target located at loca-
formance evaluation metric. Section III proposes the tion k can be represented as [1]
step-by-step sub-optimal node deployment algorithm
for localization applications in WSNs. Section IV pro-
⎡ F ( k ) FXY (k ) ⎤ ,
vides the performance comparison between our pro- F ( k ) = ⎢ XX ⎥
(2)
posed algorithm and other available benchmarks. Fi- ⎣ FXY (k ) FYY (k ) ⎦
nally, the conclusion is drawn in Section V.
where
2. Preliminaries
⎧ Δxik2
2.1. System Model ⎪ FXX (k ) = β ∑
i∈H ( k ) ( Δxik + Δyik )
2 2 2

⎪⎪ Δxik Δyik ; (3)
⎨ FXY (k ) = β ∑
It is assumed that there are N reference sensor nodes
i∈H ( k ) ( Δxik + Δyik )
deployed in a desired area A. The targets may appear 2 2 2

within the desired area with a particular location distri- ⎪ Δyik2
bution f(k) at location k. For localization applications, ⎪ FYY (k ) = β ∑
⎪⎩ i∈H ( k ) ( Δxik + Δyik )
2 2 2
all sensor nodes can simultaneously detect/measure the
radio signal emitted by a specific target, and then send
the information to a central controller to jointly deter-
β = (10n log(10)σ ) is an
2
environment-dependent
mine an estimated location for this target. The applied
location estimation algorithm may be arrival time constant; Δxik and Δyik are the distance in the X-
based or received signal strength (RSS) based [1]. The coordinate and the distance in the Y-coordinate, re-
RSS-based localization methods are generally used in spectively, between location k and reference node i;

115
and H(k) is the set of reference nodes that the meas- formance is strongly impacted by the network topology,
urement associated with location k exists. Then, taking especially when the user distribution is non-uniform or
the inverse of FIM and under the constraint of the non- the target area is irregular. Exhaustive search for the
cooperative case, we have the lower bound of the optimal network topology is time consuming and is not
mean-square error (MSE) at location k expressed as [1] achievable when the node number is very large. Alter-
natively, we propose a low-complexity, step-by-step
FXX (k ) + FYY ( k ) deployment algorithm which provides suboptimal solu-
MSE (k ) ≥
FXX (k ) FYY ( k ) − ( FXY (k )) 2 tions flexible for different system scenarios.
The desired area A is divided into multiple uniform
1
∑ (Δxik2 + Δyik2 )
i∈ H ( k )
(4) grids, and all reference nodes are assumed to be de-
ployed on grids. Initially, a primary network topology
≥ .
(Δxik Δy jk − Δyik Δx jk ) 2 with a limited node number N ini is deployed. It is
β ∑ ∑
i∈H ( k ) j ≠i (Δxik2 + Δyik2 ) 2 (Δx 2jk + Δy 2jk ) 2 noted that the primary network topology must be set to
j∈H ( k )
avoid the ill-condition, which implies that the network
topology forms only one-dimensional space in the de-
By averaging the user location distribution over the sired area. If the ill-condition occurs, the CRLB of
entire area, we have the average CRLB of the entire MSE does not exist at some locations and the estima-
system as tion performance of the entire system cannot be evalu-
ated. One efficient way to avoid the ill-condition is to
v∫ A
MSE (k ) × f (k ) , (5) initiate a primary network topology in which each lo-
cation in the desired area can be accessible by at least
three reference nodes, or equivalently the signal emit-
where f(k) is the user distribution at location k.
ted from a location can be received by at least three
Some facts can be observed from (3)-(5). Adding
reference nodes. According to the current network
a new reference node in H(k) can improve the estima-
topology, the average CRLB of the entire system cor-
tion accuracy at location k. This improvement is in-
responding to each possible deployed location of the
versely proportional to the distance between location k
next reference node can be calculated based on (5).
and this new reference node, and also depends on the
Then the next reference node is deployed at the loca-
relative angles among this new node and the other de-
tion having the minimum average CRLB of the entire
ployed reference nodes. Considering overall estimation
system. Subsequently, each reference node can be de-
performance in the entire area, deploying widely
ployed similarly based on this step-by-step process.
spreading nodes can achieve a better performance than
The proposed step-by-step node deployment algorithm
deploying gathering nodes for uniform user distribu-
is shown as follows.
tion. Furthermore, the CRLB corresponding to a net-
work topology is associated to the mean-square error
of the location estimation, and thus can be used as an
evaluation metric for node deployment in WSNs. % Proposed Deployment Algorithm %

3. Step-by-step Suboptimal Deployment Set a primary network topology with the reference
node number N ini that avoids any ill-conditions in
Algorithm
the desired area;
In this work we propose a flexible step-by-step reference node number = N ini + 1 ;
node deployment algorithm suitable for different area for (reference node number <= number of total ref-
shapes and different user distributions. This algorithm erence nodes N)
can achieve near optimal node deployment with lim- {
ited computational complexity. Based on (5), search for the average CRLB for
each possible deployed location;
3.1. Node Deployment Algorithm Deploy a reference node at the location with the
minimum average CRLB;
The proposed node deployment algorithm is based reference node number + 1;
on minimizing the location estimation CRLB of the }
entire system, and is independent of the applied local- End
ization algorithm. No matter what kind of localization
algorithms is applied, the average localization per-

116
100 100 8
11 3
18
90 23 90 13 20
20 16 12
80 24 80
6 13 9
8 2 5
70 15 70

60 21 60 25
24
Y-axis

Y-axis
50 4 1 5 50 1 19 4 16
18 17
40 40 23
7
30 9 25 30
17 10 14
3 6
20 20
7
19 11 14
22 22 21
10 12 10
2 15
0 0 10
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
X-axis X-axis
Figure 1. Sub-optimal network topologies for a Figure 3. Sub-optimal network topologies for an
homogeneous user distribution (Scenario 1). irregular area (Scenario 3).

It is noted that the proposed node deployment algo- homogeneous, i.e. with the same distribution of the
rithm can be applied to a system with any desired area desired users over the entire area. The primary network
shape and any user distribution. topology contains node 1, node 2 and node 3. By ap-
plying the proposed algorithm, the sub-optimal node
3.2. Experimental Results topologies are shown in Fig. 1 for the node number
from 4 to 25. For example, if the number of available
Based on the proposed algorithm, three scenarios nodes is 10, then location 1 to location 10 are the sug-
are investigated. The path loss exponent of radio gested node topology. It is found that nodes are almost
propagation is assumed to be n = 2 and the shadowing uniformly distributed over the entire area for different
standard deviation is assumed to be σ = 4 dB. In Sce- node numbers.
nario 1, the desired area is a square with 100 units in In Scenario 2, the desired area is the same as that
length in each side, and is divided into 100×100 = in Scenario 1, except that it is non-homogeneous for
10000 grids. The user distribution is assumed to be user distribution. The distribution of the desired users
is assumed to be 2-dimensional bivariate Gaussian
with the mean at the center of the area and the variance
100 in each dimension equal to 250 unit-square. By apply-
90
ing the proposed algorithm, the sub-optimal node to-
14
pologies are shown in Fig. 2 for the node number from
80 4 to 25. It is found that nodes are now more concen-
2 19
70 23 trated near the center of the desired area.
5 9
11 In Scenario 3, the user distribution is assumed to
60 8 21
be homogeneous but the desired area is an irregular
Y-axis

12 22 17
50 1 3 area as shown in Fig.3. The shadowed area is not part
6 16 13 24
40
of the desired area, and the detection across the shad-
20 7 owed area is assumed to be unavailable. The primary
30 18 4 network topology contains node 1 – node 6. By apply-
15
20 10 ing the proposed algorithm, the sub-optimal node to-
25 pologies can be obtained for the node number from 7
10
to 25. It is found that nodes are almost uniformly dis-
0
0 10 20 30 40 50 60 70 80 90 100
tributed within the desired area.
X-axis The proposed algorithm is not only suitable for
homogeneous and regular environments but also for
Figure 2. Sub-optimal network topologies for a non-homogeneous and irregular environments. More-
non-homogeneous user distribution (Scenario 2).
over, the computational complexity is linearly propor-

117
Performance Metric (Normalized CRLB)
bility of our proposed node deployment algorithm. To
Proposed Algorithm
Optimal Network Topology justify the optimality of the proposed algorithm, we
need to compare our results with some available
10-1
benchmarks. The most reliable benchmark is the opti-
mal network topology obtained by exhaustive search,
which minimizes the average CRLB with a given num-
ber of reference nodes. Since the complexity grows
dramatically when the node number or the grid number
is too large, only the results of the scenarios with a
finite node number and a finite grid number can be
evaluated. Fig. 4 shows the performance comparison
between the proposed algorithm and the optimal net-
work topology in a homogeneous environment (Sce-
10-2 nario 1) with 10×10 = 100 grids and the node number
3 4 5
from 3 to 5. The performance metric is defined as the
Number of Reference Nodes
average CRLB normalized to the area of the desired
Figure 4. Performance comparison between the region. It is found that the performance of the pro-
optimal network topology and the proposed algo- posed algorithm is slightly worse than the best avail-
rithm in a homogeneous environment (Scenario able performance obtained via the optimal topology.
1). However, the performance of the proposed algorithm
converges to the best performance quickly, and the
loss in performance is less than 5% for only five nodes
tional to the node number, not exponentially propor-
in homogeneous environment. This implies that our
tional to the node number. Thus massive deployment
proposed algorithm can provide a sub-optimal topol-
of reference nodes becomes feasible.
ogy with the performance quite close to the best
achievable performance.
4. Performance Comparison
4.2. Comparison with Monte-Carlo simulation
4.1. Comparison with the optimal network to-
pology Since the investigation of exhaustive search is lim-
ited to the scenarios with a finite node number, we
In previous we have shown the simplicity and flexi- further investigate another benchmark for the scenarios
with a large node number. Monte-Carlo simulation is a
well-known test bench for performance evaluation,
Performance Metric (Normalized CRLB)

Proposed Algorithm which is also flexible for the different scenarios. In the
Monte-Carlo Simulation simulation, we randomly generate 100000 random
(The Best Performance)
network topologies for each node number. Then the
network topology with the best performance, i.e. the
10-2
one minimizes the average CRLB, is chosen as the
benchmark for comparison. Fig. 5 shows the perform-
ance comparison between the proposed algorithm and
the Monte-Carlo simulation results in a homogeneous
environment (Scenario 1). When the node number is
small, searching over 100000 random network topolo-
gies can achieve a very good performance. However,
when the node number is large enough, searching over
100000 random network topologies is not quite enough
10-3 and the performance becomes worse than that obtained
6 8 10 12 14 16 18 20
Number of Reference Nodes by our proposed algorithm.

Figure 5. Performance comparison between 4.3. Comparison with uniform-grid deploy-


Monte-Carlo simulation and the proposed algo- ment
rithm in a homogeneous environment (Scenario
1).

118
der area. Fig. 6 shows the performance comparison
Performance Metric (Normalized CRLB) Proposed Algorithm between the proposed algorithm and the uniform-grid
Uniform-grid Deployment
deployment in a homogeneous environment (Scenario
1). For uniform-grid deployment, the node number N
must be a square number, i.e. 4, 9, 16, etc. It is found
10-2 that the uniform-grid deployment provides a worse
performance when the node number N is small, due to
the performance degradation in the border area. On the
other hand, when the node number N is large enough,
the uniform-grid deployment tends to be the optimal
deployment strategy for homogeneous environments,
10-3 and the proposed algorithm achieves this performance
bound. It must be noted that the uniform-grid deploy-
ment is not feasible for non-homogeneous and irregu-
10 20 30 40 50 60 70 80 90 100
Number of Reference Nodes lar environments.

Figure 6. Performance comparison between the 4.4. Simulation results for maximum likelihood
uniform-grid deployment and the proposed algo- estimator
rithm in a homogeneous environment (Scenario
1). The proposed node deployment algorithm is based
on the performance metric. To further investigate the
Conceptually, uniform-grid node deployment may optimality of the proposed algorithm, we apply a spe-
be a good strategy for an unlimited desired area with a cific localization algorithm – maximum likelihood
fixed reference node density, in order to achieve the (ML) estimator, to evaluation of estimation perform-
best estimation performance in a homogeneous envi- ance of the network topology obtained by our pro-
ronment. Therefore, uniform-grid node deployment posed algorithm via simulation. The estimation per-
can be used as a theoretical benchmark for perform- formance is represented as the MSE normalized to the
ance comparison. However, if the desired area is lim- area of the desired region. Fig. 7 shows the ML estima-
ited, uniform-grid deployment will lead to a lower lo- tion performance of the proposed network topology
cal node density in the border area, and thus a worse and the optimal network topology in a homogeneous
estimation performance is expected for user in the bor- environment (Scenario 1) with 10×10 = 100 grids. The
estimation performance of our proposed network to-
pology is only slightly worse than that of the optimal
network topology. However, the estimation perform-
10-1
Proposed Algorithm
ance is almost the same for the scenario with the node
Normalized Mean Square Error

Optimal Network Topology number N = 5, implying the optimality of our proposed


algorithm.
Fig. 8 shows the ML estimation performance of
the proposed network topology and the uniform-grid
deployment topology in a homogeneous environment
(Scenario 1) with 100×100 = 10000 grids. The estima-
tion performance of our proposed network topology is
even slightly better than that obtained via the uniform-
grid deployment.
We also investigate the ML estimation perform-
ance in a non-homogeneous environment. Fig. 9 shows
the ML estimation performance of the proposed net-
10-2 work topology and the uniform-grid deployment to-
3 4 5
Number of Reference Nodes pology in a non-homogeneous environment (Scenario
2) with 100×100 = 10000 grids. The user distribution
Figure 7. Performance comparison between the f(k) is the same as that addressed in Fig. 2. It is found
optimal network topology and the proposed algo- that the estimation performance of the uniform-grid
rithm based on the ML estimation in a homoge- deployment topology is severely degraded due to the
neous environment. non-uniform user distribution. On the other hand, our

119
10-2
Proposed Algorithm

Normalized Mean Square Error


Normalized Mean Square Error Proposed Algorithm
10-2 Uniform-grid Deployment Uniform-grid Deployment

10-3

10-3

10-4

10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
Number of Reference Nodes Number of Reference Nodes
Figure 8. Performance comparison between the Figure 9. Performance comparison between the
uniform-grid deployment topology and the pro- uniform-grid deployment topology and the pro-
posed algorithm based on the ML estimation in a posed algorithm based on the ML estimation in a
homogeneous environment. non-homogeneous environment.

proposed network topology still performs well in a [3] R. L. Moses, D. Krishnamurthy, R. M. Patterson, “A self-
non-homogeneous environment. localization method for wireless sensor networks,” EURASIP
Journal on Applied Signal Processing, vol. 2003, issue 4, pp.
348–358, 2003.
5. Conclusion
[4] N. Patwari, J. N. Ash, S. Kyperountas, A. O. Hero III, R.
In this work, we proposed a step-by-step node de- L. Moses, and N. S. Correal, “Locating the nodes, Coopera-
ployment algorithm which provides sub-optimal solu- tive localization in wireless sensor networks,” IEEE Signal
tions feasible for large-scale WSNs. This deployment Processing Magazine, vol. 22, no. 4, pp. 54-69, July 2005.
algorithm has the computational complexity only line-
arly proportional to the number of available nodes, and [5] Q. Shi, S. Kyperountas, N. S. Correal, F. Niu, “Perform-
is flexible for the scenarios with a non-homogeneous ance analysis of relative location estimation for multi-hop
wireless sensor network,” IEEE Journal on Selected Areas in
user distribution and with an irregular sensing area.
Communications, vol. 23, no. 4, pp. 830-838, April 2005.
The performance comparison between our proposed
algorithm and other available benchmarks shows that [6] C. Savarese, J. M. Rabaey, and J. Beutel, “Locationing in
the sub-optimal network topologies proposed by our distributed ad-hoc wireless sensor networks,” in Proc. of
algorithm can achieve a performance very close to the IEEE Int. Conf. Acoustics, Speech, Signal Processing
optimal performance with low complexity. Simulation (ICASSP), vol. 4, pp. 2037–2040, May 2001.
results based on maximum likelihood estimator were
also provided to evaluate the performance of the pro- [7] Y. Shang, W. Ruml, Y. Zhang, and M. P. J. Fromherz,
posed algorithm. According to these results, it was “Localization from mere connectivity,” in Proc. of Int. ACM
Symposium on Mobile Ad Hoc Networking and Computing
found that our algorithm performs very well for differ-
(MobiHoc), pp. 201–212, June 2003.
ent scenarios.
[8] Y. Shang and W. Ruml, “Improved MDS-based localiza-
References tion,” in Proc. of IEEE INFOCOM, vol. 4, pp. 2640–2651,
Mar. 2004.
[1] N. Patwari, A. O. Hero III, M. Perkins, N. S. Correal,
and R. J. O’Dea, “Relative location estimation in wireless [9] L. Doherty, K. S. J. Pister, and L. E. Ghaoui, “Convex
sensor networks,” IEEE Trans. Signal Processing, vol. 51, position estimation in wireless sensor networks,” in Proc. of
no. 8, pp. 2137–2148, Aug. 2003. IEEE INFOCOM, vol. 3, pp. 1655–1663, Mar. 2001.

[2] E. G. Larsson, “Cramér-Rao bound analysis of distributed [10] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E.
positioning in sensor networks,” IEEE Signal Processing Cayirci, “Wireless sensor networks: A survey,” Computer
Letter, vol. 11, no. 3, pp. 334–337, Mar. 2004. Networks, vol. 38, no. 4, pp. 393–422, Mar. 2002.

120
[11] M. Maleki, M. Pedram, “QoM and lifetime-constrained
random deployment of sensor networks for minimum energy
consumption,” in Proc. of Int. Symposium on Information
Processing in Sensor Networks, pp. 293-300, Apr. 2005.

[12] S.-P. Kuo, Y.-C. Tseng, F.-J. Wu, and C.-Y. Lin, “A
probabilistic signal-strength-based evaluation methodology
for sensor network deployment,” in Proc. of Int. Conf. Ad-
vanced Information Networking and Applications (AINA),
vol. 1, pp. 319–324, Mar. 2005.

[13] H. V. Poor, An Introduction to Signal Detection and


Estimation, 2nd Edition, New York, Springer, 1994.

[14] G. L. Stuber, Principles of Mobile Communication, 2nd


Edition, Boston, Kluwer Academic Publishers, 2001.

[15] G. Durgin, T. S. Rappaport, and H. Xu, “Measurements


and models for radio path loss and penetration loss in and
around homes and trees at 5.85 GHz,” IEEE Trans. Comm.,
vol. 46, no. 11, pp. 1484–1496, Nov. 1998.

121
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

On the Security of the Full-Band Image Watermark


for Copyright Protection
Chu-Hsing Lin, Jung-Chun Liu, and Pei-Chen Han
Department of Computer Science and Information Engineering
Tunghai University, 181 Section 3, Taichung-kang Road, Taichung, Taiwan
chlin@thu.edu.tw, jcliu@thu.edu.tw and g95280010@thu.edu.tw

Abstract Section 3, the multi-scale FBIW scheme is described.


Digital watermarks have been embedded invisibly in digital Experimental results are shown in Section 4. This paper is
media to protect copy rights of legal owners. The embedded concluded in Section 5.
watermarks can be extracted to indicate ownership originals.
In this paper a robust Full-Band Image Watermark method is II. PRELIMINARIES
investigated by using various attacks on the watermarked The FBIW watermarking scheme combines Distributed
image. Image attacks have evolved so much that they are not Discrete Wavelet Transformation (DDWT) watermarking
just destructive, that is, degrading the visual quality of the scheme and Singular Value Decomposition (SVD)
image, but somehow, constructive, that is, altering the original watermarking scheme.
work in creative ways to make it look as if a new piece of
creation. We launched many destructive and constructive A. Distributed Discrete Wavelet Transformation
attacks on images embedded with our watermarking method to The FBIW watermarking scheme combines Distributed
test its security. The experimental results prove that it is very Discrete Wavelet Transformation (DDWT) watermarking
robust against image attacks. scheme and Singular Value Decomposition (SVD)
watermarking scheme.
Keywords and phrases: Full-band image watermark,
constructive attacks, destructive attacks, copyright protection, Based on the Discrete Wavelet Transform [1, 6, 12, 17, 25],
image embedding, image extraction. Lin et al. proposed a Distributed Discrete Wavelet
Transformation (DDWT) watermarking scheme [2] in 2006.
I. INTRODUCTION The DDWT watermark scheme distributes hidden information
in the spatial domain, so it is effective against malicious
In the digital era, processing and disseminating of digital cropping attacks.
information and media have become very fast and convenient.
Piracy of digital creation without appropriate permission from The multi-scale DDWT consists of horizontal and vertical
rightful owners not only deprives rights of creators but also processes as follows:
harms innovations. To solve this problem, digital The horizontal process:
watermarking has been offered for copyrights protection [23].
1. Separate the original image along horizontal direction into
An effective watermarking scheme embeds watermarks two equal blocks.
invisibly in the original cover image and is robust against
image processing attacks [3, 11]. New techniques of image 2. Add and subtract corresponding pixels on the two sub-
attacks evolve along with the development of image blocks, then replace pixels on the left sub block with the
processing tools and they present great menace to the result of the addition and denote and pixels on the right
robustness of watermarking. The Full-Band Image sub-block with the result of the subtraction. Denote the
Watermarking (FBIW) scheme proposed in [9] shows robust processed left sub-block as L and the right sub-block as H.
against most geometric and non-geometric attacks. In this The vertical process:
paper, we investigate the security of the FBIW scheme by
launching a variety of attacks, some of them just slight 3. Separate the horizontal processed image along vertical
modifying the image, some of them distorting the image badly, direction into two equal blocks.
and some of them manipulating the image in creative ways to 4. Add and Subtract corresponding pixels on the two sub-
make it a new piece of artwork by pirating the intellectual blocks and replace pixels on the upper sub block with the
property of the legal owner. The experimental results show result of the addition and pixels on the lower sub block
that the FBIW scheme is robust against all of the above with the result of the subtraction. Thus, we generate four
mentioned attacks. sub-blocks and denote them LL1, HL1, LH1 and HH1,
The rest of this paper is organized as follows. In Section 2, the which are the four band of the 1-scale DDWT. Repeat
background of related techniques is briefly reviewed. In

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 74


DOI 10.1109/SUTC.2008.16
above horizontal and vertical processes on LL1 to obtain column vector of matrix U . The vector vi is the i column
four band of the 2-scale DDWT and so on. vector of matrix V.
Fig. 1 shows the 1-scale DDWT. After applying the horizontal
process on the original image S, sub-band L and H are III. FULL BAND IMAGE WATERMARK METHOD
obtained, and after applying the vertical process on L and H,
The DDWT watermarking scheme consists of the watermark
the four sub-bands LL1, HL1, LH1 and HH1 are obtained. The
embedding process and the watermark extracting process as
result of 3-scaled DDWT is shown in Fig. 2.
follows:
B.1 Embedding algorithm
Step 1 Input the original image X (M×M) and the watermark
W (N×N).
Step 2 Perform the K-scale DDWT transform on X to obtain
X′, where K is the number of scale.

(Step 3 to Step 6 embedding the watermark in HL and LH sub-


Figure 1. Block diagram of the 1-scale DDWT. bands utilizing the SVD method)

Step 3 Set initial values of the stego-image in the frequency


domain Y ′ to be equal to X′, and apply SVD on sub-
bands HL and LH of the last scale:

X ′HL = U XHL′ Σ HL HLT


X ′ VX ′
(2)
X ′LH = U XLH′ Σ LH LHT
X ′ VX ′
Where X′HL and X′LH represent X′ in sub-bands HL and LH,
X′i and σX'i ) of ∑X' and ∑X' are the
and the diagonal elements (σHL LH HL LH

singular values on sub-bands HL and LH.

Step 4 Apply SVD to the watermark:


Figure 2. The result of the 3-scale DDWT.
W = UW ΣW VWT (3)

Where the diagonal elements (σWi ) of ΣW are the singular


values of the watermark, and
B. Singular Value Decomposition
SVD was invented by Beltrami in 1873 to solve the square σ W = [σ W 1 , σ W 2 , ⋅⋅⋅, σ WN ] , σ W 1 ≥ σ W 2 ≥ ⋅⋅⋅ ≥ σ WN ≥ 0
matrix problem. Gene Golub proposed an algorithm that
makes the computation of SVD feasible in 1970. Many Step 5 Process the singular values of X′ in the frequency
researchers have since applied SVD at image compression [7, domain with the singular values of the watermark:
8, 10, 18, 19, 24], watermarking [4, 5, 13, 22, 26] and other
signal processing fields [14, 15, 16, 20, 21]. σ YHL = σ XHL + ασ w
i
' '
i i

SVD is a technique to unitarily diagonalize normal matrices (4)


using a basis of eigenvectors. An image can be seen as a σ LH
Yi'
=σ LH
Yi'
+ ασ wi
matrix composed of non-negative values.
M¯N
For an image matrix A∈R , where R is the real number Where α is a scaling factor, and σ Y ' is the singular values of
and M ≥ N, then
the singular matrix ΣY′.
m
A = U ΣV = ∑ σ u v
T T
i i i (1) Step 6 Obtain Y ′HL and Y ′LH embedded with watermarks on
i =1 sub-bands HL and LH:
Where UM × M and VN × N are both orthogonal matrices and
Y HL = U XHL′ ΣYHLVXHLT
∑M×N is a diagonal matrix and m = min{M,N}. The scalars σ1, ′
(5)
σ2,..., σm are the singular values of A. The vector ui is the i Y LH = U XLH′ ΣYLH VXLHT

75
Step 7 Take Y ′HL and Y ′LH of the last scale of Y ′ and perform Step 5 Extract the singular values of watermarks by
inverse DDWT to obtain spatial domain YHLLH that processing the diagonal elements of ∑HL
F' with ∑X' and
HL

has been embedded with watermarks in sub-bands HL ∑F' with ∑X' , respectively.
LH LH

and LH.
σ FHL′ − σ XHL′
σ WHL = i i

(Step 8 embedding the watermark in sub-bands LL and HH


i
αi
utilizing the DDWT method) (9)
σ FLH′ − σ XLH′
σ LH
= i i

Step 8 Take Y′ data in the sub-bands LL and HH of the last


Wi
αi
scale and embed watermark information according to
the following formula: Where i = 1, 2… N.
if Wij = 0 then Y 'ijLL = Y 'ijLL +α (2 K ) 2 Step 6 Obtain the two watermarks embedded in sub-bands
(6) HL and LH by the following equations:
if Wij = 1 then Y ' HH
ij =Y' HH
ij +α (2 )
K 2

W HL = U WHL ΣWHLVWT
(10)
Step 9 Apply the inverse DDWT to Y′ to produce the stego- W LH = UWLH ΣWLH VWT
image Y, which has been embedded with watermark
information on the four sub-bands of the last scale.
Subtract YHLLH from Y to obtain YDiff, which gives
difference of pixel values of YHLLH and Y in the spatial
IV. EXPERIMENTAL RESULTS
domain.
The original cover image Lena (512 × 512) is shown in Fig.
B.1 Extracting algorithm 3(a), and the watermark (64 × 64), in Fig. 3(b). The watermark,
(Step 1 to Step 2 extracting the watermark from sub-bands LL Tunghai Univerisy, is a masterpiece of calligraphy by the
and HH) famous modern Chinese calligrapher Yu, You-ren (1879-
Step 1 Input the stego-image Y, the original image X, the 1964). We embedded watermarks in the full band of the cover
spatial domain data YHLLH, and the watermark W. image after doing 3-scale DDWT.
Step 2 Subtract YHLLH from Y to obtain YLLHH, and apply
formula (7) on YLLHH to extract the embedded
watermark WLLHH:

WijLLHH = 0 if YLLHH < 0


(7)
= 1 otherwise

(Step 3 to Step 6 extracting the watermark from sub-bands HL


and LH) (a) (b)
Figure 3. (a) The original cover image of Lena (512×512)
Step 3 Subtract YDiff from Y to obtain F, and then apply the (b) The watermark (64×64).
To evaluate the robustness of watermarks, the Pearson
multi-scale DDWT on F to obtain F′.
correlation coefficient was used:
Step 4 Apply SVD to F′ on sub-bands HL and LH of the last n −1 n −1
scale: ∑∑ (W
i = 0 j =0
(i, j ) − W )(W(′i , j ) − W ′)
F′ HL
=U Σ V
HL HL HLT Corr (W ,W ′) =
n−1 n−1 n −1 n −1
F′ F′ F′ (11)
(8) ∑∑ (W − W )2 ∑∑ (W ′ − W ′) 2
F′
(i , j ) (i, j )
LH
=U Σ V
LH
F′
LH
F′
LHT
F′
i =0 j =0 i =0 j =0

Where F′HL and F′LH represent F′ in the sub-bands HL and LH Where W and W ' , the average value of pixels of the original
F' and σF' ) of ∑
of the last scale, and the diagonal elements (σHL LH
watermark and the extracted watermark, are defined as follows:
F' and ∑F' are the singular values of F′
HL LH HL LH
and F′ .

76
n −1 n −1 n −1 n −1 Column 3 shows the extracted watermarks and Column 4
∑∑W
i = 0 j =0
(i , j ) ∑∑W ′
i =0 j =0
(i , j )
(12)
shows their correlation coefficients and the bands they are
W= ,W ′ = extracted from. Since we have embedded watermarks in the
n× n n×n four bands of the 3-scale DDWT, we can extract all of them
from the attacked stego-image and use the best one for
The watermarked image, or the stego-image, is somewhat copyrights protections. Only the best watermark of each
different from the cover image. To evaluate the fidelity of the attacking test is tabulated on Column 3.
stego-image, the peak signal-to-noise ratio (PSNR) was The visual quality of extracted watermarks and corresponding
calculated as follows: correlation coefficients of them indicate the robustness of the
 2552  FBIW watermarking scheme. It is very robust against many
PSNR = 10 log  (13) attacks, such as the Gaussian noise, sharpening, the histogram
 MSE  equalization, rotation, cropping, warm, the ripple, the
Where the mean square error (MSE) of the cover image (m×m) whirlpool, the crystal and glass, the blast, the watercolor, the
and the stego-image (m×m) is: colored pen, mosaic, the invert, equalization, Gamma, the
1 m −1 m −1 zoom blur, the resizing attacks. It also shows good robustness
MSE = ∑∑ (α ij − βij )2
m2 i = 0 j =0
(14) against other attacks.
Image processing tools can be used not only to attack the
Where α ij is the pixel value of the cover image, and β ij is the watermarking information but also to reprocess the image in
creative ways. Table 2 shows reprocessed images of Lena, but
pixel value of the stego-image. The typical value of PSNR for all of them have been disguised so much that one can hardly
lossy image is between 30 to 50 dB, and the higher, the better. associate them with Lena at the first glance. To one’s surprise,
The stego-image of Lena is shown in Fig. 4(a), and its PSNR the extracted watermarks are still very clear. The experimental
value is 39.2793. From the stego-image, we extracted results show that the FBIW watermarking scheme is robust
watermarks embedded in the four sub-bands of the 3-scale against creative and multiple image attacks, including the
DDWT. The watermarks embedded by using the DDWT puzzle, the kaleidoscope, the kaleidoscope plus tile, the
algorithm on the LL and HH sub-bands are retrieved and kaleidoscope plus puzzle, and the kaleidoscope plus tile and
shown in Fig. 4(b). The watermarks embedded by using the puzzle attacks.
SVD algorithm on the sub-bands HL and LH are retrieved and
shown in Fig. 4(c) and Fig. 4(d).
TABLE 1 ATTACKS, ATTACKED IMAGES,
EXTRACTED WATERMARKS AND CORRELATION
COEFFICIENTS

Testing Corr(W, W’)


Stego-images Best Extracted
attacks
after attacks watermarks Embedded
(parameter)
Bands
(a) (b) (c) (d)
Figure 4. (a) The stego-image of Lena (PSNR = 39.2793) Gaussian 0.9930
(b) Extracted watermarks embedded in sub-bands LL and HH Noise
(c) The extracted watermark embedded in the sub-band HL (5)
HL
(d) The extracted watermark embedded in the sub-band LH.

Contrast 0.8830
We launched varieties of attacks on the stego-image to
adjustment
investigate the robustness of the FBIW scheme. For the sake
(80)
of space, we just list part of the experimental results with the LH
most common attacks in Table 1.
Column 2 shows the sabotaged stego-images. Some of them 0.8721
Gaussian Blur
are slightly modified, such as the ones processed by the Attack
Gaussian noise, the Gaussian blur, sharpening, the colored pen, (1)
LH
warm, equalization, Gama, resizing, and Mosaic attacks. Some
of them are more apparently modified, such as the ones suffer
from the contrast adjustment, the histogram equalization, 0.9919
rotation, the texture filter, and the watercolor attacks. The rest Sharpen
(80, 30)
of them are much distorted, such as the ones damaged by the LH
cropping, the ripple, the whirlpool, the crystal and glass, the
blast, the invert, the tile, and the zoom blur attacks.

77
0.9970 0.8986
Histogram
Tile
equalization
(50)
(Auto)
HL HL

1.0000 1.0000
Rotation
Invert
(45)
HL, LH and
HL and LH
LLHH

1.0000 0.9889
Cropping
Equalize
(95%)
LLHH LH

0.6324 0.9644
Texture Gamma
Filter-Effect (0.5)
Embossed HL LH

0.9998 1.0000
Warm Zoom Blur
(Red:2) ( Zoom In:50)
LH LLHH

1.0000 0.9681
Resize
Ripple (256)
LLHH LH

1.0000 TABLE 2 CREATIVE ATTACKS, ATTACKED IMAGES,


Whirlpool EXTRACTED WATERMARKS AND CORRELATION
LLHH COEFFICIENTS

Testing Corr(W, W’)


1.0000 Stego-images Best Extracted
Crystal and attacks
after attacks watermarks Embedded
Glass (parameter)
Bands
LLHH

0.9075
0.9712 Puzzle
Blast (50)
(Lift:60) LH
HL

1.0000
0.9986 Kaleidoscope
Watercolor Effect
(Little:80) LLHH
HL

0.8863
0.9965 Kaleidoscope
Colored Pen + Tile:50
(5) LH
HL

0.6970
09761 Kaleidoscope
Mosaic + Puzzle:50
(2) HL
LH

78
Applications to Wavelet Image Compression, ” IEE Proceedings-
0.9151 Vision Image and Signal Processing, Vol. 146, No.3, Jun. 1999, pp.
Kaleidoscope
+ Tile:50+ 159-164.
Puzzle:50 LH [7] D. P. O'Leary and S. Peleg, “Digital Image Compression by Outer
Product Expansion, ” IEEE Transactions on Communications, March
1983, pp. 441-444.
0.9279 [8] H. C. Andrews and C. L. Patterson, “Singular Value Decomposition
Kaleidoscope
+ Puzzle 50+ (SVD) Image Coding, ” IEEE Transactions on Communications, April
Tile: 50 1976, pp. 425-432.
LH
[9] J. C. Liu, C. H. Lin, and L. C. Kuo. “A Robust Full-Band Image
Watermarking Scheme, ” Communication systems, 2006. ICCS 2006.
V. CONCLUSIONS 10th IEEE Singapore International Conference, Oct. 2006, pp. 1-5.
[10] J. F. Yang and C. L. Lu, “Combined Techniques of Singular Value
An effective digital watermarking scheme needs to be Decomposition and Vector Quantization, ” IEEE Transactions on
invisible as well as robust. The FBIW scheme has been shown Image Processing, August 1995, pp. 1141-1146.
to be very effective, that is, it has high PSNR value for the [11] J. Lee, and C. S. Won, “ A Watermarking Sequence Using Parities of
stego-image and is robust against common geometric and non- Error Control Coding for Image Authentication and Correction, ”
geometric attacks. IEEE Transactions on Consumer Electronics, vol. 46, no. 2, pp. 313-
317, 2000.
New image attacks come along with new and efficient
[12] J.M. Shapiro, “Embedded Image Coding Using Zerotrees of Wavelet
image processing tools. To evaluate the security of the FBIW Coefficients, ” IEEE Transactions on Signal Processing, Vol. 41, Dec.
scheme against new attacks, we test on the stego-image with a 1993, pp. 3445-3463.
wide range of attacks, destructive or creative, single or [13] Jieh-Ming Shieh, Der-Chyuan Lou, Ming-Chang Chang, “ A Semi-Blind
multiple ones. Experimental results show that FBIW is not Digital Watermarking Scheme Based on Singular Value
only very robust against most image attacks, such as rotation, Decomposition, ” Computer Standards & Interfaces, Vol. 28, April,
cropping, the ripple, and the whirlpool attacks, but also very 2006, pp. 428-440.
robust against creative and multiple image attacks, such as the [14] K. Konstantinides and G. S. Yovanof, “Application of SVD-Based
kaleidoscope plus tile, the kaleidoscope plus puzzle, and the Spatial Filtering to Video Sequences, ” IEEE International Conference
on Acoustics, Speech and Signal Processing, Vol. 4, Detroit, MI, May
kaleidoscope plus tile and puzzle attacks. The FBIW scheme 9-12, 1995, pp. 2193-2196.
combines merits of DDWT and SVD watermarking techniques
[15] K. Konstantinides and G. S. Yovanof, “Improved Compression
and is proved to be very secure against image attacks. Performance Using SVD-Based Filters for Still Images, ” SPIE
Proceedings, Vol. 2418, San Jose, CA, February 7-8, 1995, pp. 100-
ACKNOWLEDGEMENT 106.
[16] K. Konstantinides, B. Natarajan and G. S. Yovanof, “Noise Estimation
This work was supported in part by Taiwan Information and Filtering Using Block-Based Singular Value Decomposition, ”
Security Center, National Science Council under grants NSC- IEEE Transactions on Image Processing, March 1997, pp. 479-483.
95-2218-E-001-001, NSC-95-2218-E-011-015, iCAST [17] M. Antonini, M. Barlaud, P. Mathieu and I. Daubechies, “Image Coding
NSC96-3114-P -001-002-Y and NSC95-2221-E-029-020- Using Wavelet Transform, ” IEEE Trans on Image Processing,
MY3. Vol.1,No.2, April 1992, pp. 205-220.
[18] N. Garguir, “Comparative Performance of SVD and Adaptive Cosine
REFERENCES Transform in Coding Images, ” IEEE Transactions on
Communications, August 1979, pp. 1230-1234.
[1] A. Munteanu, J. Cornelis, G. Van der Auwera and P. Cristea, “Wavelet
[19] P. Waldemar and T. A. Ramstad, “Hybrid KLT-SVD Image
Image Compression - The Quadtree Coding Approach, ” IEEE
Transacations on Information Technology in Biomedicine, Vol. 3, Compression, ” IEEE International Conference on Acoustics, Speech
Sept., 1999, pp. 176-185. and Signal Processing, Vol. 4, Munich, Germany, April 21-24, 1997,
pp. 2713-2716.
[2] C. H. Lin, J. S. Jen and L. C. Kuo, “Distributed Discrete Wavelet
[20] R. Karkarala and P. O. Ogunbona, “Signal Analysis Using a
Transformation for Copyright Protection, ” 7th International
Workshop on Image Analysis for Multimedia Interactive Services Multiresolution Form of the Singular Value Decomposition, ” IEEE
(WIAMIS 2006), April 19-21 2006, pp. 53-56. Transactions on Image Processing, May 2001, pp. 724-735.
[3] C. Y. Lin, M. Wu, J. A. Bloom, I. J. Cox, M. L. Miller, and Y.M. Lui, [21] Ruizhen Liu and Tieniu Tan, “ An SVD-Based Watermarking Scheme
“Rotation , Scale and Translation Resilient Watermarking for for Protecting Rightful Ownership, ” IEEE Transactions on
Images, ” IEEE Transactions on Image Processing, vol. 10, no. 5, Multimedia, Vol.4, Issue1, March 2002, pp. 121-128.
2001. [22] Rykaczewski, R., “ An SVD-Based Watermarking Scheme for Protecting
[4] Chandra, D.V.S., “Digital Image Watermarking Using Singular Value Rightful Ownership, ” IEEE Transactions on Multimedia, Vol. 9, Feb.
Decomposition, ” The 45th Midwest Symposium on Circuits and 2007, pp. 421 - 423.
Systems(MWSCAS-2002) Vol.3, Aug. 2002, pp. III-264 - III-267 [23] S. D. Lin, and C. F. Chen, “ A Robust DCT-Based Watermarking for
[5] Cheng-qun Yin, Li Li, An-qiang Lv and Li Qu, “Color Image Copyright Protection, ” IEEE Transactions on Consumer Electronics,
Watermarking Algorithm Based on DWT-SVD, ” IEEE International Vol. 46, No. 3, August 2000, pp. 415-421.
Conference on Automation and Logistics, Aug. 2007, pp. 2607 - 2611. [24] S. O. Aase, J. H. Husoy and P. Waldemar, “A Critique of SVD-Based
[6] Craizer, M., Silva, E. A. B. D. and Ramos, E. G., “Convergent Image Coding Systems, ” IEEE International Symposium on Circuits
Algorithms for Successive Approximation Vector Quantization with and Systems VLSI, Vol. 4, Orlando, FL, May 1999, pp. 13-16.

79
[25] S.G. Mallat, “ A Theory for Multiresolution Signal Decomposition: The
Wavelet Representation, ” IEEE Transactions on PAMI, Vol. 11,No.7,
July 1989, pp. 674-693.
[26] Yu-Ping Hu, Zhi-Guang Chen, “ An SVD-Based Watermarking Method
for Image Authenticion, ” IEEE International Conference on Machine
Learning and Cybernetics, Vol. 3, Aug. 2007, pp. 1723 - 1728.

80
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Using Body Sensor Networks for Increased Safety in Bomb Disposal Missions

John Kemp Elena I. Gaura James Brusey


C. Douglas Thake
Cogent Computing Applied Research Centre
and Sports Sciences Department,
Coventry University, Priory St, Coventry, UK CV1 5FB

Abstract body for timely detection of health-related problems. BSN


developers have targeted a variety of environments, includ-
Bomb disposal manned missions are inherently safety- ing emergency response [5, 18], hospital [4, 11] and phys-
critical. Wireless Sensor Network (WSN) technology po- iotherapy environments [6]. Added to these, several other
tentially offers an opportunity to increase the safety of the general purpose body monitoring devices [10, 1, 7] have
operatives involved in such missions through detailed phys- also been proposed. A common element of much of the
iological parameters monitoring and fusion of “health” in- work on BSNs is the focus on integrating accurate single-
formation. point physiological parameter measurements (such as heart
Wearing heavy armour during bomb disposal manned rate or blood oxygenation) into larger monitoring and as-
missions may have side effects that, due to reduction of the sessment systems. The challenges hence identified by the
body’s ability to regulate core temperature in the enclosed BSN community are largely to do with secure communica-
environment of the suit, lead to uncompensable heat stress tion of the sensed parameters to remote units, portability of
and thus impair the technician’s physical or mental abil- the measurement systems, devising the supporting infras-
ity. Experimental trials have shown no obvious relationship tructure for concurrently monitoring a number of subjects
between temperature of any single skin site and the core and, in some measure, miniaturisation of the supporting
temperature nor between single point temperature and sub- sensing and communication platforms (this will be further
jective thermal sensation (usually associated with comfort). supported in section 2).
Also, core temperature alone may not yield indicators of In contrast, the work proposed here explores the poten-
danger sufficiently early. tial benefits of deployed BSNs, in terms of:
This paper proposes to integrate a body network of tem-
perature sensors into the bomb disposal suit. The paper • providing detailed physiological measurement, hence
describes an in-network sensor data fusion and modelling providing better insight into what is happening to the
approach that estimates the overall thermal sensation for human body when exposed to uncomfortable and po-
the suit wearer, in real time, based on the multi-point tem- tentially hazardous environments (such as heavy pro-
perature data. The case is made for performing the mod- tective armour) and extreme environmental conditions,
elling in-network on the basis of reducing communications • supporting on-line and real-time extraction of accurate
with the remote mission control point and to better support human thermal sensation estimates based on multiple
actuation of an in-suit cooling system. It is also argued that sensor measurements,
thermal sensation indicators are more useful to present at
an on-line remote monitoring station than individual tem- • reporting of useful information rather than data to a
peratures. The appropriateness of the developed Body Sen- remote station, thus enabling rapid assessment of haz-
sor Network (BSN) application is supported by experimen- ardous situations,
tal validation.
• allowing the provision of thermal remedial measures
through control and actuation of systems commonly
integrated with armoured suits.
1 Introduction
The appropriateness and usefulness of deployed BSNs
A range of body sensor network (BSN) systems have catering for the above four points is demonstrated here
been proposed in the literature for monitoring the human through a motivating application described below.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 81


DOI 10.1109/SUTC.2008.25
system, particularly the inference of thermal sensation of
the wearer of the protective suit and the communication of
the sensation levels to a remote monitoring station.
The paper is structured as follows: Section 2 reviews re-
lated work. This is followed by a discussion of the key as-
pects of the protective suit and its effect on the wearer. The
design of the proposed system is described in section 4,
along with the model used to estimate thermal sensation
Figure 1. Explosive Ordinance Disposal from multi-site, sensed temperature (section 5). Experimen-
(EOD) Suit tal results are presented in section 6, which is then followed
by the conclusions.

Bomb disposal missions provide armour designers, dis- 2 Related work


posal technicians, and mission controllers with a number of
challenges, due to the extreme conditions and strain gen- With respect to the instrumentation design and imple-
erated by both wearing the armour and the typical bomb mentation, the work reported in this paper is most closely
disposal sites and scenarios. A typical scenario of a bomb aligned with the field of Body Sensor Networks. This is a
disposal mission will initially involve investigating the site sub-area of Wireless Sensor Networks that makes use of a
using a remote controlled robot, and if possible, disarming combination of wireless and miniaturised sensor technolo-
the bomb remotely. Sometimes, however, it is necessary for gies to monitor the human body. The scope of most present
a human bomb disposal expert to disarm the device. For BSN approaches is patient care. Such systems are either
this, the expert will put on a protective suit and helmet (as designed to focus on capturing the evolution of particu-
shown in figure 1), pick up a tool box of equipment, and lar physiological parameters and ensuring that alarms are
walk the 100 or so metres to the site. It may be necessary to generated when parameters stray outside a safe range [8],
climb stairs, crawl through passageways, or even lie down or aimed to provide general monitoring solutions for pa-
in order to reach the bomb’s location. tient status within a hospital or similar environment [4]. In
One of the UK manufacturers of such suits, having iden- comparison, the work presented here is concerned with in-
tified the problem of the suit wearer becoming uncomfort- creased safety and comfort of human subjects in constrained
ably hot and, in the worst case, suffering heat stress, have at- environments through integrating sensing, actuation, and
tempted to address it by installing an in-suit cooling system. autonomous decision making. In this context, wireless sen-
The system is based on a dry-ice pack and a fan that cycles sor technology is used as an enabler for the necessary de-
air through the pack and blows cooled air onto the wearer’s tailed measurement of physiological parameters.
back and into the helmet. The cooling system has a vari- The authors’ work shares some of the design space of
able control thus both allowing the airflow to be adjusted BSNs in terms of the type of physiological parameters
for comfort and also allowing the life of the batteries that sensed and the wearability requirements of the implemented
power the fan to be extended, as they would only provide system. On the other hand, given that the application is
sufficient power for part of the mission otherwise. Whilst within the safety critical domain, the work here also shares
theoretically the cooling could alleviate the heat stress in some common characteristics with the area of instrument-
some measure, mission trials have shown both: ing and monitoring first responders. In this section, sample
applications of BSNs are reviewed together with their sup-
• the inefficient use of the battery power (hence ineffi- porting architectures.
cient cooling provision and limited remedial effects)
by mission technicians, and, 2.1 Emergency response and disaster manage-
• the need for remote monitoring of the mission techni- ment
cians.
The best fit example of a commercial product designed
The authors have developed a prototype system which both for the purpose of monitoring personnel carrying out mis-
satisfies the need for remote monitoring and allows for fu- sions in dangerous environments is the VivoResponder by
ture integration of a cooling automation component to en- VivoMetrics [18]. VivoResponder is based upon an ear-
sure effective, need-based cooling. lier product called the LifeShirt and is aimed at personnel
This paper presents the authors’ work towards the soft- engaged in firefighting and hazardous materials training or
ware support and communications aspects of the prototype emergency response, industrial clean-ups using protective

82
gear, and biohazard-related occupational work. The Vi- data is collected by the mote and transmitted to a monitor-
voResponder is supplied in three parts: a lightweight, ma- ing device such as a PC or PDA. The pulse oximeter is based
chine washable chest strap with embedded sensors; a data on the same hardware platform and aims to provide similar
receiver; and, VivoCommand software for monitoring and benefits in terms of portability.
data analysis. The sensors embedded in the chest strap mon- Working towards similar monitoring aims as the above,
itor the subject’s breathing rate, heart rate, activity level, Jovanov et al. [6] present a sensor node named ActiS that is
posture, and single point skin temperature. designed to be used as part of a wireless body area network.
Monitoring of the subject’s breathing is performed using This node incorporates a bio-amplifier and two accelerom-
a method called inductive plethysmography, where breath- eters, allowing the monitoring of heart activity as well as
ing patterns are monitored by passing a low voltage elec- the position and activity of body segments. The main focus
trical current through a series of contact points around the is the node’s use for monitoring the activity of physiother-
subject’s ribcage and abdomen. Monitoring of the subject’s apy patients outside of the laboratory. The proposed system
heart rate is performed via an ECG. speeds up the set-up process compared to is classical moni-
The VivoCommand software, provided with the device, toring counterpart solution and has the alternative advantage
displays the gathered data from the chest strap in real-time that it may be left attached to a patient for a prolonged pe-
on a remote PC. The parameters are updated every second riod (meaning that the set-up phase is not necessary before
along with 30-second average trends. The parameters are every physiotherapy session).
displayed with colour coding intended to allow quick as-
sessment of the status of up to 25 monitored personnel si- 2.3 Body Sensor Networks—
P latforms
multaneously. Baseline readings can be set individually per
monitored person. BSN based systems are often more constrained than or-
Another system for first responders is the patient moni- dinary embedded systems. These constraints are mainly in
toring system presented by [5], which was developed as part terms of power, size and weight. Power is restricted be-
of the CodeBlue project [15]. Unlike the VivoResponder, cause mains AC power is not available. Furthermore, size
this system is designed for monitoring patients at an emer- and weight restrictions limit the battery supplies that can be
gency scene, and provides the facility to monitor a patient’s used. Size and weight must be limited because large and
vital signs and location, as well as medical record storage heavy devices would be cumbersome, uncomfortable, and
and triage status tracking. Several additional devices were in applications such as the one described here, an unneces-
added to the Mica2 platform which supports this applica- sary distraction.
tion: location sensors, a pulse oximeter, a blood pressure In response to the above, some of the BSN systems de-
sensor, and an electronic triage tag. The electronic triage signed and implemented by research groups integrate within
tag replaces the paper equivalent commonly in use. (A pa- the nodes an appropriate central processing unit, memory
per triage tag is also provided as backup if the electronic and radio transceiver as a single custom chip. An exam-
tag fails.) The mote continuously transmits patient infor- ple here is the MITes platform (for monitoring movement
mation to a tablet device which the first responder carries of human subjects) developed by Tapia et al. [16], which is
in a weatherproof casing. Mote packages are distributed to based around the Nordic VLSI Semiconductors nRF24E1
patients as required once the scene is reached. chip. This chip integrates a radio transceiver and an Intel
8051 based processor core that runs at 16MHz and pro-
2.2 Continuous monitoring solutions for patient vides a nine channel 12-bit ADC and various other inter-
care faces, such as SPI (serial peripheral interface) and GPIO
(general purpose I/O). This approach is efficient in terms of
The CodeBlue project [15, 11] aims to provide an archi- size and weight, due to the integration of several functions
tecture and system implementation for continuously mon- onto one chip, but has limited generality, as it cannot easily
itoring patients in a hospital environment. Two of the de- be adapted with new components (such as a different radio
vices produced during the course of this project were a device).
mote-based EKG and pulse oximeter, with the goal of in- Another, more popular design option is to use off-the-
tegrating them into one device. Fulford-Jones et al. [4] shelf components. There is a trade off made between pro-
present the EKG unit, which is built onto a Mica2 mote. It is cessing and storage capabilities and the size and power con-
designed for continuously monitoring patients in a hospital sumption of the devices. This means that the devices se-
intensive care unit. Standard “portable” EKG systems re- lected would likely be considered severely under-powered
quire power from an electrical outlet and are moved around in other systems (often including 16- or even 8-bit pro-
on a cart which must be taken with the patient, whereas this cessors) and have small amounts of memory (in the order
system aims to be lightweight and unobtrusive. The EKG of tens or hundreds of kilobytes). For instance, the Texas

83
Instruments MSP430F149 micro-controller has been used also depend on the ambient conditions and is likely to in-
for several systems including those developed by Lo and crease during operations in hot compared to temperate envi-
Yang [10] and Jovanov et al. [7]. This is a 16-bit processor ronments. Approaches to attenuate heat strain have the po-
running at 8MHz incorporating 60KB of flash memory and tential to reduce physiological stress and increase safe op-
2KB of RAM and provides interfacing opportunities via 48 erating time. Recent developments in this area include the
GPIO lines and a 12-bit ADC. The system developed by Lo integration of cooling devices and altered equipment config-
and Yang used ECG sensors, accelerometers, and a temper- urations. Clearly knowledge of differences between phys-
ature sensor to monitor patient health. The system devel- iological and thermal responses of the operatives across a
oped by Jovanov et al., was used for monitoring the elderly range of conditions is essential to inform the requirements
and those undergoing physiotherapy. of an “active” system to optimise the microclimate between
Other systems expand upon commercial devices such as the skin and protective clothing to facilitate heat transfer
the Mica2 and MicaZ motes developed at the University of and maintain body temperature. Laboratory based activity
California, Berkeley, or Intel’s Imote platform. This ap- simulation protocols have recently been developed to assess
proach often has a disadvantage in that the basic platform the impact of such innovations on UHS [9, 17].
is generic, and may not directly provide the facilities re- This paper uses a modified version of Zhang’s comfort
quired for the specific BSN project. Such commercial plat- model [22] to estimate thermal sensation from tempera-
forms are also often larger and heavier than custom devel- ture data and compares it to that actually reported by trials
oped platforms as they are required to be general purpose participants [17, Section 6]. In brief, participants under-
in order to achieve any commercial success. The MicaZ took up to four 16:30 (min:sec) activity cycles consisting
mote uses the Atmega128L, an 8-bit processor running at of treadmill walking (4 km/h, 3 min), unloading and load-
8MHz and featuring 128KB of flash memory to which an ing weights from a kit bag (≈ 2 kg each, 2 min), crawl-
additional 512KB is added externally on the mote itself. A ing and searching activity (2 min), arm cranking (unloaded,
10-bit ADC, UART and I2 C bus are also available. Gao 3 min), seated physical rest (5 min) interspersed with 30
et al. [5] developed a system based around the this mote, sec intervals between the first three activities. Heart rate
adding various sensors and supporting devices to allow pa- (HR; Polar Vantage), rectal temperature (Tcore ), skin tem-
tient tagging and monitoring in an emergency response en- peratures (Ts ; arm, chest, thigh and calf [12]) were mon-
vironment. Walker et al. [19] present a blood pressure mon- itored throughout. Thermal sensation, reported on a 0 to
itoring system based on the MicaZ platform. In that work, a 8 scale [21] that incorporates verbal anchors from unbear-
commercial blood pressure monitoring device is connected ably cold (0) to comfortable (4) to unbearably hot (8) was
to the MicaZ via a serial interface. sought at specified intervals. Wearing an EOD suit dramati-
The system proposed in this paper uses off-the-shelf cally increased physiological strain as indicated by elevated
components, although integration into custom chips is fore- heart rate (up to 60 bpm more than without the suit) and
seen as an avenue to be explored in the future. gradual increase in core and mean skin temperatures (up
to 2◦ C more than without the suit) and thermal sensation
3 Suit environment in all four participants. Such responses are likely to have
a negative impact on performance. Continuous monitoring
is essential since the rate of rise in core body temperature
The combination of elevated metabolic heat production can abruptly increase when mean skin temperature reaches
M and restricted avenues for body heat loss (convection C, a similar level. Furthermore unpublished data from our lab-
conduction K, radiation R and evaporation E) when wearing oratory demonstrate that wearing a phase change cooling
necessarily heavy and bulky protective clothing has a neg- vest under the EOD suit results in a reduction in chest tem-
ative effect on the heat balance of the body and results in perature (≈ 3◦ C) and elevation in upper arm temperature
heat storage. This is a situation where the thermoregulatory (≈ 0.5◦ C) compared to not wearing a cooling vest. Such
system is unable to defend against increases in core body data highlight differences between body segments and sup-
temperature1 . This condition of uncompensable heat stress port the rationale for multi-point temperature sensing to be
(UHS) is associated with significant physical and psycho- used when using thermal information to estimate thermal
logical impairment [2] therefore placing the individual at well being of operatives in protective clothing.
an increased risk of making an avoidable error and jeop-
ardising the mission. Furthermore as well as the micro-
climate within the EOD suit, the rate of heat storage will 4 System Design and Implementation
1 The balance between heat gain and heat loss is represented by the heat
A prototype sensing system is in development to pro-
balance equation: S = M − (±W ) ± (R + C) ± K − E where S is the rate
of body heat storage; M is the rate of metabolic heat production; W is the vide a greater data gathering capability than that offered by
mechanical work [13]. currently available monitoring systems. This system is de-

84
Cooling A Processing Node
sense System
B C
model D
Actuation Node
environment transmit E
decide
F

visualise Processing Node


act
Subject with Sensors
mission plan
changes

Figure 2. Conceptual design of prototype


system
Remote Monitoring Point

Figure 3. Prototype system hardware compo-


signed following the architecture shown in figure 2. The nents and sensor positioning.
environment within the suit is sensed in terms of tempera-
ture; sensed data is integrated into a model representing the
thermal state of the wearer; a decision is made about how to
adjust the cooling system based on the thermal state; finally, the main processing and communication platform (support-
the determined action is transmitted to the fan speed con- ing the processing and actuation nodes). The Connex in-
troller. In addition to this basic architecture, the system also cludes an Intel XScale PXA255 400MHz processor, 16MB
transmits inferred state values for the purpose of remote, of flash memory, 64MB of RAM, a Bluetooth controller
on-line, visualisation of the thermal state of the wearer. In and antenna (enabling all communications), and 60-pin and
summary, the prototype system can be seen as being com- 92-pin connectors for expansion boards. There are no on-
posed of two control loops: one giving rapid feedback to board sensors provided. The sensor packages connect to the
autonomously adjust cooling; the other, which is the object Connex board via an expansion board that was designed in-
of discussion here, transmits the thermal sensation informa- house. As shown in figure 3, three Connex boards are used;
tion to the remote monitoring point. two as processing nodes and one as an actuation node.
The prototype components supporting this functionality
are presented in figure 3. The processing nodes, actua- 4.1 Remote monitoring loop
tion nodes, and remote monitoring point form a wireless
network. Each processing node is wired to several sensor Remote monitoring is shown conceptually as a feed-
packages via an I2 C bus. (Although it would be possible to back loop (in figure 2) that transmits the modelling results
integrate all sensor packages used in this prototype into a (thermal sensation) to a remote monitoring station and dis-
single processing / actuation node, using separate process- plays the information (thermal sensation level), thus allow-
ing nodes allows the helmet, jacket, and trousers to be kept ing human-in-the-loop feedback. The data and information
separate with no wires running between them. This is es- flow for this process (as implemented in the current proto-
sential for ensuring that the product remains easy to use and type) is illustrated in figure 4. The first phase is to smooth
transparent to the wearer.) the raw sensed data from all skin sites using Kalman filter-
The sensor packages are attached to the body following ing [20] and to collate all into a skin temperature vector. A
the standard positioning used for skin sensors as used by thermal sensation model is applied to the resulting vector,
Thake and Price [17], which is a subset of the locations de- which yields an estimate of the thermal sensation for the
scribed by Shanks [14]. The used skin sites were: A – neck, current point in time. The next phase is to transmit this to
B – chest, C – bicep, D – abdomen, E – thigh, F – lateral the remote station. (Optionally, the skin temperature vec-
calf muscle, as indicated in figure 3. tor can also be transmitted.) Due to the possibility of a ra-
The Gumstix Connex 400xm-bt board was selected as dio jammer being used and to compensate for other factors

85
Temperature Sensors

...

Raw data

Kalman filter Visualisation


and collate at remote station

Filtered data Thermal


sensation information

Thermal Sensation Buffering and


Model transmission

Thermal
sensation information
Figure 5. A snapshot of the remote monitor-
ing visualisation component.
Figure 4. Information flow for remote moni-
toring
contextual information such as number of sensors, position
of sensors and whether redundant sensors have been used.
causing communications link failures (such as obstructions Due to the nature of embedded, low power devices, reduc-
and out of range mobility), it is necessary to first buffer ing the number of bits that need to be transmitted extends
the information, and then, when it is sensed that commu- their lifetime.
nications are available, to transmit all buffered information. Furthermore, communication with the base station may
Given that only the information is being buffered rather than be intermittent due to radio jammers or other factors as
whole ready-to-transmit packets, this approach saves mem- noted previously. Even though buffering may help, the ef-
ory and avoids dropped packets due to overflowing commu- fective bandwidth will be considerably less than that avail-
nication buffers. The last phase is the information arrival at able under optimal conditions. For this reason, it makes
the remote monitoring station and its conversion to visual sense to perform modelling on-board and to transmit ther-
form. mal sensation estimates.
A snapshot of the remote monitoring visualisation com- Effective visualisation systems need to assist the user
ponent is shown in figure 5. The main information display with interpreting the data. It has been the authors’ expe-
panel (in this case using illustrative data) includes a 3D fig- rience that it is difficult for a user to assess thermal comfort
ure showing the interpolated temperature distribution across by looking at individual skin temperatures. Furthermore,
the subject’s skin, the current average skin temperature, and skin temperatures tend to change slowly and overall trends
the current thermal sensation level. Other panels show the are difficult to assess. It has been noticed, for example, that
location and status of the sensors and the history of the in- skin temperature of one body segment may be rising while
coming data. the temperature of another is falling, whilst the overall ther-
mal sensation follows yet another trend.
5 In-Network Thermal Sensation Modelling A final advantage for modelling in-suit is that more ef-
fective and efficient cooling might be achieved by using
In this section, a case is established for performing ther- thermal sensation rather than single point temperature mea-
mal sensation modelling “in-network” and communicating surements as the basis for controlling the cooling system.
to mission control this global “well-being” parameter. The The thermal sensation model, used in the current proto-
argument is raised from two perspectives: first, a network- type system, is described below.
ing hardware perspective, and second, an information ben-
efit one. 5.1 Thermal sensation model
Thermal sensation estimates can be transmitted in fewer
bits than a detailed thermal profile from a large number of Several models of human thermal sensation exist. Exam-
sensors. Also, a sensation estimate removes unnecessary ples are the PMV-PPD and SET* [3] models. In this work, a

86
model provided by Zhang [22] has been evaluated in detail. 5
Self-assessed
Zhang’s model was chosen as it has been well researched Estimated

and validated with a large number of human subject trials. 4

This model takes skin temperature (and optionally core tem-


perature) readings from a subject as input and provides as

Thermal Sensation
3

output an estimation of thermal sensation, both per body


segment and globally. Thermal sensation is given in the 2

range [−4, 4], with −4 being very cold and 4 being very
hot. (Note that a bias of −4 has been applied to the trials 1

scale described in section 3 to unify the self assessed and


0
modelled thermal sensation. This unified scale was used
throughout the remainder of this paper.) The model ac-
-1
counts for both static and dynamic temperature conditions. -20 -10 0 10 20 30 40 50 60 70
In Zhang’s work, thermal sensation levels are then used to Time (minutes)

calculate the thermal comfort level, which is not discussed


here. Figure 6. Self-assessed overall thermal sen-
The main part of the model is a logistic function based sation versus estimated for subject 3 with no
on two main parameters: suit.

• the difference between the local skin temperature and


its “set” point (the point at which the local sensation is
set of body segments in the model were reduced to just four
neutral)
skin sites, with corresponding changes in associated weight-
• the difference between the overall skin temperature ing.
and overall set point

The local thermal sensation for segment i is defined as, 6 Results


 
Li = C1± (Ts,i − Ts,i,set ) Sample results showing a comparison of self-assessed
i  
+ K1i (Ts,i − Ts,i,set ) − T s − T s,set (or subjective) thermal sensation and that estimated by the
model are shown in figures 6 and 7 with the former show-
ing a trial without the EOD suit and the latter showing a
  trial with the suit. For all body components (and overall),
2   dTs,i  ±  dTcore
Si = 4 −L
− 1 + C2± + C3i the self-assessed values with no EOD suit are always lower
1+e i i
dt dt
than estimated by the model. However, when the suit is
where Si is the local thermal sensation for segment i, Ts,i is worn, the model mostly, if not always, underestimates.
the skin temperature at i, T s is the mean skin temperature, Self-assessed thermal sensation does not necessarily fol-
and T s,set is the set point for the mean skin temperature. A low the same trend as the temperature of any particular skin
constant C1± , which is different for each body segment, de- site. As shown in figure 8, a dramatic increase in thermal
fines how large a change in sensation results from a change sensation occurs between 30 and 40 minutes correspond-
in temperature, while a constant K1, which is also differ- ing to the subject feeling much hotter than previously. At
ent for each body part, determines the contribution of the the same time, the chest temperature (which would be the
overall thermal state to the sensation of the segment in ques- only site conventionally monitored) has actually decreased
tion. Constants C2± and C3± control the contribution of the by about one degree. Just after this, the self assessed sensa-
rate of change of local skin and core temperatures to the lo- tion drops to 2 while the temperature increases. In conclu-
cal sensation. Note that the model defines slightly different sion, the results both confirm the need for detailed measure-
values for ± constants depending on whether the associated ment of temperature at multiple skin sites and also confirm
multiplicand is positive or negative. that Zhang’s model is a good starting point towards estimat-
Estimated overall thermal sensation S is the weighted ing thermal sensation of subjects wearing an EOD suit.
sum of estimates of local sensations ∑i∈B wi Si where B is Trials with other subjects, who had previously worn the
the set of body segments and the weights are normalised EOD suit fewer times than subject 3, experienced large
such that ∑i∈B wi = 1. changes in skin temperature during the activity regime.
It should be noted that in order to apply Zhang’s model Subject 3, however, showed much more stable tempera-
of thermal sensation to the protective suit environment, the tures.

87
7 Conclusions and future work

It is clear that wearing heavy armour during bomb dis-


posal missions may induce uncompensable heat stress due
5
Self-assessed to the enclosed nature of bomb disposal (EOD) suits. The
Estimated
work presented here has positively assessed the need for de-
4
tailed measurement of skin temperature, the applicability of
body sensor network technology to this application domain,
Thermal Sensation

3
and the suitability of modelling thermal sensation based on
skin temperature. The approach taken by the authors ex-
2
ploited in-network information extraction and communica-
1
tion of real-time thermal sensation to mission control to fa-
cilitate both high yield and timely remedial actions. This
0
work has the potential to provide real improvement to both
the working conditions of EOD technicians and greater lev-
-1
els of safety.
-20 -10 0 10 20 30 40 50 As it stands, the model used is not a perfect fit for this
Time (minutes)
application. Some possible reasons are:
Figure 7. Self-assessed overall thermal sen- • experimentation has shown that there is a tendency for
sation versus estimated for subject 3 with the discomfort to grow with time when wearing the suit,
full protective suit and no cooling. possibly due to the subject becoming tired, thus affect-
ing their subjective assessment,

• the model was not specifically designed for estimating


thermal sensation while wearing this type of protective
clothing,

• thermal sensation is a subjective measure, which may


lead to variations in the reporting between subjects.

In future work, it is planned to develop a new or revised


5 40
Self-assessed model that better accounts for the environmental factors of
Chest temperature
39 the EOD suit. Also, it is envisaged to use thermal sensation
4
information as a control parameter for autonomous activa-
38 tion of an in-suit cooling system. Extensive experimenta-
Thermal Sensation

3
Temperature (C)

tion and trials are planned for January through to March,


37
2
2008.
36

1 References
35

0 [1] H. H. Asada, P. A. Shaltis, A. Reisner, S. Rhee, and


34
R. C. Hutchinson. Mobile monitoring with wearable
-1 33 photoplethysmographic biosensors. IEEE Engineering in
0 10 20 30 40 50 60 70
Time (minutes)
Medicine and Biology Magazine, 22(3):28–40, 2003.
[2] S. S. Cheung, T. M. McLellan, and S. Tenaglia. The ther-
mophysiology of uncompensable heat stress. physiological
Figure 8. Self-assessed thermal sensation
manipulations and individual characteristics. Sports Med.,
compared with chest skin temperature for
29:329–359, 2000.
subject 1. [3] Energy Systems Research Unit, University of Strath-
clyde. Thermal comfort models. Available on-
line http://www.esru.strath.ac.uk/Reference/
concepts/thermal_comfort.htm; accessed 26-11-2007.
[4] T. R. Fulford-Jones, G.-Y. Wei, and M. Welsh. A portable,
low-power, wireless two-lead EKG system. In Proceedings

88
of the 26th IEEE EMBS Annual International Conference, [18] Vivometrics. Vivometrics: Better results through non-
San Francisco, September 2004. invasive monitoring. Available online http://www.
[5] T. Gao, D. Greenspan, M. Welsh, R. R. Juang, and A. Alm. vivometrics.com; accessed 16-11-2007.
Vital signs monitoring and patient tracking over a wireless [19] W. Walker, T. Polk, A. Hande, and D. Bhatia. Remote blood
network. In 27th Annual International Conference of the pressure monitoring using a wireless sensor network. In
IEEE EMBS, pages 102–105, Shanghai, September 2005. 6th Annual IEEE Emerging Information Technology Con-
[6] E. Jovanov, A. Milenkovic, C. Otto, and P. C. de Groen. A ference, Dallas, Texas, August 2006.
wireless body area network of intelligent motion sensors for [20] G. Welch and G. Bishop. An introduction to the Kalman fil-
computer assisted physical rehabilitation. Journal of Neuro- ter. Technical report, University of North Carolina at Chapel
Engineering and Rehabilitation, 2(6), March 2005. Hill, 1995. Available online http://www.cs.unc.edu/
[7] E. Jovanov, D. Raskovic, J. Price, J. Chapman, A. Moore, ~welch/kalman; accessed 3-12-2007.
and A. Krishnamurthy. Patient monitoring using personal [21] A. Young, M. Sawka, Y. Epstein, B. Decristofano, and
area networks of wireless intelligent sensors. Biomedical K. Pandolf. Cooling different body surfaces during upper
Sciences Instrumentation, 37:373–378, 2001. and lower body exercise. J. App. Physiol., 63:1218–1223,
[8] S. L. Keoh, N. Dulay, E. Lupu, K. Twidle, A. E. Schaeffer- 1987.
Filho, M. Sloman, S. Heeps, S. Strowes, and J. Sventek. Self [22] H. Zhang. Human Thermal Sensation and Comfort in Tran-
managed cell: A middleware for managing body sensor net- sient and Non-Uniform Thermal Environments. PhD thesis,
works. In Proceedings of the 4th International Conference University of California, Berkeley, 2003.
on Mobile and Ubiquitous Systems: Computing, Network-
ing and Services (Mobiquitous), Philadelphia, USA, August
2007.
[9] J. Kistemaker, E. den Hartog, and C. L. Koerhuis. Thermal
strain and cooling during work in a bomb disposal (EOD)
suit. In TNO Defence, Safety and Security, Soesterberg, The
Netherlands. Personal armour systems symposium., pages
20–22, Royal Armouries Museum, Leeds, United Kingdom,
September 2006.
[10] B. Lo and G.-Z. Yang. Architecture for body sensor net-
works. In Perspective in Pervasive Computing, pages 23–28,
October 2005.
[11] D. Malan, T. Fulford-Jones, M. Welsh, and S. Moulton.
Codeblue: An ad hoc sensor network infrastructure for
emergency medical care. In International Workshop on
Wearable and Implantable Body Sensor Networks, April
2004.
[12] N. L. Ramanathan. A new weighting system for mean
surface temperature of the human body. J App. Physiol.,
19:531–533, 1964.
[13] M. N. Sawka and A. J. Young. ACSM’s Advanced Exer-
cise Physiology, chapter Physiological systems and their re-
sponses to conditions of heat and cold. Lippincott Williams
and Wilkins, Baltimore, USA, 2006.
[14] C. A. Shanks. Mean skin temperature during anaesthesia:
An assessment of formulae in the supine surgical patient.
British Journal of Anaesthesia, 47(8):871–876, 1975.
[15] V. Shnayder, B. rong Chen, K. Lorincz, T. R. F. Fulford-
Jones, and M. Welsh. Sensor networks for medical care.
Technical report, Division of Engineering and Applied Sci-
ences, Harvard University, 2005.
[16] E. M. Tapia, N. Marmasse, S. S. Intille, and K. Larson.
Mites: Wireless portable sensors for studying behavior. In
Proceedings of Extended Abstracts UbiComp 2004: Ubiqui-
tous Computing, 2004.
[17] C. D. Thake and M. J. Price. Reducing uncompensable heat
stress in a bomb disposal (EOD) suit: a laboratory based
assessment. In Proceedings of the 12th International Con-
ference on Environmental Ergonomics (ICEE 2007), Piran,
Slovenia, 2007. ISBN 978-961-90545-1-2.

89
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Neighborhood-Aware Density Control in


Wireless Sensor Networks
Mu-Huan Chiang, Gregory T. Byrd
Center for Efficient, Secure and Reliable Computing (CESR)
North Carolina State University
Raleigh, NC 27695, USA
{mchiang, gbyrd}@ncsu.edu

Abstract—In dense wireless sensor networks, density con- essary. Data aggregation [11] has been proved to be an
trol is an important technique for prolonging network’s efficient approach to decrease the in-network traffic, and
lifetime. However, due to the intrinsic many-to-one com- hence reduce the forwarding workload of sensor nodes.
munication pattern of sensor networks, nodes close to the
sink tend to deplete their energy faster than other nodes. However, the energy consumption on overhearing still
This unbalanced energy usage among nodes significantly results in an obvious imbalance on energy usage. In this
reduces the network lifetime. In this paper, we propose paper, we propose a mechanism to alleviate the unbal-
neighborhood-aware density control (NADC) to alleviate this anced energy usage by reducing unnecessary overhearing
undesired effect by reducing unnecessary overhearing along along routing paths. Our approach extends the concept
routing paths. In NADC, nodes observe their neighborhoods
and dynamically adapt their participation in the multihop of density control mechanisms. Instead of achieving a
network topology. Since the neighborhood information can uniform density over the network, we further exploit the
be easily observed through the overheard information, the local information to reduce the density along routing
density in different regions can be adaptively adjusted paths, in order to reduce the unnecessary overhearing
in a totally distributed manner. Simulation experiments energy consumption.
demonstrate that NADC alleviates the extremely unbalanced
workload and extends the effective network lifetime without Some related works are reviewed in Section 2. Section
significant increase in data delivery latency. 3 describes our proposed approach, followed by some
implementation issues in Section 4. We present the eval-
uation results in Section 5, and conclude in Section 6.
1. I NTRODUCTION
Wireless sensor networks (WSNs) have emerged as a 2. R ELATED W ORK
promising solution for various applications such as cli-
mate monitoring, tactical surveillance, and vehicle track- A. MAC Layer Overhearing Avoidance
ing. Energy consumption is the most important factor to Overhearing is one of the most important components
determine the lifetime of a sensor network, because sensor of energy waste in sensor networks. It occurs when a node
nodes usually have very low energy sources. Optimizing receives packets that are destined to other nodes. Various
the energy usage in sensor networks is difficult because approaches in the medium-access control (MAC) layer
it involves not only reduction of energy consumption but have been proposed to reduce the overhearing energy
also prolonging the lifetime of the whole network. consumption.
Various energy-efficient paradigms and strategies have The MAC protocols in sensor networks can be cate-
been devised to collect and route packets towards the sink, gorized into synchronized and asynchronous approaches.
trying to maximize the lifetime of sensor nodes while Synchronized protocols, such as S-MAC [29] and T-
maintaining system performance and operational fidelity. MAC [23], use RTS-CTS mechanism for overhearing
However, due to the intrinsic many-to-one communication avoidance. On the other hand, asynchronous protocols,
pattern of sensor networks, the nodes closer to the sink such as B-MAC [17] and WiseMAC [7], rely on low
have to forward more packets than the ones at the pe- power listening, also called preamble sampling, while
riphery of the network. Therefore, nodes around the sink communicating packets. These protocols suffer from the
would deplete their energy faster, leading to an energy overhearing problem where receivers who are not the
hole around the sink. This energy hole problem occurs target of the sender also wake up during the long preamble
regardless of the routing strategies and MAC protocols, and have to stay awake until the end of the preamble (or
and may severely reduce the network lifetime. the packet). With the support of some packetizing radios,
To alleviate this undesirable effect, a mechanism to X-MAC [3] reduces this unnecessary overhearing by
balance the energy usage among sensor nodes is nec- employing a series of strobed preambles, each containing

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 122


DOI 10.1109/SUTC.2008.44
the address of the target receiver. Non-target receivers Maximum lifetime data gathering problem [10] presents
can quickly go back to sleep without receiving the whole polynomial-time algorithms to solve the data gathering
preamble or the packet. problem, given the locations of the sensors and the sink
While studies have demonstrated the energy-saving ca- and the available energy at each sensor. Olariu et al.
pabilities of these approaches, the MAC layer approaches also propose an iterative process to avoid the creation
reduce the energy usage uniformly over the network. of energy holes around the sink [15].
The unbalanced energy usage among sensor nodes is not Although these algorithms solve the problem by pre-
considered. Thus upper layer approaches can be further determining the routing topology based on the anticipated
employed to mitigate the unbalanced energy usage. traffic within the network, the topology or traffic infor-
mation of the whole network is usually required, and the
sensors have to perform complex computations to decide
B. Density Control
the work schedule or transmission power. These require-
Radios consume energy not only when sending and ments are impractical in a sensor network environment.
receiving, but also when listening or idle. The great
energy consumption associated with idle time and over-
D. Location-Varying Deployment
hearing suggests that energy optimizations must turn off
the radio, not simply reduce packet transmission and Some mechanisms have been proposed to cope with
reception [5]. While energy-efficient MAC protocols turn the unbalanced energy usage problem while deploying
off radio whenever possible, application-level information the sensor nodes. Location-varying node density [12],
can be further used to turn off redundant nodes (for longer [25] is a method that deploys more sensor nodes around
time periods than MAC layer approaches) for further the sink to provide more redundancy. Deploying assisting
energy reduction. gateways or special relaying nodes in the network is
Sleep-based topology control, also referred to as density also proposed to provide better connectivity and facilitate
control, is an approach that maintains the density of scalability [1], [30]. Sichitiu et al. [21] propose to use
working sensors at a desirable level. Specifically, density multiple levels of batteries. Nodes closer to the sink are
control ensures only a subset of sensor nodes operate in equipped with batteries of large capacity, and other nodes
the active mode, while meeting the sensing coverage or have the battery inversely proportional to the distance
connectivity requirements. from the sink.
Various methods have been proposed for density con-
trol. GAF [27] and OGDC [19] use location information 3. N EIGHBORHOOD -AWARE D ENSITY C ONTROL
to achieve uniform density over the network. In AFECA While density control mechanisms show the capabil-
[26] and PEAS [28], each node switches between sleeping ity to successfully reduce the unnecessary energy con-
and active based on the number of active neighbors. AS- sumption in WSNs, the unbalanced energy usage still
CENT [5] further takes the link status into consideration, negatively impacts the network lifetime. Although some
and makes the state transitioning based on the number of centralized algorithms and sensor deployment methods
active neighbors and link quality measurements. SPAN have shown their capability of mitigating the problem,
[6] adaptively elects coordinators from all nodes to form these approaches may not fit various sensor network
a forwarding backbone in the network. In STEM [18], the environments.
link between nodes is constructed by having each node Therefore, we propose neighborhood-aware density
periodically turn on its radio to detect if someone else control (NADC), which requires only local information
wants to communicate with it. STEM also proposes using and simple computation, to alleviate the unbalanced en-
different frequency bands for the wakeup protocol and the ergy usage among sensor nodes. In this section, we intro-
data transfer to avoid interference. duce the system assumptions used throughout the work,
describe the uniform density control mechanism, and then
C. Lifetime-Aware Data Gathering present the algorithm of NADC. For illustration purposes,
we use a simple density control method based on the
While reducing the nodes’ energy consumption has
number of active neighboring nodes. The implementation
become a major concern, it is also important to maintain
of NADC with other density control methods will be
a balance of energy usage in the network so that certain
addressed in future work.
nodes do not die much earlier than others, resulting in the
energy hole problem. Some algorithms have been devised
to solve the unbalanced energy usage among sensors in A. System Assumptions
order to prolong the network lifetime. NADC is based on the assumption that most of the time
Optimal sensor management [16] is formulated as a the network is only sensing the environment, waiting for
generalized maximum flow graph problem with additional an event to happen. One example of such application is
constraints to maximize the lifetime of an application. a sensor network designed to track animals in a forest.

123
When in the active state, a node does the data send-
ing/receiving as well as neighbor discovery. If the node
has no data to send in this epoch, it sends out a beacon
packet as a notification of its existence. At the end of
the epoch, it makes the state transitioning decision by
following the same rules as in the discovery state.
In this simple UDC mechanism, all nodes except source
Figure 1. State transitioning diagram of uniform density control. Nb and forwarding nodes perform density control based on a
represents the number of neighbors, and Th represents the threshold. predefined threshold value. The threshold value is chosen
so that there is a high probability that the active nodes
These networks have to remain operational for months or form a connected graph, so that multihop forwarding
years, but send out data only on the occurrence of an an- works. If the density of active nodes can not provide
imal passing by. Although it is desirable that the transfer sufficient connectivity in the network, the data delivery
state be energy-efficient, it may be more important that latency (i.e. the time required to relay an event packet
the monitoring state be low-power as the network resides to the sink) will increase dramatically. Since a node
in this state for most of the time. This observation holds does not know whether it is required to be active in
true for many other applications as well. order to maintain good connectivity in the network, to
We assume that the sensors in the network are homo- be conservative the threshold value tends to be high to
geneous; that is, all sensors have the same computation keep a large number of active nodes. That means nodes
capability and radio range, and are powered by batteries are active even when they could be asleep, and results in
with the same available energy. We also assume that unnecessary energy consumption.
the power consumption of sensor nodes is dominated
by communication costs, as opposed to sensing and C. Neighborhood Type
processing costs. This assumption is reasonable for many
In many event-based sensor network applications, such
types of sensor nodes that require very little energy, such
as habitat monitoring or intrusion detection systems,
as pressure and temperature sensors.
source nodes tend to locate in the same area where events
occur. As shown in Fig. 2, we can categorize the sensor
B. Uniform Density Control field into three different types: hot area, midstream area,
Many density control methods have been proposed to and silent area. A node’s neighborhood type is identified
achieve a uniform density over the network [27], [26], according to the following rules:
[28]. Thus we use a simple uniform density control (UDC) • A node is in a hot area if it is a source or at least
mechanism as our baseline to evaluate the performance one of its neighbors is a source. A node is referred
of NADC. Similar to AFECA and PEAS, UDC uses the to as a source if it detects events and generates data
number of active neighbors as the criteria of node state packets.
transitioning. In UDC, nodes are in one of three states: • A node is in a midstream area if none of its neigh-
sleep, discovery, and active. A state diagram is shown in bors is a source and at least one of its neighbors
Fig. 1. Time is divided into epochs (see Section 4.3), and or itself is a forwarding node. A forwarding node
the state transitioning occurs at the end of epochs. doesn’t generate data packets but forwards packets
Initially nodes start out in the sleep state. When sleep- along the routing path.
ing the radio is off, not consuming power. In this state they • A node is in the silent area if all its neighbors are
keep their radio turned off for time Tsleep , then transition idle nodes (i.e. none of its neighbors is a source or
to discovery. If a node has data to send while sleeping, it forwarding node).
transitions to active and starts sending the data. (Although
By embedding the node type information in the trans-
the radio is off, sensors and other parts of the node are
mitted packets, each node can keep track of the node
still on.)
type of its neighbors, and easily identify its neighborhood
When in the discovery state, a node turns on its radio
type. Algorithm 1 shows the process of neighborhood
and listens for messages for an epoch. It also performs
observation.
neighbor discovery by maintaining the number of active
neighbors, which is the number of neighboring nodes it
hears in this epoch. At the end of the epoch, the node D. Neighborhood-Aware Density Control
transitions to active if it gets a routing message and The objective of NADC is to avoid the unnecessary
participates in the route, or if it decides to send data. overhearing by reducing the active node density along
Otherwise it uses the number of active neighbors, Nb, and the routing paths (i.e. hot areas and midstream areas)
a predefined threshold, Th, as the transitioning criteria: while keeping the data delivery latency in a reasonable
transition to active if Nb < Th, otherwise returns to sleep. range. In UDC, since all nodes other than source and

124
Table 1
NADC NODE STATES
State Radio Sensor Communication Task
Sleep off on —
neighbor discovery
Discovery on on neighborhood observation
data receiving
neighbor discovery
Active on on neighborhood observation
data sending/receiving

Figure 2. Different neighborhood types in event-based sensor networks.


The outermost rectangular area represents the whole sensor field.
discovery and neighborhood observation for an epoch. At
Algorithm 1 Neighborhood observation of NADC the end of the epoch, the node obtains the number of
function PacketReceived ()
switch ReceivedPacket.NodeType
active neighboring nodes and its neighborhood type from
case Src:
Node.NeighborhoodType=HOTAREA
the overheard information, and then transits to either the
case Fwd:
if Node.NeighborhoodType!=HOTAREA then
active state or the sleep state according to the NADC
Node.NeighborhoodType=MIDSTREAMAREA
end if
state transitioning procedure. If a node has data to send
case Idle:
if Node.NeighborhoodType!=HOTAREA and
or receives packets that need to be forwarded, it transitions
Node.NeighborhoodType!=MIDSTREAMAREA then
Node.NeighborhoodType=SILENTAREA
to active state in the next epoch to do the communication
end if
end switch
tasks, regardless of its current state. When a node is
end function in the active state, it performs neighbor discovery and
neighborhood observation as in the discovery state, and
makes the state transitioning decision at the end of each
forwarding nodes perform density control based on the
epoch. Table 1 lists the three node states of NADC.
threshold value, if different threshold values are used in
different areas, the corresponding node density can be
controlled in different levels.
The mechanism of NADC is very similar to UDC. E. Tuning NADC
However, instead of achieving a uniform node density NADC leaves choices of some parameters to the appli-
throughout the network, NADC uses neighborhood type to cation, including thresholds and sleep time. In our design,
determine the threshold value. Since there are packets be- we have three threshold values, T hh , T hm , and T hs ,
ing transmitted in hot areas and midstream areas, the node corresponding to the threshold in hot, midstream, and
density in these areas can be reduced by using a smaller silent areas, respectively. When there is no event occur-
threshold value in order to avoid unnecessary overhearing. ring, the node density of the network is determined by
The node density in silent areas is maintained in the T hs ; therefore, T hs should be set to an appropriate value
normal level by using the original threshold value to to maintain the required connectivity for future event
provide good connectivity in the network. Algorithm 2 delivery. If there is not sufficient connectivity maintained
shows the state transitioning procedure of NADC. when events occur, data packets will have to be re-routed
several times to reach the sink, resulting in long delivery
Algorithm 2 State transitioning procedure of NADC latency. After events occur, T hh and T hm determine the
switch NeighborhoodType
case hot area: node density in hot areas and midstream areas. Since
Th=THRESHOLD1
case midstream area: the unnecessary overhearing is proportional to the node
density in these areas, T hh and T hm should be set smaller
Th=THRESHOLD2
case silent area:

to avoid overhearing energy consumption1.


Th=THRESHOLD3
end switch
if Node.type==SRC or Node.type==FWD then
Node.state=ACTIVE When a node transitions to the sleep state, it sets the
else
switch Node.state sleep time T sleep to determine how many epochs it will
case SLEEP:
if sleep_time>=Tsleep then sleep. Thus the range of T sleep affects the speed of NADC
Node.state=DISCOVERY
end if to adapt to topology changes (due to unstable links, new
case DISCOVERY:
if NumberOfActiveNeighbors>=Th then source nodes, or dead nodes). A short sleep time allows
Node.state=SLEEP
else nodes to wake up frequently and rapidly adapt to the
Node.state=ACTIVE
end if environment changes, while longer sleep time saves more
case ACTIVE:
if NumberOfActiveNeighbors>=Th then energy and suits relatively stable networks. A node can
Node.state=SLEEP
end if dynamically adjust its range of sleep time according to
end switch
end if the stability of the network.

Initially nodes start out in the sleep state, where a ran- 1 Here we assume that nodes have their sensors on while sleeping,

dom timer is used to set the length of sleep in a predefined thus low density doesn’t lead to low sensing coverage or observation
fidelity. In systems which nodes turn off sensors while sleeping, T hh
range to avoid synchronization. When a node wakes up, it can be set to a higher value to keep more active nodes near the event
transitions to the discovery state, and performs neighbor sources for higher observation fidelity.

125
4. I MPLEMENTATION I SSUES If no potential parents are available, the node temporarily
A. Acquisition of Neighbor Information holds the packet until a new parent is available.
In NADC, we exploit the overhearing effect as an
C. Pipelined Aggregation
approach for neighbor discovery and neighborhood ob-
servation. Overhearing is one of the most important com- We use pipelined aggregate [13], [14] to perform
ponents of energy waste in sensor networks, and therefore the in-network aggregation for result collection. In this
overhearing avoidance has been an important issue in the approach, time is divided into intervals, which we called
design of MAC protocols. Here we first describe how epochs. In each epoch, every node listens for packets
NADC works on B-MAC [17], the default MAC protocol from its children, computes the aggregate and sends to
in TinyOS [8], then discuss how neighbor information its parent. The aggregate consists of the combination of
can be acquired if another overhearing avoiding MAC its own local sensor readings with any child values it
protocol is used. received in the previous epoch. Here we only consider
In B-MAC, nodes have asynchronous duty cycles, and aggregates which result in the same packet size, such as
each packet is transmitted with a long preamble so that MAX, MIN, and SUM.
the receiver is guaranteed to wakeup during the preamble
transmission time and remain awake to receive the data. 5. E VALUATION R ESULTS
Since all the neighbors of the sender will receive the A. Simulation Settings
packet, the neighbor information can be easily acquired To reduce unnecessary overhearing along routing paths,
by embedding the helpful information (i.e. node type) in NADC tries to reduce the density of active nodes in hot
the packet. When a node hears a packet not addressed areas and midstream areas, while keeping the density in
to itself, it can retrieve the helpful information before the normal level in silent areas in order to maintain good
dropping the packet. connectivity. To evaluate the performance of NADC, we
Synchronized MAC protocols, such as S-MAC [29] compare NADC with three different UDC settings:
and T-MAC [23], use RTS-CTS to avoid overhearing.
• UDC-high: Uniform density control with high active
Although the data packet is received only by the intended
node density (T hh = T hm = T hs = 12).
recipient, the RTS and CTS are still received by all the
• UDC-medium: Uniform density control with medium
neighbors. So the helpful information can be embedded
active node density (T hh = T hm = T hs = 6).
in RTS or CTS to facilitate the neighbor information
• UDC-low: Uniform density control with low active
acquisition. For asynchronous MAC protocols, X-MAC
node density (T hh = T hm = T hs = 2).
[3] exploits shortened preamble to reduce the overhearing
• NADC: Neighborhood-aware density control (T hh =
waste on long preamble and data packets. The shortened
T hm = 2, T hs = 12)2 .
preamble, however, is still detected by all the neighboring
nodes. Therefore, the helpful information can still be put In each epoch (20 sec), if a node detects an event,
in the preamble and be acquired by the neighbors. it generates a data packet and forward it to the sink.
Data aggregation is performed in each node to combine
multiple received packets into a single packet on their way
B. Routing in Dynamic Topology to the sink. We also use implicit acknowledgment [4] to
As in query processing systems [14], the operation of maintain the reliability of packet delivery. For simplicity,
the network consists of two phases: query dissemination we use a fixed range of sleep time (3–6 epochs) whenever
and result collection. An aggregation tree is constructed a nodes transitions into the sleep state, while leaving
during the query dissemination phase by flooding a query adaptive sleep time for future work.
message from the sink to the network. If a node is asleep We use Prowler [22], a probabilistic wireless network
during the query dissemination and wakes up afterwards, simulator, as the simulation tool. The network settings
it can get the query via query sharing [14]: when a node for the simulation are shown in Table 2. In the network,
hears a result packet for a query that it is not yet running, a 320 × 320 field is evenly divided into 1024 grids,
it asks the sender of that data for a copy of the query, and and two nodes are placed at random positions within
begins running it. each grid. The sink is placed at the center of the field.
In the result collection phase, each node sends the We use B-MAC [17] (with the default configuration in
packets to the sink along the aggregation tree. With TinyOS) as the MAC protocol, with low-power listening
density control mechanisms, however, nodes turn on and enabled. The communication range is set to 22 units,
off, and the parent node may not be awake to receive the which guarantees that each node is able to communicate
packet. We modified MintRoute [24], the built-in routing with at least one node in its neighboring grids. The
protocol in TinyOS, to cope with such dynamic topology 2 We reduce T h and T h
h m aggresively for overhearing avoidance.
change. Each node maintains a neighbor table, and dy- Depending on different system assumptions, difference threshold values
namically chooses its parent before transmitting packets. may be used (see Section 3.5).

126
Table 2 20
N ETWORK SETTING

Total # events received by sink


TOPOLOGY
field range 320×320 15

number of grids 1024


nodes per grid 2 10
node distribution grid-based + random displacement
RADIO MODEL UDC−high
max radio range 22 5 UDC−medium
link failure rate 0.05 UDC−low
NADC
MAC protocol B-MAC
0
ENERGY MODEL 0 10 20 30 40 50 60 70
radio Tx cost 1.2 per packet Epoch
radio Rx cost (including overhearing) 1 per packet Figure 4. The timing of event delivery from 20 randomly chosen source
energy budget 50 per node
nodes in the network. Each source sends one event at time 0.

overhearing waste, it still has higher total energy con-


sumption compared to NADC and UDC-medium. The
cause is the bad connectivity between source nodes and
the sink that results in more re-routed packets and re-
transmissions. The phenomena can be observed in Fig.
4, which shows the timing of event delivery to the sink.
UDC-low takes much longer time for the sink to receive
all the events because of the bad connectivity in the
network. However, the delivery latency relates not only
Figure 3. Distribution of nodes with different energy consumption.
to the active node density, but also the distance from the
source nodes to the sink and the mobility model of sensing
number of packets sent/received/overheard is used as the targets. The issue is further discussed in the next section.
metrics for workload comparison, and the corresponding
energy usage is calculated using the Mica2 power model
measured by Shnayder et al. [20], where the ratio of C. Data Delivery Latency
energy consumption for packet sending and receiving is
1.2:1 with 0 dBm transmission power. The data delivery latency is dominated by the connec-
tivity from the source nodes to the sink. If the density
of active nodes is not high enough to maintain the
B. Unbalanced Energy Consumption connectivity, the forwarding nodes have to re-route the
packets or even temporarily hold the event data until a
To illustrate the unbalanced workload among sensor
new route is available.
nodes, we randomly choose 20 source nodes, which
To further explore the impact of different density con-
generate event packets simultaneously for one epoch, and
trol settings on the delivery latency, we explicitly set the
measure the energy consumption of individual sensor
source nodes to be a fixed number of hops away from the
nodes. Fig. 3 shows the number of nodes with respect
sink, so that no extra source nodes will show up along
to different amount of energy consumption. The cor-
the routing paths and increase the active node density.
responding statistics is shown in Table 3. The result
That means the active node density along routing paths
shows that UDC-high has the most unbalanced energy
will be determined only by the NADC threshold values.
usage among sensor nodes, where a few nodes consume
In this set of simulations, 10 source nodes are randomly
much more energy than others. NADC not only reduces
chosen in level 4, level 8, and level 12 respectively, and
the total energy consumption by avoiding unnecessary
the source nodes generate event packets simultaneously
overhearing, but also alleviates the unbalanced energy
for 6 consecutive epochs. We measure the time from the
usage. We can see in the result that NADC has the
generation of the first event to the successful delivery of
smallest standard deviation of the nodes’ energy usage;
all the events as an indication of delivery latency.
that means NADC has the most balanced energy usage
Fig. 5 shows the measured results with different density
among sensor nodes compared to the other three schemes.
control settings. When the source nodes are in level 4,
Although UDC-low maintains the lowest active node
which is relatively close to the sink, the difference is not
density compared to other schemes, which implies less
obvious among the settings. However, when the source
Table 3 nodes are farther away from the sink, the connectivity
S TATISTICS OF THE NODES ’ ENERGY USAGE IN F IG . 3 from the source nodes to the sink has decisive impact
Energy usage UDC-high UDC-medium UDC-low NADC on the delivery latency. When the source nodes are in
Total 3531 2696 2984 2456 level 12, the latency of UDC-low is more than twice
Average 1.72 1.31 1.46 1.2
Standard deviation 3.04 1.89 2.06 1.73
compared to UDC-high. NADC, on the other hand, has
Maximum 33 15 15 13 far less increase in latency compared to UDC-low due to

127
The unbalanced energy usage can be alleviated if the
overhearing waste can be reduced in the areas with high
communication activity. This can be achieved by either
uniformly reducing the density in the network (as UDC-
low does) or selectively reducing the density along the
routing paths (as NADC does). We show in Fig. 6 that
NADC and UDC-low successfully extend the lifetime of
Figure 5. The difference in the timing of event delivery from source
nodes in different hops away from the sink. The result shows the average the nodes close to the sink by reducing the unnecessary
of five simulation runs, and the errorbar represents the variation. overhearing in the central area of the network.
The lifetime indications of the simulation is shown
its capability of maintaining better connectivity from the in Fig. 7. As expected, the dead nodes soon form a
source nodes to the sink. bottleneck of communication to the sink. Although the
total number of detected events keeps increasing, the
delivery rate decreases dramatically due to the energy
D. System Performance and Network Lifetime hole problems. However, the energy hole problem may
To evaluate the performance of NADC on the network not necessarily first occur on the nodes closest to the
lifetime, we simulate an application that keeps track of sink. Some nodes farther away may deplete their energy
moving targets in the network. Various definitions of sooner and affect the event delivery, and the sink may
network lifetime have been proposed in literatures [2]. stop receiving events packets even when it still has alive
However, a formal definition is not straightforward and neighbors.
may depend on the application scenario in which the Compared with the other three UDC settings, NADC
network is used. Here we use the following four metrics has longer network lifetime than UDC-high and UDC-
as the indications of network lifetime: a) number of alive medium, and achieves similar network lifetime as UDC-
neighbors of the sink, b) number of delivered events to the low. However, UDC-low achieves lower energy consump-
sink, c) total number of dead nodes, and d) total number tion at the cost of significant increase in data delivery
of detected events in the network. While the number of latency, while NADC has a much smaller increase in the
alive neighbors of the sink and the number of delivered latency.
events can be directly obtained from the sink, the total
number of dead nodes and detected events require the
knowledge of the whole network. 6. C ONCLUSION
In the simulation, we have 20 moving targets in the
network. The targets are randomly placed, and Random In this paper, we propose neighborhood-aware density
Waypoint model [9] is used to model the target movement. control to alleviate the unbalanced energy usage by reduc-
The sensors close to the targets (i.e. within the sensing ing the unnecessary overhearing along routing paths. In
range of 5 units) will detect the event and become source NADC, nodes observe their neighborhoods and dynam-
nodes. Source nodes generate event packets in each epoch. ically adapt their participation in the multihop network
Event packets are routed on the aggregation tree towards topology. Since the neighborhood information can be
the sink, and aggregation is performed on routing paths. easily observed through the overheard information, the
Fig. 6 shows the energy hole problem due to un- density in different regions can be adaptively adjusted in
balanced energy usage by dividing the nodes into four a totally distributed manner. By reducing the node density
classes based on the nodes’ location. Among the four near the routing paths while keeping the nodes involved
settings, UDC-high suffers from the worst unbalanced in packet generation or forwarding in the active state,
energy usage among nodes in different levels, which the overhearing waste can be reduced without dramatic
results in a high percentage of dead nodes near the sink. increase in data delivery latency.
Since there are more active nodes along the routing paths, We ran simulations with four different density con-
the large amount of overhearing waste soon depletes the trol settings: UDC-high, UDC-medium, UDC-low, and
limited energy budget of nodes close to the sink. While NADC. Although UDC-high has the lowest delivery la-
there are less than 5% of dead nodes in the peripheral of tency, the high density of active nodes results in large
the network (i.e. levels higher than 7), there have been amount of unnecessary overhearing along the routing
more than 80% of nodes dying in the central area (i.e. paths and quickly forms an energy hole near the sink.
within the first two levels close to the sink). When there UDC-low alleviates the extremely unbalanced energy
are not enough alive nodes in the inner levels of the usage problem, but incurs significant increase in latency.
network to maintain the routing paths to the sink, the The adaptability of NADC to adjust the node density in
curves representing the inner levels stop rising because different regions successfully alleviates the unbalanced
few event packets can be propagated to the central area. energy usage with a small increase in delivery latency.

128
Figure 6. Percentage of dead nodes in areas in different levels of the network. The level represents the number of hops away from the sink. The
parenthesized number represents the number of nodes in the specified levels.
UDC−high UDC−medium UDC−low NADC
(a) # alive sink neighbors (b) Total # events received by sink (c) Total # dead nodes (d) Total # detected events
30 1500 1500 2500

25 2000
20 1000 1000
1500
15
1000
10 500 500

5 500

0 0 0 0
0 25 50 75 100 125 150 175 0 25 50 75 100 125 150 175 0 25 50 75 100 125 150 175 0 25 50 75 100 125 150 175
Epoch Epoch Epoch Epoch

Figure 7. Four different lifetime indications of the simulation.

R EFERENCES the 25th Conf. of the IEEE Communication Society (INFOCOM),


2006.
[1] M. Ahmed, S. Dao, and R. Katz. Positioning range extension gate- [16] M. Perillo and W. Heinzelman. Optimal sensor management under
ways in mobile ad hoc wireless networks to improve connectivity energy and reliability constraints. In Proc. of the IEEE Wireless
and throughput. In Proc. of IEEE Military Communications conf. Communications and Networking Conf., 2003.
(MilCom), 2001. [17] Joseph Polastre, Jason Hill, and David Culler. Versatile low power
[2] Douglas M. Blough and Paolo Santi. Investigating upper bounds media access for wireless sensor networks. In Proc. of the 2nd Intl.
on network lifetime extension for cell-based energy conservation Conf. on Embedded Networked Sensor Systems (SenSys), pages
techniques in stationary ad hoc networks. In Proc. of the 8th an- 95–107, 2004.
nual Intl. Conf. on Mobile Computing and Networking (MobiCom), [18] Curt Schurgers, Vlasios Tsiatsis, Saurabh Ganeriwal, and Mani
pages 183–192, 2002. Srivastava. Optimizing sensor networks in the energy-latency-
[3] M. Buettner, G. Yee, E. Anderson, and R. Han. X-MAC: A short density design space. IEEE Trans. on Mobile Computing, 1(1):70–
preamble mac protocol for duty-cycledwireless networks. In Proc. 80, 2002.
of the 4th Intl. Conf. on Embedded Networked Sensor Systems [19] Yi Shang and Hongchi Shi. Coverage and energy tradeoff in
(SenSys), pages 307–320, 2006. density control on sensor networks. In Proc. of the 11th Intl.
[4] Qing Cao, Tian He, Lei Fang, Tarek Abdelzaher, John Stankovic, Conf. on Parallel and Distributed Systems (ICPADS), pages 564–
and Sang Son. Efficiency centric communication model for 570, 2005.
wireless sensor networks. In Proc. of the 25th Conf. of the IEEE [20] Victor Shnayder, Mark Hempstead, Bor-rong Chen, Geoff Werner
Communication Society (INFOCOM), 2006. Allen, and Matt Welsh. Simulating the power consumption of
[5] Alberto Cerpa and Deborah Estrin. ASCENT: Adaptive Self- large-scale sensor network applications. In Proc. of the 2nd Intl.
Configuring sEnsor Networks Topologies. IEEE Trans. on Mobile Conf. on Embedded Networked Sensor Systems (SenSys), pages
Computing, 3(3):272–285, 2004. 188–200, 2004.
[6] Benjie Chen, Kyle Jamieson, Hari Balakrishnan, and Robert Mor- [21] Mihail L. Sichitiu and Rudra Dutta. Benefits of multiple battery
ris. Span: an energy-efficient coordination algorithm for topology levels for the lifetime of large wireless sensor networks. In
maintenance in ad hoc wireless networks. Wireless Networks, NETWORKING, pages 1440–1444, 2005.
8(5):481–494, 2002. [22] G. Simon, P. Volgyesi, M. Maroti, and A. Ledeczi. Simulation-
[7] Christian C. Enz, Amre El-Hoiydi, Jean-Dominique Decotignie, based optimization of communication protocols for large-scale
and Vincent Peiris. WiseNET: An ultralow-power wireless sensor wireless sensor networks. In IEEE Aerospace Conf., 2003.
network solution. Computer, 37(8):62–70, 2004. [23] Tijs van Dam and Koen Langendoen. An adaptive energy-efficient
[8] Jason Hill, Robert Szewczyk, Alec Woo, Seth Hollar, David Culler, MAC protocol for wireless sensor networks. In Proc. of the 1st Intl.
and Kristofer Pister. System architecture directions for networked Conf. on Embedded Networked Sensor Systems (SenSys), pages
sensors. In Intl. Conf. on Architectural Support for Programming 171–180, 2003.
Languages and Operating Systems (ASPLOS), 2000. [24] Alec Woo, Terence Tong, and David Culler. Taming the underlying
[9] David B. Johnson and David A. Maltz. Dynamic source routing in challenges of reliable multihop routing in sensor networks. In Intl.
ad hoc wireless networks. In Imielinski and Korth, editors, Mobile Conf. on Embedded Networked Sensor Systems (SenSys), pages
Computing, volume 353. 1996. 14–27, 2003.
[10] Konstantinos Kalpakis, Koustuv Dasgupta, and Parag Namjoshi. [25] Xiaobing Wu, Guihai Chen, and Sajal K. Das. On the Energy Hole
Efficient algorithms for maximum lifetime data gathering and Problem of Nonuniform Node Distribution in Wireless Sensor
aggregation in wireless sensor networks. Comput. Networks, Networks. In IEEE Intl. Conf. on Mobile Ad Hoc and Sensor
42(6):697–716, 2003. Systems (MASS), pages 180–187, 2006.
[11] Bhaskar Krishnamachari, Deborah Estrin, and Stephen B. Wicker. [26] Ya Xu, John Heidemann, and Deborah Estrin. Adaptive energy-
The impact of data aggregation in wireless sensor networks. In conserving routing for multihop ad hoc networks. Research Report
Proc. of the 22nd Intl. Conf. on Distributed Computing Systems 527, USC/Information Sciences Institute, 2000.
(ICDCSW), pages 575–578, 2002. [27] Ya Xu, John Heidemann, and Deborah Estrin. Geography-
[12] Sze-Chu Liu. A lifetime-extending deployment strategy for multi- informed energy conservation for ad hoc routing. In Proc. of
hop wireless sensor networks. In Proc. of the 4th Annual the 7th annual Intl. Conf. on Mobile Computing and Networking
Communication Networks and Services Research Conf. (CNSR), (MobiCom), pages 70–84, 2001.
pages 53–60, 2006. [28] Fan Ye, Gary Zhong, Jesse Cheng, Songwu Lu, and Lixia Zhang.
[13] Samuel Madden, Robert Szewczyk, Michael J. Franklin, and Peas: A robust energy conserving protocol for long-lived sensor
David Culler. Supporting aggregate queries over ad-hoc wireless networks. In Proc. of the 23rd Intl. Conf. on Distributed Comput-
sensor networks. In Proc. of the 4th IEEE Workshop on Mobile ing Systems (ICDCS), page 28, 2003.
Computing Systems and Applications (WMCSA), page 49, 2002. [29] Wei Ye, John Heidemann, and Deborah Estrin. An energy-efficient
[14] Samuel R. Madden, Michael J. Franklin, Joseph M. Hellerstein, MAC protocol for wireless sensor networks. In Proc. of the 21th
and Wei Hong. TinyDB: an acquisitional query processing system Conf. of the IEEE Communication Society (INFOCOM), 2002.
for sensor networks. ACM Trans. on Database Systems, 30(1):122– [30] Zhenqiang Ye, Srikanth V. Krishnamurthy, and Satish K. Tripathi.
173, 2005. A framework for reliable routing in mobile ad hoc networks.
[15] Stephan Olariu and Ivan Stojmenovic. Design guidelines for In Proc. of the 22th Conf. of the IEEE Communication Society
maximizing lifetime and avoiding energy holes in sensor networks (INFOCOM), 2003.
with uniform distribution and uniform reporting. In Proc. of

129
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A Cloaking Algorithm based on Spatial Networks for Location Privacy

Po-Yi Li† Wen-Chih Peng† Tsung-Wei Wang† Wei-Shinn Ku§


Jianliang Xu ‡ J. A. Hamilton, Jr.§


Dept. of Computer Science, National Chiao Tung University, Taiwan
Email: {leeboy, wcpeng}@cs.nctu.edu.tw
§
Dept. of Computer Science and Software Engineering, Auburn University, Auburn, USA
Email:{weishinn, hamilton}@eng.auburn.edu

Dept. of Computer Science, Hong Kong Baptist University, Hong Kong
Email: xujl@comp.hkbp.edu.hk

Abstract and industries. Due to the unrestricted mobility of users


in the mobile computing environments, users are often in-
Most of research efforts have elaborated on k-anonymity terested in acquiring information or services related to their
for location privacy. The general architecture for imple- current locations. Consequently, large amount of queries
menting k-anonymity is that there is one trusted server (re- along with user location information are submitted to LBS
ferred to as location anonymizer) responsible for cloaking servers. Examples of such queries include finding the near-
at least k users’ locations for protecting location privacy. A est restaurants to a user (k nearest neighbor query) and find-
location anonymizer will generate cloaked regions in which ing ATMs within 500 meters from a user’s current location
there are at least k users for query processing. Prior works (range query). While LBSs have shown to be valuable to
only explore grid shape cloaked regions. However, grid users’ daily life, on the other hand, they also expose ex-
shape cloaked regions result in a considerable amount of traordinary threats to user privacy. If not well protected, the
query results, thereby increasing the overhead of filtering location information of users may be misused by some un-
unwanted query results. In this paper, we propose a cloak- trustworthy service providers or stolen by hackers. Once
ing algorithm in which cloaked regions are generated ac- the location information is exposed, adversaries may utilize
cording to the features of spatial networks. By exploring the them to invade user privacy. Obviously, it is important to
features of spatial networks, the cloaked regions are very ef- protect location privacy.
ficient for reducing query results and improving cache uti-
lization of mobile devices. Furthermore, an index structure Recently, the problem of location privacy preserving has
for spatial networks is built and in light of the proposed growing interests and most research efforts have elaborated
index structure, we develop a Spatial-Temporal Connective on k-anonymity [3, 6, 11]. In k-anonymity, users submit
Cloaking algorithm(abbreviated as STCC). A simulator is their queries to the LBSs via a trusted server (which is
implemented and extensive experiments are conducted. Ex- different from the LBS servers). This trusted anonymizer
perimental results show that our proposed algorithm out- transforms the exact locations of a number of users into
performs prior cloaking algorithms in terms of the candi- a cloaked spatial area in accordance with privacy require-
date query results and the cache utilization. ments set by users in order to obtain data or services from
LBSs. Upon receiving the LBS query with a cloaked re-
gion, the LBS server evaluates and returns a result superset
(referred to as a candidate query result) containing the query
1 Introduction results for all location points in the cloak region. From
the candidate query result, mobile users are further to de-
Location-based services (LBSs) have emerged as one termine the actual query result according to the true loca-
of the killer applications for mobile computing and wire- tion information at the mobile devices. In addition, candi-
less data services. These LBSs are critical to public safety, date query results are cached for further data access. Note
transportation, emergency response, and disaster manage- that when candidate results cached are able to satisfy con-
ment, while providing great market values to companies secutive queries, users are no longer need to issue queries,

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 90


DOI 10.1109/SUTC.2008.56
which reduces the location privacy threat of revealing user n2
n8
locations. However, cache in mobile devices has a limited o6 Junction
Mobile user
storage size. More candidate query results incur cache re- n1 n3 n9 n11 n14
Target Object
placements. As such, the cache hit probability will be re- o4 o2 Intersection Point
o7
duced. Consequently, it is important to retrieve candidate Query point

query results that are likely to be accessed in the future and n4


n10
o1
will not frequently incur cache replacements due to the lim-
n5 n6
ited cache size. o5 n12 n15

n7 o3
Prior works in [7] proposed a framework for location ser- n13
vices in which a free space is divided into a number of grid
cells. Then, a cloaked region consists of grid cells in which (a) An example of a spatial network.
the total number of users is at least k. Hence, the cloaked n2
n2
n8
n8
spatial area whose shape is rectangle results in the larger 5
o6 3 5
5
1
o6
6
3
3 4
1 6 3 4
5 n14
candidate size. The problem we study could be better un- n1 n3 n9
5
n11 n14 n1 n3
5
n9
5
n11
5
5 5 o4 o2
derstood by an illustrative example in Figure 1(a), where o7
o4 o2 o7
1 1
1 1
spatial networks is modeled as a graph with each vertex as n4
2 1 n4
2
5
1
2 4
5 2 4 n10
n10
a junction and edges between two junctions are roads. As- 3
3
2 2 6 2 2 3
3
2 2 6 2 2
n5 n6 o1
sume that k is set to 4 for k-anonymity and the KNN spatial n5 n6
3
o5 3
n12 o1 n15
3
o5 3
n12 n15

n7 n7 o3
query is issued. Figure 1(b) shows the cloaked region de- n13
2 o3
n13
2

rived. It can be computed that the size of candidate query


results is 5 ( i.e., O1 , O2 , O3 , O4 and O5 ). Cloaked spatial (b) A grid-based cloaked region (c) A spatial network-based
cloaked region
areas derived do not take the features of spatial networks
and the spatial-temporal moving behavior into considera-
tion. The spatial-temporal moving behavior refers to the Figure 1. Examples of cloaked region
feature that the consecutive movements of users are not too
far away. In this paper, we argue that the cloaked spatial
area should be the cloaked segment set in which there are at is the minimal area required for cloaked regions. Since
least k users along with these road segments. In Figure 1(c), cloaked regions derived in this paper is the set of road seg-
the cloaked region is viewed as a set of road segments (i.e., ments, Amin is only appropriate for cloaked region in free
(n4 , n10 ), (n9 , n10 ) in this example). It can be verified that spaces. Thus, we define a new privacy profile in which the
the size of candidate query results is 2 ( i.e., O4 and O7 ). features of spatial networks (i.e., the number of road seg-
Hence, the candidate query size is smaller. Furthermore, ments and the total length of road segments) is considered.
since the moving behavior of a user has spatial-temporal As such, a user privacy profile is represented as (k, Lmin )
feature which refers to the feature that the a user is likely to or (k, Nmin ), where k is the minimal number of users in a
move along the nearby road segments, the cloaked segment cloaked region and Lmin (respectively, Nmin ) is the mini-
set that exploits connective road segments is able to increase mal total length (respectively, the minimal number) of road
the cache hit ratio. This is due to that candidate query result segments required. According to the user privacy profile,
sets are small, which is likely to be stored in limited cache we propose a Spatial-Temporal Connective Cloaking algo-
of mobile devices and these candidate query results are very rithm(abbreviated as STCC) to derive cloaked region that
likely to be used in the near feature. Therefore, in this paper, contains a set of road segments in which there are at least k
by exploring features of spatial networks and the spatial- users and the number of road segments or the total length of
temporal moving behaviors for cloaking, cloaked segment road segments satisfies Nmin or Lmin . In order to quickly
sets are able to reduce the size of candidate query results extract road segments for cloaked region, a hierarchical in-
and improve the cache hit ratios, thereby reducing the num- dex structure is developed. Through the index structure
ber of queries. Once the number of queries is reduced, built, algorithm STCC is able to efficiently derive cloaked
the probability of revealing location information along with segment set. Extensive experiments are conducted and ex-
LBS queries is thus decreased. perimental results show that the cloaked segment sets de-
Consequently, in this paper, we propose a spatial rived fully capture the spatial-temporal feature of moving
network-based cloaking algorithm to derive cloaked seg- behaviors, thereby not only protecting location privacy but
ment sets. In traditional cloaking algorithms of k- also reducing the candidate query size.
anonymity, users are required to set their privacy profiles. We mention in passing that the authors in [13] explored
A privacy profile is basically a pair of (k, Amin ), where k the concept of k-anonymity for data privacy. By exploit-
is the number of users required for k-anonymity and Amin ing the k-anonymity, the authors in [3] proposed spatial-

91
temporal cloaking to protect user location privacy. More-
Location-based
over, the authors in [2] proposed the CliqueCloak algo- 1. User’s profile, location and query 2. Cloaked road segments to query service provider

rithm to support varied k-anonymous requirement for each


user. In their work, the authors construct a clique graph in 4. Candidate answers 3. Candidate answers
Mobile User Location Anonymizer
which some users can share the same cloaked region. How- Privacy-Protected
Query Processor

ever, these researches mainly focus on designing the loca-


tion anonymizer rather than query processing. Therefore,
Figure 2. The system architecture
the authors in [7] proposed a framework that include two
main component, location anonymizer and query processor.
In particular, location anonymizer will construct a pyramid However, in reality, users are restricted to move on prede-
structure to index different granularity cloaked region. On fined roads, which is viewed as a spatial network. Without
the other hand, query processor is used to obtain candidate loss of generality, a spatial network is usually modeled as a
query results according to the cloaked region. However, graph, G=(V, E), where a vertex denotes a road junction, an
both of the cloaked region and query processing are in free edge denotes the road segment between two junctions and
spaces. Prior works do not take the feature of spatial net- the weight of the edge is the length of this road segment.
works into consideration nor utilize spatial-temporal feature Thus, in our paper, we argue that the cloaking spatial area
of moving behavior for deriving cloaked region, let alone should be the set of road segments instead of grid cells. Fur-
developing an index structure for spatial network-based lo- thermore, since the moving behaviors of users have spatial-
cation anonymizer. These features differentiate our work temporal locality which refers to the feature that consec-
from others. utive locations of mobile users are not far away, once the
The rest of the paper is organized as follows: Preliminar- cloaked spatial area is the set of road segments nearby the
ies are given in Section 2. In Section 3, a spatial network- true location, the cloaked spatial area is able to fully capture
based algorithm is presented. Section 4 devotes to experi- the spatial-temporal locality feature in the candidate query
mental results. This paper concludes with Section 5. result. Thus, mobile users are likely to find out the query
results from the cache of mobile devices. As such, the num-
2 Preliminaries ber of queries will be reduced, thereby avoiding location
exposure.
Figure 2 depicts the system architecture, where there are Privacy profile: When cloaked spatial area is the set
two components in this system (i.e., a location anonymizer, of road segments, a user privacy profile is thus defined as
and privacy-protected query processor). Mobile users are (k, Lmin ) (respectively, (k, Nmin )), where there are at least
able to set up their location privacy profiles and register to another k − 1 users in cloaked spatial area and Lmin (re-
with a location anonymizer. Once a user issues a location spectively, Nmin ) indicates the minimum acceptable total
dependent query, the location of the user will be sent alone length of road segments (respectively, total number of road
with the LBS query to the location anonymizer. Then, a lo- segments) in the cloaked spatial area. Both Lmin and Nmin
cation anonymizer will blur the true location as a cloaked capture the feature of spatial networks in user privacy pro-
region and forwards this LBS query with the cloaked re- files. Users are able to select one or both for their loca-
gion. The Privacy-Protected Query Processor is respon- tion privacy requirement. Note that Lmin and Nmin are
sible for performing privacy-protected queries. Upon re- particularly useful in dense area where even a large k can-
ceiving the LBS query with a cloaked region, the privacy- not achieve the user’s privacy requirements. With a larger
protected query processor evaluates and returns a result su- Lmin , it is harder to identify the exact location of a user in
perset (referred to as a candidate query result) containing road segments. Moreover, a larger Nmin is used to have
the query results for all location points in the cloak region. more number of road segments in a cloaked spatial area in
From the candidate query result, mobile users are further to which the exact road segment that a user is on is harder to
determine the actual query result according to the true loca- determine.
tion information at the mobile devices. As point out early, The objective of our work: In this paper, given a spa-
prior works explore transforms the exact locations of a num- tial network, denoted as G=(V, E), and a user privacy profile
ber of users into a cloaked spatial area in accordance with (i.e., (k, Lmin ) or (k, Nmin )), we intend to derive a cloaked
privacy requirements set by users. The privacy requirement segment set that satisfies user privacy profiles and the cloak-
is represented as (k, Amin ), where k is the minimal num- ing algorithm should achieve two requirements: 1.) Accu-
ber of users needed in the cloaked spatial area and Amin racy the cloaked segment set, represented as R, should be
is the minimal acceptable region size of the cloaked spa- a set of road segments in which there are at least k users
tial area. Note that in prior works, the cloaked spatial area and those road segments should be as close to the user pri-
consists of grid cells and users are moved in the free space. vacy profile (i.e., (k, Lmin ) or (k, Nmin )) as possible. 2.)

92
Efficiency: the cloaked algorithm should efficiently derive Given a spatial network, the total length of road seg-
cloaked segment set based on the user privacy profile due ments is denoted as T otallength and the total number of
to that a spatial network is usually large-scale and mobile road segments is expressed by T otalrs . The total length of
users are dynamically move in a spatial network. road segments in block Bh,i denotes as LengthBh,i . We
intend to let the total length of road segments in each block
3 The Spatial-Temporal Connective Location at the same level as close as possible. Thus, we should
first derive the average total length for each block and in-
Anonymizer
tend to minimize the variance of each block in terms of to-
tal length of road segments. The expected total length for
In order to efficiently derive the cloaked segment set, we
each block at level i is denoted as δi . In particular, for level
first develop an index structure for a spatial network. In
1, adjacent road segments will be merged to form a block.
light of the index structure, we propose a spatial-network
Hence, the maximal number of blocks in level 1 is deter-
based cloaking algorithm.
mined as T otal
2
rs
if two adjacent road segments are put into
T otallength
3.1 Index Structure of a Spatial Network one block. Thus, δ1 is formulated as T otal rs
. Once δ1 is
2
determined, adjacent road segments are put into one block
Similar to the work in [12], given a spatial network, each if their total length is smaller than δ1 . Then, for each time,
vertex maintains an adjacency list in which each data node we consider add one road segments into a block. If one road
contains the adjacent vertex, the length of the correspond- segment is included and the total length of road segments in
ing road segment and the number of users along with this the block is larger than δ1 +, this road segment is removed.
road segment. To efficient derive the cloaked segment set  is a acceptable tolerance value when one more road seg-
fulfilling the user privacy profile, we proposal a hierarchi- ment is put in the block while the total length is larger than
cal structure that decomposes the spatial network into Lh δ1 . For blocks at higher level (e.g., i), we will merge adja-
levels and each level contains a various number of blocks cent blocks at lower level (e.g., i-1 in this example). Denote
consisting of a set of road segments. Clearly, the root of the the number of blocks at level i−1 is N blockLi−1 . We could
totallength
hierarchical data structure has only one block that covers have the expected total length δi as N block Li−1
. Then, ad-
the whole set of road segments. Since we have two features jacent blocks will be merged together if the difference be-
of a spatial network in a user privacy profile, for each fea- tween their total length of road segments and δi is within .
ture, we will build the corresponding index structure. In the Same as in generating blocks in level 1, each time, one adja-
following, an index data structure for the length of road seg- cent block at lower level is included into a higher level block
ments is described. The index structure for the number of if the total length of the higher level block is smaller than δi .
road segments is build in the similar way. To facilitate the Similar to the principle for deriving blocks at level 1, once
presentation of this paper, the jth block in level i is denoted the total length in higher level block is larger than δ1 + af-
as Bi,j . Blocks contain a set of pointers to the lower level ter including the newly added road segment. This newly
blocks and the total number of users within the set of road added road segment will be removed. In order to build the
segments in lower levels blocks. For each block in the same index structures, we adopt a bottom-up approach to first de-
level, the total length of road segments in each block should rive lower level blocks and iteratively generate higher level
be as close as possible. In other words, the variance of each block through merging adjacent blocks until the whole set
block in terms of the total length of roads segments is small of road segments is covered by one block. An example of
and thus, it is able to appropriately obtain a set of road seg- a spatial network is shown in Figure ?? and assume that 
ments with their total length of road segments as close to is set to 2. In the beginning, we should calculate δ1 . Since
Lmin as possible. the total length of road segments in ?? is 86 and the num-
An index structure is built in a bottom-up manner. The ber of road segments is 14, we could have δ1 = 86 7 = 12.3.
blocks in level 0 are the original road segments in a spatial Consider a road junction n10 as an example, where two ad-
network given. Then, two adjacent road segments will be jacent road segments are able to put into one block since the
put in one block at level 1. As pointed out early, each block difference between their total length and δ1 is smaller than
at the same level should have the approximate total length  (i.e., 14-12.3=1.7 ≤ 2). Note that no more adjacent road
of road segments. We will describe the criterion of putting segments is included since their total length of this block is
two adjacent road segments into a block later. Once blocks already larger than δ1 +. For a road junction n9 , adjacent
at level 1 are generated, two blocks at level 1 will be formed road segments (n9 , n3 ), (n9 , n11 ) and (n9 , n6 ) are in the
into a block at level 2. Following the same operation, one same block since their total length is smaller than δ1 + (i.e.,
could recursively merge two lower level blocks for higher 7+3+3=13 < 12.3+2=14.3). In our example in procedure
level blocks until one block covers all road segments of a until there exists only one block that covers the whole spa-
spatial network given.

93
tial network. In ??, N blockL1 = 7. Furthermore, we could B4,1

have δ2 = 86 = 21.5. With δ2 , we could decide whether B4,1


 72 
two adjacent blocks should be put in a higher level blocks Level 4

or not. For example, consider the adjacent blocks that con- B3,1 B3,2

tain n10 and n9 . Since the total length of these two blocks B3,1
B3,2
at level 1 is larger than δ2 +2 (i.e., 23.5), these two adjacent Level 3
blocks cannot form a block at level 2. Hence, only adja-
B2,1 B2,2 B2,3 B2,4
cent blocks with their total length is smaller than δ2 + are B2,1 B2,3
B2,2
merged in one higher level block. B2,4 Level 2
B1,1 B1,2 B1,3 B1,4 B1,5 B1,6 B1,7
B1,5 B1,2

Algorithm 1 Build Indexlength Algorithm B1,4 B1,1 B1,3


l3 l4 l6 l9 l10 l11 l12
Level 1
Input: A spatial network graph, G(V, E) B1,6 B1,7

Output: An index structure, Indexlength , for Lmin n1


n2 n8
n14
n4 3 7 3 n9 1 7
n3 n9 n11 3
1: T otallength = the total length of road segments in n4 n10
n5 n15
G(V, E) n7
n6 n13 n12 Level 0 adjacent number of
users
length of
road segments
number of
road segments
node
2: T otalrs = the total number of road segments in G(V, E)
3: N Bi−1 is the number of blocks at level i-1
T otallength
N B0 = T otal rs
Figure 3. An example of a hierarchical struc-
2
i=1 ture
4: while No one block covers E in G=(V, E) do
5: if i = 1 then
T otallength
6: N Bi−1 = T otal rs 3.2 The Spatial-Temporal Connective
2
T otal Cloaking Algorithm
7: δi = N Blength
i−1
8: if i = 1 then
9: for each vertex with its adjacent road segments do Assume that the user privacy profile is (k, Lmin ) and the
10: length=0 location of user is represented as (x, y). When a user issues
11: j = 1; a query along with his location to the location anonymizer,
12: while length larger than δi + do the location anonymizer will first determine the road seg-
13: Include one adjacent road segment into Bi,j ment that the user is currently in. Then, an index struc-
14: length+=the length of selected road segment ture Indexlength is used to located which block at level 1
15: j + +; containing this road segment. If the block found already
16: N B1 = j satisfies the user privacy profile (i.e., the total number of
17: else users is larger than k and the total length of road segments
18: for each block not marked do is larger than Lmin ), the set of road segments in this block
19: length=the total length of this block; is used as the spatial cloaked area. However, if the user
20: j = 1; privacy profile is not satisfied, the neighboring blocks at
21: while length larger than δi + do the same level is first checked, where two blocks are iden-
22: Include one adjacent block into Bi,j and tified as neighboring blocks if these two blocks have the
mark this adjacent block same higher level block. If the combination of neighboring
23: length+=the length of road segments in the blocks have at least k users and the sum of their total lengths
selected block are larger than Lmin , the union set of their road segments
24: j + +; covered by these blocks is used as a spatial cloked area.
25: N Bi = j On the other hand, if none of the neighboring blocks can
26: i++ be combined with the current block, we will recursively ex-
ploit the higher level block and perform the possible com-
bination of higher level neighboring blocks until the user
Following the above procedure, we could derive the in- privacy requirement is fulfilled. An example is shown in
dex structure in Figure 3. Note that the index structure for Figure 4(a), where a user privacy profile is k = 3 and
the number of road segments is built in the same way except Lmin = 25 and the query point is at the road segment
that a criteria is set to the number of road segments instead (n9 , n10 ). Then, the corresponding block at level 1 is ob-
of the total length of road segments. tained via index structure Indexlength . Since the corre-

94
Algorithm 2 Spatial-Temporal Connective Cloaking Algo- n2
n8
n2
n8

rithm 5
5
7
3
3 4
5
5
7
3
3 4
n14
Input: User’s profile (k, Lmin ), user location (x, y) and an n3 n9 n11 n14 n1 n3 n9 n11
n1
Query point Query point

index structure Indexlength 7


7
10
7
7
10

Output: A set of road segments for cloaking, denoted as n4


7
n10
n4
7
n10
3 3
CR n5
3
n6
10 4
n5
3
n6
10 4
n12 n15 n12 n15
1: Find the road segment, (ni , nj ), that contains (x, y) 3
5
3
5
n7 n7
2: Find block at level 1, denoted as Bx,y includes (ni , nj ) n13 n13

3: CQ=a set of road segments in Bx,y


(a) Cloaks user’s location in L0 (b) Cloaks user’s location in L1
4: current block=Bx,y
5: while user privacy profile is not satisfied do n2
n8

6: if these exist neighboring blocks of current block 5


5
7
3
3 4
n3 n9 n11 n14
n1
then Query point
7
7: select one neighboring block of current block 7
10

8: include road segments in the selected block in CR n4


7
n10
3
9: else n5
3
n6
10

n12
4
n15
3
10: Find the parent node of current block in n7 5

Indexlength n13

11: select one neighboring block of the parent node of


(c) Cloaks user’s location in L2
current block
12: include road segments in the selected block in CR
Figure 4. Bottom-up cloak user’s location
13: current block=parent node

road network of the given map. We set there are 5000


sponding block only has two road segments (i.e., (n9 , n10 ) mobile users on the spatial network and they will update
and (n4 , n10 )) and the total length is smaller than 25 though their location per time stamp. Next, target objects are ran-
there are already enough users, the total length in user pri- domly distributed on the spatial network. Moreover, we
vacy profile is not obeyed. Thus, we need to seeking neigh- randomly choose one mobile user and the query type is
boring blocks to fulfil the privacy requirement Lmin . The continuous KNN query that persists 30 time stamps in the
neighboring block contains three road segments and thus simulator. The default number of target objects is 3000.
the total length of these two blocks are larger than Lmin . Performance metrics are cache hit ratio and candidate an-
Consequently, a spatial cloaked area consists of road seg- swer size which is the number of objects in the candidate
ments {(n9 , n3 ), (n9 , n8 ), (n9 , n11 ), (n10 , n4 ), (n10 , n9 )}. query result. Note that in our simulation model, we set
Note that if the user privacy profile is set to (k, Nmin ), Lmin =10000, and Nmin =50. We analysis the road map data
we could use the same concept to generate a spatial cloked and find that when a cloked spatial area satisfied Lmin , the
area. A cloaking algorithm shown above could be slightly average number of road segments is 50. Moreover, the av-
modified by checking Nmin instead of Lmin . Also, the in- erage cloaked spatial area is 100*100. Thus, for fair com-
dex structure IndexN um segment is used as well. parison of our proposal algorithm and traditional Amin , we
set Nmin = 50 and Amin = 100 ∗ 100. Our cloaked al-
4 Performance Study gorithm is abbreviated as STCC and the parameter  used
for building up index structures is set to 50. We randomly
In this section, we will evaluate the performance of our choose one mobile user as the query point who issues an
proposed algorithm. For the comparison purpose, we also K-Nearest-Neighbor query (KNN query) which persists 30
implement traditional grid-size based cloaking algorithm time stamps and there are 3000 target objects in this spatial
(denoted as Grid-based scheme) in which a spatial network network.
is divided into grids and by exploring the prior work, a pyra-
mid data structure is implemented. When a user issues a 4.1 The Impact of k for k-Anonymity
query, we will first find out which grid that this user is in
and then we extract all road segments within this grid. In First, we investigate the impact of k for k-anonymity.
all of our experiments, we use the Network-based Genera- Without loss of generality, a user issues 1NN query and the
1
tor of Moving Objects [1] to generate moving objects. Old- moving speeding set to 50 and the cache size of mobile de-
enburg’s road map is used in our experiments. The gener- vices size is 100. Figure 5(a) shows the candidate answer
ator will output a set of moving objects that move on the size with various k. As can be seen in Figure 5(a), with a

95
250 1 1 1

Candidate Answer Size 200 0.8 0.8 0.8

Cache Hit Ratio

Cache Hit Ratio

Cache Hit Ratio


150 0.6 0.6 0.6

100 0.4 0.4 0.4

50 STCC with Lmin 0.2 STCC with Lmin 0.2 STCC with Lmin 0.2 STCC with Lmin
STCC with Nmin STCC with Nmin STCC with Nmin STCC with Nmin
Grid-based with Amin Grid-based with Amin Grid-based with Amin Grid-based with Amin
0 0 0 0
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100 1/200 1/150 1/100 1/50 1/10 20 30 40 50 60 70 80 90 100
k-anonymity k-anonymity Speed Cache Size

(a) Candidate answer size (b) Cache hit ratio (a) Effect of speed (b) Effect of cache size

Figure 5. Performance comparisons with var- Figure 7. Performance study with various
ious k moving speeds and cache sizes

160 1
140
Candidate Answer Size

120
0.8 Figure 6(a). It can be seen that with a larger value of K for
Cache Hit Ratio

100
80
0.6
KNN queries, the candidate answer size will be increased.
60 0.4
Note that the candidate answer size in STCC is still much
40 STCC with Lmin STCC with Lmin
20 STCC with Nmin
Grid-based with Amin
0.2
STCC with Nmin
Grid-based with Amin
smaller than that of Grid-based scheme. Since STCC uses a
0 0
2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20 set of road segments for cloaking, those road segments that
KNN KNN
are near mobile users are form a cloaked spatial area. Fur-
(a) Candidate answer size (b) Cache hit ratio
thermore, due to that the cloaked spatial area in STCC is the
set of road segments that users are likely to move around,
Figure 6. Performance comparisons with the cache hit ratios of STCC is much higher than that of
KNN queries Grid-scheme, which is shown in Figure 6(b).

4.2.1 The Impact of Moving Speeds and Cache Sizes


larger k, more users should be included in the cloaked spa-
tial area to meet the privacy requirement. Thus, more road Now, the impact of moving speeds is evaluated. The mov-
1 1
segments are in the cloked spatial area, increasing the can- ing parameter are ranged from 200 to 10 , where a smaller
didate answer size. However, the candidate answer size of value of moving parameter means a slower moving speeds.
STCC is smaller than that of Grid-based scheme. This is The user privacy profile for k-anonymity is set to 30. The
due to that STCC considers the feature of spatial-temporal default privacy requirements are Lmin =10000, Nmin =50
locality of moving behaviors for cloaking. By exploring and Amin = 100 ∗ 100. Figure 7(a) shows cache hit ratios
the features of spatial temporal locality of moving behaviors with various moving speeds. Note that with a faster moving
and spatial networks, STCC is able to increase cache hit ra- speeds, it is possible that the mobile user is very likely to
tio. Figure 5(b) shows the cache hit ratios with various k. It move out the cloaked spatial area, thereby reducing cache
can be seen that Figure 5(b) that STCC outperforms Grid- hit ratios. It can be seen in Figure 7(a) that STCC still per-
based scheme in terms of cache hit ratios. By increasing forms better than Grid-based scheme in terms of cache hit
the k, the cache hit ratios of STCC and traditional cloaked ratios. Clearly, the cache size of mobile devices will also
region increase from k = 10 to k = 30. However, cache have influence on cache hit ratios. Figure 7(b) is the exper-
hit ratios decrease from k = 40 to k = 100 since candi- imental result by varying the cache size. By increasing the
date answer size is larger and mobile devices are not able to cache size, the cache hit rate increases because more candi-
store a larger number of objects. Even though, the cache hit date objects are able to store in the cache of mobile devices.
rate of STCC is significantly larger than that of Grid-based In particular, for smaller cache size, which is the common
scheme. case of mobile devices, the cache hit rate of STCC is better
than Grid-based scheme, showing the advantage of explor-
ing spatial-temporal feature of mobile behaviors in STCC.
4.2 The Impact of KNN Query

Now, we conduct experiments on varying the value


of k for KNN queries. The user privacy profile for k- 5 Conclusion
anonymity is set to 30. The default privacy requirements
are Lmin =10000, Nmin =50 and Amin = 100 ∗ 100 for fair- In this paper, we proposed a cloaking algorithm in which
ness. In addition, the cache size is set to 100. The candidate cloaked regions are generated according to the features of
answer size with various K for KNN queries is shown in spatial networks. By exploring the features of spatial net-

96
works, the cloaked regions are very efficient for reducing [6] M. F. Mokbel. ”Towards Privacy-Aware Location-
query results and improving cache utilization of mobile de- Based Database Servers”. In Proceedings of the 22nd
vices. Explicitly, mobile users can set their privacy profile IEEE International Conference on Data Engineering
(k, Lmin ) or (k, Nmin ). Given a user privacy profile, we (ICDE) Workshops, Atlanta, Georgia, USA, 2006.
propose an index structure to efficiently derive cloaked seg- [7] M. F. Mokbel, C.-Y. Chow, and W. G. Aref. ”The
ment set. Note that two hierarchical index structures are New Casper: Query Processing for Location Services
able to obtain cloaked segment sets that are very close to without Compromising Privacy”. In Proceedings of
the user privacy requirements. Based on index structures, the 32nd International Conference on Very Large Data
we propose algorithm STCC to quickly blur the true user Bases (VLDB), Seoul, Korea, 2006.
location as an acceptable cloaked segment set. We experi- [8] M. F. Mokbel, X. Xiong, and W. G. Aref.
mentally evaluated our proposed algorithm and experimen- ”SINA: Scalable Incremental Processing of Continu-
tal results shows that the cloaked segment sets derived fully ous Queries in Spatio-temporal Databases”. In Pro-
capture the spatial-temporal feature of moving behaviors, ceedings of the 2004 ACM International Conference
thereby not only protecting location privacy but also reduc- on Management of Data (SIGMOD), Paris, France,
ing the candidate query size. 2004.
[9] K. Mouratidis, M. Hadjieleftheriou, and D. Papa-
dias. ”Conceptual Partitioning: An Efficient Method
Acknowledgement
for Continuous Nearest Neighbor Monitoring”. In
Proceedings of the 2005 ACM International Confer-
W. C. Peng was supported in part by Taiwan MoE ATU ence on Management of Data (SIGMOD), Baltimore,
Program, and by the National Science Council, Project Maryland, USA, 2005.
No. NSC 95-2211-E-009-61-MY3 Taiwan, Republic of [10] K. Mouratidis, M. L. Yiu, D. Papadias, and
China. Jianliang Xu’s work was supported in part by the N. Mamoulis. ”Continuous Nearest Neighbor Moni-
Research Grants Council of Hong Kong under Grant No. toring in Road Networks”. In Proceedings of the 32nd
HKBU211206. This research has been funded in part by International Conference on Very Large Data Bases
NSF grant DUE 0621307. (VLDB), Seoul, Korea, 2006.
[11] B. N. Schilit, J. I. Hong, and M. Gruteser. Wire-
References less Location Privacy Protection. IEEE Computer,
36(12):135–137, 2003.
[12] S. Shekhar and D.-R. Liu. CCAM: A Connectivity-
[1] T. Brinkhoff. A Framework for Generating Network- Clustered Access Method for Networks and Network
Based Moving Objects. GeoInformatica, 6(2):153– Computations. IEEE Transactions on Knowledge and
180, 2002. Data Engineering, 9(1):102–119, 1997.
[2] B. Gedik and L. Liu. ”Location Privacy in Mobile Sys- [13] L. Sweene. k-Anonymity: A Model for Protecting Pri-
tems: A Personalized Anonymization Model”. In Pro- vacy. International Journal of Uncertainty, Fuzziness
ceedings of the 25th IEEE International Conference and Knowledge-Based Systems, 10(5):557–570, 2002.
on Distributed Computing Systems (ICDCS), Colum- [14] T. Xia and D. Zhang. ”Continuous Reverse Nearest
bus, OH, USA, 2005. Neighbor Monitoring”. In Proceedings of the 22nd In-
[3] M. Gruteser and D. Grunwald. ”Anonymous Us- ternational Conference on Data Engineering (ICDE),
age of Location-Based Services Through Spatial and Atlanta, GA, USA, 2006.
Temporal Cloaking”. In Proceedings of the First [15] M. L. Yiu, N. Mamoulis, and D. Papadias. Aggregate
ACM/USENIX International Conference on Mobile Nearest Neighbor Queries in Road Networks. IEEE
Systems, Applications, and Services (MobiSys), San Transactions on Knowledge and Data Engineering,
Francisco, CA, USA, 2003. 17(6):820–833, 2005.
[4] H. Hu, J. Xu, and D. L. Lee. ”A Generic Framework
for Monitoring Continuous Spatial Queries over Mov-
ing Objects”. In Proceedings of the 2005 ACM In-
ternational Conference on Management of Data (SIG-
MOD), Baltimore, Maryland, USA, 2005.
[5] C. S. Jensen, D. Lin, B. C. Ooi, and R. Zhang. ”Ef-
fective Density Queries on Continuously Moving Ob-
jects”. In Proceedings of the 22nd IEEE Interna-
tional Conference on Data Engineering (ICDE), At-
lanta, GA, USA, 2006.

97
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Controlled Disclosure of Context Information across


Ubiquitous Computing Domains

Cristian Hesselman, Henk Kamran Sheikh Andrew Tokmakoff


Eertink, Martin Wibbels Philips Research
Telematica Instituut University of Twente The Netherlands
The Netherlands The Netherlands andrew.tokmakoff
first.lastName@telin.nl k.sheikh@ewi.utwente.nl @philips.com

Abstract In this paper, we address this problem and present a


novel context management system that (1) enables
One of the challenges in ubiquitous computing is applications to obtain context information on roaming
that of mobility, which typically requires interaction entities from remote domains and (2) at the same time
between intelligent environments in different domains enforces privacy policies regarding the release of
of administration. We present a highly distributed and context information, both of the roaming entity as well
collaborative system that enables context-aware as of the remote domain. The latter is a crucial
applications to obtain context information about requirement for the deployment of ubiquitous
mobile entities (users or devices) independent of the computing systems in any realistic scenario [5].
domain that produces this information. The added Our work focuses on interactions between relatively
value of the system is that it enriches the amount of small domains (e.g., homes and office environments)
available context information about these entities in a that are not necessarily federated through roaming
way that is transparent for applications. In addition, agreements. This is unlike the more telecom-oriented
the system shares context information across domains approach discussed in [18]. The work described in this
in a controlled manner by taking privacy policies into paper is an extension and elaboration of [11].
account, both of the mobile entity as well as of the In this paper we first illustrate our ambition with a
domains it visits. We discuss the system’s architecture, mobility scenario (Section 2). We then discuss the
its implementation, and the way we deployed it. architecture of our system, first concentrating on intra-
domain aspects (Section 3) and then on inter-domain
1. Introduction context sharing (Section 4). Next, we present the
implementation of our system and describe how we
A critical component in ubiquitous computing is a deployed it (Section 5). Finally, we compare our work
system that enables applications to obtain (enriched) to existing systems and present our conclusions and
context information (e.g., [6],[7],[15]) about “entities”, future work in sections 6 and 7.
typically users, places, and devices. Examples of
context information include the activity, mood, and 2. Scenario
heart rate of a user, as well as a device’s system level
state (e.g., CPU usage). Figure 1 shows a scenario in which a mobile user
A known problem of most existing systems is that visits a remote domain. The goal of the scenario is to
their operation is constrained to a single administrative introduce the roles of the involved domains, a typical
domain [1]. As a result, applications cannot get context context-aware application, and the policies that control
information generated by sensors in remote domains. the exchange of context information.
For example, when a user is in a hotel that tracks the Alice@CompanyA is visiting John in his office
location of its guests, this information will be @CompanyB. The third person in the scenario is Bob,
unavailable to that user’s personal applications as well a co-worker of Alice. He is in his office at Company A
to applications of others (e.g., the user’s buddies). In and is unaware of Alice’s appointment with John.
such cases, these applications will at best only be able Building B is able to detect information about users,
to obtain context information provided by the sensors objects and places within building B. This information
of the user’s personal device (e.g., GSM cell IDs). is made available via local services, which we call

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 98


DOI 10.1109/SUTC.2008.58
context sources. In the example of Figure 1, building B several services that together collect, share, interpret,
contains a location context source that tracks the and manage context information about entities. The
location of the people in building B (e.g., using two key services of the CMF are context sources
Bluetooth or WLAN sensors) and an activity context (Section 3.1) and context agents (Section 3.2). In
service that determines what activity they are currently Section 4, we consider how the CMF shares context
engaged in. The activity source may infer the activity information across domains. Context sources and
of a person by combining information from several agents have been reported on before in [11] and [15].
other sources, for instance by combining Alice’s
location and the appointments in John’s electronic 3.1 Context Sources
calendar.
A context source is a service that computes or
Company A
(Alice’s home domain) Company
Building B, Company B
(foreign domain)
senses context information and makes this available to
B’s privacy
policies
applications through a well-defined interface, both
location
source synchronously and asynchronously. Context sources
may wrap physical sensors (e.g., a Bluetooth dongle or
telephony
client
activity
source
RF signals a temperature sensor) or entire sensor networks. They
bob@companyA can also aggregate context information from other
sources or reason about it to infer new context
Alice’s
privacy John’s information. Context sources use a context ontology to
e-calendar
policies
describe the types of context information they can
roaming
provide and the quality levels they support (our quality
of context mechanism is outlined in [20]).
alice@companyA
The example of Figure 1 contains three context
sources operated by the foreign domain (Company B):
Figure 1. Scenario. a location context source, a context source that
provides calendar information and a context source that
Now suppose Bob opens his context-aware infers activity information from the other two.
telephony application to call Alice and invite her to Context sources have a trust relationship with the
lunch. In the scenario, the application indicates that other context sources in their domain (e.g., by
Alice is in a meeting until 11am, which is information imprinting with a shared secret, or using certificates).
that originates from the activity source in building B Context sinks use a stateful discovery mechanism to
(foreign domain). This context source knows that Alice find context sources, the core component of which is a
is in a meeting with John at Company B until 11am, so-called “context broker”. There is at least one context
but only sends part of this information to Bob as a broker per domain, but usually all devices that support
result of two policies: First, Company B’s privacy the CMF run a context broker and the context brokers
policies say that it only releases activity information to exchange information within their domain using a
the outside world when the information does not stateless discovery protocol (our current
pertain to employees of Company B. Hence, Alice’s implementation supports SLP [10] and WS-Discovery
activity information will not include a reference to [19]). We refer to [15] for more details on the
John. Second, Alice’s policies could indicate that co- interaction between context sources and context
workers like Bob should not see her current location. brokers.
As a result, the information about Alice’s current
activities will be downgraded again before being
3.2 Context Agents
passed to Bob’s telephony application, i.e. to “John is
in a meeting until 11am”.
A context agent is a discoverable service that is
The main question we address in this paper is how
bound to a specific entity (e.g., a user such as Alice)
we can share context information like Alice’s activity
and provides a one-stop shop for applications to obtain
across administrative domains, while at the same time
context information about that entity. As we will see in
enforcing the privacy policies of the roaming entity
Section 4, this includes information gathered by
and those of the visited (or “foreign”) domains.
context sources in foreign domains.
In general, a context agent keeps track of the set of
3. Intra-domain Context Management context sources that can currently provide context
information about the entity that the agent represents
We designed a distributed system called the Context (the agent’s set of registered context sources). A
Management Framework (CMF), which consists of

99
domain will therefore publish a context agent for each policies of the entity that the agent represents and
entity (e.g., devices, users, and places) about which it obtains these policies from a policy repository in its
wants to provide context information. local domain. A CAM also registers a set of ‘well-
In our work, each context agent has an identity of known’ context sources with each new agent. These
the form entityID@domainID (e.g., are sources in the CAM’s local domain, typically
alice@companyA). Applications such as Bob’s building-specific aggregators of context information.
telephony application locate a context agent by At a later stage, a context agent may use locally
resolving its identity to a network address, for instance available context brokers (Section 3.1) to dynamically
using a name resolution infrastructure. The CMF locate additional context sources in the CAM’s local
supports both DNS and SIP based name resolution. domain. A CAM must be discoverable through service
Since a context agent is the single point of access discovery protocols like SLP or WS-Discovery.
for context information about a particular entity, it is
also a natural place for enforcing that entity’s privacy 4. Inter-domain Context Sharing
policies (cf. [6]). Privacy policies are rules that specify
who is allowed to get what context information in what For inter-domain context sharing, we focus on
quality in what situation. To enforce such policies, the “friendly” foreign domains that (1) allow visitors to
CMF follows the Policy Core Information Model [16], use the foreign domain’s network and (2) that are
which consists of Policy Decision Points (PDPs) and a willing to share at least part of the information they
Policy Enforcement Points (PEPs). A PDP is collect about visitors with the outside world.
responsible for evaluating policies (privacy policies in Our prime mechanism for supporting inter-domain
our case), while a PEP requests a policy decision from context sharing is the so-called “temporary context
a PDP and enforces the decision that the PDP makes. agent” (TCA). A TCA represents an entity when it
The PEPs in our architecture are context agents and resides in a foreign domain and forms the entry point
proxy context sources. A context agent controls which for getting context information from the foreign
types of context information an application is allowed domain about that entity. A TCA links to the entity’s
to get, while a proxy context source subsequently context agent in the home domain to make the foreign
ensures that the application gets this information at the context information available to applications and
appropriate Quality of Context (QoC) [2][20]. To enforces the foreign domain’s privacy policies (with
accomplish this, a proxy context source wraps the the assistance of proxy context sources in the foreign
source that provides the “real” context information. domain). The CAM of the foreign domain manages the
Our privacy policies determine the maximum QoC that life-cycle of TCAs, which are softstate components.
a proxy passes to the requester of certain context The latter means that a CAM will destroy TCAs
information (e.g., Bob in Figure 1) is allowed to get. automatically if they are not kept alive. This frees up
Examples of QoC indicators are precision [20], resources and therefore benefits the scalability of the
freshness and probability of correctness. entire system. It may also help in foreign domains
We assume that context agent and proxy context deleting information they have collected about visiting
sources have their own logical PDP, but we do not entities, which may be required from a (legal) privacy
make any assumptions on how they should be realized perspective. As the foreign domain is not likely to
(e.g., as truly separate PDPs or as a single domain- know the identity of its visitors, we instantiate one or
wide PDP for multiple agents and sources). more new TCAs for each visitor. No authentication is
A PDP communicates the result of its policy required. In practice, a TCA will be created for each
evaluations to a PEP in the form of policy decisions. network interface belonging to the user that obtains
This includes an “obligations” part [8], which specifies access to networking resources at the visiting domain.
the maximum level of QoC. A policy decision In this section we describe two protocols: one for
depends on three inputs: (1) the identity of the party setting up a secure association between a TCA and the
requesting the context information (e.g., context agent in the home domain (Section 4.1) and the
bob@companyA), (2) the identity of the entity whose protocol that subsequently makes the context
context information the requestor wants to obtain (e.g., information from the foreign domain available to
alice@companyA), and (3) the type of context applications through the context agent (Section 4.2).
information being requested (e.g., activity
information). 4.1 Establishing Secure Associations
Each domain contains a Context Agent Manager
(CAM), which is responsible for creating, configuring, To enable applications to obtain context information
and destroying context agents in that domain. A CAM from a foreign domain, we will need to establish a
configures a new context agent with the privacy

100
connection between a TCA and the context agent of the agent in the home domain. The registration
corresponding entity in the home domain. In our message includes the CADC’s access token in the
system, it is the mobile user’s personal device that home domain (which was issued previously by the
controls this process. The reason is that it typically AAS in the home domain) as well as the access
knows when it is in a foreign domain, plus that it has a token that the CADC received from the AAS in
secure association with its own context agent. A TCA the foreign domain. The CADC encrypts all
will be created by the foreign (visiting) domain for all messages between the user’s personal device and
network interfaces of the personal devices of the user the home context agent (we use SSL for this
that are connected to access points in the foreign purpose).
environment. For instance, the user’s WiFi MAC 4) The context agent in the home domain establishes
address, or bluetooth id. This, therefore, does not a secure association with the TCA, presenting the
require any explicit tagging by foreign host, and it will credentials the CADC received from the AAS in
allow the guest to use pre-configured home-credentials the foreign domain (token B in Figure 2).
on his personal device to link the foreign TCA to its Figure 2 illustrates this using the example of Figure 1.
own Context Agent. An alternative approach would be that the TCA
To establish these secure connections, we introduce registers itself with the entity’s agent in the home
two additional components: Authentication and domain. The disadvantage is that this would require the
Authorization Services and Context Agent Discovery CADC to expose the entity’s identity to the foreign
Clients. domain, which may not be desirable from a privacy
An Authentication and Authorization Service (AAS) perspective.
is responsible for authenticating and authorizing The CADC is also responsible for keeping the
visitors for particular services in the domain for which association between its context agent and the TCAs
the AAS is authorative. The AAS grants these rights by alive. It accomplishes this by regularly refreshing (1)
means of access tokens. AAS clients (in our scenario: the access token and the TCA it requested in the
the personal device of the visitor) need to present these foreign domain and (2) the registration of the TCA
tokens to the services in the AAS’ domain in order for with the entity’s context agent. We expect these refresh
those services to execute the client’s requests. AAS intervals to be rather large (currently set to 30
clients discover the AAS of a certain domain’s by minutes), which minimizes the impact on the mobile
means of a generic service discovery protocols like device (e.g., in terms of used battery power). An
SLP or WS-Discovery. Our AAS service is configured alternative is that the entity’s context agent refreshes
to provide zero-authentication guest-certificates for the association with a TCA by regularly polling that
nodes that have obtained network access, like our TCA.
visitors. Local users will be granted more permissions, The association between a context agent and a TCA
typically. This means that cross-domain authentication disappears automatically if the TCA disappears. The
mechanisms are not needed. CADC may however also explicitly remove such an
The Context Agent Discovery Client (CADC)
controls the establishment of secure associations
between a context agent and a TCA. A CADC runs on
a personal mobile device (e.g., on Alice’s smart phone)
and goes through the following four steps:
1) Upon entering a foreign domain, the CADC
obtains an IP address and discovers the foreign
domain’s AAS and its CAM;
2) The CADC obtains an access token from the AAS
and requests the CAM in the foreign domain to
create a TCA for the roaming entity. The request
message includes the token the CADC received
from the AAS. The CAM checks with its local
AAS if the token is valid. If it is, it retrieves the
privacy policies it requires from the policy
repository, creates the requested TCA, and
configures the TCA with the foreign domain’s
privacy policies;
3) The CADC obtains a reference to its TCA from
Figure 2. Establishing a link with a TCA.
the CAM and registers it with the entity’s context

101
association by deregistering the TCA at the context the home domain and configures it with the
agent. The corresponding message contains the access privacy policies of the roaming entity. It then links
token for the home domain as well as the one from the the proxy to its peer in the foreign domain, and
foreign domain. passes the references to the proxy in the home
domain to the client application. If the entity is in
4.2 Obtaining Context Information its home domain, the context agent will not link to
a TCA and the proxies in the home domain will
After having established a secure association, CMF link to context sources in that same domain.
applications such as Bob’s telephony client can obtain 3) Query the proxy context source. A client uses the
(foreign) context information from a context agent. reference it received from the context agent in the
These interactions follow three steps procedure, which home domain to query the proxy context source
is similar to typical discovery protocols: and obtain the actual context information. The
1) Locate an entity’s context agent. A client proxy forwards the query to its peer in the foreign
application uses a name resolution infrastructure to domain, which gets the actual context information
map the identity of an entity onto the network from the context source for which it acts as a
address of the corresponding context agent (see proxy and returns the context information back to
Section 3.2); the client along the same path. The proxy context
2) Query the context agent. A client queries a context source in the foreign domain interacts with its
agent by sending a query message to it. The local PDP to enforce the privacy policies of the
message contains the type of context information foreign domain, which may for instance result in
the client is looking for, the identity of the the proxy lowering the quality of the context
requesting entity (e.g., Bob@CompanyA), and a information. The proxy in the home domain also
token that represents that requestor’s rights (e.g., interacts with its local PDP and may modify the
using SAML [17]). The context agent validates the context information it receives from the proxy in
request at its PDP. If the subject of the query (e.g., the foreign domain once more before passing it to
Alice) is in a foreign domain and if the PDP the application. The obligation part of the PDPs’
indicates that the client is allowed to get the policy decisions specify the maximum QoC.
requested type of context information, the context Figure 3 illustrates this behavior for Bob’s client and
agent forwards the query to the TCA in the foreign Alice’s context agent (example in section 2). To
domain using the secure connection of Section 4.1. simplify the figure, we did not detail the interactions
The TCA then creates a proxy context source in with the PDPs of Alice and the foreign domain. The
the foreign domain, links it to the foreign context token that Bob’s client application passed to Alice’s
source, configures the proxy with the foreign context agent (tokenA) proves Bob’s authenticity and
domain’s privacy policies, and returns the is typically issued by a trusted third party. Observe that
reference to the proxy to the context agent in the token B is a result of the establishment of the secure
home domain. This agent creates another proxy in connection with the TCA (see Section 4.1).
The usage of two sets of privacy policies (one of the
Alice’s home domain foreign domain (Company B)
roaming entity and one from the foreign domain) may
C(Bob) CA(Alice) PCS1(CS(activity)) TCA(Alice) PCS2(CS(activity)) CS(activity)
result in certain context information never reaching the
query(activity, tokenA)
Get policy
context-aware application. This for instance happens
decision from
Alice’s PDP query(activity, tokenB)
when the policies of the visiting domain lower the QoC
Get policy
decision from
below the level required by the policies of the roaming
B’s PDP
user and is therefore not returned to the requesting
application. As the policy rules will not be exchanged
PCS2(CS(activity)) C

PCS1(CS(activity)) C
(privacy), this is impossible to prevent.
getContext(activity, tokenA) getContext(activity, tokenB) If the foreign domain’s security policies do not
Get policy
decision from allow access to the TCA, our system also supports a
B’s PDP getContext(activity)

activity information
fall-back scenario where the CADC entity instantiates
activity information E a context source proxy on the personal device that acts
Get policy
decision from
as a second bridge. This does, of course, consume
Alice’s PDP

E
additional resources on the personal device.
activity information
This system also assumes that the quality of the
C = create a proxy context source E = enforce policies (obligations) information provided by the TCA is sufficient. This
Figure 3. Inter-domain context sharing could very well be spoofed by un-compliant intelligent
environments. This is another level of trust-

102
management that is not covered by our system,
although it is possible to switch the ‘automatic context
sensing’ feature off.

5. Implementation and Deployment


We implemented the CMF as Java components. They
run in a distributed run-time environment that allows
them to abstract away from the underlying operating
system (Windows 2000/XP, Windows Mobile, Linux), foreign context
information
RPC mechanisms (web services, XML-RPC), local
service discovery infrastructures (WS-Discovery,
centralized registry), and global name resolution
schemes (SIP, DNS). Only a few context sources are
operating system or hardware dependent because they Figure 5. Screenshot of Buddy Spotter
acquire context in an OS/hardware-specific way. application.
Examples include keyboard activity, schedule
information and system power status. domain) and in the houses of three different employees
Context sources use an OWL ontology to describe (foreign domains). The PCs in the three houses are
the types of information they produce. We use Jena equipped with Bluetooth dongles and contain a context
[13] to generate and parse the RDF and to process source that uses the Bluetooth RF signal to determine
SPARQL sent to context sources and context agents. if an employee is in his house or not. The PC in the
We use the OASIS standard XACML (eXtensible home domain hosts three context agents, one for each
Access Control Markup Language) [8] to encode the of the three employees. The PCs in the foreign
privacy policies that our system uses and use the PDP domains run a TCA when the employee is in his house
implementation provided by SUN to evaluate those and feed the extra context information (“at home” or
policies. We extended the PDP implementation so that “away”) back to the employee’s context agent in the
policies can specify a maximum allowed QoC, in home domain.
particular regarding the precision of activity We make the additional context information
information. If a certain request triggers the evaluation available through an instant messaging-like application
of multiple policies, the XACML PDP chooses the called the Buddy Spotter. The Buddy Spotter is
highest allowable QoC as the result. More details can accessible from a wall-mounted display in our office’s
be found in coffee corner and keeps track of the locations of
Figure 4 illustrates how we deployed the CMF. We several employees, including the three with the PCs in
put a CMF-enabled PC in our office (the home their houses. For these employees, the Buddy Spotter
also indicates if they are at home, which is information
it obtains from the three context agents running on the
office PC. Without our system, this information would
have been unavailable without inter-domain context
sharing. Figure 5 shows a screen shot of the Buddy
Spotter application and marks the foreign location
information (“living room”) available on user Martin
Wibbels.

6. Related Work
The work that comes closest to ours is the Vade
system of José et al. [12]. Their context managers are
very similar to our context agents and also allow
applications to obtain foreign context information. The
main difference is that our system explicitly supports
configurable privacy policies, also for foreign domains.
Another difference is that José et al. seem to rely on
Figure 4. Inter-domain deployment of the CMF. telecom operators to track a device’s location and use

103
that information to determine whether a mobile device Like our system, the ACAI system of Khedr and
is in a particular foreign domain or not. Although they Karmouch [14] also enables applications to obtain
also identify our mobile-controlled approach, they have foreign context information. The main difference with
not detailed it. our work is that they do not consider the privacy
Several other projects, like e.g. in-Context, have policies of roaming entities and foreign domains. Their
developed a system that essentially has a per-group architecture furthermore assumes that each domain
server (e.g. [[8]]. This is fundamentally different from runs an agent platform. In our architecture, we do not
our approach, in which each entity has its own service, make this assumption and allow each domain to use its
and sharing and privacy-management does not require own technology for intra-domain discovery and
a centralized service at all. communications. Finally, the ACAI system uses SIP
Project DAIDALOS has developed a system in for inter-domain communications and for location
which foreign domains push the context information updates, which is one of the protocols we could have
they collect about visiting users to so-called “home used to realize these interactions in our system
managers” [18][21]. A home manager is a database (between a CADC and a context agent as well as
(that resides in a user’s home domain) with context between a context agent and its TCA).
information about a particular user, including foreign Chin et al. [4] and Gu et al. [9] enable context-
context information. The similarity with our approach aware applications to get foreign context information
is that context agents also provide a single point of by means of a self-organizing peer-to-peer overlay that
contact for context information. The main difference is interconnects the context brokers (called “discovery
that DAIDALOS targets an environment that consists gateways”) of different administrative domains (which
of large trusted domains that are federated through they call “spaces”). The difference with our work is
service level agreements (cf. traditional telco that they rely on the peer-to-peer infrastructure to
operators). As a result, DAIDALOS users inform a discover context sources in foreign domains and do not
foreign domain of the URL of their home manager, leverage the current location (domain) of a roaming
which the foreign domain subsequently uses to push entity. There also seems to be no explicit mechanism to
context information to the home manager. In our determine in which domain a roaming entity currently
system, visiting users typically deal with untrusted resides, which makes it unclear how they associate
foreign domains and therefore usually do not publish foreign context sources with that entity. They also not
their identity. As a result, it is the user’s personal consider privacy aspects.
device that establishes a link with the foreign domain
and our home domains pull for foreign context 7. Conclusions
information. Another important difference is that the
DAIDALOS system does not consider the privacy In order for the vision of ubiquitous, context-aware
policies of users and foreign domains. computing to become a reality, we will need to be able
While not the main contribution of their paper, to share context information across different domains
Chen, Finin, and Joshi [3] outline how their COBRA of administration. This will enable context-aware
system could be used to provide foreign context applications to use context sources in foreign domains
information to applications. Unlike in our system, they to obtain additional context information about roaming
make foreign domains responsible for enforcing the entities such as users and devices, which will
privacy policies of visiting users. The advantage of ultimately improve these applications’ operation. A
their approach is that it takes the home domain out of crucial requirement, however, is that the underlying
the loop for policy enforcement, which reduces the system enforces the privacy policies of mobile users
load on the home domain and possibly also the number and foreign domains, which determine who is allowed
of traversed network hops. The downside is that to get what context information under which
roaming users have to (1) inform every foreign domain circumstances.
they visit of their privacy policies, and (2) that they The key contribution of our system is that it realizes
need to trust these domains to correctly enforce those both of these requirements. In contrast to existing
policies and (3) that applications will need to rebind to systems, we do not require pre-established security
the foreign sources of information. This is not relations between domains, but dynamically establish
necessary in our approach, but malicious foreign these relations based on the physical presence of a
domains can still manipulate the context information roaming entity in a foreign domain. Based on current
they collect about a visiting entity. Another limitation literature, we believe this is novel work. Another
of the COBRA system is that they do not consider the benefit of our work is that applications that use our
privacy policies of foreign domains, which we do. system will only need to interact with a single context

104
agent, independent of the domain that provides the [8] Dorn, C. Schall, D. Dustdar, S., “Granular Context in
contextual information about that specific entity. We Collaborative Mobile Environments”, Springer LNCS
have implemented all concepts described in this paper 4278, pp 1904-1913, Springer Verlag, 2006
[9] T. Gu, E. Tan, H. Keng Pung, D. Zhang “A Peer-to-Peer
and also deployed the CMF on a small scale.
Architecture for Context Lookup”, 2nd International
Future work includes the validation and usability Conference on Mobile and Ubiquitous Systems
check of the policy language, in particular regarding its (MobiQuitous’05), San Diego, California, July 2005
expressiveness, the types of rules it should support, [10] E. Guttman, C. Perkins, J. Veizades, M. Day, “Service
automation of context-information adaptation (due to Location Protocol, Version 2”, RFC 2608, June 1999
policy enforcement) and the empowerment how users [11] C. Hesselman, H. Eertink, and M. Wibbels, “Privacy-
can be empowered to easily manage their privacy aware Context Discovery for Next Generation Mobile
policies. We will also investigate privacy policies that Services”, 3rd SAINT2007 Workshop on Next
depend on context information and will use the system Generation Service Platforms for Future Mobile
Systems (SPMS 2007), Hiroshima, Japan, January 2007
for different applications (healthcare domain,
[12] R. José, F. Meneses, A. Moreira, “Integrated Context
professional services) in order to validate the Management for Multi-domain Pervasive
application programming model of the CMF. Environments”, First International Workshop on
Managing Context Information in Mobile and Pervasive
Acknowledgments. This work has been conducted Environments (MCMP-05), Ayia Napa, Cyprus, May
within the projects Freeband AWARENESS (co- 2005
sponsored by the Dutch government under contract [13] HP Labs, “Jena – A Semantic Web Framework for
BSIK 03025) and IST Amigo (partially funded by the Java”, http://jena.sourceforge.net/, October 2005
European Commission under contract IST 004182). [14] M. Khedr and A. Karmouch “'ACAI: Agent-Based
Context-aware Infrastructure for Spontaneous
Remco Poortinga and Niels Snoeck contributed to the
Applications”, Journal of Network & Computer
topics presented in this paper. Maarten Wegdam Applications, Volume 28, Issue 1, pp. 19-44, 2005
reviewed the draft version of this paper. [15] H. van Kranenburg, M. S. Bargh, S. Iacob, A.
Peddemors, “A Context Management Framework for
8. References Supporting Context-Aware Distributed Applications”,
IEEE Communications Magazine, August 2006, pp. 67-
[1] M. Blackstock, R. Lea, and C. Krasic, “Toward Wide 74.
Area Interaction with Ubiquitous Computing [16] B. Moore, et al., IETF RFC3060 Policy Core
Environments”, 1st European Conference on Smart Information Model—Version 1 Specification, February
Sensing and Context, the Netherlands, October 2006 2001.
[2] T. Buchholz, A. Küpper, and M. Schiffers, “Quality of [17] Ragouzis, et al. (eds.), Security Assertion Markup
context: What it is and why we need it”, Workshop of Language (SAML) V2.0 Technical Overview,
the HP OpenView University Association (HPOVUA http://www.oasis-open.org/committees/documents.php?
2003), Geneva, 2003 wg_abbrev=security, Oct 2006.
[3] H. Chen, T. Finin, and A. Joshi, “Using OWL in a [18] I. Roussaki, M. Strimpakou, C. Pils, N. Kalatzis, M.
Pervasive Computing Broker”, Workshop on Ontologies Neubauer, C. Hauser, and M. Anagnostou, “Privacy-
in Open Agent Systems (AAMAS 2003), July 2003 Aware Modelling and Distribution of Context
[4] C.-Y. Chin, D. Zhang, M. Gurusamy, “Orion: P2P- Information in Pervasive Service Provision”, IEEE
based Inter-Space Context Discovery Platform”, 2nd International Conference on Pervasive Services (ICPS
Annual International Conference on Mobile and 2006), Lyon, France 2006, pp. 150-160
Ubiquitous Systems (MobiQuitous’05), San Diego, [19] J. Schlimmer (ed), “Web Services Dynamic Discovery
USA, July 2005 (WS-Discovery)”, Microsoft Corporation, 2005,
[5] N. Davies and H.-W. Gellersen, “Beyond Prototypes: http://msdn2.microsoft.com/en-us/library/
Challenges in Deploying Ubiquitous Systems”, IEEE bb706924.aspx
Pervasive Computing, pp. 26-35, January 2002 [20] K. Sheikh, M. Wegdam, M. Sinderen, “Quality-of-
[6] P. Debaty and D. Caswell, “Uniform Web presence Context and its use for Protecting Privacy in Context
architecture for people, places, and things”, IEEE Aware Systems”, Journal of Software (JSW), Vol.3,
Personal Communications, Volume 8, Issue 4, Aug Issue 3, pp 83-93, March 2008.
2001, pp. 46-51 [21] M. Strimpakou, I. Roussaki, C. Pils, M.Angermann, P.
[7] A. Dey, D. Salber, and G. Abowd, “A Conceptual Robertson, M. Anagnostou, “Context Modelling and
Framework and a Toolkit for Supporting the Rapid Management in Ambient-aware Pervasive
Prototyping of Context-Aware Applications”, Special Environments”, International Workshaop on Location-
issue on context-aware computing; Human-Computer and Context-Awareness (LoCa 2005), Munich,
Interaction (HCI) Journal, Volume 16 (2-4), 2001, pp. Germany, 2005
97-166

105
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Two-Way Beacon Scheduling in ZigBee Tree-Based Wireless Sensor Networks

Lun-Wu Yeh† , Meng-Shiuan Pan† , and Yu-Chee Tseng†‡



Department of Computer Science
National Chiao Tung University, Taiwan
Department of Information and Computer Engineering

Chung-Yuan Christian University, Taiwan
Email: {lwyeh, mspan, yctseng}@cs.nctu.edu.tw

Abstract efficient and low-latency scheduling designs. Reference


[11] aims to minimize the end-to-end communication de-
Broadcast and convergecast are two fundamental opera- lay while provides energy-efficient sleep mode for sensor
tions in wireless sensor networks. Although previous works nodes. Reference [9] analyzes different wake-up schemes
have addressed energy-efficient and low-latency schedul- on energy consumption and upstream/downstream delays.
ing, these works either consider only one-way (broadcast or It is suggested that a node should wake up twice per cycle
convergecast) communication or are not compliant to Zig- to support two-way transmission. The results in [9][11] are
Bee/IEEE 802.15.4 standards. Motivated by these observa- not compliant to the ZigBee/IEEE 802.15.4 standards. In
tions, this work defines a two-way beacon scheduling prob- [14], the authors propose convergecast algorithms which are
lem for ZigBee tree-based networks. We propose a schedul- compliant to ZigBee/IEEE 802.15.4. We propose an obser-
ing algorithm that can reduce indirect interference neigh- vation to reduce some indirect interferences in that network.
bors and achieve low-latency broadcast and convergecast. It can decrease the transmission delay. Also, the broadcast
Simulation results are presented. issue is not addressed.
Our goal is to design an efficient beacon scheduling so-
lution for ZigBee/IEEE 802.15.4 tree-based networks to
support both broadcast and convergecast with low laten-
1. Introduction
cies in both directions. Fig. 1(a) shows the problem sce-
nario. The network contains one sink (ZigBee coordina-
The rapid progress of wireless communication and em- tor), some full-function devices (ZigBee routers), and some
bedded micro-sensing MEMS technologies has made wire- reduced-function devices (ZigBee end devices). Each Zig-
less sensor networks (WSNs) possible. A WSN normally Bee router is responsible for collecting sensory data from
consists of many inexpensive wireless nodes, each capable end devices and relaying them to the sink. According to the
of collecting, processing, and storing environmental infor- IEEE 802.15.4 specification, a router can announce a bea-
mation, and communicating with neighboring nodes. Many con to start a superframe. Each superframe consists of an
WSN applications have been developed, such as emergency active portion followed by an inactive portion. On receiv-
guiding [6][12], object tracking [10], and environmental ing its parent router’s beacon, a child device needs to wake
monitoring [3][13]. up for an active portion and can communicate with its par-
Recently, several WSN platforms have been developed, ent. However, to avoid collision with its neighbors, a router
such as MICAz [2], Tmote [4], and Dust Network [1]. should shift its active portion by a certain amount. Fig. 1(b)
To ensure interoperability of different platforms, the Zig- shows a possible allocation of active portions for the sink
Bee/IEEE 802.15.4 [7][16] standards are proposed, which c, routers RA , RB , and RC . In this example, the sensory
define physical, MAC, and network layers for low-rate, low- data reported from ED1 and ED2 can reach to the sink in
power wireless communications. one superframe. However, in the reverse direction, when
Sending commands to several nodes from the sink node the sink needs to send a packet to ED1 , the latency will be
(broadcast) or reporting sensory data to the sink node up to 3 superframes. The transmission delay can not be neg-
(convergecast) are two fundamental operations in WSNs. ligible when the network is run under a low duty cycle. For
Broadcast and convergecast are inverse operation of each example, in 2.4 GHz PHY, with 3.13% duty cycle, a super-
other. Some previous works [9][11][14] propose energy- frame can be up to 251.658 seconds (with an active portion

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 130


DOI 10.1109/SUTC.2008.51
Sink Schedule of
sink Ξ Ξ Ξ
c
Schedule of RA Ξ Ξ Ξ
ZigBee
RA
router
Schedule of RB
Ξ Ξ Ξ
Report from Message to ED2
ZigBee
end device ED2 Ξ Ξ Ξ
RB RC
Schedule of RC
Interference Report from ED1 Message to ED1
neighbor
ED2 ED1 m-th superframe (m+1)-th superframe (m+2)-th superframe
Active portion led Active period to communicate
(a) (b) by a beacon period with parent's active portion

Schedule of cd cu Ξ cd cu Ξ
Upstream active portion
Ru sink Message to Message to
of a ZigBee router R ED1 and ED2 ED1 and ED2
d
cd R A R uA c u Ξ c d R dA R uA cu Ξ
Downstream active portion Schedule of RA
d
R of a ZigBee router R
R dA R dB R uB R uA Ξ R dA R dB R uB R uA Ξ
Schedule of RB
u A ZigBee router R listens to its
R parent's upstream active portion Message to ED2 Report from ED2 Message to ED2 Report from ED2
R dA R dC R Cu R uA Ξ R dA R dC R Cu R uA Ξ
Schedule of RC
d A ZigBee router R listens to its
R parent's downstream active portion Message to ED1 Report from ED1 Message to ED1 Report from ED1
m-th superframe (m+1)-th superframe
(c)

Figure 1. (a) An example of ZigBee/IEEE 802.15.4 tree-based network. (b) An example of two-way
transmission under the original ZigBee/IEEE 802.15.4 superframe structure. (c) An example of two-
way transmission under the proposed modified ZigBee/IEEE 802.15.4 superframe structure.

of 7.88 seconds). briefly introduces ZigBee/IEEE 802.15.4 standards and our


The above observation leads to the problem of designing proposed superframe structure. Section 3 formally defines
two-way beacon scheduling for ZigBee tree-based networks the two-way beacon scheduling (TBS) problem. Section 4
such that the latencies of both broadcast and convergecast presents our observations and algorithm for the TBS prob-
are as low as possible. We propose to modify the original lem. Simulation results are given in Section 5. Finally, Sec-
superframe structure of IEEE 802.15.4 to allow each router tion 6 concludes this paper.
to broadcast beacons twice in a superframe. This requires
two active portions per superframe. One beacon is for the
2. Preliminaries
upstream (convergecast) direction and the other is for the
downstream (broadcast) direction. Fig. 1(c) shows an allo-
cation of active portions in the previous example. As can 2.1. Overview of IEEE 802.15.4 and ZigBee
be seen, the transmission delays of downstream messages
from the sink and the upstream reports from end devices IEEE 802.15.4 [7] specifies the physical and data link
ED1 and ED2 can be limited to one superframe. protocols for low-rate wireless personal area networks (LR-
Apparently, different assignment of active portions will WPAN). In the physical layer, there are three frequency
incur different transmission delays. We propose an schedul- bands with 27 radio channels. Channel 0 ranges from 868.0
ing algorithm for low-delay, two-way (broadcast and con- MHz to 868.6 MHz, which provides a data rate of 20 kbps.
vergecast) beacon scheduling. We show that assigning up- Channels 1 to 10 work from 902.0 MHz to 928.0 MHz and
stream and downstream active portions simultaneously can each channel provides a data rate of 40 kbps. Channels 11
achieve lower delays than doing this separately. In addi- to 26 are located from 2.4 GHz to 2.4835 GHz, each with a
tion, to further relieve the interference among neighboring data rate of 250 kbps.
routers, we propose a mechanism to reconnect end devices IEEE 802.15.4 devices are expected to have limited
to different parent routers to reduce some indirect interfer- power, but need to operate for a longer period of time.
ences. Simulation results show that the proposed algorithm Therefore, energy conservation is a critical issue. Devices
can indeed achieve good performance. are classified as full function devices (FFDs) and reduced
The rest of this paper is organized as follows. Section 2 function devices (RFDs). IEEE 802.15.4 supports star and

131
Received Beacon Transmitted Beacon Received Beacon Transmitted beacon Transmitted beacon
for upstream data for downstream data
Incoming (received)
Outgoing Received beacon Received beacon Received beacon
Inactive (transmitted) Inactive for upstream data for downstream data for upstream data
active portion
active portion

SD (Active ) SD (Active )
StartTime > SD SD = aBaseSuperframeDuration × 2SO symbols RbUp Inactive TbUp Inactive RbDn Inactive TbDn

BI = aBaseSuperframeDuration × 2BO symbols


m-th superframe (Beacon Interval)

Figure 2. The relationship between incoming


and outgoing active portions. Figure 3. The proposed superframe structure
for the TBS problem

peer-to-peer topologies. In each PAN, one device is desig- In a beacon-enabled star network, a device only needs to
nated as the coordinator, which is responsible for maintain- be active for 2−(BO−SO) portion of the time. Changing the
ing the network. A FFD has the capability of becoming a value of (BO − SO) allows us to adjust the on-duty time
coordinator or associating with an existing coordinator. A of devices. However, for a beacon-enabled tree network,
RFD can only associate with a coordinator. routers have to choose different times to start their active
The ZigBee alliance [5] defines the communication pro- portions to avoid collision. Once the value of (BO − SO)
tocols above IEEE 802.15.4. In [16], star, tree, and mesh is decided, each router can choose from 2BO−SO slots as its
topologies are supported. A ZigBee coordinator is respon- active portion. In the revised version of IEEE 802.15.4 [8],
sible for initializing, maintaining, and controlling the net- a router needs to select one active portion as its outgoing ac-
work. In a star network, devices must directly connect to tive portions, and based on the active portion selected by its
the coordinator. For tree and mesh networks, devices can parent, it also selects the same active portion as its incoming
communicate with each other in a multihop fashion. The active portions (refer to Fig. 2). In an outgoing/incoming
network backbone is formed by one ZigBee coordinator and active portions, a router is expected to transmit/receive a
multiple ZigBee routers (which must be 802.15.4 FFDs). beacon to/from its child routers/parent router. When choos-
RFDs can only join the network as end devices by associ- ing a slot, neighboring routers’ active portions (i.e., outgo-
ating with the ZigBee coordinator or ZigBee routers. In a ing active portion) should be shifted away from each other
tree network, the coordinator and routers can announce bea- to avoid interference. However, the specification does not
cons. However, in a mesh network, regular beacons are not clearly define how to choose the locations of routers’ active
allowed. Beacons are an important mechanism to support portions.
power management. Therefore, the tree topology is pre- In this work, we consider two types of interference be-
ferred, especially when energy saving is a desirable feature. tween routers. Two routers have direct interference if they
The ZigBee coordinator defines the superframe structure can hear each others’ beacons. Two routers have indirect
of a network. As shown in Fig. 2, the structure of super- interference if they have at least one common neighbor
frames is controlled by two parameters: beacon order (BO) which has communication activities with one of these two
and superframe order (SO), which decide the lengths of a routers. Both interferences should be avoided when choos-
superframe and its active potion, respectively. For a beacon- ing routers’ active portions.
enabled network, the setting of BO and SO should satisfy
the relationship 0 ≤ SO ≤ BO ≤ 14. (A non-beacon- 2.2. The Superframe Structure for Two-way
enabled network should set BO = SO = 15 to indicate Beacon Scheduling
that superframes do not exist.) Each active portion consists
of 16 equal-length slots, which can be further partitioned We propose a new superframe structure to support quick
into a contention access period (CAP) and a contention broadcast and convergecast. In IEEE 802.15.4, each router
free period (CFP). The CAP may contain the first i slots, should wake up in two slots (outgoing and incoming active
and the CFP contains the rest of the 16 − i slots, where portions). In this work, we propose that each router should
1 ≤ i ≤ 16. Slotted CSMA/CA is used in CAP. FFDs wake up in four slots. These four slots are denoted as TbUp,
which require fixed transmission rates can ask for guarantee TbDn, RbUp, and RbDn, as shown in Fig. 3. In TbUp/TbDn
time slots (GTSs) from the coordinator. A CFP can support slots, a node will transmit beacons to its children for receiv-
multiple GTSs, and each GTS may contain multiple slots. ing upstream/transmitting downstream data from/to them.
Note that only the coordinator can allocate GTSs. After the In RbUp/RbDn slots, a node will receive beacons from its
active portion, devices can go to sleep to save energy. parent for transmitting upstream/receiving downstream data

132
m-th superframe (m+1)-th superframe (m+2)-th superframe
Ru RbUp slot of router R
Sink Slot number 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
k=8
Rd RbDn slot of router R c
Sink c (5,0) td tu td tu td tu
u dd(RA,RB)=3
R TbUp slot of router R RA
Router RA (1,3) t d R uA R dA tu t d R uA R dA tu t d R uA R dA tu
Rd TbDn slot of router R RB du(RB,RA)=7

Router RB (2,6) R uA R uB R dA R dB R uA R uB R dA R dB R uA R uB R dA R dB
ZigBee router RC
Router RC (7,0) R dC R uB R dB R Cu R dC R uB R dB R Cu R dC R uB R dB R Cu
ZigBee end device
RD

Interference neighbor Router RD (4,3) R dC R dD R uD R Cu R dC R dD R uD R Cu R dC R dD R uD R Cu


ED
dd(RA,RB)=3 dd(RB,RC)=3 dd(RC,RD)=2 dd(RD,ED)=3

Dd(RD) = 11

du(RD,RC)=3 du(RC,RB)=3 du(RB,RA)=7 du(RA,c)=4


u
D (RD) = 17

Figure 4. An example of two-way transmission delay.

to/from its parent. To support this modification, we can use slots su (par(i)) and sd (par(i)), respectively, where par(i)
the reserved field in the beacon frame to announce the po- is the parent of i in tree T . By interference free, it implies
sitions of these extra slots. Except these changes, all other that su (i) = su (j), su (i) = sd (j), sd (i) = su (j), and
operations follow the original IEEE 802.15.4 specification. sd (i) = sd (j) if edge (i, j) ∈ EI .
Hence, the new design is backward compatible with the Our goal is to find a slot assignment such that the broad-
original specification. cast and convergecast latencies are as low as possible. Let
(i, j) be a link in T such that i is the parent of j. The up-
3. The Two-Way Beacon Scheduling Problem stream delay du (j, i) from j to i is the number of slots from
j receiving a convergecast packet until it is forwarded to i,
We model the network as a graph G = (V, E), where i.e.,
V contains all routers and the coordinator c and E con-
du (j, i) = (su (i) − su (j)) mod k.
tains all symmetric communication links between nodes in
V . The coordinator and routers follow the superframe struc- Similarly, the downstream delay dd (i, j) from i to j is the
ture mentioned in Section 2.2. The coordinator serves as the number of slots from i receiving a broadcast packet until it
sink. End devices are not included in G but will be associ- is forwarded to j, i.e.,
ated to nodes in G. From G, we assume that a ZigBee tree 
T = (VT , ET ) has been constructed, where VT = V and d 0, if i is sink node
d (i, j) =
ET ⊆ E. Also, we can construct from G an interference (sd (i) − sd (par(i))) mod k, otherwise.
graph GI = (V, EI ), where edge (x, y) ∈ EI if there are
Note that dd (i, j) is in fact independent on j’s selection
direct/indirect interferences between x and y. With param-
because the transmission time of i is purely dependent on i’s
eters BO and SO, there are k = 2BO−SO active portions
selection. Also, the latency is 0 when i is the sink because
(slots) in a superframe. Motivated by Brook’s theorem [15],
the broadcast is initiated by i itself.
which proves that n colors are sufficient to color any graph
For any node i, the upstream delay from i to sink c, de-
with a maximum degree of n, we assume that k ≥ 2 × DI ,
noted by Du (i) is the sum of the per-hop upstream delays
where DI is the maximum degree of GI . The factor 2 is
of the path form i to c on tree T . Similarly, for any node i,
required here because each node will need up to 2 colors
the downstream delay from sink c to i, denoted by Dd (i) is
(slots).
the sum of per-hop downstream delays of the path from c to
To solve the broadcast and convergecast problems, each
i. The overall delay incurred by T is defined as
router i ∈ V needs to decide two interference-free slots
su (i) and sd (i) among [0, k − 1] for its slots TbUp and L(T ) = max{ max Du (i), max Dd (j)}.
TbDn, respectively. Its slots RbU p and RbDn will be the ∀i∈VT ∀j∈VT

133
140 Transmission range
SA1
SA2
120 c c
(3,4) (2,5)
R1 R1
100 R2 R2
(2,5) (1,6)

(1,6) (1,6)
80 ED1 ED1
L(T)

ED2 ED2
60 R3 R3
End Potential (0,7) (0,7)

40 device parent routers k=8


ED1 R1, R2, R3, R4
Router
20 ED2 R1, R2, R3
End device
0 (a) (b)
500 600 700 800 900 1000 1100 1200 1300 1400 1500
Number of ZigBee routers

Figure 6. (a) Original slot assignment. (b) The


Figure 5. Simulation results on Observation slot assignment after reconnecting ED1 and
1. ED2 to R3 .

Definition 1 Given a graph G = (V, E), G’s interference indirect interference relation between Ri and Rj may be re-
graph GI = (V, EI ), a ZigBee tree T = (VT , ET ), and moved.
k available slots, two-way beacon scheduling (TBS) prob-
The common end devices of routers Ri and Rj must be
lem is to find an interference-free slot assignment su (i) and
located in the overlapping area of the transmission ranges
sd (i) for each i ∈ V such that network latency L(T ) is of Ri and Rj . Fig. 6(a) shows an example where R1 and
minimized.
R2 have indirect interference because of the existence of
end devices ED1 and ED2 . Given the slot assignment in
Fig. 4 shows an example, the per-hop upstream delay
the parentheses, the latency L(T ) is 3. If ED1 and ED2
du (RB , RA ) from router RB to router RA is 7 and the
are associated to c or R3 , R1 and R2 will have no indirect
per-hop downstream delay dd (RA , RB ) from router RA to
interference. Then, as shown in Fig. 6(b), R1 and R3 can
router RB is 3. The upstream delay from RD to c is 17
use the same slot pair (1, 6) to achieve a lower L(T ) of 2.
and the downstream delay from c to RD is 11. The overall
Observation 3: When selecting slots, a router with more
latency L(T ) of this example is 11.
interference neighbors should select its slots earlier.
A node with higher interference relation has less choices,
4. The Proposed Scheme so it should pick its slots earlier. This leads to Observation
3.
In this section, we propose a centralized slot assignment Below, we propose a centralized slot assignment algo-
algorithm for the TBS problem. We first present some ob- rithm for the TBS problem. We traverse routers in a bottom-
servations. up fashion according to their depths in T . For those vertices
Observation 1: Assigning upstream and downstream in depth d, we first sort them according to their degrees in
slots simultaneously can achieve lower delay than assign- GI in a descending order. Then we sequentially traverse
ing upstream and downstream slots separately. these vertices in that order. For each vertex v being vis-
This observation is supported by simulations. The first ited, run the following two procedures: FindUpSlot(v) and
slot assignment scheme (SA1) examines nodes of T in a FindDnSlot(v).
bottom-up manner. For each node being visited, it will FindUpSlot(v)
greedily pick a slot for its TbUp and a slot for its TbDn
such that the per-hop latencies are smallest. This is repeated 1. For each v, a tentative variable tu (v) will be computed,
until the sink node is reached. The second slot assignment from which the final slot su (v) will be determined.
scheme (SA2) also visits nodes of T to the sink node in a
bottom-up manner, but only assigns slots for TbUp. Then it (a) If v is a leaf node, we set tu (v) = 0.
visits T in a top-down manner and assigns slots for TbDn. (b) If v is an non-leaf node, we set tu (v) =
Our simulation result is in Fig. 5, which shows that SA1 can max{su (v  )|v  is a child of v} + 1.
achieve lower latency than SA2 in most cases.
Observation 2: Suppose that routers Ri and Rj have in- 2. Let N (v) be the set of nodes that have direct or indi-
direct interference because they have some common end de- rect interference with v and have received slots, i.e.,
vices. If some end devices can be reconnected, then the N (v) = {v  |(v  , v) ∈ EI and su (v  ) = N U LL}.

134
3. We will check if temp = tu (v) mod k is a feasible slot v. If w already has its slots su (w) and sd (w), then
for v by examining the slots used by nodes in N (v). we have to make sure this does not cause new inter-
There are two cases: ference. Otherwise, x can be associated w since the
interference, if any, can be resolved later on.
(a) If there exists a v  ∈ N (v) such that v  and v
have direct interference and su (v  ) = temp or 2. If for each common neighbor x of v  and v, the above
sd (v  ) = temp, then the slot temp is not feasible step 1 allows us to reassociate x to another router to
for v. remove the interference, then a positive response will
(b) If there exists a v  ∈ N (v) such that v  and v be replied by this procedure; otherwise , a negative re-
have indirect interference and su (v  ) = temp sponse will be replied.
or sd (v  ) = temp, then we will call procedure
RemoveIndInt(v ,v) to see if such interference Note that each end device maintains a potential parent
can be removed. If so, temp is a feasible slot routers which contains routers that can connect the end de-
for v. Otherwise, temp is not feasible for v. vice. Fig. 7(a) shows an example of this algorithm. We
choose tentative slots (1, 15) for router R6 . Routers R6 and
If the above examination determines that temp is a fea- R8 have common end devices ED1 and ED2 . After per-
sible slot for v, then we assign su (v) = temp. other- forming procedure RemoveIndInt(v ,v), ED1 reconnects to
wise, we increase tu (v) by one and repeat step 3 again. other router. R6 and R8 are removed from potential parent
routers of ED1 and ED2 . R6 can use slots (1, 15), as in
FindDnSlot(v)
Fig. 7(b). Fig. 7(c) shows the result of the slot assignment
1. For each v, a tentative variable td (v) will be computed, algorithm, where ED2 and ED3 also change their parent
from which the final slot sd (v) will be determined. routers. The result L(T ) is 5.
The computational complexity of this algorithm is an-
(a) If v is a leaf node, we set td (v) = k − 1. alyzed below. Before three procedures, the complexity of
(b) If v is an non-leaf node, we set td (v) = sorting is O(DT log DT ), where DT is the maximum de-
min{sd (v  )|v  is a child of v} − 1. gree of T . In RemoveIndInt(v , v), the time complexity
is O(DI + DE ), where DE is the maximum number of
2. Let N (v) be the set of nodes that have direct or indi-
common end devices. In FindUpSlot(v), the computational
rect interference with v and have received slots, i.e.,
cost of computing tentative slot in step 1 is O(DT ). In step
N (v) = {v  |(v  , v) ∈ EI and sd (v  ) = N U LL}.
2 and 3 of FindUpSlot(v), the time complexity for check-
3. We will check if temp = td (v) mod k is a feasible slot ing interference and call procedure RemoveIndInt(v , v) is
for v by examining the slots used by nodes in N (v). O(DI (DI + DE )). The time complexity of FindDnSlot(v)
There are two cases: is the same as FindUpSlot(v). And FindUpSlot(v) and
FindDnSlot(v) will be executed |V | times. Hence, the
(a) If there exists a v  ∈ N (v) such that v  and v overall time complexity is O(|V |DT log DT + |V |DI2 +
have direct interference and su (v  ) = temp or |V |DI DE ).
sd (v  ) = temp, then the slot temp is not feasible
for v.
5. Simulation Results
(b) If there exists a v  ∈ N (v) such that v  and v
have indirect interference and su (v  ) = temp
or sd (v  ) = temp, then we will call procedure In this section, we use simulation programs to evalu-
RemoveIndInt(v ,v) to see if such interference ate the proposed algorithm. In order to observe the effects
can be removed. If so, temp is a feasible slot of procedure RemoveIndInt(v ,v), we compare our scheme,
for v. Otherwise, temp is not feasible for v. denoted as SA, against a reduced version of SA, which does
not reconnect end devices, denoted as SA-NR. Besides, we
If the above examination determines that temp is a fea- compare the SA against a greedy slot assignment algorithm,
sible slot for v, then we assign sd (v) = temp. Other- denoted as GSA, greedily chooses TbUp slots in a bottom-
wise, we decrease td (v) by one and repeat step 3 again. up manner to minimize per hop latency of upstream, and
then chooses TbDn slots in a top-down manner to minimize
RemoveIndInt(v  ,v)
per hop latency of downstream. In our simulations, routers
1. Indirect interference appears when there is a common and end devices are randomly distributed in a N × N region
neighbor x of v  and v. If x is a router, it can be ig- and a sink node is placed in the center of the network.
nored. If x is an end device, then we may try to asso- Fig. 8(a) shows the effects on the different network size.
ciated x with another router, say w, other than v  and The number of routers and end devices are set to (N/10)2

135
R1 Interference R1 R1
neighbor (5,11)

R2 End device R2 R2
R3 R3 R3
(1,15) (4,12)
Router

ED3 k = 17 ED3 ED3

R5 (2,14) R7 R5 (2,14) R7 R5 (2,14) R7


R4 R4 R4
R6 R6 (0,16) R6
(1,15) (1,15) (3,13)
ED2 ED2 ED2

R8 (1,15) ED1 R8 (1,15) ED1 R8 (1,15) ED1


(0,16) (0,16) (0,16)

R9 R9 R9
End device Potential parent routers End device Potential parent routers End device Potential parent routers
ED1 R5, R6, R7, R8, R9 ED1 R5, R7, R9 ED1 R7
ED2 R5, R6, R7, R8, R9 ED2 R5, R7, R9 ED2 R7
ED3 R2, R5, R7, R8 ED3 R2, R5, R7, R8 ED3 R5, R7
(a) (b) (c)

Figure 7. (a) In the process of the slot assignment algorithm. (b) After procedure RemoveIndInt(v  ,v),
router R6 and R8 can use the same slots. (c) The result of the slot assignment algorithm.

and (N/10)2 × 3, respectively, and set k = 128. SA outper- 6. Conclusions


forms others. From the result, we can see that the procedure
RemoveIndInt(v ,v) can effectively reduce L(T ). GSA has In this paper, we modify the original superframe struc-
the worst performance. This result corresponds to the Ob- ture and define a two-way beacon scheduling (TBS) prob-
servation 1 on Section 4. lem for supporting two-way transmission in ZigBee tree-
based networks. We propose a centralized algorithm to
Next, we simulate a 300m × 300m network, place 900 solve the TBS problem. Simulation results indicate that the
routers and 2700 end devices, and set k = 128. Fig. 8(b) proposed scheme can effectively reduce the number of in-
shows that the result when we vary the transmission range. terference neighbors and thus decrease the network latency.
Because a larger transmission range implies higher interfer- In the future, we would design a distributed version of slot
ence neighbors. The latency L(T ) of three algorithms are assignment algorithm to support two-way transmissions.
increasing with transmission range. Due to the procedure
RemoveIndInt(v ,v) can effectively reduce indirect interfer-
7. Acknowledgements
ences, the latency L(T ) of SA slightly increases.

Y.-C. Tseng’s research is co-sponsored by Taiwan


With a network size of 200m × 200m and a router trans- MoE ATU Program, by NSC grants 93-2752-E-007-001-
mission range of 20m, and k = 128, we vary the number PAE, 95-2221-E-009-058-MY3, 95-2221-E-009-060-MY3,
of routers in the network. As Fig. 8(c) shows, when there 96-2219-E-009-007, 96-2218-E-009-004, 96-2622-E-009-
are more and more routers, the number of interferences will 004-CC3, and 96-2219-E-007-008, by Realtek Semicon-
increase. The latency L(T ) of GSA and SA-NR markedly ductor Corp., by MOEA under grant number 94-EC-17-A-
increase, but SA does not. This is also because the pro- 04-S1-044, by ITRI, Taiwan, by Microsoft Corp., and by
cedure RemoveIndInt(v ,v) can effectively reduce indirect Intel Corp.
interferences. In Fig. 8(d), we fix the number of routers
and end devices to 900 and 2700, respectively, and vary
routers’ duty cycle. Note that a lower duty cycle means a References
larger number of available slots. In GSA and SA-NR, when
available slots are k = 26 , routers use up a round of slots [1] Dust network inc. http://www.dustnetworks.
quickly. So, these L(T ) is larger than others. Interestingly, com/flash-index.shtml.
when available slots are enough, the latency L(T ) of these [2] Micaz mote. http://www.xbow.com/Products/
algorithms are independent of the number of slots. productdetails.aspx?sid=164.

136
90
GSA GSA
SA-NR 120 SA-NR
80 SA SA
70 100

60
80
50

L(T)

L(T)
40 60

30
40
20
20
10

0 0
1202 1402 1602 1802 2002 2202 2402 2602 2802 17 18 19 20 21 22 23 24 25 26
2 Transmission range (m)
Network size (m )

(a) (b)
140
GSA GSA
SA-NR 120 SA-NR
120 SA SA

100
100

80
80
L(T)

L(T)
60
60

40 40

20 20

0 0
500 600 700 800 900 1000 1100 1200 1300 1400 1500 26 27 28 29 210 211 212
Number of ZigBee routers Number of available slots

(c) (d)

Figure 8. Simulation results on the network latencies under different configurations.

[3] Terrestrial ecology observing systems. http: [10] C.-Y. Lin, W.-C. Peng, and Y.-C. Tseng. Efficient in-
//research.cens.ucla.edu/areas/2007/ network moving object tracking in wireless sensor networks.
Terrestrial/. IEEE Trans. on Mobile Computing, 5(8):1044–56, 2006.
[4] Tmote sky. http://www.moteiv.com/products/ [11] G. Lu, N. Sadagopan, B. Krishnamachari, and A. Goel. De-
tmotesky.php. lay efficient sleep scheduling in wireless sensor networks. In
Proc. of IEEE INFOCOM, 2005.
[5] ZigBee alliance. http://www.zigbee.org/. [12] M.-S. Pan, C.-H. Tsai, and Y.-C. Tseng. Emergency guiding
[6] M. A. Batalin, G. S. Sukhatme, and M. Hattig. Mobile robot and monitoring applications in indoor 3d environments by
navigation using a sensor network. In IEEE International wireless sensor networks. International Journal of Sensor
Conference on Robotics and Automation, 2004. Networks (IJSNet), 1(1/2):2–10, 2006.
[13] R. Szewczyk, A. Mainwaring, J. Polastre, J. Anderson, and
[7] IEEE standard for information technology - telecommunica-
D. Culler. An analysis of a large scale habitat monitoring
tions and information exchange between systems - local and
application. In Proc. of ACM Int’l Conference on Embedded
metropolitan area networks specific requirements part 15.4:
Networked Sensor Systems (SenSys), 2004.
wireless medium access control (MAC) and physical layer
[14] Y.-C. Tseng and M.-S. Pan. Quick convergecast in zig-
(PHY) specifications for low-rate wireless personal area net-
bee beacon-enabled tree-based wireless sensor networks. In
works (LR-WPANs), 2003.
Computer Communications, to appear.
[8] IEEE standard for information technology - telecommunica- [15] D. B. West. Introduction to Graph Theory. Prentice Hall,
tions and information exchange between systems - local and 2001.
metropolitan area networks specific requirements part 15.4: [16] ZigBee specification version 2006, ZigBee document
wireless medium access control (MAC) and physical layer 064112, 2006.
(PHY) specifications for low-rate wireless personal area net-
works (LR-WPANs)(revision of IEEE Std 802.15.4-2003),
2006.
[9] A. Keshavarzian, L. V. H. Lee, K. C. D. Lal, and B. Srini-
vasan. Wakeup scheduling in wireless sensor networks. In
Proc. of ACM Int’l Symposium on Mobile Ad Hoc Network-
ing and Computing (MobiHoc), 2006.

137
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Algorithms and Methods beyond the IEEE 802.15.4 Standard for a Wireless
Home Network Design and Implementation

M. A. Lopez-Gomez A. Florez-Lara J. M. Jimenez-Plaza J.C. Tejero-Calado


1
R&D Dept, AIRZONE R&D Dept., AIRZONE R&D Dept., AIRZONE University of Malaga
mlopez@airzone.es aflorez@airzone.es jmjimenez@airzone.es jctejero@uma.es

Abstract
The aim of this paper is to describe techniques and scenarios, such as home, industrial and building
algorithms that have been used to design and automation, biomedical monitoring, energy-efficiency
implement a commercial wireless home network control, etc.
(WHN). Although IEEE 802.15.4 supposes an efficient This paper is focussed on a WHN composed of
solution for the lower layers of the communication sensors and actuators controlled by a central device.
stack, several key issues are out of its scope. Topology As occurs with other applications, the upper layers of
control (TC) of the network, address management, and Zigbee suppose a high complexity which is not
multi-hop synchronization are some of these issues, required by this application. Because of this, an
which are covered in this paper. The WHN has been energy-aware network simpler than Zigbee, and also
designed taking in mind low complexity, low based on IEEE 802.15.4, has been designed and
consumption, and low end-to-end latency. These implemented. This paper describes how some issues,
objectives have been reached thanks to a multi-hop which are out of the scope of the standard, have been
beaconed structure, a distributed address solved. Although the taken solutions have been
management, and a routing algorithm based on masks. particularized for this WHN, they can also be suitable
for other ones.
IEEE 802.15.4 is a flexible protocol. The personal
1. Introduction area network (PAN) can operate with several networks
topologies depending on application, such as star or
Home automation, or domotics, has been a target peer-to-peer. A TC algorithm is not intended to be
application of the wireless sensor networks (WSN) defined by the standard, though a description of how a
from their beginnings. Due to the effort carried out by PAN can be established is included. The standard also
the IEEE Task Group (TG4) [1] and the Zigbee defines two addressing spaces within the PAN, as well
Alliance [2], two standards that support the as how they have to be used, although the management
communication needs of the WHN are available: IEEE of addresses is not treated. For a star network, a
802.15.4 [3] and Zigbee [4]. reasonable position is considering that the central node
IEEE 802.15.4 specifies the medium access control generates and distributes the network addresses. In tree
(MAC) layer and the physical (PHY) layer for low-rate or cluster tree networks, the problem is more complex.
wireless personal area networks (WPAN). Among the Synchronization is other key factor for multi-hop
objectives of this standard, the following ones can be network that the standard does not deal with it. It is
highlighted: low-cost devices, reasonable battery proposed the use of an attribute, which determines the
lifetime, easy network deployment, and reliable data offset between the beacon received from the
transfer. While IEEE 802.15.4 deals with lower layers, coordinator and the beacon transmitted by the own
Zigbee defines both the upper layers and the profiles of node. But it is not considered a beacon scheduling and
devices. Since the Zigbee Alliance groups some management across the PAN.
enterprises of different sectors, the standard has to In the remainder of the paper, solutions to these
provide communication facilities to a wide number of issues are exposed. Related work is shown in the next
section. The model of the considered WHN is then
1
Currently, M. A. Lopez-Gomez works as Systems Eng. in the R&D described. In section 4, the TC algorithms followed by
department of SEPSA (SPAIN); e-mail: mlopez@sepsa.es the different devices of the network are detailed. The

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 138


DOI 10.1109/SUTC.2008.70
addressing scheme and the routing algorithm are then Ma et al. [10] establish a routing scheduling based on
described. After that, the beacon scheduling method received power, number of hops between nodes or a
designed and implemented is shown. In section 7, the combination between these two methods. However,
results obtained are exposed. Finally, conclusions are node discovery adds complexity to the network since it
presented. is not necessary data packets but needs a hard control
that tends to saturate the network, because it has to be
2. Related work running.
In this paper, a mesh or tree network formation is
There are some key issues at IEEE 802.15.4 WHNs described by means of device associations, based on
about the network topology and the devices time offset beacon displacements and using address
communication. The position occupied by a device can masks as end-to-end communication method.
have influence in network latency and general
consumption. These aspects are looked into different 3. Network model
papers about these networks. The standard does not
specify final decisions in issues related to network The WHN designed and implemented is composed
construction or how different nodes communicate of devices with three differentiated roles of
depending on the topology the network gets. Zigbee application. The first kind of device is the central-
specifies routing methods, which can be difficult to control device (CCD), which collects all the network
implement due to limited memory capacity of the information. It also can act as gateway with other
devices or saturation in network functionality. protocols and services. The second one is the actuator
In relation to device communication, one multi-hop device (AD), while the third one is sensor device (SD).
network communication mechanism based on beacons All of these devices are battery supplied, with
is proposed by You-min et al. [5]. However, beacon exception of the CCD. Since it can act as gateway with
transmission is not guaranteed if medium is busy, that other protocols, with higher consumption requirement,
leads to an asynchronous data traffic and difficulty to it has to be line powered. In relation with ADs and
characterize data latency through the network. Multi- SDs, the following distinction can be established. ADs
hop communication based on time offset beacons is must be supplied by batteries with a high capacity
suggested by the IEEE 802.15 WPAN Task Group 4b since they are normally connected to motor drivers,
[6]. Koubâa et al. [7] adopt this communication solenoid valves, etc. The consumption of these
method and establish a temporal network scheduling subsystems determines the whole consumption of the
depending on each device requirement. Nevertheless, it ADs. On the other hand, in SDs the battery lifetime is
does not go deeply into the methods adopted in device conditioned by the consumption of the communication
associations as well as the offset beacons obtainment subsystem, since it is the most significant component.
of each device, which are necessary in network For this reason, it is more important to minimize
communication. Offset beacon method permits communication consumption in SDs than in ADs
relatively constant bit rates and delimited latency by The above-mentioned consumption requirements
means of a good time scheduling. can be translated to the following IEEE 802.15.4 kind
Cheng et al. [8] and Kim et al. [9] have been written of device to application-role assignment. As the SDs
focussing in analytical 802.15.4 network solutions, devices have the hardest consumption requirements in
establishing comparisons of the throughput obtained the network, they have to act as IEEE 802.15.4
for different network configurations, trying to find the reduced function devices (RFD), that is, end devices.
best solution for bandwidth and latency requisites. On the other hand, ADs may act as full function
These analyses allow to find out the best efficient devices (FFD), because their consumption
network as well as the best consumption reduction. requirements are less restrictive thanks to the use of
However, the analysis of these networks starts from a high capacity batteries. There is only a CCD in the
star configuration where the coordinator communicates network. Due to this, the role of PAN coordinator can
with the rest of devices by the means of the superframe be assigned to it.
structure. According to [3], mesh networks can be Low computational complexity of the protocol is
established and global network efficiency can be one the objectives of the WHN proposed. Thus, a
different from pure star networks. dynamic tree topology, shown in the Figure 1, has
TC is an important topic when talking about WHN been chosen. Each node can just communicate with its
networks because it can affect to the latency of the data parent and son nodes, but not with nodes in the same
transmission, and consumption of the network devices. level. End devices have a better communication

139
topology, such as failures in nodes, node movements,
etc. This is useful, for instance, at operation phase
where the maintainability time of the WHN has to be
minimized. Next, the network formation and how the
network acts against changes, are described.

4.1. Network establishment

Figure 2 shows the flow-chart of the code executed


by a CCD when is powered on. It has to choose a
channel before starting to transmit beacons. The
Figure 1 Example of tree network topology decision criterion is to select the nearest one to the
channel 11, which is not occupied by other PAN and it
has got the lowest detected energy.
Since the service area of the network is normally
wider than the coverage area of the CCD, two WHNs
could operate at the same channel. A distributed
algorithm has been designed in order to avoid
choosing a channel occupied by another PAN. When
either ADs or SDs detect more than one PAN in the
same channel in association time, they report this
circumstance to the CCD. Then, this channel is marked
as occupied and applies the decision criterion to its
free-channel list one more time. The CCD which
receives this conflict notification chooses a free
operation channel and every device already associated
to this network face a resynchronization stage
described in 4.2 section.

Figure 2. Flow chart of a CCD

consumption behaviour because the transceptor is


active less time in transmission and reception than in
intermediary nodes. Communication is possible thanks
to beacon synchronization and both the direct and the
indirect models of data transfer described in the
standard. A beacon-enabled network has been chosen.
It enables CCD to recollect the information from SD at
limited-duration intervals, to update the AD states, and
to achieve a low-activity cycle.

4. TC algorithm
In order to get an easy installation procedure of the
network, a dynamic TC algorithm, which does not
require human intervention, has been designed and
implemented. Devices have not to be connected in a
fixed-order, on the other hand a self-organized method
is proposed. The network also evolves in an
autonomous way against changes that can affect to its Figure 3. AD flow chart

140
The TC algorithm permits associating new devices it enters in orphan state. At this moment, a procedure
just during a period of time called time-limited to leave the orphan state it is initiated. This procedure
window of association (TLWA). The TLWA is is divided in three different phases: resynchronization,
initiated when a switch is pressed in the CCD. After current-channel search, and all-band search.
that, the network construction is permitted, that is, the First, a MAC-layer resynchronization with the
association permit bit of the beacon is activated in the coordinator node in the way that is indicated in the
MAC level. This human intervention has been standard is tried. This is a phase with a high power
imposed, in order to avoid undesired association consumption, because the receptor has to be enabled
between nodes from different networks which are all the time. In order to reduce this power
being installed at the same time. Because of this the consumption, the number of retries is limited. Once the
TWLA has to be also minimized. maximum number of retries is reached, the device
The flow-chart of the code followed by ADs is enters in the second phase.
shown in Figure 3. A device that is not associated to In the current-channel search, the device stays in a
the network remains in a scan-sleep loop with a 50 per scan-sleep loop with a 1 per cent duty cycle. During
cent duty cycle, until beacons are detected. In this the activity period, the device performs a passive scan
phase, only beacons with active association permit bit at the network operation channel. If any scan result is
are treated. If the scan result list is filled with more obtained, and it belongs to the network, the node
than one coordinator that allows the association, it is breaks the loop. When the old coordinator is present in
chosen the CCD or the nearest one to it. That is, if the the scan result list, resynchronization is tried. In other
CCD is in the list it is always selected, if it is not, the case, an association process with a new coordinator
AD with the lowest level will be. For ADs within the and the reinsertion in the network are undertaken. The
same level, the selection criterion of highest link process is similar to that one carried out when the
quality indicator (LQI) is followed. Next, the AD starts device is inserted in the network, except that the
the network insertion process (Figure 4). association in the network is not probably allowed.
The first step followed by a device, when it wants Because of this, the process is preceded by sending
to be inserted into the network, is the MAC-layer automatically a TLWA-start request message to the
association with the chosen coordinator. A short MAC CCD, with no human intervention.
address is assigned to it and so it can send a network The third phase starts when the number of scan
insertion message to the CCD. When the CCD retries at the same channel expires. Unlike the process
confirms the insertion, the AD can request a beacon followed in the second phase, now the 16 channels are
time offset (BTO). Using the BTO, the AD calculates scanned at any time that the device wakes-up.
the relative time needed to transmit the beacon. Then,
it falls in the main loop of the application.
The AD copies the association permit bit from its
parent’s beacon. Thereby, the CCD is the unique
device which defines this bit, being the network
growth delimited by the TLWA.
The process of association and network insertion on
the SD is similar to the AD process. The unique
difference is that the SD does not require BTO,
because it has the role of end device.
Figure 4. Network-insertion process
4.2. Re-establishment of the network topology

The network topology can suffer changes along its


life cycle: the device’s battery can be exhausted across
the network in a no uniform way, any devices can
suffers damages or failures, or they can be simply
transported from one place to another in the
installation phase. In order to provide robustness
against this kind of changes, an algorithm for devices
in orphan state has been designed.
As it is described in the standard, when a device
does not receive more than aMaxBeaconLost beacons, Figure 5. Example of IP address notation

141
5. Address management and routing Figure 6 shows a tree and the address assignment
performed to its eleven nodes distributed in five levels.
A distributed and hierarchical address scheme has The CCD is the unique node which has a static and
been designed. It is based on the use of prefixes and prefixed short address (0.0.0.0). The others have a
level masks. For a more intuitive understanding, an IP short address assigned by their parent. Since SDs are
notation can be introduced, where groups of bits are always end devices, ADs and CCD are the unique
separated by points (Figure 5). Each group of bits is devices that manage address in the network.
associated with a hierarchical level of the topological CCD y ADs have a pool of addresses, which are
tree. The first one is related to level 1, the second one implicitly assigned during the MAC-layer association.
to level 2, and so on. The prefix of the address is The address is generated using a prefix for a node
defined as 16 bits, where the n-1 groups of the equal to the address of its parent node.. The assigned
beginning are equal to the first 16 bits of the address address just differs in the n group of bits, being zero
and the other ones are set to zero, being n the level of the others groups. For instance, the 2-level node with
the node. address 1.1.0.0 has received the first address of the
pool of its parent node, which has address 1.0.0.0. All
the sons of this node have the same prefix, 1.0.0.0, and
they have different second group of bits: 1.1.0.0 and
1.2.0.0. As it can be noticed, since the nodes are of
level 2, the last two groups of bits are zero. Thus,
network information is implicit in their addresses. The
level of a node is the number of more significant
groups of bits different from zero. Since neither
previous information transfers nor level-information
publication are needed, this is an efficient way to
choose the lowest-level coordinator.
With this addressing scheme, a mask-oriented
Figure 6. Example of address assignment routing algorithm, which does not use routing tables,
can be performed. This involves more relaxed
computational requirements. When a node receives a
MAC-layer packet, the intended node of the network
message that it carries can be: the own node, a descent
of the node, o any other one. In the first case, the
packet has to be processed by the node. In the second
case, it has to be relayed to the next-hope node, using
an indirect transaction of information. In the last case,
the packet has to be forwarded to the parent node
through a direct data transfer. So, the routing problem
can be formulated as the resolution of three variables:
routing decision (RD), next-hop address (NH), and
Figure 7. Routing Algorithm transfer mode (TM). RD is a Boolean variable, which
takes true value when the incoming packet has to be
forwarded. If the packet has to be processed, it takes
false value.
The n-level mask (LM) concept can be defined as
16 bits where the n first groups of bits are ones, and
the rest of them zeros. Figure 7 shows a pseudo-code,
where the values of variables RD, NH and TM, can be
obtained in an efficient way. ADN is the address of the
node which makes the routing decision, and ADS and
ADD are the source and destination addresses,
respectively. Figure 8 describes the routing of a packet
sent by the node 1.1.1.0 to the node 2.1.0.0.

Figure 8. Packet routing across the network

142
The values which have been given to the BBO, BO
and SO attributes enable that up to 63 ADs can
simultaneously act as coordinators without time
overlapping in their activity periods. The beacon
scheduling is centrally managed by the CCD. The
BTOs of the beacons, as is illustrated in the Figure 9,
are numerated with index (BTOI) from 0 to 63. The
BTOI relative (BTOIREL) between a node with BTOI k
and a node with BTOI j, can be defined as the j minus
k. In this manner, the BTO relative (BTOREL) can be
expressed as the BTOIREL multiplied by the BBO. A
son node can have a BTOI lower than the BTOI of its
Figure 9. Beacon scheduling by time division parent, that is, the beacons transmitted by a node can
have a negative shift respect to the beacon of its
parent. Taking this in consideration, the BTOIREL can
take negative values between -63 and 63.
When an AD is inserted in the network, as it was
described in 4.1, it sends a BTO request message.
Once this message is received, the CCD checks if there
is a BTOI free. In this case, it allocates it to the new
AD. Then, a response message with the BTOI assigned
is generated. This message has basically two fields. On
one hand, the CCD fills the first field with its own
BTOI (BTOICOORD). On the other hand, the second one
is filled with the BTOI assigned (BTOIASSIG). With the
aim to update the BTOICOORD field in any hop, the
message has to be processed by all the intermediated
nodes. When the requesting node receives the response
Figure 10. Example of BTO assignment.

6. Time division offsets distribution


Among the solutions proposed by the 15.4b Task
Group of the IEEE for beacon scheduling across a Figure 11. Channel conflict notification
multi-hop network, the approach based on time
division has been selected. The main reason is that it is
the only way to avoid time overlapping in any point of
the network.
The network has been designed to achieve a balance
between energy-saving and end-to-end delay. Because
of this, a configuration with Beacon Order (BO) = 8
and Superframe Order (SO) = 1 has been chosen.
These values imply that beacons are transmitted any
Beacon Interval (BI) = 3932.16 ms, and the activity
period of the superframe duration is 15.36 ms. If a
guard time equal to the superframe duration is
considered, up to 63 beacons can be scheduled Figure 12. Channel selection conflict
between two consecutive beacons of the PAN
coordinator (Figure 9). As the base beacon offset
(BBO) is equal to the double of the superframe
duration, the BTOs are also multiples of the
superframe duration, what involves low computational
Figure 13. Association in the new channel
complexity.

143
Figure 14. Packet routing

Figure 15. Beacon request

message, it obtains the BTOIASSIG and the BTOICOORD, 7.1. Network-topology construction and
thereby it can find the BTOREL with respects to its address assignment
coordinator. In this way, it can schedule the beacon
transmission using the beacon reception from its parent Figure 11 shows how an AD tries to associate with
as reference. Figure 10 shows an example of BTOI a CCD, which transmit beacons with source address
assignment. 0x0000. The AD starts the MAC-layer association
process. In this process, the CCD assigns the first short
7. Results address of its pool to the AD, that is, 0x0400 (1.0.0.0).
Next, the network-insertion message is not sent, but a
In order to verify the correct implementation of the notification message of channel conflict is. This is
designed algorithms and methods, some illustrative because the AD has detected more than one PAN
scenarios have been recreated, as well as the radio transmitting beacons in the same channels. When the
traffic has been captured. The implementation has been CCD processes this command, it finishes the beacon
carried out using a proprietary platform. It is based on transmission and it starts a procedure to select a new
CC2420 transceiver and MSP430 microcontroller of operation channel, as it is described in 4.1. This case
Texas Instruments. In order to visualize and to save the can be produced by a geographical distribution of
IEEE 802.15.4 frames, both the Packet Sniffer nodes like the one represented in Figure12.
software tool and the CC2420DK have been used. The end of the network-insertion process, in a new

144
channel, can be observed in Figure 13. It has to be address pool of a node, and no extra data interchange
noted that the AD has not to initiate the MAC is needed. Each node gives addresses to its sons, where
association process with the CCD. It only has to prefixes are equal to its own address. As result a
resynchronize with the CCD using the orphan hierarchical tree of addresses is obtained, and so, a
algorithm, which was described in 4.2. Finally, it sends routing algorithm without tables, which routes the
the network insertion message. packets across the networks using level masks, can be
used. Because of these advantages, the addressing
7.2. Packet routing scheme proposed is efficient in terms of energy and
computational complexity.
The routing example included in section 5, Figure 7, With the aim to synchronize nodes across the tree, it
has been reproduced. Results are shown in Figure 14. has been followed the time-division vision given by
The address of the node A (1.2.1.0) and B (2.1.0.0) in the 15.4b IEEE task group. The time offset of the
hexadecimal format is 0x0488 and 0x0840, transmitted beacons is centrally managed by the PAN
respectively. It can be observed, how the data packet is coordinator. Up to 63 nodes can send beacons without
transmitted from the A-node to the CCD, using a direct the possibility of time overlapping and with a low duty
transfer, and from the CCD to the B-node through an cycle. In this way, a wide area is covered and a good
indirect transfer. balance between consumption and information delay is
obtained.
7.3. BTO assignment
9. References
In this case, the insertion of a new AD has been
performed in a network, which was compound by a [1] IEEE 802.15 WPAN™ Task Group 4 (TG4),
CCD and two AD. It corresponds with the example http://www.ieee802.org/15/pub/TG4.html
shown in Figure 10. The packet interchange and the
[2] Zigbee Alliance, http://www.zigbee.org
offsets between beacons can be observed in Figure 15.
[3] IEEE 802.15.4, "Part 15.4: Wireless Medium Access
8. Conclusions Control (MAC) and Physical Layer (PHY) Specifications for
Low-Rate Wireless Personal Area Networks (LR-WPANs)",
In this paper, the algorithms and method used in the IEEE standard for Information Technology, Revision 2006.
development of real and commercial WHN, which are
out of the scope of IEEE 802.15.4, have been exposed. [4] Zigbee-Alliance, "ZigBee specification", [Online].
Available: http://www.zigbee.org
As well, some packet captures have been included in
order to contrast the correct operation of the network. [5] Z. You-min, S. Mao-heng, R. Peng, ”An Enhanced
A TC algorithm has been designed and Scheme for the IEEE 802.15.4 Multi-hop Network”, 2006
implemented. It provides the means to reach a star or IEEE.
tree topology, and a network with the lowest number
of hops is always reached obtaining a reduction of both [6] IEEE 802.15 WPAN Task Group 4b (TG4b),
the end-to-end delay and consumption associated with http://grouper.ieee.org/groups/802/15/pub/TG4b.html
data transmission. The network obtained is self-
organized and dynamic. It does not require human [7] A. Koubâa, A. Cunha, M. Alves, “A Time Division
Beacon Scheduling Mechanism for IEEE 802.15.4/Zigbee
operation in installation or changing cases, such as
Cluster-Tree Wireless Sensor Networks “, 19th Euromicro
drops or movements of nodes, etc. minimizing the Conference on Real-Time Systems (ECRTS'07).
maintainability.
Addressing scheme and routing algorithm have [8] L. Cheng, A. G. Bourgeois, “Energy efficiency of
been other key issues solved. In relation to the address different data transmission methods in IEEE 802.15.4: study
management, a distributed technique based on prefixes and improvement”, 2007 IEEE.
has been proposed. Some intrinsic information about
the node has been given to the short addresses, such as: [9] H. S. Kim, J.-H. Song, and S. Lee, “Energy-Efficient
the depth in the tree, its address pool, the address of its Traffic Scheduling in IEEE 802.15.4 for Home Automation
Networks”, 2007 IEEE.
parent. That way, a scanning node can know which of
the scanned coordinators is the closest to the CCD. As [10] J. Ma, M. Gao, Q. Zhang, and L. M. Ni, ”Energy-
well, the assignment of its short address in the MAC- Efficient Localized Topology Control Algorithms in IEEE
layer association supposes the assignment of the 802.15.4-Based Sensor Networks”, IEEE 9 Jan. 2007

145
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Service-Oriented Design Methodology for Wireless


Sensor Networks: A View through Case Studies
Elena Meshkova, Janne Riihijärvi, Frank Oldewurtel, Christine Jardak and Petri Mähönen
Department of Wireless Networks, RWTH Aachen University
Kackertstrasse 9, D-52072 Aachen, Germany
Email: {eme, jar, fol, cja, pma}@mobnets.rwth-aachen.de

Abstract— In this paper we discuss the design methodology this architecture. WSNs can be viewed in part as a reduced
based on the service-oriented architecture and agile development copy of Internet, where different nodes or their groups provide
principles for wireless embedded and sensor networks (WSNs). different services to the end user. Ideally nodes in WSNs self-
This methodology suits particularly well for streamlining and
partially automating the design and implementation of complex organize to provide the required functionality. Web services
WSNs. We report results from selected case-studies to test appli- aim to achieve exactly the same goal on the Internet scale.
cability of service-oriented architectures for embedded software. Therefore technologies and concepts popular for the Internet
We evaluate the proposed design methodology by studying cases can be tried out in the WSN domain. The SOA is one of
that include the development of three different services for them. However, this architectural principle can not be used
wireless sensor networks that can work together as a part of
a complete solution. We specifically comment on the trade-offs directly. It has to be adapted to additionally include not
that a developer might face while designing and implementing only large complex services, but also simple ones, like data
systems. We follow the “best-practices” of the software design storage, routing or sensor readings. To a certain extent the
methodology and adapt them to the development of both the terms component-based and service-based architectures can be
sensor network services and the sensor networks themselves. The used interchangeably in WSNs. One can distinguish between
design and implementation cycle includes three stages: the overall
solution and architecture design, the protocol and application these by defining that the first term refers more to the actual
design, and finally, the implementation. These stages are iterated implementation while the second term is used for logical
throughout the life-time of the project. During the design we constructions.
consider both the abstract definition of user requirements and WSNs have a large range of applications starting from
targeted functionality, and their mapping to the real hardware personal area (PAN) and home environment networks to habi-
and software.
tat monitoring (see Table I). These highly diverse scenarios
impose different requirements on WSNs and lead to distinct
I. I NTRODUCTION
design and implementational decisions. Wireless embedded
Wireless sensor networks (WSNs) lie at the crossroads devices further possess limited energy, memory and processing
of software, network and embedded engineering. This field resources1 . The major limitation for WSNs is power. In many
requires developers knowledgeable in all of these areas, so that scenarios sensor nodes are deployed without constant power
the maximal benefit can be gained from the well-established supply and have to operate on one set of batteries for a
domains. Otherwise WSNs will inevitably suffer from the considerable amount of time. Therefore, all the factors that
danger of re-inventing the things that are already well-known contribute to the power usage have to be minimized. Among
and widely used in the other areas. However, most of the them are communication over the network, reading/writing
“best practices” can not be directly re-used in WSNs and to/from the EEPROM, usage of power-hungry peripherals and
have to be adapted accordingly to the specifics of this domain. excessive computations [3]. This leads to the unique protocol
In this paper we suggest a design methodology for wireless stack and application design for each particular WSN usage
sensor networks that is based on the concepts of service- scenario.
oriented architecture (SOA), agile development methodology Usually WSNs have to be also optimized against a number
and “best practices” of the network development in the WSN, of other parameters besides the network lifetime. Among
MANET and peer-to-peer areas. The use of this methodology them are the maximum allowable data propagation time, the
is illustrated on several use-cases. Additionally, we suggest a fault-resilience requirements, the overall network cost, the
framework for partial automation of WSNs that is based on the complexity and flexibility of the system, code maintainabil-
suggested methodology and is using, among others, Semantic ity, re-usability and comprehensiveness. Depending on the
Web instruments. application scenario different parameters of WSN have to
Originally the term service-oriented architecture refers to a be considered (see Table II). Improvement of one parameter
logical set that consists of several large software components often leads to downgrading of others. The balancing of the
that together perform a certain task or service [1], [2]. SOA 1 For example Berkeley TelosB sensor nodes, or motes, on which we
is a particularly popular paradigm in the community of the implement our designs have only 48 KB of ROM, 10 KB of RAM, a 16-
web software developers, for example Web Services [1] utilize bit microcontroller and, by default, are powered by two 1.5 Volt batteries.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 146


DOI 10.1109/SUTC.2008.43
TABLE I
•The requirements to a WSN project can change during
S OME OF THE APPLICATION SCENARIOS FOR WSN S .
the development of the project. The refinement of the re-
Scenario Scale
External
Mobility
Fault- Required
QoS quirements can be initiated by both users and developers.
networks tolerance lifetime
• Precise solutions are preferable to generalized solutions
PAN Small Often Minimal None Medium Quick
Home Medium in WSNs as they allow to optimize the performance of
Small Always Minimal Minimal Quick
network to long
Structure Medium/ Some- Medium Quick to
the system which is important in the resource-constraint
Minimal None
montoring
Disaster
Large time to long
Short to
medium environment.
Large Often Minimal Some Quick
monitoring long • The development of the WSN is an iterative process.
Habitat Minimal to
monitoring
Large Randomly
Large
Some Long Medium • Rapid prototyping is essential for wireless embedded
networks.
• Re-use of the already developed components is desirable,
TABLE II
which requires in turn good documentation, careful plan-
C OMMON WSN PARAMETERS .
ning and identification of common parts in the project.
• Before development of the new system the deficiencies
Main Network Service Hardware Software
Chip
of the old similar ones are to be identified, as well as the
Cost Mobility Datarate Service
Platform

Memory

Lifetime Fault rate Fuctionality


Radio
OS / VM
reusable parts.
Interface
Size • The design plans for the WSN include the hardware,
Topology

Max. extra delay


Diameter
QoS Power RAM + ROM
QoS

Fault-resilence
Node Distribution
Area
Sensors Hardware network topology, operation system, programming and
Services Network graph security issues.
• WSN solutions are widely developed using prototypes.
The results of the prototyping, as well as following
deployment and testing should exhaustively evaluated and
major WSNs parameters is not a trivial task and leads to the recorded to enable the experience sharing and component
iterative design process. This is one of the peculiarities of the re-use.
WSNs development, along with the fact that the development • WSN systems are typically developed by a small num-
teams are usually small and highly qualified. The requirements ber (up to 10-15) of highly qualified developers and
of the project can also change from initial ones during the researches. The team members are typically working in
project lifetime. These makes the use of the agile design a close collaboration and can contribute to each other’s
technologies [4] appropriate for the WSNs development. This work. Therefore the project development steps have to be
class of methodologies is primary characterized by the short logical and obvious to each member of the development
and iterative development cycle, highly motivated and qual- team in order to increase the efficiency of the work and
ified team, and a constant face-to-face interaction with the inspire team-work.
customer, as well as between the team members. The use of Most of the features of this list map well to the agile soft-
this methodology allows easy adaptation of the project to the ware design methodology [4] and therefore it can be chosen
changing circumstances and user requirements. as one of the basis on which our methodology can reside.
We split our further discussion into five sections. In Section Service-oriented architecture also suits particularly well for
II we briefly describe the major characteristics of the WSN WSNs as the development of the whole network can directly
design and discuss on the applicability of SOA and agile be mapped to service, simple or complex. For example, the
technique for these systems. In Section III we present the network itself provides several composite high-level services
basic parameters that can be used to characterize WSNs. as area monitoring or manipulation of actuators. Each node
The overall design process, its challenges and iterations are also offers complex services like data forwarding or sensor
given in Section IV. In Section V we provide three case readings. Each node can also be represented as a collection
studies in the areas of service discovery, in-network processing of services that interact with each other. The model-driven
and fault-resilience to ground our theoretical discussion. The architecture (MDA) [5] also seems to be a promising candidate
possibilities for the partial automation of the WSN design, the to be used for WSN systems, as there the process of the
corresponding challenges and gains are discussed in Section system development is very well defined and recommended
VI. Finally, the conclusions are drawn in Section VII. design tools are specified. However, this architecture is more
structured and heavy than the SOA and agile methodology
II. C HARACTERISTICS OF WSN D ESIGN . SOA AND AGILE
combination, and we feel that it need to be adapted to be
M ETHODOLOGY A PPLICABILITY.
used for the typically lightweight and rapidly changing WSN
Judging from both our experience, discussions with the projects. We hope to reuse or adapt some of the core MDA
colleagues and the research papers we have produced a list features, like meta-data component specification, in our future
of characteristics typical for WSN design, some of which are research on design in WSNs.
coming from the software development domain: We propose an abstract service model for an individual node
• Minimalism is essential as a simple and well-defined (Figure 1) that incorporates the following modules: Applica-
system usually works the best. tion, Service Discovery, Transport, Routing, MAC, Physical,

147
Application fault-resilience and services. The overall cost of the network

In-Network Processing
Service discovery
and data gathering consists of hardware, development and deployment costs. The

Middleware
Transport deployment costs grow larger when the individual placement
Routing of specific nodes is required. It is much cheaper to deploy
MAC a homogeneous network that self-organizes depending on the
Physical surrounding conditions. However, this will in turn increase the
development costs. The decision between the static and the
dynamic network configuration is also an important part of an
Fig. 1. Basic WSN node structure architectural design. From one side a careful study of the target
scenario, extensive design efforts and test implementations
allow the creation of a static WSN network that requires
In-network processing and Middleware service classes. Each minimal interference after the deployment. The development
of the above modules can either contain one simple service like of a dynamic network is a more difficult task, but ideally it
a routing protocol or be composed of several complex ones. has to be performed only once. After that the same design
For example, the in-network processing block can include data can be employed for different scenarios and later the network
aggregation and data recovery modules. The service discovery can self-adjust depending on the surrounding. The examples
module might not be required on the mote, if its functionality of dynamic network configuration mechanisms that enable
is already incorporated into the middleware component. The network self-organization are virtual machines, middleware
service-oriented architecture not only suits well for wireless- frameworks and code updates from EEPROM [6].
sensor networks, it also maps well to the agile design method- The network lifetime is primary determined by power con-
ology too, as it allows to realize the iterative principle of sumption of the individual nodes which are mostly battery
the agile design with regular deliveries of small portions of powered and therefore have a limited lifetime. The network
working code, namely individual services. lifetime can be increased, for example, by putting a part of
computational burden to the gateway or user devices. Care
III. WSN PARAMETERS AND T RADE - OFFS
has to be taken to ensure that the gain from the processing of
Wireless sensor networks are relatively small systems that information on a gateway is higher than the communicational
have a limited amount of expected functionality, namely gath- costs of transporting the data to this device. Typically this
ering of the senor reading and enabling actuators in response approach is applicable for small diameter networks.
to sensor readings. WSNs can also be used as a transition Maximum additional delay and fault-resilience are two
network to propagate third-party data. These networks are main quality-of-service (QoS) parameters of wireless sensor
rarely directly connected to the end-user. More often they networks. Maximum additional delay specifies the maximum
are parts of a larger heterogeneous distributed system, which allowable propagation delay between the gateway and the
are the “users” for the WSNs. All of the above leads to the farthest node from it. Fault-resilience of the network, i.e.
possibility to formalize the requirements to these systems, as tolerance of the WSN to node failures, outdated and imprecise
they, though widely diverse, are used for a limited number of information, as well as data losses and injection of malicious
purposes. Additionally, these requirements usually come from data, can be improved by employing special fault-resilience
the professionals who are generally more precise in formulat- mechanisms, such as data back-ups or security add-ons. This
ing their needs than the normal end-users. The formalization approach leads to additional deployment costs and can reduce
of the WSN description does not only allow an easier and the network lifetime. The introduction of back-up nodes in the
therefore faster development of WSN systems, but also enables network leads to the increase in fault-resilience of the network
a partial automation of this process. and at the same boosts up the hardware and deployment costs.
We suggest a basic set of parameters that need to be Finally, the services parameter contains a list of functionalities
specified virtually for any wireless sensor network (Table II). that the user expects from the network.
The parameters can be divided into several groups. Main
parameters characterize overall system performance. These B. Network Parameters
are the most important parameters for the user. Network Network parameters allow to describe the sensor network
parameters describe the wireless network and behavior of the where the application is to be deployed. These parameters in-
nodes in it. Service parameters define the behavior of each clude information on the expected mobility, fault rate, minimal
realized service as well as its inputs and outputs. Hardware required bandwidth, number of nodes in the network, average
and software parameters describe the devices that can be used network diameter, information on the network symmetry and
to create a WSN; they are used during the implementation heterogeneity. Some of these parameters, like expected band-
stage of the project. width are direct user inputs. Some other are generated and
adapted during the project deployment and directly influence
A. Main Parameters the main parameters that have to be optimized. The user is
We have identified five basic parameters which are used for free to fix some of them and leave the rest to the developers.
the overall WSN description. They are cost, lifetime, delay, For example, the user might want to fix the positions of

148
some nodes and therefore indirectly influence the network Begin

size, diameter and heterogeneity parameters. In general the


Get initial parameters from the
network is modeled and described by the graph. However, if user. (Main, Service and parts
we want to partially automate the WSN generation process first of the Network, HW and SW
parameters) No Redefine
we have to estimate the generalized parameters of the network, individual services on a
node?
like network symmetry, maximum diameter and density. Later,
Check if there exists a network
when we will have a limited pool of options, more detailed service and corresponding
Yes
software and hardware that
graph level representation of the network can be considered satisfies the input parameters. Indentify the problematic layer
and evaluated. However, if the WSN design is done manually Search the existing solutions
Specify the required Service
these processes are typically merged. parameters for the module
Yes Define additional hardware
C. Service Parameters The service is found
parameters
Define additional software
An abstract description of services is a key element of No
parameters

the SOA-based WSN system design, as it allows to create


No Create specific layer service
non-implementation specific design solutions important in Redefine that satisfies the input
a network service? parameters.
the initial project development stages. Additionally, abstract
Create new service and
service descriptions help to formalize the developed WSN, Yes SW module

which is important if we want to create heterogeneous systems Suggest network topology

and automate the wireless sensor network design process. A Define additional hardware Test the service
parameters
service description includes the list of provided functionalities Define additional software
and required services. For example, the data collection service paramters
No Any Yes
other problematic
requires an underlying data delivery mechanism and provides services?
Check if there exists the node
the functionality of data gathering. Each service is also char- service and corresponding
software that satisfies the input
acterized by the expected influence on the node lifetime and parameters.
overhead data rate, i.e. the amount of data per time unit Search the existing solutions
generated extra depending on the given data rate from the
services that utilize this service. For composite distributed No
The service is found
services QoS parameters are also important.
Yes
D. Software and Hardware Parameters
Test the network solution
The software and hardware parameters become important
in this development stage. The hardware parameters include
Deploy the network
hardware platform specification (memory, chip, interfaces,
radio, power resources) and sensor platform descriptions. The
End
software parameters include the required OS/Virtual machine,
needed amount of memory (RAM/ROM), a list of modules
that the considered software module requires to operate and Fig. 2. Generalized WSN design process.
requirements to the hardware, like a radio or a specific sensor
support.
parameters and concepts derived in the previous two steps
IV. D ESIGN P ROCESS into the implementation on the real devices. The flowchart
As we have proposed earlier, the WSN design process corresponding to these three design steps in terms of SOA is
should be agile, however it should have some structure. We given in Figure 2.
have chosen to loosely follow the standard waterfall model After gathering and analysis of the user requirements at
[7]. The waterfall model includes eight stages: gathering first the existing solutions can be checked for the re-use, i.e.
of the requirements, their analysis, the design of the solu- if the existing solutions fit in the user constrains and while
tion, development of the software architecture, development providing the desirable services. If no appropriate solution is
of the code, testing, deployment and post implementation. found a new one has to be created.
For wireless sensor networks most of these stage are also
valid. However, the solution design stage has to be altered A. Design of the Solution
to include the network design parameters and the following In the solution design stage the estimation needs to be
choice the network architecture. The development of the made if the parameter constrains specified by the user can be
software architecture stage in WSNs is changed to the design achieved in principle and what corresponding network archi-
of the individual network protocol stacks and applications tecture should be used. The choice of the network architecture
that provide the services desired by the user. The stage of is based on the expected network topology and the behavior of
the code development deals with the mapping of the abstract the nodes (mobility, fault ratio). Additionally, judging from the

149
user requirements, the priority between the main parameters of a network topology is desirable or use a dynamic role
should be estimated, as in WSNs we always face the trade- assignment mechanism, for example the one proposed in [9].
off between cost vs. network lifetime vs. QoS vs. functionality The major trade-off at this design stage is simplicity vs. effi-
of services. For example, the increase in the network lifetime ciency vs. flexibility. By simplicity we understand the architec-
can be achieved by attaching a larger battery to the sensor tural simplicity of an application, as well as the simplicity of
node, i.e. increasing its cost or decreasing QoS or functionality the protocol message flow and on-node processing. Efficiency
constraints and therefore allowing inaccurate or out-of-date refers to ability of the specific application or protocol to fulfill
data to be reported. the goal. Flexibility reflects the re-usability of the design for
For small-scale networks the cost and fault-resilience often other projects and also to the ability of the WSN to adapt to
come in the first place. The health-monitoring system is one the changing environment.
example. Here the network life-time is not usually the major The network lifetime can be improved by reducing the
issue and in PANs it is relatively easy to recharge the devices. network traffic, which requires the increase of the network
However, the fault of a single device might lead to heavy con- protocol efficiency. Techniques that help to reduce the network
sequences as the change in the patient’s state can be missed. traffic include, but are not limited to caching, piggybacking and
Cost is an issue in non-critical applications such as smart home in-network processing. The use of each of these techniques
and sport training scenarios. The plain architecture, where all involves different trade-offs. Caching is used to both increase
the nodes have the same basic functionality, is well suited to the fault-resilience of the network and decrease the network
such scenarios. Additionally, small scale networks can benefit traffic. Caching is employed, for example, in routing and
the most from the back-end devices that connect WSNs to service discovery. When applying caching one should maintain
external networks. The small network diameter allows these a careful balance between the amount of messages injected
networks to put computational load on the back-end devices, into the network used to update the cache and the age of
such as gateways, at a low communication cost. the cache, i.e. the probability to have stale information there.
For medium and large networks more complex role specific This balance is application specific as, for example, in fairly
architectures suit the best, as it was already demonstrated by static networks routing information can be updated much more
the Internet community [8]. By role-specific architecture we rarely than in highly mobile networks. Piggybacking allows
mean any network configuration where nodes play different to attach additional data to bypassing network packets, thus
roles. In this category, among others, fall the client-server and potentially decreasing the number of transmitted packets. The
cluster-based architectures. The nodes in a WSN generally trade-off here is between the increase in the packet size and,
perform either the basic role, i.e. they gather sensor data, therefore the decrease in the probability of a packet delivery,
or the processing role. The processing role can include a and the number of messages generated. It is well known that
variety of functionalities like data aggregation, data back the larger the packet transmitted over wireless network is the
up, cluster-head operation and other in-network processing more chances that it will be corrupted. However, it is cheaper
activities. The use of roles for large networks can significantly in terms of power to transmit the data in a large packet rather
boost the network lifetime keeping the overall cost of the than in several small packets [10]. This is strictly true only if
network relatively low. However, introducing a hierarchy into we do not consider possible retransmission.
the network can lead to longer network response times or The messaging trade-off involves two different data propa-
affect the network stability, as the failure of a cluster-head gation techniques: reactive vs. proactive. If the data is prop-
can cause the whole surrounding network to malfunction. A agated reactively through the network it is sent after the user
hierarchical architecture is also not suitable for application request. The proactive data propagation is initiated by the data
scenarios with medium or high mobility due to the increased accumulating device, in form of an advertisement or an event
cost of maintaining the hierarchy. notification. Proactive data transmissions are used, for exam-
ple, in case of emergency notifications. They are also used
B. Protocol Stack and Application Design as supporting messaging in, for example, routing protocols
After the general decisions have been made concerning the needed to obtain the addresses of neighboring nodes. Reactive
solution architecture, protocols and applications residing on data propagation is typically cheaper in terms messaging as
the individual nodes are to be designed. Stack formulation packets are generated only when it is necessary.
is as well as the system design in general, is an iterative Most of the supporting functionality of WSNs can be
process, first we define the top services, than go to the lower defined as in-network processing that was already briefly
level services. The shared, cross layer-services are defined as discussed in the previous section. The classic examples of
they are required by the hierarchical services, or at the end to in-network processing are data aggregation, cluster-head func-
adjust the main parameters. For example, distributed coding tionality, data storage and compression. In-network processing
saves some energy at the expense of additional complexity, i.e. can be activated on some or all of the network nodes. This al-
development cost. Additionally, if we want to further optimize lows to save the individual power and computational resources
the network performance by assigning different roles to nodes, and sometimes leads to better routing. This functionality can
like a data aggregator or a cluster head, we should either therefore increase the network lifetime, the efficiency of the
assign these roles statically for which a detailed knowledge network, and its flexibility. However, the mismanaging of in-

150
Service Network Hardware Software TABLE III
parameters parameters parameters parameters
P ERFORMANCE CHARACTERISTICS FOR D ISTRIBUTED S OURCE C ODING .

Service MAPPING Module Component RAM ROM Energy, Energy,


[bytes] [bytes] R=0.36 [µJ] R=0.72 [µJ]
Encoding 330 34 0.82 0.53
Service Main Software Decoding 562 558 27.61 4.35
parameters parameters parameters
Tracking 620 26 237.57 237.57

Fig. 3. Mapping services to components.

is fundamentally based on the Slepian-Wolf theorem [12]. The


network processing might lead to the faster power-losses by
basic idea behind it is the compression of multiple correlated
the hosting nodes and cause additional messaging.
sensor readings from sensor nodes in vicinity. These sensor
C. Implementation Considerations nodes do not communicate with each other and send their
In the implementation phase of WSN development the compressed readings to a central sensor node which performs
abstract description of services have to be mapped into the real the joint decoding [13], [14]. Distributed Source Coding
code running under a certain operating system on the specific has two operation modes in time-varying environments: the
device (Figure 3). The developer has to find a compromise entropy tracking phase and the compression phase. In the
between complexity, comprehensiveness and efficiency of the tracking phase the conditional entropy of the sensor readings
code2 . The complexity of the code leads to poor code re- is periodically estimated and then used for determining the
usability and maintainability. There is not much that can be code rate. In the compression phase, whereas the source nodes
done about complexity. Certain technologies can contribute to compress their readings at the assigned rates, we save energy
the code comprehensiveness. These are the component-base and capacity due to reduced packet sizes. Based on the results
approach, the use of suitable programming abstractions and obtained from our experiments, the energy consumed for the
the clear naming of software primitives [11]. transmission of 1 bit is about E = 180 nJ and the energy
The size of the memory footprint of the code and the and memory consumed due to our Distributed Source Coding
processing burden contribute heavily to the efficiency of a scheme is listed in Table III. Depending on the underlying
WSN application. One more trade-off worth mentioning in this correlation structure significant net energy savings can be
section is the trade-off between implementation complexity achieved. The low memory consumption of the corresponding
and protocol efficiency. There exist several possibilities to software components makes this approach also applicable for
make the network communication more efficient at the cost resource-poor environments.
of additional on-node processing. Data compression and en- The prolonged network lifetime is achieved at the cost of the
coding as well as cross-layer optimization are such methods. increase in the systems complexity (as the nodes are required
to play specific roles) and by moving the computational
V. C ASE STUDIES complexity and load from source nodes toward gateways.
In this section we provide several case studies based on Very low additional deployment costs are expected since
our own research experience that show the applicability of the the software components can easily be loaded to hardware
proposed design methodology for development of a complete platforms, especially in case of a code propagation mechanism
WSN system, as well as individual services. With these being used. There is no need to later on change the software
examples we aim at highlighting the complex tradeoffs one components in case the set of suitable codes is identified
has to consider in the overall systems design. Only with a during the application design phase. Finally, the system can
flexible design methodology such as outlined above can all run unattended since the entropy tracking algorithm analyses
these tradeoffs and their impact on the overall application the environment in terms of varying correlations and thus
performance be estimated accurately. Distributed Source Coding can not be observed from the user
perspective.
A. Distributed Source Coding as In-Network Processing
Distribute Source Coding belongs to the class of the in- B. DISC as a Mechanism to Increase Fault-Resilience of a
network processing service. It is one of the techniques that WSN
allows to extend the network lifetime of WSNs, especially DISC [15] is a distributed data storage and collection
in the case of high spatial node densities. This high den- mechanism for WSNs which protects the information from
sity induces spatial correlations between the measurements malicious destruction of parts of the network and therefore
of individual sensor nodes. A direct consequence of these increases the fault-resilience of the system. The system is
correlations is that the sensor readings between neighboring designed for medium and large-scale networks and requires a
sensor nodes are highly redundant. The Distributed Source hierarchical architecture from the WSN. Nodes are arbitrarily
Coding approach seeks to exploit the spatial correlations and located in a monitored area divided into identified regions
2 In this paper we consider the TinyOS operating system for our experi- by the mean of existing ID configuration protocol. In each
ments, though most of the conclusions drawn are OS-independent. region a node is dynamically elected as a cluster head using

151
TABLE IV
M EMORY FOOTPRINT OF THE SERVICE DISCOVERY PROTOCOLS .

nanoSLP nanoSLP
Resources nanoSQL
simplified full
RAM [Bytes] 320 411 655

ROM [Bytes] 1653 2505 4512


Lines of code 700 + 700 950 + 700 3400 + 1300
(protocol + data storage)

from every three traces, which leads to approximately 9%


increase in the false-positive probability of locating false data
in the WSN.
Fig. 4. The fraction of recovered information per cluster in case of malicious
destruction of nodes.
C. Protocol Design and Implementational Trade-offs on Ex-
ample of Service Discovery Protocols
a low-energy cluster formation algorithm such as PANEL We illustrate the protocol design and implementation trade-
[16]. The network life-time is divided into the time periods, offs using the example of two service discovery (SD) proto-
epochs, and the data collected in one cluster is stored in the cols: nanoSLP [18] and nanoSQL. These general purpose dis-
nodes of a randomly chosen neighboring cluster serving as covery protocols differ in their expressive power and therefore
a backup cluster. In order to keep track of the stored data, are efficient to different range of applications. The nanoSLP
each cluster head maintains traces identifying the stored data. protocol is an adapted version of the Service Location Protocol
A trace is a single-element Bloom filter [17] computed from [19] for WSNs and is capable only of data discovery, not data
the time epoch of the data aggregation, the region identifier management. Therefore it can only be used for SD purposes.
of the aggregator and the type of data. When there is need to The nanoSQL protocol uses a simplified version of SQL and
backtrack the information from a physically destroyed region allows both sophisticated data discovery and data management.
within a particular period of time, the correspondent traces The range of possible application for this protocol are not
are created, combined into Bloom filters and sent in a request limited to SD. The nanoSQL can also be used for cross-
message. Using geographical routing this request is routed layer optimization and as a part of a middleware. Both
to the neighboring regions of the destroyed area where the protocols function in a peer-to-peer fashion and, in principle,
requested data is stored. can the flexibility to be used on any network architecture.
The fault-tolerance provided by DISC comes at the price The protocols support proactive and reactive messaging and
of higher complexity and a slight decrease of the network require an information storage service where the sensor nodes
lifetime. The complexity of the WSN increases due to the use can register their services and data.
of a cluster-based architecture and the need for services such In nanoSLP each service is characterized by its name and
as geographical routing and cluster head election mechanism. attributes. We have realized two versions of the protocol.
Since our network follows a hierarchical architecture, cluster The first protocol version allows users to conduct complex
heads execute the heavy work of computing the collected data searches using multiple AND, OR and comparison operators
and sending it to be stored. We therefore try to preserve a and discover separately both services and attributes. The sec-
homogeneous level of energy consumption for all the nodes by ond version is simplified further and is limited only to a single
using a dynamic election of cluster heads. In comparison with comparison operator and service searches. The comparison of
mechanisms that store data locally on the node itself, DISC the footprints of both protocol versions given in Figure IV
offers a higher resiliency for the data at a price of a slight shows that there is 56% increase in the ROM consumption
increase in energy consumption due to the remote storing. and approximately 30% in the RAM size. The complexity of
The DISC is a very flexible mechanism that allows to the code also increased significantly (by nearly 30%) due to
balance the amount of recovered data and the amount of more sofisticated parsing needed. The reader should note that
resources spent. The amount of recovered information depends the memory footprint of the protocols include both the data
on the number of neighboring clusters that serve as backup storage and the protocol modules, as we have used different
clusters for one region. The DISC’s efficiency grows with the data storages in case of the SLP- and SQL- based parsers.
density of clusters. Figure 4 shows the amount of recovered The nanoSQL protocol is a new multipurpose protocol
information depending on the fraction of failed nodes in the capable of both sophisticated service discovery and data
cluster in the case of malicious destruction of nodes. The usage management. The implemented support for both the service
of memory and processing resources can be adjusted depend- discovery and data gathering functionalities allows to save
ing on the tolerable probability of locating false data, that network traffic, via use of piggybacking. The SQL syntax
comes due to the natural false positive probability in Bloom was already used in WSNs before (see, for example, [20],
filters. For example, in order to decrease the consumption of [21]) however the implemented parsers did not allow as
RAM by a factor of three, we logically form one Bloom filter high flexibility as ours. The nanoSQL is capable of forming

152
complex expressions, nested SQL statements, built-in aggre- basic abstract SOA-based model of an individual node. Our
gation functions and support for a wide range of data types. next goal is to develop a framework that will assist developers
The expressiveness of nanoSQL comes at the price of the in designing and implementing wireless embedded and sensor
increase in the ROM footprint (80%) and the tripled code networks.
size. The growing functionality of the SD protocol and the
ACKNOWLEDGMENT
corresponding parser leads not only the the increase of the
memory footprint and code size, it also automatically leads We thank DFG (Deutsche Forschungsgemeinschaft) and
to more complex component interfaces. For nanoSQL module RWTH Aachen University for partial financial support through
we have more than fifteen interfaces, which certainly makes UMIC-Excellence cluster. The work was also supported by
it more difficult to use this component. European Union (COMANCHE project). We would also like
to thank K. Rerkrai, E. Osipov and M. Popa for fruitful
VI. D ESIGN AUTOMATION P OSSIBILITIES discussions.
The design process of WSNs can be partially automated due R EFERENCES
to their relative simplicity. The development of such a tool is a
[1] C. Schroth and T. Janner, “Web 2.0 and SOA: Converging concepts
subject of our on-going work. The proposed SOA-based agile enabling the internet of services,” IT Professional, vol. 9, no. 3, pp.
WSN methodology suits well for at least partial automation 36–41, 2007.
using Semantic Web [22] approach developed for web services [2] M. P. Papazoglou, “Service-oriented computing: Concepts, characteris-
tics and directions,” in Proc. of WISE, Washington, DC, USA, 2003.
composition. However, substantial differences exist in the pro- [3] J. Polastre, R. Szewczyk, and D. Culler, “Telos: enabling ultra-low power
cess of how computers and humans can reach the satisfactory wireless research,” in Proc. of the IPSN, NJ, USA, 2005, p. 48.
design decision. The computer tends to make decisions by iter- [4] R. Martin, Agile Software Development: Principles, Patterns, and Prac-
tices. Prentice Hall PTR Upper Saddle River, NJ, USA, 2003.
ating through a number of possible solutions where the number [5] J. Miller et al., “Model Driven Architecture (MDA),” Object Manage-
of possibilities searched are regulated by pre-defined rules. ment Group, Draft Specification ormsc/2001-07-01, July 2001.
Humans take shortcuts using their experience and intuition. [6] S. Brown, “Updating software in wireless sensor networks: A survey,”
Dept. of Computer Science, National Univ. of Ireland, Maynooth, Tech.
For example, a part of the application development process is Rep., 2006.
the mapping of the abstract requirements and descriptions to [7] I. Sommerville, Software Engineering. Addison Wesley, 2006.
the real world software and hardware. At this stage human- [8] Y. Chawathe, S. Ratnasamy, L. Breslau, N. Lanham, and S. Shenker,
“Making gnutella-like P2P systems scalable,” in Proc. of the SIGCOMM
dependent parameters such as developer experience with one ’03. NY, USA: ACM Press, 2003, pp. 407–418.
or another software and hardware platform influence the [9] C. Frank and K. Römer, “Algorithms for generic role assignment in
decision. People tend to develop in those environments they wireless sensor networks,” in Proc. of SENSYS, San Diego, California,
USA, November 2005.
are familiar with, as additional learning curve is a time and [10] V. Raghunathan and C. Srivastava, “Energy-aware wireless microsensor
resource consuming process. networks,” Signal Proc. Mag., IEEE, vol. 19, no. 2, pp. 40–50, 2002.
We are currently developing on an expert system that [11] D. M. Jones, “The New C Standard (sentence 787),” http://www.coding-
guidelines.com/cbook/sent787.pdf [last visited 20.09.2007], 2005.
will allow to design both complete WSN systems, as well [12] D. Slepian and J. Wolf, “Noiseless coding of correlated information
as individual services, automate documentation and feedback sources,” IEEE Trans. on Information Theory, vol. 19, no. 4, pp. 471 –
gathering process for different components. As a first step 480, 1973.
[13] S. S. Pradhan and K. Ramchandran, “Distributed source coding using
we are implementing an ontology-based knowledge base that syndromes (DISCUS): design and construction,” IEEE Trans. on Infor-
stores information about different WSN services and can mation Theory, vol. 49, no. 3, pp. 626 – 643, 2003.
estimate their compatibility and suggest possible component [14] J. Chou, D. Petrovic, and K. Ramachandran, “A distributed and adaptive
signal processing approach to reducing energy consumption in sensor
wiring. On the network side we are using a middleware with networks,” in Proc. of INFOCOM’03, San Francisco, USA, March 2003.
SD capabilities that is inspired by SQL syntax to gather the [15] C. Jardak et al., “Distributed Information Storage and Collection in
required parameters about the individual service functioning, WSNs,” in Proc. of MASS, Pisa, Italy, October 2007.
[16] L. Buttyan and P. Schaffer, “Panel: Position-based aggregator node
as well as of performance of the overall system. election in wireless sensor networks,” in Proc. of MASS, Pisa. Italy,
October 2007.
VII. C ONCLUSIONS AND F UTURE W ORK [17] B. H. Bloom, “Space/time trade-offs in hash coding with allowable
errors,” Communications of the ACM, vol. 13, no. 7, pp. 422–426, 1970.
Constructing a WSN that fulfills a particular task, as well [18] Z. Shelby et al., “NanoIP: The Zen of Embedded Networking,” in Proc.
as designing and implementing the corresponding applications of ICC’03, Seattle, Washington, USA, May 2003.
and protocols is not a trivial task. A developer has numerous [19] E. Gutman and et al., “RFC 2608: SLPv2: A service location protocol,”
June 1999.
decisions to take on the architectural, protocol and application [20] K. Rerkrai et al., “Unified Link-Layer API Enabling Portable Protocols
design and implementational levels. In this paper we have and Applications for Wireless Sensor Networks,” in Proc. of ICC,
proposed a design methodology that relies on service-oriented Glasgow, UK, June 2007.
[21] S. M. et al., “TinyDB: an acquisitional query processing system for
architecture and agile methodology and adapts this techniques sensor networks,” ACM Transactions on Database Systems, vol. 30,
for the WSN use. We have provided case studies that showed no. 1, pp. 122–173, 2005.
the applicability of this methodology to various application [22] T. Berners-Lee, J. Hendler, and O. Lassila, “The Semantic Web,”
Scientific American, vol. 284, no. 5, pp. 28–37, 2001.
scenarios. Additionally we have proposed a list parameters
that most of the WSNs can be described with and discussed
the major trade-offs WSNs face. We have also developed a

153
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Kuka: An Architecture for Associating an Augmented Artefact with


its User using Wearable Sensors

Kaori Fujinami§ Susanna Pirttikangas§§


§
Department of Computer, Information and Communication Sciences,
Tokyo University of Agriculture and Technology
fujinami@cc.tuat.ac.jp
§§
Department of Electrical and Information Engineering, University of Oulu
msp@ee.oulu.fi

Abstract personalized service using the owner information. In


the future, a wrist watch, a necklace, shoes and clothes
In this paper, we present an architecture for associ- will be augmented with sensors so that the user’s ac-
ating a person and an artefact (a daily object) utilized tivity can be monitored continuously and implicitly.
by him/her in an ad-hoc and extensible manner. The We believe that the collaboration between an arte-
proximity of the acceleration signal patterns from wear- fact and wearable sensors can finally lead to a con-
able sensors on different parts of the body and sensor clusion like: “The user of this vacuum cleaner is the
augmented artefacts are utilized for making an associ- wearer of the wrist watch.” This can be defined as the
ation. The proposed architecture also provides a filter- association of an object with its owner. Association
ing facility to reduce the possibility of miss-association allows a system to recognize a more complex context
between an artefact and a person who is using a dif- that consists of temporally distributed utilization of
ferent artefact in a similar way but just exists in close objects. For example, we utilize a pot, a cup, a spoon,
vicinity. With the association information, a system a teabag, etc., to make tea. One can use a cup to have
can provide a more reliable context-aware service as a coffee, or just drink water. This indicates that the
the particular role of an artefact and the proprietary more the number of objects in use increase, the more
nature of such an accessory gives additional knowledge the context gets concrete and complex. A system can
for the system. A distributed architecture and a correla- utilize the association to connect the events of using the
tion coefficient based approach is realized. We describe objects. In other words, the information of the user of
the design, a prototype implementation and a basic ex- an object can be the key to distinguish many events
periment. of usage from each other. Furthermore, by association,
one can avoid intentional or unintentional misusage of
an object that is assumed unshareable with others.
1. Introduction In this paper, we propose an architecture named
“Kuka1 ”. Kuka incorporates wearable sensors attached
to a user’s body to associate the wearer’s identity with
In a ubiquitous computing environment, the notion
an artefact that is utilized at the time. The key tech-
of context-awareness plays a key role, where a person’s
nical concepts are 1) the loose coupling of an artefact
contextual information is utilized as a filtering param-
usage and an activity related to the artefact, and 2) the
eter to narrow the information overflow. We have pro-
utilization of the proximity of the signal patterns from
posed the notion of sentient artefact as a method for
sensors on the target artefact and a specific part of the
implicit and natural context extraction[4]. A daily ob-
body. The former is achieved by defining a rule that
ject like a chair, an alarm clock, etc., is developed to
associates an artefact and specific activities recognized
assist a specific task, which means the state-of-use can
by body worn sensors, which allows the system to be
be a clue to infer the user’s activity related to the task.
extensible in the type of an identifiable artefact. The
Additionally, an object that is not shared with others
(e.g. a toothbrush) can be utilized as a trigger for a 1 “Kuka” means “Who” in Finnish.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 154


DOI 10.1109/SUTC.2008.42
latter provides robustness in a situation where many can be utilized to measure the proximity of two devices
people are engaged in similar activities. Also, Kuka [3]. However, the signal strength is easily affected by
does neither require any centralized host nor infrastruc- the orientation of the body and placement of devices.
ture in a space, which allows an application to be used Therefore, the proximity of the signals of sensors within
outdoors as well as indoors. We have already inves- a certain period of time provides robust results.
tigated the basic framework of this combination[6, 5],
where we have conducted a simulation-based experi- 3. Kuka System Design
ment. This paper shows a concrete architecture and
prototyping based on the previous work. The goal of Kuka is to associate a person with an
The rest of the paper is organized as follows: the artefact that he/she is really involved with in an impro-
background is described in section 2. The Kuka ar- vised and extensible manner. We assume that a person
chitecture design and implementation are presented in wears one or more acceleration sensors (wearable sen-
section 3 and 4, respectively. The performance of the sors) and that an artefact is also augmented with an
system is evaluated in section 5. Finally, section 6 con- accelerometer. A small terminal like a PDA or a mobile
cludes the paper with future work. phone is carried by a person so that data from wearable
sensors are collected and processed. Generally, a wear-
2. Background able sensor provides little information about the user’s
context[8], and the raw data needs to be aggregated on
User identification and authentication are associat- the terminal to extract higher level contexts.
ing procedures where the identified person is logically
connected to a service provided by a system containing 3.1. Requirements
the identification component. No matter how the iden-
tification is achieved, the pre-registration of a person’s The requirements for the architectural design are, as
information, e.g. the fingerprint, is required. Also, follows.
the explicit way to initiate the identification process 1. Improvised association: To initiate a service spon-
loses the advantage of the artefact-based implicit con- taneously even for the first time, the user needs to
text recognition. be associated with a specific artefact in an impro-
The concept to connect two things by the proximity vised manner. Traditional identification (or as-
of the moving pattern is similar to that of Smart-ITs sociation, in this context) methods require pre-
Friends[7]. However, Kuka provides a generic architec- registration of some features. For example, in fin-
ture and working environment for various types of arte- gerprint recognition, the fingerprints are recorded
facts. The project also presents the notion of grouping and stored in advance, and additionally, the user is
by context proximity and shows a physical access con- requested to put a finger on the sensor consciously.
trol using a door handle and a wrist[1], which can be
an application of Kuka. 2. Infrastructure independency: The most prominent
A radio frequency identification (RFID) technology advantage of the wearable computing paradigm is
can also be utilized for making an association. In the that a user can be free from a specific place and
guide project at Intel Research Seattle [10], a person can move anywhere with continuous service provi-
wears an RF tag reader. The utilization of an object is sion. So, the processing should not be done, for
approximated by the detection of a tag on the object. example, on a server in a specific location. In
This is very simple, however it is difficult to distinguish other words, a sensor on an artefact and a pro-
the situation where a person just touches the object cessing terminal of a user need to communicate
from the actual utilization. Also, the chance of associ- each other directly, and the association is made
ation is lost by the failure of the detection. These are without a central coordinator.
because the approach is based on the single point of 3. Scalability in the type of an artefact: A person
utilization, i.e. touching. Moreover, there is a trade- uses various types of artefacts. From the applica-
off in the range of the tag detection. The short range tion developer’s point of view, the association al-
detection using a passive tag can reduce the chance of gorithm should have minimum dependency on the
the detection of an incorrect object that is not utilized type of an artefact. This means a developer should
but just exist in the range, as well as the low cost and only adjust some parameters for each artefact.
maintenance free natures on the tag side. However, it
leads to the mis-detection of a tag that an active tag 4. Timeliness: In our experiences, a user expects an
solves. Similar to the approach, radio signal strength artefact to act as a kind of switch to an action

155
(imagine pressing the remote control buttons for the frequency domain have been utilized to character-
a TV set), which means he/she can wait only for ize human activities [2]. However, it is insufficient to
a few seconds for the service. So, the association utilize simple features like the sum of the squared dis-
should be made quickly when the result is utilized crete FFT component magnitudes of the signal, since
as trigger of a service. they do not represent the difference in the phases of the
signals on both the object and the body side. Imag-
3.2. Design Decisions ine that two persons riding bicycles of the same size
of wheels at the same speed. Here, the same patterns
For the first requirement, we utilize the proximity of the frequency component distribution can be seen.
(correlation) of the signal patterns of sensors since, as Another interesting but more complex approach work-
described in section 2, direct identification by biomet- ing with the frequency domain is [9]. That is based
rics requires explicit involvement by the user. The on a coherence function which represents the correla-
utilization of proximity is a relative approach, which tion of the power spectrum between two signals. In
means the wearer that has the most proximate signal, [9], a system can determine if two devices are carried
e.g. movement, pattern to an artefact is selected as by the same person with 100 % accuracy. However, it
the user. Therefore, a proximity metric needs to be requires a sliding window of 8 seconds of the data, and
calculated from the signal patterns, and then it is eval- it is too long to initiate a service on the detection of
uated within all possible users (terminals) in a ded- the association.
icated space. This approach also satisfies the third We have tested a variety of correlation coefficient
requirement in a sense that the calculation and eval- metrics [5], and the maximum value among all possible
uation (comparison) components are common. As we pairs of correlation coefficients that provided the best
describe in section 4.4, only a few parameters need to performance were utilized in the later experiments.
be specified for each type of an artefact. We assume that 3-axes accelerometers are attached on
The communications between a person’s terminal both an artefact side and a user’s body, where multiple
and an artefact, and also between two terminals is re- nodes (N) can be on the body. So, the maximum num-
alized by a broadcast basis to meet the second require- ber of the pairs is 9N (=3 axes x 3 axes x N nodes). To
ment. Furthermore, it is useless to run the process for reduce unnecessary calculation, N should be as small
all artefacts in the dedicated area, rather it should only as possible. We can selectively specify a node(s) on
work for a very few sets of artefacts. So, we introduce a the body that relates to the utilization of an artefact,
selective activation of the identification process called a node on a leg for a node on a pedal of a bike, for
an association condition. example.
We also studied the dependency on the window size
3.3. Correlation Coefficient for the proximity evaluation. Basically, the perfor-
mance of correct association increases when the size
To realize the association in an improvised and ex- of the window gets larger. However, the “saturation”
tensible way, a metric that represents the proximity point in the window size-performance graph varies
of the acceleration signal pattern is calculated on the among artefacts, which means window size is a system
user’s side (terminal), and then it is evaluated to find parameter. Please refer to [5] for more detail about the
the most proximate pair within all the terminals in a characteristics.
dedicated space. As the metric, we have selected cor-
relation coefficient (Formula (1), below) since it is also 3.4. Association Procedure
lightweight for timely association.
∑ Figure 1 shows the six steps procedure for associa-
(d1 − d1 ) • (d2 − d2 ) tion. We have presented our basic algorithm of asso-
r12 = √∑ ∑ (1)
(d1 − d1 )2 • (d2 − d2 )2 ciation with a simulation study in [6]. In this paper,
we extend the procedure and present the architectural
Here, d1 and d2 represent the data sets obtained (from design based on the requirements mentioned in section
the wearables and the artefacts, respectively) and to be 3.1.
associated at a specific period of time (i.e. window ). The procedure starts with the event from an artefact
The overline di denotes the mean value in the window, that indicates someone is using it (Step 1). Driven by
and the sum (Σ) is calculated over a window. this event, all the possible users’ activities are checked
A correlation coefficient is a metric of the proxim- if they satisfy an association condition (Step 2). This
ity of signal patterns in the time domain. Features in is done on the terminal side, and the user’s activity is

156
Scheme

continuously recognized using the wearable sensors. An (1)

association condition is a rule that represents possibile


User Terminal (c)
activities during the utilization of a specific type of an (1) I am a cleaner.
Who is using me? (4)
artefact. Table 1 shows example conditions. Here, the Shared medium (3) Corr(c) = 0.4
chair’s state-of-use occupied is associated with a user’s (6) (1)
activity sitting and whiteboard cleaner’s state-of-use Raw data

Corr(c) = 0.4
Corr(a) = 0.9
(1)
moving with user activities walking and cleaning. A Artefact
(6) Stop sending raw data
condition is defined by an application developer who
User Terminal (b)
builds an application running on Kuka. A methodology (2) Association Condition
Checking: “Yes, satisfied (4)
to investigate appropriate conditions for each type of the link condition.”
(2) Association Condition
User Terminal (a)
an artefact is out of focus of this paper. The type of an (3) Calculating Correlation
Checking: “Not satisfied.”
metric (Correlated
artefact that generates the event and the state-of-use Coefficient): Corr(a) = 0.9
(5) Mine (0.9) is the highest of
are utilized as the key to specify required activity to all. So, I am the user!

satisfy.
At the same time as the event notification, the arte- Figure 1. Six Step Association Procedure
fact starts sending raw data for the correlation check-
ing. However, only those terminals that satisfy the
association condition accept the data and start calcu-
lating the metric in a data window with appropriate Artefact-Person Associator, Data Buffer, and Broad-
size(Step 3). This means that unnecessary calculation cast Client/Server. Activities of the wearer of the sen-
can be avoided on the terminal of a person who is not sors are recognized by the Activity Recognizer at a
engaged in an activity specified in the association con- certain interval. This is utilized when an association
dition. However, a wildcard “*” is also supported on condition is evaluated. The Artefact-Person Associa-
the right column of the table of the association con- tor component is the main function of the Kuka ar-
dition since some artefacts seem not to be involved in chitecture that listens to an event from a sensor aug-
any activity, for example, we drink a cup of coffee while mented artefact. The event indicates the start/finish
standing, sitting, walking, etc. In this case, Step 2 is of the usage of an artefact. Furthermore, the com-
skipped. After the calculation, the terminals distribute ponent performs the actual proximity evaluation pro-
the metric to select the maximum one on each side, cess. The Data Buffer component stores data from the
where the selection is done after a specific period (Step wearable sensors for a specific periods of time (window )
4). This period is necessary to wait for the metrics to calculate both the features for activity recognition
from other terminals. Here, we utilize absolute maxi- and the correlation coefficient. Finally, the Broadcast
mum value to handle negative correlation. Then, the Client/Server component deals with broadcast-based
terminal that selects its own metric as the maximum Components
communication onentities,
between other the Terminal
i.e. an artefact
is qualified as the user (Step 5). Once the user is de- and other terminals.
termined, the person is assumed to be the user until
another event indicating the end of the usage arrives. Personal
Sensor
Then, the terminal of the user sends a message to stop Augmented
Terminal Activity
Recognizer
the raw data transmission for the reduction of compu-
Data Buffer

Artefact

tation, communication, power consumption (Step 6).


Broadcast Artefact-Person
Wearable
Table 1. Example Association Conditions Client/Server Associator Sensor
Node
Artefact State-of-use Activity
Chair Occupied sitting
Cleaner Vertically Moving walking, cleaning Figure 2. Relevant Components on a Terminal

3.5. Components on Personal Terminal


4. Prototyping
As a summary of the architectural design, we show
the functionalities on the terminal side in Figure 2.
It consists of four components: Activity Recognizer, Figure 3 illustrates the prototyping architecture.

157
Current Implementation
3-axis
4.1. Wireless Sensor Nodes accelerometer

Cookie

For the activity recognition, as well as the utilization


Vacuum

Artefact Proxy
of sensor augmented artefacts, we used our own sensor Cleaner
node named Cookie [11]. Cookie is a 50ecent-sized Bluetooth Wireless LAN
(IEEE802.11g)
Personal
Terminal
wireless sensor node that is extensible for more than Whiteboard
JRE
Cleaner
nine types of sensors, and communicates with a device Bluetooth
Cookie
capable of Bluetooth v1.1 Serial Port Profile. Raw data
is acquired on the node, where the accelerometer data is
sampled 64 times at 200 kHz, averaged for smoothing, Figure 3. Prototyping Architecture
and transmitted to the personal terminal at every 100
msec at 9600 bps. Various sensors can be attached
to Cookie. In this prototyping, a 3-axes accelerometer In [11], we attached three Cookie nodes on both
was utilized to measure the movement. wrists and right thigh according to [2]. Moreover, we
attached a node to a necklace, because of its good dis-
4.2. Implementation crimination performance within sitting activities. We
modeled various everyday activities into two different
levels of abstraction: 1) 9 activities (abstract model)
We have implemented a prototype of Kuka. Here,
and 2) 17 activities (detailed model). Table 2 shows the
Cookies were attached on a person’s body. Artefacts
17 activities. Nine activities are established by combin-
were also augmented with Cookies to detect their state-
ing similar ones, as follows (1) and (2) are clean, (3,4)
of-use, however, they can be augmented with any sen-
and (5) are stand, (6)–(9) are sit, and (10) and (11) are
sors and implemented with any languages if they can
stairs. In the abstract model, the activity drink (17)
communicate with a user terminal. For simplicity, cer-
was removed.
tain acceleration variance of an artefact is defined as
We tested a multi-layer perceptron (MLP) with var-
a piece of “artefact used”. By applying 2-out-of-3 rule
ious input features and data window sizes to find the
for the three axes accelerometer that exceed a thresh-
best structure and a suitable combination of features
old, the final “state-of-use” is answered. The threshold
that would give a fast and accurate classification re-
of the variance was determined heuristically. However,
sult for the user. The input features selected with
the detection algorithm should be optimized for each
forward-backward feature search were the averages and
artefact at the time of manufacturing, e.g. on/off or
the standard deviations (SD) of the three axes accelera-
specific movement pattern recognition.
tion signals within a time window of seven samples (0.7
As assumed in section 3.2, the communication
seconds) from four sensor nodes. Besides the correla-
among sensor augmented artefacts and personal ter-
tion metric, the components of Fast Fourier Transform
minals is realized by a short range wireless medium to
(FFT) (the most useful features for activity recognition
reduce mis-association. Bluetooth is one of such a short
according to the literature) were not utilized, because
range communication medium. But, it is basically for
of the delay as reported in [2] (6.7 sec).
point-to-point communication. So, in the prototyping,
The recognition performance against 9 and 17 activ-
we have utilized UDP multicasting over IEEE 802.11g
ity models is 90.97% and 85.13%, respectively, which
based wireless LAN. Therefore, a protocol converting
is obtained 4-fold cross validation with our collected
component Artefact Proxy was required. However, it
data. We will compare the association performance in
can be removed if a wireless sensing systems with short
terms of the number of classified activities, i.e. level of
range and 1-to-N basis communication is available. All
abstraction in the basic activities.
the components on the terminal were written in Java
(J2SE1.4.2 09).
4.4. Kuka Configuration Parameters
4.3. Activity Recognition Kuka’s behavior is controlled by a set of parame-
ters as shown in Table 3. They are described in an
The activities of the wearer of the sensors are rec- XML-encoded file. The major two parts are 1) activity
ognized at certain time intervals. We made a data col- recognition and 2) proximity evaluation. For the con-
lection experiment to develop the activity recognition figuration of 1), an implementation of an activity recog-
model [11]. The requirements of the model were low nition model is given as a fully qualified class name, e.g.
computation cost and high accuracy. kuka.NineClassifier. The recognition features are spec-

158
Table 2. Recognized Activities Table 3. Kuka Configuration Parameters
(1) Clean Whiteboard (2) Vacuum Clean Name Example
(3,4) Elevator Up/Down (5) Stand Still Activity Recognition Class kuka.NineClassifier
(6) Sit/Watch TV (7) Sit/Relax Recognition Features wrist right.3X.average,
(8) Sit/Read Newspaper (9) Type KB thigh right.3X.sd
(10,11) Stairs Up/Down (12) Walk Recognition Window Size 7
(13) Lie Down (14) Brush Teeth Recognition Interval 1000 msec
(15) Run (16) Bicycle Association Condition wb.vertical movement
(17) Drink =walk,clean
Time to Start Comparison 1000 msec
Samples for Corr. Check 20
Required Nodes wrist right,thigh right
ified in a way that represents the position on the body,
an axis of accelerometer, and the name of the feature,
e.g. wrist right.3X.average. We assume an appropriate
vocabulary is defnied. The size of the data buffer (size fact. In case T2, two subjects utilize different artefacts,
of window) and the interval of activity recognition are i.e. either a whiteboard cleaner or a vacuum cleaner,
also given. These four parameters allow Kuka to be whereas in T3 two subjects utilize the same type of
independent of a recognition algorithm, to be scalable artefact, i.e. whiteboard cleaners.
in the number of activities and to be flexible in the Four subjects participated to the test ( for all, right
recognition features without changing any line of code. wrist is the dominant wrist.). We utilized the two (9
For proximity evaluation, the association condition, and 17) activity recognition model described in section
the time to start comparison of received metrics, the 4.3, which we call abstract and detailed, respectively.
data window size for calculating correlation coefficient, In the former case, both cleaning a whiteboard and
and the node for the calculation are specified for each vacuum cleaning are categorized into one class, clean.
type of an artefact. The right hand side of the condi- The name of activity contained in each activity appears
tion consists of the type of an artefact and the name of on the right side of the association condition. Also, a
event that represents “starting”, while the left side con- wildcard was applied in the condition for comparison in
tains basic activities that the user might be engaging the performance. The conditions were defined for each
with the artefact. In the table, the condition indicates artefact. Based on our previous study [5], 20 samples
that the vertical movement that represents the usage of and the right wrist were utilized for the calculation
a whiteboard cleaner (“wb”) should be associated with of the correlation coefficient of a whiteboard cleaner,
walk or clean activity. As described in section 3.4, a and 30 samples and both wrists and the right thigh
specific period of time is required to start comparing were for a vacuum cleaner. We ran two instances of
proximity metrics from other terminals. The last two Kuka system and one Artefact Proxy on a laptop PC
parameters in Table 3 allow flexible configuration for (Macintosh Powerbook G4, OS: MacOSX 10.4.5, CPU:
each type of artefact since, as can be seen in section 3.3, 1GHz, Memory: 1.25GB, JRE: J2SE1.4.2 09).
the size of a window and a sensor node that contribute
to obtain high correlation varies among artefacts.
Table 4. Test Cases
Case Subject-1 Subject-2
5. Evaluation T1 Whiteboard Cleaner —
5.1. Experimental Setups Vacuum Cleaner —
T2 Whiteboard Cleaner Vacuum Cleaner
We have selected a whiteboard cleaner and a vac- Vacuum Cleaner Whiteboard Cleaner
uum cleaner as association targets. These artefacts are T3 Whiteboard Cleaner-1 Whiteboard Cleaner-2
utilized in a cleaning task. So, the association allows
a system to say what the user is cleaning, i.e. white-
board or floor. Table 4 lists the test cases. Also, Figure
4 shows the scenes and sensor nodes (Cookies) attach- 5.2. Association Performance
ment. The Cookie nodes were attached to both wrists,
the right thigh, and a necklace. Here, T1 is a simple The percentage of successful association for the case
test case, where only one subject utilizes a single arte- T2 and T3 were 67.3% (N=49) and 60.9% (N=23),

159
T1: Single Person T2: Two Persons, Different Types

Wrist (right)
Necklace

Wrist (left) Thigh (right)

T3: Two Persons, Same Type

Figure 4. Test Cases and Sensor Attachment

Figure 5. The Relationship among the Levels


of Abstraction for Basic Activities
respectively. The result indicates 32.7% of T2 were
misidentified to the other subject. (Obviously, in T1,
the result was 100%.) The major reason for the mis-
association is an accidental high correlation of an arte-
fact usage with the person’s body movement. However,
cluded into the association condition, which can com-
in those cases, the correlation values are lower than
plement the performance of the activity recognition
that of the successful cases, for example 0.48. So, we
model. Moreover, in case of a vacuum cleaner, four
need some rejection criteria for such a low but maxi-
times of whiteboard clean were utilized because of the
mum correlation among participants. In this case, if
wildcard (in turn, two times of vacuum clean for a
there is no person that has higher correlation than the
whiteboard cleaner). Therefore, if the 9 activities
rejection threshold, the checking process will be done
model had been applied, they would have been included
from Step 2 in Figure 1. Furthermore, a short range
in the top part of the graph.
communication would allow the system to reduce the
mis-association. We consider 1 meter would be enough The wrong activity recognitions were “rescued” by
regarding the usage of an artefact. the wildcard and the abstract classifier. We consider
the wildcard should not be utilized frequently since
5.3. Association Condition only the correlation check process is to be applied and
thus, unnecessary checking provides the chance of “mis-
We analyzed the activities that had satisfied the con- correlation”. Instead, appropriate levels and required
dition for the successful cases. Figure 5 shows the re- set of basic activities need to be defined.
sults. The bottom part of the graph indicates the cases As described above, the unification of two detailed
where a wildcard was applied, while the top part is ei- contexts, whiteboard clean and vacuum clean, into one,
ther abstract activity clean, detailed activities white- clean, allowed a recognition model to utilize 9 activ-
board clean or vacuum clean. In the cases where the ities model. This leads to higher accuracy , as can
wildcards were applied, the activity of the user was be seen in section 4.3, in the activity recognition part.
not correctly recognized. In the figure, the breakdown The proximity evaluation by correlation coefficient can
of activities to which the wildcards were applied are then filter out the confused activities in the clean ac-
shown. For example, 32 out-of 35 successful cases were tivity. Furthermore, an activity recognition model is
associated by clean, and the rest three cases were by not affected by an activity that is done with an arte-
others, in this case, three times of run activity. fact, which means the model is not necessary to be
This figure suggests that the whiteboard cleaner is reconstructed as the need for such a complex activ-
relatively easy to associate with clean or clean white- ity. Thus, the combination of an artefact and activity-
board activity, and on the other hand, the vacuum derived contexts by an association condition allows a
cleaner needs more activities than clean and vacuum system to extract more concrete and complex context
clean. The confused activities run and walk can be in- in an extensible manner.

160
5.4. Processing Speed the core logic while adapting to heterogeneous require-
The processing speed from the notification of the ments.
state-of-use of artefacts, i.e. whiteboard cleaner and
References
a vacuum cleaner, to completing association were ap-
proximately 3.1 sec (2.1 sec for the window making)
[1] S. Antifakos, B. Schiele, and L. E. Holmquist. Group-
and 3.9 sec (2.8 sec), respectively. Here, the difference ing Mechanisms for Smart Objects Based On Im-
is due to the size of the window (20 for a whiteboard plicit Interaction and Context Proximity. In Adjunct
cleaner and 30 for a vacuum cleaner, and a sample Proc. of the 5th International Conference on Ubiqui-
comes every 100 msec.). The activity recognition time tous Computing (Ubicomp2003), pages 207–208, 2003.
is less than 15 msec. The rest of the time (approx. 1 [2] L. Bao and S. S. Intille. Activity recognition from user-
sec) is required for the terminal to be notified by other annotated acceleration data. In Proc. of the 2nd Inter-
neighborhood terminals and to select the most corre- national Conference on Pervasive Computing (Perva-
lated one as described in section 3.4. We consider that sive 2004), volume LNCS 3001, pages 1–17. Springer-
the current processing time is not too much delay for Verlag, 2004.
[3] W. Brunette, C. Hartung, B. Nordstrom, and G. Bor-
a user. In our earlier experience, the users expect a
riello. Proximity Interactions between Wireless Sen-
sentient artefact to act as a switch like a TV remote, sors and their Application. In Proc. of the Sec-
where the TV screen appears in a few seconds. ond ACM International Workshop on Wireless Sen-
sor Networks and Applications (WSNA 2003), pages
30–37, 2003.
6. Conclusions and Future Work [4] K. Fujinami and T. Nakajima. Sentient Artefact: Ac-
In this paper, we have proposed an architecture quiring User’s Context Through Daily Objects. In
Proc. of the 2nd International Symposium on Ubiq-
“Kuka” to associate a person with an artefact. The
uitous Intelligence and Smart Worlds (UISW2005),
proximity of the signal patterns of readings from a
LNCS 3823, pages 335–344, 2005.
sensor augmented artefact, sentient artefact, and body [5] K. Fujinami and S. Pirttikangas. A Study on a Corre-
worn sensors were utilized for the association. Impro- lation Coefficient to associate an Object with its User.
vised association, infrastructure independency, scala- In Proc. of the 3rd IET International Conference on
bility in the type of an artefact, and timeliness are Intelligent Environment (IE07), pages 288–295, 2007.
taken into account in the Kuka system design. A dis- [6] K. Fujinami, S. Pirttikangas, and T. Nakajima. Who
tributed architecture and a correlation coefficient based opened the door?: Towards the implicit user identifi-
approach are realized and implemented. cation for sentient artefacts. In Adjunct Proc. of the
4th International Conference on Pervasive Computing
The prototype implementation and the evaluation
(Pervasive2006), pages 107–111, 2006.
gave us future work. A short range communication [7] L. E. Holmquist, F. Mattern, B. Schiele, P. Alahuhta,
system needs to be adopted to assess the association M. Beigl, and H.-W. Gellersen. Smart-Its Friends: A
performance in a more realistic environment as well as Technique for Users to Easily Establish Connections
the improvement of the broadcast-based communica- between Smart Artefacts. Lecture Notes in Computer
tion. Also, evaluation on the scalability in the number Science (Ubicomp2001), 2201:116–22, 2001.
of users and artefacts needs to be done. Besides that, [8] K. V. Laerhoven, A. Schmidt, and H.-W. Gellersen.
the applicability in heterogeneous sensor situation will Multi-Sensor Context Aware Clothing. In Proc. of the
be examined. The proximity in the proposed system 6th International Symposium on Wearable Computers
does not limit to the movement of an artefact and a (ISWC’02), pages 49–56, 2002.
[9] J. Lester, B. Hannaford, and G. Borriello. “Are
part of the body, i.e. acceleration. You With Me?”–Using Accelerometers to Determine
Furthermore, more complex and practical cases will if Two Devices are Carried by the Same Person.
be considered. First of all, dynamic thresholding of the In Proc. Int. Conf. Pervasive Computing (Pervasive
correlation metric is needed to handle a multiple users 2004), pages 33–50, 2004.
situation. We are planning to apply some clustering al- [10] M. Philipose, K. P. Fishkin, M. Perkowitz, D. J. Pat-
gorithms to find “user group” and the other. Also, we terson, D. Fox, H. Kautz, and D. Hähnel. Inferring
will take into account the situation where one user is Activities from Interactions with Objects. IEEE Per-
switched to another while an artefact is utilized. Con- vasive Computing, 3:50–57, 2004.
[11] S. Pirttikangas, K. Fujinami, and T. Nakajima. Fea-
tinuous association checking by the six step procedure
ture Selection and Activity Recognition from Wear-
is cost ineffective. So, an on-demand checking mech- able Sensors. In 2006 International Symposium
anism should be introduced. We consider that these on Ubiquitous Computing Systems (UCS2006), pages
cases are application (or artefact) dependent, and thus 516–527, 2006.
an appropriate application framework could simplify

161
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Generating a Tailored Middleware for Wireless Sensor Network Applications

Christian Buckl, Stephan Sommer, Andreas Scholz, Alois Knoll, Alfons Kemper
Department of Informatics
Technische Universität München
Garching b. München, Germany
{buckl,sommerst,scholza,knoll,kemper}@in.tum.de

Abstract the developer. But due to the limited resources available


on wireless sensor nodes, a generic middleware providing a
Wireless sensor networks are characterized by resource container for all applications in the sensor network is quite
constraints. Therefore, today’s sensor networks are imple- impossible. Although, one can currently observe the advent
mented from scratch emphasizing code efficiency. This de- of new, more powerful nodes such as iMote2 [3] that enable
velopment strategy leads to relatively complex code and bad the use of a generic middleware, e.g. the .net Framework
code reusability in further projects. To improve reusability [17], these nodes are too expensive and require too much
and development efficiency, it is state-of-the-art in the de- power to be used in many sensor network applications.
velopment of standard information systems to divide appli- Resource constraints such as available main memory or lim-
cations into at least two parts, the application-logic, pro- ited power supply force the developer to implement these
viding all the functions to solve a given problem and a services, typically provided by middleware, manually and
reusable distributed middleware providing a container for tailored for a specific hardware and application. Therefore,
the application. After developing the middleware once, developers with expert platform and low-level program-
the developer of further projects need to focus only on ming knowledge are required. This application-specific de-
the application-logic. Thereby, the development times can velopment of standard functionality is a very slow and com-
be reduced considerably. However, a generic middleware plex process which impacts the development time of the
layer replacing code implemented from scratch is not prac- whole project and reduces the code reusability. In addition,
ticable in sensor networks due to resource constraints. the resulting code mixes typically aspects concerning appli-
Within this paper, we will present a model-driven approach cation and system logic. Because of this, minor changes
in combination with a template-based code generator to get in application-logic may lead to vast changes of the whole
the best of both development strategies. This approach en- system; code reuse is quite impossible.
ables us to generate a tailored middleware for our appli- A solution of this problem is an application and platform
cation including interface-stubs for the application-logic. specific middleware with defined interfaces for applica-
In contrast to other component-based approaches, the tem- tions. The separation of system and application logic helps
plates can be adopted easily to fulfill specific platform to split the software development into different parts. Fig-
needs. We will demonstrate the practicability of this ap- ure 1 shows a logical view of such a component/container
proach by implementing the control of a model railway. infrastructure [21]. A middleware/container offers differ-
ent services, like component discovery and inter-component
communication, to application components denoted by an
1 Introduction A in the figure. Hardware related software components,
denoted by an H, realize the hardware access (sensing or
Developing wireless sensor network applications re- actuating) and hide implementation details. An application
quires a different approach than developing standard infor- domain expert is able to develop the application-logic with-
mation systems. Many problems such as mobility, limited out considering each platform detail like communication or
resources and unreliable communication links, must be con- sensor-access. On the other hand, developers with low-level
sidered by the developer. In most other domains, these programming skills and expert hardware knowledge can de-
problems are solved by underlying abstraction layers. In velop the services provided by the middleware and used by
standard information systems, these abstraction layers are the application-logic. However, it is important to guarantee
realized by a middleware that offers high-level services to that the container realizes only the functionality required by

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 162


DOI 10.1109/SUTC.2008.57
computational power and memory capabilities. Therefore,
the nodes can take over different roles within the whole
system. Resource-constrained nodes can be used to per-
form simple interactions with the environment like sens-
ing or actuating. More powerful nodes can control the
whole network, optimize the data flow and trigger appli-
cation changes.
The middleware forms a container that allows an easy com-
bination of the components realizing the application func-
Figure 1. Component/Container Infrastruc- tionality. Regarding these components, we distinguish two
ture kinds of components. An Application Component realizes
a control function of the application. The functionality can
be implemented independently of the underlying hardware.
the application and implicates no or only minimal overhead Therefore, these components can be placed within the dis-
in comparison to the manual implementation. tributed system according to some performance criteria. In
In this paper, we will present a tool-supported approach contrast, a Hardware Interaction Component realizes the
that realizes such a tailored middleware. To increase de- hardware access, e.g. sensing or actuating, and must be im-
velopment efficiency, the manual tailoring of its compo- plemented hardware dependent.
nents is replaced by a model-driven development approach. The middleware realizes the interaction of these compo-
Thus, the approach combines the advantages of component- nents and can be seen as intelligent glue code. In contrast
based and model-based development, as discussed in [19]. to the operating systems such as TinyOS [18] or SOS [16],
The domain expert creates a model of the intended appli- which are very often considered to be middleware them-
cation based on a meta-model describing the relevant fea- selves, the presented approach has to be seen at a higher
tures of sensor network applications. Based on the model, a level. In particular, it offers services related to the distrib-
template-based code generator produces the middleware in- uted execution of sensor applications such as routing, node
cluding interfaces for the application-logic. Templates can failure management and quality of service. It consists of
be seen as highly adaptable components. Each template can several components as depicted in Figure 2. The commu-
solve a particular aspect of the middleware or can be used to nication between the different nodes is handled by the Net-
construct the middleware out of other templates. Using this work Service. In this service, all supported communication-
approach, we can create a tailored middleware providing media, e.g. ZigBee or serial communication, of a single
exactly the features required by the application. Therefore, node are implemented. Details about the network protocol
we get a good tradeoff between resource-use, code size and and routing are hidden by this service.
development time. Received messages are forwarded to the Broker Service.
The paper starts with a discussion of the components of a This service handles all communication at the level of ap-
middleware that are useful for wireless sensor networks in plication logic. This comprises the local communication
Section 2. In Section 3, we describe the models used for between different application components and / or hardware
the code generation. The code generation technique is ex- services, as well as the transmission of local results to exter-
plained in Section 4. Section 5 discusses our first prototype. nal components. Every time a new message is received by
In Section 6, the experiences with the tools are described in the network layer or sent by a local component, the Broker
the context of a simple application. The related work is dis- Service determines the set of target components. If a target
cussed in Section 7. Finally, Section 8 concludes the paper component is located on a different node, the Broker Ser-
and points out possibilities for future work. vice sends a message, including the target-node id, to the
Network Service.
The configuration of the Broker Service is handled by the
2 Middleware Application Management. While the Network and Broker
Service must be implemented on each node, the Applica-
Within this section, we will discuss the tasks and related tion Management may be realized only on more powerful
components of the middleware. The general architecture is nodes. The Application Management is responsible for the
depicted in Figure 2. Similar to CORBA [14], we provide configuration of the whole application and tries to optimize
well defined interfaces for the application components to this configuration according to different criteria, like QoS
access the container-services. But in contrast to CORBA, or maximum life time. The Application Management is
our container can be tailored for a specific application and supported by a Component Management that manages all
hardware. available application components, as well as a Node Man-
We expect the system to be heterogeneous concerning the

163
Application Application Application
Component 1 Component 2 Component 3

Application Management

Component Management
middleware

Broker Broker
Node Management

Network Network

Hardware Interaction Hardware Interaction


Component 1 (e.g. Component 2 (e.g.
sensor) actuator)

Node 1 Node 2

Figure 2. Middleware Services

agement that monitors the set of operational nodes. The information about the hosting node but do not collect in-
task of the Node Management comprises amongst others formation about other nodes. More powerful nodes execute
the discovery of new nodes and the monitoring (e.g. battery active versions of the node management that gather the for-
status) of connected nodes. The Component Management warded information and report changes to the Application
can be placed similar to the Application Management only Management.
on a subset of the available nodes.
After this short overview of all used middleware compo- 2.2 Component Management
nents, we will discuss some of them in more detail.
The Component Management provides information
2.1 Node Management about all components available for the entire sensor net-
work. We differentiate between Application Components
The first middleware component, we will discuss in more and Hardware Interaction Component. Hardware Inter-
detail is the Node Management. This distributed service action Components are offered on each node with dedi-
is used to collect status information and capabilities of all cated hardware devices. In contrast, it is possible to lo-
nodes in the sensor network. The capabilities of the net- cate Application Components on an arbitrary node in the
work comprise the available sensors and actuators, the pro- network. To acquire an optimal service placement in the
vided communication media as well as processing power sensor network, the Application Management service needs
and storage capabilities. In addition, run-time data like bat- in-depth knowledge of all interfaces, the provided function-
tery status, free memory or hardware failures must be mon- ality and resources requirements (memory consumption, re-
itored. This information can be used to optimize the con- quired processor time) of each component. This informa-
figuration of the application. Furthermore, the status infor- tion is stored, maintained and provided by the Component
mation can be used for maintenance to identify nodes with Management. Different application components may real-
heavy load or low energy resource at an early stage and to ize a similar functionality. Based on the description of these
make arrangements to replace these nodes or their battery. components, the Application Manager can choose an ade-
To gather all these information, it is essential for the whole quate component based on the available devices and QoS
system that each node announces its presence and keeps the constraints.
state up to date. A node failure can be detected and reported
by neighbor nodes due to the fact that communication to a 2.3 Application Management
lost node is not possible anymore. For resource-constrained
nodes in the network, a passive version of the node manage- This middleware component handles the configuration
ment is sufficient. It is passive in the sense that they provide of the application. The configuration depends on the set of

164
*
available nodes and their status, the set of software com-
ponents, the topology and QoS requirements. Application *

components can be placed intelligently within the distrib-


*
uted system to minimize network load or to balance the load
on the different processors. If for example an average value
out of a set of redundant sensor results is used at a remote Figure 3. Component Model
controller, the application component computing this aver-
age value should be placed close to the sensors. A new con- 3 Domain Specific Language
figuration can be obtained by moving the affected software
components and updating the routing. The latter is done by
reconfiguring the Broker. This section will give an overview of our modeling lan-
guage used for automatic code generation. To allow an
extensive code generation, the modeling language must be
2.4 Broker generative and descriptive [10]. The models must have ex-
plicit execution semantics, hardware characteristics need to
be specified and the interfaces of the components must be
The component realizing the Broker must be imple- described in an unambiguous way. Especially the first re-
mented as a local service on each node. The task of this quirement excludes the use of standard modeling languages.
component is to realize the routing at the level of applica- The widely used Unified Modeling Language UML lacks
tion logic. The routing table of the broker is maintained by the precision and rigor needed for code generation [9] for
the Application Management to guarantee an optimal rout- example. Therefore, we decided to design an own domain
ing. All messages consumed and/or produced on a specific specific language that is optimized for the use in our spe-
node need to pass the broker. It is the task of the broker to cific scenario. Thus, it is possible to create a very simple,
decide to which components on which nodes the message but powerful modeling language.
will be forwarded. In contrast to messages for local ser- Since it is necessary to describe different aspects of the sys-
vices, the messages for non-local services need to be sent tem, we decided to use several sub-models. The hardware
over network. The message including routing information model describes the properties of the hardware, the compo-
such as the receiver, security and reliability requirements is nent model describes the interfaces and parameters of the
sent to the Network service for further processing. application components and the application model is used
for the specification of the concrete component interplay.
2.5 Network Service All the meta-models are specified using the Eclipse Mod-
eling Framework1 (EMF). Several plug-ins for Eclipse are
available to specify the models. In the following subsec-
The Network Service is used to communicate with other tions, we will summarize the used models. Due to space
nodes in the sensor network regardless of the concrete limitation, we will restrict the description on the character-
communication medium. In order to get better efficiency istics that are necessary for the code generation.
and less overhead, we adapt the capabilities of this ser-
vice while generating the middleware. With adequate hard- 3.1 Hardware Model
ware and application knowledge, we can decide which
communication-medium is available on a specific platform The hardware model is used to specify the properties
and which ones of them are actually needed or used for the of the used hardware. The main idea for this model is to
specific application. This leads to a quite optimal perfor- adapt the code generation to the specific platform. In ad-
mance without the need of any manual adaptations. The dition, the model is used for optimization issues. Within
Network Service implements the end-to-end routing by for- the model, platform specialists can describe essential hard-
warding the message to the next neighbor on the route and ware features like computing power, available memory and
applying the appropriate communication protocol such as supported communication protocols.
ZigBee. To achieve secure communication, message de-
and encryption can be activated in this service to transpar- 3.2 Component Model
ently get a secure communication layer for message trans-
port. For better efficiency and because of the resource con-
The component developer can specify the component
straints, only critical messages are encrypted. Which mes-
interfaces within this model. Each interface is described as
sages are assumed to be critical, can be derived by the ap-
a set of in and out ports. Each in and out port can consist
plication model that is described in the next section.
1 http://www.eclipse.org/modeling/emf/

165
¿FOREACH app.componentInstance AS ciÀ¿IF ci.node==nÀ Main.StdControl ->OnOffLEDC.StdControl;
BrokerC.OnOffLED ->OnOffLEDC;
Main.StdControl ->¿ci.nameÀC.StdControl;
BrokerC.¿ci.nameÀ ->¿ci.nameÀC; Main.StdControl ->LightClapServiceC.StdControl;
¿ENDIF-À BrokerC.LightClapService ->LightClapServiceC;
¿ENDFOREACH-À
Main.StdControl ->SoundSensorC.StdControl;
BrokerC.SoundSensor ->SoundSensorC;

Figure 4. Template
Figure 5. Generated Code

of different parameters/variables. The description of the in


and out ports is used for the interaction between the dif- functions during the code generation. OpenArchitecture-
ferent components. We use an event based push model for Ware also allows polymorphism as one element to select
component interaction similar to data flow diagrams. The adequate templates.
activation of an out port is realized by sending a message. To specify the control flow of the code generation, the com-
This message contains elements for each parameter of the mands FOR/FOREACH and IF/ELSE can be used. The
individual out port. The arrival of a message at a specific FOREACH statement is used to generate code for each ob-
component triggers the activation of the according in port. ject of a certain type that is declared within the model. Fi-
Figure 3 shows a simplified version of our meta-model. nally, the commands FILE and ENDFILE allow the man-
agement of the generated files. The code generation process
is then rather simple: the adaptation of the templates to the
3.3 Application Model model is performed using a technique similar to preproces-
sor macros. Text sequences between the different XPand
The interaction between the different components is commands are directly copied to the generated files and
specified within the application model. This step comprises variables allow the access to objects and their attributes.
a simple wiring of out ports to in ports. Depending on the Figure 4 shows a simple template that illustrates the ba-
middleware used, the user also has to specify the mapping sic concept. The template realizes the generation of links
of the application components to a specific node or state between the components on one node and its Broker in
criteria used for run-time optimization. TinyOS 1.x. The required information can be retrieved from
the model. The generated code is depicted in Figure 5.

4 Code Generation
5 First Prototype
One key requirement was the application-specific tailor-
ing of the middleware. We are using a template-based code Within a first prototype, we have implemented the main
generator [1] to satisfy this requirement. Templates are features of the approach discussed before. Using our model-
highly adaptable components. This offers not only the pos- driven development tool, the developer can specify the com-
sibility to adjust some parameters of the template, but also ponents of the application. The tool supports the automatic
to generate strongly application dependent components of generation of an optimized middleware and integrates the
the middleware like a routing table of the broker. application components. The current prototype is based on
Templates can be used to solve certain aspects of the run- a static setting of the individual nodes with all nodes in 1-
time system, or to combine the results of different templates hop distance. The main task of the middleware is to real-
to form the middleware. Most templates are platform de- ize the interaction between the different components on one
pendent in the sense that they offer a solution only for a cer- node and between local and remote components. There-
tain combination of hardware and operating system. There- fore, only the Broker and the Network Service are necessary
fore, also the correct selection of adequate templates is nec- within the middleware.
essary. The logical connection for a simple example is depicted in
Instead of implementing an own code generator, we are Figure 6. The application is executed on the two nodes.
using an existing code generation framework, called ope- A Hardware Interaction Component reads some value from
nArchitectureWare2 [20]. OpenArchitectureWare provides the environment and sends the result to an Application
for these problems a special template language, call XPand. Component. This component computes a control function
XPand offers the statements DEFINE to declare a new code and sends the result to a second Hardware Interaction Com-
generation function and EXPAND to call other generation ponent that outputs the data. The single components interact
only with the Broker. The task of the Broker is to forward
2 http://www.openarchitectureware.org/ the event to the relevant components.

166
the trains. The components realizing the interaction with
the sensor network were similarly generated by our tool.
I O

Service Broker Service Broker


6.1 Evaluation
O I

Several criteria can be used to evaluate our approach.


We chose to compare our approach with a standard devel-
opment process regarding the development time, the flex-
ibility, the code size and the code maintainability. Two
Figure 6. Example Application teams developed the same application. The first team im-
plemented the application from scratch, while the second
We have implemented components to generate this middle- team used our code generator, but had to implement all the
ware for the versions 1.1 and 2.0 of TinyOS. In addition, components (Hardware Interaction Components and Appli-
we also implemented components for Windows hosts that cation Components) and templates for the middleware by
allow the easy implementation of graphical user interfaces themselves.
to allow the interaction of the user with the sensor network. Both teams needed approximately the same time to develop
We use ZigBee for node-to-node, RS232 for node-to-host the application. Not surprisingly, the first team could im-
and UDP/IP for host-to-host communication. The physical plement the first prototype earlier, since the second team
connection is abstracted by the Network Service. had to implement the middleware and templates first. How-
ever, this initial effort was compensated during the develop-
ment cycle. One reason for this surprising result was that
6 Application Example and Evaluation by defining a meta-model and a middleware architecture,
the implementation of the templates and components was
The approach and developed tools were evaluated in the straightforward. Of course, the advantages of our approach
context of an example application realizing the control of will be much more significant, when the templates and com-
a model railway, see Figure 7. For this application, we use ponents are reused in further development processes. Re-
MICAz [5] sensor nodes from Crossbow. Several Hardware garding the code size, we expected some overhead of the
Interaction Components were implemented to access the generated code due to the middleware approach. Neverthe-
different available sensors: brightness, temperature, humid- less the code size of both systems was of comparable size.
ity and volume sensors. In addition, we also implemented The code developed manually had 310 loc and used 12 kB
a Hardware Interaction Component to enable the easy use of flash memory; the generated code had 400 loc and used
of the MDA300 [4] data acquisition board of Crossbow that 13 kB of flash memory. The reason was that some function-
includes ADC and digital in- and output. These compo- ality was repeatedly implemented in the different modules
nents are of course independent of the concrete application, within the first solution. Due to the strict separation of con-
we had in mind. They can be used in completely different cern, this problem was avoided in the generated code. Also
scenarios. the maintainability of the generated code was much better.
In addition, we implemented different application compo- Due to better design and documentation including the mod-
nents to calculate the speed and acceleration of the trains. els, the readability of the code was significantly improved.
As input, we used the data from the ADC and digital IO Regarding the flexibility, we could experience the fore-
channels of the data acquisition board connected to dif- casted flexibility. To support the MDA300 evaluation board,
ferent hardware devices (hall sensor, acceleration sensor). we had to switch to version 1.1 of TinyOS due to the un-
These components were implemented independent of the availability of suited hardware drivers. As consequence,
used platform. great parts of the code of the first development process had
Using these different components, we could implement to be reimplemented due to the mixture of application and
the complete application. For example, we monitored the system logic. In contrast, in the model-driven approach only
brightness to control the light of the trains for driving in the templates had to be adapted to the new operating system,
the tunnels and during night. We also allowed the measure- while the application components could be used unchanged.
ment of the train velocity. To demonstrate the interaction Summarily, we could show that our approach has significant
between the user and the sensor network, we implemented advantages. Especially when using the approach in the de-
a signal-horn application. The user can use a graphical user velopment of several applications, the development times
interface running on a Windows PC to control the horn of can be reduced due to template reuse.

167
Figure 7. Application Example: Model Railway

The RUNES[2] middleware provides a component oriented


programming platform for sensor network applications.
The encapsulation in components with well defined inter-
faces allows dynamically reconfigure applications based on
environmental changes, thus providing context aware adap-
tations. However, the design and composition of the indi-
vidual components is still the task of an expert and can-
not be done by the end-user himself. In our approach, the
adaptation of the individual components is automated by the
code generator.
Reusability is addressed in different standards for sensor
Figure 8. Evaluation Results networks. For home automation, the Konnex (KNX) [8]
standard is used to ensure the interoperability of different
devices. However, this standard specifies the hardware plat-
7 Related Work forms that are allowed to use and is therefore not extensible.
Furthermore, the standard does not address issues like au-
tomatic code generation or tool support during component
Different research teams addressed recently the dis-
development.
cussed issue by using macro-programming languages, mid-
Industrial-process measurement and control systems can be
dleware and component-based approaches for sensor net-
implemented according to the IEC 61499 [7] standard. Dif-
works [6, 15].
ferent function blocks can be defined and a graphical user
CORBA [14] is a widely used middleware standard, but the
interface for application development is supported. Simi-
implementations are typically too resource consuming to be
lar to our approach, the standard uses an event-based push
used in the context of sensor networks. The standards Min-
model. However, the standard only addresses the system
imum CORBA [13] and Real-Time CORBA [12] define a
and application description and does not standardize the bi-
smaller subset to minimize these constraints. Nevertheless
nary representation or the concrete application interfaces.
with a footprint of about 100 kB, the use of CORBA is not
Automatic code generation is not supported.
feasible for wireless sensor nodes. The .net MicroFrame-
work [17] is with a footprint of about 300 kB in the same
order of magnitude. 8 Conclusion
The OASiS Framework[11] aims at developing a frame-
work that allows designing service-oriented sensor network In this paper, we proposed an approach using domain
applications. The design of applications is driven by an specific languages and template-based code generators to
object-centric view, i.e., applications are designed in rela- generate an application specific middleware and to increase
tion to a monitored object. This eases the development of reusability.
monitoring or tracking applications, which require services For the domain specific language, we are using a
to ”follow” an observed object through the network. In con- component-based approach. The sensor network applica-
trast to our approach, OASiS does not provide automatic tion is interpreted as a set of components that interact via an
code generation. event based push model. Hardware Interaction Components

168
are used to access hardware devices and hide low-level im- [2] P. Costa, G. Coulson, C. Mascolo, G. P. Piccoand, and
plementation details. Application components implement S. Zachariadis. The RUNES Middleware: A Reconfig-
the aspects of the application functionality and can be im- urable Component-based Approach to Networked Embed-
plemented platform independent. To form a concrete appli- ded Systems. In Proc. of the 16th Annual IEEE Intl. Sym-
cation, the interaction between Hardware Interaction Com- posium on Personal Indoor and Mobile Radio Communica-
tions (PIMRC’05), 2005.
ponents and the Application Components must be specified
[3] I. Crossbow Technology. Crossbow imote2.builder.
in a model-based tool. The interoperability in heteroge- [4] I. Crossbow Technology. Mda300, data acquisition board.
neous systems with different nodes and also operating sys- [5] I. Crossbow Technology. Micaz, wireless measurement sys-
tems is realized by a tailored middleware. tem.
The transformation of the model into executable code [6] S. Hadim and N. Mohamed. Middleware: Middleware chal-
and the generation of the middleware are realized by our lenges and approaches for wireless sensor networks. IEEE
template-based code generator. Template-based code gen- Distributed Systems Online, 07(3), 2006.
erators are designed to support an easy extension. There- [7] International Electrotechnical Commission. IEC 61499:
Function blocks.
fore, new templates can be easily added to support further
[8] International Organization for Standardization. ISO/IEC
platforms. In addition, the middleware can be augmented 14543-3: Information technology - Home Electronic Sys-
with new features using this extensibility. tems (HES) Architecture - Part 3: Communication Layers
The approach was tested in the context of a model railway. and Initiation.
The implementation was done in two teams: one using stan- [9] I. Johnson, C. Snook, A. Edmunds, and M. Butler. Rigor-
dard methods, the other using the suggested approach. We ous development of reusable, domain-specific components,
could show the advantages of the domain-specific approach, for complex applications. In CSDUML’04 - 3rd Inter-
especially regarding the flexibility in relation to the used national Workshop on Critical Systems Development with
hardware and operating system as well as the code main- UML, 2004.
tainability. The development times were similar. The main [10] G. Karsai, J. Sztipanovits, A. Ledeczi, and T. Bapty. Model-
integrated development of embedded software. Proceedings
reason here was the non-existence of the required templates
of the IEEE, 91(1):145–164, 2003.
for the middleware. We expect a significant acceleration [11] M. Kushwaha, I. Amundson, X. Koutsoukos, S. Neema, and
of the development times, when using our approach in sev- J. Sztipanovits. OASiS: A Programming Framework for
eral projects due to reuse of templates and components. To Service-Oriented Sensor Networks. In International Con-
prove this assumption, we will conduct a case study in the ference on Communication System software and Middleware
future. This case study will also focus on additional evalu- (COMSWARE 2006), 2007.
ation criteria such as power consumption and required exe- [12] Object Management Group. Real-time corba specification,
cution times. Jan 2005.
Since we just have started this work, there are a lot of fea- [13] Object Management Group. Corba for embedded specifica-
tures, we have in mind but could not yet implement. In tion, version 1.0 beta 1 specification, Aug 2006.
[14] Object Management Group. Common object request broker
addition to the Broker and Network Service component, we
architecture (corba) specification, version 3.1, Jan 2008.
have to implement the other components mentioned in Sec- [15] A. Rezgui and M. Eltoweissy. Service-oriented sensor-
tion 2. In addition, dynamic reconfiguration can be used to actuator networks: Promises, challenges, and the road
cope with node failures at run-time. The sensor network can ahead. Comput. Commun., 30(13):2627–2648, 2007.
detect such failures and the reconfigure the network, e.g. by [16] SOS. https://projects.nesl.ucla.edu/public/sos-2x/doc/.
installing affected components on fault-free nodes. [17] D. Thompson and C. Miller. Introducing the .net micro
Furthermore, we want to implement Quality of Service framework, 2007.
(QoS) methods into the network. Examples are the ser- [18] TinyOS. http://www.tinyos.net/.
[19] M. Torngren, D. Chen, and I. Crnkovic. Component-based
vices that deliver the velocity and the acceleration of our
vs. model-based development: A comparison in the context
model trains. To monitor the velocity, the user would typi- of vehicular embedded systems. In EUROMICRO ’05: Pro-
cally select the sensor measuring the velocity. If this sensor ceedings of the 31st EUROMICRO Conference on Software
fails, the sensor measuring the acceleration could be used as Engineering and Advanced Applications, pages 432–441,
backup, but with a lower service quality due to measuring Washington, DC, USA, 2005. IEEE Computer Society.
imprecision. [20] M. Voelter, C. Salzmann, and M. Kircher. Model Driven
Software Development in the Context of Embedded Compo-
nent Infrastructures, pages 143–163. 2005.
References [21] M. Volter, A. Schmid, and E. Wolff. Server Component Pat-
terns: Component Infrastructures Illustrated with EJB. John
[1] F. J. Budinsky, M. A. Finnie, J. M. Vlissides, and P. S. Yu. Wiley & Sons, Inc., New York, NY, USA, 2002.
Automatic code generation from design patterns. IBM Sys-
tems Journal, 35(2):151–171, 1996.

169
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Energy Efficient Object Tracking in Sensor Networks by

Mining Temporal Moving Patterns

Vincent S. Tseng1, Kawuu W. Lin2 and Ming-Hua Hsieh1


1
Department of Computer Science and Information Engineering
National Cheng Kung University
Tainan, Taiwan, R.O.C.
2
Department of Computer Science and Information Engineering
National Kaohsiung University of Applied Sciences
Kaohsiung, Taiwan, R.O.C.
E-mail: tsengsm@mail.ncku.edu.tw

Abstract With the capabilities of widespread surveillance,


sensor networks are applied to a lot of applications,
Object tracking sensor networks (OTSNs) have such as the environmental data collection and object
received extensive attentions for researches in recent tracking application. Effectively modeling the behavior
years due to the wide applications. One important patterns of objects in the sensor networks can benefit
research issue in OTSNs is the energy saving strategy energy conservations a lot. However, the intrinsic
in considering the limited power of sensor nodes. The limitations such as power constraints, synchronization,
past studies on energy saving in OTSNs usually deployment, and data routing bring numerous research
considered the movement behavior of objects as challenges [17]. In this paper, we focus on the problem
randomness. However, in some real applications, the of energy saving in the object tracking sensor networks
object movement behavior often carries certain (OTSNs).
patterns instead of randomness completely. In this In an OTSN, each sensor node is composed of
paper, we propose an efficient data mining algorithm sensing, data processing, and communication
named TMP-Mine with a special data structure named components [8]. The cooperation is an important issue
TMP-Tree for efficiently discovering the temporal in OTSNs. For example, using a velocity-based
movement patterns of objects in sensor networks. strategy to track the moving objects requires the
Moreover, we propose novel location prediction velocity sensing component, which is an energy
strategies that employ the discovered temporal expensive device and is not the necessary equipment
movement patterns so as to reduce the prediction for all sensor nodes. Therefore, one of our research
errors for energy saving. Through empirical goals is to propose energy efficient strategies by using
evaluation on simulated, TMP-Mine and the proposed intelligent software mechanism wihtout adding the
prediction strategies are shown to deliver excellent energy expensive components.
performance in terms of scalability, accuracy and A number of past studies explored the energy
energy efficiency. saving issue in terms of hardware design. For instance,
the optimization problem of the communication cost
Keywords: Location prediction; Temporal by inactivating the RF radios of idle sensor nodes was
movement patterns; Object tracking; Sensor networks; widely discussed [3]. However, these studies did not
Data mining. consider the energy saving issues for these components
[15] although the sensing and computing components
1. Introduction consume relative less energy than radios [8]. Several
researchers tried to save the energy through the
Energy-efficient tracking of objects in sensor software approach like scheduling of sensors. One of
networks has attracted extensive attentions recently. the novel ideas is to put a sensor node into sleeping

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 170


DOI 10.1109/SUTC.2008.27
mode when there are no objects in its coverage region, presented in this paper is a simplified study based on
and a sensor node is activated again when an object [14].
enters its sensing region. Based on this idea, the studies For the research on behavior mining, numerous
for energy saving in OTSNs can be further divided into studies have been done on mining users’ behavior
two categories: non-prediction based tracking and patterns like association rules or sequential patterns in
prediction based tracking. The intuitive way of WWW [7] and transactional databases [1]. In [7], the
non-prediction based tracking method is periodically authors proposed a method named WAP-Mine for fast
turn the sensor nodes off and only activate the sensor discovery of the web access patterns from web logs by
nodes when it is time to monitor their sensing regions. using a tree-based data structure without candidate
The prediction based methods use the information of a generation. Previous studies on the mining of temporal
moving object like velocity or moving direction to databases include [1], [9]. In [1], the authors proposed
predict the next location the object might visit. a method for mining the transactions to discover the
In this paper, we propose two prediction-based time-ordered patterns named sequential patterns. In [9],
strategies for predicting the location of a missing the method using sliding window to restrict the time
object in OTSNs by utilizing discovered temporal gap between sets of transactions in mining sequence
movement patterns (TMPs). The first prediction patterns was proposed. In the category of mobility
strategy named PTMP is capable of making prediction mining, most of the existing researches focused only
by employing TMPs with no need to detect the object on the analysis of user movement behavior [17]. To
velocity. Hence, it can be applied to the sensor discover the patterns from two-dimensional mobility
networks with low-end sensor nodes. The second data, the problem of mining location associated service
strategy, namely PES+PTMP, is a hybrid approach by patterns was first studied by Tseng et al. [13]. A novel
integrating PTMP method with a popular method for discovering users’ sequential movement
velocity-based strategy named PES [15]. This patterns associated with requested services in mobile
integrated strategy can further enhance the energy web systems was also proposed by Tseng et al. [12].
efficiency if the sensor nodes carry the velocity In the area of behavior prediction, some researchers
detection capability. Through empirical evaluation on proposed variations of Markov models, such as
various simulation conditions, TMP-Mine and the Dependency Graph (DG) [5],
proposed prediction strategies are shown to deliver Prediction-by-Partial-Match (PPM) [6] and N-gram
excellent performance in terms of scalability, accuracy model [10], for predicting the user behavior in WWW.
and energy efficiency. Basically, these methods employ the last N page views
The rest of this paper is structured as follows. We of the user to predict the next view by using the
briefly review the related work in Section 2. In Section patterns discovered. Yang et al. [16] studied the
3, we describe the problem definition and the data association-rule based sequential classifiers and
mining algorithm, namely TMP-Mine [11], is considered features of association rules such as order,
presented in Section 4. Section 5 gives the detailed adjacency, and recency systematically to construct
description on the prediction strategies. The empirical prediction models from web logs.
evaluation for performance study is made in Section 6.
The conclusions are given in Section 7. 3. Problem Statement
In this section, we first state the problem. Afterwards,
2. Related Work we describe the network environments and the
For energy saving policies in sensor networks, a behavior issues of moving objects. The performance
number of past studies tried to solve this issue from the metrics are also described in the end of this section.
aspect of hardware design. For instance, the In this work, we adopt a network model for OTSNs
optimization problem of the communication cost by as proposed in [15], in which a sensor node is activated
inactivating the RF radios of idle sensor nodes was only if there is object in its coverage/sensing region.
widely discussed [3]. There are also a lot of research Moreover, we assume the movement log of objects is
efforts in energy efficient media access control (MAC) collectable [11] and the trajectory of each object is
[18]. Several researches tried to save the energy represented in the form of S = <(l1, t1) (l2, t2) ... (ln, tn)>,
through the software design approach. In [4], the where li represents the sensor node location at time ti.
authors developed some tree structures for efficient The log is considered as a valuable resource since it
object tracking by considering the physical network contains the habitual patterns of objects. The targeted
structure. In [15], Xu et al. proposed a network model, problem is to two fold: 1) Efficient discovery of
in which a sensor node is activated only if there exist temporal movement patterns (TMPs) for objects, and
some objects in its coverage region. The work

171
2) Location prediction by utilizing TMPs for energy each rule, each rule is ranked by the following formula
saving. that considers both of support and confidence:
To solve the problem described above, we shall strength(Rt) = sup(Pt)×conf(Pt) (3)
discover TMP in the form as P = <(l1, i1, l2, i2, ..., ir-1,
lr)> where ik semantically represents the time interval
between two traversed locations. Moreover, we shall Since a large number of rules could be generated,
generate temporal movement rule (TMR) in the form most of traditional data mining methods need a
of function utilizing hashing tables [10] or hashing trees
Rt = < (l1, i1, l2, i2, ..., lm-1, im-1) > → < (l m) > [1] to accelerate the rule access. However, we do not
for incorporation into the location prediction need any accelerating function for accessing the rules,
mechanisms so as to achieve low energy and low and the rules will be deployed over the networks based
missing rate in the OTSNs. on the location-related criterion. Therefore, dispatching
Note that we assume the behavior of moving objects TMRs to sensors by LLocation of each TMR requires
is often based on certain underlying events instead of only one scan over the physical rule repository. Take
randomness completely [11], [12], [13], [17]. An event the antecedent < (l1, i1, l2, i2, ..., lm-1, im-1) > of a TMR
is a stream of locations with time intervals. Note that for example. Since the LLocation of the antecedent is
the characteristics of events in OTSNs include not only lm-1, the sensors to load this rule are those within the
locations but also time interval. The network model for neighboring radius of lm-1. Considering that the rule
the movement behavior of objects will be given in that has been dispatched will not be used again in the
details in Section 6.1. future, no accelerating function is needed in our
In solving the targeted problem, some important application.
performance metrics should be considered. In this In Section 6, we will show through experimental
work, we adopt two popular metrics named Total results that ranking rules by strength instead of support
Energy Consumed (TEC) [15] and missing rate [15]. or confidence can save more energy. Moreover, if two
TEC indicates the total energy consumed by sensor or more rules have the same strength value, the rule
nodes in the OTSN during data mining and object with larger confidence will be given higher priority
tracking phases. Missing rate denotes the number of over other rules.
erroneous predictions in a specified time period in ratio
of the total number of movement of objects. 5. Proposed Prediction Strategies
In this section, we describe how the discovered TMPs
4. Data Mining Algorithm: TMP-Mine and TMRs are applied to predicting the location of
In order to discover temporal movement patterns each missing object. For the generated TMRs as
efficiently, we utilize TMP-Mine [11] .The main described in Section 4, they are deployed over the
advantages of TMP-Mine are 1) constructing a sensor network by loading the location-related TMRs
variation of prefix tree called TMP-Tree to aggregate into corresponding nodes. We propose two prediction
the temporal moving patterns into memory in a strategies, namely PTMP and PES+PTMP, for
compact form so that the mining of frequent patterns achieving the prediction tasks. PTMP is a non-velocity
can be done efficiently, and 2) can manipulate the two based prediction strategy that exploits the TMRs to
dimensional moving patterns including location and predict the location of the missing object, while
time attributes simultaneously. It recursively construct PES+PTMP is a hybrid strategy that incorporates the
the TMP-Trees and mines the trees till termination well-known velocity-based strategy named PES with
condition is met. Afterward, we will describe how the PTMP, using both information of detected velocity and
TMRs are generated. the TMRs.
For a discovered TMP, Pt = < (l1, i1, l2, i2, ..., lm) >, In an OTSN, a location prediction requires two
the form of the corresponding TMR Rt and the message transmissions in order to know whether the
definitions of confidence conf(Pt) and strength missing object is recovered. The introduced additional
strength(Pt) are given as: communication cost caused by the message
Rt = < (l1, i1, l2, i2, ..., lm-1, im-1) > → < (l m) > (1) transmission is the energy consumed by the
transmission and receiving operations between radio
sup( < ( l1 , i1 , l 2 , i2 , ..., l m ) > ) components in two sensor nodes and the activation
conf ( Pt ) = × 100% (2)
sup( < ( l1 , i1 , l 2 , i 2 , ..., l m −1 , i m −1 ) > )
power of two nodes. Our pattern-based prediction
strategies use the ranked TMRs one at a time in
We term the last location of antecedent, namely lm-1, as predicting the location. For simplicity and generality,
LLocation. Besides, in order to reveal the strength of the real-time constraint for prediction is represented by
number of predictions or TOP-N predictions in this

172
paper. Hence, a tight real-time constraint corresponds Input: N-gram value n, N-gram method Nm, TOP-N
to a low TOP-N value, and a loose constraint constraint α, Neighboring radius r, and Ranking method R
corresponds to a higher value contrarily. Output: return whether the object can be found by PTMP
Figure 1 shows the PTMP algorithm for recovering Method: PTMP (n, Nm, α, r, R)
missing objects. The N-gram method is used to induce 1. bvr ← Object’s historical movement behavior
the most likely location an object will visit next based 2. FOR i=1 to r
on its previous movement behaviour. Two variants, 3. IF Nm = N+-gram method
4. call PTMP-N+-gram (bvr, n, R, α)
namely standard N-gram and N+-gram, are considered
5. ELSE IF Nm = N-gram method
here. The algorithm begins with composing the 6. call PTMP-N-gram (bvr, n, R, α)
antecedent for prediction (line 1). The initial 7. ENDIF
antecedent is obtained by concatenating the location, 8. IF (the object is recovered)
arrival time and leaving time of the object. Then, either 9. RETURN true
N+-gram or N-gram (as shown in Figure 3) is invoked 10. ENDIF
to recover the missing object (line 3 to line 7). If the 11. subtract the number of predictions
object can be recovered by the assigned N-gram from α
method, the OTSN continues to track the object (line 8 12. IF (α > 0)
13. bvr ← remove the LLocation
to line 10). Otherwise, we subtract the number of error
of bvr and sum the last interval values to form
predictions from the specified TOP-N value, namely α the new antecedent for prediction
(line 11). In the end of each round, the algorithm will 14. ELSE
check the value of α. If α is greater than 0, it means 15. invoke the flooding method to
that there is still remaining time for more predictions. recover the object, and RETURN false
Hence, we extend the neighboring radius in each round 16. ENDIF
for obtaining more TMRs. The new antecedent is 17. ENDFOR
obtained by removing the LLocation of the bvr and
summing up the last two interval values (line 13). The Figure 1. PTMP algorithm.
purpose of regenerating new antecedent is to seek more
Input: N-gram value n, N-gram method Nm, TOP-N
TMRs for prediction. Take the antecedent < (l1, i1, l2, constraint α, Neighboring radius r, and Ranking method
i2, ..., lm-1, im-1) > for example. Suppose that the object is R
currently at location lm-1 and there are no more TMRs Output: return whether the object can be predicted by
for prediction, the antecedent will be modified as < (l1, PES+PTMP
i1, l2, i2, ..., lm-2, im-2+ im-1) >, where the LLocation is Method: PES+PTMP(n, Nm, α, r, R)
removed and the last two intervals are summed for 1 use the latest detected velocity of the object to
seeking more predictions. Note that the flooding predict its current location
method will be invoked to recover the object if the 2 IF (the object is found)
3 RETURN true
location of object cannot be predicted or no more
4 ELSE
prediction is allowed (line 15). 5 call PTMP (n, Nm, α-1, r, R)
Figure 2 shows the hybrid prediction algorithm 6 ENDIF
named PES+PTMP for recovering objects. The input
parameters are the same as those to PTMP and it works Figure 2. PES+PTMP algorithm.
as follows. It first uses the latest detected velocity of
the object to predict its current location (line 1). If the nodes will be activated one by one to recover the
object can not be recovered by the velocity-based object by the original node that lost the object (line 4).
prediction, the algorithm will invoke PTMP to recover Finally, the algorithm returns whether the object is
the missing object. Here the TOP-N value is subtracted found by PTMP-N-gram or not.
by 1 due to the error prediction that has been made by Figure 3-(b) gives the PTMP-N+-gram prediction
PES (line 5). algorithm. The spirit of this algorithm is that the
Figure 3-(a) gives the PTMP-N-gram prediction predicting by a longer antecedent often produces
algorithm. In the beginning, we extract the last n higher precision than that by a shorter one [10], [12].
LIPairs from the movement behavior to form the However, the applicability will decrease with the
antecedent for prediction (line 1). By using the increase in antecedent length [10], [12]. Therefore, the
antecedent we obtain a consequent set called prediction algorithm starts with high N-gram value and decreases
set from the TMRs, where the predicted locations are the N-gram value after each round for the
ranked by the specified rule ranking method such as PTMP-N-gram method (line 2). The activated node
support, confidence and strength (line 2). After the must report back to the original node whether the
prediction set is obtained, the corresponding sensor missing object is found or not (line 3). The algorithm

173
Input: Object’s historical movement behavior bvr, 6.1.1 Simulation Model
N-gram value n, and Ranking method R In the base experimental model, the network is
Output: whether the object can be found by modelled as a mesh network with size |W| = 20*20, and
PTMP-N-gram or not there are N (defaulted as 10,000) objects in this
Method: PTMP-N-gram (bvr, n, R, α) network. Initially, each object arrives at the network on
1. atc Å The last n location-interval pair from bvr IF an arbitrary outer sensor node deploying outside of the
(the object is found) sensor network at some time. We assume that the
2. csq Å Get the consequent by the antecedent atc from
behavior of moving objects in the OTSNs is
the TMRs ranked by R
3. FOR j=1 to min(α, | csq |) event-driven instead of randomness completely. Hence,
4. activate sensor si and check whether the we use two parameters le and Pe to model the average
object is in si length and the event probability, respectively. The
5. if the object is found, RETURN true length of each event is modelled by Poisson
6. ENDFOR distribution with mean le defaulted as 4. The event
7. RETURN false probability indicates the probability for an object to
adhere to a certain event, and it is modelled by Normal
(a) distribution with mean Pe (defaulted as 0.6). The
Input: Object’s historical movement behavior bvr, events of a node are structured by a tree, in which the
N-gram value n, and Ranking method R, TOP-N fan-out of each node is modelled by Normal
constraint α distribution with mean F (defaulted as 2). Each object
Output: whether the object can be found by in the network may move by adhering to a certain
PTMP-N+-gram or not event or randomly. When an object is in random
Method: PTMP-N+-gram (bvr, n, R, α) movement, it will move back by the probability Pb
1 FOR i = n down to 1 (defaulted as 0.1) or randomly move to other nodes in
2 call PTMP-N-gram (bvr, i, R, α)
the hexagon network structure by probability Pn = (1-
subtract the number of predictions from α
3 if the object is found, RETURN true
Pb)/(6-1). The node staying time is modelled by
4 if α <= 0, RETURN false Exponential distribution with mean Ι (defaulted as 4).
5 ENDFOR The tracking time for each object is set as 120 seconds.
6 RETURN false We assume the sensing coverage range is 15m and the
average object velocity is set as 15 m/s. For
(b) communications between the sensor nodes and the base
stations, we utilize a well-known routing algorithm
Figure 3.(a) PTMP-N-gram algorithm. named shortest path multi-hop as used in [15]. We
(b) PTMP-N+-gram algorithm. adopted the Rockwell’s WINS node [19] as our basis
in simulating the energy consumption. More detailed
power analysis of WINS nodes can be found in [8],
terminates only if the object is found or the number of
[13], [19]. The default value settings for the parameters
predictions exceeds the specified value (line 4).
reflect a reasonable and compact environment for
OTSN and mobile systems as in related studies [2], [4],
6. Experimental Evaluation [15].
In this section, we evaluate the proposed prediction
strategies by measuring the TEC and missing rate
under different time constraints. To select the best 6.2 Performance of Prediction Strategies
ranking method for TMRs, we measured the missing In the following series of experiments, we measure the
rate by applying support, confidence, or strength to TEC and the missing rate of the proposed prediction
ranking the TMRs. Moreover, the evaluation on strategies. TEC indicates the total energy consumed by
variations of PTMP was also discussed. In the object the OTSN in tracking all objects, and missing rate is
tracking experiments, 80% of the simulated data are the ratio of the error predictions to the total number of
used for training to obtain TMRs, and the rest 20% are movement of objects within a specified deadline. The
taken as testing set for object tracking. goal of prediction strategies is to track the moving
6.1 Experimental Setup objects with low TEC and low missing rate. Through
the performance study on prediction strategies, we use
To evaluate the performance of the proposed methods, 80% of the simulated data as training set to obtain
we implemented a simulator that generates the TMRs, and the rest 20% as testing set for object
workload data of an OTSN. The details of the tracking.
simulation model will be described in Section 6.1.1.

174
6.2.1 Selection of Ranking Method
Figure 4 shows the impact on missing rate when the 0.75
TMRs are ranked by strength, support and confidence,
0.70
with the training data occupying 80% of the dataset. It Confidence
Strength
is clear that the strength-ranking approach delivers 0.65 Support

overall lowest missing rate among the three ranking 0.60

Missing Rate
methods. Moreover, it is observed that the
0.55
confidence-ranking method has the worst performance
in missing rate since this kind of ranking might 0.50

recommend a rule with high confidence but very low 0.45

support. The strength-ranking method considers both 0.40


the support and confidence of a rule and is
0.35
demonstrated to have the best performance in terms of 0 1 2 3 4 5 6 7 8
missing rate. TOP-N
6.2.2 Performance of Variations of PTMP
Figure 5 shows the performance of PTMP-N-gram and Figure 4. Missing rate for using strength, support, and
PTMP-N+-gram in terms of TEC and missing rate with confidence to rank the TMRs.
TOP-N varied from 1 to 7. As shown in Figure 5, the
TEC of PTMP with 1-gram, (denoted as
PTMP-1-gram) and PTMP-3+-gram decrease greatly 800x10 3

with the increase of TOP-N. Comparatively, the TEC


700x10 3

Total Energy Consumption (J)


for PTMP-2-gram and PTMP-3-gram decreases much
slowly. This phenomenon can be explained by 600x10 3
investigating the number of generated TMRs. In our PTMP-1-gram
PTMP-2-gram
experiments, it is observed that the average number of 500x10 3
PTMP-3-gram
PTMP-3+-gram
TMRs stored in each sensor node with length greater or
equal to 2 is about 3.56 in average, which is much less 400x10 3

than that with length equal to 1 (about 7.70).


Therefore, the PTMP-2-gram and PTMP-3-gram will 300x10 3

often invoke the flooding recovery for the missing 0 1 2 3 4 5 6 7 8

TOP-N
objects due to the few TMRs. Figure 5. TEC for PTMP-N-gram and PTMP-N+-gram
The reason why the TEC of PTMP-1-gram decreases with TOP-N value varied.
greatly with the increase of TOP-N value is that more
TMRs are used for prediction. Note that the number of 800x103

activated sensor nodes by the flooding method is


(6×1+6×2+...+6×m) = 6×(m+m2)/2, where the value 6 700x103

is the number of neighboring sensors in hexagon


Total Energy Consumption (J)

600x103
network structure and m is the distance ( in number of
sensors) between the missing object and the original
500x103
sensor node.
6.2.3 Comparisons of Different Prediction Methods 400x103
This experiment investigates the performance of CM
different prediction methods in terms of TEC, i.e., the 300x103
PES (X=0.1)
PES (X=0.5)
PES (X=0.1)+PTMP
efficiency in energy saving. Four kinds of prediction PES (X=0.5)+PTMP
PTMP

methods are compared, namely Continuous Monitoring 200x103

(CM) [15], PES [15], PTMP and PES+PTMP. Here, 0.1 0.2 0.3 0.4 0.5 0.6

Support Threshold (%)


0.7 0.8 0.9 1.0

PES+PTMP is a hybrid method by integrating Figure 6. The TEC with support threshold varied
PES(Destination, Instant) method with PTMP. The for CM, PTMP and PES+PTMP.
reason we choose PES(Destination, Instant) for
integration is described below. Through our straight forward [15], which benefits the other PES
experiments we found PES(Destination, Instant) is the variations. We observed that the accuracy of the other
most energy efficient method proposed in [15]. This is variations is not absolutely higher than
because we use parameters Pb and Pn in our model to PES(Destination, Instant) but the energy penalty is
simulate the activities of objects while not follow the much higher than it because more than one node will
assumption that the object highly intends to move

175
be activated for searching the missing object. Figure 6 [5] V. Padmanabhan, J. Mogul, Using Predictive Prefetching
shows the experimental results. Note that CM, to Improve World Wide Web Latency, ACM Computer
PES(X=0.1) and PES(X=0.5) are not influenced by Communication Review 26(3), 1996.
varied support threshold, and the TEC results for them [6] T. Palpanas, A. Mendelzon, Web Prefetching Using
Partial Match Prediction, in: Proc. of the 4th Web Caching
are shown as PES(X=0.1) > PES(X=0.5) > CM. We Workshop, 1999.
then explain the phenomenon by the following [7] J. Pei, J. Han, B. Mortazavi-Asl, H. Zhu, Mining Access
observations. If an object changes its velocity or Patterns Efficiently from Web Logs, in: Proc. of the 4th
moving direction when the corresponding sensor node Pacific Asia Conf. on Knowledge Discovery and Data
is in sleeping mode, PES(X=0.1) incurs higher Mining, 2000, pp. 396-407.
probability in missing the object than PES(X=0.5). [8] V. Raghunathan, C. Schurgers, S. Park, M. B. Srivastava,
Energy Aware Wireless Microsensor Networks, IEEE Signal
Proc. Magazine, 19(2), 2002, pp. 40-50.
7. Conclusions [9] R. Srikant, R. Agrawal, Mining Sequential Patterns:
In this paper, we propose a pattern-based prediction Generalizations And Performance Improvements, in: Proc. of
strategy named PTMP and a hybrid strategy named the 5th Int’l Conf. on Extending Database Technology
PES+PTMP integrating the PES method with PTMP. (EDBT’06), 1996.
The pure pattern-based prediction strategy works with [10] Z. Su, Q. Yang, Y. Lu, H. Zhang, WhatNext: A
no need to detect the object velocity; hence, it can be Prediction System for Web Requests Using N-gram
applied to the sensor networks with low-end sensor Sequence Models, in: Proc. of the 1st Int’l Conf. on Web
nodes. The hybrid strategy that exploits both the Information Systems and Engineering (WISE’00), 2000, pp
information of object velocity and movement patterns 200-207.
[11] V.S. Tseng, K.W. Lin, Mining Temporal Moving
was shown to outperform PTMP and PES in terms of
Patterns in Object Tracking Sensor Networks, in: Proc. of the
the energy consumption in an OTSN. Therefore, the Int’l Workshop on Ubiquitous Data Management (held with
hybrid strategy serves as an excellent mechanism for ICDE’05), 2005, pp. 105-112.
OTSNs in which the sensors are equipped with [12] V.S. Tseng, K.W. Lin, Efficient Mining and Prediction
velocity detection ability. To adapt to the limited of User Behavior Patterns in Mobile Web Systems”,
storage and weak computation ability of sensor nodes, Information and Software Technology, vol. 48, no. 6, 2006..
a rule dispatching mechanism is also devised by [13] S.M. Tseng, C.F. Tsui, Mining Multi-Level and
complying the location-based criterion. Through Location-Aware Associated Service Patterns in Mobile
experimental evaluation, it is shown that the ranking Environments, IEEE Trans. on Systems, Man and
Cybernetics: Part B, vol. 34, no. 6, 2004.
rules by strength criteria delivers better results in terms
[14] Vincent S. Tseng and K. W. Lin, "Energy Efficient
of TEC and missing rate than that by using confidence Strategies for Object Tracking in Sensor Networks: A Data
or support. Mining Approach," in Journal of Systems and Software,
Acknowledgement Volume 80, Issue 10, pp.1678-1698, 2007.
This research was supported by Ministry of Economic [15] Y. Xu, J. Winter, W.C. Lee, Prediction-Based Strategies
Affairs, Taiwan, R.O.C. under grant no. for Energy Saving in Object Tracking Sensor Networks, in:
95-EC-17-A-02-51-024, and by National Science Proc. of the 5th IEEE Int’l Conf. on Mobile Data
Management (MDM’04), 2004, pp. 346-357.
Council, Taiwan, R.O.C. under grant no. NSC
[16] Q. Yang, T. Li, and K. Wang, Building Association
96-2221-E-006-143-MY3. Rule Based Sequential Classifiers for Web Document
Prediction, Journal of Data Mining and Knowledge
References Discovery, vol. 8, no. 3, pp. 253-273, 2004.
[1] R. Agrawal, R. Srikant, Mining Sequential Patterns, in: [17] G. Yavas, D. Katsaros, Ö. Ulusoy, Y. Manolopoulos, A
Proc. of the 11th Int’l Conf. on Data Engineering, 1995, pp. Data Mining Approach for Location Prediction in Mobile
3-14. Environments, Data and Knowledge Engineering, vol.54,
[2] N. Eagle and A. Pentland (2005), "Reality Mining: no.2, 2005.
Sensing Complex Social Systems", Personal and Ubiquitous [18] W. Ye, J. Heidemann, D. Estrin, An Energy-Efficient
Computing, Vol 10, #4, 2006. MAC Protocol for Wireless Sensor Networks, in: Proc. of the
[3] S. Goel, T. Imielinski, Prediction-based Monitoring in 21st IEEE Infocom, 2002, pp. 1567-1576.
Sensor Networks: Taking Lessons from MPEG, ACM [19] WINS project, Rockwell Science Center. Available:
Computer Communication Review, 31(5), 2001. http://wins.rsc.rockwell.com.
[4] C. Y. Lin, W. C. Peng, Y. C. Tseng, Efficient In-Network
Moving Object Tracking in Wireless Sensor Networks, IEEE
Trans. on Mobile Computing, vol. 5, no. 8, 2006.

176
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A Space-Time Network Optimization Model for Traffic


Coordination and its Evaluation
Nirav Shah, Subodha Kumar, Farokh Bastani, and I-Ling Yen

Abstract traffic controllers, and road traffic management systems


[Feb2003] [Fer2001] [Var1993]. Such systems are
In transportation systems, the existing infrastructure can large-scale and complex in nature. Moreover, such
potentially be used more efficiently by deploying systems, being safety-critical physical systems, have
intelligent net-centric solutions that coordinate vehicles hard real-time processing requirements. However, at the
and traffic signals in real-time. For capacity planning same time, it is essential for such transportation systems
and assessing the cost/benefit tradeoffs of intelligent to be efficient, i.e., coordinate vehicles such that their
net-centric coordination infrastructures, it is essential to travel times are minimized. This is a necessary
determine the performance of optimal solutions, i.e., the requirement because increased travel times have several
best possible traffic flow that can be achieved. Given adverse socio-economic and environmental impacts. For
the scale and complexity of transportation systems, it example, if the traffic light controllers at the
may not be feasible to actually achieve these optimal intersections are not adaptive to incoming traffic, the
performances in practice. However, if the results show phase switching sequence they compute would make the
that substantial improvements are possible by simply vehicles wait longer at the intersections. This can
using the current physical roadway infrastructures more eventually result in traffic congestion.
effectively, then one can justify the cost of deploying
intelligent vehicle/traffic-light coordination systems. Since transportation systems are very large and
complex, it is often difficult to achieve optimal
In this paper, we demonstrate these concepts through a performances over their entire range of operating
case study of scheduling vehicles on a grid of conditions. Instead, heuristic approaches are most
intersecting roads. We develop heuristic algorithms and commonly used to find a scheduling solution. Since
an optimization model using the space-time network for efficiency is an important requirement for transportation
this problem, and compare them. Moreover, we also systems, it is imperative to evaluate the deployed
compare the space-time network modeling technique heuristic solution.
with the integer programming optimization approach
and show that the former is better for modeling traffic While there are numerous approaches available to
coordination systems. evaluate a deployed solution, comparing its
performance with that of an optimal solution is an
attractive alternative because it represents the best
1. Introduction possible performance of the system. Integer
programming techniques are usually used to compute
Modern transportation systems depend heavily on the optimal performance. However, with integer
computing and communication devices for monitoring programming, only a very small size of vehicle
and controlling the system. Some examples of such scheduling problem can be optimally analyzed
transportation systems include scheduling autonomous [Sha2007], because the underlying scheduling problem
guided vehicles (in factories or in container ports is strongly NP-hard. In this paper, we formulate the
[Soh1996]), railway traffic controllers [Mat2002], air problem as a space-time network which is a special type
of multi-commodity flow networks [Ahu1993]. This
Nirav Shah is with the Department of Computer Science, approach is interesting because of two reasons, namely,
University of Texas at Dallas, Richardson, TX 75083-0688 (e-mail: the transportation systems can be readily modeled as a
nirav@utdallas.edu).
flow network, and flow networks usually reduce the
Subodha Kumar is with the Information Systems and Operations
Management Department, Michael. G. Foster School of Business, complexity of the problem significantly. Therefore, we
University of Washington, Seattle, WA 98195-3200 (e-mail: are able to analyze the optimal performance for larger
subodha@u.washington.edu). size systems as compared to those that can be analyzed
Farokh Bastani is with the Department of Computer Science, using integer programming formulations [Sha2007].
University of Texas at Dallas, Richardson, TX 75083-0688 (e-mail:
bastani@utdallas.edu). We demonstrate the effectiveness of multi-commodity
I-Ling Yen is with the Department of Computer Science,
University of Texas at Dallas, Richardson, TX 75083-0688 (e-mail: space-time network flow model using a case-study of a
ilyen@utdallas.edu). traffic coordination system that schedules vehicles on a

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 177


DOI 10.1109/SUTC.2008.18
grid of intersecting roads with the objective of [Mat2002] propose a method to improve the railroad
minimizing the average transit time of the vehicles. The infrastructure utilization using an autonomous
result of this model is then used to evaluate the decentralized train control system. Similarly, Soh, et al.,
performance of two heuristic algorithms. [Soh1996] describe a decentralized method to
dynamically route autonomous guided vehicles.
The rest of the paper is organized as follows: Section 2
reviews the related works in the area of traffic The multicommodity network flow model has been
management and modeling and analysis techniques for widely used for modeling transportation systems as
computing optimal performance. Section 3 defines the shown by Ahuja [Ahu1993]. Ahuja, et al., [Ahu2005]
traffic scenario addressed in this paper while Section 4 use space-time networks for assigning locomotives to
presents the optimization model based on the space-time trains. Vaidyanathan [Vai2007] uses a multicommodity
network approach. In section 5, we present two network flow for scheduling crew for railroads. Agarwal
heuristic algorithms In section 6, we compare these [Aga2007] solves the ship scheduling and cargo
heuristics with the optimal results. Finally, Section 7 scheduling problems using multicommodity network
summarizes the paper and outlines some future research flows. Helema [Hel1992] proposes a method to reduce
directions. air traffic delays. Bertsimas [Ber2000] developed an
approach to dynamically route aircrafts under changing
weather conditions. Both these works used
2. Related Works multicommodity network flows for computing the
solution. Similarly, Zawack [Zaw1987] used space-time
Many researchers have studied various problems in road network to model road traffic congestion.
traffic management systems, railway traffic control
systems, air-traffic control systems, and coordination of
autonomous guided vehicles. Most existing works in
3. The Problem
intelligent transportation systems address methods that
improve infrastructure utilization and reduce travel We study the problem of scheduling vehicles on a grid
times. To augment the existing work, our focus is not to of intersecting roads. The grid is formed by V bi-
develop a particular method for traffic coordination but, directional north-south and H bi-directional east-west
instead, to develop an optimization model using roads, having a total of H×V intersections. For
multicommodity network flow, called the “Space-Time simplicity, we assume that each road has only one lane
Network”. This model computes the minimum possible per direction. We assume that the arrival process of the
travel time for the vehicles, and provides a benchmark vehicles has a Poisson distribution. Moreover, we
for evaluating potential traffic coordination strategies. assume that each vehicle has sufficient intelligence to
maintain a minimum safe distance relative to other
Varaiya [Var1993] and Sotelo, et al., [Sot2000]
vehicles on the same road, to prevent collisions in the
demonstrate methods that increase the utilization of
same direction. For simplicity, we restrict the movement
road infrastructures by reducing inter-vehicle gaps.
of the vehicles such that each vehicle travels on the
They achieve this by grouping a few vehicles in a group
same road throughout the grid and exits the grid without
called a “platoon,” and show that the inter-vehicle
turning. In addition, each vehicle adheres to the speed
distance between the vehicles in a platoon is
limit laws. The objective is to find the performance of
significantly reduced. Porche, et al., [Por1996]
the optimal schedule for each vehicle at each
[Por1997] develop a traffic light control algorithm that
intersection in the grid such that the average transit time
is adaptive to the queue lengths at the intersections.
(total time that the vehicles take to travel through the
Febraro, et al., [Feb2003] propose a method that
grid) is minimized.
minimizes the travel time for emergency vehicles while
reducing disruptions to regular traffic. Fereira, et al., Theorem: The problem of scheduling vehicles on the
[Fer2001] present an algorithm to control intersection grid of intersecting roads is strongly NP-hard.
phase timing based on feedback from neighboring
For proving this theorem, we construct an instance of
intersections and traffic queue lengths. Mamei, et al.,
the vehicle scheduling problem and show that it is
[Mam2003] demonstrate a decentralized traffic
equivalent to a known NP-hard problem, namely, the
coordination scheme in which vehicles find the path of
problem of scheduling non-preemptive jobs on a
least congestion to reach their corresponding final
processor when the jobs have arbitrary release times.
destinations through decentralized coordination.
The details of the proof are given in Shah, et al.
Giridhar and Kumar [Gir2006] propose a method to
[Sha2007].
schedule automated vehicles where a scheduler provides
timed trajectories to the vehicles such that deadlocks
and collisions do not occur. Matsumoto, at al.,

178
4. Optimization Model waiting at the intersection for a given flow. The crossing
arcs model the behavior of vehicles crossing the
4.1. The Space-Time Network for the problem
intersection.
The objective of the vehicle scheduling model is to find
The vehicles arrive at an arrival node either by a road
an optimal scheduling sequence of vehicles such that the
arc from a different intersection, or through a wait arc
average transit time of all the vehicles is minimized.
from the previous arrival node of the same intersection.
We formulate this problem as a multicommodity (The latter represents the vehicles waiting to cross the
network flow problem with side constraints, called the intersection.) If there are vehicles at an arrival node,
“intersection space-time network.” In the intersection one of the following event occurs:
space-time network, each node represents both the time
1) One vehicle crosses the intersection by
and location. Moreover, the projection of any arc on the
traveling over the crossing arc. All the
horizontal axis (the distance axis) is the distance the
remaining vehicles wait, i.e. reach the next
vehicles cover by traveling on that arc. The projection
arrival node by traveling through a wait arc.
of an arc on the vertical axis (the time axis) represents
the propagation delay on that arc. We illustrate this 2) None of the vehicles cross the intersection if a
formulation for a road with two intersections for one vehicle from a conflicting flow is crossing the
direction of flow in Fig. 1. intersection. In this case, all the vehicles wait,
i.e. reach the next arrival node by traveling
In this network, there are two types of nodes for each
through a wait arc.
intersection for each direction of flow. These nodes are
called the arrival nodes and the departure nodes, Each waiting vehicle joins a subsequent wait arc until it
respectively. Each arrival node represents an arrival can be eventually scheduled at the intersection.
event (arrival of one or more vehicles) at that
4.2. Formulation
intersection at the specified time for the given flow.
Similarly, each departure node represents the departure We formulate the vehicle scheduling problem as a flow
event (crossing the intersection) at that time for the of different vehicles on various arcs of a space-time
given flow. network. As explained earlier, the vehicles on the road
arcs are traveling between adjacent intersections. The
This model has three types of arcs, namely, the road
vehicles on the wait arcs are waiting to cross the
arcs, the crossing arcs, and the wait arcs. Each road arc
intersection. Finally, the vehicles on the crossing arcs
represents the journey of the vehicles between
are crossing the intersection.
consecutive intersections within a given time interval
for a given flow. Each wait arc represents the vehicles

Fig. 1. Space-Time network for two intersections for a single direction of flow

179
In the real world, during the go (green) phase at an cannot be permitted to cross the intersection
intersection, it is desirable to permit more than one concurrently.
vehicle to cross the intersection if they are traveling in
The objective of the formulation is to minimize the total
the same direction, following one another, and
transit time of the vehicles. Transit time is the sum of the
maintaining some safe distance between each other.
travel times between intersections (time taken on road
Moreover, as long as the vehicle flow from one direction
arcs), the crossing time at the intersections (travel time
(say north-south) is crossing the intersection, vehicles
on crossing arcs), and the wait time at the intersections
from another conflicting direction (say east-west) cannot
(the time spent waiting on the wait arcs). Each vehicle
be permitted to cross the intersection. The east-west flow
spends the time in the system either traveling over the
can only be permitted after the last vehicle from the
road arcs and the crossing arcs, or by waiting at the
north-south flow has crossed the intersection. Thus, there
intersections by joining the wait arcs. When vehicles
is some delay in phase switching which we refer to as the
travel at the speed limits, the transit time of the vehicles
“switching delay.” When vehicles from the same
can be improved only by minimizing the delay
direction are scheduled consecutively at an intersection,
experienced by vehicles while waiting at the
we need to ensure that these vehicles maintain some
intersections, i.e. by minimizing the number of wait arcs
fixed inter-vehicle gap. We model this gap by
that a vehicle joins. Hence, by minimizing the flow on
introducing a small delay while scheduling vehicles that
the wait arcs, the transit time of the vehicles can be
are traveling in the same direction. We select the length
minimized. In the objective function, we minimize the
of wait arcs suitably to model this delay.
total flow on all arcs, which in turn minimizes the total
To formulate this problem, we shall use the following flow on wait arcs. The objective function can be
notations: expressed as follows:
(i) xi: The flow (total number of vehicles) on arc i. Minimize z = ∑
i∈ A llA rc s
ti x i
(ii) ti: The propagation delay on arc i.
(iii) rjkl: jth road arc between intersections (k – 1) and (k) subject to:
for flow along direction l. This arc connects the jth
departure node of the (k – 1)th intersection to the jth xi ≤ 1 , ∀i ∈ CrossingArcs (1)
arrival node of the kth intersection, for the flow in
x r j kl + x w ( j −1 ) k l = x c jk l + x w jkl
direction l.
∀ j ∈ A r r iv a lN o d e s [ k , l ] (2)
(iv) wjkl: jth wait arc along direction l at intersection k.
This arc connects the jth arrival node to the (j + 1)th ∀ k ∈ In te rs e c tio n s
arrival node at intersection k, for the flow in ∀ l ∈ D ire c tio n s
direction l. xc jkl = xrj ( k +1) l
(v) cjkl: jth crossing arc at intersection k for direction l. It ∀j ∈ ArrivalNodes[k , l ] (3)
connects the jth arrival node of intersection k (in
∀k ∈ Intersections, ∀l ∈ Directions
direction l) to the jth departure node of intersection k
(in direction l). xc jkl + xc pkq ≤ 1, ∀j ∈ ArrivalNodes[k , l ]
(vi) s: Switching delay between phase transitions. ∀k ∈ Intersections
(4)
(vii) CrossingArcs: The set of crossing arcs for the ∀l , ∀q ∈ ConflictingDirections
entire network.
∀p = j , j + 1, j + 2,… , j + ( s / t w jkl ), s ≥ 1
(viii) AllArcs: The set of all the arcs in the network.
Constraint set (1) ensures that at the most one vehicle
(ix) Intersections: The set of all the intersections in the
can travel on a given crossing arc of an intersection.
network
Consecutive vehicles in the same direction must maintain
(x) Directions: The set of directions of flows in the a minimum required safe distance. Constraint set (1)
network together with the length of the wait arcs help us model
this behavior. Constraint sets (2) and (3) are flow balance
(xi) ArrivalNodes[k,l]: The set of arrival nodes at
equations. Constraint set (2) enforces all incoming
intersection k for flow in direction l.
vehicles on an arrival node (either from a road arc or
(xii) ConflictingDirections: The set of all direction pairs from a wait arc) to leave (either to the crossing arc or to
(l1 and l2) where traffic from directions l1 and l2 the next wait arc). Similarly, the flow balance constraint
(3) guarantees the flow conservation for the departure

180
nodes. The departure nodes could be omitted by merging We select appropriate values for PNS and PEW and use
the crossing arcs and the road arcs, such that the delay on these values for all the other intersections to create a
the resulting arc is the sum of the travel time required to schedule.
cross the intersection, and the time to travel between
In the fixed period variable platoon size algorithm, we
these intersections. However, we retain the departure
fix the period for all the platoons to be the hyperperiod H
nodes in the model for clarity. As explained earlier, as
for the system, and vary their lengths (execution times)
long as vehicles from one direction are crossing the
depending on the arrival rates. Thus, for a given
intersection, the vehicles from another flow in its
intersection, the crossing time (execution time) for the
conflicting direction cannot start at the intersection until
north-south platoon ENS and the crossing time for the
the last vehicle of the previous flow has completely
east-west platoon EEW can be calculated as:
crossed the intersection. This behavior is modeled by the
constraint set (4). ENS = H ( ANS / ( ANS + AEW )) and
The purpose of this optimization model is not to develop
a specific scheduling sequence of vehicles at the EEW = H ( AEW / ( AEW + ANS )) .
intersections, but to compute the best possible travel
times for the vehicles for the given scenario, and to serve Here, ANS and AEW represent the arrival rates on the
as a benchmark to evaluate potential heuristic north-south and the east-west roads respectively.
algorithms. To ensure that the schedule is feasible over all the
intersections, we select the execution time of the
platoons based on the following conditions:
5. Heuristic Algorithms
If r is a north-south road, and j is an intersection on r,
We use the performance of the optimal scheduling then the execution time of the platoons on road r, Er is:
method to serve as a benchmark for evaluating two
heuristic algorithms, namely, the “Fixed platoon size, Er = min( ENS ( j )), ∀j ∈ r , r ∈ RNS .
variable period algorithm” and the “Fixed period,
variable platoon size algorithm” [Sha07]. We assume Similarly, for an east-west road q having intersection l,
that we know the average traffic arrival rate on every
road in the grid and apply clock-driven scheduling
Eq = min( E EW (l )), ∀l ∈ q, q ∈ REW .
methods [Liu2000] to generate the virtual platoons. Each
Here, RNS and REW represent the set of north-south and
vehicle must join the next available virtual platoon to
the set of east-west roads, respectively. From the
travel through the grid.
execution time of the platoons and the hyperperiod H, we
In the fixed platoon size variable period algorithm, we can obtain a schedule for the intersections.
vary the period of the platoons, but fix the length of the
These heuristics can be implemented in a variety of ways
platoons. To ensure schedulability over all the
depending on the degree of automation. In a
intersections, we select an intersection having the
conventional system, traffic lights at an intersection
maximum arrival rates for the north-south and the east-
perform phase switching depending on the schedule
west roads. Without the loss of generality, we assume
generated by the heuristics. In this case, the vehicles
that the arrival rate on the north-south road is k times the
could start from or stop at the intersection based on the
arrival rate on the east-west road, PEW = k × PNS . PEW current state of traffic lights.
denotes the period of platoons on the east-west road and
In contrast, in a fully automated net-centric system, each
PNS represents the period of platoons on the north-south
intersection could be equipped with a road-side
road. If e is the time each platoon needs to cross the intersection controller and the vehicles could be
intersection, for a feasible solution to exist, the following equipped with real-time communication and processing
condition should be satisfied:
capabilities. Every intersection controller can broadcast
e ÷ PNS + e ÷ PEW ≤ 1 . the schedule in real-time. Based on the broadcast from
the intersection controller, each approaching vehicle can
Thus, we obtain, compute the time instance at which it can cross that
intersection. In addition, to make the journey smoother,
PNS ≥ e(1 + k ) ÷ k and the vehicles could attempt to reduce or eliminate
stopping at the intersections. To achieve this, each
PEW ≥ e(1 + k ) . vehicle could autonomously compute its travel speed
such that it would reach the intersection just when it is
allowed to start from that intersection.

181
6. Experimental Study
In this section, we compare the performance of the
heuristic algorithms with the optimal solution. We
obtained the optimal solution by solving the space-time
model (presented in Section 4) using CPLEX version
8.1.0. Moreover, we also compare two different
optimization methods, namely, an integer programming
approach to find the optimal solution as explained in
[Sha2007] and an optimization method using the space-
time model.

Fig. 4. Performance of fixed period variable length


algorithm for different grid sizes

We restricted the analysis to a 2×2 grid of roads having 4


intersections because the computation time increases
exponentially for the optimization methods, given the
strongly NP-hard nature of this problem. However, the
heuristic algorithms are capable of finding solution for
larger grid sizes. For simplicity, we assumed a perfect
grid. For evaluation of the algorithms, we fixed the
Fig. 2. Comparison of algorithms with the optimal arrival rates on one north-south and one east-west roads
solution to 0.3 vehicles/sec and increased the arrival rates on the
other roads. We allowed the heuristic algorithms to
compute the average transit time of the vehicles for about
a period of 2 logical hours, over several thousand
vehicles. However, for the optimization model (using the
space-time network), we computed average transit times
for a smaller data set, having approximately 70 to 200
vehicles because of the NP-hardness of the problem. Fig.
2 shows the comparison of the heuristics to the optimal
solution. It is evident that the heuristics are comparable
in performance to the optimal solution for low to
moderate arrival rates. The fixed period, variable platoon
length algorithm performs better than the fixed platoon
length, variable period algorithm at higher arrival rates as
shown in Fig. 2. However, there is still some potential
for improving these as shown by the value of the optimal
solution. Since the fixed period, variable platoon length
algorithm is better than the fixed platoon variable period
algorithm, we study the performance of this algorithm
for different grid sizes and present the results in Fig. 4. It
is evident that the point at which this algorithm
Fig. 3. Comparison of the results by different degenerates (i.e. the transit time increases rapidly) is
optimization methods independent of the grid size.

182
Fig. 3. shows the comparison of the results obtained by
different optimization methods. We compare the optimal [Ahu1993] R. Ahuja, T. Magnanti and J. Orlin. Network Flows
solution obtained by solving the space-time network – Theory, Algorithms and Applications,” Prentice Hall, 1993
model presented in this paper to the optimal solution
[Ahu2005] R. Ahuja, J. Liu, J. Orlin, D. Sharma and L.
presented in [Sha2007]. When integer programming
Shughart, “Solving Real-Life Locomotive Scheduling
model was used to compute the optimal solution, it was Problems,” Transportation Science, Nov 2005, Vol. 39, No. 4,
not feasible to find an accurate solution for arrival rates pp. 503-517.
greater than 2 vehicles/sec. Moreover, the optimal
solution could be calculated over a very small dataset, [Ber2000] D. Bertsimas and S. Patterson, “The Traffic Flow
having just 14 vehicles in the grid. More details are Management Rerouting Problem in Air Traffic Control: A
explained by Shah, et al., in [Sha2007]. However, by dynamic Network Flow Approach,” Transportation Science,
computing the optimal solution using space-time Aug 2000, Vol. 34, Issue 3, pp. 239-255
network model, we could find accurate optimal solutions
[Feb2003] A. Febbraro, D. Giglio and N. Sacco, “On
for the arrival rates greater than 2 vehicles/sec. In
controlling privileged vehicles by means of coordinated traffic
addition, we could successfully compute the optimal lights,” Proceedings of the Intelligent Transportation Systems
solution for larger datasets ranging from 70 to 200 Conference, IEEE, Oct 2003, Vol. 2, pp. 1318-1323.
vehicles in the grid, when we used the space-time
network model. Thus, this experiment demonstrates the [Fer2001] E. Ferreira, E. Subrahmanian and D. Manstetten,
effectiveness of developing an optimization model using “Intelligent Agents in Decentralized Traffic Control,”
a space-time network model as compared to the integer Proceedings of the Intelligent Transportation Systems
programming model. Conference, IEEE, Aug 2001, pp. 705-709.

[Gir2006] A. Giridhar and P. Kumar, “Scheduling Automated


Traffic on a Network of Roads,” IEEE Transactions on
7. Summary and Future Research Vehicular Technology, Sep 2006, Vol. 55, No. 5, pp. 1467-
In this paper, we demonstrate the necessity of developing 1474.
optimization models to find the performance of optimal
[Hel1992] M. Helme, “Reducing Air Traffic Delay in Space-
solutions for transportation problems. We use a case Time Network,” International Conference on Systems, Man
study of scheduling vehicles on a grid of intersections to and Cybernetics, IEEE, Oct 1992, Vol. 1, pp. 236-242.
explain this concept. We develop an optimization model
using a space-time network flow model. We compare the [Liu2000] J. Liu, Real-Time Systems, Prentice Hall, 2000.
performance of two heuristic algorithms with the optimal
performance obtained by solving the space-time network [Mam2003] M. Mamei, F. Zambonelli and L. Leonardi,
model, and show that these heuristics are comparable in “Distributed Motion Coordination with Co-fields: A Case
performance to the optimal solution in the low to Study in Urban Traffic Management,” Proceedings of the Sixth
International Symposium on Autonomous Decentralized
moderate traffic arrival rate range. Moreover, since
Systems, IEEE, Apr 2003, pp. 63-70.
optimization models can be developed using different
methods, we compare the model developed using a [Mat2002] M. Matsumoto, M. Sato, S. Kitamura, T. Shigeta
space-time network with that developed using integer and N. Amiya, “Development of Autonomous Decentralized
programming approach proposed by Shah, et al., ATC System,” Proceedings of Second International Workshop
[Sha2007]. We conclude that the space-time network on Autonomous Decentralized Systems, Nov 2002, pp. 310-315.
flow is a superior approach as compared to the integer
programming for the problem studied because it can [Por1996] I. Porche, M. Sampath, R. Sengupta, Y. Chen and S.
handle larger traffic rates as well as handle more Lafortune, “A Decentralized Scheme for Real-Time
Optimization of Traffic Signals,” Proceedings of the
complex roadway systems.
International Conference on Control Applications, IEEE, Sep
As future research, we plan to develop scheduling 1996, pp. 582-589.
methods that can handle turning movements at
intersections and also handle scheduling emergency [Por1997] I. Porche and S. Lafortune, “Dynamic Traffic
Control: Decentralized and Coordinated Methods,”
vehicles. We also plan to extend the algorithms to be
Proceedings of the Conference on Intelligent Transportation
adaptive to dynamic fluctuations in traffic. System, IEEE, Nov 1997, pp. 930-935.

[Sha2007] N. Shah, S. Kumar, F. Bastani and I. Yen, “An


8. References Optimization Model for Rigorously Assessing Efficient
Heuristics for Traffic Coordination at Intersections,”
[Aga2007] R. Agarwal and O. Ergun, “Ship Scheduling and Proceedings of the 10th International IEEE Conference on
Network Design for Cargo Routing in Liner Shipping,” Intelligent Transportation Systems, IEEE, Sep 2007, pp.12-17.
Transportation Science, Forthcoming

183
[Soh1996] J. Soh, W. Hsu, S. Huang and A. Ong,
“Decentralized Routing Algorithms for Automated Guided
Vehicles,” Proceedings of 1996 ACM symposium on Applied
Computing, ACM, Feb 1996, pp. 473-479.

[Sot2000] J. Sotelo Jr, D. Vilela and M. B Leonel, “Automated


Flexible Transportation System – AFTS: A New Concept for
Urban Mass Transportation,” Proceedings of the IEEE
Intelligent Transportation Systems, IEEE, Oct 2000, pp. 107-
112.

[Vai2007] B. Vaidyanathan, K. Jha and R. Ahuja,


“Multicommodity Network Flow Approach to the Railroad
Crew Scheduling Problem,” IBM Journal of Research and
Development, May/Jul 2007, Vol. 51, No. 3/4, pp. 325-344.

[Var1993] P. Varaiya, “Smart Cars on Smart Roads – Problems


of Control,” IEEE Transactions on Automatic Control, IEEE,
Feb 1993, Vol. 38 No. 2, pp. 195-207.

[Zaw1987] D. Zawack and G. Thompson, “A Dynamic Space-


Time Network Flow Model for City Traffic Congestion,”
Transportation Science, Aug 1987, Vol. 21, No. 3, pp. 153-
162.

184
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Fair Broadcasting Schedules on Dependent Data in Wireless Environments

Ming-Te Shih and Chuan-Ming Liu


Department of Computer Science and Information Engineering
National Taipei University of Technology
Taipei 106, TAIWAN
mtshih@cht.com.tw, cmliu@ntut.edu.tw

Abstract client spent on listening to the broadcast. In this paper, we


consider that there are access relations among the data and
In wireless mobile services, data broadcasting provides hence all the mobile clients will start to receive their re-
an effective method to disseminate information to mobile quest data at the beginning of the broadcast cycle. Since
clients. In some applications, the access pattern of all data there is no index considered in the broadcast, we focus on
items can be represented by a weighted DAG. In this paper, the latency. Our objectives is to provide some broadcast
we explore how to generate the data broadcast schedule for schedules which can make every client experience a similar
dependent data, which can be modeled as a weighted DAG, latency. This differentiates our work from the others where
to serve all the mobile clients effectively and fairly. The re- the major objective focuses on minimizing the average la-
sulting data broadcast schedules must keep the original de- tency to achieve a good quality of service [4, 20].
pendency and minimizes the variance of latency. We prove Most of the previous works about the broadcast schedul-
that it is NP-complete to find such an optimal solution. Due ing problem assumed that the requested data are indepen-
to the NP-completeness, we provide two heuristics to solve dent [13, 25]. In some applications, the quested data are de-
the problem. In additions, we discuss the upper and lower pendent [2, 12, 14]. For instance, consider the scenario that
bounds on the variance of latency for the problem. clients request Web pages which contain different compo-
nents on it, such as audio clips or images [12, 17]. Figure 1
demonstrate an example showing the access pattern among
Keywords: Data Broadcasting, Scheduling, Latency, the data. The different Web pages start with a main page
DAGs, NP-completeness. (S) and then may include different components, three arti-
cles (a1 , a2 , and a3 ), two images (j1 , and j2 ), two video
1. Introduction clips (m1 , and m2 ), and two audio clips (p1 , and p2 ). The
edges with arrows show the access order and pattern and the
Data broadcasting provides an efficient way to dissem- number on each them indicates the probability of accessing
inate data in wireless environments and attracts much re- the next component. The other applications includes pro-
search attention in recent years [1, 6, 7, 8, 12, 14, 15, viding the stock information and the broadcast data in the
5, 19, 25]. Using data broadcasting, servers can provide on-demand broadcasting model [21].
public information, such as news service, traffic informa- In this paper, we consider the data to be broadcast
tion, weather report, and stock-price information, to a large can be represented by a weighted directed acyclic graph
amount of mobile clients in a wireless environment. Mobile (DAG) [9, 14, 22, 24] according to their access frequen-
clients will access the information by tuning into the broad- cies [1, 5, 23] and access pattern [2, 14]. Our objective is
cast and listening to the expected information. In order to to have a wireless data broadcast schedule that allows all the
allow the mobile clients to access the information via the mobile clients wait for their requests in an equal time inter-
broadcast efficiently, how the servers schedule the data on val. We measure the fairness of the broadcast using the vari-
the broadcast channel becomes important. ance of latency. On the other hand, due to the data depen-
To evaluate the broadcast schedules, one usually mea- dency, the resulting broadcast should retain the topological
sures the latency and tuning time experienced by the mo- order of the input DAG. A topological order of a DAG is an
bile clients. The latency is the time interval from send- ordering of the vertices where the precedent vertices in the
ing a request to the time when the requested information DAG are broadcast earlier than than the following vertices.
is received. The tuning time measures the actual time a The broadcast schedule generated without the topological

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 185


DOI 10.1109/SUTC.2008.55
logical ordering), (2)minimum linear distance between any
two related objects and (3)more availability for popular ob-
1 1
a1 j1 j2 jects. A broadcast schedule with minimized average latency
1 1 is good for average. However, in some cases, some clients
3 2 still need to wait for a long time to get the results. In order
1 to have a fair latency for each client, we consider the gen-
3 1 1 erated broadcast schedule for a weighted DAG which is not
S a2 m1 m2
1 only a topological ordering but minimizing the variance of
1 latency.
1 2
1 2 A DAG G = (V, E) is a directed graph that has no cycle,
3 where V is the vertex set and E is the edge set consisting of
1 ordered pairs of vertices. An element (u, v) ∈ E, u, v ∈ V
a3 p1 p2
and u = v is a directed edge from u to v. If (u, v) is an edge
in G, we say that vertex u is adjacent to vertex v and vertex
v is adjacent from vertex u. We also call the edge (u, v) as
Figure 1. An example for the access pat- the in-edge of u and the out-edge of u. The in-degree of
terns of all the components in different pos- a vertex is the number of in-edges of it and the out-degree
sible Web pages, where the vertices are the of a vertex is the number of out-edges of it. We define the
components, the edges indicate the relations maximum in-degree of a DAG as di and the maximum out-
between vertices, and the number on each degree of a DAG as do . Among the vertices in V , a source
edge presents the probability of accessing is a vertex which has no in-edge and a sink is a vertex which
the next component. has no out-edge. If there is a path P (u, v) from u to v, we
say that v is reachable from u via P (u, v). For a vertex v,
we define P RE(v) = {u : (u, v) ∈ E, ∀u ∈ V } which is
the set of all the vertices adjacent to v and ADJ(v) = {w :
ordering may result in a longer latency. However, to have (v, w) ∈ E, ∀w ∈ V } which is the set of all the vertices
a data broadcast schedule that retains the topological order- adjacent from v, respectively.
ing and has a minimized variance of latency is intractable.
We will first define this problem formally in Section 3 and
show the NP-completeness of this problem in Section 4 af-
ter giving some background and related works in Section 2. pad=0.8
Because of the intractability of the problem, two heuristics a d
pdm=0.9
are proposed to generate the broadcast schedules in Sec- psa=0.2
tion 5. In the proposed heuristics, we will use the V-shape pam=0.7
placement proposed in [18] for minimizing the variance of psb=0.1 pbm=0.4
job completion time when scheduling the jobs with differ- S b m
ent processing time. In Section 6, we discuss the upper and
lower bounds on the variance of latency for the problem. psc=0.5 pbe=0.9
Section 7 conclude this paper.
pce=0.8
c e
2. Preliminaries

As mentioned, one objective of data broadcast schedul-


ing is to minimize the access latency. Allowing a mobile Figure 2. A weighted DAG of 7 vertices; the
client to start receiving the data in the middle of a broad- weight on an edge representing the access
cast cycle can reduce the access latency. Thus, many pa- probability after a vertex has been accessed.
pers considered to minimize the relative distance between
the positions of two related data items in order to mini-
mize the average access latency [2, 3, 14, 16]. In particular, Suppose that, for each edge (u, v) , there is a weight Puv
Chehadeh et al. [3] assumed that the relation among the ≤ 1. For any edge (u, v), the direction from vertex u to ver-
data to be accessed can be represented by a directed acyclic tex v represents that u is accessed before v and the weight
graph (DAG) and presented several heuristics according to Puv ≤ 1 on (u, v) stands for the probability of vertex v after
the following three criteria, (1) linear ordering (i.e. topo- vertex u has been accessed. The summation of all weights

186
on the edges from some certain vertex is not necessarily in a unfair waiting time and/or longer latency . Therefore,
equal to 1 since the receiving process may stop at that vertex the generated broadcast schedules should achieve the fol-
according to the access patterns. As shown in Figure 2, af- lowing two objectives:
ter vertex s has been accessed, the chances to access vertex
a, b,and c are Psa = 0.2, Psb = 0.1, Psc = 0.5, respectively. 1. the topological ordering is reserved;
So, the probability of stoping at s is 1 − (Psa + Psb + Psc ) 2. the variance of latency is minimized.
= 0.2.
To minimize the variance of the latency, we refer to the We refer to such a problem as the Topologically Ordered
mechanism, V-shape placement, proposed in [18] to gen- Broadcast Schedule with Fairness (TOBSF) problem and
erate the broadcast schedule. According to V-shape place- the definition is given below.
ment, if the jobs are ordered in a non-increasing completion
Definition 1. (TOBSF Problem)
time, the optimal schedule can be achieved by placing the
Suppose all the notations are defined as above. The Topo-
jobs with longer completion time at the both ends of the
schedule and the jobs with shorter completion time in the logically Ordered Broadcast Schedule with Fairness (TO-
BSF) problem is to find a 1-to-1 mapping f : V −→
middle of the schedule.
{1, 2, ..., |V |} such that

3. Problems (1) if (u, v) ∈ E, then f (u) < f (v), and



(2) V ar[f (v)] = v∈V (f (v) − E[f (v)])2 · β(v) is mini-
Suppose that the access pattern of the data to be broad- mized.
casted is modeled as an edge-weighted DAG G = (V, E).
We assume that there is only one source s in the DAG G. If
the DAG G has multiple sources, we add a pseudo-vertex s 4. NP-Completeness
which has a directed edge to each source with equal weight
respectively. For each vertex v except in G, there is at least In this section, we will show that the TOBSF Problem is
one path from s to v. Suppose P (s, v) is one of the paths NP-complete. To show this, we first give the corresponding
from s to v. We define the probability to access vertex v via decision problem of the TOBSF Problem. We refer to this

path P (s, v) as (i,j)∈P (s,v) pij . Let Pv be the set of all decision problem as the Topologically Ordered Broadcast
the paths from the source s to vertex v. The probability to Schedule with a Given Variance of Latency (TOBSG) prob-
access v from s hence is lem.
  Definition 2. (TOBSG Problem)
γ(v) = ( pij ) (1)
Instance: A DAG G = (V, E), an integer K, and weight
P (s,v)∈Pv (i,j)∈P (s,v)
puv ≤ 1 for each edge (u, v) ∈ E .
Consider the vertex m in the DAG of Figure 2. The γ(m) = Question: Does there exist a 1-to-1 mapping f : V −→
0.2 × 0.8 × 0.9 + 0.2 × 0.7 + 0.1 × 0.4 = 0.288. {1, 2, ..., |V |} such that
 After normalizing the values by β(v) = γ(v) / (1) if (u, v) ∈ E, then f (u) < f (v), and
v∈V γ(v), one easily can show that β(·) is a probabil- 
ity mass function, p.m.f. We denote the broadcast schedule (2) V ar[f (v)] = v∈V (f (v) − E[f (v)])2 · β(v) ≤ K .
as a mapping f from the vertices in the input DAG to the
time slots in the broadcast. In other words, f : V −→ We first claim that the TOBSG problem is in NP. This
{1, 2, ..., |V |}. can be verified easily since, given a 1-to-1 mapping f and
According to the definition of f , f is a random vari- an integer K, it takes linear time to determine
 whether
able. We hence can derive the average E[f (v)] and variance f (u) < f (v) if (u, v) ∈ E and whether v∈V (f (v) −
V ar[f (v)] for latency as follows, respectively. E[f (v)])2 · β(v) ≤ K. Then, we next show that TOBSG is
NP-hard by reducing the General Optimal Linear Arrange-

E[f (v)] = f (v) · β(v) (2) ment (GOLA) problem to TOBSG problem. The GOLA
v∈V
problem is NP-hard and is in a variation form of the Simple
Optimal Linear Arrangement (SOLA) problem [11]. The

V ar[f (v)] = (f (v) − E[f (v)])2 · β(v) (3) definitions are given below.
v∈V
Definition 3. (SOLA problem)
Recall that the data in the input weighted DAG are de- Instance: DAG G = (V, E) , |V | = n and an integer K.
pendent. The broadcast schedule generated without consid- Question: Does there exist a 1-to-1 mapping g : V −→
ering the access frequencies and the dependency may result {1, 2, ..., n}, such that

187
(1) if (u, v) ∈ E, then g(u) < g(v), and heuristics for generating the broadcast schedules using V-
 shape placement. One generates the broadcast schedule
(2) (u,v)∈E [g(v) − g(u)] ≤ K . level by level according to a BFS traversal on the DAG and
Definition 4. (GOLA problem) the other does it according to the Topological Sort. The re-
Instance: DAG G = (V  , E  ), |V  | = n and an integer K  . sulting broadcast schedules generated will retain the topo-
Question: Does there exist a 1-to-1 mapping g : V  −→ logical order and minimize the variance of latency.
{p1 , p2 , ..., pn }, where 0 < p1 < p2 < · · · < pn , p1 , p2 ,
· · ·, pn are positive integers, such that 5.1 LOAP Algorithm

(1) if (u, v) ∈ E  , then g(u) < g(v), and We first introduce the LOAP (Level-Orientation with
 Access Probability) algorithm. LOAP algorithm first uses
(2) (u,v)∈E  [g(v) − g(u)] ≤ K  . the Bread First Search (BFS) traversal with equation (1) to
Now we show the reduction. Suppose that I  is an in- calculate the level and the access probability of each ver-
stance of the GOLA problem. An instance I of the TOBSG tex in V . In order to minimize the variance of latency, we
problem can be constructed from I  as follows. sort the the vertices on the same level by their γ values.
Then, on each level, we place the vertices into channel by V-
1. Let V = V  and E = E  . shape placement according to the γ values. Finally, from the
lower level to the higher level, we concatenate the vertices
2. The value of K can be derived by the following rela- together to get the data broadcast schedule. Concatenating
tion between K and K  . the vertices from lower level to higher level maintains the
Suppose the 1-to-1 mapping g exists in I  . We con- topological ordering. Putting them in V-shape using their γ
sider K = V ar[f (v)] = v∈V (f (v)−E[f (v)])2 ·β(v) values minimizes the variance of latency. Figure 3 gives a
= E[f 2 (v)] - (E[f (v)])2 . If V ar[f (v)] is a constant, high-level description of the LOAP algorithm.
then E[f (v)] is a constant. Let E[f (v)] = m. If g
is a permutation of f 2 for each vertex v in V , then Input: A weighted DAG, G = (V, E) with source s.
there exists a vertex u in V  such that f 2 (v) = g(u). Output:A topological-ordering broadcast schedule with
That is g(u) : V −→ {1, 4, ..., n2} for each u in V  . minimum variance of latency.
2 2
Then
 we can get K + m = v∈V β(v) · f (v) =
1. Use BFS to compute the level and γ value
u∈V  β(u) · g(u) = u∈V  β(u) · {[g(u) − g(al )] +
of each vertex.
[g(al ) − g(al−1 )] + ... + [g(a1) − g(s)] + g(s)}, where 2. Group the vertices into groups using levels and, on each
a1 , a2 , ..., al are the vertices along the longest path level, sort the vertices in decreasing order by γ.
from s to u. Since g(s) = 1 and u∈V  {[g(u) − 3. for each level do
g(al )] + [g(a
 l ) − g(al−1 )] + ... + [g(a1 ) − g(s)] +
Put vertices to the head side and the tail side
g(s)} ≤ (u,v)∈E  [g(v) − g(u)] ≤ K  , K + m2 ≤ alternatively from the sorted sequence in Step 2.
 
β(u) · {[ (i,j)∈E  g(j)−g(i)]+1} K +m2 ≤ 4. Concatenate the arranged vertices level by level.
u∈V   

u∈V  β(u) · {K + 1} = {K + 1} · u∈V  β(u) =
K  + 1.
Figure 3. The LOAP algorithm.
Given any instance I  of the GOLA problem, we hence
can construct an instance I of the TOBSG problem in
O(|V | + |E|) time. Furthermore, using the relation between
K and K  , one can easily show that there is a solution for 5.1.1 An Example
an instance I  of the GOLA problem if and only if there We use an example in Figure 4 to demonstrate how LOAP
is a solution for the instance Iof the TOBSG problem. We algorithm works. In Step 1, we use BFS traversal to find the
therefore have the following conclusion. level and compute the γ value for each vertex. We start from
source vertex s and define its level as one. Any vertex adja-
Theorem 1. The TOBSG problem is NP-complete; hence,
cent from s is level two and so on. If a vertex adjacent from
the TOBSF problem is also NP-complete.
two or more vertices with different levels, we take the level
of highest one. During the traversal, we can also compute
5. Heuristics the γ value for each vertex. For example, when determin-
ing the level of vertex f , we can also compute γ(f ) as γ(f )
In the previous section, we have shown that the TOBSF = γ(a) · Paf + γ(c)·cf = 0.1 · 0.2 + 0.3 · 0.2 = 0.08. Af-
problem is NP-complete. In this section, we introduce two ter the grouping in step 2, we arrange vertices in V-shape

188
and e edges in the DAG. In the first step, algorithm LOAP
finds the level and computes the γ value for each vertex in
pad 0.5 0.9
d
pdh
h
a BFS fashion. This costs O(n + e) time. Suppose that,
a
phk 0. 5 after the first step, there are k levels and each level has at
peh 0. 3
psa 0.1 p af 0 .2 most m vertices. In the second step, algorithm LOAP sorts
pek 0.3 k
m
pbe 0.9 e the vertices at each level in O(km lg m) = O(n lg n) time.
pei 0. 4
psb 0.5 pik 0. 9 Last, algorithm LOAP takes linear time to concatenate all
b i plm 0.8
s
p fi 0.8 the levels together. Hence, algorithm LOAP totally needs
f l
pcf 0.2 O(n lg n + e) time.
psc 0.3
p jl 0 .9
c
pgi 0.5 j

pcg 0.6 0.5


g pgj 5.2 TSAP Algorithm

We now introduce another algorithm for the TOBSF


Figure 4. A weighted DAG of 14 vertices and problem. This algorithm, Topological Sorting with Access
each edge having an access probability. Probability in V-shape (TSAP) algorithm, first arranges all
the vertices according to the V-shape placement using γ
value and then arranges the vertices based on topological
sort. When there are two or more vertices to be selected at
by their γ values on each level in Step 3. For instance, on some position i, we select the vertex whose γ value is clos-
level 3, the arrangement for vertices e, g, f, d is e, f, d, g. est to the γ value of the ith vertex in V-shape order. We
In the last step, we concatenate the vertices together from therefore define the following notations. For a vertex iv at
the lower level to higher level to derive the resulting sched- a considered position i, let the ith vertex in V-shape order
ule, s, b, a, c, e, f, d, g, i, j, h, k, l, m. A complete result is is io , we define dist(iv , io ) = |γ(iv ) − γ(io )|. Hence, if
shown in Figure 5. there are two vertices to be selected at position i, say iu and
iw , and dist(iu , io ) < dist(iw , io ), we select vertex u. The
     detailed algorithm is presented in Figure 6.

 
 
      



 Input: A weighted DAG, G = (V, E) with source s.

Output: A topological ordering broadcast schedule with



   minimized variance of latency.
  
   
   1. Compute the γ value of each vertex in BFS fashion.
 


2. Sort the vertices with γ values in a non-increasing order.

   3. Arrange the vertices using their γ values according
 
     

  the V-shape placement.
 



4. Do Topological Sort as follows


 
    /* let i be current considered position */
(4.1)insert s into a minimum heap H
with dist value of s as the key.
(4.2)i = 1.
Figure 5. The result of applying LOAP algo- (4.3)while H = ∅
rithm on the DAG in Figure 4. (4.3.1) extract the vertex v from H.
(4.3.2) assign v to current position i.
(4.3.3) remove vertex v and its out-edges.
(4.3.4) insert new sources into H
5.1.2 Correctness and Time Complexity (4.3.5) i = i +1;
Because LOAP arranges the vertices level by level, the
topological ordering is followed. Besides, since all the ver-
tices in every level are arranged in V-shape by their γ val- Figure 6. The TSAP algorithm.
ues, the variance of latency is therefore minimized. As for
the time complexity, we assume that there are n vertices

189
5.2.1 An Example in V-shape ordering. This takes O(n + e) time. Totally,
algorithm TSAP takes O(n lg n + e) time.
We use the same example in Figure4 to show how TSAP
algorithm works. In the first step, as LOAP algorithm does,
we compute the γ value for each vertex in BFS fashion. In 6 Bounds on the Variance of Latency
Step 2, we sort the vertices according to γ values in a non-
increasing order. Then, the vertices are arranged according In this section, we discuss the bounds on the variance of
to the V-shape placement using the γ values in Step 3. The latency for the TOBSF problem. We first explore the rela-
two vertices with largest γ values are placed at the head tion between the data broadcasting and the Poisson distribu-
and tail positions in the placement, respectively. Then the tion by modeling the data broadcast as a binomial random
following vertices are placed in a similar manner with the experiment. Then the bounds on the variance can be derived
order of γ values. The order in the placement is listed in by referring to the bounds on the average latency [20] and
Figure 7. In the last step, Step 4, we use Topological Sort the property that the mean and the variance are equal for the
to finish the scheduling. If there are two or more vertices to Poisson distribution.
be selected at some position i, we select the vertex whose γ
value is closest to the γ value of the ith vertex in V-shape 6.1 Modeling Data Broadcasting
order. The complete result is shown in Figure 8.
A binomial random variable X is defined as,
n! λ λ
Position 1 2 3 4 5 6 7
P r(X = x) = · ( )x · (1 − )n−x , (4)
Candidates s a, a, a, a, d, d, x! · (n − x)! n n
b, c, e, e, e, e,
c e g j f, j
j where n is the total number of the Bernoulli trials, x is the
number of successes, and p = nλ and q = 1 − p are the
a[pos] 1 0.45 0.3 0.18 0.1 0.081 0.0648
Candidates’ 1 0.1, 0.1, 0.1, 0.1, 0.05, 0.05,0.45,
Access 0.5, 0.3, 0.45, 0.45, 0.45, 0.214,
Prob. 0.3 0.45 0.18 0.09 0.08,0.09 0.09 probabilities of success and failure in every Bernoulli tri-
Chosen s b c g a f d
Vertex als respectively. When n approaches infinity, the binomial
Position 8 9 10 11 12 13 14
random variable becomes the Poisson random variable [10].
Candidates e, e, e, e h,k,i h, h
j l m k We observe that the data broadcasting is similar to the
a[pos] 0.05 0.08 0.09 0.18 0.214 0.4176 0.5
Candidates’ 0.45, 0.45, 0.45 0.45 0.18, 0.18, 0.18 binomial random experiment as n approaches infinity. The
Access 0.09 0.081 0.0648 0.4176, 0.4176
Prob. 0.214 observations are
Chosen j l m e i k h
Vertex
Broadcast Topological Sort with Access Prob. In V-shape
1. n data items in the broadcast can be the total number
Schedule s b c g a f d j l m e i k h of trails.
2. The number of required data items received, x, is the
number of successes.
Figure 8. The resulting broadcast schedule
3. The probability of success, p = nλ , every Bernoulli
when applying algorithm TSAP.
trial can be the probability of success to receive the
required data items. items.
In the binomial random experiment, all the trials are inde-
5.2.2 Correctness and Time Complexity pendent and have the same probability of success. In wire-
less data broadcast, when n approaches infinity, the prob-
The resulting broadcast schedule generated by algorithm ability to receive the required data items successfully for
TSAP follows the topological ordering because we arranges clients will be equal. Therefore, the data broadcasting can
the vertices by topological sorting. It also minimizes the be modeled as the binomial random experiment when n ap-
variance of latency. Consider that, during the sorting, we proaches to infinity.
choose the candidate whose γ value is closest to the γ value Furthermore, when n approaches to infinity, a binomial
of the vertex at the corresponding position in V-shape order. random variable becomes a Poisson random variable. So,
Assume that there are n vertices and e edges in DAG. In the the probability of receiving required data items follows the
first step, we compute the γ value for each vertex in BFS Poisson distribution which is
fashion, which costs O(n + e) time. In the second step, we
λx · e−λ
sort all the vertices and that needs O(n lg n) time. In the β(v) = lim P r(X = x) = (5)
n→∞ x!
third step, we arrange n vertices with γ values. It requires
O(n)time. In the last step, we use the topological sort and By the properties of Poisson distribution, the mean and the
choose the vertex whose γ value is closest to the γ value variance are equal to λ, which is the average time interval

190
Vertices s b e k c i g h a j l f m d
γ Values 1.0 0.5 0.45 0.4176 0.3 0.214 0.18 0.18 0.1 0.09 0.081 0.08 0.0648 0.05
V-shape 1 14 2 13 3 12 4 11 5 10 6 9 7 8
Order

Figure 7. The access probability (γ value) and the V-shape order for each vertex of the DAG in Figure 4
when applying algorithm TSAP.

for a client to receive all the requested data items. We there- Theorem 4. The TOBSF problem has a lower bound on the
fore can have the following theorem. variance of latency as

Theorem 2. Suppose the p.m.f. β(v) is defines as before. n(n + 1)


Then, limn→∞ β(v) = Po(λ) and E[f (v)] = V ar[f (v)] = LBv = · βmin ,
2
λ, where P o(λ) denotes the Poisson distribution with pa-
rameter λ. where βmin = minv∈V {β(v)}, as n → ∞.

6.2 Bounds on the Variance Theorem 5. The TOBSF problem has an upper bound on
the variance of latency as
According to Theorem 2, we can derive the lower and n2 (n + 1)(2n + 1) 1/2
upper bounds on the variance of latency for the TOBSF U Bv = [ ] ,
6
problem using the bounds on the average latency derived
in [20]. A lower bound LBe for the average latency is as n → ∞.

n(n + 1) Theorem 6. The factor of the ratio on the bounds of the


LBe = · βmin , (6)
2 TOBSF problem is

where βmin = minv∈V {β(v)}. The upper bound is shown LB 3
in the following theorem. ρ= = · βmin .
UB 2
Lemma 3. An upper bound U Be for the average latency
can be derived as 7. Conclusions
n2 (n + 1)(2n + 1) 1/2
U Be = [ ] (7) In this paper, we discuss how to arrange the dependent
6 data to generate the wireless data broadcast schedule with
proof: The proof can be done by using Cauchy-Schwartz fair latency. We define such a problem as the TOBSF prob-
inequality and β(vi ) ≤ 1 for all vi ∈ V . lem. In the problem, the data dependency and access proba-
   bility can be modeled as a weighted DAG. We show that
1
f (vi ) · β(vi ) ≤ [( f (vi )2 ) · ( β(vi )2 )] 2 TOBSF problem is NP-complete by reducing the GOLA
vi ∈V vi ∈V vi ∈V problem. We further provide two heuristics, LOAP and
  1 TSAP, based on the V-shape placement, to generate the
2
≤ [( i )·( 12 )] 2
broadcast schedules with fairness. These two algorithms
1≤i≤n 1≤i≤n
cost O(n lg n + e) time. With such broadcast schedules, ev-
n(n + 1)(2n + 1) 1
ery mobile client will experience a similar latency to receive
≤ [( · n)] 2 = U Be
6 the result.
When the total number of vertices, n, in weighted DAG Besides, we derive the factor of the bounds on the vari-
LB
approaches the infinity, we can use the bounds for the av- ance of latency for the TOBSF problem, which is ρ = UB =

3
erage latency to derive the lower bound (LBv ) and upper 2 ·β min when data items approach to infinity. In the future,
bound(U Bv ) for the variance of latency. We further de- we will perform the simulation work and analyze the dif-
LB
fine the factor of ρ = UB as the indicator on the bounds ferences between different heuristics. Some related issues
for our problem. The following theorems present the upper also can be further studied, such as wireless data broadcast
and lower bounds as well as the factor about the variance of scheduling with multiple channels, with variable length of
latency. data items, and with index and data mixed problems.

191
References [16] G. Lee, S.-C. Lo, and A. Chen. Data allocation on wire-
less broadcast channels for efficient query processing. IEEE
Transactions on Computers, 51(10), 2002.
[1] S. Acharya, M. Franklin, and S. Zdonik. Balancing push and
[17] Y. Li, Z. Gong, and K. Qi. Decting the content related parts
pull for data broadcast. In Proceedings of the 1997 ACM
of web pages. In Proceedings of the 2005 International
SIGMOD international conference on management of data,
Conference on Services Systems and services Management,
pages 183–194, 1997.
pages 1071–1074, 2005.
[2] A. Bar-Noy, J. Naor, and B. Schieber. Pushing depen- [18] K. H. Lio and R. C. Chen. On scheduling to minimize the
dent data in clients-providers-servers systems. Wireless Net- variance of job completion time, 1999.
works, 9(5):421–430, 2003. [19] C.-M. Liu and S.-Y. Fu. Effective protocols for knn search
[3] Y. C. Chehadeh, A. R. Hurson, and M. Kavehrad. Object or- on broadcast multi-dimensional index trees. Information
ganization on a single broadcast channel in the mobile com- Systems, 33(1):18–35, 2008.
puting environment. Multimedia Tools and Applications, [20] C.-M. Liu and K.-F. Lin. Disseminating dependent data in
9(1):69–94, 1999. wireless broadcast environments. Distributed and Parallel
[4] A. Eden, B. Joh, and T. Mudge. Web latency reduction via Databases, (1):1–25, 2007.
client-side prefetching. In Proceedings of the 2000 IEEE In- [21] C.-M. Liu, L.-C. Wang, L. Chen, and C.-J. Chang. On-
ternational Symposium on Performance Analysis of Systems demand data disseminating with considering channel in-
and Software, pages 193 – 200, 2000. terference for efficient shortest-route service on intelligent
[5] S.-C. L. G. Lee and A. Chen. Data allocation on wire- transportation system. In Proceedings of the 2004 IEEE In-
less broadcast channels for efficient query processing. IEEE ternational Conference on Networking, Sensing and Con-
transactios on Computers, 51(10), 2002. trol, pages 701–706, 2004.
[6] S. Hambrusch, C.-M. Liu, W. G. Aref, and S. Prabhakar. Ef- [22] G. Malewicz, A. Rosenberg, and M. Yurkewych. On
ficient query execution on broadcasted index tree structures. scheduling complex dags for internet-based computing. In
Data and Knowledge Engineering, 60(3):511–529, 2007. Parallel and Distributed Processing Symposium, 2005. Pro-
[7] S. Hambrusch, C.-M. Liu, and S. Prabhakar. Broadcast- ceedings. 19th IEEE International, pages 66 – 66. IEEE,
ing and querying multi-dimensional index trees in a multi- 2005.
channel environment. Information Systems, 31(8):870–886, [23] N. V. S. Hameed. Efficient algorithms for scheduling data
2006. broadcast. ACM/Baltzer Journal of Wireless Networks,
[8] S. E. Hambrusch, C.-M. Liu, W. G. Aref, and S. Prabhakar. 5(3):183–193, 1999.
Query processing in broadcasted spatial index trees. Lecture [24] J. Shen, I. Nikolaidis, and J. Harms. A dag-based approach
Notes in Computer Science, 2121:502–510, 2001. to wireless scheduling. In Proceedings of the 2005 IEEE In-
[9] L. He, S. Jarvis, D. Spooner, and G. Nudd. Performance ternational Conference on Communications, page 66, 2005.
evaluation of scheduling applications with dag topologies [25] B. Zheng, X. Wu, X. Jin, and D. Lee. Tosa: a near-optimal
scheduling algorithm for multi-channel data broadcast. In
on multiclusters with independent local schedulers. In Pro-
ceedings of the 20th InternationalParallel and Distributed Proceedings of the 6th International Conference on Mobile
Processing Symposium, page 8, 2006. Data Management, pages 29–37, 2005.
[10] R. V. Hogg and E. A. Tanis. The poisson distribution.
In Probability and Statistical Inference, pages 131–133.
Macmillan Publishing Company, 1989.
[11] E. Horowitz, S. Sahni, and S. Rajasekaran. Np-hard and
np-complete problems. In Computer Algorithms, page 553.
W.H. Freeman and Company, 1998.
[12] J.-L. Huang, M.-S. Chen, and W.-C. Peng. Broadcasting
dependent data for ordered queries without replication in a
multi-channel mobile environment. In Proceedings of the
19th International Conference on Data Engineering, pages
692–694, 2003.
[13] H.-P. Hung and M.-S. Chen. On exploring channel allo-
cation in the diverse data broadcastingenviroment. In Pro-
ceedings of the 25th IEEE International Conference on Dis-
tributed Computing Systems(ICDCS’05), pages 729–738,
2005.
[14] A. Hurson and Y. Jiao. Data broadcasting in a mobile envi-
ronment. In Wireless Information Highway, chapter 4, pages
96–154. IRM Press, 2004.
[15] T. Imieliński, S. Viswanathan, and B. R. Badrinath. Data on
air: Organization and access. IEEE Transactions on Knowl-
edge and Data Engineering, 9(3):353–372, 1997.

192
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Hovering Information - Self-Organising Information that Finds its Own Storage

Alfredo A. Villalba Castro Giovanna Di Marzo Serugendo


Centre Universitaire d’Informatique School of Computer Science
University of Geneva, Switzerland and Information Systems
alfredo.villalba@cui.unige.ch Birkbeck, University of London, UK
dimarzo@dcs.bbk.ac.uk

Dimitri Konstantas
Centre Universitaire d’Informatique
University of Geneva, Switzerland
dimitri.konstantas@cui.unige.ch

Abstract as information dissemination and storage. In these ap-


proaches, the mobile nodes decide when and to whom the
A piece of Hovering Information is a geo-localized infor- information is to be sent. Here we take the opposite view; it
mation residing in a highly dynamic environment such as a is the information that decides upon its own storage and dis-
mobile ad hoc network. This information is attached to a semination. This opens up other possibilities, not available
geographical point, called the anchor location, and to its for traditional MANET services, such as different pieces of
vicinity area, called anchor area. A piece of hovering infor- hovering information all moving towards the same location
mation is responsible for keeping itself alive, available and and (re-)constructing there a coherent larger information for
accessible to other devices within its anchor area. Hover- a user, e.g. TV or video streaming on mobile phones.
ing information uses mechanisms such as active hopping, Hovering information is a self-organised user-defined in-
replication and dissemination among mobile nodes to sat- formation which do not need a central server to exist. Indi-
isfy the above requirements. It does not rely on any central vidual pieces of hovering information each use local infor-
server. This paper presents the hovering information con- mation, such as direction, position, power and storage capa-
cept and discusses results of simulations performed for two bilities of nearby mobile devices, in order to select the next
algorithms aiming to ensure the availability of a piece of appropriate location. Hovering information benefits from
hovering information at its anchor area. the storage space and communication capacities of the un-
derlying mobile devices. It is not residing in a centralized
server, and is not bound to any mobile operator.
1 Introduction This paper presents the hovering information concept as
well as a preliminary algorithm allowing single pieces of
Hovering information [7] is a concept characterising hovering information to get attracted to their respective an-
self-organising information responsible to find its own stor- chor locations. A complete formal description of the hover-
age on top of a highly dynamic set of mobile devices. The ing information model is described in [6].
main requirement of a single piece of hovering information Section 2 discusses potential applications of this con-
is to keep itself stored at some specified location, which we cept. Section 3 presents the hovering information concept.
call the anchor location, despite the unreliability of the de- Section 4 discusses the Attractor Point algorithm that we
vice on which it is stored. Whenever the mobile device, on have designed where the information is ”attracted” by the
which the hovering information is currently stored, leaves anchor location and a general Broadcast-based algorithm
the area around the specified storage location, the informa- we implemented in order to allow comparisons. Section 5
tion has to hop - ”hover” - to another device. reports on simulation results related to availability and ad-
Current approaches in this area (cf. Section 6) try to ditional metrics such as number of messages exchanged or
either define a virtual structured overlay network on top pieces of hovering information replicated. Finally Section 6
of this environment offering a stable virtual infrastructure, compares our approach to related works, and Section 7 dis-
or propose a system-based approach offering services such cusses some future works.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 193


DOI 10.1109/SUTC.2008.62
2 Applications in meters, n is the mobile node where h is currently hosted
(hosting node), data is the data carried by h, policies are
When deployed over mobile devices, hovering informa- the hovering policies of h and size is the size of h in bytes.
tion is an infrastructure free service that supports a large Policies stand for hovering policies stating how and when a
range of applications. Among others we can cite: urban piece of hovering information has to hover.
security - users (citizens, policemen, security) post and re- We consider that identifiers of pieces of hovering infor-
trieve comments or warnings related to dangers in their ur- mation are unique, but replicas (carrying same data and an-
ban environment; self-generative art - users of a learning art chor information) are allowed on different mobile nodes.
experience centre provide collective inputs self-assembled We also consider that there is only one instance of a hov-
together into a piece of art (painting, music, etc) generated ering information in a given node n, any other replica re-
by a computer according to some rules; intravehicular net- sides in another node.
works - drivers insert tags into the environment related to Figure 1 shows a piece of hovering information (blue
road conditions or accidents; emergency scenarios - emer- hexagon) and two mobile nodes (yellow circles). One of
gency crew use hovering information to locate survivors or them hosts the hovering information whose anchor location,
coordinate their work. More generally, hovering informa- radius and area are also represented (blue circle). The an-
tion is a technical way to support stigmergy-based applica- chor area is the disc whose center is the anchor location,
tions. Stigmergy is an indirect communication mechanism and radius is the anchor radius. The communication range
among individual components of a self-organising system. of the second mobile node is also shown.
Communication occurs through modification brought to lo-
cal environment. The use of ant pheromone is a well known
example of stigmergy. Users that communicate by placing
hovering information at a geo-referenced position, which is
later on retrieved by other users is also an example of stig-
mergy. The hovering information concept, using an infras-
tructure free storage media, naturally supports stigmergy-
based applications that need to be deployed on an ad hoc
manner (e.g. unmanned vehicles or robots).

3 Hovering Information Concept

3.1 Mobile Nodes and Hovering Informa-


tion Figure 1. Mobile Nodes and Hovering Infor-
mation
Mobile nodes represent the storage and motion media
exploited by pieces of hovering information. A mobile node
n is defined as a tuple: A hovering information system is composed of mobile
nodes and pieces of hovering information. A hovering in-
n = (id, loc, speed, dir, rcomm ), formation system at time t is a snapshot (at time t) of the sta-
tus of the system. We denote by Nt the set of mobile nodes
where id is its mobile node identifier, loc is its current lo-
at time t. Mobile nodes can change location, new mobile
cation (a geographic location), speed is its current speed in
nodes can join the system and others can leave. New pieces
m/s, dir is its current direction of movement (a geographic
of hovering information can appear (with new identifiers),
vector) and rcomm is its wireless communication range in
replicas may appear or disappear (same identifiers but lo-
meters.
cated on other nodes), hovering information may disappear
A piece of hovering information is a piece of data whose
or change node.
main goal is to remain stored in an area centred at a spe-
Figure 2 shows two different pieces of hovering infor-
cific location called the anchor location, and having a radius
mation h1 (blue) and h2 (green), having each a different
called the anchor radius. A piece of hovering information
anchor location and area. Three replicas of h1 are currently
h is defined as a tuple:
located in the anchor area (in three different mobile nodes
h = (id, a, r, n, data, policies, size), n2 , n3 and n4 ), while two replicas of h2 are present in the
anchor area of h2 (in nodes n2 and n5 ). It may happen that
where id is its hovering information identifier, a is its an- a mobile node hosts replicas of different pieces of hovering
chor location (geographic coordinate), r is its anchor radius information, as it is the case in the figure for the mobile node

194
n2 that is at the intersection of the two anchor areas. The contains a replica of the piece of hovering information. The
arrows here also represent the communication range possi- accessibility of a piece of hovering information h is the
bilities among the nodes. rate between the area covered by the hovering information’s
replicas and its anchor area. The accessibility of h between
time tc (creation time of a piece of hovering information)
and time t is given by:
t
1 
ACH (h, t) = acH (h, τ ),
t − tc τ =t
c

where acH (h, τ ) is the rate between the area covered by


the hovering informations replicas and its anchor area. The
interested reader can refer to [6] for a full set of definitions.
Let us notice that an available piece of hovering informa-
tion is not necessarily accessible and vice-versa, an accessi-
Figure 2. Hovering Information System at ble piece of hovering information is not necessary available.
time t Figure 3 shows different cases of survivability, availability
and accessibility. In Figure 3(a), hovering information h
(blue) is not available, since it is not physically present in
the anchor area, however it is survival as there is a node
3.2 Properties - Requirements
hosting it. In Figure 3(b), hovering information h is now
available as it is within its anchor area, however it is not ac-
Survivability. A hovering information h is alive at some
cessible from node n1 because of the scope of the communi-
time t if there is at least one node hosting a replica of this
cation range. Finally, in Figure 3(c), hovering information
information. The survivability along a period of time is de-
h is survival, available and accessible from node n1 .
fined as the ratio between the amount of time during which
the hovering information has been alive and the overall du-
ration of the observation. The survivability of h between
time tc (creation time of a piece of hovering information)
and time t is given by:
t
1 
SVH (h, t) = svH (h, τ ),
t − tc τ =t
c

where svH (h, τ ) takes value 0 or 1 whether h is survival or (a) (b) (c)
not at time τ .
Availability. A hovering information h is available at
some time t if there is at least a node in its anchor area host- Figure 3. Survivability, Availability and Ac-
ing a replica of this information. The availability of a piece cessibility
of hovering information along a period of time is defined as
the rate between the amount of time along which this infor-
mation has been available during this period and the overall
time. The availability of h between time tc (creation time 4 Algorithms for Hovering Information
of a piece of hovering information) and time t is given by:
t
4.1 Assumptions
1 
AVH (h, t) = avH (h, τ ),
t − tc τ =t We make the following assumptions in order to keep
c
the problem simple while focusing on measuring availabil-
where avH (h, τ ) takes value 0 or 1 whether h is available ity and resources consumption. Unlimited memory: All
or not at time τ . mobile nodes have an unlimited amount of memory able to
Accessibility. A hovering information is accessible by a store any number of hovering information replicas. The pro-
node n at some time t if the node is able to get this infor- posed algorithms do not take into account remaining mem-
mation. In other words, if it exists a node m being in the ory space or the size of the hovering information. Unlim-
communication range of the interested node n and which ited energy: All mobile nodes have an unlimited amount

195
of energy. The proposed algorithms do not consider failure safe radius is the risk area, and finally the larger disc is the
of nodes or impossibility of sending messages because of relevant area.
low level of energy. Instantaneous processing: Processing
time of the algorithms in a mobile node is zero. We do not
consider performance problems related to overloaded pro-
cessors or execution time. In-built geo-localization ser-
vice: Mobile nodes have an in-built geo-localization ser-
vice such as GPS which provides the current position. We
assume that this information is available to pieces of hov-
ering information. Neighbours discovering service: Mo-
bile nodes are able to get a list of their current neighbouring
nodes at any time. This list contains the position, speed, and
direction of the nodes. As for the other two services, this in-
formation is available to pieces of hovering information.

4.2 Safe, Risk and Relevant Areas


Figure 4. Radii and Areas
In this paper we consider that all pieces of hovering in-
formation have the same hovering policies: active replica- The values of these different radii are different for each
tion and hovering in order to stay in the anchor area (for piece of hovering information and are typically stored in the
availability and accessibility reasons), hovering and caching Policies field of the hovering information. In the following
when too far from the anchor area (survivability), and clean- algorithms we consider that all pieces of hovering informa-
ing when too far from the anchor area to be meaningful (i.e. tion have the same relevant, risk and safe radius.
disappearance). The decision on whether to replicate itself
or to hover depends on the current position of the mobile
4.3 Replication
node in which the hovering information is currently stored.
Therefore, we distinguish three different areas: safe area,
We describe two algorithms simulating two variants of
risk area and relevant area.
replication policies: the Attractor Point and Broadcast-
A piece of hovering information located in the safe area based algorithms. Both algorithms are triggered periodi-
can safely stay in the current mobile node, provided the con- cally each TR (replication time) seconds and only replicas
ditions on the node permit this: power, memory, etc. This of h being in the risk area are replicated onto some neigh-
area is defined as the disc having as centre the anchor loca- bouring nodes (nodes in communication range) which are
tion and as radius the safe radius (rsaf e ). selected according to the replication algorithm.
A piece of hovering information located in the risk area
should actively seek a new location on a mobile node going
4.3.1 Attractor Point Algorithm
into the direction of the safe area. It is in this area that
the hovering information actively replicates itself in order The anchor location of a piece of hovering information acts
to survive and stay available in the vicinity of the anchor constantly as an attractor point to that piece of hovering in-
location. This area is defined as the ring having as centre formation and to all its replicas. Replicas tend to stay as
the anchor location and bound by the safe and risk radii close as possible to their anchor area by replicating from
(rrisk ). one mobile node to the other.
The relevant area limits the scope of survivability of a Periodically and for each mobile node (see Algorithm 1),
piece of hovering information. This area is defined as the the position of the mobile node (line 2) is retrieved together
disc whose centre is the anchor location and whose radius with the list and position of all mobile nodes in communi-
is the relevant radius (rrele ). cation range (lines 3 and 4). Hovering information replicas
The irrelevant area is all the area outside the relevant verify wether they are in the risk area and need to be repli-
area. A piece of hovering information located in the irrel- cated (line 8). The number of target nodes composing the
evant area can disappear; it is relieved from survivability multicast group is defined by the constant kR (replication
goals. factor). The distance between each mobile node in range
Figure 4 below depicts the different types of radii and and the anchor location is computed (line 9). The kR mo-
areas discussed above centred at a specific anchor location bile nodes with the shortest distance are chosen as the target
a. The smallest disc represents the safe area, the blue area nodes for the multicast (lines 10). A piece of hovering in-
is the anchor area, the ring limited by the risk radius and the formation in the risk area then multicasts itself to the kR

196
Algorithm 1 Attractor Point Replication Algorithm Algorithm 2 Broadcast-based Replication Algorithm
1: procedure REPLICATION 1: procedure REPLICATION
2: pos ← NODE - POSITION 2: pos ← NODE - POSITION
3: N ← NODE - NEIGHBOURS 3: for all replica ∈ REP LICAS do
4: P ← NEIGHBOURS - POSITION(N ) 4: anchor ← ANCHOR - LOCATION(replica)
5: for all replica ∈ REP LICAS do 5: dist ← DISTANCE(pos, anchor)
6: anchor ← ANCHOR - LOCATION(replica) 6: if (dist ≥ rsaf e ) and (dist ≤ rrisk ) then
7: dist ← DISTANCE(pos, anchor) 7: BROADCAST(replica)
8: if (dist ≥ rsaf e ) and (dist ≤ rrisk ) then 8: end if
9: D ← DISTANCE(P, anchor) 9: end for
10: M ← SELECT- KR - CLOSESTS(N, D, kR ) 10: end procedure
11: MULTICAST(replica, M )
12: end if
13: end for
the risk area, it replicates itself onto all the nodes in com-
14: end procedure
munication range, nodes n1 to n5 (blue nodes).

mobile nodes, in communication range, closest to its an-


chor location (line 11). Figure 5 illustrates the behaviour of
the Attractor Point algorithm. Consider a piece of hovering
information h in the risk area. It replicates itself onto the
nodes in communication range that are the closest to its an-
chor location. For a replication factor kR = 2, nodes n2
and n3 receive a replica, while all the other nodes in range
do not receive any replica.

Figure 6. Broadcast-based Algorithm

4.4 Caching and Cleaning Modules

Each node is assumed to have an unlimited amount of


memory. Therefore, when replicas are sent from one node
Figure 5. Attractor Point Algorithm to another, they are simply stored in the nodes memory.
However, if a node receives two or more replicas of the
same piece of hovering information h, the first replica to
arrive is stored in the memory, and any subsequent one is
4.3.2 Broadcast-based Algorithm ignored. Therefore, at most one replica of each piece of
The Broadcast-based algorithm (see Algorithm 2) is trig- hovering information is present in a given node n.
gered periodically (each TR ) for each mobile node. After Periodically - each TC seconds - and for each node, repli-
checking the position of the mobile node (line 2); pieces cas that are too far from their anchor location are removed,
of hovering information located in the risk area (line 6) are i.e. those replicas that are in the irrelevant area. Although
replicated and broadcasted onto all the nodes in commu- the amount of memory is unlimited and replicas could stay
nication range (line 7). We expect this algorithm to have forever in the nodes’ memory, we remove the replicas that
the best performance in terms of availability but the worst are too far away from their anchor location, this represents
in terms of network and memory resources consumption. the cases where the replica considers itself too far from the
Figure 6 illustrates the behaviour of the Broadcast-based al- anchor area and not able to come back anymore. This avoids
gorithm. Consider the piece of hovering information h in as well the situation where all nodes have a replica.

197
4.5 Metrics Blackboard 500mx500m
Mobility Model Random Way Point
In order to evaluate and compare the above algorithms, Nodes speed 1m/s to 10 m/s
the following values have been measured. Communication range (rcomm ) 121m
Messages complexity. The message complexity at a Replication time (TR ) 10s
Cleaning time (TC ) 60s
given time t is the number of messages sent between time 0
Replication factor (kR ) 1, 2, 4 and 8
and time t by all nodes n of the system (Nt ):
Anchor radius (r) 50m
t
  Safe radius (rsaf e ) 30m
M SGS(t) = msgsn (τ ), Risk radius (rrisk ) 70m
τ =0 n∈Nτ Relevant radius (rrele ) 200m

where msgsn (τ ) represents the number of messages sent at Table 1. Simulation settings
time τ by node n.
Replication complexity. The replication complexity
measures, for a given piece of hovering information h, the Based on this generic scenario, we defined 10 specific
maximum number of replicas having existed in the whole scenarios with varying number of nodes: from 20 to 200
system at the same time. nodes, increasing the number of nodes by 20 each time. We
t  have performed 20 runs for each scenario. One run lasts
REPh (t) = max( memn (τ )), 3’600 simulated seconds. All the results presented here are
τ =tc
n∈Nτ the average of the 20 runs for each scenario, and the errors
where memn (τ ) is 0 if there is no replica of h in n, and 1 bars represent a 95% confidence interval. We investigated
if there is a replica. four variants of the Attractor Point algorithm, with four dif-
Concentration. The concentration of a given piece of ferent replication factors, namely 1, 2, 4 and 8. All the sim-
hovering information h is defined as the rate between the ulations ran on a linux cluster of 32 computation nodes (Sun
number of replicas of h present in the anchor area and the V60x dual Intel Xeon 2.8GHz, 2Gb RAM).
total number of replicas of this hovering information in the
whole environment. 5.2 Results

5 Evaluation Availability. Figure 7 shows the average of the avail-


ability performance over the 20 runs. As expected, the
We evaluated the behaviour of the two above described Broadcast-based algorithm outperforms the Attractor Point
algorithms under different scenarios by varying the num- algorithm which tends to behave like the first one as the
ber of nodes. In these experiments, we considered only replication factor (kR ) increases, since the Broadcast-based
one piece of hovering information. For this given piece of algorithm is a particular case of the Attractor Point algo-
hovering information h, we measured the availability of h, rithm when kR is big enough. For the Broadcast-based and
the corresponding message complexity, the corresponding the Attractor Point (with a kR greater than 4) algorithms, we
replication complexity and the concentration of h. observe that an 80% of availability can be expected as soon
We performed simulations using the OMNet++ network as the number of mobile nodes in the environment reaches
simulator (distribution 3.3) and its Mobility Framework 100 nodes. This represents a density of 3.1 nodes per an-
2.0p2 (mobility module) to simulate nodes having a sim- chor area. The maximum availability value, nearly 96%,
plified WiFi-enabled communication interfaces (not dealing is reached by the Broadcast-based and the Attractor Point
with channel interferences). (with a kR of 8) algorithms when the population of mobile
nodes is 200, while the Attractor Point (with a kR of 4)
5.1 Simulation Settings and Scenarios reaches nearly 93% of availability for 200 nodes
Messages Complexity. Figure 8 shows the average
The generic scenario consists of a surface of 500m x number of messages sent. As expected, the Broadcast-based
500m with mobile nodes moving around following a Ran- algorithm sends a higher number of messages when com-
dom Way Point mobility model with a speed varying from pared to the Attractor Point algorithm. This phenomenon is
1m/s to 10m/s without pause time. In this kind of mobil- amplified when the number of nodes increases. In the worst
ity model, a node moves along a straight line with speed case (200 nodes), the number of sent messages, in average,
and direction changing randomly at some random time in- by the Broadcast-based algorithm is nine times higher than
tervals. Table 1 summarises the values used for the generic the number of messages sent when the Attractor Point al-
scenario. gorithm is used with a kR of 1. This messages complexity

198
1 10000
Broadcast-based
0.9
Attractor Point, kR = 1
8000

Average Num of Sent Msgs


0.8 Attractor Point, kR = 2
Average Availability

Attractor Point, kR = 4
0.7 6000 Attractor Point, kR = 8

0.6

0.5 4000
Broadcast-based
0.4 Attractor Point, kR = 1
Attractor Point, kR = 2 2000
0.3 Attractor Point, kR = 4
Attractor Point, kR = 8
0.2 0
20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200
Number of Nodes Number of Nodes

Figure 7. Availability Figure 8. Messages Complexity

will be a decisive factor when applying the algorithm in a mobile nodes which are state machines having a fixed lo-
network dealing with interferences. cation or a well-defined trajectory. On top of this virtual
Replication Complexity. Figure 9 shows in average the infrastructure it should become easier to define distributed
maximum number of replicas of a single piece of hover- algorithms such as routing or leader election.
ing information having existed at the same time. Again, GeOpps [4] proposes a geographical opportunistic rout-
we observe that the Broadcast-based algorithm creates more ing algorithm over VANETs (Vehicular Ad Hoc Networks).
replicas than the Attractor Point. The curves for Replication The algorithm selects appropriate cars for routing some in-
Complexity are very similar to those for Message Complex- formation from a point A to a point B. The choice of the
ity (see Figure 8). This is explained as the number of sent next hop (i.e. the next car) is based on the distance between
messages is directly proportional to the number of existing that cars trajectory and the final destination of the informa-
replicas; since each replica can potentially send messages tion to route. This work focuses on routing information to
(replicate itself again). some geographical location; it does not consider the issue
Concentration. Figure 10 shows the concentration rate. of keeping this information alive at the destination, while
We observe that the concentration rate is above 7% for this is the main characteristic of hovering information.
the Attractor Point algorithm and increases with number The work proposed by [5] aims to disseminate traffic in-
of nodes up to 17% (depending on kR ). In the other formation in a network composed by infostations and cars.
hand, a maximal concentration rate of 7% is reached by the The system follows the publish/subscribe paradigm. Once
Broadcast-based algorithm. The Attractor Point algorithm a publisher creates some information, a replica is created
concentrates 2 to 3 times more replicas than the Broadcast- and propagated all around where the information is rele-
based algorithm (depending on kR and the number of nodes vant. While the idea is quite similar to that of hovering
considered). information, keeping information alive in its relevant area,
At the time of writing, additional simulations are run- this study does not consider the problem of having a limited
ning. They are aiming at computing the accessibility of amount of memory to be shared by many pieces of informa-
hovering information under different scenarios. tion or the problem of fragmentation of information. It also
takes the view of the cars as the main active entities, and not
6 Related Works the opposite view, where it is the information that decides
where to go.
The Virtual Infrastructure project [2, 3] defines vir- The Ad-Loc project [1] proposes an annotation location-
tual (fixed) nodes implemented on top of a MANET. This aware infrastructure-free system. Notes stick to an area of
project proposes the notion of an atomic memory cells, im- relevance which can grow depending on the location of in-
plemented on top of a MANET, which ensure their persis- terested nodes. Information is periodically broadcasted to
tency by replicating their state in neighbouring mobile de- neighbouring nodes. Nodes are the active entities exchang-
vices. This notion has been extended to the idea of virtual ing information. The size of the area of relevance grows as

199
160 0.25
Broadcast-based Broadcast-based
140 Attractor Point, kR = 1 Attractor Point, kR = 1
Average Num of Max Replicas

Attractor Point, kR = 2 Attractor Point, kR = 2


120 0.2

Average Concentration
Attractor Point, kR = 4 Attractor Point, kR = 4
100 Attractor Point, kR = 8 Attractor Point, kR = 8

80 0.15

60
0.1
40

20
0.05
0
20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200
Number of Nodes Number of Nodes

Figure 9. Replication Complexity Figure 10. Concentration

necessary in order to accommodate the needs of users po- tions (e.g. channel interferences or physical obstacles).
tentially far from the central location. The information then
becomes eventually available everywhere. References

7 Conclusion [1] D. J. Corbet and D. Cutting. Ad loc: Location-based


infrastructure-free annotation. In ICMU 2006, London, Eng-
land, Oct. 2006.
In this paper we discussed the notion of hovering infor-
mation, defined and simulated the Attractor Point algorithm [2] S. Dolev, S. Gilbert, L. Lahiani, N. A. Lynch, and T. Nolte.
Timed virtual stationary automata for mobile networks. In
which intends to keep the information alive and available in
OPODIS, pages 130–145, 2005.
its anchor area. This algorithm multicasts hovering infor-
mation replicas to the nodes that are closer to the anchor lo- [3] S. Dolev, S. Gilbert, E. Schiller, A. A. Shvartsman, and
J. Welch. Autonomous virtual mobile nodes. In DIALM-
cation. The performances of this algorithm have been com-
POMC ’05: Proceedings of the 2005 joint workshop on Foun-
pared to those of a Broadcast-based version. The results dations of mobile computing, pages 62–69, New York, NY,
show that the Broadcast-based algorithm outperforms the USA, 2005. ACM Press.
Attractor Point algorithm in terms of availability but only
[4] I. Leontiadis and C. Mascolo. Geopps: Opportunistic geo-
from a very small factor. The proposed Attractor Point al- graphical routing for vehicular networks. In Proceedings of
gorithm is much less bandwidth and memory greedy than the IEEE Workshop on Autonomic and Opportunistic Commu-
the Broadcast-based algorithm and achieves higher levels nications. (Colocated with WOWMOM07), Helsinki, Finland,
of concentration of data in the anchor area. June 2007. IEEE Press.
Considering that these results constitute a proof of con- [5] I. Leontiadis and C. Mascolo. Opportunistic spatio-temporal
cept of the hovering information paradigm, future works dissemination system for vehicular networks. In MobiOpp
will concentrate on releasing the assumption of unlimited ’07: Proceedings of the 1st international MobiSys workshop
memory and in considering not only one piece of hover- on Mobile opportunistic networking, pages 39–46, New York,
ing information but multiple distinct pieces all hovering in NY, USA, 2007. ACM Press.
the same environment. We intend as well to take into ac- [6] A. Villalba Castro, G. Di Marzo Serugendo, and D. Kon-
count the speed and direction of the nodes when choosing stantas. Hovering information - self-organising information
the nodes that will host replicas. We have tested the At- that finds its own storage. Technical Report BBKCS-07-07,
tractor Point algorithm under a Random Way Point mobil- School of Computer Science and Information Systems, Birk-
ity model and under ideal wireless conditions. This is not beck, University of London, Nov 2007.
characteristic of real world behaviour. We will apply the At- [7] A. Villalba Castro and D. Konstantas. Towards hovering infor-
tractor Point algorithm to scenarios following real mobility mation. In Proceedings of the First European Conference on
patterns (e.g. crowd mobility patterns in a shopping mall or Smart Sensing and Context (EuroSSC 2006), pages 161–166,
traffic mobility patterns in a city) with real wireless condi- 2006.

200
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

EvAnT: Analysis and Checking of event traces for Wireless Sensor Networks

Matthias Woehrle # , Christian Plessl ∗ , Roman Lim # , Jan Beutel # , Lothar Thiele #
# ETH Zurich, Computer Engineering and Networks Lab
8092 Zurich, Switzerland
matthias.woehrle@tik.ee.ethz.ch

∗ University of Paderborn, Paderborn Center for Parallel Computing


33102 Paderborn, Germany

Abstract impact of the environment on the system, some can be


attributed to software failures and simplified assumptions
Testing and verification methodologies for Wireless in protocol design, while causes for other system fails
Sensor Networks (WSN) systems in pre-deployment are remain unknown [1].
vital for a successful deployment. Increased visibility of Deployments in remote and harsh environments such
the internal state of a WSN application is established by as a volcano [2] prohibit debugging at the installation
instrumenting the application for logging execution traces site or extensive field tests [3], necessitating an error-free
at runtime. While the interpretation of the event traces is system at deployment time. Systematic WSN verification
application-specific, a common method for analysis can be is indispensable to arrive at a correct system. Testing in
devised. This method should allow for a concise formula- pre-deployment allows for exposing errors in the system,
tion of explorative queries to determine the occurrence and that may cause expensive re-deployments [3] or non-
the cause of functional or performance problems. performing systems [4].
The contribution of this paper is an event analysis Pre-deployment testing and debugging has the major
methodology that is implemented in the EvAnT framework. advantage that execution information may be increased
EvAnT allows for specifying queries that are executed on by instrumenting individual sensor nodes. Using test plat-
the collected traces. EvAnT is specifically tailored to WSN forms such as a testbed or a simulator, which allow for
testing and debugging. We demonstrate the applicability of collecting the instrumentation data, a rich set of data may
EvAnT by a case study in a building monitoring project. be used for failure detection and debugging. Accuracy,
the scale of the network and the intrusiveness of the
monitoring can be customized to the individual application
and test requirements. Data collected in one or many test
I. Introduction runs can be used to analyze functional and non-functional
(performance) properties in detail.
Wireless Sensor networks are wireless networked em- Analysis on an application level is straight forward. Just
bedded sensing systems allowing to monitor an environ- consider the example of packet yield in the predominant
ment in previously unprecedented ways. The close and data collection applications in WSNs: Count the transmis-
continuous observation of a phenomenon with these tiny sions on each node and the received packets on the sink
sensing devices allows for many novel and widely differing node and compute the ratio of arrived versus sent nodes.
application areas such as fire detection alarm systems or This allows on an application level for checking, whether
monitoring of the environment or structures such as a the system performs satisfactorily. However, it does not
building. help when trying to determine the cause of unsatisfiable be-
However, numerous failed or under-performing deploy- havior. More information about the execution is collected
ments of WSNs have shown that their design is ex- by additional instrumentation of the whole software stack.
tremely intricate. Causes for these fails differ: some can As an example, the computation of transmission paths of
be attributed to the embedded nature of WSNs and the packets can reveal where packets are actually lost. The

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 201


DOI 10.1109/SUTC.2008.24
transmission path must be reconstructed by instrumenting n1 n2 n3 n4 ...
Sensor
Nodes
the network layer to extract the forwarding information
e1 e5 e23 Event
of the nodes in the routing tree. Having determined the Monitoring
nodes causing the error, an analysis of these nodes can E1 E3
Local
E2 E4
Event Trace
reveal the cause of the problem: Could it be collisions
Global
due to hidden-terminal effects? Are these nodes having E = E1 E2 ...
Event Trace
synchronization issues or are they rebooting? These are
some simple examples to describe the type of queries an
event analysis framework must support. Fig. 1. Logical view of distributed system
For such an in-depth analysis and long test executions monitoring to a unified event set
over multiple hours or even days allowing for statistically
significant conclusions, the amount of logged monitoring
data is substantial. Referring to our case study, even a A. Event specification
well-focussed instrumentation renders megabytes of log
files for just a day. The computation of routing paths Test monitors collect events on the sensor nodes
in the presence of loops or retransmission is not easily n1 , ..., nk . Each event e is a n-tuple of key-value pairs.
determined with simple scripting. Further analysis has to Events may be test- or application-specific. Minimally, an
take place with a suitable framework allowing to formulate event includes a node identifier and a type identifier. Thus
expressive queries for the transmission paths, loop detec- an event has the following format:
tion or correlation of events for failure debugging. e = (node : nodeid , type : typeid , key1 : value1 , . . .)
This paper provides the following contributions to alle- No further requirements on the content of events are
viate this situation: defined. The events collected on the instrumented sensor
• We formulate the problem of analyzing WSN systems node n1 , . . . , nk form sets E1 , . . . , Ek . The trace is the event
based on events collected in execution traces, set E, which is the union of the distributedly collected
• We present operators for performing queries on the event sets.1 Figure 1 illustrates the process of collecting
event trace and providing assertions on the query events from the distributed targets. There is no requirement
results for checking, on the collection process, rather the event analysis relies on
• We demonstrate the feasibility and applicability of a complete view of the system including events from any
EvAnT, our Event Analysis framework for Testing. node participating in the event analysis. The node identifier
may be added by the test infrastructure or by the distributed
Section II formulates the semantics of events and the node itself. Test infrastructures as described in Sec. III-
event trace collected during test execution. Section III A are responsible for a reliable transport of local traces
discusses the peculiarities of event analysis for Wire- rendering the event trace an accurate reflection of the actual
less Sensor Networks testing and debugging. Section IV execution of the distributed system.
outlines our event analysis framework and discusses the The semantics of the events are defined in the analysis
implementation of its major components. In section V, framework. The event analysis may work with partially
we present a case study based on the analysis of the ordered events based on node identifiers and local times-
Harvester, a data collection application used for building tamps, or on a total order based on the local order and the
monitoring. Using EvAnT we were able to rapidly derive event propagation sequence. Thus, order metrics or time
significant conclusions from application trace data, solely are optional and do not have any syntactic restrictions for
by implementing an application-specific event trace parser order implications.
and by formulating the verification queries by the means
provided by EvAnT. Section VI discusses related work. B. Event sets and abstract events
The work concludes with a summary and outlook.
Events in the trace represent monitoring information as
II. Event trace formulation atomic actions. When processing events and for behavioral
abstractions events are joined into compound objects. This
WSNs are wireless distributed embedded system com- may be a set of events, such as the set of reboot events that
prised of a large number of loosely coupled sensor nodes. have occurred. Alternatively, events can also be joined into
For testing of a WSN, individual sensor nodes are instru- a single event with a new representation of its constituent
mented with test monitors [5] to extract information about primitive events, e. g. for routing analysis, where individual
the execution of the software on the distributed system, 1 We denote sets with upper-case letters, while events are denoted with
i. e. the sensor nodes. lower-case letters.

202
send and receive events form a comprehensive routing path
Event Trace
Input
event. Nevertheless, these abstract events have different E

semantics, since they no longer represent atomic actions. Partitioning based on type
reboot send/receive

Subset C
C. Temporal properties of events Subset R
Reboots
Commu-
nication

Subset C-H EvAnt


Add Hop-
Events may include temporal information. For individ- count Key
Fixed Point
ual events, a single timestamp suffices to describe the Iteration
atomic actions. An event set or abstract event A cannot be Subset SR Subset P
Ackn. Send- Routing
represented with a single timestamp, since they are non- Recv. pairs Paths
atomic. Basten et al. [6] define timestamps for abstract
events. Informally, they define a function T + as the largest Subset PS
Routing
timestamp of any constituent primitive event a ∈ A and the paths to sink
function T − as the smallest timestamp of any constituent
Assert Time
primitive event a ∈ A. cardinality == difference
Average
Hop-count
Subset FP
Failing paths
Output
0 distribution
In contrast to debugging tools requiring timestamps for
their analysis, our Event Analysis has no implicit semantics
for temporal event information. Rather, the event analysis Fig. 2. Event Analysis: From trace to behavior
user explicitly introduces temporal semantics. This allows or assertions
for simple queries, which do not require any temporal
information as the analyses in the case study indicate.
However, timestamps may be utilized to formulate tem- A. Testing and the implications for event
poral queries on the event set. analysis

Testing is highly application specific. In order to avoid


III. Event Analysis system perturbation, monitoring is restricted to the smallest
set of test monitors as required to deduce test specific
information.
A detailed analysis of a long event trace in large-scale One reason is that access to a WSN node is bandwidth-
distributed systems is tedious and error-prone. An event limited. Low level interfaces such as JTAG and UART may
analysis framework alleviates this problem by allowing be used for access to internal state, but only provide limited
a systematic approach: It allows to specify behavioral access. LEDs only allow for visual inspection of code
aspects of a system on a higher level by formulating properties, rendering them a very primitive debugging help.
queries on event sets. Instrumentation with test monitors [5] needs meticulous
care to avoid perturbation of system execution. On mote
As depicted in Figure 2, abstracted event sets are platforms sharing a bus for the serial interface and the radio
deduced by iteratively processing the input of the event such as the Tmote Sky, monitor output may considerably
analysis, i. e. the atomic events of the trace set E. The interfere with the communication stack. Thus, tests may
analysis is based on selection predicates on event tuple also be divided to only focus on specific aspects of a
keys and values and transformation functions on the se- test execution. Furthermore, long test runs necessitate off-
lected events (cf. Sec. III-B). The analysis computes sets system logging, since memory is considerable limited on
of processed events. the sensor nodes.
As Fig. 2 illustrates, the outputs of the analysis are Current Wireless Sensor Network test platforms, simu-
abstract event sets, which provide extracted behavioral lators such as TOSSIM [7] or testbeds such as Motelab [8]
information such as the set of failed routing paths or a or the Deployment Support Network (DSN) [9], feature
check of sets against golden results to assert a satisfiable different logging mechanisms and formats. However, all
execution, e. g. that no reboots have occurred. Figure 2 also of them typically present test results on a central test
shows that processed event sets may provide debugging or host. The test data collected by the test monitors on the
performance information such as the average hop count distributed sensor nodes are centrally available for off-line
of routed packets or the time difference distribution for test analysis.
acknowledged send to the according receive events. Rather than imposing a structure on the monitoring and

203
its formats, which would allow for more automation of 1) Partitioning: The partition operators allows to par-
the analysis, EvAnT imposes only minimal requirements tition a set of events into subsets based on the values of
on instrumentation and test platforms. The core of EvAnT, event keys. Partitioning is performed on a single set S.
namely its operators, is independent of the monitoring The partitioning operator returns multiple sets Ri , each
and trace format and thus reusable across projects and containing the events that satisfy a given predicate ϕi (s).
platforms. With merely adding a simple event parser, An example for partitioning into disjunct event sets
EvAnT is usable for any project. Even heterogeneous based on values of the type key is depicted in Fig. 2. As
logging is supported as described in Sec. V. shown, partitioning may also be used for filtering specific
EvAnT provides its users flexible, yet powerful opera- event types as shown for the reboot events.
tors to describe test, application and test platform specific 2) Set Transformator: The set transformator allows to
analysis queries and checks. Thus, the analysis framework select a subset A from a base set by using a predicate ϕ(A).
is readily usable for any WSN project. Moreover, off-line Selected events are processed based on a transformation
analysis does not require any optimization, but can rely on function, e. g. to join multiple events into a single com-
a powerful analysis host. pound event. Processed events are added to the result set
R. Processing is performed on a single set. An extension
B. Operators to use the operator on multiple sets is to use the union
of the sets as the base set and describe a predicate that
EvAnT uses events and event sets as primitives for the discriminates individual events based on their origin set.
analysis of a system execution. Thus, EvAnT offers the set An example is the set SR in Fig. 2, which selects
operations of union, intersection and relative complement. according send and receive events, joins these into com-
Additionally, EvAnT offers four novel operators (cf. Fig. 3) pound transmission events with an additional key-value
especially tailored for formulating queries on WSN event pair describing the time difference between the sending
sets. and the reception. This allows for a computation of the
• Partition Operator: time difference distribution as shown.
Ri = {s|s ∈ S ∧ ϕi (s)}, i ∈ N (1) 3) Set Processor: Set processors are available for
processing single events. Each event s ∈ S is selected
• Set Transformator:
and processed by a transformation function f (s) on the
R = {e|e ∈ f (A) ∧ A ⊆ S ∧ ϕ(A)} (2) event. This allows for adding key-value pairs for events or
computations on event values.
• Set Processor:
As depicted in Fig. 2, to determine the hop-count of a
R = {e|e ∈ f (s) ∧ s ∈ S} (3) routing path and computing the average, communication
• Fixed Point Processor: events need to add a hop-count key, which is used in
the fixed point operator transformation function to be
R0 =S
incremented when a send and a receive event are joined.
Ri ={{e|e ∈ f (A) ∧ A ⊆ Ri−1 ∧ ϕ(A)}∪ (4) 4) Fixed Point Processor: The Fixed Point Processor
i−1 computes the least fixed point of a given function on event
{e|e ∈ A ∧ A ⊆ R ∧ ¬ϕ(A)}}
As Fig. 3 indicates, S is the set used as the input for sets and produces a new set. Selection and processing of
the operators. R is the result set, i. e. the output, of a given events in a single iteration is performed as in the Set
operator. s denotes an event in the base set S. Basis for Transformator. Iteratively the sets Ri are computed, until
the operators are predicates ϕ and transformation functions a fix point is determined, i. e. Rk = Rk−1 . All events that
f : 2S → 2S . are not selected by the predicate are maintained for each
A predicate ϕ is defined on one or multiple events in iteration.
the selection process. The predicate uses relations on the An example is the computation of the routing paths:
values of specified event keys. Relations differ based on Starting from the initial transmission on the origin node,
the purpose of the selection and the available trace infor- each path is traversed by joining each receive and send
mation. The Partition Operator makes use of predicates on message into a compound transmission path event until
a single event ϕ(s). The Set Transformator and the Fixed the packet is received at the sink or the path fails.
Point Processor use predicates on multiple events ϕ(A).
A transformation function f (A) is used to either add C. Analysis for debugging or for testing
information to events or to merge events into a compound
event object. The event analysis framework is used to extract a behav-
In the following, the four operators in EvAnT are ioral description from the event trace E. This abstraction
described: process allows for two different goals: the analysis for

204
Base Set Base Set
Base Set S Base Set S S
S
Partitioning based on predicate

Fixed Point
Result Set Result Set Iteration
Subset R1 Subset Rk Result Set
R R R

4) Fixed Point
1) Partition Operator 2) Set Transformator 3) Set Processor Processor

Fig. 3. The four operators in EvAnT

a better comprehension of actual system behavior and EvAnT already provides support for DSN database access
the checking of execution properties, e. g. for regression and EvAnT-format ASCII log files. This allows for reading
testing. monitoring data and creating event sets. Further processing
1) Queries is performed by the implementations of the event analysis
Queries are used to extract behavioral aspects of operators (cf. Sec. III-B).
the raw event set. Thus queries may be used for Predicates are implemented as Python lambda functions
debugging, i. e. to expose the cause a failure of the returning a Boolean value. Transformation functions are
WSN system. Additionally, queries and processing formulated by constructing a new event out of the con-
of the event set allow for performance evaluation. stituent key-value pairs of the selected event pairs or by
Multiple test runs with different parameter sets are introducing novel keys and values. They may also simply
easily comparable by a common analysis. Query return the selected events. Listing 1 displays the predicate
results may be used to check for correct behavior and the transformation for determining the packet yield
e. g. by comparing against golden result set. for communication sets partitioned by origin and sequence
2) Checks number.
Assertions [10] allow for checking execution proper- Selection predicates and transformation functions are
ties and presenting assertion misses. Assertions may inputs to the EvAnT operators. The operators iterate over
have different severity levels as known from hardware the event sets, evaluate the predicate on events and trans-
design. An exemplary use is to assert a warning if form selected events accordingly.
packets are lost in an individual run, but only assert The partition function returns an associative array of
it as an error, if packet loss occurs in more than 1 out the partitioned event sets addressable via their value or the
of 100 test executions. index of the value interval. For enumerable values of a key,
a partitioning is performed with an implicit predicate based
IV. EvAnT on the discrete values, rendering disjoint sets. 2 For keys
with continuous values with possibly infinite partitions,
value intervals defined as boolean lambda functions may
EvAnT is a framework to analyze the system behavior
be defined allowing to divide the value range in partitions,
of WSNs by the means of collected monitor events during
allowing for events in multiple sets. Set Transformators
execution of the WSN. EvAnT is generic and thus can
and Fixed Point Processor currently support predicates
be used for different testing platforms and programming
and transformation functions on two events. This suffices
languages. It supports heterogeneous system devices and
for covering typical WSN cases such as the routing path
test platforms, since it only operates on the trace. It is based
computation as discussed. Both operators return a new
on the described operators on event sets. The semantics
event set.
of events is specified in EvAnT. Causality or other order
relations are provided by the analysis query rather than Also provided are arithmetic functions on event sets
implied by syntactical requirements. returning a maximum and minimum of the numeric values
EvAnT is implemented as a set of Python classes for of a key, as well as operators for determining mean and
event and event set handling, as well as the overall EvAnT standard deviation of all values. These allow for simple
Framework. The implementation in Python allows for easy computations for a given set as used in the drift measure-
extension with additional operators targeted at specific ment analysis in Sec. V-C for each node-neighbor pair set.
analyses. An analysis is started by creating an input parser
for the specific test platform or monitoring data format. 2 Events not featuring the event key(s) are ignored.

205
type key0 key1 key2 key3 *!"(%%%+
received dest addr origin seqNo
#$ #&
!" '()
senddone dest addr origin seqNo ack #!!('""
drift neighbor ID interval abs. drift drift in interv.
'&()'
**(%'" !##(&')
Fig. 4. Harvester specific event tuple keys !$'()"$
#% #!
collected on each node.
'(%&

V. Case Study Fig. 5. Acknowledged and total number of


sent packets for selected nodes
We evaluate EvAnT by applying it to Harvester, which
is a typical WSN application running on TinyOS 2. Har-
vester collects temperature data on remote nodes to mon- the fixed-point processors; in each iteration, send and
itor the heat flows in an office building. The temperature receive events that correspond to each other are aggregated
readings are forwarded to base stations that act as gateways into compound transmission events.
to the sensor network. The routing protocol bases on the For our analysis, we used EvAnT to determine the
TinyOS Collection Tree Protocol. To achieve a long system packet yield using the Set Transformator. The implemen-
lifetime, Harvester minimizes the power consumption by tation of this computation in EvAnt is shown in List-
augmenting the routing protocol with a custom low power ing Sec. 1. While the Set Transformator could operate
listening (LPL) stack. The LPL stack minimizes the power directly on the complete set of events, this operation
consumption by the estimating wake-up times resulting takes a prohibitively long time due to the large number
in a link-based asymmetric synchronization similar to the of events. But, we know, that all packets related to one
WiseMac Protocol [11]. The sink nodes do not have power temperature reading share the same sequence number and
constraints because they are powered from attached PCs. origin address. Thus we can first partition the initial event
Hence, for improving the bandwidth of the gateways, LPL set based on the source address and the sequence number.
is disabled on sink nodes. The Set Transformator can then be used on each of the
In the case study, Harvester is executed on Tmote Sky resulting sets separately, which significantly accelerates the
sensor nodes that are connected to the DSN, a testbed computation. This methodology can also be adopted when
comprised of distributed sensor and observer node-pairs using the computation-intensive Fixed Point Processor.
and a wireless backbone channel of the observer nodes [9]. We performed multiple experiments on 17 nodes, each
Data is forwarded to a single gateway node. We perform running for 3 hours showing that the average packet yield
heterogeneous logging: the sink node directly attaches to for each node is higher than 78%, with nodes in the
a PC via the serial interface, while the logs of all remote one-hop neighborhood having an average packet yield
nodes are collected via the DSN. higher than 90%. A more detailed analysis showed the
Figure 4 shows the event types that are produced interesting result, that for the different tests in average each
by the instrumented Harvester application. Receive and temperature reading required between 2.6 and 5.2 packets
send event types are extracted from the TinyOS 2 event to be transmitted. For a network with a small number of
handlers for a path and packet yield analysis. Measurement hops, this number of sent packets would be expected not to
events allow for drift analysis as described in V-C. In the differ so considerably. This hints at a considerable message
following analysis, we are interested in the performance loss along the routes, which are counteracted by frequent
of Harvester concerning its packet yield while collecting retransmissions. While the impact on the packet yield is
the temperature data and concerning its efficacy heavily still tolerable for Harvester, the retransmissions are very
determining the system lifetime. The events collected for expensive in terms of energy. Nevertheless, both results
this case study are rather generic. Thus, our case study is hint at a deeper analysis of the underlying MAC layer. The
representative for a large class of WSN operating systems, analysis of one 17-node experiment takes approximately
sensor nodes and testing platforms. one minute on a MacBook (2GHz Intel Core 2 Duo / 1.5
GB RAM) running OSX 10.5.1 and Python 2.5.1.
A. Routing Analysis
B. MAC Analysis
To investigate the performance concerning packet yield
of Harvester, one possibility is an analysis of the routing The MAC analysis evaluates the efficacy of the link
paths. The actual routing paths can be reconstructed with layer and the low power listening protocol. Figure 5 shows

206
# Selection predicate matches acknowledged sent with received packet with receiver nodeid equaling sink address
pairing = lambda x,y: x.typeid == ’received’ and y.typeid == ’senddone’ and x.nodeid == sink
# Transforation function generates send−receive pairs updating the current node id to the receiver ’s node id
pair_tf = lambda x,y: event(nodeid = y.nodeid, origin = x.origin, seqNo = x.seqNo, typeid=’pair’)

#2−staged partitioning performs implicit equivalence selection predicate and improves EvAnT runtime
partitionOriginSet = tempset.partition(’origin’)
for origin in partitionOriginSet.keys():
partitionSequenceSet = partitionOriginSet[key].partition(’seqNo’)
for sequenceNumber in partitionSequenceSet.keys():
yield_set = partitionSequenceSet[sk].set_transformator(pairing, pair_tf)

Listing 1. Using EvAnT’s Partition and Set Transformator to determine the packet yield

ID 29 40 41 42
accurately (e. g. 29 and 40), while some links are totally
29 N/A -0.964 (5539) 0.932 (3268) 0 (0)
40 0.949 (3967) N/A 0.095 (4020) -1.676 (4234) off (40 and 42).
41 0.444 (3610) 0.812 (3026) N/A -1.012 (5156) Hence, we have determined in our analysis that while
42 0 (0) -0.201 (2609) 0.140 (2671) N/A the overall packet yield is tolerable, the implementation of
the custom LPL stack needs improvement concerning the
TABLE I. Drift in scaled ppm (number of mea-
drift measurement and interpretation. One of the problems
surements) for selected nodes
can be attributed to a timestamping problem of the CC2420
driver as discussed on the TinyOS mailing list [12], where
corrupted or stale timestamps might be applied to a packet.
results from a 12 hour test run for a selected part of
the network. The sink that is not shown in the figure is
positioned to the left of the nodes. Each link is annotated VI. Related work
with the acknowledged and total number of packets sent
along this link. In an ideal setting (perfect synchronization There is no previous work concerning event analysis for
and no interference) each packet should be sent only once WSN testing. However there exists previous work on the
and should be acknowledged immediately. The discrepancy related fields of test platforms, WSN health monitoring,
between the total number of packets and the number passive monitoring frameworks and on-line data mining:
of acknowledgments indicates that the synchronization of Test platforms for WSNs include simulators such as
sender and receiver does not work properly3 . Hence, an TOSSIM [7] or EmTos in EmStar [13] and testbeds such
in-depth analysis of the wakeup time estimation and the as Motelab [8] or the DSN [9]. EmSim [13] allows for
time drift detection was performed. heterogeneous testing with simulation and actual hardware.
Test platforms provide input data for EvAnT, which ben-
C. Drift Analysis for Wakeup time estima- efits from an improved view of the system.
tion Yang et al. have shown the benefits of a wireless source-
level debugger for WSNs [14]. This is complementary
We determine the drift measurement data from measure- to our work, since we target analysis for testing and
ments packets, which are exchanged by neighboring nodes for debugging on a macroscopic level, while Clairvoyant
to determine the clock drift. In a scenario with perfect targets debugging of microscopic effects such as race
synchronization each node pair n1 measures the same drift conditions or stack overflows.
of the local timebase to the timebase of its neighbor n2 . Network health monitoring tools ([15], [16]) allow for
That is, if node n1 determines the drift of node n2 as +τ, detecting and debugging failures by providing and commu-
node n2 will compute a drift of −τ. Thus, in a perfectly nicating additional state information. They extend the com-
synchronized scenario the computed drifts add up to 0. munication protocol to provide collaborative, automatic
We ran this test twice for 24 consecutive hours and have maintenance and recovery of WSNs after deployment. The
accumulated the results in a single event set. Joining anal- main target is to provide on-line health monitoring for
ysis results, i. e. abstract event sets is easily implemented typical data collection application in an energy-efficient
in EvAnT. We collected a total of 494316 measurement manner by providing common failure indications. Our
events. Table I shows that certain links are synchronized approach is orthogonal, since EvAnT is targeted for pre-
3 While interference problems are also possible, previous measurements
deployment and providing an analysis framework for test-
on the testbed and current test results indicate that this is considerably ing various WSN applications on different test platforms.
less likely. Passive inspection of distributed wireless systems as

207
proposed for WSNs ([17], [18]) relies on traffic snooping References
and automatic analysis of collected information. Algo-
rithms are focussed on detecting indicators on the basis of [1] J. I. Choi, J. W. Lee, M. Wachs, and P. Levis, “Opening the
an incomplete view. Ringwald et al. [17] presents typical sensornet black box,” SIGBED Rev., vol. 4, no. 3, pp. 13–18, 2007.
[2] G. Werner-Allen et al., “Monitoring volcanic eruptions with a
indicators for WSNs focussed on passive inspection, which wireless sensor network,” in Proc. 2nd European Workshop on
may be adapted for usage in EvAnT. Maifi et al. [18] use Sensor Networks (EWSN 2005), 2005, pp. 108–120.
machine learning techniques and provide predicates for [3] V. Turau, M. Witt, and M. Venzke, “Field trials with wireless
sensor networks: Issues and remedies,” in Proc. of the Int’l Multi-
anomalous behavior, which could also be used in EvAnT. Conference on Computing in the Global Information Technology
EvAnT rather relies on a comprehensive view and allows (ICCGI ’06), 2006, p. 86.
for specifying test- and application-specific aspects. [4] K. Langendoen, A. Baggio, and O. Visser, “Murphy loves potatoes:
Experiences from a pilot sensor network deployment in precision
Data mining techniques may be used for WSNs, e. g. to agriculture,” in Proc. 20th Int’l Parallel and Distributed Processing
perform an online analysis of collected data for traffic Symposium (IPDPS 2006), 2006, pp. 8–15.
reduction [19]. However on-line analysis of WSNs is [5] M. Woehrle, C. Plessl, J. Beutel, and L. Thiele, “Increasing the
reliability of wireless sensor networks with a distributed testing
restricted to spatially and temporally local predicates and framework,” in Proc. 4th IEEE Workshop on Embedded Networked
infer a considerable overhead if used for testing on the Sensors (EmNetS-IV), 2007.
sensor node. [6] T. Basten et al., “Vector time and causality among abstract events
Differing to all the previous approaches, EvAnT pro- in distributed computations,” Distrib. Comput., vol. 11, no. 1, pp.
21–39, 1997.
vides the benefit of being application and platform inde- [7] P. Levis, N. Lee, M. Welsh, and D. Culler, “TOSSIM: Accurate
pendent and thus readily applicable to any project. and scalable simulation of entire TinyOS applications,” in Proc. 1st
ACM Conf. Embedded Networked Sensor Systems (SenSys 2003),
2003, pp. 126–137.
VII. Summary and outlook [8] G. Werner-Allen, P. Swieskowski, and M. Welsh, “MoteLab: A
wireless sensor network testbed,” in Proc. 4th Int’l Conf. Informa-
tion Processing Sensor Networks (IPSN ’05), 2005, pp. 483–488.
We have formulated the problem of analyzing Wireless [9] M. Dyer et al., “Deployment support network - a toolkit for the
Sensor Networks systems based on events collected in development of WSNs,” in Proc. 4th European Workshop on Sensor
traces. We defined novel and powerful operators on event Networks (EWSN 2007), 2007, pp. 195–211.
[10] D. S. Rosenblum, “Towards a method of programming with as-
sets for performing expressive queries on the event trace. sertions,” in ICSE ’92: Proceedings of the 14th international
We also showed the usage of assertions on the query conference on Software engineering, New York, NY, USA, 1992,
results for checking test executions. We presented EvAnT, pp. 92–104.
[11] A. El-Hoiydi and J. Decotignie, “WiseMAC: An ultra low power
our event analysis framework for WSN systems. EvAnT MAC protocol for multi-hop wireless sensor networks,” in Proc. 1st
allows for analyzing arbitrary event traces from a system Int’l Workshop Algorithmic Aspects of Wireless Sensor Networks
execution by merely providing an input parser or using one (ALGOSENSORS 2004), 2004, pp. 18–31.
[12] D. Moss et al., “Bug in cc2420 timestamp,”
of the supported event formats. Thus, EvAnT is generally https://www.millennium.berkeley.edu/pipermail/tinyos-help/2007-
applicable for different WSN and test platforms. EvAnT October/028901.html, October 2007.
can be used for macroscopic debugging by defining queries [13] L. Girod et al., “Emstar: A software environment for developing and
deploying heterogeneous sensor-actuator networks,” ACM Trans.
on the event sets or for testing by formulating assertions Sen. Netw., vol. 3, no. 3, p. 13, 2007.
on the query sets. A case study showed the application of [14] J. Yang, M. L. Soffa, L. Selavo, and K. Whitehouse, “Clairvoyant: A
EvAnT to Harvester, a typical application. EvAnT allowed comprehensive source-level debugger for wireless sensor networks,”
in Proc. 5th ACM Conf. Embedded Networked Sensor Systems
us with an explorative, iterative analysis to reveal the (SenSys 2007), 2007.
problem of Harvester as the drift measurements. [15] S. Rost and H. Balakrishnan, “Memento: A health monitoring
In the future, we plan to apply the framework to more system for wireless sensor networks,” in Proc. 3rd IEEE Commu-
nications Society Conf. Sensor, Mesh and Ad Hoc Communications
projects. The ease of use of EvAnT can be extended with and Networks (IEEE SECON 2006), 2006.
an instrumentation approach that automatically extracts [16] N. Ramanathan et al., “Sympathy for the sensor network debugger,”
the event format at compile time and provides an input in Proc. 3rd ACM Conf. Embedded Networked Sensor Systems
(SenSys 2005), 2005, pp. 255–267.
parser. EvAnT is available for other researches in the WSN [17] M. Ringwald, K. Römer, and A. Vitaletti, “SNIF: Sensor network
community. inspection framework,” Department of Computer Science, ETH
Zurich, Technical Report 535, Oct. 2006.
[18] M. Maifi et al., “SNTS: Sensor network troubleshooting suite,” in
VIII. Acknowledgements Distributed Computing in Sensor Systems, vol. Volume 4549/2007.
Springer, 2007, pp. 142–157.
[19] K. Römer, “Discovery of frequent distributed event patterns in
The work presented here was supported by the National sensor networks,” in Proc. European Workshop on Wireless Sensor
Competence Center in Research on Mobile Information Networks (EWSN 2008), Bologna, Italy, Jan. 2008, pp. 106–124.
and Communication Systems (NCCR-MICS), a center
supported by the Swiss National Science Foundation under
grant number 5005-67322.

208
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Power-Aware Real-Time Scheduling upon Identical Multiprocessor Platforms

Vincent Nélis∗, Joël Goossens, Raymond Devillers, Dragomir Milojevic


Université Libre de Bruxelles (U.L.B.)
CP 212, 50 Av. F. D. Roosevelt,
1050 Brussels, Belgium
{vnelis, joel.goossens, rdevil, dragomir.milojevic}@ulb.ac.be
Nicolas Navet
LORIA - Equipe TRIO
Campus Scientifique - B.P. 239
54506 Vandoeuvre-lès-Nancy, France
nicolas.navet@loria.fr

Abstract Many power-constrained embedded systems are


built upon multiprocessor platforms because of high-
In this paper, we address the power-aware scheduling computational requirements and because multiprocessing
of sporadic constrained-deadline hard real-time tasks us- often significantly simplifies the design. As pointed out
ing dynamic voltage scaling upon multiprocessor platforms. in [4], another advantage is that multiprocessor systems are
We propose two distinct algorithms. Our first algorithm is more energy efficient than equally powerful uniprocessor
an off-line speed determination mechanism which provides platforms, because raising the frequency of a single proces-
an identical speed for each processor. That speed guar- sor results in a multiplicative increase of the consumption
antees that all deadlines are met if the jobs are scheduled while adding processors leads to an additive increase.
using EDF. The second algorithm is an on-line and adap-
tive speed adjustment mechanism which reduces the energy 1.2 Problem definition
consumption while the system is running.
In the following, we consider the problem of minimizing
the energy consumption needed for executing a set of spo-
1 Introduction radic constrained-deadline real-time tasks scheduled upon
a fixed number of identical processors. The scheduling is
preemptive and uses the global EDF policy [15]. “Global”
1.1 Context of the study scheduling algorithms, on the contrary to partitioned al-
gorithms, allow different instances of the same task (also
Some important applications impose temporal con- called jobs or processes) to be executed upon different pro-
straints on the response time while running on systems with cessors. Each process can start its execution on any pro-
limited power resource (such as real-time communication cessor and may migrate at run-time from one processor to
in satellites). As a result, the research community has in- another if it gets meanwhile preempted by smaller-deadline
vestigated during the past 15 years the low-power system processes.
design. Actually, the dynamic voltage scheduling (DVS) We first tackle the problem of choosing the smallest (or
framework became a major concern for power-aware com- so) processor frequency for the set of CPUs, such that all
puter systems. This framework consists in minimizing the deadlines will be met. The procedure is performed off-line
system energy consumption by adjusting the working volt- (i.e., before the system starts its execution) and provides
age and frequency of the CPU. For real-time systems, this a static result in the sense that the computed speed does
DVS framework focuses on minimizing the energy con- not change over time. Such a static solution is sufficient
sumption while respecting all the timing constraints. to significantly reduce the energy consumption; however,
1 Supported by the Belgian National Science Foundation (FNRS) under due to the discrepancy between Worst-Case Execution Time
a FRIA grant. (WCET) and Actual-Case Execution Time (ACET) [11], it

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 209


DOI 10.1109/SUTC.2008.31
usually leads to pessimistic results. In a second step, we “One Task Extension” (OTE). We proved that our on-line
thus propose an on-line scheme that takes advantage of un- proposal does not jeopardize the system feasibility.
used CPU slots to further reduce the energy consumption.
Organization of the paper. The document is organized
1.3 Previous work as follows: in Section 2, we introduce our model of compu-
tation, in particular our task model; in Section 3, we present
There is a large number of researches about uniprocessor our off-line processor speed determination; in Section 4, we
energy-aware scheduling but much less for the multiproces- present our on-line speed reduction technique; in Section 5,
sor case, where low-power scheduling problems are often we present our experimental results; in Section 6, we con-
NP-hard when the actual applicative constraints are taken sider our future works and in Section 7, we conclude.
into account (see [7] for a starting point). Among the
most interesting studies, one can cite [14] where the au-
2 Model of computation
thors provide power-aware scheduling algorithms for bag-
of-tasks applications with deadline constraints on DVS-
enabled cluster systems. A study particularly relevant to 2.1 Application model
the DVS framework is [6] which targets energy-efficient
scheduling of periodic real-time tasks over multiple DVS We consider in this paper the scheduling of sporadic
processors with the considerations of power consumption constrained-deadline tasks, i.e., systems where each task
due to leakage current (i.e. the static part of the energy dis- τi = (Ci , Di , Ti ) is characterized by three parameters –
sipation). In [8], the authors propose a set of multiprocessor a worst-case execution requirement (WCET) denoted Ci ,
energy-efficient task scheduling algorithms with different a minimal inter-arrival delay Ti and a deadline Di ≤ Ti
task remapping and slack reclaiming schemes, where tasks – with the interpretation that the task generates successive
have the same arrival time and share a common deadline. A jobs τi,j (with j = 1, 2, . . . , ∞) arriving at times ei,j such
large number of such “slack reclaiming” approaches have that ei,j+1 − ei,j ≥ Ti , each such job has an execution re-
been developed over the years for the uniprocessor case. quirement of at most Ci execution units, and must be com-
Among those, some strategies dynamically collect the un- pleted by its deadline noted Di,j = ei,j + Di . We therefore
used computation times at the end of each job and share it assume that the worst-case execution time is always lower
among the remaining active jobs. Examples of algorithms than the deadline, i.e. Ci ≤ Di . We assume that preemp-
following this “reclaiming” approach, include the ones pro- tion is allowed – an executing job may be interrupted, and
posed in [19, 16, 21, 3]. Some reclaiming algorithms even its execution resumed later (may be upon another proces-
anticipate the early completion of tasks for further reduc- sor), with no loss or penalty. Let τ = {τ1 , τ2 , . . . , τn } de-
ing the CPU speed [16, 3], some having different levels of notes a sporadic task system. For each task τi , we define
“aggressiveness” [3]. its density λi as the ratio of its execution requirement to
def
its deadline: λi = Ci /Di . Since Ci ≤ Di we have that
1.4 Contribution of the paper λi ≤ 1. We also define the total density λsum (τ ) of spo-
def Pn
radic task system τ as λsum (τ ) = i=1 λi , and its max-
Unlike the work considered in [4], we study the case def
imal density as λmax (τ ) = maxτi ∈τ λi . Without loss of
where the number of processors is already fixed. This generality, we assume in the remainder of the paper that
constraint can be imposed by the availability of hardware λ1 ≥ λ2 ≥ . . . ≥ λn , and consequently λmax (τ ) = λ1 .
components, by design considerations not related to power-
consumption. Notice that in practical situations, the task
2.2 Platform model
characteristics are unknown at (hardware) design time.
The first contribution of this paper, is based on [13], and
provides a technique which determines the minimum off- In our platform model, a processor can dynamically
line processor speed for the fixed and identical multiproces- adapt its working frequency in some continuous range
sor platform using EDF. [fmin , fmax ]. The case where the number of frequencies
The second, and the main contribution of this document, is finite can be addressed as in [12]. In the remainder of
is a slack reclaiming algorithm which is, to the best of our this paper, we denote by s(t) the processor speed at any
knowledge, the first of its kind for the global preemptive time-instant t. The processor speed s(t) is defined as the
scheduling problem of distinct-deadlines tasks on multipro- ratio of its current functioning frequency (say f (t)) over
def
cessor platforms. This contribution can be considered as an the maximal frequency fmax , i.e.: s(t) = ffmax (t)
, with
extension to the multiprocessor case of a previous proposal fmin ≤ f (t) ≤ fmax . Notice that the processor speed al-
of Shin and Shoi in [19], which is usually referred to as ways lies between ffmax
min
and 1, whatever the values of fmin

210
and fmax , and to each speed corresponds exactly one fre- platform with m processors running at speed s if:
quency. λsum (τ ) − λmax (τ )
We consider in this document multiprocessor platforms s ≥ λmax (τ ) + (1)
m
composed of a known and fixed number m of identical pro-
cessors {P1 , P2 , . . . , Pm } upon which a set of real-time Notice that, from the expression (1) (which is a sufficient
tasks is scheduled. The working power of each processor condition), s is always greater or equal to λmax (τ ), which is
may be characterized by its speed (or computing capac- a necessarily condition to ensure the system schedulability,
ity) s – with the interpretation that a job that executes on whatever the scheduling algorithm.
a processor of speed s for R time units completes s × R
3.2 Algorithm EDF(k)
units of execution. The minimal and maximal admissible
speed of all processors are identical and are denoted by
def def
Following an idea from [13], but adapted to our off-
smin = ffmaxmin
> 0 and smax = ffmax max
= 1, respectively. line speed determination where the number of processors is
Since we assume that the range of available frequencies is fixed, we shall present an improvement on the speed needed
continuous between fmin and fmax , the speed of the proces- in order to schedule sporadic task sets.
sors can take any real value between smin and smax at every
instant. Notice that the task computing requirements (Ci ’s) Algorithm EDF(k) (Goossens, Funk and Baruah [13]):
are defined for the maximal speed smax . Assuming that the task indexes are sorted by non-increasing
In Section 3 we assume that all the processors share a order of task densities and 1 ≤ k ≤ m, EDF(k) assigns
common speed which is fixed before the system starts its priorities to jobs of tasks in τ according to the following
execution. This speed does not change during the schedul- rules:
ing and thus, we will use the notation s instead of s(t) to
simplify the presentation. Then, we study the case in Sec- For all i < k, taui jobs are assigned the highest priority
tion 4 where each processor may run at a different speed (ties are broken arbitrarily).
and may change it at any time during the scheduling. In our For all i ≥ k, τi jobs are assigned priorities according to
work, speed assignments are determined at job-level: volt- EDF (ties are again broken arbitrarily).
age/speed changes only occur at job dispatching instants.
That is, once a job is assigned to a CPU, the CPU speed is That is, Algorithm EDF(k) assigns the highest prior-
fixed until the job is preempted or completed. ity to jobs generated by the (k − 1) tasks in τ that have
highest densities, and assigns priorities according to dead-
lines to jobs generated by all other tasks in τ (thus, “pure”
3 Off-line speed determination EDF is EDF(1) ). We show in the following that we get
another lower-bound for the speed s when using EDF(k)
3.1 Introduction instead of EDF, and this bound is always lower than (or
equal to) the one provided by Expression (1). But first,
Off-line processor speed determination is the process of we introduce the notation τ (i) to refer to the task system
determining, during the design of the real-time application, composed of the (n − i + 1) minimum-density tasks in
the lowest processor speed s in order to schedule the spo- def
τ : τ (i) = {τi , τi+1 , . . . , τn }; (according to this notation,
radic task set τ upon an identical multiprocessor platform τ ≡ τ (1) ).
with m processors running at speed s. In this Section, we
consider the case where, at any instant, all processors must Theorem 2. Any sporadic constrained-deadline task sys-
be running at the same speed noted s. We shall use the fol- tem τ is EDF(k) -schedulable upon an identical multipro-
lowing result: cessor platform with m processors at speed sk if sk ≥
(τ (k+1) )
max{λ1 , λk + λsum
m−k+1 }
Theorem 1 (Bertogna, Cirinei and Lipari [5]). Any spo-
Corollary 2. A sporadic constrained-deadline task system
radic constrained-deadline task system τ satisfying
τ is schedulable upon m processors at speed sol by EDF(`) ,
λsum (τ ) ≤ m − (m − 1) · λmax (τ ) with
def m λsum (τ (k+1) )
sol = max{λ1 , min{λk + }} (2)
is schedulable by the EDF algorithm upon a platform with k=1 m−k+1
m identical processors. and ` is the parameter minimizing the speed sol of sk .
Then, we get the following sufficient feasibility condition: Proof. The proof is a direct consequence of Theorem 2.
Corollary 1. A sporadic constrained-deadline task sys- It may be seen that this expression always yields a better
tem τ is EDF-schedulable upon an identical multiprocessor bound than Inequality (1).

211
3.3 Implementation 4.2 Notations

A more detailed description of our off-line speed deter- We denote by t the current time in the schedule and by
mination mechanism is given by Algorithm 1. Let sol de- Bi (t) the last release time of τi before or at time t, with
note the returned speed, defined by Expression (2). Before Bi (0) initially set to −Ti (see Equation 3 to understand this
applying this algorithm, we assume that the number of pro- initialization). During the scheduling, Bi (t) is updated at
cessors is sufficient to schedule the system τ at the maximal each time t a job is released by τi . The ready queue, de-
speed. Consequently, the speed sol is initially set to smax noted by ready-Q, holds all the pending jobs (i.e. ready to
(line 3). Then, the algorithm searches the minimal speed by be executed but waiting for a CPU) sorted according to the
sweeping the value of k between 1 and m (line 4 to line 13). EDF(k) rule, where ties are broken according to an arbitrary
Finally, in order that EDF(k) assigns the highest priorities rule; recall that using EDF(k) , the priorities of the jobs are
to the (k − 1) tasks that have highest densities, we set the constant. In the following, si denotes the processor speed
deadline of these tasks to −∞ (line 14). for the job τi,j at time t. We shall use the following func-
tions.
Algorithm 1: Off-line speed determination The function Ai (t, t0 ) indicates if the sporadic task τi
Input: τ , m, smax , smin may generate a job at time t0 ≥ t. Since Ti denotes the
Output: sol
minimal inter-arrival delay between job releases of the spo-
1 begin
2 kopt := 1; radic task τi , we get:
3 sol := smax ;
1 if t0 ≥ Bi (t) + Ti

4 slimit := max{smin , λ1 } ; 0 def
5 for (k := 1 ; k ≤ m and sol > slimit ; k := k + 1) do Ai (t, t ) = (3)
0 otherwise
λ (τ (k+1) )
6 s := max{λ1 , λk + sum m−k+1
};
7 if (s < sol ) then Notice that Bi (0) is initially set to −Ti in order to have
8 sol := s ; Ai (0, 0) = 1 since our task model considers that each task
9 kopt := k ;
10 if (sol < slimit ) then sol := slimit ;
may release its first job at time t = 0.
n o Then, the function PotActi (t, t0 ) (for Potentially Active
11 foreach τi ∈ τ1 , ..., τkopt −1 do Di := −∞ ; at time t0 ) indicates if τi has an active job at time t which
12 return (sol ) ; may still be active at time t0 . This function returns 1 only if
13 end τi is active at time t and if t0 is not larger than the deadline
of this job:
s

 1 if ωi i (t) > 0 and
4 Multiprocessor One Task Extension 0 def
PotActi (t, t ) = t ≤ t0 < Bi (t) + Di
0 otherwise

4.1 Introduction
where ωisi (t) denotes the remaining worst-case execution
In this section, we consider the case where processors requirement of the last released job of τi if executed at speed
still share the same minimal and maximal speeds smin and si (if a job is done, its ω is set to zero, even if the WCET is
smax , but each one may run at its own execution speed dur- not exhausted).
ing the scheduling. We assume that, when a processor is Theorem 3. The function
idle, its execution speed is always fixed to the minimal com-
def X X
mon speed smin . We propose a low-complexity on-line al- Π(τu,v , t, t0 ) = m − PotActi (t, t0 ) − Ai (t, t0 )
τi ∈τ
gorithm that aims to further reduce the speeds of the CPUs τi ∈τ \{τu }

by performing “local” adjustments, when it is safe to reduce if non-negative, provides a lower bound of the number of
the speed below sol defined by Equation (2). available CPUs at time t0 ≥ t, when ignoring the schedule
We term our technique MOTE for Multiprocessor One of the current job of τu (if any).
Task Extension, since it is a multiprocessor version of the
technique proposed in [19] and usually referred to as OTE. Corollary 3. At each time t where a job τu,v is allocated
The idea is the following: the speed of a CPU can safely be to CPU P` , the earliest future time instant in the schedule
reduced below the speed sol during the execution of a job such that P` may be required by another job (possibly from
if the reduced speed does not change anything with respect the same task) is given by:
to the schedule of the subsequent jobs scheduled on that
min{t0 ≥ t | Π(τu,v , t, t0 ) ≤ 0} if m ≤ n

CPU. More precisely, subsequent jobs will not be delayed tnext =
by more (nor less) higher-priority workload than with sol . +∞ otherwise

212
4.3 MOTE scheme τ2,1 will be completed by their deadline. Consequently,
when ignoring the schedule of τ3,1 , we see that tnext is the
EDF(k) is a job-level fixed-priority consequently a job earliest time instant (after the time t) such that all processors
executed on a CPU can only be preempted upon its comple- may be required. Indeed, tnext is the earliest time instant af-
tion or the release of a (higher priority) job. In our scheme, ter time t such that Π(τ3,1 , t, tnext ) = 3 − 0 − 3 = 0.
the speed reduction of a job is decided when the job is al- Since tnext is the earliest time instant (after the cur-
located to a CPU, for the first time or when it resumes after rent time t) such that P3 may be required by another job
being preempted. Upon its release, a job is inserted into the than τ3,1 (assuming that all the other active jobs are sched-
ready-Q if it cannot receive a processor (i.e. all processors uled on other processors), one can conclude that P3 will
are used and the job is of lower priority). We do not make only execute the job τ3,1 between time instants t and tnext .
any assumptions on the CPU allocation rule when several That is, we proved that P3 can modify its working speed
CPUs are available for a single job. For instance, free CPUs in such a way that τ3,1 completes in the worst-case at time
can be granted according to the rule “smaller CPU index min{D3,1 , tnext } (or earlier if smin imposes it).
first.”
Since we consider multiprocessor platforms, we know Principle: Our on-line power-aware algorithm deals with
that we have to be very careful to any change in the origi- a priority rule that assigns a constant priority to each job. In
nal schedule because of scheduling anomalies. We say that this work, these priorities are determined by the algorithm
a scheduling algorithm suffers from anomalies if a change EDF(k) . Our power-aware algorithm is only applied when
which is intuitively positive in a schedulable system can a job τi,j is to be allocated to a CPU P` at time t during the
turn it unschedulable. An “intuitively positive change” is scheduling, which corresponds to its arrival or to the com-
a change which seems to help the scheduling, like reducing pletion of a higher priority job. At this time, our method
the density of a task (by increasing its period or reducing its determines the earliest time instant tnext such that P` may
execution requirement) or advancing the start-time of a job; be needed by another job. The function Π(τi,j , t, t0 ) (based
this can also be an increase of the number of processors on the deadlines of the jobs currently executing) is used to
on the platform. Unfortunately, multiprocessor platforms sweep the task set (with a running time linear in the num-
are subject to scheduling anomalies [2]. For that reason, ber of tasks). Notice that the function Π(τi,j , t, t0 ) could be
our on-line low-power mechanism only focuses on the last evaluated only at the deadline-times of the jobs currently un-
allocated-job and avoids to change the schedule of the other der execution and at the next (possible) arrival-time of every
jobs. task (since between these instants, the function Π(τi,j , t, t0 )
t tnext is constant). It follows from Corollary 3 that P` will not
A1,2
execute another job than τi,j until the time instant tnext .
P1 τ1,1
l ? The speed for τi,j can be safely reduced in such a way that
D1,1 it completes at time min{Di,j , tnext } (if the corresponding
A2,2
P2 τ2,1 speed is lower than the current one). Obviously, the work-
l ?
D2,1 ing speed of a processor can never be reduced under smin .
A3,2
P3 τ3,1
l ?
D3,1
- Algorithm 2: Determination of tnext
t0 Input: t, τi
Figure 1. Illustration of a 3-task system. Output: tnext
begin
na := number of active tasks at time t ;
Figure 1 illustrates the main idea of our on-line algo- L := set of the next deadline and possible arrival-time of
rithm when 3 tasks are scheduled upon 3 processors at speed each task, sorted by increasing order of the occurring time ;
tnext := t;
sol . This example shows a schedule where t is the cur- Π := m − (na − 1);
rent time, τ1,1 , τ2,1 and τ3,1 are the active jobs at time t while (Π > 0 and L 6= φ) do
(the ready-queue is empty since there are only three tasks in e ← L.top();
tnext := e.occurring time ;
the system) and plain circles and vertical arrows represent if (e.task 6= τi ) and (e.type == deadline) then
the deadlines and the (earliest) arrival times (since tasks are Π := Π + 1;
sporadic) of each task, respectively. Suppose that τ1,1 and else if (e.type == arrival) then Π := Π − 1;
τ1,2 are allocated to P1 and P2 . Before allocating τ3,1 to the L.pop() ;
return tnext ;
processor P3 , we see that P3 cannot be required by another end
job than τ3,1 until time tnext . Indeed, τ1,2 and τ2,2 could be
assigned (if they arrive at time A1,2 and A2,2 ) to the CPUs
P1 and P2 since the system feasibility ensures that τ1,1 and Let si denote the processor speed of the active job τi,j .

213
Algorithm 3: Speed-allocation to τi,j at time t the decision to reduce or not the CPU speed for a job τi,j is
Input: τi,j taken: when it is allocated to an available CPU (upon its re-
Output: φ lease, or when it is waiting for an available processor at the
begin head of the ready-Q and a job terminates its execution). A
// Initialization step
if (τi,j is allocated for the first time) then detailed description of the applied procedure at any alloca-
if (i < k) then si := λi ; tion time is given in Algorithm 3. Algorithm 2 shows how
λsum (τ (k+1) )
else si := λk + m−k+1
; to compute tnext with a linearithmic (also called quasilin-
// MOTE step ear) worst-case computing complexity O(n · log(n)), where
if (m ≤ n) then tnext := Call Algorithm2(t, τi ) ; n is the number of tasks.
else tnext := ∞ ;
if (tnext > t) then It worth noting that the MOTE step (see Algorithm 3) is
s
ω i (t)·s applied at most once to each job (and only if i > k); indeed,
si := min{si , min{Di ,t i }−t } ;
i,j next a job whose speed has been changed by this step will not be
if (si < smin ) then si := smin ;
τi,j is allocated to any available CPUs ; preempted in the future and thus will not be (re-)stored in
The speed of the designated CPU is fixed to si ; the ready-Q before its end of execution. However, when the
else No speed reduction can occur. The EDF(k) rule speed of a job (with a normal priority) is initialized but not
applies; τi,j either preempts the lowest priority job modified by the MOTE step at its arrival, it can possibly be
currently under execution or is allocated to any available reduced by the MOTE step in the future, if the job is at the
CPU, and the processor speed is fixed to si . ;
end head of the ready-Q and another job completes its execu-
tion. Section 5 shows that the MOTE algorithm indeed sig-
nificantly improves the energy consumption of a real-time
sporadic system.
This speed si is initialized when τi,j is released. In a sim-
ple version of the MOTE technique, the execution speed of
every released job is initially set to sol , since we assume
5 Experiments
that the priorities are assigned by EDF(k) and we proved
that the system feasibility is guarantee when it is scheduled
by EDF(k) at speed sol (Theorem 2). However, we adopt 5.1 Introduction
here another initialization step in order to profit from the
individual speed of each processor. In this “optimized” ini-
tialization step, two cases may arise at the arrival of the job In our simulations, we have scheduled periodic
τi,j : constrained-deadline systems (i.e., Ti is here the exact inter-
arrival delay for each task τi ). The energy consumption
1. if τi ∈ (τ \ τ (k) ) (the set of the (k − 1) tasks with of each generated system is computed by simulating the
highest densities), si is fixed to λi . three methods described in this paper during one hyper-
λsum (τ (k+1) ) period (i.e. the least common multiple of the task peri-
2. if τi ∈ τ (k) , si is fixed to λk + m−k+1 . ods); indeed, the authors of [9] show that, for the specific
We proved that all deadlines are met when the system is case of synchronous periodic task systems, the schedule
scheduled while using this rule. Then, when the job τi,j is to repeats from the origin with a period equals to the hyper-
be allocated to a CPU during the scheduling, we determine period. The three methods are: the off-line speed reduc-
the earliest time instant tnext such that Π(τi,j , t, tnext ) ≤ 0 tion for EDF (Equation (1)), the off-line speed reduction
and if tnext > t, one has: for EDF(k) (Equation (2)) and the MOTE algorithm (com-
bined with EDF(k) ). The energy consumptions generated
ωisi (t) · si
 
by these three methods are compared with the consumption
si := min si , (4)
min {Di,j , tnext } − t by the Smax method (i.e. all jobs are executed at the maxi-
mal processors speed smax ), while using different processor
We proved also that the system feasibility is not jeopardized models. During our simulations, about 5000 constrained-
by this speed modification. deadline systems were generated and simulated; with the
number of tasks n in [5, 40] (with density below 1 and
4.4 Implementation λsum (τ ) between 1 and 10). During each simulation, the
ACET of each job was generated using a pseudo-random
Before the system starts its execution, our algorithm generator. We made many graphics from our results, but
computes the speed sol by determining the optimal value of they are omitted here due to space limitation. To ensure
k thanks to Equation (2) (see Algorithm 1). Then, while the that the number m of processors is sufficient to schedule
system is running, there is only one kind of situation where the generated systems at speed smax , m is determined by

214
the following Equation (from [13]): results with the StrongARM SA-1100 processor
Method name Power saving over Smax Standard deviation
  
λsum (τ ) − λmax (τ ) offline EDF 4.33 % 3.34
m := min n, offline EDF(k) 27.12 % 10.24
1 − λmax (τ ) MOTE 44.74 % 8.82

5.2 Processor models results with the Crusoe processor


Method name Power saving over Smax Standard deviation
In our experiments, we used two realistic processor mod- offline EDF 0.62 % 0.76
offline EDF(k) 5.91 % 4.38
els. These models, noted P1 and P2 in the following, are
MOTE 23.3 % 7.55
derived from the processor Crusoe TM5400 from Trans-
meta and the processor StrongARM SA-1100 from Intel, Table 2. Simulation results.
respectively. In these two processor models, the voltage can
only vary in a limited range. Moreover, only a fixed num-
ber of functioning frequencies/voltages are available. For is, a speed reduction in the StrongARM implies a more sig-
that reason, we use the available processor speed immedi- nificant reduction of the system energy consumption. This
ately above the desired one, if the latter is not available. reduction is therefore even more significant when we use
Note that the use of the two adjacent frequencies to the re- the standard dynamic consumption model where the power
quested frequency is more efficient from an energy point of consumption function is modeled as a constant plus a cubic
view (see, for instance, [12]). Table 1 (adopted from [17] function (or at least a quadratic function) of the speed [22].
and [20]) summarizes the relationship between frequency, However, our results for this theoretical case are omitted
voltage, power consumption and the corresponding speed due to the space limitation.
for the Transmeta TM5400 (P1) and the StrongARM SA- According to [18], the Crusoe processor performs a
1100 (P2). speed transition less than 20 µs. This time overhead is
negligible for most real-time systems, since the order of
CPU Freq. (MHz) Volt. (V) Power (%) Speed magnitude of the task characteristics is about few millisec-
700 1.65 100 1 onds. With the Strong ARM SA-1100 processor, Pouwelse
600 1.60 80.59 0.857
et al. [17] report that a voltage/speed change can be per-
P1 500 1.50 59.03 0.714
400 1.40 41.14 0.571 formed in less than 140 µs. If this may not be considered as
300 1.25 24.60 0.429 negligible, since we have at most two speed transitions for
200 1.10 12.70 0.286 each job (one initially and one for a MOTE step), the “volt-
206 1.50 100 1 age change overheads” can be incorporated into the worst-
195 1.42 78.9 0.947
180 1.30 63.2 0.874
case execution requirement.
165 1.20 50.0 0.801
150 1.15 39.9 0.728 6 Future works
P2 135 1.10 33.6 0.655
120 1.08 33.0 0.583
105 0.95 19.8 0.510 Currently this work addresses the impact of the proposed
90 0.90 15.0 0.437 scheduling algorithms only on the dynamic power compo-
75 0.82 11.8 0.364
60 0.80 9.44 0.291
nent of the overall microprocessor power dissipation. Pro-
posed methods do not take into account the power dissi-
Table 1. Processors characteristics. pated to hold the circuit state and/or power dissipation due
to the imperfections of the physical implementation (static
Tables 2 provides the average consumption profit gen- power dissipation component). However it is a very well
erated by each method (expressed in percent), compared to known fact that for integrated circuits manufactured with
the consumption using the Smax method over the entire sim- technologies below 130 nm, and especially with current
ulation. 90 nm and 65 nm technologies, the static power dissipa-
tion component becomes very important and comparable
5.3 Observations to the dynamic power dissipation [10]. A significant re-
search effort has been provided, and is still deployed on
We observe a large variation in the power saving of our the static power dissipation reduction techniques. Proposed
algorithms when they are simulated upon the Crusoe pro- methods target not only low-level, hardware actions (such
cessor and upon the StrongARM SA-1100. This variation is as clock gating) but also higher-level (operating system)
due to the difference in the shape of their consumption func- actions forcing the processor to enter one of the multiple
tion: the consumption function of the StrongARM proces- low-power dissipation modes for better trade-off between
sor has a higher curvature than the Crusoe processor. That power saving and wake-up time (see [1] as an example).

215
The problem of the increased static power dissipation of the and Real-Time Computing Systems and Applications, pages
sub-micron technologies is the main motivation for our fu- 28–38. IEEE Computer Society, August 2007.
ture work, in which we will extend the existing controllable [8] J.-J. Chen, C.-Y. Yang, and T.-W. Kuo. Slack reclamation for
parameters of our scheduling algorithms (voltage and fre- real-time task scheduling over dynamic voltage scaling mul-
tiprocessors. In IEEE International Conference on Sensor
quency) with a processor switch-off parameter.
Networks, Ubiquitous, and Trustworthy Computing (SUTC),
Taichung, Taiwan, June 2006.
7 Conclusion [9] L. Cucu and J. Goossens. Feasibility intervals for multi-
processor fixed-priority scheduling of arbitrary deadline pe-
In this paper, we proposed two approaches which reduce riodic systems. In Design Automation and Test in Europe,
pages 1635–1640. IEEE Computer Society, 2007.
the energy consumption for real-time systems implemented [10] N. Ekekwe and R. Etienne-Cummings. Power dissipa-
upon multiprocessor platforms. The first one is an adap- tion sources and possible control techniques in ultra deep
tation of the first proposal “Global EDF”, called EDF(k) , submicron cmos technologies. Microelectronics Journal,
which allows a lower computing speed of the processors 37(9):851–860, September 2006.
than EDF. The second proposal (called MOTE) is an on- [11] R. Ernst and W. Ye. Embedded program timing analysis
line low-power algorithm which takes into account the “un- based on path clustering and architecture classification. In
used” CPU times to adjust the processor speeds while the Proceedings of the IEEE/ACM international conference on
system is running. We show in our experiments that this Computer-aided design, pages 598–604, California, United
on-line technique can significantly improve the processors States, 1997. IEEE Computer Society.
[12] B. Gaujal, N. Navet, and C. Walsh. Shortest path algorithms
energy consumption (up to 45% for the Intel StrongARM for real-time scheduling of fifo tasks with optimal energy
SA-1100). Moreover, our MOTE technique can incorpo- use. In ACM Transactions on Embedded Computing Sys-
rate the speed/voltage change overheads by simply adding tems, volume 4, pages 907–933, November 2005.
the speed transition time of the processors to the worst- [13] J. Goossens, S. Funk, and S. Baruah. Priority-driven
case workload of each task. Our two methods address spo- scheduling of periodic task systems on uniform multipro-
radic constrained-deadline real-time systems. This model cessors. Real Time Systems, 25:187–205, 2003.
includes the most popular one: the sporadic and implicit- [14] K. Kyong Hoon, B. Rajkumar, and K. Jong. Power aware
scheduling of bag-of-tasks applications with deadline con-
deadline task systems. The complexity of each decision (at
straints on dvs-enabled clusters. In Seventh IEEE Interna-
any job allocation-time) is linear in the number of ready
tional Symposium on Cluster Computing and the Grid, 2007.
jobs in the system. This low-complexity makes the MOTE CCGRID 2007, pages 541–548, May 2007.
strategy a very mighty technique. [15] C. Liu and J. Layland. Scheduling algorithms for multipro-
gramming in hard real-time environment. In Journal of the
References ACM (JACM), pages 46–61, february 1973.
[16] P. Pillai and K. Shin. Real-time dynamic voltage scaling
for low powered embedded systems. Operating Systems Re-
[1] Intel® pxa27x processor family optimization guide. view, 35:89–102, October 2001.
[2] B. Andersson. Static-priority scheduling on multiproces- [17] J. Pouwelse, K. Langendoen, and H. Sips. Dynamic voltage
sors. PhD thesis, Chalmers Univerosty of Technology, 2003. scaling on a low-power microprocessor. In Proceedings of
[3] R. Aydin, R. Melhem, D. Moss, and P. Mejia-Alvarez. the 7th annual international conference on Mobile comput-
Power-aware scheduling for periodic real-time tasks. IEEE ing and networking, pages 251–259, 2001.
Transactions on Computers, 53(5):584–600, 2004. [18] G. Quan and H. Xiaobo. Energy efficient fixed-priority
[4] S. Baruah and J. Anderson. Energy-aware implementation scheduling for real-time systems on variable voltage pro-
of hard-real-time systems upon multiprocessor platform. In cessors. In Proceedings of the 38th conference on Design
Proceedings of the ISCA 16th International Conference on automation, pages 828–833, 2001.
Parallel and Distributed Computing Systems, pages 430– [19] Y. Shin and K. Choi. Power conscious fixed priority schedul-
435, August 2003. ing for hard real-time systems. In Design Automation Con-
[5] M. Bertogna, M. Cirinei, and G. Lipari. Improved schedu- ference, pages 134–139, 1999.
lability analysis of EDF on multiprocessor platforms. In [20] A. Sinha and A. P. Chandrakasan. Jouletrack: a web based
ECRTS’ 05: Proceedings of the 17th Euromicro Conference tool for software energy profiling. In Proceedings of the 38th
on Real-Time Systems, 2005. conference on Design automation, pages 220–225, 2001.
[6] J.-J. Chen, H.-R. Hsu, and T.-W. Kuo. Leakage-aware [21] F. Zhang and S. Chanson. Processor voltage scheduling for
energy-efficient scheduling of real-time tasks in multipro- real-time tasks with non-preemptible sections. In 23th Real-
cessor systems. In 12th IEEE Real-Time and Embedded Time Systems Symposium, pages 235–245, 2002.
Technology and Applications Symposium, pages 408–417, [22] D. Zhu. Reliability-aware dynamic energy management in
2006. dependable embedded real-time systems. In Proceedings
[7] J.-J. Chen and T.-W. Kuo. Energy-efficient scheduling for of the 12th IEEE Real-Time and Embedded Technology and
real-time systems on dynamic voltage scaling (DVS) plat- Applications Symposium, 2006., pages 397–407, April 2006.
forms. In 13th IEEE International Conference on Embedded

216
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Finding Similar Answers in Data-Centric Sensor Networks ∗

I-Fang Su1† Yu-Chi Chung2 Chiang Lee 3


1,3
Department of Computer Science and Information Engineering
National Cheng-Kung University, Tainan, Taiwan, R.O.C.
2
Department of Computer Science and Information Engineering
Chang Jung Christian University, Kway Jen, Tainan, Taiwan, R.O.C.
1
emily@dblab.csie.ncku.edu.tw 2 justim@mail.cjcu.edu.tw 3 leec@mail.ncku.edu.tw

Abstract uously at every moment in time. Instead, sensors of-


ten detect the environment periodically. When a user
Intensive study has been dedicated to wireless sensor requests for data at a time point besides these detecting
networks and their applications in the last few years. time instants, the sensor will return an estimated value
However, similarity search problem in sensor network to the user. It is also possible that due to the impreci-
environments seems to have not attracted the deserved sion of sensor hardware, different sensors may generate
attention. In fact, sensor detected data are very likely slightly different sensor readings even though they are
imprecise due to the simplified hardware of the sensor in exactly the same environment. Similar or the near-
itself and various environmental factors. Hence, queries est data in these cases is important in answering a user
requesting for similar result should be an often scenario query.
and an important problem to resolve. In this paper, we Processing a similarity search query in a sensor net-
propose a Similarity Search Algorithm (SSA) for effi- work environment is not easy, mainly because sensors
ciently processing similarity search queries. We first are driven by low power batteries. The constraint on
present a data-centric storage structure based on the energy consumption has to be seriously considered. The
concept of Hilbert curve. Then, we propose an algo- past techniques for processing data of sensors mainly fo-
rithm designed for efficiently probing the most similar cused on two types of queries, point queries and range
data item for the sensor network. The performance study queries [6, 7]. A point query means to find results from
reveals that this mechanism is highly efficient and sig- sensors that own a value exactly matches the given value
nificantly outperforms other approaches in processing of the query. A range query is to retrieve results from
similarity search queries. sensors that have the values falling in the given range of
the query. While executing a point query in the sensor
network, the sensors only return those data that exactly
1. Introduction match the given query. Utilizing a point query process-
ing technique to process a similarity search requires that
A similarity search query is to find an object or a set the user issues multiple point query of similar conditions
of objects from the database that is/are similar to a given so as to retrieve similar data. However, processing mul-
query object. This problem has been studied in many tiple queries in this case causes a rapid energy consump-
database applications such as data mining, information tion to the sensors. Using a range query processing tech-
retrieval, image and video databases [1, 2], as well as in nique to process a similarity search, on the other hand,
distributed applications, such as Peer to Peer [3, 4] and faces two major problems. First, redundant results might
web services [5]. be transmitted to the query node. For example, assume
We find that the similarity search problem is also that a user is looking for a place where the temperature
highly required in sensor network environments. One is closest to 30o C if there is not a place at 30o C. The
of the main reasons is that due to the limited resources user issues a range query such as finding the tempera-
(e.g., battery power and network bandwidth), it is infea- ture between 25 o C and 35o C. Assume that there are six
sible for the sensors to monitor the environment contin- sensors detecting their temperature, 25o C, 27o C, 29o C,
32o C, 33o C and 35o C, falling in the range of the given
∗ This work is supported by National Science Council of Taiwan
query, those data will all be returned to the query node.
(R.O.C.) under Grants NSC95-2221-E-006-208.
† I-Fang Su is also a lecturer at the Department of Information Man- However, only the temperature 29o C is the true answer
agement Fortune Institute of Technology, No.1-10, Nwongchang Rd., to the query, and it is the only one needed to be trans-
Lyouciyou Village, Daliao, Kaohsiung County 831, Taiwan, R.O.C. mitted. Using this range query processing method, how-

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 217


DOI 10.1109/SUTC.2008.26
ever, extra five tuples are transmitted to the query node work environments.
which wastes the sensor’s energy. The second problem
is that it might not be easy for a user to specify an appro- 2. The data mapping is based on the concept of
priate range in a similarity search query. The reason is Hilbert curve, which is simple and easy to im-
that if the given range is too small, there may not be any plement. The indexing node to which a detected
result qualifying for the condition. If the given range is data item should be mapped can be determined dis-
wide, there may be too many qualifying results returned, tributedly by each sensor, which avoids centralized
which again wastes sensor energy. Therefore, the past data dispatching to indexing nodes.
point query and the range query processing techniques
3. The whole processing is in-network. The number
are improper for processing similarity search queries in
of involved indexing nodes in processing a simi-
sensor networks.
larity search query is only a few, which avoids the
A major challenge in processing a similarity search
need of transmission of local similar data from all
query in a sensor network is that each sensor is only a
sensors and therefore dramatically simplifies the
minirepository of an entire distributed sensor database.
task and reduces the energy consumption of sen-
Each sensor only has the knowledge of its local data, but
sors.
has no global knowledge of the entire sensor database.
Hence, while processing a similarity search query, each As a preliminary study of the problem, this paper
sensor does not know whether its local similar data is mainly focuses on processing similarity search queries
the globally most similar data and has to transmit its lo- for one-dimensional data which means that a query is
cal similar data to somewhere (e.g., the query node) for specified only for one type of events. We leave the
further verification. This causes a serious waste of sen- multi-dimensional part of the work as our future work.
sor energy for data transmission and data forwarding. The subsequent content of this paper is organized as fol-
Similarity search is referred to as a nearest neighbor lows. A representative previous work on data-centric
search in some environments [8, 9, 10]. However, the storage that can possibly be applied to processing simi-
proposed algorithms for the nearest neighbor search in larity search query is surveyed and discussed in Section
sensor network, such as Peer-Tree [11], KPT [12], are 2. Section 3 presents the proposed algorithm for a simi-
designed for a different goal. Those algorithms pro- larity search. The simulation results are given in Section
vide solutions to return the geographically nearest sen- 4. Finally, we state our conclusions and our future work
sor nodes to a desired query point, which is quite dif- in Section 5.
ferent from our focus of finding the data nearest to the
given query point. Since a data appearing location is
unpredictable, determining the nearest data in a sensor 2. Related Work
network is more complicated than finding the geograph-
ically nearest sensor node. To the best of our knowledge, Data-Centric Storage
In this paper, we propose a Similarity Search Algo- in Sensornets with a Geographic Hash Table (GHT) [6]
rithm (SSA) to overcome the above problems in pro- may be the most representative approach among the past
cessing similarity search queries. We choose a group DCS-based research that is applicable to processing sim-
of sensors, which are named the indexing nodes, to store ilarity search in a distributed sensor network. However,
data based on the Data-Centric Storage (DCS) concept there exist some obstacles for such schemes to process
[6]. These indexing nodes are so chosen from the entire a similarity search query. We illustrate them in the fol-
sensor network that they actually form a Hilbert curve lowing.
[13] in the network. The adjacent indexing nodes along Essentially, GHT hashes the type of an event into ge-
the Hilbert curve have data of similar values. Hence, ographic coordinates and stores the detected data of this
searching similar data in this arrangement becomes very event type at the sensor node geographically nearest the
easy. In this paper, we will discuss how this scheme is hashed coordinates of the event type. A sensor which
realized in a sensor network environment and how deep is responsible for storing the mapped data is named a
(i.e., how many levels) the Hilbert curve should be im- storage node, which is just like an indexing node in our
plemented. Our performance study indicates that the design. When a query is issued to a sensor, the sensor
proposed method provides a significantly lower query also hashes the event type of the query to a geographic
processing cost than a previous method while process- coordinates, and forwards this query to the storage node
ing a similarity search query. Another elegant feature is nearest the coordinates. The data of this storage node,
that this method is scalable with respect to the number which matches the query will be retrieved and sent to
of queries and the amount of detected data items. the query node. For instance, the sensors are responsi-
The main contributions of this paper are as follows. ble for detecting two event types, temperature and hu-
midity. Hence, two sensors are used for storing the ob-
1. This work is the first one to provide an algorithm served temperature and humidity, respectively, in GHT.
for searching similar data in wireless sensor net-

218
(0,100) (100,100)
3. Design of a Data-Centric Storage System
Supporting Similarity Search
In this section, a data allocation method is first illus-
trated in Section 3.1, and a data insertion mechanism is
designed for storing observed data into an indexing node
in Section 3.2. A search mechanism is proposed in Sec-
tion 3.3 which efficiently finds the answer for a given
(0,0) (100,0)
query. The symbols used in this section are summarized
in Table 1.
root node: (3,3)
level 1 mirror nodes: (53,3) (3,53) (53,53)
3.1. Data Allocation
level 2 mirror nodes: (28,3) (3,28) (28,28) (78,3) (53,28) (78,28)
(3,78) (28,53) (28,78) (78,53) (53,78) (78,78)
A space-filling curve is a thread that goes through all
the points in the space while visiting each point only one
Figure 1. Example of structured replica-
time, and imposing a linear order of points in the multi-
tion.
dimensional space. The Hilbert curve manifests superior
data clustering properties when compared with the other
space-filling curves [13, 14]. Thus, we adopt the Hilbert
curve to design the structure of indexing nodes in a sen-
As one type of events is mapped to one storage node, sor network.
the workload of this node can be very heavy as all The Hilbert curve is mathematically defined by a
queries asking for this type of events will be processed mapping of the unit interval [0, 1] in one dimension to
in this storage node. It easily makes this storage node a bounded region of a higher dimension space which is
a hot spot and depletes its energy much sooner than the called a Hilbert space. According to the number of level
other nodes. To alleviate the problem, the GHT team , the network is divided recursively into 4 square cells.
proposed the structured replication GHT (SR-GHT) [6] Determination of  is dependent on the data generation
to dispatch detected data to multiple storage nodes. The rate. The more generated data, the higher  is required.
replicated storage nodes are named the mirror nodes. A (The determination of  is to be discussed shortly.) Each
node that detects a data will store the data at either the square cell is a quadrant and there is a point P at
hashed node or the mirror node, depending on which the center of each quadrant. The Hilbert curve passes
one is closer to the location that the data is detected. For through each P of all quadrants [13, 15]. For exam-
example, Figure 1 shows a hierarchy of up to level=2 ple, in Figure 2 the sensor network is divided into four
structured replication. Each black dot, which is named a quadrants where =1 and each quadrant has one point Pi
root node, is the original storage node of GHT. The gray in the middle. These four points, P0 , P1 , P2 , P3 , of the
and white dots represent the mirror nodes of level=1 and four quadrants are strung in a linear order P0 , P1 , P2 , P3 .
level=2 respectively. The number of the replication level When  increases to 2, the network will be divided into
depends on the data generation frequency. It increases 16 quadrants as shown in the figure and the point of each
while data is frequently generated. If an event is detected quadrant is linked in the same manner as that in level 1
at (100,100) in Figure 1 for example, it will be stored at Hilbert space.
the mirror node at the upper-right cell as it is the closest
mirror node. Thus, SR-GHT reduces the cost for trans-
mitting the detected data and uplifts the availability of
P1 P2
sensors as it spreads the workload over multiple nodes.

This storage mechanism is however very inefficient


in processing similarity search queries. In processing P3 P2
a similarity search query, SR-GHT has to forward the 0
P0 P3
query to all mirror nodes as each of them has some por-
P0 P1
tion of the entire data and we do not know which one has
0 1 0 1
the most similar data. Also, all these mirror nodes have (a) Level 1 (b) Level 2
to participate in processing this query and send back the
result to the query node for finding the most similar data. Figure 2. Hilbert curve for level 1 and level
As a result, the processing cost as well as the communi- 2.
cation cost are both dramatically high. A new method
is required for processing similarity search queries effi-
ciently. We assume that sensors are uniformly deployed in the

219
Table 1. Summary of symbols and defini-
tions
Symbols Definitions
 Number of level in Hilbert curve
P Center point of a quadrant
SID Sensor number ID
IID Indexing node number ID
n Number of indexing nodes
A Total memory space for storing data (a) Hilbert curve (b) Sensor network
z Memory size of each sensor node
R Entire data range of an event
RL Lower bound of data range R Figure 3. Mapping Hilbert curve onto a
RU Upper bound of data range R sensor network.
r Sub-range of data range R, where r=R/n
IID
Vmin Minimum existing value of indexing node IID
IID
Vmax Maxmum existing value of indexing node IID
Vq The search value of a query z, the number of indexing nodes n should be n≥A/z.
IT The indexing node that query is forwarded to Since the number of indexing nodes is four to the num-
I
Vs ID Local similar data of indexing node IID
I ber of levels (n=4 ), the number of levels  equals to
RLID Lower bound of data range in IID
I log4 n≥log4 A/z.
RUID Upper bound of data range in IID
Mx,y Midpoint of two values x and y
3.2. Data Insertion Mechanism

network. All the sensors are homogeneous and each of Let the entire data range R of an event be [lower
them has a unique sensor ID, SID , and is aware of its bound RL , upper bound RU ]. For instance, the detected
own geographic location. We call each quadrant of a range for temperature in a sensor is usually in the range
Hilbert space a cell in the network and the number of of -40oC to 60o C in a wild area, and humidity is within
cells is equal to 4 . We choose one sensor that is closest the range 0% to 100%. If the number of indexing nodes
to the middle of a cell in the network as point P of each is n, which are I0 ,I1 ...In−1 . We equally divide R into n
quadrant in the Hilbert space. These sensors are respon- sub-ranges, each being equal to r. That is, n·r=R. The
sible for storing detected data and named the indexing sub-range of data for which the indexing node IID is re-
IID IID
nodes (IID ). The number of indexing nodes is equal to sponsible is defined as [RL , RU ), which is equal to
the number of cells, i.e., (4 ). [RL +(IID -1)·r, RL +IID ·r). For example, Figure 4(a)
Figure 3 shows an example of a Hilbert space of =1 shows a sensor network of partition level =1. There are
mapped upon a sensor network. Figure 3(b) is the sen- four indexing nodes in the sensor network, which are I0 ,
sor network corresponding to the Hilbert space in Fig- I1 , I2 , I3 . Assume that the value range of an event is
ure 3(a). Each black dot in Figure 3(b) represents a sen- [0, 1]. We equally split the range [0, 1] into four sub-
sor in the network, and each white dot (I0 , I1 , I2 , I3 ) is ranges [0, 0.25), [0.25, 0.5), [0.5, 0.75) and [0.75, 1],
a chosen sensor that is closest to the center of a cell and corresponding to the indexing nodes I0 , I1 , I2 , and I3 ,
is regarded as the indexing node of this cell. These four respectively. If =2 as shown in Figure 4(b), the num-
nodes compose a Hilbert curve of level=1 in the sensor ber of indexing nodes increases to sixteen, i.e., I0 , I1 , ...
network. Each of these indexing nodes then broadcasts and I15 . Hence, the sub-range of the first indexing node
a message to its neighbors to let its one-hop neighbors I0 is [0, 0.0625), that of the second indexing node I1 is
know that it is the indexing node of the cell. When a [0.0625, 0.125), and so on.
future message is sent to the center of this cell, these When a data is detected by a sensor, this sensor sends
neighbor nodes will direct the message to this indexing the data to the indexing node that is assigned a sub-range
IID
node. covering the detected data. Two parameters (Vmin ,
IID
Determination of a proper number of levels is a prob- Vmax ) are used here to record the minimum and the
lem that should be explained. Too many indexing nodes maximum existing values of an indexing node. These
(i.e., building a Hilbert curve of too many levels) will two parameters are initially set to 0. When a data is
degrade the efficiency of query processing. Too fewer stored in an indexing node, these two values are updated
indexing nodes, on the other hand, may not provide accordingly. For example in Figure 4(a), if a sensor de-
enough storage space for detected data items. We de- tects a data whose value is 0.3, the data will be sent to
termine the number of levels in this way. Assume the I1 because it belongs to the sub-range of I1 . And if 0.3
I1
detected data expire after a period of time so that they is the only data in I1 , the two parameters will be (Vmin ,
I1
do not need to be kept forever. The total memory space Vmax ) = (0.3, 0.3), as the minimum and the maximum
for storing data is A. If the memory size of each sensor is existing values are both 0.3.

220
I5
3.3.2. Query Probing Phase for Similar-
I6 I9 I10
[0.25, 0.5) [0.5, 0.75) [0.3125, 0.9375) ity Search Queries
I1 I2
I7 I8
I4 I11 Let the local similar data of indexing node IID be
[0.25, 0.3125)
VsIID . So the local similar data of target node IT is VsIT .
tȷ=0.3 tȷ=0.3
I3 I2 I12 We have the following possible cases.
I13
[0.1875 , 0.25) [0.125, 0.1875 )
I0 I3 0
[0, 0.25) [0.75, 1]
• Case 1 IT is nonempty (i.e., has local data) and
I0
[0, 0.0625)
I1
[0.0625, 0.125)
I14 I15
[0.9375, 1]
VsIT is the most similar local data in IT .
(a) Level 1 (b) Level 2 – Sub-Case 1 If VsIT is larger than Vq , then all
data in IT +1 must be even greater than Vq (be-
Figure 4. Hilbert curve for level 1 and level cause they are greater than VsIT ). But in IT −1
I
2. there may be a Vs T −1 which is closer to Vq .
– Sub-Case 2 If VsIT is smaller than Vq , then
all data in IT −1 must not be more similar to
Vq than VsIT is. But in IT +1 there may be a
3.3. Similarity Search Mechanism I
Vs T +1 which is closer to Vq .

• Case 2 IT is empty (i.e., no data is stored in IT ).


The data insertion mechanism presented in the previ- Vq has to be sent to both neighbors (i.e., IT −1 and
ous subsection allows a query to be easily processed af- IT +1 ) of IT to find the most similar data.
ter locating the indexing node that needs to be searched.
This, in effect, embeds a filtering mechanism that ele- We proposed three operations, backward probing, for-
gantly sifts out unqualifying data. The similarity search ward probing, and bi-directional probing, to deal with
mechanism consists of two phases, the similarity search the above cases in the probing phase. The backward
query resolving phase and the query probing phase. The probing and forward probing are designed for Sub-Case
query resolving phase determines an indexing node that 1 and Sub-Case 2 respectively, and the bi-directional
is most likely to provide an answer for the query. If the probing is for Case 2.
answer does not exactly match the given query, the query
probing phase is initiated for finding possible answers Backward Probing
from other indexing nodes. This type of probing is needed when VsIT is greater than
Vq . Since the greatest data in IT −1 is smaller than the
IT IT
lower bound RL of IT , the Vq , VsIT and RL can be
IT
3.3.1. Query Resolving Phase for Simi- used to determine whether Vs is the answer of Vq . If
IT
larity Search Queries VsIT is closer to Vq than RL is, VsIT is definitely the
answer of Vq . Otherwise, there may be proper answer in
IT −1 . More precisely, we can rephrase the above in the
When a similarity query is issued, the sensor that following. We denote the middle of two values x and
receives this query locates the indexing node whose y as the midpoint of x and y, Mx,y . Let x be VsIT and
data sub-range covers the given value of this query y be the lower bound RL IT
of IT . Hence, Mx,y is equal
by executing a locate() function, which is defined as IT IT
to (Vs +RL )/2. If Vq is greater than Mx,y , then none
follows.
of the data items in IT +1 will be closer to Vq than VsIT
is. So, VsIT is the answer. On the other hand, if Vq is
locate(Vq ) = locate the indexing node IID such smaller than Mx,y , then there may be more similar data
that RL +(IID -1)·r≤Vq <RL +IID ·r, where Vq is the in IT −1 . Hence, IT −1 has to be visited. In this case, the
search value given by the query. IT −1
greatest value Vmax of IT −1 should be sent to IT . The
IT −1
value of Vmax of VsIT that is closer to Vq is returned as
The query is then forwarded to the located indexing the answer. If there is no data in IT −1 , then VsIT is the
node IID to retrieve data. We call this indexing node the answer.
Target Node IT . IT compares Vq with its local data. If
a result that exactly matches Vq is found, the query exe- Forward Probing
cution is finished and the data is forwarded to the query This case is an opposite case to the previous one. It
node. The probing phase is unnecessary in this case. is needed when VsIT is smaller than Vq . In this case,
Otherwise, the probing phase is initiated for efficiently only the data that is greater than VsIT can be a more
IT
determining which adjacent indexing nodes should be similar data. Therefore, Vq , VsIT and RU are used to
accessed to find the most similar data. determine whether Vs is the answer of Vq . If VsIT is
IT

221
IT
closer to Vq than RU is, VsIT is the answer of the query. Query Value, Vq Similar Data,VsI ID Midpoint, M x , y
Upper Bound of IID , RUI I ID
Max. Observed Data of IID, Vmax
Otherwise, IT +1 might have a data item that is a more
ID

Lower Bound of IID ,RLI ID I ID


Min. Observed Data of IID, Vmin
proper answer. We also use Mx,y to illustrate whether
IT +1 should be visited. Let x be VsIT , and y be the RUI 2 75% RUI 2 75% RUI 2 75%
IT IT
upper bound RU of IT . Mx,y is equal to (VsIT +RU )/2.
If Vq is smaller than Mx,y , then none of the data items I2 I2 I2
in IT +1 will be closer to Vq than VsIT is. So, VsIT is the I1
I2
Vmin I2
Vmin
R RUI1 RUI1
answer. On the other hand, if Vq is greater than Mx,y , U 50%
48%
50%
47%
50%

then there may be more similar data in IT +1 . Hence, I1 I1 40%


IT +1 36.5% I1
IT +1 has to be visited. The largest data Vmin should 28% 30%
IT +1
be sent to IT . The value of Vmin or VsIT that is closer RLI1 25%
I0
RLI1 25% RLI1 25%
I0
Vmax Vmax
to Vq is returned as the answer. If there is no data in
IT +1 , then VsIT is the answer. I0 I0 I0
I0
R L 0% RLI0 0% RLI0 0%
Bi-directional Probing (a) (b) (c)
If IT has no data, then IT has to probe both IT −1
and IT +1 to find the most similar data. IT initiates Figure 5. Three operations of the probing
both the backward probing and the forward probing phase.
T −1 T +1
processes to find Vmax and Vmin , respectively,
and return the one that is closer to Vq as the result.
If IT −1 or IT +1 again contains no data, the pro- I2
and Vmin from I0 and I2 , respectively. Finally, I1 re-
cess will continue until the termination condition in turns the most similar data to the query node.
the backward or the forward probing process is satisfied.
4. Simulation Results
Example
We use Figure 5 to illustrate how the probing phase
In this section we verify the effectiveness of our
works. Assume that sensors are used for detecting hu-
work, the proposed Similarity Search Algorithm (SSA),
midity of the environment, and the indexing nodes are
by comparing it against SR-GHT in processing similar-
I0 , I1 , and I2 . The assigned data sub-ranges of I0 , I1
ity search queries. Since the communication cost is the
and I2 are [0%, 25%), [25%, 50%) and [50%, 75%), re-
main part of energy consumption of sensors, we use the
spectively. In Figure 5(a), if a query is issued to find the
number of exchanged messages as the comparison met-
humidity that is either 28% or the one that is closest to
rics.
28%, the query will be forwarded to I1 as the sub-range
of I1 covers Vq = 28%. That is, I1 is IT in this case. As- 4.1. Performance Model
sume that the local data of I1 that is closest to Vq is 48%.
That is, VsI1 is 48%. However, as VsI1 , which is 48%, is As the comparison system is SR-GHT, we use the
greater than Vq , which is 28%, I0 might have data closer same settings as that in SR-GHT. The sensors are uni-
to Vq than VsI1 is. Since VsI1 is 48% and the lower bound formly placed in the entire field. The number of sen-
of I1 is 25%, Mx,y is (48%+25%)/2 = 36.5%. As Vq is sors varies from 103 up to 105 and the radio range of
smaller than Mx,y , I1 should forward the query and VsI1 each sensor is equal to 40 meters. The node density
I0
to I0 to compare Vmax with the VsI1 . If Vmax
I0
is closer to of the sensor network is equal to 1 node per 256 m2
Vq thanVsI1 is, then Vmax
I0
is the answer and is returned to and the number of levels of the Hilbert curve varies in
the query node. Otherwise VsI1 is the most similar data the performance from one to four. We assume that sen-
of all. This implements a backward probing process in sors in this sensor network only detect data of one event
locating the answer. type. According to the analysis of SR-GHT, DCS sys-
Figure 5(b) shows another case that requires a for- tem performs well when the frequency of data genera-
ward probing. Let Vq = 47%. Again, let VsI1 be 30% as tion is higher than the frequency of query issued. For
shown in Figure 5(b). As Vq >VsI1 (i.e.,47%>30%) and fairness in comparison, the ratio of query issuing fre-
the midpoint is 40%, I2 might contain more similar data quency to data generation frequency in this simulation
than VsI1 is. Therefore, I1 forwards the query and VsI1 1
is 10 . In the experiments, each sensor on average gener-
I2 I2
to I2 to compare Vmin with VsI1 . If Vmin is closer to Vq ates ten data items and one query. The value of each data
I2
than Vs is, then Vmin is the answer. Otherwise VsI1 is
I1
item and he query value within each range are uniformly
the most similar data. distributed in the range [0, 1].
If the target node I1 has no data in its memory as The performance metrics employed in the simula-
shown in Figure 5(c), I1 has to implement a backward tions is the number of exchanged messages, which in-
I0
probing and a forward probing process to retrieve Vmax clude the data insertion cost and the query processing

222
(a) A sensor network with level 1. (b) A sensor network with level 2.

(c) A sensor network with level 3. (d) A sensor network with level 4.

Figure 6. Total cost of SSA and SR-GHT in different network sizes.

cost. For data insertion cost, we record the number of 2. The data insertion cost of SSA is higher than that
exchanged messages required for each data item that is in SR-GHT. That is because the detected data is
detected in one sensor and forwarded to the correspond- stored locally in SR-GHT, but is forwarded to an
ing storage node. For query processing cost, we record assigned indexing node in SSA and the assigned
the number of exchanged messages required for process- indexing node may be far away from the detecting
ing a similarity search query, which includes forwarding sensor. Though the data insertion cost of SR-GHT
the query to the corresponding storage node, executing performs well in insertion cost, SSA outperforms
the similarity search mechanism, and forwarding the re- SR-GHT in total cost.
sults to the query node.
3. While the number of level increases, the above two
4.2. Network size observations remain the same. The query process-
ing cost of SR-GHT increases exponentially with
We first compare the performance of processing sim- the increase of number of levels. However, the
ilarity search queries of SSA with SR-GHT under dif- query processing cost of SSA remains quite stable
ferent size of network. The SSA and SR-GHT are pro- when the number of levels increases.
cessed in the network that size varies from 103 sensors
to 105 sensors. The density of the network remains at 1
node per 256 m2 . Each sensor on average generates ten
detected data items when it issues one similarity search
query. Figure 6 gives the cost per sensor for level=1 up
to level=4 in the sensor network.
In the following, we list the interesting observations
found in Figure 6 .
1. The performance of query cost of SSA outperforms
SR-GHT in all scales of the network size. The
reason is that drastically fewer storage nodes are
visited while processing a similarity search query
in SSA, whereas, all the storage nodes have to be
visited in SR-GHT. As a result, the query cost of Figure 7. Comparison on the number of
SR-GHT increases significantly with the expansion levels in a large sensor network.
of the network size and the increase of the Hilbert
curve level.

223
4.3. Partition level Currently, we are extending the capability of this de-
sign to dealing with multi-dimensional similarity search
Also, we observe the performances of SSA and SR- queries. Indexing multi-dimensional data is difficult as
GHT under different partition level of sensor network. it requires an intelligent mapping to a two-dimensional
In Figure 7, we compare their average query processing sensor network so as to maintain the adjacency in the
cost, average data insertion cost, and average total cost two dimensional space. An efficient query processing
for the number of sensor nodes being equal to 1000. technique that works for multi-dimensional data is also
under designed.
1. Insertion cost and query cost: The insertion cost
of SR-GHT drops as the number of levels increases. References
The reason is that the observed data are stored
to the nearest mirror node in SR-GHT. When the [1] M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang,
number of levels increases, the number of mirror B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele,
nodes greatly increases so that on average a de- and P. Yanker, “Query by image and video content: the qbic
system,” IEEE Computer, vol. 28, no. 9, pp. 23–32, September
tected data item becomes closer to a mirror node 1995.
to be stored there. The query cost of SR-GHT [2] S. Cost and S. Salzberg, “A weighted nearest neighbor algorithm
increases however much more significantly as the for learning with symbolic features,” Machine Learning, vol. 10,
number of levels increases, because all data in the no. 1, pp. 57–78, January 1993.
[3] B. I., K. S.R., and P. S., “Similarity searching in peer-to-peer
mirror nodes need to be sent to the query node. databases,” in Proceedings of 25th International Conference on
Distributed Computing Systems (ICDC05), Columbus, Ohio,
2. Total cost: The total cost of the SR-GHT method USA, June 2005, pp. 329–338.
is higher than the cost of SSA for all levels of the [4] P. Kalnis, W. S. Ng, B. C. Ooi, and K.-L. Tan, “Similarity queries
in peer-to-peer networks,” Information Systems Journal, vol. 31,
Hilbert curve. The higher the number of levels, the
no. 1, pp. 57–72, March 2006.
greater the difference between the two algorithms. [5] X. Dong, A. Halevy, J. Madhavan, E. Nemes, and J. Zhang,
Notice that the vertical scale in Figure 7 is mea- “Similarity search for web services,” in Proceedings of 30th
sured as the logarithm. Hence, when level=4, the VLDB Conference., Toroto, Canada, September 2004, pp. 372–
383.
SSA method is more than one order of magnitude [6] S. Ratnasamy, B. Karp, S. Shenker, D. Estrin, R. Govindan,
better than SR-GHT. L. Yin, and F. Yu, “Data-centric storage in sensornets with ght,
a geographic hash table,” Mobile Networks and Applications,
3. Sensitivity to the number of levels: Each cost vol. 8, no. 4, pp. 427–442, August 2003.
component of SSA remains about the same for all [7] X. Li, Y. J. Kim, R. Govindan, and W. Hong, “Multi-dimensional
range queries in sensor networks,” in Proceedings of the 1st in-
levels of Hilbert curve. Hence, the SSA method is ternational conference on Embedded networked sensor systems,
insensitive to the number of levels. The number of Los Angels, CA, November 2003, pp. 63–75.
levels actually represents the amount of generated [8] B. S., B. C., B. B., K. D. A., and K. H.-P, “Fast parallel similarity
data. When the amount of generated data increases, search in multimedia databases,” in Proceedings of ACM SIG-
MOD International Conference on Management of Data (SIG-
a higher level of Hilbert curve has to be used to MOD’97), Tucson, AZ, January 1997, pp. 1–12.
contain so much data. This means the SSA method [9] R. Weber, H. Schek, and S. Blott., “A quantitative qnalysis and
has a very nice feature that it is insensitive to the performance study for similarity search methods in high dimen-
amount of data generated. Hence, the method is sional spaces,” in Proceedings of the 24th International Confer-
ence on Very Large Data Bases (VLDB), New York City, New
perfectly scalable in terms of data size. York, August 1998, pp. 194–205.
[10] C. Doulkeridis, A. Vlachou, Y. Kotidis, and M. Vazirgiannis,
“Peer-to-peer similarity search in metric spaces,” in Proceedings
5. Conclusion of the 33th International Conference on Very Large Data Bases
(VLDB), Vienna, Austria, September 2007, pp. 986–997.
In this paper, we proposed the design and implemen- [11] M. Demirbas and H. Ferhatosmanoglu, “Peer-to-peer spatial
queries in sensor networks,” in Proceedings of the 3rd IEEE In-
tation of an algorithm for processing similarity search
ternational Conference on Peer-to-Peer Computing, Linkping,
queries in sensor networks. Our design applies the con- Sweden, September 2003, pp. 32–34.
cept of Hilbert curve to sensor networks such that se- [12] J. Winter and W.-C. Lee, “Kpt: a dynamic knn query processing
mantically related data are mapped to adjacent index- algorithm for location-aware sensor networks,” in Proceedings
of the 1st International Workshop on Data Management for Sen-
ing nodes. A similarity search algorithm was proposed sor Networks (DMSN), New York, NY, USA, August 2004, pp.
for efficiently processing similarity search queries. Such 119–124.
a query can be directly routed to an indexing node to [13] J. Griffiths, “An algorithm for displaying a class of space-filling
find the matching result or the one that is closest to the curves,” Software-Practice and Experience, vol. 16, no. 5, pp.
403–411, May 1986.
given query. The major advantage of this design is that it [14] N. Wirth, Algorithms and Data Structures. Englewood Cliff,
drastically reduces the communication cost for process- NJ: Prentice-Hall Inc., 1986.
ing similarity search queries. Our performance study [15] D.Hilbert., “Ueber stetige abbildung einer linie auf ein flashen-
showed that this design exhibits a superior performance stuck.” Mathematishe Annalen, vol. 38, pp. 459–460, 1891.
in terms of energy consumption in the sensor networks.

224
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Energy-Efficient Real-Time Co-Scheduling of Multimedia DSP Jobs ∗

Chien-Wei Chen, Chuan-Yue Yang, Tei-Wei Kuo Ming-Wei Chang


Department of Computer Science SoC Technology Center
and Information Engineering Industrial Technology Research Institute
Graduate Institute of Networking and Multimedia
National Taiwan University, Taiwan. Taiwan
Email: {r94057, r92032, ktw}@csie.ntu.edu.tw Email: MingWeiChang@itri.org.tw

Abstract In order to resolve the computing demands of multime-


dia applications, many embedded systems deploy one or
While DSP’s are now widely adopted in many embedded more digital signal processor (DSP), beside the micro pro-
systems in the cost minimization and the resolving of com- cessing unit (MPU). Such a strategy does help in reducing
puting needs of various multimedia applications, little work the cost and resolving the computing demands, in general,
is done for energy-efficient real-time job scheduling over but introduces more complexity in energy-efficient designs.
DSP’s. As motivated by the needs, a set of sliding-window- Since the workloads of many multimedia applications might
based algorithms are proposed. A sequence of time points come in burst or fluctuate violently, the performance of the
and their corresponding processor speeds is generated to MPU and/or the DSP should be adjusted accordingly so as
run jobs of a periodic task on the DSP, such as that for the to meet the needs in performance and energy consumption.
decoding of an H.264 stream. An online DVS scheme for The dynamic voltage scaling (DVS) technology is devel-
energy minimization with constrained buffer size consider- oped recently to let the MPU or the DSP to adjust its supply
ation is proposed, and the capability of the scheme is eval- voltage and clock frequency dynamically according to the
uated by a series of experiments over real and synthesized computing demand. Well-known example MPU’s are Intel
traces. It was shown that the energy saving can be up to StrongARM SA1100 and Intel XScale [20]. In recent years,
44%, and prediction errors were not significant enough to example dual-core platforms, such as the DaVinciTMfrom
result in more than 3% in deadline missing. TI [14] and the PACTM from ITRI [17], are also developed to
support DVS over DSP’s. Note that the power consumption
Keywords: Energy-Efficient Scheduling, Preemption function of a modern CMOS chip can usually be modeled
Control, Real-Time Systems, System-Wide Energy Effi- as a nondecreasing convex function of its clock frequency.
ciency. As a result, the decreasing ratio of the power consumption
is larger than that of the performance when the clock fre-
1 Introduction quency is lowered [7, 22].
In the past decades, energy-efficient task scheduling with
The strong demands in computing power from various various deadline constraints has received a lot of attention.
multimedia applications has provided a great driving force Many studies have been done for uniprocessor schedul-
for powerful platforms. The demands not just aim at desk- ing [2, 5, 9, 10, 15, 16, 18, 24]. Various heuristics were also
top computers but also result in quick emerging of many proposed for energy consumption minimization under dif-
embedded system platforms. Related example products are ferent task models in multiprocessor environments [1, 4, 8,
smart phones and personal multimedia players. Different 12, 13, 19, 23, 27, 28]. In particular, Yao, et al. proposed
from desktop computers and servers, the design problems an off-line optimal energy-efficient scheduling algorithm of
of embedded systems are further complicated by their re- independent tasks. An online competitive scheduling algo-
strictive resource supports and limited energy supply (due rithm for aperiodic jobs was also presented [24]. When
to battery-driven designs). How to have a good compromise the unpredictability of task execution cycles was consid-
between the system performance and the energy consump- ered, slack reclamation methods are proposed for energy-
tion is a very difficult problem. efficient real-time task scheduling in uniprocessor and mul-
∗ Support in parts by research grants from Excellent Research Projects tiprocessor environments [3, 11]. Researchers also explored
of National Taiwan University, 95R0062-AE00-07 and Industrial Technol- quality-of-service in a soft real-time fashion for energy sav-
ogy Research Institute of Taiwan, FY96-S5-10. ing [25, 26].

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 225


DOI 10.1109/SUTC.2008.54
Distinct from the past work, we are interested in energy- co-scheduling over a DSP. While DSP, such as TMS320C
efficient real-time job scheduling over a DSP, where a and TMS320D, are in general used to run jobs of specific
MPU and a DSP work together in energy-efficient real-time computing demands, the DSP workload is usually submit-
scheduling. We focus on the scheduling on the DSP because ted from MPU. It is more often than not that the meta in-
of the great variation of the workloads on the DSP. Take the formation of the pending DSP workload can be obtained
decoding of an H.264 video stream as an example. Jobs are before jobs of the workload are submitted from the MPU to
issued by the MPU to execute on the DSP such that a se- the DSP, e.g., the prediction of IDCT, intra-frame, and inter-
quence of jobs that correspond to the decoding workloads frame decoding time for jobs over the DSP based on the
of B, P , and I frames is issued in a periodic fashion. Jobs entropy decoding information over the MPU. To be more
can be pending on the DSP but must be completed within specific, energy-efficient real-time job co-scheduling over
their deadlines so as to avoid problems in the playing of the a DSP is to schedule jobs on the DSP so as to meet their
H.264 video stream. deadlines and to minimize the energy consumption of the
We are interested in the scheduling of jobs of a single pe- DSP, based on given job characteristics (from the MPU).
riodic task, such as that of PMP1 -like products, where jobs Note that the DSP workload consists of a set of pending
are submitted to the DSP every P time units. Note that even jobs submitted from the MPU in a dynamic way. Because
if an embedded system, such as a smart phone, has many of the buffer size limitation and application characteristics,
tasks, it is still reasonable to optimize the energy consump- the number of pending jobs at any time point is bounded by
tion of the execution of some video-stream-playing task for a fixed integer.
a non-trivial interval of time. In this work, we assume that With the advance in VLSI circuit designs, many DSP’s
the required DSP cycles of each job is known when it is now support dynamic voltage scaling (DVS) such that it is
submitted to the DSP. The prediction of the DSP cycles possible to explore the tradeoff between performance and
of jobs can be done by some online prediction methods, energy consumption. Similar to many MPU’s, the power
such as those for MPEG decoding at the frame level [6] consumption of the dynamic voltage scaling part of a DSP
or at the macro-block level [21]. The objective in energy- can be defined as a function p(s) of a given speed s [7, 22]:
efficient real-time job scheduling on the DSP is to derive a 2
speed schedule in executing jobs on the DSP such that all p(s) = Cef Vdd s, (1)
jobs meet their deadlines, and the energy consumption is 2
where s = k (VddV−V t)
, and Cef , Vt , Vdd , and k denote the
minimized. In other words, a sequence of time points and dd
effective switch capacitance, the threshold voltage, the sup-
their corresponding processor speeds should be determined
ply voltage, and a hardware-design-specific constant, re-
to run the jobs on the DSP during the runtime. In this pa-
spectively (Vdd ≥ Vt ≥ 0, k > 0, and Cef > 0). The
per, sliding-window-based online algorithms are proposed
DSP’s under considerations in this study are assumed to
in energy-efficient real-time job scheduling over a DSP. The
support a set of discrete processor speeds, denoted as a vec-
capability of the proposed algorithms are evaluated by a
tor A = {s1 , s2 , ..., sm }. Each speed si ∈ A corresponds to
series of experiments over real and synthesized traces. It
an operating mode of the DSP, and its power consumption
was shown that roughly 44% energy reduction was possible
is denoted as p(si ). The power consumption function p() of
for many cases, and prediction errors were not significant
a DSP is usually provided in a tabular form in its specifica-
enough to result in more than 3% in deadline missing.
tion.
The rest of this paper is organized as follows: In Sec-
The co-scheduling of multimedia jobs over a DSP with
tion 2, the task model and the problem definitions are pro-
DVS (or referred to as a DVS DSP) must meet the appli-
vided. Section 3 presents our voltage scaling algorithms.
cation performance requirements, such as a good quality in
Section 4 summarizes the experimental results. Section 5 is
H.264 stream playing. As motivated by PMP-like products,
the conclusion.
we considered the minimization of the energy consumption
in running jobs of a single periodic task over the DVS DSP,
2 Problem Formation provided that the job characteristics are known when they
are submitted to the DSP. A popular example application is
2.1 System Model the minimization of the energy consumption in the decoding
of a H.264 stream over a DVS DSP.
This work is motivated by the designs of personal Let (J1 , J2 , ..., JN ) denote the set of jobs submitted to
multimedia players (PMP) and related products, where the DSP, where jobs are belonging to one periodic task, and
energy-efficient real-time scheduling of multimedia jobs are jobs are indexed by their occurrence times. Let ci denote
needed. Instead of joint scheduling considerations of a mi- the required DSP cycles in executing job Ji 2 . The period
cro processing unit (MPU) and a digital signal processor of the task is denoted P such that a job is submitted to the
(DSP), we focus our study on energy-efficient real-time job DSP every P time units. Each job is also associated with a
1 Personal Multimedia Player 2 In practice, we could predict ci s with the methods presented in [6, 21].

226
schedule Φ = (S, T) such that the total energy consump-
R10 D10
tion E(Φ) is minimized, where there are at most B pending
B=6
W orkload

jobs on the DSP, and every job meets its deadline.


Sometimes scaling the supply voltage will affect the op-
eration of some components such as the pipeline in a pro-
c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 cessor and result in some error. To avoid this kind of error
0 P 2P 3P 4P 5P 6P 7P 8P 9P 10P in some platforms, we need to force each job to execute at
T ime
one speed. Therefore, we add this constraint to define the
Figure 1: The Job Executions over the DSP. ECMRSP problem. The difference between the ECMRSP
and ECMASP problems is on when we can adjust the DSP
speed. Note that the two problems are both online schedul-
deadline di , that is a multiple of P . Take H.264 decoding ing problems.
as an example: P = 1/F , where the frame rate is F frames
Definition 2 ECMRSP (Energy Consumption Minimiza-
per second (fps). Let B denote the maximum number of
tion at Restricted Scaling Points) problem:
jobs that are submitted and pending at the DSP, where B is
Given a task of N jobs, i.e., (J1 , J2 , ..., JN ), and a con-
restricted by the buffer size of the DSP and the application
stant B, the ECMRSP problem is to determine a legal speed
characteristics. Since jobs are submitted to the DSP in the
schedule Φ = (S, T) such that the total energy consump-
increasing order of their indices, it is required that at most
tion E(Φ) is minimized, where there are at most B pending
B consecutive jobs are pending at the DSP. As shown in
jobs on the DSP, every job meets its deadline, and each job
Figure 1, there are 6 pending jobs over the DSP, where there
only executes at one DSP speed.
are 10 jobs totally and B = 6. The deadline of the 10th job
is 10P , and we can start its processing at any time no earlier
than 4P and when all of its preceding jobs complete. 3 Online Voltage-Scaling Algorithms for
Speed Schedules
2.2 Problem Statement
3.1 An ECMASP Solution - SW-ASP
Since jobs over DSP execute one by one based on their
submission order, the problem in energy-efficient real-time The purpose of this section is to propose an online
job co-scheduling over a DSP is on the determination of voltage-scaling algorithm for the ECMASP problem, re-
the DSP speed in the job executions. The objective is to ferred to as the sliding-window-based algorithm for DVS
minimize the energy consumption without any violation of at arbitrary scaling points (SW-ASP). The idea is to look
the job deadlines. for the critical speed in executing each pending job on the
Let T = (t1 , t2 , ..., tn ) denote a sequence of time points DSP, where the critical speed of a pending job is the lowest
in voltage scaling between the time interval [t1 , tn ], and speed to execute the job and all of its preceeding pending
S = (ŝ(t1 ), ŝ(t2 ), ..., ŝ(tn )) denote the sequence of the cor- jobs without missing their deadlines.
responding DSP speeds in the interval. Φ = (S, T) is thus Since one job is submitted every P time units, let i ×
called a speed schedule for the given time interval. The en- P denote the current time, and e denote the index of the
ergy consumption of Φ can be derived as follows: currently executing job, where there are N jobs considered
in the scheduling. A sliding window of w jobs is assumed to
n−1
X review pending jobs to determine their critical speeds, and
E(Φ) = p(ŝ(ti )) · (ti+1 − ti ). (2) let B denote the maximum number of pending jobs on the
i=1
DSP. Algorithm 1 derives a legal speed schedule as follows:
A legal speed schedule Φ = (S, T) is one in which ev- Initially, the derived speed schedule is reset as null (Step
ery speed ŝ(ti ) is belonging to the set A of available speeds 1). Since we are supposed to derive a critical speed for each
of the DSP. In this paper, we are interested in legal speed time instance (i × P ), each iteration in the for-loop between
schedules only. Given any DSP under considerations, let Steps 3 and 13 derives a critical speed for the corresponding
A and p() denote the set of available speeds and the power time moment. However, when a derived critical speed is not
consumption function, respectively. The problem in energy- belonging to the set of available speeds of the DSP, some
efficient real-time job co-scheduling over a DSP can be de- interpolation must be done (Steps 5-9). Index o is used to
fined as follows: keep track of the speed and its corresponding time for such a
purpose. In each iteration of the for-loop, the corresponding
Definition 1 ECMASP (Energy Consumption Minimiza- critical speed is derived based on the following formula:
tion at Arbitrary Scaling Points) problem:   X
Given a task of N jobs, i.e., (J1 , J2 , ..., JN ), and a con- Lj
s̃i = max , where Lj = re + cl . (3)
stant B, the ECMASP problem is to determine a legal speed e≤j≤e+w−1 dj
e+1≤l≤j

227
Job J0 J1 J2 J3 J4 J5 J6 J7
Algorithm 1: SW-ASP
ci 0 2426 991 857 690 588 1671 525
Input: A set of jobs (J1 , J2 , ..., JN ) and the buffer di P 2P 3P 4P 5P 6P 7P 8P
size B Job J8 J9 J10 J11 J12 J13 J14 J15
Output: A legal speed schedule Φ = (S, T) ci 485 518 485 1534 467 501 514 470
di 9P 10P 11P 12P 13P 14P 15P 16P
1 Initialize S and T as empty sequences;
2 o ← 1;
Table 1: An Example Job Workload
3 foreach Scheduling points t̃i = (i − 1) × P for
i = 1, ..., N do
4 Derive the critical speed s1 = 1520
 

Speed
Lj s2 = 1013
s̃i = maxe≤j≤min(i−1+B,N ) where Lj = s3 = 760

X dj s4 = 0
re + cl ; 0 P 2P 3P 4P 5P 6P 7P 8P 9P
T ime
10P 11P 12P 13P 14P 15P 16P

e+1≤l≤j (a) The Derived Legal Schedule Without Interpolation


5 if s̃i 6∈ A then
6 Do the interpolation of s̃i by a higher speed
s1 = 1520
s̃H L
i and a lower speed s̃i if needed based on

Speed
s2 = 1013
Equation 5, where s̃i and s̃L
H
i ∈ A;
s3 = 760

7 Add to = t̃i and to+1 = t̃i + swi into T; s4 = 0


0 P 2P 3P 4P 5P 6P 7P 8P 9P 10P 11P 12P 13P 14P 15P 16P
8 Add ŝ(to ) = s̃H L
i and ŝ(to+1 ) = s̃i into S;
T ime

(b) The Legal Schedule After Interpolation


9 o ← o + 2;
10 else Figure 2: Legal Schedules Derived by Algorithm SW-ASP
11 Add to = t̃i into T;
12 Add ŝ(to ) = s̃i into S;
13 o ← o + 1; A new speed and its corresponding time point are then
added in the legal schedule. The idea is to have the DSP
operating at a higher speed s̃H i for swi of time units and
X then at a lower speed s̃L
i for the rest time in a period P . The
Note that Lj = re + cl denotes the total DSP cy- setting of the speeds and their corresponding time points are
e+1≤l≤j
as shown at Steps 6-8.
cles of pending jobs that must be completed by dj , where Je
Algorithm SW-ASP can be better illustrated by an exam-
denotes the currently executing job, and re is its remaining
ple: Consider the job workload as shown in Table 1, where
DSP cycles. When the DSP is in the sleep mode at the cor-
ci is in terms of cycles. Suppose that the relative deadline
responding moment, Je denotes the next job in the buffer,
of each pending task is one 2P , where the i-th pending job
and re = ce . The setting of s̃i is to derive the lowest speed
is issued by the MPU at the time instance ((i − 1) × P ).
to meet the deadline of every pending job in the sliding win-
Suppose that the maximum number B of pending jobs on
dow. The size w of the sliding window at the time instance
the DSP is 5. The critical speed s̃i of each time point
(i × P ) can be derived by the following formula:
((i − 1) × P ) is as shown in Figure 2a. Suppose that the
w = B + min(i − 1, N − B) − (e − 1) set of available speeds is (1520, 1013, 760, 0) cycles per pe-
(4) riod. The speed and its corresponding time point after inter-
= min(i − 1 + B, N ) − e + 1.
polation are as shown in Figure 2b.
It is because there are at most B pending jobs at the time
instance (i × P ), and e is the index of the currently execut- 3.2 An ECMRSP Solution - SW-RSP
ing job. Based on the above formulas, the formula in the
derivation of the critical speed of the time instance (i × P )
This section is meant to the proposing of a voltage-
is as shown at Step 4. When the derived critical speed s̃i
scaling algorithm for the ECMRSP problem. The proposed
is not belonging to the set of available speeds of the DSP,
algorithm is referred as the sliding-window-based algorithm
some interpolation must be done (Steps 5-9): We interpo-
for DVS at restricted scaling points (SW-RSP). The basic
late the speed s̃i by two available processor speeds, i.e.,
idea behind Algorithm SW-RSP is the same as that of Al-
s̃H L H L
i , s̃i ∈ A, where s̃i and s̃i are the nearest available gorithm SW-ASP described in Section 3.1. The main differ-
processor speeds that are higher and lower than s̃i , respec-
ence between Algorithm SW-RSP and Algorithm SW-ASP
tively. A time value swi must be found to satisfy the fol-
is on the setting of the scheduling points. Unlike the EC-
lowing equation.
MASP problem, each job in the ECMRSP problem can only
s̃i · P = s̃H L
i · swi + s̃i · (P − swi ) (5) be executed at one DSP speed. In other words, once a job

228
starts its execution at a DSP speed, it uses the speed to com- s1 = 1520
plete all of its execution. The scheduling points are set as

Speed
s2 = 1013
s3 = 760
the time points before the beginning of each job execution.
Let i be the index of the first pending job of the sliding s4 = 0
0 P 2P 3P 4P 5P 6P 7P 8P 9P 10P 11P 12P 13P 14P 15P 16P
window of size w at the current scheduling point, i.e., the T ime

time right before the execution of Ji . Algorithm 2 should


derive a legal speed schedule for a task of N jobs with the Figure 3: An example of Algorithm SW-RSP
constraint that the maximum number of pending jobs on the
DSP is B. Speed mode s1 s2 s3 s4
Frequency (MHz) 456 304 228 0
Power (%) 100 46 28 ≤0.01
Algorithm 2: SW-RSP
Input: Set of jobs (J1 , J2 , ..., JN ) Table 2: The Characteristics of the PAC DSP
Output: A legal speed schedule Φ = (S, T)
1 Initialize S and T as empty sequences;
2 foreach Job Ji do (Steps 6-8). Although the setting of a higher processor
3 Derive the critical speed speed seems not being very energy-efficient, the early com-
  pletion of the job leave more slack to the executions of
Lj
s̃i = maxi≤j≤min(⌊t̃i /P ⌋+B,N ) dj the subsequent jobs. For the simplicity in the presenta-
X tion, comi denotes the completion time of Job Ji , and let
where Lj = cl ; com1 = 0.
i≤l≤j
Algorithm SW-RSP can be illustrated with the same ex-
4 Set t̃i ← comi−1 ; ample, as shown in Table 1. The resulted schedule is pre-
5 Add ti = t̃i into T; sented in Figure 3. If the energy consumption of the job ex-
6 if s̃i 6∈ A then ecution at the speed s1 , i.e., 1520 cycles per period, for one
7 Round s̃i to the nearest higher available period P as 1, the resulted energy consumption amounts for
processor speed s̃Hi ; the speed schedules derived by Algorithm SW-ASP (shown
8 Add ŝ(ti ) = s̃H
i into S; in Figure 2b) and Algorithm SW-RSP (shown in Figure 3)
9 else are 5.498956 and 5.657138, respectively.
10 Add ŝ(ti ) = s̃i into S;
4 Experiments

The derived speed schedule is reset as null initially, sim- 4.1 Environment Setup
ilar to Algorithm SW-ASP (Step 1). For each job Ji , a criti-
cal speed s̃i is derived just before the execution of Ji (Steps The purpose of this section is to evaluate the proposed
2-10). s̃i is derived based on the following formula (Step volatage scaling scheme by both simulations and real case
3), where the window size is w at the current time: study. The simulation part was based on the parameters of
  X PAC DSP. The case study was done by the measurement
Lj
s̃i = max , where Lj = cl . (6) of system executions over a Texas Instruments DaVinci
i≤j≤i+w−1 dj
i≤l≤j DVEVM (Digital Video Evaluation Module) development
board. The speed modes available on the simulation plat-
Similar to the derivation of w in Section 3.1, the window form , i.e., PAC DSP, and the experimental platform, i.e.,
size w at the corresponding scheduling point t̃i is derived as DaVinci, are listed in Table 2 and Table 3, respectively.
follows: Note that the last speed mode, i.e., the mode s4 in Table 2,
t̃i and the mode s8 in Table 3, are the sleep mode.
w = B + min(⌊ ⌋, N − B) − (i − 1) Two sets of video bitstreams were adopted for testing:
P (7)
t̃i One was for simulations, and the other was for the case
= min(⌊ ⌋ + B, N ) − i + 1. study. The first test set consisted of bitstreams that were
P
common to the evaluations of MPEG compression algo-
Note that at time t̃i , ⌊t̃i /P ⌋ new jobs are issued from the rithms, such as akiyo, coastguard, foreman, etc. These bit-
MPU. streams were compressed in terms of the MPEG-2 format
Different from Algorithm SW-ASP, Algorithm SW-RSP with a resolution at 176 × 144 (qcif) pixels and consisted
does not do interpolation. Instead, the derived critical speed of roughly 250 to 300 frames. We decoded this set of bit-
of a job is rounded up to the nearest available processor streams and measured their DSP execution cycles on the
speed s̃H
i that is no less than the derived critical speed PAC platform. The simulations of the proposed scheme

229
Speed mode s1 s2 s3 s4 s5 80

NormalizedEnergyConsumption(%)
Frequency (MHz) 594 567 540 513 486 LowerBound
SW-ASP
Speed mode s6 s7 s8 s9 SW-RSP
Frequency (MHz) 459 432 405 0 60

Table 3: The Characteristics of the Texas Instruments


40
DaVinci DVEVM

20
were then conducted according to the measured traces of
the DSP execution cycles. The energy consumptions in the
simulations were calculated based on the power consump-
coastguard news tempete singer container akyio foreman
tion characteristics of the PAC DSP, as specified in Table 2. VideoBitstreams
The second test set for the case study had long H.264 base-
line videos and a larger resolution (720 × 480). We imple- Figure 4: The Energy Consumption in Decoding with Per-
mented the proposed scheme in the H.264 video decoder fect Cycle Prediction
for DaVinci and measured the power consumptionwith the
Agilent 34970A data acquisition/switch unit. The second
80
test set was summarized in Table 4. In order to measure

NormalizedEnergyConsumption(%)
LowerBound
SW-ASP
the performance of the algorithms, the normalized energy SW-RSP
consumption was adopted as the performance metric in the 60
minimization of the energy consumption, and it was defined
as the total energy consumption of the schedule derived by
40
an evaluated algorithm divided by that without any volt-
age scaling. The deadline missing served as another per-
formance metric, and it was used to measure the quality of 20
the decoded video in the evaluation.

4.2 Simulation Results coastguard news tempete singer container akyio foreman
VideoBitstreams
Figure 4 shows the energy consumption in decoding of
MPEG-2 bitstreams on the PAC DSP, where the predic- Figure 5: The Energy Consumption in Decoding with Im-
tion on decoding cycles was always 100% accurate. The perfect Cycle Prediction
Lower Bound denotes the lower bound on the energy con-
sumption derived Pfrom the execution of all jobs at a com-
N
mon speed s̃ = i=1 ci /dN 3 . SW-ASP and SW-RSP indi- in the simulation experiments because of its good perfor-
cate the energy consumption in decoding under Algorithms mance. As shown in Figure 5, the energy saving with the
SW-ASP and Algorithm SW-RSP in the simulation, respec- FRAME TYPE cycle prediction method was close to the
tively. Each result was normalized to the energy comsump- corresponding one with perfect prediction. Moreover, since
tion in decoding at the highest speed s1 , as shown in Table 2. the prediction method was not perfect, some decoding jobs
As shown in the figure, the proposed algorithms resulted in would miss their deadlines. As shown in Figure 6, the
about 44% in energy saving. deadline miss rate was no more than 3% in the decoding.
Figure 5 shows the energy consumption of algorithms With such a low miss rate, we surmised that the dropping
under consideration without perfect cycle prediction in de- of frames would not cause trouble to the watching of video
coding. The cycle prediction method FRAME TYPE pro- streams.
posed in [6] was adopted in the simulation experiments.
The method utilized the decoding information provided by 4.3 Experimental Results - A Case Study
MPU, i.e., the past history in the frame execution cycles
of different frame types such as I, P, and B. The method
The proposed algorithms were also implemented and
predicted the execution cycles of incoming video frames
evaluated over a realistic platform, i.e., that over a Texas
based on the average cycles of frames of the same type
Instruments DaVinci DVEVM, with real traces. The cy-
in the frame execution history. The method was adopted
cle prediction method FRAME TYPE was adopted as that
3 If s̃ is not an available speed, we interpolate s̃ by executing at s̃H for in the simulation experiments. Figure 7 show the energy
s̃−s̃L
s̃H −s̃L
· dN time units and then executing at s̃L till dN , where s̃H and consumption of Algorithm SW-RSP, compared to that un-
L
s̃ are the higher and lower available speeds nearest to s̃, respectively. der NO DVFS (,i.e., decoding at the highest speed). Algo-

230
2D Animation 3D Animation News
Action
Conference
Number of frames 17985 19182 19182 1751
Resolution 720 × 480 720 × 480 720 × 480 720 × 480
FPS 29.97 29.97 23.98 25
Bitrate (kbps) 754.99 1237.58 1798.87 631.27
PSNR 42.44 41.85 40.04 42.49
Scene changes some many many nearly none
Camera motion a little a little radical nearly none
Spatial complexity low medium highest low
Figure movements medium high highest lowest

Table 4: The Characteristics of H.264 Video Bitstreams

3.0 250
SW-ASP NoDVFS
SW-RSP SW-RSP
2.5 83.0%

EnergyConsumption(J)
200
82.0%
DeadlineMissing(%)

2.0
150 82.1%
1.5
100
1.0
50
0.5 82.3%
0
2D 3D Action News
coastguard news tempete singer container akyio foreman Animation Animation Conference
VideoBitstreams VideoBitstreams

Figure 6: the Deadline Miss Rate with Imperfect Cycle Pre- Figure 7: The Energy Consumption of Algorithms in the
diction Cast Study

5 Conclusion and Future Work

rithm SW-RSP achieved about 18% in energy saving, that This paper is motivated by the needs in energy-efficient
was less than the corresponding one in the simulation ex- scheduling of multimedia workloads over a DSP. A set
periments over the PAC DSP. One of major reasons was on of sliding-window-based algorithms are proposed to derive
the limitation of the experiment platform DaVinci, where time points in the speed adjustment of a DSP and their cor-
the experimented platform DaVinci did not have good speed responding speeds. We derive proper speed adjustment time
flexibilities, compared to the PAC DSP. Even if DaVinci of- points and their corresponding processor speeds to run jobs
fered up to 8 discrete speeds, its lowest speed was still not of a periodic task on the DSP, such as that for the decoding
as low as that of the PAC DSP. Because of its design limi- of an H.264 stream. The capability of the proposed algo-
tation, DaVinci was less power-efficient than the PAC DSP, rithms are evaluated by a series of experiments over real
where PAC DSP was designed later. Another two reasons and synthesized traces. The proposed algorithms can save
in less energy saving in the case study were on the power 44% of the energy consumption in many cases. Prediction
domain design of DaVinci and the limitation of the experi- errors also did not result in significant dropping in deadline
ment instruments. Note that DPS, MPU, DMA, and system missing.
buses of DaVinci were all on the same power domain, and For future research, we will further investigate the job
all of them contributed to the measured amount in energy partitioning problem between the MPU and the DSP. We
consumption in the case study. That evened out the energy shall explore the balance between the energy consumption
saving achieved in energy-efficient job scheduling over DSP and the performance. While multiple-core systems become
by the proposed algorithm. However, we must point out that more and more popular, more research on energy-efficient
18% was already a reasonably decent result in energy effi- real-time job scheduling among multiple DSP’s has become
cient DSP scheduling. even more critical in the near future.

231
References [15] S. Irani, S. Shukla, and R. Gupta. Algorithms for power sav-
ings. In Proceedings of the Fourteenth Annual ACM-SIAM
[1] T. A. Alenawy and H. Aydin. Energy-aware task alloca- Symposium on Discrete Algorithms, pages 37–46, 2003.
tion for rate monotonic scheduling. In Proceedings of the [16] T. Ishihara and H. Yasuura. Voltage scheduling problems for
11th IEEE Real-time and Embedded Technology and Appli- dynamically variable voltage processors. In Proceedings of
cations Symposium (RTAS’05), pages 213–223, 2005. the International Symposium on Low Power Electronics and
[2] H. Aydin, R. Melhem, D. Mossé, and P. Mejı́a-Alvarez. De- Design, pages 197–202, 1998.
termining optimal processor speeds for periodic real-time [17] J.-M. Lu, H.-L. Wu, T.-M. Chiang, and W.-F. Chen. High
tasks with different power characteristics. In Proceedings performance and low-power dual-core soc platform for
of the IEEE EuroMicro Conference on Real-Time Systems, portable multimedia applications. ITRI SoC Technical Jour-
pages 225–232, 2001. nal, May issue:36–45, 2005.
[3] H. Aydin, R. Melhem, D. Mossé, and P. Mejı́a-Alvarez. [18] P. Mejı́a-Alvarez, E. Levner, and D. Mossé. Adaptive
Dynamic and aggressive scheduling techniques for power- scheduling server for power-aware real-time tasks. ACM
aware real-time systems. In Proceedings of the 22nd IEEE Transactions on Embedded Computing Systems, 3(2):284–
Real-Time Systems Symposium, pages 95–105, 2001. 306, 2004.
[4] H. Aydin and Q. Yang. Energy-aware partitioning for mul- [19] R. Mishra, N. Rastogi, D. Zhu, D. Mossé, and R. Melhem.
tiprocessor real-time systems. In Proceedings of 17th In- Energy aware scheduling for distributed real-time systems.
ternational Parallel and Distributed Processing Symposium In International Parallel and Distributed Processing Sympo-
(IPDPS), pages 113 – 121, 2003. sium, page 21, 2003.
[5] N. Bansal, T. Kimbrel, and K. Pruhs. Dynamic speed scal- [20] I NTEL -X SCALE , 2003.
ing to manage energy and temperature. In Proceedings of http://developer.intel.com/design/xscale/.
[21] Y. Tan, P. Malani, Q. Qiu, and Q. Wu. Workload prediction
the Symposium on Foundations of Computer Science, pages
and dynamic voltage scaling for mpeg decoding. In ASP-
520–529, 2004.
DAC ’06: Proceedings of the 2006 conference on Asia South
[6] A. C. Bavier, A. B. Montz, and L. L. Peterson. Predict-
Pacific design automation, pages 911–916, New York, NY,
ing mpeg execution times. In SIGMETRICS ’98/PERFOR-
USA, 2006. ACM Press.
MANCE ’98: Proceedings of the 1998 ACM SIGMETRICS
[22] M. Weiser, B. Welch, A. Demers, and S. Shenker. Schedul-
Joint International Conference on Measurement and Mod-
ing for reduced CPU energy. In Proceedings of Symposium
eling of Computer Systems, pages 131–140, New York, NY,
on Operating Systems Design and Implementation, pages
USA, 1998. ACM Press.
13–23, 1994.
[7] A. Chandrakasan, S. Sheng, and R. Broderson. Lower-
[23] C.-Y. Yang, J.-J. Chen, and T.-W. Kuo. An approximation al-
power CMOS digital design. IEEE Journal of Solid-State
gorithm for energy-efficient scheduling on a chip multipro-
Circuit, 27(4):473–484, 1992.
cessor. In Proceedings of the 8th Conference of Design, Au-
[8] J.-J. Chen, H.-R. Hsu, K.-H. Chuang, C.-L. Yang, A.-
tomation, and Test in Europe (DATE), pages 468–473, 2005.
C. Pang, and T.-W. Kuo. Multiprocessor energy-efficient
[24] F. Yao, A. Demers, and S. Shenker. A scheduling model
scheduling with task migration considerations. In EuroMi-
for reduced cpu energy. In FOCS ’95: Proceedings of the
cro Conference on Real-Time Systems (ECRTS’04), pages
36th Annual Symposium on Foundations of Computer Sci-
101–108, 2004.
ence (FOCS’95), page 374, Washington, DC, USA, 1995.
[9] J.-J. Chen and T.-W. Kuo. Voltage-scaling scheduling for
IEEE Computer Society.
periodic real-time tasks in reward maximization. In the 26th
[25] S. M. Yardi, M. S. Hsiao, T. L. Martin, and D. S.
IEEE Real-Time Systems Symposium (RTSS), pages 345–
Ha. Quality-driven proactive computation elimination for
355, 2005.
power-aware multimedia processing. In DATE ’05: Pro-
[10] J.-J. Chen, T.-W. Kuo, and H.-I. Lu. Power-saving schedul-
ceedings of the conference on Design, Automation and Test
ing for weakly dynamic voltage scaling devices. In Work-
in Europe, pages 340–345, Washington, DC, USA, 2005.
shop on Algorithms and Data Structures (WADS), pages
IEEE Computer Society.
338–349, 2005.
[26] W. Yuan and K. Nahrstedt. Energy-efficient soft real-time
[11] J.-J. Chen, C.-Y. Yang, and T.-W. Kuo. Slack reclamation
cpu scheduling for mobile multimedia systems. In SOSP
for real-time task scheduling over dynamic voltage scaling
’03: Proceedings of the nineteenth ACM symposium on Op-
multiprocessors. In SUTC ’06: Proceedings of the IEEE
erating systems principles, pages 149–163, New York, NY,
International Conference on Sensor Networks, Ubiquitous,
USA, 2003. ACM Press.
and Trustworthy Computing -Vol 1 (SUTC’06), pages 358–
[27] Y. Zhang, X. Hu, and D. Z. Chen. Task scheduling and volt-
367, 2006.
age selection for energy minimization. In Annual ACM IEEE
[12] F. Gruian. System-level design methods for low-energy
Design Automation Conference, pages 183–188, 2002.
architectures containing variable voltage processors. In
[28] D. Zhu, R. Melhem, and B. Childers. Scheduling with dy-
Power-Aware Computing Systems, pages 1–12, 2000.
namic voltage/speed adjustment using slack reclamation in
[13] F. Gruian and K. Kuchcinski. Lenes: Task scheduling for
multi-processor real-time systems. In Proceedings of IEEE
low energy systems using variable supply voltage proces-
22th Real-Time System Symposium, pages 84–94, 2001.
sors. In Proceedings of Asia South Pacific Design Automa-
tion Conference, pages 449–455, 2001.
[14] T. Instruments. TMS320DM6446 DaVinci Digital Media
System-on-Chip, 2007.

232
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Industry Panel 2

Sensor Network, where is it going to be?

Polly Huang
National Taiwan University

This is a sequel panel to “Sensor network: Is it for It is perhaps time to ask ourselves again. Are these
real or just for research?” held 2 years back in SUTC applications for real or just for everyone's amusement?
2006. The sensor network community has worked Invited panelists are all experts and pioneers who have
hard trying to find the rightful places to make sensor reported experimental uses of sensor networks in the
networks useful since then. We have seen, over the last real world. They will share with us openly the most
couple years, teams reporting experimental use of difficult part of their work and discuss, hopefully also
sensor networks deep in the mountains, up in the air, openly, whether they think the sensor network
down underwater, sitting on the roads, deployed in the approach is (1) feasible, (2) economical, (3) useful,
hospitals, and, the least expected but surprisingly and, to the extreme, (4) commercializable (i.e.,
plausible, worn by the dancers in performance art profitable).
theaters. We all deserve a warm 'well done' given the
effort.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 271


DOI 10.1109/SUTC.2008.94
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

An Automated Bacterial Colony Counting System

Chengcui Zhang1, Wei-Bang Chen1, Wen-Lin Liu2, and Chi-Bang Chen3


1
Department of Computer and Information Sciences, University of Alabama at Birmingham,
Birmingham, AL 35294, USA
2
Department of Accounting and Information Systems, University of Alabama at Birmingham,
Birmingham, AL 35294 USA
3
Department of Transportation Management, Tamkang University, Taiwan, R.O.C.
{wbc0522, zhang}@cis.uab.edu

Abstract of microbes to form colonies (a.k.a. colony forming


unit - CFU) on the plates. The evaluation is done by
Bacterial colony enumeration is an essential tool examining the survival rate of microbes in a sample.
for many widely used biomedical assays. However, These assays are also widely used in biomedical
bacterial colony enumerating is a low throughput, time examinations, food and drug safety test, environmental
consuming and labor intensive process since there monitoring, and public health [1].
might exist hundreds or thousands of colonies on a However, bacterial colony enumerating is a low
Petri dish, and the counting process is often manually throughput, time consuming and labor intensive
performed by well-trained technicians. In this paper, process since there might exist hundreds or thousands
we introduce a fully automatic yet cost-effective of colonies on a Petri dish, and the counting process is
bacterial colony counter. Our proposed method can often manually performed by well-trained technicians.
recognize chromatic and achromatic images and thus The manual counting is an error-prone process since
can deal with both color and clear medium. In the results tend to have more subjective interpretation
addition, our proposed method can also accept and mostly rely on persistent practice, especially when
general digital camera images as its input. The whole a vast number of colonies appear on the plate [2].
process includes detecting dish/plate regions, Thus, having consistent criteria is very important.
identifying colonies, separating aggregated colonies, To reduce the operator’s workload and to provide
and finally reporting consistent and accurate counting consistent and accurate results, colony counting
results. Our proposed counter has a promising devices were developed and commercialized in the
performance in terms of both precision and recall, and market [3]. We reviewed these counters available on
is flexible and efficient in terms of labor- and time- the market and classified them into two categories.
savings. The first kind of counter is called automatic digital
counters, widely used in most laboratories. However,
Keywords: they are not truly automatic since they still require
Biomedical image mining, colony counting. technicians to use probe to identify each colony so that
the sensor system can sense and register each count.
The second type of counter is semi-automatic or
1. Introduction automatic counters which are often very expensive.
These high-priced devices often come with their own
In biomedical research and clinical diagnosis, there image capture hardware for acquiring high quality
is a great need to quantify the amount of bacteria in the images to optimize the counter’s efficiency and
samples. To analyze the result from bacterial culture, performance. However, the affordability of this kind of
bacterial colony enumeration is used to count the equipment is still a non-trivial issue for most
number of viable bacteria as colonies. This type of laboratories due to the high price of such equipment in
assays is achieved by pouring a liquefied sample the market. Some laboratories that need to perform a
containing microbes onto agar plates, incubating the huge amount of enumeration tasks may require more
survived microbes as the seeds for growing the number than one high-throughput counter to fit their needs.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 233


DOI 10.1109/SUTC.2008.50
Thus, colony enumeration devices pose a significant clear/transparent LB agar. Based on our experience in
budgetary challenge to many laboratories [5]. using existing software for colony counting, there is no
In addition, some automatic counters accurately single best algorithm that can satisfy the needs of
detect colonies by growing bacteria on special growth different types of medium. Hence, we believe it is
medium which contains fluorogenic substrates [8]. more appropriate to process images that do or do not
Bacteria metabolize the substrates, and then produce carry color information separately.
fluorescent product for detection. These systems are Although human operators can easily recognize
extremely sensitive, and are good for detecting bacteria colonies on medium after some training,
microcolonies. However, the fluorogenic substrates computers can hardly “see” these colonies without any
used in the medium are costly, and the fluorescence prior knowledge. This motives us to design a three-
can only be detected by using a sensitive instrument. step approach for simulating human’s recognition
Besides, some automatic counters [4] still require users behavior. When a human operator examines a bacteria
to manually specify the plate/dish area and provide colony image, he gradually identifies objects from the
parameters prior to the actual enumeration process. image. First, the dish/plate region, the largest object in
Some may need operators to adjust the threshold the image, is identified. Second, within the dish/plate
values in order to handle dishes/plates/medium that region, human starts to find colonies based on some
differ from their default settings. In such cases, human criteria such as color and shape. An illustration of this
operators are heavily involved in the operation, and it hierarchy is shown in Figure 1. If colonies are
is thus not efficient for high throughput processing of clustered together, the operator will try to separate the
plates/dishes. clustered colonies based on his best visual judgment.
Further, laboratories have needs to use various Once all colonies have been identified, the operator
types of dishes and plates in examinations. However, counts the total number of colonies.
most of the commercial counters are designed for In this paper, the proposed framework differentiates
measuring 60-150mm Petri dish, thus, lack the the processing of colony images with rich color
flexibility for accommodating plates with different information from that of those with little color
sizes and shapes. information. It first determines whether or not an
In addition to the above problems, some counters image carries color information. Then, it locates the
use only binary images for detecting colonies. Plenty dish/plate area. In the third step, possible colony
of important characteristics of the colony, such as candidates are identified which are subject to further
color, are lost although they can be used to identify the statistics test in order to identify the ‘true’ colonies.
genus of the bacteria. Clustered colonies are further separated by Watershed
To address the above problems, our goal in this algorithm [7]. Finally, the number of remaining
study is to design and implement an inexpensive, segments is reported back as the total colony count.
software-centered system for detecting bacterial In the remaining of this paper, we introduce the
colonies in a fully automatic manner. Thus, more time system details in Section 2. Section 3 demonstrates
and money can be allocated to other priorities for those experiment results. Section 4 concludes the paper.
laboratories.
Nowadays, image acquiring devices such as digital
cameras and flatbed scanners, become more popular
and affordable. Hence, it motivates us to use these
devices to obtain cost-effective yet high-quality images
for colony counting.
In automating the bacteria colony counting process,
one of the challenges is that the colors of bacteria
colony and culture medium vary. This is because Figure 1. The hierarchical structure of objects in
different strains of bacteria may require different bacteria colony images
nutrients, and these ingredients make the culture
medium with various colors. In addition, bacteria grow 2. The proposed bacterial colony counter
on different kinds of culture medium may appear in
different color. Hence, while some bacteria colony The overview of the proposed system is illustrated
images contain abundant color information, some do in Figure 2. First, for an input image, we examine its
not. For example, Mutans Streptococci appears as chromatic/color components, and a proper processing
black colonies on the blue color Mitis-Salivarius agar, method is selected, depending on what type of image
and Escherichia Coli appears as white colonies on the

234
(chromatic/achromatic) it is. In the next step, we deviation of mean values from each color channel R,
gradually extract objects from the image step-by-step G, and B. This is based on the fact that if the RGB
based on their hierarchical structure as illustrated in values of a pixel are close to each other, it is most
Figure 1. likely a gray pixel, and vice versa. Thus, a small
standard deviation indicates low hue or lack of
chromatic components. The smaller the standard
deviation is, the higher the possibility that the image is
achromatic (e.g. those colony images with clear
Color Feature medium and white colonies.) It is at this point that
Detector
chromatic images (e.g. Mutans Streptococci appears as
black colonies on the blue color Mitis-Salivarius agar)
Color Feature
Detected
are distinguished from achromatic images (e.g.
Yes No
Escherichia Coli appears as white colonies on the clear
LB agar), which will be dealt with separately.
Segmentation Segmentation
with color feature without color feature In view of the different characteristics of
Dish/Plate Dish/Plate
achromatic and chromatic images, we develop
Region Detector Region Detector different methods for these two types of images in the
subsequent image segmentation step.
Colony Detector Colony Detector

2.2. Image segmentation

Colony Enumeration The core of the proposed bacterial colony counter is


image segmentation. The goal of segmentation is to
distinguish foreground objects from the background.
For this purpose, there are two popular choices of
techniques, namely the thresholding techniques and the
Figure 2. The overview of the proposed system clustering techniques. The thresholding techniques use
a global threshold value to separate foreground and
Once all the colonies on the image are identified, background, and the clustering techniques partition
we check the morphology of each segment. This is objects based on their inter- and intra-class similarities.
necessary because some colonies may aggregate The thresholding techniques are quite straightforward
together to form a large cluster. Hence, to obtain an and efficient, but are not stable when dealing with
accurate colony count, those clustered colonies need to images containing more than two classes. According to
be separated. For this purpose, we adopt the Watershed our preliminary experiments, the performance of multi-
algorithm [7] to detect and segment those plausible class clustering methods, which are more complicated
colony clusters. Once all the colonies on the dish/plate and time-consuming, are generally worse than that of
have been identified and isolated, we simply count the the thresholding techniques in terms of robustness,
number of detected segments and use it as the total explainability, and projectibility. In this paper, we
count of bacteria colonies. propose a new thresholding based technique for
bacterial colony image segmentation.
2.1. Color feature detection To do segmentation with thresholding techniques,
we have to solve the problem that the target region
As mentioned in Section 1, some bacteria colony contains more than two classes. The natural
images may contain abundant color information. For hierarchical structure of the objects in colony images
those colored images, we propose a color feature based (as shown in Figure 1) indicates that we may be able to
method to detect foreground objects in the target gradually separate them in a progressive way.
region (region-of-interest). On the other hand, those Our target, at the first level, is the entire image. The
images with very little color information (almost no foreground object is the dish/plate region, and the
hue) shall be dealt with in a different way. background is the area surrounding the dish/plate
To choose a proper method for different types of region. After the dish/plate area is separated, we move
images, we first need to determine whether the to the second level in which the foreground objects are
imported image is chromatic or achromatic. The colonies and the background includes medium and
checking is achieved by examining the standard other artifacts within the dish/plate region.

235
As aforementioned, we propose different correctly identify all colonies due to the existence of
approaches to deal with images with chromatic and artifacts such as scratches, dusts, markers, bubbles,
achromatic medium separately in the subsequent reflections, and dents in the image. The calculation of
processes. It is much easier to perform segmentation color similarity in HSV color space is shown in
on chromatic images than achromatic images since Equation 1.
they contain more color information. In the following
discussions, we first introduce how to perform
segmentation on chromatic images, then the approach CSij = 1 −
1
(x j − xi )2 + (y j − yi )2 + (z j − zi )2
5
for achromatic images. (1)
xi = Si × cos(H i × 2π )
2.3. Segmentation on chromatic images yi = Si × sin (H i × 2π )
zi = Vi
In this step, our goal is to identify the dish/plate
region in an image, and then, recognize colonies in the where CSij is the color similarity of two pixels i and j.
detected dish/plate region. The motivation is to reduce H, S, V are the hue, saturation, and value of a pixel in
the operator’s workload by eliminating the process of the HSV color space.
manually specifying the target dish/plate region. To This is based on the assumption that pixels inside a
distinguish the dish/plate region from the background, segment, no matter it is a colony segment or a medium
we first perform the contrast limited adaptive segment, have higher similarity values with its
histogram equalization (CLAHE) on the converted neighboring pixels, and pixels along the segment
grayscale images which operates on small regions boundary have lower similarity values with their
called tiles in the image rather than the entire image neighbors. We calculate the color similarity values
[6]. Each tile’s contrast is enhanced and the between a pixel and its eight neighbors, and use the
neighboring tiles are then combined using bilinear minimum similarity value to represent the maximum
interpolation to eliminate artificially induced color difference with its neighbors. Thus, pixels in the
boundaries. same segment have higher minimum similarity values.
Then we apply the Otsu’s method [9] on the On the contrary, pixels on the boundary of a segment
contrast enhanced image to identify the dish/plate have lower values. After the calculation, the
region as a target region. For some target regions boundaries are more evident, and the minimum color
detected this way, there may be small holes inside, and similarity values formed a matrix as a grayscale image.
we fill in the holes by adopting a morphology-based Thus, we can adopt the Otsu’s method used in the
method and consolidate the target regions. dish/plate region detection stage to further distinguish
Sometimes, this method can also detect some background (medium areas) from foreground objects
smaller objects that are not part of the target dish/plate (candidate colonies).
region. We assume the target region should occupy the
majority (and central) part of the image, thus there is
an extra step in our algorithm which is designed to
remove those isolated small objects. A few of detected
target regions of dish/plate, after applying the above
steps, are shown together with their original images in
Figure 3.
The results show that the automatic dish/plate
region detection algorithm is effective regardless of the
size and shape of the dish/plate. After the dish/plate
region has been extracted, we can apply the
segmentation again on the detected dish/plate region.
The second step is to isolate colonies on the
dish/plate, identify clustered colonies, and separate
aggregated bacteria colonies for subsequent colony Figure 3. Segmentation results for detecting
enumeration. In addition to using Otsu’s method [9] to dish/plate regions. Raw images (left column);
separate colonies and medium, we also adopt color Otsu’s method (middle column); proposed method
similarity in HSV (Hue-Saturation-Value) color space (Right column)
to assist the colony boundaries detection [10]. This is
necessary because a simple global threshold cannot

236
Ideally, an isolated foreground object from the much less accurate on achromatic images due to the
previous step corresponds to one colony. However, presence of artifacts. An additional noise removal step
such an object may correspond to more than one is performed for those achromatic images.
colony because several colonies may cluster together. The color similarity as described in Section 2.3
There is a need to split them in order to get the correct cannot be applied since achromatic images lack color
colony counts. To separate the connected colonies, we information. In this paper, we propose a new statistic
consider the intensity gradient image as a topological approach to detect and remove those artifacts and
surfaces [7], thus the watershed algorithm can be successfully preserve only colonies.
applied to divide clustered colonies in the image just as Our proposed statistic approach includes two steps.
water flood in a topographical surface. To illustrate the The first step is to remove those large-size artifacts.
concept, we demonstrate the application of watershed We collect the sizes of all objects detected by Otsu’s
algorithm in Figure 4. method from the dish/plate region, and generate
frequency distribution with log base of those size
values. Colonies of similar size should occupy the high
frequency segment in this distribution, and the
frequencies for those very large artifacts should be
very low. By this assumption, we can remove those
large size objects. The second step is to remove those
small artifacts which are very similar to the colonies in
the dish/plate. In this step, area size is not a good
determinant since the area size range of those small
artifacts is about the same as that of colonies. Instead,
we consider the intensity distribution of the dish/plate
region as a two-peak distribution which consists of the
distribution of medium pixels (background) and the
distribution of colonies pixels. Those small artifacts
belong to background distribution; however, they have
overlapped with the colony distribution. Therefore, we
Figure 4. The concept of watershed algorithm assume that colonies should have significantly
different intensity values than their surrounding
After applying the watershed algorithm, almost all background, and it is highly possible that those small
clustered colony segments can be separated and artifacts have similar intensity values to their
identified and are ready for the colony enumeration. surrounding pixels. Based on this assumption, we
examine each small object including colonies by
2.4. Segmentation on achromatic images hypothesis testing. In the hypothesis testing, we use the
mean of surrounding pixel values as null and test if the
The most challenge part in this research is to deal mean of object pixel values has significant difference
with achromatic images. Most of the existing colony with the null, by the α = 0.01.
counters have disappointing performance in handling After excluding most of the artifacts, we apply the
achromatic images due to the low contrast between watershed algorithm to separate clustered colonies as
colonies and medium. Besides, the background described in Section 2.3.
artifacts look very similar to colonies in the clear agar,
making it more difficult to discriminate the 2.5. Colony enumeration
background artifacts from real colonies in the
dish/plate. After all colonies have been properly separated and
To handle achromatic images, our method is also identified, the final step is to acquire the total number
based on the hierarchical structure of objects of viable colonies by adding up the number of the
aforementioned. In the first step, the dish/plate region objects that have been identified as colonies.
can be detected by using the same approach as
described in Section 2.3. In the colony detection stage, 3. Experimental results
we develop a different method to alleviate the low
contrast and artifacts problems. We also apply Otsu’s In our experiments, we use four different digital
method to isolate colonies. However, Otsu’s method is cameras as the image acquiring devices to obtain

237
dish/plate images for bacterial colony detection. The 0.52±0.19, respectively; their recall values are
four digital cameras include a Nikon D50 Digital SLR 0.96±0.04 and 0.99±0.01, respectively; their F-
Camera (6.0-megapixel) with a resolution of 3008 × measure values are 0.96±0.01 and 0.67±0.18,
2000, a Canon PowerShot A95 Camera (5.0- respectively. The precision, recall, and F-measure
megapixel) with a resolution of 2592 × 1944, a Sanyo values of the proposed counter (P.C.) are about the
DSC-J1 Camera (3.2-megapixel) with a resolution same as that of the A.C. method on chromatic images.
1600 ×1200, and an Asus P525 PDA cell phone built- To evaluate the robustness of the proposed counter
in camera (2.0-megapixel) with a resolution of 1600 (P.C.) on achromatic images, we conduct the following
×1200. two experiments, and compare the performance of P.C.
Additionally, Petri dishes with two different types with that of A.C and C.C.
of medium and bacteria strains are used in our In the first experiment, we test the proposed counter
experiments. The first type of images is obtained from (P.C.) on 24 achromatic images (9 images with good
the Department of Pediatric Dentistry at the University quality and 15 images with poor quality). The
of Alabama at Birmingham. This type of plate contains performance of the P.C., A.C., and C.C. methods on
blue color Mitis-Salivarius agar which is used for good/poor quality images are summarized in Table 1.
isolating Mutans Streptococci. These acid-producing From Table 1, we can observe that the P.C.
bacteria attack tooth enamel minerals and cause dental significantly outperforms the A.C. and C.C methods.
caries. The second type of plate is obtained from the The average overall precision, recall, and F-measure
Division of Nephrology, Department of Medicine, values of the P.C. method are 0.61±0.29, 0.94±0.06,
University of Alabama at Birmingham. This type of and 0.69±0.20, while the corresponding values of P.C.
plates contains the clear LB agar which is widely used and C.C. are (0.44±0.24, 0.68±0.24, 0.44±0.13) and
in laboratories for Escherichia Coli culture. (0.00±0.00, 0.00±0.00, 0.00±0.00), respectively.
In our second experiment, we further apply the
3.1. Dish/Plate detection proposed method on 15 different achromatic images
taken from the same dish, but with different
In this experiment, we compare the proposed background surfaces, zooms, and lighting conditions.
dish/plate detection algorithm with Otsu’s method. We measure the precision, recall, and F-measure of the
Some sample segmentation results are demonstrated in proposed counter. The average precision, recall, and F-
Figure 3. In addition, we also evaluate the performance measure on the 15 achromatic images are 0.93±0.11,
of the proposed dish/plate detection algorithm and 0.87±0.04, and 0.90±0.07, respectively. The results of
Otsu’s method by applying both methods on 100 the consistency analysis show the proposed system is
chromatic and achromatic images. The satisfaction quite consistent.
rates for the proposed method and Otsu’s method are
96% and 38%, respectively. For the 25 chromatic Table 1. Performance comparison on achromatic
images, the satisfaction rates for the proposed method images
and Otsu’s method are 92% and 0%, respectively. For Image F-
Method Precision Recall
Condition measure
the 75 achromatic images, the satisfaction rates for the
0.94 ± 0.88 ± 0.90 ±
proposed method and Otsu’s method are 97% and P.C.
0.07 0.02 0.03
50%, respectively. It is obvious that the proposed Good
0.71 ± 0.42 ± 0.52 ±
Quality A.C.
method outperforms Otsu’s method in dish/plate (9)
0.06 0.16 0.12
region detection. 0.00 ± 0.00 ± 0.00 ±
C.C.
0.00 0.00 0.00
0.41 ± 0.98 ± 0.56 ±
3.2. Colony detection P.C.
0.16 0.04 0.13
Poor
0.27 ± 0.84 ± 0.40 ±
Quality A.C.
Since the characteristics of the chromatic and 0.12 0.07 0.12
(15)
0.00 ± 0.00 ± 0.00 ±
achromatic images are quite different, it is more C.C.
0.00 0.00 0.00
appropriate to discuss the counter performance on 0.61 ± 0.94 ± 0.69 ±
P.C.
them separately. In the experiments, we compared the 0.29 0.06 0.20
proposed counter (P.C.) with the Clono-Counter (C.C.) Overall A.C.
0.44 ± 0.68 ± 0.44 ±
[4] which is reported by Niyazi in 2007, and the 0.24 0.24 0.13
0.00 ± 0.00 ± 0.00 ±
automatic counter (A.C.) proposed in our previous C.C.
0.00 0.00 0.00
study [11]. For chromatic images, the precision values P.C. : The proposed counter
of the A.C. and C.C. methods are 0.97±0.03 and A.C. : Automatic counter [11]
C.C. : Clono-Counter [4]

238
also exists a lot of noise on the plate such as bubbles,
3.3. Splitting clustered colonies small scratches, and small markers. Some round-
shaped small objects are very similar to the colonies
In the process of detecting colonies, there exist and sometimes it is hard to distinguish them from real
some clustered colonies that need to be further divided colonies even by trained human eyes. This makes the
into separate colonies. As mentioned earlier, we adopt colony isolation task extremely difficult. In this paper,
the Watershed algorithm to deal with this problem and we address those challenges and demonstrate a
found it useful in separating connected colonies in our reasonable performance in both color and clear
experimental results. We give an example of the medium images.
splitting result of Watershed algorithm in Figure 5. The above features also make our proposed method
In our experiment, we checked the performance of very flexible and attractive to laboratories. In addition,
the Watershed algorithm on 19 segments with our proposed counter operates automatically without
clustered colonies which actually contain 98 colonies. any human intervention, and the performance is very
After applying the watershed algorithm, we obtain 96 quite promising for both color and clear medium.
colonies. Only 2 overlapped colonies are missed in the In the future work, we plan to detect and distinguish
splitting process. different species of bacteria from a single colony
dish/plate. Ultimately, our goal is to accurately classify
different kinds of bacterial colonies and produce the
correct count for each class, which could greatly
benefit clinical studies.

5. Acknowledgement
This research of Dr. Zhang is supported in part by
NSF DBI-0649894.

Figure 5. Clustered colonies split by watershed 6. References


algorithm
[1] X. Liu, S. Wang, L. Sendi, and M. J. Caulfield, “High-
It is worth noting that the Watershed algorithm is an throughput imaging of bacterial colonies grown on filter
integral part of the proposed system, in which each plates with application to serum bactericidal assays,” Journal
step contributes to the better performance of the of Immunological Methods, vol. 292, pp. 187-193, 2004.
following steps.
[2] C. W. Chang, Y. H. Hwang, S. A. Grinshpun, J. M.
Macher, and K. Willeke, “Evaluation of Counting Error Due
4. Discussions, Conclusions, and Future to Colony Masking in Bioaerosol Sampling,” Applied and
Work Environmental Microbiology, vol. 60, pp. 3732-3738, 1994.

[3] J. Dahle, M. Kakar, H. B. Steen, and O. Kaalhus,


In this paper, we introduce a robust and effective
“Automated counting of mammalian cell colonies by means
automatic bacterial colony counter with the ability to of a flat bed scanner and image processing,” Cytometry A,
recognize chromatic and achromatic images, detect the vol. 60, pp. 182-188, 2004.
dish/plate regions, isolate colonies on the dish/plate,
and further, separate the clustered colonies for accurate [4] M. Niyazi, I. Niyazi, and C. Belka, “Counting colonies of
counting of colonies. The proposed counter has the clonogenic assays by using densitometric software,”
following contributions. Radiation Oncology, vol. 2, pp. 4, 2007.
First, our proposed method can handle various
kinds of dish/plate, including circular and rectangular [5] M. Putman, R. Burton, and M. H. Nahm, “Simplified
shaped dish/plates. Second, it can accept general method to automatically count bacterial colony forming
unit,” J Immunol Methods, vol. 302, pp. 99-102, 2005.
digital camera images as its input. The third
contribution is that our proposed method can recognize [6] Z. Karel, “Contrast limited adaptive histogram
chromatic and achromatic images and deal with both equalization,” in Graphics gems IV: Academic Press
color and clear medium. The most challenging part in Professional, Inc., pp. 474-485, 1994.
this study is to handle clear medium images, since
colonies look very similar to the background. There

239
[7] V. Luc and S. Pierre, “Watersheds in Digital Spaces: An
Efficient Algorithm Based on Immersion Simulations,” IEEE [11] C. Zhang and W.-B. Chen, “An Effective and Robust
Trans. Pattern Anal. Mach. Intell., vol. 13, pp. 583-598, Method for Automatic Bacterial Colony Enumeration,” in
1991. Proc. of the IEEE Intl. Workshop on Semantic Computing
and Multimedia Systems, in conjunction with the 2007 Intl.
[8] http://www.colifast.no/ Conf. on Semantic Computing, pp. 581-588, September 17-
19, Irvine, CA, USA, 2007.
[9] N. Otsu, “A Threshold Selection Method from Gray-
Level Histograms,” IEEE Trans. on Systems, Man, and
Cybernetics, vol. 9, no. 1, pp. 62-66, 1979.

[10] G. A. F. Seber, “Multivariate Observations”, Wiley,


1984.

240
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

LOFT: Low-Overhead Freshness Transmission in Sensor Networks

Chin-Tser Huang
Department of Computer Science and Engineering
University of South Carolina
huangct@engr.sc.edu

Abstract is assumed to have greater memory and processing


power, and does not have the constraints of the sensor
Sequence numbers have been used by a variety of nodes.
network protocols as freshness identifiers to achieve With their low cost and scalability, sensor networks
reliable transmission and provide protection against have found a variety of applications. For instance, in
replay attacks. The number of bits allocated for a the military, such networks are used in target tracking,
sequence number shall not be too small in order to perimeter monitoring, and battlefield survey.
avoid frequent wraparounds and replay attacks. Commercial applications of these networks include
However, in a sensor network consisting of resource- inventory control and building systems monitoring.
constrained sensor nodes, transmitting a full sequence However, the constraints on the sensor nodes, plus the
number along with a message represents considerable open nature of communications in a wireless ad hoc
overhead that is desirable to be avoided. In this paper, sensor network, make securing networks of this type a
we propose LOFT, a protocol that lowers the overhead challenging task.
of transmitting freshness identifiers along with One of the issues that need to be taken care of in
messages. In LOFT, the sender and receiver still securing a sensor network is the threat of replay
maintain a sequence number of the same number of attacks. A replay attack is an attack in which an
bits, but when the sender sends a new message, it only adversary inserts into the channel, from the source to
transmits a less significant portion of the sequence the destination in a communication, one or more copies
number bits in order to save energy. All the bits of the of messages that were sent before by the source ([4]
sequence number are involved in the computation of a and [14]). If the destination cannot distinguish replayed
message authentication code, so that the receiver can messages from normal messages when it is under
verify the freshness of the message. Moreover, LOFT replay attack, the destination may end up authenticating
makes use of Bloom filter to mitigate the overhead of the adversary as the source, or making incorrect
freshness check caused by DoS attacks. We use decision based on the content of the replayed message.
simulations to analyze how much a portion of the In wireless sensor networks, an adversary can replay
sequence number should be transmitted to achieve best authentication message such that it is authenticated as a
performance in terms of efficiency and effectiveness. legitimate neighboring node or outside user [15, 17],
replay routing information messages to cause routing
1. Introduction loops or increase end-to-end delay [13, 19], and replay
regular messages to consume sensor node’s resources
A typical sensor network consists of a number of [8] or disrupt data gathering and diffusion procedures.
sensor nodes and a base station. Each sensor node is a To counter replay attacks, every message needs to
small, cheap device that is programmed to collect carry some freshness proof that can be verified by its
certain types of data, for example temperature, receiver. One common technique for providing
humidity, and light. The base station is an aggregation freshness proof is by using sequence numbers. In this
point for collected data and is viewed as the human technique, the sender attaches a monotonically
interface into the network. The sensor nodes are increasing counter value to every message it transmits.
typically constrained by limited battery power, small To make it correct, the sender is required to remember
memory, low computational ability, and limited the last used number so that it will not reuse a number,
transmission range. On the other hand, the base station and the receiver is required to remember the largest

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 241


DOI 10.1109/SUTC.2008.38
received number. The sender increments the counter by
1 after sending a message. The receiver compares the 2. Related Works
sequence number of received message with the largest
received number. If the sequence number of received One widely known countermeasure against replay
message is larger than the largest received number, the attacks is the anti-replay window protocol in IPsec, the
receiver accepts the message and updates largest standard protocol suite for adding security features to
received number. Otherwise, the receiver regards the the IP layer in the Internet ([5], [6], and [7]). This
message as a replayed one and discards it. protocol can provide anti-replay service by including a
However, using sequence numbers as the vehicle of
sequence number in each IPsec message and using a
freshness transmission in sensor networks may appear
sliding window. According to IPsec, a unidirectional
to be too much an overhead for sensor nodes with
security association can be established between any
respect to their transmission and energy constraints. In
order for the sequence number approach to be two computers in their networks: one computer is the
effective, the range of valid sequence numbers cannot source of the association and the other is the
be too small, otherwise all the valid sequence numbers destination. On the source end, the source keeps a
can be used up quickly and a wraparound is often counter for the sequence numbers used for sending
needed, opening a convenient venue for replay attacks. messages and the sequence numbers are always
For example, IPsec ([5], [6], and [7]) allocates 32 bits monotonic. When a security association is established,
to the sequence number field, so the range of valid the counter is initialized to zero. Every time the source
sequence numbers is from 0 to 232-1. This range of sends a message to the destination, the source includes
numbers can last for more than 1.3 years before all the in the message the current value of the sequence
numbers are used once if on average 100 messages are number counter, and increments the counter by one so
sent per second. However, 32 bits are equal to 4 bytes, that the used sequence number will not be reused again.
which represents a substantial overhead for a typical On the destination end, the destination uses a sliding
sensor network message, such as the messages of a window to determine whether a received message is a
Mica2 sensor whose maximum size is 35 bytes. normal message or a replayed message. If the sequence
In this paper, we propose a new approach that number of the received message is less than the number
transmits partial freshness identifier along with each represented by the left edge of the window, then the
message such that only low overhead is incurred. The
message is regarded as a replayed message and is
sender and receiver still maintain a sequence number of
discarded by the destination. If the sequence number of
the same number of bits (say 32 bits), but when the
the received message falls inside the window, the
sender sends a new message, it only transmits a less
significant portion of the sequence number bits in order destination can determine whether the message is a
to save energy. All the bits of the sequence number are replayed message or not by checking the information
involved in the computation of a message kept in the window. If the sequence number of the
authentication code, so that the receiver can verify the received message is larger than the number represented
freshness of the message with a quick hash by the right edge of the window, the message is
computation. We will show through analysis and regarded as a fresh message and the window is shifted
simulation that this approach is effective and easy to to the right, making this received sequence number the
deploy, and provide analysis on how much a portion of new right edge of the window. (Note that for a message
the sequence number should be transmitted for and its sequence number to be accepted, the message
performance optimization in terms of efficiency and also needs to pass integrity check in the destination.)
effectiveness. However, one major limitation of applying the anti-
The remainder of this paper is organized as follows. replay window protocol to sensor networks lies in its
In Section 2, we discuss previous works that are related overhead. In order not to use up all the valid sequence
to anti-replay in general networks and in sensor numbers too quickly, IPsec defines the sequence
networks. In Section 3, we present the protocol of our number to be 32 bits long. In order to address the
Low-Overhead Freshness Transmission (LOFT) common message loss and message reorder phenomena
approach. In Section 4, we give an analysis and in the Internet, IPsec stipulates that the window size be
simulation result of our approach. In Section 5, we
at least 64 bits. Both numbers spell too much an
consider some implementation issues about our
overhead for a usual sensor node.
approach. Finally, we conclude our presentation and
Perrig et al. propose a Sensor Network Encryption
discuss future works in Section 6.
Protocol (SNEP) that is aimed to provide data

242
confidentiality, data authentication, integrity, and can retry the verification with subsequent values seqr +
freshness [11]. In SNEP, data confidentiality is 1, seqr + 2, seqr + 3, and so on, until a match is found
achieved by encrypting the data preceded with a or reaching seqr + l, where l is a predefined threshold
sequence number. The purpose of including of the selected by observing the length of the longest run of
sequence number in the encryption is to prevent an consecutive message losses that ever occurs in the past.
adversary from inferring the plaintext of encrypted If the sequence number inconsistency is due to message
messages if the adversary knows plaintext-ciphertext loss and the length of this message loss streak is less
pairs encrypted with the same key. However, if the than or equal to l, then the verification will succeed and
sequence number is transmitted along with each the receiver will set the sequence number it maintains
message, then more energy will be consumed. As to the value that passes the verification. However,
shown in Fig. 1(b), SNEP avoids this transmission although this solution can solve the problem caused by
overhead by requiring the sender and receiver to message loss, it cannot solve the problem caused by the
remember the same sequence number1. The sender DoS attack. In the case of a DoS attack, the receiver
attaches to the message a message authentication code will waste its energy to make l verifications without a
(MAC) that is calculated over the concatenation of the success.
encrypted data and the sequence number maintained by The second solution, which is also proposed in [11],
the sender, denoted seqs. When the receiver receives a suggests that if the verification of MAC fails, then the
message from the sender, the receiver can verify the sender and the receiver decide that they are out of
message’s integrity and freshness by calculating a synchronization and use a counter exchange protocol to
MAC of the message on its own, using the sequence resynchronize their sequence numbers. However, there
number it maintains, denoted seqr . Provided that the are two problems with the second solution. First,
sender and receiver are synchronized on their sequence executing the counter exchange protocol will incur
number, the result calculated by the receiver should be extra message transmission overhead and can open a
consistent with the MAC attached to the message. Each potential venue for an adversary to disrupt the
synchronization between the sender and the receiver.
message from a sender A to a receiver B can be
Second, the adversary can still send fake or replayed
specified as follows:
messages to the receiver, causing unnecessary
A → B: data, MACMKAB(seqs || data) 2 resynchronizations between the sender and the receiver
and eventually leading to a DoS attack.
where MKAB is the MAC key shared between A and B.
In [11], the authors point out that there are two 3. The LOFT Protocol
problems with the approach of SNEP. First, if one or
more consecutive messages get lost, then the next We design a novel Low-Overhead Freshness
message received by the receiver carries a sequence Transmission (LOFT) protocol to address the anti-
number that is inconsistent with the one maintained by repaly requirement and avoid the problems associated
the receiver. Second, there is a potential denial-of- with the aforementioned solution. In our solution, the
service (DoS) attack in which an adversary keeps sender and the receiver still maintain a sequence
sending fake (or replayed) messages to the receiver. number, but unlike SNEP, which totally refrains from
Because in both cases the MAC carried in the received transmitting any sequence number in clear in the
message will not pass the verification, the two cases are message, our solution transmits only a less significant
indistinguishable to the receiver. portion of the sequence number bits in order to provide
There are two possible solutions to the above both energy saving and freshness proof.
problem. In the first solution, when the verification of a
message’s MAC fails with seqr, where seqr is the 3.1 Transmission of Less Significant Portion of
sequence number expected by the receiver, the receiver Sequence Number
1
In the original paper the authors propose that two counters are
shared by the two parties (one for each direction of communication).
Assume the sender and the receiver are synchronized
For simplicity, we consider only one direction; the other direction on a sequence number of n bits long. We denote the
can be inferred similarly. sequence number stored at the sender and the receiver
2
In SNEP, the authors use encryption to provide confidentiality,
but it is an add-on feature. For the sake of simplicity, here we
as seqs and seqr respectively. In the LOFT protocol, as
remove the encryption notation without loss of generality and focus illustrated in Fig. 1(c), the sender will send only m less
on the freshness transmission.

243
We discuss how the two aforementioned problems
(a)
seqs : transmitted that are faced by the SNEP approach are overcome by
our LOFT approach. First, if the received message
carries an inconsistent sequence number due to one or
more consecutive message losses, the receiver does not
(b)
need to retry with the subsequent sequence numbers
one by one as in the case of SNEP approach. Instead,
the receiver will first compute a MAC using the
seqs : not transmitted use seqr to concatenation of the m less significant bits of the
verify freshness
(c)
sequence number seqs, 0..m-1 received in the message and
seqs, 0..m-1: transmitted
the n-m more significant bits seqr, m..n-1 stored by the
receiver. If the computed MAC is the same as the
received MAC, the receiver will regard the message as
seqs, m..n-1: not transmitted use seqr, m..n-1||seqs, 0..m-1 intact and fresh and accept it. Otherwise, the receiver
to verify freshness
will increment the number represented by seqr, m..n-1 by
header
one, and compute a MAC using the concatenation of
sequence number
payload seqs, 0..m-1 and seqr, m..n-1+1. If the result still does not
match the received MAC, the receiver will retry with
Fig. 1 Different ways of freshness transmission: (a) the concatenation of seqs, 0..m-1 and seqr, m..n-1+2, and will
the trivial solution in which the whole sequence continue until a match is found. Second, if the received
number is transmitted, (b) the method used in SNEP, inconsistent sequence number is due to a DoS attack in
in which the whole sequence number is not which an adversary keeps sending fake (or replayed)
transmitted, and (c) the method used by LOFT, in
messages to the receiver, the receiver will also retry
which a less significant portion of the sequence
with the MAC computed using the concatenation of
number is transmitted.
seqs, 0..m-1 and seqr, m..n-1+i one by one since the second
case is indistinguishable from the first case. However,
significant bits (denoted as seqs, 0..m-1), where m < n, of no match will be found because the message is either
the sequence number along with each message. This fake or replayed. Therefore, a threshold th needs to be
way, the sender saves the transmission of the n-m more defined such that after trying th concatenations of
significant bits seqs, m..n-1 per message. The sender will sequence numbers, the receiver will decide that the
calculate a hash value over the concatenation of the message is illegitimate and stop the freshness
message header, all the n bits of the sequence number verification. It can be seen that th should be defined as
seqs, and the message payload, and attach the hash l / 2m, because when seqr, m..n-1 is incremented by 1,
value to the message. Each message from a sender A to the value of the concatenation of seqs, 0..m-1 and seqr, m..n-
a receiver B can be specified as follows: m
1 is increased by 2 .

A → B: seqs, 0..m-1, data, MACMKAB(seqs || data)


3.2 Application of Bloom Filter
When receiving a message, the receiver will
calculate a hash value over the concatenation of the During some period of time, many messages may be
message header, all n bits of the expected sequence lost in transit due to environmental causes or attacks.
number (the concatenation of the m less significant bits Under such conditions, the receiver may have to check
received from the message, seqs, 0..m-1, and the n-m multiple sequence numbers in order to verify the
more significant bits the receiver is expecting, seqr, m..n- freshness of a received message, because many
1), and the message payload. The receiver compares the legitimate messages in between may have been lost,
hash value it calculates and the hash value attached to and several sequence numbers within a reasonable
the message. If the two values are equal, the receiver range have the same m less significant bits. If the
accepts the message. Moreover, the receiver updates received message is legitimate and the number of
seqr to reflect the newly received sequence number. consecutive lost messages is no greater than l, then the
Otherwise, the receiver regards the message as receiver should find a match within l / 2m checks. On
“susceptible”, and further checking is needed to the other hand, in a replay attack, an adversary replays
determine the integrity and freshness of the packet. a lot of old messages to the receiver. Since the receiver

244
cannot distinguish these replayed messages from the freshness of a message, it will first use the received
case of message loss, the receiver will also perform Bloom filter to test whether the message is one of the
multiple freshness checks. The difference is that the last w messages sent by the sender. If the result is no,
receiver will never find a match after performing l/2m the receiver can discard the message right away. If the
checks in vain. In order to mitigate the burden of extra result is yes, the receiver will proceed to perform
freshness check while enjoying the saving from freshness check because there is still a possibility of
reduced freshness transmission, we propose to use the false positive. In this way, the receiver can quickly
Bloom filter to address this problem. verify whether a received message is a legitimate
A Bloom filter is a space-efficient data structure message before proceeding to perform freshness check
designed to represent the membership of a set of data and determine the associated full sequence number of
items. We follow [2, 9] to give a brief overview of the the message. The application of a Bloom filter in
definition and mechanism of a Bloom filter. A Bloom LOFT is illustrated in Fig. 2.
filter for representing a set S = {e1, e2, ... , ew} of w
elements is described by a bit vector of v bits, initially
all set to 0. Associated with a Bloom filter is a set of k
independent hash functions h1, … , hk, each is assumed 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0
to map each possible item in the universe uniformly to
the range {0, … , v − 1}. Note that these hash functions
are not required to be of cryptographic strength, so illegitimate likely legitimate likely legitimate
their computation is usually very efficient. To represent
the membership of an element e ∈ S, the bits hi(e), for Fig. 2 Bloom filter applied in LOFT.
1 ≤ i ≤ k, out of the m bits are set to 1. Note that a bit
can be set to 1 multiple times, but it will remain 1 after
the first time it is set, and once a bit is set to 1 it will The request for sending the Bloom filter and the
not be reset to 0. To check if an item x is in S, just Bloom filter itself can be piggybacked with normal
check whether all bits hi(x), 1 ≤ i ≤ k, are set to 1. If messages, so that no extra messages need to be sent and
not, then x is apparently not a member of S. If all bits their integrity is also protected. Moreover, it is easy to
hi(x) are set to 1, it can be assumed that x is in S, with a select appropriate values for v, w, and k so that the
possible false positive, in which item x is indeed not in savings from less transmitted sequence number bits are
S but all hi(x) are set to 1 thanks to other data items. retained. Note also that loss of a Bloom filter may
The probability of false positives is determined by the result in unnecessary freshness checks on a susceptible
relationship between v, w, and k. By selecting packet, but does not prevent correct communications
appropriate values for v, w, and k, the probability of a between the sender and the receiver.
false positive can be made sufficiently small so that it is
acceptable to the application in question. In particular, 4. Simulation and Analysis
it is shown in [9] that the minimal false positive rate
occurs when k = (ln 2) · (v/w). In this section, we analyze how to optimize the
A Bloom filter can be applied in LOFT as follows. number of transmitted less significant bits m in terms of
On the sender side, a Bloom filter can be used to transmission overhead and freshness check overhead.
represent every w messages. After w messages have It has been discussed in [3, 16, 18] that there is a
been sent, the sender will reset all the bits in the Bloom tradeoff between the transmission cost and computation
filter, and use it to remember the following w cost. In sensor networks, transmission operations have
messages. When there are not many fake or replayed been shown to consume much more energy than
messages in the sensor network, it is not beneficial to computation operations do. (The modeling of
enable the Bloom filter because of its own computation transmission energy consumption per bit of a
and transmission overhead. When the receiver has communication link can be found in [12].) Therefore, it
received too many messages that require multiple is our objective to reduce the amount of transmission at
freshness checks but end up being discarded because of the expense of some extra computations.
no match, the receiver can request the sender to enable To show the effectiveness and efficiency of LOFT,
the Bloom filter, and send the Bloom filter it maintains we use a Java program to simulate the transmission of
every w messages. Before the receiver checks the messages between a sender and a receiver. In order to

245
compare the performance of different freshness computation of 3 hash values as equivalent to one
transmission schemes, we inject cases of message loss freshness check, because the three hash functions used
and replay attacks into the simulated message are computationally much cheaper than a MAC
sequences. function.
The simulation is set up as follows. In each It is straightforward to analyze the transmission
simulation, a sequence of 10,000 messages is overhead in our simulation. The variations of LOFT
transmitted from the sender to the receiver. The respectively transmit 2 bits, 3 bits, and 4 bits of the
sequence number used in the simulation is 32 bits long. sequence number along with each message. These
In the simulated SNEP, no sequence number is translate to 2500 bytes, 3750 bytes, and 5000 bytes in
transmitted along with each message. In the simulated total when a sequence of 10000 messages is
LOFT, 2, 3, and 4 less significant bits respectively are transmitted. In contrast, SNEP does not transmit any
transmitted with each message. With a probability of byte of sequence number. Nevertheless, these numbers
0.1, 0.2, 0.3 respectively, one or more (consecutive) still represent a great reduction in freshness
messages can be lost in transit. Longer runs of transmission overhead compared to 40,000 bytes in the
consecutive message losses occur with smaller case when the full sequence number is transmitted
probability. We set the longest run of consecutive along with each message.
message losses to be 16, 32, and 64 respectively in Next, we consider the freshness check overhead in
different simulations. To illustrate what this setting our simulation. Fig. 3 shows the comparison of the total
means, in case the longest run of consecutive message number of extra freshness checks when the sequence
losses is 64, then when message loss or replay attack only contains message losses (up to 64 consecutive
occurs, the SNEP approach has to check up to 64 message losses), without any occurrence of replay
sequence numbers, the LOFT approach with 2 less attacks. From the figure we can see that when SNEP is
significant bits transmitted has to check up to 16 used, no bit of the sequence number needs to be
different concatenations, LOFT with 3 less significant transmitted, but much more extra freshness checks need
bits transmitted has to check up to 8 different to be performed due to message loss. LOFT with 2 less
concatenations, and LOFT with 4 less significant bits significant bits transmitted incurs some freshness check
has to check up to 4 different concatenations. This overhead. When 3 or 4 less significant bits are
setting allows us to compare the tolerance of different transmitted, the freshness check overhead is almost
schemes against message loss. For the replay attacks, negligible.
we assume that a replayed message is injected into the Fig. 4 shows the comparison of the total number of
sequence with a probability of 0.4, 0.5. 0.6 extra freshness checks when the sequence contains
respectively. These probabilities are set to be higher occurrences of both message loss (with a probability of
than the probabilities of message loss for two reasons. 0.1 and up to 64 consecutive message losses) and
First, in a replay attack, in particular a DoS-oriented replay attacks. In this figure we only show the
attack, the attacker usually wants to inject a lot of comparison between variations of LOFT because
replayed messages. Second, we want to test the SNEP requires much more extra freshness checks
performance of the Bloom filter-based scheme, which under replay attacks and will make the comparison
should only be enabled when the replay attack level is between variations of LOFT less clear. From this figure
high. The difference between the case of message loss we can see that when the Bloom filter-based scheme is
and the case of replay attack is that in a replay attack enabled and used with the variation in which 3 less
no match will ever be found, therefore the receiver has significant bits of sequence number is transmitted, the
to perform the maximum number of freshness checks number of extra freshness checks is largely reduced. In
required by each scheme. For the Bloom filter, we particular, when the probability of replayed messages is
simulate a 32-bit filter to represent every 8 messages, 0.6, the Bloom filter-based scheme will make less extra
and use 3 hash functions, namely RS hash function, freshness checks than the variation when 4 less
PJW hash function, and DJB hash function obtained significant bits of sequence number is transmitted.
from [10]. In each run of simulation we count the total However, the tradeoff for getting this reduction in extra
number of extra freshness checks performed by the freshness checks is the overhead of transmitting the
receiver on the sequence of messages (which means we Bloom filter.
do not count the freshness check on a legitimate
message). When the Bloom filter is used, we model the

246
10000
the two communicating sensor nodes using an
SNEP encryption key configured before deployment. For the
9000
LOFT w/ 2bits size of sequence number, we suggest to use 32 bits for
8000 LOFT w/ 3bits
LOFT w/ 4bits the full sequence number and transmit 3 less significant
7000 bits of the full sequence number along with each
#extra freshness check

6000 message for freshness check purpose.


5000
For the Bloom filter, we think it is reasonable to use
a 32-bit wide Bloom filter to represent the membership
4000
of every 8 messages using 3 hash functions. The
3000
implementations of many hash functions are available
2000 in [10]. According to [9], the false positive rate can be
1000 estimated at (1/2)3 = 0.125.
0
0.1 0.2
message loss probability
0.3
6. Concluding Remarks
In this paper, we presented a novel approach to
Fig. 3 Comparison of total number of extra freshness
achieve low overhead freshness transmission in
checks when the message sequence contains message
wireless sensor networks. Our solution lowers the
losses with probability of 0.1, 0.2, and 0.3 respectively.
overhead of freshness transmission by transmitting
along with each message only a less significant portion
100000
of a sequence number shared by both communicating
LOFT w/ 2bits
90000
LOFT w/ 3bits
parties. Since the whole sequence number is involved
80000
LOFT w/ 4bits in the calculation of a MAC, the freshness of the
LOFT w/ 3bits & BF

70000
received message can be verified, thus protecting the
#extra freshness check

communication from replay attacks. We also proposed


60000
to make use of Bloom filter to mitigate the overhead of
50000 freshness check caused by DoS attacks. Through
40000 analysis and simulation, we show that although SNEP
30000
can achieve the best reduction on freshness
transmission overhead, LOFT is more tolerant to
20000
message loss and replay attacks and still achieves a
10000
large reduction on freshness transmission overhead.
0 Thus, we believe LOFT is a viable and reliable solution
0.4 0.5 0.6
message replay probability to the problem of freshness transmission in sensor
networks.
Fig. 4 Comparison of total number of extra freshness In the future, we would like to design variations of
checks when the message sequence contains replayed LOFT protocol that can be used between sensor nodes
messages with probability of 0.4, 0.5, and 0.6 that are multiple hops away and among a group of
respectively. sensor nodes. The solution in this paper focuses on the
communication between adjacent sensor nodes, where
message reorder should not occur; however in the case
of multi-hop communication, message reorder becomes
5. Implementation Issues an issue to consider [1]. Moreover, we will consider
how some partial order transactions in sensor networks
The LOFT protocol we presented above is easy to
can be exploited to further lower the burden of
implement and deploy. According to the assumption,
freshness transmission.
LOFT only requires the two communicating sensor
nodes to share a MAC key, a sequence number, and the
hash functions used for the Bloom filter.
7. References
As in other schemes, the MAC key and the initial
[1] T. C. R. Bennett, C. Partridge, and N. Shectman,
sequence number can be securely exchanged between “Packet Reordering is Not Pathological Network

247
Behavior”, IEEE/ACM Transactions on Networking, [13] T. Roosta, S. Shieh, S. Sastry, “Taxonomy of Security
Vol. 7, No. 6, pp. 789-798, Dec. 1999. Attacks in Sensor Networks and Countermeasures,”
[2] L. Fan, P. Cao, J. Almeida, A. Broder, “Summary cache: Proceedings of First IEEE International Conference on
A scalable wide-area Web cache sharing protocol,” System Integration and Reliability Improvements,
Proceedings of SIGCOMM’98, 1998. December 2006.
[3] L. Ferrigno, S. Marano, V. Paciello, A. Pietrosanto, [14] P. Syverson, “A Taxonomy of Replay Attacks”,
“Balancing computational and transmission power Proceedings of the Computer Security Foundations
consumption in wireless image sensor networks,” Workshop VII, pp. 187-191, Jun. 1994.
Proceedings of the 2005 IEEE International Conference [15] H.-R. Tseng, R.-H. Jan, W. Yang, “An Improved
on Virtual Environments, Human-Computer Interfaces Dynamic User Authentication Scheme for Wireless
and Measurement Systems (VECIMS 2005), 2005. Sensor Networks,” Proceedings of IEEE Global
[4] L. Gong, “Variations on the Themes of Message Communications Conference (GLOBECOM 2007),
Freshness and Replay”, Proceedings of the Computer November 2007.
Security Foundations Workshop VI, pp. 131-136, Jun. [16] I. Wirjawan, J. Koshy, R. Pandey, and Y. Ramin,
1993. “Balancing Computation and Communication Costs:
[5] S. Kent and R. Atkinson, “Security Architecture for the The Case for Hybrid Execution in Sensor Networks,”
Internet Protocol”, RFC 2401, November 1998. Proceedings of IEEE SECON’06, 2006.
[6] S. Kent and R. Atkinson, “IP Authentication Header”, [17] K. H.-M. Wong, Y. Zheng, J. Cao, S. Wang, “A
RFC 2402, November 1998. Dynamic User Authentication Scheme for Wireless
[7] S. Kent and R. Atkinson, “IP Encapsulating Security Sensor Networks,” Proceedings of IEEE International
Payload (ESP)”, RFC 2406, November 1998. Conference on Sensor Networks, Ubiquitous, and
[8] J. McCune, E. Shi, A. Perrig, M. Reiter, “Detection of Trustworthy Computing -Vol 1 (SUTC'06), 2006.
Denial-of-Message Attacks on Sensor Network [18] Y. Xi, L. Schwiebert, W. Shi, “Preserving Source
Broadcasts,” Proceedings of the 2005 IEEE Symposium Location Privacy in Monitoring-Based Wireless Sensor
on Security and Privacy, 2005. Networks,” Proceedings of 20th International Parallel
[9] M. Mitzenmacher, “Compressed Bloom Filters,” and Distributed Processing Symposium (IPDPS 2006),
IEEE/ACM Transactions on Networking, Vol. 10, No. 2006.
5, Oct 2002. [19] T. Zia, A. Zomaya, “Security Issues in Wireless Sensor
[10] A. Partow, General Purpose Hash Function Algorithms, Networks,” Proceedings of International Conference on
http://www.partow.net Systems and Networks Communication (ICSNC'06),
[11] A. Perrig, R. Szewczyk, J. D. Tygar, V. Wen, and D. E. 2006.
Culler, “SPINS: Security Protocols for Sensor
Networks”, Wireless Networks, No. 8, pp. 521-534,
2002.
[12] J. G. Proakis, Digital Communications, 4th Ed. New
York:McGraw-Hill, 2001.

248
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A Framework of Machine Learning Based Intrusion


Detection for Wireless Sensor Networks

Zhenwei Yu
World Evolved Services, LLC, New York

Jeffrey J.P. Tsai


Department of Computer Science, University of Illinois at Chicago
tsai@cs.uic.edu

Abstract data freshness and authenticated broadcast for sensor


network [1]. LEAP (Localized Encryption and
Some security protocols or mechanisms have been Authentication Protocol), is designed to support
designed for wireless sensor networks (WSNs). in-network processing bases on the different security
However, an intrusion detection system (IDS) should requirements for different types of messages exchange
always be deployed on security critical applications to [2]. INSENS is an intrusion tolerant routing protocol
defense in depth. Due to the resource constraints, the for wireless sensor networks [3]. A lightweight
intrusion detection system for traditional network security protocol relying solely on broadcasts of
cannot be used directly in WSNs. Several schemes end-to-end encrypted packets was reported in [4].
have been proposed to detect intrusions in wireless However, a sensor network, as a complicated system,
sensor networks. But most of them aim on some there are always some vulnerabilities to be attacked.
specific attacks (e.g. selective forwarding) or attacks Moreover, WSNs may be deployed in hostile
on particular layers, such as media access layer or environments such as battlefields, where sensor nodes
routing layer. In this paper, we present a framework of are susceptible to physical capture. Security sensitive
machine learning based intrusion detection system for information (e.g. shared key) might be exposed by
wireless sensor networks. Our system will not be compromised nodes. A subvert node might be rejoined
limited on particular attacks, while machine learning the sensor network to perform further attacks. So a
algorithm helps to build detection model from training particular intrusion detection system for the sensor
data automatically, which will save human labor from network is desirable for those security critical
writing signature of attacks or specifying the normal applications.
behavior of a sensor node. Several schemes have been proposed to detect
intrusions in wireless sensor networks [5]-[13].
1. Introduction However, most of them aim on some specific attacks
(e.g. selective forwarding [6],[7]) or attacks on
A wireless sensor network (WSN) consists of a particular layers, such as routing layer [9] or media
large set of tiny sensor nodes. Sensor nodes can access layer [10]. In this paper, we present a
perform sensing, data processing and communicating framework of machine learning based intrusion
but with limited power, computational capacities, detection system for wireless sensor networks. Our
small memory size and low bandwidth. The senor system will not be limited to particular attacks, while
nodes in WSNs are usually static after deployment, machine learning algorithm helps to build detection
and communicate mainly through broadcast instead of model from training data automatically, which will
point-to-point communication. Sensor networks have save human labor from creating signature of attacks or
been used in a variety of domains, such as military specifying the normal behavior of a sensor node.
sensing in battlefield, perimeter defense on critical The second section will give the overview of the
area such as airport, intrusion detection for traditional challenges on intrusion detection in WSNs. Then our
communication network, disasters monitoring, home framework of IDS will be discussed in section 3. The
healthcare and so on. Obviously, some applications are related works are reviewed on section 4. The paper
security critical (e.g. military sensing in battlefield), ends with conclusion and future work in section 5.
which attract many researchers’ attention to secure a
sensor network. Some security protocols or 2. Challenges on Intrusion Detection in
mechanisms have been designed for sensor networks. WSNs
For example, SPINS (Sensor Protocol for Information
via Negotiation), a set of protocols, provides secure To understand the challenges of intrusion
data confidentiality, two-party data authentication, and detection in WSNs better, we will give the overview

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 272


DOI 10.1109/SUTC.2008.39
of a sensor node first. A well-known sensor node is the WSNs may be deployed in hostile environments
MICA2/MICAz series by Crossbow. The processor in such as battlefields, where sensor nodes are
these nodes is an 8 MHz 8-bit Atmel ATMEGA128L susceptible to physical capture. Security information
CPU. It has only 128 kB of instruction memory, 4 kB (e.g. shared key) might be exposed by compromised
of RAM for data, and 512 kB of flash memory [14]. nodes. The development of tamper-proof nodes is one
The sensor node is usually powered by 2 AA batteries. possible approach to security in hostile environment,
MICA2 series sensor nodes feature a multi-channel but the complicated hardware and high cost keep it
radio delivering up to 38.4 kbps data rate with a range away from WSN applications. An intrusion detection
of up to 1000 feet [15]. MICAz series sensor nodes system for WSNs has to be aware of physical attacks
offer 250Kpbs data rate with a range of up to 100 and can not trust any node.
meters [16]. Security properties or the challenges of
the wireless sensor networks have been reported in 3. Our Framework of Machine Learning
varied literatures [17]-[20]. In the following, we based ID for WSNs
summarize the challenges on designing IDS in WSN
from the limited resources of the sensor nodes, the 3.1. Architecture
wireless communication, the dynamic topology of the
network and the hostile working environment. For the traditional wired network, four
Sensor network nodes usually have severely architecture of intrusion detection system were
constraints in computational power, memory size, and studied. Centralized network intrusion detection
energy as we see on MICA2/MICAz series nodes. systems are characterized by distributed audit
With those limited resources, some effective security collection and centralized analysis. A hierarchical
defense techniques for traditional LAN/WAN/Internet NIDS has some intermediate components between the
are no longer suitable for wireless sensor network. For collection and analysis components to form a tree
example, asymmetric cryptography is often too structure hierarchy. The intermediate components
expensive for many WSN applications. Intrusion aggregate, abstract and reduce the collected data and
detection, as another layer of security, plays a more output the results to analysis. A netted architecture
important role to secure wireless sensor networks. permits information flow from any node to any other
However, the low computational power and the node. The collection, aggregation and analysis
insufficient available memory pose big challenges to components are combined into a single component
the design of an intrusion detection system for WSNs: that is residing on every monitored system. In a
the intrusion detection components should optimize mobile agent based intrusion detection system, all of
resource consumption, and it might sacrifice its collection, aggregation and analysis components are
performance to fit the resource constraints. Another wrapped by mobile agent. The code can be migrated to
challenge is only limited log/audit data could be used a destination instead of passing massive audit data to
for intrusion detection due to low available storage. reduce the network traffic. Although a centralized
Sensor nodes use wireless communication in detection algorithm was proposed to detect
WSNs. Any information over the radio can be sinkhole/selective forwarding attack in wireless sensor
intercepted and the private or sensitive information networks in [6]. However, the centralized architecture
could be captured by a passive attacker. An aggressive is not suitable for an intrusion detection system to
attacker can easily inject malicious messages into the detect as many types of attacks as possible, because
wireless network to perform varied attacks. Unlike the low data rate of wireless communication and
wireless local area networks (LANs), whose available limited energy of the sensor nodes couldn’t afford to
bandwidths could be 54Mbps, the data rate for WSN is pass the massive audit data to a base station to be
likely far less than 1Mbps. The low bandwidth analyzed. In other hand, the codes in the sensor nodes
prevents some analysis on suspicious data being are usually written in its ROM before the sensor
executed promptly in the powerful remote base network is deployed. There is no feasible solution to
station. In other hand, communication is a very support transmitting and executing code dynamically
energy-hungry task in sensor node, transmitting with which is required by the mobile agent based
maximal power could consume about 3~4 times power architecture. A hierarchical architecture of IDS was
as processor does in active mode. Most of suggested in [5], where the local agent monitors the
communication ability should be reserved for target node local activity to detect intrusions and the global
sensed information. Only limited amount of security agent monitors the packets sent by its all neighbors to
related data could be sent to powerful base station for detect attacks. The local agents run on every sensor
further comprehensive analysis to detect intrusions. nodes while the global agents run on selected nodes,
Knowledge of the network is very useful such as the cluster head in applications deploying
information to detect intrusions. In a wireless sensor hierarchical routing protocols, or some watchdog
network, the topology of the network is usually not a nodes. A decentralized high-level rule-based IDS
priori. Even after the deployment, the network is model was proposed in [11]. Like in the netted
always evolving due to frequent failure of sensor architecture, all IDS functions, from data acquiescing
nodes, new added sensor nodes. It could be a big to analyzing, are implemented in monitor nodes.
challenge to build a base profile in such a dynamic However, unlike in the netted architecture, only
network for an intrusion detection system. selected sensor nodes act as monitor nodes and only
intrusion alerts are sent to base station.

273
In our intrusion detection system, the architecture
is similar with the netted architecture, where every Base Station Sensor Node
sensor node will be equipped an intrusion detection ID Agent
agent (IDA). But no cooperation exists between two IDS Console
IDAs since no node can be trusted. Like the attacks Model Tuner PIDC
against the traditional wired networks, the attacks in
wireless sensor network could be inside attacks or
outside attacks. The outside attacks could come from LIDC
more powerful adversary nodes like laptop, while the
inside attacks might be launched by compromised
sensor nodes that have the legitimate access to the Figure 1. System block diagram
sensor network. Sensor networks are
application-oriented, the codes in the sensor nodes are 3.2. Audit data for LIDC
written in its ROM before the sensor network is
deployed. An adversary can physically capture a In the domain of intrusion detection for traditional
sensor node from a sensor network and reprogram it wired networks, comprehensive research works have
with extracted security sensitive data (such as id, key) been done on the audit data for intrusion detection.
and malicious codes. The subverted node could join Many system features were identified to be useful for
the senor network to attack the sensor network further intrusion detection. For example, 41 features were
as a compromised node. Unlike the traditional wired conducted on the network connection in the
network, where HIDS (Host based Intrusion Detection KDDCup’99 Intrusion Detection dataset [21].
System) can analyze the host features to detect However, due to the resource constraints of a sensor
whether the host is compromised or misused. We can node, there are only few features that could be used to
not expect to design a similar intrusion detection detection intrusion. Moreover, a sensor network will
component to report that its host node is compromised have its own application requirement and employ only
or misused because all original codes (including the necessary protocols. Thus not all of features
intrusion detection codes) in its ROM could be erased identified below could be available for one particular
or modified in such a compromised node. However, sensor network application.
we could design a similar Local Intrusion Detection The sensing component, processor, radio and
Component (LIDC) to analyze local features to detect energy provider are the core parts of a sensor node.
whether its host node is suffering attacks from other Beside the CPU usage, memory usage and changes on
malicious nodes. storage which have been identified to detect attacks in
One of the goals of intrusion detection is to stop HIDS for traditional wired networks, more features
any ongoing attacks if it is possible. Wireless sensor can be identified related to communication, energy
networks mainly rely on wireless broadcast and sensing to detect intrusion in sensor nodes.
communication with certain effective range, thus it is
possible to locate the inside intruder (subverted node) 3.2.1. Packet collision ratio. Packet collision occurs
and isolate it from the sensor network. To locate the when two or more neighbor sensor nodes try to send
subverted node, the intrusion detection system must packet at the same time through the shared
monitor some suspect nodes and identify subverted communication channel. Collided packets have to
nodes by monitoring communication activities of been discarded and retransmitted and waste the
neighbor nodes, which is the task of Packet based constrained energy. Collisions are handled by MAC
Intrusion Detection Component (PIDC) in our IDS. (Media Access Control) protocol. Scheduled protocols
However, the density of nodes in wireless sensor (such as TDMA-based LEACH [22] ) are collision
networks is usually high. Many WSNs related research free protocols and all transmissions are scheduled on
work were based on the network where each node had different time/frequency slots. However, adversary
eight or more neighbor nodes. If a PIDC has to nodes could break the schedule intended to attack the
monitor communication activities of its all neighbor sensor network. Contention based protocols (such as
nodes, it will cost too much its precious energy. Based CSMA based protocol [23] ) allocate the shared
on the analysis of the LIDC, the IPDC could monitor channel on-demand and employ some mechanism to
only one or couple of its neighbor nodes which could avoid the collision but accept some level collisions. A
be particular suspect nodes. The PIDCs in its good MAC protocol should archive relative low
watchdog nodes could cooperate together to identify collision rate when the sensor network works
the real subverted node. normally, thus abnormal high collision ratio indicates
The intrusion alerts are sent to the base station, the existence of adversary.
where the user may be able to verify some possible
intrusion. For some false alerts, the base station could 3.2.2. Packet delivery waiting time. In contention
do some tuning of the intrusion detection model to based MAC protocols, a packet will be buffered to
reduce further false alerts, and pass the tuning result to wait for the shared channel. The fairness of accessing
sensor nodes. The system block diagram of our shared channel of the MAC protocol will ensure the
proposed intrusion detection is shown Figure 1. waiting time of a packet in a reasonable level. The
statistic of waiting time could be used to detect some
attacks against the fairness of MAC protocol.

274
applications have different sensing reading report
3.2.3. RTS packets rate. To avoid packet collision, requirement. Some applications require each sensor
contention based MAC protocols adopt RTS/CTS node report its reading periodically. In these
mechanism. When the channel is idle, the sender is applications, if a sensor node couldn’t report its
required to send RTS (Request-to-Send) packet to the sensing reading following the desired interval, the
receiver, and the receiver acknowledges the CTS sensing component could be under attacks. Some other
(Clear-to-Send) packet. The sender starts to send its applications require each sensor node reports its
data after receives the CTS packet from the receiver. reading as the answer of the query from the base
This RTS/CTS mechanism could be attacked by station. Subverted node could query the sensing
sending lots of RTS packets to gain unfair channel or reading more frequently to exhaust the energy of
to exhaust the receiver’s energy. victim nodes.
3.2.4. Neighbor count. Sensor nodes have limited 3.3. Packet features for PIDC
radio transmission range, while the sensor network
could be very large. A large sensor network usually Packet based Intrusion Detection Component
employs a multi-hop routing protocol to (PIDC) analyzes the packets from a suspect node to
communication. Each sensor node maintains a know whether the suspect node is attacking the host
neighbor table to record its neighbor information (e.g. node. The following identified features are calculated
nodes id, link cost etc) to build its routing. Unlike the on the packets from the same sender (a suspect node).
mobile ad-hoc network (MANET), most of nodes in
sensor network are supposed to be stationary. The 3.3.1. Distribution of packet type. There are several
neighbor table should be stable in relative short period, packets to be transmitted over the air in the wireless
although new nodes could be added and existed node sensor network, such as sensing data, route update,
could be removed since fault or energy exhaust over a query/command from the base station, HELLO
long time. The change of its neighbor count could be packets. But the main purpose of a sensor network is
used to detect some attacks. For example, The Sybil to sense certain interesting information, thus the main
attack [24] is where a malicious node illegitimately part of the packets should be sensing data.
claims multiple identities and works as if it were many
nodes. 3.3.2. Packet received signal strength. In wireless
transmission, the sender radiates electromagnetic
3.2.5. Routing cost. In wireless sensor networks, a energy into the air through its antenna and the receiver
multi-hop routing protocol maintains a route table in picks up the electromagnetic wave from its
every node to route its packets. The route table mainly surrounding air through its antenna. The received
records the next node of the path from one node to signal strength (RSS) measures the energy of the
base station and its cost, such as hop count or latency. electromagnetic wave. To receive a packet correctly,
Attacks against routing protocol (such as the received signal strength must be greater than a
sinkhole/wormhole) usually broadcast fake routing threshold known as receiver sensitivity. The received
information to attract more packets to route to its signal strength gradually decreases as the distance
node. Monitoring the routing cost and analyzing its between the sender and receiver increases. The
change could be used to detect those attacks. distance between the sender and receiver can be
estimated according to the RSS and propagation
3.2.6. Power consumption rate. Sensor nodes have model. The estimated distance could be used to detect
constrained power. The components of the sensor the attacker with much powerful radio (such as laptop)
node, including processing unit, sensing unit and compared to the radio of the sensor node. In the other
radio, are designed to be powered off to save the hand, the received signal strength should decrease as
energy if it is possible. The node spends most of time the system runs since the energy of the sensor node
in sleep mode to extend the node life. Some proposed will be consumed. If the received signal strength
energy-aware routing protocols (e.g. SPIN) have increases, it is possible that the node identification was
access to the current energy level of the node and stolen by a powerful malicious node.
adapt the protocol it is running based on how much
energy is remaining. Some DOS (Denial of Service) 3.3.3. Sensing data arrival rate. There are two types
attacks aim at the limit power of the sensor nodes. For of sensor network applications according to how the
example, an intruder interferes the transmission to sensor nodes are driven to sense data. In the first
increase the collision ratio or send RTS packets flood applications, the sensor nodes are driven by some
to exhaust the victim’s energy. The power particular events. In this type of applications, the
consumption ratio could be monitored to detect such sensing data will arrive without any pattern. However,
attacks. Usually sensor node (e.g. MICA2/MICAz in the second type of applications, the sensor nodes
nodes) has its own resource manager which keeps sense the data every preset interval, i.e., driven by the
track of resource consumption including the power time. In those applications, either missing an expected
consumption. sensing data or receiving unexpected sensing data
identifies some abnormality of the target node.
3.2.7. Sensing reading report rate. Sensing is one of
the main functions of sensor nodes. Different

275
3.3.4. Sensing reading value changing ratio. Sensor magnitude of PC represents the confidence of the
networks are mainly used to monitor some prediction.
environment parameter, such as temperate, sound, Since the detection model consists of multiple
wind speed and so on. Some parameters will change binary classifiers, a final arbiter is needed to pick one
within a certain range in a short time. For those of the prediction results from those binary classifiers
applications, if the sensing reading value changes as its final prediction. The prediction confidence ratio
beyond the normal range, there may be some (PCR) based arbitral strategy [26] could be used in the
abnormality. final arbiter in our intrusion detection system for
wireless sensor network, because the computation
3.3.5. RTS packets rate. This feature is calculated on required by this arbitral strategy is very light and meet
the packets sent from the particular sender, the the constrained computational power of sensor nodes.
suspicious node. The PCR is defined by:
PCR = PC / MAX { PC 1 , PC 2 , … , PC m } (2)
3.3.6. Packet drop ratio. We have stated that a large
sensor network usually employs a multi-hop routing Where PC stands for prediction confidence on a
protocol to communication since sensor nodes have data record in test dataset while PC i stands for the
very limited radio transmission range. Most sensor prediction confidence on the ith data record in the
nodes also work as a route to forward its received training dataset with total m records. The prediction
packets. A subverted node could attack this confidence ratio based final arbitral strategy can be
forwarding function by dropping packets or selectively expressed as follow:
forwarding some packets. To calculate this drop ratio,
the host node must know the received packets and the i = { j | PCR j = MAX {PCR1 , PCR2 , …, PCRn }} (3)
forwarded packets of the suspicious node.
Where PCRj is prediction confidence ratio and
3.3.7. Packet retransmission rate. A packet could be computed by Equation (2), and the i is the index of the
retransmitted when the previous transmission is failed binary classifier whose prediction result is selected to
due to conflict. However, such retransmission be the final prediction result. We had built a detection
mechanism could be attacked. A subverted node could model for traditional network and evaluated it on
retransmit a packet multiple times to exhaust the KDDCup’99 intrusion detection dataset, which was
energy of the receiver or try to alter the aggregation constructed from the raw TCP data for a wired
value. Abnormal retransmission rate can be used to local-area network (LAN) simulating a typical U.S.
detect intrusion. Air Force LAN. The performance of the detection
model on the test dataset was better than the winner of
3.4. Detection model and optimization the KDDCup’99 classifier contest [26].
However, the relationships among rules are not
We would like to apply a machine learning explored and rules in the model are disjunctive in
algorithm called SLIPPER [25] to build the detection default. Therefore, at least one condition in every rule
model. The model will consist of multiple binary which has multiple conditions has to be evaluated on
classifiers, which includes a set of rules. SLIPPER is a every data to make the final prediction. In wireless
confidence-rated boosting algorithm, and each rule sensor networks, the CPU has limited computational
learned from its training dataset might not have very power and the sensor node has constrained energy, so
high prediction accuracy on new data. However, the it is desired to optimize the rule evaluation procedure
predictions based on the entire set of rules are to reduce unnecessary computation. Rules are in
expected to be highly true. A rule R in binary classifier IF-THEN form. Most of rules have one or more
is forced to abstain on all data records not covered by conjunctive conditions (see rule examples in Figure 2).
R, and predicts with the same confidence CR on every Each condition consists of a feature name, an operator
data record x covered by R. The confidence CR was and a reference value. For example, “Service = telnet”,
calculated when the rule was built in the training “SourceBytes <= 147”. Some conditions in different
phase. A default rule which covers all data has rules could have same features. To optimize the
negative confidence, while all other rules have detection model, we will explore the relationship
positive confidence. The binary prediction engine is among those conditions with same features while
same as the final hypothesis in SLIPPER [25], which ignore any possible relationship among different
is: features since we assume the features are independent.
Among those conditions with the same features, we
H ( x) = sign( ∑R :x∈R C R )
t t t
(1) realize that two kinds of relationships could be used to
optimize the rule evaluation procedure. The first
In other words, the final hypothesis sums up the relationship is mutually exclusive relationship among
confidence values of all rules that cover the data and conditions such as “Service = login” and “Service =
the sign of the sum represents the predicted class label. ftpdata” where these two conditions couldn’t be true at
However, our binary prediction engines will output a the same time. The second relationship is implicit
signed sum of the confidence values of all rules that relationship between conditions such as “Duration >=
cover the data (not just the sign). We refer this 134” and “Duration >= 67” where the former implies
signed sum to prediction confidence (PC). The the latter. For conditions with mutually exclusive
relationship, at most only one condition could be true,

276
while all other conditions must be false. So when these R3, ). The structure shown in Figure 3 can be stored in
conditions are evaluated one by one, as long as the text mode as:
true condition is evaluated, the evaluations on ( ( , NumFiles >= 1 , R3, ), Service = ftpdata, ( ( ,
remained conditions with mutually exclusive
relationship could be skipped. These conditions with Duration >= 134 , ( , DstHostErrRate <= 0, R4, ), ),
mutually exclusive relationship could be ordered DstBytes<=5, R2, ), ( , Service = login , ( , Duration
further by its possibility to be true if such information
is available. For example, the feature “Service” is >= 134, (R1, DstHostErrRate<=0, R4, ), ( , Duration
expected more likely to be “ftpdata” than to be
“login”, so the condition “Service = ftpdata” should be >= 67, R1, ) ), ( , Duration >= 134 , ( , DstHostErrRate
evaluated earlier than condition “Service = login”. For <= 0, R4, ), ))).
conditions with implicit relationship, the implied
condition (“Duration >= 67”) should be evaluated only
if the implying condition (“Duration >= 134”) is 3.5. Model tuning
evaluated to be false.
An intrusion detection system will alert possible
R1: IF Service = login, Duration >= 67. intrusions, however the alert could be false. The false
alerts are also anticipatable in our system. We have
R2: IF Service = ftpdata, DstBytes <= 5. developed a model tuning algorithm [27], which can
tune the model (rule’s associate confidence)
R3: IF NumFiles >= 1. automatically to improve its performance in the future
R4: IF Duration >= 134, DstHostErrRate <= 0. data. The tuning algorithm utilizes a property of the
binary classifier that only rules that cover a data record
Figure 2. A rule set with four rules contribute to the final prediction on this data record.
Our tuning algorithm changes the associated
confidence values to adjust the contribution of each
To utilize these relationships to optimize the rule to the binary prediction. Consequentially, tuning
condition evaluation, we organize conditions in all ensures that if a data record is covered by a rule in the
rules into a tree structure. Each node consists of a original model, then it will be covered by this rule also
condition to be evaluated and three child trees. The in the tuned model and vice versa.
left child tree (true child) will be evaluated when its When a binary classifier is used to predict a new
condition is true, while the right child (false child) tree data record, two different types of false predictions
will be evaluated when its condition is false. Of may be generated according to the sum of confidence
course, there are some rules such that none of its values of all rules that cover this data record. When
conditions has mutually exclusive or implicit the sum is positive, the binary classifier predicts the
relationship with any condition in other rules. The data record to be in the positive class. If this prediction
middle tree (unconditional child) is built on all is false, it is treated as a false positive prediction.
conditions from those rules, which will be evaluated When the sum is negative, the binary classifier
before its condition is evaluated. For example, we can predicts the data record to be in the negative class. If
organize the four rules listed in Figure 2 into a tree this prediction is false, it is considered a false negative
structure shown in Figure 3. prediction. Obviously, when the classifier makes a
false positive (FP) prediction, the confidence values of
those positive rules involved should be decreased to
avoid the false positive prediction made by these rules
on subsequent data. When the classifier makes a false
negative (FN) prediction, the confidence values of the
positive rules involved should be increased to avoid
false negative predictions made by these rules on
successive data. Formally,
 p ⋅ C R if rule R   → FP
contributes to

C R′ =  (4)
 q ⋅ C R if rule R    → FN
contributes to

Where constrains p < 1 and q > 1 ensure that a


positive rule always has a positive confidence.
Because p < 1, q > 1, and the confidence value of the
Figure 3. Rule evaluation tree default rule is unchanged, trivially there exists a
number n, such that after updating the confidence
values n times, the sign of the sum of the confidence
Each node in the tree structure can be described in values of all rules (both positive rules and the default
text mode by a quad (unconditional child, node rule) will be changed. That means the tuned classifier
condition, true child, false child). For example, the could make a true prediction on the data where the
node “NumFiles >= 1” is saved at (,NumFiles>=1, original classifier made a false prediction. Our

277
experiments showed that the system could achieve sensor network in [9]. Total 9 traffic related features
about 20% improvement with quick tunings while based on AODV (Ad hoc On-demand Distance Vector
only 1.3% of the false predictions were used to tune [28]) routing protocol were identified to describe the
the model [27]. conditions of the traffic flow through the node. Three
Due to the limited computation power of the non-traffic related features were selected to monitor
sensor node, the model tuning function is separated changes of the path to the base station. The proposed
from the detection agent logically and physically. The system adopted a fixed-width clustering algorithm,
model tuner with the system console is resident in the which had been applied for anomaly detection in IP
base station. To tune the model of a particular sensor network.
node, the system must keep a copy of detection model Attacks against on MAC protocol in wireless
for every intrusion detection agent. Only the tuned sensor networks were studied and classified into
results will be delivery from the base station to save collision attack, unfairness attack and exhaustion
communication cost. attack in [10]. Three statistics collision ratio, packet
waiting time and RTS packet ratio were identified as
4. Related Work intrusion indicators respectively. The probability of
particular attacks was calculated by a soft decision
The general guidelines for IDS in sensor networks function along with an overall probability of attacks
were discussed in [5]. A hierarchical architecture of related to packet successful delivery ratio.
IDS was suggested, where the local agent monitors the A decentralized high-level rule-based IDS model
node local activity to detect intrusions and the global was proposed in [11]. All IDS functions, from data
agent monitors the packets sent by its all neighbors to acquiescing to analyzing, are implemented in monitor
detect attacks. nodes. Only intrusion alerts are sent to the base
A centralized detection algorithm against station. Seven high level rules (interval rule,
sinkhole/selective forwarding attack was proposed in retransmission rule, integrity rule, delay rule,
[6]. The base station identifies a list of suspicious repetition rule, radio transmission range rule and
nodes by detecting data inconsistency use a statistical jamming rule [11]) were defined to detect intrusions.
method. Then the base station can estimate the attack This IDS performs analysis on data message listened
area where the sinkhole node locates. A requested to by the monitor node that is not addressed to it and
network data flow message will be sent by the base message collision when the monitor node tries to send
station to the nodes in attack area with the suspicious a message. After messages are collected in
node IDs. All suspicious nodes will reply this request promiscuous mode and the important information is
with its network flow information including its ID, filtered and stored, a sequent rule-matching procedure
next-hop ID and cost. The network flow information is executed on every message. The order of rules
can be represented by a directional edge from source depends on the message type. When a rule fires on a
ID to its next-hop ID in a base station. The base message, the rule-matching procedure will stop and
station will realize routing pattern by constructing a the message will be discarded to save the storage
tree using these direction edges. An area invaded by a space. Instead of reporting an alarm on attack, a
sinkhole attack processes special routing pattern where failure counter is incremented when a rule fires on a
all network traffic flows toward the same destination, message. An attack is alerted only if the counting
the root in the tree of network flow, which is failure number is greater than an expected value by the
compromised by the intruder. monitor node during the analysis of messages
A specification-based network intrusion detection transmitted on its neighborhood in a round. This
system principally against the sinkhole/selective expected number is calculated dynamically by the
forward attack was presented in [7]. A rule specifies monitor node according to the failure history for each
that a normal node should forward the packets at a rate node in its neighborhood.
over a threshold. Otherwise, the node could be The intrusion detection problem in WSN was
abnormal. For a link A->B (Node A sends packets to formulated as a non-cooperative two-player
Node B), Node A and the watchdog nodes of link nonzero-sum game between the intrusion detection
A->B monitor the behavior of node B and make the system and the attacker in [12], [13]. The basis is that
decision cooperatively through a majority-vote policy. in non-cooperative games there exist sets of optimal
To identify an intruder impersonating a legitimate strategies (so-called Nash equilibrium) used by the
neighbor, a low-complexity anomaly detection players in a game such that no player can benefit by
algorithm was proposed in [8]. A sensor node recorded unilaterally changing his or her strategy if the
the arrival time and received power of each incoming strategies of the other players remain unchanged. The
packet for last N packets from each neighbor. A relationship between an attacker and the IDS is
simple dynamic statistical model (the min and max of non-cooperative in nature because no outside authority
received power, the packet arrival rate on last N could assure any agreement between an attacker and
packets and on last N2 packets) was built. The simple the IDS. The proposed IDS is able to monitor all
statistical model was used to detect any abnormality sensor nodes, but due to system limitations it can only
by monitoring received packet power level and packet protect one sensor node at each time slot, and based on
arrival rates from a neighbor node. a game theoretic framework it will choose such a
An unsupervised anomaly detection technique sensor node (called cluster head) for protection.
was proposed to detect routing attacks in wireless

278
5. Conclusion and Future Work [12] A. Agah, S.K. Das, S.K. and K. Basu, “A
Non-cooperative Game Approach for Intrusion
In this paper, we presented a framework of a Detection in Sensor Networks”, IEEE Vehicular
machine learning based intrusion detection for Technology Conference (VTC), fall 2004.
wireless sensor network. In our system, each sensor [13] A. Agah, et al., “Intrusion Detection in Sensor
node will equip a detection agent. The detection agent Networks: a non-cooperative game approach”, the 3rd
will analyze the local data and packet data from IEEE Int. Symp. on Network Computing and
suspicious node to identify an intruder. When the user Applications, (NCA'04), 343-346, 2004.
found a false alert, the system can automatically tune [14] Online ATMEGA128L datasheet,
the model to improve its performance in the future http://www.atmel.com/dyn/resources/prod_documents/
data. In the future, we plan to set up an experiment doc2467.pdf, Jun, 2007.
environment to test our framework. [15] Online MICA2 datasheet,
http://www.xbow.com/Products/Product_pdf_files/Wir
6. References eless_pdf/MICA2 _Datasheet.pdf, Jun, 2007.
[16] Online MICAz datasheet,
[1] A. Perrig, et al., “SPINS: Security Protocols for Sensor http://www.xbow.com/Products/Product_pdf_files/Wir
Networks”, Wireless Networks, 8(5):521- 534, Sep. eless_pdf/MICAz _Datasheet.pdf, Jun, 2007.
2002. [17] C.Y. Chong and S.P. Kumar, “Sensor Networks:
[2] S. Zhu, S. Setia, and S. Jajodia, “LEAP: Efficient Evolution, Opportunities and Challenges”, Proc. of the
Security Mechanisms for Large-scale Distributed IEEE, 91(8):1247-1256, Aug. 2003.
Sensor Networks”, Proc. of the 10th ACM Conference [18] E. Shi and A. Perrig, “Designing Secure Sensor
on Computer and Communications Security (CCS '03), Networks”, IEEE Wireless Communications,
Oct. 2003. 11(6):38-43, Dec. 2004.
[3] J. Deng, R. Han, and S. Mishra, “A Performance [19] I. F. Akyildiz, et al., “A Survey on Sensor Networks”,
Evaluation of Intrusion-tolerant Routing in Wireless IEEE Communications Magazine, 40(8):102-114, Aug.
Sensor Networks”, Proc. of the 2nd Int. IEEE 2002.
Workshop on Information Processing in Sensor [20] M. Tubaishat and S. Madria, “Sensor Networks: an
Networks (IPSN’03), Apr. 2003. overview”, IEEE Potentials, 22(2):20-23, Apr. 2003.
[4] J. Undercoffer, et al., “Security for Sensor Networks”, [21] http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.ht
CADIP Research Symposium, 2002. ml.
[5] R. Roman, J. Zhou and J. Lopez, “Applying Intrusion [22] W. R. Heinzelman, A. Chandrakasan, and H.
Detection Systems to Wireless Sensor Networks”, Balakrishnan, “Energy-efficient Communication
Proc. of the 3rd IEEE Consumer Communications and Protocols for Wireless Microsensor Networks,” Proc.
Networking Conference (CCNC 2006), Jan. 2006 Vol. of the Hawaii International Conference on Systems
1, pp. 640- 644. Sciences, Jan. 2000.
[6] E. Ngai, J. Liu, and M. Lyu. “On the Intruder Detection [23] A. Woo and D. Culler, “A Transmission Control
for Sinkhole Attack in Wireless Sensor Networks”, Scheme for Media Access in Sensor Networks,” Proc.
IEEE International Conference on Communications of the ACM/IEEE International Conference on Mobile
(ICC'06), Istanbul, Turkey, June 11-15, 2006. Computing and Networking, Rome, Italy, July 2001,
[7] K. Ioannis, T. Dimitriou and F. Freiling, “Towards pp. 221–235, ACM.
Intrusion Detection in Wireless Sensor Networks”, in [24] J.R. Douceur, “The Sybil Attack”, Proc. of the 1st Int.
13th European Wireless Conference, Paris, France, Workshop on Peer-to-peer Systems (IPTPS ’02), Mar.
April 2007. 2002.
[8] I.Onat, A. Miri, “An Intrusion Detection System for [25] W. Cohen and Y. Singer, “A Simple, Fast, and
Wireless Sensor Networks”, IEEE International Effective Rule Learner”, Proc. of 16th national
Conference on Wireless and Mobile Computing, Conference on Artificial Intelligence and 11th
Networking and Communications, 2005. Conference on Innovative Applications of Artificial
(WiMob'2005), V. 3, Aug. 2005, pp. 253 – 259. Intelligence, Orlando, Florida, pp.335-342, July 1999.
[9] C. E. Loo, M. Y. Ng, C. Leckie and M. Palaniswami, [26] Z. Yu and J. Tsai, “An Efficient Intrusion Detection
“Intrusion Detection for Routing Attacks in Sensor System using Boosting Based Learning Algorithm”,
Networks”, International Journal of Distributed Sensor Int'l Journal of Computer Applications in Technology
Networks, Vol. 2, No. 4, October-December 2006, pp. (IJCAT), Vol. 27, No. 4, pp.223–231. 2006.
313-332. [27] Z. Yu, J. Tsai and T. Weigert, “An Automatically
[10] Q. Ren; Q. Liang, “Secure Media Access Control Tuning Intrusion Detection System”, IEEE
(MAC) in wireless sensor networks: intrusion Transactions on Systems, Man, Cybernetics, Part B,
Detections and Countermeasures”, 15th IEEE Vol. 37, No. 2, pp.373-384, April 2007.
International Symposium on Personal, Indoor and [28] C. Perkins and E. Royer, “Ad-hoc On-Demand
Mobile Radio Communications (PIMRC 2004), Sept. Distance Vector Routing”, Proc. of the 2nd Workshop
2004 Volume 4, pp. 3025 – 3029. on Mobile Computing Systems and Applications
[11] A.P. Silva, et al., “Decentralized Intrusion Detection in (WMCSA'99), February 1999, pp. 90-100.
Wireless Sensor Networks”, Proc. of the 1st ACM Int.
Workshop on Quality of Service and Security in
Wireless and Mobile Networks, 16-23, Oct. 2005.

279
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Inter-Domain Authentication for Seamless Roaming in


Heterogeneous Wireless Networks

Summit R. Tuladhar, Carlos E. Caicedo, and James B. D. Joshi


School of Information Sciences
University of Pittsburgh
Pittsburgh, PA 15213, USA
srt17@pitt.edu, cec15@pitt.edu, jjoshi@mail.sis.pitt.edu

Abstract For a MN to seamlessly roam across wireless net-


works, any active connection must not be broken and
The inter-operation among heterogeneous wireless the handoff time should be minimum. Service continu-
networks is crucial to support ubiquitous mobility and ity in handoffs is achieved by mobility management
seamless roaming. Handoffs across wireless networks protocols like Mobile IP [14] and Mobile IPv6 [10], in
in separate administrative domains should ensure un- which the MN is able to use its home address in a for-
interrupted service and authenticity of the entities in- eign network. Another crucial issue for seamless roam-
volved. However, the re-authentication of a mobile ing is the handoff delay that occurs as the MN moves
node (MN) during a handoff across administrative from one network to another. For a smoother transi-
domains typically involves several round trips to the tion, a minimum handoff delay is desired.
home domain which produce long latencies [9, 12]. A A key reason for a longer handoff delay in the ex-
new authentication protocol based on the use of proof isting solutions for roaming across administrative do-
tokens is presented in this paper which allows a MN to mains is the delay introduced by the authentication
authenticate in a foreign domain without requiring process. Currently employed authentication protocols
communication with its home domain. Moreover, the require the active participation of the home domain.
foreign domain is only required to have a trust rela- For example, while roaming into a foreign GSM net-
tionship with any one of MN’s previously visited do- work, a challenge response mechanism is carried out
mains, and not necessarily with the home domain. Our between the mobile device and the authentication cen-
objective is to provide an improved authentication ap- ter at its home network [5]. Eliminating round-trip la-
proach for ubiquitous mobility and seamless roaming tencies to the home network can improve the speed
that does not compromise the security of the handoff with which the handoff takes place in a secure manner.
process while providing low delays. This research work has been motivated by such a need
for improved authentication approach that does not
compromise the security of the handoff process while
eliminating latencies due to the participation of the
1. Introduction home network.
Mobile wireless users are in a constant quest for
1.1. Inter-Domain Trust
higher speed and ubiquitous coverage. However, data
rate and coverage are complementary to each other [7].
Trust is an integral component required for coop-
As a mobile node (MN) moves in an environment that
eration between wireless networks. Conventionally, for
supports universal wireless access through heterogene-
mobile users to be able to roam into foreign networks,
ous wireless networks of different technologies oper-
the foreign network and the mobile user’s home net-
ated by multiple service providers, it must be able to
work must trust each other and have a roaming agree-
seamlessly roam from one network to another in a se-
ment established beforehand. However, to achieve the
cure manner. In particular, the solutions that support
vision of a ubiquitous wireless network with global
such seamless roaming should ensure the authenticity
coverage involving a mixture of large and small net-
of the entities as it connects to new domains.
work operators and heterogeneous access technologies
will require procedures for dynamically establishing

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 249


DOI 10.1109/SUTC.2008.61
trust and roaming relationships. In particular, we want sends an ‘accept’ or a ‘reject’ message back. The en-
interoperation between wireless domains which do not hanced challenge response mechanism only requires a
have direct roaming agreements. single round trip to the home domain.
Two wireless domains are said to be directly-
connected if they have direct trust and roaming agree-
ments and they are said to be well-connected if there is
at least one trust-path between them. Thus, the trust
path can be used to establish roaming between domains
without direct trust and roaming agreement.

1.2. Inter-Domain Authentication

Consider a MN trying to connect to a foreign wire-


less network which has a roaming agreement with its
home network. The authenticator in the foreign net-
work requests the identity of the MN, and the MN pre- Figure 2. Enhanced Challenge-Response
sents its Network Access Identifier (NAI) [1], which based Inter-Domain Authentication
has a form of user@domain. The Foreign Authentica-
tion, Authorization, and Accounting (AAAF) server The two methods shown above are simplified just
looks at the domain part of the NAI and sees that the to illustrate the number of round trips that may be re-
MN does not belong to its administrative domain. It quired for an authentication to be performed by the
then checks if it has roaming agreements with MN’s home domain. In reality, depending on the technology
home domain, and if it does, it sends a message to the used, the number of round trips required might be
AAA at the MN’s home domain (AAAH). more. Various EAP methods could be used for authen-
For a shared secret based authentication, AAAH tication. For example, EAP-SIM for GSM, EAP-AKA
performs a challenge-response authentication and for UMTS, PEAP, LEAP, EAP-TLS, or other flavors
sends an ‘accept’ or a ‘reject’ message as an outcome of EAP might be used for WLANs. More round trips to
of the authentication. the AAAH will be required for setting up the session
keys causing a large delay for the MN to get access to
the network.

2. Related Work

2.1. Certificate Based Authentication

Certificate based authentication can be used to ver-


ify the claimed identities of two previously unknown
entities. The two parties do not need to exchange se-
crets in advance and no prior trust relationships is re-
quired, making it more suitable for roaming scenarios
Figure 1. Challenge response based inter-
in heterogeneous wireless networks. However, the use
domain authentication
of certificates for authentication requires a Public Key
Infrastructure (PKI) with a common root Certificate
As shown in Figure 1, this method consists of two Authority (CA). Previous proposals using certificate
round-trips to the home domain. Since the home do- based authentication in wireless networks [4,15] as-
main could be situated across the globe, it is desirable sume that both MN and AAAF carry with it the self-
to minimize the number of round trips to it. signed certificate of the root CA, which is used as a
An enhanced challenge response based inter- trust anchor to establish new trust between the MN and
domain authentication method similar to the work of AAAF.
Laurent-Maknavicius and Dupont [12], is shown in Certificates can also be used to establish trust rela-
Figure 2, where the challenge is generated locally at tionship between two wireless domains by cross-
the authenticator, and the {NAI || Challenge certification. In such a case, each domain has its own
|| Response} triplet is carried to the AAAH which root CA which issues certificates to all the mobile cli-
checks the response for that particular challenge and ents and other trustworthy domains with which it has

250
roaming relationships. Thus, if a domain has roaming list from the home CA to verify the authenticity of the
agreements with ‘m’ other domains, the AAA server of certificates.
the domain carries ‘m’ corresponding certificates. The In a different approach, as suggested in [4], the MN
certificate issued by a partnering domain is known as a could delegate the validation of the foreign domain’s
Roaming-Certificate. certificate to a trusted third party. The MN only needs
In a similar work by Long, Wu, & Irwin [13], pub- to be sure of the revocation status of the trusted third
lic-key certificates are used for localized authentication party.
without connecting to the home network of the MN as However, if there is no direct trust relationship be-
the certificate proves the identity of the MN. As shown tween the home domain and the foreign domain, both
in Figure 3, the AAAF stores certificates issued by the MN and the foreign domain require a chain of certifi-
CA of every domain it has roaming relationships with. cates with a common root CA as the trust anchor be-
A MN belonging to the home domain carries with it a tween them. The construction of this certificate path
certificate issued by its home domain’s CA. It uses this requires certificate retrieval from several CAs until a
certificate, H<<MN>> to authenticate at foreign do- trust anchor is reached, which causes long delays in the
mains that have trust relation with its home domain. authentication process.
For mutual authentication, AAAF also sends the
roaming-certificate H<<F>> to MN to prove its iden- 2.2. Media-Independent Pre-Authentication
tity, which the MN verifies using the public key of its (MPA)
home domain.
MPA [6] is a pre-configuration and pre-
Home authentication scheme that is executed by a MN to a
Domain (H)
target network before the actual handoff takes place. It
AAAH CA
can be used to enhance the performance of existing
mobility protocols by proactively performing layer 3
and layer 4 associations and bindings before the actual
H<<F>> handoff takes place, thereby saving time for these op-
erations that usually only take place after the layer 2
Foreign
Domain (F)
association.
H<<MN>> MPA is successful only when the MN can detect
deteriorating signal strength and then have enough time
AAAF to discover and select candidate networks to connect
to, and initiate pre-authentication and pre-configuration
MN
procedures with the candidate network. However in a
highly mobile scenario, pre-authentication steps might
be broken abruptly which is undesirable.
Figure 3. Certificate based authentication 2.3. Shadow Registration
There are a few difficulties associated with public In the Shadow Registration method [11], a security
key certificates regarding certificate validation. Both
association is established between a MN and every
MN and AAAF must validate each other’s certificates
neighboring wireless network’s AAA servers before
during mutual authentication. This involves verifying
the MN handoffs to one of the regions. This procedure
the CA’s signature on the certificates and checking operates like the shadow as one walks, thus the name –
their revocation status. If the MN’s and foreign do-
shadow registration. The registration will already be
main’s certificates are signed by the CA of MN’s home
completed when the MN moves to a particular cell, and
domain, the signature can be verified using the home
the only necessary AAA operations that are required
domain’s public key. However, the retrieval of the re- will be processed locally in the new domain without
vocation status requires communicating with the home
communicating with the MN’s home domain.
domain’s CA. Also, MN does not have network access
With a similar concept, Han et al. [8] have pro-
during the authentication phase to retrieve a revocation
posed Region-based Shadow Registration (RSR) which
list. Therefore, the AAAF and the MN can carry out tries to increase the efficiency of Shadow Registration
the authentication protocol without checking the revo- by performing a Shadow Registration only when the
cation list. Once MN is authenticated and gets network MN moves to a section with high probability of hand-
access, both MN and AAAF can retrieve the revocation off.

251
2.4. Optimistic Access cated in the previously connected wireless domain.
Thus, in cases where the newly connected domain has
Aura and Roe [3] have proposed the Optimistic trust relations with one of the MN’s previously con-
Access scheme of network access control to minimize nected domain but not with its home domain, the MN
authentication delay. Instead of executing a stronger may utilize proof tokens to authenticate itself. If the
higher-delay authentication mechanism during the newly connected domain has direct trust relations with
handoff process, the MN is granted optimistic access to its home domain, it can use the certificate issued to it
the new network. The strong authentication is delayed by its home domain’s CA.
until the handoff is actually completed. In the proposed method, accounting messages are
When the MN handoffs to a new network, a faster sent to the home domain only after the authentication
but weaker authentication takes place, and after it is process is complete. The MN thus gets a quicker net-
successful the MN is authorized for an optimistic ac- work access with local authentication, which provides
cess to the new network. When the layer 2 handoff better seamlessness than existing methods.
process is complete, the MN must be involved in a new The fields of a proof token resemble the fields of an
stronger authentication to continue using the resources X.509 certificate but the interpretations are modified as
of the new network. shown in Table 1. The proof token is signed by the
The weaker authentication mechanism does not re- issuer using its private key for integrity protection.
quire any communication with the home network of
the MN, thus making the optimistic access a fast au- Table 1. The Proof Token
thentication mechanism. However, security can be eas-
ily compromised with optimistic access, and it might Version Number
be suitable only for private networks where users are
Serial Number
more trustworthy. For less secure applications, opti- Uniquely assigned by the Issuer
mistic access is not recommended as it creates a win-
dow of opportunity for malicious users to try to exploit Signature Algorithm
vulnerabilities. Thus, authentication using optimistic Signature Algorithm ID used by the Issuer
access is a tradeoff between security and performance. Issuer
The Distinguished Name of the Issuing Domain
3. Proof Token Based Authentication
Not Valid Before
Date & Time of Authentication
A proof token binds a subject’s identity with a pub-
lic key as a digital certificate does. Additionally, a Not Valid After
proof token also proves the fact that the subject was Date & Time after which the token is not valid
successfully authenticated at the issuer’s domain at the Subject’s Name
time it was issued. The Distinguished Name of MN
The proof token method of authentication is an al-
Subject’s Public Key Algorithm
ternative to other authentication methods which in- The algorithm used by MN’s public key
volves the participation of the home domain. In this
method, whenever a MN successfully authenticates in Subject Public Key
a wireless domain using any of the authentication me- The public key of MN
thods, it requests and obtains a proof token from the Signature
AAA server in that domain, which proves the fact that Signature over the entire message signed by the issuer
the MN was successfully authenticated in that domain
at that time. Similarly, the MN carries with it proof The proposed EAP method for the Token Based au-
tokens issued by all the visited domains in a structure thentication, called the EAP-Token method is essen-
called the ‘token store’, which also contains the corre- tially based on the industry standard EAP-TLS method
sponding certificates of the issuing domains, and a list [2]. It differs from EAP-TLS in that instead of the MN
of distinguished names of roaming partners of the issu- presenting a fixed X.509 certificate issued by a root
ing domains. CA, it presents a proof token issued by a foreign do-
Once the MN moves to another wireless network in main it has recently visited and with which the current
a different domain, it has to re-authenticate with the domain also has roaming relations with. Another dif-
new domain’s AAA server. To decrease the re- fering point is that the AAA server carries with it a
authentication delay, the MN may present a proof to- number of roaming-certificates instead of a single cer-
ken proving the fact that it was successfully authenti- tificate issued by a root CA. The roaming-certificates

252
are issued by other wireless domains with which it has the figure as it is operates in the EAP pass-through
roaming relations with. The roaming-certificate binds mode.
the AAA server’s identity with its public key. The Similar to EAP-TLS, the MN initially sends a Cli-
AAA server of each domain carries with it ‘m’ number entHello message identifying the protocol version,
of cross-certified certificates from its ‘m’ roaming cryptographic algorithms to use and a random number
partner domains. The MN will use this public key to to use as a nonce. The MN next sends a list of domain
authenticate the AAA server and to encrypt the pre- names which it has visited recently, and for which it
master secret. possesses a proof token in a DomainList message.
For authentication of the MN, we require a com- The DomainList message includes the X.500 Dis-
mon certificate authority as a trust anchor between the tinguished Name (DN) of its home domain and a
MN and the visited domain. A mechanism is thus re- sorted list of DNs of visited domains. The list is sorted
quired to find a common domain between all domains so that the AAA Server may use the roaming relation-
the MN has visited and obtained a proof token from, ship of the domain which is closest to MN’s home do-
and all the domains the currently visited domain has main in the trust path as shown in Figure 5.
roaming relationships with.
To find out which proof token to use, the MN sends
a list of all visited domain’s Distinguished Names in a
message called DomainList after sending the initial
ClientHello message. The AAA server chooses a
common domain between MN’s visited domain list
and its roaming partner domain list, and sends the cor-
responding roaming-certificate. The rest of the mes-
sage exchange is same as EAP-TLS. Figure 5. MN moves through visited domains

The AAA server first responds with a ServerHello


message as in standard TLS. Then, it checks the list of
DNs serially as it appears in the DomainList message
beginning with the home domain of the MN, and se-
lects the closest trust path to authenticate the MN.
Once a common domain is found, it sends the corre-
sponding roaming-certificate matching the common
domain.
In the following ClientTokenRequest message, the
AAA server requests the MN for the token issued by
the common domain between them. The AAA server
then sends a ServerHelloDone message to tell the MN
that it has finished and is awaiting a response.
After receiving the ServerHelloDone message, the
MN validates the roaming-certificate presented by the
AAA server. The MN first extracts the issuer name of
the roaming-certificate and retrieves the corresponding
domain’s public certificate from its token store. The
MN can now validate the roaming-certificate using the
public key of the common domain.
After validating the roaming certificate, it responds
to the AAA server by sending the proof token issued
by the common domain in a ClientToken message.
The MN then sends the ClientKeyExchange mes-
Figure 4. EAP-Token Method sage which contains a randomly generated Pre-Master
Key (PMK) encrypted with the public key of the AAA
The exchange of EAP-Token messages between the server. Using the PMK, the ServerRandom and the
MN and the AAA server in a foreign domain is illus- ClientRandom number from the hello messages, both
trated in Figure 4. The authenticator is not shown in parties compute the Master Key locally using the same
pseudo random function as negotiated in the Server-

253
Hello and ClientHello messages. If the AAA server is 4. Comparison
able to decrypt the PMK and complete the protocol, the
MN is assured about the authenticity of the AAA serv- In the proposed token based approach for inter-
er. domain authentication, the EAP-TLS method has been
The remaining messages exchanged are similar to extended to use tokens instead of certificates and a
EAP-TLS. The MN uses its private key to sign a hash token selection and a roaming-certificate selection me-
of all the messages exchanged up to this point and chanism has been added as described in the previous
sends it in a CertificateVerify message. The AAA serv- section. Exploiting trust relationships between various
er can verify the signature using the public key of the domains in a heterogeneous wireless network with the
MN as specified in the token. This proves the authen- help of tokens and roaming-certificates, mutual authen-
ticity of the MN. tication is performed between the MN and the AAAF
After sending the CertificateVerify message, the server without contacting the home domain.
MN sends the ChangeCipherSpec message which noti- The proof token based mechanism can be compared
fies the AAA server that all the messages that follow with the simple certificate based authentication as pro-
the ClientFinished message will be encrypted using the posed by Long, et. al. [13], MPA [6], Shadow Regis-
keys and algorithms just negotiated. Following this tration [11], and Optimistic Access [3]. In Table 2, the
message, the ClientFinished message is sent to verify comparison is shown in terms of the use of public key
the success of the key exchange and the authentication vs. secret key, mutual authentication support, privacy
processes. support, non-repudiation support, and the inter-domain
The AAA server then sends the final response to trust required. Shadow Registration and MPA does not
the MN with the ChangeCipherSpec and the Server- specify whether to use public key or secret key cryp-
Finished message. The ChangeCipherSpec message tography to use, and any one of them can be used.
notifies the MN that the AAA server will begin en-
crypting messages with the keys just negotiated and the Table 2. Protocol comparison: Security
ServerFinished message again verifies the success of
key exchange and the authentication processes. Public Key Mutual Privacy Non- Inter-Domain
Thus, after completing the EAP-Token method, vs . Secret Authentication Repudiation Trus t
Key
both the MN and the AAA server authenticate each S ha dow Not Defined No No No F ull
R eg istra tion
other’s identity and obtain master keys to derive fur- MP A Not Defined Yes No No Inter-Dom a in
ther transient keys for data encryption and integrity Not required
Optimistic S ecret Key Yes Yes No F ull
protection. Since the EAP-Token method is based on Acces s
EAP-TLS with the only difference being the use of Long , e t. a l. P ublic Key Yes Yes Yes F ull

tokens and roaming-certificates instead of PKI certifi- P roof-Token P ublic Key Yes Yes Yes Well-
Connected
cates, our proposed protocol has all the security fea-
tures of EAP-TLS.
The use of EAP-Token can be beneficial for mobile For MPA to work, the current domain is not re-
nodes in future heterogeneous wireless networks to quired to have a trust relationship with the future do-
achieve fast re-authentication when roaming from one main. Only the MN needs to have trust relation with
domain to another. With increasing number of wireless the domain it is trying to connect to.
networks globally, it is infeasible to have one-to-one For the proof token method, the various domains
trust between them. However, if wireless networks are are required to have a well connected trust relation-
well-connected, an international traveler who roams ships. The various domains are not required to have a
across political and network boundaries can first con- one-to-one trust, but, the degree of separation from one
nect to a major operator in a foreign country which has domain to the other should be minimal. The essence of
roaming agreements with his home network and then well-connected domains is that if a MN has proof to-
get a proof token to establish a trust path to connect to kens of a few domains that it has visited recently, it can
other wireless networks which have roaming agree- use the proof tokens to authenticate in most of the oth-
ments with that major network operator. Once the trav- er domains it wants to visit.
eler hops around a few wireless networks, he should be In Table 3, a comparison is made with respect to
able to connect to most of the wireless networks in that the handoff type: intra or inter-domain, proactive or
country. reactive, fast, smooth, or seamless and the number of
required roundtrips to the home domain.

254
Table 3. Protocol comparison: Mobility 6. References
Intra-Domain or Proactive or Fas t, Roundtrips to [1] Aboba, B., & Beadles, M. (1999). The Network Access
Inter-Domain Reactive Handoff Smooth, Home Domain
Handoff Seamles s during Identifier. IETF RFC 2486.
Handoff Handover
S ha dow Both P roa ctive S e a mle ss 1 [2] Aboba, B., & Simon, D. (October 1999). PPP EAP TLS
R eg is tra tion Authentication Protocol. IETF RFC 2716.
MPA Both P roa ctive S e a mle ss 1

Optimis tic Both R ea ctive Not De fine d 0 [3] Aura, T., & Roe, M. (2005). Reducing Delay in Wireless
Acces s
Networks. SECURECOMM.
Long , e t. al. Both R ea ctive S e a mle ss 0

P roof-Token Both R ea ctive S e a mle ss 0 [4] Bayarou, K. (2004). Towards Certificate-based


authentication for future mobile communications.
Wireless Personal Communications, 283-301.
Shadow Registration and MPA are both proactive
authentication methods in which the authentication is [5] Beller, M. J., Chang, L. F., & Yacobi, Y. (1993).
performed before layer 2 handoff. Whereas, in a reac- Privacy and Authentication on a Protable
Communication System. IEEE Journal Selected Areas in
tive handoff, authentication is performed after the MN Communications, 821-829.
moves and connects to the new network. Similarly, a
fast handoff primarily aims to minimize the handoff [6] Dutta, A. (July 2007). A Framework of Media-
latency, whereas a smooth handoff tries to minimize Independent Pre-Authentication (MPA) for Inter-
the packet loss during handoff. On the other hand, in a domain Handover Optimization. IETF draft, draft-ohba-
mobopts-mpa-framework-05.txt.
seamless handoff there is no noticeable change in the
quality of service for the user finds during the handoff. [7] Gupta, P., & Kumar, P. R. (1999). Capacity of Wireless
From these two tables, we see that the proof token Networks. University of Illinois Urbana-Champaign.
based authentication mechanism performs better than [8] Han, S. B. (2005). Efficient Mobility Management for
other protocols as it supports mutual authentication, Multimedia Service in Wireless IP Networks. 4th
privacy, and non-repudiation, and it does not require Annual ACIS Int'l Conf. Computer and Inforation
roundtrips to the home-domain during authentication. Science (ICIS '05), 447-454.
[9] Hess, A., & Schafer, G. (2002). Performance
5. Conclusions Evaluation of AAA/Mobile IP Authentication. PGTS '02.
[10] Johnson, D., Perkins, C., & Arkko, J. (June 2004).
In this work, we first defined the problem of seam- Mobility Support in IPv6. IETF RFC 3775.
less mobility in heterogeneous wireless environments
to achieve ubiquitous connectivity. Authentication de- [11] Kwon, T., Gerla, M., & Das, S. (2002). Mobility
lay was identified as a major cause for high latency. Management for VoIP: Mobile IP vs. SIP. IEEE
With conventional authentication mechanism involving Wireless Communication Magazine, 66-75.
symmetric key cryptography, the home domain must [12] Laurent-Maknavicius, M., & Dupont, F. (April 2002).
participate in a number of round trip message ex- Inter-Domain Security for Mobile IPv6. 2nd European
changes. For the case of global mobility, the home Conference on Universal Multiservice Networks
domain might be across the globe behind high latency (ECUMN'02). Colmar, France.
communication links. To eliminate the service disrup- [13] Long, M., Wu, C. H., & Irwin, J. D. (October 2004).
tion, the use of public key cryptography without the Localised authentication for inter-network roaming
use of expensive PKI components was proposed. across wireless LANs. IEE Proceedings
In the proposed architecture, certificate-like proof Communications, 496-500.
tokens are used to complete an EAP-Token authentica- [14] Perkins, C. (August 2002). IP Mobility Support for
tion method. The EAP-Token method is defined my IPv4. IETF RFC 3344.
modifying some of the protocol details of EAP-TLS.
The changes and their purpose were highlighted and [15] Meyer, U., J. Cordasco, & S. Wetzel (2005). An
Approach to Enhance Inter-Provider Roaming Through
the EAP-Token protocol was also compared analyti-
Secret-Sharing and its Application to WLANs, in
cally with other related methods of authentication. WMASH'05. Cologne, Germany.

255
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Structural Videotext Regions Completion with Temporal-Spatial Consistency

Tsung-Han Tsai Chih-Lun Fang


Dept. of Electrical Engineering,
National Central University, Taoyuan County 320, Taiwan
han@ee.ncu.edu.tw allen@dsp.ee.ncu.edu.tw

Abstract After recognizing the regions of text, the methods based on


some information can be used to remove the videotext and
With the rapid speed of spreading video content, some complete the original region. If there is an unexpected
videotext exists in the video content. Usually some videotext change in scene, this completed region is not accepted by
is the advertisements and it is not needed for some people. human vision. Thus the smooth in spatial and temporal
Most video completion methods deal with the object domain is sensitive to human vision and how to achieve
removing or broken regions occurring when the object is temporal-spatial consistency is an important issue when
undesired or original video is broken. But few methods can completing the video. There are several existing methods to
handle the completion of the original region occupied by complete the original region occupied by videotext. J.B.
motion videotext well since the complex background of Kim [1] employed the information of neighboring blocks of
motion videotext. Most existing algorithm took a lot the text region and Genetic algorithm to recover the original
computation time for video completion. However, the regions. Byung Tae Chun [2] proposed the recovery method
completed region should be generated as soon as possible based on the block matching algorithm and camera motion.
based on the real-time consideration. In this paper, we That means one uses spatial information and the other uses
propose a fast and regular structural videotext region the temporal information. The performance of these
completion method to recover the original structural region methods is not well enough in fast moving and structural
occupied by the scrolling videotexts. We utilize the content in the temporal domain. Besides, our previous
characteristic of the canny edge map of frames to form videotext removal approach [3] based on error-concealment
structural edge in the occupied region with temporal is suitable for video sequences without structural objects.
consistency. The spatial texture region completion is carried Furthermore, the purpose of video completion is to
out by our proposed video gradient interpolation approach. complete the region occupied by some objects. Thus if we
We have completed the whole system and the experimental treat the videotext region as the removal region, we can use
results show that all of the horizontal and vertical scrolling some video completion methods to reveal the original
videotexts can be completed well. region. Although inpainting techniques proposed in [4], [5],
[6] have the goal to reveal the original region, these methods
Keywords do not consider the temporal information and only apply in
Video completion, edge completion, spatial-temporal consistency, spatial domain. Except inpainting, there are a lot of
videotext removal researches about video completion to remove the videotext
regions. The first approach [7] is based on the spatial-
1. Introduction temporal block searching. They use a 3D region with the
size of 5 x 5 x 5 to obtain a most similar block searching the
Nowadays, there are a lot of video content obtained whole neighboring frames by some criterion. The criterion
from TV broadcasting, Internet and wireless network. Video is a vector containing red, green, blue and location
content, especially the news program contains some textual components. After eliminating the objects, the removal
information. Although these videotexts provide a lot of regions could be filled by the similar block. The second
information, some text is the advertisement or useless approach [8] is based on the 3D tensor voting method. This
information disturbing the audience to view the video. That approach applied 3D tense voting to smooth the motion
is to say it needs a completion method to remove the trajectory and found the closest intensity pixels. After
videotext and recover the original video content for some finding the closest pixel, morphing technique is applied to
audience. Besides, there are a lot of structural objects in fill the removal region. Although some well completed
video sequences. It needs an approach to deal with this case. regions are produced by these two methods, the drawback of
The videotext completion has two main steps: videotext this approach is that they take a lot of time to complete this
regions localization and videotext region filling. The work. The third approach is based on the object-background
videotext regions localizing step is to localize the region of inpainting method [9]. This approach completes video
videotext and form a region containing the whole videotext.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 256


DOI 10.1109/SUTC.2008.45
region in two different ways. The video sequences are firstly consistency between completed frames. Thus, the videotext
separated into object and background. The object is region could be removed well with high consistency. The
recovered by the pixels of neighboring blocks with a well completed video sequences will be accepted by human
confidence value. And the background is recovered by vision.
texture synthesis. These approaches are robust for
Traditionally, it needs a lot of computation to achieve
completing small broken regions of video but the size of
spatial-temporal consistency in frames. Besides, the
videotext region is larger in the frame.
complexity of video completion is proportioned to the
There are a lot of researches about video completion,
number of reference frames. The complexity and
but few of them addressed on the consideration of structural
computation are reduced by only considering the edge
regions and the consistency of structure in temporal-spatial
information. From the analysis above, the consistency and
domain when recovering the original region. Consequently
reduction of complexity could be beneficial from edge
it needs a more powerful video completion method for the
information in a word.
purpose of videotext recovery. Concerning the speed of
recovering, consistency and structural regions in video
content, we apply a video completion method based on the
edge detection and gradient interpolation in temporal-spatial 3. The Proposed Structural Videotext Region
domain for recovering. Because the scrolling videotexts are Completion
at different positions, our automatic recovery method is
based on the user’s selection. That means users can erase the 3.1 Overview of the Proposed Algorithm
videotext at will. This paper is organized as follows. In In the modern television programs, there are more and
Section 2, the definition of video region completion is more scrolling videotexts which are the advertisement
described. In Section 3, the proposed algorithm is discussed. captions or the redundant information for audiences. If the
In Section 4, experiment evaluations are performed to show scrolling videotexts can be automatically erased, the
the efficiency and effectiveness of the approach. Some audiences are not be disturbed by the videotexts. Thus the
discussions are presented in Section 5. Finally, Section 6 completing process is exploited and we found that recovery
concludes the paper. of edge dominates the consistency of completion as
discussed in Section 2. For this purpose, we design a
2. Videotext Region Completion as a Spatial- restoration of regions method shown in Fig. 1. In the first
Temporal consistency step, we use the previous proposed videotext detection
method [10] to initialize and localize the videotext region.
As for videotext region completion, the most important In the second step, a canny edge detector is employed for
concept is that the completed region should have edge detection. After edge detection, the edge and the
consistency with other region no matter in space or in time. remaining regions are processed by different methods. The
The definition of consistency is that the objects in frames pair of edge is obtained by comparing the closest edges in
have to move smoothly and the change of scene has to be the next step. After obtaining the closest edges, the edge
smooth. Thus, the videotext region completion could be could be connected by temporal interpolation. Then the
treated as the problem of spatial-temporal consistency. The remaining spatial region is filled by our proposed gradient
existing video completion methods apply some approaches priority interpolation firstly and gradient completion later.
to achieve the consistency. In [7], they apply block After each pixel on the base frame is filled, the original
matching algorithm to achieve this issue. It may trap in local videotext region is recovered. The next frame is processed
optimum sometimes. In [8], they apply 3D tense voting to following the iteration described above.
achieve consistency. In our view, edges play an important
role in computer vision especially in structural regions. If
edges could be completed well, the remaining region could
be completed by some low-complexity algorithm. That
means edge is a critical factor for temporal consistency, and
(1 )
the remaining region is critical for spatial consistency. Thus
the consistency of frames could be shown below,
C = E∗S
Fig. 1. Proposed videotext region restoration flowchart.
Where C is the level of consistency, E is the edge and S
is the spatial region. The problem solving of spatial-
temporal consistency could be done by this equation. If the 3.2 Recognition and Completion
edges between frames could be completed consistency and
the completed spatial region has high relation with the Since our goal is to detect the scrolling text string in
neighboring regions, it would have a high level of the video frames and recover the original regions, we use

257
the previous proposed method to detect the possible text scale changes, the angle is the same. The search range is 40
region. This method is based on the contrast of the text and x 40 and the size of block is 5 x 5.
uses horizontal and vertical projection to localize the
possible text regions. After recognition of the videotext, the
regions have to be completed in the next step. Considering
to the edges in structural videos, we complete them in
advance to achieve the consistency. There are several edge
detection methods producing different types of edge, such as
sobel edge detector, canny edge detector and Laplacian of
Gaussian Edge Detection. When connecting the edge in
frames, only the thin edges with one pixel could be Fig. 3. The lost structural region in video.
connected well. In our evaluation, canny edge detector is
easy to achieve consistency due to its thin characteristic.
After obtaining the most similar pixel in reference
The comparison of Canny edge detector and Sobel edge
frame, we can illustrate the spline by temporal interpolation.
detector is shown in Fig. 2. In Fig. 2, the edge obtained from
The concept of generating the lost edge is shown in Fig. 4.
Canny edge detector represents the actual detail edges. The
In Fig. 4, the edge in the center frame is broken by the
edges obtained from Sobel edge detector is double hardly to
videotext represented as blank regions. The lost edge is
connect. Thus, the canny edge detector is applied in our
connected by this edge-pair comparison and completion
proposed video completion method.
algorithm with the two edges map in the reference frames.

(a) Canny edge detector (b) Sobel edge detector


Fig. 2. The comparison of Canny edge and Sobel edge. t

After obtaining the edge, the lost edge shown in Fig. 3 Fig. 4. The concept of generating lost structural region in video.
is needed to connect. As described in Section 2, the
consistency of edges is sensitive for vision especially in 3.3 Region based gradient interpolation
temporal domain. Thus, keep consistency of edge in
temporal domain is a critical issue for video completion. After the obtaining the edge, it needs to recover the
When exploiting the temporal consistency, it needs to remaining spatial region. The neighboring information is
compare the edge in reference frame to obtain the trajectory. used to fill the lost region in our algorithm. The reason why
As for connecting edges with temporal consistency, the not use the information in other frames is described below.
edge-pair comparison algorithm is performed to recognize Firstly, there is a little difference in each frame. If the
the same edge in frames. This issue is related to the information in other frame is applied straightly, it is not
detection of local motion and global motion of pixels. suitable for vision. Secondly, the pixels in the lost region
Because the traditional local motion computational methods should have consistency with the neighboring region in the
such as optical flow detection have high complexity and same frame. Thirdly, it needs more computation and
need much computation to complete, thus we proposed a memory access to fill the remaining regions from other
low complexity edge-pair comparison algorithm for real- frames comparing with filling from the same frame. Based
time requirement. In our edge-pair comparison algorithm, on this consideration, the neighboring pixels in the same
without performing each pixel in the frame, pixels of edge frame are applied to fill the edge filled region.
are labeled as the interesting points to obtain the edges in
next frame. Comparing these pixels in frames, the same It is realized that the remaining region except edge in
edges in frames could be obtained. The matching criterion is Fig. 3 is a texture region without structural information.
Although there are some existing methods proposed to fill
α = υ2 + φ2 +τ 2 (2 ) the lost texture region, the complexity of these methods is
very high. In our view, the gradient information plays an
Where υ is the color space, φ is the angle of the important role on textural region and needs the consistency
gradient and τ is the magnitude of the gradient. Applying with the neighboring region. Besides the gradient, the order
the angle of the gradient is because that no matter what the of block to fill is also critical in successful inpainting

258
techniques. While some researchers decide this order as the The lost region could be completed by this approach
ratio of number of know pixels. But in our finding, we think with gradient consistency. Thus, the text regions of the
the gradient in the neighboring of unknown pixels is also whole sequence can be recovered and completed after being
important to this order especially in texture regions. The processed by our algorithm. The experimental results are
order of completion could be expressed as shown in Section 4.
Pr (Ψ ) = [G (ρ ), G (ρ ')] (3 )
4. Experimental Results
Where ρ and ρ ' is the neighboring pixels, G is the
The proposed algorithm is constructed on the PC
gradient. The gradient is normalized into 8 modes based on platform, and the development environment is Microsoft
the characteristics of texture regions. That means if the Visual Studio C++. All test video sequences are 720 x 480
successive neighboring points have the same mode of with 30 fps in YUV color space. The video sequences are
gradient, these points have high priority and are completed captured from the cable broadcast system. We acquire the
firstly. With this consideration, the gradient of pixels in lost video sequences from nine different TV program. To
region is resembled to the known pixels. recover the original regions occupied by scrolling videotext
After deciding the order of videotext completion, the in video sequence, we perform detection and recovery
pixels of lost regions are filled with gradient consistency. process at two frames per second to make sure the
The gradient consistency is inspired from the bi-cubic overlapping of the videotext. All operations are performed
interpolation where the outside pixels are completed by on the YUV color space. The outputs are recovered
inside pixels for zooming application. In bi-cubic sequences without videotext.
interpolation, the outside 16 points are obtained from inside General image processing is analyzed as the PSNR.
points. Based on this primary consideration, the regions But image reconstruction is not proper to use this evaluation
occupied by videotext could be completed if the occupied causing there is no true original video content for
regions are treated as the regions after zooming. But it may comparing. Thus, several simulation results and comparison
lose a lot of information applying so many points in bi-cubic are shown below for evaluating the performance of
interpolation for video completion. Thus, 4 inside points are completion. From our simulated video sequences, we
used to complete 12 outside pints in our method. The choose three kinds of video contents to evaluate our
neighboring pixels are filled in 8 directions as shown in Fig. algorithm. They are news program, drama and sports video.
5. The red pixel is the interesting point with a direction of The detail video database is shown in Table 1. The result of
gradient to represent the direction of 4 points and the yellow completed videotext region is shown in Fig. 6. The top row
pixels are known for patching. The outside points are is the original successive frames which structure is moving
completed with the gradient direction of the interesting in these 3 frames. The middle row is the frames in which
point and obtained from the known pixels. For example, if text is detected and removed. And the bottom row is the
the direction of gradient of the interesting point is in upper- completed videotext region. It shows that the regions have
right direction, the upper-right region containing unknown spatial and temporal consistency and been reconstructed by
pixels is completed from the region with the center in the our proposed approach.
upper-right yellow pixel.
For completing the structural regions, some
completions of structural regions are shown in Fig. 7. In Fig.
7, the performance of edge completion is shown in Fig. (b).
The top row is the edge of clothes and the bottom row is the
baseball field. The result of region completion after
performing our approach is shown in Fig. 7 (c). In Fig. 8,
the result of textural region completion is shown. Some
regions without edge are simulated to obtain the completed
result. Based on the consideration as described in Section 2
and 3, the results show that the edge could be connected
well and the textural region could also be completed well.
Fig. 9 is the completion of whole frame. Our algorithm is
not only applied for videotext removal but also for logo
Fig. 5. The direction of gradient filling. The light gray region is the removal.
unknown region and the remaining region is the known region.
To demostrate the performance of our proposed
algorithm in advance, we also compare with the method
proposed in [3] which used the characteristic of stroke for
videotext removal. The comparison is shown in Fig. 10. The

259
structure is reconstructed and better than [3]. That means
our approach has better performance especially in structural
regions. Because our proposed algorithm is based on the
consideration of the edge detection and the gradient method,
it can reach a better performance comparing with other
video completion algorithm.

(a) (b)
Table1. Database of the test sequences.
Video type Length Numbers of Frames with
Scrolling Videotext Scrolling Videotext
Daai (Drama) 1237 1 945
SET-TV(Drama) 1008 1 802
TVBS-G (News) 281 1 281
CtiTV (News) 1015 2 922
SET-NEWS 1358 1 933
TVBS (News) 320 2 302
ETTVS (News) 394 1 302
USTV (News) 610 1 610
Videoland (Sport) 204 1 196 (c)
Fig. 8. texture region completion.

(a)original frame (b)completed frame


Fig. 9. The result of whole frame completion.

Fig. 6. The result of completed videotext region.

(a) result in [3] (b)our result


Fig. 10. The comparison of other methods.

5. Discussion
Based on our experimental result, the recovering rate is
determined by the resolution of the videotext and the
extraction performance. In the vertical scrolling videotext,
(a) Removal region (b)edge completion (c)region completion the background is often complex than horizontal ones.
Fig. 7. Structural region completion. Besides, the font size of the horizontal videotext is often
smaller than vertical ones. The proposed gradient based
completion algorithm could enhance the performance of

260
restoration for regions of vertical videotext. In the method for original image recovery for caption areas in
experimental result, the discrete videotext regions can be video,’’ in Proc. Conf. IEEE Syst., Man, Cybernet., '99.
recovered more accuracy because the ratio of neighboring [3] T. H. Tsai, C. L. Fang and H. Y. Lin, ‘‘Progressive
Videotext Regions Inpainting Based on Edge Detection and
regions is larger than a whole region. On the other hand, the Statistic Method,’’ in Proc. Int'l Multi Conf. on Engineening
whole videotext region has small neighboring regions, and and Computer Scientists. 2006.
the performance is worse than the discrete regions. [4] Marcelo Bertalmio, Guillermo Sapiro, Vicent Caselles and
Coloma Ballester.’’ Image Inpainting’’, in Proc. ACM Conf.
The larger region of videotext in video often decreases Comp. Graphics (SIGGRAPH), New Orleans, LA, July
the performance of completion. After completing the texture 2000.
[5] Antonio Criminisi, Patrick Pérez, and Kentaro
region, it sometimes has no consistency with neighboring
Toyama,‘‘Region filling and object removal by exemplar-
regions as shown in Fig. 8(b) because there is no edge at the based image inpainting’’, IEEE Trans. Image Processing,
boundary of hairs. If the edge detector could be improved to vol. 13, Issue 9, Sep 2004, pp. 1200 --- 1212.
detect more accurate edge, such a situation in Fig. 8(b) can [6] Manuel M. Oliveira Brian Bowen Richard McKenna Yu-
be completed well. Sung Chang, ‘‘Fast Digital Image Inpainting’’ in Proc. Conf.
on Visualization, Imaging and Image Processing (VIIP.
Considering the structural region, the structure could 2001), 2001, pp. 261-266.
[7] Wexler Y, Shechtman E, Irani M, ‘‘Space-Time video
be completed by the reference frame. If there is no similar completion’’ in Proc. of the IEEE Int'l Conf. on Computer
edge in the previous or next frame, more reference frames Vision and Pattern Recognition. 2004. 120-127.
will be compared. The maximum number of reference [8] Jiaya Jia, Yu-Wing Tai, Tai-Pang Wu, Chi-Keung Tang,
frames is 10. Besides, there is a static logo in videos ‘‘Video repairing under variable illumination using cyclic
sometimes. The removal of a static logo with a static motions’’ IEEE Trans. on Pattern Analysis and Machine
background is more complex because there is no temporal Intelligence, vol. 28, no. 5, pp. 832---83, May 2006.
information to keep consistency for completion. Although [9] Patwardhan, K. A, Sapiro, G, Bertalmio, M, ‘‘Video
Inpainting Under Constrained Camera Motion’’ IEEE
our experimental results only calculate the scrolling Trans. Image Processing, vol. 16, Issue 2, Feb 2007, pp.
videotext, static videotext lay on the boundary of the video 545 --- 553.
frame is also recovered by our proposed algorithm. It can be [10] T. H. Tsai, Y. C. Chen and C. L. Fang ‘‘A Comprehensive
removed well by our proposed algorithm as shown in Fig. 9. Motion Videotext Detection、 Localizaiton and Extraction
Method’’ in Proc. Int’l Conf. on Communications Circuits
and Systems, 2006.
6. Conclusions
In this paper, we propose an algorithm for motion
videotext region completion with temporal and spatial
consistency in digital video. It utilizes the previous proposed
videotext recognition method to localize the text region. The
information of edge is used to decide the completion method
employed due to its importance in temporal consistency. We
propose an edge-pair comparing method for the purpose of
connecting. The interesting pixels are moved to compare
and obtain the possible edge in reference frames. Then the
lost structural region is completed by temporal interpolation.
The remaining textural region is filled by the neighboring
pixels based on the direction of gradient and the order of
confidence. We choose versatile video contents to recover
the original images occupied by the text region. By our
proposed completion algorithm, the performance is higher
than the novel video completion algorithm from the
simulation results. No matter the videotexts are moved in
vertical, horizontal direction or in structural or texture
regions, our results reveal a good performance in subjective
simulation results.

References
[1] J. B. Kim, H. J. Kim, and S. Wachenfeld, ‘‘Restoration of
regions occluded by a caption in TV scene,’’ in Proc.
TENCON Conf., 2003.
[2] Byung Tae Chun, Younglae Bae, and Tai-Yun Kim, ‘‘A

261
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Mobile Intelligence for Delay Tolerant Logistics and Supply Chain


Management
Tianle Zhang1, Zongwei Luo2, Edward C. Wong2, CJ Tan2, Feng Zhou1
1
Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia, School of
Computer Science and Technology, Beijing University of Posts and Telecommunications, Beijing
100876, China
tlezhang@bupt.edu.cn
2
E-Business Technology Institute, the University of Hong Kong, Pokfulam, Hong Kong
zwluo@eti.hkuhk

Abstract In the remainder of this paper, we will overview


motivation of this paper in Section 2. We will describe
In this paper, we present Mobile Intelligence (MI) to potential applications of mobile relay in logistics and
develop delay tolerant RFID network for enabling delay supply chain management in Section 3. The mobile
tolerant logistics and supply chain applications. It intelligence will be discussed in Section 4. Section 5
provides intelligent and ubiquitous information access, provides performance evaluation. Section 6 will
relay, search and delivery over Mobile Relay Networks conclude the paper.
(MRN) - a critical and ubiquitous infrastructure with
convergence of RFID, wireless and sensor networks, 2. Motivation
and IP networks. It leverages mobile nodes to “bridge
the network gap” towards enabling pervasive delay RFID is a means of contact-less item identification
tolerant logistics and supply chain management. without line-of-sight through electromagnetic
transmission from tag reader to an RF compatible
Keywords: Mobile Intelligence, Mobile Relay, Delay integrated circuit (tag). RFID is capable of sensing the
Tolerance, Logistics and Supply Chain Management existence and location of network node neighbors in the
RFID network. It can pinpoint its neighbor to turn off
most modules and eliminate the idle listening and
1. Introduction querying. RFID can also unicast (broadcast) one (all) of
a node's neighbor(s) within detected range. It can also
Advanced technologies have been in great demand provide the capability to read specific subset of nodes.
to modernize the logistics and supply chain Many of these capabilities have been reported,
management for better supply chain visibility, planning, especially in the context of applying RFID in Wireless
and operations. Radio Frequency Identification (RFID) Sensor Networks (WSN). For example, STEM [1] uses
possesses unique wireless object identification and the second paging channels to wake up node, while [2-
track-and-trace enabling real time supply chain 4] apply RFID to activate the sleeping neighbors in the
information visibility. Benefits of RFID could be more WSN. All of this partly is due to the fact that embedded
eminent if it could penetrate into all possible logistics RFID transceivers can operate at energy consumption
and supply chains. In the meanwhile, low cost pursuit of about three orders of magnitude lower than typical
has limited RFID’s working zone to the range of commercial radios in WSN.
meters. These working zones can be integrated via However, reliable communication in delay tolerant
interconnected RFID networks connecting to Internet networks [5-8] that is subject to unpredictable
type infrastructure for global logistics and supply chain connectivity due to limited energy source has long been
management. recognized as a difficult task. This is also true for delay
In logistics and supply chain management, there tolerant RFID networks. There always exists trade off
consist typical three flows, i.e. physical goods flow, between reliability and the energy consumption. In most
cash flow, and information flow. As the information of the works in delay tolerant networks, the authors
flows faster than physical goods, organizations will assume the constant availability of connectivity with no
have the advantages to utilize the in advance sleep latency, which may not be true in real world
information visibility for better planning and sensor networks. [9] sets up model of two state Markov
operations. The inherent delay tolerance in logistics and process, and computed the probability of the node being
supply chain operations makes delay tolerant RFID available for at least k slots in N consecutive slots to
network a good technology enabler. evaluate the responsiveness of network. In this paper,
we will discuss mobile intelligence for MRN

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 280


DOI 10.1109/SUTC.2008.46
connecting RFID, sensors, wireless networks and IP directed enough attention towards bridge health
networks to develop delay tolerant RFID networks. monitoring. Current data acquisition practice often
relies on either an on site result examination or
3. Potential applications communication tools like satellite, telephone line to
transmit the results back. With GPRS or 3G technology,
In logistics and supply chain management, relay wireless transmission means are also considered.
based delay tolerant RFID network could be a good However, if a large bulk of data has to be transmitted
candidate for a few logistics and supply chain via GPRS or 3G, cost tends to be quite high and the
operations. needs to change current monitoring practice and request
for such data transfer have to be studied. Thus, one of
3.1 Object sensing in large scale dense stacking possible solution is to leverage the service truck passing
environment through the bridge and transmit the results to it, and
then relay the results back.
Working range for RFID is limited. RFID reading is
also often blocked by barriers in between the tagged 4. Mobile intelligence
objects and the RFID reader. For example, in a
container depot, many containers are stacked in the Mobile intelligence (MI) is developed for
depot, waiting to be packed for shipping. Varieties of constructing delay tolerant RFID networks via
content could be in a container which could potentially information relay. It further provides intelligent
block an RFID read if it sits between a target container information services through intelligent and ubiquitous
and the RFID reader. This problem could be solved if a information access, relay, search and delivery. Relying
container could relay the read to the target and relay the on timely information acquisition, relay and delivery
read result back to the RFID reader. over Mobile Relay Networks (MRN) – a critical and
ubiquitous infrastructure with convergence of RFID,
3.2 Localization for 3D visualization WSN, and Wi-Fi, MI provides seamless dynamically
created communication and interconnections. It
In cargo packing processes, a 3D visualization of the leverages mobile nodes [10-12] to “bridge the gap” by
current packing status and the optimized 3D packing serving as relay for delivery of mobile queries with or
could yield a straightforward tool for helping direct the without direct network access. The mobile relay of the
packing processes. Packing 3D visualization needs to MRN works in an energy-aware manner. It will execute
know the current location of the shipping items (e.g. sleep wakeup scheduling with the aid of low power
containers, etc.). If containers can relay information via RFID wakeup means.
delay tolerant RFID networks, the relative container The MI can also make use of the queuing
location information could be identified during the information in supply chain and logistics systems to
information relay. fetch data source through an RFID enabled sensor
network. The feasibility of this scheme relies on how
3.3 Relay for localization processing well the MRN network protocol performs and how
worthwhile the increased sophistication of the
When the containers are packed for shipping in the techniques is. To evaluate the responsiveness and
open ocean, the only communication to report performance of the protocol under unreliable and
monitoring status would be through satellite or other intermittent connectivity, availability should be
long range communication tools. Some containers are computed to estimate the QoS of the network.
equipped with Global Positioning Systems (GPS) thus
their positions could be monitored. Companies could 4.1. MI system architecture
request monitoring each of such containers. Instead of
computing the locations one by one for each GPS
positioning, the delay tolerant RFID network could
relay and locally compute all the locations through relay
network building up and then relying on single GPS
position information to yield all the required results.

3.4 Mobile relay for remote monitoring

Similar to the container monitoring in the open


ocean, remote monitoring of objects such as bridge is
another potential application of delay tolerant
information relay. For example, sudden collapse of a
bridge in a city highway in Minneapolis, Minnesota has

281
Fig.2 Service workflow
MI has three main layers: Mobile information access
layer, Query Information Service (QIS) layer and 4.3 MRN routing protocols
Mobile relay layer. Mobile information access layer
provides multiple service access points for subscribers In the current coverage of typical network
of other application. QIS layer consists of information infrastructure, there are great amount of connection
processing servers and Queuing query server, AAA partitions in the networks according to network
(Authentication, Authorization, and Accounting) server. connection status. In MRN, mobile nodes for mobile
Mobile relay layer consists of RFID networks, devices relay fill the gap between there partitions. Cluster-
in home network, sensor networks, Wi-Fi, Internet, etc. sequence routing as Fig.3 is adopted [13].
MRN will provide message relay and seamless
handover among heterogeneous networks. The
architecture of the system can be described as the
following fig.1.

Fig.3 MRN protocols

5. Performance evaluation
The second and following pages should begin 1.0
inch (2.54 cm) from the top edge. On all pages, the
bottom margin should be 1.2 inches (3.05 cm) from the
bottom edge of the page.
The feasibility of our work relies on how well the
Fig.1 Layered architecture MRN performs and how worthwhile the sophistication
of our techniques is. To evaluate the responsiveness and
4.2 Service work flow performance of MRN, We model the mobility of the
relay nodes in MRN as the transitions in Markov chain.
In the QIS service, the work flow diagram is shown The duration of the ON phase (Wake/Appear) and OFF
as Fig.2. The RFID_proxy server will forward the phase (Sleep/Disappear) follows an exponential
requests to correspondence server, RFID database distribution with parameter and respectively. As a result
server for example, after the authentication by the AAA of Markov property, the duration of the OFF state is
security server. The RFID database server maintains the exponentially distributed with mean E[TOFF] = 1/ µ
queuing status in service site by RFID middleware. The and the duration of the ON state is exponentially
dynamic situation will be monitored by the RFID distributed with mean E[TON] = 1/ λ . According to
network deployed in the service site. When the response the process of the MRN protocol, we get the packet
is echoed to the users, users can be informed by the delivery rate of the MRN as the following:
queue number or the recommended scheduling.

Fig.4 shows the enhanced packet delivery ratio vs.


failure ratio in comparison with other schemes such as
typical Full IP routing. The theoretical result affirms the
feasibility of providing available communication using
mobile relay.

282
1.2
n  N  i  N −i −1 (Tµ )k 
  ⋅ ρ
Full_IP_Calc

1.0
Full_IP_Simu
QIS_MRN_Sim
PSucc = ∑  i  (1 + ρ ) n
⋅ 1 −
 ∑ m
k!
e −Tµ m 

Packet delivery ratio QIS_MRN_Calc i =0   k =0  ,
.8

where
ρ =µ m / λ m .
.6
When the packet arrives at the k-th hop, route node
.4
will compute the tentative emergency of the forwarding
to adjust the activity of the succeeding nodes.
.2 Supposing that the packet arrives at the k-th hop and
still n-k hop to go with consumed time Tk . The packet
0.0
delivery ratio should be no less than Pe(e.g. 80%), and
the duty cycle is designated before hand (e.g. 50%,
0.0 .1 .2 .3 .4 .5 .6 40%, 30%). The duty cycle is defined as the following:
Failure ratio 1/ λ
Dutycycle =
Packet delivery ratio vs. Failure ratio 1/ µ + 1/ λ .

Fig.4 packet delivery ratio vs. failure ratio The result of the computing is shown in the Fig.5.
The delta delay is defined as the error of the consumed
Self adaptation of sleep ratio at the Kth hop
delay differed from the average expected delay of the k
.05
hop forwarding which is as following:
Duty cycle 50% k
Duty cycle 40%
Duty cycle 30%
∑ t i − kT / N
i =1
.04

delta delay= kT / N ,where ti is delay per


hop, and T is end to end tolerant delay N is the number
Sleep Ratio

.03
of hops.
.02
6. Summary and conclusion
.01
In this paper, we present Mobile Intelligence (MI) to
develop delay tolerant RFID networks for enabling a
0.00 few logistics and supply chain operations. MI provides
-.6 -.4 -.2 0.0 .2 .4 .6
intelligent and ubiquitous information access, relay,
Delta Delay search and delivery over Mobile Relay Networks
Fig.5 Self-adaptive sleep scheduling (MRN) – a critical and ubiquitous infrastructure with
convergence of RFID, wireless and sensor networks,
An inhibition mobility model is developed where and IP networks. It leverages mobile nodes to “bridge
node can enter or leave certain unit section (or partition) the network gap” towards enabling delay tolerant
randomly. The entry and leave events are independent logistics and supply chain management. Performance of
with each other. Let the events follow the Poisson MRN for information relay efficiency via mobile nodes
stream process with parameter of m and m , the
µ λ is studied. Results valid the feasibility of our MI and
surface area of the roaming field be S, and let the MRN approach. Future work in this MI research
includes mobility pattern and impact study.
surface area of unit section be S h . The total number of
divided sections is W=[ S h /S] and the number of nodes References
is M. Let all the unit sections be of the same surface [1] C. Schurgers, V. Tsiatsis, S. Ganeriwal, and M.
area. The event can be seen as the node with the packet Srivastava, Topology Management for Sensor Networks:
enters a unit section and wait for another node to arrive. Exploiting Latency and Density. In Proc. ACM
Then the event is waiting for other nodes entering the MobiHoc Conf., June 2002.
section of the node holding the packet. The entry and [2] M. Nosovic, T. Todd, Low power rendezvous and
leave events follow the Poisson stream process with RFID wakeup for embedded wireless networks, In
µm Annual IEEE Computer Communications Workshop,
2000.
µ ' λ ' µ '
parameter of m and m , where m =
W ⋅ Cn2 ,
[3] P. Skraba, Hamid Aghajan, Ahmad Bahai, RFID
λm Wake-up in Event Driven Sensor Networks. Technical
λm ' = W ⋅ C n2 . report, U.C. Berkeley, 2001.
[4] A.G. Ruzzelli, R. Jurdak, and G.M.P O’Hare, On
Then the probability of successful N hop packet
the RFID Wake-up Impulse for Multi-hop Sensor
forwarding within T delay will be:

283
Networks. In Proc. of the 1st ACM Workshop on [8] S. Burleigh et. al., Delay-Tolerant Networking -- An
Convergence of RFID and Wireless Sensor Networks Approach to Interplanetary Internet, IEEE
and their Applications at the Fifth ACM Conference on Communications Magazine, June 2003
Embedded Networked Sensor Systems, Sydney, [9] Ocakoglu, O. Ercetin, O., Energy Efficient Random
Australia. November, 2007. Sleep-Awake Schedule Design. IEEE Communications
[5] V. Cerf et. al., Delay-Tolerant Networking Letters, 2006, 10 (7): 528~530.
Architecture, http://www.ietf.org/rfc/rfc4838.txt, Draft [10] M. Grossglauser and D. Tse, Mobility increases the
1.0, Oct 2003. capacity of ad hoc wireless networks, IEEE/ACM
[6] Wei Wang, Vikram Srinivasan and Mehul Motani, Transactions on Networking, 2002.
Adaptive Contact Probing Mechanisms for Delay [11] Chaintreau, P. Hui, J. Crowcroft, C. Diot, R. Gass,
Tolerant Applications, MobiCom07, September 9-14, and J. Scott, Impact of human mobility on the design of
2007, Montrol, Qubec, Canada. opportunistic forwarding algorithms, in Proceedings of
[7] Gunnar Karlsson, Vincent Lenders, and Martin May, IEEE INFOCOM, 2006.
Delay-Tolerant Broadcasting, IEEE TRANSACTIONS [12] J. Su, A. Chin, A. Popivanova, and E. d. L. Ashvin
ON BROADCASTING, VOL. 53, NO. 1, pp369-381, Goel, User mobility for opportunistic ad-hoc
MARCH 2007 networking, in Proceedings of IEEE WMCSA, 2004.
Kevin Fall, A Delay-Tolerant Network Architecture for [13] Tianle Zhang, Z. Li, M. Liu, Routing in Partially
Challenged Internets, SIGCOMM 2003, August 25-29, Connected Wireless Network. Journal of System
2003, Karlsruhe, Germany. Simulation, 2006, 18(10): 2972~2975.

284
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Effective Feature Space Reduction with Imbalanced Data


for Semantic Concept Detection

Lin Lin, Guy Ravitz, Mei-Ling Shyu Shu-Ching Chen


Department of Electrical and School of Computing and
Computer Engineering Information Sciences
University of Miami Florida International University
Coral Gables, FL 33124, USA Miami, FL 33199, USA
l.lin2@umiami.edu, {ravitz,shyu}@miami.edu chens@cs.fiu.edu

Abstract tection applications use low-level features, such as visual


features, text-based features, audio features, motion fea-
Semantic understanding of multimedia content has be- tures, and other meta data to determine the semantic mean-
come a very popular research topic in recent years. Seman- ing from the multimedia data. As previously mentioned,
tic concept detection algorithms face many challenges such two issues, namely semantic gap and imbalanced data, have
as the semantic gap and imbalance data, among others. In been identified as the main obstacles that any system faces
this paper, we propose a novel algorithm using multiple cor- when attempting to understand the semantics of multime-
respondence analysis (MCA) to discover the correlation be- dia content. Recently many researchers have directed their
tween features and classes to reduce the feature space and efforts towards the development of machine learning algo-
to bridge the semantic gap. Moreover, the proposed algo- rithms that will have the capability to bridge the so-called
rithm is able to explore the correlation between items (i.e., semantic gap. Schneiderman and Kanade [16] proposed a
feature-value pairs generated for each of the features) and system for component-based face detection using statistics
classes which expands its ability to handle imbalance data of parts. A framework for a multimodal video event de-
sets. To evaluate the proposed algorithm, we compare its tection system which combined the analysis of both speech
performance on semantic concept detection with several ex- recognition and video annotations was developed [3]. Chen
isting feature selection methods under various well-known et al. [4] proposed a framework using both multimodal anal-
classifiers using some of the concepts and benchmark data ysis and temporal analysis to offer strong generality and
available from the TRECVID project. The results demon- extensibility with the capability of exploring representative
strate that our proposed algorithm achieves promising per- event patterns. In [5], the authors proposed a user-centered
formance, and it performs significantly better than those semantic event retrieval framework which incorporated the
feature selection methods in the comparison for the imbal- Hierarchical Markov Model Mediator mechanism.
anced data sets.
In theory, any supervised learning algorithm could be
used for semantic concept detection. However, that is under
the assumption that the distribution of positive and nega-
1. Introduction tive data is balanced in the training data. In fact, this may
not always be true in real multimedia databases, which usu-
Multimedia retrieval has a long history and many ap- ally only include a small collection of positive instances for
proaches have been developed to manage and query di- some semantic concepts. When the data set is imbalanced,
verse data types in the computer systems [17]. Semantic many machine learning algorithms have problems, and the
understanding of multimedia content is the final frontier in prediction performance can significantly decrease [11]. The
multimedia information retrieval. One of the fundamental two most common sampling schemes which are currently
challenges in semantic understanding of multimedia con- used to adapt machine learning algorithms to imbalanced
tent is semantic concept detection. The desired concepts data sets are called over-sampling and under-sampling. The
to be detected could be the existence of an entity such as first adapts the algorithm to the imbalanced data by dupli-
faces, trees, etc, or of a more descriptive meaning such cating the positive data and increasing the frequency of the
as weather, sports, and more. Content-based concept de- positive class in the training set; while the second scheme

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 262


DOI 10.1109/SUTC.2008.66
does so by discarding some negative data and by that it bal- 2 Relevant Technologies
ances the frequency of the positive and negative classes in
the training set. Some existing solutions which have been In this section, we introduce those algorithms that we
proposed to handle the imbalanced data set problem are the used in the performance comparison. There are many
analysis of the relationships between the class distribution feature space reduction algorithms and classification algo-
of a fixed size training data [18], an approach combining rithms available in the literature. Here, several most popular
different expressions of the resampling method [8]. Finally, algorithms are selected.
in our previous work [13], we proposed a pre-filtering ar-
chitecture to prune the negative instances using association
rule mining. 2.1 Feature Space Reduction Algorithms
Feature selection is one of the most frequently used tech-
nique in data pre-processing to remove redundant, irrele- As previously mentioned, there are supervised and unsu-
vant, and noisy data. By reducing the feature space, the ef- pervised feature selection methods which are used to reduce
ficiency, accuracy, and comprehensibility of the algorithm the high-dimensional data sets. For the supervised filter-
could be improved. Ideally, using feature selection would based feature selection algorithms, they can be separated
make it possible for the system to choose a feature subset into several categories from the different points of views.
from the original feature set, which best represents the tar- For the comparison purposes, we select the following algo-
get semantic concepts. The performance of a feature sub- rithms which are available in WEKA [2].
set is measured by an evaluation criterion which is selected
based on the evaluation model that is used. The three main • Correlation-based Feature Selection (CFS): CFS
evaluation models that are used for feature selection are the searches feature subsets according to the degree of re-
filter model [9][22], the wrapper model [7][12], and the hy- dundancy among the features. The evaluator aims to
brid model [6][21]. The filter model uses the independent find the subsets of features that are individually highly
evaluation functions, the wrapper model uses the perfor- correlated with the class but have low inter-correlation.
mance of one pre-determined algorithm as the dependent The subset evaluators use a numeric measure, such as
evaluation criterion, and the hybrid model takes advantage conditional entropy, to guide the search iteratively and
of the two models by using different evaluation criteria in add features that have the highest correlation with the
different search stages. Both supervised feature selection class.
algorithms (i.e., the feature selection used for classification
with labeled data) and the unsupervised feature selection al- • Statistics-based Feature Selection: Information Gain
gorithms (i.e., the feature selection used for clustering with (IG) and Chi-Square measures are examples in this cat-
unlabeled data) have been developed [14]. egory. The Information gain measure evaluates fea-
tures by computing their information gain with respect
In this paper, we propose a novel feature selection algo-
to the class. The Chi-square measure evaluates fea-
rithm using Multiple Correspondence Analysis (MCA) [15] tures by ranking the chi-square statistic of each feature
to evaluate the extracted low-level features. Using the best
with respect to the class.
feature subset captured by MCA, we compare the perfor-
mance of the semantic concept detection between the pro- • Instance-based Feature Selection: Relief is an
posed framework with the performance of several other ex- instance-based method that evaluates each feature by
isting feature selection algorithms using some of the con- its ability to distinguish the neighboring instances. It
cepts and benchmark data from TRECVID 2007 [1] under randomly samples the instances and checks the in-
various well-known classifiers, such as the Decision Tree stances of the same and different classes that are near
(C4.5), K-Nearest Neighbor (KNN), Support Vector Ma- to each other. An exponential function governs how
chine (SVM), Adaptive Boosting (AdaBoost), and Naive rapidly the weights degrade with the distance.
Bayes (Bayes). Overall, the proposed framework performs
better than other feature selection methods over all five clas- • Transformation-based Feature Selection: Principal
sifiers, and performs significantly better with imbalanced Components Analysis (PCA), for example, transforms
data sets. the set of features to the eigenvectors space. Since each
This paper is organized as follows. In Section 2, the eigenvalue gives the variance along its axis, we could
different technologies of filter model based feature selec- use such a special coordinate system that depends on
tion and classification methods are introduced. Section 3 the cloud of points with a certain variance in each di-
presents the details of the proposed feature selection algo- rection. All the components could be used as new fea-
rithm. Section 4 discusses our experimental results, and the tures but the first few account for most of the variance
conclusion is provided in Section 5. in the data set.

263
The details of these functions available in WEKA could few training examples by minimizing a bound on the
be found in [19]. An important step of using a feature selec- empirical error and the complexity of the classifier at
tion method is to set up the stopping criterion, which deter- the same time. WEKA uses the Sequential Minimal
mines when the feature selection algorithm stops and con- Optimization (SMO) algorithm for SVM.
cludes that the subset found at that point is the best feature
subset. Some of the stopping criteria adopted by the feature • Naive Bayesian (Bayes).
selection methods in the literature are (i) complete search;
(ii) a threshold, such as minimum number of features; (iii) The Bayesian classifier is a statistical classifier, which
subsequent addition, such as in CFS; and (iv) classification has the ability to predict the probability that a given
error rate [14]. instance belongs to a particular class. The probabilis-
tic Naive Bayes classifier is based on Bayes’s rule and
2.2 Classification Algorithms assumes that given the class, features are independent,
which is called class conditional independence. In the-
ory, Bayesian classifiers have the minimum error rate
There are several categories of classifiers. Some of
in comparison to all other classifiers. However, this
them are trees, functions, Bayesian classifiers, lazy clas-
is not always the case in practice, because of the pre-
sifiers, rules-based, and meta-learning algorithms [19].
viously mentioned assumption. Even so, the Naive
The most popular classification algorithms in data mining
Bayesian classifier has exhibited high accuracy and
voted in [20] are C4.5 (trees), Support Vector Machine
high speed when applied to large databases.
(functions), Naive Bayesian (Bayesian), K-Nearest Neigh-
bor (lazy), Apriori (rules), and Adaptive Boosting (meta-
learning). Based on this fact, we chose to use the afore- • K-Nearest Neighbor (KNN).
mentioned classification algorithms in our experiments with The K-Nearest Neighbor algorithm is used under the
the exception of the association rule-based classification, as assumption that instances that are closer to each other
WEKA does not include an implementation for that clas- generally belong to the same class. Thus KNN is an
sifier. The following are the definitions of those classifiers instance-based learner. The testing sample is labeled
that are used in our experiments taken from [10]: according to the class of its first K nearest neighbors.
The weight is converted form the distance between the
• Decision Tree (C4.5). test instance and its predictive neighbors in the train-
C4.5 decision tree learner is a tree structure where ing instances. As new training instances are added, the
each non-leaf node represents a test on a feature, each oldest ones are removed to keep the number of training
branch denotes an outcome of the test, and each leaf instances at the size of K. The most common metric for
node represents a class label. The Decision Tree clas- computing the distance is the Euclidean distance. For
sifier became popular due to the fact that the construc- nominal data, the distance between instances accord-
tion of a decision tree classifier does not require any ing to a particular feature is 0 if their values are the
domain knowledge, and the acquired knowledge in a same and 1 otherwise.
tree form is easy to understand. In addition, the clas-
sification step of decision tree induction is simple and • Adaptive Boosting (AdaBoost).
fast. C4.5 uses the information gain ratio as its feature
selection measure. Besides the splitting criterion, an- Boost is a general strategy to improve the accuracy of
other interesting challenge of building a decision tree the classifiers. In boosting, weights are assigned to the
is to overcome the over-fitting of the data. To achieve training instances and a series of classifiers is itera-
that, C4.5 uses a method called pessimistic pruning. tively learned. WEKA includes the Adaptive Boosting
M1 method. One advantage of the AdaBoost is that it
• Support Vector Machine (SVM). is fast. It can be accelerated by specifying a threshold
Support Vector Machine is built on the structural risk for weight pruning.
minimization principle to seek a decision surface that
can separate the data points into two classes with a
3 The Proposed Framework
maximal margin between them. The choice of the
proper kernel function is the main challenge when us-
ing a SVM. It could have different forms such as Ra- The algorithm proposed in this paper achieves the goal
dial Basis Function (RBF) kernel and polynomial ker- of reducing the feature space of a semantic concept detec-
nel. The advantage of the SVM is its capability of tion system by applying Multiple Correspondence Analysis
learning in sparse, high-dimensional spaces with very (MCA) to multimedia feature data.

264
3.1 Multiple Correspondence Analysis the right singular vectors (expression level vectors) in the
(MCA) singular value decomposition (SVD) theorem. MCA will
provide the principle components using SVD as follows.
Multiple correspondence analysis (MCA) extends the 1 1
D− 2 (Z − M M T )(DT )− 2 = P ∆QT . (1)
standard Correspondence Analysis (CA) by providing the
ability to analyze tables containing some measure of cor- Finally, the multimedia data could be projected into a
respondence between the rows and columns with more new space by using the first and the second principle com-
than two variables [15]. In its basic format, a multimedia ponents discovered using Equation 1. The weight of the cor-
database stores features (attributes) and class labels for sev- relation between the different items and the different classes
eral instances such as frames, shots, or scenes, for example. can be used as an indication to the similarity between them.
If we consider the instances as the rows in the MCA table, Such similarity could be calculated as the inner product of
and the features (attributes) and class labels as the columns each item and each class, i.e., the cosine of the angle be-
of that table, we can see that when using MCA, the cor- tween each item and each class. Since the difference be-
respondence between the features and the classes could be tween an item and a class ranges from 0 to 180 degrees, and
captured, which could help us narrow the semantic gap be- the cosine function decreases from 1 to −1 for that range,
tween the low level features and the concepts (class labels) the higher correlated item and class would be the pairs that
in a multimedia database. project to the new space with a smaller angel between them.
MCA is used to analyze a set of observations described
by a set of nominal variables, and each nominal variable 3.2 MCA-based Feature Selection
comprises several levels. In general, the features that are
extracted from multimedia streams are numerical in their The proposed framework consists of several stages as
nature. Therefore, in order to be able to properly use MCA, can be seen in Figure 1.
the extracted quantitative features should be quantized into
bins in some manner. Assuming that there are I rows and K
columns in the MCA table, the nominal features will have
Jk levels, and the total number of items (bins) will be equal
to J. Therefore, if there are I data instances in a multime-
dia database, which are characterized by a set of low-level
features, after discretization (i.e., converting the numerical
features into nominal ones), there will be K nominal fea-
tures (including the classes), and each feature will have Jk
items (feature-value pairs).
Next, MCA will scan the discretized data to generate the
indicator matrix. The indicator matrix is a binary represen-
tation of the different categorical values. Each column in
this matrix represents a level (item) generated during the
data discretization process, while each row represents an
instance. The indicator matrix will indicate the appearance
of items using the value 1. For a specific instance, only
one level (item) can be present for each feature, and there-
fore each feature can only have one value of 1 in the indi-
cator matrix for each instance. Standard CA analyzes the
indicator matrix; while MCA calculates the inner product
of the indicator matrix, which generates the Burt matrix
Y = X T X, and uses it later for analysis. The size of the
indicator matrix is I × J, and the size of the Burt matrix is
J × J.
Now, let the grand total of the Burt matrix be N and
the probability matrix be Z = Y /N . The vector of the Figure 1. The Proposed Framework.
column totals of Z is a 1 × J mass matrix M , and D =
diag(M ). Furthermore, let ∆ be the diagonal matrix of
the singular values, the columns of P be the left singular First, the audiovisual low-level features are extracted
vectors (gene coefficient vectors), and the rows of QT be from the data. A total of 28 different features, including

265
11 visual features, 16 audio features, and 1 feature that rep- tive instances to the number of negative instances for each
resents the length of the shot are extracted. The normal- concept is listed in Table 1.
ization process is done right after the features have been
extracted. Next, we discretize the data in order to be able to • Vegetation: Shots depicting natural or artificial green-
properly use MCA, since all the features are numerical and ery, vegetation woods, etc.;
MCA requires the input data to be nominal. We discretize • Sky - Shots depicting sky;
the training data set first, and then use the same partitions
for discretizing the testing data set. Each interval of the dis- • Crowd - Shots depicting a crowd;
cretization is considered as an item.
Following that, MCA is applied to the discretized train- • Outdoor - Shots of outdoor locations;
ing data set and the angles between each item and each class • Face: Shots depicting a face.
are computed. As mentioned before, the angle between an
item and a class has been observed to quantify the correla-
tion between them, and therefore, we have decided to use Concept Name P / N Ratio
the angle as our stopping criterion for the proposed fea- Vegetation 0.12
ture selection algorithm. One possible threshold condition Sky 0.14
could be those items whose angle values are smaller than Crowd 0.28
90 degrees, but it may not be a good choice. In order to Outdoor 0.51
determine the proper angle threshold value, the angles gen- Face 0.96
erated by MCA for each concept are sorted in the ascending
order. We used the first big gap from the distribution of Table 1. Positive (P) to Negative (N) instance
the sorted angles before 90 degree as the lower boundary, ratio per investigated concept.
and used 90 degrees as the upper boundary. The average of
the angles falling into this range was used as our threshold
value. Based on this threshold value, the items which have We evaluated our system using the precision (Pre), recall
the corresponding angle values that were smaller than the (Rec), and F1-score (F1) metrics under the 3-fold cross vali-
threshold value were kept. This automatic procedure facil- dation approach, i.e., three different random sets of training
itates our proposed framework the capability of identifying and testing data sets were constructed for each concept. To
different angle thresholds for positive and negative classes. show the efficiency of our proposed framework, we com-
Therefore, we could always get enough best selected items pared its performance to the performance of the four differ-
to identify the positive class from the data set. ent feature selection algorithms using five different classi-
After the different items generated by the discretization fiers available in WEKA [2]. The average precision, recall,
stage are evaluated, the best features will be selected when and F1-score values of all the feature selection methods for
most of the items from a particular feature are kept. Fi- the aforementioned concepts are shown in Table 2 through
nally, the classification for the selected semantic concepts Table 6. These tables can be read as follows: columns 3 to 7
by using the aforementioned five well-known classifiers is represent the different feature selection algorithms we have
conducted. used. These algorithms are: Correlation-based feature se-
lection (CFS), Information gain (IG), Relief (RE), Principal
components analysis (PCA), and Multiple correspondence
4 Experiments and Results analysis (MCA). The rows represent the different classi-
fication algorithms used as follows: Decision tree (DT),
To evaluate the proposed framework, we used the news Support vector machine (SVM), Naive Bayesian (NB), K-
broadcast and movies provided by TRECVID [1]. Using nearest neighbor (KNN), and Adaptive boosting (AB).
23 video data as our testbed, we have chosen five concepts, We observe that SVM always yields zero precision and
namely vegetation, sky, outdoor, crowd, and face. These recall in those concepts with extremely imbalanced data,
concepts were taken from the list of concepts provided namely vegetation (ratio=0.12) and sky (ratio=0.14). This
for the TRECVID 2007 high level feature extraction task. is because when the class distribution is too skewed, SVM
We chose these concepts because (i) there are sufficient will generate a trivial model by predicting everything to the
amounts of instances to build useful training and testing majority class, i.e., the negative class. In the case of the
data sets for these concepts, and (ii) these concepts repre- Information Gain, Relief, and PCA methods, WEKA pro-
sent both balanced and imbalanced data sets, which allows duced a ranked list of the features without performing the
us to demonstrate the robustness of our framework. The actual feature selection. Due to this fact, we had to se-
concept names and their corresponding definitions from [1] lect the best stopping criteria. After an extensive empiri-
are discussed as follows. The ratio of the number of posi- cal study, we have set the stopping criteria for these three

266
CFS IG RE PCA MCA
Pre 0.00 0.00 0.00 0.00 0.17
DT Rec 0.00 0.00 0.00 0.00 0.01
F1 0.00 0.00 0.00 0.00 0.02 CFS IG RE PCA MCA
Pre 0.00 0.00 0.00 0.00 0.00 Pre 0.59 0.42 0.17 0.78 0.59
SVM Rec 0.00 0.00 0.00 0.00 0.00 DT Rec 0.06 0.04 0.03 0.05 0.08
F1 0.00 0.00 0.00 0.00 0.00 F1 0.11 0.07 0.05 0.09 0.14
Pre 0.00 0.35 0.28 0.13 0.38 Pre 0.00 0.00 0.00 0.00 0.00
NB Rec 0.00 0.03 0.02 0.01 0.09 SVM Rec 0.00 0.00 0.00 0.00 0.00
F1 0.00 0.05 0.04 0.01 0.14 F1 0.00 0.00 0.00 0.00 0.00
Pre 0.00 0.50 0.35 0.19 0.37 Pre 0.44 0.23 0.32 0.47 0.37
KNN Rec 0.00 0.05 0.04 0.06 0.11 NB Rec 0.20 0.31 0.13 0.10 0.47
F1 0.00 0.09 0.06 0.09 0.16 F1 0.28 0.26 0.17 0.16 0.40
Pre 0.00 0.00 0.00 0.00 0.33 Pre 0.51 0.50 0.41 0.36 0.47
AB Rec 0.00 0.00 0.00 0.00 0.01 KNN Rec 0.12 0.13 0.12 0.20 0.19
F1 0.00 0.00 0.00 0.00 0.01 F1 0.19 0.20 0.17 0.25 0.27
Pre 0.60 0.38 0.35 0.62 0.59
Table 2. Average precision (Pre), recall (Rec) AB Rec 0.02 0.03 0.02 0.04 0.06
and F1-score (F1) for “vegetation” over five F1 0.04 0.06 0.04 0.08 0.12
classifiers
Table 3. Average precision (Pre), recall (Rec)
and F1-score (F1) for “sky” over five classi-
methods as follows. We have calculated the average score fiers
of each of the previously mentioned ranked lists and used
this average value as a threshold for selecting the features,
i.e., those features that had a higher score than the average
value were selected as the best subset of features produced
by these three methods.
As can be observed in Tables 2 through 6, our proposed
framework achieves promising results compared to all the
other feature selection methods over all classifiers, espe- CFS IG RE PCA MCA
cially in the cases of the imbalanced data sets (i.e., vege- Pre 0.60 0.80 0.62 0.37 0.57
tation, sky, and crowd concepts). We can further observe DT Rec 0.19 0.10 0.19 0.16 0.23
that the recall values and the F1-scores for the proposed F1 0.28 0.15 0.28 0.22 0.32
framework are always higher over all the classifiers. This Pre 0.82 0.79 0.79 0.33 0.79
encouraging observation demonstrates the fact that the pro- SVM Rec 0.06 0.08 0.08 0.01 0.08
posed framework has the ability to help the classifiers to de- F1 0.11 0.15 0.15 0.02 0.15
tect more positive instances in the testing data set without Pre 0.49 0.49 0.48 0.48 0.46
misclassifying too many negative instances by identifying NB Rec 0.40 0.31 0.34 0.17 0.41
the best feature subset for each of the investigated concepts. F1 0.44 0.37 0.39 0.25 0.44
In addition, the proposed framework was able to reduce the Pre 0.46 0.60 0.55 0.43 0.48
feature space by approximately 50% for all the investigated KNN Rec 0.29 0.20 0.23 0.35 0.31
concepts in the experiments, which is considered a signif- F1 0.36 0.29 0.30 0.37 0.38
icant feature space reduction. This demonstrates that the Pre 0.63 0.76 0.77 0.66 0.64
proposed framework can better represent the semantic con- AB Rec 0.12 0.10 0.04 0.05 0.14
cepts using the reduced feature set. F1 0.19 0.16 0.07 0.09 0.22

Table 4. Average precision (Pre), recall (Rec)


5 Conclusions and F1-score (F1) for “crowd” over five clas-
sifiers
In this paper, a correlation-based and transformation-
based feature selection framework using MCA is proposed
to handle multimedia semantic understanding related prob-
lems such as high-dimensionality, semantic gap, and the

267
imbalance data in a multimedia database. The TRECVID
2007 benchmark data is used to evaluate the concept de-
tection performance of our proposed framework compared
CFS IG RE PCA MCA with several widely used feature selection schemes under
Pre 0.59 0.59 0.58 0.54 0.60 several well-known classifiers. We utilize the functionality
DT Rec 0.47 0.41 0.41 0.44 0.48 of MCA to measure the correlation between extracted low-
F1 0.52 0.48 0.48 0.49 0.53 level audiovisual features and classes to infer the high-level
Pre 0.58 0.57 0.38 0.59 0.64 concepts (semantics). The experimental results show that
SVM Rec 0.34 0.32 0.22 0.33 0.42 our proposed framework performs well in improving the de-
F1 0.43 0.41 0.28 0.42 0.50 tection of the high-level concepts, namely vegetation, out-
Pre 0.54 0.53 0.52 0.58 0.53 door, sky, crowd, and face. Furthermore, the results demon-
NB Rec 0.51 0.51 0.44 0.39 0.54 strate the superiority of the proposed framework over the
F1 0.53 0.52 0.46 0.47 0.54 other feature selection methods used in the case of the im-
Pre 0.58 0.56 0.53 0.50 0.54 balanced data over all five classifiers. The proposed feature
KNN Rec 0.50 0.45 0.49 0.50 0.57 selection framework proves to play a major role in assisting
F1 0.53 0.50 0.50 0.50 0.56 semantic concept detection systems to better understand the
Pre 0.59 0.59 0.56 0.56 0.58 semantic meaning of multimedia data under real world con-
AB Rec 0.32 0.35 0.21 0.29 0.39 straints such as imbalanced data sets.
F1 0.41 0.43 0.30 0.38 0.47

Table 5. Average precision (Pre), recall (Rec)


6 Acknowledgement
and F1-score (F1) for “outdoor” over five
classifiers For Shu-Ching Chen, this research was supported in part
by NSF HRD-0317692 and Florida Hurricane Alliance Re-
search Program sponsored by the National Oceanic and At-
mospheric Administration.

References

[1] Guidelines for the TRECVID 2007 Evaluation. http://www-


CFS IG RE PCA MCA nlpir.nist.gov/projects/tv2007/tv2007.html.
Pre 0.65 0.65 0.66 0.68 0.65 [2] WEKA. http://www.cs.waikato.ac.nz/ml/weka/.
DT Rec 0.61 0.62 0.58 0.54 0.63 [3] A. Amir, S. Basu, G. Iyengar, C.-Y. Lin, M. Naphade, J. R.
F1 0.63 0.63 0.62 0.61 0.64 Smith, S. Srinivasan, and B. Tseng. A multimodal system
for the retrieval of semantic video events. Computer Vision
Pre 0.67 0.67 0.66 0.66 0.68
and Image Understanding Archive, 56(2):216–236, Novem-
SVM Rec 0.63 0.63 0.63 0.62 0.64
ber 2004.
F1 0.65 0.65 0.64 0.64 0.66 [4] M. Chen, S.-C. Chen, M.-L. Shyu, and K. Wickramaratna.
Pre 0.64 0.64 0.64 0.65 0.65 Semantic event detection via temporal analysis and multi-
NB Rec 0.65 0.65 0.64 0.59 0.66 modal data mining. IEEE Signal Processing Magazine, Spe-
F1 0.65 0.65 0.64 0.60 0.65 cial Issue on Semantic Retrieval of Multimedia, 23(2):38–
Pre 0.61 0.61 0.62 0.62 0.62 46, March 2006.
KNN Rec 0.71 0.67 0.67 0.65 0.73 [5] S.-C. Chen, N. Zhao, and M.-L. Shyu. Modeling seman-
F1 0.66 0.64 0.64 0.64 0.67 tic concepts and user preferences in content-based video
retrieval. International Journal of Semantic Computing,
Pre 0.64 0.64 0.64 0.66 0.63
1(3):377–402, September 2007.
AB Rec 0.63 0.65 0.64 0.54 0.68 [6] S. Das. Filters, wrappers and a boosting-based learning.
F1 0.64 0.64 0.64 0.69 0.65 Proc. International Conference of Machine Learning, pages
74–81, 2001.
Table 6. Average precision (Pre), recall (Rec) [7] J. G. Dy and C. E. Brodley. Feature subset selection and
and F1-score (F1) for “face” over five classi- order identification for unsupervised learning. Proc. Inter-
fiers national Conference of Machine Learning, pages 247–254,
2000.
[8] A. Estabrooks, T. Jo, and N. Japkowicz. A multiple resam-
pling method for learning from imbalanced data sets. Com-
putational Intelligence, 20(1):18–36, February 2004.

268
[9] M. A. Hall. Correlation-based feature selection for discrete
and numeric class machine learning. Proc. International
Conference of Machine Learning, pages 359–366, 2000.
[10] J. Han and M. Kamber. Data Mining: Concepts and Tech-
niques, 2nd Edition. The Morgan Kaufmann, 2006.
[11] N. Japkowicz and S. Stephen. The class imbalance problem:
A systematic study. Intelligent Data Analysis, 6(5):429–449,
2002.
[12] Y. Kim, W. Street, and F. Menczer. Feature selection for un-
supervised learning via evolutionary search. ACM SIGKDD
International Conference of Knowledge Discovery and Data
Mining, pages 365–369, 2000.
[13] L. Lin, G. Ravitz, M.-L. Shyu, and S.-C. Chen. Video se-
mantic concept discovery using multimodal-based associa-
tion classification. IEEE International Conference on Mul-
timedia and Expo, pages 859–862, July 2007.
[14] H. Liu and L. Yu. Toward integrating feature selection al-
gorithms for classification and clustering. IEEE Trans. on
Knowledge and Data Engineering, 17(4):491–502, April
2005.
[15] N. J. E. Salkind. Encyclopedia of Measurement and Statis-
tics. Thousand Oaks, CA: Sage Publications, Inc, 2007.
[16] H. Schneiderman and T. Kanade. Object detection using the
statistics of parts. International Journal of Computer Vision,
56(3):151–177, February 2004.
[17] M.-L. Shyu, S.-C. Chen, Q. Sun, and H. Yu. Overview and
future trends of multimedia research of content access and
distribution. International Journal of Semantic Computing,
1(1):29–66, March 2007.
[18] G. M. Weiss and F. Provost. Learning when training data
are costly: The effect of class distribution on tree induction.
Journal of Artificial Intelligence Research, 19:315–354, Oc-
tober 2003.
[19] I. H. Witten and E. Frank. Data Mining Practical Machine
Learning Tools and Techniques, 2nd Edition. Morgan Kauf-
mann, 2005.
[20] X. Wu, V. Kumar, J. R. Quinlan, J. Ghosh, Y. Q., H. Mo-
toda, G. J. McLachlan, A. Ng, B. Liu, P. S. Yu, Z.-H. Zhou,
M. Steinbach, D. J. Hand, and D. Steinberg. Top 10 algo-
rithms in data mining. Knowledge and Information Systems,
pages 1–37, December 2007.
[21] E. Xing, M. Jordan, and R. Karp. Feature selection for high-
dimensional genomic microarray data. Proc. International
Conference of Machine Learning, pages 601–608, 2001.
[22] L. Yu and H. Liu. Feature selection for high-dimensional
data: A fast correlation-based filter solution. Proc. Inter-
national Conference of Machine Learning, pages 856–863,
2003.

269
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Industry Panel 1

Wireless Sensor Network industrial view?


What will be the killer apps for Wireless sensor network?

Ming-Whei Feng
Institute for Information Industry

One of the key technologies associated with Wireless sensor network enables connectivity and
ubiquitous network society services is wireless sensor intelligence for sensor applications that will provide
network (WSN). The WSN technology incorporates advanced monitoring, automation, and control
multiple technology components such as wireless, solutions for a range of industries. There are nearly
sensor, and embedded system. WSN was quoted by unlimited numbers of WSN markets that have different
the MIT Technology Review as one of the technology considerations such as frequencies,
technologies that will create giant impact to future life sampling rate, topologies, sensor to use, etc. In this
styles and various industrial segments. For WSN panel section, it is our honor to invite experts in WSN
applications, the sensor device is the key since they applications or industries to discuss (1) what will be
need to be deployed to collect environmental data for the killer applications for wireless sensor network, and
further analysis. According to the analysis report by On (2) their associated technology requirements.
World, the CAGR for sensor nodes deployed
worldwide will be 216%, and the revenue of sensor
chips will reach US$12.1 billions in 2010. Therefore it
becomes one of the most important & potential
technologies.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 270


DOI 10.1109/SUTC.2008.95
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Towards Scalable Deployment of


Sensors and Actuators for Industrial Applications
Han Chen, Paul Chou, Hao Yang
IBM Thomas J. Watson Research Center, 19 Skyline Drive, Hawthorne, NY 10532 USA
{chenhan, pchou, haoyang}@us.ibm.com

Abstract real-time visibility to the supply, demand, and inven-


tory levels for optimizing supply chain operations [1] .
While electronic sensors and actuators have a long We believe that the potential value that could be
history of being used for a broad variety of industrial gained from business optimization using sensors and
applications, these applications tend to be self- actuators is immense and is yet to be fully exploited.
contained, for example, to control a process or to This paper first examines three emerging industry
automate an operation. The recent technology ad- applications to illustrate the benefits of sensor and ac-
vancements in ubiquitous connectivity and distributed tuator integrated business information systems. The
computing have afforded improved business perform- examples also help draw some of the key technical
ance or novel business models by narrowing the gap challenges related to the cost, function and quality of
between the real-world conditions and the state of service requirements that need to be addressed through
business information systems with sensors and actua- the different phases of application deployment. It then
tors seamlessly integrated in critical business proc- describes an architectural approach and a working pro-
esses. totype to address those challenges.
This paper first describes three industry examples
to motivate and discuss the key challenges for the 2 Use cases
ubiquitous applications of sensors and actuators. In
particular, it argues for the need of an architecture Deploying new sensor and actuator applications in-
that promotes component assembly for deployment volve significant risk and cost. It is common and pru-
scalability and supports application-specific QoS re- dent to adopt a progressive deployment strategy that
quirements such as timeliness, reliability and security. involves multiple phases starting with technology vali-
The paper also describes a prototype solution to illus- dation and proof-of-concept prototyping, limited func-
trate our approach and progress. tion and scale pilots, and followed by large-scale roll-
outs. The emerging applications that we choose as the
1 Background motivational examples are from the Pharmaceutical,
Energy and Utilities, and Railroad industries. These
The role of sensors and actuators for industry appli- applications are currently at different levels of matur-
cations traditionally has been confined to the control- ity: The RFID-enabled Pharmaceutical Track and
ling of equipment or automating of repetitive tasks. As Trace is on the verge of industry-wide roll-out to meet
a result of the continuous improvements over their regulative mandates. Its technical risk has largely been
capabilities and costs, in conjunction with the advent addressed and its cost versus benefit is better under-
of ubiquitous connectivity and distributed computing, stood. The intelligent energy grid is on the pilot de-
a new role has emerged for sensors and actuators as the ployment phase, with the continuously evolving tech-
interface between business information systems and nology being investigated and understood on small or
the physical world. By tightly connecting to the physi- medium-scale settings. The next generation WSN-
cal world through sensors and actuators, business in- based Railroad Equipment Tracking and Monitoring is
formation systems that previously depended on poten- still at the stage of validation. Many technical chal-
tially stale and frequently inaccurate data to make lenges remain and cost versus benefit still needs to be
business decisions can now continuously monitor the further analyzed.
real-world conditions and respond to changes in a
timely and optimal way. For example, much of the
recent excitement around RFID/EPC is about enabling

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 285


DOI 10.1109/SUTC.2008.89
2.1 Pharmaceutical Track and Trace afford to pay more. The diversion of drugs destined for
low price markets and their sale in high price markets
Radio frequency identification (RFID) has gathered by downstream distributors, along with other forms of
much attention recently and is arguably one of the theft, costs an estimated $40 billion per year to the
most widely deployed nascent sensor technologies. industry.
Although conceptually similar to a barcode, in the The RFID-enabled Pharmaceutical Track and Trace
sense that an RFID tag carries a unique number (or application addresses both problems. It allows serial-
string) for the underlying item that it identifies, RFID ized, RFID tagged product items to be quickly identi-
sets itself apart from traditional optical barcode tech- fied at each transaction. The serial identification is
nology by offering some unique advantages. First, bar- used to look up and update an electronic pedigree sys-
ring any electromagnetic shielding effects caused by tem that is built on a standard-based information ser-
metal or liquid, there is generally no line-of-sight re- vice [3] It provides product traceability for diversion
quirement for reading RFID tags. Second, media ac- detection and also enables product tracking for recalls.
cess control algorithms are used in readers to allow
multiple tags to be present in the field simultaneously, 2.2 Intelligent Energy Grid
eliminating the need for serialized item-by-item scans.
Third, RFID tags can carry potentially large number of The energy industry, under the pressure of unprece-
bits with on-chip memory and can be enhanced with dented growth of electricity demand, is currently fac-
other sensor modalities, such as temperature and pres- ing challenges for improving the cost-effectiveness and
sure, to further enhance its functionality. Finally, reliability of the energy grid. Traditional approaches
newer generations of RFID tags are re-writable, which for maintaining the power supply infrastructure treat
allows additional information to be passed along load as a purely passive element and provide enough
through the product lifecycle. capacity to meet peak demand plus a reserve margin to
One of the most important applications of RFID is protect against outage. To overcome the low utilization
the track and trace of pharmaceutical products of system assets while maintaining the desired level of
throughout their supply chains. Besides the usual bene- reliability, the intelligent grid technology [4] is ex-
fits associated with RFID enabled supply chain man- pected to bring real-time demand information into the
agement, such as better inventory visibility, reduced control and decision systems, which can make the en-
out-of-stock, and so on, pharmaceutical track and trace ergy grid inherently more efficient, stable and adap-
addresses two even more pressing issues in this indus- tive.
try—drug counterfeit and diversion. The World Health An energy grid consists of different portions such
Organization (WHO) estimates that between 10 and 30 as transmission and distribution grids. The real-time
percent of the medicines on sale in developing coun- demand information can be collected from intelligent
tries can be counterfeit and medicines purchased over meters deployed on the edge of the distribution grid
the Internet from sites without revealing the business’ (e.g., homes and offices). On the other hand, there are
actual physical address are counterfeits in over 50 per- many substations within the transmission grids that
cent of cases [2] . It leads to billions in lost revenue provide a critical aggregated view of the energy capac-
every year by the industry. More importantly, fake ity and demand, as well as the status of the infrastruc-
drugs can kill, as many recent headlines have made us ture for fault detection and maintenance purposes.
all aware. Many governments including EU, Japan and Today the energy companies have already deployed
several states in US have proposed or adopted pedigree many sensors in the grid to collect system utilization
laws or regulations to address the counterfeit problem. and health data; however, these sensors are often
For example, the California law requires that each pre- treated as independent data sources, and there is no
scription drug must be serialized by the manufacture at integrated system that can combine them, in a real-time
its smallest packing unit. Every transaction starting manner, to support business applications such as grid
from the manufacturer must be appended to an elec- planning, fault detection and risk prediction.
tronic pedigree; and wholesalers and retail pharmacies In addition to collecting real-time energy demand
cannot take possession of a product or provide it to and grid utilization information, the intelligent grid can
others without validating the chain of custody. further enable control and feedback functionalities that
The problem of drug diversion is a result of the allow the grid to interact with the end users and impact
multi-tiered pricing structure employed by the Pharma- their energy usage profile. One example of such con-
ceutical industry. For the exact same product, lower trol feedback is the load control through adaptive en-
price is used for less affluent markets while much ergy pricing, in which the grid periodically publishes
higher price is targeted for communities which can the latest energy prices based on its current utilization.

286
The users can specify their response strategies with deformed wheels). A train is stopped as soon as a de-
respect to the tradeoff between comfort level and cost. fective bearing or wheel is detected, and is not allowed
These strategies are implemented on home controllers to continue the journey until the problem is fixed.
that can adjust the heaters, air conditioners, refrigera- The AEI and sensor infrastructure has delivered sig-
tors and other consumer products according to the lat- nificant benefits to the railroad operations; however,
est energy prices. As such, the energy demand, for the the trackside-based approach has a major drawback:
first time, becomes not only visible but also controlla- the cost for installing sensors along the railroad net-
ble to the grid in a real-time or near-real-time manner. work (e.g. over 140K miles in US) amounts to signifi-
While the intelligent grid holds great promise in the cant capital expenditures. As a result, the read points
new services and capabilities that it can enable, it also are sparse; it may take too long from the time a prob-
requires great deployment and integration efforts lem occurs to reach a sensor along the route to detect
across the industry. The vast amount of sensors, actua- the problem.
tors and controller devices at different levels, distrib- An alterative to the trackside-based sensor infra-
uted over wide areas and administered by different structure is to develop an on-board sensor infrastruc-
parties, must be integrated through a unified architec- ture that enables continuous monitoring of the cars and
ture, where the availability, reliability, scalability and delivers near real-time information to the train operator
efficiency are of paramount importance in the system and the relevant stakeholders. Given that most of the
design. railcars are unwired and un-powered, recent advance-
ments in wireless sensor networks (WSN) [5] may
2.3 Railroad Equipment Tracking and Moni- provide a technology base for this approach: placing
toring motes on railcars to form a train network using a mesh-
networking protocol specially tuned to support railroad
Rolling stock, including locomotives and railcars, applications. The train network dynamically updates
represent a significant portion of the capital assets in- itself as cars join and disjoin the train. Readings from
vested by the railroad industry and their customers. sensors attached to the motes are reported based on a
They form trains as the operating unit in rail transpor- pre-determined schedule or out-of-band for urgent
tation – taking goods or passengers from location to alerts. A gateway located in the locomotive bridges the
location. A manifest freight train in North America, for train network with the enterprise system, and it also
example, may comprise up to 150 freight cars from hosts on-board applications.
different customers to be delivered to different destina- The WSN-based approach faces domain specific
tions. At every stop of a route, cars may be dropped off challenges arising from the fact that a train may extend
or added to the train. Maintaining the accurate and up- over a mile long and travel at a speed exceeding 40
to-date information about the train consist is an essen- miles per hour. Any battery deployed cannot be re-
tial requirement for railroad operations. Such informa- placed outside of the normal maintenance schedule at a
tion may be used for correcting and preventing opera- period over five years. Reliable and energy-efficient
tional mistakes, logistic planning and optimization, wireless protocols remain to be a research focus for
billing and other financial settlements. this application.
The North American Railroads industry adopted an
RFID-based Automatic Equipment Identification 3 Challenges
(AEI) system in the early 1990’s to identify and track
railroad equipment while en route. By 2000, over 95% It is interesting to draw the following observations
of the railcars in operation have been tagged with a from the three application examples described in the
standard-based UHF RFID transponder on each side of previous section.
a car, and more than 3000 RFID readers have been • In all three examples, sensors and actuators are
deployed across North America. As a train passes by a deployed at geographically dispersed locations
trackside reader, the unique identification numbers of where operations take place. They are connected
the locomotives and railcars are automatically captured to enterprise information infrastructure using com-
by the reader and transmitted to a central server to in- munication networks with different QoS charac-
fer the train’s consist, which comprises of the informa- teristics in terms of bandwidth, latency, reliability,
tion about the relative position and the type of the cars resilience, and so on.
making up the train. To improve safety and prevent
• In all three examples, the success of the applica-
derailment, the railroad companies are deploying addi-
tions depends on industry-wide cooperation that
tional trackside sensors such as hot box detectors (for
typical involves setting industry specific technol-
hot bearings) and wheel impact load detectors (for

287
ogy and operational standards. Technology stan- from a proof-of-concept to limited pilot to full de-
dards promote an ecosystem to provide building- ployment, as illustrated in Figure 1.
block technology elements and compete in func- Although as an architectural style, SOA applies to
tion and price, while operational standards enable sensor and actuator applications, several characteristics
interoperability and information sharing among of these applications present unique challenges.
the stakeholders. First, sensors usually generate events asynchro-
• The overall cost of an application, that includes nously; they can represent conditions of the physical
the costs of infrastructure and software develop- world or situations in a computing system. A modeling
ment and deployment, is typically the biggest im- language designed for event-driven applications will
pediment to its broad adoption as some of the be much better suited for sensor and actuator applica-
benefits may take a long time to realize, especially tions than a flow-based model such as BPEL.
for the ones that require inter- or intra-industry Second, sensor and actuator applications are usually
collaboration. Minimizing the cost in every step of distributed, sometimes over a wide geographical area.
a phased deployment would help reduce the in- Traditional cluster-based runtime approaches found in
vestment risk for the participating parties and ac- data centers may not perform well. This calls for a
celerate the adoption. truly distributed runtime platform.
Third, an immediate corollary of a distributed run-
time is that non-functional aspects of the runtime, such
as node capacity, link bandwidth, latency, reliability,
security, and so on, may vary greatly spatially and tem-
Full deployment porally. Thus, the proper modeling of Quality of Ser-
Assemble
vice (QoS) requirements of applications and the man-
Pilot agement of QoS capabilities of the infrastructure are
Deploy
Proof-of-concept paramount to the overall success of any such sensor
Model and actuator applications.
Manage
4 Approach and Prototypes

Inspired by the spirit of SOA, we have designed an


Figure 1. Technology adoption cycle and SOA architecture for managing the life cycle of distributed,
Service-Oriented Architecture (SOA) has been suc- event-driven applications, which integrate sensors and
cessfully applied in the business process domain to actuators into business processes. As shown in Figure
create composite applications rapidly by orchestrating 2, our architecture consists of two layers, namely ap-
multiple service components using a high-level, de- plication modeling and assembly and runtime virtual-
clarative language such as Business Process Execution ization, which enable the separation and orchestration
Language (BPEL). While this is a rather developer- of solution development and deployment activities.
centric view, a broader definition of SOA such as that
Lifecycle
promoted by the IT industry encompasses the entire Application Modeling and Assembly Layer
life cycle of a solution. It can be summarized into four Modules and Components
Model
Develop
Containment hierarchies
distinct phases—modeling, assembly, deployment, and Assemble
Specify
management. In the modeling phase, analysts create a Validate
Test
high level model that is implementation independent. Requirements
During the assembly stage, developers convert the con- Deploy
Runtime Virtualization Layer Capabilities
ceptual model into implementations by using existing Monitor
Domain hierarchies
components or creating new components. In the de- Nodes Manage
Communication Links Administer
ployment phase, the created software artifacts are exe- Discover

cuted on appropriate runtime stacks. The behavior and


performance characteristics are then monitored and Figure 2. High-level architecture
managed in the management stage. Throughout the The top layer focuses on developing applications in
SOA life cycle, software engineering tools typically a platform-independent and reusable manner, while the
are used to automate many tasks. bottom layer provides runtime support for the applica-
During a phased technology adoption, the SOA life tions on a specific deployment infrastructure. During
cycle is usually exercised multiple times, for example, the deployment phase, these two layers are bound to-

288
gether through matching of the requirement and capa- An agent can be used to model a sensing device us-
bility specifications. In what follows, we will describe ing output ports, an actuating device using input ports,
the research prototypes that we have developed for or a controller device or processing logic with both
each layer in an effort towards realizing the grand vi- input and output ports. Developers can use either Java
sion of this architecture. or EventScript, an event pattern matching language, to
specify the behavior of an agent [8] .
4.1 DRIVE for solution life cycle management DRIVE further extends the traditional EPN pro-
gramming model by incorporating SOA concepts.
DRIVE, or Distributed Responsive Infrastructure When defining an EPN, the constituent agents are
Virtualization Environment [6] , is designed to manage specified using only their interface definitions. The
the SOA life cycle of distributed sensor and actuator actual implementations of the agents are specified us-
applications. The current version of DRIVE is limited ing bindings. The binding information is captured in a
to the functional aspects of an overall application. configurable model file instead of being hard coded in
DRIVE includes an Eclipse-based model-driven the EPN implementation. Thus, composite applications
development tool and a distributed runtime platform can be easily assembled by binding EPN’s or EPA’s
that executes the solution artifacts created by the tools. together.
DRIVE further insulates application logic from the
SOA Composition of Solution Modules
infrastructure by allowing EPN models to be defined
independently of actual network topology. An applica-
tion can be deployed on a single computing node or a
network of distributed nodes with no modification to
the code. The mapping from agents to runtime nodes is
done at deployment time. DRIVE provides a runtime
DRIVE Component Binding
container that makes this deployment flexibility possi-
Programming-in-the-large
event
ble. The DRIVE container is implemented on top of an
v
t Event stream
en Processing
Event eve
Processing stre nt
OSGi environment [9] . It manages the life cycle of the
e am am
Event
str
e Agent Agent
Event
agent code on the target runtime. Most importantly, the
Event
Processing Processing Processing container handles the event transport from producer
Agent event Agent Agent
stream eve
nt stre Event
stre
am (for example, an output port of a source agent) to con-
am nt
Processing eve sumer (for example, an input port of a target agent)
Agent
Event Processing Network transparently whether the source and target agents are
Programming-in-the-small co-located on the same runtime or distributed across
the network.
Thus, DRIVE presents a distributed computing in-
frastructure as a single sensor and actuator runtime
platform and allows applications to be deployed and
managed in this virtualized environment.
EventScript
4.2 HARMONY for QoS management
Figure 3. Modeling and assembly of sensor HARMONY [10] is a technology we have devel-
and actuator applications using DRIVE oped to manage networked systems of sensors, actua-
DRIVE uses Event Processing Network (EPN) to tors and processing nodes in a runtime environment for
model a distributed sensor and actuator application, as distributed, event-driven applications. In particular,
shown in Figure 3. An EPN consists of multiple Event HARMONY provides a virtualization layer that allows
Processing Agents (EPA) that are connected via wires. the applications to monitor and manage the system
This concept is widely used in the event processing QoS performance along multiple dimensions, such as
community [7] . An EPA in DRIVE is characterized by delay, throughput, and reliability.
its interface, which is defined by input ports, output In HARMONY, the available nodes in the system
ports, and parameters. An EPN can also have its own are organized into multiple domains, and each domain
interface definition, thus allowing recursive and hier- corresponds to one virtual “clique” where an instance
archical composition of agents to form an arbitrarily of the DRIVE applications can be deployed. Note that
complex network of processing logic. there could be multiple instances of applications (e.g.,
local RFID tracking) running on different localities

289
and these instances collectively realize applications tion of functions and capabilities. We are evaluating
with larger scope (e.g., global pharmaceutical track- the benefits of this architectural approach and validat-
ing). As such, these domains can inherit a hierarchical ing the value proposition of DRIVE and HARMONY.
structure from the application decomposition and de-
ployment. 6 Acknowledgement
Within each HARMONY domain, the nodes form
an overlay network on top of the physical connectivity This paper is built on the work by many IBM col-
such as wireless or wired communication channels. leagues. We would like to thank Francis Parr for pro-
The advantage of such an overlay approach is that it viding the architecture leadership, Ron Ambrosio and
enables the applications to configure their end-to-end Mark Yao for the insights of the Intelligent Grid, Joh-
paths to achieve desired QoS properties and adapt to nathan Reason and John Dorn for the WSN-based
occasional failures without any direct control over the Railroads application, John Del Pizzo for the Pharma-
underlying network infrastructure. As such, the mes- ceutical Track and Trace, and many others who con-
sages sent from one node to another are routed in the tributed to the design and implementation of the archi-
overlay network through QoS-aware routing protocols, tecture and solution. The DRIVE project was partially
which seek to find the best currently available paths supported by the IT839 project from the Institute of
that satisfy the application’s QoS requirements. On the Information Technology Assessment and the Ministry
other hand, the nodes continuously monitor the link of Information and Communication of the Republic of
quality between each other and update their (overlay) Korea.
routing tables accordingly. These monitoring results
are also exposed to the applications through perceived
7 References
end-to-end QoS performance, and the applications may
adapt their behavior to the current network conditions. [1] D. Delen, B. Hardgrave, and R Sharda “RFID for Better
In addition to managing the QoS of communication Supply-Chain Management through Enhanced Informa-
between distributed endpoints, another important task tion,” Information Technology Research Institute, Sam
of HARMONY is to discover changes to the domain M. Walton College of Business, University of Arkansas,
structure such as node join and leave. In each domain, January 2007.
there is a domain controller that serves as the initial [2] International Medical Products Anti-Counterfeit Task-
contact for any newly joined node. On the other hand, force (IMPACT), “Counterfeit Medicines: an update on
node departure or failure is detected through neighbor estimates, 15 November 2006”.
monitoring and then reported to the domain controller. [3] EPC Information Services Specification,
www.epcglobalinc.org, EPCglobal, 2007.
The domain controller maintains a directory service by
[4] R, Anderson, P. Chu, R.Oligney, and R. Smalley,
which the applications can find the currently available “Smart Electric Grid of the Future: The Distributed
nodes and their associated properties. Such informa- Store-Gen Test Bed”. White paper, 2004.
tion can be used to enable automatic deployment as [5] K. Romer and F. Mattern, “The Design Space of Wire-
well as re-configuration of DRIVE applications in a less Sensor Networks,” IEEE Wireless Communica-
dynamic environment. tions, vol. 11, no. 6, Dec 2004, pp. 54-61.
[6] H. Chen, P. Chou, N. Cohen, S. Duri, and C. Jung,
“DRIVE: A tool for developing, deploying, and manag-
5 Conclusions ing distributed sensor and actuator applications,” to ap-
pear in IBM Systems Journal, Vol. 47, No. 2, May
Large-scale sensor and actuator applications have 2008.
started to emerge, enabling businesses to address far- [7] D. Luckham, The Power of Events. Boston: Addison-
reaching problems and to better respond to real-world Wesley, 2002
conditions. This paper described three industry exam- [8] H. Cohen, and K. Kalleberg, “EventScript: an event-
ples to illustrate the opportunities and challenges, in processing language based on regular expressions with
particular from the perspectives of addressing deploy- actions.”, to appear in ACM SIGPLAN/SIGBED 2008
ment scalability regarding overall cost, function, and Conference on Languages, Compilers, and Tools for
Embedded Systems, Tucson, Arizona, June 2008.
QoS. DRIVE and HARMONY together represent an [9] OSGi Alliance, “About the OSGi Service Platform,”
implementation of an architecture that is designed for Technical Whitepaper, www.osgi.org,, June 2007.
large scale sensor and actuator application deploy- [10] P. Dube, N. Halim, K. Karenos, M. Kim, Z. Liu, S.
ments. The architecture is based on SOA; it supports Parthasarathy, D. Pendarakis and H. Yang, “Harmony:
the separation of functional specification and QoS Holistic Messaging Middleware for Event-Driven Sys-
management, and also supports hierarchical composi- tems”, to appear in IBM Systems Journal, Vol. 47, No.
2, May 2008.

290
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

An Environment Sensor Fusion Application on Smart


Building Skins
Kun-Cheng Tsai, Jing-Tian Sung, Ming-Hui Jin
Institute for Information Industry, Taiwan, R.O.C.
garytsai@iii.org.tw, today@iii.org.tw, jinmh@iii.org.tw

Abstract- This study proposed a smart control algorithm to spaces.


naturally adjust the thermal quality of the environment
according to the interior and exterior environmental factors and The organization of this paper is as following. In next section,
the behavior of the inhabitants. To design an appropriate smart the definition of thermal comfort is introduced. The control
control algorithm for the smart buildings in the subtropical
algorithm in intelligent building equipped with sensor
zones, this study conscientiously investigated the thermal effect
of smart-skin facades of some office buildings in Taiwan to learn networks is presented in Section III. The experiment and
the correlation of the thermal parameters and the requirements performance analysis are given in Section. Finally, Section IV
of the inhabitants. To evaluate and further improve the proposed concludes this study and depicts the future works.
algorithm, an experimental building with two experimental
zones was constructed. The experimental results show that the
proposed algorithm significantly improves the thermal comfort II. RELATED WORKS
and reduce power consumption through well coordinating the
smart-skin equipment and air conditioners. The basic requirement of an intelligent building is to maintain
a comfortable environment. This study uses a thermal comfort
I. INTRODUCTION index called the predicted mean vote (PMV), which is defined
in ISO 7730 as an evaluative factor [5].
Due to the demand for better quality residences and lower
power consumption, considerable attention has been paid in Thermal Comfortable Index, PMV
recent years to the construction of intelligent buildings [1,2].
These buildings are equipped with the latest scientific According to the definition in ISO 7730, in order to achieve a
technologies and therefore they provide smarter and more satisfactory thermal environment, it is necessary to ensure that
environment-friendly residences. Several studies have the human body and the environmental systems are in
proposed the use of automatic control systems, signal thermodynamic equilibrium. The method proposed in this
transmission technology, and building superintendence study is based on two factors, namely, the PMV and the
systems to functionally implement intelligent buildings. In Predicted Percent of Dissatisfied (PPD) indices. The method
this study, wireless sensor networks (WSN) are used in the proposed by Fanger, which is based on the thermal comfort
construction of an intelligent building system to implement index, provided an equation with the six factors that affect an
these proposals. WSN are composed of a lot of sensor nodes individual’s thermal comfort [8]. The PMV is defined as
and few coordinator nodes [3,4]. The wireless sensors are
distributed in the environment to collecting information and PMV = (0.303e −0.036 M + 0.028){M − W − 3.05 × 10 −5
communicate the data. WSN is convenient to provide the [5733 − 6.99( M − W ) − Pa ] − 0.42[ M − W − 58.15]
ubiquitous service. As the progress of the very large scale (1)
− 1 .7 × 10 − 5 M (5867 − Pa ) − 0.0014 M (34 − t a )
integrated circuits (VLSI), the low-cost and versatile sensor
node can be realized. For the intelligent building, the sensors − 3.96 × 10 − 8 f cl [( t cl + 273) 4 − ( t r + 273) 4 ]
can be likened to the skin of the building and are used to − f cl hc (t cl − t a )},
actively detect the environment. The information collected
from the sensor networks is used to make alterations to the where
intelligent building system. Intelligent buildings are equipped
with sensor networks that are capable of regulating power
consumption, interior air temperature, and communication of 1.00 + 1.290I cl ,   I cl ≤ 0.078 ,
f cl =  (2)
information, in addition to providing renewable power and 1.05 + 0.645I cl ,   I cl > 0.078
enhancing home security in order to realize a healthy and
comfortable environment. Hence, the use of sensor networks
is an important criterion while designing intelligent living

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 291


DOI 10.1109/SUTC.2008.11
t cl = 35.7 − 0.028( M − W ) − I cl of the suitable quality level for an indoor environment, as
{3.96 × 10 −8
f cl [(t cl + 273) − ( t r + 273) ]
4 4 , (3) stated in Table 2.
+ f cl hc ( tcl − t a )} Table 2.
The parameters of the suitable quality level for an indoor environment

2.38(tcl − ta )0.25 ,   2.38(tcl − ta )0.25 > 12.1 Va , (4) Comfort


hc =  Season
Comfort
temperature
Relative Air
 12
 .1 Va    ,    2.38( tcl − ta )0.25 < 12.1 Va   temperature
range
humidity velocity

Summer 24.5 ℃ 23~26℃ 0.25m/s


and the variables are defined as follows: 30~65%
Winter 22 ℃ 20~23.5℃ 0.15m/s
M: metabolic rate (W/m2);
W: external work (W/m2);
Pa: partial water vapor pressure (Pa);
Smart Skin
ta: air temperature (℃);
tr: mean radiant temperature (℃); The concept of the smart skin of a building is based on
fcl: ratio of man’s surface area while clothed to man’s surface multiple-performance glazing proposed by Mike Davies. In
area while nude. this method, the outer building is simplified to multiple
compound elements. A self-regulation control system is
Icl: thermal resistance of clothing (m2 ℃/W);
employed to improve the comfort level inside the building.
tcl : surface temperature of clothing (℃); The outer building, also called the skin of the building, is the
construction that surrounds the building support structure; it is
hc: convective heat transfer coefficient (W/m2 ℃); the protective structure of the building. It also serves as an
Va: relative air velocity (m/s). interface between the interior and exterior of the building.
With the advancements in building technologies, the role of
In addition, PPD is represented as following: the outer building has gradually transformed from an element
of the support structure into an important artistic element.
(
PPD = 100 − 95 exp − 0.3353PMV 4 − 0.2179 PMV 2 ). (5) However, because of the advancements in building structure
technologies, there is an increase in the power consumption
and a need for greater security [6,7,8].
There are seven levels of the PMV. The range of the PMV is
from –3 to +3, which represents cold and hot, respectively. The smart skin is used to sense a change in the environment
When the PMV is equal to zero, it implies a moderate and is composed of a sensor, computer, and actuator. The
temperature and a comfortable state. The relationship between main functions of the outer building are performed relative to
the value of the PMV and its meaning is shown in Table 1. its interior and exterior. The main function of the outer
building is to provide cover and air interchange. The outer
Table 1. building provides the functions of permeability, conveyance,
The value of PMV and its corresponding meaning interdicted reflection, absorbability, and maintainability,
Value of PMV Meaning
which are related to the surrounding natural environment. In
this study, we implement an outer building having these
+3 Hot
+2 Warm
desired characteristics by using the building equipped with
+1 slight warm
sensor networks.
0 Neutral
-1 slight cool
-2 Cool III. INTELLIGENT BUILDING WITH SENSOR NETWORKS
-2 Cold
This method is based on ISO 7730. In this method, we have
proposed the use of the PMV to improve thermal comfort and
Moreover, ISO 7730 suggests that the value of the PMV to achieve the goal of reduced power consumption. We have
should lie in the range of –0.5 to 0.5. When the value of the utilized the optimal control mechanism to control the basic
PMV is within this range, over 90% of people are satisfied equipment in the building based on the PMV values.
with the state of the indoor thermal environmental conditions. Equipment such as sun visors and systems that ensure
sufficient ambient light and ventilation in the building are
In accordance with the above mentioned suggestions, the used to improve thermal comfort and reduce power
American Society of Heating, Refrigerating, and Air- consumption. This control mechanism depends on parameters
conditioning Engineers (ASHRAE) has proposed a definition such as air temperature, mean radiant temperature, humidity,
air flow, activity level, and clothing. We built a practical

292
smart-skin building to test this mechanism. The entire equipment is installed in accordance with the three
principles stated as follows. The upper section of the smart
PMV and Environment skin is responsible for illumination and ventilation. The
middle section is designed as a sunshade and penetrating
Information regarding the indoor condition of the building can vision. The design of the lower section also focuses on
be obtained by detecting the air temperature and humidity. ventilation and heat insulation.
The value of air flow is obtained from the data measured
using practical equipment. In the absence of sunlight, the Furthermore, we proposed an algorithm to control the
mean radiant temperature indoors is close to the mean air intelligent building system. This algorithm uses PMV and
temperature. The values of activity level and clothing are energy efficiency ratio (EER) to evaluate the efficiency of the
obtained from the previously determined values of the equipments. The efficiency of improved comfort for the
reference [8]. equipment can be expressed as

In this study, we applied the results obtained in [7]. A wireless ∆PMV


sensor that can sense indoor environmental conditions and ρ EP = , (6)
analyze the outdoor environment has been developed. Before
∆p
the installation of the wireless sensors, we employ a wireless where ∆PMV is the variation of PMV and ∆p is the power
network planning tool to map the locations of the sensors. In consumption of the equipment.
addition, an RF site survey is also used to obtain better
locations for installation. In the control platform, the network The smart-skin system of the outer building operates using the
monitor ensures that each sensor node is in working condition. environmental information collected by the sensor networks.
In order to improve the environmental comfort factors, The
For the installation of wireless sensors, it is necessary to algorithm is described as following:
consider the coverage and connectivity of the sensor networks.
Furthermore, the relation between the locations of the sensor 1. Equations (1) and (4) are used to evaluate the PMV and
nodes and the character of the environment is also considered. PPD, thereby obtaining information about the
The space is divided into different zones. In each zone, environmental factors influencing the indoor and outdoor
sensors are installed on the inner and outer wall to detect the surroundings.
changes in the environment. The building uses a technology 2. The influence of the environment is considered when the
that controls the macroclimate environmental factors to equipment is being operated. In addition, ρEP of the each
reduce the power consumption without loss of comfort. equipment is also evaluated.
3. The equipment with the best performance, namely, the
Intelligent Building and Control Algorithm value of ρEP is the greatest, is chosen to start prior to the
other equipment.
In order to accurately evaluate the quality of the environment 4. After the equipment is switched on, sensor networks
and the efficiency of power consumption, the control method constantly check the environmental conditions to
that only refers to the temperature is substituted with one that determine whether or not comfortable interior conditions
refers to the PMV factors. The natural air-regenerating device, have been achieved. If PMV is out of the comfortable
air current velocity equipment, heat insulation device, air- range, go to step 3 until the PMV satisfies the demand of
conditioning equipment, and wet air-conditioning equipment comfort. Otherwise, the equipment is turned off.
are regarded as effective control modules that can be
considered as PMV factors. IV. EXPERIMENT
The parameters of the smart skin are set according to the In this study, the performance of the system is evaluated by
requirements of basic illumination and the activity level of the building a practical house in the north of Taiwan. In the
people. For instance, when people are working indoors, the experimental house, the equipment installed in the smart-skin
parameters of the smart skin are set based on the following facades includes windows to bring in air and light, a shutter
principles. with double glasses, a vent hole with a shutter, and an
extractor fan. The house is shown in Figure 1. The interior of
1. The uniformity of illumination is more than or equal to the house is separated into two areas—experimental and
0.7, where uniformity of illumination is the ratio of the contrastive. Illuminating equipment, air conditioners, and
minimum degree of illumination to the average degree of dehumidifiers with identical specifications are installed in
illumination. both areas. The two areas including their equipment are
2. The desired illumination is within 800 Lux to 1200 Lux. independent of each other. These two areas are compared
3. Keeping in mind that people talk while they are working, based on their practical performances.
the activity level is set to 1.4.

293
Request of resident In experimental area
In contrastive area Outdoors
2
1.5
1
0.5

PMV
0
-0.5
-1
Figure 1. The appearance of practical intelligent building. -1.5
-2

14 0
14 0
15 0
15 0
15 0
15 0
15 0
15 0
16 0
16 0
16 0
16 0
16 0
16 0
17 0
17 0
17 0
0
The procedure for the experiments is described as follows:

:3
:4
:5
:0
:1
:2
:3
:4
:5
:0
:1
:2
:3
:4
:5
:0
:1
:2
14
Time
1. First, the number of people indoors and the
characteristics of the area are defined. Figure 2. Comparison of PMV between outdoors, experimental area and
2. Next, a comparison of the performance of the equipment contrastive area.
and the environmental conditions is carried out.
3. (a) In the experimental area, an intelligent area with air-
conditioners is set up. (b) In the contrastive area, Request of resident In experimental area
traditional air-conditioners are set up. In contrastive area Outdoors
4. Information about the quality of the environment and 50
power consumption is collected and analyzed. 40

In the experimental area, an intelligent area with air 30


PPD (% )

conditioners and sensor networks is set up. In the contrastive 20


area, traditional air conditioners are used. Figures 2 and 3
depict the experimental results of the thermal comfort index 10
PMV and the degree of expected discomfort in the two areas, 0
respectively. In the contrastive area, the traditional air
conditioners improve the indoor comfort but they cannot -10
adapt to a change in the environment. On the other hand, the
15 0
14 0
14 0

15 0
15 0
15 0

15 0
16 0

16 0
15 0

16 0

16 0
16 0

17 0

17 0
16 0

0
17 0
:3
:4
:5
:0
:1
:2

:4
:5
:3

:1
:2
:0

:4
:5
:3

:1
:2
:0
14

intelligent area automatically adapts to a change in the


environment by instantly calculating the environmental Time
factors and the desires of the inhabitants. Figure 3 shows that
the smart skin combined with air conditioners makes the Figure 3. Comparison of PPD between outdoors, experimental area and
degree of thermal discomfort PPD stable at 10%. Thus, it can contrastive area.
be proved that the control method satisfies human demand
and achieves a stable condition.
In experimental area In constrastive area
The power consumption of the building is shown in Figure 4. 1.2
Although the operating equipment satisfies the demands of
Power consumption (KW)

the people, the power consumption for an intelligent building 1


is far lower than that for traditional systems. Moreover, within 0.8
the duration in which the degree of comfort is achieved, the
0.6
intelligent area consistently consumes less power.
0.4
Figure 5 represents the analysis of the cumulative power 0.2
consumption. The total power consumed by the intelligent
0
area is far less than the power consumed by the area using
0
:40

0
0
0
0
:30

0
:50

0
0
:20

0
:40

0
:00

traditional air conditioners. Therefore, a building with sensor


:3

:5
:0
:1
:2

:4

:0
:1

:3

:5

:1
14
14
14
15
15
15
15
15
15
16
16
16
16
16
16
17
17

networks and intelligent control provides a better performance


時間
with respect to comfort and power consumption than the
traditional buildings.
Figure 4. Comparison of power consumption between experimental and
contrastive areas.

294
obtain a better performance.
Cumulative power consumption(KWH) In experimental area In constrastive area
0.7
0.6
V. REFERENCE
0.5
[1]. S Sharples, V. Callaghan and G. Clarke “A Multi-agent
0.4
architecture for intelligent building sensing and control,”
0.3 International Sensor Review Journal, pp.1-8, May 1999.
0.2 [2]. Hani Hagras, Victor Callaghan, Martin Colley and
0.1 Graham Clarke, “A hierarchical fuzzy–genetic multi-
0 agent architecture for intelligent buildings online learning,
adaptation and control,” Information Sciences (Elsevier),
0
0
0
:00
:10
:20
:30

0
0
0
:10
:20
:30
:40

0
0
0
:3
:4
:5

:4
:5
:0

:5
:0
:1
14
14
14
15
15
15
15
15
15
16
16
16
16
16
16
17
17
vol. 150, pages 33-57, March 2003.
Times [3]. I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E.
Cayirci, “A survey on wireless sensor networks,” IEEE
Figure 5. Comparison of cumulative power consumption in experimental and Communications Magazine, Vol. 40, 2002, pp. 102-114
contractive areas. [4]. D. Culler, D. Estrin, and M. Srivastava, “Overview of
sensor networks,” IEEE computer, Vol. 37, 2004, pp. 41-
V. CONCLUSION 49.
[5]. ISO. 1994. International Standard 7730, Moderate
In this study, we provide an algorithm to coordinate smart- thermal environments - Determination of the PMV and
skin equipment and air conditioners in order to improve PPD indices and specification of the conditions of
thermal comfort. In addition, our proposed power Thermal Comfort, 2nd ed. Geneva: International
consumption scheme ensures a better performance than the Standards Organization.
traditional methods. [6]. T.-P. Ku, “A study on the improvement of exterior wall
construction by the double walls system,” thesis, national
The main distinguishing factor between our method and the Cheng Kung university, 2003.
traditional methods is that we have utilized the PMV model in [7]. The Institute for Information Industry, III ZigBee
our algorithm to coordinate the air conditioners and the Advanced Platform (iZAP), http://zigbee.iii.org.tw.
devices in the smart skin to conduct light, provide shelter [8]. Fanger, P.O., “Improvement of human comfort and
from sunlight, ensure heat insulation and ventilation, and resulting effects on working capacity,” Biometeorology
control temperature. Further, the coordinative equipment is (II): 31-41, 1972
operated on the basis of the thermal comfort indices defined
in ISO 7730 and the principles of power consumption to

295
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Automated Management of Assets based on RFID


1 2 1 3
Shengguang Meng Dickson K.W. Chiu Liu Wenyin Xuxiang Chen
1
Department of Computer Science, City University of Hong Kong, 83 Tat Chee Ave., KLN, Hong Kong
2
Senior Member, IEEE, Dickson Computer Systems, 7 Victory Avenue, Kowloon, Hong Kong
3
Zhuhai Branch, China Mobile Group, Zhuhai, Guang Dong Province, PR China
Email: shenmeng@cityu.edu.hk, dicksonchiu@ieee.org, csliuwy@cityu.edu.hk

Abstract carry out preliminary experiments based on a real


application of enterprise asset management over ten
With the maturing of the Radio Frequency Identifi- months to investigate its efficiency and feasibility.
cation (RFID) technology, automated management of The results of the experiments show good user ex-
assets (particularly for mobile ones) in an enterprise perience and significant cost decrease of asset man-
using RFID becomes practical. This paper proposes a agement.
RFID-based Asset Management System (RAMS) and The rest of this paper is organized as follows. Sec-
details how to maintain the whole life-cycle of assets tion 2 reviews related work. Section 3 introduces the
from their acquisition, transfer, maintenance, audit architecture of our RAMS that combines RFID with
taking, to retirement. The proposed system also dis- WEBGIS and SMS technologies while Section 4
plays spatial information of assets on an electronic details the functions for automated asset manage-
geographical map using WebGIS technology. Further ment. Preliminary experiments with a real applica-
asset security is improved by automated notification of tion are presented in Section 5 for evaluation of the
asset movement and malfunction alarm through SMS. system. We conclude our work in Section 6.
Preliminary experiments show that the proposed sys-
tem can improve the availability of assets and reduce
2. Related Work
the asset management cost dramatically. RFID is a technology that provides non-contact,
non-line-of-sight wireless data communication be-
1. Introduction tween devices in a system. RFID has been widely
Asset management has long been considered as an used in the fields of manufacture, traffic, logistics,
important issue in enterprise organizations. How- etc. In particular, several pieces of work have applied
ever, automated asset management has still been RFID techniques to asset management. Mark et al.
inadequately supported by existing information sys- [1] propose a RFID-based tagging and tracking
tems. Assets are often not managed individually. method that supports total asset visibility for the
Information about location, status, and usage is often military. Barber and Tsibertzopoulos [2] investigate
inaccurate or lacking. This may cause delays in in- the features of the new Ultra-high Frequency (UHF)
dustrial operations, inefficient use or excess inven- RFID standard and their implications for asset man-
agement. Of these features, tag memory, security
tory of costly assets, and may even lead to asset
techniques, operational modes, communication
damage or lost.
methods, global applicability, and other expected
In order to solve these problems, we propose a
implementation specific improvements are covered.
RFID-based Asset Management System (RAMS), The improvements that the features bring to asset
augmented with Web-based Geographical Informa- management are also mentioned. Hakim et al. [3]
tion System (WebGIS) and Short Message System employs a passive RFID mechanism in Hartford
(SMS) technologies. RFID is used to automate the Hospital, Connecticut, for asset tracking. The tech-
management of assets that are to be controlled in real nique is employed to monitor Telemetry Transmit-
time, including their acquisition, transfer, audit tak- ters (TT) when they pass through certain key spots in
ing, maintenance, and retirement. SMS is used for the hospital in order to prevent asset lost.
automated transferring of asset movement informa- WebGIS is a new technology which combines the
tion in a distributed environment where the deploy- Internet and GIS. End users can search and analyze
ment of cable network is inappropriate. The em- the GIS data intuitively on the Internet using brows-
ployment of WebGIS technique facilitates the dy- ers. The advantages of WebGIS are the visual data
namic visual representation of the spatial information sustaining to providing assistant decision for users,
of the asset distribution on an electronic map. We coexisting of words and images and visualization.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 296


DOI 10.1109/SUTC.2008.67
The adoption of WebGIS provides a new perspective nication with the RFID-enabled tags. A GSM mod-
for the management method of the asset management ule is integrated into the RFID reader for sending
system. For example, Luo et al. [4] propose a and receiving SMS. When assets enter or leave the
framework to provide a new model for WebGIS ser- base station, the RFID reader can read their data
vices in a network environment. This framework from the tags attached to them. The asset data is sent
consists of three layers: user layer, application layer to an application server in real time by SMS. The
and WebGIS service layer. The user layer supports communication between the AAIS and the applica-
users to access WebGIS services via different net- tion server ends when a confirmation SMS is re-
work environment. The application layer represents ceived from the application server.
WebGIS applications constructed by different or-
ganizations or companies, which integrate GIS ser-
vices to solve some domain-oriented problems. The
WebGIS service layer provides different GIS ser-
vices with a multi-agent system, in which WebGIS
services are implemented as agents to match the use
of different users and applications.
SMS is now very popular and has been employed
in many fields, such as remote monitoring, data col-
lection, and wireless alarm, due to its low cost, high
speed, and high reliability. For example, Kwon et al.
[5] develop a blood glucose management system
using the Internet and SMS in order to a new tool for
communication between health care providers and
Figure 1. Architecture of the RAMS
patients.
Existing RFID-based asset management or track-
ing systems transfer data using the Internet or a local
area network. However, those systems are usually
unsuitable for the assets with the following charac-
teristics: (a) distributed in different locations, (b)
located in a limited area, (c) no network environment
is available or it is costly to establish network envi-
ronment. For example, for assets in base stations of
mobile communication companies, some of them are
located on top of a building, while some of them are
located on the mountainside and mountaintop.
Hence, we also propose the use of WebGIS and SMS
technologies for such situations.
Figure 2. Modules of the RAMS
3. System Design and Implementation Application Server - This subsystem consists of
The proposed asset management method is mainly RFID component, an SMS service module, and a
for enterprises with extensive and dispersive distri- MapXtreme module. The RFID component is
butions of assets, such as in electricity and commu- designed and implemented based on a middleware
nications industries. Figure 1 shows the architecture methodology. The interfaces of program provided by
of our RAMS. Figure 2 summarizes its modular the RFID component define a high-level application
composition, which consists of four main subsys- environment with good stability. The application
tems: Asset Automatic-Identification Subsystem, software requires no modification when the RFID
Application Server, Database Server, and Asset component is updated with new hardware or system
Automatic Management Subsystem. software. The RFID component mainly accomplishes
Asset Identification Subsystem (AIS) - This sub- dynamic data exchange between the application
system consists of three basic components: a RFID server and RFID readers. RFID readers submit the
reader, an antenna that is installed at the entrance of data of asset to RFID component via GSM modems
the base station, and RFID tags that are attached to when assets leave or enter base station. The RFID
the surface of the assets. The RFID reader emits ra- component returns a confirmation to the RFID reader
dio-frequency signals using the antenna for commu- after finishing the analysis and warehousing of
relevant data.

297
The SMS service module includes the following tial to manage the attribute data and spatial data in
functions: (a) receiving SMS messages from RFID the system.
readers in base stations, (b) sending confirmation Asset Management Subsystem (AMS) - The
messages to RFID readers, (c) sending SMS mes- AMS works in a browser/server (B/S) mode. The
sages to RFID readers of base stations to inspect main modules of this subsystem are introduced as
their working status, and (d) analyzing from which follows.
RFID reader an incoming SMS comes. In our Basic Information Management - This module
RAMS, a Subscriber Identity Module (SIM) card is takes the charge of maintaining the entire database,
applied to the GSM module, which is integrated into such as asset data, base station data, SMS data, and
a RFID reader that holds personal identity informa- so on. System administrators can add or modify asset
tion, cell phone number, phone book, text messages, information, such as asset name, asset number, asset
and other relevant data. The SMS source can be type, manufacturer, production time, specification,
identified according to the cell phone number of the purchase time. The relations between RFID tags and
SIM card. In this way, the RFID readers can be man- asset entities are established in this module.
aged individually. The content of the message is Automated Assets Management - This module util-
converted to an appropriate format for storage in the izes RFID technology and wireless data transfer
database. technology to implement asset data automated gath-
The MapXtreme module employs the WebGIS ering and processing so that the asset management
software from MapInfo MapXtreme to provide a can be automated. The detailed implementation is
visual representation of the relevant spatial informa- further discussed in the next subsection.
tion of the assets (http://www.mapinfo.com). When a WebGIS Module - This module utilizes Web-based
user query of spatial data is sent to the Web server in GIS technology to support the integration with geo-
the system from the user’s browser, the Web server
graphic information. Users can search and display
transfers the related parameters to the MapXtreme
spatial information of assets on the electronic geo-
module. MapXtreme queries the corresponding map
graphical map, such as basic information, location
data from the GIS database and the asset data from
information, statistic information, usage status, trans-
the basic database. The result is then returned to the
Web server and finally rendered on the user’s ferring record, maintenance record, and malfunction
browser. information. The system adopts MapInfo MapX-
Database Management System (DMS) - All treme to implement the WebGIS functions. MapX-
data that are used in the RAMS are stored in a rela- treme can read and display map data of GIS data-
tional database, including basic data, spatial data, bases. On the client-side, it supports many functions
and attribute data so that data storage and manage- on the electronic geographical map, such as map
ment is separated from the application logic. We search, map visualization, layer control, zoom in/out,
separate the data into two databases, namely the Ba- roam, creation of the special subject map, etc.
sic Database and the GIS Database. The Basic Data- SMS Notifications and Alarms Module - The func-
base stores the information about base stations, as- tions of this module can be customized in the AMS.
sets, and SMS messages. The GIS Database is a For example, the system can be configured so that a
dataset to describe the geographical characteristics of SMS is sent to the administrators of a base station
an area that is to be controlled by the system. Be- whenever an asset moves in or out of it. It is also
sides regular attribute data of geographical factors, it possible for the system to send an alarm SMS to the
also has a large volume of spatial data which de- corresponding maintainers of a base station to repair
scribes the spatial distribution of geographical fac- it as soon as possible when a RFID reader or a GSM
tors. So, such information is separated from the Ba- module is not usable.
sic Database. Other modules include user management module,
There is an indivisible relation between the attrib- operation log management module, statistics report
ute data and the spatial data. The MapXtreme mod- module, print module and so on.
ule can adopt either file strategy or spatial data strat-
egy to manage GIS data. We adopt the spatial data
strategy to manage the GIS data because this enables 4. Automated Asset Management Functions
integrated management of spatial data and attribute
data. This can take advantage of commercial data-
4.1. Asset Acquisition Process (AAP)
base systems to implement distributed structures and
support multi-user queries. We adopt the OracleSpa- The Asset Acquisition Process (AAP) adds an as-
set into the RAMS at a base station. We combine

298
RFID technology with WebGIS technology in this 2) The AMS records the location and the time of the
process. The steps of the AAP are described as fol- movement of the transferred asset and changes the
lows. asset state to a transit state. It then sends a SMS to
1) The information of a new asset is entered into notify the administrators of A.
database with the basic Info Management Mod- 3) When the asset arrives at B, the RFID reader at B
ule. The status of asset is marked as newcomer. reads the asset data from its tag. The information
2) When the new asset is moved into base station A, of the incoming asset is then uploaded to the ap-
the RFID reader there reads the data from the tag plication server through the GSM module.
attached on the asset. The new information is 4) The AMS stores the location and time of the
then uploaded to the application server through movement of the incoming asset and changes the
the GSM module. The application server then in- asset state back to a normal state.
serts the new asset data into the database system. 5) The AMS records all the data of the incoming
3) The AMS stores the location and coming time of asset to the base station B and finalizes the asset
the new asset automatically and changes the asset transfer accordingly.
state to normal. 6) After finishing an asset transfer, the administra-
tors of both station A and B will get SMS mes-
sages about the asset transfer.

Figure 3. WebGIS Interface of Asset Acquisition


The asset acquisition information can also be Figure 4. WebGIS Interface of Asset Transfer
shown on the WebGIS interface. As shown in Figure In Figure 4, a big blue point icon represents a base
3, a big blue point icon represents a base station. station. Users can see the information window of this
Users can see the information window of this base base station by clicking on it. The window shows
station by clicking on it. The window shows basic basic data and statistic data of the base station, such
and statistical data of the base station, such as longi- as longitude (LON), latitude (LAT), amount of trans-
tude (LON), latitude (LAT), amount of new assets on ferred assets on a particular day, and so on. A big
a particular day, and so on. A big blue point with a blue point with an outgoing edge in the figure indi-
plus sign edge in the figure indicates that new assets cates that new assets have left this base station re-
have been added into this base station recently. Users cently. Users can get the information of the trans-
can get the information of the incoming asset by ferred asset by clicking on this icon. A big blue point
clicking on this icon. After clicking, the icon will be with an incoming edge indicates that assets have
hidden in the map until new assets are added into the entered this base station recently. Users can get the
base station in the future. information of the incoming asset by clicking on this
4.2. Asset Transfer Process (ATP) icon. After clicking, the icon will be hidden in the
map until new transfers occur in the future, or if any
The Asset Transfer Process (ATP) keep track of other information is changed in this station.
an asset when it is transferred from a base station to
another. We combine RFID technology with 4.3. Asset Maintenance Process (AMP)
WebGIS and SMS technology in this process. The The Asset Maintenance Process (AMP) keeps
steps of the ATP are described as follows. track of an asset when it is sent for maintenance. The
1) Before an asset is transferred from base station A steps of the AMP are described as follows.
to B, the RFID reader at base station A reads the 1) When an abnormal asset is moved out of base
asset data from the tag attached on the asset. Such station A for maintenance, the RFID reader there
information is then uploaded to the application reads the asset data from the tag attached on the
server through the GSM module.

299
asset. Such information is then uploaded to the ble 1 lists the different auditing status of assets. Fig-
application server through the GSM module. ure 5 shows a screen shot of the auditing result.
2) The application server stores the removal time of
the abnormal asset and changes the asset state to Table 1. Auditing Status of Assets
a transit state. Auditing Status Description
3) When the abnormal asset arrives at the mainte- Normal Asset exists in correct base sta-
nance station, the RFID reader there reads the as- tion
set data from its tag. Such information is then up- Losing Asset does not exist in any base
loaded to the application server through the GSM station
module. Surplus Asset exists in database, but
4) The application server stores the time of the in- cannot been audited.
coming asset and changes the asset state to a Mistake Asset exists in incorrect base
maintenance state. station
5) After the asset is repaired and returned to the
base station A, the RFID reader there reads the
asset data from the tag attached on the asset.
Such information is then uploaded to the applica-
tion server through the GSM module.
6) The application server changes the asset state
back to a normal state and finalizes the mainte-
nance record in the database.
4.4. Asset Retirement Process (ARP)
The Asset Retirement Process (ARP) records re-
tired asset when the asset becomes no longer useful. Figure 5. Auditing Result of Assets
The steps of the ARP are described as follows.
1) When an asset is moved out of base station A 5. Experiment and Evaluation
because of retirement, the RFID reader there
reads the asset data from the tag attached on the We collaborate with a subsidiary company of the
asset. Such information then uploaded to the ap- China Mobile Communications Corporation for the
implementation requirements on a real application of
plication server through the GSM module。
asset management in mobile base stations. A proto-
2) The application server changes the asset state to a type of the RAMS has been developed based on the
transit state. J2EE architecture, with which we carry out the
3) When the retired asset arrives at the retirement evaluation experiments. The system has been de-
station, the RFID reader there reads the asset data ployed to 30 base stations of the company distributed
from its tag. Such information is then uploaded to in different districts of a city.
the application server through the GSM module Only a browser is needed to be installed on a cli-
to confirm the retirement. ent-side machine. The application server operates
4) The application server changes the asset state to a under the operating system of Windows 2003 Server
retired state and creates a retired record of asset with MapXtreme for Java installed for supporting
in database. WebGIS functions. Tomcat 5.0 and Sun JDK-
1.5.0.07 is used for supporting JSP programs. An
4.5. Asset Auditing Process (AAuP) Oracle9i database is employed to manage basic and
The RAMS schedules the Asset Auditing Process GIS data.
(AAuP) according to the auditing plan pre-defined The asset information of the 30 base stations is
by the administrators. Auditing data such as tag, lo- first entered into the database system. Then, the
cation, amount, etc., are collected with RFID hand- automated asset identification system starts to gather
held readers according to the auditing schedule. Then the asset data of the 30 base stations, such as the as-
the auditing data are uploaded to the application set movement information, the working status of
server through the GSM module. The AAuP then RFID readers and GSM modules, and so on. Those
compares the auditing data with asset data in data- data are submitted to the application server using
base. The AAuP records the auditing status of the SMS. The application server processes the SMS
assets and produces the auditing analysis report. Ta- messages and updates the database system accord-
ingly. Those data can be queried and represented on
the WebGIS interface. The system can also send

300
notification SMS messages to the administrators of statistics based on spatial distribution and other re-
the base station about the movement and status of the lated information of the assets using electronic geo-
assets. Figure 6 shows a partial list of alarm and noti- graphical map. It greatly enhances the system ma-
fication SMS messages of the system. neuverability and user experience. (c) Our RAMS
utilizes SMS technology to inform administrators of
base stations about asset transfer and other alerts in
real time. The monitoring of assets becomes highly
effective and efficient.
Future enhancements of the system include the
following. The current speed of sending SMS is
somewhat slow. It sometimes takes about four min-
utes to receive a SMS message. We are investigating
the possibilities of improving the speed of wireless
data transfer so that the system processing speed can
be improved. In addition, we are working on an addi-
Figure 6. Alarm SMS Message List tional subsystem to enhance the system security.

The experiment results show that the process time Acknowledgement


of asset movements is usually less than 4 seconds The work described in this paper was fully supported
and all data are updated correctly in the entire proc- by a grant from the Research Grants Council of the
ess. Compared with traditional asset management Hong Kong SAR, China [Project No. CityU
methods, our system not only enhances user experi- 117907], the China Semantic Grid Research Plan
ence by using WebGIS and SMS technologies, but
(National Grand Fundamental Research 973 Pro-
also greatly save the cost of asset management. The
gram, Project no. 2003CB317002).
system has been up running for ten months at the
time of writing. It benefits the mobile communica- References
tion company in the following aspects: (a) reducing
[1] Buckner, M., Crutcher, R., Moore, M., Whitus, B.:
the cost of purchasing repetitive assets by 42%; (b)
Miclog rfid tag program enables total asset visibility.
reducing asset audit cost by 75%; (c) reducing asset Proc. MILCOM 2002, vol.2, pp. 422-1426, 7-10 Oct.
lost rate by 45%; (d) increasing the asset usage rate 2002
by 35 %; (e) prolonging the average asset usage life [2] Barber, G., Tsibertzopoulos, E.: An analysis of using
by 10%. epcglobal class-1 generation-2 rfid technology for
wireless asset management. Proc. IEEE Military Com-
6. Conclusion and Future Work munications Conference, 2005 (MILCOM 2005), vol.
1, pp. 245-251, 17-20 Oct. 2005
In this paper, we propose a novel system for asset [3] Hakim, H., Renouf, R., Enderle, J.: Passive rfid asset
management in enterprises using RFID, WebGIS, monitoring system in hospital environments. Proc.
and SMS technologies. The main merits of the 32nd Annual Northeast Bioengineering Conference,
RAMS are as follows. (a) The proposed method can pp. 217-218, 2006
maintains the whole life-cycle of assets from their [4] Yingwei, L., Xiaolin, W., Zhuoqun, X.: Design of a
acquisition, transfer, maintenance, retirement, audit framework for multi-user/application oriented webgis
services. Proc.. International Conference on Computer
taking by integrating RFID technology with GSM
Networks and Mobile Computing, pp. 151-156, 2001
wireless module. Manual intervention has been much [5] Kwon, HS., et al.: Development of web-based diabetic
reduced, avoiding data distortion problems due to patient management system using short message ser-
factitious factors during collecting and inputting the vice (SMS). Diabetes Res Clin Pract. 66(suppl
asset data. (b) Our RAMS utilizes Web-based GIS 1):S133–7, Dec. 2004
technology to enable users to search and view the

301
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

ZigBee source route technology in home application

Yao-Ting Wu
Networks and Multimedia Institute, Institute for Information Industry
7FL., No.133, Sec. 4, Minsheng E. Rd., Taipei, 105, Taiwan, R.O.C.
solarph@nmi.iii.org.tw

Abstract consumption, wireless communication standard


ZigBee is a new short-range wireless defined by ZigBee Alliance. It is based on IEEE
technology. ZigBee devices have parents and 802.15.4, with low power consumption, supports
children relationship, and those devices can mesh network, and also it is a flourishing
construct a huge mesh network. Most devices in technology in wireless sensor networks, paves
ZigBee network can relay packets, repair route, more advantages. ZigBee uses DSSS (Direct
etc. If a ZigBee device wants to send packets to Sequence Spread Spectrum) technology, and it
another device, the source node needs to find a works in the 868 MHz (Europe), 915 MHz
route to the destination. In an application, if (North America and Australia) and 2.4 GHz
data are sent to the concentrator form the (available worldwide) ISM band with up to
sensing nodes and concentrator tends control the 20kbps, 40kbps and 250kbps data rate
sensing nodes, the concentrator can record the respectively. ZigBee technology is used in many
path from sensing nodes to its self, and send the application areas, such as consumer electronics,
data to the sensing node directly according to Home and Building automation, Industrial
the recorded path. The relaying nodes using the control, sensor networks, home care, etc. [1] [2].
recoded nodes information relay packets. The ZigBee stack architecture has three main
In this paper, we dealt about usage of the layers, depicted in Figure 1, which are medium
source route technology to find the best route access control (MAC) layer, network (NWK)
and send packets directly according the recorded layer, and application support sub-layer (APS).
nodes information. In a huge mesh network, find The MAC layer is defined by IEEE
a good route is very important. It can reduce the 802.15.4-2003 standard. NWK and APS layer is
probability of packet collision and data loss. defined by ZigBee Alliance. ZigBee supports
This technology is suitable in the application of several network topologies, such as star, tree,
user monitors the sensing area in the and mesh networks.
concentrator, .and send control packets to the Source route is one of the ZigBee specific
sensing nodes from concentrator. technologies. In a huge network, if device want
to send packets to another, it needs to find a
I. Introduction route, and this action may course the network
noisy. Source route is used to reduce the noisy in
ZigBee is very low cost, very low power the network. In home application, we have one

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 302


DOI 10.1109/SUTC.2008.29
concentrator, maybe pc or control panel, to
monitor all sensing area at home. If some A. Many-to-one route discovery
abnormal situations happen, user or pc can do
some actions. Take something for example. In Before sensing nodes sending packets to
the concentrator, we can set comfortable concentrator, we need to find routes to
temperature to the living room. The temperature concentrator. In order to find new routes to
sensor in the living room will gather the space concentrator, concentrator should send a
temperature, and sent the detected value to the broadcast packet, route request, to let all sensing
concentrator. If the temperature is higher than nodes know where the concentrator is, and
the setting value, the concentrator will send the create a route. When all sensing nodes receive
control packets to the sensing node to control air the route request, they should relay this packet
condition and change the temperature to the using broadcast and create routing table,
setting value. In the past, if temperature sensor depicted in Table 1, and route discovery table,
sent data to concentrator, it needed to find a depicted in Table 2. The route discovery table
route to concentrator, and if concentrator sent will record the route request Id, source address,
control packets to the sensing node, it needed to sender address, and cost. Route request Id will
find a route either. In the application, finding a be set from route request packet. Source address
route will make the network noisy, and it may will be set to the concentrator, sender address is
cause data loss or packets collision. Using set to the address which sends the route request,
source route technology, we only need to find a and cost is counted according the link quality.
route from sensing nodes to concentrator, and The routing table will record the destination
don’t need find the route form concentrator to address, status and next-hop address. The
sensing node. It will reduce the noisy in the destination address is set to the concentrator,
network. next-hop address will be set to the sender
address of route discovery table, and if route has

ZigBee device object been found, the status will be set to success. If
Application framework
(ZDO)
nodes receive the route request and route
discovery table has already existed, the nodes
APS

SECURITY
will compare the cost in the table. If received
NWK cost is less than the existed one, the sender
address, cost in the route discovery table will be
MAC replaced by a new value and next-hop address in
the routing table will be changed too.
PHY Table 1. Routing Table Entry
Field Name Size
Figure 1. ZigBee Stack Archetecture Destination address 2 octets

Status 3 bits
II. Method No route cache 1 bit

303
Many-to-one 1 bit initialized to 1 less than the relay count by the
Route record required 1bit concentrator of the packet, and it is decremented
Group ID flag 1 bit by 1 by each relaying nodes. If its own address
Next-hop address 2 octets is not found in the rely list, the node will discard

Table 2. Route Discovery Table Entry the packets and no further action. Only if the

Field Name Size controlled node is not recorded in the table,

Route request ID 1 octet concentrator should find a route to the sensing

Source address 2 octets nodes.

Sender address 2 octets

Forward cost 1 octet Table 3. Route Record Table Entry


Field Name Size
Residual cost 1 octet
Network address 2 octets
Expiration time 2 octets
Relay Count 1 octet

Path Variable
B. Route record

When sensing node sends packet to the 1 octet 1 octet Variable

concentrator, the packet will be sent following ZigBee Relay Relay Relay list

the next-hop address field in the routing table, Header count index

and the relaying nodes will put their own address Figure 2. Source route packet format

in the packet. After concentrator receives the


packet, all relaying nodes will be recorded and III. Result and conclusion
information will be stored in route record table,
depicted in Table 3, in concentrator, and the In most application, we will have one

address is listed in order form concentrator to the concentrator, and many sensing nodes. User can

sensing node. monitor the sensing area and control the area in

Before concentrator sending control concentrator side. In the past, it would spend

packets, it should search the route record table much time to find a route, and it would make

first. If the sensing node is listed in route record network noisy. Using source route technology,

table, concentrator doesn’t need to find a route to we only need route one time and record the route

the sensing node. It only needs sending control information in the concentrator. In this way, we

packets following the relay list of the route can find a route easily, and don’t need to find the

record table. The relay list will be set in the route frequently. It is useful to reduce network

source route packet payload, depicted in Figure noisy, the probability of data loss and packets

2. When relaying node receives the source route collision.

packet, it will compare its own address with the Reference


address in the relay list of packet. If its own [1] D. Culler, D. Estrin, and M. Srivastava,

address is found in the relaying list of the packet, Overview of Sensor Networks, Computer,

it will change the relay index field. This field is August 2004

304
[2] Shigeru Fukunaga Tadamichi Tagawa Kiyoshi
Fukui Koichi Tanimoto Hideaki Kanno
“Development of Ubiquitous Sensor Network”
Oki Technical Review October 2004/Issue 200
Vol.71 No.4
[3] ZigBee Alliance, http://www.zigbee.org/.
[4] IEEE 802.15.4: “IEEE 802.15 WPANTM Task
Group 4 (TG4)”, http://www.ieee802.org
[5] Zigbee.iii.org.tw

305
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Two Practical Considerations of Beacon Deployment for


Ultrasound-Based Indoor Localization Systems
Chun-Chieh Hsiao*a, d and Polly Huanga, b, c
a
Department of Electrical Engineering
b
Graduate Institute of Communication Engineering
c
Graduate Institute of Networking and Multimedia
National Taiwan University (NTU), Taipei 10617, Taiwan, R.O.C.
d
Department of Computer Information and Network Engineering
Lunghwa University of Science and Technology (LHU), Taoyuan 33306, Taiwan, R.O.C.
*d94921013@ntu.edu.tw

Abstract environment. There are two common assumptions


when beacon deployment algorithms are devised: (1) a
In this paper, two practical considerations of beacon spherical radio model for the sensing range and (2) free
deployment for ultrasound-based indoor localization space for the deployment environment.
systems are presented. In an indoor environment, Replacing the spherical model with a realistic radio
beacons are deployed incrementally on the ceiling in model for the ultrasound sensor and adopting a
order to localize the listeners in between the ceiling heuristic based on [4], we find the deployment plan
and the floor. We first propose a water-drop shaped turns out very different in both the number of beacon
radio model for the beacon to replace commonly required, as well as the placement, in a case study
assumed spherical radio model in order to provide true where room BL-621 in our department building is
coverage of the listeners. Obstacles in the indoor measured. .
environment are then considered to take into account Furthermore, we take into consideration the
the line-of-sight restrictions and thus to enable furniture and other obstacles in BL-621. The
practical beacon deployment. Although when taking deployment algorithm is extended and the resulting
into these two considerations, the number of deployed deployment plan is even more different. Depending on
beacons required tends to be high, it would otherwise the height of the tracking target, the number of beacon
be impossible to provide true coverage of the listeners required to cover the room decreases with the height.
in the indoor environment utilizing ultrasound-based In particular, the amount decreases from 157 to 67 as
localization. the height increases from 0 cm to 150 cm.
This paper is organized as follows. Section 2
1. Introduction describes the base-line algorithm used for beacon
deployment. The extension of the algorithm by
Indoor localization has become more important as utilizing the realistic ultrasound model is demonstrated
many applications such as asset tracking, virtual in Section 3. Section 4 describes how the algorithm is
presence, and smart space [1] [2] [3] emerge. There are further extended to take into account the obstacles in
two types of radios commonly used for localization in the environment. Finally, in Section 5, we conclude
indoor environments, namely radio frequency (RF) and our work.
ultrasound. Though RF is the more popular media for
localization in indoor environments, its localization 2. Ultrasound-based Indoor Localization
precision is usually poor. Ultrasound can provide
higher localization precision. It, however, suffers from 2.1 Localization Using Trilateration
the line-of-sight restrictions. Beacon placement thus
becomes challenging for ultrasound-based indoor In this paper, sensor nodes with beacons in an
localization in environments with various obstacles indoor environment are deployed on the ceiling to
such as desks and chairs. provide accurate 3D positions of the sensor nodes
Another essential problem in beacon placement is (listeners) that can move around in the indoor
coverage, i.e., localization of every target in the environment in between the floor and the ceiling. A

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 306


DOI 10.1109/SUTC.2008.14
beacon on the ceiling can simultaneously transmit RF
and ultrasound signals to the listener as in [5]. Since Algorithm 1. Beacon Deployment Algorithm
the transmitted RF signal propagates at the speed of for each candidate beacon
light in the air (≈ 3×108 m/s), it arrives almost calculate its initial Np
immediately at the listener. The transmitted ultrasound set the coverage of each potential listener to be 0
signal will arrive at the listener much later since it has put all candidate beacons in a list Lcb
to propagate in the air at the speed of sound (≈ 346 m/s until the coverage of all potential listeners are at least 3
sort Lcb with Np in a non-increasing order
with a temperature of 25°C in dry air). The time
select and delete a beacon from the head of the sorted
difference of the arrival times for the transmitted RF Lcb
and ultrasound signals is thus approximately equal to update the coverage of the potential listeners that are
the propagation time of the transmitted ultrasound covered by the selected beacon
signal. The listener can then derive the distance from update the Np in the Lcb
the beacon by multiplying the ultrasound propagation output deployed positions of the selected beacons
time by the speed of sound. Since the location of the
beacon is known when the beacon was deployed, the Figure 2. Beacon deployment algorithm.
listener must be located at the surface of a sphere that
is centered at the beacon and with a radius of the In the beacon deployment algorithm the beacons
derived distance from the beacon to the listener. In 2D, are deployed progressively to provide at least 3-
distances from three beacons with known locations are coverage of each listener. By 3-coverage of a listener,
needed to derive the listener’s location as shown in we mean that the listener is within radio range of three
Figure 1. The process in which the location of a beacons, or in other words, the listener is covered by
listener is determined by utilizing the information of three beacons. In the literature [4] [6], a spherical
the beacon’s known locations and the distances from model is usually used to represent the radio range of a
the beacons to the listener is called trilateration. radio transmitter (a beacon in our case). All the
candidate beacons are assumed to be located at the grid
points in a predefined grid. All the potential listeners
are also assumed to be located at the grid points in
another predefined grid. For each of the candidate
beacon, its initial popularity Np, the number of
Di
sta
potential listeners that are within the candidate
n ce
1 beacon’s radio range and are not yet of 3-coverage, is
first calculated. Then, each time a candidate beacon is
selected if it has the greatest number of Np. The
Distance 3

deployment or selection process proceeds


progressively and terminates when all listeners are at
least of 3-coverage.

Figure 1. Demonstration of trilateration.

In 3D, the actual position of the listener can be


derived with distances from at least four beacons.
However since the listener can only be located below
the ceiling, distances from three beacons would be
sufficient to calculate the listener’s location. Figure 3. A photo of our lab BL-621 in NTU BL Building.

2.2 Heuristic for Beacon Deployment 2.3 Case Study


In this paper, our goal is to deploy the least number As an example, we take one of our labs, BL-621 as
of beacons to locate the listeners so as to achieve shown in Figure 3, as the test environment. The
minimal system cost. For the deployment of beacons, dimension of this lab is 648 cm × 569 cm × 254 cm. In
we used an algorithm similar to the one used in [4]. this example, the potential listeners are assumed to be
The pseudo code of our algorithm is as shown in located on the 10-cm grid on the floor and the
Figure 2.
candidate beacons are located on the 30×30 grid on the

307
ceiling. With a spherical model of 5-m radio range, the is thus proposed in the next section to overcome this
deployed beacons would be as shown in Figure 4. problem.

Figure 4. Only three beacons are needed for our lab


using a spherical model with radio range of 5 meters.

Figure 7. Directional model of ultrasound transmission.

Figure 8. 20 beacons are deployed in three different


Figure 5. The uncovered listeners are shown as green, runs using modified beacon deployment algorithm with
yellow and red respectively when they are covered by water-drop model to provide 3-coverage for each
only 2, 1, and 0 beacons respectively. listener.
3. Water-Drop Radiation Model

3.1 Field Measurement of Water-Drop


Radiation Model

In the field measurement, initially an ultrasound


beacon and an ultrasound listener are placed facing
each other with a distance D as shown in Figure 6 (a)
to check whether the listener can derive the distance D.
The listener is then moved such that the distance
Figure 6. (a) Initial placement of an ultrasound beacon between the beacon and the listener remains the same
and an ultrasound listener facing each other with a however the angle between the normal direction of the
distance D (b) Move the ultrasound listener such that PCB and the line connecting the beacon and the
the distance remains D however the angle between the listener is Θ as shown in Figure 6 (b). The same check
normal direction of the PCB and the line connecting the as in (a) is performed to see whether the listener can
beacon and the listener is Θ (c) The equivalent of (b) derive the distance D. For the ease of measurement, the
with rotation only. configuration of the listener and the beacon in Figure 6
(c) is used instead of (b). In Figure 6 (c), both the
However after deploying the beacons in the lab, we beacon and the listener only need to rotate with an
find that many listeners cannot be localized because angle Θ and do not have to move any more. By the
they are actually not covered by at least three beacons. results of our field measurement, we have observed
The uncovered listeners are as shown in Figure 5. The that the ultrasound transmission is highly directional as
percentage of total covered listeners is only about 40%. shown in Figure 7. Only when the listener is inside the
The reason why around 60% of listeners are not water-drop shaped area enclosed by the solid line can
covered is that the spherical model we use is too the listener derive distance measurement value from
idealistic. The radio range of the ultrasound beacons is the beacon.
rather far from being spherical. A more realistic model

308
3.2 Case Study Revisited

Due to the directional model of ultrasound


transmission, it is thus necessary to modify the
radiation model used in our beacon deployment
algorithm to utilize water-drop shaped model instead
of spherical model. Algorithm 1 in section 2.1 is
modified accordingly. In particular, for the calculation
and the update of popularity Np, we substitute the
spherical model with the water-drop model based on
our measurement. By applying the modified beacon
deployment algorithm in the same test environment as Figure 9. Number of deployed beacons using beacon
deployment algorithms with water-drop model and
in section 2.1, the number of beacons required
sphere model versus size of spaces with squared area.
increases to 20, as opposed to the 3 beacons derived
using the naïve spherical model. Three sampled beacon
deployments with 20 beacons using realistic water-
drop shaped model are as shown in Figure 8. In all
three deployments, every listener can be covered by at
least three beacons and can thus be localized. When a
more realistic radiation is considered, a significantly
higher number of beacons are needed to provide true
full coverage of localization to the listeners. In
addition, when we compare the deployments in Figure
8 and Figure 4, there is a substantial amount of
beacons needed to cover the corners and borders in the
space.

3.3 Scaling Property Figure 10. Ultrasound signal blocked by a piece of


furniture. Both listeners A and B cannot receive
We examine further with the beacon deployment ultrasound signal from the beacon due to line-of-sight
constraint.
required for spaces with different sizes. This will
allow us to observe how the beacon requirement scales 4. Obstacles
to the size of the deployment space. Considering
square-shape space, we start from a space with 5 4.1 Problems with Obstacles
meters in width, 5 meters in length, and 2.54 meters in
height. The area of the space is then increased from 25 When beacon deployment is derived using the
square meters incrementally to 250 square meters. The algorithm in section 3 with water-drop model, another
deployment results are as shown in Figure 9 in which problem emerges when the beacons are deployed in a
results of the original beacon deployment algorithm are room with furniture such as desks and chairs inside.
included for comparison. As can be observed in Figure Since the ultrasound wave from a beacon can be
9, using either the spherical or the water-drop model, blocked by the furniture (obstacles) due to line-of-sight
the number of beacons required increases as the size of constraints as shown in Figure 10, some listeners can
the deployment space. However, the growth rate in the no longer receive the ultrasonic signal from this beacon
water-drop case is significantly higher than that in the even though the listeners are within radio range of this
spherical case. This demonstrates the limitation of the beacon. If too many ultrasonic signals are blocked by
spherical model. Deployment algorithms assuming the the obstacles in the environment, it is easy for a
spherical model not only under-estimate the number of listener to receive less than 3 ultrasonic signals such
beacons required in practice. The discrepancy grows that this listener can no longer be localizable.
larger as the size of the deployment space.
The squared area in the smallest space is of 25 m2 4.2 Solutions
and is formed by a 5 m by 5 m square. The squared
area in the largest space is of 250 m2 and is formed by To overcome the problem of ultrasonic signals
a 15.8 m by 15.8 m square. being blocked by the obstacles in the environment, the
obstacles in the environment should be taken into
consideration in the deployment planning process, i.e.

309
the coverage area should be calculated with the
obstacles in place when selecting the candidate
beacons. To take into account the effect of the
obstacles, a 3D modeling tool, Autodesk 3ds Max, is
used to construct the 3D model of the indoor
environment along with the obstacles. Take one of our
labs in NTU, BL-621 as shown in Figure 3, as an
example; its constructed 3D model is as shown in
Figure 11.

Figure 12. Beacon deployment with consideration of


Figure 11. 3D model of one of our labs in NTU, BL-621 obstacles and with listeners at different height: (a) 157
constructed using Autodesk 3ds Max. beacons for listeners at height of 0 cm (on the floor); (b)
131 beacons for listeners at height of 45 cm (around
After the 3D model of the indoor environment with knees); (c) 121 beacons for listeners at height of 100
cm (around waist); (d) 67 beacons for listeners at height
obstacles is built, it is used to calculate the list of
of 150 cm (around shoulder).
listeners that are actually covered by each candidate
beacon. The list of listeners visible to a candidate
beacon is referred to as the beacon’s covering list
(Lcov). The Lcov’s are used to calculate Np in the beacon
deployment algorithm.
To obtain Lcov’s, all the listeners within the radio
range of each candidate beacon are first included in the Figure 13. (a) Furniture in the center area only (b)
Furniture in the surrounding area only.
candidate beacon’s Lcov. Then a line is drawn between
the beacon and each of the listeners in the beacon’s
Lcov’s to test whether the line intersects any obstacle in 4.3 Case Study Revisited
the indoor environment. If the line intersects with any
of the obstacles, the ultrasound signals transmitted by The results of beacon deployment considering
the beacon will be blocked by that obstacle and the obstacles in the same indoor environment used in
listener will no longer be able to receive the ultrasound section 2 and 3, namely BL-621, and considering
signals. The corresponding listener will be thus listeners at different height are as shown in Figure 12.
excluded from the candidate beacon’s Lcov. After all the As can be seen in Figure 12, the number of beacons is
listeners are tested, only the listeners remain in the Lcov significantly larger than the one in Figure 8 in order to
will receive the ultrasound signals from the candidate compensate the ultrasound signals that are blocked by
beacon and can thus derive the distance from the the obstacles. The number of deployed beacons for
beacon. A further check is necessary to see whether listeners at height of 0 cm, 45 cm, 100 cm, and 150 cm
each of all listeners appears in at least three Lcov’s. If are 127, 111, 101, and 47 more than the one needed for
any listener fails to appear in at least three Lcov’s, it the same environment without obstacles in Figure 8
means that this listener cannot receive signals from at respectively. The percentages of the potential listeners
least three candidate beacons and thus cannot be at height of 0 cm, 45 cm, 100 cm, and 150 cm that are
localized. This listener will then be marked and deleted not covered by the corresponding deployed beacons are
from all the Lcov’s. After the preprocessing stage, the 6%, 26%, 3% and 3% respectively. The reason of the
selection of the candidate beacons can proceed as huge number of uncovered listeners at height of 45 cm
before until all the unmarked listeners can receive is that when the listener is right under the tables at that
signals from at least three beacons. height like the listener A in Figure 10 it would be very
difficult if not impossible for any beacon’s ultrasound
signal to reach that listener. Consequently, that listener
is unlikely to be localized since it cannot receive
ultrasound signals from more than three beacons.

310
space. Taking into consideration of the obstacles, we
find that (1) many more beacons are needed to localize
the listeners since the ultrasound signals can be
blocked by the obstacles in the environment, (2) the
requirement of beacons grows about inversely
proportionally with the height of the listeners since
when the listener is higher the less effect the obstacles
will have to block the ultrasound signal from beacons,
and (3) the placement of the obstacles have significant
Figure 14. Beacon deployment when furniture is
present in (a) the center area (b) the surrounding area.
impact on the beacon deployment especially when the
listener is lower. The cost of deploying for a smart,
interactive room is approximately US$ 7,000 in order
to locate listeners at height of 150 cm, around the
height of one’s shoulder, in our lab BL-621 with full
set of furniture when an about US$ 100 sensor node is
used as a beacon or a listener.

References
[1] S-W Lee, S-Y Cheng, Jane Y-J Hsu, Polly Huang,
C-W You, “Emergency Care Management with
Location-Aware Services,” In Proceedings of
Figure 15. Number of deployed beacons for different Pervasive Health Conference and Workshops,
sets of furniture. 2006, Nov. 29, 2006-Dec. 1, 2006, pp.1–6.
[2] N Talukder, , S I Ahamed, R M Abid, “Smart
4.4 Effect of Furniture Distribution Tracker: Light Weight Infrastructure-less Assets
Tracking solution for Ubiquitous Computing
Two more indoor environments with different Environment,” In Proceedings of Fourth Annual
setting of furniture as shown in Figure 13 are examined International Conference on Mobile and
further to observe how the location of the furniture in Ubiquitous Systems: Networking & Services, 2007
the environment affects the beacon deployment plan. (MobiQuitous 2007), 6-10 Aug. 2007, pp. 1–8.
The deployments for environments with two different [3] J Koch, J Wettach, E Bloch, K Berns, “Indoor
furniture settings when the listeners are at the height of Localization of Humans, Objects, and mobile
0 cm are shown in Figure 14. As can be seen in Figure Robots with RFID Infrastructure,” In Proceedings
14, when only the centered/surrounding furniture is of 7th International Conference on Hybrid
present, the beacons tend to be deployed toward the Intelligent Systems, 2007 (HIS 2007), 17-19 Sept.
surrounding/center to avoid the obstacles. The numbers 2007, pp. 271 – 276.
of deployed beacons for different sets of furniture [4] R Huang, , G V Zaruba, “Beacon Deployment for
including full set of furniture are as shown in Figure Sensor Network Localization,” In Proceedings of
15. As can be observed in Figure 15, the obstacles in IEEE Wireless Communications and Networking
the center area affect more than the ones in the Conference (WCNC 2007), March 2007, pp.3188-
surrounding area especially when the height of the 3193.
listeners is low. [5] N. Priyantha, A. Chakraborty, and H. Balakrishnan,
“The cricket location support system,” In
5. Conclusions Proceedings of the Sixth Annual ACM/IEEE
International Conference on Mobile Computing
Radio model and environmental obstacles play a and Networking (MobiCom 2000), pp. 32-43,
critical role in determining how many and where to Boston, Massachusetts, USA, August 6-11 2000
place the beacons for localization. Using a realistic [6] J Deng, Y.S. Han, P-N Chen, and P.K. Varshney,
radio model, we discover that (1) more beacons are “Optimal Transmission Range for Wireless Ad Hoc
needed to provide full coverage of the listeners since Networks Based on Energy Efficiency,” IEEE
the covering space of an ultrasound transmitter is Transactions on Communications, Volume 55,
smaller than a perfect sphere and (2) the number of Issue 9, pp. 1772 – 1782, Sept. 2007.
required beacons grows linearly with the size of the

311
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A measurement-based method for improving data center energy efficiency

Hendrik F. Hamann
IBM T.J Watson Research Center
hendrikh@us.ibm.com

Abstract identify current problems and help to mitigate the


cooling challenges of today's DCs
In this paper we discuss a novel sensing and
measurement technology which allows visualizing the 2. Mobile Measurement Technology
3D temperature and heat distributions in a data center
or other complex thermal system. Specifically, we In order to address these shortcomings and the
leverage the high spatial resolution of IBM's Mobile paucity of detailed temperature data in DCs, a simple
Measurement Technology (MMT). We show that experimental technique was recently introduced, which
substantially improvements in energy efficiency can be allows for fast 3D mapping throughout a large scale
obtained by the MMT method. computing system [2]. As illustrated in Fig.1 a sensor
network is mounted at a regular distance on the mobile
1. Introduction measurement cart, which is readily moved through an
actual data center while temperature data is being
The increase of power consumption of IT logged from all the sensors as a function of X, Y and Z
equipment is imposing significant strains on data coordinates. The sensor distance is 8'' in the lateral
centers (DC) the supporting infrastructure [1]. Today, direction and 12'' in the z-direction. We show here a
DC managers are struggling with to balance “T”-shaped cart which allows for thermal
maintaining the inlet temperatures for each equipment measurements above the racks. The cart design is
(within the required specifications) and saving energy modular can be adjusted to different heights and
as well as capital cost associated with cooling. shapes. By repeating the measurements throughout the
Consequently, a much better understanding and DC and in combination with the cart’s position and
management of the thermal and energy distribution orientation we can digitize the three dimensional DC
within the DC is essential, in particular considering space.
that up to 50 % of the total DC energy consumption
can be governed by the cooling.
Today, extensive finite element models are
deployed to understand hotspots and cooling solutions
in data centers. These models solve a complicated set
of coupled partial differential equations (i.e., Navier
Stokes and heat conduction/convection equation) using
computational fluids dynamics (CFD) techniques.
While it is very difficult to accurate represent the
physical dimensions and conditions of the DC, such
computations are often difficult and lengthy due to the
multi-dimensionality of the modeling problem with its
various lengths and time scales. Furthermore, current
model predictions have not sufficiently been validated Fig.1: Illustration of the MMT in a raised floor data
[x]. Ironically and despite the importance of DC center.
thermal management there has been a complete lack of
detailed, experimental data, which would readily For example we can construct a 3-D temperature or
heat map of an actual data center. Fig.2 shows

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 312


DOI 10.1109/SUTC.2008.65
exemplary data from a raised-floor DC, which was to the plot (shown in scale). It is evident from Fig.4
scanned using the Mobile Measurement. that we have enormous temperature variations across
the rack starting at 15oC for servers at the bottom of
the rack up to 35oC for servers at 5.5 feet height.
Clearly, a variable frequency fan within the server rack
or better air flow provisioning could readily prevent
these hotspots improving the energy efficiency of the
DC. Fig.4 also shows large inter-rack hotspots. For
example, rack #1,6,7,12 show much larger temperature
variations than some of the other server racks.
Evidently, these racks are located at the end of the aisle
and thus they are more prone for recirculation effects.

Fig.2: 3D thermal distribution of a cold aisle, which


was scanned using the MMT.

Specifically, in Fig.2 we show results from a "cold


aisle" with the server inlets being fed by the cold air
from the perforated tiles. The backside of the server is
referred to a "hot aisle" where the heated air from the
servers is exhausted. As it is evident from Fig.2 not all
servers within the racks are supplied with the same
inlet temperature. In fact, a closer look reveals that a
few servers suffer from so called hotspots. These
hotspots are often responsible for major energy
inefficiencies, because an (energy-costly) solution to Fig.4: MMT measured inlet temperatures as a function
such hotspots involves compensating for these hotspots of height for 12 server racks (see Fig.3).
by chosen a lower chiller set point, which drives
disproportionately up the energy costs for the DC. It 3. Conclusion
has been shown that is useful to distinguish between
temperature variations between racks (inter-rack or In summary, we have presented a simple
horizontal hotspots) and across the same rack (intra- experimental methodology for rapid 3D thermal
rack or vertical hotspots) [2]. mapping of large data centers. This new technique not
only allows for benchmarking current data center
models but it also enables the rapid diagnosis of
existing cooling problems within the data center. In
this case study we showed hotspots due to intermixing
between cold and hot air which suggest that the current
cooling scheme can be significantly improved.
Acknowledgement: HFH thanks the whole IBM
MMT team for contributions.

Fig.3: 2D MMT measured thermal distribution of a 4. References


DC at 5.5 feet height.
[1] J.G. Koomey, "Estimating Total Power
In the following these hotspot effects are quantified. Consumption By Servers In The US and The World",
In Fig.3 we show an MMT measured 2D temperature A report by the Lawrence Berkeley National
distribution at a height of5.5 feet for twelve server Laboratory, February, 2007.
racks (grey boxes). Each rack contains 12 servers (total
144 servers). In Fig.4 we have plotted the inlet [2] H. F. Hamann ,J. Lacey, M. O’Boyle, R. Schmidt,
temperatures of these 12 racks from Fig.3 as a function and M. Iyengar , Rapid 3D thermal imaging of large-
of height. For reference we show the server rack next scale computing facilities, IEEE CPMT, vol.xx, no. x.,
March 2008 (accepted) - see also US patent application
US20070032979A1

313
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Solution Templates Tool for Enterprise Business Applications Integration

Shiwa S. Fu, Jeaha Yang, Jim Laredo, Ying Huang,


Henry Chang, Santhosh Kumaran, and Jen-Yao Chung
IBM T. J. Watson Research Center, Yorktown Heights, NY, U.S.A.
Email: {shiwa, jeaha, laredoj, Ying_Huang, hychang, sbk, jychung}@us.ibm.com

Yury Kosov
IBM Software Group, Burlingame, CA, U.S.A.
Email: kosov@us.ibm.com

Abstract enterprise business application systems, such as ERP,


must integrate together and adapt very rapidly to meet
Today’s enterprises must meet the rapidly changing these business demands and collaborate with other
market demands while collaborating with much business system without major reimplementation effort.
broader range of trading partners to maximize supply To seamlessly integrate these disparate systems there is
chain efficiency and improve services to customers and a need for an intermediate or middleware layer.
their profit margin. To accomplish these business Enterprise Application Integration (EAI) [1] emerges
goals, their business application systems, such as perfectly to fill this niche of market. Though EAI
enterprise ERP, must integrate and adapt rapidly to systems generally provide many packaged off-the-shelf
meet these business demands and collaborate with components that can shorten the middleware
other business system without major reimplementation development cycle and ease subsequent maintenance
effort. Our proposed approach addresses these efforts, it still requires expensive and time consuming
problems by using model based approach of creating IT programming effort to develop and maintain EAI
reusable solution template that consists of off-the-shelf solution.
artifacts to be quickly assembled as a solution. The Model-Driven Architecture has been
In this paper, we introduce a Solution Template advocated by the Object Management Group (OMG)
Tool that simplifies the life cycle for creating an [2]. The objective of the Model-Driven Architecture is
integration solution, through the flexible design and to move the focus from programming to solution
customization of solution templates, and an interactive modeling. Our proposed tool employs model driven
environment driven by wizards. By providing levels of approach toward EAI solution at a higher level of
abstraction, the proposed tool allows users to compose abstraction of solution composition closer to the
templates of platform independent model without problem domain. The tool exhibits the following
worrying about implementation details. The tool features: (1) levels of abstraction, (2) separation of
transforms the composed templates into platform concerns, and (3) reusable assets. By providing the
specific IT execution model and a deployable solution, levels of human-friendly abstraction of composition
thus easing the task of solution integration lifecycle. A and development wizard, the users can focus on
solution template of UCCnet illustrates our study. developing higher-level Platform Independent Model
(PIM) while the tool transforms the PIM to Platform
Index Terms: Solution Template, UCCnet, RFID, Specific model (PSM) that can be used to generate
Platform Independent Model, Platform Specific Model. code. These abstractions also facilitates the
participation of users with different skills at each stage,
1. Introduction creating a separation of concerns that ties each skill
with a different set of activities that are clearly defined
Today’s enterprises must meet the rapidly changing and isolated from each other. To facilitate reusability
market demands while collaborating with much broader we propose the use of Solution Templates [3]. The
range of trading partners to maximize supply chain Solution Template, stored as an asset, is designed with
efficiency, improving service to customers and their customizable points to increase its flexibility by rapidly
profit margin. To accomplish these business goals, adapting to changes, and acting as the unifying artifact

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 314


DOI 10.1109/SUTC.2008.84
throughout the life cycle of the solution to accumulate corresponding component which will then update the
the useful knowledge that is gathered from WBI runtime when realized.
development phase through the deployment phase. The Several users may participate in the creation of an
tool is adapted as a WBI SE Tech Preview, and is integration solution at different stages, and each user’s
available from IBM Developer Works website [4]. responsibility requires a different set of skill and has a
This paper is organized as follows. Section 2 specific mission in the process. There are three roles
describes the features of Solution Template Tool available in the tool: Template Creator, Solution
including different roles in solution composition. To Creator, and Component Builder. The Template
facilitate the application business integration scenario, Creator is usually a Subject Matter Expert with enough
we choose to use the UCCnet as an example of solution technical skill to abstract a given solution, and creates a
composition in Section 3. The composition of Solution corresponding template in a way that it can be used
Template and the transformation of PIM into WBI SE multiple times. The Template Creator also determines
runtime environment are introduced in Section 4. variability points and expresses them into this template.
Section 5 presents the model transformation functions. The Solution Creator makes use of templates created
Finally, the paper is summarized in Section 6. by the Template Creator. The Solution Creator is able
to identify the right template candidate and may be able
2. Solution Template Tool to perform the configuration tasks. He may be able to
do some additional composition to complete the
The Solution Template Tool is developed as a solution. The Component Builder is the more IT
Eclipsed-based plug-in with IBM Websphere Business skilled user of the three, understands the underlying
Integration Server Express (WBI SE) [5]. It consists of technology and is able to create new reusable
two major parts, PIM Composer and Reusable Asset components based on requirements. He will be able to
Repository, as shown in Figure 1. The PIM Composer create flexible components with its customizations
helps users to model and customize the Solution points exposed as Points of Variability to be described
Template, navigate through WBI SE tools, and realize in Section 4.1.
the solution template in WBI SE runtime. The The tool provides interactive wizard (PIM
Reusable Asset Repository stores solution templates Composer) for Template Creator to create new
and artifacts as reusable assets to ease future solution template, or locate and import the reusable template
composition. from the asset library. The business description and
requirements that are available are gathered and
matched against the Solution Template information to
find closest Solution Template to be used. Importing
Business
Asset
Library
Solution Oriented Interface
the reusable template will be simpler and thus reduce
Access to Assets Compose Template
the development time than designing from scratch.
Template
and Solution
Components
Creator Also the wizard will accelerate learning curve by
Customize Template
PIM Composer identify dependencies and guiding the Template
Traceability of Changes Solution
(Round-Tripping) Creator Creator to locate them. The tool follow the top down
design approach where the Template Creator can create
Deploy Solution

Navigation through tooling


high level design with the artifacts and then each
WBI SE Tools
artifact can be further refined.
WBI SE
Runtime
3. Case Study
Figure 1 Proposed Solution Template Tool. The Solution Template Tool is developed with WBI
SE as the initial target platform to demonstrate solution
The tool maintains traceability between the solution lifecycle and reuse capabilities via integration with the
template level artifacts and those platform dependent asset library which contains the reusable Solution
components in the runtime. For example, any property Templates and artifacts. The demonstration of
or interfaces change in the WBI SE components will prototype uses an example where SMB business
trigger the corresponding artifacts in the Solution supplier to publish their product description and order
Template model to propagate the changes. Each information in the UCCNet catalog as shown in Figure
artifact in the Solution Template will launch the 2. UCCnet [6, 7] is a subsidiary of Uniform Code
associated WBI native editor to modify the Council (UCC) that uses industry standards to

315
synchronize the item information between trading mediate the conflict between two message formats
partners. It is a third-party external exchange that during wiring.
provides product registry services to enable
synchronization of item and location information to 4. Solution Composition and Transformation
reduce mis-shipment and return, and shorten setup time
for new products. It targets high-volume, low margin, Solution Template is a reusable asset at a higher
inefficient industries such as retail, grocery. While the level of abstraction that can be transformed into a
EPC Network [8] offering dynamic product solution (or application) in service oriented architecture
information specific to an individual item, such as environment. We used it for creating e-business
expiration dates and shipping details held on the Radio integration solutions to reduce the complexity and the
Frequency Identification (RFID) [9] tag, the UCCnet cost of creating the integration solution by providing
contains static attributes common to all products. more repeatable, cumulative and transferable
The solution (UCCnet_ItemSync) integrates the knowledge obtained during the life cycle of the
supplier’s backend system (SAP) with the UCCNet hub integration solution.
(AS2 Server) for registration and validation of data.
The product order information (via Item Data) from the 4.1. Solution Composition
supplier will trigger the registration (new item
publication request and registry catalogue item Solution can be composed by capturing the reusable
registration) of the product information in the UCCNet components (other templates or Solution Artifacts) and
catalogue. The rest of the paper will reference this define the relationships amongst its components
example when we describing features and functionality. according to the business requirements. Solution
Artifacts can be a service container usually
representing process flows (Collaborations), adapters
(Connectors), screen views, and other elements that can
XML Document
(worklist requests)
be reused [3]. These components can be created new
AS2 Server
AS2 Adapter JText Adapter
or imported along with its definition, and its platform
requirements and performance characteristics.
UCCnet_
REQUEST WORKLIST
Request
Worklist
POLL FOR
ANY WORK
Components are then wired according to their business
PUBLICATION RESPONSE
RCIR
RESPONSE
(WORKLIST) UCCnet_
Process
LIST
REQUESTS logic. Once created, it becomes a reusable business
CATALOGUE ITEM NOTIFICATION
(CIN)
NEW ITEM PUBLICATION
REQUEST UCCnet_
CIN
Dispatcher
Worklist

(OPTIONAL)
services (templates) that publish or operate on business
Notify_by_
REGISTRY CATALOGUE ITEM ItemSync
REGISTRATION (RCIR)
ITEM
eMail data. As shown in Figure 3, the UCCnet item
DATA
WebSphere InterChange Server
(WICS)
synchronization artifacts illustrated in Figure 2 is
SAP Adapter
composed as a Solution Template using the PIM
Composer of the proposed Solution Template Tool.

Figure 2 UCCnet Item Synchronization.

In our example, two Connectors (for SAP and AS2


server) and one Collaboration (business process to
publish product ordering information) are the Solution
Artifacts used in the template. Solution Artifact
exposes a web-service like interface. For the
Collaboration, a port in the Solution Templates Tool
corresponds to the port in its corresponding WBI SE
collaboration and an operation corresponds to the
supported verb by that port. For a Connector, a port in
the Solution Templates Tool corresponds to a
supported message formats and an operation Figure 3 Template Composition Using PIM Composer.
corresponds to the supported verb by that message
formats. The Template Creator can simply wire them To address the customization aspect, the notion of
together according to the business logic. To ensure points of variability is introduced. A given artifact may
message formats from both artifacts are offer customization points or points of variability that
interchangeable, WBI map editor will be launched to identify options for a given artifact. As an example

316
shown in Figure 4, the Process flow of Connector supports point of variability by providing
UCCnet_ItemSync defined in a collaboration artifact choices of the connectors under the same component, a
could have two points of variability: “New Item generic connector container. So more connector
Publication Request” and “Registry Catalogue Item implementations can be added for different suppliers
Registration”. Similarly, in a connector artifact, under a generic connector container (e.g., ERP
inbound/outbound message formats, business rules, connector) as shown in Figure 5, and allow more
functional options, properties, etc could be defined. options (SAP or JDEdwards) later when configuring
This tool detects if a variability point is available and the template. Only one connector can be selected to be
presents the Solution Creator with options to choose active and the active one is the one considered when
amongst them, further more would a mismatch occur, doing the composition with other components.
this tool offers options to resolve the incompatibility. Collaborations offer point of variability for their
When appropriate, an interaction is initiated by the tool properties and interfaces. The tool supports three types
to move ahead the integration process and minimize of conditions to govern the point of variability:
any second-guessing by the user. mandatory, optional and conditional as illustrated in
Figure 6. Some interfaces of the collaboration are
mandatory and needs to be presented when the
template is configured. Some interfaces are optional
Process Flow
and it can be enabled or disabled at configuration time
at the discretion of the Solution Creator. Some
Process UCCnet_ItemSync interfaces will be automatically enabled if the rule
New Item associated with the point of variability is met;
Publication
Request otherwise, it will be disabled.
Variable Selection:
Receive UCCnet_ItemSync
Synchronization
Item Data Completion
Methods
Registry
Catalogue
Item Registration

Figure 4 Point of Variability in a Process Flow Model.

Figure 6 Points of Variability for Collaboration.

In our example, we can define another point of


variability for the interfaces of an artifact. The
suppliers can have many different business partners to
sell their products, and the supplier’s message formats
should be compatible with the business partner’s.
Therefore the Template Creator can add new
collaborations to validate the supplier’s message
Figure 5 Points of Variability for Connector. formats against business partner’s message formats. If
the Template Creator can adds validation logic for
The Template Creator can define the point of Wal-Mart and Target stores, the Solution Creator can
variability to be consumed by the Solution Creator. select appropriate business partners by choosing the
For example, the Template Creator can add the corresponding interfaces of the business partner. For
flexibility to this template for other suppliers who want example, the value of an artifact property such as
to publish their products into the UCCNet catalogue. RETAIL_PARTNER can be used to enable the

317
collaboration’s operation to Wal-Mart or Target store the template can be augmented to provide a
as an operation of choice as shown in Figure 6. deployment layout that matches the solution
The rule can also constrain the value of the property requirements. This will further enrich the solution with
and the default value can be set to a literal or a component specific information to fine tune each
reference to another property. In most cases rules can component in terms of configuration settings and
be used to define the point of variability. A very performance parameters. As the template is enriched
simple rule set has been defined to create this notion. during deployment, the added information can be
Associated with the artifact or template properties, provided to the deployed solution management phase
these rules can be used to perform configuration to and allow the management tool to provide a complete
determine the behavior of the solution. Artifact view of the solution. Solution upgrades can reuse this
properties, treated as string values, can be used in the information, expediting the process to deploy the new
rule expression where two types of rules are supported. version. Furthermore, collected information can be
First one is the Boolean expressions where the applied (or reused) to the different configuration with
operators are <, >, !=, ==, ||, &&, (, ), and the second similar requirements. Template becomes the unifying
one is the choice selection driven by the value of a artifact throughout the life cycle of the integration
property. For example, if RETAIL_PARTNER == solution.
Target then the Retrieve operations from In our example, we used WBI SE as our platform
“ValidateTarget” collaboration is used; otherwise, the and all artifacts (Collaborations and Connectors) and
Retrieve operation from “Wal-Mart” collaboration. their related assets (Business Objects and Maps) will
be registered under a new or existing Integration
4.2. Solution Transformation Component Library (ICL) in WBI SE. The links will
determine the Collaboration Objects to be created and
Solutions are composed in a platform independent necessary compilations will be performed. At this
fashion regardless of their ultimate execution point the ICL is ready to be configured into a User
environment and become reusable templates to create Project and ready for the deployment to test the
other solutions. Solutions templates, once created, can integration solution.
be published in the form of a jar file to the file system
for later reuse, sharing, or upload to asset library. 5. Models Transformation Functions
Platform Independent Model
Component
Abstraction
Artifact
A = ƒ(C,MP) (A)
Connector Connector
Component
Instance Component
Traceability Abstraction
AURI = URI(Cr, MP)
Cr = ƒ-1(∆A,(C|Cr),MP) A = ƒ(∆Cr, MP)
Cr URI = URI-1(A, MP)
Component
Collaboration Realization
component
(C)
Specific Component
(Cr)

Platform Specific Model (P)

Figure 8 Model Transformation Functions.


Realized Assets of UCCNet Scenario in WBI Express
Our tool seamlessly integrates the Platform Specific
Figure 7 WBI SE Runtime View. Model (PIM) and Platform Independent Model (PSM)
workspaces in the same environment, Eclipse-based
Once the template is configured per the user platform. It provides an integrated working
requirements for certain platform, it becomes a environment for solution composition, solution artifact
Solution Instance as illustrated in Figure 7 where one realization, and run time collaboration execution. The
Collaboration and two Connectors are instantiated. transformation functions between PIM and PSM,
The Solution Creator needs to register and generate all shown in Figure 8, consists of three major parts:
the necessary elements into a specific platform to Component Abstraction, Component Realization, and
ensure that the solution can be executed on that Component Instance Traceability as described in the
platform. Once the physical topology is determined, following three subsections.

318
5.1. Component Abstraction 6. Conclusions
The objective of the Model-Driven approach is to In this paper, we present a Solution Template Tool
move the focus from programming to solution for enterprise applications integration to simplify the
modeling. A PIM artifact is a high level abstraction of life cycle of an integration solution, through flexible
the corresponding components in PSM. To abstract the design and customization of Solution Templates and an
PSM components into PIM artifact, only the required interactive environment driven by Wizards. The tool
elements in PSM components are extracted. A provides levels of abstraction, separation of concerns,
transformation function, f, converts PSM component, and reusable assets to ease the task of solution
C, into the corresponding PIM artifact, A, is expressed composition. The WBI Express SE is chosen as our
as A = ƒ(C,MP), where MP is the mapping table platform specific platform with the model
representing for a particular PSM P. The transformation functions introduced. The tool is
transformation is based upon the mapping table which adapted as a WBI SE Tech Preview, and is available
defines the relationships between PSM components and from IBM Developer Works website [6].
PIM artifacts. Ideally, same transformation function We also describe the implementation of UCCnet as
can be applied to different PSM with the change of the a Solution Template, how it is composed as a platform
mapping table. independent model, how the points of variability are
In a runtime environment as shown in Figure 8, the defined, and its transformation to the platform specific
transformation function is expressed as A = ƒ(∆Cr, model for instantiation and deployment in WBI SE.
MP), where only the changed elements (∆) of the
runtime PSM component (Cr) are converted. Our tool References
keeps track of instances of components in a runtime
environment. This is made possible by the Component [1] J. Lee, K. Siau, and S. Hong, “Enterprise Integration
with ERP and EAI,” Communication of the ACM, pp.
Instance Traceability function as described in the next
54-60, Feb. 2003.
subsection. [2] Joaquin Miller and Jishnu Mukerji, ”MDA Guide
Version 1.0.1,” OMG/2003-06-01.
5.2. Component Instance Traceability http://www.omg.org/docs/omg/03-06-01.pdf
[3] Ying Huang, Kumar Bhaskaran, and Santhosh
The Component Instance Traceability function Kumaran, “Platform-Independent Model Templates for
utilizes URI to dynamically trace the component Business Process Integration and Management
instances in both PIM and PSM. Given a particular Solutions,” 2003 IEEE International Conference on
PSM component (Cr) and the corresponding mapping Information Reuse and Integration (IRI), pp. 617-622,
table (MP) of a particular PSM platform, the Oct. 2003.
[4] “Solution Templates Tool for WebSphere Business
transformation function (ƒURI) generates and traces the
Integration Server Express technology preview,” Dec.
corresponding URI value (AURI) of the artifact in the 2006.
PIM space, i.e., AURI = ƒURI(Cr, MP). http://www.ibm.com/developerworks/websphere/downl
Reversely, given a PIM artifact (A) and the mapping oads/simpleWBISE.html
table (MP) of a particular PSM platform, the reverse [5] Bill Moore, Knut Inge Buset, Dennis Higgins, Wagner
transformation function (ƒURI-1) generates and traces the Palmieri, Mads Pedersen, Joerg Wende, Alison Wong,
corresponding URI value (Cr URI) of a component in “WebSphere Business Integration Server Express The
that particular PSM space, i.e., Cr URI = ƒURI-1(A, MP). Express Route to Business Integration,” IBM Redbooks,
ISBN0738492035, Feb. 2005.
http://www.redbooks.ibm.com/abstracts/SG246353.html
5.3. Component Realization ?Open.
[6] UCCnet. http://www.uccnet.org.
The Component Realization transformation is [7] Lee Gavin, Geert Van de Putte, Suresh Addala,
expressed as Cr=ƒ-1(∆A,(C|Cr),MP). The transformation Rajaraman Hariharan, Daniel Max Kestin, Pushkar Suri,
function (ƒ-1), a reverse of component abstraction Kiran Tatineni , “Implementing WebSphere Business
function, dynamically transforms changed elements (∆) Integration Express for Item Synchronization,“ IBM
of the PIM artifact into the corresponding components Redbooks, ISBN0738498084, May 2004.
of a particular PSM. Similar to Component Abstraction http://www.redbooks.ibm.com/abstracts/SG246083.html
function, it requires a mapping table (MP). The ?Open.
different part is that it selects an existing PSM [8] EPCglobal Inc. http://www.epcglobalinc.org.
component if such one exists in the runtime PSM; [9] “A Guide to Understanding RFID,” RFID Journal.,
http://www.rfidjournal.com/article/gettingstarted/.
otherwise, a PSM component template is selected.

319
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A Fuzzy-Based Transport Protocol for Mobile Ad Hoc Networks

Neng-Chung Wang1, Yung-Fa Huang2, and Wei-Lun Liu2


1
Department of Computer Science and Information Engineering
National United University, Miao-Li 360, Taiwan, R.O.C.
Email: ncwang@nuu.edu.tw
2
Graduate Institute of Networking and Communication Engineering
Chaoyang University of Technology, Taichung 413, Taiwan, R.O.C.
Email: yfahuang@mail.cyut.edu.tw

Abstract protocol performance affects an entire network. What


we would like to do is to develop an appropriate
A mobile ad hoc network (MANET) is a dynamically transport protocol for MANETs based on the ad hoc
reconfigurable wireless network that does not have a network environment.
fixed infrastructure. Because this kind of network uses In this paper, we propose an improved transport
wireless connections for communication, traditional protocol (ITP) in MANETs. We present the details of
transport protocol for wired networks is not suitable the ITP algorithm and show through simulation results
for MANETs. In this paper, we propose an improved that it significantly outperforms both TCP and the ad
transport protocol (ITP) to enhance the efficiency of hoc transport protocol (ATP) [6].
the transport protocol in MANETs. In ITP, we use a There have been many previous studies carried out
rate-based transmission scheme with fuzzy logic by researchers on the transport protocol for MANETs.
control to tune the proper data rate. ITP gets the MAC These studies approach transport protocol problems in
layer information and uses a fuzzy logic controller to MANETs in two approaches. The first approach is the
estimate the appropriate data rate for transmission. problem as the link of route failure of mobility [3, 5].
Then ITP uses a feedback scheme to adjust data flow When traditional TCP encounters a link failure, it
by receiving a feedback packet. The difference between cannot make the right decision, leading to other
traditional TCP and ITP is that ITP adjusts its problems. TCP usually discovers the problem after
transmission rate using the received packet instead of packet loss triggers a timeout or when an
the acknowledgments or lost packets. Simulation acknowledgment (ACK) is not received. The problem
results show that the proposed ITP outperforms TCP may be caused by congestion or link failure, but TCP
and ATP. does not know it when it is encountered the first time,
so it sends out more packets. If the network is in a state
Keywords: Fuzzy logic control, data rate, mobile ad of congestion, the throughput of the network will be
hoc networks, rate adaptation, TCP. decreased. The second approach avoids congestion and
uses congestion control [2, 4]. In this approach,
1. Introduction because the bandwidth is low in MANETs, congestion
avoidance becomes an important issue. Therefore, the
A mobile ad hoc network (MANET) is a throughput of the network will be increased highly.
dynamically reconfigurable wireless network that does The ATP protocol utilizes the weight of the network to
not have a fixed infrastructure. Because this kind of adjust its flow and avoid congestion.
network uses wireless connections for communication, The rest of the paper is organized as follows.
the traditional transport protocol for wired network is Section 2 describes the related work. The proposed
not suitable for MANETs. The characteristics of scheme is given in Section 3. We present the
MANETs cause the bandwidth to be more limited than simulation results in Section 4. Finally, Section 5
the bandwidth in traditional networks. draws the conclusions.
Several studies have focused on the transport layer
in MANETs. There are several TCP versions for 2. Related Work
traditional computer networks. We know that transport

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 320


DOI 10.1109/SUTC.2008.52
In this section, we introduce the traditional transport (although typically only a moderate number of rules
protocol TCP that is widely used in current computer are needed).
networks. It provides reliable transmission using In fuzzy logic, unlike in standard conditional logic,
packet acknowledgments. Fuzzy concepts are the truth of any statement is a matter of degree. How
described in this section as well. In addition, we also cold is it? How high should we set the heat? We are
describe the concepts of ATP [6]. familiar with inference rules of the form p → q . With
fuzzy logic, it's possible to say (.5 * p ) → (.5 * q ) . For
2.1. Transmission Control Protocol (TCP) example, for the rule if “weather is cold” then “heat is
on”, both variables are cold and on, map to ranges of
The transmission control protocol (TCP) is one of values. Fuzzy inference systems rely on membership
the most popular and widely used end-to-end protocols functions to explain to the computer how to calculate
for the Internet today. Many studies on TCP the correct value between 0 and 1. The degree to which
performance have been carries out [1, 8]. Unlike any fuzzy statement is true is denoted by a value
routing, where packets are relayed hop-by-hop toward between 0 and 1.
their destination, TCP actually provides reliable end- Not only do the rule-based approach and flexible
to-end transmission of transport-level segments form membership function scheme make fuzzy systems
source to receiver. Transport segments arrive in straightforward to create, but they also simplify the
sequence and lost segments are recovered. Therefore, design of systems and ensure that you can easily
TCP provides flow control and congestion control update and maintain the system over time.
functions, in addition to reliable transmission.
Reliability in transmission involves the use of some
2.3. Ad Hoc Transport Protocol (ATP)
forms of handshakes between the source and
destination. Also, sequence numbers can be used to
Sundaresen et al. [6] proposed a reliable transport
ensure in-sequence delivery of segments and help to
protocol for ad hoc networks, called ATP. ATP is a
identify lost or corrupted segments. Retransmission
transport protocol which was developed for MANETs
can be used to resend lost or corrupted segments.
and is different from traditional networks. ATP
Hence, a retransmission timer is needed to determine
provides reliable end-to-end transmission. Because of
when to initiate a resend. For TCP, an adaptive
the difference in environment between traditional
retransmission mechanism is employed to
networks and MANETs, TCP will face many problems
accommodate the varying delays encountered in the
if used in MANETs. Several researchers have devoted
Internet environment. The timeout parameter is
time to solving the problems of mobility and limited
adjusted accordingly by monitoring the delay
bandwidth which TCP faces. ATP is a new protocol
experienced on each connection.
that has many characteristics which are different from
the TCP protocol, but ATP is reliable. ATP needs to
2.2. Fuzzy Logic Control coordinate with other layers, such as the MAC and
routing layers, to function. ATP uses rate-based
Fuzzy logic starts with and builds on a set of user- transmission and selective ACK (SACK) for
supplied human language rules. The fuzzy systems reliability. In coordinating with other layers, the MAC
convert these rules to their mathematical equivalents. layer needs to gather information which is used to
This simplifies the job of the system designer and the adjust the data rate and to detect link failure to prevent
computer, and results in much more accurate packet loss. The routing layer sends out a notification
representations of the way systems behave in the real of link failure. The routing layer sends the information
world. Many products use fuzzy logic for control, on the calculated data rate. The traditional TCP
including for the Internet [7]. protocol does not have this function.
Additional benefits of fuzzy logic include its ATP uses the rate-based model for transmission,
simplicity and its flexibility. Fuzzy logic can handle which is different from the window-based (using
problems with imprecise and incomplete data, and it sliding window) transmission of TCP. The operation
can model nonlinear functions of arbitrary complexity. model is discussed in TCP flow control. ATP adopts
Fuzzy logic models, called fuzzy inference systems, rate-based transmission.
consist of a number of conditional “if-then” rules. For The rate-based transmission of ATP uses
the designer who understands the system, these rules exponential weight moving average (EWMA) to get
are easy to write, and as many rules as necessary can the data rate. The equation for EWMA is shown in Eq.
be supplied to describe the system adequately (1). EWMA uses recursion to calculate the average
data rate. The parameter α is a constant.

321
information. Intermediate nodes in the network
Davg = α × Davg + (1 − α ) × Dcur (1) maintain the sum of the queuing delay (Qt) and the
transmission delay (Tt) experienced by packets
traversing through intermediates. The queuing delay is
3. Improved Transport Protocol (ITP) the time spent by a packet from the time it is inserted
into the queue to the time it reaches the head of the
In this section, we describe the proposed improved queue to be dequeued by the MAC layer. The
transport protocol (ITP). The whole ITP operation transmission delay is measured as the time spent by the
model will first be introduced. When the ITP protocol packet in the MAC layer form the time it is dequeued
starts, the source node will send a probe packet to by the MAC layer to the time it gets transmitted
gather information of the entire route. This packet will successfully through the channel.
trace all the nodes of the route. This tracing procedure
is called the probe state. During the probe state, the 3.3. Data Rate Estimation with Fuzzy Logic
probe packet is relayed to each intermediate node. The
Control
probe packet can get the state of each node on the
route. The node states include transmission delay and
As mentioned before, once an intermediate node
queuing delay. After calculating the transmission delay
gets the data rate, the intermediate node should pass
and the queuing delay, the intermediate node can sum
through the fuzzy logic controller and then piggyback
up both the delay times. The delay time and the
on the packet to the next hop. When it passes the fuzzy
transmission rate are inversely proportional. We can
controller, the fuzzy logic controller will calculate the
use this relation to estimate the transmission rate. The
current data rate and the last data rate to get an
estimation scheme is based on fuzzy logic control,
appropriate rate for transmission.
which will be discussed later. After the probe state,
In the proposed ITP, we use EWMA with fuzzy
ITP will trigger the calculating scheme periodically
logic control to get the data rate. The equation for
with the sent packets.
EWMA with fuzzy logic control is shown in Eq. (2).
The parameter f is a variable.
3.1. Layer Coordination
D avg = (1 − f ) × D avg + f × Dcur (2)
The difference between the ITP and the traditional
TCP is that ITP needs the coordination of the MAC
layer. In the traditional TCP protocol, different layers In the following, we will introduce how to use the
have different control issues. For instance, the media fuzzy logic controller to get an appropriate data rate.
access control layer uses RTS/CTS and ACK The base logic for a fuzzy logic controller is that if the
mechanisms to provide transmission for each mobile data rate is low, the effect is more, and if the data rate
node. The routing layer enables the sender to find the is high, the effect is less. We first set up the fuzzication
destination node. In the ITP protocol, we use the MAC rule base. Then we compute the appropriate rate using
layer to gather other information, as discussed in the the rule base. The fuzzification membership function
last subsection. The queuing delay and the of the perceived data rate and the fuzzification
transmission delay are measured in the MAC layer. membership function of the signal power are shown in
Using the sum of the two parameters, we can find the Figs. 2 and 3, respectively.
data rate, which is an inverse of the sum of the delays.
The data propagation model is shown in Fig. 1. As the
probe packet passes through the current node, the
MAC layer will calculate the data rate.

Fig. 1. Data propagation model.

3.2. Intermediate Node


Fig. 2. Fuzzification membership function of perceived
ITP relies on the intermediate nodes on a route that data rate.
a connection traverses to provide rate feedback

322
p
∑ x × µ(x )
i i
f = i =1 p (3)
∑ µ (x )
i =1 i

3.4. ITP Receiver

The ITP receiver provides periodic feedback to the


sender to assist with its reliability and flow control
mechanisms. In addition, it also confirms the rate
feedback information provided by the intermediate
nodes and sends it back to the sender. The receiver
Fig. 3. Fuzzification membership function of signal runs an epoch timer of period E in order to send the
power. feedback periodically. Note that the period E should be
larger than the round-trip time of a connection, but at
The input i to the fuzzy logic supervisory controller the same time must be small enough to track the
is the perceived data rate D and the signal power SP. dynamics of the path characteristics. There are two
The input i is fuzzified into five linguistic terms: VL aspects of feedback. We will introduce both in the
(very low), Low (low), Med (medium), Hi (high), and following sections.
VH (very high). The defuzzification function f includes
five classes: ES (extreme small), S (small), M 3.4.1. Rate Feedback
(medium), L (large), and EL (extreme large).
The inference engine works as follows: After the probe state, which was discussed earlier,
Rule 1. If the rate D is VL, then f is EL. every incoming packet belongs to a flow. The receiver
Rule 2. If the rate D is Low, then f is L. performs an exponential averaging of the Davg value,
Rule 3. If the rate D is Med, then f is M. which is specified in the packet, using Eq. (2). When
Rule 4. If the rate D is Hi, then f is S. the message passes through all the intermediate nodes,
Rule 5. If the rate D is VH, then f is ES. the Davg value is averaged one at a time with fuzzy
Rule 6. If the SP is VL, then f is EL. logic control. After each epoch timer expires, the
Rule 7. If the SP is Low, then f is L. receiver provides the Davg value at the time that
Rule 8. If the SP is Med, then f is M. feedback is sent. This feedback appears in the Davg
Rule 9. If the SP is Hi, then f is S. field of the packet.
Rule 10. If the SP is VH, then f is ES.
3.4.2. Reliability Feedback
Fig. 4 shows the defuzzification functions for f. The
defuzzification method uses the center-of-maximum The ITP receiver uses the selective ACK to provide
(CoM), where xi is the input, p is the number of rules, reliability, which means that lost packet will be found
and f is the output. CoM is shown in shown in Eq. (3). in the data stream. ITP provides a larger number of
blocks of SACK than TCP-SACK does. This is
because the feedback is not provided for every
incoming data packet, but rather for a period of time.
The TCP-SACK uses only three blocks while a coming
ACK, and ITP uses 20 SACK blocks, which are
adjustable. The len field is dedicated to the total
number of blocks.

3.5. ITP Sender

The ITP sender, like that in TCP, consists of several


Fig. 4. Defuzzication function for f. parts. Specifically, the IPT sender includes two
functionalities: probe state and reliability. In the rest of
the section, we will elaborate on the different
components.

323
4. Simulation Results
3.5.1. Probe State
We conducted a performance evaluation using NS2
During connection initiation, or when a sender (Network Simulation 2). NS2 is the most popular of all
recovers from a timeout, the slow start of TCP will network simulators. The mobility model used for
take a few round trip times to make the bandwidth topology generation was the random waypoint model.
available. During slow start, TCP wastes a lot of time. The simulation environment was a 1000 m × 1000 m
Due to frequent path failures and resultant timeouts in network grid of 100 nodes. The mobility speeds were
MANETs, a TCP connection can end up spending a from 3 m/s to 21 m/s. The signal power range was 150
considerable portion of its lifetime in the slow-start m. The amount of flows was 5 flows. The application
phase, thus degrading network utilization. layer generated the FTP flow, and the routing layer
ITP uses a probe packet to probe the available used the DSR protocol. 802.11b was used as the MAC
network bandwidth in a round trip time during the layer protocol. The simulation result was an average of
probe state. ITP performs the operation both during 100 different results from tests conducted for 100
connection initiation and when the network path seconds.
changes. The probe packets that are sent out during the In the simulations, we used TCP New Reno and
probe state to elicit rate feedback form the receiver. ATP as the transport protocol in the simulations. The
The function of the probe state is straightforward when simulations include time spent in slow/quick start and
the path change occurs. When a new path is used, the throughput versus mobility. The time spent in
sender is not aware of the available bandwidth of the slow/quick start was defined as the time from flow
path. Therefore, performing the rate estimation again start to the time the flow approach was stable.
allows the sender to operate at the true available Throughput was defined as the average bytes per
bandwidth. second.
Fig. 5 shows the time spent in slow/quick start
3.5.2. Reliability versus mobility speeds for 5 flows. From Fig. 5, we
can see that the time ITP uses is less than that of TCP.
ITP adopted the SACK for reliability. For each ITP uses more time from 1 m/s to 20 m/s. It presents
epoch time, ITP sends a probe packet and the receiver the average number of times connections enter the
sends the SACK feedback, which is larger than the slow start during a 100 seconds simulation period for
TCP-SACK. The estimate data rate D will piggyback different rates of mobility and different loads. It
on the SACK packets. In this SACK packet, ITP presents the average time spent in the slow start by the
provides reliability that ITP will again send the lost connections during the 100 seconds simulation. It can
packets. The lost packets sequence numbers are be observed that TCP connections spend a
recorded in the SACK block. The SACK format is considerable amount of time in the slow start phase,
discussed later. In the mechanism, ITP can enable the with a proportion of time going above 50 percent for
receiver to get all the data which may have been lost the higher loads.
during transmission. We can conclude that our proposed ITP in any case
the time spent increasing is less than TCP. We follow
3.6. Congestion Control the operation of the ATP quick start and the ITP time
spent is same as that of ATP. If the mobility speed is
After the probe state, ITP will enter the congestion higher, the time will increase. With the same mobility,
control state. In the congestion control state, ITP TCP spent more time in slow start than ITP did. ITP
mainly prevents congestion from occurring. The utilizes the network resource more efficiently than
mechanism is determined when the feedback packet is TCP.
received. Once the source node receives the feedback, Fig. 6 shows the throughput versus mobility speeds
it will determine how to control the flow. If the for 5 flows. Fig. 6 shows the throughput under various
received data rate D is higher than the previous data mobility speeds with 5 flows. As was expected, the
rate D, the source node will increase the sending rate. throughput decreased as the mobile nodes became
If the received data rate D is lower than the previous more mobile. The reason is that there were more
data rate D, the source node will decrease the sending chances for routes to break when the speed of the
rate. This mechanism prevents congestion. In addition, mobile nodes was faster. Thus, the number of packet
ITP will determine the lost packets to be retransmitted. loss increased. Because the proposed ITP estimates
each node’s condition for transmission, the
performance of throughput is better than ATP and

324
TCP. Even if mobility is enhanced, ITP still can This work was supported by the National Science
provide better performance than TCP New Reno and Council of Republic of China under grants NSC-94-
ATP. 2213-E-324-025 and NSC-95-2221-E-239-052.

70
References
60 ITP and ATP
Time in slow/quick start (s)

[1] V. Anantharaman, S.-J. Park, K. Sundaresan, and R.


50 TCP Sivakumar, “TCP Performance over Mobile Ad Hoc
Networks: A Quantitative Study,” Wireless
40 Communication and Mobile Computing Journal,
30
Special Issue on Performance Evaluation of Wireless
Networks, Vol. 4, pp. 203-222, March 2004.
20 [2] L. S. Brakmo, and S. W. O’Mally, “TCP Vegas: New
Techniques for Congestion Detection and Avodiance,”
10 Proceeding of the ACM SIGCOMM, pp. 24-35,
October 1994.
0 [3] K. Chandran, S. Raghunathan, S. Venkatesan, and R.
1 5 10 15 20 Prakash, “A Feedback-Based Scheme for Improving
Mobility speed (m/s) TCP Performance in Ad Hoc Wireless Networks,”
IEEE Personal Communications, Vol. 8, No. 1, pp. 34-
Fig. 5. Time spent in slow/quick start for 5 flows.
39, February 2001.
[4] K. Chen, Y. Xue, and K. Nahrstedt, “On Setting TCP’s
350 Congestion Window Limit in Mobile Ad Hoc
Networks,” Proceedings of the IEEE International
300 Conference on Communications, Anchorage, Vol. 2,
pp. 1080-1084, May 2003.
250
Throughput (Kbps)

[5] J. Liu and S. Singh, “ATCP: TCP for Mobile Ad Hoc


200 Networks,” IEEE Journal Selected Areas
Communication, Vol. 19, No. 7, pp. 1300-1315, July
150 2001.
[6] K. Sundaresan, V. Anantharaman, Hsieh Hung-Yun,
100 ITP A.R. Sivakumar, “ATP: A Reliable Transport Protocol
ATP
TCP for Ad Hoc Networks,” IEEE Transactions on Mobile
50 Computing, Vol. 4, Issue 6, pp. 588-603, November
2005.
0
[7] W. Xu, A.G. Qureshi and K. W. Sarkies, “Novel TCP
3 5 7 9 11 13 15 17 19 21 Congestion Control Schemes and Its Performance
Mobility speed (m/s) Evaluation,” IEE Proceedings on Communications,
Fig. 6. Throughput vs. mobility for 5 flows. Vol. 149, Issue 4, pp. 217-222, August 2002.
[8] X. Yu, “Improving TCP Performance over Mobile Ad
Hoc Networks by Exploiting Cross-Layer Information
5. Conclusions Awareness,” Proceedings of the Mobile Computing
and Networking, pp. 231-244, September 2004.
In this paper, we proposed an improved transport
protocol in MANETs. The proposed scheme uses the
fuzzy logic control mechanism to determine an
appropriate data rate for sending packets. By using the
fuzzy logic control, we can derive a proper parameter f
from an indefinite situation. This scheme depends on
the variable parameter f, which is used to tune the data
rate Davg. After Davg is received, the sender can
determine the proper data rate for transmission.
Simulation results show that the proposed ITP
performs better than TCP and ATP.

Acknowledgments

325
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Region-based Sensor Selection for Wireless Sensor Networks

Yoshiyuki NAKAMURA£ , Kenji TEI£Ý , Yoshiaki FUKAZAWA£ , and Shinichi HONIDENÝÞ


Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo, 169-8555, Japan
National Institute of Informatics
2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, 101-8430, Japan
The University of Tokyo
7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
nakayosi, tei0820, fukazawa@fuka.info.waseda.ac.jp, honiden@nii.ac.jp

Abstract should be integrated. In general, accuracy of an observation


improves with the number of observation sensors. Sensors
In a sensor network, the technique that limits the have severe energy consumption limitations. The lifetime
number of sensors used for observation is effective to of each sensor is shortened when it is used too often, and
reduce the energy consumption of each sensor. To limit the this in turn shortens the network lifetime. Thus, a technique
number of sensors without sacrificing observation accu- that minimizes the number of sensors used for observations
racy, an appropriate sensor combination must be selected would reduce the total energy consumption. However,
by evaluating the observation effectiveness of various when only the number of sensors is reduced, the number of
combinations. However, the computational workload for observation errors increases. Now, to minimize the number
evaluating all the sensor combinations is quite large. We of sensors without sacrificing observation accuracy, it is
can define a parameter related to the optimal size of a necessary to select appropriate combination of sensors by
region around an observation target by making a trade- evaluating their observation effectiveness. However, the
off between accuracy and the computational workload. In computational workload would be huge, if the evaluation
region-based sensor selection, a combination of sensors is of every combination of sensors was necessary. It is
selected from that is near the observation target. Accuracy unrealistic to evaluate all the possible sensor combinations,
is better in a larger region with a lot of sensors, but because the computational resources of a sensor (CPU
the computational workload is heavier. In contrast, a and memory) are usually very restricted. Therefore, we
smaller region with fewer sensors has poorer accuracy, propose a sensor selection technique in this paper that we
but a lighter workload. The size of the region controls call a region-based sensor selection that takes the trade-
the trade-off between accuracy and the computational off between accuracy and the computational workload into
workload. We define a parameter related to the optimal account. In general, a sensor that is near the target is the
size of a region, and use it to dynamically adjust the most effective choice for making an observation, because
region’s size. Our simulations confirmed that region-based observation noise increases with the distance from the
sensor selection reduces the computational workload and target. In region-based sensor selection, a combination
improves accuracy in comparison to existing techniques. of sensors is selected near an observation target. Only
the sensors within a region around the observation target
are considered candidate sensors. The size of the region
I. Introduction helps to control the trade-off between accuracy and the
computational workload. When the region is large and a
The main purpose of a wireless sensor network is to lot of sensors exist within it, accuracy increases but so does
measure a target’s state (e.g., its temperature, sound, and the computational workload. When the region is small, the
light). Since noise is included in the observations made by computational workload decreases but so does accuracy.
a sensor, the results always include some errors. Therefore, We define a parameter related to the optimal size of a
to improve accuracy, the results from two or more sensors region in this paper, which we then use to dynamically

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 326


DOI 10.1109/SUTC.2008.47
adjust a region size.      

 


The remainder of this paper is organized as follows.
Section II describes spread-based heuristics. Section III The metrics proposed in spread-based heuristics can
describes region-based sensor selection in more detail. be calculated without complex processing. However, the
Section IV presents our simulation results. Section V computational workload becomes huge if it evaluates them
describes the related works on the topic of sensor selection, to the all sensor combinations. Spread-based heuristics
and Section VI concludes this paper. reduces the computational workload by decreasing the
number of selected sensors. For each selection, only one
sensor that seems to be ineffective for the observation is
II. Spread-based heuristics eliminated from the combination, and a new one is selected
from the other sensors in the network. Let the total number
This section describes the spread-based heuristics [1] of sensors be
and the number of selected sensors be
and its shortcomings. Section II-A describes the evaluation . Then the computational workload to select the sensors
metrics and Section II-B describes the method used to becomes 
. With the technique mentioned in [1], the
reduce the computational workload when evaluating the computational workload can be reduced to 
.
suggested metrics. Section II-C describes the shortcomings
of the spread-based heuristics.
 
         
 
  The sensors that remain in the combination are selected
according to the position of the target in the past. There-
The metrics proposed in reference [1] are simple and fore, accuracy might decrease if the selected sensors are
can be evaluated without using a complex calculation. ineffective. If the target’s speed is high, this shortcoming
Thus, they are suitable for the computation by a sensor. is significant. When the target’s speed is high, the sensors
The spread-based heuristics comprise three metrics for se- that remain in the combination that are far from the target
lecting sensors: collinearity, spread, and proximity. These and the error included in the observation result increase.
metrics are used to evaluate a target’s position by using
a three-point measurement and they select a combination
s5
of sensors that form a shape that is nearly an equilateral
triangle centered on the target’s position. The evaluation s1
formulas are defined as follows. T’(k+1)
T(k)
 

Collinearity:      (1) s4


Spread :       
 (2)
s2

   s3

Proximity:    (3) Fig. 1. Overview of spread-based heuristics


 
In these formulas,  is the number of selected sensors An example of such a situation is shown in Fig. 1,
and  and  are the coordinates of sensor .  is the slope where a target’s position   is observed by sensors  ,
and  is the y-intercept of a straight line to fit      and 
at time , and the sensors chosen to observe
through all of the sensors. Therefore, the value of formula    are selected by the predicted target’s position
(1) is small if the sensors are almost collinear. In formula ¼   . With the technique mentioned in [1], sensor
(2),  is the angle incident upon the target from the two 
is removed from the combination and sensor  is
sensors. When  is 3, formula (2) evaluates the difference then selected. Sensors  , , and  are then used to
between the angle of the equilateral triangle and the actual observe the target’s position   . However, in this
angle. Therefore, the value of formula (2) gets smaller as case, the combination of sensors ,  and  is more
the sensors get closer to forming an equilateral triangle.   suitable according to the metrics described in Section II-A.
in formula (3) is the distance between sensor and a target. Therefore, the appropriate sensors are not selected and
Therefore, as this distance shortens, the value of formula accuracy of the target’s position decreases. This problem
(3) gets smaller. These three evaluation metrics are used occurs because  and  are not selected according to the
to select the combination of sensors. present target’s position.

327
III. Region-based sensor selection special sensors, so sensors can easily exchange roles. In
our simulations, the leader sensor at time    is one of
Region-based sensor selection is a sensor selection the sensors selected at time  .
technique that makes a trade-off between accuracy and the
computational workload. In this scheme, all the sensors are Leader Sensor
selected for each selection. In this case, the increase in the target position,
amount of processing becomes a problem. Therefore, the region range
sensors chosen to observe a target are selected from the
if within region,
candidate sensors that are nearest to the observation target. send sensor position
sensor position
The candidate sensors are limited to a circular area called a
region. The computational workload of the sensor selection Sensor Selection
is substantially reduced by evaluating the effectiveness of selected sensorID
only the candidate sensors. Moreover, energy consumption if selected,
is reduced by limiting the number of sensors related to sensing result start sensing
the sensor selection processing. Section III-A describes the estimate
procedure of region-based sensor selection. Section III-B target position
describes the method of dynamically adjusting the region
size.
Fig. 3. Flow of sensor selection
   
   

The first thing the leader sensor does is to broadcast


the estimated target’s position ( ,  ) and the region’s
s5 radius (). Each sensor replies to the leader sensor with
its own position (, ) if it is within the region. If it is not
s1 within the region, a sensor returns to the stand-by mode
T’(k+1) without replying. After a fixed period, the leader sensor
T(k) evaluates the combinations of the responding sensors and
selects a combination to observe the target. After selecting
rk +1 s4
the sensors, the leader sensor broadcasts the selected
s2 sensors’ IDs. The selected sensors observe the target and
send their observation results back to the leader sensor.
s3 The leader sensor estimates the target’s position from the
observation results and predicts the target’s position. This
sensor selection procedure is then repeated.
Fig. 2. Overview of region-based sensor se-
Since our technique selects sensors from around the
lection
target’s position, the metrics, which evaluate each sensor’s
position and the target’s position, are suitable. For this
Fig. 2 is an overview of region-based sensor selection. research, we used the evaluation metrics proposed by
As can be seen, the region is a circle with its center at spread-based heuristic (Section II). Let the sets of sensor
the predicted target’s position ¼   , and its radius is combinations within a region be  , and the threshold of 
 ·½ . The size of a region controls the trade-off between be  .  is an average of . First, the combinations
accuracy and the computational workload. The size of with    are removed from  . Then the combination
the region is dynamically adjusted by adjusting , and with the least  is selected for the observation. The
the method of dynamically adjusting the region size is selected sensors form a shape similar to an equilateral
described in Section III-B. triangle centered on the target’s position.
The behaviors of the sensors are shown in Fig. 3. Each
sensor can take a leader role or a regular sensor role.
At time  , the leader sensor selects the other necessary
  
 

sensors, estimates the target’s position  , and predicts An appropriate region size is decided by the simulation.
the target’s position ¼   . Since the leader sensor TOSSIM [2] was used as the simulator. Simulations were
always collects the positions of the candidate sensors, it carried out for 140 sensors randomly placed within 
does not need to know all the sensor positions. This makes  area. The number of target is one, and the target’s
it easy to account for breakdowns and the addition of movement model was a straight line or a circle. The region
necessary sensors to the network. Moreover, there are no radius  took on values from  
  
, the

328

the appropriate region size. If the target is moving fast,
even a small prediction error becomes significant. The
㪉㪅㪌 number of observation errors increases when selecting sen-
sors from a small region. Therefore, the appropriate region
㪉 㫍㪔㪈
size increases as the target’s speed increases. However,
㪸㫍㫉㩷㪼㫉㫉㫆㫉㩷㩿㫄㪀

㫍㪔㪌
㪈㪅㪌 㫍㪔㪈㪇 if the region size is decided only based on the target’s
㫍㪔㪈㪌 speed, when it is slow moving it becomes very small
㪈 㫍㪔㪉㪇
size. In that case, there are not enough sensors within a
㪇㪅㪌
region and an appropriate arrangement of the necessary
sensors can not be found. Therefore, a minimum size is
㪇 needed. This is decided by the number of sensor within the
㪇 㪉㪇 㪋㪇 㪍㪇 㪏㪇
㫉㪼㪾㫀㫆㫅㩷㫉㪸㪻㫀㫌㫊㩷㩿㫄㪀
region. However, a sensor is sometimes added or deleted
in wireless sensor networks, and the density of the sensor
Fig. 4. straight line is not constant. Therefore, the density of sensors at time
  is unknown. Since the target’s position at time  will

be close to the one at   , the densities of the sensors
㪊㪅㪌 nearest the target will be close to being the same at these
㪊 times. Thus, by letting the density of the region at time
㪉㪅㪌 㫍㪔㪈   be  ·½ , we can approximate  ·½ by using  ,
㪸㫍㫉㩷㪼㫉㫉㫆㫉㩷㩿㫄㪀


㫍㪔㪌
i.e.,  ·½   . The simulation results indicated that the
㫍㪔㪈㪇
㫍㪔㪈㪌 appropriate region size is related to the target’s speed and
㪈㪅㪌
㫍㪔㪉㪇 sensor density. Thus, we can calculate the region radius
 ·½ in the following way:


㪇㪅㪌


㪇 㪉㪇 㪋㪇 㪍㪇 㪏㪇
 ·½   ·½ 
 ·½
(4)
㫉㪼㪾㫀㫆㫅㩷㫉㪸㪻㫀㫌㫊㩷㩿㫄㪀

Fig. 5. circle where, ·½ is the target’s speed at time k+1, and  and
are constants. ·½ can be calculated using the target’s
moving history, and  ·½ can be approximated by  .
target’s speed took on values from      , Three sensors or more are needed within the region so
and we took the averages from 100 simulations. A move- that our technique evaluates the position by three point
ment history was needed to predict the moving target’s measurement. Thus,   and    from Figs. 4 and 5.
position. Therefore, it was assumed that it was possible We can calculate the region radius  ·½ as follows
to accurately pursue the target and to maintain a correct
movement history. The AVERAGE heuristic proposed by
 
reference [3] was used to predict the target’s position. In
 ·½   ·½ 

(5)
this heuristic, it is assumed that the distance and direction
of a target’s movement is the mean value of the past IV. Performance Evaluation
movement.
The results from the simulation are shown in Figs. 4 In this section, accuracy, computational workload, and
and 5, which specifically show results for straight line energy consumption of region-based sensor selection are
and circle target movement models, respectively. In these evaluated. The simulation setting and the prediction tech-
cases, the number of errors increases not only when the nique are the same as that described in Section III-B.
region is small, but also when the region is too large. The Given the above settings, we evaluated the number of
sensors were first evaluated according to the distance from errors when the total number of sensors and the
the target, because our technique selects sensors from the region radius  are varied. took on the values from
region. When the region is large, the sensors are selected     
, and we averaged the values
by evaluating their arrangement (collinearity and spread). from 100 simulations.
Therefore, the number of errors included in the observation The errors in the actual target’s position and the esti-
results increase because a sensor that was far from the mated target’s position were used for the evaluation metrics
target was selected. Therefore, the region only needs to be of accuracy. The processing time for the sensor selection
small for an appropriate size to exist. was used as an evaluation metric of the computational
Figs. 4 and 5 also show that the target’s speed influences workload. PowerTOSSIM [4] was used as a simulator for

329
㻠㻚㻡


TABLE I. Energy consumption
㻟㻚㻡
 1 5 10 15 20

㼍㼢㼞㻌㼑㼞㼞㼛㼞㻔㼙㻕

spread-based [mA] 297 301 287 308 303
region-based [mA] 110 139 186 239 285 㻞㻚㻡
㼞㼑㼓㼕㼛㼚

㼟㼜㼞㼑㼍㼐
TABLE II. Processing time 㻝㻚㻡


 100 120 140 160
select 3 sensors [s] 4.80 10.70 20.21 36.52 㻜㻚㻡
region-based [s] 0.84 1.21 1.35 1.72 㻜
spread-based [s] 0.81 0.85 0.90 0.96 㻤㻜 㻝㻜㻜 㻝㻞㻜 㻝㻠㻜 㻝㻢㻜 㻝㻤㻜 㻞㻜㻜

the energy consumption evaluation. It is difficult to mea- Fig. 6. Estimation error (v=10)
sure the exact processing time of a simulator. Therefore,
we used a real sensor to measure the processing time. The
sensor was a MICAz [5] made by the Crossbow company. 㪏

Table I compares the energy consumptions of region- 㪎

based sensor selection and the spread-based heuristics [1] 㪍


in the processing shown in Fig. 3. The energy consump-

㪸㫍㫉㩷㪼㫉㫉㫆㫉㩿㫄㪀

tion was influenced by the number of sensors related to 㪋 㫉㪼㪾㫀㫆㫅
the sensor selection. Since region-based sensor selection 㫊㫇㫉㪼㪸㪻

changes the region size to correspond to the target’s speed,
the target’s speed changed and Æ
fixed at 140. The 㪉

spread-based heuristics has constant energy consumption 㪈

because it did not take into account the target’s speed. In 㪇


㪇 㪌 㪈㪇 㪈㪌 㪉㪇 㪉㪌
contrast, compared to the spread-based heuristics, region-
㫍㩷㩿㫄㪆㫊㪀
based sensor selection reduced the energy consumption by
35 % for Ú . This is because the number of sensors Fig. 7. Estimation error (N=140)
for the sensor selection is small, but the computational
workload is large.
Table II shows the processing times, including those of see that region-based sensor selection reduces the number
a technique that selects from all of the sensors. Region- of errors by 45 % when compared to the spread-based
based sensor selection needs a processing time of about
9 % of existing techniques for Æ
. However, heuristics. This proves that it is able to select an effective
combination for target observation. The error cannot be
it takes longer than the spread-based heuristics. This is decreased even if there are a lot of sensors, because
because the computational workload of region-based selec-
ÇÆ
tion is  ¿ , whereas the computational workload of the
the spread-based heuristics selects only one sensor again.
ÇÆ
spread-based heuristics is  . However, because region-
Similarly, the error increases in spread-based heuristics as
the target’s speed increases. Since our technique selects
based selection reduced N, the increase in processing time all the sensors again according to the moving destination
is small. The combination of sensors can be selected before of the target, the increase in error due to the target’s
the target actually moves, because the sensor is selected on high speed can be decreased. However, the error increases
the basis of the predicted target’s position. Therefore, the when the target moves at a slow speed, so the error of a
influence of the processing time is small when the target is positional prediction may increase. Region-based sensor
observed at intervals greater than the processing time. The selection becomes more effective as the target moves
processing time might influence the energy consumption. faster. Since the spread-based selection selects only one
However, Table I shows that region-based selection reduces sensor, the sensors that remain in the combination may
the energy consumption, because it uses only a few sensors quickly become distant from the target. Therefore, the error
for the sensor selection processing. included in the observation result increases.
Figs. 6 and 7 compare the errors of region-based sensor
selection and the spread-based heuristics [1]. Fig. 6 shows The simulation shows that region-based sensor selection
the relationship between Æ
and the average error when improves accuracy and reduces the energy consumption.
the target’s speed is assumed to be 10 and Fig. 7 shows Since our technique changes the region size in response to
the relation between Ú
and the average error when the the target’s speed changes, it becomes more effective as
number of sensors is assumed to be 140. In Fig. 6 we can the target moves faster.

330
V. Related works One approach is to make multiple predictions and to
select the best sensor combination. In that case, it will be
Previous works [6], [7] proposed evaluation metrics necessary to reconsider the trade-off between observation
for sensor selection. The evaluation metrics proposed in accuracy and the computational workload and the power
[7] involve a lot of computation calculations and are not consumption.
suitable when a sensor calculates them. References [8],
[9], [10], [11], and [12] researched other sensor selections.
References [9] and [11] assumed that a sensor was a References
directional sensor, like a camera, and these approaches are
[1] V.Sadaphal and B.Jain, “Spread-based heuristic for sensor selection
not suitable for sensors that observe at a distance from the in sensor networks.” in COMSWARE. IEEE, 2006.
target. References [8] and [12] assumed that a sensor has [2] P.Levis, N.Lee, M.Welsh, and D.Culler, “Tossim: accurate and
the ability to move, and these approaches are not suitable scalable simulation of entire tinyos applications,” in SenSys ’03:
Proceedings of the 1st international conference on Embedded
for common sensors that have no mobility. Reference [10] networked sensor systems. New York, NY, USA: ACM Press,
took into consideration the reduction of information due 2003, pp. 126–137.
to communication delays. A sensor was selected based on [3] Y.Xu, J.Winter, and W.C.Lee, “Dual prediction-based reporting for
object tracking sensor networks,” mobiquitous, vol. 00, pp. 154–
the present target’s position. However, because the actual 163, 2004.
target’s position was uncertain, it was necessary to predict [4] V.Shnayder, M.Hempstead, B.Chen, G.W.Allen, and M.Welsh,
the target’s position from past movements. References [3], “Simulating the power consumption of large-scale sensor network
applications,” in SenSys ’04: Proceedings of the 2nd international
[13], and [14] proposed heuristics to predict a target’s conference on Embedded networked sensor systems. New York,
position. The AVERAGE heuristics proposed in reference NY, USA: ACM Press, 2004, pp. 188–200.
[3] and [13] ware used for simulation purposes in Section [5] J.Hill and D.Culler, “A wireless embedded sensor architecture for
system level optimization,” UC Berkeley Technical Report, January
IV. Reference [15] was another approach that reduced 2001.
the energy consumption of a sensor. In reference [15], [6] F.Bian, D.Kempe, and R.Govindan, “Utility based sensor selection,”
the energy consumption was reduced by changing the in IPSN ’06: Proceedings of the fifth international conference on
Information processing in sensor networks. New York, NY, USA:
radio strength according to the distance from the receiver, ACM Press, 2006, pp. 11–18.
because the energy consumption was larger when the data [7] H.Wang, K.Yao, G.Pottie, and D.Estrin, “Entropy-based sensor
was transmitted. selection heuristic for target localization,” in IPSN ’04: Proceedings
of the third international symposium on Information processing in
sensor networks. New York, NY, USA: ACM Press, 2004, pp.
VI. Conclusion and future work 36–45.
[8] A.Verma, H.Sawant, and J.Tan, “Selection and navigation of mobile
We proposed region-based sensor selection, which takes sensor nodes using a sensor network,” in PERCOM ’05: Pro-
ceedings of the Third IEEE International Conference on Pervasive
into account the trade-off between accuracy and the com- Computing and Communications. Washington, DC, USA: IEEE
putational workload. Region-based sensor selection selects Computer Society, 2005, pp. 41–50.
an appropriate combination of sensors for a target observa- [9] P.V.Pahalawatta, T.N.Pappas, and A.K.Katsaggelos, “Optimal sen-
sor selection for video-based target tracking in a wireless sensor
tion. The computational workload is reduced by selecting network,” in ICIP, 2004, pp. 3073–3076.
only the candidate sensors that are within the region around [10] S.Kagami and M.Ishikawa, “A sensor selection method considering
the target. We showed that the number of sensors within communication delays,” The Transactions of the Institute of
Electronics, Information and Communication Engineers. A,
the region and the target’s speed are related to the trade- vol. 88, no. 5, pp. 577–587, 20050501. [Online]. Available:
off between accuracy and the computational workload, and http://ci.nii.ac.jp/naid/110003314052/en/
therefore, we proposed a method to dynamically adjust the [11] V.Isler and R.Bajcsy, “The sensor selection problem for bounded
uncertainty sensing models,” in IPSN ’05: Proceedings of the
region. The simulation results showed that region-based 4th international symposium on Information processing in sensor
sensor selection reduces the computational workload and networks. Piscataway, NJ, USA: IEEE Press, 2005, p. 20.
energy consumption in comparison with the spread-based [12] Y.Mostofi, T. Chung, R. Murray, and J. Burdick, “Communication
and sensing trade-offs in decentralized mobile sensor networks:
heuristics, and its accuracy improves even more when a cross-layer design approach,” in IPSN ’05: Proceedings of the
the target moves slowly. In region-based sensor selection, 4th international symposium on Information processing in sensor
sensors are selected by referring to the predicted target’s networks. Piscataway, NJ, USA: IEEE Press, 2005, p. 16.
[13] Y.Xu, J.Winter, and W.C.Lee, “Prediction-based strategies for en-
position. Therefore, if the predicted target’s position differs ergy saving in object tracking sensor networks,” mdm, vol. 00, p.
from the actual target’s position, the selected sensors will 346, 2004.
not be effective enough for an observation. The AVERAGE [14] Y.Xu and W.C.Lee, “On localized prediction for power efficient
object tracking in sensor networks,” in ICDCSW ’03: Proceedings
heuristics predicts the target’s position from a target’s past of the 23rd International Conference on Distributed Computing
movement. Therefore, when a target moves at random and Systems. Washington, DC, USA: IEEE Computer Society, 2003,
a target’s velocity greatly changes, region-based sensor p. 434.
[15] J.P.Wagner and R.Cristescu, “Power Control for Target Tracking
selection will be ineffective. Finding a way to reduce the in Sensor Networks,” in Conference on Information Sciences and
influence of prediction accuracy will be a future work. Systems, 2005.

331
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

CRT-MAC: A Power-Saving Multicast Protocol in the Asynchronous Ad hoc


Networks

Yu-Chen Kuo Chih-Nung Chen


Department of Computer and Information Department of Computer and Information
Science, Soochow University, Taipei, Taiwan, Science, Soochow University, Taipei, Taiwan,
R.O.C. R.O.C.
yckuo@cis.scu.edu.tw demon.uogo@msa.hinet.net

Abstract the lifetime of battery, such as self-charging [2] and


The asynchronous PS (Power-Saving) protocol was power management [3]. In the self-charging methods,
designed to synchronize two wireless hosts in the asyn- they can produce energy by using solar power energy
chronous ad hoc network such that those two wireless or chemical reaction effect etc.. On the other hand, the
hosts can transmit an unicast message even though power management is to effectively utilize energy and
they have asynchronous wakeup frequencies. However, try to reduce unnecessary energy consumption. To re-
for transmitting a multicast message to more than one duce the usage of radio activity in the wireless hosts is
receiver, the protocol could not guarantee that all re- a good choice for the power management. Thus, the
ceivers can wake up simultaneously and then receive wireless host may adjust its wakeup frequency to re-
the multicast message. In this paper, we propose a new duce the usage of radio activity according to its rest of
asynchronous PS protocol, named CRT-MAC PS pro- energy.
tocol, for the multicast transmission in the asynchro- IEEE 802.11 [3] has defined its power-saving (PS)
nous ad hoc networks. We define the difference m-pair mode for single-hop (fully connected) ad hoc network
property to guarantee that m different wakeup frequen- based on periodical transmissions of beacons. When
cies have the intersection. Thus, m wireless hosts with the protocol is applied to a multi-hop ad hoc network,
m different wakeup frequencies satisfying the difference the protocol may encounter the clock-drifting problem
m-pair property could wake up simultaneously. The [4, 5]. The wireless hosts with the clock-drifting prob-
CRT-MAC PS protocol utilizes the concept of Chinese lem may not be able to wake up simultaneously due to
Remainder Theorem to generate m wakeup frequencies their asynchronous clocks. Therefore, many research-
which satisfy the difference m-pair property. When m ers had proposed the PS protocols for asynchronous ad
wireless hosts use the wakeup frequency generated by hoc networks [5, 6, 7] to lead wireless hosts to wake up
the Chinese Remainder Theorem to transmit the multi- simultaneously and then transmit the unicast messages.
cast message, they will wake up simultaneously and A kind of asynchronous PS protocols based on quorum
receive the multicast message. systems [8] had been proposed and applied to arrange
wakeup frequencies of wireless hosts [6]. However, not
all quorum systems are suitable to arrange wakeup
1. Introduction frequency for the asynchronous PS protocol. Only the
quorum systems satisfying the rotation closure prop-
In recent years, due to the advances in wireless erty [5] are suitable to arrange wakeup frequency. The
communication technologies, the wireless network is cyclic quorum systems [9] are constructed by the con-
applied in more and more applications. The ad hoc cept of the difference pair [10]. The difference pair can
network [1], which consists of a number of wireless guarantee that two quorums still have the intersection
hosts, is a kind of wireless networks without infra- after the quorums drift. Any pair of quorums in the cy-
structure. The wireless hosts in ad hoc networks may clic quorum system is a difference pair. Hence, the cy-
communicate with each other in a multi-hop matter clic quorum system satisfies the rotation closure prop-
without using the base station. These wireless hosts erty. Thus, applying the cyclic quorum system to ar-
rely on the batteries to provide the power supplies. range wakeup frequencies of wireless hosts will guar-
Thus, the power-saving become a critical issue to ex- antee that any pair of wireless hosts can wake up si-
tend the lifetime of wireless hosts. One of the methods multaneously even though their wakeup frequencies
for extending the lifetime of wireless hosts is to extend drift. In an ad hoc network, for transmitting a multicast

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 332


DOI 10.1109/SUTC.2008.40
(and broadcast) message [11] to more than one wireless
host is an unavoidable type of service that provides
such as clock synchronization, neighbor discovery and
network management etc.. However, the quorum-based
asynchronous PS protocols can guarantee that any pair
of wireless hosts wakes up simultaneously only. Thus,
when for transmitting a multicast message to more than
one receiver, the PS protocol could not guarantee that
all receivers can wake up at the same time. Therefore,
the transmitter may need to retransmit the multicast Figure 1. IEEE 802.11 PS mode in ad hoc networks.
message to guarantee the reliability of multicast until protocol for wireless hosts is important. The IEEE
all receivers are awake. Nevertheless, the retransmis- 802.11 standard [3] supports two power modes in
sion will cause a lot number of packets in the network power management of the MAC layer: active mode
and multicast inconsistency may occur. It will increase (AM) and power-saving (PS). In the AM mode, the ra-
the energy consumption of the transmitter and the us- dio activities of the wireless hosts always keeps in the
age of the bandwidth. The busy waiting approach is a monitoring state for receiving or transmitting the mes-
simple approach that can lead m wireless hosts to wake sages. Hence, the wireless hosts can immediately
up simultaneously and transmit the multicast message communicate with other wireless hosts. However, the
once, but it may make the receiver to wait the multicast energy consumption of the wireless hosts in the moni-
message for a long time because the wakeup frequen- toring state is almost the same with that in the receiv-
cies of other wireless hosts may be unpredictable. It ing state [13, 14], the wireless hosts will consume a lot
will increase the energy consumption of the receivers of energy in the monitoring state. In the PS mode, it
for idle monitoring. aims to reduce idle monitoring of the wireless hosts.
In this paper, we propose a PS multicast protocol in Fig. 1 illustrates the IEEE 802.11 PS mode. The wire-
the asynchronous ad hoc networks, named CRT-MAC less hosts divide the time axis into equal-length beacon
PS protocol. In our protocol, the receivers will reduce intervals. Each of the beacon intervals starts with a
its energy consumption of idle monitoring for waiting short interval and the wireless hosts remains in wakeup
the multicast message. We extend the difference pair to state during the short interval. The short interval is
define the difference m-pair property. The difference called the ATIM (Announcement Traffic Indication
m-pair property can guarantee that m different wakeup Map) window. In the ATIM windows, all wireless hosts
frequencies have the intersection even wakeup fre- in PS mode will wake up simultaneously and transmit
quencies drift with different rotation volumes. The or receive the control messages. In the rest time of the
CRT-MAC PS protocol utilizes the concept of Chinese beacon interval, the wireless hosts will remain in
Remainder Theorem [12] to generate m wakeup fre- wakeup state or enter doze state according to whether
quencies which satisfy the difference m-pair property. they receive the ATIM frames or not. In IEEE 802.11,
Hence, when m wireless hosts request to transmit the it guarantees that all wireless hosts in PS mode can
multicast message, each wireless host will choose a wake up simultaneously in their ATIM windows be-
new wakeup frequency generated by the Chinese Re- cause all wireless hosts utilize the TSF (Time Synchro-
mainder Theorem respectively. Eventually, these m nization Function) to synchronize their clock cycles.
wireless hosts will wake up simultaneously and receive However, it is a difficult job to synchronize global
the multicast message. clock in the multi-hop ad hoc network due to commu-
The rest of this paper is organized as follows. In nication delays and temporary sub-networks partition.
Section 2, we discuss some related power-saving pro- Especially when the network scale is large, global
tocols in ad hoc network. In Section 3, we define the clock synchronization is very costly or even infeasible
difference m-pair property and propose the CRT-MAC [6]. Hence, the wireless hosts in the multi-hop ad hoc
protocol. The simulation results and compare are pre- network may have the clock-drifting problem and may
sented in Section 4. Conclusions are in Section 5. not be synchronous in their ATIM windows.

2. Related works and problem statement 2.2. A quorum-based asynchronous PS protocol

2.1. IEEE 802.11 PS mode in ad hoc networks Recently many researchers had proposed the PS
protocols for asynchronous environment to increase the
The battery of a wireless host can provide only lim- survivability of asynchronous ad hoc networks [5, 6, 7].
ited energy. Thus, the design of an energy-efficient A kind of asynchronous PS protocols based on quorum
systems had been applied greatly in arranging wakeup

333
frequency. They utilize the intersection property of Host1 0 1 2 3 4 5 6 7
quorum systems to guarantee that any pair of wireless Host2 6 7 0 1 2 3 4 5
hosts can wake up simultaneously to transmit the uni- Figure 2. Quorum-based asynchronous PS protocol.
cast messages. The definition of the quorum system is
The difference pair is based on the notion of cyclic
defined as follows.
difference set in combinatorial theory [6]. Let
Definition 1 A quorum system C under U={0,1, …, U={0,1, …, N−1} and Q is a subset of U. A cyclic dif-
N–1} is a collection of non-empty subsets of U, called ference set of Q is defined that Q(i)={(q+i) mod N |
quorums, which satisfies the intersection property: ∀q∈Q}, where i is a positive integer. Thus the cyclic
∀Qs , Qt ∈ C: Qs ∩ Qt ≠ ∅. difference set Q(i) denotes the drifting of Q with a
volume i. The difference pair is defined by two subsets
For example, C′={{0,1},{0,2},{0,3},{1,2,3}} is a of U such that their cyclic difference sets have the
quorum system under U′={0,1,2,3}. The wireless host same value. The definition of the difference pair is de-
in the quorum-based asynchronous PS protocol divides fined as follows.
the time axis into equal-length beacon intervals. Thus,
each wireless host chooses a quorum arbitrarily and Definition 4 Let C = {c1 ,c2 ,…,ck } and D = {d1 ,d2 ,…,dl }
takes this quorum as its wakeup frequency of beacon be two subsets of U={0,1, …, N−1}. The ordered pair
interval. By the intersection property, any pair of wire- (C, D) is called an (N, k, l)-difference pair if ∀δ∈U,
less hosts will wake up simultaneously at the inter- there exists at least one pair (ci , dj ), ci ∈C, dj ∈D, such
sected beacon interval if their clocks are synchronous. that ci = dj + δ(mod N).
However, the clock-drifting problem may occur on the
wireless hosts due to their asynchronous clocks. Two For example, consider the two subsets C={0,1,2,4}
wireless hosts with clock-drifting problem can not and D={3,4,5,7} under U={0,1,2,3,4,5,6,7}. The or-
wake up simultaneously even though their quorums dered pair (C, D) is a (8,4,4)-difference pair because
have an intersection. Hence, not all quorum systems 4 = 4 + 0(mod 8), 4 = 3 + 1(mod 8), 1 = 7 + 2(mod 8),
are suitable to arrange wakeup frequency for asyn- 0 = 5 + 3(mod 8), 1 = 5 + 4(mod 8), 0 = 3 + 5(mod 8),
chronous PS protocol. In [5], J.R. Jiang et al. showed 2 = 4 + 6(mod 8), 2 = 3 + 7(mod 8).
that only the quorum systems satisfying the rotation For a difference pair (C, D), a value dj in D which
closure property can tolerate the clock-drifting problem.
The definition of the rotation closure property is de- drifts δ volumes will still equal to a value ci in C, that
fined as follows. is C ∩ D(δ) ≠ ∅, ∀δ∈U. Therefore, when two quo-
rums are a difference pair, these two quorums with dif-
Definition 2 Given a positive integer i and a quorum Q ferent rotation volumes still have an intersection.
in a quorum system C under U={0,1, …, N–1}, we de- The cyclic quorum systems utilize the concept of
fine rotate(Q , i)={(j + i) mod N | j∈Q}. difference pair to construct the quorum systems such
that any pair of quorums in the cyclic quorum is a dif-
Definition 3 A quorum system C under U={0,1, …,
ference pair. Thus, the cyclic quorum systems obvi-
N–1} is said to have the rotation closure property if
ously satisfy the rotation closure property.
∀Qs , Qt ∈ C, i∈{0,1, …, N–1}: Qs ∩ rotate(Qt , i) ≠ ∅.
C={{0,1,2,4},{1,2,3,5},{2,3,4,6},{2,3,4,6},{3,4,5,7 2.4. Problem statement
},{4,5,6,0},{5,6,7,1},{6,7,0,2},{7,0,1,3}} is a quorum
system under U={0,1,2,3,4,5,6,7}. Two wireless hosts The quorum-based asynchronous PS protocols can
‘1’ and ‘2’ choose {0,1,2,4} and {3,4,5,7} to be their not guarantee that more than one receiver is awake
wakeup frequencies of beacon interval respectively. As when the transmitter requests to transmit the multicast
the example shown in Fig. 2, when the clock of the message. Therefore, the transmitter may adopt the busy
wireless host ‘2’ drifts a volume 2, {0,1,2,4} ∩ ro- waiting approach to notify the receivers to keep awake
tate({3,4,5,7},2) is still non-empty subsets. Thus, these one by one and then transmit the multicast message
two wireless hosts can still wake up simultaneously at when all receivers are awake. However, a wireless host
the same beacon interval even their clocks drift. It is may adjust its wakeup frequency according to its rest
not hard to verify that for any pair of quorums in the of energy for extending its lifetime. Thus, the trans-
quorum system C, Qs ∩ rotate(Qt , i) ≠ ∅ such that the mitter can not predict when the receivers will wake up.
Hence, the early wakeup receivers will enter the busy
rotation closure property holds.
waiting state and consume a lot of energy in idle
monitoring for waiting the multicast message until the
2.3. Difference pair last receiver wakes up. In this paper, we try to design a

334
PS multicast protocol, named CRT-MAC PS protocol, satisfied the difference m-pair property by using Chi-
for the asynchronous ad hoc networks to reduce energy nese Remainder Theorem.
consumption in waiting multicast messages. When the
transmitter requests to transmit the multicast message, 3.2. Chinese Remainder Theorem
each receiver will choose a new wakeup frequency ac-
cording to the notification sequence such that the ear- The Chinese Remainder Theorem was posed by
lier wakeup receivers wake up with less frequency and Sun Tzu Suan-Ching, the problem is certain things
the later wakeup receivers wake up with more fre- whose number is unknown. The Chinese Remainder
quency. Eventually, m wireless hosts will wake up si- Theorem could be formally described as follows [12].
multaneously to receive the multicast message and
save energy consumption in waiting multicast message. Theorem 1 Let p1 , p2 , …, pm be m positive integers
which are pairwise relatively prime, i.e. gcd(pi , pj )=1
3. The CRT-MAC PS Protocol when i≠j. Let P=p1 ×p2 × …×pm and let r1 , r2 , …, rm be
m integers. Then the system of linear congruences
3.1. Difference m-pair I ≡ r1 (mod p1 ) ≡ r2 (mod p2 ) ≡ ... ≡ rm (mod pm )
has a common solution I to all of the congruences, and
The difference m-pair property is an extension of any two solutions are congruent to one another modulo
the difference pair. The definition is formally defined P. Furthermore, there exists exactly one solution I be-
as follows. tween 0 and P–1.
Definition 5 Let Q1 , Q2 , …, Qm are m subsets of In the Chinese Remainder Theorem, a common so-
U={0,1, …, N−1}. The ordered m-pair (Q1 ,Q2 , …, Qm ) lution I can generated by each multiple of pt and then
is called an (N, k1 , k2 , …, km )-difference m-pair if ∀ plus rt , for 1≤ t ≤ m. When the remainders r1 , r2 , …, rm
δ1 ,δ2 , ...,δm ∈U, there exist x1 ∈Q1 , x2 ∈Q2 , …, xm ∈Qm are different, the common solution I between 0 and
such that y = xi + δi (mod N), where ki =|Qi | for 1 ≤ i ≤ P−1 is also different. Hence, we utilize the concept of
the Chinese Remainder Theorem to generate m subsets
m, y∈U.
and guarantee that these m subsets have the intersec-
By definition 5, we observe that if Q1 , Q2 , …, Qm tion after different rotation volumes. Let p1 , p2 , …, pm
are an (N, k1 , k2 , …, km )-difference m-pair, they have be m positive integers which are pairwise relatively
at least an intersection after different rotation volumes P
prime and P=p1 ×p2 × …×pm . Qt ={pt k | 0≤ k ≤ – 1},
δ1 , δ2 ,..., δm among them. That is, Q1 (δ1 )∩Q2 (δ2 ) pt
∩ …∩Qm (δm ) ≠ ∅. In the following, an (N, k1 , k2 , …, for 1≤ t ≤ m, are m subsets under U={0,1, …, P−1}. By
the Chinese Remainder Theorem, we observe that
km )-difference m-pair may simply be called a differ-
∃I∈U, I∈Q1 (r1 )∩Q2 (r2 )∩ …∩Qm (rm ) ≠ ∅, ∀rt ∈U,
ence m-pair when no confusion may arise.
for 1≤ t ≤ m. The ordered m-pair (Q1 ,Q2 , …, Qm ) is a
For example, consider the three subsets Q1 ={0,5,10,
difference m-pair.
15,20,25}, Q2 ={0,3,6,9,12,15,18,21,24,27} and Q3 ={0,
For example, consider the three pairwise relatively
2,4,6,8,10,12,14,16,18,20,22,24,26,28} under U={0, prime 5, 3 and 2. We can construct three subsets
1, …,29}, where |Q1 |=6, |Q2 |=10, |Q3 |=15 and N=30. Q1 ={0,5,10,15,20,25}, Q2 ={0,3,6,9,12,15,18,21,24,27}
When δ1 =0, δ2 =1 and δ3 =8, we can find that 10∈Q1 , and Q3 ={0,2,4,6,8,10,12,14,16,18,20,22,24,26,28}
9∈Q2 , 2∈Q3 such that y = 10 = 10 + 0(mod 30) = 9 + under U={0,1, …, 29}. When the remainders are r1 =0,
1(mod 30) = 2 + 8(mod 30). It is not hard to verify that r2 =1 and r3 =0, Q1 (0)={0,5,10,15,20,25}, Q2 (1)=
∀δ1 , δ2 , δ3 ∈U, these exist x1 ∈Q1 , x2 ∈Q2 , x3 ∈Q3 such {1,4,7,10,13,16,19,22,25,28} and Q3 (0)={0,2,4,6,8,10,
that y = xi + δi (mod N), for 1 ≤ i ≤ 3, y∈U. Thus, the 12,14,16,18,20,22,24,26,28} have a common solution
ordered 3-pair (Q1 ,Q2 ,Q3 ) is a (30,6,10,15)-difference 10. It is not hard to verify that ∀r1 , r2 , r3 ∈U, those
3-pair. Hence, for any rotation volumes δ1 , δ2 and δ3 three subsets exist an intersection. Thus, the ordered
among Q1 , Q2 and Q3 , Q1 (δ1 )∩Q2 (δ2 )∩Q3 (δ3 ) ≠ ∅. It 3-pair (Q1 ,Q2 ,Q3 ) is a difference 3-pair.
will guarantee that m wireless hosts with different rota-
tion volumes can wake up simultaneously when m 3.3. CRT-MAC PS protocol
wireless hosts take these m subsets satisfied the differ-
ence m-pair property as their wakeup frequencies. The In asynchronous PS protocols, the wireless host di-
following will introduce how to generate m subsets vides the time axis into equal-length beacon intervals.

335
It is more difficult to synchronize m wireless hosts than R1 0 1 2 3 4 5 6 7 8 R1 0 1 2 3 4 5 6 7 8 9 10
two wireless hosts. The busy waiting approach is a R2 0 1 2 3 4 5 6 7 R2 0 1 2 3 4 5 6 7 8 9
R3 0 R3 0 1 2
simple approach that can lead m wireless hosts to wake
(a) (b)
up simultaneously. In this approach, the transmitter no-
Figure 3. Examples of Busy waiting approach and
tifies the receivers to keep awake one by one and then
CRT-MAC PS protocol for the multicast message.
transmit the multicast message after all receivers are
awake. The receivers notified early will enter the busy Procedure CRT-MAC_ transmitter (Int m){
waiting state and consume a lot of energy in idle Set p1 , p2 , …, pm are pairwise relatively prime descending;
monitoring for waiting the multicast message. In the For( Receiver_i=0; Receiver_ i <m; Receiver_ i ++) do{
Listen Receiver_ i wakeup;
worst case for waiting the multicast message, m–1 Send CRT-MAC_ receiver( pReceiver_i +1);
wireless hosts have entered the busy waiting state but }
the transmitter still has to wait for the last wireless host Wait every Receiver_i wakeup;
to wake up and then starts to transmit the multicast Send MulticastMessage;
}
messages. Fig. 3(a) shows a simple example of the
busy waiting approach. The transmitter transmits the Figure 4. The procedure of CRT-MAC transmitter.
multicast message to the receivers ‘1’, ‘2’ and ‘3’. The Procedure CRT-MAC_ receiver (Int p){
receivers ‘1’ and ‘2’ have already entered the busy Clock_temp = 0;
waiting state in the beginning, whereas the receivers Do{
If (Clock_temp mod p == 0) Then
‘3’ is notified to enter the busy waiting state after the Listen a Beacon Interval;
eighth beacon interval passed. Thus, the receivers ‘1’ …..;
Else
and ‘2’ consume a lot of energy in idle monitoring for Sleep a Beacon Interval;
waiting the multicast message. End
} Loop (Clock_temp++)
In our CRT-MAC PS protocol, the transmitter noti- }
fies m receivers to wake up one by one and dispatches Figure. 5. The procedure of CRT-MAC receiver.
the pairwise relatively primes p1 , p2 , …, pm to m re-
ceivers as their wakeup frequencies respectively ac- 9,12,15,18,21,24,27} and Q3 ={0,2,4,6,8,10,12,14,16,
cording to the notification sequence, where the sizes of 18,20,22,24,26,28} under U={0,1, …, 29}. According
p1 , p2 , …, pm are descending. Then each receiver t to their new wakeup frequencies and the rotation vol-
generates its new wakeup frequency Qt by using Chi- umes δ1 =0, δ2 =1 and δ3 =8, these three receivers will
nese Remainder Theorem according to its assigned wake up simultaneously at the beacon interval 10 = 10
prime and takes Qt as its new wakeup frequency. Due + 0 = 9 + 1 = 2 + 8, where 10∈Q1 , 9∈Q2 , 2∈Q3 and
to the descending property of p1 , p2 , …, pm , the fre- 10∈Q1 (δ1 )∩Q2 (δ2 )∩Q3 (δ3 ) ≠ ∅. Then, the transmitter
quencies of Q1 , Q2 , …, Qm are ascending such that the starts to transmit the multicast message to them.
earlier wakeup receivers wake up with less frequency Fig. 4 illustrates the procedure of the transmitter
and the later wakeup receivers wake up with more fre- when the transmitter wants to transmit the multicast
quency. Thus, the receivers can reduce the energy con- message. Fig. 5 illustrates the procedure of the receiver
sumption in idle monitoring for waiting the multicast when the receiver receives the notification of choosing
message. We let δ1 , δ2 , ..., δm to be the rotation vol- a new wakeup frequency.
umes for m receivers respectively. According to Chi-
nese Remainder Theorem, these m receivers will wake 4. Simulation setting and results
up simultaneously at the beacon interval I∈Q1 (δ1 )∩
In the simulations, we adopt the QualNet simulator
Q2 (δ2 )∩ …∩Qm (δm ) ≠ ∅. Hence, after all receivers [16] to simulate our CRT-MAC PS protocol. We mod-
are awake, the transmitter starts to transmit the multi- ify the ad hoc module of MAC-layer in the QualNet to
cast message to them. fit our CRT-MAC PS protocol. We assume that the
Fig. 3(b) shows an example of CRT-MAC. When transmission radius is 250 meters and the transmission
the transmitter wants to transmit the multicast message rate is 2M bits/sec. The MAC-layer of wireless hosts
to three receivers ‘1’, ‘2’ and ‘3’ and notifies them se- basically follows the IEEE 802.11 standard, except the
quentially, the transmitter dispatches p1 =5, p2 =3 and power management. Table 1 summarizes the power
p3 =2 to three receivers ‘1’, ‘2’ and ‘3’ as their wakeup model parameters used in our simulations, which are
frequency respectively according to the notification obtained from real experiments using Lucent Wave-
sequence. These three receivers generate their new LAN cards [13]. Transmitting or receiving a broadcast
wakeup frequencies Q1 ={0,5,10,15,20,25}, Q2 ={0,3,6, packet of L bytes has a cost Pbase+Pbyte× L, where Pbyte
is the energy consumption per byte. We set three

336
Table 1. tee that m different wakeup frequencies have the inter-
Energy consumption parameters in the simulation. section even though they drift with different rotation
Transmit rate:2Mbps volumes. When m wireless hosts are required to trans-
Measured Power consumed
Power Supply:4.74 V mit the multicast messages, each wireless hosts will
Sleep Mode 14 mA 27 uW/ms choose a new wakeup frequency generated by the con-
Idle Mode 178 mA 843 uW/ms
Broadcast Receive 204 mA
cept of Chinese Remainder Theorem. The new wakeup
56+0.5×L uW/packet
Broadcast Transmit 280 mA 266+1.9×L uW/packet
frequency satisfies the difference m-pair property.
Eventually, these m wireless hosts will wake up simul-
taneously and receive the multicast message. Finally,
we simulate our CRT-MAC PS protocol to evaluate its
energy consumption and compare with the busy wait-
ing approach. The simulation results have confirmed
that the energy consumption of our PS protocol is less
(a) (b) than that of the busy waiting approach.
Figure 6. (a) Energy consumption of vary number
of multicast packets. (b) Energy consumption of the 6. References
CRT-MAC PS protocol and busy waiting approach.
[1] C.E. Perkins, “Ad Hoc Networking,” Addison Wesley,
wireless hosts in the simulation and transmit the mul- 2001.
ticast message to them. In the first simulation, we [2] X. Jiang, J. Polastre, and D. Culler, “Perpetual environ-
compare the energy consumption of vary number of mentally powered sensor networks,” IEEE SPOTS, 2005.
multicast packets. Fig. 6(a) shows the comparison of [3] IEEE Std 802.11-1999, “Wireless LAN Medium Access
the energy consumption of transmitting multicast mes- Control (MAC) and Physical Layer (PHY) specifications,”
sages and retransmitting multicast messages. We can IEEE, 1999.
observe that the multicast messages need to be re- [4] C.M. Chao and J.P. Sheu, “An Adaptive Quorum-Based
transmitted while the receivers are in PS mode. Hence, Energy Conserving Protocol for IEEE 802.11 Ad Hoc Net-
works,” IEEE Transactions On Mobile Computing, 2006.
the energy of the transmitter will be rapidly consumed
[5] J.R. Jiang, Y.C. Tseng, C.S. Hsu, and T.H. Lai, “Quo-
with the increasing of retransmission. rum-Based Asynchronous Power-Saving Protocols for IEEE
In order to reduce the possibility of retransmitting 802.11 Ad Hoc Networks,” ACM Journal on Mobile Net-
the multicast messages, the transmitter needs to wait works and Applications, 2005.
for all receivers to be awake in both the busy waiting [6] Y.C. Tseng, C.S. Hsu, and T.Y. Hsieh, “Power-Saving
approach and the CRT-MAC PS protocol and then Protocols for IEEE 802.11-Based Multi-Hop Ad Hoc Net-
transmit the multicast messages once. However, wait- works,” IEEE INFOCOM, 2002.
ing for other receivers to wake up will increase energy [7] P. Hurni, T. Braun, and L.M. Feeney, “Simulation and
consumption of idle monitoring. Hence, in the second Evaluation of Unsynchronized Power Saving Mechanisms in
Wireless Ad Hoc Networks,” WWIC, pp. 311-324, 2006.
simulation, we evaluate the energy consumption of the
[8] M. Maekawa, “A √N Algorithm for Mutual Exclusion in
busy waiting approach and that of the CRT-MAC PS Decentralized Systems,” ACM Trans. Comput. Syst., 1985.
protocol. Fig. 6(b) shows that the energy consumption [9] W.S. Luk and T.T. Huang, “Two New Quorum Based
of the CRT-MAC PS protocol is less than that of the Algorithms for Distributed Mutual Exclusion,” Proc. Int’l
busy waiting approach before the transmitter starts to Conf. Distributed Computing Systems, pp.100-106, 1997.
transmit the multicast message. The reason is that in [10] C.M. Lin, G.M. Chiu, and C.H. Cho, “A New Quo-
CRT-MAC protocol the receiver notified to wake up rum-Based Scheme for Managing Replicated Data in Dis-
will choose a new wakeup frequency and wakeup or tributed Systems,” IEEE Trans. Computers, vol.51, 2002.
doze according to its new wakeup frequency, whereas [11] T. You, H.S Hassanein, and C.-H. Yeh, “SeMAC: Ro-
in the busy waiting approach the receiver notified to bust broadcast MAC Protocol for multi-hop,” IPCCC, 2006.
[12] C.H. Wu, J.H. Hong, and C.W. Wu, “RSA Cryptosystem
wake up need to keep awake until the multicast trans- Design Based on the Chinese Remainder Theorem,” Pro-
mission is finished. Hence, the CRT-MAC PS protocol ceedings of the ASP-DAC, pp. 391-395, 2001.
consumes less energy in waiting the multicast message. [13] L.M. Feeney and M. Nilsson, “Investigating the Energy
Consumption of Wireless Network Interface in an Ad Hoc
5. Conclusion Networking Environment,” IEEE INFOCOM, 2001.
[14] J. So and N.H. Vaidya, “A Multi-Channel MAC Proto-
In this paper, we propose a new asynchronous PS col for Ad Hoc Wireless Networks,” ACM Mobihoc, 2004.
[15] M. Hall Jr., Combinatorial Theory. John Wiley & Sons,
protocol, named CRT-MAC PS protocol, for the mul- 1986.
ticast transmission in the asynchronous ad hoc network. [16] QualNet, http://www.scalable-networks.com/.
We introduce the difference m-pair property to guaran-

337
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Adaptive Bandwidth Management and Reservation Scheme in


Heterogeneous Wireless Networks
I-Shyan Hwang, *Bor-Jiunn Hwang, Ling-Feng Ku, Pen-Ming Chang
Department of Computer Engineering and Science
Yuan-Ze University, Chung-Li, Taiwan, 32026
*Department of Computer and Communication Engineering
Ming-Chuan University, Tao-Yuan, Taiwan, 33348
E-mail: ishwang@saturn.yzu.edu.tw, bjhwang@mcu.edu.tw, s939408@mail.yzu.edu.tw, s966041@mail.yzu.edu.tw

Abstract In order to support multiple types of service for delivering high speed Internet access to businesses,
with different QoS requirements in heterogeneous wireless homes and hot spots.
networks, efficient resource management, call admission The systems in heterogeneous wireless networks are
control strategies, and mobility management are important able to maintain the delivered QoS to different users at the
issues. In this paper, we propose Bandwidth Management target level with the combination of call admission control
Strategy 1 (BMS1), Bandwidth Management Strategy 2 and resource management techniques. In this paper, two
(BMS2) and reservation scheme with Fuzzy controller for bandwidth management and reservation schemes
real-time services. Simulation result shows that the Bandwidth Management Strategy 1, Bandwidth
proposed methods balance resource utilization outperforms Management Strategy 2 are proposed to decrease call
the previous work and traditional CAC by improving the reject probability for better resource utilization. The
call reject probability (CRP). bandwidth management strategies include admission
Keywords: Heterogeneous wireless network, QoS, CAC, control, resource reservation mechanism for real-time
CRP. services. And in real-time services, some traffic has bursty
features, it is hard to reserve appropriate bandwidth for
such traffic, so we propose Fuzzy controller to adjust
1. Introduction bandwidth of real-time service adaptively and enhance
The future wireless network trend is to connect to the resource reservation mechanism [4,5].
network anywhere and anytime. Thus, the integration of This paper is organized as follows. Section 2 introduces
different wireless data networks such as WiFi, WiMAX the related work. Section 3 depicted the system model and
and UTRAN [1] to become a multi-tier heterogeneous the call management flow. The proposed BMS algorithms
wireless network is a more and more popular issue. and Fuzzy controller are described in section 4. In section
Heterogeneous wireless network has the feature that all 5, the simulation model is proposed and simulation results
users are able to switch different access technologies are evaluated and compared. Conclusion is given in section
according to their demands. How to provide such a 6.
multiple types of applications with different QoS
requirements, the efficient resource management and call 2. Related Work
admission control strategies play an important role [2,3].
In nowadays, the WiFi network can provide Lots of call admission control (CAC) and resource
communication coverage in a close area. WiFi connect to reservation schemes designed for different choices and
the Internet only in a limited range with an access point approaches have been proposed [6,7]. But those CAC
(AP). The limited coverage range of WLAN makes it scheme are static method. Then, some adaptive method in
difficult to fulfill the future wireless network trend to dealing with resource reservation schemes, CAC and
connect to the network anywhere and anytime. Unlike bandwidth control mechanisms have been proposed to
802.11, WiMAX has been designed specifically for cope with the complex wireless network [8,9]. A dynamic
deployment in outdoor environments. WiMAX, also known admission control for WiMAX networks and a handoff
as IEEE 802.16 offers a high speed data, voice and video algorithm for the hybrid network or for WiFi and WiMAX
services with a bandwidth of up to 72Mbps over a range of are proposed in [10,11]. Besides, how many neighboring
30 miles. It offers a non-line-of-sight (NLOS) range of 4 cells and which cells will make resource reservation are
miles and supports a point-to-multipoint distribution. two important issues [12]. The resource reservation in
Therefore, a natural trend of combining WiMAX with WiFi neighboring cells can provide real-time traffic with more
will need to create a system, a complete wireless solution stationary call for handoff call. In [13], it proposed a

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 338


DOI 10.1109/SUTC.2008.7
resource reservation scheme with reserved resource in all
neighboring cells which introduced the impact on system Table1. 802.16 and 802.11e service type mapping
performance. However, the behavior of the mobile users` 802.16 802.1p 802.11e
handoff to each neighboring cells is haphazard. Therefore Service type user_priority acronym for traffic type
our proposed management scheme will propose excessive UGS 6 VO
ERT-VR 6 VO
cells reserve resource, system will cast away some resource
RT-VR 5 VI
and call rejecting probability will decrease.
NRT-VR 1 BK
BE 0 BE
3. System Model
The heterogeneous network architecture, where a 4. Bandwidth management Strategy
macrocell (WiMAX cell) is an overlay of several Two bandwidth management strategies are proposed in
microcells (WiFi cells). The mobile users can access the this paper and the system capacities considered in the
resource of WiFi or WiMAX. Both the WiFi and WiMAX
system are active capacity and passive capacity [9]. The
cells contain new calls, vertical handoff calls and
active capacity represents the capacity occupied by the on-
horizontal handoff calls. New calls event means if the
going calls in the cell, and the passive capacity is reserved
mobile users request new call, the system start the
capacity for each cell reserves bandwidth for calls in the
bandwidth management strategy to evaluate sufficient
bandwidth for new call; in handoff calls if the mobile user neighboring cell in case handoff occurs. The system
discovers the received signal strength indicator (RSSI) downlink residual capacity can be obtained
from target cell is weaker than neighboring cell, it notifies by C down _ resi =C down −C down _ a − C down _ p . Similarly,
the system to start the bandwidth management strategy to the system uplink residual capacity can be obtained by
evaluate sufficient bandwidth for handoff call. The call
C up _ resi =C up −C up _ a − C up _ p .
admission control procedure is depicted in Fig. 1. Besides,
when the vertical handoff call is initiated, there exists a
traffic mapping mechanism between WiFi 802.11e and 4.1 BMS1
WiMAX network. Table 1 shows the traffic mapping.
The algorithm WiMAX new calls for resource
After classification, bandwidth management scheme will
reservation shown in Fig. 2, if it is the UGS or ERT-VR
decide whether the connection is accepted or dropped.
traffic, the strategy first verifies that the WiMAX cell has
enough capacity ( C down _ resi and C up _ resi ) to support the
traffic QoS requirement. If not, the new connection is
blocked; otherwise, the new call is accepted and allocated
with desired amount of bandwidth. In the case that the
system has enough capacity but reservation fails in one of
six neighboring cells, the new call is also blocked, and so
does in RT-VR traffic. For the RT-VR traffic, since this
kind of VBR traffic exhibits highly bursty and stationary
properties, the effective bandwidth allocation must be
designed to handle the worst-case input scenario in order to
avoid excessive delay or even the call is dropped. This
implies that the system must support the minimum required
bandwidth in order to guarantee the maximum tolerable
end-to-end delay. If the desired amount of bandwidth can
be provided and succeed in the resource reservation, the
new call is accepted and allocated with desired amount of
bandwidth. At the same time, a fuzzy bandwidth controller
adjusts the allocated bandwidth to the target RT-VR traffic
based on the system state: residual bandwidth and dropping
probability. As for the NRT-VR traffic, it is accepted as
long as residual capacity is greater than or equal minimum
amount of resource requested in the target cell. For the BE
Figure 1 Call admission control procedure. traffic, it is accepted as long as there is available resource

339
in the target cell. We provide no resource reservation for 4.3 BMS2
these kinds of traffic due to nonreal-time packets or data
packets which can tolerate longer transmission delay and Figure 3 shows the BMS2 algorithm of WiMAX new
packet loss. call for sharing. In WiMAX cell, if a new call of real-time
The real-time traffic type in WiMAX new call traffic type arrives and the target cell or neighboring cells
procedure, WiMAX horizontal handoff call procedure, do not have enough capacity to support traffic QoS
WiFi horizontal handoff call procedure and WiFi vertical requirements. The target cell is allowed to “share” some of
handoff call procedure all are needed to reserve resource the bandwidth from the passive capacity (reserved capacity
resulted in higher threshold of admission control for real- for handoff calls) to improve the call blocking probability
time traffic type. On the contrary, our proposed strategy performance.
adopts lower threshold of admission control for nonreal- For a real-time new call, the BMS 2 works as follows.
time traffic and it maybe introduces higher blocking or First, the BMS 2 is verified if Cdown_ resi + Cdown_ s and
dropping probability for real-time traffic. To avoid this, the Cup _ resi + Cup _ s are sufficient for the call. If they are able
balance mechanism is proposed in next section. to support the traffic QoS requirement and the resource
reservation succeeds, the new call is accepted and reserves
For a WiMAX new call bandwidth in ε neighboring cells. If either reservation
IF UGS or ERT-VR traffic type failed or the available capacity is not enough for the call,
IF Cdown_ resi ≧ desired amount of downlink bandwidth the call is blocked. As for a nonreal-time new call, it is
AND Cup _ resi ≧desired amount of uplink bandwidth
blocked as long as no enough residual capacity available
Allocate desired amount of bandwidth
Resource Reservation for this call.
ELSE
Call blocked For a WiMAX new call
IF RT-VR traffic type IF UGS or ERT-VR traffic type
IF Cdown_ resi ≧ desired amount of downlink bandwidth
IF Cdown_ resi desired amount of downlink bandwidth
AND Cup _ resi ≧desired amount of uplink bandwidth
Fuzzy Bandwidth controller AND Cup _ resi desired amount of uplink bandwidth
Resource Reservation Allocate desired amount of bandwidth.
ELSE Resource Reservation
Call blocked IF There is enough passive resource capacity to be shared
IF NRT-VR traffic type Fuzzy Ratio controller
IF Cdown_ resi ≧ minimum amount of downlink bandwidth ELSE
AND Cup _ resi ≧minimum amount of uplink bandwidth Call blocked
Balance mechanism IF RT-VR traffic type
ELSE IF Cdown_ resi desired amount of downlink bandwidth
Call blocked AND Cup _ resi desired amount of uplink bandwidth
IF BE traffic type
Fuzzy Bandwidth controller
IF Cdown _ resi ≧ 0 AND Cup _ resi ≧0 Resource Reservation
Balance mechanism. ELSE
ELSE Call blocked
Call blocked IF NRT-VR traffic type
IF Cdown_ resi ≥ minimum amount of downlink bandwidth
Figure 2 BMS 1 for WiMAX new calls.
AND Cup _ resi ≥ minimum amount of uplink bandwidth
Allocate desired amount of bandwidth.
4.2 Balance Mechanism ELSE
Call blocked
Actually CRP is the sum of call dropping probability IF BE traffic type
(CDP) and call blocking probability (CBP). In our IF Cdown_ resi ≥ 0
proposed strategy, it adopts higher threshold of admission AND Cup _ resi ≥ 0
control for real-time traffic type. In order to avoid higher Balance mechanism.
CBP and CDP of real-time traffic type as a result of higher ELSE
Call blocked
threshold of admission control, the resource reservation
balance mechanism records the CBP and the CDP of real- Figure 3 BMS 2 algorithm.
time traffic type that the handoff calls are blocked or
dropped as a result of a fail in reserving resource, then the
4.4 Fuzzy Bandwidth Controller
threshold update accordingly.
The status of network always changes from time to

340
time. Therefore, static resource reservation mechanism can inference according to the Fuzzy Rule Base. In our case,
not adjust accordingly. Thus, the Fuzzy controller within the Fuzzy Rule Base is expressed as the following format:
the resource reservation mechanism alters the parameters
dynamically and adaptively. For example, a real-time Rule i:
service like variable bit rate (VBR) traffic that exhibits IF Cγ 1 is U Cmγ 1 and PD is U PmD
highly bursty and nonstationary properties. Inefficient Then set the ρ for τ , m = 1, 2, 3.
resource allocation may lead to under-utilization of
network resources or excessive traffic delay. To prevent
more bandwidth being allocated than needed, a Fuzzy
5. Simulation Results
bandwidth controller is introduced in the bandwidth Six types of calls are considered in the simulation in
management scheme to adaptively adjust the amount of terms of WiMAX new calls, WiFi new calls, WiMAX
allocated bandwidth for new and handoff calls based on the horizontal handoff calls, WiFi horizontal handoff calls,
current network conditions. WiMAX vertical handoff calls, and WiFi vertical handoff
The fuzzy bandwidth controller adjusts the allocated calls. Traffics are generated following Poisson distribution
bandwidth to the target RT-VR traffic based on system with average arrival rates. The call holds duration for data,
state, e.g. residual bandwidth and call dropping voice, and video traffic are exponentially distributed. The
probability. The system residual bandwidth can be derived, value of target dropping probability PD_tar is set to 0.02 [8].
and recall the bandwidth management strategy, every time The threshold of dwelling time of a mobile user staying in
a handoff call is accepted or dropped, system call dropping a cell is estimated periodically by using the dwelling time
probability PD is updated. When PD reaches or exceeds statistics that is updated.
the target cell dropping probability PD _ tar , the system
calculates the dropping probability change ∆PD , current
residual capacity Cresi are derived, the fuzzy bandwidth
controller is initiated. ∆ P D= | P D − P D _ tar | .
Fuzzification is the process that translates the real
number inputs of each feedback into linguistic terms. Fig. 5
shows the dropping probability change ∆PD , three
linguistic terms are defined as {low, medium, high} of
each of them with corresponding membership function.

(a)

Figure 5 Membership function for ∆PD .

4.5 Fuzzy Ratio Controller


BMS2 adopts sharing-resource method to decrease
CRP. For the ratio of total WiMAX uplink bandwidth to
WiMAX uplink passive bandwidth, three linguistic terms
(b)
are defined U Cmγ 1 = {low, medium, high} as well as Fuzzy
Figure 6 (a) Call Reject probability of Real-time traffic
bandwidth controller. After the linguistic terms are (b) Call Reject probability of Non-real-time traffic
generated through the membership functions in the
Fuzzifier, the Inference Engine performs the logic

341
The CRP is evaluated for each strategy, the CRP is the Amendment 2: Physical and Medium Access Control
summation of CBP and CDP, and the proposed BMS1 and Layers for Combined Fixed and Mobile Operation in
BMS2 are compared with two resource management Licensed Bands,” Feb. 2006.
methods in [13], called Com_RMS 1 and Com_RMS 2, [2] N. Nasser, A. Hasswa and H. Hassanein, “Handoffs in
respectively. First, the CRP of real-time traffic of the fourth generation heterogeneous Networks,” IEEE
proposed BMS1 and BMS2 is smaller than Com_RMS 1 Communication Magazine, vol. 44, issue 10, Oct. 2006,
and Com_RMS 2. There are 2 main reasons: 1. BMS1 and pp. 96-103.
BMS2 adopt fuzzy bandwidth controller for VBR traffic [3] S. Xu and B. Xu, “A fair admission control scheme for
resource reservation. Besides, newly arriving calls are multimedia wireless network,” International Conference
allowed to share capacity to improve the performance in on Wireless Communications, Networking and Mobile
BMS2, thus the CBP is reduced; 2. Due to some of the Computing, vol. 2, Sept. 2005, pp. 859-862.
passive capacity is “shared” by the real-time new calls so [4] I.S. Hwang, S.N. Lee and I.C. Chang, “Performance
less reserved capacity is available for real-time handoff Assessment of Fuzzy Logic Control Routing Algorithm
calls, to reduce the CDP. Figure 6(a) depicts the with Different Wavelength Assignments in DWDM
improvements of BMS1 and BMS2 compared with Networks”, Journal of Information Science and
Com_RMS1 and Com_RMS2 are 34% and 46%, Engineering, vol. 22, no. 2, Mar. 2006, pp. 461-473.
respectively. [5] I.S Hwang, I.F Huang and S.C. Yu, “Dynamic Fuzzy
However, for non-real-time calls, the BMS1 and BMS2 Controlled RWA Algorithm for IP/GMPLS over WDM
encountered a worse CRP than the Com_RMS1 and Networks”, Journal of Computer Science and
Com_RMS 2. The reason is that the CBP increases rapidly Technology, vol. 20, no. 5, Sept. 2005, pp. 717-727.
as the arrival rate increases due to real-time traffic. Since [6] M.H. Ahmed, “Call admission control in wireless
the call duration of a real-time call is relatively long (180 communications: a comprehensive survey,” IEEE
secs for voices calls, 360 secs for video calls) when Communication Surveys, vol. 7, no. 1, First Quarter
compared to non-real-time calls, more capacity will be 2005, pp. 50-69.
occupied by the real-time calls (64kbps~384kbps for video [7] D. Niyato and E. Hossain, “Call admission control for
calls). As the arrival rate increases, more real-time calls are QoS provisioning in 4G wireless networks: issues and
accepted to enter into the system leading to the increase of approaches,” Special Issue of IEEE Network on 4G
non-real-time CBP. The reason of CDP is the same as the Network Technologies for Mobile Telecommunications,
CBP. Figure 6(b) shows the CRP vs. non-real-time calls. vol. 19, no. 5, Sept.-Oct. 2005, pp. 5-11.
[8] B.J. Hwang, J.S. Wu and Y.C. Nieh, “Improving the
6. Conclusion performance in a multimedia CDMA cellular system with
resource reservation,” IEICE Trans. on Communications,
The main benefit of combining different wireless access vol. E84-B, no. 4, Apr. 2001, pp. 727-738.
technologies with multiple tiers topology to become [9] P. Siripongwutikorn, S. Banerjee and D. Tipper, “A
wireless heterogeneous network is the system loading can survey of adaptive bandwidth control algorithms,” IEEE
be shared between different wireless access technologies, Communication Surveys, vol. 5, no. 1, Third Quarter
so we can improve system performance such as decreasing 2003, pp. 14-26.
rejecting probability and increasing system throughput. In [10] W. Li, and X. Chao, “Call Admission Control for an
this paper, two bandwidth management strategies with Adaptive Heterogeneous Multimedia Mobile Network,”
resource reserving mechanism are proposed in IEEE Transactions ON Wireless Communications, vol.6,
heterogeneous wireless network. Simulation results show no. 2, Feb. 2007, pp. 515-525.
that the proposed BMS1 and BMS2 indeed improve the [11] J. Nie, J.C. Wen, Q. Dong and Z. Zhou, “A seamless
CRP of real-time calls due to the mobile users’ fuzzy handoff in IEEE 802.16a and IEEE 802.11n hybrid
controller. However, the CRP of the proposed BMS1 and networks,” International Conference on Communication,
BMS2 in non-real-time calls are higher than Com_RMS 1 Circuits and Systems, Hong Kong, May 27-30, 2005,
and Com_RMS 2. Future work is how to avoid non-real- vol. 1, pp. 383-387.
time service to become starvation when system in heavy [12] J. Ni, H. K. Tsang, S. Tatikonda, and B. Bensaou,
loading and consider the how to dynamically adjust passive “Optimal and Structured Call Admission Control Policies
capacity which can be borrow in neighboring cells. for Resource-Sharing Systems,” IEEE Transactions on
Communications, vol. 55, no. 1, Jan. 2007, pp. 158-170.
References [13] I.S. Hwang, B.J. Hwang and L.F. Ku, “Adaptive
[1] IEEE 802.16e-2005, “Part 16: Air Interface for Fixed resource management in two-tier wireless networks”,
and Mobile Broadband Wireless Access Systems – International Computer Symposium, Taipei, Taiwan, vol.
2, Dec. 4-6, 2006, pp. 634-639.

342
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

WAP: Wormhole Attack Prevention Algorithm in Mobile Ad Hoc Networks

Sun Choi, Doo-young Kim, Do-hyeon Lee, Jae-il Jung


Division of Electrical and Computer Engineering, Hanyang University
17 Haengdang-dong, Sungdong-gu, Seoul, 133-791, Korea
{sun0467,dykim,dohyeon,}@mnlab.hanyang.ac.kr, jijung@hanyang.ac.kr

Abstract the lack of infrastructure, rapid deployment practices, and


the hostile environments in which they may be deployed,
In wireless ad hoc networks, nodes compromise to for- make them vulnerable to a wide range of security attacks
ward packets for each other to communicate beyond their described in [1, 2, 3, 4, 5, 6]. However the attacks are pre-
transmission range. Therefore, networks are vulnerable to formed by a single malicious node. Many solutions pro-
wormhole attacks launched through compromised nodes be- posed in order to solve single node attacks in [10, 15, 16]
cause malicious nodes can easily participate in the net- cannot defend attacks that are executed by colluding mali-
works. In wormhole attacks, one malicious node tunnels cious node, such as wormhole attack, which damage is ex-
packets from its location to the other malicious node. Such tensive than single node attacks.
wormhole attacks result in a false route with fewer. If source
In this paper, we focus on an attack launched by a pair of
node chooses this fake route, malicious nodes have the op-
colluding attackers: wormhole attack. Two malicious nodes
tion of delivering the packets or dropping them. It is dif-
that are separated by a large distance of several hops build a
ficult to detect wormhole attacks because malicious nodes
direct link called a tunnel and communicate with each other
impersonate legitimate nodes. Previous algorithms detect-
through the tunnel. The tunnel can be established in many
ing a wormhole require special hardware or tight time syn-
different ways, for example, through an out-of-band chan-
chronization. In this paper, we develop an effective method
nel, packet encapsulation, and high-powered transmission.
called Wormhole Attack Prevention (WAP) without using
This route via the wormhole tunnel is attractive to the le-
specialized hardware. The WAP not only detects the fake
gitimate nodes because it generally provides less number of
route but also adopts preventive measures against action
hops and less latency than normal multi-hop routes. The at-
wormhole nodes from reappearing during the route discov-
tackers can also launch attacks without revealing their iden-
ery phase. Simulation results show that wormholes can be
tities. The wormhole attack is still possible even if the ad-
detected and isolated within the route discovery phase.
versary does not access the contents of the packet. There-
fore, it can be difficult to detect wormhole attacks since the
contents of the packets are not modified.
1. Introduction
In order to detect these attacks, some mechanisms have
been proposed [3, 4, 7, 8, 9]. However, most of these mech-
A Mobile Ad-hoc NETwork (MANET) comprises nodes
anisms require specialized devices that can provide the lo-
that are organized and maintained in a distributed manner
cation of the nodes or tight time synchronization. Moreover,
without a fixed infrastructure. These nodes, such as laptop
they focus only on the method of detection of the wormhole
computers, PDAs and wireless phones, have a limited trans-
route. In this paper, we propose an efficient algorithm based
mission range. Hence, each node has the ability to commu-
on Dynamic Source Routing (DSR) protocol [12, 13]. The
nicate directly with another node and forward messages to
advantage of this algorithm is that it does not require the
neighbors until the messages arrive at the destination nodes.
location information of time synchronization.
Since the transmission between two nodes has to rely on re-
lay nodes, many routing protocols [11, 12, 13, 14] have been This paper is organized as follows. Section 2 presents
proposed for ad hoc networks. However, most of them as- related works on the detection of wormhole attacks. In Sec-
sume that other nodes are trustable and hence they do not tion 3, we describe wormhole detection and the prevention
consider the security and attack issues. This provides many algorithm in detail. Simulation results and analysis are pre-
opportunities for attackers to break the network. More- sented in Section 4. Finally, the conclusion is provided in
over, the open nature of wireless communication channels, Section 5.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 343


DOI 10.1109/SUTC.2008.49
2. Related Works cannot pinpoint the location of a wormhole. Moreover, be-
cause the lengths of the routes are changed by every node,
Packet Leash [4] is an approach in which some infor- including wormhole nodes, wormhole nodes can change the
mation in added to restrict the maximum transmission dis- route length in a certain manner so that they cannot be de-
tance of packet. There are two types of packet leashes: tected.
geographic leash and temporal leash. In geographic leash,
when a node A sends a packet to another node B, the node 3. WAP (Wormhole Attack Prevention)
must include its location information and sending time into
the packet. B can estimate the distance between them. The In this section, we describe a method for preventing
geographic leash computes an upper bound on the distance, wormhole attack called as Wormhole Attack Prevention
whereas the temporal leash ensures that a packet has an up- (WAP). All nodes monitor its neighbors behavior when they
per bound on its lifetime. In temporal leashes, all nodes send RREQ messages to the destination by using a spe-
must have tight time synchronization. The maximum dif- cial list called Neighbor List. When a source node receives
ference between any two nodes’ clocks is bounded by ∆, some RREP messages, it can detect a route under wormhole
and this value should be known to all the nodes. By using attack among the routes. Once wormhole node is detected,
metrics mentioned above, each node checks the expiration source node records them in the Wormhole Node List. Even
time in the packet and determine whether or not wormhole though malicious nodes have been excluded from routing
attacks have occurred. If a packet receiving time exceed the in the past, the nodes have a chance of attack once more.
expiration time, the packet is discarded. Therefore, we store the information of wormhole nodes at
Unlike Packet Leash, Capkun et al. [7] presented SEC- the source node to prevent them taking part in routing again.
TOR, which does not require any clock synchronization and Moreover, the WAP has the ability of detecting both the hid-
location information, by using Mutual Authentication with den and exposed attacks without special hardware.
Distance-Bounding (MAD). Node A estimates the distance
to another node B in its transmission range by sending it 3.1. Assumption
a one-bit challenge, which A responds to instantaneously.
By using the time of flight, A detects whether or not B is a At the link layer, it assumes that a node can always mon-
neighbor or not. However, this approach uses special hard- itor ongoing transmissions even if the node itself is not the
ware that can respond to a one-bit challenge without any intended receiver. This typically requires the network in-
delay as Packet leash is. terface stay in the promiscuous reception mode during all
In order to avoid the problem of using special hardware, transmissions, which is less energy efficient than listening
a Round Trip Time (RTT) mechanism [5] is proposed by only to packets directed to oneself. We also assume that
Jane Zhen and Sampalli. The RTT is the time that extends radio links are bi-directional; that is, if a node A is in trans-
from the Route Request (RREQ) message sending time of mission range of some node B, then B is in transmission
a node A to Route Reply (RREP) message receiving time range of A. We further assume that the transmission range
from a node B. A will calculate the RTT between A and all of a wormhole node is similar to a normal node because
its neighbors. Because the RTT between two fake neigh- more powerful transceiver is easy to detect.
bors is higher than between two real neighbors, node A can
identify both the fake and real neighbors. In this mecha- 3.2. Neighbor Node Monitoring
nism, each node calculates the RTT between itself and all
its neighbors. This mechanism does not require any special Neighbor Node Monitoring is used to detect neighbors
hardware and it is easy to implement; however it can not that are not within the maximum transmission range but pre-
detect exposed attacks because fake neighbors are created tend to be neighbors. In order to reduce network overheads
in exposed attacks. by additional packets, this mechanism is achieved during
The Delay per Hop Indicator (DelPHI) [13] proposed the route discovery process. Originally, the intermediate
by Hon Sun Chiu and King-Shan Lui, can detect both hid- node which has a route to destination can send a RREP to
den and exposed wormhole attacks. In DelPHI, attempts source. However, our mechanism does not support the DSR
are made to find every available disjoint route between a optimization because it performs end-to-end signature au-
sender and a receiver. Then, the delay time and length of thentication of routing packet and verification of whether a
each route are calculated and the average delay time per node is authorized to send a RREP packet. Therefore, an
hop along each route is computed. These values are used to intermediate node cannot reply from its cache.
identify wormhole. The route containing a wormhole link Figure 1 shows an example of the secure neighbor mon-
will have a greater Delay per Hop (DPH) value. This mech- itoring. Node A sends a RREQ, which starts a wormhole
anism can detect both types of wormhole attack; however, it prevention timer (WPT). When node B receives the RREQ,

344
B must broadcast to its neighbors because B is not a desti- If any node sends a RREQ, it records the RREQ se-
nation. A can check whether the RREQ arrives within the quence number and sending time of the RREQ. Then, on
timer. If A receives the message after the timer expires, it overhearing a RREQ from any node, it records the address
suspects B or one of B’s next nodes to be wormhole nodes. of the neighbor node and the time when it receives the
packet. If the node receives the RREQ after the timer count,
called as WPT, it considers the neighbor node sending the
RREQ as a node affected by wormhole nodes. The count
value in its table will be increased by 1. It must be noted
that the count value does not exceed the previously config-
ured threshold. If the count value exceeds the threshold, the
node cannot engage in the network. This method ensures
RRE
Q
that wormholes nodes are avoided in all the future data con-
nections.
RRE
Q

Table 1. Neighbor Node Table


D 1HLJKERUQRGHPRQLWRULQJRIOHJLWLPDWHQRGHV
RREQ Neighbor Sending Receiving Count
seq # Node ID Time Time

RREQ

TUNN
E L

RREQ
3.2.2 Wormhole Prevention Timer
RR E We detect wormholes by using a special timer. For using
g Q
earin
Overh this timer, all the nodes do not require clock synchroniza-
tion, except the source node. As soon as a node sends a
earing RREQ packet, it must set the WPT and wait after send-
Overh
ing the RREQ packet until it overhears its neighbor’s re-
E 1HLJKERUQRGHPRQLWRULQJRIZRUPKROHQRGHV transmission. The WPT consider the maximum amount of
time required for a packet to travels from a node to a neigh-
Figure 1. Example of Neighbor Node Monitor- bor node and back to the node. If WPT is too small, the
ing legitimate nodes can be excluded. On the other hand, if it is
too large, it is difficult to detect wormhole attacks.
Two formulas are considered to determine whether or not
Once a malicious node overhears a RREQ, the node can the nodes have a mobility. If the nodes are fixed like sensor
claim to be another wormhole node that is actually not node, the WPT is estimated by
within the transmission range of a neighbor node. For this
reason, two nodes may believe that the other is its neigh- 2 × T ransmission Range(T R)
WPT = (1)
bor which does not want to expose itself. In order to pre- Vp
vent this problem, nodes monitor the malicious behavior of
Here, T R denotes a distance that a packet can travel and
neighbors and record it in the own neighbor node table.
Vp denotes the propagation speed of a packet. It is assumed
that the maximum propagation speed of the radio signal is
3.2.1 Neighbor Node Table the speed of light and the delay from sending and receiving
packets is negligible.
Each node maintains a Neighbor node table that contain a On the other hand, if the nodes have a mobility with an
RREQ sequence number, neighbor node ID, sending time average velocity of Vn , the distance that packet can travel
and receiving time of the RREQ and count. By using this may be different. The maximum transmission distance of a
table, all nodes monitor the activities of neighbors in its ta- packet is calculated by
ble and check for malicious behavior of the neighbors. All
the fields of neighbor node table set to zero. Table 1 shows 2 × TR 2 × Vn × T R
Radius = Vn × = (2)
an example of the neighbor node table. Vp Vp

345
Consequently, when network are formed in the mobile
Malicious Node
environment, the WPT of nodes is given by
Good Node
2 × Vn × T R
WPT = 2 (3)
(Vp )
By using the formulas 1, 3, when a node overhears
its neighbor node’s re-transmission, it checks whether the
packet has arrived before the WPT expired. If a hidden
wormhole attack is launched, the packet transmission time
between two fake neighbor nodes may be longer than the
normal transmission time of one hop. Therefore, we can
detect a route through a wormhole tunnel.

3.3. Wormhole Route Detection

We detect exposed wormhole node when a source node


selects one route among all the routes collected from the
RREP packets within the RREP waiting timer. In the DSR
protocol, the route selection without any wormhole attack
is simple. The source node selects the smallest hop count
route among all the received routes.
Unfortunately, the smallest hop count route may contain
wormhole nodes. Hidden attacks can be detected by neigh-
bor node monitoring. However, if wormhole nodes are ex-
posed and act like a legitimate node, it is difficult to detect Figure 2. Time Delay of Route Discovery
a wormhole route by using only the neighbor node monitor-
ing mechanism.
Therefore, nodes must check a RREP packet on receiv- dexed by wormhole node and colluding node. For example,
ing it from neighbor nodes. When a wormhole node sends a if a node overhearing a RREP discovers that the previous
RREP to indicate that a colluding node is its neighbor, nor- node is wormhole node, it places previous node and next
mal neighbor nodes of the wormhole node examine whether node from the RREP packet in the node’s blacklist. A node
they have corresponding RREQ packet previously received must broadcast information of the wormhole nodes in the
from the node in their table. For example, in figure 2, sup- blacklist. Each time nodes receive the messages, the node
pose a source node S broadcasts RREQ at time Ta , and then should set the wormhole node list and record the informa-
receives a RREP at time Tb ; the source node can calculate tion. After the wormhole nodes is specified in the list, any
the time delay per hop in the route by using hop count field packet from the nodes in the wormhole list.
in the RREP. The formula is given by
Tb − T a 4. Simulation
Delay per hop = ≤ WPT (4)
Hop count
4.1. Simulation Environment
As specified in above, the maximum amount of time re-
quired for a packet to travel one-hop distance is WPT / 2.
Therefore, the delay per hop value must not exceed the es- We have implemented wormhole attack and our pro-
timated WPT. posed algorithms in a Qualnet [17]. For our simulations,
In normal situation such as Figure 2(b), a smaller we use CBR (Constant Bit Rate) application, UDP/IP, IEEE
hop count provides a smaller time delay. This can be ex- 802.11b MAC and physical channel based on statistical
plained by the fact that a shorter route should have a smaller propagation model. The simulated network consists of 50
round trip time. Hence, the delay per hop count value of the randomly allocated wireless nodes in a 1500 by 1500 square
normal route should have similar values. meter flat space. The node transmission range is 250 me-
ter power range. Random waypoint model [18] is used for
scenarios with node mobility. The minimum speed for the
3.3.1 Wormhole Node List
simulations is 0 m/s while the maximum speed is 10 m/s.
When a node detects exposed wormhole nodes during route The selected pause time is 30 seconds. A traffic generator
discovery, it must keep a wormhole node list, which is in- was developed to simulate constant bit rate sources. The

346
size of data payload is 512 bytes. Five data sessions with
randomly selected sources and destinations are simulated.
Each source transmits data packets at the rate of 4 packets/s.
Duration of the simulations is 900 seconds.

4.2. Simulation Results

The network throughput is measured for the basic DSR


routing protocol and DSR with the WAP method. The speed
of nodes is varied to compare the results. Figure 3 shows the
results of the network throughput of both techniques for dif-
ferent node speeds. Even if there are no wormhole nodes,
the network throughput diminishes in the environment of
both DSR and WAP method as the node speed increases
because the network generally becomes more fragile as the
node speed increases However, the network throughput of
the basic DSR protocol dramatically decreases when there Figure 3. Effect of Wormhole Attack on Net-
are wormhole nodes in the networks. For example, the work Throughput
throughput value is 74.7% when the basic DSR is used
and when the nodes are moving with a speed of 10 m/s.
However, the throughput value is 88.9% when the WAP is
used under a wormhole attack. This proves that the network
throughput of the WAP algorithm exceeds that of the basic
DSR protocol.
We experiment on the capability of wormhole detection
and isolation with WAP method. Generally, in the basic
DSR protocol, each node does not check a RREQ packet
overheard from its neighbor nodes. Therefore, the fraction
of packets sent through the wormhole tunnel is high. In
contrast, each node that uses the neighbor node table and
wormhole node list take into account the information of the
subsequent node before forwarding a packet. Therefore, the
packets sent through a wormhole tunnel are mostly dropped
to prevent the packets from arriving at the destination. Fig-
ure 4 provides the fraction of packets sent over wormhole
routes in the basic DSR and modified DSR with the WAP
algorithm at varying speeds.
Figure 4. Fraction of Packets Sent Through
5. Conclusion Wormhole

With development in computing environments, the ser-


vices based on ad hoc networks have been increased. How-
ever, wireless ad hoc networks are vulnerable to various at-
tacks due to the physical characteristic of both the environ- We achieve this through the use of the neighbor node mon-
ment and the nodes. A wormhole attack is such an attack, itoring method of each node and wormhole route detection
that is, it is executed by two malicious nodes causing seri- method of the source node on the selected route. Our mech-
ous damage to networks and nodes. anism is implemented based on the DSR protocol and is
The detection of wormholes in ad hoc networks is still proven to be capable through simulation results. In future
considered to be a challenging task. In order to protect net- studies, we plan to study false positive problems with regard
works from wormholes, previous solutions require special- to the detection of wormholes and a mechanism to solve
ized hardwares. Thus, in this paper, we propose an algo- such problems. Moreover, we plan to apply the WAP algo-
rithm to detect wormholes without any special hardwares. rithm to other on-demand routing protocols.

347
Acknowledgement networks for IPv4. RFC 4728, The Internet Engineer-
ing Task Force, Network Working Group, Feb 2007.
http://www.ietf.org/rfc/rfc4728.txt.
This research was supported by the Ministry of Knowl-
[14] R. V. Boppana and S. P. Konduru. An adaptive distance
edge Economy, Korea, under the ITRC(Information Tech- vector routing algorithm for mobile, ad hoc networks. In
nology Research Center) support program supervised by the IEEE Computer and communications Societies (INFOCOM
IITA(Institute of Information Technology Advancement) 2001), pages 1753–1762, 2001.
(IITA-2008-C1090-0801-0016) [15] P. Papadimitratos and Z. J. Haas. Secure routing for mobile
ad hoc networks. In Proceedings of SCS Communication
Networks and Distributed Systems Modeling and Simulation
References Conference (CNDS 2002), Jan 2002.
[16] Y.-C. Hu, D. B. Johnson, and A. Perrig. SEAD: Secure ef-
[1] L. Buttyán and J.-P. Hubaux. Report on a working session ficient distance vector routing for mobile wireless ad hoc
on security in wireless ad hoc networks. ACM SIGMOBILE networks. In IEEE Workshop on Mobile Computing Systems
Mobile Computing and Communications Review, 7(1):74– and Applications (WMCSA), pages 3–13. IEEE Computer
94, Jan 2003. Society, Dec 2002.
[2] H. Yang, H. Luo, F. Ye, S. Lu, and L. Zhang. Security in [17] Scalable Network Technologies (SNT). QualNet.
mobile ad hoc networks: challenges and solutions. IEEE http://www.qualnet.com/.
Wireless Communications, 11(1):38–47, Feb 2004. [18] J. Broch, D. A. Maltz, D. B. Johnson, Y.-C. Hu, and J. G.
[3] L. Hu and D. Evans. Using directional antennas to prevent Jetcheva. A performance comparison of multi-hop wire-
wormhole attacks. In Network and Distributed System Secu- less ad hoc network routing protocols. In ACM/IEEE Inter-
rity Symposium (NDSS). The Internet Society, Feb 2004. national Conference on Mobile Computing and Networking
[4] Y.-C. Hu, A. Perrig, and D. B. Johnson. Packet leashes: (MOBICOM), pages 85–97, Oct 1998.
A defense against wormhole attacks in wireless networks.
IEEE INFOCOM, Mar 2003.
[5] J. Zhen and S. Srinivas. Preventing replay attacks for secure
routing in ad hoc networks. In ADHOC-NOW, LNCS 2865,
pages 140–150, 2003.
[6] Y.-C. Hu, A. Perrig, and D. B. Johnson. Rushing attacks
and defense in wireless ad hoc network routing protocols.
In W. D. Maughan and A. Perrig, editors, ACM Workshop
on Wireless Security (WiSe), pages 30–40, Sep 2003.
[7] S. Capkun, L. Buttyán, and J.-P. Hubaux. SECTOR: secure
tracking of node encounters in multi-hop wireless networks.
In ACM Workshop on Security of Ad Hoc and Sensor Net-
works (SASN), pages 21–32, Oct 2003.
[8] I. Khalil, S. Bagchi, and N. B. Shroff. LITEWORP: A
lightweight countermeasure for the wormhole attack in mul-
tihop wireless networks. In Dependable Systems and Net-
works (DSN), pages 612–621, Jun 2005.
[9] I. Khalil, S. Bagchi, and N. B. Shroff. MOBIWORP: Mit-
igation of the wormhole attack in mobile multihop wireless
networks. Securecomm and Workshops 2006, pages 1–12,
Aug 2006.
[10] L. Tamilselvan and D. V. Sankaranarayanan. Prevention of
impersonation attack in wireless mobile ad hoc networks.
International Journal of Computer Science and Network Se-
curity (IJCSNS), 7(3):118–123, Mar 2007.
[11] C. E. Perkins, E. M. Belding-Royer, and S. R. Das. Ad hoc
on-demand distance vector (AODV) routing. RFC 3561, The
Internet Engineering Task Force, Network Working Group,
Jul 2003. http://www.ietf.org/rfc/rfc3561.txt.
[12] D. B. Johnson and D. A. Maltz. Dynamic source routing in
ad hoc wireless networks. In Imielinski and Korth, editors,
Mobile Computing, volume 353, pages 153–181. Kluwer
Academic Publishers, 1996.
[13] D. A. Maltz and D. B. Johnson and Y. Hu. The dy-
namic source routing protocol (DSR) for mobile ad hoc

348
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Performance of a Hierarchical Cluster-Based Wireless Sensor Network

Yung-Fa Huang1 , * , Neng-Chung Wang2 and Ming-Che Chen1


1
Graduate Institute of Networking and Communication Engineering
Chaoyang University of Technology
No. 168 Jifong E. Rd., Wufong, Taichung County 41349, Taiwan
*
Email: yfahuang@mail.cyut.edu.tw
2
Department of Computer Science and Information Engineering
National United University, Taiwan
Email:ncwang@nuu.edu.tw

Abstract constrained nodes. The other one is to balance the


energy dissipation of all nodes [2].
This paper proposes a clustering scheme to improve The energy in WSN is mainly consuming on the
energy efficiency for cluster-based wireless sensor direct data transmission [2]. Firstly, each sensor
networks (WSNs). In order to reduce the energy collects data and delivers the data to the base station
dissipation of transmitting sensing data at each sensor, directly, called as “sink”. Applying this mode, the
the proposed fixed algorithm uniformly divides the sensor will have quick energy exhaustion if it is apart
sensing area into clusters where the cluster head is from the base station. Thus, this kind transmission
deployed in the center of the cluster area. Moreover, to scheme is not suitable in a large area [2]. Then,
improve the energy efficiency in the cluster based on secondly, to enable communication between sensors
the fixed clustering, the cluster head is elected by the not within each other’s communication range, the
LEACH scheme. Simulation results show that the common multi-hop routing protocol is applied in the ad
proposed low-energy fixed clustering (LEFC) hoc wireless sensors communication networks [3]-[5].
definitely reduces the energy consumption of the In this scheme, several multi-hop paths exist to
sensors and outperform LEACH with more 60% perform the network connectivity. Each path in the
network lifetime. configuration will have one link head to collects data
from sensors.
1. Introduction Every sensor node in the WSN sends both the
sensing data of itself and the receiving data from
Recently, the rapidly developed technologies of previous nodes to its closer node. Then, the destination
microelectro-mechanical systems and node delivers the data collection in the path to the base
telecommunication battery make the small sensors station. The nodes closer to the base station need more
comprise the capabilities of wireless communication energy to send data because the scheme uses hierarchy
and data processing [1]. These small sensors could be transmission. However, due the highly complexity in
used as the surveillance and the control capability routing protocols and the most likely heavy load on the
under a certain environment. Specially, the location of relaying nodes, this scheme is not suitable for the
wireless sensor network (WSN) could be a region highly densely WSNs.
where people could not easily reach and there is a The third scheme is the cluster-based one that those
difficulty to recharge the device energy. Therefore, the closer sensors belong to their own clusters. One of
energy efficiency of the sensor networks is an sensors, called “cluster head (CH)”, in each cluster is
important research topic and the lifetime of WSNs responsible for delivering data back to the base station.
could be considered as the most significant In this scheme, the CH performs data compressing and
performance in the WSN [2]. Moreover, there are two sending back to the base station. Thus, the lifetime of
main issues in the lifetime prolong problems. One is to CH may be shorter than that of other sensors [6]-[7].
minimize the energy dissipation for all energy Therefore, for WSNs with a large number of energy-
constrained sensors, it is very important to design an

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 349


DOI 10.1109/SUTC.2008.37
algorithm to organize sensors in clusters to minimize the CH and these data will be sent back to the base
the energy used to communicate information from all station located at the point (0, -B).
nodes to the base station.
Moreover, the energy efficiency and the lifetime Table 1 The network parameter descriptions
prolonging are the most important topics in WSNs. As for cluster based WSNs
we know that both energy efficiency and energy Parameters Descriptions
balancing are two main issues for prolonging the q Number of clusters
lifetime WSNs. Then the structure of low-energy D Length of the sensing square area
adaptive clustering hierarchy (LEACH) proposed by Q Number of sensor nodes
[8]-[10] is one of the initiate algorithm to balance the α Radio exponential exponent
energy issues for cluster based WSNs. In LEACH, the The nearest distance between base
CH can be elected by uniform random number to B
station and the sensing area
uniformly take turn being the CH. Then the energy
ηi Data fusion factor
consumption for all nodes will be nearly balanced in a
round [8]. However, due to the randomly clustering the
cluster area may be too large to consume large energy A. LEACH
for the CH. Then the energy dissipation becomes In cluster based WSNs, how to select the CH and
inefficiently. Therefore, in this paper a fixed area further to clustering the sensing area are the most
clustering method is proposed to improve the energy procedures. In a data collection procedure, it is called a
efficiency. round. After the setup state, the LEACH finished two
In previous works, the proposed fixed clustering steps of CH election and clustering the area as shown
algorithms (FCA) can give the uniform area of clusters in Figure 1. In CH election procedure for the rth round,
to improve energy efficiency of the sensing nodes in the nodes with a random number, U[0,1], lower than
the clusters. Therefore, we combine the LEACH and the threshold,
FCA to further propose low-energy fixed clustering p
(LEFC) scheme to improve energy efficiency and then Tn (r ) = n ∈G
 1
prolong the network lifetime. 1 − p ×  r mod  (1)
In the following section, network models of the  p
WSN are described. Then, the proposed LEFC scheme Tn (r ) = 0 n ∉G
is described. In Section 4, simulation results show
comparison of energy efficiency for clustering will be elected as one of the CHs, where p is the
algorithms. Finally, some conclusions are given in expected probability for the CH election, p=q/Q and G
final section. is the set of having yet not been elected CH in recent (r
mod 1/p) rounds, n=1,2, …, Q, Q is the total number of
2. Network Models nodes in WSN. After the CHs are selected, the
clustering is performed by broadcasting the
In practical, the geometry of the WSN is non- advertisement message in which the CH ID in included.
regular. However, the square is a basic area to be Then, the nodes communicate with the nearest CH by
consisted of non-regular area. Thus, for simplification, CSMA/CA protocol and send the sensing data to CH.
in this paper we adopt a square area with the length D. Thus, the clustering procedure is finished. An example
The network parameters are described in Table 1. The for five clusters is shown in Figure 1.
sensor area is with uniformly distributed CHs and is After clustering procedure, the network is in steady
shown in Figure 1. In Figure 1, the symbol “•” is state in which all nodes are in sleep state excepting the
represented as a location of CH whereas the symbol communicating nodes. Then, after the data aggregation
“o” is represented as a location of the sensing node. in CH, the CHs send aggregated data to the base
When the cluster area is of random distributed, the station [11]. Then a round is performed.
energy efficiency of sensor nodes on data transmitting
is terrible [3]. Therefore, the FCA in previous work
[12] is proposed to divide the sensor area into clusters
and to deploy CHs uniformly over the network area.
Based on the configuration of square area, the sensors
are supposed to be spread out uniformly to the whole
area. The data from each cluster will be collected by

350
(a)
Figure 1 An example of topology in LEACH
with q=5 [8]

B. Low Energy Fixed Clustering (LEFC)


For simplification, the sensing area is set to a square
with the length equaling to 50m as shown in Figure 2.
To perform the energy efficient for sensing nodes, a
fixed clustering method is proposed to normalize the
clustering region. There is a CH elected at each
clustering area. In order to divide the area into uniform
clusters in size, we propose FCA described as
followings.
Class A. When the number of clusters equals to
p×p, that is, the clusters in row and those in column are
the same. For example, Figure 2(a) shows an example (b)
of the clustering sensor region for the network when
the number of clusters is equal to 9. In Figure 2(a), the
maximum sending distance is 25 2  p from the
sensor node where   is the floor function and p = q
is the square decision factor.
Class B. When the number of clusters are with 1×2,
2×3, 3×4, 4×5, ..., M×(M+1), M∈N, the clustering area
is divided by Class B. Figure 2(b) shows an example of
the sensor region of the fixed cluster network when the
number of clusters is equal to 12
Class C. When the number of clusters does not fit
in Class A or B, then the clustering algorithm is
classified to Class C [12]. Figure 2(c) shows an (c)
example of the Class C of the fixed cluster network
when the number of clusters is equal to 11 Figure 2. The energy efficient fixed clustering
After the clustering, the number of sensor nodes in examples for three categories: (a) q=9,
each cluster area is almost the same as the example of (b)q=12, and (c)q=11.
q=4 shown in Figure 3. In Figure 3, the different In wireless communication, the channel models are
symbols denote the nodes within different clusters. modeled by
Then the nodes in the cluster take turn the CH
randomly as LEACH. After Q/q rounds, each node Pr = c
Pt
, (2)
has been the CH once to balance the energy dissipation dα
and to further prolong the network lifetime.

351
where Pr and Pt are the received power at receiver and 3. Energy Efficiency Analysis
the transmitted power at transmitter respectively, c is
the propagation coefficient, and α is the path loss In this paper, we assume that the sensor nodes are
exponent, 2≤α<6. For a free space area, the path loss uniformly distributed in the area of the cluster.
exponent is set by α = 2. The location of the nodes is Therefore, the power dissipation of a CH to relay the
assumed to be known to base station by GPS. information of the cluster in one round can be obtained
In Mac layer, the sensing nodes are assumed to by
know the belonging CH by centralized based station
Q
broadcasting. Based on the configuration of square E ch,i = η i ⋅ el ⋅ Wi ⋅ , (4)
area, Figure 3 shows the investigated environment in q
this paper. In Figure 3, the total Q sensors are supposed
where el is the energy dissipation sending one packet
to be spread out uniformly to the whole area where is
per square meters, the energy dissipation due to the
divided into q clusters. The data from each cluster will
path loss of a distance between the ith CH and the base
be collected by the CH and these data will be sent back
station is expressed by
to the base station located at the point (0, -B).
To evaluate the lifetime of the network, one round Wi = E[diα/c] = E[ di2 ]= E[xi2+[yi +B] 2], (5)
is defined as a cycle in which the base station receives
data from the sensor node. In one round, it contains the where α = 2, c = 1, and (xi, yi) is coordinates of the
time from the data collected at sensor to the centre of the ith cluster area. Moreover, the energy
corresponding CH and the time from the CH to the dissipation for a sensor node to transmit one packet in
base station. a clustering area can be obtained by
Thus, the total energy of networks in one round can E n , j = el ⋅ Z j , (6)
be expressed by
q
 Q  Q−q where Z j = d 2j is the random variable of the
ET = ∑ η i ⋅ Ech ,i ⋅  + ∑ E n, j (3) rectangular square of the distance between the jth
i =1  q  j =1
normal sensor node and the CH. Thus, the expected
where ηi is a data compressing factor for the ith power dissipation for a sensor node to transmit one
cluster with 0< ηi ≤1, Ech,i and En,j are the transmission packet to CH in a rectangular clustering area can be
energy of one packet for the ith CH and the jth normal obtained by [3]
sensor, respectively. Moreover, the dissipation energy 2 L1L2  2 2 z 
of nodes depends on the path loss. E[ Z ] = ∫ z  −  dz , (7)

  L1 L2 L1 L2 
0 
where L1 and L2 are the width and length of the
rectangular area of the cluster.
In the LEACH, the CH is selected randomly as
shown in Figure 1. Therefore, the energy dissipation of
each cluster in transmitting one packet is expressed by

5D 2
E[ Z ] = E[ x 2 + ( y + B ) 2 ] = + B ⋅ D + B2 , (8)
12
where D is the length of the square and B is the
distance between the sensing field and the base station.
Therefore, by the number of clusters we can choose the
suitable algorithm to equally cluster the cluster area.

4. Simulation Results
Figure 3. An example of EEFCH with q=4. In order to verify and compare the energy efficiency
of the proposed FCA, a simulation work is presented.
In the simulation, we assumed that the energy
dissipation sending one packet by each sensor is

352
el= 5 × 10 −7 Joule (J)/m2. In our simulation, the total
number of sensors nodes is one hundred, Q=100. Then,
the normal sensor nodes are 100-q. The length of
sensing square area is set D=50 meters. To be
generalized, the worst case in data fusion with data
compressing factors for all clusters ηi=1 is performed
in the simulations. To perform fixed clustering, in the
nodes deployments, we assume that the sensor nodes
are scattering uniformly distributed in the sensing field.
To verify the energy efficiency of the proposed
LEFC, we performed the computer simulation in
MATLAB programming. In our simulation, the el
=5×10-7(J/m2) for α = 2. The initial energy in the
battery each nodes is 5 J. At first, the simulation
parameters are set by q =5, D=50m, α=2, and B=10m.
In order to illustrate the energy efficiency of the Figure 4. Comparison of network life time for
clustering method, we compare a direct transmission proposed LEFC, Direct and LEACH.
scheme with the proposed clustering scheme. In direct
scheme, denoted by Direct, each node transmits the Table 2. The comparison of lifetime with
data obtained by sensing to base station directly with LR=70% and 50% for Direct, LEACH and LEFC.
assuming that the amplifier of the nodes can broadcast
sufficient power to base station.
From Figure 4, simulation results show that the LR
proposed LEFC prolongs the network lifetime 70% 50%
evidently. In the direct scheme, the nodes which are far LT
away from base station suffer the extremely short LT
lifetime. Even though the near node can last ling life
time, the network suffers lower coverage problems. Direct 2226 3471
The LEACH scheme balances the energy consumption
due to the CH, yet not of the energy efficiency of the LEACH 3508 4245
normal sensing nodes. The proposed LEFC can not
only compromise the balancing on energy consumption LEFC 5771 7537
in performing CH, but also improve the energy
efficiency of the sensing nodes. Therefore, it is easily
observed that the proposed LEFC outperforms the Table 3. The comparison of total transmission
LEACH and Direct schemes. packets for Direct, LEACH and LEFC.
To depict the advantage of the proposed LEFC,
with the living rate (LR) 70% and 50%, the network Total Transmission
lifetime for the routing schemes are compared in Table (packets)
2. The LR means the percentage survival nodes in the
WSNs. From Table 2, it is obviously observed that the Direct 581617
proposed LEFC largely outperform both LEACH and
Direct no matter of 70% and 50% living rate. LEACH 450229
Moreover, to compare the real energy efficiency, Table
3 shows the comparison in total transmission packets LEFC 742919
of the three schemes for the WSN. From Table 3, it is
observed that the proposed LEFC can transmit more
65% and 27% packets than the LEACH and Direct To find the most energy efficient for various
schemes, respectively. number of CHs and compare the energy efficiency of
LEACH and LEFC schemes, we simulate the network
energy consumption in one round as shown in Figure 5.
From Figure 5, it is observed that the highest energy
efficiency is with the number of clusters 6≤q≤10 for
both LEACH and LEFC. Moreover, with highest

353
energy efficiency of q=8, the consumed energy in [2] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, E.
LEFC is less than half of the LEACH in one round. Cayirci, “Wireless sensor network: a survey,” Computer
Networks, vol. 38, pp. 393-422, 2002.
[3] E. J. Duarte-Melo and M. Liu. “Analysis of energy
consumption and lifetime of heterogeneous wireless
sensor networks,” Proceedings. Global
Telecommunication Conference, pp. 21–25, Nov. 2002.
[4] M. Chatterjee, Sajal K. Das, and D. Turgut, “A
weighted clustering algorithm for mobile ad hoc
networks,” Proceedings of the 7th International
Conference on High Performance Computing, pp. 511-
521, Dec. 2000.
[5] R. C. Shah and J. M. Rabaey, “Energy aware routing for
low energy ad hoc sensor networks,” Proceedings of
IEEE Wireless Communications and Networking
Conference, pp. 17–21, 2002.
[6] C. Schurgers and M.B. Srivastava, “Energy efficient
routing in wireless sensor networks,” Proceedings of
IEEE Military Communications Conference for
Network-Centric Operations, vol.1, pp. 357-361, 28-31
Oct. 2001.
[7] B. Huang, F. Hao, H. Zhu, Y. Tanabe, and T. Baba,”
Figure 5. The comparison of network energy Low-energy static clustering scheme for wireless sensor
consumption in a round between LEFC and network,” Proceedings of International Conference on
LEACH. Wireless Communications, pp. 1-4, 22-24 Sept. 2006.
[8] W. Heinzelman, “Application-specific protocol
architectures for wireless networks,” Ph.D. thesis,
5. Conclusion Massachusetts Institute of Technology, 2000.
[9] S. D. Muruganathan, D. C. F. Ma, R. I. Bhasin, and A.
In this paper, an energy efficient clustering O. Fapojuwo, “A centralized energy-efficient routing
algorithm is proposed to prolong the lifetime of protocol for wireless sensor networks,” IEEE
cluster-based WSN. The proposed LEFC gives Communication Magazine, vol. 43, no.3, pp. 8-13,
uniform area of cluster area for the WSN and save the March 2005.
energy dissipation of normal senor nodes in the cluster. [10] W. Heinzelman, A. Chandrakasan, and H. Balakrishnan.
Simulation results show that the LEFC can efficiently “Energy-efficient communication protocols for wireless
microsensor networks (LEACH),” Proceedings of the
cluster the sensing nodes to minimize the energy 33rd Hawaii International Conference on Systems
dissipation and then outperform LEACH with 60% Science, vol. 8, pp. 3005-3014, Jan. 4-7, 2000.
longer lifetime for WSN. [11] K. Sohrabi, J. Gao, V. Ailawadhi, G.J. Pottie, “Protocols
for self-organization of a wireless sensor networks,”
6. Acknowledgements IEEE Personal Communications, vol. 7, pp. 16-27, Oct.
2000.
This work was funded in part by National Science [12] Y.-F. Huang, W.-H. Luo, J. Sum, L.-H. Chang, C.-W.
Chang and R.-C. Chen, “Lifetime Performance of an
Council, Taiwan, Republic of China, under Grant NSC energy efficient clustering algorithm for cluster-based
94-2213-E-324-029 for Y.-F. Huang. wireless sensor networks,” LNCS 4743, Springer-
Verlag Berlin Heidelberg, pp. 455---464, Aug. 2007.
7. References
[1] D. Culler, D. Estrin, M. Srivastava, “Guest editors'
introduction: overview of sensor networks,” IEEE
Computer, vol. 37, issue 8, pp. 41- 49, Aug. 2004.

354
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

BIOSEMANTIC SYSTEM: APPLICATIONS OF STRUCTURED NATURAL LANGUAGE


TO BIOLOGICAL AND BIOCHEMICAL RESEARCH

David Hecht2,5, Rouh-Mei Hu2, Rong-Ming Chen3, Jong-Waye Ou2, Chao-Yen Hsu2, Haitao

Gong1, Ka-Lok Ng2, Han C.W. Hsiao2, Jeffrey J.P. Tsai2,4, and Phillip C-Y Sheu1,2

communicate with the system in natural language and a


ABSTRACT workflow could be automatically generated and distributed
into appropriate tools.
Recent advances and new technologies in biological and
medical research have resulted in a rapid accumulation of Biologists and medical researchers should be allowed to
enormous amounts and types of data and data analysis tools. concentrate on their research and not the job of interfacing
Although most of these databases and tools are available disparate systems and data sets. Usability is critical to the
through the internet and are easily accessible for users, they future of bioinformatics tools. Increased usability has been
are highly heterogeneous making it difficult to integrate into linked to decreased training costs and time, as well as to
efficient workflows. In this paper, we present a natural- improving human performance and productivity, ensuring
language, object-based computing system, BioSemantic better quality of work, and minimizing the risk of user error
System (BSS), for seamlessly integrating these diverse [1].
bioinformatics databases and tools into efficient workflows
that will increase the productivity of end-user researchers.
2 RELATED WORK
Below, we present vocabulary, including nouns, verbs and Natural language interface to database (NLIDB) [2] has
adjectives as well as several examples of applications to been a popular field of study in the past. Many commercial
biological and biomedical research problems. and educational NLIDBs became available in the 80’s [3]
[4]. These applications incorporated new concepts in
computational linguistics together with some AI capabilities.
1. INTRODUCTION Some of the recent NLIDB work includes Hermes [5], Edite
Current bioinformatics databases and tools are highly [6], and PRECISE [7]. Most of these provide an
heterogeneous in the following aspects: 1). the input and intermediate language between the natural language input
output formats of different tools are generally restricted to and a formal query language (i.e., SQL). This intermediate
fixed formats which are different; 2). databases were language expresses the meaning of the query in terms of
constructed on different systems or platforms in different high-level concepts that are independent of database
formats (schema); and 3). the terminologies, such as gene structures [6].
name, gene ID or accession number, are heterogeneous. It is
Despite these developments, graphical and form-based
desired that different databases and analysis tools be
interfaces have continued to dominate, partly because most
normalized, integrated and encompassed with a semantic
existing NLIDB systems can only express what SQL can
interface so that users of biological data and tools could
which is significantly restricted. In addition, SQL can
___________________________ become quite awkward when aggregations are involved, and
usually its grammar is not at all close to natural language.
1. Department of EECS, University of California, Irvine, CA
This hinders correct translation from an intermediate
92697
language to SQL.
2. Department of Bioinformatics, Asia University, Wufeng,
Taichung 41354, Taiwan It has been well recognized that relational databases are
3. Department of Computer Science and Information Engineering, restricted in supporting complex applications such as
National University of Tainan, Taiwan biology and medicine. An object-oriented database has the
4. Department of Computer Science, University of Illinois, advantage that an investigator can specify conditions for
Chicago, IL 60607 objects to be retrieved at any level without mapping such
5. Department of Chemistry, Southwestern College, Chula Vista, conditions to tables. For example, an investigator can
CA, 91910 email: dhecht@swccd.edu simply post a query like “How many AT8 positive tangles

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 386


DOI 10.1109/SUTC.2008.34
in sample #172 are there in entorhinal cortex?”; and “How records; three-dimensional molecular structures; microarray
many of these appear to have DNA damage?” To our data and images; biological activity and assay data; as well
knowledge, NLIDB technologies for object-oriented as biomedical literature.
databases are yet to be developed.
If the information is stored in a local resource, the semantic
query will be generated by the BSS through the Semantic
3 BIOSEMANTIC SYSTEM ARCHITECTURE
Object layer. The SNL query will be parsed into a set of SO
BioSemantic System (BSS) based on SemanticObjects™ queries to compose the semantic object.
(SO) aims to provide an integrated framework for
After all necessary information has be collected, the
biomedical knowledge retrieval, management, capture,
workflow will start to process the semantic object by
sharing, discovery, delivery and presentation.
filtering, joining and binding to produce the knowledge data.
SemanticObjects™ is a development environment that
builds an object relational layer on top of relational data
sources that helps designers build a global schema to
4 BIOSEMANTIC SYSTEM DATA STRUCTURE
AND CLASSES
capture the semantics of compound objects. A solution
developed in SO is extensible and user programmable based SemanticObjectsTM (SO) is a database development
on Structured Natural Language (SNL). environment that supports complex applications on top of
relational databases. (Figure 2) Objects are defined in a
We envision the BioSemantic System being used as follows.
global schema and wrapped by java classes. Data are stored
Users will define the problem by composing a SNL query.
in different data sources and manipulated by
The SNL query will be parsed into two types of query
SemanticObjectsTM transparently without depending on
depending on the Semantic Object type. (Figure 1).
other data sources. The global schema is mapped to local
data sources by the Mapper module of SO. Using SO’s
User Objects Designer the user can declare object classes, define
Query
their operations and behaviors by adding verb and adjective
methods. The actual objects’ data are stored in the
SO Query Web Service
Workflow Parser Client underlying relational databases. SemanticObjects also has a
structured natural language parser, which allows the user to
SO Entrez compose their queries in structural natural language, a
BSS DB Query Query NCBI DB
subset of natural language, using: WebTools.

End Designer
Semantic Object User

Filtering
WebTools Object
Designer
Binding

SO SO
Query Classes
Semantic Knowledge Retrieval

Figure 1. BioSemantics System (BSS) Architecture


Semantic Object Mapper

If the SO has to be retrieved over the internet from web


services such as the National Center for Biotechnology
Underlying Underlying Underlying
Information (NCBI), the SO will be composed to create the DB DB DB
web services query by the web services client that
communicates through an interface into the web services,
e.g., the Entrez database system at NCBI. The BSS Figure 2. SemanticObjectsTM Architecture. Users
translates the query into a standard set of input parameters and Designers interface with the Webtools and
for various software components to search for and retrieve Object Designer modules respectively that feed into
the requested data from the web services. In the case of the Mapper module. Queries are then sent out to the
NCBI this can be quite complex as there are at least twenty diverse underlying databases.
three databases containing a wide variety of biomedical data.
These include: nucleotide and protein sequences; gene

387
Three major classes currently exist in the BSS. These “existing” are adjectives and are used to refine the query.
classes are meant to be representative and are by no means Other adjectives can readily be defined as needed.
comprehensive.
“Find” is the BioSemantic operator in these statement. For
4.1 GENE CLASS the literature searches this is PubMed at NCBI [8]. For the
patent searches it is the United States Patent and Trademark
The Gene Class is perhaps the central cornerstone of the
Office [9].
BioSemantic vocabulary. This class (with its associated
subclasses) stores all the general information of a gene, such 5.2 BLAST PROBLEM:
as id, source database and accession number, species,
One of the most common tasks in biological research today
sequence, as well as the proteins they code for.
is that of identifying genes and proteins related or similar to
4.2 PROTEIN CLASS a particular sequence. This is often performed with Blast
(NCBI) [10].
The Protein Class is also a central part of the BioSemantic
vocabulary. This class (with its associated subclasses) stores Some typical queries are presented below:
all the general information of a protein, such as id, source
FIND NUCLEOTIDE SEQUENCE FROM SPECIES HOMOLOGOUS
database and accession number, species, sequence, as well
TO [INPUT] \\ INPUT IS A NUCLEOTIDE SEQUENCE
as the associated genes.
(BLASTN) OR PEPTIDE SEQUENCE (TBLASTN)
4.3 LITERATURE CLASS
FIND PROTEIN HOMOLOGOUS TO[INPUT] \\ INPUT IS A
The Literature Class (with its associated subclasses) stores NUCLEOTIDE SEQUENCE (BLASTX) OR PEPTIDE
all the relevant information for reference material that SEQUENCE(BLASTP)
includes: author(s), titles, journal, dates, and pages.
FIND SNPS SIMILAR TO [INPUT] \\ INPUT IS A
NUCLEOTIDE SEQUENCE, ACCESSION OR GI
5 BIOSEMANTIC SYSTEM QUERY COMMANDS
In these examples the objects are “nucleotide sequence”,
Once a framework for the data and class structures have
“protein” and “SNP”. These fall in the protein and gene
been defined it is possible to put together semantic
classes. “Species” is an attribute of these classes and is used
statements and apply these to real-life research problems.
to refine the queries. The adjectives are “homologous” and
Operators or verbs need to be defined. For purposes of the
“similar”. These can be further refined depending upon the
examples presented in this paper these will be limited to
tool(s) and databases used in the queries. “Find” is the
existing bioinformatics software and tools. It is anticipated
operator or verb and refers to BLAST.
that new tools and operators will be developed as a natural
expansion of BioSemantic System. 5.3 ALIGNMENT PROBLEM:
Presented below are several examples chosen to Another common problem is that of aligning multiple
demonstrate the wide applicability of BioSemantic System. sequences of nucleic acids and/or proteins. The goal is to
see which regions are conserved and which are different.
5.1 LITERATURE SEARCH:
This problem is made difficult by the fact that there can be
Common to every biological and biomedical research intervening sequences of varying length that play little or no
problem is the need to find the relevant literature and prior functional/structural role.
art. As is too often the case, these searches result in too
Some typical queries are presented here:
many hits that are not exactly what is required.
ALIGN [INPUT] \\ WHERE INPUT IS A SERIES OF
Several typical BioSemantic queries are presented below:
PROTEIN, PROTEIN OR NUCLEOTIDE SEQUENCES
FIND RELEVANT LITERATURE FOR [INPUT]
STRUCTURALLY ALIGN [INPUT] \\ WHERE INPUT IS A
FIND EXISTING PATENTS FOR [INPUT] \\ IDENTIFY IF SERIES OF EITHER PROTEINS, AMINO ACID OR NUCLEIC
THE RESEARCH TOPIC IS NOVEL INTELLECTUAL PROPERTY ACID SEQUENCES
OR NOT
FIND CONSENSUS SEQUENCE FROM [INPUT] \\ WHERE
FIND ALL RELEVANT LITERATURE FOR [INPUT] \\ INPUT INPUT IS A SERIES OF EITHER PROTEINS, AMINO ACID
IS A STRING OR NUCLEIC ACID SEQUENCES

The BioSemantic objects are “literature” and “patents” MATCH CONSENSUS [INPUT] ON A SEQUENCE [INPUT]
which fall in the Literature object class. “Relevant” and
MATCH CONSENSUS [INPUT] ON A GENOME [INPUT]

388
FIND CONSENSUS FROM MULTIPLE SEQUENCE \\ PATTERN Some typical queries include:
DISCOVERY
PREDICT THE PI OF [INPUT] \\ INPUT AMINO ACID
MATCH CONSENSUS ON A GENOME (GENOME-SCALE PATTERN SEQUENCE AND CALCULATE THE ISOELECTRIC POINT
MATCHING)
PREDICT THE MOLECULAR WEIGHT OF [INPUT] \\
The objects in these queries are “sequences” which fall in INPUT IS AMINO ACID SEQUENCE
the Gene and Protein classes. “Consensus” is used as an
PREDICT THE LOCATION OF LEUCINE ZIPPERS IN [INPUT]
adjective. The verbs are “Align”, “Structurally Align”,
\\ INPUT IS AMINO ACID SEQUENCE
“Find” and “Match”. These use the following tools: Align at
NCBI [11]; CLUSTALW at the European Bioinformatics PREDICT THE LOCATION OF COIL REGIONS IN [INPUT]
Institute [12]; Mutalin [13]; Weblogo [14]; STRAP [15]; \\ INPUT IS AMINO ACID SEQUENCE
and various Regulatory Sequence Analysis Tools [16].
The objects are all part of the protein class and the verb
5.4 PREDICT PROTEIN FAMILIES, DOMAINS AND “Predict” uses the following tools: Compute pI/Mw [23]
FUNCTIONS (computes the theoretical isoelectric point (pI) and
molecular weight (Mw) from a UniProt Knowledgebase
Prediction of the class of protein family domain and
entry or for a user sequence); 2ZIP [24] (predicts Leucine
function is another important set of tasks often performed.
Zippers); Coils [25] (predicts coiled coil regions in proteins);
Sample queries are presented below: and Multicoil [26] (predicts of two and three stranded coiled
coils).
PREDICT PROTEIN [FAMILY] GIVEN A SEQUENCE [INPUT]
\\ E.G. WHERE INPUT IS A SEQUENCE OF AMINO ACIDS 5.5.2 SECONDARY STRUCTURE PREDICTION PROBLEMS:
PREDICT PROTEIN [DOMAIN] GIVEN A SEQUENCE [INPUT] The secondary structures of proteins and peptides are based
\\ E.G. WHERE INPUT IS A SEQUENCE OF AMINO ACIDS on the intra- and inter-chain hydrogen-bonding interactions
OR NUCLEOTIDE SEQUENCES of the amino acids. The three major classes of secondary
structure include alpha helices, beta-sheets, turns, and
PREDICT PROTEIN [FUNCTION] GIVEN A SEQUENCE
random coil regions.
[INPUT] \\ E.G. WHERE INPUT IS A GENE OR PROTEIN
NAME The following query predicts the presence of alpha-helices,
beta-sheets, turns, and coils. based on the amino acid
PREDICT GENE [FUNCTION] GIVEN A SEQUENCE [INPUT]
sequence:
\\ E.G. WHERE INPUT IS A SEQUENCE OF AMINO ACIDS
PREDICT THE SECONDARY STRUCTURE OF [INPUT] \\
All objects are part of the Protein class. The operator or verb
INPUT AMINO ACID SEQUENCE
is “Predict” and the tools include: Pfam HMM search (scans
a sequence against the Pfam protein families database [17] The tool used in this query is: APSSP (Advanced Protein
[18]; SMART [19]; NCBI Conserved Domains Search [20] Secondary Structure Prediction Server) [27].
[21]; and Gene Ontology [22].
5.5.3 TERTIARY STRUCTURE AND FOLDING PROBLEM
5.5 PROTEIN STRUCTURE PROBLEMS: (FOR RNA AND PROTEINS):
Ultimately as genes are expressed as proteins, it is their Ultimately it is the tertiary 3D structure of a protein (as well
three dimensional structures that determine their function(s) as RNA) that determines function. Presented below are two
and activity. There are four levels of protein structure sets of queries: the first for retrieving solved protein
problems presented below: primary, secondary, tertiary and structures and structurally aligning them, the second for
quaternary. predicting 3D structures of both proteins as well as RNA
molecules.
5.5.1 PRIMARY STRUCTURE ANALYSIS PROBLEMS
FIND THE 3D STRUCTURE OF [INPUT] \\ INPUT IS IS
Primary structure analyses of proteins depend solely on the
NUCLEIC ACID OR AMINO ACID SEQUENCE
sequence of the amino acids (and the underlying gene
sequences). These analyses often include the BLAST, STRUCTURALLY ALIGN [INPUT] \\ INPUT IS NUCLEIC
Alignment, and Prediction of Protein Families, Domains ACID OR AMINO ACID SEQUENCE
and Functions problems presented previously.
PREDICT THE 2D STRUCTURE OF [INPUT] \\ INPUT
Additional information (derived from protein primary IS NUCLEIC ACID OR AMINO ACID SEQUENCE
structure) is often required.
PREDICT THE 3D STRUCTURE OF [INPUT] \\ INPUT
IS NUCLEIC ACID OR AMINO ACID SEQUENCE

389
The operator or verb in the first query is “find’ and the tool A working demonstration of this workflow is currently in
is the PDB database [28]. For the second query, the operator development.
or verb is “align” and the tool is STRAP [29] (a structural
6.2 PATIENT SELECTION AND DEMOGRAPHICS
alignment program for proteins). For the last two queries,
the operator or verb is “predict” and the tools used include: The very first step in this analysis is to select the patient
MFOLD [30] (for RNA 2D/3D structure prediction); and pool to study. This will require access to patient medical
SWISS-MODEL [31] (an automated knowledge-based records and databases and will require appropriate
protein modeling server). safeguards for confidentiality.
5.5.4 Q UATERNARY STRUCTURE AND PROTEIN-PROTEIN For purposes of this workflow we are interested in
INTERACTIONS: identifying genes that can act as biomarkers and/or drug
targets for HCC cancer (a form of liver cancer).
Quaternary structure refers to proteins that form dimers,
tetramers or higher-order macro-molecular assemblies. The very first BioSemantic query is:
Systems Biology is a relatively new area of research
Select patients having HCC from the
concerned with mapping out and modeling these complex
patient database.
regulatory protein-protein interactions. Some representative
BioSemantic queries have already been presented above in This results in 48 patients. All other patients having other
Section 5.3. Additional queries and modeling of pathways diseases are excluded.
are currently in development.
Additionally, queries and patient selection can be made on
6. A WORKFLOW FOR THE IDENTIFICATION OF patient demographic information that includes sex, age,
LIVER CANCER BIOMARKERS other diagnoses, risk factors, and tumor characteristics
among others. The patients selected for this study will then
In the previous section individual queries were performed
have new biopsies taken or have new analyses performed on
using BioSemantic statements. In practice, biological and
biopsies already taken from which microarray gene
biomedical research and development is much more
expression experiments will be performed.
complicated. There is in fact a complex workflow,
consisting of multiple steps requiring different sets of 6.3 MICROARRAY GENE EXPRESSION ANALYSIS
data/attributes to be passed from step to step.
The next step in this workflow is to data-mine and analyze
In this section, we try to model a representative workflow in the microarray data in order to identify genes that are over-
order to demonstrate the broader applicability of or under-expressed. The microarray experiments will have
BioSemantic System. already been conducted on normal (controls) as well as
tumor tissues from liver biopsies obtained from these 48
6.1 WORKFLOW OVERVIEW
patients. These data are normalized and processed “off-line”
The goal of this workflow is to identify and validate genes to generate log2 ratios of expression of mRNA’s from tumor
that can act as biomarkers and/or potential drug targets for vs normal tissues. The data was loaded into a proprietary
cancer. These biomarkers can be used for diagnosis or for tool for data-mining and report generation.
monitoring endpoints to evaluate the efficacy of a therapy.
Again for purposes of this demonstration, data for five
The workflow for this demo will consist of 5 sets of genes have been loaded into the microarray tool. One of
semantic queries to 5 different underlying databases – both these is “negative” (e.g. having no significant difference
public and non-public domain. These include: a database of between expressions levels in tumor vs normal tissues) in
patient information and demographics; microarray gene order to illustrate how decisions can be made to select or
expression data; quantitative Real-Time Polymerase Chain reject genes. These 5 genes are: HPRT1 (Hypoxanthine
Reaction (RT-PCR) data; analyses of tumor biopsy images; phosphoribosyl transferase) which is a “housekeeping gene”
and queries to public domain databases at the NCBI as and will be negative for this analysis; CCND1 (Cyclin D1
discussed above. (PRAD1: parathyroid adenomatosis 1) which will be
positive for under expression; CDK4 (Cyclin-dependent
After each semantic query it is envisioned that there will be
kinase 4) which will be positive for over expression;
an interactive decision making tool to allow the user to
PTP4A1 (Protein tyrosine phosphatase type IVA, member 1)
make appropriate selections for the next semantic query.
which will be positive for under expression; and
Please note that the data presented and the semantic queries
THY1(Thy-1 cell surface antigen) which will be positive for
(and their results) are meant to be representative and for
over expression.
illustration purposes only.

390
BioSemantic queries are sent to the microarray analysis tool At this point, an interactive step is performed where the
which will perform the analyses requested and pass back the researcher will select the 4 positive genes and reject the
output. These queries include: negative one for further testing.
IDENTIFY GENES THAT ARE SIGNIFICANTLY OVER OR 6.4 REAL-TIME POLYMERASE CHAIN REACTION
UNDER EXPRESSED. DATA ANALYSIS
ANALYZE EXPRESSION LEVELS FOR EACH GENE For the four genes selected from the microarray analysis,
confirmation studies are performed using Real-Time
CREATE A HISTOGRAM FOR EACH GENE FOR VISUAL
Polymerase Reaction data from both tumor as well as
INSPECTION
normal tissue samples from each patient. Depending upon
Table 1 presents the output of the analysis tool to the first where these studies are to be performed, the data may reside
two queries which consists of p-values from a T-test of in a relational database such as Oracle or simply in a
tumor vs normal expression levels for each gene (alpha = spreadsheet or text document residing on a server or a
0.05). From this analysis, HPRT1 is determined to be not shared drive.
significantly different from control whereas the other four
The values represent Quantified mRNA levels in tumor and
genes were.
normal cells. A significant difference (as determined by a t-
test, α = 0.05) will confirm the conclusion from the
microarray data, while an insignificant difference
Table 1. Gene Expression T-test p values (alpha = 0.05).
contradicts the conclusion.
CDK4 PTP4A1 THY1 CCND1 HPRT1
The BioSemantic queries are:
2.70E-07 1.12E-02 6.64E-16 3.75E-3 1.00 IDENTIFY GENES THAT ARE SIGNIFICANTLY OVER OR
UNDER EXPRESSED.

ANALYZE EXPRESSION LEVELS FOR EACH GENE.


The output from the third query results in a series of The output from these queries is presented below in Table 2.
histograms. These can be used by the researcher for visual From this analysis, CCND1 is determined to be not
inspection to confirm the statistical analyses. A couple of significantly different from control whereas the other three
representative histograms are presented below (Figures 3 genes were.
and 4) illustrating a negative result (HPRT1) as well as a
positive result (CCND1).
Table 2. RT-PCR T-test p values (alpha = 0.05).
CDK4 PTP4A1 CCND1 THY1
0.040 0.006 0.051 1.50E-3

6.5 IMMUNOHISTOCHEMISTRY ANALYSIS


The next step in the workflow involves analyses of
Figure 3. Histogram of HPRT1 expression levels. immunohistochemistry images corresponding to these three
genes in the 48 patients. These experiments are
complimentary to the RT-PCR studies as they measure the
relative expression levels of the gene products in the cells
themselves. This data will consist of the image itself and
various annotations and conclusions which, as was the case
for the RT-PCR data, may reside in various locations from
large relational databases to shared drives or servers.
The image analysis tool is still in development, but we will
present some representative queries and outputs for
Figure 4. Histogram of CCND1 expression levels. illustrative purposes. The queries will include:

391
WHERE IS THE BOUNDARY BETWEEN NORMAL AND CANCER
CELLS?
6.7 SUMMARY REPORT
DESCRIBE THE DISTRIBUTIONS OF NUCLEI IN NORMAL VS.
At the very end of the workflow, an automated report will
CANCER CELLS.
be generated to facilitate reporting and publication of the
IS THERE A ‘SIGNIFICANT’ DIFFERENCE IN PROTEIN analysis.
EXPRESSION BETWEEN THE NORMAL AND CANCER CELLS?
7. SERVICE DISCOVERY AND SYNTHESIS
‘Typical’ images (for THY1) are shown below in Figure 5
illustrating the differential distributions of nuclei and THY1 Many of the bioinformatics problems addressed in
protein expression in normal vs cancerous cells. BioSemantic are combinatorial problems. If an algorithm is
readily available, a description of the problem it solves
needs to be matched by a query so that the algorithm can be
used. In BioSemantic, the “capability” of a service is
A). B). C).
described in the Semantic Capability Description Language
(SCDL) expressed in the following form:

SELECT outputs (O1,…,Om), aggregated-outputs (f1


A1,…, fd Ad)
FROM inputs (I1,…,Im), range variables (R1,…,Rn),
Figure 5. A). Original image for THY1. B). Image
other variables (S1,….,Sk)
processed to determine distributions of nuclei. C).
WHERE p(inputs, outputs, other variables)
Distributions of protein revealed by the antibody.
GROUP BY (H1,…,Hj)

Note an SCDL expression may be executable, but it is un-


The output from the BioSemantic queries are presented in realistic in terms of execution. The language is used for the
Tables 3 and 4 below. These data confirm the analysis from purpose of service search/synthesis only. By comparing the
the RT-PCR data confirming the significance of CDK4, capability of a service (in SCDL) and a query in SNL (that
PTP4A1 and THY1 as well as the insignificance of CCND1. can be converted into SCDL), a match may be determined.
This kind of mapping is called a “one-to-one” mapping. In
BioSemantic, we also include a matching mechanism that
Table 3. Proportion of cancerous tissue area. maps a query into multiple services (one-to-many mapping).
CDK4 PTP4A1 CCND1 THY1
Some examples of SCDL in the context of BioSemantic are:
61% 34% 37% 70%
Example 1: (The DNA Alignment Problem): Given a set of
DNA sequences find subsequence pairs that match.

Table 4. Percent differential protein expression and SELECT s


conclusion. FROM int n (input), for i in (1..n)
{nucleotide_sequence(input) qi,, Q = Q +qi}, £Q×Qs
CDK4 PTP4A1 CCND1 THY1
WHERE (s1 in Q) AND (s2 in Q) AND not-equal(s1, s2 )
178.1% 57.7% 104% 180.8% AND match(s1,s2)

Over Under No Over where £qi designates all possible subsequences of qi; £Q×Q
Expression Expression Difference Expression designates all possible subsequence pairs of nucleotide
sequences that may be derived from Q; s1 and s2 refer to the
first and second element of a pair s, respectively. Q is
6.6 PUBLIC DOMAIN DATABASE QUERIES initialized to empty by default.
The next step in the workflow is to retrieve all known,
Example 2: (The blastN problem): finds regions of local
published and public domain information available for the similarity between sequences by comparing nucleotide or
three genes identified in order to allow for hypothesis protein sequences to sequence databases and calculates the
generation. The set of BioSemantic queries generated will statistical significance of matches
include many (if not most) of the queries presented in
Section 5. SELECT s1, s2

392
FROM nucleotide_sequence(input) qi ,, £(qi, w)
s1,
nucleotide_sequence(input) db, £(db, w) s2, [6.] P.P. Filipe, N.J. Mamede, Database and Natural Language
float(input) threshold Interfaces, 2000, Jornadas Ingeniería de Software y Bases de
WHERE HSP(s1,s2) > threshold Datos (JISBD2000), Pages 321-332.

where £(qi, w) designates all possible subsequences of qi of [7.] A.-M. Popescu, O. Etzioni, H. Kautz, Towards a Theory of
length w ; HSP (high scoring pair) computes the score and Natural Language Interfaces to Databases, 2003, Intelligent
compare with predefined threshold to decide it is blast. User Interfaces, Pages 149-157.

8. CONCLUSIONS [8.] http://www.ncbi.nlm.nih.gov/


In this paper we introduce the natural-language, object-
based, computing system, BioSemantic System. Presented [9.] http://www.uspto.gov/
are some sample vocabulary, including nouns, verbs and
adjectives as well as several examples of applications to [10.] http://www.ncbi.nlm.nih.gov/blast/Blast.cgi
current and relevant biological and biomedical research
problems are discussed. A sample workflow with a working [11.] http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi
demonstration is also presented that demonstrates how
[12.] http://www.ebi.ac.uk/Tools/clustalw/
diverse bioinformatics databases and tools can be
seamlessly integrated. We are in the process of expanding [13.] http://bioinfo.genopole-toulouse.prd.fr/multalin/multalin.html
the scope of the classes and queries to address additional
“real world” workflows. Ultimately it is anticipated that this [14.] http://weblogo.berkeley.edu/logo.cgi
will result in enhanced productivity of end-user biological
and biomedical researchers. [15.] http://www.charite.de/bioinf/strap/

ACKNOWLEDGMENTS [16.] http://rsat.ulb.ac.be/rsat/


This work is supported by a grant from National Science
Council, Taiwan, ROC (96-2221-E-468-011-MY3). The [17.] http://www.sanger.ac.uk/Software/Pfam/search.shtml
views, opinions and/or findings contained in this report are
those of the authors and should not be construed as an [18.] http://pfam.wustl.edu/hmmsearch.shtml
official National Science Council position, policy or
[19.] http://smart.embl-heidelberg.de/
decision unless so designated by other documentation.
[20.] http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi
REFERENCES
[21.] http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml
[1.] D. Mayhew, Usability Engineering Lifecycle, 1999 Morgan
Kauffman, San Francisco. [22.] http://www.geneontology.org/

[2.] I. Androutsopoulos., G.D. Richie, P. Thanisch, Natural [23.] http://www.expasy.org/tools/pi_tool.html


Language Interface to Database – An Introduction, 1995.
Journal of Natural Language Engineering, Volume 1, Issue 1, [24.] http://2zip.molgen.mpg.de/index.html
Pages 29-81.
[25.] http://www.ch.embnet.org/software/COILS_form.html
[3.] B.H. Thompson, F.B. Thompson, Introducing ASK, A Simplest
Knowledgeable System, 1983, Proceedings of the 1 [26.] http://groups.csail.mit.edu/cb/multicoil/cgi-bin/multicoil.cgi
Conference on Applied Natural Language Processing, Pages
[27.] http://imtech.res.in/raghava/apssp/
17-24.
[28.] http://www.pdb.org/pdb/home/home.do
[4.] D. Warren, F. Pereira, An Efficient Easily Adaptable System
for Interpreting Natural Language Queries, 1982, [29.] http://www.charite.de/bioinf/strap/
Computational Linguistics, Volume 8, Issues 3-4, Pages 110-
122. [30.] http://bioweb.pasteur.fr/seqanal/interfaces/mfold-simple.html

[5.] C.B. Rivera, N. Cercone, Hermes: Grammar and Lexicon, [31.] http://www.charite.de/bioinf/strap/
1998 University of Regina Technical Report CS-98-02.

393
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Ad Hoc Collaborative Filtering for Mobile Networks

Patrick Gratz Adrian Andronache Steffen Rothkugel


University of Luxembourg University of Luxembourg University of Luxembourg
patrick.gratz@uni.lu adrian.andronache@uni.lu steffen.rothkugel@uni.lu

Abstract In this work we introduce an ad hoc collaborative


filtering system designed to augment the podcast
This work introduces a recommender system providing mechanism of the mobile hybrid network
designed to augment the information discovery and application HyCast [2].
dissemination in mobile networks. We use the system to The remainder of this paper is organized as follows.
optimize the podcast providing mechanism of the Related work is presented in Section II. Section III
HyCast application. The presented recommender introduces our approach of an ad hoc recommender
system is based on collaborative filtering and provides system for mobile devices together with two
two different algorithms to determine similar network algorithms for a similar neighbor determination in such
neighbors, which are used to incrementally build up a networks. In Section IV we evaluate our system and
locally stored model for the final prediction compare the two introduced algorithms in respect of
calculation. precision and message complexity. Finally, our paper
concludes with a preview to future work.
1. Introduction
2. Related work
Due to the huge amount of available information in
today’s society, it becomes more and more difficult for Recommender systems using collaborative filtering
the consumers to find the most useful information. are one of the most successful recommendation
Recommender systems using collaborative filtering techniques [3], [4]. In this section we introduce some
(CF) are a popular technique for reducing such an existing research work about collaborative filtering in
information overload and finding useful information on mobile environments.
the Internet. For this purpose a collaborative filtering In [5] an incremental collaborative filtering
system gives recommendations or predictions on items algorithm for applications, where users are
for a user based on the opinions of other like-minded occasionally connected to a central server is
users [1]. The opinions on such items can be obtained introduced. The general idea is to store a subset of
explicitly from users by giving a rating on an item or selected user profiles, together with a ranked list of
implicitly by counting clicks, viewing time, and alike. predictions. When the user is in offline mode, a service
Consider the increase in popularity of mobile devices on the local device can still recommend items based on
in the form of mobile phones, smart phones, PDAs and the predictions made the last time the user was
Tablet-PCs the same problem of information overload connected. Each time the user supplies new ratings, the
emerge. These devices are becoming faster in list of predictions will be recomputed, even if the user
processing, gain more memory capacities, are able to is not connected to the server. In the case that a user
execute more and more powerful applications and at encounters another user, the authors suggest that they
the same time being equipped with various wireless exchange their profiles and recalculate their prediction
and/or cellular communication capabilities. However, lists. The past influence of the other user should be
different from traditional static devices such mobile removed from all predictions and the new influences
devices often have no always available cellular should be added. At last this case is not evaluated or
connection to the Internet. Furthermore, compared to considered any further in the paper and is a part of
the often available wireless communication capabilities future work.
such a cellular connection cause more cost and A further portable recommender system along with
provides lesser bandwidth. five peer-to-peer (P2P) architectures for finding

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 355


DOI 10.1109/SUTC.2008.41
neighbors is present in [6]. The authors introduce a Jacobsson et al. introduce in [10] an approach for a
new collaborative filtering algorithm called mobile recommender system where media can find
PocketLens that can run on connected servers, on people rather than the other way around. Whereas,
usually connected workstations or occasionally media files are autonomous, rule-following agents
connected portable devices. The presented algorithm is capable of building their own identities from
a variant of the item-item algorithm introduced in [1] interactions with other agents and users. The general
with modifications for a peer-to-peer environment. To idea is that the interaction of large ensembles of those
reach the goal of portability a local similarity model is interacting agents, distributed over mobile devices in
created for the user. Thereby, the algorithm only needs social networks can emerge a collaborative filtering-
access to the ratings of the owner and one other user at like behavior.
a time. In this manner, the model is created
incrementally in a distributed fashion. 3. Mobile ad-hoc recommender system
In [7] an approach for making a scalable
recommendation system for mobile commerce using In order to get new unknown podcasts, the HyCast
P2P systems is considered. The main idea of the application provides an ad hoc network podcast
proposed approach is to transform the problem of discovery mechanism. This mechanism provides a list
finding recommendations using collaborative filtering, of all available podcasts in the neighborhood.
into a search problem in scalable P2P systems like However, with an increasing amount of available
Freenet or Gnutella. Thereby, a query (vector with podcasts this mechanism entails to a quite inefficient
votes on products) is broadcasted from the querying advertising. Thus, it will be very hard for the user to
node to all neighbor peers. When a peer receives a decide to which podcasts he should subscribe to. For
query it calculates the proximity with other cached this purpose, a recommender system based on
queries. If the proximity is higher than a threshold, the collaborative filtering that recommends only the most
cached voting vector is sent back otherwise the query interesting items can help to overcome this problem of
is broadcasted further. For sparse voting vectors the information overload.
authors propose a binary interpolative compression In this section we introduce such a recommender
algorithm. Furthermore, to improve the performance system for the “Big City Life” scenario [2] that allows
and quality of recommendations they propose an the user to get recommendations about the local offers
approach for clustering similar peers. based on the taste of other like-minded users in nearby
An approach to collaborative filtering in a mobile mobile environments. While the ability to get
tourist information system for visitors of a festival recommendations from the local ad-hoc network has
based on spatio-temporal proximity in social contexts the advantage that the system can take into account
is proposed in [8]. This new approach is based on the new information without first having to connect to a
idea that users who go to the same place at the same central repository on the Internet and thus provide
time tend to have similar tastes. In order to keep track updated recommendations even if the user has no
about the visited places each user is equipped with a cellular connection available. Furthermore, all ratings
portable computer coupled with a GPS unit. made by a user on the way, e.g. for the menu of the day
Furthermore, a central server provides a database with in a restaurant will instantly have an impact on the
information about all the events, restaurants, venues calculated recommendations for like-minded users in
and bars at the festival [9]. The proposed approach the local vicinity.
uses a user-based CF technique and calculates similar The recommender process can be roughly divided
users via a spatio-temporal proximity measure, i.e. two into three phases: determine similar neighborhood,
users are considered as similar if they consume the update the recommender model, and calculate a
same items simultaneously. The following exchange of prediction. Before proceeding with a description of our
rating information between such similar users is done algorithm we first define a suitable similarity measure
via an ad-hoc peer-to-peer interaction. However, the between two users.
defined similarity measure has one drawback. Users
consuming the same periodic event at different times 3.1. User similarity
still share interests, but are not considered as similar. In
a future work, the authors intend to investigate how In order to deliver good recommendations a typical
their CF approach can be extended in order to CF system depends on a critical mass of users with
exchange ratings between users in spatial but not commonly rated items. However, in our application
temporal proximity. Furthermore, they want to scenario it is very likely that a tourist who visits a
evaluate the introduced CF system at the Edinburgh certain city for the first time has no commonly rated
Fringe festival.

356
podcasts with users in his nearby environment. 3.2. Hierarchical Cluster-based Neighborhood
Nevertheless, this fact does not except that the tourist Resolution (HCNR)
has no similar taste with other users in the local
neighborhood. The tourist can have rated different Our first algorithm uses a weighted cluster topology
podcasts that are similar concerning the content of generated by the Weighted Application Aware
those in the nearby neighborhood. For this purpose we Clustering Algorithm (WACA) presented in [12].
calculated the similarity between two users based on an WACA creates clusters in a hierarchical fashion. For
approach proposed by Pazzani in [11] called this purpose each device elects exactly one device as
collaboration via content. The idea behind this its cluster-head, i.e. the neighbor with the highest
approach is to exploit a content-based profile for each weight. This cluster-head also investigates its one-hop
user in order to calculate the similarity between two neighborhood, similarly electing the device with the
users via their content-based profiles instead of their highest weight as its cluster-head. This process
commonly rated items. In the context of the HyCast terminates in case of a device electing itself as its own
application such a content-based profile is represented cluster-head, due to the fact of having the highest
by a list of weighted keywords. For this purpose we weight among all its neighbors. We call all
presume that each podcast feed contains a set of intermediary devices along such cluster-head chains
keywords describing its content. Given the sub-heads. Each device on top of a chain is called a
corresponding rating values for all podcasts rated by full cluster-head, or, in short, just cluster-head. Hence,
a specified user together with the appropriate set of in each network partition, multiple cluster-heads might
keywords describing these podcasts, a weight for coexist.
each keyword can be calculated as follows: Based on such a topology, our algorithm works as
follows. At first, in order to determine a set of similar
∑ neighbors, each slave sends its own profile to the
, currently elected cluster-head. To keep the message
complexity low, each device maintains a list of cluster-
The function returns the number of heads that have already received the current profile. As
soon as the own profile changes this list will be
podcasts containing keyword . Thus, for each user
cleared.
a vector of weighted keywords After receiving the profiles from its slaves, the
, ,…, , can be calculated, that cluster-head and all sub-heads calculate a similarity
represents his preferences. Given these content based matrix via the received profiles. Subsequent to this
profiles we can define the similarity between two users calculation, the sub-head sends the calculated
, via the cosine between the corresponding similarity matrix and the list of received profiles to the
weighted keyword vectors: cluster-head. The cluster-head stores this profile list
together with the corresponding similarity values and
∑ calculates the similarity values for all missing pairs in
, ,
order to complete the similarity matrix. Figure 1 shows
∑ ∑ how the algorithm is using the cluster topology to
exchange the information.
In order to provide recommendations to the active
user, the system starts with a search for other like-
60 S
minded users in the mobile ad-hoc network. In the M
following sections we introduce two different P
P
algorithms that determine such a set of similar 80 90 70
neighbors. Due to the fact that the mobility in such a P
network entails to high dynamic and also to unreliable S
M
connections, both algorithms are designed to use short 70 S
communication paths and avoid additional
communication overhead. For this purpose both
algorithms use only local information instead of any P Profile S Similar devices M Similarity matrix
routing protocols for the communication.
Figure 1. HCNR similarity calculation and
similar neighbor discovery

357
Thus the matrix on the cluster-head stores the together with their ratings via a tuple-set and a
similarities between all users connected to the current triple-set . Furthermore, each entry in can be also
cluster. Finally, in order to determine the similar considered as a directed weighted edge between the
neighborhood for all slaves that are not directly active user and a similar neighbor, while each entry in
connected with the cluster-head a copy of this can be seen as a directed weighted edge between a
similarity matrix is replicated to each sub-head in the similar user and a podcast. Consequently, each set can
cluster. The cluster-heads and the sub-heads provide a be represented via a corresponding graph. The whole
list of similar neighbors to each slave in the current set of votes stored in defines a directed bipartite
cluster. graph , where are nodes that
represents similar users, are item nodes, and is the
3.3. Weighted Neighborhood Resolution set of directed weighted edges from user nodes to item
(WNR) nodes. The tuple set defines a complete bipartite
graph , that represents the similarities
The second algorithm is based on a simple peer-to- between the active user and his most similar
peer communication pattern. However, in contrast to neighbors.
our first algorithm each device calculates only Given the above described model we can calculate a
similarities between the active user and other users in prediction for an item for the active user via
range. In order to decide on which device the computing the average of the rating values contained in
corresponding similarity should be calculated, each each weighted by the corresponding similarity value
device computes a weight and includes this weight in in .
its beacon. Thus when a device detects a new neighbor
with a higher weight as the own, it sends its profile to ∑ ,
this neighbor. The receiving device calculates then the ,
#
corresponding similarity value. If this value is higher
than a given threshold then the device is considered as 4. Experiments and results
similar and it is stored in the list of similar devices.
The own profile together with the computed similarity
In order to check the proper operation of the above
value is sent back. Figure 2 shows a WNR information
introduced ad hoc recommender system, we
exchange example.
implemented it on the top of the JANE simulator [13]
and performed several experiments comparing the
60 S precision of the calculated predictions as well as the
P P number of messages and Kbytes sent using HCNR and
P WNR for the neighborhood determination.
80 90 70
P P
S
S 4.1. Dataset
S
70 S
Due to the lack of a podcast rating database, we used
the MovieLens Data Set [14] that consists of 100,000
P Profile S Similarity ratings for 1682 movies by 943 users, where each user
has at least rated 20 movies. Additionally, each movie
Figure 2. WNR similar neighbor discovery is also mapped onto different genres. Thus, we can
consider each movie as a podcast and the
3.4. Recommender model and prediction corresponding genres as keywords.
calculation For our experiments, we generated 5 different
training sets containing 15 votes that are considered as
Each time the system detects one or more new the observed votes for each user and 5 different test
similar neighbors, the system updates the local sets containing the remaining votes. After each
recommender model. This model aggregates the ratings simulation run we compare the predicted votes with the
of the -most similar neighbors with a similarity value corresponding votes in the test set and calculated the
higher than a threshold . While a similar neighbor is Mean Absolute Error (MAE), which has been used to
defined as a pair , , and a vote is measure prediction performance in several cases [4],
defined as a triple , , , . The [15], and [16]. If a predicted item did not have an
implementation of our recommender model maintains adequate entry in the test set it was eliminated from the
the similarity values of the most similar neighbors evaluation. Note that we used the MAE only to

358
compare how accurately our algorithms predict a In HCNR, where certain elected devices gather and
randomly selected item rather than evaluating the user provide similarity information about the entire cluster,
experience of generated recommendations. a big part of the messages contain a list of profiles and
similarities, which results in a distinctly bigger
4.2. Environmental settings bandwidth usage.
1.22
For each experiment we chose a 150 x 150 and a 300 HCNR 150x150
x 300 unit square with 50, 60, 70, 80, 90 and 100 1.2 HCNR 300x300
WNR 150x150
mobile devices, where the transmission range of each WNR 300x300
1.18
device was 50 units. Furthermore, we used random

MAE with 95% Confidence


way point as the underlying mobility model where the 1.16
devices move between positions provided by a random
position generator. While, for each device the moving 1.14

speed was randomly varied for every new position 1.12


between 0.9 and 5.4 km/h.
In an initialization phase, we selected 15 votes for 1.1

each device in order to calculate an initial user profile.


1.08
Afterwards the devices exchanged these profiles in
order to determine the -most similar neighbors. For 1.06
40 50 60 70 80 90 100 110
all experiments we simulated with 5 different training Number of devices
sets and 50 different topologies per training set. All Figure 3. Mean absolute error of calculated
results are shown with 95% confidence intervals. predictions

4.3. Results 10000


HCNR 150x150
9000 HCNR 300x300
WNR 150x150
Figure 3 shows the measured MAE after 5 minutes of 8000
WNR 300x300
Unicasts with 95% Confidence

simulation. For each run we chose = 10 and 0.5 7000


as the similarity threshold for each device. As figure 3
6000
shows, HCNR provides a 2%-4% better average
precision and a slightly minor variation concerning the 5000

calculated predictions as WNR. After decreasing the 4000

simulation area to 150 x 150 units by keeping the same 3000


number of devices with the same transmission range—
2000
that leads to an increased connectivity—this difference
in the measured precision decreases to distinctly under 1000

1%. Anyway, this result was expected, due to the fact 0


40 50 60 70 80 90 100 110
that, if WNR is used, for each device the quality of the Number of devices
similar neighborhood and therefore the precision of the Figure 4. Number of messages sent
calculated predictions directly depend on the number
of discovered one-hop neighbors.
4
x 10
4.5
In a second experiment we measured the number of HCNR 150x150
4 HCNR 300x300
unicasts and the number Kbytes sent when using WNR 150x150
Kbytes sent with 95% Confidence

HCNR and WNR in order to compare their message 3.5 WNR 300x300
complexity. As figure 4 shows, HCNR needs distinctly 3
lesser unicasts than WNR and scales much better with
2.5
the number of devices, particularly in dense network
settings. However, as figure 5 shows the total number 2

of Kbytes sent and for that reason the message size is 1.5
too a large extend bigger when using HCNR instead of
1
WNR. Due to the fact, that in WNR a message
contains at most the own profile together with one 0.5

similarity value between the own profile and a similar 0


40 50 60 70 80 90 100 110
neighbor, the total number of Kbytes sent is very small Number of devices
compared to HCNR. Figure 5. Bandwidth usage

359
5. Conclusion and future work [4] U. Shardanand and P. Maes, “Social information
filtering: Algorithms for automating “word of mouth”, in
Proceedings of ACM CHI’95 Conference on Human Factors
In this work we presented a collaborative filtering in Computing Systems, vol. 1, 1995, pp. 210-217.
based recommender system for mobile ad hoc [5] R. Cöster, M. Svensson, “Incremental
networks to augment the HyCast application and Collaborative Filtering for Mobile Devices”, in Proceedings
overcome the potential problem of information of the 2005 ACM symposium on Applied Computing
overload. For this purpose we defined a suitable (SAC’05), 2005, pp. 1102-1106, Santa Fe, New Mexico,
similarity measure and introduced two algorithms to USA.
determine a set of similar neighbors. Furthermore we [6] B. N. Miller, J. A. Konstan, and J. Riedl., “PocketLens:
presented a local model for calculating predictions Toward a Personal Recommender System”, in ACM
which is incrementally built up on the way. In order to Transactions on Information Systems, Vol. 22, No. 3, July
2004, Pages 437-476
check the introduced system for proper operation it has [7] A. Tveid, “Peer-to-peer based Recommendations for
been implemented and tested on top of the JANE Mobile Commerce”, in Proceedings of the 1st International
simulator by performing several experiments. As the Workshop on Mobile Commerce (WMC 01), 2001, pp. 26-29,
results of these experiments show, HCNR provides a Rome, Italy.
slightly better precision and scales distinctly better [8] A. de Spindler, M. C. Norie, M. Grossniklaus and Beat
concerning the number of sent unicasts if we increase Signer., “Spatio-Temporal Proximity as a Basis for
the network connectivity. However, WNR has the Collaborative Filtering in Mobile Environments”, in
advantage that the used bandwidth is very small Workshop on Ubiquitous Mobile Information and
compared to HCNR. In WNR each message contains at Collaboration Systems (UMICS 2006), 2006, Luxembourg,
Grand Duchy of Luxembourg.
most one profile together with one similarity value [9] R. Belotti, C. Decurtins, M.C. Norrie, B. Signer, and L.
between the corresponding neighbors, while a big part Vukelja, “Experimental Platform for Mobile Information
of the messages sent by HCNR contains similarity and Systems”, in Proceedings of the 11th Annual International
profile information of several devices. A further Conference on Mobile Computing and Networking
difference is that in WNR each device only calculates (MobiCom 2005), 2005, pp. 258-269, Cologne, Germany.
and stores similarity information which is useful for [10] M. Jacobsson, M. Rost and L. E. Holmquist, “When
the own interest. In HCNR we suppose an altruistic Media Gets Wise: Collaborative Filtering with Mobile Media
behavior of the cluster-heads and the sub-heads. Agents” in Proceedings of the 11th International Conference
Thus in future work additional research has to be on Intelligent User Interfaces (IUI’06), 2006, pp. 291-293,
Sydney, Australia.
done in order to decide which algorithm should be used [11] M.J. Pazzani, “A Framework for Collaborative,
in which case. Furthermore, we intend to extend our Content-Based and Demographic Filtering”, in Artificial
ad hoc approach by using available backbones in order Intelligence Review, Dec. 1999, pp. 393-408.
to exchange similarity information between different [12] A. Andronache, M. R. Brust, and S. Rothkugel,
partitions. Additionally, regarding to the “Big City “Multimedia Content Distribution in Hybrid Wireless
Life” scenario—where the tourists visiting different Networks using Weighted Clustering” in Proceedings of the
cities are primary interested in the local offers from the 2nd ACM Workshop on Wireless Multimedia Networking and
nearby environment—we plan an extension by using Performance Modeling (WMuNeP’06), 2006, pp. 1-10,
additional context information, e.g. position Torromolinos, Spain.
[13] D. Görgen, H. Frey and C. Hiedels, “JANE-The Java
information to further improve our system. Ad Hoc Network Development Environment”, in 40th Annual
Simulation Symposium (ANSS’07), 2007, pp. 163-176.
6. References [14] MovieLens Data Set. Available at:
http://www.grouplens.org/node/73.
[1] B. Sarwar, et al., “Item-based Collaborative Filtering [15] J.S. Breese, D. Heckerman, and C. Kadie, “Empirical
Recommendation Algorithms”, in Proceedings of the 10th analysis of predictive algorithms for collaborative filtering”,
International World Wide Web Conference (WWW10). Hong in Proceedings of the 14th Conference on Uncertainty in
Kong. Artifical Intelligence (UAI-98), 1998, pp. 43-52.
[2] A. Andronache, M. R. Brust, and S. Rothkugel., “HyCast [16] J. L. Herlocker, J. A. Konstan, A. Borchers, and J.
–Podcasts Discovery in Mobile Network”, in Proceedings of Riedl, “An algorithmic framework for performing
the 3rd ACM Workshop on Wireless Multimedia Networking collaborative filtering”, in Proceedings of the 22nd
and Performance Modelling (WMuNeP’07), 2007, pp. 27-34, International Conference on Research and Development in
Chania, Crete Island, Greece. Information Retrieval (SIGIR’99), 1999, pp. 230-237.
[3] P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and
J.Riedl, “Grouplens: An open architecture for collaborative
filtering of netnews,” in Proceedings of ACM 1994
Conference on Computer Supported Cooperative Work.
Chapel Hill, North Carolina: ACM, 1994, pp.175-186.

360
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A Reconfigurable Distributed Broker Infrastructure for Publish Subscribe


based MANET

Mayank Pandey BD Chaudhary


Motilal Nehru National Institute of Motilal Nehru National Institute of
Technology, Allahabad, India Technology ,Allahabad, India
mayankpandey@mnnit.ac.in bdc@mnnit.ac.in

Abstract The architecture proposed for fixed infrastructure


based network cannot be extended directly to Mobile
Publish- Subscribe communication paradigm is an Ad-hoc Networks (MANET) due to frequent
attractive alternative for MANET because of its disconnections and high degree of unpredictability of
anonymous, asynchronous and decoupled behavior. A motion and location of mobile nodes. In these mobile
broker is a dominant component of this paradigm and ad-hoc environments, designing a static and distributed
its design is non-trivial in MANET due to broker network is difficult because it is not possible to
unpredictable mobility of nodes. In this paper, we assign broker role to any node permanently.
present an algorithm to establish a reconfigurable, In past few years, some efforts [11, 12, 13, 16]
distributed network of brokers, facilitating guaranteed have been made to extend publish/subscribe paradigm
communication between mobile publisher and for MANET considering every node of network as a
subscriber. Our approach dynamically elects a set of broker. These proposals can be broadly classified in
brokers, such that every non-broker node has a broker two categories. In first category, overlay network of
in its range. These brokers subsequently establish brokers is maintained at application layer to route
connection with their neighbor brokers, ensuring messages between publisher and subscriber. This
connectivity among them. Simulation results indicate results in big overhead because the structure of overlay
that for a given node density the number of brokers in keeps on changing frequently due to mobility of nodes.
network becomes stable after initial fluctuation. Further, after few reconfigurations, there is a mismatch
Further, number of brokers decreases with increase in between logical and physical topology of overlay. In
node density. The results also indicate that number of the second category gossip or epidemic approaches are
brokers remain stable up to a certain mobility speed used to route event by using the broadcast nature of
and tend to fluctuate only at very high speed. underlying wireless MAC layer. Since gossip based
approaches are probabilistic, it is difficult to provide
guaranteed delivery of messages. Further, there is
1. Introduction additional overhead due to delivery of events to un-
interested subscribers.
A Publish-Subscribe communication system We take a different approach from above two. Our
provides anonymity and decoupling in time and flow approach elects broker nodes dynamically ensuring
between communicating partners. It consists of three every non-broker node can find a broker node in its 1-
roles: publishers, subscriber and broker. Broker, the hop MAC range. For this dynamic election of broker
most important component, is responsible for we have utilized the basic beaconing mechanism of
dispatching published events to interested subscribers. 802.11 MAC protocol. We have proposed strategies to
Several publish/subscribe systems [1, 2, 3, 4] have effectively route messages among these elected
been proposed for large scale infrastructure networks brokers, ensuring high percentages of message
with either centralized or distributed brokers. This delivery. The remainder of this paper is organized as
architecture of brokers has been extended to wireless follows. We present the system model and broker
LAN like settings (with fixed wireless access points selection algorithm in section 2. Section 3 and 4 give
acting as brokers) to deal with the mobility of details of broker network construction and message
publishers and subscribers. dissemination. Simulation environment and results are

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 361


DOI 10.1109/SUTC.2008.30
detailed in section 5. A brief survey of related works infinity. An initial node can also set itself as a broker if
appears in section 6, followed by conclusion in section its timer t-rand expires or it wants to publish or
7. subscribe before the expiration of timer t-rand. This
role change is indicated in the next hello-broker
2. Broker selection Algorithm messages, to its 1-hop neighbors. There can be a
possibility where two initial nodes, simultaneously
Our proposed approach for broker selection is want to publish /subscribe or their t-rand timers are
inspired by ad-hoc clustering algorithm [5, 6, 7] and it expiring at the same time. In such situation, both can
ensures that for any publisher/subscriber, the broker is elect themselves as broker. Contention between the
always available in 1-hop range and no two brokers are two is solved by node ID. The node with lower ID
in the range of each other. Our model assumes N value wins the race. After some finite time, nodes are
mobile nodes, moving randomly in a predefined area. in two roles; broker or non-broker and they keep on
Each node is using omni directional antenna of equal exchanging hello-broker message to reflect any
power and is capable of becoming publisher, changes in topology. The pseudo-code of this
subscriber or broker. Further, our model assumes that algorithm is given in Figure. 1
this ad-hoc network is connected and has no partitions. This broker selection algorithm ensures that a
We have used following variables and data structures broker is always available with in 1-hop range of every
in our algorithm: publisher or subscriber and no two brokers are in the
-role (i): the role currently played by any node i. range of each other.
-ID (i): Identification number of any node i
Broker Selection Algorithm ( )
-t-rand (i): Random timer value at any node i.
when node i powers on
-NIDB: neighbor information data base maintained at
set role(i) = initial
every node, detailing information about 1-hop
broadcasts hello broker messages
neighbors of this node.
set t-rand value
-hello-broker message: Sent by every node
start receiving broadcast messages
periodically in 1-hop neighborhood.
if (receives broadcast from a broker node j)
The role currently played by any node i is stored at
set role(i) = non-broker
each node in variable role (i). It can take one of the
set t-rand = infinity
three possible values; broker, non-broker or initial.
set broker(i) = node j
The broker value represents that the node is serving as
else if ( t-rand expires && role (i) = initial )
broker for some nodes. The initial value indicates that
set role(i) = broker
the node is not a broker and it does not have a broker
indicate this role change in next broadcast
in its range where as the non-broker value tells that the
else if ( role (i) = initial && node i wants to
node is not a broker but it has a broker in its range to
pubs/subs )
interact with. There is a variable ID (i) at every node
set role(i) = broker
which stores identification number of the node i. Timer
indicate this role change in next broadcast.
variable t-rand (i) stores random timer value at any
node i. NIDB is a neighbor information data base Figure.1
maintained at every node, detailing information about
1-hop neighbors of this node. This table contains ID of
3. Construction and maintenance of broker
neighbors, role of the neighbors and broker node ID of
neighbors. The beacon hello-broker is broadcasted by network
every node periodically in 1-hop neighborhood. This
message contains the following information of node; Other property of broker selection algorithm is that
node ID, role and broker ID for this node. two broker nodes are always either two or three hops
The algorithm for broker selection runs at every away. As the nodes are exchanging hello-broker
node. When any node powers on, it sets its role as message periodically the minimum hop number
initial and the sends hello-broker messages between two broker nodes can be found by inspecting
periodically to its 1-hop neighbors, it also starts its NIDB of non-broker nodes. As already mentioned,
random timer t-rand and builds its NIDB. An initial there can be two cases:
node, after receiving a hello-broker message from a 1) Two brokers are separated by one non-broker as
broker node, changes its initial role to non-broker, sets depicted in Figure.2. In this case, broker 1 and broker 2
its broker node and resets the value of t-rand to communicate through non-broker. Non-broker gets
hello-broker message from both broker 1 and broker 2.

362
Non-broker selects the one which is having lower ID brokers (having higher ID) changes its role from
as its broker and also concludes that there is a path broker to non-broker.
between broker 1 and broker 2 through itself. This
information is passed to both broker 1 and broker 2. 4. Message dissemination among brokers
hello hello Every selected broker-node maintains a
subscription table. This table contains the following
Broker 1 Non-Broker Broker 2 fields: {subscriber node ID, subscription filter}.
Figure.2 Subscriber’s messages are stored in this table. Due to
2) Two brokers are separated by two non-broker nodes the mobility of subscriber, it is possible that its
as depicted in Figure.3. In this case broker 1 and subscriptions are stored in one broker and it is
broker 2 communicate through non-broker 1 and non- currently attached to any other broker. This creates a
broker 2. Non-broker 1 and non-broker 2 both problem in notifying the subscriber when event match
exchange hello-broker messages and gather the is there. To solve this problem, subscriber maintains a
knowledge that broker 1 and broker 2, are the brokers list of its subscription messages. When subscriber
for non-broker 1 and non-broker 2 respectively. They moves in to range of another broker, it again sends
also conclude that path from broker 1 to broker 2 is subscription messages to new broker using this list.
through non-broker 1 and non-broker 2. This The entries related to subscriber are purged from
information is passed to both broker 1 and broker 2. subscription table of previous broker. The movement
of broker node does not pose a big problem. As soon
hello hello hello
as broker node moves it flushes its subscription table.
This broker moves near to another broker and if, it is
selected as a broker for that area as discussed in
Broker 1 Non-Broker 1 Non-Broker 2 Broker 2
previous section, then the old broker transfers its
subscription table to this new broker. Other wise new
Figure.3 broker changes its role from broker to non-broker. This
In this way every broker node gets the ID, path and scheme ensures that during reconfiguration, subscriber
distance in hop number for all neighbor brokers. More is always attached to that broker where its
specifically, every broker gets a small subset view of subscriptions are stored. These subscriptions are never
complete broker network. This view always gets propagated beyond the broker receiving them, instead
updated as nodes continuously exchange hello-broker we use publication forwarding scheme discussed next
messages. In our approach, published messages are forwarded
Due to mobility of nodes, structure of broker by a broker to all other broker along the broker
network gets disturbed. In Figure.1 suppose non- network. Every broker node maintains a publication
broker nodes moves away from its current location, table to store published messages. The fields of this
then broker 1 and broker 2 will get disconnected if this table are: {publisher ID, published message}. A
was the only non-broker node between them (contrary publisher sends its message to broker which propagate
to our assumption as we have assumed a connected ad- this message to all other brokers in network. Each
hoc network), then in this situation brokers have to published message is uniquely identified by the 2-tuple
wait for some time for any other non-broker node to {publisher ID, sequence number}. This 2-tuple
move between them and to make them connected information is attached with the published message by
again. If there are more than one non-broker nodes the publisher itself. Each broker knows the routes to its
between broker 1 and broker 2 then the movement of neighbor brokers as mentioned in section 3. The
non-broker node will not affect the connectivity of published message is forwarded to the neighbor broker
brokers. They get connected through other non- and stored in its publication table. This message is
broker node. forwarded to its neighbors and so on. To avoid looping
The movement of broker node is little bit trickier. and duplicate message forwarding, a list of broker IDs
Broker movement leaves behind some nodes having no to whom the message is forwarded is kept with the
broker in their range. This causes change in their role published message. Suppose in Figure.4, where an
from non-broker to initial. All these initial nodes again example broker network is depicted (only broker nodes
enter in broker selection phase and elect a new broker are shown), broker node-1 initiates publication
for them. This broker node can move in the range of forwarding. At this instant, published message is
any other broker node. In this situation, one of the having node 1, 2, 7 and 4 in its forwarded list as 2, 7

363
and 4 are neighbor brokers and node 1 is initiator.
When this message reaches at node 2, it checks its Parameter Value
forwarded list and forward this only to those neighbors Number of nodes 100, 50 and 80
which are not there in this list and also append those Area 2000×2000 m2
neighbor IDs in that list. The list will now contain Minimum speed 0 m/s
1,2,4,7 and 3 when message reaches at node 3. This Maximum speed 10 m/s, 20 m/s,50 m/s
process continues till all the broker nodes are there in Hello packet period 0.2 sec
the forwarded list. This strategy ensures that messages Number of publisher 25% of the total nodes
are not forwarded in loops and avoids duplicate No. of subscribers 75% of the total nodes
message reception to some extent. Duplicate message Publishing rate 1 message/ 3 second
are detected by 2-tuple unique identifier attached with t-rand period Random between 1-5
published message, and they are discarded after seconds
reception. In this way every published message is
received and stored by all brokers in network. Table.1
We performed a simulation experiment to evaluate
6 performance our broker selection algorithm with
respect to variation in node density of ad-hoc network.
For this experiment mobility speed was fixed in the
2
3 range from 0 to 10 m/s. Figure.5 contains three graphs
for three different node densities (100, 80 and 50
nodes). Horizontal axis of graph represents the time
7 elapsed in minutes where as the vertical axis represents
1 5 the number of elected brokers. As can be seen from the
graph, the number of brokers for 100 nodes is less than
that for 80 and 50 nodes. This numbers of brokers, for
different node density is stable too. In other words, the
4 number of brokers for a given node density is varying
in very small range (6 to 8 for 100 nodes).

Figure.4

5. Simulation Results
We have used J-Sim [8] for the simulation purpose.
It is an open-source, component-based network
simulation environment, developed in Java. It provides
No. of brokers

a platform-independent, extensible, and reusable


environment. It is a dual-language simulation
environment, in which classes are written in java and
network topologies are built in Tcl.
We modified the source code of ‘Mac_802_11.java’ to
simulate our idea. A new hello-broker packet is
created which is sent by every node periodically. For
the simulation, 100 nodes are distributed randomly in
an area of 2000×2000 m2. Random-waypoint mobility
model and a free space propagation model are used.
The free space model basically represents the Figure.5
communication range as a circle around the presents the result of our second
Figure.6
transmitter. If a receiver is within the circle, it receives experiment which was performed to evaluate the effect
all packets. Each simulation was run for 120-200 of mobility speed on number of elected brokers. The
simulated minutes and results are averaged over 10 three graphs contained in this figure are for mobility
runs of every simulation. The default simulation speed of 10 m/s, 20 m/s and 50 m/s. The number of
settings are given in Table.1. brokers for mobility speed of 10 m/s and 20 m/s is

364
relatively stable as compared to the mobility speed of approach. It is evident from the curves that our scheme
50 m/s. It implies that as the speed goes beyond certain outperforms the pure gossip approach in the context of
threshold limits, number of elected brokers become message delivery.
unstable.
Broker selection 4. Related work
14

12
In this section we summarize some of the most
relevant work in the area of publish/subscribe systems
10 for mobile ad hoc networks. The related research
No. of brokers

efforts can be divided in two approaches; deterministic


8 approaches, based on overlay structures and
6
probabilistic approaches, based on gossip or flooding.
In [9] and [10] algorithms for building and maintaining
4 a tree based event routing structure for a MANET on
10 m/s
20 m/s
the top of a transport protocol are presented. Further,
2 authors have considered every node as a broker and
50 m/s

0
then examined the effects of broker mobility on the
0 50 100 150 network performance. In [9], an algorithm for restoring
Tim e elapsed (in m inutes) the event routing tables is presented, after a
disconnection in a tree topology of overlay network. It
Figure.6
is difficult to achieve robustness by this approach since
a single link failure partitions the tree. In [10], a
distributed protocol to construct optimized
publish/subscribe trees in ad-hoc wireless networks is
% of message delivery

presented. In this protocol, many publish/subscribe


trees which are rooted at a publisher node, are
maintained for different publication patterns. This
approach assumes a relatively stable environment with
occasional reconfigurations followed by periods of
stability. Again, even for moderate continuous
mobility, this approach is not suitable due to high
overhead involved in maintenance of trees. One more
problem related with overlay based approaches is the
mismatch between the assumed static topology of
Figure.7 overlay and the dynamic physical topology of
To have a comparative evaluation of our algorithm, MANET. This mismatch results in inefficient event
we have also simulated a variant of gossip protocol delivery. Recently, some probabilistic approaches [11],
which represents a simplest strategy for [12], [13] are presented. [11] [12] are structure less
publish/subscribe message dissemination. In this approaches and they do not maintain any deterministic
variant protocol, a broker is running on every node and topology over MANET. Event routing in these
delivers the published message to its 1-hop neighbor approaches is based on gossip or flooding among the
with a forwarding probability p (0p1). We have nodes where each node acts as a broker. These
taken p=1 when a node is playing the broker role as approaches are unable to provide guaranteed delivery
well as the publisher role for a certain message. of messages. Further, due to redundant forwarding of
Otherwise, the message is re-forwarded by receiving same messages these may lead to broadcast storm
brokers with p=0.5. In our simulation, published event problem. In [13], a special type of informed gossip is
and subscriptions are represented as randomly used where each node works as a broker and nodes
generated 5 digit numbers. A published message autonomously decide about forwarding messages,
matches the subscription if it contains the number based on its estimated distance from the closest node
specified by subscription. Results of this experiment interested in the content of the message.
are shown in Figure.7. Y-axis in this figure represents We have taken a mid-way approach between
the percentage of message delivery and X-axis deterministic and probabilistic approaches. We
represents the time elapsed in minutes. The lighter dynamically elect a subset of nodes in role of brokers
curve is for variant protocol and darker curve is for our

365
and then use controlled flooding of published [9] G. P. Picco, G. Cugola, and A. L. Murphy. Efficient
messages for event dissemination among them. In our content-based event dispatching in presence of toplogical
approach, overlay among elected brokers is not reconfiguration. In International Conference on Distributed
maintained. Broker nodes are just aware of their Computing Systems (ICDCS), 2003
[10] Y. Huang and H. Garcia-Molina, “Publish/subscribe tree
neighbor brokers and they do not have a complete construction in wireless ad-hoc networks,” In Proc. of the 4th
view of broker topology. A different form of content- Int. Conf. on Mobile Data Management (MDM), 2003.
based publish-subscribe is proposed in [14], where the [11] P. Costa, M. Migliavacca, G. P. Picco, and G. Cugola,
authors describe mechanisms, to reconfigure an “Epidemic algorithms for reliable content-based publish-
overlay network according to the changes in the subscribe: An evaluation”, In Proc. of the 24th Int. Conf. on
physical topology and to the current brokers’ load. Distributed Computing Systems (ICDCS), 2004.
Unlike our solution, each broker must be provided [12] P. Costa and G. P. Picco. Semi-probabilistic content
with a global view of the system and for this a GSR based publishsubscribe. In Proc. of the 25th Int. Conf. on
(global state routing) protocol is used. Further, the Distributed Computing Systems (ICDCS), 2005.
[13] R. Baldoni, R. Beraldi, G. Cugola, M. Migliavacca, and
authors do not provide any mechanism for selection of
L. Querzoni, “Structure-less content-based routing in mobile
brokers; they simply assume the existence of some ad hoc networks,” In Proc. of the IEEE Int. Conf. on
broker nodes in MANET which may lead to a situation Pervasive Services, 2005
where non-broker nodes can not find any nearby [14] Y. Chen and K. Schwan, “Opportunistic overlays:
broker to interact with. Efficient content delivery in mobile ad hoc networks”, In
Proc. of the 5th Int. Middleware Conf., 2005.
5. Conclusion [15]G. Cugola, D. Frey, A. Murphy, and G. P. Picco,
“Minimizing the reconfiguration overhead in content-based
publish-subscribe”, In Proc. of the 19th ACM Symp. on
We have presented an algorithm for dynamic Applied Computing (SAC), 2004.
selection of brokers, and formation of their network for [16] G. Cugola and G.P. Picco, “REDS: A reconfigurable
the successful dissemination of published and dispatching system”, In Proc. of the 6th Int. Workshop on
subscribed messages. Our proposed scheme combines Software Engineering and Middleware (SEM), 2006.
the benefits of deterministic approaches, based on [17] P. Eugster, P. Felber, R. Guerraoui, and A.-M.
overlay structures and probabilistic approaches based Kermarrec, “The many faces of publish/subscribe,” ACM
on gossip or flooding. Simulation results show that our Computing Surveys, 2(35), June 2003.
approach can establish a broker network with fairly [18] Y. Huang and H. Garcia-Molina, “Publish/subscribe in a
mobile environment,” In Proc. of the 2nd ACM Int.
less number of brokers and provide efficient message
Workshop on Data engineering for Wireless and Mobile
delivery. access (MOBIDE), 2001.

6. References

[1] Castro, M., Druschel, P., Kermarrec, A., Rowston, A.:


Scribe: A large-scale and decentralized application level
multicast infrastructure. IEEE Journal on Selected Areas in
Communications 20 (October 2002)
[2] Carzaniga, A., Rosenblum, D., Wolf, A.: Design and
Evaluation of a Wide-Area Notification Service. ACM
Transactions on Computer Systems 3 (Aug 2001) 332–383
[3]Gryphon: http://www.research.ibm.com/gryphon/
[4]SIENA :http://www.cs.colorado.edu/users/carzanig/siena/
[5] M. Gerla and J.T.-C.Tsai, “Multicluster, mobile,
multimedia radio network”, ACM/Baltzer Journal of
Wireless Networks, Vol. 1, No. 3, 1995, pp. 244-265.
[6] C.R. Lin and M. Gerla, “Adaptive Clustering for Mobile
Wireless Networks”, IEEE Journal on Selected Areas in
Communications, Vol. 15, No. 7, Sep 1997, pp. 1265-1275.
[7] Chatterjee, M., Das, S., Turgut, D, “WCA: A weighted
clustering algorithm for mobile ad hoc networks”, Journal of
Cluster Computing (Special Issue on Mobile Ad hoc
Networks) 5 (2002) 193–204.
[8] J-Sim web page: http://www.j-sim.org/.

366
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Comparing the Conceptual Graphs Extracted from Patent Claims

Shih-Yao Yang and Von-Wun Soo


Department of Computer Science, National Tsing Hua University, HsinChu 300, Taiwan, ROC
yao@cs.nthu.edu.tw, soo@cs.nthu.edu.tw

Abstract triple mapping. For the Ambiguous Mapping, the


A patent claim defines the protection of the relaxation labeling may allow a triple to match with
invention. It is usually very time consuming and more than one triple at the convergence so that the
laborious to manually conduct analysis on patents in unique matching fails. In this paper, we modify the
any domain of interest. A maximal common edge support function of relaxation labeling that can filter
subgraph (MCES) is a subgraph consisting of the out the noisy evidence and recursively run the
largest number of edges common between two graphs relaxation procedure by anchoring triples with
G and G’. This paper automatically compares ambiguous mapping.
conceptual graphs, extracted from patent claims by an
NLP parser, using anchored relaxation labeling. 2. Conceptual Graphs

1. Introduction A conceptual graph is a kind of formal knowledge


representation in terms of semantic networks and
Graph matching has been an important technique existential graphs [6]. “In a conceptual graph, concept
applied in many research communities such as nodes can represent entities, attributes, states, and
ontology matching [2], computer vision [3], molecular events, while relation nodes can show how the
structures identification [7], question answering [10], concepts are interconnected” [9]. A conceptual graph
and XML schema matching [11]. If objects are is related to a background support in terms of specific
represented in graphs, then similarity comparison domain ontology. This background support can be
between the graphs is equivalent to the similarity formalized as:
comparison between objects. We consider conceptual 1. Tc, is a concept hierarchy that has a partial
graphs [9] since they are general knowledge ordering operator ≤ over concepts. For concept
representations that provide more expressiveness than labels A, B, and C in the ontology, it has the
traditional modeling languages [5]. A conceptual graph property: if A ≤ B and B ≤ C, then A ≤ C.
can also be converted to first-order-logic formula [1] in 2. Tr, is a relational hierarchy that is disjoined from
a way that further semantic inferences can be carried the concept hierarchy. Each relation links with 2
out. or more concepts.
In this paper, we propose an anchored relaxation 3. Ts, is a set of star graphs. Each star graph
labeling method, we call it ARL hereafter, for finding consists of a relation and a set of concepts to
maximal common edge subgraph. A maximal common which the relation can link. The number of
edge subgraph is a subgraph consisting of the largest concepts a relation r can link to is the degree of r.
number of edges common in both G and G’. The idea 4. M, is a set of individual markers for concept
of relaxation labeling [8] for graph matching is to vertices that refer to specific entities and 1
match two triples between two graphs according to generic marker “*” refers to an unspecified
their corresponding neighbor coherency. A triple is an entity.
ordered 3-tuple elements (vi, el, vj) where el is the edge
label linking vi with vj. But there are two problems in A conceptual graph G = (R, C, U, lab) is a bipartite
relaxation labeling for graph matching: (1) Noisy graph where R represents the relation vertices (r-
Evidence; and (2) Ambiguous Mapping. If a triple t vertices) and C represents concept vertices (c-vertices).
connects with many noisy neighbors, then it can A concept vertex is typically denoted as a rectangle
receive erroneous evidence so as to result in wrong while a relation is denoted as a circle. For instance, the
concept “A cat is on the mat” is represented by a

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 394


DOI 10.1109/SUTC.2008.87
conceptual graph with a relational vertex “On” that Applying the above definitions to the relaxation
relates two concept vertices “Cat” and “Mat” labeling, we obtain a formula (3.2).
respectively (Figure 1). Notably, U is the set of edges PASj (λl )q AS j (λl )
and for each r ∈ R, the set of edges adjacent to r is PS +1
Aj (λl ) = m
(3.2)
totally ordered. If c = Gi(R), c is said to be the ith
neighbor of r. Each vertex has a label defined in the ∑P
k =1
S
Ai (λk )q (λk )
S
Ai

mapping lab. If r ∈ R, then lab(r) ∈ Tr; if c ∈ C, then


The PA (λl ) is the probability of the object Ai
S +1
lab(c) ∈ Tc × M ∪ {*}. j

being labeled as λl after iteration (S+1) that is based


P S (λ )
on the probability A j k and the support qAi (λl ) ,
Figure 1. An Example of Conceptual Graph S

3. Relaxation Labeling for Conceptual the evidence it has received from the labels of the
neighboring objects at iteration S. The denominator is
Graph Matching a normalization factor. The relaxation labeling formula
iteratively compute the support value of a label for
3.1. Relaxation Labeling each object and update the probability of each object
until it converges where all the probabilities are
Relaxation labeling has been applied to image stabilized.
processing and computer vision that has been
described in details in [8]. The major idea of the 3.2. Conceptual Graph Matching
method is that a label of an image object is influenced
by the features of the object’s neighborhood in the Definition 1. (Triple) Given a conceptual graph G, a
image. The features are usually points, edges and triple is an ordered 3-tuple elements (r-vertex, el, c-
surfaces in image processing. It is a technique that vertex) where el is the edge label linking r-vertex with
involves the propagation of local information gradually c-vertex.
via an iterative process in order to reach a global
coherent interpretation The purpose of relaxation To apply the relaxation labeling to maximal common
labeling in image processing problem is to assign a edge conceptual subgraph matching from G to G’, the
label to each image object so that it leads to an triples of G are considered as a set of objects while
appropriate image interpretation in the sense that the those of G’ are considered as a set of labels. For each
total evidence is compatible with all contextual
information conveyed. In a labeling problem, one is triple ti ∈ G, the relaxation labeling process attempts
given: to find a triple t j
' ∈ G’ that can have the maximal final
- A set of objects A = { A1 , A2 ,..., An };
'
- A set of labels for each object = { λ1 , λ2 ,..., λ m }; posterior probability while the number of t j can be
- A set of neighbor relations ov er the objects; >1.
- A set of constraint relations over labels for each
pair (or n-tuples) of neighboring objects.
t 'j = arg max Pt
final '
(tk ) (3.3)
' ' i
t k ∈G
Definition 2. (Isolated Triple) In a maximal common
In relaxation labeling, a compatibility coefficient
edge conceptual subgraph matching between G and G’,
r ( Ai = λl , A j = λk ) represents that contextual information
for each triple ti ∈ G, if ∀t 'j ∈ G ' → Pt final (t 'j ) = 0 ,
conveyed by assigning label λk to object A j and i

then ti is called an Isolated Triple.


assigning label λl to object Ai . Let the probability
that vertex A j has label λ k at iteration S be PAS (λk ) . Definition 3. (Mapping Set) In a maximal common
j

Then the support for labeling λl to vertex Ai by edge conceptual subgraph matching between G and G’,
vertex A j at iteration S can be defined as for each triple ti ∈ G, if ti is not an Isolated Triple,
final
m then arg max Pti (t k' ) the set of is the Mapping
q (λl ) = ∑r( Ai = λl , Aj = λk )P (λk )
S
Ai
S
Aj (3.1)
' '
tk ∈G
k =1
Set of ti .

395
The concept compatibility coefficient pij (Ta , Tb' )
Definition 4. (Decidable Triple vs. Ambiguous Triple)
In a maximal common edge conceptual subgraph represents how well the Ta matches with Tb as shown
matching between G and G’, for each triple ti ∈ G, if in formula 3.5.
ti is not an Isolated Triple and the Mapping Set of ti ⎧1, E quality (Ta , Tb' )
p ij (Ta , Tb' ) = ⎨ (3.5)
consists of only one single element, then ti is a ⎩0, otherwise
3.4. The conceptual graph matching Algorithm
Decidable Triple, otherwise it is an Ambiguous Triple
that have more than one possible mapping in G’.
Definition 7. (Anchored Pair) In the relaxation
labeling procedure for maximal common edge
Definition 5. (Neighbor of Triples) For each triple
conceptual subgraph matching between G and G’, if
Ta(vr,el,vc), the Neighbor of Triples of Ta is the set of
NTS={(va,e,vb) ∈ Triple | va=vr or vh=vc} ∀ti ∈ G, ∀t 'j ∈ G ' , ( ti , t 'j ) is an Anchored Pair

⇒ Pt ( t j ) = 1 and Pt ( t q ) = Pt ( t j ) = 0
s ' s ' s '
i i p
Definition 6. (Equality of two Triples) In a maximal
for ∀ p ≠ i , ∀ q ≠ j , S ≥ 0
common edge conceptual subgraph matching between
G and G’, if Triple1( vi - elip - v p ) ∈ G and The anchored pair allows the algorithm to anchor a
Triple2( v - el
'
j
'
jq
'
-v )
q ∈ G’, Triple1 and Triple2 are particular pair ( ti , t 'j ) of triple mapping by fixing their
mapping probability at one at every iteration. Figure 1
equal ⇔
' '
lab( vi ) = lab( v j ), elip = el jq , and
shows an algorithm of conceptual graph matching
' between G and G’. The initial prior probabilities for
lab( v p ) = lab( vq ).
each pair of triples Pt ( 0 ) (t `j ) is 1 if Equality( ti , t 'j ) is
i

3.3. Noise Evidence vs. Support Function true (line 5). The relaxation labeling cycle (lines 7–15)
are repeated until the difference between prior
In the graph matching, the evidence for two Triples probability Pt ( s −1) (t `j ) and posterior probability
i

ti and t 'j , where ti ∈ G and t 'j ∈ G’, are the set of Pti( s ) (t `j ) becomes smaller than a certain threshold. If
their Neighbor of Triples, NTSi and NTS’j, there is no Ambiguous Triple at the convergence of
respectively. For each Ta ∈ NTSi, there may be more the relaxation labeling, the algorithm adds ( ti , t 'j ) to
than one Tb' ∈ NTS 'j that can make Ta and Tb’ equal
the triple_mapping for each ti in G where
and vice versa. For two equal triples, they may connect
different number of Neighbor of Triples. To resolve t = arg max
'
j
s '
Pt i (t ) (lines 24–27). If there are more
k
' '
t k ∈G
the Noisy Evidence, we modify the support function so
that it only receives the evidence that has the maximal than one Ambiguous Triple at the convergence step of
influence from all Equality pairs in NTS. The support relaxation labeling, the system will select an
function for the pair of triples of ti and t `j receives Ambiguous Triple and iteratively calls the anchored
relaxation labeling procedure before deciding each ( ti ,
the evidence that has the maximal influence from all
Equality pairs of NTS is shown in formula 3.4. t 'j ) as an Anchored Pair where t 'j ∈ MappingSet( ti )
Notable, pt( S ) (t q` ) is the matching probability between (lines 17–21). After the exit of the relaxation labeling
p

` cycle, the algorithm adds the solutions of the current


t p and t in the iteration S.
q triple_mapping to MCES_Solutions and reserves the
solution with maximal number of edge (lines 29–30),
qt(iS ) (t `j ) = ∑ arg max p ij (Ta , Tb' ) * pt(pS ) (t q` ) where MCES_Solutions stores the updated maximal
∀ distinct Ta x Tb' ∈ Ta x Tb'
common edge subgraph.
NTSti x NTS '
tj
and Equality (Ta , Tb )

(3.4)

1. AnchoredPairs = Φ ;
2. MCES_Solutions = Φ ;

396
3. ARL(G, G’, AnchoredPairs, MCES_Solutions)
4. {
5. InitialPriorProbability();//initial prior probabilities of Pt ( 0 ) (t `j )
i

6. S= 1; // S is the iteration
7. while(true){
8. // update the matching probability between triples
9. for (each triple ti in G)
10. for (each triple tj’ G’)
11. Evaluate Pt ( S ) (t `j ) ;
i

12. if (stop_condition() == true)


13. break;
14. S++;
15. }
16. if (#AmbiguousTriple>0){
17. ta = select an Ambiguous Triple;
18. for each tb’ ∈ MappingSet(ta){
19. add (ta, tb’) to AnchoredPairs;
20. Call ARL(G, G’, AnchoredPairs, MCES_Solutions);
21 remove (ta, tb’) from AnchoredPairs;
22. }
23. }else{
24. triple_mapping = Φ ;
25. for (each triple ti in CG){
26. t 'j = arg max Pt is (tk' ) ;
t k ∈G
' '

27. add (ti, tj’) to triple_mapping;


28. }
29. Add the solutions of current triple_mapping to MCES_Solution and
30. reserve the solutions with maximal number of edges;
31.
32. }

Figure 2. An example of graph matching with more than one solution

Figure 2 is an example of graph matching from G to iteratively re-run the relaxation labeling again by
G’ with more than one solutions. There are 4 mapping adding (f-1-g,i-1-h) and (f-1-g,m-1-n) to
solutions, {(b-1-a, b-2-c) to (m-1-n, m-2-l)}, {(f-1-g, f- AnchoredPairs where “i-1-h” and “m-1-n” are found in
2-e) to (i-1-h, i-2-j)}, {(b-1-a, b-2-c) to (i-1-h, i-2-j)} the MappingSet(f-1-g) as shown in Figure 5 and
and {(f-1-g, f-2-e) to (m-1-n, m-2-l)}. Figure 3 is the Figure 6.
prior probabilities for the vertices of G and G’. After
the first convergence of relaxation labeling as shown in
Figure 4, 4 triples (b-1-a), (b-2-c), (f-1-g), and (f-2-e)
are Ambiguous Triples, the algorithm selects the
Ambiguous Triples “f-1-g” as an anchoring triple and

397
Figure 3. Initial prior probabilities convergence again, the algorithm finds two solutions,
{(b-1-a, b-2-c) to (i-1-h, i-2-j)} and {(f-1-g, f-2-e) to
(m-1-n, m-2-l)}

4. Compare patent claims based on


extracted conceptual graphs
In this section, we apply the proposed method to
Figure 4. After the convergence of relaxation labeling,
compare conceptual graphs extracted from patent
4 triples (b-1-a), (b-2-c), (f-1-g), and (f-2-e) are
claims. The method of extracting patent claims is
Ambiguous Triples
based on the [12] that extracting a conceptual graph
from a patent claim based on the dependency tree
generated from the Stanford NLP Parser [4]. Table 1
shows the dependency relations between a head and its
dependent and their corresponding conceptual graph in
terms of their dependency labels, “nsubj”, “dobj”,
“advmod” and “amod”. For the dependency label
“advmod”, because a “verb” may have more than one
Figure 5. Add (f-1-g, i-1-h) to AnchoredPairs and re- manner, the system creates a concept “MannerHead”
run the relaxation labeling procedure. After the to link the manners of a verb.
convergence again, the algorithm finds two solutions, As in Figure 7, we illustrate part of a maximal
{(b-1-a, b-2-c) to (m-1-n, m-2-l)} and {(f-1-g, f-2-e) to common edge subgraph between US patents 6045439
(i-1-h, i-2-j)} and 5893796 in the CMP (chemical mechanism
polishing) domain. All programs were implemented on
java platform and run on an Intel Xeon 3.2G, equipped
with 1 G of RAM. There are total 67 and 69 vertices
respectively in their corresponding conceptual graphs.
The number of maximal common vertices between
them is 57. The execution time is 23,344 msec (10-3sec)
and the number of backtracking is 38.
Figure 6. Add (f-1-g, m-1-n) to AnchoredPairs and re-
run the relaxation labeling procedure. After the
Table 1. The relation between a dependent and its head and their corresponding conceptual graph
Dependency The relation between a dependent and its head Conceptual graph
nsubj A subject depends on a verb and the conceptual class is
(relation, concept)

dobj A direct object depends on a verb and the conceptual


class is (relation, concept)
amod An adjective depends on a noun, the conceptual class is
(concept_1, concept_2). The system creates a
Auxiliary relation “HasAtt” to link the two concepts.
Advmod 1. An adverb depends on an adjective and the
conceptual class is (concept_1, concept_2). The
system creates a Auxiliary relation “HasAtt” to
link the two concepts.
2. An adverb depends on a verb and the conceptual
class is (concept, relation). The system creates a
concept “MannerHead” to link the relation “Verb”
and creates a Auxiliary relation “HasAtt” to link
“MannerHead” with the concept “Adverb”.

398
Figure 7. Part of maximal common edge subgraph between patent 6045439 and 5893796
5. Conclusion and Future Work [3] H. Kalviainen, E. Oja, “Comparisons of Attributed Graph
Matching Algorithms for Computer Vision”, STeP-90
Finnish Artificial Intelligence Symposium, University of Oulu,
In this paper, we have proposed a maximal common 1990, pp. 354-368.
edge subgraph algorithm based on the relaxation [4] M. Marneffe, B. MacCartney, amd D.C. Manning,
labeling. There are two problems of traditional “Generating Typed De-pendency Parses from Phrase
relaxation labeling: (1) Ambiguity Mapping; and (2) Structure Parses”, Inter-national Conference On Lerc
Noisy Evidence. To resolve the Ambiguity Mapping, Language Resources and Evaluation, 2006
at each final convergence of relaxation procedure, the [5] G. Mineau, R. Missaoui, R. Godinx, “Conceptual
algorithm iteratively anchors two triples and modeling for data and knowledge management”, Data &
recursively run relaxation procedure. To resolve the Knowledge Engineering, 33, 2000, pp. 137 – 168.
Noisy Evidence, we modified the support function in [6] C.S. Peirce, Reasoning and the Logic of Things. The
Cambridge Conferences Lectures of 1898. Ed. by K. L.
relaxation labeling so that it allowed only receiving the
Kremer, Harvard Univ. Press, Cambridge, 1992
maximal influence evidence for each set of same [7] J. Raymond, P. Willen, “Maximum common subgraph
evidence. However, it may also increase the elements isomorphism algorithms for the matching of chemical
in the Mapping Set of an Ambiguous Triple. Although structures”, Journal of Computer-Aided Molecular Design,
we have applied to conceptual graphs, the anchored 16, 2002, pp. 521-533.
relaxation labeling framework is a general one and can [8] R. Rosenfeld, R. Hummel, S. Zucker, “Scene labeling by
be applied to other types of graph matching as well. In relaxation operations”, IEEE Trans. Systems Man Cybernet,
the future, we will utilize the proposed method to 6, 1976, pp. 420–433.
facilitate patent processing such as patent retrieval, [9] J.F. Sowa, Conceptual Structures: Information
Processing in Mind and Machine, Addison-Wesley, 1984.
patent cluster, patent comparison and patent
[10] D. Williams, J. Huan, W. Wang, “Graph Database
summarization to help more profound judgment of Indexing Using Structured Graph Decomposition”,
patent infringement. Proceedings of the 23rd IEEE International Conference on
Data Engineering, 2007.
10. References [11] S. Yi, B. Huang, W.T. Chan, “XML Application
Schema Matching Using Similarity Measure and Relaxation
[1] G. Amati, I. Ounis, “Conceptual Graphs and First Order Labeling”, Information Sciences, 169, 2005, pp. 27 – 46.
Logic”, The Computer Journal, 43, 2000, pp.1-12. [12] S.Y. Yang and V.W. Soo, “Extract Conceptual Graphs
[2] F. Giunchiglia, M. Yatskevich, P. Shvaiko, “Semantic from Plain Texts in Patent Claims”, The 6th Mexican
Matching: Algorithms and Implementation”, Journal on International Conference on Artificial Intelligence,
Data Semantics, 2007, pp. 1-38. November 4-10, 2007 (Post Presentation).

399
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A Hilbert Curve-Based Distributed Index for Window Queries in Wireless Data


Broadcast Systems

Jun-Hong Shen and Ye-In Chang


Dept. of Computer Science and Engineering
National Sun Yat-Sen University
Kaohsiung, Taiwan, R.O.C.
shenjh@cse.nsysu.edu.tw changyi@cse.nsysu.edu.tw

Abstract environment. Moreover, because of the power constraint of


mobile devices, an important challenge is to provide effi-
Location-dependent spatial query in the wireless envi- cient indexing and searching mechanisms for energy effi-
ronment is that mobile users query the spatial objects de- cient querying of LDSQs [3].
pendent on their current location. The window query is one LDSQs include window queries, nearest-neighbor (NN)
of the essential spatial queries, which finds spatial objects queries, and -nearest-neighbor ( NN) queries. Window
located within a given window. In this paper, we propose queries find data items that are located within a given win-
a Hilbert curve-based distributed index for window queries dow, which is a rectangle in a 2-dimensional space [8]. NN
in the wireless data broadcast systems. Our proposed al- queries return only one data item in the spatial space closest
gorithm allocates spatial objects in the Hilbert-curve order to a given query point. NN queries return data items in
to preserve the spatial locality. Moreover, to quickly an- the spatial space closest to a given query point [8]. Among
swer window queries, our proposed algorithm utilizes the them, the window query, one of the essential spatial queries,
neighbor-link index, which has knowledge about neighbor is very useful for spatial selection. In this paper, we focus
objects, to return the answered objects. From our experi- on the window query.
mental study, we have shown that our proposed algorithm For a file being broadcast on a channel, the following two
outperforms the distributed spatial index. parameters are of concern [6]: (1) Access time: The average
time elapsed from the moment a client wants a record iden-
Keywords: Location-dependent spatial query, power tified by a primary key, to the point when the required record
constraint, space-filling curve, spatial index, wireless data is downloaded by the client. (2) Tuning time: The amount
broadcast. of time spent by a client listening to the channel. This will
determine the power consumed by the client to retrieve the
1. Introduction required data. Since battery power is a scarce resource in
mobile devices, it is crucial for saving energy consumption
of the devices during the query process. Therefore, in this
Recently, location-dependent spatial query (LDSQ) on
paper, the main concern is to reduce the tuning time.
the wireless data broadcast is a new concerned issue in the
wireless environment. LDSQ in the wireless environment In the literature, there has been much work providing in-
is that mobile users query the spatial objects dependent on dex structures to support the efficient access on LDSQs on
their current location. Examples of LDSQs include query- the wireless data broadcast. In [9], Zheng et al. proposed
ing local traffic reports and the nearest restaurants with re- the grid-partition index to support NN queries. The studies
spect to user’s current location [7]. Because of its high scal- in [3, 7] are specified for NN queries. The Hilbert curve
ability, wireless data broadcast is an efficient way to dis- index [8] provides an index structure to support window
seminate data to a large number of mobile users. There- queries and NN queries. In [5], Lee and Zheng proposed
fore, wireless data broadcast is particularly suitable for pro- the distributed spatial index (DSI) to improve the perfor-
viding spatial objects for a tremendous number of mobile mance of the Hilbert curve index.
users. Since mobile users may move (mobility), many exist- Among the above work, DSI [5] can provide a good per-
ing techniques for processing spatial objects and queries in formance for window queries on the wireless data broad-
the tradition spatial databases may not fit with the wireless cast. However, this work does not utilize the property that

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 367


DOI 10.1109/SUTC.2008.13
5 6 9 10
the answered objects for window queries may be the neigh- 1 2
bor of each other to further reduce the tuning time. There- 4 7 8 11
fore, in this paper, we propose a Hilbert curve-based dis- 3 2 13 12
tributed index (HCDI) to support window queries by us-
0 3 0 1 14 15
ing the mentioned property. In our experimental results, we
have shown that our proposed algorithm can perform better (a) (b)
than the distributed spatial index. 21 22 25 26 37 38 41 42
The rest of this paper is organized as follows. In Sec- 20 23 24 27 36 39 40 43
tion 2, we give a brief description of the Hilbert curve. In
19 18 29 28 35 34 45 44
Section 3, we present our proposed Hilbert curve-based dis-
tributed index. In Section 4, we study the performance of 16 17 30 31 32 33 46 47
the proposed algorithm, and make a comparison with the
15 12 11 10 53 52 51 48
distributed spatial index by simulation. Finally, a conclu-
sion is presented in Section 5. 14 13 8 9 54 55 50 49

1 2 7 6 57 56 61 62
2. Background 0 3 4 5 58 59 60 63
(c)
In the wireless environments, a broadcast cycle consists
of a collection of data items, which are cyclically broadcast Figure 1. The Hilbert curve: (a) order 1; (b)
on the wireless channel. Mobile clients listen to the wireless order 2; (c) order 3.
channel to retrieve the data item of interest. In this section,
we briefly describe the Hilbert curve, one kind of space-
filling curves.
A space-filling curve is a continuous path which passes feasible [1]. These assumptions include:
through every point in a multi-dimensional space once to
1. Spatial objects are represented as a point in a two-
form a one-one correspondence between the coordinates of
dimensional space.
the points and the one-dimensional sequence numbers of
the points on the curve [4]. Some examples of space-filling 2. Spatial objects appear once in a broadcast cycle, i.e.,
curves are the Peano curve, the RBG curve and the Hilbert the uniform broadcast.
curve. Among them, the Hilbert curve can preserve the spa-
tial locality of points. The spatial locality means that points 3. A bucket is a logical transmission unit on a broadcast
that are close to each other in a multi-dimensional space are channel. Index tuples can be put into an index bucket
remained to close to each other in a one-dimensional space. and a spatial object can be put into a data bucket. The
Figure 1-(a) shows the Hilbert curve of order 1. The curve size of a data bucket is far larger than that of an index
can keep growing recursively by following the same rota- bucket, and the size of a data bucket is a multiple of
tion and reflection pattern at each point of the basic curve. that of an index bucket.
Figures 1-(b) and 1-(c) show the Hilbert curves of orders
2 and 3, respectively. In this paper, to preserve the spatial 4. Clients make no use of their upstream communications
locality, we allocate spatial objects to the broadcast channel capability.
in the ascending order of the Hilbert curve. 5. When a client switches to the public channel, it can
retrieve buckets immediately.
3. The Hilbert Curve-Based Distributed Index
6. The server broadcasts buckets over a reliably single
To provide an efficient way to process window queries in channel.
the wireless broadcast environments, we propose a Hilbert
curve-based distributed index (HCDI). In this section, we    
  
first state the assumptions of our proposed algorithm and
then present our proposed algorithm. For a window query of spatial objects, the answered ob-
jects may be the neighbors of each other. To provide an ef-
 
 ficient way to support the window query, we propose neigh-
bor links to guide clients to receive related objects. The
This paper focuses on the wireless environment. Some policy for adding neighbor links to the base unit is that
assumptions should be restricted in order to make our work the base unit has neighbor links pointing to the neighbor

368
units of the northern, southern, eastern, western, northeast-
ern, southeastern, northwestern and southwestern directions
that have the Hilbert-curve value greater than its one. The
efficient way to find these neighbor units of the current base
unit in the Hilbert curve can be found in [2]. In [2], Chen 21 22 25 26 37 38 41 42
and Chang presented a method to find the sequence num-
20 23 24 27 36 39 40 43
bers of the neighboring blocks next to the current base unit
based on its bit shuffling property in the Peano curve, and 19 18 29 28 35 34 45 44
the transformation rules between the Peano curve and the 16 17 30 31 32 33 46 47
Hilbert curve. That is, the sequence numbers of the neigh-
15 12 11 10 53 52 51 48
boring blocks next to the current base unit can be easily
found in the Peano curve, and then transformed to the ones 14 13 8 9 54 55 50 49
in the Hilbert curve. 1 2 7 6 57 56 61 62
0 3 4 5 58 59 60 63
   

 
 
: An object inside : No object inside
The proposed algorithm for efficiently processing win-
dow queries is proceeded as follows. Figure 2. A Hilbert curve of order 3
1. Allocate spatial objects in the ascending order of the
Hilbert curve of order .

2. Allocate one index bucket before each data bucket.


Each index bucket contains the neighbor-link index
and the local index. Let the objects covered by the
same Hilbert-curve value of order   as a group.
The allocation of the index bucket is processed as fol-

;;
;
lows. 21 22 25 26 37 38 41 42
20 23 24 27 36 39 40 43
5 6 9 10
(a) Add neighbor links (the neighbor-link index) to 19 18 29 28 35 34 45 44
the corresponding index buckets from base units 16 17 30 31 32 33 46 47
4 7 8 11
(blocks) of order   to those of order 1. 15 12 11 10 53 52 51 48
14 13 8 9 54 55 50 49
3 2 13 12
(b) Check if the index tuples of the neighbor-link in- 1 2 7 6 57 56 61 62
dex in the index buckets have the same pointer 0 3 4 5 58 59 60 63
0 1 14 15
(offset). If yes, remove the index tuples with the
short range. (a) (b)
(c) Add the local index, which has information about
the objects in the same group, to the correspond-
ing index buckets.
1 2
Now, we use an example to illustrate our proposed al-
gorithm. Figure 2 shows an example of the Hilbert curve
of order 3, where the gray block contains a spatial object
0 3
inside and the white one contains no object. In Step 1, we
allocate spatial objects to the one-dimensional space in the (c)
order of the Hilbert curve of order   .
Next, in Step 2, we allocate one index bucket before each Figure 3. Block levels: (a) order 1; (b) order 2;
data bucket. The objects covered by the same Hilbert-curve (c) order 3.
value of order     are considered as a group.
Block  of order   ,          , is covered
in block  of order . Take objects 8, 9 and 11 shown
in Figure 3-(a) for example. Their Hilbert-curve values are
covered by block      
   of order

369
Table 1. Neighbor links
Order Start Block Neighboring Blocks (Range)
0 1 ([17,31]), 2 ([32,40]), 3 ([51,61])
1 2 ([32,40]), 3 ([51,61])
1
2 3 ([51,61])
3 -
1 2 ([8,11])
2 4 ([17,17]), 7 ([28,31]), 8 ([32,32])
4 6 ([27,27]), 7 ([28,31])
6 7 ([28,31]), 8 ([32,32])
2 7 8 ([32,32])
8 10 ([40,40]), 12 ([51,51]) [17, 17] I4 [17, 17] I4
10 - [17, 31] I4
[28, 31] I6
[17, 31] I4
[28, 31] I6
12 15 ([61,61]) neighbor-link
[32, 32] I9 [32, 32] I9
[32, 40] I9 [32, 40] I9 [28, 31] I6
15 - index
[51, 61] I11 [51, 61] I11 [32, 32] I9
[8, 8] 2 [8, 8] 2' [32, 40] I9
* “-”: no neighbor link local index [9, 9] 3 [9, 9] 3' [51, 61] I11
[11, 11] 4 [11, 11] 4 [27, 27] 6

I I I I I I I
2 shown in Figure 3-(b). Therefore, these objects are in the 0 6 1 8 2 9 3 11 4 17 5 27 6 28
bucket
same group. 1 2 3 4 5 6 7
number
[8, 11] I1 [17, 17] I4 [27, 27] I5 [32, 32] I9
Then, in Step 2-(a), to provide an efficient way to support [17, 31] I4 [17, 31] I4 [28, 31] I6 [32, 40] I9
[32, 40] I9 [28, 31] I6 [32, 40] I9 [51, 61] I11
the window query, neighbor links (the neighbor-link index) [51, 61] I11 [32, 32] I9 [51, 61] I11 [28, 28] 7
[32, 40] I9
are allocated to the corresponding index buckets for each [6, 6] 1
[51, 61] I11
[17, 17] 5 [29, 29]
[31, 31]
8
9
object in the group from base blocks of order     [8, 8] 2'
[9, 9] 3
to those of order 1. Consider the group containing objects [11, 11] 4

8, 9 and 11 shown in Figure 3-(a) for example. The neigh-


bor links of their corresponding block 2 of order 2 shown [40, 40] I10
[51, 51] I11
in Figure 3-(b) are pointing to blocks 4, 7, and 8 of order 2, [51, 61] I11 [61, 61] I12

which contain objects inside. Moreover, the neighbor links [32, 32] 10 [51, 51] 12

of their corresponding block 0 ( ) of order 1 shown


I I I I I I
in Figure 3-(c) are pointing to blocks 1, 2 and 3 of order 7 29 8 31 9 32 10 40 11 51 12 61
1, which contain objects inside. The neighbor links of the 8 9 10 11 12 13
[32, 32] I9 [32, 32] I9 [51, 61] I11 [61, 61] 13
corresponding blocks of order 1 and order 2 for the objects [32, 40] I9 [32, 40] I9
[40, 40] 11
shown in Figure 2 are listed in Table 1. Note that in Table [51, 61] I11 [51, 61] I11
[28, 28] 7' [28, 28] 7'
1, the range in parentheses following the neighboring block [29, 29] 8 [29, 29] 8'
[31, 31] 9 [31, 31] 9
indicates the Hilbert-curve values of order 3 of the objects
covered by this neighboring block. In Figure 3, we can ob-
: an index bucket : a data bucket range pointer : an index tuple
serve that blocks 3, 5, 9, 11, 13 and 14 of order 2 contain no
object inside; therefore, Table 1 has no information about
these blocks. Figure 4. The allocation of the Hilbert curve-
The index tuples of the index buckets are shown in Fig- based distributed index
ure 4. Take index bucket   for object 8 shown in Fig-
ure 4 for example. From Figures 3-(a) and 3-(b), we get
that object 8 is covered in block 2 of order 2. More-
over, in Table 1, we get the neighboring blocks of block
2 of order 2 and their corresponding ranges, 4 ([17,17]), 7
([28,31]), and 8 ([32,32]). Therefore, in index bucket  ,
the neighbor-link index has these three index tuples point-
ing to the index buckets for the first objects in these ranges,
i.e.,       ,       , and     
 shown in
Figure 4. (Note that since the size of a data bucket is a multi-
ple of that of an index bucket, it is easy to convert the pointer
of an index tuple to an offset to the corresponding bucket.)

370
Furthermore, from Figures 3-(a), 3-(b) and 3-(c), we get that should be examined.
object 8 is contained in block 0 of order 1. In Table 1, we
get the neighboring blocks of block 0 of order 1 and their (a) Check index tuples of the neighbor-link index,
corresponding ranges, 1 ([17,31]), 2 ([32,40]), 3 ([51,61]). which have the shortest range covering one of the
As a result, in index bucket  , the neighbor-link index has intersected segments, to get the offsets to the re-
these three index tuples,       ,     
, and lated index buckets.
      shown in Figure 4.
In Step 2-(b), to save the space in the index bucket, (b) Check index tuples of the local index to get the
we check if the index tuples of the neighbor-link index in offsets to the related data buckets.
the index buckets have the same pointer (offset). For the
index tuples with the same pointer, since the index tuple 3. Determine the nearest offset of the related bucket to
with the wide range can contain more information about the tune in, and then go into the doze mode to save power
neighboring objects than that with the short one, the latter consumption.
one is removed from the index bucket. Take index tuples
       and        in index bucket   for ex- 4. Repeat Step 2 to Step 3 until all the intersected seg-
ample. Range [17,31] of index tuple        is wider ments are checked.
than range [17,17] of index tuple       . Therefore,
index tuple        is removed from index bucket  . Take the query window (the dash-line box) in Figure 2
Moreover, index tuple     
 is removed from index for example. This query window can be divided into three
bucket   for the same reason. The removed index tuples segments covered with the Hilbert curve,   ,  
are the ones with a strikethrough line shown in Figure 4. and   . Assume that the client first tunes in to the
In Step 2-(c), we add the local index, which has informa- channel at the beginning of the index bucket   in Figure
tion about the objects in the same group, to the correspond- 4. After checking this bucket, the client gets the offsets
ing index buckets. Take index bucket   for object 8 shown to index buckets  ,  , 
and  , which have informa-
in Figure 4 for example. Since objects 8, 9, and 11 are in tion about the intersected segments. Next, the client tunes
the same group, the local index in index bucket   has three in at the nearest-related index bucket  . By examining
index tuples pointing to the data buckets containing these the neighbor-link index in this bucket, the client finds tuple
objects, i.e.,    , 

 , and      shown in        that has the shortest range covering segment
Figure 4. Note that in Figure 4, when index bucket   is   . Since the range covering segment   of index
broadcast, bucket number 2 containing object 8 has been bucket   is longer than that of index bucket  , the offset
broadcast; therefore, index tuple    ¼  in index bucket to index bucket   is replaced by that to index bucket  .
  points to bucket number 2 in the next cycle. The index At the same time, from the local index of index bucket  ,
tuples of the local index in index buckets  ,  and  are the client gets the offset to object 11, which is in   .
in the same manner. After receiving object 11, the client reaches index bucket
Each index bucket has an index tuple pointing to the be-  . From the local index of this bucket, the client gets the
ginning of the next cycle to retrieve objects that have been offset to receive object 31, which is in   . After that,
broadcast. Each data bucket has an offset to direct clients to the client reaches object 32 through index bucket 
. The
the nearest index bucket to start receiving the related buck- client finally examines index bucket   and knows that
ets. there is no object in segment   . Up to this point, all
the intersected segments have been checked and the query
  
   processing is terminated. The answered objects for this
query are objects 11, 31, and 32.
To process a window query with our proposed algorithm,
all the segments along the Hilbert curve that are intersected
with a given query window should be found [5]. The access 4. Performance
protocol for window queries is proceeded as follows.

1. Tune in to the broadcast channel to receive the current In this section, we study the performance of the proposed
bucket to get the offsets to the nearest index bucket algorithm. We compare our proposed algorithm with the
and the beginning of the next cycle, and then go into distributed spatial index, DSI [5]. DSI divides the spatial
the doze mode. objects, which are allocated in the Hilbert-curve order, into
frames. Each frame has an index table that maintains infor-
2. Tune in to receive the bucket, a data bucket or an index mation about spatial objects which are exponentially away
bucket. If an index bucket is received, index tuples from the current frame.

371
5. Conclusions
HCDI
500000 DSI
In this paper, we have proposed a Hilbert curve-based
distributed index with neighbor links for window queries
Average Tuning Time (Bytes)

400000 over the single wireless broadcast channel. For a win-


dow query of spatial objects, the answered objects may be
300000
the neighbors of each other. Our proposed algorithm uti-
200000
lizes neighbor links, which point to the neighbor objects of
the current object, to efficiently process the window query.
100000 From our simulation results, we have shown that the pro-
posed algorithm needs the shorter average tuning time than
0
0.10 0.15 0.20 0.25 0.30 the distributed spatial index.
WinSideRatio

Acknowledgement
Figure 5. The average tuning time
This research was supported in part by the National Sci-
ence Council of Republic of China under Grant No. NSC-
   

95-2221-E-110-079-MY2 and by “Aim for Top University
Plan” project of NSYSU and Ministry of Education, Tai-
In our simulation model, two integer numbers of wan.

 ( ) byte are used to represent two coordinates
in a two-dimensional space, so that an integer number of 2
bytes is used to represent a Hilbert-curve value. Each spa- References
tial object occupies  
 ( ) bytes. The search
region of a window query is controlled by  
  , [1] Y. I. Chang and C. N. Yang. A Complementary Approach to
the ratio of the side length of a window query to that of the Data Broadcasting in Mobile Information Systems. Data and
search space. Given  
     and the side Knowledge Eng., 40(2):181–194, Feb. 2002.
[2] H. L. Chen and Y. I. Chang. Neighbor Finding Based on
length of the search space is equal to   , the search
region of a window query is a 26 (   )  26
Space Filling Curves. Information Systems, 30(3):205–226,
May 2005.
square. In our simulation, 10,000 points are uniformly gen- [3] B. Gedik, A. Singh, and L. Liu. Energy Efficient Exact kNN
erated in a square Euclidean space [5], and 10,000 queries Search in Wireless Broadcast Environments. In Proc. of the
are randomly issued. Therefore, our experimental results 12th ACM Int. Workshop on Geographic Info. Systems, pages
are the average of 10,000 queries. The average tuning time 137–146, 2004.
is measured in terms of bytes. [4] J. K. Lawder and P. J. H. King. Querying Multi-Dimensional
Data Indexed Using the Hilbert Space-Filling Curve. ACM
  !  " # SIGMOD Record, 30(1):19–24, March 2001.
[5] W. C. Lee and B. Zheng. DSI: A Fully Distributed Spatial
Index for Location-Based Wireless Broadcast Services. In
For the distributed spatial index (DSI), we set the num- Proc. of the 25th IEEE Int. Conf. on Distributed Computing
ber of objects in a frame and the exponential base for index Systems, pages 349–358, 2005.
tuples to 8 and 2, respectively. In this performance eval- [6] J. H. Shen and Y. I. Chang. A Skewed Distributed Indexing
uation, we vary  
   from  to . Figure 5 for Skewed Access Patterns on the Wireless Broadcast. The
shows the average tuning time. In this figure, as the value of Journal of Systems and Software, 80(5):711–723, May 2007.
[7] B. Zheng, W. C. Lee, and D. L. Lee. Search K Nearest Neigh-
 
   increases, the average tuning time of both
bors on Air. In Proc. of the 4th Int. Conf. on Mobile Data
algorithms increases. As the value of  
   in- Management, pages 181–195, 2003.
creases, the search region increases. It means that the ob- [8] B. Zheng, W. C. Lee, and D. L. Lee. Spatial Queries in Wire-
jects in the search region to be examined increase, resulting less Broadcast Systems. Wireless Networks, 10(6):723–736,
in the increase of the average tuning time. In Figure 5, we Nov. 2004.
can observe that the average tuning time of our proposed al- [9] B. Zheng, J. Xu, W. C. Lee, and D. L. Lee. Grid-Partition
gorithm is shorter than that of DSI. Our proposed algorithm Index: A Hybrid Approach to Nearest-Neighbor Queries
has an average improvement of 36% on the average tuning in Wireless Location-Based Services. The VLDB Journal,
time over DSI. This is because our proposed HCDI utilizes 15(1):21–39, Jan. 2006.
the neighbor-link index to reduce the number of the tune-in
buckets, resulting in the reduction of the tuning time.

372
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

An Efficient Quorum-based Fault-Tolerant Approach for Mobility Agents in


Wireless Mobile Networks

Yeong-Sheng Chen Chien-Hsun Chen Hua-Yin Fang


Department of Computer Science, National Taipei University of Education
yschen@tea.ntue.edu.tw

Abstract The traditional architecture of a single MA can not


In this paper, we propose an efficient quorum- offer fault tolerance services. Usually, a single MA may
based mechanism to the fault-tolerance problem in not able to afford the services for a lot of MNs. And, if a
mobile IP protocol. In a network with N HAs, all the MA fails, all the MNs served by it can not communicate
HAs are grouped into N quorums with logical circular with other CNs. To cope with these problems, multiple
structure. Every HA has ª« N º» backup quorums and MAs are necessary. However, in multiple MAs
divides the network that it manages into ǻ N ȼ equal environment, data will be stored in distributed storage in
parts. According to the home address of the MN, each the different MAs. The distributed data processing will
HA finds out the backup quorum in the network segment cause some problems, such as registration delay, fault
and stores the mobility bindings of the MN in ª« N º»  1 tolerance, load balance, etc. To offer solutions to these
backup HAs. When a HA crashes, the system will select problems, many fault-tolerant approaches [3, 5, 7, 8, 11]
the HA with minimum load from each backup quorum to have been proposed. The following section will briefly
take over the bindings of the faulty HA. In comparison review the related approaches.
with previous related works, our experimental results
show that the proposed fault-tolerant protocol has many 2. Background and Related Works
advantages: better system load balance, less latency for
registration process, less system resource requirements, 2.1. System Architecture
and no extra hardware cost.
Keywords: Mobile IP, Fault-Tolerant, Quorum, The system model considered in this paper consists
Mobility Agent. of two major components: radio access network (RAN),
and core network. The RAN provides the transmission
1. Introduction with the far network or the equipments across the air
interface. By the function of an RAN, the MNs acquire
The Mobile IP protocol [1, 2] is a routing protocol radio resources for executing the wireless data sessions.
proposed by Internet Engineering Task Force (IETF) to The core network provides the services of the packet
support IP mobility. In the architecture of Mobile IP, switch and contains the HAs, the FAs and intermediate
there are two kinds of Mobility Agents (MAs): Home routers. With Mobile IP functionality, the HA and the
Agents (HAs) and Foreign Agents (FAs). The purpose of FA offers the wireless data sessions as described above.
the MAs is to provide services without changing IP However, the intermediate routers help the MAs
address. A MN registers with its home agent when it first forwarding the packets. Besides, there is an
arrives at a Radio Access Network (RAN). When the interconnection network between the RANs and the FAs.
Correspondent Node (CN) sends the packets to the MN, The MNs or the users send the data requests to the core
these packets will be delivered to the MN’s home network through the interconnect network. On the other
network by using traditional IP routing protocol. hand, the core network will dispatch data reply packets
Nevertheless, when the MN moves from the home to the MNs or the users through the interconnection
network to a foreign network, it will update its location networks, too.
by registering with its HA with a new temporary IP
address called Care-of Address (CoA) offered by the FA 2.2. Existing Fault Tolerance Approaches
in the foreign network. The packets will be tunneled by
the HA to the FA and then forwarded to the MN. Thus, a A mechanism with redundant MAs was presented
mobile IP network can provide the portable devices with by Ghosh and Varghese [11]. We call it “FTMIPP” for
roaming capability and enables the mobile users to short. In that approach, redundant MAs backup the
maintain continuous data connectivity without mobility bindings of the MNs, which are served by all
interruption while changing locations. MAs. When a MA receives the registration request
message from the MN, it keeps a record of the message

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 373


DOI 10.1109/SUTC.2008.9
and forwards the message to all the other MAs. At the each quorum (Qi, 1ЉiЉN) contains ª« N º» different
same time, before sending a registration reply message,
HAs and every HA belongs to ª« N º» different quorums.
the MA has to acquire every backup reply message from
Note that the intersection of any two quorums is
all other MAs. Although this scheme is able to reduce
nonempty.
the load of a single MA and tolerate the failure, it results
For example, consider a network environment with
in long registration time for the MNs and has low
8 (N = 8) HAs, which are numbered from 1 to 8 in order
resource utilization since all MAs have to store and
and are divided into groups according to the proposed
maintain the mobility binding for every MN.
scheme above. The construction of the N-ring system
Ahn and Hwang [8] proposed a scheme with a
with 8 HAs is shown as Figure 1. Due to N=8, there are
stable storage in each MA. They allocate a stable storage
8 HAs divided into 8 quorums. Each quorum (Qi,
for each MA and store the mobility bindings in the stable
storage of the MA. In case that a MA crashes, other 1 d i d 8 ) has ª« 8 º» = 3 HAs and every HA exactly
failure-free MA, which is in charge of the bindings of belongs to ª« 8 º» = 3 different quorums, such as HA1 
the faulty MA, can acquire the bindings from the stable {Q1, Q8, Q7}. Then, we have the CQ-set = {Q1, Q2, Q3,
storage in the failed MA and also take over the MNs Q4, Q5, Q6, Q7, Q8}, where Q1 = {HA1, HA2, HA3}, Q2 =
recorded in these bindings. The scheme is able to shorten {HA2, HA3, HA4}, Q3 = {HA3, HA4, HA5}, and so on.
the registration time. However, a stable storage does not The CQ-set with 8 HAs is shown in Table 1.
seem to exist in the real world.
Lin and Arul [7] proposed a method with integrated
OA&M (Operation Administration and Maintenance)
[13] functions to detect the states of all devices in the
network. When OA&M discovers a failed MA, it will
select other failure-free MAs as backup members and
then pick a manager from these backup members. The
manager searches for the bindings of the MNs which are
influenced by the failed HA in each FA and requests
other backup members to maintain these bindings.
Although the MAs do not have to backup the bindings
beforehand, the scheme has to spend extra time on
searching and delivering the bindings and have extra
hardware cost of the management equipments.
Figure 1. The N-ring system with 8 HAs
3. The Proposed Approach
We propose an efficient cyclic-quorum-based
Table 1. An example of the CQ-set with 8 HAs
fault-tolerant protocol with distributed binding
management mechanism to solve the problems
mentioned above in Mobile IP networks.

3.1. Cyclic Quorum Scheme

Without loss of generality, we assume that there


are N HAs in a mobile network. We group the HAs into
a number of quorums. Our mechanism is based on a
circular numbering system with N HAs, which are
organized as a logical circle. Each HA in the system is
assigned a distinct number between 1 to N. The following definitions and theorems are used to
facilitate the description of our proposed mechanism.
Definition 1. A set system [9] C = {Q1, Q2,…, Qn}, 1Љn,
is a collection of nonempty subsets Qi Ž U (1ЉiЉn) of a Definition 4. Primary Home Agent: A home agent that
finite universe U. takes charge of the binding of the registered MN is
called the primary home agent of the MN.
Definition 2. Each element Qi of C in Definition 1 is
called a quorum. For a particular MN with its primary home agent
HAk ( 1 d k d N ), we further have the following
Definition 3. A cyclic quorum set CQ-set = {Qi | Qi = definitions.
{HAi, HA((i+1) mod N ), HA((i+2) mod N ), …, HA((i+d-1) mod N )},
1ЉiЉN}, where d = ª« N º» ; i and N are integers. Definition 5. Backup Quorum Ǻ The quorum that
consists of HAk is called the backup quorum of the MN
Let the N HAs are divided into N quorums or the backup quorum of HAk.
according to Definition 3. Obviously, in this CQ-set,

374
Definition 6. Backup Quorum Set Ǻ The set that Qk except for HAk, and the bindings of MNs in the
ª Nº second segment are backuped by the HAs in Q((k-1) mod N)
comprises all the backup quorums of the MN is
« »
except for HAk, …, and so forth.
called the backup quorum set of the MN or the backup
Consider the above example again. In the network
quorum set of HAk.
with 8 HAs, for the primary home agent HA1, the backup
quorum set is {Q1, Q8, Q7}, where Q1 = {HA1, HA2,
Definition 7. Backup Quorum ArrayǺ The array that
HA3}, Q8 = {HA8, HA1, HA2}, Q7 = {HA7, HA8, HA1}.
contains all the ª« N º» backup quorums of HAk is called Assume that the address of the network segment that is
backup quorum array of HAk and is denoted as managed by HA1 is A.B.C.DX1~A.B.C.DX4. Then, it is
BQk[0..d-1]=[Qk, Q((k-1) mod N),…, Q((k-d+1) mod N)], d= ª« N º» . divided into ª 8 º =3 segments and the three segments
« »
are A.B.C.DX1~A.B.C.DX2, A.B.C.DX2+1~A.B.C.DX3,
Theorem 1. For a primary home agent HAk ( 1 d k d N ), and A.B.C.DX3+1~A.B.C.DX4 respectively. For each MN
its backup quorum set is {Qk, Q((k-1) mod N),…, Q((k-d+1) mod managed by HA1, if the MN’s home address belongs to
ª Nº A.B.C.DX1~A.B.C.DX2 segment, the MN’s bindings will
N)}, d= « » , and there are exactly d backup quorums in
the backup quorum set. be stored and backuped in all the HAs of the backup
quorum BQ1[0] = Q1. And, if the MN’s home address
Definition 8. Backup Home Agent (Backup HA), belongs to A.B.C.DX1+1~A.B.C.DX3 segment, the MN’s
Backup Home Agent Set (Backup HA Set)Ǻ bindings will be stored and backuped in all the HAs of
For a mobile node MN with its primary home agent HAk the backup quorum BQ1[1] = Q8. Further, if the MN’s
( 1 d k d N ), every HA (except the primary HAk) in the home address belongs to A.B.C.DX3+1~A.B.C.DX4
segment, the MN’s bindings will be stored and backuped
backup quorum of HAk is called the backup HA of the
in all the HAs of the backup quorum BQ1[2] = Q7.
MN or the backup HA of HAk. The set includes all
backup HAs in the backup quorum set of HAk is called
the backup HA set of HAk. 3.3. Maintenance and Backup of Bindings

Definition 9. Backup Home Agent ArrayǺThere are N In the conventional Mobile IP protocol, the mobility
binding consists of a MN’s home address, its CoA and
HAs in mobile network. For each HAk ( 1 d k d N ), the
registration lifetime. In our cyclic quorum-based
array contains the array of all backup HAs of the HAk
mechanism, the binding also include the HA’s IP address
called backup HA array of the primary HA and presents
and the chosen backup quorum. That is, the binding
as BHAk[0..m-1] = [HA((k-d+1) mod N), HA((k-d+2) mod N),…,
contains home address, CoA, lifetime, HA’s IP address
HA((k-1) mod N), HA((k+1) mod N), HA((k+2) mod N),…, HA((k+d-1)
and the backup quorum that the HA selects. We call this
mod N)], where m = 2( ª« N º» -1). binding as “extended binding”.
The registration and backup management
Theorem 2. For a primary home agent HAk ( 1 d k d N ), mechanism is described as follows. In a mobile network
the backup HA set is {HA((k-d+1) mod N), HA((k-d+2) mod N), …, with N HAs, when a MNi ( 1 d i d N ) roams to a foreign
HA((k-1) mod N), HA((k+1) mod N), HA((k+2) mod N),…, HA((k+d-1) network, the FA is going to allocate a CoA for it.
mod N)}, and there are exactly 2( «
ª N º -1) backup HAs in
» Supposing that the MNi’s HA is HAk ( 1 d k d N ), the
the backup HA set. MNi will send a registration request with its CoA to HAk.
According to the backup mechanism based on network
For example, in the mobile network with 8 HAs, for segments above, HAk determines that the the MNi
the MNs that register to the home agent HA1, the backup belongs to which network segment (assume the jth
quorums of HA1 are Q1, Q8, and Q7. That is, the backup segment) in accordance with the MNi’s home address
quorum set is {Q1, Q8, Q7}, where Q1 = {HA1, HA2, and then HAk takes BQk [j-1] as the backup quorum of
HA3}, Q8 = {HA8, HA1, HA2}, and Q7 = {HA7, HA8, this MN. After that, HAk delivers a backup registration
HA1}. According to Theorem 2, the backup HA set is request to all active backup HAs in the quorum BQk [j-1]
(except for HAk) and the request message contains the
{HA2, HA3, HA7, HA8} and there are 2( ª« 8 º» -1) = 4 HAs extended binding of the MNi. When a backup HA
in this set. accepts the request message, it will backup the extended
binding of the MNi and transmits a response message
3.2. Backup Quorum Management back to HAk. In case that HAk acquires all reply
Our proposed backup quorum management messages (at most ª« N º» -1) from all the active backup
mechanism is based on network segmentation. In a HAs of the quorum BQk[j-1], HAk will dispatch a
mobile network with N HAs, the network that is registration reply message with successful authentication
managed by the primary HAk ( 1 d k d N ) is divided into to the MNi. Finally, the processes of the registration and
backup are finished.
d = ª« N º» network segments. The first segment is
backuped with BQk [0] = Qk, the second segment is with
BQk[1] = Q((k-1) mod N),…, and the dth segment is with Consider the previously described example again.
BQk[d-1] = Q((k-d+1) mod N) in order. That is, the bindings of In a mobile network with 8 HAs, when a MN1 roams to a
the MNs in the first segment are backuped by the HAs in

375
foreign network, the FA will allocate a CoA to the MN1 ª º
and the MN1 will send a registration request message (BQi[0], BQi[1], …,BQi[d-1], where d= « N » -1) which
with its CoA to HA1 after obtaining the CoA. Then HA1 belongs to HAj in the backup bindings of the failed HAi.
determines that MN1 belongs to which network segment When HAk acquires the load messages from all active
in accordance with the MN1’s home address. If the MN1 backup HAs of the faulty HAi, HAk will execute the
belongs to the third segment, HA1 will assign the takeover procedure and select the HA with minimum
BQ1[3-1] = BQ1[2] as the backup quorum of the MN. load (that is, number of binding maintenance) from each
Afterward, the procedure of the registration and backup backup quorums (BQi[0], BQi[1], …,BQi[d-1], where
is started. This is, HA1 will deliver the backup request d= ª« N º» -1) of the failed HAi in order.
message with the MN1’s extended binding to all active In other words, HAk that is with the minimum
backup HAs in the backup quorum BQ1[2]. Every active number in the backup HA set is responsible for selecting
backup HA should finish the backup procedure and one HA with the minimum load (that is, minimum
transmits a backup reply message back to HA1. When number of binding maintenance) to take over the backup
HA1 receives all backup reply messages from all the binding of the faulty HAi from each backup quorum of
active backup HAs, HA1 will dispatch a registration the faulty HAi in order. And then, HAk broadcasts the
reply message to MN1. After finishing the registration check result to all backup HAs except for the faulty HAi.
and backup process, HA1 will maintain the current After the takeover process, the load of the chosen HA
extended binding for MN1 and other active backup HAs becomes the sum of the minimum load and the backup
in the BQ1[2] will also store the extended binding. Table number of the bindings in the backup quorum of the
2 shows the record of the extended binding in the failed HA. Finally, the chosen HA begins to take care of
primary home agent HA1 and backup HAs of BQ1[2]. the failed HAi’s bindings and resumes the operation
transparently. For maintaining the synchronization of
load information sent by each HA, when all backup HAs
Table 2. Bindings in HA1 and in the Backup HAs
deliver the load messages to the HA with the minimum
number in the backup HA set, they will enter a critical
section and do not take any registration at the same time.
However, they leave the critical section and accept the
registration till they obtain the messages with check
result from the HA with the minimum number. After
getting the check result message, supposing that HAk is
the takeover home agent in a backup quorum of the
faulty HA, HAk will take charge of all extended bindings
of the backup quorum and serve the MNs in the extended
binding. On the other hand, if a HA is not the takeover
home agent in any quorum, it will just record the
takeover message and leave the critical section and
resume accepting new registration of MNs.
For example, in a mobile network with 8 HAs, as
3.5. Failure Detection and Takeover shown in Table 3, the mobility bindings of HA1 are
recorded by all the backup HAs in the BHA set of HA1.
The failure of a HA can be detected by using agent When HA1 fails, the backup HAs (HA2, HA3, HA7, HA8)
advertisements. In Mobile IP protocol, all MAs will send in the BHA set of HA1 know that HA2 is the HA with the
the agent advertisement [1, 2] messages periodically to minimum number and they (HA3, HA7, HA8) will deliver
advertise their existence and service for the MNs on any the binding load (see Table 4) to HA2. After receiving
attached network. With the agent advertisements, we can the binding load messages from all the backup HAs, HA2
check if a MA operates or not. A HA has N-1 timers for starts the takeover of the check process in each backup
the other N-1 HAs except itself. When a timer runs out quorum. In Table 5(a), HA2 has minimum load (number
of time, it means that the correspondent HA fails since of binding maintenance) in Q1. Hence, HA2 will take
its agent A\advertisement has not been received after a over the Q1, and, after the takeover, its binding load
certain period of time. become 3 (the sum of maintenance number and backup
number in Q1) as shown in Table 5(b). Similarly, the
Provided that a HAi crashes, other active HAj in the takeovers in Q8 and Q7 are HA8 and HA7 as shown in
same network will discover the failure of HAi because of Tables 6 and 7, respectively. Finally, after the check
not acquiring an agent advertisement message from HAi process, HA2 will send the takeover reply message (see
after a period of time. According to Theorem 2, HAj is Table 8) to each backup HA in the BHA set. The backup
able to figure out the backup HA set from the faulty HAi HAs receive the takeover reply message and then leave
and finds out the HA with the minimum number in this the critical section to accept the registration or take over
set, say HAk. If k is not equal to j ( k z j ), HAj will the bindings. The extended bindings in the backup HA of
transmit a load message to HAk. The load message is HA1 after the takeover are as shown in Table 9.
taken down the amount of all the MNs served by faulty
HAi at present and the quantities of the backup quorum

376
Table 3. Bindings in the backup HAs of HA1 Table 8. The takeover reply message from HA2

Table 9. Bindings in the backup HAs of HA1 after the


takeover

Table 4. Binding loads of backup HAs of HA1

Table 5. The loads of HAs in Q1


4. Simulation Results and Comparisons
In this section, we describe the simulation results
of our proposed method. We analyze the overhead with
different number of MNs, different number of HAs, and
registration delay with different mobility rates. In our
experiment, the registration delay includes registration
Table 6. The loads of HAs in Q8 time and backup time. We compare our proposed
mechanism with that proposed by Ghosh and Varghese
[11], which is called “FTMIPP” for short. And, our
proposed cyclic-quorum-based fault-tolerant protocol is
called “CQFTP” for short.
In FTMIPP, the bindings of the MNs are stored in
primary HA and also backuped in all other N  1 HAs.
In other words, all N  1 HAs have to backup the
Table 7. The loads of HAs in Q7
bindings. On the other hand, in our proposed mechanism,
the bindings are stored in some backup HAs of one
quorum, that is, ª« N º»  1 backup HAs. Hence, in our
cyclic-quorum-based protocol, the number of the backup
bindings in each backup HA is less than that in FTMIPP.
The comparison is shown in Figure 2.

377
References
[1] C. E. Perkins, “IP Mobility Support,” IETF RFC
2002, Oct. 1996.
[2] C. E. Perkins, “IP Mobility Support for IPv4,” IETF
RFC 3220, Jan. 2002.
[3] C. Graff, M. Bereschinsky, M. Patel, and L. F. Chang,
“Application of Mobile IP to Tactical Mobile
Internetworking,” Military Communication Conference,
Vol. 2, pp. 409-414, 1998.
[4] C. M. Lin, G. M. Chiu, and C. H. Cho, “A new
quorum-based scheme for managing replicated data in
Figure 2. Bindings per backup HA
distributed systems,” IEEE Transactions on Computers,
Vol. 51, pp. 1442-1447, 2002.
As shown in Figure 3, in case that the number of [5] H. Ahn and C. S. Hwang, “Low-Cost
HAs is 2, the FTMIPP and CQFTP almost spend the Fault-Tolerance for Mobile Nodes in Mobile IP Based
same time. In such a case, the same number of messages Systems,” 15th International Parallel and Distributed
is sent in two protocols. However, when the number of Processing Symposiums, pp. 508-513, 2001.
HAs is more than 4, in FTMIPP, it spends much time [6] I. H. Bae, “A quorum-based dynamic location
than our CQFTP. The reason is that the number of management method for mobile computings,” in
backup HAs is N-1 in FTMIPP and ª« N º»  1 in CQFTP. Proceedings of the 6th International Conference on
Real-Time Computing Systems and Applications
(RTCSA), 1999, pp. 398-401.
[7] J. W. Lin and J. Arul, “An Efficient Fault-Tolerant
Approach for Mobile IP in Wireless Systems,” IEEE
Transactions on Mobile Computing, VOL. 2, NO. 3,
Jul.-Sep. 2003.
[8] J. H. Ahn and C. S. Hwang, “Efficient Fault-Tolerant
Protocol for Mobility Agents in Mobile IP,” Proceedings
of the 15th International Parallel and Distributed
Processing Symposiums, pp. 1273 -1280, Apr. 2001.
[9] M. J. Yang, Y. M. Yeh, and Y. M. Chang, “Legion
Structure for Quorum-Based Location Management in
Figure 3. Average registration delay
Mobile Computing,” Journal of Information Science and
Engineering, Vol. 20, pp. 191-202, 2004.
As shown in Figure 4, we simulate the registration [10] M. Naor and A. Wool, “Access control and
delay of total 100 MNs with different mobility rates in signatures via quorum secret sharing,” IEEE
the network with 20 HAs. It shows that our proposed Transactions on Parallel and Distributed Systems, Vol.
protocol has smooth and smaller registration delay than 9, pp. 909-922, 1998.
FTMIPP. [11] R. Ghosh and G. Varghese, “Fault-Tolerant Mobile
IP,” Technical Report WUCS-98-11, Washington Univ.,
Apr. 1998.
[12] R. Jimenez-Peris, M. Patino-Martinez, G. Alonso,
and B. Kernme, “How to Select a Replication Protocol
According to Scalability, Availability and
Communication Overhead,” in Proceedings of 20th
IEEE Symposium on Reliable Distributed Systems, pp.
24-33, 2001
Figure 4. Registration delay with different mobility [13] R. Mistry, P. Savill, and A. Tofanelli, “OA&M for
rates Full Services Access Networks,” IEEE Communications
Magazine, pp. 70-77, Mar. 1997.
5. Conclusions [14] Y. Mun, Y. Kim, Y. J. Kim, and G. Hwang, “IP
Mobility Support over Wireless ATM,” IEEE
In this paper, we propose a cyclic-quorum-based International Conference on Communications, pp.
fault-tolerant protocol in the Mobile IP network with 319-323, 1999.
redundant HAs. Simulation results show that our [15] Y. T. Wu, Y. J. Chang, S. H. Yuan, and H. K.
proposed mechanism has many merits: it does not need Chang, “A New Quorum-Based Replica Control
the extra hardware cost; it reduces the number of the Protocol,” in Proceedings. Pacific Rim International
backup bindings by using the small quorum size; it Symposium on Fault-tolerant System, pp. 116-121, Dec.
balances the load of the takeover process; it has low 1997.
registration overhead.

378
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Semantic Enforcement of Privacy Protection Policies via


the Combination of Ontologies and Rules

Yuh-Jong Hu, Hong-Yi Guo, and Guang-De Lin


Department of Computer Science,
National Chengchi University
Wen-Shan District Area, Taipei, Taiwan
{hu,g9607,g9610}@cs.nccu.edu.tw

Abstract However, this access control scenario cannot be easily


enforced and implemented on the open Web where there are
We propose that the semantic formal model for P3P and so many websites within it for users to randomly surf and
EPAL-based privacy protection policies can be enforced search for their intended information [21]. In fact, it is still a
and expressed as a variety of ontologies and rules (ontolo- big challenge to deal with the design and implementation of
gies+rules) combinations, such as DLP, SWRL, AL-log, DL- access control (or more specific privacy protection) policies
log, DL+log, and MKNF, etc. Based on P3P and EPAL’s and language on the open Web [26] [27].
original expressions and their dictionaries, several ontolo- It is impossible to compel a user to disclose his/her own
gies+rules semantic enforcement of privacy protection poli- profile information unless a particular website has enough
cies will be proposed in this study that can be compared incentives for the user to disclose his/her personal profile
with existing others. Furthermore, we express privacy pro- information. Furthermore, the disclosed user profiles might
tection management policies as a set of ontology statements, not be truly trusted and authenticated information because
rules, and facts for both information disclosure and rights the user is afraid of his/her personal digital traces might be
delegation using one of the above ontologies+rules combi- collected and analyzed later on for possible personal pri-
nations for two specific use case scenarios. When verify- vacy invasion. The reason a user does not intend to disclose
ing P3P/EPAL formal semantics, we exploit which ontolo- his/her personal profile for a website is that he/she is un-
gies+rules combination will be a feasible information dis- aware how the collected personal profile and digitally traced
closure control scenario under certain conditions. We hope information will be used. Even the website provides explicit
that this study might shed some light on the study of future usage statements that claims it will comply with existing le-
general information disclosure and rights delegation con- gal privacy protection regulations. It is still very difficult
trolled on the open Web environment. for a user to justify whether the usage of collected profile
data and digital traces are honestly compliant with the pri-
vacy protection statements indicated on that particular web-
1. Introduction site [2].
The Platform for Privacy Preferences (P3P) is a pri-
When we consider the information disclosure problem, vacy markup language for a web server to easily annotate
it is highly relevant to the privacy protection issue on the a server’s intentions on his collection of selective personal
Web because both of them have to achieve the objective of information usage options. Thus the P3P enables website to
information disclosure at the right time for the right per- express its privacy practices in a standard format that can be
sons (or agents) with the right purposes [10]. People al- easily and automatically retrieved and interpreted by user
ways enforce very strict information access control policies agents. P3P user agents will allow users to be informed
in the centralized system where all of the users are already of site practices (in both machine- and human-readable for-
registered with their true identities and profile information. mats) and to automate decision-making based on these prac-
Once a user’s account is granted for accessing the system tices when appropriate [7]. On the other hand, using a P3P
resources, he/she should show his/her own pre-authorized Preference Exchange Language (APPEL), a user can ex-
user name and password to execute the intended software press his/her preferences in a set of preference-rules (called
or to access sensitive information within this system. a ruleset), which can then be used by his/her user agent

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 400


DOI 10.1109/SUTC.2008.59
to make automated or semi-automated decisions regarding significant. Anderson argued that enterprises should choose
the acceptability of machine-readable privacy policies from XACML as a privacy policy language because the function-
P3P enabled Web sites [8]. ality of XACML 2.0 is a superset of EPAL 1.2 [1]. In order
Unfortunately, XML-based markup languages, such as to ascertain all of information disclosure actions will abide
P3P and APPEL, do not have the expressive power to model by its privacy protection regulations; current P3P/EPAL pri-
and enforce the semantics of a person’s privacy protec- vacy protection mechanisms were implemented and embed-
tion intentions from both client and server sides [17] [29]. ded into the relational database, such as Oracle Virtual Pri-
Therefore, we cannot simply use XML-based P3P to spec- vate Database (VPD) [17]. But it is not easy to exchange
ify our privacy protection policies and ensure that these and share personal data and digital traces from heteroge-
policies can be automatically verified to comply with the neous data sources under VPD architecture. Certainly it
underlying legal regulations from a semantics unambiguous is not easy to exercise the auditing and policy compliance
perspective. Obviously, we need a higher formal semantics checking based on the current P3P/EPAL design and imple-
layer laid on the P3P/APPEL to ensure all of the semantic mentation mechanism [2]. We need to have a more gen-
clearness for policy compliance checking, which has been a eral and powerful semantic representation and enforcement
very important research area of trust management for policy of privacy protection framework to deal with the possible
making [4]. challenges that cannot be resolved by P3P/EPAL alone.
If a website is trustworthy, we might allow it to freely Previous semantic policy languages for security or pri-
collect our personal profiles and digitally traced informa- vacy control were proposed either using description logic
tion because we have confidence that this website will abide ontologies or logic program rules alone. For example,
by the legalized information sharing and disclosure poli- KAoS, Rei, and Ponder policy languages were based on
cies under the law per se to respect our intentions and op- ontology representation and reasoning only, so they were
tions on the data usage. However, the problem is whether pretty limited with respect to their policy representation and
a trusted website can really be aware of the usage options enforcement [14] [25]. Similarly, the rule-based policy
and purposes for the collected information coming from a representation faced the same limitations on policy expres-
tremendous amount of different users on their selective op- sion and enforcement [3] [6] [26]. In general, these policy
tions of personal profiles and digital traces. Another issue is languages have less expressive power compared with our
whether we allow our agents to enforce the privacy protec- ontologies+rules combination for policy representation and
tion principles unambiguously without our direct interven- enforcement.
tion. If possible, we might need to delegate privacy protec-
tion compliance checking services to one of the trustworthy 3 Privacy Protection on the Web
web site’s agents to ensure our benefits.
The original idea of WWW (or Web) is to promote the ef-
2 Related Studies fective sharing and exchange of information among agents
and people. After several years of development, the pri-
The P3P/APPEL privacy protection mechanisms were mary objective of this concept has been achieved to some
proposed to enable easy collection of data user (or data con- extent. However, we expect information collectors provide
sumer) and date subject’s (or data owner) usage purposes the information disclosure option services for us as respect-
and conditions under a client server model [7] [8]. How- ing our basic human rights while they are collecting and
ever, P3P can not support any semantics level representa- sharing our personal profiles and digitally traced informa-
tion and enforcement of privacy protection policies because tion among themselves. In fact, this privacy protection con-
P3P/APPEL expressions were based on XML syntax only. sideration has become one of the most important emerging
Similarly, the E-P3P (or later EPAL) was based on previ- research issues in Web development.
ous Flexible Authorization Framework (FAF) [28] [13] that
was proposed to express and enforce the enterprise’s privacy 3.1 Privacy Protection on Web 1.0 and
protection policies on the Web [15] [16]. The EPAL was Web 2.0
using a logic program (LP) model to indicate the data usage
purposes for a particular role under certain conditions. But The Web 1.0 and Web 2.0 privacy protection problem is
the semantic representation and enforcement from the logic the primary focus for most of the current privacy protection
program model are still far from satisfactory from semantic languages, such as P3P/EPAL and XACML [1] [7] [15].
representation and enforcement viewpoints. However, the systems enforced with these privacy protec-
While EPAL and XACML are proposed as privacy pol- tion languages cannot deal with the digital trace protection
icy languages, they are very similar in both structure and in and disclosure issues because users’ digital traces are usu-
concept but the differences between these two languages are ally stored as unstructured text-based weblog files. Further-

401
more, to achieve information sharing and exchange objec- mantic web layer cake 1 . There are several possible hy-
tives on the current Web 1.0 and Web 2.0, we need to collect brid ontologies+rule combinations, such as AL-log, DL-
most of this information from multiple relational databases log, and DL+log, to consider as a policy language for the
in the deep web. Therefore, the information disclosure poli- representation and enforcement of privacy protection poli-
cies and mechanisms for satisfying privacy protection prin- cies [9] [23] [24]. Under hybrid ontologies+rules com-
ciples are usually embedded into the relational database bination, some of the terms in privacy protection policies
systems. Certainly it is not easy to have either informa- will not be explicitly declared or defined in ontologies but
tion sharing or information disclosure actions across multi- they will be declared as predicates in each rule. Therefore
ple relational databases from these heterogeneous database the knowledge flow between ontologies and rules might be
schema. bi-directional to re-enforce ontologies and rules expressive
power of each other. At this moment, it is unclear which
3.2 Privacy Protection on Web 3.0 homogeneous/hybrid ontologies+rules combinations to use
as an ideal representation and enforcement of privacy pro-
If we successfully migrate from Web 1.0 and Web 2.0 tection system policies. This still needs further study.
to Web 3.0 (Semantic Web), then all of users’ profiles and Another issue is that most of the current privacy pro-
digitally traced information will be annotated via ontology- tection systems with their policies can be expressed and
based markup language, such as RDF(S) or OWL. We do enforced only as positive permission but no negative per-
not know whether this semantic web evolution process will mission (or deny) on the rule’s conclusion for each infor-
be a benefit or a detriment on the realizing of privacy protec- mation disclosure request. Similarly, people do not allow
tion. From the pro side, because all of the information will weak (or strong) negation premises on each privacy pro-
be modeled and marked up by a well-defined semantic web tection policy. All of these constraints are due to the lack
ontology structure we can easily apply similar techniques of negation as failure (NAF) assumptions for ontologies
for the expression and enforcement of privacy protection that certainly restrict wide information dissemination and
policies. On the con side, if the privacy protection sys- disclosure capacity on privacy protection. In fact, how to
tem was not perfectly designed and implemented, certainly merge open world assumption (OWA) from the ontology
this system would be much easier for any privacy violators side with closed world assumption (CWA) from the rule
to challenge our protection policies by using pre-existing side on the ontologies+rules integration is also one of the
highly semantically connected information to inference (or emerging critical research issues when we combine ontolo-
reason) where are the possible weak links to attack. This is gies with rules together. Furthermore, we might face on-
a two-edged sword scenario when we introduce the seman- tologies merging and rule composition challenges when we
tic techniques into both information modeling and privacy integrate the information cross heterogeneous multiple do-
protection policy on Web 3.0. mains that might induce a dilemma for a global inconsistent
ontologies+rules protection polices from each collected lo-
4 Ontologies+Rules for Privacy Protection cal consistent ontologies+rules protection policies [5].
Policy
4.1 Ontologies for Privacy Protection
Policies
We usually classify ontologies+rules combination as two
approaches: homogeneous integration and hybrid combina-
We proposed three types of ontology in the DL+log-
tion [18]. In a homogeneous integration, ontologies will be
based ontologies+rules combination for the semantic en-
the main body of concept for information structure, where
forcement of privacy protection policies, e.g., data user on-
DLP is the most restricted one for this approach [11]. All
tology, data type ontology, and purpose ontology. More de-
of the major terms and representations for privacy protec-
tailed structures with their associated class and property hi-
tion will be declared and defined in ontologies and later
erarchies are shown as the followings:
move to the rules for further inferencing processes, such
as SWRL [12]. So the knowledge flow is uni-directional in 1. The structure of data user ontologies for both class
a homogeneous integration of ontologies+rules. Here rules and property hierarchies are proposed to categorize the
can be regarded as an added-on component to the ontologies type of users and with their memberships correspond-
component to enhance/extend the expression limitations of ing to an organization (see Figure 1).
ontologies.
In a hybrid combination, the ontologies module is rep- 2. The data type ontologies to describe both the hierar-
resented as OWL or RDF(S) and it sits side by side with chies of class and property for personal profiles and
the rules module represented as RIF to enforce the knowl- 1 See W3C Semantic Web Activity for the latest ”layercake” diagram at

edge representation and integration on the well-known se- http://www.w3.org/2001/sw/.

402
Figure 1. A data user hierarchy to classify the
data user class hierarchy
Figure 2. A class hierarchy classification for
both personal profiles and digital traces
digital traces can be shown as Figure 2.

3. The purpose ontology to describe the intention of data


in, sender’s/receivers’ email address(es) for each incom-
user to use a particular type of data can be shown as
ing/outgoing email, the titles and contents for each thread
Figure 3.
of all associated emails, etc. Of course, the mail server G
does provide opt-in and opt-out mechanisms for the user to
4.2 Two Scenarios for Privacy Protection decide whether his public (or private) profile and digitally
of Mail Servers traced information can be (not-)disclosed under certain cir-
cumstances for some roles to achieve a specific purpose.
A privacy protection scenario for three email users Please propose DL + log-based privacy protection poli-
(Alice, Bob, and Charlie) in a mail server G to enforce cies that can explicitly specify ontologies and rules to sat-
privacy protection policies under a specific purpose from isfy the weak DL-safeness conditions to have the semantic
different organization domain is shown as follows: enforcement of privacy protection objective via the combi-
nation of ontologies and rules [23]. The knowledge bases
G company is a well-known mail server portal that pro- of ontologies and rules for two use case scenarios can be
vides email sending, receiving, and storing management shown as: 2 .
services for its registered users. In order to apply for an A 5-tuple term (user(s), type(s), purpose(s), right(s),
email account from this portal, each new user has to explic- condition(s)) is a fact shown as the P3P XML-based rep-
itly fill in his own office profile information to this portal, resentation from data owner specified options on the data
including name, office phone number, office address, and usage’s for data user(s), where user(s) ∈ data user ontol-
working organization, etc. Furthermore, for the purposes ogy, type(s) ∈ data type ontology, purpose(s) ∈ purpose
of providing the user’s personal email search and retrieval ontology; right(s) ∈ (read,write,display,disclose,..), and
or for the management of a mail server’s own business ser- condition(s) ∈ (date,time,counter,..). Once this 5-tuple
vices, dynamically generated users’ digitally traced infor- term was collected from data owner, it will be extracted
mation will be online extracted, (un)disclosed, and even and decomposed as several legal predicates that fitted into
archived in this portal during email sending and receiving the grounding facts for the ontologies module and the
activities.
2 In
the following rules and facts, each term shown as capital letters
The possible online digitally traced information ex- comes from ontologies while each term shown as little letters is defined
tracted, (not-)disclosed, and archived from this mail as Datalog predicates. This is the feature of a hybrid ontologies+rules
server portal are IP address for each time the user signs combination.

403
Figure 3. A purpose ontology for the classifi-
cation of different data usage purposes

rules module to semantically enforce the privacy protection


policies with respect to each data user’s request. Figure 4. A recipient B’s email address can-
not be disclosed to C ∈ CP under all data
• Use case one scenario: There are two organizations usage purposes
that share users’ public profiles and digitally traced in-
formation from this mail server portal: one is a sub-
sidiary department SD of this mail server and the other ORGANIZATION
domain range
is a cooperative partner CP of this mail server. The MAIL TRACE ←− HAS M AIL T RACE −→
privacy protection policies to enforce the information EMAIL
disclosure requests from the members of these two or- EMAIL v ∃ HAS MAIL TRACE ONLINE− .O EMAIL SENDER
ganizations will be quite different from service pur- EMAIL v ∀ HAS MAIL TRACE ONLINE.O EMAIL RECEIVER
poses or user roles perspective. Now a user Alice ∈ DATA AUDIT ANNOUN. v AUDIT ANNOUN.
SD is going to send a data auditing announcement
email ∈ DAT A AU DIT AN N OU N. to both a user Ontologies Module’s Facts:
Bob ∈ SD and a user Charlie ∈ CP . Under com- ORGANIZATION(G)
pany SD internal regulation, anyone sends an email to HAS SUBSIDIARY(G, J-Corp.)
a mailing list with multiple recipients, where email re- HAS COOPERATIVE(G, Q-Corp.)
cipients ∈ SD cannot disclose his/her email address to IS STAFF OF(Alice, J-Corp.)
those people not ∈ SD domain under any purposes. IS STAFF OF(Bob, J-Corp.)
Therefore, the email recipient Charlie ∈ CP can- IS STAFF OF(Charlie, Q-Corp.)
not explicitly see the email address of the recipient HAS EMAIL ADDRESS(Alice,Alice@gmail.com)
Bob ∈ SD in his receiving email address header(see HAS EMAIL ADDRESS(Bob,Bob@yahoo.com.tw)
Figure 4). HAS EMAIL ADDRESS(Charlie,Charlie@hotmail.com)
Let Γ = (Λ, ∆) be the two components of knowledge O EMAIL SENDER(Alice@gmail.com),
representation from ontologies Λ module and rules ∆ O EMAIL RECEIVER(Bob@yahoo.com.tw)
module: O EMAIL RECEIVER(Charlie@hotmail.com)
HAS MAIL TRACE ONLINE(Alice@gmail.com,
– Λ = ontology about information disclosure for Bob@yahoo.com.tw)
this use case one scenario: HAS MAIL TRACE ONLINE(Alice@gmail.com,
Ontologies Module’s Axioms: Charlie@hotmail.com)
COMPANY v PRIVATE
PRIVATE v ORGANIZATION – ∆ = Rules about information disclosure for this
OWNER v PERSON use case one scenario:
domain range
COMPANY ←− HAS COOP ERAT IV E −→
COMPANY Rules Module’s Rules:
domain range
COMPANY ←− HAS SU BSIDIARY −→ cando(?c,?b-email, display) ⇐=
COMPANY opt-in(?b,?b-email,?p)), data-user(?c),
HAS COOPERATIVE ≡ HAS COOPERATIVE − data-owner(?b),
domain range
PERSON ←− IS ST AF F OF −→ HAS EMAIL ADDRESS(?b,?b-email). ← (a1)

404
isfy DL-safeness conditions because some of the
cando(?c,?b-email, nill) ⇐= variables c1 and c2 in IS STAFF OF DL predicate
opt-out(?b,?b-email,?p)), data-user(?c), did not occur in any Datalog predicates.
data-owner(?b),
HAS EMAIL ADDRESS(?b, ?b-email). ← (a2) • Use case two scenario: The auditing officer Bob
serves in one of government auditing agencies In-
opt-in(?b,?b-email,?p) ⇐= ternal Revenue Service (IRS), where IRS ∈
IS STAFF OF(?b,?c1), IS STAFF OF(?c, ?c2), GOV AGEN CY v P U BLIC . Bob is going to en-
HAS SUBSIDIARY(?c1,?c2), force a routine auditing check to a company M ∈
HAS MAIL TRACE ONLINE(?a-email,?c-email), COM P AN Y v P RIV AT E through its representa-
O EMAIL SENDER(?a-email),
tive Charlie. An auditing announcement officer Alice
O EMAIL RECEIVER(?c-email),
data-owner(?b), data-user(?c), purpose(?p),
from IRS is going to send an email to a representative
data-type(?b-email). ← (a3) employee Charlie ∈ M and other company represen-
tatives to notify the account-auditing schedule. Under
opt-out(?b,?b-email,?p) ⇐=
government’s auditing regulations, the real acting au-
IS STAFF OF(?b,?c1), IS STAFF OF(?c, ?c2),
ditor Bob as one of the mailing list recipients served in
HAS COOPERATIVE(?c1,?c2),
IRS cannot disclose his email address in this account-
HAS MAIL TRACE ONLINE(?a-email,?c-email),
auditing notification email. Therefore, a chief privacy
O EMAIL SENDER(?a-email),
officer (CP O) ∈ IRS has to opt-out the acting auditor
O EMAIL RECEIVER(?c-email),
recipient Bob’s email address to comply the regula-
data-owner(?b), data-user(?c), purpose(?p), tions while Alice is sending an account-auditing noti-
data-type(?b-email). ← (a4)
fication message (see Figure 5).

Rules Module’s Facts:


data-user(Bob), data-owner(Bob),
data-user(Charlie), data-owner(Charlie),
purpose(data-auditing),
data-type(Bob@yahoo.com.tw),
data-type(Charlie@hotmail.com),
opt-in(c,Charlie@yahoo.com,data-auditing),
cando(Bob,Charlie@yahoo.com,display),
cando(Charlie,Bob@yahoo.com.tw,nill),
opt-out(b,Bob@yahoo.com.tw,data-auditing)

From Bob’s side, a mail server G will be


grounding rule (a4) first and then it will derive
opt-out(b,Bob@yahoo.com.tw,data-auditing) as
a conclusion. The opt-out(..) will be-
come one of the facts in rule (a2) con-
ditions once Charlie activates his email re- Figure 5. A recipient Bob’s email address
ceiving action from mail server G to read Bob@government.gov cannot be disclosed to
this particular email from Alice@gmail.com. Charlie under auditing regulations for the
The recipient email address Bob@yahoo.com.tw purpose of delivering auditing notification
will not be displayed due to the conclusion email to Charlie
of cando(Charlie,Bob@yahoo.com.tw,nill) from
rule (a2) due to the nill access right. The ontologies module and the rules module for
From Charlie’s side, a G mail server does not this use case two scenario are very similar to those
have the constraints from Charlie to enforce as- specified in the use case one scenario except condi-
sociated privacy protection policies so Bob is tions for rule (a3) and rule (a4) are not shown as
aware Charlie as one of the mailing list recip- binary ontology predicates HAS SU BSIDIARY (..)
ients with Charlie@hotmail.com in his receiving and HAS SU BSIDIARY (..) instead they are re-
email message (see Figure 4). In the rule (a3), placed as unary ontology predicates IRS(?c1) and
it satisfies weak DL-safeness but it does not sat- IRS(?c2) to ascertain the data owner b will opt-in(..)

405
his email address to the data user c who also serves tration. There are several issues they did not exploit in their
in IRS. Otherwise, the data owner b will opt-out(..) approach, shown as follows:
his email address to the data user c who is not an IRS
employee. 1. They did not explicitly separate the ontologies mod-
ule and rules module in their policy specification so
the rules to enforce privacy protection policies have
5 Discussion
to be classified as four categories: direct authoriza-
tion rules, derived authorization rules, decision rules,
5.1 Which Ontologies+Rules Combina- and integrity rules. In our approach, the decisions for
tion? the derived authorization rule can be enforced directly
from the reasoning of ontologies using class and sub-
A variety of ontologies and rules (ontologies+rules) class subsumption relationships. Then the rules mod-
combinations had been proposed for the past few years, ule only has to deal with the final permission of infor-
such as DLP, SWRL, AL-log, DL-log, DL+log, and MKNF, mation disclosure.
etc [11] [12] [9] [23] [24] [19]. We subjectively choose
DL+log as the ontologies+rules combination for two use 2. They did not really demonstrate how to achieve the au-
case scenarios of privacy protection because DL+log consti- thorization decision of private information disclosure
tutes the most powerful decidable combination of Descrip- using the combination of hierarchy of groups, data ob-
tion Logic (DL) ontologies and disjunctive Datalog rules jects, and purposes. We explicitly show this authoriza-
with a weak DL-safeness rule condition [23]. tion decision can be obtained by the ontology merging
This condition of DL-safeness can be expressed as fol- techniques from our three ontologies, e.g., data subject
lows: every variable occurring in an atom with a DL predi- ontology, object ontology, and purpose ontology.
cate must occur in an atom with a Datalog predicate in the
body of the rule [24]. In other words, the DL-safeness con- 3. They did not model and enforce the disclosure of data
dition ensures that each rule variable must occur in one of between enterprises, i.e., exporting and importing data
the Datalog predicates. In DL+log ontologies+rules combi- with their associated privacy policy from/into a sys-
nation, DL-safeness can be weakened as weak DL-safeness tem. We are aware that this problem can be solved by
without losing its nice decidable computational properties using ontologies merging and rule composition tech-
where a Datalog rule with only on the head variables of the niques across multiple domains [5].
rule imposed DL-safeness condition [23], e.g., every head
4. Finally, they did not consider the profile information
variable of Datalog rule must appear in at least one of the
disclosure as well as the digitally traced information
atoms in a Datalog predicate.
disclosure and this will be an emerging research area
A Semantic Web Rule Language (SWRL) used to be
for Web 2.0 and Web 3.0 privacy protection. In our
a semantic web language for the combination of ontolo-
above mail server use case, we demonstrated how the
gies+rules [12]. But the complexity of reasoning for query
personal profile information disclosure opt-in/opt-out
of SWRL-based ontologies+rules is undecidable, which
selection influences the later on digitally traced infor-
prevents people from using this combination without hes-
mation disclosure.
itation. Decidability of reasoning is a crucial issue in sys-
tems when we combine DL-based knowledge bases (KBs)
and Datalog rules together [20] [22]. The loose integra- 6. Conclusion and Future Prospects
tion between DL-based KB and rule-based KB with weak
DL-safeness conditions apply to all of the rules in a privacy There are several challenges for us to elaborate the se-
protection policy set that guarantee the semantic enforce- mantic web core technologies on modeling of privacy pro-
ment of privacy protection policies to be a decidable deci- tection’s policy representation and enforcement. At this
sion process. moment, we are not quite sure which ontologies+rules com-
bination will be the most appropriate one under certain in-
5.2 Privacy Protection Language and formation usage purposes and conditions [9] [19] [23] [24].
Policy In summary, we express and enforce all profile informa-
tion and digital traces with associated disclosure policies
A privacy protection language and policy was proposed using a specific ontologies+rules combination on Web 3.0,
by Karjoth, G. as an extending Flexible Authorization e.g., DL + log. This information modeling structure and
Framework (FAF) with grantors and obligations [15]. In access mechanism will be quite different from Web 1.0 and
this extended FAF approach, a privacy control language in- Web 2.0, where profile information will be defined as re-
cludes user consent, obligations, and distributed adminis- lational database tables in the deep web, and digital traces

406
for recording each user’s surfing activities will be defined [15] G. Karjoth and M. Schunter. A privacy policy model for
and collected as an unstructured weblog. On the Web 3.0 enterprises. In 15th IEEE Computer Security Foundations
information cyberspace, we might face all personal profile Workshop (CSFW). IEEE, June 2002.
information as well as associated digital traces are modeled [16] G. Karjoth, M. Schunter, and M. Waidner. Platform for
as a ontologies+rules combination with semantic query as enterprise privacy practices: Privacy-enabled management
of customer data. In 2nd Workshop on Privacy Enhancing
the only feasible access mechanism; then the challenge for
Technologies (PET), LNCS. Springer, 2002.
semantic representation and enforcement of privacy protec- [17] N. Li, T. Yu, and A. I. Antón. A semantics-approach
tion policies just begins. to privacy languages. Computer Systems and Engineering
(CSSE), 21(5), Sep. 2006.
References [18] J. Maluszynski. Hybrid integration of rules and dl-based on-
tologies. In J. Maluszynski, editor, Combining Rules and
Ontologies. A survey, pages 55–72. EU FP6 Network of Ex-
[1] A. H. Anderson. A comparison of two privacy policy lan-
cellence (NoE), Feb. 2005. REWERSE.
guages: Epal and xacml. In Proceedings of the 3rd ACM
[19] B. Motik et al. Can owl and logic programming live together
Workshop on Secure Web Services (SWS’06), pages 53–60.
happily ever after? In 5th International Semantic Web Con-
ACM, 2006.
ference (ISWC) 2006, LNCS 4273, Athens, GA, USA, Nov.
[2] I. A. Antón et al. A roadmap for comprehensive online for
2006.
privacy policy management. Comm. of the ACM, 50(7):109–
[20] B. Motik, U. Sattler, and R. Studer. Query answering
116, July 2007.
for owl-dl with rules. In 3rd International Semantic Web
[3] G. Antoniou et al. Rule-based policy specification. In T. Yu
Conference (ISWC) 2004, LNCS 3298, pages 549–563.
and S. Jajodia, editors, Secure Data Management in Decen-
Springer, 2004.
tralized Systems, pages 169–216. Springer, 2007.
[4] M. Blaze, J. Figenebaum, and M. Strauss. Compliance [21] J. Park and R. T. Sandhu. The uconABC usage control
checking in the policymaker trust management system. In model. ACM Trans. on Information and System Security,
Proc. of the Financial Cryptography, LNCS 1465, pages 7(1):128–174, 2004.
[22] R. Rosati. On the decidability and complexity of integrating
254–274. Springer, 1998.
[5] A. P. Bonatti, S. D. C. di Vimercati, and P. Smarati. An ontologies and rules. Web Semantics: Science, Services and
algebra for composing access control policies. ACM Trans. Agents on the World Wide Web 3, pages 61–73, 2005.
[23] R. Rosati. DL+log: Tight integration of description logics
on Information and Systems Security, 5(1):1–35, February
and disjunctive datalog. In Proc. of the 10th International
2002.
[6] A. P. Bonatti et al. Semantic web policies - a discussion of Conference on Principles of Knowledge Representation and
requirements and research issues. In 3rd European Semantic Reasoning (KR), 2006.
Web Conference (ESWC 2006), Budva, Montenergro, June [24] R. Rosati. Integrating ontologies and rules: Semantic and
2006. computational issues. In Reasoning Web 2006, LNCS 4126,
[7] L. Cranor et al. The platform for privacy pref- pages 128–151, 2006.
erences (p3p) 1.0 (p3p 1.0) specification, 2002. [25] G. Tonti et al. Semantic web languages for policy represen-
http://www.w3.org/P3P/. tation and reasoning: A comparison of kaos, rei, and pon-
[8] L. Cranor, M. Langheinrich, and M. Marchiori. A der. In 2nd International Semantic Web Conference (ISWC)
p3p preference exchange language 1.0 (appel 1.0), 2002. 2003, LNCS 2870, pages 419–437, 2003.
http://www.w3.org/TR/P3P-preferences/. [26] S. D. C. d. Vimercati et al. Access control policies and lan-
[9] M. F. Donini et al. AL-log: Integrating datalog and de- guages in open environments. In T. Yu and S. Jajodia, ed-
scription logics. Journal of Intelligent Information Systems, itors, Secure Data Management in Decentralized Systems,
10(3):227–252, 1998. pages 21–58. Springer, 2007.
[10] S. Fischer-Hübner. IT-Security and Privacy - Design and [27] D. J. Weitzner et al. Creating a policy-aware web: Discre-
Use of Privacy-Enhancing Security Mechanisms. LNCS tionary, rule-based access for the world wide web. In E. Fer-
1958. Springer, 2001. rari and B. Thuraisingham, editors, Web and Information Se-
[11] N. B. Grosof et al. Description logic programs: Combining curity, pages 1–31. Idea Group Inc., 2006.
logic programs with description logic. In World Wide Web [28] Y. C. T. Woo and S. S. Lam. Authorization in distributed
2003, pages 48–65, Budapest, Hungary, 2003. systems: a new approach. Journal of Computer Security,
[12] I. Horrocks et al. Swrl: A semantic web 2(2-3):107–136, 1993.
rule language combing owl and ruleml, 2004. [29] T. Yu, A. N. Li, and I. Antón. A formal se-
http://www.w3.org/Submission/SWRL/. mantics for p3p. In ACM Workshop on Se-
[13] S. Jajodia et al. Flexible support for multiple access control cure Web Services, Fairfax, VA, USA, Oct. 2004.
policies. ACM Trans. on Database Systems, 26(2):214–260, http://citeseer.ist.psu.edu/750176.html.
June 2001.
[14] L. Kagal, T. Finin, and A. Joshi. A policy based approach
to security for the semantic web. In International Semantic
Web Conference (ISWC) 2003, LNCS 2870, pages 402–418,
2003.

407
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

PNECOS: A Peer-to-peer Network Coding Streaming System

* Tein-Yaw Chung, Chih-Cheng Wang, Yung-Mu Chen, Yang-Hui Chang


Department of Computer Science and Engineering, Yuan Ze University
No. 135 Yuan-Tung Rd., Chung-Li, Taoyuan, Taiwan 32003
* csdchung@saturn.yzu.edu.tw

Abstract with close play offset can share their available segments
with other peers and reduce service load of the media server.
Multimedia streaming service, such as IPTV and VoD, Peer-to-peer network exploits resources of end hosts,
is one of killer applications in 4G network. Exploiting such as computing power, storage space and upload band-
well resources of end hosts to extend the service scalabil- width, and reduces service load of streaming servers. One
ity of a media server is a critical approach in multime- of the key elements in improving service scalability of P2P
dia streaming services. This paper proposes a Peer-to- streaming system is to exploit well the resource of end
peer NEtwork COding based Streaming system (PNECOS) hosts. Therefore, how to use the buffer space allocated
to provide overlay multimedia streaming service. PNECOS in each peer efficiently to improve the scalability of P2P
employs the network coding technology to supply stateless streaming system is an important issue.
and independent encoded content, efficient utilize end hosts’ Network coding technology can improve throughput and
buffer space, and balance network traffic load and server reduce bandwidth consumption under variant multicasting
load. Simulation results show that with network coding, the environments. Network coding technology has two fea-
effective buffer length of peers can be extended by reducing tures: 1) multicasting neutrality: raise the throughput and
storage spaces for coded content, and the server bandwidth reduce the redundant data transfer; 2) stateless storage: pro-
consumption can be reduced. vide a stateless coding mechanism to encode and decode
data without considering the sequential issues.
This research proposes a Peer-to-peer NEtwork COding
1 Introduction based Streaming system which called PNECOS. PNECOS
use a network coding technology to utilize end hosts’ buffer
space and reduce the server load. The improved key of
Multimedia streaming services, such as IPTV and VoD,
PNECOS is to extend the effects of end hosts’ buffer space
are the killer applications on Internet today. Continuous
by stateless storage feature of network coding.
media is characterized as large data rate and delay sensi-
To evaluate the benefit of this combination, extensive
tive, and the continuous media streaming services are often
simulations are performed. The simulation results show
asynchronous multicast in nature, i.e., more than one user
that PNECOS performs superior to traditional multimedia
may request different parts of the video at the same time.
streaming system, oStream [6], in various user arrival rates.
Thus, a multimedia server must disseminate the multimedia
In the rest of this article is organized as follows. Section
content to multiple users simultaneously. However, with
2 describes the related works about peer-to-peer based mul-
limited server capacity and lack of multicast support, scal-
timedia streaming systems and network coding approach.
ing up multimedia streaming services to a large number of
In section 3, the architecture of PNECOS is presented.
users has been a great challenge.
Then, section 4 analyzes the performance of PNECOS. Fi-
Recently, peer-to-peer based Application-Layered Mul-
nally, Section 5 draws conclusions.
ticast (ALM) was proposed to implement multicast mecha-
nisms at the application layer by using only end-hosts. In
P2P based ALM systems, participants are organized to form 2 Related Works
an overlay topology for data disseminating. The stream-
ing data is divided into a sequence of segments, which are 2.1 Multimedia Streaming Systems
distributed by the media server to peers with asynchronous
demands. Each peer has a finite buffer to caches a few con- In the past, many peer-to-peer based multimedia stream-
tiguous segments around its play offset. Thus, the peers ing systems [1–12] have been proposed to provide overlay

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 379


DOI 10.1109/SUTC.2008.10
Figure 1. The content delivery with network
coding and without network coding.
Figure 2. The schema of peer-to-peer stream-
ing framework.
multimedia streaming services. These multimedia stream-
ing systems can be classified into three categories: tree-
based streaming systems, mesh-based streaming systems, of multicast diagram which regards data chunks as vector-
and coding-based streaming system. added linear transformations in a topology.
Tree-based streaming systems concatenate all partici- Figure 1 presents the concept of network coding. In
pants to form multicast trees for streaming video and audio case A, without network coding, the source S needs to send
data [1–4]. [4] proves the advantage of buffer capacities of packets a and b twice for completing transmission due to
end hosts under application-layered multicasting environ- the bandwidth constraints. In case B, with network cod-
ments. In [4], every participant in a streaming system has a ing, source S only sends packets a and b for once, and node
fix-sized buffer to store used streaming data for supporting Ca combines packets a and b by a network coding scheme.
later incoming participants by sending data without involv- Then node Ca sends packet (a + b) to relay node Rc . Fi-
ing the VOD server. nally, the receiver T1 and T2 can recovery packets a and b
Mesh-based streaming systems outspread peers’ links to by decoding (a + b) and a (or b). Observably, the through-
different individual peers and retrieval streaming data from put and bandwidth utilization in case B is higher than that
their links in a parallel downloading fashion [4,5]. By using in case A.
the additional links, the high availability of acquired data is
expected. 3 The Architecture of PNECOS
In coding-based streaming system, codices are exploited
for accelerating the process of streaming data download.
This section describes the design concept and system ar-
[7, 8] use the erasure codes [9, 10] to achieve efficient bulk
chitecture of PNECOS. PNECOS is composed of four com-
data download. Moreover, a novel coding technology, net-
ponents: 1) Isochronic Content Time Interval Unit (ITIU),
work coding [11, 12], can be regarded as a variant of gen-
which is a short clip of a media object for encoding and
eralized Digital Fountain approach [9]. Erasure codes can
transferring; 2) 2-phased ITIU-LNC based circular buffer,
divide data into m blocks and generate m (1 + β) coded
which caches the received ITIUs and supports new incom-
blocks on a source peer. Then, receivers must collect mγ
ing clients without the VOD server; 3) centralized code
distinct coded blocks to reconstruct the entire data. In net-
distribution, which regulates the code dissemination by us-
work coding, a code can be combined with many other
ing linear network coding (LNC); 4) Media Distribution
codes, and the sequential order of received codes is insignif-
Tree (MDT), which is proposed by oStream [6] to construct
icant for recover the original data.
streaming trees in a distributed way.

2.2 Network Coding 3.1 Basic Concept

Network coding technology was first proposed by In a peer-to-peer streaming system, each end host may
Ahlswede et al. [9]. Network coding technology can become a light-weighted proxy to reduce the bandwidth
save bandwidth and promote throughput utilization. It re- consumption of the VOD server by allocating a buffer space
gards the information flows in a network topology as trans- to cache data. If a new client acquires streaming data, pre-
formable flows by coding at peer nodes, rather than individ- decessor end hosts can serve as hosts to stream the client
ual isolated ones. Later, Li et al [14, 15] proposed an ef- when the acquired data exists in their buffer spaces.
fective coding scheme by using the linear algebra. This lin- Figure 2 shows a general diagram of a peer-to-peer
ear network coding scheme (LNC) formulates the behavior streaming system. Generally, streaming data is divided into

380
several segments for transferring and playing. Each peer has coded format (1/ω size). We use α to indicate the size of the
a finite buffer to cache the segments. In the streaming sys- raw ITIU area and β to indicate the size of the coded ITIU
tem, segments arrive at a request client continually. In order area. The size ratio between α and β is determined by an
to cache new segments, when the client’s buffer is full, the alpha ratio R, which is expressed as follows:
older segments must be dropped. Hence, the data caching
α
time is determined by the buffer length. R= (2)
In Fig. 2, end host B can acquire data from host A with- α+β
out the VOD server because host A has its desired data. Es- An raw ITIU stored in 2-ILB can be encoded by using
sentially, longer the length of buffer a client has, higher the liner network coding procedures and divided into ω-coded
probability of a client obtaining data from other clients is. ITIUs. Then, Γ copies of these coded ITIUs are picked up
and saved in 2-ILB. The rest of coded ITIUs are discarded.
3.2 Isochronic Time Interval Unit (ITIU) According to the stateless-storage feature of network cod-
ing, the saved coded ITIUs still reserve partial information
In PENCOS, every multimedia data, such as movie clips of the original ITIU. In other words, only partial space of
or sound data, needs to be divided into the same length data the ITIU is used for storing coded ITIUs. Comparing with
fragments by network coding technology. These fragments regular circular buffers without using the network coding,
are named as Isochronic Time Interval Units (ITIUs) which 2-ILB can keep more information of media objects for a
mean their playback times are all the same. given fixed buffer. That means 2-ILB extends the effective
ITIU can be classified into two types, raw ITIU and buffer lengths of clients virtually.
coded ITIU. When a raw multimedia file is encoded, the
multimedia file is divided into several raw ITIUs firstly. 3.4 Centralized Code Distribution
Then, each raw ITIU is encoded and divided into ω equal-
sized copies of coded ITIU by linear network coding Each coded ITIU includes coefficient vectors and a
(LNC). Finally, each coded ITIU is encapsulated as coded coded content. When a peer receives all ω coded ITIUs of a
packet. raw ITIU that their coefficient vectors are linear dependent,
A coded ITIU includes its own coefficient vectors and a the peer can recover the original ITIU. To avoid a client
coded content. The coefficient vectors present the composi- acquiring linear dependent coded ITIUs, how to assign co-
tion of the linear combination and the coded content denotes efficient vectors is a critical issue. In PNECOS, we use an
calculation of the linear combination. ordered prime sequence as identifier and assign them to ev-
The size of the coefficient vectors is defined as S that ery joined end host. Each end host can only use its identifier
represents fields of coefficient vectors, and the number of as coefficient vectors to encode ITIUs. There are at least
coefficient vectors depends on the dimension ω of the net- two advantages provided by this method. First, the linear
work coding. Usually, the field size of 16-bits for each coef- dependences between coded ITIUs are guaranteed if coded
ficient vector should be enough in most practical cases [16]. ITIUs come from different participants. Second, receivers
The size of a coded ITIU can be formulated as can identify the source of a coded ITIU by its coefficient
vectors.
The size of a coded ITIU
= The size of a raw ITIU × ω1 + ω × S. (1) 3.5 Media Distribution Tree

3.3 2-phased ITIU-LNC Based Circular In a peer-to-peer streaming system, organizing peers into
Buffer (2-ILB) groups for transferring media data is the key to bootstrap the
system performance. In PNECOS, each peer requires a spe-
In PNECOS, each client peer uses a circular buffer to cific offset position of a media object for playback. Thus,
cache streaming data. The buffer is called 2-phased ITIU- PNECOS arranges the peers by their required offset posi-
LNC based circular Buffer, or 2-ILB for abbreviation. In tion. PNECOS applies a Media Distribution Tree (MDT)
PNECOS, each client allocates a fixed 2-ILB to cache re- from oStream [6] for constructing and maintaining an over-
ceived ITIUs. After the ITIUs are played, the client keeps lay network. Before constructing a MDT, a Media Distri-
them in 2-ILB temporarily. When some new incoming bution Graph (MDG), which is a directed acyclic weighted
clients request those ITIUs, this peer forwards the cached graph, must be set up for providing temporal information
ITIUs to the requested peers. among requests.
2-ILB is divided into two areas, raw ITIU area and coded Figure 3 demonstrates the diagram of MDG and MDT.
ITIU area. In the raw ITIU area, ITIUs are cached in raw In Fig. 3, S presents the VOD server, Rn denotes requests,
format. In the coded ITIU area, ITIUs are cached in their and n indicates the joining order of requests. In MDG, the

381
Table 1. System parameters in the simulation.
Parameter Set Description
Symbol Value
T 60 min- The time length of the dis-
utes tributed CBR visual media ob-
ject.
W 0.05 The ratio of buffer length for
buffering the media object.
Γ 1 The reserved number of coded
ITIU, after the original ITIU is
Figure 3. The diagram of MDG (a) and MDT
encoded.
(b).
ω 2, 4 The network dimension, which
also denotes the number of seg-
ments that an ITIU divided into.
VOD server is the default server for all requests. Every re- R 0 — 0.9 The ratio between the size of
quest peer has edges that connected from all reachable pre- at 0.1 the streaming buffer areas, for
decessors which have cached streaming data to support this inter- caching ITIUs and coded ITIUs,
request peer. This graph called MDG. As shown in Fig. vals respectively. R indicates the
3(a), peer R3 can be supported by R1 and S, but R4 is only length proportion to cache ITIUs
supported by S. After constructing a minimum spanning of the whole buffer.
tree (MST) from the MDG, a MDT is ready to be built and
media data can be streamed from the streaming server.
PNECOS can provide raw ITIUs and coded ITIUs to re-
quest peers. For participants, a peer acquires coded ITIUs from candidate set is to provide a chance for the re-
means that it must spend more computing power to decode quest peer to use coded ITIUs from other sources.
the coded ITIUs. However, PNECOS may assign the peer 2. If no parent is found for raw data download service,
to acquire coded ITIUs from other peers due to the using of pick ω coded parents ranged in the coded ITIU buffer
minimum spanning tree algorithm as in oStream [6], even length, where ω denotes the member of dimension
though one or more sources can forward raw ITIUs to this used in the network coding. In this phase, the VOD
new request peer. It is because the difference between all server is included for candidate selection.
qualified parent candidates is only their distances to the new
request peer in oStream. 3. Complement the rest of needed coded ITIUs by us-
Therefore, picking a suitable parent from candidate peers ing the VOD server. If the acquired request only gets
can not only rely on the physical characteristic, such as RTT one coded parent, it means no other source peer, ex-
(Round Trip Time) or hop count, but also the difference be- cept the VOD server can provide needed coded ITIUs.
tween raw ITIU buffer and coded ITIU buffer must be taken Therefore, assign the VOD server to be its parent for
into consideration. Thus, PNECOS extended the distributed coded ITIU streaming. If less than ω but larger than
algorithms in [6] to maintain MDT: MDT-Insert and MDT- one coded parents are found, The VOD server is also
Delete. MDT-Insert and MDT-Delete are executed when a used to complement the rest of needed coded ITIUs.
new request arrives and an existing request leaves, respec- Notably, with the modified MDT-Insert and MDT-
tively. Delete, PNECOS is not using a minimum spanning tree. It
The difference between the original MDT-Insert/MDT- causes larger latency in receiving streaming data. However,
Delete and the modified ones is in the selection procedure. since appropriate parents or coded parents are allocated to
In the original MDT-Insert/MDT-Delete, the VOD server every acquired request, the workload of the VOD server is
and other candidates are selected based on their link cost. substantially reduced.
In PNECOS, the VOD server is selected only when required
media data has no sufficient source. The proposed selection
procedure is divided into 3 phases: 4 Performance Evaluation

1. Pick a parent ranged in the ITIU buffer length. To 4.1 Simulation Environment
avoid the complicated decoding process, the acquired
request should connect to a raw ITIU source as possi- We evaluate the performance of PNECOS in terms of
ble as it can. The reason to exclude the VOD server Server Bandwidth Consumption (SBC) by computer simu-

382
Figure 4. Server bandwidth consumption Figure 5. Server bandwidth consumption
(W =0.05T , Γ=1, ω=2). (W =0.05T , Γ=1, ω=4).

lations. The inter-arrival time of requests for acquiring an for caching, and ω/Γ shows the extended playback time
entire media object follows the Poisson distribution with ar- of media cached in W × (1 − R). Accordingly, the ef-
rival rate λ. In all of simulations, the case for distributing fective buffer length can be adjusted by R and Γ. With
a single CBR visual media is considered. The simulated small R, W ×(1 − R) offers a longer physical buffer length
time is 12 hours. The distributed media object is 1-hour for streaming. With larger Γ, ω/Γ provides larger augment
long, and its playback rate is 1 bit/sec [6]. Although the factor to amplify physical buffer length to effective buffer
bitrates setting is seemed inscrutable, it simplifies the con- length. Although both parameters are devoted to prolonged
version process of simulating data for statistical analyses. lengths of streaming buffer, the effect brought by R and Γ
Also, we implement oStream [6] to be the reference evalu- are different.
ation for comparing the difference between using network In PNECOS, the degree of improvement on SBC is dom-
coding and not using it. In the simulation, all requests ac- inated by R, ω and Γ. Γ denotes the reserved number of
quire the entire content of the media object. The unit for coded ITIU segments. If Γ equals to ω, there would be no
SBC is the average amount (multiples of one media object storage space saved, because the coded buffer is fully ex-
total size) of streamed data per hour. The system parameters ploited to cache the coded ITIUs of a raw ITIU. If Γ is more
of the simulations can be found in Tab. 1. than one, every predecessor would support one descendant
for more than one coded ITIUs. In this case, the predecessor
4.2 Simulation Results need to avoid linear-dependence between generated coded
ITIUs. In our simulation, Γ is always set to one for maximal
As shown in Fig. 4 and Fig. 5, the curve of oStream storage space saving.
is higher than PENCOS due to oStream uses while buffer The parameter R states the ratio of buffer area for storing
to cache raw streaming data. With network coding, the ef- ITIUs. Clearly, if R is small, the available buffer space for
fective buffer length can be extended by reducing storage coded ITIUs is large. When R is zero, all buffer space is
spaces for coded ITIU. Because each ITIU is encoded and used to store coded ITIUs. Thus, the buffer can save media
divided into ω coded ITIUs and only Γ coded ITIUs are data span the longest playback time. Also, all descendants
saved, even though each coded ITIU keeps less information can only receive coded ITIUs from their predecessors. The
than a raw ITIU. parameter ω defines the number of coded ITIUs an ITIU
As mentioned in last section, by using network coding, is divided into. Increasing ω, can effectively extend play-
the effective buffer length can be prolonged, and a longer back time of the media data cached. At the meantime, a
buffer length gives more chances to support new requests request also needs additional coded sources to acquire suf-
without involving the VOD server. The effective buffer ficient number of coded ITIUs for media recovering. How-
length is calculated by the following formula: ever, the increased required source number augments the
W × (1 − R) × ω chance to acquire coded ITIUs from the VOD server and
The effective buffer length = , (3) raises SBC. Thus, there is a tradeoff in selecting ω. The
Γ
prolonged buffer length can effectively increase hit rate and
where W × (1 − R) indicates the physical buffer length helps PNECOS to reduce SBC.

383
As demonstrated in Fig. 4 and Fig. 5, SBC becomes [4] P. Rodriguez-Rodriguez and E. Biersack, “Dynamic
much small when ω increasing. The effect is especially ap- parallel access to repicated content in the internet,”
parent while using whole buffer to store coded ITIUs. For IEEE/ACM Trans. Netw., vol. 10, no. 4. pp. 455–465,
each curve in Fig. 4 and Fig. 5, the peak of a curve indi- 2002.
cates the maximal volume of SBC. As discovered in oS-
[5] J. Byers, M. Luby, and M. Mitzenmacher, “Accessing
tream [6], these peaks denote the thresholds that the ex-
multiple mirror sites in parallel: Using tornado codes
pected length of streaming trees starts to overcome intervals
to speed up downloads,” in Proc. IEEE INFOCOM 99,
between individual requests and enable to concatenate them
vol. 1, pp. 275–283, 1999.
after corresponding arrival rates. Obviously, in Fig. 4 and
Fig. 5, when R is decreasing, the peak of curve would shift [6] Y. Cui, B. Li, and K. Nahrstedt, “oStream: Asyn-
from right to left. This phenomenon shows that the effective chronous Streaming Multicast in Application-Layer
buffer length is indeed prolonged by using network coding. Overlay Networks,” IEEE J. on Sel. Areas in Com-
mun., vol. 22, no. 1, pp. 91–106, Jan. 2004.
5 Conclusion [7] D. Kostic, A. Rodriguez, J. Albrecht, and A. Vahdat,
“Bullet: High bandwidth data dissemination using an
This work proposes a Peer-to-peer NEtwork COding overlay mesh,” in Proc. ACM SOSP 03, pp. 282–297,
based Streaming architecture (PNECOS). PNECOS ex- 2003.
ploits the storage feature of network coding to extend the [8] J. Byers, J. Considine, M. Mitzenmacher, and S. Rost,
effective buffer length and simplify the data downloading “Informed content delivery across adaptive overlay
process. In PNECOS, network coding provides a solution mesh,” IEEE/ACM Trans. Netw., vol. 12, no. 5, pp.
to recover encoded data without noticing the sequential or- 767–780, Oct. 2004.
der and reduce storage space by blending and distributing
the data information into all divided data parts. We com- [9] J. Byers, M. Luby, M. Mitzenmacher, and A. Rege,
pare PNECOS with oStream system by computer simula- “A digital fountain approach to reliable distribution of
tion in terms of server bandwidth consumption. Simulation bulk data,” in Proc. ACM SIGCOMM 98, pp. 56–67,
results show that with network coding, the effective buffer 1998.
length of peers can be extended by reducing storage spaces [10] P. Maymounkov and D. Mazires, “Rateless codes and
for coded content, and the server bandwidth consumption big downloads,” in Proc. IPTPS’03, pp. 247–255,
can be reduced. 2003.
[11] C. Gkantsidis and P. R. Rodriguez, “Network Coding
Acknowledgment for Large Scale Content Distribution,” in Proc. IEEE
INFOCOMM 2005, vol. 4, pp. 2235–2245, 2005.
This paper was sponsored in part by “Aim for the Top [12] S. Acedanski, S. Deb, M. Medard, and R. Koetter,
University Plan” of Yuan Ze University and Ministry of Ed- “How Good is Random Linear Coding Based Dis-
ucation, Taiwan, R.O.C., and the National Science Council, tributed Networked Storage?,” in Proc. NetCod 2005,
Taiwan, R.O.C. under Contract No. NSC96-2221-E-155- Italy, Apr. 7.
033.
[13] R. Ahlswede, N. Cai, S. -Y. R. Li and R. W. Yeung,
“Network information flow,” IEEE Trans. Inf. Theory,
References vol. 46, no. 4, pp. 1204–1216, July 2000.
[14] S. -Y. R. Li, R. W. Yeung, and N. Cai. “Linear network
[1] Y. -H. Chu, S. G. Rao, and H. Zhang, “A case of end- coding,” IEEE Trans. Inf. Theory, vol. 49, no. 2, pp.
system multicast,” in Proc. ACM SIGMETRICS 00, 371–381, Feb., 2003
pp. 1–12, 2000.
[15] S.-Y. R. Li, N. Cai, and R. W. Yeung, “On Theory of
[2] P. Francis, S. Ratnasamy, R. Govindan, and C. Alaet- Linear network coding,” in Proc. IEEE ISIT 05, pp.
tinoglu. Yoid project. http://www.icir.org/yoid/ 273–277, Sept. 2005.
[16] T. Ho, R. Koetter, M. Medard, D. R. Karger, and
[3] M. Castro, P. Druschel, A. -M. Kermarrec, A. Nandi,
M. Effros, “The benefit of coding over routing in a
A. Rowston, and A. Singh, “Splitstream: High-
randomized setting,” in Proc. IEEE ISIT, pp. 442, July
bandwidth multicast in cooperative environments,” in
2003.
Proc. ACM SOSP, pp. 298–313, 2003.

384
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A Common Concept Description of Natural Language Texts


as the Foundation of Semantic Computing on the Web

Mitsuru Ishizuka
School of Information Science and Technology
The University of Tokyo
ishizuka@i.u-tokyo.ac.jp

Abstract to make a consensus on how to describe the deep


meaning, we think that a certain consensus can be
In order to intelligently process vast information on attained on a way of describing the shallow meaning of
the Web, we need to make computers understand the the texts, based on the research results accumulated in
meaning of the Web contents and manipulate them the field of natural language processing such as
taking account of their semantics. Since text is the machine translation over the last several decades.
major medium conveying information, it is thus natural In CDL, besides lexicons, 45 relations are pre-
and reasonable to set it as the immediate target that defined as being necessary and sufficient for denoting
the computer understands the meaning, while there are every semantic relation between entities (lexicons in a
other types of media such as picture, movie, etc. simple case). These CDL relations can be used
Toward this direction, the activity of the Semantic Web universally, while the ontologies in the Semantic Web
is going on. It aims to establish a standardized are domain dependent and thus cause some
machine-readable description format of meta-data. problematic situations.
However, the meta-data are only fragments of the Web Current issues of CDL are, among others, an easy
contents. semi-automatic way of converting natural language
Unlike the Semantic Web, we aim to describe the texts into the CDL description, and an effective
concept meaning expressed in the whole natural mechanism of executing semantic retrieval on the CDL
language texts with a common format that the database. We believe that CDL contributes to build a
computer can understand. We have designed Concept framework of next-generation Web which provides the
Description Language (CDL) as a vehicle for this end, foundation for a variety of semantic computing. Also,
and started its standardization activity in W3C. There CDL may contribute to overcome the language barrier
are several levels of the meaning of the texts, ranging among nations.
from shallow level to deep one. While it is still difficult

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 385


DOI 10.1109/SUTC.2008.91
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Development of an Integrated Platform for Social Collaborations Based on


Semantic Peer Network

Ching-Long Yeh1, Yun-Maw Cheng2 and Li-Chieh Chen3


1,2
Department of Computer Science and Engineering,
3
Department of Industrial Design,
Tatung University, Taipei, 104 Taiwan
1
chingyeh@cse.ttu.edu.tw 2kevin@ttu.edu.tw 3lcchen@ttu.edu.tw

Extended Abstract architecture, each user site can act as content provider
and requester as well. The peer-to-peer way of social
In the emerging Web 2.0, the popular services are collaboration would also relieve the efforts to build
sharing of contents of various sorts, including, for centralized servers, by making use of the storage and
example, audio 1 , photo 2 , weblog 3 , encyclopedia 4 , computing capabilities in each site. In this paper, we
friend5, social network6, and peer-to-peer file sharing7, propose to develop an integrated platform for social
etc. Because of information sharing, the flow of collaborations that supports user autonomous
content becomes multi-directional, i.e., user can not management over their contents.
only be consumer but also provider of content [1]. The infrastructure of the platform is a semi-
Adding the newer services to the web comes up with a structured on peer-to-peer network, consisting of
more collaborative environment, for example, in e- simple nodes that provide common collaborative
learning [2, 3]. Using the newer services, learners can services for user, and rendezvous nodes that provide
not only obtain course material from instructors’ web additional information aggregation services for node
sites but also share their opinions with other peer users. Through the collaboration service interface user
learners to form more collaborative work [4]. The can access the services used to see in social software,
current Web 2.0 services, or social software systems, including weblog, wiki, annotation, bookmark sharing,
are primarily implemented by using the current web etc. Using the service interface, user can manage,
technology with lightweight programming in XML[5]. including add, modify, delete, query her own contents
From user’s aspect, she can access the services using and metadata, and acquire from other nodes. Users
ordinary browser without worrying installing new register their favorite profiles to the aggregation
programs in their machines. The social software services in specific rendezvous nodes and the latter
systems all have their own proprietary databases. acquires newly created contents from other nodes and
Therefore, to access various services, user has to disseminate the matched ones to the appropriate
disperse, having little autonomy, her profiles and destination nodes.
contributed contents all over different sites and pay To support the above technical architecture, we
attentions to manage them in different formats. design three-layered system architecture in each peer
Furthermore, the situation is getting worse as more node. The bottom layer is a peer network providing the
services are acquired. Obviously, keeping user connections, and message services among peer nodes.
generated contents, including articles and their The middle layer is a distributed knowledge-based
associated metadata, in her site instead of distributing system constructed by using the Semantic Web
all over the social systems would facilitate user technology. The knowledge-based system consists of
managing the contents. On the resulting peer-to-peer knowledge base and service interfaces. The former is
built based on the Semantic Web languages, RDF[6],
1 OWL[7] and SWRL[8], and the latter is a number of
http://youtube.com/ management and inference functions on the knowledge
2
http://www.flickr.com/ base for the collaboration services at the top layer.
3
https://www.blogger.com/ According to the “Semantic Web Stack” [9, 10], this
4
http://wikipedia.org/ layer corresponds to the ontology and rule layers in the
5
http://www.myspace.com/
6 Semantic Web architecture.
http://www.facebook.com/
7
http://www.bittorrent.com/

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 408


DOI 10.1109/SUTC.2008.80
The knowledge base contains RDF facts describing [2] M. OWEN, L. GRANT, S. SAYERS, K. FACER, “Social
resources either in the local storage or in other peer Software and Learning”’, FutureLab, Bristol, UK.
nodes, and SWRL rules required for expressing the Available at:
relationships implicit in the resources of application http://www.futurelab.org.uk/research/opening_education/
domains. The vocabulary used in the RDF facts are social_software_01.htm
either taken from those commonly used in [3] P. Anderson, “What is Web 2.0? Ideas, Technologies and
collaboration software, for example, RSS [11], FOAF Implications for Education”, JISC Technology and
Standards Watch, 2007, Available at:
[12], or the ones constructed following certain
engineering steps for various application domains [13, http://www.jisc.ac.uk/media/documents/techwatch/tsw07
01b.pdf
14]. The former kind of vocabulary is used in the
[4] T. Franklin and M. van Harmelen, “Web 2.0 for Content
social collaboration services to manage the associated for Learning and Teaching in Higher Education”, JISC
resources, while the latter is used for categorizing the Publications, 2007, Avauilable at:
resources in application domains. http://www.jisc.ac.uk/media/documents/programmes/digi
Using the application interfaces of the management talrepositories/web2-content-learning-and-teaching.pdf
functions in the middle layer, the service programs at [5] J. J. Garrette, “AJAX: a New Approach to Web
the top layer can discover, obtain, and modify the Applications”, Adaptive Path, 2005, Available at:
content of the distributed knowledge base in a trustable http://www.adaptivepath.com/ideas/essays/archives/0003
way. We implement a number of trust management 85.php
functions in the middle layer. When user finds out a [6] D. Beckett (ed.), “RDF/XML Syntax Specification
resource on the semantic peer network, she can also (Revised)”, W3C, 2004, Available at:
query the rank of trust of the resource by consulting the http://www.w3.org/TR/rdf-syntax-grammar/
opinions of it from other node users. Furthermore, user [7] D. McGuinness and F. Van Harmelen, “OWL Web
can decide by herself the degree how she trusts others’ Ontology Language Overview”, W3C, 2004, Available at:
ideas about the resource. http://www.w3.org/TR/owl-features/
In this paper we investigate the benefits from using [8] I. Horrocks, et al., “SWRL: A Semantic Web Rule
the collaboration services in two case studies: e- Language Combining OWL and RuleML”, W3C, 2004,
Learning and industrial knowledge sharing. In the first Available at:
case, learners can not only be consumers of course http://www.w3.org/Submission/SWRL/
material but also annotate her comments, post [9] T. Berners-Lee. “WWW Past & Future”, W3C, 2003.
questions, and retrieve answers on any portion of the Available at: http://www.w3.org/2003/Talks/0922-rsoc-
material when reading it. On the other hand, instructors tbl/.
not only play the role of content providers but also [10] I. Horrocks, et al., ‘‘Semantic Web Architecture: Stack
listeners of the feedback about course material from or Two Towers?’’, in Principles and Practice of
learners. Because of the metadata in RDF, user can get Semantic Web Reasoning, Lecture Notes in Computer
aggregation services Science, Vol. 3703, Springer, 2005, pp. 37-41.
In the second case, we develop industrial domain [11] M. Pilgrim, “What Is RSS”,O’Reilly, 2002, Available at:
schema that provides vocabulary for user to describe in http://www.xml.com/pub/a/2002/12/18/dive-into-
more detail their resources to be shared. On the other xml.html
hand, user can find out resources that are close to the [12] L. Dodds, ‘‘An Introduction to FOAF’’, O’Reilly, 2004,
things in her mind using the same set of vocabulary. In Available at:
addition to finding out resources, user can also rank the http://www.xml.com/pub/a/2004/02/04/foaf.html
found resources using the trust management facilities [13] N. F. Noy and D. L. McGuinness, “Ontology
to determine whether to carry out further downloading. Development 101: A Guide to Creating Your First
Ontology'', Stanford Knowledge Systems Laboratory
Technical Report KSL-01-05 and Stanford Medical
Informatics Technical Report SMI-2001-0880, March
References 2001.
[14] A. Schreiber, et al., Knowledge Engineering and
[1] T. O’Reily, “What is Web 2.0: Design Patterns and
Management: The CommonKADS Methodology, MIT
Business Models for the Next Generation of Software”,
Press, Cambridge, MA, 2002.
O’Reilly, 2005. Available at:
http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/0
9/30/what-is-web-20.html.

409
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A Survey of State of the Art Biomedical Text Mining Techniques for Semantic
Analysis

Hong-Jie Dai1,3 Chi-Hsin Huang1


hongjie@iis.sinica.edu.tw sinyuhgs@iis.sinica.edu.tw

Jaimie Yi-Wen Lin1 Pei-Hsuan Chou1


jaimie@iis.sinica.edu.tw onlytaco@gmail.com

Richard Tzong-Han Tsai2 Wen-Lian Hsu1,3*, Fellow, IEEE


thtsai@saturn.yzu.edu.tw hsu@iis.sinica.edu.tw
1
Institute of Information Science, Acdemia Sinica, Taipei, Taiwan, R.O.C.
2
Dept. of Computer Science & Engineering, Yuan Ze Univ., Taoyuan, Taiwan, R.O.C.
3
Dept. of Computer Science, National Tsing-Hua Univ., Hsinchu, Taiwan, R.O.C.

Abstract cooperation and collaboration between research teams


from institutions worldwide and provided a forum for
The abstract is to be in fully-justified italicized text, biomedical text mining research. For example, in 2004,
at the top of the left-hand column as it is here, below The National Centre for Text Mining (NaCTeM) [2]
the author information. Use the word “Abstract” as was established by the University of Manchester, the
the title, in 12-point Times, boldface type, centered University of Liverpool and the University of the
relative to the column, initially capitalized. The Salford with the objective of offering high quality
abstract is to be in 10-point, single-spaced type, and biological and biomedical text mining services. Many
up to 150 words in length. Leave two blank lines after other projects have sprung up in the wake of
the abstract, then begin the main text. BioCreAtIvE [18]. A survey conducted by Alex et al.
[1] of the latest biomedical natural language processing
1. Introduction (NLP) technology showed that a maximum reduction
of one-third in curation time can be expected, showing
In a 2004 survey, Cohen [7] et al. observed that the that biomedical text mining is a promising field.
phenomenal growth in biomedical literature poses a In this paper, we present a survey of recent
major problem for biologists. At present, there are biomedical text mining works published between the
approximately seventeen million articles in the end of 2006 and the beginning of 2008. The survey
MEDLINE/PubMed database. Clearly, applications covers 13 openly available named entity recognition
that could automatically extract useful information (NER) systems, four semantic role labeling (SRL)
from such massive information sources would greatly corpora, two event corpora, and 12 text mining-based
facilitate biological research. web services.
The past few years have seen a great deal of
research activity in the field of biomedical text 2. Biological Named Entity Recognition
mining., The BioCreAtIvE (Critical Assessment of
Information Extraction systems in Biology) [18] task, The first step in biological text-mining is the
first held in 2003 and again in 2006, has provided a identification of biological entities, referred to as the
standard training/evaluation dataset and well defined NER task. NER is important because it is a
evaluation metrics for biomedical text mining and fundamental step for tasks, such as information
information extraction. The task has facilitated extraction, summarization, and question answering. In
the biological realm, the types of named entities (NEs)
*
Corresponding author

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 410


DOI 10.1109/SUTC.2008.86
are wider in scope than the generic entity types numeric normalized value for them is IL0; hence, the
PERSON, ORGANIZATION and LOCATION unseen surface forms, such as IL4, in the training data
defined in [5]. In 2004, the JNLPBA [20] open have the same representation as forms that are seen.
challenge task for bio-NER simplified the 36 entity AIIAGMT combines the tagging results with forward
classes in the GENIA corpus [21] and used only five and backward parsing to improve its performance [26].
classes, namely protein, DNA, RNA, cell line, and cell In addition to the above online NER services, three
type, to evaluate the performance of the participating downloadable tools, Penn BioTagger [15], GENIA
systems. Unlike the earliest rule-based NER system Tagger [48] and BANNER [28], have been released.
[14], the following four types of classification models Penn BioTagger was trained by using the k-best MIRA
were applied by the participating teams: Support learning algorithm [29] with lexicons and
Vector Machines (SVMs) [16, 34], Hidden Markov automatically derived word clusters. It achieved a final
Models (HMMs) [50], Maximum Entropy Markov F-measure of 86.28% (ranked 5 in the second
Models (MEMMs) [13] and Conditional Random BioCreAtIvE workshop). Originally, the GENIA
Fields (CRFs) [38]. The most frequently applied Tagger only output base forms, part-of-speech tags and
models were SVMs. The evaluation results showed chunk tags, but the latest version, GENIA Tagger 3.0,
that SVMs worked better in combination with other also supports NE tags. BANNER is an open-source
models, while the other three models yielded a bio-NER tool that uses CRFs with carefully selected
reasonable performance in isolation [20]. However, the feature sets and the numerical normalization technique
CRFs system proposed by Settles [38] achieved a [44]. The evaluation results [28] show that BANNER
comparable performance to that of the top ranked yields a significantly better performance than existing
systems [16] with a simple feature set, which suggests open source systems, including ABNER [39] and
that integration of more useful features may further LingPipe [3]. Because BANNER is open-source, it
improve the NER performance. can be re-trained with new NER corpora; hence,
In 2006, the second BioCreAtIvE workshop researchers who require a baseline system can use it as
organized a gene mention tagging task [42], which a benchmark for evaluating new methods they propose
involved 21 teams. In contrast to JNLPBA 2004, half for NER tasks.
of the teams used CRFs as their machine-learning In contrast, the BioCaster text mining project 2 ,
models, and almost all participating teams used which is dedicated to the detection and tracking of
machine-learning-based approaches. This indicates that, disease outbreaks from Internet news articles, provides
since annotated corpora became available, machine- a totally different perspective of NER problems. The
learning approaches have become the mainstream for above NER tasks defined several annotation schemas,
NER tasks [51]. Specifically, the most popular model such as DNA and RNA, for biomedical text; however,
is CRFs [9, 27, 30, 44]. until recently, little work had been done on developing
One contribution of the second BioCreAtIvE a schema specifically for public health related texts. In
workshop was the launch of the BioCreative 2007, Doan et al. [10] of the BioCaster project
MetaServer (BCMS) online web service, which developed an annotation schema to fill this research
integrates about twenty annotation servers in different gap. They identified several important concepts that
countries to provide NER annotation services. Users or reflect information about infectious diseases, and
computer programs can simply access the service via created guidelines for annotating them as target NE
BCMS’s uniform application programming interfaces classes. In total, 18 concepts are specified as NE
without considering the fact that the annotation results classes, namely PERSON, LOCATION,
derive from different annotation servers using a variety ORGANIZATION, TIME, DISEASE, CONDITION,
approaches. The scalability (number of participants) of OUTBREAK, VIRUS, ANATOMY, PRODUCT,
BCMS is its main advantage. NONHUMAN, DNA, RNA, PROTEIN, CONTROL,
Some participating teams made their gene mention BACTERIA, CHEMICAL, and SYMPTOM. After
tagging tools openly available [9, 19]. NERBio [9] and defining the 18 categories, Doan et al.
AIIAGMT [19], which are both based on CRFs, are
easy-to-use online tools for detecting gene and gene
product names in free text. NERBio applies the
numerical normalization technique [44] to substantially
reduce the number of features required for machine-
learning training, and also to improve the accuracy of
feature weight estimation. Numerical normalization is
useful because entity names often occur in a series,
such as the gene names IL2, IL3, and so on. The 2
http://biocaster.nii.ac.jp

411
Table 1. Openly available NER tools or services
Name Description URL
AbbreviationServer [5] Biomedical abbreviation server http://bionlp.stanford.edu/abbreviation/
AbGene [50] Protein name tagger ftp://ftp.ncbi.nlm.nih.gov/pub/tanabe
ABNER [46] Protein/Gene/DNA/RNA/cell tagger http://pages.cs.wisc.edu/~bsettles/abner/
AIIAGMT [22] Gene and protein name tagger http://140.109.23.113/AIIAGMT/index.html
AliasServer [23] Protein alias handler http://cbi.labri.fr/outils/alias/index.php
BANNER [33] Gene and protein name tagger http://banner.sourceforge.net/
BioCaster [13] Health protection roles tagger http://biocaster.nii.ac.jp/
BCMS Gene and protein name tagger http://bcms.bioinfo.cnio.es
GAPSCORE [6] Protein name tagger http://bionlp.stanford.edu/gapscore
GENIA Tagger [56] Protein/Gene/DNA/RNA/cell tagger http://text0.mib.man.ac.uk/software/geniatagger/
NERBio [12] Gene and protein name tagger http://asqa.iis.sinica.edu.tw/biocreative2/
NLPort Tagger [36] Protein name tagger http://cubic.bioc.columbia.edu/services/NLProt/
Penn BioTagger [18] Gene and protein name tagger http://www.seas.upenn.edu/~strctlrn/BioTagger/
BioTagger.html

trained the BioCaster tagger with the Näive Bayes such as time, manner, and location. In 2004, the
classifier [31]. PASBio [49] project released a set of PASs for a small
We reviewed nine NER tools described in [25], and set of biometrically relevant verbs. PASBio is
summarize all currently available NER tools in Table 1. specifically designed for annotating molecular events
and defining core arguments that are important for
3. Biological Relation Corpora completing the meaning of an event.
Because the PASBio project only focused on the
In this section, we shift our focus from the creation of a semantic lexicon and annotation
fundamental NER task to the task of extracting verbal guidelines, some researchers have extended it to create
information that represents the relations between NEs. useful biomedical applications. For instance, Kogan et
The simplest way to detect the relations between al. [23] extended PASBio to build a domain-specific
NEs is to collect texts in which they co-occur. In most set of PASs for the medical domain, while Shah et al.
cases, co-occurrence statistics provide high recall but [40] used the PASBio’s representation scheme to
poor precision, but they can often be used as a baseline construct semantic patterns for the LSAT (Literature
system against which other methods can be compared Support for Alternative Transcripts) database system.
[16]. Advanced approaches that determine the roles In 2006, Shah et al. [41] annotated a small PASBio
played by NEs can be roughly classified into three corpora to build a semantic role labelling system. They
categories. (1) Pattern-based methods, which map showed that a prior binary classification step could
words, parts-of-speech, or NEs sequences into constrain the number of predicates, and provided
structural information slots according to predefined greater insight into the semantic roles of sentence
patterns and matching rules [32-34]. (2) Natural constituents for biomedical event extraction. These
language processing based methods, which may use successful applications show that the PASBio method
full parsing or shallow parsing information to extract and its specific representational schemes are adequate
subject/object information from predefined frames [35, for the general problem of representing molecular
36]. Huang et al. [37] proposed using a hybrid method biology concepts [8].
with both shallow parsing and pattern matching. A In 2006, Chou et al. [6] proposed another realizable
completely different technique that utilizes a Web approach for constructing a biomedical proposition
search engine was proposed by Mukherjea et al. [32]. bank on top of GENIA Treebank (GTB) [43]. To
(3) The semantic role labeling (SRL) technique, which construct their biomedical proposition bank, they first
we discuss in detail below. employed the rich resources of PropBank [33] in the
In SRL, sentences are represented by one or more general English domain to build an SRL system [47].
predicate-argument structures (PASs), also known as They then used the SRL system to automatically
propositions [33]. Each PAS is composed of a annotate the semantic roles in GTB and construct the
predicate (e.g., a verb) and several arguments (e.g., biomedical proposition bank called BioProp [6]. The
noun phrases and adverbial phrases) that have different project involved annotating the arguments of 30
semantic roles. The roles include main arguments, such frequent biomedical verbs. In contrast to PASBio,
as an agent and a patient, as well as adjunct arguments, BioProp does not place any biomedical constraints on

412
Table 2. Biological Relation corpora
Name Description URL
BioInfer [42] A biological relationships corpus http://mars.cs.utu.fi/BioInfer/
GENIA event A biological event annotation corpus http://www-tsujii.is.s.u-
corpus [26] tokyo.ac.jp/GENIA/home/wiki.cgi?page=Event+Annotation
Kogan et al. [27] A medical domain SRL corpus http://ycmi.med.yale.edu/krauthammer/rolelabeling.htm
LSAT [48] Literature Support for Alternative Transcripts http://www.bork.embl.de/LSAT/
PASBio [57] A set of PAS for semantic roles of http://research.nii.ac.jp/~collier/projects/PASBio/
biomedical verbs

its PASs because of the PropBank standard. 4. Biological Web Services


Furthermore, BioProp provides complete structures for
describing argument modifiers, such as location, From the large number of publications in the
manner, timing, and condition. The primary goal of biological text mining area, it is clear that the
BioProp is to port the proposition bank to the performance of basic text mining tasks has reached
biomedical domain for training a biomedical SRL reasonable levels. In the last decade, several advanced
system called BIOmedical SeMantIc roLe labEler biological text-mining services have been developed,
(BIOSMILE)[45]. and some systems have been applied to real-world
In addition to PASBio and BioProp, another corpus curation problems. PreBIND [11], for example, was
called Bioinfer was released in 2007 [35]. Bioinfer is developed to facilitate the extraction of protein-protein
annotated with syntactic dependencies and NEs as well interactions (PPI) and reduces the task duration by
as their relationships within a complex structure, such 70% [11]. The PRIME [24] database text mining
as relationships between relationships or the system, on the other hand, extracts interactions
relationships of more than two entities. Ontologies that between proteins, genes and compounds. A new text
define the types of entities and relationships annotated mining system, EpiLoc [4], which predicts the
in the corpus are also provided. Currently, the corpus subcellular location of proteins was published at the
contains 1,100 sentences from abstracts of biomedical beginning of 2008. It applies subcellular localization
research articles. prediction to almost any protein, even in the absence of
The latest GENIA event corpus [22] was released in published data about it.
January 2008. A new type of annotation, called event For article retrieval, biologists are now able to
annotation, has been added to the corpus. Event search through a massive volume of online articles. For
annotation belongs to what we call biological example, using NCBI PubMed Entrez [37], a user can
annotation. In contrast to linguistic annotation, such as retrieve articles from a database of over 4,600
SRL discussed earlier, biological annotation is biomedical journals published from 1966 to the present;
performed by biologists, not by linguists. It follows a the database is updated daily. BioText [17] provides a
similar principle to that used in the annotation of new way to access scientific literature by enabling
Bioinfer, i.e., it associates all annotations with actual biologists to search and browse the figures and
expressions in text. The difference between the two captions in biological articles. However, users of these
types of annotation is that the goal of biological basic search engines may need to scan or read retrieved
annotation is to identify what kinds of biological articles in more detail to obtain specific information of
information appear in which part of the text, while interest. Needless to say, services that can identify and
linguistic annotation focuses on the linguistic mark key relations, entities and terms can save
properties of texts in the domain. NE annotation in the biologists a great deal of time.
GENIA event corpus is one example of biological Several advanced search services have already been
annotation. It identifies text spans in which biological developed. For example, BESearch [46] provides
entities, such as proteins, DNA, RNA, and cellular biologists with a form-based query interface to obtain
locations actually appear. This new annotation was the information they need. Meanwhile, the iHOP
made on half of the GENIA corpus [21], consisting of service [12] retrieves sentences containing specified
1,000 Medline abstracts. The GENIA event corpus genes, labels the biomedical entities in the genes, and
contains 9,372 sentences in which 36,114 events have provides graphs of the co-occurrences among all
been identified. entities. iHOP allows researchers to (1) filter and rank
We summarize the current openly available corpora retrieved sentences that match the given gene or
in Table 2. protein names according to their significance, impact
factor, date of publication and syntax; and (2) explore a
network of gene and protein interactions by directly

413
navigating the pool of published scientific literature.
MEDIE, developed by the Tsujii Laboratory, can
identify subject-verb-object (syntactic) relations and
biomedical entities in sentences.
Another novel text mining service called
BIOSMILE Web Search (BWS) was released in
February 2008. BWS has similar features to iHOP and
MEDIE. It can annotate entities as well as a wider
range of relation types (Figure 1). For example, the
sentence “KaiC enhanced KaiA-KaiB interaction in
vitro and in yeast cells,” describes an enhancement
relation. BWS can identify the elements in this relation,
such as the action “enhanced”, the enhancer “KaiC”,
the enhanced “KaiA-KaiB interaction”, and the
location “in vitro and in yeast cells”. Relations are
classified by their main verbs and put in different tabs.
This makes it easy for researchers to browse through
all the relations in an article verb by verb, helps them
locate passages of interest easily, and significantly
speeds up overall comprehension (see Figure 1b).
BWS also provides a search result summary in table
format, showing all the relations found in multiple
articles during one session (see Figure 1c). This is a
convenient function for summarizing several related
papers. Furthermore, for researchers interested in PPI,
BWS classifies articles as PPI-relevant or –irrelevant
[36].
We summarize the current biological web services
in Table 3.

5. Conclusion
As the goals of biomedical information extraction
applications have become more ambitious, the range of
bio-NLP application types has become
correspondingly broader. In this paper, we have Figure 1. The features of the BWS search interface. (a) Users
summarized state of the art bio-NLP applications can enter either a PMID or keywords. For each abstract,
ranging from fundamental NER to more complex BWS annotates gene or protein names in light blue, and a
relation extraction and online integrated text mining graduated bar meter indicates the abstract’s relevance to PPI.
services. Needless to say, there are still significant (b) Analysis results are shown in the tab pane with
unsolved problems in the field. However, biomedical biomedical verbs marked in red. The semantic roles related to
text mining is an extremely active research area, and a verb are listed on the right-hand side. (c) An analysis
the outlook for continued progress is positive. summary table that contains all relations in abstracts.

414
Table 3. Biological Web Services
Name Description URL
BIOSMILE Web Search Biomedical relation extraction service http://140.138.150.34/BIOSMILER
BioText [20] Scientific literature figures and http://biosearch.berkeley.edu/
captions search engine
Chilibot [7] Relationships search engine http://www.chilibot.net/
EpiLoc Subcellular localization prediction http://epiloc.cs.queensu.ca
system
iHOP [15] Information on hyperlinked proteins http://www.ihop-net.org/
MEDIE Syntactic relations extraction system http://www-tsujii.is.s.u-tokyo.ac.jp/medie/
KinasePathway database [28] Tool for extraction of protein, gene http://kinasedb.ontology.ims.u-tokyo.ac.jp:8081/
and compound interactions from text
PreBIND [14] Classifier of protein interaction http://bond.unleashedinformatics.com/
documents
PRIME [29] Tool for extraction of protein, gene http://prime.ontology.ims.u-tokyo.ac.jp:8081/
and compound interactions from text
PubMed Entrez [44] Biomedical citation retrieval system http://www.ncbi.nlm.nih.gov/sites/entrez?db=pubmed
Textpresso [39] C. elegans literature information http://www.textpresso.org/
retrieval and extraction tool

6. References [9] H.-J. Dai, H.-C. Hung, R. T.-H. Tsai, and W.-
L. Hsu, "IASL Systems in the Gene Mention
[1] B. Alex, C. Grover, B. Haddow, M. Tagging Task and Protein Interaction Article
Kabadjov, E. Klein, M. Matthews, S. Sub-task," in Proceedings of Second
Roebuck, R. Tobin, and X. Wang, BioCreAtIvE Challenge Workshop, 2007.
"ASSISTED CURATION: DOES TEXT [10] S. Doan, A. Kawazoe, and N. Collier, "The
MINING REALLY HELP?," Pac Symp Role of Roles in Classifying Annotated
Biocomput, vol. 556, p. 67, 2008. Biomedical Text," in BioNLP 2007, 2007.
[2] S. Ananiadou, J. Chruszcz, J. Keane, J. [11] I. Donaldson, J. Martin, B. de Bruijn, and C.
McNaught, and P. Watry, "The National Wolting, "PreBIND and Textomy-mining the
Centre for Text Mining: Aims and biomedical literature for proteinprotein,"
Objectives," UKKDD'5, 2007. 2003.
[3] B. Baldwin and B. Carpenter, "LingPipe," in [12] J. M. Fernandez, R. Hoffmann, and A.
http://www.alias-i.com/lingpipe/. Valencia, "iHOP web services," Nucl. Acids
[4] S. Brady and H. Shatkay, "EpiLoc: a Res., vol. 35, 2007.
(working) text-based system for predicting [13] J. Finkel, S. Dingare, H. Nguyen, M. Nissim,
protein subcellular location," Pac Symp C. Manning, and G. Sinclair, "Exploiting
Biocomput, vol. 604, p. 15, 2008. Context for Biomedical Entity Recognition:
[5] N. Chinchor, "MUC-7 named entity task From Syntax to the Web," Joint Workshop on
definition," Proceedings of the 7th Message Natural Language Processing in Biomedicine
Understanding Conference, 1997. and Its Applications at Coling 2004, 2004.
[6] W. C. Chou, R. T. H. Tsai, Y. S. Su, W. Ku, [14] K. Fukuda, A. Tamura, T. Tsunoda, and T.
T. Y. Sung, and W. L. Hsu, "A Semi- Takagi, "Toward information extraction:
Automatic Method for Annotating a identifying protein names from biological
Biomedical Proposition Bank," Proceedings papers," Pacific Symposium on Biocomputing,
of the Workshop on Frontiers in Linguistically pp. 707-718, 1998.
Annotated Corpora, pp. 5-12, 2006. [15] K. Ganchev, K. Crammer, F. Pereira, G.
[7] K. B. Cohen and L. Hunter, "Natural Mann, K. Bellare, A. McCallum, S. Carroll,
Language Processing and Systems Biology," Y. Jin, and P. White, "Penn/UMass/CHOP
Artificial Intelligence Methods and Tools for Biocreative II systems," in Proceedings of
Systems Biology, 2004. Second BioCreAtIvE Challenge Workshop,
[8] K. B. Cohen and L. Hunter, "A critical review 2007.
of PASBio's argument structures for [16] Z. GuoDong, S. Jian, N. Collier, P. Ruch, and
biomedical verbs," BMC Bioinformatics, vol. A. Nazarenko, "Exploring Deep Knowledge
7 Suppl 3, p. S5, 2006. Resources in Biomedical Name Recognition,"
COLING 2004 International Joint workshop

415
on Natural Language Processing in Sumbitted to Second BioCreAtIvE Challenge
Biomedicine and its Applications Workshop, 2007.
(NLPBA/BioNLP) 2004, pp. 99-102, 2004. [28] R. Leaman and G. Gonzalez, "BANNER: AN
[17] M. A. Hearst, A. Divoli, H. Guturu, A. EXECUTABLE SURVEY OF ADVANCES
Ksikes, P. Nakov, M. A. Wooldridge, and J. IN BIOMEDICAL NAMED ENTITY
Ye, "BioText Search Engine: beyond abstract RECOGNITION," Pac Symp Biocomput, vol.
search," Bioinformatics, vol. 23, p. 2196, 652, p. 63, 2008.
2007. [29] R. McDonald, K. Crammer, and F. Pereira,
[18] L. Hirschman, A. Yeh, C. Blaschke, and A. "Online Large-Margin Training of
Valencia, "Overview of BioCreAtIvE: critical Dependency Parsers," Ann Arbor, vol. 100,
assessment of information extraction for 2005.
biology," feedback, 2005. [30] R. McDonald and F. Pereira, "Identifying
[19] H. S. Huang, Y. S. Lin, K. T. Lin, C. J. Kuo, gene and protein mentions in text using
Y. M. Chang, B. H. Yang, C. N. Hsu, and I. F. conditional random fields.," BMC
Chung, "High-Recall Gene Mention Bioinformatics, vol. 6, p. (Suppl)(1:S6), 2005.
Recognition by Unification of Multiple [31] T. M. Mitchell, Machine Learning: McGraw-
Backward Parsing Models," Proceedings of Hill, 1997.
the Second BioCreative Challenge Evaluation [32] S. Mukherjea and S. Sahay, "DISCOVERING
Workshop, p. 109?11, 2007. BIOMEDICAL RELATIONS UTILIZING
[20] K. Jin-Dong, O. Tomoko, Y. T. Yoshimasa THE WORLD-WIDE WEB," Pac Symp
Tsuruoka, and N. Collier, "Introduction to the Biocomput, vol. 11, pp. 164-75, 2006.
bio-entity recognition task at JNLPBA," [33] M. Palmer, D. Gildea, and P. Kingsbury, "The
Proceedings of the International Workshop on proposition bank: An annotated corpus of
Natural Language Processing in Biomedicine semantic roles," Computational Linguistics,
and its Applications (JNLPBA-04), p. 70?5, vol. 31, pp. 71-106, 2005.
2004. [34] K. M. Park, S. H. Kim, D. G. Lee, and H. C.
[21] J.-D. Kim, T. Ohta, Y. Tateisi, and J. Tsujii, Rim, "Boosting Lexical Knowledge for
"GENIA corpus--a semantically annotated Biomedical Named Entity Recognition,"
corpus for bio-textmining," Bioinformatics, Proceedings of the Joint Workshop on
vol. 19, pp. 180-182, 2003. Natural Language Processing in Biomedicine
[22] J.-D. Kim, T. Ohta, and J. i. Tsujii, "Corpus and its Applications (JNLPBA-2004), p. 75?9,
annotation for mining biomedical events from 2004.
literature," BMC Bioinformatics, vol. 9:10, [35] S. Pyysalo, F. Ginter, J. Heimonen, J. Bjorne,
2008. J. Boberg, J. Jarvinen, and T. Salakoski,
[23] Y. Kogan, N. Collier, S. Pakhomov, and M. "BioInfer: a corpus for information extraction
Krauthammer, "Towards Semantic Role in the biomedical domain," BMC
Labeling & IE in the Medical Literature," Bioinformatics, vol. 8, p. 50, 2007.
AMIA Annual Symposium Proceedings, vol. [36] T. RT-H, H. H-C, D. H-J, and H. W-L,
2005, p. 410, 2005. "Exploiting Likely-Positive and Unlabeled
[24] A. Koike, Y. Niwa, and T. Takagi, Data to Improve the Identification of Protein-
"Automatic extraction of gene/protein Protein Interaction Articles," 6th InCoB -
biological functions from biomedical text," Sixth International Conference on
Bioinformatics, vol. 21, pp. 1227-1236, 2005. Bioinformatics, 2007.
[25] M. Krallinger and A. Valencia, "Text-mining [37] G. D. Schuler, J. A. Epstein, H. Ohkawa, and
and information-retrieval services for J. A. Kans, "Entrez: molecular biology
molecular biology," Genome Biology, 2005. database and retrieval system," Methods
[26] T. Kudo and Y. Matsumoto, "Chunking with Enzymol, vol. 266, pp. 141-62, 1996.
support vector machines," North American [38] B. Settles, "Biomedical Named Entity
Chapter Of The Association For Recognition Using Conditional Random
Computational Linguistics, pp. 1-8, 2001. Fields and Rich Feature Sets," in Proceedings
[27] C. J. Kuo, Y. M. Chang, H. S. Huang, K. T. of the Joint Workshop on Natural Language
Lin, B. H. Yang, Y. S. Lin, C. N. Hsu, and I. Processing in Biomedicine and its
F. Chung, "Rich Feature Set, Unification of Applications (JNLPBA-2004) Geneva,
Bidirectional Parsing and Dictionary Filtering Switzerland, 2004.
for High F-Score Gene Mention Tagging,"

416
[39] B. Settles, "ABNER: an open source tool for [48] Y. Tsuruoka, Y. Tateishi, J. D. Kim, T. Ohta,
automatically tagging genes, proteins and J. McNaught, S. Ananiadou, and J. Tsujii,
other entity names in text." vol. 21: Oxford "Developing a robust part-of-speech tagger
Univ Press, 2005, pp. 3191-3192. for biomedical text," Lecture notes in
[40] P. K. Shah, L. J. Jensen, S. Boue, and P. Bork, computer science, pp. 382-392, 2005.
"Extraction of transcript diversity from [49] T. Wattarujeekrit, P. K. Shah, and N. Collier,
scientific literature," PLoS Computational "PASBio: predicate-argument structures for
Biology, 2005. event extraction in molecular biology," BMC
[41] P. K. Shah and P. Bork, "LSAT: learning Bioinformatics, vol. 5, p. 155, Oct 19 2004.
about alternative transcripts in MEDLINE," [50] S. Zhao, "Named Entity Recognition in
Bioinformatics, vol. 22, pp. 857-865, 2006. Biomedical Texts using an HMM Model,"
[42] L. Smith, L. K. Tanabe, R. J. n. Ando, C.-J. Proceedings of the COLING 2004
Juo, I.-F. Chung, C.-N. Hsu, Y.-S. Lin, R. International Joint Workshop on Natural
Klinger, C. M. Friedrich, K. Ganchev, M. Language Processing in Biomedicine and its
Torii, H. Liu, B. Haddow, C. A. Struble, R. J. Applications (NLPBA), 2004.
Povinelli, A. Vlachos, W. A. B. Jr., L. Hunter, [51] P. Zweigenbaum, D. Demner-Fushman, H.
B. Carpenter, R. T.-H. Tsai, H.-J. Dai, F. Liu, Yu, and K. B. Cohen, "Frontiers of
Y. Chen, C. Sun, S. Katrenko, P. Adriaans, C. biomedical text mining: current progress,"
Blaschke, R. T. Perez, M. Neves, P. Nakov, Briefings in Bioinformatics, 2007.
A. Divoli, M. Mana, J. Mata-Vazquez, and W.
J. Wilbur., "Overview of BioCreative II Gene
Mention Recognition," Genome Biology,
2007.
[43] Y. Tateisi, A. Yakushiji, T. Ohta, and J.
Tsujii, "Syntax Annotation for the GENIA
corpus," Proc. IJCNLP 2005, Companion
volume, pp. 222–227, 2005.
[44] R. T.-H. Tsai, C.-L. Sung, H.-J. Dai, H.-C.
Hung, T.-Y. Sung, and W.-L. Hsu, "NERBio:
using selected word conjunctions, term
normalization, and global patterns to improve
biomedical named entity recognition," BMC
Bioinformatics, vol. 7 Suppl 5, p. S11, 2006.
[45] R. T.-H. Tsai, W.-C. Chou, Y.-S. Su, Y.-C.
Lin, C.-L. Sung, H.-J. Dai, I. T. Yeh, W. Ku,
T.-Y. Sung, and W.-L. Hsu, "BIOSMILE: A
semantic role labeling system for biomedical
verbs using a maximum-entropy model with
automatically generated template features,"
BMC Bioinformatics, vol. 8, p. 325, 2007.
[46] R. T. H. Tsai, H. J. Dai, H. C. Hung, R. T. K.
Lin, W. C. Chou, Y. S. Su, M. Y. Day, and
W. L. Hsu, "BESearch: A Supervised
Learning Approach to Search for Molecular
Event Participants," Information Reuse and
Integration, 2007. IRI 2007. IEEE
International Conference on, pp. 412-417,
2007.
[47] T. H. Tsai, C. W. Wu, Y. C. Lin, and W. L.
Hsu, "Exploiting Full Parsing Information to
Label Semantic Roles Using an Ensemble of
ME and SVM via Integer Linear
Programming," Proceedings of CoNLL-2005,
2005.

417
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Adaptive Automatic Segmentation of HEp-2 Cells in Indirect


Immunofluorescence Images

Yu-Len Huang, PhD*, Yu-Lang Jao*, Tsu-Yi Hsieh, MD†, Chia-Wei Chung*
*Department of Computer Science and Information Engineering
Tunghai University, Taichung, Taiwan

Division of Allergy, Immunology and Rheumatology, Taichung Veterans General Hospital,
Taichung, Taiwan
ylhuang@thu.edu.tw

Abstract testing becomes more widespread used, a functional


automatic inspection system is essential and its clinical
Indirect immunofluorescence (IIF) with HEp-2 cells application is becoming urgent.
is used for the detection of antinuclear autoantibodies An automatic inspection system for ANA testing
(ANA) in systemic autoimmune diseases. An automatic can be divided into HEp-2 cell detection, fluorescence
inspection system for ANA testing can be divided into pattern classification and computer aided diagnosis
HEp-2 cell detection, fluorescence pattern phases. This study focused on the first phase of cell
classification and computer aided diagnosis phases. detecting and locating. We presented an efficient
This study focused on the first phase of cell detecting method for automatically detecting cells with
and locating. This study presented an adaptive edged- fluorescence pattern in IIF images. The preprocessing
based segmentation method for automatically detecting of the proposed method reduces the any amount of
outlines of fluorescence cells in IIF images. The noises but preserves the shape and contrast of cells.
proposed method evaluated 2573 cells with six distinct Then an adaptive edged-based segmentation
fluorescence patterns from 45 images. The results of automatically extracts outlines of cells in IIF images.
computer simulations revealed that the proposed The proposed method evaluated 2573 cells with six
method always identified cell outlines as were obtained distinct fluorescence patterns from 45 images. The
by manual sketched. Such a method provides robust results of computer simulations revealed that the
and fast automatic segmentation of HEp-2 fluorescent proposed method always identified cell outlines as
patterns in ANA testing. were obtained by manual sketched. Such a method
provides robust and fast automatic segmentation of
1. Introduction HEp-2 fluorescent patterns in ANA testing. The
proposed automatic segmentation system can save
Indirect immunofluorescence (IIF) with HEp-2 cells much of the time required to locate fluorescence
is used for the detection of antinuclear autoantibodies patterns with very high stability.
(ANA) in systemic autoimmune diseases [1]. The
ANA testing allows to scan a broad range of 2. Data Acquisition for Autoantibody
autoantibody entities and to describe them by distinct fluorescence patterns
fluorescence patterns. The fluorescence patterns are
usually identified by physician manual inspecting the This study used slides of HEp-2 substrate, at a
slides with the help of a microscope [2]. However, due serum dilution of 1:80. A physician takes images of
to lacking in satisfied automation of inspection and a slides with an acquisition unit consisting of the
low level of standardization [3-4], this procedure still fluorescence microscope coupled with a commonly
needs highly specialized and experienced technician or used fluorescence microscope (Axioskop 2, CarlZeiss,
physician to obtain diagnostic result. For this purpose, Jena, Germany) at 40-fold magnification. The
automatic inspection for fluorescence patterns in IIF immunofluorescence images were taken by an operator
image may assist physicians, without relevant with a color digital camera (E-330, Olympus, Tokyo,
experience, in making correct diagnosis. As ANA Japan). The digitized images were of 8-bit photometric

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 418


DOI 10.1109/SUTC.2008.73
resolution for each RGB (Red, Green and Blue) color speckled, (e) discrete speckled, and (f)
channel with a resolution of 3136×2352 pixel. Finally, nucleolar
the images were transferred to a personal computer and
stored as *.orf-files (Raw data format) without 3. Adaptive Image segmentation
compression. The image database containing 45
samples were collected from January 2007 to July Several studies have been proposed to classify
2007. Due to the size of original images was too large autoantibody fluorescence patterns by using an
to adapt a segmentation procedure, thus this study automatic thresholding method, i.e. the Otsu’s
down-sampled the image to a reasonable resolution algorithm [5], to segment the cells. The thresholding
1024×768 pixel. method can choose the threshold to minimize the intra-
To evaluate the proposed system, this study class variance of the black and white pixels
included six different main patterns: automatically. Due to the variety of ANA patterns, the
1. Diffuse pattern Otsu’s algorithm always failed to segment cells of
2. Peripheral pattern discrete speckled and nucleolar patterns. Figure 2
3. Coarse speckled pattern shows the over-segmentation results by using Otsu’s
4. Fine speckled pattern algorithm. In order to extract precise cells in an image,
5. Discrete speckled pattern the proposed method comprised a simple classification
6. Nucleolar pattern procedure for IIF images to avoid over-segmentation.
Figures 1 illustrate the distinct autoantibody Firstly, the automatic thresholding algorithm was
fluorescence patterns. In the view point of image performed to convert an IIF image to binary version.
processing, the fluorescence cell belongs to diffuse, Then the proposed method counted the number of
peripheral, coarse speckled, or fine speckled pattern connected region in the binary image. The information
normally includes only one connected region. On the of the connected region was used as the input of the IIF
contrary, the discrete speckled and nucleolar patterns image classifier. We classified an IIF image into two
consist of mass and several connected regions, cases based on the number of connected region n in an
respectively. image:
Type 0 (image with sparse region cells): if the
number of connected region n ≤ T;
Type 1 (image with mass region cells): if the
number of connected region n > T,
where T was the predefined threshold. Based on the
simple classification, the proposed method adaptive
selected from two modules with different parameters to
segment autoantibody fluorescence cells. Figure 3
(a) (b) presents a flowchart of the proposed method, in a form
that includes the preprocessing and segmentation
phases.
For image with sparse region cells, the original
RGB image was first transformed to HSB (Hue,
Saturation and Brightness channels) color space. The
proposed method utilized the brightness (gray level)
component as input image to segment cells. The
(c) (d) anisotropic diffusion filter [6] was performed to
enhance cell regions from background. Edge-based
segmentation methods depend on the gradient of an
image to determine objects’ boundary. Such methods
were designed to detect discontinuities of image
intensity, so edge-based methods perform well when
applied to Type 0 IIF images. This study employed a
practical method, i.e. the Canny edge detector [7], to
identify the edge information in an image and
(e) (f)
enhanced the detected edges by using the
Figure 1. The six distinct autoantibody
morphological dilation operator [8]. Finally, the
fluorescence patterns: (a) diffuse, (b)
morphological operator erosion and dilation were
peripheral, (c) coarse speckled, (d) fine

419
alternately utilized to fill the gaps within a cell and element were performed to diminish the region with a
smooth outline of the segmented cell. Figures 4 unreasonable size and then obtained the precise outline
demonstrate the processing results by this module in of the cells. Figures 5 show the processing results by
the proposed method. segmentation module for Type 1 images.

(a) (b) (a) (b)


Figure 2. Over-segmentation by using Otsu’s
algorithm: (a) segmented result of Fig.1(e); (b)
segmented result of Fig.1(f)

(c) (d)

(e) (f)
Figure 4. An IIF image (with coarse speckled
patterns) processed with the Type 0
segmentation module: (a) original RGB image,
(b) transformed brightness image after the
anisotropic diffusion filtering, (c) after Canny
edge detection, (d) after morphological
dilation, (e) after morphological smoothing,
and (f) the outline of the segmented cell
Figure 3. The flowchart of the proposed
4. Results
adaptive method
This study totally experimented 2573 autoantibody
Furthermore, for image with mass region cells, the fluorescence patterns with manual sketched outlines
original RGB image was transformed to CMY (Cyan, (including 519 diffuse patterns, 482 peripheral
Magenta, and Yellow channels) color space. We found patterns, 788 coarse speckled patterns, 634 fine
that the intensity dissimilarity between fluorescence speckled patterns, 64 discrete speckled patterns and 86
cells and background was massive in the cyan nucleolar patterns) from 45 IIF images to test the
component. Thus the proposed method utilized the accuracy of the proposed method. These simulations
cyan component as input image to segment cells in the were made on a single CPU Intel Pentium-VI 3.0 GHz
image with mass region cells. The anisotropic diffusion personal computer with Microsoft Windows XP
filter was also performed to enhance cell regions from operating system.
background. The binary image with cell region In this work, the Otsu’s algorithm was first
segmentation could be generated by using the Otsu performed to calculate the number of connected region
algorithm. In this module, the morphological operator that used as the input of the IIF image classifier. With
opening and closing with a larger sized structuring

420
the predefined threshold T ranged from 200 to 400, the outlines that are similarly to those manually sketched.
proposed adaptive segmentation system obtained a From the segmentation results, only a small number of
stable and the highest accuracy. Moreover, the cases might generate an undesired segmentation.
morphological dilation operators for image with sparse
region cells (Type 0) utilized a 3×3 diamond-shaped Table 1. Comparisons for fluorescence
structuring element. The morphological dilation patterns segmentation between the proposed
operators for image with mass region cells (Type 1) method and the Otsu’s automatic thresholding
employed a 17×17 diamond-shaped structuring with morphological operators
element.

5. Conclusion
(a) (b)
This article presented an efficient method for
automatically detecting outlines of fluorescence cells
in IIF images. The preprocessing of the proposed
method applied an anisotropic diffusion filter to reduce
the any amount of noises but preserved the shape and
contrast of fluorescence cell. This study classified an
IIF image into two cases based on the number of
connected region in an image. The proposed adaptive
(c) (d) segmentation method was used to generate precise
outline of the cells based on the types of image. The
IIF image database including 2573 cases were used to
evaluation. We found that the proposed method
determines the outlines of fluorescence cells that are
very similar to manual sketched ones. The
experimental results revealed that the proposed method
can practically segment fluorescence cells from IIF
(e) (f) images.
Figure 5. An IIF image (with discrete speckled
patterns) processed with the Type 1 6. References
segmentation module: (a) original RGB image,
(b) transformed cyan image after the [1] Conrad K, Schoessler W, Hiepe F. Autoantibodies in
anisotropic diffusion filtering, (c) after systemic autoimmune diseases. Lengerich, Berlin, Riga, Rom,
automatic thresholding, (d) after Viernheim, Wien, Zagreb: Pabst Science Publishers, 2002.
morphological closing, (e) after morphological
[2] Conrad K, Humbel RL, Meurer M, Shoenfeld Y, Tan EM.
opening, and (f) the outline of the segmented Autoantigens and autoantibodies: diagnostic tools and clues
cell to understanding autoimmunity. Lengerich, Berlin, Riga,
Rom, Viernheim, Wien, Zagreb: Pabst Science Publishers,
Table 1 lists the segmentation results that made a 2000.
comparison between the proposed method and the Otsu
automatic thresholding with morphological operators. [3] Perner P, Perner H, and Müller B, “Mining Knowledge
The number of the segmented divided and mixed cells for Hep-2 Cell Image Classification,” Journal Artificial
Intelligence in Medicine, vol. 26, pp. 161-173, 2002.
might be used to evaluate the accuracy of the
segmentation results. However, there are various cases [4] Sacka U, Knoechnera S, Warschkaub H, Piglac U,
are mixed indeed. In these cases, serious overlapping Emmricha F, and Kamprada M, “Computer-assisted
could be found between the cells. Besides this classification of HEp-2 immunofluorescence patterns in
circumstance, the proposed system clearly yielded cell

421
autoimmune diagnostics,” Autoimmunity Reviews, vol. 2, pp.
298–304, 2003.

[5] Otsu N, “Threshold Selection Method from Gray-Level


Histograms,” IEEE Transactions on Systems Man and
Cybernetics, vol. 9, no. 1, pp. 62-66, 1979.

[6] Black MJ et al., “Robust anisotropic diffusion,” IEEE


Transactions on Image Processing, vol. 7, no. 3, pp. 421-432,
1998.

[7] Canny J, “A Computational Approach to Edge


Detection,” IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. PAMI-8, no. 6, pp. 679-698, 1986.

[8] Gonzalez RC, Woods RE. Digital image processing. 2 ed.


Massachusetts: Addison Wesley, 2002.

422
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Outline Detection for the HEp-2 Cell in Indirect immunofluorescence Images


Using Watershed Segmentation

Yu-Len Huang, PhD*, Chia-Wei Chung*, Tsu-Yi Hsieh, MD†, Yu-Lang Jao*
*Department of Computer Science and Information Engineering
Tunghai University, Taichung, Taiwan

Division of Allergy, Immunology and Rheumatology, Taichung Veterans General Hospital,
Taichung, Taiwan
E-mail: ylhuang@thu.edu.tw

Abstract making correct diagnosis. As ANA testing becomes


more widespread used, a functional automatic
An automatic inspection system for antinuclear inspection system is essential and its clinical
autoantibodies (ANA) testing can be divided into HEp- application is becoming urgent.
2 cell detection, fluorescence pattern classification and An automatic inspection system for ANA testing
computer aided diagnosis phases. This paper can be divided into HEp-2 cell detection, fluorescence
presented a multi-staged segmentation method for pattern classification and computer aided diagnosis
automatically detecting outlines of fluorescence cells phases. This study focused on the first phase of cell
in Indirect immunofluorescence (IIF) images. The detecting and locating. We presented an efficient
similarity-based watershed algorithm performed the method for automatically detecting cells with
marker to prevent over-segmentation. This paper fluorescence pattern in IIF images. The preprocessing
evaluated 2305 autoantibody fluorescence patterns of the proposed method reduces the any amount of
with manual sketched outlines (including 456 diffuse noises but preserves the shape and contrast of cells.
patterns, 417 peripheral patterns, 719 coarse speckled Then a two staged watershed transform automatically
patterns, 55 fine speckled patterns, 517discrete extracts outlines of cells in IIF images. The watershed
speckled patterns and 141 nucleolar patterns) from 44 transformation, which is a reliable unsupervised model,
IIF images. The experimental results revealed that the was applied to solve diverse image segmentation
proposed method can practically outline fluorescence problems. The proposed method evaluated 2305 cells
cells from IIF images. with six distinct fluorescence patterns from 44 images.
The results of computer simulations revealed that the
1. Introduction proposed method always identified cell outlines as
were obtained by manual sketched. Such a method
Indirect immunofluorescence (IIF) with HEp-2 cells provides robust automatic segmentation of HEp-2
has been used for the detection of antinuclear fluorescent patterns in ANA testing. The proposed
autoantibodies (ANA) in systemic autoimmune automatic segmentation system can save much of the
diseases [1]. The ANA testing allows to scan a broad time required to locate fluorescence patterns with very
range of autoantibody entities and to describe them by high stability.
distinct fluorescence patterns. The fluorescence
patterns are usually identified by physician manual 2. Data Acquisition
inspecting the slides with the help of a microscope [2].
However, due to lacking in satisfied automation of This study used slides of HEp-2 substrate, at a
inspection and a low level of standardization [3-4], this serum dilution of 1:80. A physician takes images of
procedure still needs highly specialized and slides with an acquisition unit consisting of the
experienced technician or physician to obtain fluorescence microscope coupled with a commonly
diagnostic result. For this purpose, automatic used fluorescence microscope (Axioskop 2, CarlZeiss,
inspection for fluorescence patterns in IIF image may Jena, Germany) at 40-fold magnification. The
assist physicians, without relevant experience, in immunofluorescence images were taken by an operator
with a color digital camera (E-330, Olympus, Tokyo,

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 423


DOI 10.1109/SUTC.2008.72
Japan). The digitized images were of 8-bit photometric
resolution for each RGB (Red, Green and Blue) color
channel with a resolution of 3136×2352 pixel. Finally,
the images were transferred to a personal computer and
stored as *.orf-files (Raw data format) without
compression. The image database containing 44
samples were collected from January 2007 to July
2007. Due to the size of original images was too large (a) (b)
to adapt a segmentation procedure, thus this study
down-sampled the image to a reasonable resolution
1024×768 pixel.
To evaluate the proposed system, this study
included six different main ANA patterns: diffuse
pattern, peripheral pattern, coarse speckled pattern, fine
speckled pattern, discrete speckled pattern, and
nucleolar pattern. Figures 1 illustrate the distinct
(c) (d)
autoantibody fluorescence patterns. In the view point
of image processing, the fluorescence cell belongs to
diffuse, peripheral, coarse speckled, or fine speckled
pattern normally includes only one connected region.
On the contrary, the discrete speckled and nucleolar
patterns consist of mass and several connected regions,
respectively.

3. Two staged watershed segmentation (e) (f)


Figure 1. The six distinct autoantibody
One of the most reliable region-based methods of fluorescence patterns: (a) diffuse, (b)
automatic and unsupervised segmentation is the peripheral, (c) coarse speckled, (d) fine
watershed transformation [5]. This technique has been speckled, (e) discrete speckled, and (f)
applied successfully to solve a wide range of difficult nucleolar
problems of image segmentation. Identically, the IIF
images were considered as the 3D topographic surfaces
in this study. The intensity of a pixel in the image
denotes the elevation in the corresponding location.
The objective of watershed transformation is to find
the watershed lines in a topographic surface. However,
the watershed transformation is sensitive to the noise
and contrast in image. Over-segmentation may occur to
generate incorrect outlines of fluorescence cells
because IIF images include noise and speckles. Thus
the preprocessing procedures were required for
improving the performance of watersheds in the
proposed method. Besides, due to the variety of ANA
patterns, the watershed transformation always failed to
segment the cells with discrete speckled and nucleolar
patterns. Accordingly, the two-staged watershed
segmentation was performed in the proposed method
to obtain the precise outline of the cells. Figure 2
presents a flowchart of the proposed method, in a form
that includes the first and second stage modules.
Figure 2. The flowchart of the proposed
two staged segmentation method

In the first stage segmentation, the green channel


from the original RGB image was utilized as input

424
intensity to segment cells. The median filter was approved and output the outline of cells when RN2 was
performed to reduce noise and a common contrast larger than RN1. Figures 4 show the outlining results by
adjustment was performed to enhance cell regions from using the second stage module.
background. For an image with L intensity levels, the
original intensity f(x, y) was transformed to the
processed pixel intensity g(x, y) as
g(x, y) = f ( x, y ) − f min × ( L − 1) , (1)
f max − f min
where fmin and fmax are the minimal intensity and
maximal intensity, respectively. After preprocessing,
the proposed method employed the watershed
transform to segment cell region. For reducing over- (a) (b)
segmentation, the region merging procedure was
utilized to merge the small connected regions. The
region elimination procedure removed the segmented
region with an unreasonable size (smaller than 100
pixels). Then the first stage module counted the
number of obtained cell (connected region) RN1 in the
image. The RN1 was utilized to determine whether the
image required the second stage segmentation or not. If (c) (d)
RN1 of an image was larger than a predefined threshold Figure 3. IIF images segmented by the first
T, the segmentation result of the first stage module was stage module: (a) original RGB image with
approved and then generated the outline of cells. coarse speckled patterns, (b) image with
Figures 3 demonstrate the results of the first stage diffuse patterns, (c) segmentation result of (a),
module in the proposed method. On the contrary, if and (d) segmentation result of (b)
RN1 was smaller than T, the first stage segmentation
might obtain unsatisfied outline of cells due to the Furthermore, for image with mass region cells, the
most cell regions were removed by the region original RGB image was transformed to CMY (Cyan,
elimination procedure. This situation was often seen in Magenta, and Yellow channels) color space. We found
image with discrete speckled and nucleolar patterns. that the intensity dissimilarity between fluorescence
Thus the second stage segmentation was necessary for cells and background was massive in the cyan
extracting the more precise cells. component. Thus the proposed method utilized the
In the second stage segmentation, the method based cyan component as input image to segment cells in the
on the concept of markers was utilized for controlling image with mass region cells. The anisotropic diffusion
over-segmentation. Lotufo and Falcao [6] proposed an filter was also performed to enhance cell regions from
algorithm for detecting watershed boundaries based on background. The binary image with cell region
similarity using the markers. For the image with segmentation could be generated by using the Otsu
discrete speckled and nucleolar patterns, we found that algorithm. In this module, the morphological operator
the intensity dissimilarity between fluorescence cells opening and closing with a larger sized structuring
and background was massive in the cyan component. element were performed to diminish the region with a
Thus the second stage module utilized the cyan unreasonable size and then obtained the precise outline
component as input to avoid over-segmentation. The of the cells. Figures 5 show the processing results by
original RGB image was transformed to CMY (Cyan, segmentation module for Type 1 images.
Magenta, and Yellow channels) color space [7]. The
marker was defined as a connected component in 4. Results
image and typically select by a set of criteria from the
pre-processed cyan component. In this work, the This study totally experimented 2305 autoantibody
Otsu’s algorithm [8] was first performed to generate fluorescence patterns with manual sketched outlines
the marker for watershed segmentation. The similarity- (including 456 diffuse patterns, 417 peripheral
based watershed algorithm is performed herein to patterns, 719 coarse speckled patterns, 55 fine speckled
control over-segmentation in the images. Then the patterns, 517discrete speckled patterns and 141
second stage module also calculated the number of nucleolar patterns) from 44 IIF images to test the
obtained cell (connected region) RN2 in the image. The accuracy of the proposed method. These simulations
segmentation result of the second stage module was were made on a single CPU Intel Pentium-VI 3.0 GHz

425
personal computer with Microsoft Windows XP automatic thresholding with morphological operators.
operating system. The number of the segmented divided and mixed cells
In this study, the preprocessing procedure utilized might be used to evaluate the accuracy of the
the sized 3×3 median filter to reduce noise. In the first segmentation results. However, there are various cases
stage segmentation, the predefined threshold T for RN1 are mixed indeed. In these cases, serious overlapping
ranged from 20 to 30, the proposed system obtained a could be found between the cells. Besides this
stable and the highest accuracy. The performance circumstance, the proposed system clearly yielded cell
measures, i.e. true-positive (TP), false-negative (FN), outlines that are similarly to those manually sketched.
false-positive (FP) and sensitivity, were used to From the segmentation results, only a small number of
estimate the performance of the proposed system. The cases might generate an undesired segmentation.
TP value denotes the number of correct segmented
cells; FN value denotes the number of missed cells; FP Table 1. Fluorescence patterns segmentation
value denotes the number of incorrect segmented results of the proposed method
region without any fluorescence cell; and the
Autoantibody
sensitivity is defined as TP / (TP+FN). Table 1 lists the fluorescence pattern TP FP FN
Sensitivity
segmentation results of the proposed method. From the (%)
( number of slices)
segmentation results, only a small number of cases Diffused (10) 449 1 7 98.5%
might generate an undesired segmentation. Peripheral (10) 411 0 6 98.6%
Coarse speckled (10) 681 2 38 94.7%
Fine speckled (1) 55 0 0 100%
Discrete speckled (10) 449 5 68 86.8%
Nucleolar (3) 138 0 3 97.9%
Total 2183 8 122 94.7%
TP = true-positive; FN = false-negative; FP = false-positive;
Sensitivity = TP/(TP+FN)

5. Conclusion
(a) (b)
This paper presented a multi-staged segmentation
method for automatically detecting outlines of
fluorescence cells in IIF images. The preprocessing
filter and enhancement were utilized for the first stage
watershed algorithm automatically produces the
outline of the cell. In the second stage segmentation,
the similarity-based watershed algorithm performed
the marker to prevent over-segmentation. The IIF
(c) (d) image database including 2305 fluorescence cells were
used to evaluation. It can be found that the proposed
method determines the outlines of cells that are very
similar to manual sketched ones. The experimental
results revealed that the proposed method can
practically outline fluorescence cells from IIF images.

6. References
(e) (f)
Figure 4. IIF images processed with the [1] Conrad K, Schoessler W, Hiepe F. Autoantibodies in
second stage segmentation module: (a) systemic autoimmune diseases. Lengerich, Berlin, Riga, Rom,
original RGB image with discrete speckled Viernheim, Wien, Zagreb: Pabst Science Publishers, 2002.
patterns, (b) image with nucleolar speckled
[2] Conrad K, Humbel RL, Meurer M, Shoenfeld Y, Tan EM.
patterns, (c) the marker of (a), (d) the marker Autoantigens and autoantibodies: diagnostic tools and clues
of (b), (e) the cell outlining of (a), and (f) the to understanding autoimmunity. Lengerich, Berlin, Riga,
cell outlining of (b) Rom, Viernheim, Wien, Zagreb: Pabst Science Publishers,
2000.
Table 1 lists the segmentation results that made a
comparison between the proposed method and the Otsu

426
[3] Perner P, Perner H, and Müller B, “Mining Knowledge
for Hep-2 Cell Image Classification,” Journal Artificial
Intelligence in Medicine, vol. 26, pp. 161-173, 2002.

[4] Sacka U, Knoechnera S, Warschkaub H, Piglac U,


Emmricha F, and Kamprada M, “Computer-assisted
classification of HEp-2 immunofluorescence patterns in
autoimmune diagnostics,” Autoimmunity Reviews, vol. 2, pp.
298–304, 2003.

[5] Vincent L and Soille P, “Watersheds in Digital Spaces:


An Efficient Algorithm Based on Immersion Simulations,”
IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 13, no. 6, pp. 583-598, 1991.

[6] Lotufo R and Falcao A. The ordered queue and the


optimality of the watershed approaches. In: Goutsias J,
Vincent L, Bloomberg D, editors. Mathematical Morphology
and its Application to Image and Signal Processing.
Dordrecht: Kluwer Academic Publishers, pp. 341-450, 2000.

[7] Gonzalez RC and Woods RE. Digital image processing. 2


ed. Massachusetts: Addison Wesley, 2002.

[8] Otsu N, “Threshold Selection Method from Gray-Level


Histograms,” IEEE Transactions on Systems Man and
Cybernetics, vol. 9, no. 1, pp. 62-66, 1979.

427
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A Multi-Layered Approach to the Polysemy Problems in a Chinese to


Taiwanese System

Yih-Jeng Lin*, Ming-Shing Yu**, Chin-Yu Lin**, Yuan-Tsun Lin**


*
Department of Information Management, Chien-Kuo Technology University,
Chung-Hua, Taiwan.
**
Department of Computer Science and Engineering, National Chung-Hsing University,
Taichung, Taiwan.
yclin@ctu.edu.tw, msyu@nchu.edu.tw, claire@ctu.edu,tw, uuwb2000@yahoo.com.tw

Abstract between Mandarin and Taiwanese, a C2T TTS system


should have a text analysis module that can solve the
This paper proposes a novel approach to the problems specific to Taiwanese. For instance, there is
polysemy problems in a Chinese to Taiwanese TTS only one pronunciation for “我們” (we) in Chinese, but
system. Polysemy means there are words with more there are two pronunciations for “ 我 們 ” (we) in
than one meaning or pronunciation, such as “ 我 Taiwanese.
們”(we), “不”(no), “你”(you), “我”(I), “要”(want) , In general, a C2T TTS system should contain four
and so on. The correct pronunciation of a word affects basic modules. They are (1) a text analysis module, (2)
the comprehensibility (or clarity) and fluency of a tone sandhi module, (3) a prosody generation module,
Taiwanese speech. We have shown that using language and (4) a speech synthesis module. A C2T TTS system
models to solve the polysemy problems with also needs a text analysis module like that in a
outstanding results [12]. We use a layered approach to Mandarin TTS system. A well-defined bi-lingual
the polymesy problem of the word “不” (no) in this lexicon is needed in this module. We also find that text
paper. Results show that the accuracy rate of proposed analysis in a C2T TTS system should have functions
layered approach is 5% and 17.69% over language other than that in a Mandarin TTS system, such as
models and decision list classifier. phonetic transcription, digit sequence processing [13],
and solving the polysemy problem. Among these
works, the polysemy problem in Taiwanese is a
1. Introduction complex and difficult one. There is no research about
solving the polyseny problem. Polysemy means that a
Besides Mandarin, Taiwanese is the most widely
word has two or more meanings hence it may have
spoken dialect in Taiwan. According to [13], about
different pronunciations. For example, the word “他”
75% of the populations in Taiwan speak Taiwanese. It
is government policy to encourage people to learn (he) has two pronunciations in Taiwanese. These two
one’s mother tongue in school because local languages pronunciations are /i/ and /in/. The first pronunciation
are a part of local culture. /i/ of “他” means “he”, while the second pronunciation
Researchers such as [1][2][11][14] have had /in/ of “他” means “his”. The correct pronunciation of
outstanding results in developing Mandarin text-to- a word affects the comprehensibility (or clarity) and
speech systems in the past ten years. Other researchers fluency of Taiwanese speech. Tone sandhi problems in
such as [8][18][6][7][22] have just begun to develop a Taiwanese are also very complex [18]. A C2T TTS
Taiwanese TTS system. There are no formal characters system should have a module to decide which syllables
for Taiwanese and Chinese characters are officially should be read together, and decide the correct tone for
used in Taiwan. Consequently, many researchers have each syllable. Such work is similar to determine the
focused on a Chinese to Taiwanese (C2T) TTS systems. prosodic words in a sentence in a Mandarin TTS
This means that the input of a so-called Taiwanese system.
TTS system is Chinese text. In 1999, Y. C. Yang [18] There are many researchers making a study of C2T
developed a method based on machine translations to TTS systems [8][18][6][7][22]. However, few of the
help solve this problem. Since there are differences researchers consider the polysemy problem in a C2T

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 428


DOI 10.1109/SUTC.2008.71
TTS system. We think that to solve the polysemy Ex2 shows the two diffident kinds of pronunciation
problems in a C2T TTS system is a fundamental task. of “他”(he). They are /i/ with the meaning of “he” or
The correct meaning of the synthesized words cannot “him” and /in/ with the meaning of “his”.
be determined if this problem is not solved properly. (Ex2)『我看到他/i/拿一盆蘭花回他/in/家給
The remainder of this paper is organized as follows: 他/in/爸爸。』 (I see he bring an orchid
In Section 2 we will describe the polysemy problem in back his home for his father.)
Taiwanese. We will give many examples to show the
the word “我們”(we) can have two pronunciations
importance of solving the polysemy problem in a C2T
with different meanings when used in Taiwanese. The
TTS system. We have solved the polysemy problems
word can mean a situation where (1) “我們” includes
for many words like “我們” (we), “你” (you), “我” (I),
the speaker and listener(s), and (2) “我們” includes the
and “他” (he) by using language models [12]. We will
speaker but does not include the listener(s). Depending
focus on the polysemy problem of the word “不” (no)
on the meaning there can be two pronunciations of the
with six kinds of pronunciation in this paper. To decide word “we” in Taiwanese, /ghun/ and /lan/. The
the correct pronunciation for “ 不 ” (no) is more corresponding Chinese characters for /lan/ and /ghun/
complex . We will first use language models to are “ 阮 ” and “ 咱 ”, respectively. The following
determine the pronunciations of the word “不” (no) in example helps to illustrate the different meanings of
Section 3. A layered approach for determining the the word “我們”. More examples to illustrate these
proper pronunciation for “不” (no) is proposed in differences will be used later in this Section.
Section 4. In Section 5, we compare our results with Assume first, that Jeffrey and his younger brother,
the results from the decision list classifier used by [19] Jimmy, ask their father to take them to see a movie and
and [20]. The decision list classifier worked very well then go shopping. Jeffrey can say the following to his
in solving the word sense disambiguation problems. father:
The results show that the accuracies of using the (Ex3)『爸爸你要記得帶我們一起去看電影, 我
layered approach are over that of using language
models and decision list classifier. Finally, in Section 6, 們 看完電影後, 再一起去逛街。』
we summarize our major findings and outline our (Daddy, remember to take us to see a movie
future work. and then we go shopping after seeing the
movie.)
2. The polysemy problems in Taiwanese The pronunciation of the first word “我們” (us) in
EX3 is /ghun/ in Taiwanese, since the word “我們”
Compared with Chinese, it is frequent happened and includes Jeffrey and Jimmy. It does not include the
complex in Taiwanese. We will give some examples to listener, Jeffrey’s father. While the pronunciation of
show the importance of solving the polysemy problem the second word “ 我 們 ” (we) in Ex3 is /lan / in
in a C2T TTS system. Taiwanese, since this word “ 我 們 ” includes the
The first example is about some pronouns in speaker and the listener.
Taiwanese, namely “你”(you), “我”(I), and “他”(he). The pronunciation of the word “我們” in Ex4 is
These three pronouns have two kinds of pronunciations /ghun/ in Taiwanese, since the word “我們” includes
each. And different pronunciations correspond to Jeffrey and Jimmy and does not include the listener,
different meanings. Ex1 shows the pronunciations of Jeffrey’s father.
the word “我”(I) and “你”(you) in Taiwanese. The two
(Ex4)『爸爸, 我要和弟弟去看電影, 我們看完
pronunciations of “我” are /ghua/ with the meaning of
電影後, 會一起去逛街。』 (Daddy, I
“I” or “me”, and /ghun/ with the meaning of “my”. The
will go to see a movie with my younger
two pronunciations of “你” are /li/ with the meaning of
brother and we will go shopping after seeing
“you” and /lin/ with the meaning of “your”. If one
the movie.)
chooses a wrong pronunciation, it will result in the
If a C2T TTS system cannot identify the correct
wrong meanings of the speech.
pronunciation of the word “ 我 們 ”, we cannot
(Ex1)『我/ghua/過一會兒會拿幾本有關台語文
understand what the synthesized Taiwanese speech
化的書到你/lin/家給你/li/,你/li/可 means. In a C2T TTS system, it is necessary to decide
以不必到我/ghun/家來找我/ghua/拿。 the correct pronunciation of the Chinese word “我們”
』 (I will bring some books about in order to have a clear understanding of synthesized
Taiwanese culture to your house for you Taiwanese speech.
later; you need not come to my home to get
them from me.)

429
Another complex example is the word “不”(no). Kinds of Samples Ratio
“不” have six different kinds of pronunciation. They Pronunciation
are /bhuaih/, /bho/, /m/, /bhei/, /mai/, and /but/. Ex /bhuaih4/ 1,152 3.3%
5~8 show some examples of pronunciations of “不”. /bhei7/ 4,230 12.1%
(Ex5)『一般人並不/bho/容易看出它的重要性 /bo5/ 14,482 41.3%
。』 (It is not easy for a person to see its /m7/ 11,726 33.5%
importance.) /but4/ 3,103 8.9%
(Ex6)『不/m/知浪費了多少國家資源。』 (We /mai3/ 344 0.9%
do not know how much national resources
Total 35,037 100%
are wasted.)
(Ex7)『讓人聯想不/bhei/到他與機械的關係。 We can see that the pronunciations of /bo5/ and
/m7/ are the most, while the pronunciation of /mai3/ is
』 (It cannot remind the relationship
the least.
between him and machines.)
(Ex8)『華航使用之航空站交通已不/put/如從 3.2. Description of language model approach
前方便。』 (The traffic of airport is not as
convenient as before for China Airlines.) Formulas (1) and (2) are our approaches in solving
In our opinion, according to above description of polysemy problem. We call the approach WU. There
the polysemy problem in Taiwanese, we find that are two kinds of statistical results. Statistical results
deciding the proper pronunciation for each word is were compiled for words with their corresponding
very important in a C2T TTS system. We have some frequencies that appear on the left side of the word
outstanding results in solving some polysemy problems “ 不 ” and the words with their corresponding
by using combined language models approaches. We frequencies that appear on the right side of the word
will focus on solving the polysemy problem of the “不” for each training sample. Each punctuation mark
word “不” in this paper. is treated as a word.

3. Language models, the first approach C ( p & w k )


∑ C ( p & W i)
The language models have been applied well in S u ( p ) = ∑ i
C ( p j & w k )
" (1 )

L

solving the polysemy problems in our previous k

research. We will use this approach to see how good j ∑ C ( p j & W i)


i
the results are in determining the pronunciations of the C ( p & w k )
word “ 不 ” (no). First of all, we make some ∑ C ( p & W i)
descriptions of our experimental data in 3.1.
S u ( p ) = ∑ i
C ( p j & w k )
" ( 2 )

R
k

j ∑ C ( p j & W i)
i

3.1. Description of experimental data S u L ( p ) is the score of pronunciation of p


according to the left k words to the word “不” in
We use Academic Sinica Balanced Corpus 3.0
(ASBC 3.0) as our experimental data. ASBC is a well- Formulas (1) and S u R ( p ) is the score of
known Chinese corpus with segmentation information. pronunciation of p according to the right k words to the
We select all the data with the single-character word word “不” in Formulas(2). j means that there are j
“不” (no) from ASBC. Note that some data like the kinds of pronunciation of “ 不 ”. The predicted
word “不要”, “不行”, “不可以”, and so on, are not pronunciation is according to the maximum number of
included in our experimental data, since these multi- ( S u ( p ) + S u ( p ) ).
L R
character words with “不” have only one pronunciation.
There are 38,930 samples with single-character word 3.3. Results of using WU
“不” selected . We determined the pronunciation for
each “不” manually. We used window sizes (m, n) on either side of “不”,
The ratio of training data and testing data is about where m is the window size in words on the left side,
9:1. In other words, there are 35,037 samples for and n is the window size in words on the right side.
training. Table1 shows the distribution of the training Four hundred (5 * 5 = 25) different window sizes were
data. applied in the analysis using the WU model. The best
Table 1. The distributions of the training data. result achieved was 69.18% and the best range was 1

430
words on the left side of “不” and 1 words on the right two patterns from training data, the pronunciation
side of “不”. Table 2 shows the detail accuracy rates cannot be decided in this layer yet.
for each different pronunciation. There are 3,893 There are three patterns used in layer 2. They are
samples for outside testing which mentioned in 3.1. (種,說,不), (說,不,出來), and (不,出來,的). And
we find that the pattern ( 種 , 說 , 不 ) have been
Table 1. Results of using WU. appeared in training data. The frequencies are 1 for
Kinds of # of correct Accuracies pronunciation of /bhuaih4/, 2 for pronunciation of
Pronunciation /# of test samples /bhei7/, and 0 for other pronunciations. The
/bhuaih4/ 67/128 52.34% probabilities for each possible pronunciation of “不” in
/bhei7/ 318/469 67.80% Ex 5 are 1/3 for /bhuaih4/, 2/3 for /bhei7/, 0 for others.
/bo5/ 1006/1610 62.48% We can conclude the predicted pronunciation of “不”
/m7/ 1079/1303 82.81% is /bhei7/ in this layer.
/but4/ 202/344 58.72% Compared with language models in Section 3, the
/mai3/ 21/39 53.84% main advantage of layered approach is that we can
Total 2693/3893 69.18% keep up to 5-gram words information for training data.
There is no data spare problem in our approach. In the
4. Layered approach, the second approach layered approach, we also keep lower n-gram
information. If there is no pattern found in higher layer,
we can back to the lower layer. This is the concept of
4.1. Proposed layered approach
back-off strategy.
According to our previous researches, we find that
to decide the correct pronunciation of a word with 4.2. Results of using layered-approach
polysemy problem needs to make considerations of its
neighborhood words. We find that there some We use the experimental data mentioned in 3.1.
drawbacks using language models. Such as the There are 311,180 samples for training. We use these
distances of the neighborhood words are not fixed. Our 311,180 samples to train the four layers. The other
proposed layered approach is to improve such 3,893 samples which are the same as used in 3.3 are
drawback. the outside test data.
Figure 1 is the proposed layered approach to the Table 2 shows the accuracies of using layered
polysemy problem with an input testing sentence. We approach based on word pattern. The total result is
use the Ex 5 to illustrate how the layered approach about 5% over that using language model (73.49% vs.
works. 69.18%).
(Ex 5) 體驗出(VC) 一(Neu) 種(Nf) 說(VB) 不
Table 2. Results of using layered approach by word
(D) 出來(VB) 的(DE) 感覺(Na)
pattern.
Ex 5 is a fragment in Chinese with segmentation
Kinds of # of correct Accuracies
and part-of-speech (POS) information. The value in Pronunciation /# of test samples
each pair of parentheses is the POS of the
/bhuaih4/ 44/128 34.38%
corresponding word. Such as the POS of “說” is VB.
The POS tags are defined in ASBC 3.0. We want to /bhei7/ 300/469 63.96%
predict the correct pronunciation for the word “ 不” in /bo5/ 1240/1610 77.02%
Ex 5. /m7/ 1114/1303 84.50%
In Figure 1, there are four layers designed in our /but4/ 157/344 45.64%
layered approach. We denote that /mai3/ 6/39 15.38%
( w−2 , w−1 , w0 , w+1 , w+ 2 ) is (種,說,不,出來,的). Total 2861/3893 73.49%
The first pattern, (種,說,不,出來,的), will be input
of the layer 4. Since there is no such pattern found in
layer 4, we cannot decide the pronunciation of “不” 4.3. Modified layered-approach
with this pattern in layer 4. We then use two patterns,
Theoretically, the priority of higher layer should be
( w− 2 , w−1 , w0 , w+1 ) for ( 種 , 說 , 不 , 出 來 ) and
higher than the lower layer in our layered approach.
( w−1 , w0 , w+1 , w+ 2 ) for (說,不,出來,的), as input of That is, if we can determine the pronunciation of the
layer 3. We cannot find any patterns tally with these word with polysemy problem, we may not use the
answer of the lower layer. There may be a drawback if

431
the probability of predicted pronunciation of lower /bhei7/ 306/469 65.25%
layer is higher, some correct predicted answer may be /bo5/ 1262/1610 78.39%
neglected. We use the concepts of confidence measure /m7/ 1125/1303 86.34%
to measure the confidence of the four layers. The final
predicted pronunciation is that with the highest /but4/ 158/344 45.93%
confidence. /mai3/ 5/39 12.82%
Figure 2 shows the confidence curves of the four Total 2899/3893 74.47%
layers. These confidence curves are used to measure
the confidence for each layer. We will choose the 5. A comparison with decision list classifier
pronunciation predicted by the layer with the highest
confidence. As we know, there are no researchers looking at the
polysemy problem in translating from C2T in a C2T
TTS system except our proposed language models. In
this paper, it is treated as a problem of word sense
disambiguation. The main task of the research in this
paper is to determine the pronunciation of the word
“不” in Taiwanese.
There are a number of papers that have looked at
(a) The confidence curve of layer 4. disambiguation in recent years [17] [15] [3] [10] [4] [5]
[20]. In 1997 Yarowsky built a decision list classifier
(DLC) using the local context cues within a 20-word
window for the target word. A log-likelihood ratio is
generated, which stands for the strength of each clue of
local context. The decision will be made for matching
sorted ratio sequence to decide the sense of a target
word. The accuracy reached over 96% on a wide
(b) The confidence curve of layer 3.
variety of binary decision tasks. The decision list
classifier proposed by Yarowsky is among the best in
solving the problem of word sense disambiguation. We
will apply the decision list classifier in determining the
correct pronunciations of the word “ 不 ”, and then
compare the accuracy with our three approaches. A
(c) The confidence curve of layer 2. brief description of the decision list classifier is as
follows.

5.1. A brief introduction of DLC

In a decision list classifier, there is a list of rules (or


evidences) for a particular word. The rules will sorted
(d) The confidence curve of layer 1. by the log-likelihood ratio. The list of meanings is
Figure 2. The confidence curves of the four layers. The applied to the word in sequence. The correct meaning
x-axis in each graph means the difference of is chosen by the model based on the context in which
the scores of top 1 and top 2. The values in the word is used in the sentence or by testing the
y-axis are the confidence. meaning of the words on either side of the target word.
If the first evidence is determined to be true, the word
Table 3 is the result of using modified layered or words associated with this meaning can be selected
approach with the confidence measures. The total for the target word. In contrast, if the evidence is
results show that the modified layered approach is determined to be false, then the next rule in the
improved. sequence will be tried. The procedure continues until
all of the rules have been tested or a meaning is tested
Table 3. Results of using modified layered approach . that produces the correct answer. The disambiguating
Kinds of # of correct Accuracies strength of each evidence is measured by the absolute
Pronunciation /# of test samples value of the log-likelihood ratio as follows:
/bhuaih4/ 43/128 33.01%

432
P ( S 1 | ev i ) which are the same as used in 3.3 are the outside test
Abs ( Log ( ),........ ......( 3)
P ( S 2 | ev i ) data. Table 5 shows that the accuracy of using DLC is
55.80%. Compared that with using language models in
Where S1 and S2 denote the two senses of the target Section 3 and layered approach in Section 4. The
word respectively, evi denotes the evidence i. As results of using DLC are not better than the above
shown in Table 4, the target word bass has two proposed approaches.
possible meanings, fish and music.
Table 5. Results of using DLC.
Table 4. Decision List for Disambiguating Meaning of Kinds of # of correct Accuracies
1
Word bass, where bass means fish and bass2 Pronunciation /# of test samples
means music. /bhuaih4/ 60/128 46.88%
logL Evidences(rules) Sense /bhei7/ 279/469 59.49%
/bo5/ 671/1610 41.68%
10.98 1. fish within window bass1 /m7/ 992/1303 76.13%
10.92 2. striped bass bass1 /but4/ 157/344 45.64%
/mai3/ 13/39 33.33%
9.70 3. guitar within window bass2 Total 2172/3893 55.80%
9.20 4. bass player bass2
9.10 5. piano within window bass2
6. Conclusion
9.01 6. tenor within window bass2
We proposes an elegant approach, layered approach
8.87 7. sea bass bass1
to determine the pronunciation of the word “ 不 ”
translating from C2T in a C2T TTS system.
Ex 6 shows how the decision list classifier works. Experimental results also show that the models used
The target word bass has two possible meanings show better results than the decision list classifier and
depending on the context in which it is used in a language models. The polysemy problems in
sentence. Bass can mean a fish or can be a musical translating C2T are very common and it is imperative
instrument or music player. The decision list classifier that they be done in a C2T TTS system. We will
will choose the correct meaning musician for the target continue to focus on other important polysemy
word bass in the sentence. In Table 4, there are three problems in a C2T TTS system in the future.
possible evidences for the word bass in the context in To build a quality C2T TTS system is a long-term
which it is used in the sentence - first; fish within project because of the many issues in the text analysis
window; fourth, bass player; or fifth, piano within phase. In contrast to a Mandarin TTS system, we need
window. The 4th evidence “bass player” and the 5th more functions in text analysis of a C2T TTS system.
evidence “piano within window” will choose the Two imperative tasks are the polysemy problems and
meaning music while the first evidence meaning “fish the tone sandhi problem.
within window” selects the meaning fish because its
likelihood ratio is the highest, 10.98, in the sequence of 7. References
evidence. It is obvious that the meaning fish chosen by
the decision list classifier for the target word bass is 1. H. Bao, A. Wang, and S. Lu, “A Study of
incorrect. The contextual width of window employed Evaluation Method for Synthetic Mandarin
in decision list classifier is ±20-words (Yarowsky, Speech”, in Proceedings of ISCSLP 2002, The
1997). Third International Symposium on Chinese
Spoken Language Processing, pp. 383-386.
(Ex 6) “The fish, eaten by the piano player and bass
player, is from New Zealand.” 2. S. H. Chen, S. H. Hwang, and Y. R. Wang, “A
Mandarin Text-to-Speech System”,
Computational Linguistics and Chinese
5.2. Results of using DLC Language Processing, Vol. 1, No. 1, Aug. 1996,
pp. 87-100.
The contextual width of window is ±20-words. We 3. A. Fuji, K. Inui, T. Tokunaga and H. Tanaka,
use the experimental data mentioned in 3.1. There are “Selective Sampling for Example-Based Word
311,180 samples for training; the other 3,893 samples

433
Sense Disambiguation,” Computational International Conference on Advanced Learning
Linguistics, Vol. 24, No. 4, pp. 573-579, 1998. Technologies, 2004.
4. W. A. Gale, “Cognition, Computation and 14. H. M. Lu, “An Implementation and Analysis of
Formal Systems: Some of Tomos Havranek’s Mandarin Speech Synthesis Technologies”, M.
Interests and Disambiguating Word Senses,” S. Thesis, Institute of Communication
Computational Statistics and Data Analysis, Vol. Engineering, National Chiao-Tung University,
19, pp. 135-148., 1995. June 2002.
5. W. A. Gale, K. W. Church, and D. Yarowsky, 15. H. Schutze, “Ambiguity and Language Learning:
“A Method for Disambiguating Word Sense in a Computational and Cognitive Models,” Ph. D.
Large Corpus,” Computer and the Humanities, Thesis, Stanford University, 1995.
Vol. 26, 1992.
16. C. Shih and R. Sproat, “Issues in Text-to-Speech
6. C. C. Ho, “A Hybrid Statistical/RNN Approach Conversion for Mandarin”, Computational
to Prosody Synthesis for Taiwanese TTS” Master Linguistics and Chinese Language Processing,
thesis, Department of Communication Vol., 1, No. 1, Aug. 1996, pp. 37-86.
Engineering, National Chiao Tung University,
17. J. Veronis and N. Ide, “Word sense
2000.
Disambiguation with very Large Natural Corpus
7. J. Y. Hunag, “Implementation of Tone Sandhi Extracted from Machine Readable Dictionaries,”
Rules and Tagger for Taiwanese TTS”, Master in Proceeding of COLING-90, 1990.
thesis, Department of Communication
18. Y. C. Yang, "An Implementation of Taiwanese
Engineering, National Chiao Tung University,
Text-to-Speech System", Master thesis,
2001.
Department of Communication Engineering,
8. C. H. Hwang, "Text to Pronunciation Conversion National Chiao Tung University, 1999.
in Taiwanese", Master thesis, Institute of
19. D. Yarowsky, “Decision Lists for Lexical
Statistics, National Tsing Hua University, 1996.
Ambiguity Resolution: Application to Accent
9. F. L. Hwang, M. S. Yu, and M. J. Wu, “The Restoration in Spanish and French,” in
Improving Techniques for Disambiguating Non- Proceedings of the 32nd Annual Meeting of the
Alphabet Sense Categories”, in Proceedings of Association for Computational Linguistics, Las
ROCLING XIII, 2000, pp. 67-86. Cruces, NM, 1994, pp. 88-95.
10. C. Leacock and M. Chodorow, “Using Corpus 20. D. Yarowsky, “Homography Disambiguation in
Statistics and WordNet Relation for Sense Text-to-Speech Synthesis,” in van J. Santen,
Identification,” Computational Linguistics, Vol. Sproat R., Olive J. and Hirschberg J., (Eds),
24, No. 1, pp. 147-165, 1999. Progress in Speech Synthesis, Springer-Verlag,
New York, 1997, pp. 159-175.
11. Y. J. Lin and M. S. Yu, “An Efficient Mandarin
Text-to-Speech System on Time Domain”, 21. M. S. Yu and F. L. Huang, “Disambiguating the
IEICE Transactions on Information and Systems, Senses of Non-Text Symbols for Mandarin TTS
Vol. E81-D, No. 6, June 1998, pp. 545-555. Systems with a Three-Layer Classifier”, Speech
Communication, Vol. 39, Issue 3-4, 2003, pp.
12. Y. J. Lin, M. S. Yu, and C. J. Huang, “The
191-229.
Polysemy Problems, An Important Issue in a
Chinese to Taiwanese TTS System”, to be 22. X. R. Zhong, “An Improvement on the
appeared in proceeding of CISP 2008, 2008. Implementation of Taiwanese TTS System”
Master thesis, Department of Communication
13. M. S. Liang, R. C. Yang, Y. C. Chiang, D. C.
Engineering, National Chiao Tung University,
Lyu, and R. Y. Lyu, “A Taiwanese Text-to-
1999.
Speech System with Application to Language
Learning”, in Proceedings of the IEEE

434
Word Position (w-2 , w-1 ,w0 , w+1,w+2)
/bhuaih4/=0
/bhei7/=0
/bo5/=0
Layer 4 (種) (說) 不 (出來) (的) /m7/=0
/but4/=0
/mai3/=0
No pattern found, go to next layer /bhuaih4/=0
/bhei7/=0
/bo5/=0
Layer 3 (說) 不 (出來) (的)
/m7/=0
/but4/=0
+ /mai3/=0
/bhuaih4/=0
(種) (說) 不 (出來)
/bhei7/=0
/bo5/=0
/m7/=0
/but4/=0
No pattern found, go to next layer /mai3/=0
/bhuaih4/=1
(種) (說) 不 /bhei7/=2
/bo5/=0
+ /m7/=0
Layer 2 /but4/=0
/mai3/=0
(說) 不 (出來)
:
:
+
/bhuaih4/=0
/bhei7/=0
不 (出來) (的) /bo5/=0
/m7/=0
/but4/=0
Output(1/3,2/3,0/3,0/3,0/3,0/3) /mai3/=0

/bhuaih4/=0
(種) (說) /bhei7/=0
/bo5/=0
/m7/=0
(說) 不 /but4/=0
+ /mai3/=0
Layer 1
:
+ 不 (出來) :

+ /bhuaih4/=0
(出來) (的) /bhei7/=0
/bo5/=0
/m7/=0
/but4/=0
/mai3/=0

Figure 1. An example applied the layered approach.

435
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Extracting Alternative Splicing Information from Captions and Abstracts


Using Natural Language Processing

Chia Yang Cheng1, F.R. Hsu2, and Chuan Yi Tang1


1. Department of Computation Science, National Tsing Hau University, Taiwan.
2. Bioinformatics Research Center, Department of Information Engineering and Computer
Science, Feng Chia University, Taiwan.
E-mail:d938341@oz.nthu.edu.tw, frhsu@fcu.edu.tw, yitang@cs.nthu.edu.tw

Abstract MeSH is controlled vocabulary produced by


National Library of Medicine for indexing
Alternative splicing mechanisms provide protein health-related information. UMLS semantic network
diversity for cellular growth and development. In this offer a hierarchy link, ‘isa’ link, for the most specific
research, we generate a database to descript semantic type, and a set of non-hierarchical
alternative splicing in different circumstance from relationship, which are grouped into five major
captions and abstracts in Open Access Journals by categories: ‘physically related to, ‘spatially related to’,
using natural language processing techniques. We ‘temporally related to’, ’functionally related to’, and
useMedical Subject Headings(MeSH) to tag words, ‘conceptually related to’.
and extract the AS mechanism information by UMLS Abstracts are readily available and dense in
semantic network. In this database, AS information in information. Besides abstracts, figure captions also
genes on tissue-specificity, disease-related, contain much important information. In biomedical
developmental stage, functional implications, splicing papers, the most important results are presented by
type and species are contained figures. Many components such as exon, intron,
spliceosomes, and splicing factors are involved in
1. Introduction MRNA Splicing process. Authors often descript this
complicate mechanism by figures and captions. Thus,
Alternative pre-mRNA splicing (AS), the main extract AS information from caption is as important as
mechanisms for protein diversity, occur in 30%-60% from Abstracts.
of human genes [1]. AS are often controlled by Based on MeSH, UMLS semantic network and
developmental or tissue-specific factors and many the corpus searched by BioText [9], we can extract AS
human diseases are associated with aberrant splicing circumstance by NLP technique. In this research, we
[1]. The related scientific literatures are keeping generate a database to descript AS in different
increasing. Currently, there are 12,527 alternative circumstance such as tissue-specific, disease-related,
splicing publications in PubMed with MeSH terms[2, developmental stage, functional implications, splicing
3]. Thus, an automatically information extraction tool type and species from captions and abstract in Open
to extract the AS function and structure facts of genes Access Journals.
is much required.
Recently, biomedical Language Processing 2. Data and Methods
(BLP), to transfer information from mass biomedical
literatures by Natural language processing (NLP) has The basic research workflow is presented in the
been a remarkable trend [4]. BLP has been applied to following description and briefly presented in Figure 1.
extract protein-protein interaction, and pathway system Step 1: collecting corpus by Biotext search engine
etc., but not many in mRNA splice mechanisms[5]. Step 2: parsing related terms from MeSH and UMLS
Entity recognition and relation extraction are the two Step 3: corpus preprocessing
missions in BLP technique[6]. National Library of Step 4: tagging terms
Medicine already offers entities and relations data such Step 5: entity recognition
as MeSH ad UMLS semantic network [3, 7]. Step 6: ambiguous filter
Step 7: circumstance extraction

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 436


DOI 10.1109/SUTC.2008.77
semantic types are indicated in each concept, therefore,
MeSH terms can link to UMLS semantic types. Based
on the UMLS semantic network, we can generate terms
network.
We extract terms from UMLS in the following
type: Organism (A1.1.1), tissues (A1.2.3.2), disease
(B2.2.1.2.1). Besides, we use the genes name from
PubMed gene and splicing related substance from
MeSH, which are not classified detail in UMLS. For
example, snRNPs are important molecular in splicing
mechanism, and classified in Ribonucleoproteins,
Small Nuclear [D12.776.157.725.500.875]. Totally,
24885 terms are collected.

2.3. Methods

After constructing corpus and terms data, we


start to recognize terms in corpus. First, we split the
corpus into sentence and words. For the caption, we
preprocess captions to remove image-pointer and then,
split into sentences. Then, we tag the terms on the
corpus.
For the ambiguous abbreviation, especially gene
names, we count the total amount of each gene and
ignore un-normal highly gene names. For example, one
synonym name of gene carboxyl ester lipase (GeneID:
1056) is “CELL”, each is confused with the structure
Figure 1. The workflow to generate AS and function unit of all organisms. In order to simplify
circumstance DB our research, currently, we ignore this kind to genes.
However, in the future we would introduce gene name
2.1. Corpus recognition technology to resolve the problem.
After filtering out the conflict instances, at last, we
We collect corpus by the Biotext search engine in generate a database to present the results. Besides of
using the keywords “alternative splicing exon intron terms, we recognize exon’s and intron’s position in a
NOT self NOT virus”. In order to simplify the simple method. If exon or intron follow by Arabic
research, we filter out self splicing and virus effect numerals, .Roman numerals or numerals with a single
splicing relative paper. 404 abstracts and 83 captions letter, such as exon 1a, we recognize these two words
are collected. as a exon or intron position.
Next, we extract the AS circumstance for genes. We
2.2. Terms take an abstract or a caption as a unit, and at least one
gene name has to be in this unit. Then, according to the
We collect terms from three sources, MeSH, UMLS semantic network, we construct the AS
UMLS and NCBI gene. MeSH is composed by three circumstance database. In the database, we can query
data levels, descriptors, concepts and terms. The terms tissue-specific AS or AS gene which happens in
belong in the same descriptor and concepts mean that specific species.
these terms indicate synonyms. The descriptors are
organized in 16 categories, such as category A for 3. Results
anatomic terms, category B for organisms, C for
diseases, D for drugs and chemicals, etc. Each category From out corpus of 404 abstracts and 83 captions,
is further divided into subcategories. Besides, the we tag 91 genes in abstracts, and 11 genes in captions.
Supplementary Concept Records record many Only 5 genes are both tagged in abstracts and captions.
substances which index to descriptors. It means captions contain information other than
Through MeSH has its own tree structure, in abstracts. Human and mouse are the most plenty
order to extract relations from terms, we choice UMLS
semantic network to replace MeSH tree. In MeSH, the

437
species of 49 species in 7 categories which are detected “genetic function” (UI=T045), which is (1) disease
in our research. effects genetic function and (2) disease is the result of
In human cases, 19 tissues are tagged and genetic function.
nervous and muscle instances are much more than Therefore, we recognize that exon 3b is involved
other tissues. These results agree with other in a mutually exclusive alternative splicing event and is
tissue-specific AS research [1]. About the AS type, related with prostate cancer.
only 11 exon skipping and 3 mutually exclusive are
detected. The two reasons why other AS type do not 4. Discussion and Conclusion
detect well are that (a) the method to descript AS type
are variety, (b) authors often descript alternative exon Basically, the results are encouraged; the
by numbers but not type. captions also provide the unique information which
Let’s take the abstract of “Finding signals that abstracts do not. But, some issue still need to
regulate alternative splicing in the post-genomic era” discussion deeper. Taking figure 2 as an example, there
[10]. (pubid=12429065) In the figure 1, the authors is still some information lost in this research. For
descript functionally significant examples of different example, exon 3c is not recognized due to “3c” does
types of alternative splicing (Fig.2). We tag the caption not follow exon. Second, the detail of how AS regulate
b in figure1 as following: biology function does not extract. In this research we
Alternative exons may be <genetic function> only know FGFR-2 are related with prostate cancer,
<AS type > mutually exclusive </AS type> </genetic but we don’t get the information that loss 3b isoform
function>, such as <exon>exons IIIb </exon> and IIIc plays important role in it. We will both use statistical
in the fibroblast growth factor receptor 2 Natural Language Processing technology and grammar
(<gene>FGFR-2</gene>). Use of IIIb produces a based technology to improve it. Meanwhile, AS is a
receptor with high affinity for keratinocyte growth dynamic mechanism. Besides circumstance, how to
factor (KGF), whereas use of IIIc produces a extract the dynamic process is another important
high-affinity FGF receptor. Loss of the IIIb isoform is direction to be applied.
thought to be important in <disease> prostate cancer
</disease>. 5. References
[1] C. Lee, L. Atanelov, B. Modrek, Y. Xing,” ASAP: the
Alternative Splicing Annotation Project”, Nucleic Acids Res.
2003, pp. 101-5
[2] http://www.ncbi.nlm.nih.gov/PubMed/
[3] http://www.nlm.nih.gov/mesh/
[4] L. Hunter, and K. Bretonnel Cohen, “Biomedical
Language Processing Perspective: What's Beyond
PubMed?”,Molecular Cell, 2006, pp. 589-594
[5]P.K. Shah, L.J. Jensen, S. Boue,P. Bork,” Extraction of
transcript diversity from scientific literature”, PLoS Comput
Biol. 2005,pp. e10.
[6] O. Bodenreider, Text mining for biology and
biomedicine, Artech House, 2006
Figure 2. The caption 1-B and figure 1-B of [7] http://semanticnetwork.nlm.nih.gov/
“Finding signals that regulate alternative splicing in [8] William W. Cohen, Richard Wang, Robert F. Murphy,
the post-genomic era”. “Understanding Captions in Biomedical Publications”, KDD
2003, pp. 499-504.
[9]M. A. Hearst, A. D., H. Guturu, A. Ksikes, P. Nakov,
Through “mutually exclusive” is not one of a
M.A. Wooldridge and J. Ye, “BioText Search Engine:
term of genetic function in UMLS, in the AS cases, beyond abstract search”, Bioinformatics, 2007,pp. 2196 –
“mutually exclusive” usually indicate “mutually 2197
exclusive alternative splicing”. Hence, we see all the [10] A.N. Ladd, T.A. Cooper “Finding signals that regulate
splicing type, such as exon skipping, as genetic alternative splicing in the post-genomic era”, Genome
function terms. Biology, 2002, 3: reviews0008
In the UMLS semantic network, two non
hieratical relations link “disease” (UI=T140) and

438
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

New Frontiers of Microcontroller Education: Introducing SiLabs ToolStick


University Daughter Card

Gourab Sen Gupta, Chew Moi-Tin


School of Engineering and Advanced Technology (SEAT)
Massey University, New Zealand
g.sengupta@massey.ac.nz, M.Chew@massey.ac.nz

Abstract influence the choice such as easy availability, price,


educational support, software tools and peripheral
Educational support available is an important criterion resources available on the microcontroller. The modern
for selecting a microcontroller for teaching embedded trend is to build mixed-signal microcontrollers which
programming in an under graduate university course. integrate digital and analog resources on the same chip.
The paper presents and discusses two things - first the Also, for meaningful experimental work during
criteria employed to choose a microcontroller that has project-based learning, a resource-rich development
been used as the base for teaching the undergraduate board is quite essential. A common drawback of most
course on Design for Computer & Communication microcontroller development systems is that they
Systems at Massey University, New Zealand and second require traditional laboratory instruments such as an
the design and development of an ‘expansion board’ oscilloscope to observe waveform characteristics. This
which can be used in conjunction with the also renders the system ‘non-portable’ and always
microcontroller development board to do meaningful requires a laboratory setup. Other display devices, such
experiments. The expansion board has many peripheral as a LCD, make the development system not only
resources. Silicon Labs Inc, USA has amalgamated the bulkier but also costlier. An ideal development system
microcontroller development board and the ‘expansion should be such that it can be carried around, doesn’t
board’ to create a versatile University daughter card require a laboratory setup, is inexpensive and provides
which is detailed in the paper. Software implementation enough peripheral resources.
of instruments and displays, such as oscilloscope and The paper is organised in the following manner –
LCD, has eliminated the need to use bench top lab Section 2 discusses the criteria used to select a
instruments. The introduction of the Silabs University particular microcontroller for teaching. In Section 3 we
Daughter card has helped to substantially increase the present the features of the SiLab C8051F020
standard and completion rates of final year projects. microcontroller. Section 4 presents the details of the
expansion board while Section 5 details the SiLab
1. Introduction ToolStick University Daughter Card. Section 6
introduces the features of the Virtual Display. The
Project-based learning has long been adopted as the paper ends with a discussion in Section 7.
educational model by the Institute of Information
Sciences and Technology of Massey University, New 2. Selecting a microcontroller
Zealand. This approach has proved to be more effective
than the conventional lecture- and exam–centred Choosing a so-called “right” microcontroller for
method in preparing the students to tackle the undergraduate course development is always a problem.
challenges of a real-world engineering job. In addition The following factors should be taken into
to training the students on how to apply the knowledge consideration: popularity, availability, pricing,
gained during the lecture and tutorial components of the architecture and features, ease of learning, educational
course, project work, if properly structured, can help support and tools. These criteria have been, time and
them to be creative, independent, innovative and again, proven to be a good guide to select the “right”
resourceful in their new job upon graduation [1]. microcontroller for undergraduate teaching. However,
Educationists have always grappled with the choice in a broader sense the criteria may need to be looked at
of microcontroller to use for teaching. Many factors critically and scrutinized properly so that the students
will get the most out of their learning of the

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 439


DOI 10.1109/SUTC.2008.35
microcontroller and will not be hindered from keeping 2.6 Software Tools
abreast with the new technological trends. The
discussion below will present some arguments on the There should be a good provision of a fair range of
criteria that were used in the microcontroller selection. simulators, monitors, assemblers, debuggers and C-
compilers. It is imperative that the new development
2.1 Popularity tool should support standard Windows user interface.
Beside the RS232 communication link to the
The number of microcontroller units shipped out of
development board, a USB communication link is also
the manufacturing plant is not an accurate indicator of
essential now a days.
the widespread use in embedded systems, as this is the
function of the product's popularity, rather than its Table 1. Performance Features
components. For example, a microcontroller used in a
Peripherals Features & Performance
television set will have a high number of units used, but
Availability and programmable throughput
it may well be the only application this part has! ADC and DAC
in terms of ksps ( kilo samples per second)
Some indirect measurement that could be adopted in Ports The number of ports available
order to address the above issue is to survey the number Size Pin count
of other companies that expand the functionality of the It decides the processing power in terms of
particular microcontroller in their products. The Max Speed
MIPS( million of instruction per second)
extended features to look at are speed, processing Sizes, expandability and support in-system
Memory
power, memory, port capabilities, added peripherals programmable/ reprogrammable mode
like ADC/DAC, comparators and temperature sensors. On-chip debug Superior to the emulation systems which
circuitry involve expensive emulator
2.2 Availability Temperature sensor,
Additional features are useful to have
voltage comparators
If there is a limited availability of some particular
product in the market, either local or overseas, then
from the logistics point of view it is hard to plan for
3. Mixed-signal microcontroller C8051F020
using such a microcontroller in the course. The
Before introducing the new unit, a thorough studies
schedule plan for the laboratories and project work
based on the above criteria have been done on various
could be disrupted.
microcontrollers from major production companies in
2.3 Pricing the market. It was found that while some need
additional circuitry to be wired up, others require a
With most universities having diminishing separate power supply, or an expensive development
government subsidies or grants, the educators will need tool to run their microcontrollers. The microcontrollers
to look for products which would maximise the that come with the evaluation boards are usually of a
educational benefits while minimise the pressure on the higher price range.
budget. The decision needs to be weighed on providing The relatively low cost field-programmable, mixed-
good ground of knowledge to the students and yet has a signal embedded microcontroller SiLab C8051F020 [2]
wise spending on the university fund. utilizing the CIP-51 core has become an ultimate choice
2.4 Instructions and peripherals for offering to the course on Design for Computer and
Communication Systems. It met all of the selection
Instructions and operation features of a processor criteria discussed above. The microcontroller comes on
will greatly affect the ease of learning of a particular a compact size target board. It is portable, handy and
microcontroller. If a microcontroller has a non- the students are able to work at home with the unit.
orthogonal and irregular instruction set, it may be Two important conditions were satisfied by
harder for the students to learn the language. choosing mixed-signal integrated microcontroller
Besides, the hardware features of some peripherals C8051F020: restrictions on the budget expenditure and
(in addition to the standard Timers & UARTs) within the need to introduce the new course within a very
the chip should also be looked at (Table 1). limited timeframe. The microcontroller is a readymade
2.5 Educational Support tool for teaching the undergraduates, with most of the
peripherals built-in and easy connection interface to PC
Strong product and technical support from the and laptop. Preparation time for tedious lab work like
companies is very essential as it gives an additional designing the PCB artwork for interfacing board and
edge for selection of the microcontroller device. PCB fabrication have largely been avoided! Another
advantage besides the wise spending is that the quality
of course designed based on the mixed-signal board has

440
not been affected, on the contrary it gives the state-of- Toggle Switches
the-art knowledge to the students.
To further increase the capabilities of the daughter
The C8051F020 microcontroller is a ‘big brother’ of
card to provide digital inputs, there are eight toggle
the ever popular 8051 chip [3]. Many features and
switches on it. These are in the form of DIP (Dual-In-
capabilities have been added to the basic 8051 to create
Line package) micro-switches.
a very powerful microcontroller suitable for high-end
high-speed industrial applications as well as for use in Potentiometer
numerous electronic products.
A potentiometer is used as a voltage divider. It
allows the ADC to be used with no danger of the input
4. Expansion Board
exceeding the maximum rated voltage. The variable
Most of the microcontroller development systems analogue voltage (0 to 3.3V) from the potentiometer is
have very limited peripheral resources built on the connected to the input of the 12-bit ADC on channel 2.
target board. Usually a couple of switches, a few Temperature Sensor Input
buttons and occasionally a LCD module are all that one
gets. This leaves the onus of expanding the target board A thermistor can be connected on the daughter card.
on the end user. The SiLab microcontroller board, Liquid Crystal Display
C8051F020-TB, has only one push-button and an LED.
An expansion board was thus developed to provide The LCD provided on the expansion board is a 2
several peripheral resources. Figure 1 shows the lines x 16 characters display module built around the
functional block diagram of the Expansion Board. Hitachi HD44780 micro-controller. The LCD has a
parallel interface and thus very convenient for
connection to the digital I/O port of C8051F020. The
P7 b2-0 Control
3 16 Character x 2 Lines
LCD greatly enhances the versatility of the expansion
P6 Data Liquid Crystal Display board since a convenient means of displaying program
8 output is now at the disposal of the user.
96 pin
DIN P5 b7-4 Four LEDs Figure 2 shows the expansion board and Figure 3
4
connector shows it connected to the C8051F020-TB target board.
to SiLab P5 b3-0 Four Push Button Switches
MCU 4
Board P4 DIP 8 Switches
8
AIN0.2 Potentiometer
AIN0.3 Temperature Sensor
DAC1 LCD Contrast (via JP1)
DAC0 Test Point 2

Figure 1. Expansion board Functional Block


diagram

4.1 Peripheral Resources and Connections


The various parts of the expansion board are
explained in this section. Figure 2. Expansion Board

LEDs
In most applications, several LEDs are required,
often to depict port status and program diagnostics.
Thus four LEDs are provided on the board.
Pushbutton Switches
Pushbuttons and toggle switches are required in any
micro-processor development system for generating
digital input signals. Four Pushbuttons are provided on
the daughter card
Figure 3. Expansion Board Connected to
C8051F020-TB

441
5. SiLab ToolStick University Daughter Analog Signal Connections
Card Pin header connections are provided on the daughter
card for external connections to analog comparator
Silicon Labs, USA, has developed a C8051F020 inputs and DAC outputs. For various experiments, the
based target board specifically aimed for use in teaching DAC outputs or analog voltages at AIN0.2
microcontrollers and embedded programming. All the (potentiometer) and AIN0.3 (thermistor) can be
peripheral resources that were there on the expansion connected to the comparator inputs.
board, namely four push buttons, eight toggle switches,
four LEDs and a potentiometer to generate a variable Toggle LED Push Button
analog voltage for ADC input, are all there. In addition, Switches Switches

many other I/O pins are available on the board such as


the voltage comparator inputs and DAC outputs. One Power
can also connect a thermistor for which header pin LED

connections are available. Connections to some digital


I/O ports are available through header pin connections. I/O Port
Connections
There is an on-board 22.1184 MHz crystal oscillator for
the system clock, though one can use the
microcontroller’s internal oscillator too. All these
resources and facilities make the board, called ToolStick Analog
Connections
University Daughter Card (DC), very versatile for
teaching and learning the C8051F020 microcontroller.
It immensely facilitates experiments which can be done C8051F020
Target
very easily without requiring laboratory instruments. Microcontroller Connection
Potentiometer
for thermistor
Figure 4 shows the functional block diagram of the
University Daughter Card. Figure 5. Resources and connections on the
ToolStick University Daughter Card
CP0+
CP1+ 5.2 Connecting the University DC to a PC
CP1-
CP0- The university daughter card is connected to the
P5 b7-4 Four LEDs personal computer using the USB port through the
4
Microcontroller

ToolStick Base Adapter. Figure 6 shows the connection


C8051F020

P5 b3-0 Four Push Button Switches diagram. The connection between the daughter card and
SiLab

4
P4 DIP 8 Switches
the base adapter uses the Card Edge connector.
8
AIN0.2 Potentiometer PC Base Adapter Daughter Card
AIN0.3 Temperature Sensor
DAC1 Silicon Labs IDE Debug Logic MCU
Debug HW
DAC0 Card
USB UART & GPIO
Edge
UART
Figure 4. University Daughter Card Functional ToolStick Terminal External HW
Block diagram GPIO

5.1 Additional Peripheral Resources and Figure 6. Connecting the University Daughter Card
Connections to a PC using a ToolStick Base Adapter
As can be seen, there is no LCD on the university The base adaptor provides a USB debug interface
daughter card which has been now implemented in to a Windows PC and the firmware for UART serial
software which is discussed in a later section in the communication between the PC and the Daughter Card.
paper. The additional features are- Figure 7 shows the ToolStick University Daughter Card
physically connected to the ToolStick Base Adapter.
I/O Port Connections
Pin header connections are provided on the daughter
card for external connections to ports 0 to 2.

442
ToolStick Base Two types of information may be written to or read
Adapter
from the LCD – data and control (command) bytes.
Control bytes are used to program the LCD features
such as cursor off, cursor blinking, display scroll, clear
screen etc. Data bytes are the characters that are
displayed on the LCD. A short program segment is
shown below to initialise the LCD-

ToolStick void LCD_Init(void)


{
Daughter Card
//-- Display ON, Cursor OFF
LCD_ControlWrite(0x0C); LCD_Delay();
Figure 7. University DC connected to the ToolStick //-- Clear LCD
Base Adapter LCD_ControlWrite(0x01); LCD_Delay();
//-- Entry mode increment without shift
LCD_ControlWrite(0x06); LCD_Delay();
6. Virtual Display }

Silicon Lab’s Virtual Display incorporates a LCD, a A routine to display a string is shown below-
multi-channel oscilloscope and a Terminal for serial
communication. This greatly facilitates the learning of void LCD_display(char *str, int length)
the microcontroller embedded programming without {
having to use laboratory instruments. It also brings int i;
for (i=0; i<length; i++)
down the cost of setting up a laboratory. One can do all {
the experiments even in the comfort of a home, as long //-- write one byte of data to LCD
as there is access to a PC. LCD_DataWrite(*str);
str++;
}
6.1 Virtual LCD }

The virtual LCD software implements a 16x2


So, to display any numeric data, it can be converted
character LCD interface. It mimics the behaviour of the
to a string and passed to the LCD-display function.
industry standard Hitachi HD44780 controller. The
LCD is shown on the PC screen. Any data that needs to
be displayed can be sent to this LCD. The programming 6.2 Virtual Oscilloscope
interface is very simple, only four functions are The virtual oscilloscope is a very handy ‘software
required to be learnt and invoked to display data on the instrument’ to have. It has four channels and can be
LCD or read data from the LCD. Figure 8 shows the used to display a waveform. If a series of data is
Virtual LCD displaying the temperature measured using generated in a program and needs to be plotted against
the on-chip temperature sensor. time, simply send the data to the oscilloscope. The
oscilloscope interface is shown in Figure 9. It is
showing a PWM signal generated using a timer.

Figure 8. Virtual LCD display


The prototype of write functions are-
void LCD_ControlWrite(BYTE cmdChar);
void LCD_DataWrite (BYTE dataChar);

The prototype of read functions, which are rarely used


are-
BYTE LCD_DataRead (void);
BYTE LCD_ControlRead (void);
Figure 9. Virtual Oscilloscope display

443
To program the oscilloscope, one needs to master only The routine shown below may be called to write a string
two functions, the prototypes of which are given below- to the Terminal-

void ScopeClearBuffer (BYTE ChannelMask); void Terminal_display(char *str, int length)


void ScopeSampleWrite (BYTE Channel, BYTE {
sendValMSB, BYTE sendValLSB); int i;
for (i=0; i<length; i++)
{
A code segment to display the DAC output is shown TerminalWrite(*str);
below- str++;
}
WORD sendvalue; }

ScopeClearBuffer(0x0F); //clear channels 0 to 3


7. Discussions
sendvalue.i = DAC_count;
//-- write to channel 0 This paper first presents the criteria used to select a
ScopeSampleWrite(0, sendvalue.c[MSB], mixed-signal microcontroller for teaching embedded
sendvalue.c[LSB]);
programming at an undergraduate level. One of the
constraints of most microcontroller development boards
The virtual oscilloscope interface allows the user to is the lack of enough peripheral resources. This problem
change the time base and the vertical resolution. If can was overcome by designing an ‘expansion card’ to mate
also store the displayed data in a file which can be with the SiLabs C8051F020-TB target board. The
retrieved and processed separately. expansion board had several additional peripheral
resources for a learner to use during experimentation. A
6.3 Virtual Terminal combination of the target board and the expansion
The virtual terminal module provides the board is now in the market from Silicon Labs and is
infrastructure to communicate with the embedded called the ToolStick University Daughter Card [4]. This
program on the RS232 serial communication channel. development hardware is specifically targeted for
The received data can be stored in a file. A complete university education and interfaces with the PC on the
data file may be sent from the PC to the USB port using a base adaptor. Several virtual
microcontroller. It supports ASCII and HEX data instruments and displays have been implemented in
format. Figure 10 shows the Terminal interface. It is software which reduces the cost of setting up a
showing the output of the ADC. microcontroller training lab. The hardware is small,
cheap and thus affordable. This is the new frontier of
microcontroller education.
Since the introduction of C8051F020 based
development hardware in the School of Engineering
and Advanced Technology (SEAT) of Massey
University, New Zealand, the students have made
tremendous progress not only in learning embedded
programming but also in successfully completing their
final year project work.

8. References
[1] M. T. Chew, “Enhancing Engineering Creativity Through
New ID Technology”, Journal of Teaching Practice,
Singapore Polytechnic, 1998/1999, pp.51-56
Figure 10. ToolStick Terminal interface for serial
communication [2] M. T. Chew and G. Sen Gupta, “Embedded Programming
with Field-Programmable Mixed-Signal microcontrollers”,
Only two functions are required to communicate Silicon Laboratories, Austin, TX, USA, 2005, 353 p
with the Virtual Terminal program, one to read from
and one to write to. The function prototypes are as [3] David Calcutt, Fred Cowan, Hassan Parchizadeh, “8051
follows- Microcontrollers: an Applications-Based Introduction”,
Boston, Mass., USA: Newnes, 2003
void TerminalWrite (BYTE SendChar);
BYTE TerminalRead (void); [4] www.silabs.com/MCUniversity

444
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Training Data Compression Algorithms and


Reliability in Large Wireless Sensor Networks
Vasanth Iyer, Garimella Rammurthy** and M.B. Srinivas*
*
Communications and Networking Research Centre
**
Centre for VLSI and Embedded System Technologies
International Institute of Information Technology
Gachibowli, Hyderabad India 500032.
vasanth@research.iiit.ac.in, rammurthy@iiit.ac.in, srinivas@iiit.net

Abstract— With the availability of low-cost sensor nodes there


have been many standards developed to integrate and network
these nodes to form a reliable network allowing many
different types of hardware vendors to coexist. Most of these
solutions however have aimed at industry-specific
interoperability but not the size of the sensor network and the
large amount of data which is collected in course of its
lifetime. In this paper we use well studied data compression
algorithms which optimize on bringing down the data
redundancy which is related to correlated sensor readings and
using a probability model to efficiently compress data at the
cluster heads. As in the case of sensor networks the data
reliability goes down as the network resource depletes and Figure 1. Plot showing the energy required per bit versus distance
these types of networks lacks any central synchronization up to 100 meters.
making it even more a global problem to compare different a probability model is used which gives the highest
reading at the central coordinator. The complexity of probability to the most frequently occurred values
calibrating each sensor and using an adaptable measured reported by the sensors within the same cluster. This
threshold to correct the reading from sensors is a severe drain allows transmitting peak values with least amount of bits
in terms of network resources and energy consumption. In as the underlying compression algorithm assigns least
this paper we separate the task of comparative global analysis number of bit for frequently occurring values. This
to a central coordinator and use a reference PMax which is a
probability distribution is send with the data values to the
normalized probability of individual source which reflects the
central coordinator. So each cluster head has a unique PMax
current lifetime reliability of the sensors calculated at the
[1] but not all cluster heads have the same measured
cluster heads which then is compared with the current global
reliability index based on all the PMax of cluster heads. As this
value.
implementation does not need any synchronization at the local As in recent development of VLSI and MEMS
nodes it uses compress once and stamp locally without any technologies have made it possible to package self-
threshold such as application specific calibration values powered sensors and wireless radio components which
(  ) and the summarization can be application together is capable of collecting and processing new
independent making it more a sensor network reliability index sensor data for a period of many months to few years
and using it independent of the actual measured values. without replacing the internal batteries. The miniaturized
sensors are sensitive to the available effective range to the
I. INTRODUCTION energy consumed per bit. The “instantaneous drain” on the
internal batteries is evident and the study shows that:
The lifetime of sensor networks is typically factored
into the resources it is deployed with, as by design it is
 

       
unattended (i.e. no replacement of batteries) it coexists for
many months to some years. The numbers of sensor 
nodes are typically run into hundreds to thousands in a

large environmental monitoring application. As the 
      
number of nodes in such applications are enormous than 
typical networks it uses a clustering algorithms in which
typically 20%-30% of the nodes aggregate the data of the   
   , (1)
remaining 70%-80% of the connected nodes. These
cluster heads are data concentrators which can be modeled
as a device CODEC, compressor/decompressor. The Where d is the distance to transmit between sensors i to
sensors which are attached to the nodes typically sense sensor j, from this we get the Power rule based on the
temperature, humidity and light. It is true, however, that
distance d of nearest sensor to the farthest away sensor,
the sensor measurements in the operation region are
spatially correlated (since many environmental substituting in the above equation (1) and summing up
phenomena are) they tend to be very similar. In a CODEC the total energy required for all transmissions within one

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 480


DOI 10.1109/SUTC.2008.48
meter, two meters, three meters, four meters and From the equation (7) we can infer that the property is
extending up to d meters to a progressive sequence in scale invariant even with clustering c nodes in a given
equation (2) (as shown in Figure 1). radius k. It is true, however, that the sensor measurements
in the operation region are spatially correlated, to be
efficient in a large sensor network partitioning the network
! "!      #    $  %   (2) into special clusters in done periodically and data needs to
be aggregated locally by fusing all sensor reading at the
To sum up the total energy consumption we can write it cluster head. This data is periodically routed to a central
in the form of Power Law equation (3) coordinator which is a collaborative effort of all the active
nodes in the sensor network.
! "!  &$'%  '  $' % (3)
II. DATA COMPRESSION ALGORITHMS

Substituting d-distance for x and k number of bits A. Probability Model


transmitted, we equate as in equation (4). Most of the compression algorithms use a probability
model based on the entropy of the source. Entropy of
&$%  (  $ % (4) general source is given by

Taking Log both sides of equation (4), -$.%  lim234 65 72 , where

)
*&$%+   )
  )
( (5) B<CD B>CD B2CD
:$;<  =< , ;>  => , … , ;2  =2 %
72   8 8 … 8
Notice that the expression in equation (5) has the ?@A:$;<  =< , ;>  => , … , ;2  =2 %
B<C< B>C< BEC<

And {;< , ;> , … , ;2 F is a sequence of length n from the


form of a linear relationship with slope k, and scaling the
argument induces a linear shift of the function, and leaves
source. In sensor each element in the sequence is
both the form and slope k unchanged. Plotting to the log independent and identically distributed (i.i.d.), then we
scale as shown in Figure 2 we get a long tail showing a can modify the entropy to the first order to equation (8)
few nodes dominate the transmission power compared to
the majority, similar to the Wikipedia reference 80-20 -$.%   ∑ :$;< % log :$;< %. (8)
rule of Power Law [5].
B. Aggregation Model
A. Scale invraiance property in clustering for energy
If the cluster size in n (from the cluster equation (7) in
dissipation in RF based applications.
previous section) then the entropy of data aggregation is

-JKKLMKJNMO   ∑2BC< :$=%?@A> :$=% (9)


Energy Dissipation

In a lossless mode if there are no faults in the sensor


text network then we can show that the highest probability
2
given by PMax is ambiguous if its frequency is P >
otherwise it can be determined by a local function.
Distance
To the right is the long tail
, to the left are the few
that dominate(also known as the80-20 rule).
C. Local Pmax functions

Figure 2. LOG plot showing the energy dissipation versus node 2


distance up to 100 meters. :DJQ  ?@RS? =TT U >
(10)

2
As novel sensor applications are deployed to provide :DJQ  A?@VS? =TT P (11)
>
reliable data over the life-time [3] of the sensor network,
with current routing algorithms [3] which are dependent to
communicate with a central coordinator the instantaneous Where W is total number of sensors placed in a cluster
drain on the sensors are very demanding. A typical 9V head. Here the probability of sampling similar values are
battery communication for an RF sensor to transmit over highly correlated as in the case of environmental sensing
10 meters range will drain out as per the capacity table [4]. the :DJQ X 0.5 then the entropy can be re-calculated as
As shown in the previous equation in logarithmic scale for -K\\O  0.6 ?@A> 0.6  0.4 ?@A> 0.4  0.958 (12)
point to point transmission, we can extend this by per cluster head. For a good distributed clustering
clustering C nodes in the same range as shown in algorithm it uses 20% cluster heads [3] then the total
equation(6). entropy of the network will be 0.958 a 20%  19.16
per round. To further calculate the algorithm efficiency
&$%  (  $ % (6) the most popular being Huffman coding [1] which has a
lower and upper bound for a given Pmax. The Kraft-
McMillan inequality there exist a uniquely decodable code
&$ %  ($  %  ( &$% , &$% (7)

481
with code word {?B F. The average length of the code can A. Localalized Classifier – Fault  0
be upper-bounded by using the right inequality.

g g
1
?JeK  8 :$SB %?B f 8 :$SB % h?@A>  1i
:$SB %
BC< BC<

?JeK  -$.%  1

In fact it can be shown that if Pmax is the largest


probability in the probability model then for Pmax < 0.5,
the upper bound is
Figure 3. Simple classification of faulty sensors
-$.%  :DJQ
In the life-time of sensor networks when it has no faults
While for Pmax < 0.5 then the upper bound is then the case (fault=0) the classifier’s view will be as
shown in Figure 3.This given the training model of the
-$.%  :DJQ  0.086 classifier a good partition of the life-time of the sensor
network. As this information is needed latter when faults
happen in specific areas the cluster head transmit this data
Now to calculate the average numbers needed for both
periodically to the central coordinator so it can send it to
Pmax using Huffman coding, when Pmax > 0.5 then using the host for latter comparisons.
the above equations we get
B. Localalized Classifier – Fault P n
-$.%  :DJQ  0.958  0.6  1.58 bps.

Pmax < 0.5

-$.%  :DJQ  0.086  0.958  0.4  0.086  1.44 bps.

If the symbol distribution is highly skewed then it takes


few extra bits which are certainly the case in sensor
networks with no faults. To find the efficiency of the
coding we use
Figure 4. Cluster level classification of faults.
mnop <.s
jTTR=kWRl   a 100  45% (13)
q$r% t.uv From the figure 4 it is clear that the localized
aggregation function Pmax is effective and the classifier
more than the source entropy using Huffman coding. The rule will be able to differentiate the good and bad readings
data payload which is aggregated for each round with efficiently. As the fault rate increases (fault < n) then we
< have differentiation inside one cluster as the sensor
source entropy ?@A which needs a minimum of 0.78
wxny reading are correlated the cluster head is able to
bits at the cluster heads, actually it represents 1.6 bits still differentiate within the cluster boundaries. The classifier
uses a local rule for this case.
reducing the total number of bits to be transmitted to the
coordinator after coding. C. Localalized Classifier - Fault U n

III. DATA COMPRESSION ALGORITHMS


Sensors networks when deployed has a predictable
energy resource and uses a well distributed routing
algorithm to aggregate its data to periodically send the
sensed data to a central coordinator for further processing
by using minimum resources. The goal of all the
aggregation algorithms it to maximize the network
reliability index which is a global threshold and reflects
the health of the network. In the central coordinator mode
we like to implement a classifier which allows
maximizing on the redundancy of correlated data from Figure 5. Across cluster boundaries classification of faulty sensors.
each node as it learns during the lifetime of the network
and maximizes on the fault by uniformly distributing the
load on the node. This extends the useful lifetime of the Now considering cases (fault > n) it uses sampled values
sensor network by decreasing the number of energy holes from border nodes as well as some distributed nodes to
in the network and corrupting good sensors readings. compare and obtains a new global fault function. It is a

482
significant task as shown in figure 5 to compare and KND1(Pmax) :
correct the values to an expected value. The classifier N+ (Pmax)
uses a Bayesian approach in which it has to maximize on return [Pmax [0:i] + c + Pmax [i+1:] for i in range(n) for
the posterior probability with existing prior probability. c in the sampled live measurement]
The classifier uses Pmax as a reference if it is not able to
resolve then, it takes all the faulty nodes and uses the With the implementation of this classifier we get an fault
highest Pmax of the classified nodes and extracts the value rate of < %10. The input and outputs corrections are
as the best approximation for all the correlated sensors. shown in TABLE I.
We will say that we are trying to find the correction c, out
of all possible corrections, that maximizes the probability
of c given the original measurement M: V. TRAIN FEATURES

z
:DJQz  : { } (14)
|

By Bayes Theorem this is equivalent to equation (15)

|
:DJQz  : { } :$R% (15)
~
Figure 6. Classifying faulty sensors across cluster boundaries
:$R% the probability that a proposed correction c stands using near neighborhood algorithms.
on its own. This is called the correlated cluster model.
|
: { } the probability that M would be measured by itself
TABLE I.
z
CLASSIFIER PERFORMANCE COMPARISON
when the network meant c. This is the error model. Sample Correction Algorithms
:DJQz ,the control mechanism, which says to enumerate Training K-Neighbor K-Neighbor
Text Fault Rate
all feasible values of c, and then choose the one that gives Distance- 1 Distance- 2
the best combined probability score. Simulation
98% 1% 1%
Run-1
Simulation
91% 1% 8%
Run-2
IV. PMAX GLOBAL
In simple cases the sensor network uses local cluster
The off-line process helps to train the model with the head functions to predict the best correction but as the
frequency of Pmax generated by the sensors as in equation fault rate increases which is the case in large sensor
(10) and (11) which are in the fault mode but could have network deployments. The central coordinator needs a
good readings. Below is the pseudo-code for the training way to cross validate the measured data to accept it or to
the features. make possible corrections by using the nearest correction
found by using a global function. As this performed at the
Model = collection.defaultTopologyID() coordinator is not limited to any energy constraints and
can use sophisticated methods such as training and feature
For f in features: classifiers. The main goal of the cross-validation logic as
Model[f] += 1 shown in figure 7 is to compare with live values as the
Return model primary factor, the more connected the network the better

NSENSORS=train(Pmax (file(‘snapshot-
sensor.db’).read())

At this point, NSENSORS [M] holds a count of how


many times the measured value M has been seen. Now
let’s look at the problem of enumerating the possible
correction c of a seen measured value M. It is common to Figure 7. Learning and Training during correction as a Global
talk of k-neighborhood distance between two sensors, this function.
is shown in figure 6, the number of comparisons it would
take to confirm its relative measurement. If Pmax does not
get an instant match then it has to find it in K- is its reliability. This factor would correct most of the
neighborhood distance-2. In the expanded search if found anomalies which could occur due to bad calibration or
then it can approximate to measured value which closely external noise. Even the secondary factor which is to find
matches with an existing sensor of the same Pmax. The an equivalent connected path further away from the
process stops if there was no match for the seen frequency cluster head and use its measured value to correct the
which is termed unrecoverable fault (unknown) as shown currently seen value at the faulty sensor. This correction
in figure 6. The next section deals with such cases using process is used even more as the sensor network becomes
training database. Here is the pseudo code to return all widely faulty where the existing of full clusters are
measured corrections c that are K-neighbor distance away minimum or none.
from sensor with a measure M:

483
VI. MULTI-FEATURE TRAINING TABLE II.
CLASSIFIER PERFORMANCE USING PMAX LOCAL
We like to train the feature vectors in a way that it can
handle most of the aggregation locally and minimally Raw Sensor readings
globally. The two features are PMAX which takes off the Classification Temperature
PMAX PMAXC
redundancy based on value locally equation (10, 11) and Value ()
original data
M 40.00 0.10
0.70
0.60 M 32.00 0.20
0.50
0.40 data1
Y
0.30
data2
M 44.00 0.30
0.20
Data3
0.10
0.00 M 35.00← 0.40 0.40
0.00 20.00 40.00 60.00 80.00
X
(M) 75.00 0.10

Figure 8. Measured values using local PMAX LDA with Aggregated 35.00 0.40
classes €1, €2.

LDA
LDA
110

100 110

90 100

80 90 data1

70 80
f2 data2
60 70
2
f

50 60 threshold
line
40 50
Data3
30 40

30 50 70 90 110 30
f1
30 50 70 90 110
f1

Figure 9. Classifying aggregated values using local PMAX


Figure 11. Classifying local measured data with PMAXR global
LDA with classes €1, €2.
thresholds at the central coordinator.

original data

0.45
0.40
0.35
0.30 TABLE III.
0.25
Y
0.20
data1
CLASSIFIER PERFORMANCE USING PMAX- PMAXR-RELEVANT
0.15 data2
0.10 Data3
0.05
0.00
Raw Sensor readings
0.00 20.00 40.00
X
60.00 80.00 Classification Temperature Value
()
PMAX PMAXC PMAXR

Figure 10. Measured data set across sensor clusters C 57.20 0.10
PMAXR which is based on the relevance of the current C 53.60 0.10
measurement when exceeding a given threshold based on
C 55.40 0.20
the trained data.
Table II shows example local temperature reading in C 75.00 0.30 0.30 75.00
this case the PMAX= 0.40, hence a high measured value of C 55.40 0.20
75.0 can be safely ignored. This can be adapted efficiently
using a linear discriminate analysis (LDA) as shown in M 40.00 0.10
Figure 8, 9. As this needs a lot of computing resources we
M 32.00 0.20
differ such machine learning techniques to be
implemented globally. In table III the trained values are M 44.00 0.30
corrected and are shown in ‘C’ and the new measured data
M 35.00 0.40 0.40
set is shown as ‘M’. Here again the sensor measures a
high value at 75.0 with a corresponding of PMAX= 0.10 the (M) 75.00 0.10
classifier’s other feature vector is matching similar
readings reported earlier or newly available fused values Relevant 75.00 0.40 0.30
from boarder nodes, in this case there is a such high
relevant values in the training data set so comparing the VII. SUMMARY
measured data set the training data set we have Theoretically we show that study of wireless sensor
:DJQz  0.40, :DJQL  0.30 network energy management is a significant part of
The global classifier updates the new measured value solving the reliability issue. Also we show that the energy
over :DJQz , as it is highly a probable event to be reported constraint is network size invariant and converges to an
at the central coordinator. In the faulty case which is given optimal cluster size. To achieve a balance we use
in table IV where shown similar as before compression algorithms to bring down the transmitted bits
:DJQz  0.40, :DJQL  0.30 to known entropy at the cluster head as in equation (12).
The same model serves as the reliability index for a
The high value measured is treated as faulty as in this central classifier which uses multi-feature and brings
training data set it cannot find any matching value.
down the fault rate to a minimum in our case around 10%.
Figure11, shows LDA based Classifier using dot product. This technique can also be shown that in training data, as

484
the data is measured to known source entropy at each REFERENCES
cluster head then the training data set over time for a large [1] Introduction to Data compression by Khalid Sayood.
sensor network has the self mutual-information of all the [2] Pattern Classification by Richard O. Duda, Peter E. Hart, David G.
nodes in the network with respect to the new measured Stork.
values. This helps to further classify the current data set [3] Software Stack Architecture for Self-Organizing Sensor Networks,
by reducing any faults due to measurements locally. The Vasanth Iyer, G.Rama Murthy and M.B. Srinivas- ICST 2007,
classifier is based on the probability model and is Palmerston North New Zealand.
independent of the actual measured value making it [4] Battery drain (http://www.techlib.com/reference/batteries.html).
reliable, resilient and robust. [5] Power Law math (http://en.wikipedia.org/wiki/Power_law).

TABLE IV.
CLASSIFIER PERFORMANCE USING PMAX- PMAXR -FAULTY

Raw Sensor readings


Classification Temperature
PMAX PMAXC PMAXR
Value ()
C 57.20 0.10 0.
C 53.60 0.10

C 55.40 0.20

C 58.00 0.30 0.30 58.00

C 55.40 0.20

M 40.00 0.10
M 32.00 0.20

M 44.00 0.30
M 35.00 0.40 0.40

(M) 75.00 0.10

Fault 35.00 0.40

485
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

An Embedded Computing Platform for Robot

Ching-Han Chen Sz-Ting Liou


Department of Computer Science and Department of Computer Science and
Information Engineering, National Central Information Engineering, National Central
University University
pierre@csie.ncu.edu.tw 955202083@csie.ncu.edu.tw

Abstract and behavior-based robots that bring an upsurge of the


study in robot and evolve into some topics of
As the robotic industry is growing boomingly, the intelligent robot research [1]-[4].
functionalities and system's architecture of robots are The robotic system is growing extensively in recent
more and more complex. The development of robotic years. Many kinds of robot (e.g., Humanoid Robot,
application system becomes a time-consuming and Security Guard Robot, Home Robot, Entertainment
difficult task. In this paper, we propose an embedded Robot, etc.) are manufactured rapidly into the market.
computing platform for intelligent robot, and then The development cycle must be very short, and letting
design a reliable real-time operating system (RTOS) the robot into market on time become available;
on the platform for rapid developing intelligent robotic however, the complexity of applications for robotic
applications. The proposed embedded computing system is increasing day by day. In order to create a
platform includes a reconfigurable 8-bits processor High-Performance and Low-Cost robotic system in fast
core and some robot-dedicated hardware intellectual and flexible way, it is becoming necessary to develop a
property (IP) which can be generated and robotic development platform with hardware and
reconfigured easily. Based on the embedded processor software IP in a hurry. Therefore, in order to
core, a real-time OS, uC/OS-II, is ported to this coordinate different hardware and software for robot
platform. The RTOS is adjusted and optimized due to (especially for intelligent robot), an embedded
the robot-specific requirements and the hardware computing platform plays a very important role in the
resources constrains. Finally, a simple example is development of robotic system.
applied to demonstrate the software/hardware Many researches [4]-[7] indicate that a layered
(SW/HW) co-design flow based on the proposed approach is gradually becoming a trend in the design
platform. of robotic platform. The benefits of this design method
include high-level behavior control, task dispatching
Keywords: Intelligent robot, Embedded Computing and flexible design that can make the control structure
Platform, RTOS, Reconfigurable, SW/HW co-design of robotic platform more clearly and the operation of
robot more efficiently. Consequently, we propose a
1. Introduction layered platform which is composed of application,
operating system, processor and device (from top layer
The research of robotics is originated in 1970’s. The to bottom layer). On the basis of layered approach, we
purpose of robot’s utilization is to replace manpower build an embedded computing platform for robot.
efficiently, and increase the factory’s manufacture Rest of this paper: Section 2 reviews related work
ability. Its purpose was using the efficiency of robot to and Section 3 presents an overall embedded computing
take the place of manpower and increase factory platform for robots. Section 4 demonstrates an
output. With the advancement of science and experimental example based on the proposed platform.
technology, robots have been moving out from Conclusions and future works are summarized in
laboratory and existed in our daily life. Furthermore, Section 5.
researchers, biologist, mechanical engineer and
scientist of robotics, cooperate together to do the
robotic research with the perspective of biomimetic 2. Related work
approach. The research involves creating biomimetic

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 445


DOI 10.1109/SUTC.2008.69
There are many robotic development platforms [8], also tend towards complexity to let the control flow of
[9] can aid the development of robotic system so far. the device more inextricable. Therefore, a development
The followings intend to describe a variety of platform which integrates hardware and software will
platforms which are proposed in industry and become an indispensable consideration in the future.
academia. The platform must be highly scalable to make the
Microsoft Robotics Studio [8] is one of the business development of robot's hardware and software can
software platforms; for instance, it is to supply a have the flexible design advantages which include cost,
software platform that can be used across a wide- performance and time-to-market.
variety of hardware and it is also the first robot-
dedicated software announced by Microsoft. The 3. Embedded computing platform for
Development Environment of Microsoft Robotics robots
Studio includes the following major characteristic: 1.
End-to-End Development Platform. The platform Fig. 1 is the architecture of embedded computing
enables developers to interact with robots using platform which we propose. Refer to the layered
Windows or Web-based interfaces. 2. Lightweight perspective of embedded computing platform in [4]-
services-oriented runtime. The platform offers the [7], we describe our robotic embedded computing
services which is message-based architecture and make platform with platform view, system view and robot
it simple to connect with robot's sensors and actuators view respectively.
by using a Web-browser or Windows-based The platform can separate into four layers in
application. 3. Scalable and extensible platform. The platform view which composed of application,
programming model can be applied for a variety of operating system, processor and device (from top layer
robot hardware platforms. Also, third parties can to bottom layer). The system view includes application
extend the functionality of Microsoft Robotics Studio layer, management layer, computing layer and physical
by providing additional libraries and services. layer separately.
iRobot create [9] is one of the business hardware The robot view divides the platform into three
platforms and originated from the invention of MIT layers because the robot's behavior mode and the
Computer Science and Artificial Intelligence Lab; for driving of device may be altered due to environment
instance, it mainly offers a basic hardware situation. The layers comprise robot's intelligent and
development platform which facilitates developers to behavior decision in the highest level, the motion
program simple operation of the robot without control in the lowest level and transition zone in the
considering the low-level hardware architecture. middle which can configure the control flow of
Moreover, the additional command module which can hardware and software since environment situation
be mounted to the platform and is provided as well. changes.
This optional module fulfils advanced developers to
construe the automatic application of robot and enables
users to stretch the application of robotic functionality
by means of adding or combining sensors, digital
cameras, computers or other electric device.
Besides, the academic community proposed many
layered platform architectures, too. In [4], a layered
behavior planning is established for optimizing robot’s
behavior that helps to modify the behavior model of
intelligent robot in accordance with environmental Fig. 1. Layered architecture of embedded computing
characteristics. In [5]-[7], some group numerous platform for robots
controllers into master/slave control mode and some
divide the system into three layers in roughly which 3.1. Application layer
include application layer, OS layer and physical layer.
These methods are mainly aimed at robotic system and On the embedded computing platform, the
can not only speed up the communication capability of developers can use C/C++, high-level languages, to
internal system, but also hold the property of develop the applications for robot. Also, we now use
reconfiguration and elastic expansion. the off-the-shelf Keil development tools [10] to do the
Along with progress of times, the application fields programming task of compilation and simulation. In
of robot are increasing extensively and the applications addition, OS in next layer will provides Application-
are going to be designed not only for specific purpose programming interface (API) for developing
anymore. At the meanwhile, the robot's behavior will applications and drivers for propelling devices that can

446
help developers to create applications easily and
speedy without considering the hardware construction
in low-level and the driving methods.
In the utilization of embedded system, there is
usually a great amount of input/output (I/O) demand
for communicating with external component to carry
out the application's intention. Accordingly, the main
purpose of API and driver is to encapsulate I/O flow of
the system that helpfully let the developers can
concentrate on application's algorithm developing and
high-level management and decision program’s
designing without worrying about the I/O control flow. Fig. 2. The software part of embedded computing platform
Section 3.2 will discusses API and driver in more for robots
detail.
3.3. Processor
3.2. OS
Base on the interactive requirement of intelligent
By the foundation of robot's attribute, a satisfied OS robot and outer physical environment, robot-dedicated
for robotic purpose needs a well management processor must possess a number of reconfigurable IPs
mechanism to deal with tasks and devices that can which can progress in a fast integrated development.
coordinate various tasks inside the robot to work fine. These IPs should be reusable and thereby depend on
As well as offering real-time kernel to let the robot different circumstance can be increased/decreased or
react quickly and operate smoothly, the OS kernel substituted. We then can base on the actual need to
ought to have IPC methods which are dedicated to the arrange IPs and optimize the hardware design for the
convenient of robot's applications developing. embedded computing platform.
Fig. 2 presents the structure of OS kernel. API On our platform, we implement a reconfigurable
provides the interface between OS and top-level 8051 processor core, MIAT51, which is modified from
applications. Driver provides the interface between OS the open source MC8051 IP-core of Oregano Systems
and low-level devices (e.g. actuator, sensor, etc.). [12]. Because the robotic applications may often
In this paper, we adopt uC/OS-II [11] as an communicate with the interface of external component
implementation example of robotic OS. uC/OS-II is a (e.g. I2C and UART) and the controllers for accessing
RTOS which is open source and widely used especially RAM or flash memory, we can do a flexible
for control system, and it has the advantages of high adjustment for the special function register (SFR) of
performance, small footprint, excellent real-time and the processor that lets the processor can easily map to
scalable. the peripheral interface of new added device. Thus the
In our design, inside OS layer which requires API, control and utilization of the device can be more
system call, kernel and driver. To meet the requirement convenient. Fig. 3 shows the architecture of our
of intelligent robot's behavior control, we rewrite API processor and interface IPs for the platform.
and driver. API and driver mainly encapsulate the I/O
control flow as mentioned in section 3.1. We
implement the more top-level part of the I/O control
flow into API and the more low-level part of the I/O
control flow into driver since API is the top-level
interface, driver is the low-level interface and the
coverage of the I/O control flow includes top layer and
low layer.

Fig. 3. The hardware part of embedded computing platform


for robots

447
3.4. Device behavior module of subsumption system (refer to Fig.
5(b)).
As the design consideration of API and driver
which is mentioned in section 3.2. According to the
improvement of the robot's functionality and the
increasing of the system's complexity, the driving
method of the device becomes more difficult. At the
moment, using software to achieve the device's control
flow is comparatively more complex. Moreover, the
robot may work inefficiently because the waste of CPU
resources and Bus bandwidth. For example, the PWM
signal generation, which is needed for motors,
becomes the most serious problem. Consequently, we
make a parametric PWM generator which is also a
PWM hardware controller [13], [14] and can be
synthesized very fast. This PWM hardware controller,
which makes the robotic system can be controlled
effectively and eases the load of the top-level
application, receives the parameter from applications
and generates the high efficiency PWM signal
automatically. Fig. 5. (a) is the hexapod robot (b) is the subsumption system
Fig. 4 is a basic function block in the PWM
hardware IP. To describe the complex behavior mode In our platform, the high-level intelligent decisive
and control strategy precisely, we will use GRAFCET behavior in subsumption system can be built on the
[15] as a discrete-event behavior modeling tool. We application layer of our platform. And we can complete
follow a set of automatic synthesis rules which is the robot's behavior design by using the mechanisms of
proposed by Chen et al. [16] to synthesize a scheduler and inter-process communication (IPC)
customized PWM hardware IP which can be integrated which are the components in OS and can perform as
into an automatic system of robot easily. the utility of suppressor node and inhibiter node.
According to layer approach design, we may offload
most part of the robot’s complex and repeated software
control flow into hardware portion by using
GRAFCET for modeling and constructing the
hardware IPs which can ease the load of processor for
computing and enhance the whole system’s
performance.
To show the effectiveness of this designed platform,
the PWM hardware IP introduced in section 3.4 can be
used to explain the demonstration. We assume that the
Fig. 4. The basic function block of PWM controller robot involves an 8-stage procedure to complete a
movement, such as a step to go forward/backward,
4. Experimental system implementation in left/right or turn in place etc. However, there are 6 legs
a hexapod robot of the hexapod robot and each leg has 3 motors. In
other words, we need to generate 144 PWM signals to
This experimental system, which is based on the control the robot’s single movement by either software
embedded platform that we propose, implements or hardware. Instead of using software control to
subsumption system [17] on a hexapod robot. generate PWM signals for every movement every time,
Subsumption system, a behavior-based robot we create a motion table on the top of the PWM
programming method, is proposed by Rodney Brooks controller to store the 8-stage procedures’ parameter
in 1986. The suppressor node and the inhibiter node value for each basic movement. Once receiving a
inside the system can facilitate the layered and modular command instruction from the application, the PWM
behavior control design. controller then can determine the movement with the
Fig. 5(a) is a hexapod robot used in this instruction and acquire the relative parameter values
experimental system which includes all kinds of from the motion table to generate corresponding PWM
signals.

448
TABLE I References
Impact of different implementation
[1] Brooks, R. A., “A Robust Layered Control System for a
PWM-signal Software Hardware Mobile Robot”, IEEE Journal of Robotics and
generation control control Automation, March 1986, Vol. 2, No. 1, pp. 14–23.
Instruction(s) to [2] Arkin, R. C., Behavior-Based Robotics, MIT Press,
be sent through 144 1 Cambridge, MA, 1998.
bus [3] Joseph L. Jones, Anita M. Flynn, and Bruce A. Seiger,
Wasting Mobile Robots: Inspiration to Implementation, AK
processor much less Peters, Ltd, 1998.
resource [4] Rainer Bischoff, Volker Graefe, “Learning from Nature
Extra memory to Build Intelligent Autonomous Robots”, Intelligent
space no yes Robots and Systems, 2006 IEEE/RSJ International
requirement Conference on Oct. 2006, pp. 3160–3165.
[5] M. Omar Faruque Sarker, ChangHwan Kim, Seungheon
Baek, Bum-Jae You, “An IEEE-1394 Based Real-time
Even though the PWM controller must requires Robot Control System for Efficient Controlling of
extra memory space made on it for the motion table, Humanoids”, Intelligent Robots and Systems, 2006
TableTABLE I shows that the resource requirement IEEE/RSJ International Conference on Oct. 2006, pp.
impact of hardware control is less than software 1416–1421.
control. And this result also indicates good prospects [6] J. Oh, D. Hanson, W. kim, I. Han, J. Kim, and I. Park,
and development advantages: First, without numerous “Design of Android type Humanoid Robot Albert
iterative software control flow, the saving processor HUBO”, Intelligent Robots and Systems, 2006
resource can devote to some other applications’ IEEE/RSJ International Conference on Oct. 2006, pp.
1428–1433.
algorithm computing for increasing the efficacy of the
[7] Fumio Kanehiro, Yoichi Ishiwata, Hajime Saito,
processor; Secondly, a substantial bandwidth saving on Kazuhiko Akachi, Gou Miyamori, Takakatsu Isozumi,
the bus may apply to transfer the sensing data from Kenji Kaneko, Hirohisa Hirukawa, “Distributed Control
sensors to applications, especially the demanded System of Humanoid Robots based on Real-time
information which is urgent and critical to the system. Ethernet”, Intelligent Robots and Systems, 2006
IEEE/RSJ International Conference on Oct. 2006, pp.
5. Conclusions and future works 2471–2477.
[8] http://msdn2.microsoft.com/zh-tw/robotics/default.aspx
[9] http://www.irobot.com/index.cfm
In this paper, we propose an embedded computing [10] http://www.keil.com/
platform of intelligent robot which is layered [11] http://www.micrium.com/
architecture. And the platform can be used to solve the [12] http://www.oregano.at/index2.htm
fast growing of complex designing problem of the [13] Stefano Galvan, Debora Botturi, Paolo Fiorini, “FPGA-
robotic system. This platform considers the based Controller for Haptic Devices”, Intelligent Robots
functionality requirements and resources constrain of and Systems, 2006 IEEE/RSJ International Conference
generic robotic system and the tasks of robot's control on Oct. 2006, pp. 971–976.
in different abstraction layers. The platform, then, is [14] Narashiman Chakravarthy, Jizhong Xiao, “FPGA-based
Control System for Miniature Robots”, Intelligent
programmed and designed by the principle of
Robots and Systems, 2006 IEEE/RSJ International
hierarchical and modular. Conference on Oct. 2006, pp. 3399–3404.
The applications of robotic system will tend to have [15] R.David, “Grafcet :A powerful tool for specification of
diverse functionality and higher complexity in the logic controllers”, IEEE Trans. on Control Systems
future, hence, in the future work we will continue to Technology, 1995, Vol. 3, No. 3, pp. 253-268.
optimize the RTOS kernel according to the feature of [16] CHEN, Ching-Han; DAI, Jia_Hong; “Design and high-
robot's motion and intend to design an optimum on- level synthesis of discrete-event controller”, National
chip RTOS kernel [18]-[21] which is used to Conference of Automatic Control and Mechtronics
coordinate the operation of hardware and software. System, 2002, vol.1, pp. 610–615.
[17] Brooks, R. A., “How To Build Complete Creatures
Besides, we will also design specific robotic multi-
Rather Than Isolated Cognitive Simulators”,
processor [22], [23] which is depend on the nature of Architectures for Intelligence, K. VanLehn (ed),
robotic system's functionality requirements and attempt Erlbaum, Hillsdale, NJ, Fall 1989, pp. 225–239.
to make use of the architectonic property of multi- [18] H. Walder and M. Platzner, “Reconfigurable Hardware
processor to accelerate the overall robotic system's Operating Systems: From Design Concepts to
operating performance. Realizations”, Proceedings of the 3rd International
Conference on Engineering of Reconfigurable Systems

449
and Architectures (ERSA), CSREA Press, June 2003, pp.
284–287.
[19] C. Steiger, H. Walder, and M. Platzner, “Operating
Systems for Reconfigurable Embedded Platforms:
Online Scheduling of Real-time Tasks”, IEEE
Transaction on Computers, November 2004, vol. 53, no.
11, pp. 1392–1407.
[20] M. Ullmann, M. Hubne, B. Grimm, and J. Becker, “On-
Demand FPGA Run-Time System for Dynamical
Reconfiguration with Adaptive Priorities”, Field
Programmable Logic and Application: 14th International
Conference, FPL, Springer-Verlag Heidelberg, August
2004, pp. 454–463.
[21] Theelen B.D.; Verschueren A.C.; Reyes Suarez V.V.;
Stevens M.P.J.; Nunez A., “A scalable single-chip
multi-processor architecture with on-chip RTOS kernel”,
Journal of Systems Architecture, December 2003, Vol.
49, No. 12 , pp. 619–639.
[22] Becker, J., “Configurable systems-on-chip: challenges
and perspectives for industry and universities”,
International Conference on Engineering of
Reconfigurable Systems and Algorithms (ERSA), 2002,
pp.109–115.
[23] Sun, F., Ravi, S., Raghunathan, A., and Jha, N.K.,
“Custom-instruction synthesis for extensible-processor
platforms”, IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems 23 (2), 2004,
pp. 216–228.

450
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Actuation Design of Two-Dimensional Self-Reconfigurable Robots


Ming-Chiuan Shiu*,***. Hou-Tsan Lee*,****. Feng-Li Lian*. Li-Chen Fu*,**

*Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan


** Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
*** Department of Electrical Engineering, Hsiuping Institute of Technology, Taichung, Taiwan
**** Department of Information Technology, Takming University of Science and Technology, Taipei, Taiwan
( e-mail: e-mail:smc@mail.hit.edu.tw) (e-mail: houtsan@gmail.com) (e-mail: fengli@ntu.edu.tw)(e-mail: lichen@ntu.edu.tw)
Abstract: Self-reconfigurable robots have the ability to change the shape of multiple cooperative modules in
the different working environment. One of the main difficulties in building the self-reconfigurable robots is
the mechanical complexity and necessary number for achieving certain mechanisms. In this paper, a novel
design of a self-reconfigurable robot, called “Octabot”, is described. The Octabot robot is a two-dimensional
self-reconfigurable robot with modules composed of eight e-type electromagnet actuators. The magnetic force
characteristics based on FEM are first analyzed, and mechanical design and system properties are described
in detail. A group of Octabots can be easily expanded to a large scale if needed in any case. Via examining the
basic mechanical functionalities, the Octabot in self-reconfiguration shows its satisfactory performance.

1. INTRODUCTION The research of self-reconfigurable robots is opened from


Fukuda system on the CEBOT [3-6]. According to the 2D
In general, self-reconfigurable systems are formed by a set self-reconfigurable robotic system, there have seven physical
of robotic modules. The modular feature is for the ability to prototypes. All the 2D systems use one or two kinds of
change the shape of multiple cooperated robot modules that modules. To design 2D or 3D system, gravity is the most
can be easily reconfigurable in the different working important factor that must be overcome. For most of the 2D
environment [1,2]. Typical applications of a self- systems, gravity is only helpful when the self-reconfigurable
reconfigurable robot could be working and maintaining in robot and ground rub. But in Hosokawa’s [14] and in Inou’s
dodging barriers and rescuing after the earthquake etc... [12] robot, the system runs in the vertical plane, and thus
However development of such robots including building and gravity must be considered.
controlling is still a significant challenge. Development of
self-reconfigurable robot is our subject to research. Most of existing 2D systems have been built like lattice in
order to reach the mould characteristic and reduce the
In this paper, we present the development of a novel two- complexity reconfiguration. All previously reported of 2D
dimensional self-reconfigurable robot by designing, testing systems, including Murata system [7], Gear-Type Unit of
and producing it forming the basic module. It is equipped Tokashiki et al. [8], Chirikjian system [9], Crystalline [10],
with one on-board processing unit, eight drivers and eight e- Yoshida system [11], Inou’s robot [12], Kirby’s catoms [13]
type electromagnet actuators, called the Octabot. Three and Hosokawa system [14]. Each of these 2D self-
Octabots have already been set up and used for facing reconfigurable robots has only one kind of mechanical
autonomous reconfiguration in 2D. A term of modular robots module design. How to design the connector is an important
should work cooperatively to be able to self-reconfigure and and difficult challenge of the mechanical design. From the
operate autonomously. Furthermore, related control software existing 2D self-reconfigurable systems, different design
algorithm has been designed for coordinating the term of concepts of the connector have been used. In Yoshida system,
modules to change the shape in a distributed fashion. the pin/hold structure is used. Permanent magnets are found
in Murata system and Hosokawa system. In Kirby’s catoms
The contents of this paper are as follows. Related works system, the pairs of electromagnets are used to motion and
are discussed in Section 2. Magnetic fields and force analysis adhesion.
is presented in Section 3. In Section 4, magnetic modelling
and simulation results are described. We describe the design Tables 1-2 compare the existing 2D self-reconfigurable
principle in Section 5. Experimental tests are discussed in robots. Some geometrical properties are presented in Table 1.
Section 6, and conclusion is provided in Section 7. Table 2 summarizes some physical properties.

2. RELATED WORK In this paper the mechanical design of the modular robot,
the Octabot, a lattice homogeneous reconfigurable robot is
The following pioneering works in 2D self-reconfigurable described. The Octabot has some similarities with the Murata
robot inspire the research discussed in this section. Several system. All are homogeneous and can separate their
types of 2D modular robotic systems that can support self- locomotion and reconfiguration stages.
reconfiguration have been proposed [7-14]. These robots
usually comprise on constructing a large number of
independent modules and on controlling them. This paper
only concern about the hardware and shape aspects of self-
reconfigurable robots, not on the control aspects.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 451


DOI 10.1109/SUTC.2008.21
Table 1. Geometrical properties. The Octabot is included for ∇ × H = J + ( ∂ε 0 E ) ( ∂t ) ≅ J (1)
comparison.
Actuat. where H is the magnetic field intensity, J is the current
Developer Connectors Actuated
DOF density, E is the electric field intensity and ε 0 E is the electric
Fracta 0 6 3
Gear-Type
displacement flux density. The magnetic field intensity is
1 6 0 divided into two parts: a generalized magnetic field vector
Unit
Chirikjian 3 6 3 H g ( x, y , z ) and the gradient of the generalized magnetic
Crystalline 1 4 2
Yoshida 2 4 2 scalar potential −∇Ψ g . This gradient H Ψ is evaluated at the
Inou 2 4 2 integration points using the element shape function as:
Hosokawa 2 4 2
Kirby catoms
Octabot
1
1
24
8
24
8
H Ψ = −∇ ( N )
T
(Ψ )
g (2)

Table 2. Physical properties. The Octabot is included for comparison. where the ∇ is the gradient operator, N is the shape
Weight Dimensions( functions and ω g is the nodal generalized potential vector.
Developer Connector type
(g) cm) From this, the magnetic field intensity is given by
Fracta 1200 Ø12.5 Electro Magnets
Gear-Type
Ø6 Perm. Magnets H = H g + H Ψ = H g − ∇Ψ g (3)
Unit
Chirikjian Mech. Hooks Then the magnetic flux density vector B ( x, y , z ) is related
Crystalline 375 5×5×18 Mech. Lock to H by the constitutive law:
Yoshida 80 4×4×8 Mech. Hooks
Inou 500 8×8×7.5 Mech. Grooves B = µH = B (H) (4)
Perm. Mag. and
Hosokawa
Mech. Arms
where µ is the permeability matrix.
Kirby catoms Ø4.5 Electro Magnets Second, the vector potential results are be discussed. First,
Octabot 1500 Ø13.5 Electro Magnets the magnetic flux density is derived. As the curl of the vector
potential, the magnetic flux density is defined. Using the
element shape functions, this evaluation is performed at the
integration points:
3. ELECTROMAGNETIC FIELDS ANALYSIS AND
FORCE CALCULATION B = µ H = ∇ × N TA A e (5)
A schematic diagram of proposed system is about this where ∇ × is the curl operator, N A is the shape functions
system, there are eight e-type electromagnets. To model this and A e is the nodal magnetic vector potential.
system, the magnet field analyzed becomes a very important
issue. These reasons motivate us to calculate the magnetic Magnetic forces are also been got and are be discussed
field of the e-type electromagnet that used to achieve the below.
objective of the Octabot with motion. Thus, we begin with
the analytic approach by utilizing several useful fundamental
theorems in electromagnetic fields in this section. 3.2 Magnetic Forces
Furthermore, design of such electromagnetic devices with e-
type electromagnets as well as the design of any The force calculation can be classified into two methods.
electromagnetic devices requires the calculation of magnetic The first method contains Maxwell stress. The second is
field. based on the virtual work principle. They are so-called
Maxwell Forces and Virtual work force. They are universal
and can be used to compute the total force on either
3.1 Electromagnetic Field Evaluations ferromagnetic or current-carrying objects. In this paper,
Maxwell forces and Virtual work are discussed.
Magnetic field intensity, magnetic flux density, magnetic
forces and current densities are included in the basic On ferromagnetic regions, the Maxwell stress tensor is
magnetic analysis results [15]. The evaluations have used to determine forces. It provides a convenient way of
somewhat different between these types for magnetic scalar computing forces acting on bodies by evaluating a surface
and vector formulations. First, the scalar magnetic potential integral. This force calculation on surfaces of air material
results are be discussed. From the Electromagnetic Field elements which have a nonzero face loading specified is
Fundamentals, the quasistic laws are obtained from performed [16]. In the following numerically integrated
Maxwell’s equations by neglecting either the magnetic surface integral for the 2-D application, this method uses
induction or the electric displacement current. This reduces extrapolated field values and results:
Maxwell’s equations to:
1  T11 T12   n1 
Magnetquasistatic:
FMX =
µ0 ∫ T
S
21 T22   n 2 
ds (6)

452
where µ0 is the permeability of free space, only one solution for a given simulated excitation. The
1 2 1 2
Virtual work and the Maxwell stress methods give the total
T11 = B 2x − B , T12 = B x B y , T21 = B x B y , T22 = B 2y − B , force acting on a closed surface through an air region.
2 2
n1 is a component of unit normal in x-direction and n 2 is a By using auxiliary software, simulated results are
component of unit normal in y-direction. The 2-D case cans presented in Fig. 2 and Fig. 3. Fig. 2(a) shows the flux lines
extension to 3-D application. between the e-type electromagnet and the keeper. Magnetic
field intensity is shown in Fig. 2(b). Fig. 3 shows the virtual
The virtual work principle is used to calculate work force and Maxwell stress tensor force. Their ordinates
Electromagnetic nodal forces. These are two formulations are magnetic force and cross axle are the distances. Table 3 is
currently used to calculate force. One is the element shape the parameters that were used in the simulation process. From
method that is used to calculate magnetic forces. The other is the simulation results, the relationship between magnetic
the nodal perturbations method that is to calculate force and distance can be understood. Also, the force of the e-
electromagnetic forces. First, the element shape method is type electromagnet can attract each other.
been discussed. The virtual work method is used to calculate
Magnetic forces that are obtained as the derivative of the
energy versus the displacement of the movable part. This
calculation is valid for a layer of air elements surrounding a
movable part [17]. To determine the total force acting on the
body, the forces in the air layer surrounding it can be
summed. The basic equation for an approximate force of an
air material element in the s direction is:

Fs = ∫ BT
Ve
∂H
∂s
dv+ ∫
V e
(∫0
H
B T dH ) ∂∂sdv (7)
Fig. 1. (a) E-type electromagnet CAD (b) E-type electromagnet 2D section
with a keeper

where Fs is force in element in the s direction, ∂H ∂s is


derivative of the magnetic field intensity with respect to
displacements, s measures the virtual translation of the
movable part along a given direction and Ve is a volume of
the element. Second, the nodal perturbation method is been
discussed. Electromagnetic forces are calculated as the
derivatives of the total element coenergy (sum of electrostatic (a) (b)
and magnetic coenergies) with respect to the element nodal Fig.2. (a) 2D flux lines (b) 2D magnetic field intensity
coordinates [18]:
1 ∂ 
Fx =
i
2 ∂xi  ∫V
( d T E + B T H ) dv 
e
(8)

where: Fx is the x-component (y- or z-) of electromagnetic


i

force calculated in node i , xi is the nodal coordinate (x-, y-,


or z-coordinate of node i),v is the volume of the element and
d is the nodal perturbation distance. Nodal electromagnetic
forces are calculated for each node in each element. In an
assembled model the nodal forces are added up from all
adjacent to the node elements.
Fig. 3. Virtual work force and Maxwell stress tensor force of e-type
4. MAGNET MODELING AND SIMULATION electromagnet
RESULTS Table 3.Simulation Parameters

In this paper, 2D model was designed to investigate the E-type core µr 1000
accuracy and nature of the solutions for force prediction. The Keeper µr 1000
e-type electromagnet CAD model is shown in Fig. 1(a). The
Coil µr 1
dimensions of the e-type electromagnet and a keeper are
represented in Fig. 1(b). Because the e-type electromagnet Turns 727
and a keeper are axial symmetric, two regions surrounding Air µr 1
the half e-type electromagnet are used to graduate the mesh
Excitation DC current applied to the coil 1.5 AMP
and ensure that the meshing of the fringing flux is modelled
well. For a 2D model, two simple and effective methods of Model Axial symmetric
calculating forces are the Virtual work and the Maxwell
stress methods [19-22]. These methods have the advantage of

453
5. HARDWARE STRUCTURE reach the firm connection through the attracting each other of
the electromagnets. Through the magnetic force of them
The hardware structure of the Octabot module must between modules, attraction force allows the robot
support the control of the actuators and a power system. The
connection, disconnection, rotation to change shape and
objectives of the design of the modular robots are to
locomotion. The basic motion required two modular robots at
minimize the number of discrete components, their overall
weight and their power consumption while preserving the least. Because the friction is neglected, the gravity center of
self-sufficiency and autonomy of the module. the two modular robots is not moved. The modular robot
system can achieve the goal of reconfiguration through
The Octabots have been proposed, the module has a motion. This goal requires the interactions of more than two
chassis and eight connecting plane are mounted on it as Octabots.
shown in Fig. 4. The self-reconfiguring modular robot is a 2- Three types of the Octabots are defined in reconfiguration
D unit; it has a special actuation system to attach/detach other procedure. The mover Octabot moves around a pivot Octabot
modular robot [23]. with respect to the rest of the reconfiguration [13]. The other
The e-type electromagnet is shown in Fig. 5 (a). It is used robots connected the pivot Octabot, fixers, keep the pivot
for the actuation of the modular robot. The main feature of Octabot in reconfiguration as the mover moves around it.
this module is for the simplicity of the actuation system. From the basic reconfiguration procedure, the pivot
Since the e-type electromagnet is directly mounted on the Octabot and all connected neighbors expect the mover
module, no additional motion system is required. The e-type Octabot actuate their e-type electromagnets with connected
electromagnet is used to supply attractive or repulsive force force (the blue regions in Fig. 9(a)). Then, the mover and
which enough to align and move the module. Each module pivot drive the electromagnets used to attract each other (the
has eight side formed by eight electromagnets and is fully red regions in Fig. 9(b)). From these basic reconfiguration
self-contained. Every module has an octagonal shape and procedures, the Octabot system can change shape to achieve
actuates by attraction of both modules. self-reconfiguration.
Each module consists of four steel ball rollers to reduce
friction. The steel ball rollers are shown in Fig. 5(b).
Simplicity is also required in circuit design. Fig. 6 shows the
block diagram of the circuit system. An onboard micro-
controller and e-type electromagnet drivers must be packed in
a limited space on the modular robot. We adopt the micro-
controller for this purpose. The program for the micro-
controller is downloaded through the burner which can be
removed after downloading. The micro-controller generates
control signals for e-type electromagnet drivers. A Darlington
pair circuit is shown in Fig. 7. It is used to drive the e-type
Fig. 4. The Octabot module
electromagnet. From Fig. 7, I is the current passing through
the e-type electromagnet, Vm is the control voltage, R is the
resistance, N1 and N2 are the transistors. When the resistance
R is fixed, changing the input voltage Vs and the control
voltage Vin can receive the characteristic curve of this circuit.
In this first prototype, we adopt a centralized control (a) (b)
method. Each module is controlled by a host PC, the micro-
controller decodes a command from the PC and generates Fig. 5. (a) Electromagnet (b) a steel ball roller
necessary control signals for the control circuits.
6. LOCOMOTION
The modular robot can rotate and move via connection
planes. By this connection between modular robots, an Fig. 6. Functional block diagram of a module
arbitrary modular robot can be realized. Two modular robots
are required to perform the simple motion included of
rotation and connection. The shape configuration of the
modular robot system is changed by means of rotation and
connection of them as shown in Fig. 8. From Fig. 8, eight
electromagnets mounted on the Octabot module that
produces two kinds of different polarity (N or S poles) rather
than permanent magnets. The magnet force between modular
robots will create a torque that pivots the two Octabots about
the connect edge and onto the next edge. The modular robots Fig. 7. Darlington pair control circuit

454
command sequence from the host PC, the modules change
their relative connections. Fig. 11 demonstrates of the self-
reconfiguration process of the actual Octabot system. From
this experiment, three modular M1, M2 and M3 are formed as
(a)
a line in Fig. 11(a). They are connected each other. At Fig.
11(b), M2 and M3 are connected together but M1 and M2
attract each other through the magnetic force. M1 revolves M2
clockwisely. At Fig. 11(c), M1 and M2 are connected together
but M2 and M3 attract each other through the magnetic force.
M3 revolves M2 clockwisely. Finally, the shape of robot
(b) system goes back to line.
Fig. 8. Concept of rotation and connection (a) 3D diagram (b) 2D diagram
Fig. 12 demonstrates the line shape to triangle shape. From
this experiment, three modular M1, M2 and M3 are formed as
a line in Fig. 12(a). They are connected each other. At Fig.
12(b), M2 and M3 are connected together but M1 and M2
attract each other through the magnetic force. M1 revolves M2
counterclockwisely. At Fig. 12(c), M1 and M2 are connected
together but M2 and M3 attract each other through the
(a)
magnetic force. M3 revolves M2 clockwisely. Finally, the
shape of robot system forms triangle shape. From experiment,
they have finished repeated rotator motion. They are proved
that only uses the single type component to make the robot
reach the movement and connected characteristics.

Table 4. The hardware specification


Size 135 mm ( the diameter of the robot)
(b) Weight 1.5kg
Fig. 9. Reconfiguration procedure. electromagnets 8
driven circuit 8
Processor Basic Stamp II
7. EXPERIMENTS Power supply DC40V

We build three modules based on the design explained in 8. CONCLUSIONS


previous sections to test the actuators and model design of the
Octabot. Their bodies have the same structure that the In this paper we described the detailed design of a new two
Octabot configured as a line would have. dimensional self-reconfigurable robotic system. The
magnetic force of e-type electromagnet is described. To
verify the feasibility of the proposed design, the experiments
with three real physical modular robots are significantly
important. Its simple structure and reliable operation enables
us to construct 2-D self-reconfigurable system in large scale.
We have examined its basic design concept and verified its
reliable operation of self-reconfiguration. The Octabots are a
successful application of the self-reconfigutation robot
Fig. 10. Experimental setup
system.
The electromagnet pair was used to attract each other for
the reconfiguration procedure. The specification is given in
Table 4. M1 M2 M1
M3 M2 M3
Fig. 10 shows the experimental setup of these Octabots.
The Octabots are free, controlled from a host PC and
powered by an external power supply. The eight
electromagnets are actuators driven by eight driven circuits.
The control software is composed of two programs. The first (a) (b)
program provides an interface to a library of functions, serial
port and controller drivers, synchronization routines and a
motion scheduler. Second is used to specify motions, their
on-off or their duration.
The glass is used for reducing frictional force. The weight
of these three modular is nearly equal. According to the

455
[11] Yoshida, E., S. Murata, S. Kokaji, K. Tomita and H. Kurokawa,”
Micro self-reconfigurable robotic system using shape memory alloy,”
Distributed Autonomous Robotic Systems 4, Knoxville, USA, pp. 145–
M1 M2 M3 154, 2000.
[12] Inou, N., K. Minami and M. Koseki, ”Group robots forming a
mechanical structure—Development of slide motion mechanism and
estimation of energy consumption of the structural formation,”
Proceedings of IEEE International Symposium on Computational
Intelligence in Robotics and Automation (CIRA), 2003.
(c) [13] Kirby, B., B. Aksak, J. Hoburg, T. Mowry and P. Pillai, “ A Modular
Fig. 11. The experimental result 1: line to line. Robotic System Using Magnetic Force Effectors,” Proceedings of the
2007 IEEE/RSJ Intl. Conference of Intelligent Robots and Systems
(IROS), 2007.
[14] Hosokawa, K., T. Tsujimori, T. Fujii, H. Kaetsu, H. Asama, Y. Kuroda
M1 M2 M3 M2 and I. Endo, ”Self-organizing collective robots with morphogenesis in a
M1 M3 vertical plane,” IEEE International Conference on Robotics and
Automation (ICRA). Leuven, Belgium, pp. 2858–2863, 1998.
[15] Haus, H. A. and J. R. Melcher, Electromagnetic Fields and Energy.
Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1996.
[16] Moon, F. C. ,Magneto-Solid Mechanics, New York, John Wiley and
Sons., 1984.
(a) (b)
[17] Coulomb, J. L., and G. Meunier, ”Finite Element Implementation of
Virtual Work Principle for Magnetic for Electric Force and Torque
Calculation,” IEEE Transactions on Magnetics, vol. Mag-2D, no. 5,
pp. 1894-1896, 1984.
M2 [18] Gyimesi, M., I. Avdeev and D. Ostergaard,”Finite Element Simulation
M3 of Micro Electro Mechanical Systems (MEMS) by Strongly Coupled
M1 Electro Mechanical Transducers,” IEEE Transactions on Magnetics,
vol. 40, no. 2, pp. 557–560, 2004.
[19] Edwards, J. D. and E. M. Freeman, MagNet 5.1 User Guid., Infolytica
Corporation, 1995.
(e) [20] Carpenter, C. J.,”Surface integral methods of calculating forces on
Fig. 12. The experimental result 2: line to triangle. magnetized iron parts,” Proc. IEE, 107C, pp.19-28, 1960.
[21] Edwards, J. D. , Electrical Machines and Drives,” Macmillan, 1992.
[22] ANSYS 9.OA1, Electromagnetic Field Analysis Guide. 001247. 4th
REFERENCES Edition. SAS IP, Inc., 1999.
[1] Østergaard, E. H., K. Kassow, R. Beck and H. H. Lund,” Design of the [23] Shiu, M. C., H. T. Lee, F. L. Lian and L. C. Fu, ”Magnetic Force
ATRON Lattice-based Self-reconfigurable Robot,” Auton Robot, vol. Analysis for the Actuation Design of 2D Rotational Modular Robots,”
21, pp. 165-183, 2006. accepted by the 33rd Annual Conference of the IEEE Industrial
[2] Akiya, K., H. Kurokawa, E. Yoshida, S. Murata, K. Tomita and S. Electronics Society, pp. 2236-2241, Nov. 5-8, Taipei, Taiwan, 2007.
Kokaji, ”Automatic Locomotion Design and Experiments for a
Modular Robotic System,” IEEE/ASME Trans. Mechatronics, vol.10,
pp. 314-325, 2005.
[3] Fukuda, T., M. Buss, H. Hosokai and Y. Kawauchi,”Cell structured
robotic system CEBOT (Control, planning and communication
methods,” Intelligent Autonomous Systems, vol. 2, pp.661-671, 1989.
[4] Fukuda, T., S. Nakagawa, Y. Kawauchi and M. Buss,”Structure
decision method for self organising robots based on cell structures
CEBOT,” IEEE International Conference on Robotics and Automation
(ICRA), Scottsdale, AZ, USA, vol. 2, pp. 695-700, 1989.
[5] Fukuda, T., Y. Kawauchi and H. Asama,” Analysis and evaluation of
cellular robotics (CEBOT) as a distributed intelligent system by
communication information amount,” Proc. of IEEE/RSJ International
Conference on Intelligent Robots and Systems (IROS), pp. 827-834,
1990.
[6] Fukuda, T., Y. Kawauchi and H. Asama, “Analysis and evaluation of
cellular robotics (CEBOT) as a distributed intelligent system by
communication information amount,” Proc. of IEEE/RSJ International
Conference on Intelligent Robots and Systems (IROS), pp. 827-834,
1990.
[7] Murata, S., H. Kurokawa and S. Kokaji, “Self-assembling machine,”
Proceedings of IEEE Int. Conf. on Robotics & Automation (ICRA’94).
San Diego, California, USA, pp. 441–448, 1994.
[8] Tokashiki, H., H. Amagai, S, Endo, K. Yamada and J. Kelly,
“Development of a transformable mobile robot composed of
homogenous gear-type units,” Proceedings of the 2003 IEEE/RSJ Intl.
Conference of Intelligent Robots and Systems (IROS), pp. 1602–1607,
2003.
[9] Chirikjian, G.,”Kinematics of a metamorphic robotic system,”
Proceedings of IEEE Intl. Conf. on Robotics and Automation, pp. 449–
455, 1994.
[10] Rus, D. and M. Vona ,”Crystalline Robots: Self-reconfiguration with
compressible unit modules,” Autonomous Robots, vol. 10(1), pp. 107–
124, 2001.

456
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Video Summarization Based on Semantic Feature Analysis and User


Preference

Wen-Nung Lie and Kuo-Chiang Hsu

Department of Electrical Engineering, National Chung Cheng University


160, San-Hsing, Ming-Hsiung, Chia-Yi, 621, Taiwan, ROC.
wnlie@ee.ccu.edu.tw

Abstract key frames or shots are usually low-level and non-


semantic. Frames or shots thus extracted often cannot
A personalized video summarization system is meet the human’s perception accurately. Additionally,
proposed in this paper. “Personalized” means that to satisfy most people’s needs, user’s preferences
each video can be summarized according to viewer’s which often vary case by case should be also
own preference. To meet user’s preference, semantic considered on summarizing the videos.
features of each frame are detected so that their Conventionally, methods of video summarization
relevance to user preference can be determined. Users can be categorized into categories:
can be also capable of setting time or frame number (1) evaluating the importance of a frame according to
constraint for video summary via a friendly interface. low-level features, e.g., motions [1, 2]. Such a kind
To summarize a video efficiently and effectively, a of methods is suitable for most videos since heavy-
constrained optimization problem (subject to time motion frames can often attract the focus.
constraint and video smoothness) is faced and solved (2) combining low-level and high-level features to
to determine the non-uniform sampling rates for shots evaluate the importance of a frame or a shot [2, 3].
relevant to user preferences. Subjective tests show that (3) selecting shots according to results of shot
the quality of the summarized video has a MOS of classification and event detection. This kind of
about 4.0, and the comprehension about the video methods can be often found in analyzing highlights
contents is good. of sports video (e.g., for soccer game [4]).
(4) determining the importance of a frame based on
1. Introduction user preferences [5, 6] which are specified via
querying interface. For this kind of methods,
Multimedia contents are increasingly generated semantic event analysis is usually required since
and spread fast. How to manage such a huge amount of low-level features are not sufficient to identify
multimedia contents has become an urgent problem. frames or shots that match the user preferences.
Because of the above requirements, MPEG-7 and Generally, the presentation of key frames or
MPEG-21 have been proposed as possible solutions important shots is vital to a video summary system.
for digital content management. The possible ways can be classified into: 1) digest of
Though multimedia management technology key frames [1, 5], 2) digest of selected shots [4, 5, 6],
assists people to search the contents efficiently, we still and 3) video digest with non-linearly sampled frames
confront an urgent problem of how to let the users [2, 3] (e.g., shots of higher importance are played with
know the contents they need quickly. This initiates the a higher sampling rate).
development of summarization techniques. Regardless of what way, the viewing comfort and
The goal of a video summarization system is to the meet of time constraint will also play important
assist people in understanding the content of video roles in the success of presentation.
efficiently. At present, some systems proposed to It is thus the purpose of this research to establish
select key frames or shots according to the features of an efficient video summarization system that creates
image motion, human face or close-up appearance, and summarized video based on automatic extraction of
then present them to the viewers via a story board or semantic features (e.g., humans, moving objects,
playback of shots. Traditional features used to select explosion, indoor, outdoor, etc.) and specification of

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 486


DOI 10.1109/SUTC.2008.88
personal preferences. Each shot, the basic unit in our 3. Semantic feature analysis
summarized video presentation, is assigned with tags
of semantics. Non-linear sampling rates (even 0 which Though the users specify their preferences via
indicates discarding during display) of shots are thus semantic features, our system detects semantic features
optimized so that shots best relevant to personal from each segmented shot by processing low-level
preferences are selected to meet the time constraint. In features and fusing them.
this way, the same video is summarized on-line in a The designed features in our system include:
different way for different users and our system is truly zoom detection, caption detection, explosion detection,
“personalized”. face detection, moving object detection, background
classification, which when fused would constitute the
2. System framework semantics proposed in Table 1. Take for examples, the
number of humans in a shot and the identification of
Fig.1 illustrates the framework of our proposed close-up shots can be derived via face and/or zoom-in
system. It is designed to be capable of extracting detection; shots containing speaking persons can be
semantic features to construct metadata for each video identified via face detection and caption detection;
in an off-line manner and then summarizing videos identification of indoor or outdoor scenes can be
according to on-line inputted user preferences and time reached via background classification.
constraint. First of all, tools for shot segmentation Below we outline the techniques for detecting the
should be available, which has been discussed in depth above 6 mentioned features. Interested readers are
and thus is beyond the scope of this paper. advised to have references from [14].

3.1 Zoom detection


Detecting camera zoom-in operation [2] is
achieved by inspecting motion features in each frame.
Nine fixed zooming center candidates ZCi ‘s (Fig. 2)
are designated and a membership Ei is calculated for
each candidate ZCi
m n

Ei = ∑∑ cos(θ
x =1 y =1
i
( x, y )
), (1)

where (x,y) is the coordinate for each macro-block (MB)


and θ i( x , y ) is the angle between the MB-center vector
and the motion vector (MV) at that MB. Then we
Fig.1. The framework of the proposed video choose the one having the maximum value of E which
summarization system. is also larger than a threshold. No zoom-in operation is
detected if the largest E does not exceed the threshold.
Table.1 shows the interface for selecting user
preferences. The time constraint is specified in terms
of the number of frames to be displayed. On the other Width

hand, we allow the users to specify a factor ρ which


will place constraints of smoothness when determining
Height
non-uniform sampling rates for selected shots. A
higher value of ρ means more video smoothness.

Table 1 Interface for user preference selection Fig.2 Zooming center candidates and the θ definition.

number: □one □two 3.2 Caption detection


Humans
□speaking □close-up Though the status of human speaking can be
□moving object □explosion achieved by audio signal analysis (speech detection),
Specific Event
□Zoom in we tried a different way via face and caption detection.
Background □indoor □outdoor To ease caption detection, our system uses some prior
Time constraint: frames information about their rough positions and sizes in the
Video playing smoothness factor (ρ): (0 ~ 3) display. Procedures of image processing include: 1)

487
edge map extraction by using Sobel operator and After the above filtering process on D(x,y), we
binarization process, 2) edge histogramming of the compute MapOutput-D as follows:
left/right half images by horizontal projection, 3) ⎧1, if _ D( x, y ) = 4
deriving the possible caption regions by analyzing MapOutput − D ( x, y ) = ⎨ (3)
these two histograms, 4) if the proportion of the ⎩ 0, otherwise
number of edge points in the candidate region is higher
than a threshold, the caption is then identified. Step 3: Luminance regulation [7, 8]
This process is intended to remove smooth back-
ground areas (in terms of local graylevel variance) that
3.3 Explosion detection
are falsely classified as skin-color areas by assuming
The scene of explosion is often the focus of
that the human face will not be so smooth in graylevels.
attention in watching movie videos. For video clips of
explosion, the intensity is substantially raised and the
Step 4: Connected component labeling
number of flame-color pixels increases. Therefore, we
This is a popular procedure to label connected
detect the explosion frames by analyzing intensities
regions and calculate their region sizes. Regions of
and flame colors in each frame. This is achieved by
small size will be discarded. On the other hand,
making statistics (or, training) of flame colors in
regions of too large size will be undergone a re-check
advance. Generally, the flame colors are distributed
on edge response to possibly split the regions.
around golden (237.5 < R < 246.5) ∩ (245 < G < 255)
∩ (182 < B <212) and reddish orange (234.5 < R <
Step 5: Shape parameter estimation
243.5) ∩ (177.5 < G < 255) ∩ (64 < B < 124).
We evaluate the shape of each labeled area by
On detection, the average luminance intensity L(i)
fitting it to an elliptic model. Model parameters [10]
and the histogram H (i ) of flame colors are made for
include: the long axis (a), the short axis (b), orientation
each frame i. Identify the interval from a to b, between of a (θ), and the center of mass (xc, yc).
which the mean and variance of H (i ) and the mean of
L(i) are larger than some determined thresholds. Step 6: Region growing
This step is designed to compensate the darker
3.4 Human face detection part of the human face that is influenced by bad
Our method of face detection [7-8] is based on lighting. We use the labeled area as seeds for region
skin-color detection, followed by region analysis. growing. The growing condition is based on the
Step 1: Skin-color detection similarity of (Cr,Cb) and the examination of ellipse
We combine two detection algorithms based on shape. Unlabeled pixels neighboring to the grew
YCbCr and RGB color spaces. In the YCbCr color region are checked and selected with a cost function
space, the range of skin colors is Cr ∈ [133, 173], based on (Cr,Cb) similarity. After growth, the (Cr,Cb)
Cb ∈ [77, 127]. In the RGB color space, two values and the shape parameters are re-computed.
conditions proposed in [9] are adopted. After skin- Next, compute the mismatch value ε[7] of the
color testing, a binary image can be obtained, denoted estimated shape, as in Eq.(4).
as Mapskin-color(x,y), x=0,1,…,IH-1, y=0,1,…,IW-1, where
IH and IW are image height and width, respectively. ∑ (1 − b( x, y)) + ( ∑) b(x, y )
( x , y )∈E
ε= x , y ∈L − E
(4)
Step 2: Density regulation [7,8] ∑1
( x , y )∈E
In this step, the dilation and erosion operations
based on a density map are applied to eliminate noises ⎧1 , if ( x, y ) ∈ L
where b ( x, y ) = ⎨ .
and get smooth areas. The density map D(x,y) is ⎩0 , otherwise
calculated from the skin-color map Mapskin-color by Eq.(4) represents the sum of the number of “noise”
1 1

D( x, y ) = ∑∑ Map skin −color (2 x + i,2 y + j ) (2) points of a grew region L that are outside the ellipse
i =0 j =0 and the number of “hole” pixels of the ellipse E that do
where D is of dimensions IH/2×IW/2. Obviously, D’s not belong to L (see Fig.3) .
value is from 0 to 4. The density map is processed with Region growing is stopped by examining the
erosion and dilation operations, which modify D(x,y) relationship between current and previous ε. If ε is
by sliding a 3×3 window and checking the numbers of decreased and the estimated elliptic shape is reasonable
D=4, 0<D<4, and D=0 pixels within the window. (e.g., 0.3 < b / a < 0.95 and θ > 25o ), then return to
recalculate the elliptic parameters for next iteration;

488
otherwise, region growing process is terminated. At the second stage, we train the third classifier
SVMindoor-outdoor, of which the input features are the
distances of the feature vector of each block to the
SVMcolor and SVMtexture hyperplanes.

4. Constrained optimization problem


Fig. 3 Examination of ellipse shape.
A strategy of optimized non-uniform sampling is
adopted to produce the summarized video under the
Step 7: Shape evaluation
time constraint. Fig.5 illustrates the block diagram of
In this step, we evaluate each labeled area to see
our proposed summarization system. Our major works
if it is a human face. The following criteria should be
are to select video shots which meet the user
satisfied: ε< 0.38, 0.3 < b / a < 0.95 , and θ > 25o . preference and determine the sampling rates for them
We also consider the consistency (in ellipse according to the shot relevance, time constraint, and
center and size) and duration of the detected faces in video smoothness.
the time domain for robustness. Fig.4 demonstrates
the results of face detection. Summarization
mechanism

3.5 Moving object detection Videos


Down-
sampling
Summarized
video
For moving object detection, the camera motion Search optimal
is first estimated from MVs of the decoded MPEG down-sampling
parameter Parameters
frames. Object motion can then be detected after
compensating the camera motion. After estimating Time constraint

object motions in each frame, the existence of moving


objects can then be determined and their durations can Calculate shot User User
Metadata relevance preference Interference
be derived. The starting and ending frame indices of
the detected moving objects will be recorded to help User

following summarization. Fig.5 Block diagram of the proposed summarization


system.

In searching the optimal subsampling rates, three


factors must be considered: shot relevance to user
preference, video smoothness, and shot distortion. Shot
distortion is resulted from video subsampling and may
cause viewer’ uncomfortableness.

4.1 Shot relevance


Relevance is identified by determining the
presence of certain semantic features within a frame.
Fig.4 (left) original (middle) after skin-color detection, We also associate a “confidence factor” for each shot
(right) final results. that is relevant to a user preference. The confidence
represents the proportion of frames whose semantic
3.6 Background classification features meet the user preference.
For our system, each frame is classified into indoor Let W ji be the confidence of the j-th preference
or outdoor scene. The two-stage classification for the i-th shot and Pj be the probability of the j-th
algorithm in [11, 12] is adopted, with a modified preference. Note that a single shot may be relevant to
training set in the 2nd-stage SVM classifier. several preferences simultaneously. Eq. (5) shows the
At the 1st stage, each frame is divided into 4x4 definition of shot relevance (SR) in our system:
blocks, each is calculated with color (128-D histogram
in HIS color space) and texture (a 20x1 vector ⎛ 1 ⎞
SRi = ⎜ ∑ W ji log 2 ( ) ⎟ , (5)
representing means and standard deviations of ten ⎜ j∈G P ⎟
⎝ j ⎠
subbands of 3-level wavelets transform coefficients)
features to train SVMcolor and SVMtexture, respectively.

489
where G is the set of preferences selected by the user. 4.4 Optimal video summarization
Obviously, the confidence is weighted by the A content cost function considering the above
information quantity. Hence, SR in Eq.(5) is three mentioned factors is defined:
considered as the average preference in a shot. Shots NS

having higher SR values should be played with higher CC (k1 , k 2 ,..., k N ) =


S ∑ SR × D (k )
i =1
i i i
(10)
sampling rates (i.e., the original speed).
where i is the shot number and ki is the allocated
subsampling rate for the i-th shot. Basically, to
4.2 Video smoothness
conform to the MPEG format, ki is an integer. As ki =
Since the strategy of non-linear (or, non-uniform)
sampling is adopted to cope with the problem of ∞, the i-th shot will not be played and skipped.
approximating the time budget, we would like to We also consider a temporal cost function to
consider the smoothness between the successively evaluate the resulting video smoothness:
NS
played shots to prevent viewers’ uncomfortableness.
By the way, our system also includes the shots which
CT (k1 , k 2 ,...k N ) =
S ∑ TCF ( k
i =2
i −1 − ki ) (11)
are adjacent to those highly relevant shots. This
where TCF ( x) = 1.25 x − κ z and κ z is a constant.
strategy is helpful to enhancing the comprehension of
the video contents for the viewers. Considering shot relevance, video smoothness, and
shot distortion together, the whole cost function CV
4.3 Shot distortion can be defined as follows (γ is a constant):
This is to measure the distortion of the content
CV {k} = CC {k} + γ ⋅ CT {k} (12)
after frame sampling. The lost (non-sampled) frames
are substituted by duplicating the previously sampled Hence, the video summarization problem can be
frame so that frame by frame comparison can be described as a constrained optimization problem:
performed [13].
We adopt histogram differencing below to ⎛ ⎞
∑ (SR × D (k ) + γ × TCF ( k − k i ) )⎟⎟
NS

measure distortion between adjacent frames: min ⎜


K ={ ki |i =1~ N S }⎜
i i i i −1 (13)
1 N ⎝ i =1 ⎠
∑ diff ( H ij , H ki×⎣ j / k ⎦ ) , (6)
F

HDi (k ) = NS
Ni
N F j =1 subject to ∑k =T (14)
1 128 i =1

∑ H [ m] − H [ m] ,
i
diff ( H ai , H bi ) = i i
(7) where i is the shot number, Ni is the total number of
2 × I H × IW
a b
m =1
frames in the i-th shot, T is the frame constraint
where i is the shot number; k is the sampling rate; j is (related to the time constraint). The optimal
the frame number; NF is the total number of frames in a K * = {k i* i = 1,.., N s } can be solved via the Lagrangian
shot; IH×IW is the frame dimension; H is the 128-D multiplier method [14]. Note that the constraint of
histogram on HIS color space, where the hue, integer k i* makes Eq.(14) hard to be achieved. Hence,
saturation, and intensity are divided into 8, 4, 4 bins,
respectively. We also consider the influence of a tolerance of 1% error is allowed to lead to a
subsampling on relevance to user preference. Denote convergence in our iterative process.
FPj = [fp1, fp2, …, fpc]T as the relevance vector for the
j-th frame, where fpx=1 (relevant) or 0 (irrelevant). 5. Experimental results and discussions
Hence if the i-th shot is subsampled by k, its relevance
distortion can be computed by: In order to evaluate the proposed video
1 N summarization system, we collect 5 movies and 3
∑ sim(FPji , FPki×⎣ j / k ⎦ )
F

RDi (k ) = (8) dramas for experiments. The videos are MPEG-4


N F j =1
encoded with the following parameters: GOP_size=12,
where sim( A, B) = AT ⋅ B . and IBPBPBPBPBPB structure. In experiments, valid
In this paper, we evaluate both histogram and k i* ’s include only 1, 2, 4, 8, 12 and ∞ (∞: skipped
relevance distortions after subsampling. So the shot shot). For an encoding structure of “IBBPBBPBBP”,
distortion is defined as: k i* belongs to {1, 3, 6, 12 and ∞}.
Di (k ) = α × HDi (k ) + (1 − α) × RDi (k ) , (9) To evaluate the quality of the summarized video,
subjective tests are performed. We adopted the MOS
where 0 ≤ α ≤ 1 is a constant. Obviously, 0< Di (k ) <1. (Mean Opinion Score) test for scoring. There are 5

490
testees. Each movie is tested two times for each testee gains a MOS of about 4.0, and the comprehension
(hence each movie is tested 10 times). According to about the video contents is good enough.
the results in Table 2, the average score is near 4.0,
implying a satisfactory summary. For the accuracy of 7. References
the summarized interval, each number in parentheses
denotes the number of tests that lead to the indicated [1] Yu-Fei Ma and Hong-Jiang Zhang, “A model of motion
accuracy. It is seen that an error of about 1% is for attention for video skimming,” Proc. of IEEE Int’l
both movies 1 and 4, while the errors for movies 2, 3 Conf. on Image Process., Vol. 1, pp. I-129 -I-132, 2002.
and 5 are higher, but still below 8%. The reason for [2] Wen-Nung Lie and Chun-Ming Lai, “News Video
larger errors is because the total number of frames that Summarization Based On Spatial And Motion Feature
meet user preferences is less than the time constraint Analysis,” Proc. Of Pacific-Rim Conf. on Multimedia
(PCM’04), Tokyo, Japan, Nov. 2004.
(e.g., -7.2% and -39.8%). The other reason is that k i* is [3] Kadir A. Peker and Ajay Divakaran, “An extended
constrained to be integer (e.g., the case of +6.6%). framework for adaptive playback-based video
summarization,” Proc. of SPIE, Vol. 5242, Internet
Table 2 MOS for subject tests and accuracy in Multimedia Management Systems IV, pp.26-33, 2003.
summarized time interval. [4] Ahmet Ekin, A. Murat Tekalp and Rajiv Mehrotra,
“Automatic soccer video analysis and summarization,”
Movie 1 Movie 2 Movie 3 Movie 4 Movie 5 IEEE Trans. on Image Process., Vol. 12, No. 7, pp.
Average 796-807, 2003.
3.8 3.9 4.16 3 3.8
MOS [5] Pedro Miguel Fonseca and Fernando Pereira,
+3.3% “Automatic video summarization based on MPEG-7
+0.8% +6.6% (8) +2.9%
Precision
(8) (9) -0.6% +1.3%( (9)
descriptions,” Signal Processing: Image Comm., Vol.
in time 19, No. 8, pp. 685-699, Sept. 2004.
-0.6% -0.9% (1) 10) -39.8%
interval [6] Belle L. Tseng, Ching-Yung Lin and John R.
(2) (1) -7.2% (1)
(1) Smith, ”Using MPEG-7 and MPEG-21 for
personalization video,” IEEE Multimedia, Vol. 11, No.
Experiments are also conducted with different ρ’s 1, pp. 42-52, Jan.-Mar., 2004.
for movie 1. For ρ=0, only the shots of high relevance [7] Wei Fang and Li Pengfei, “A novel face segmentation
algorithm,” Proc. Of Int’l Conf. on Info-tech and Info-
will be played. For ρ=1, shots adjacent to those of high net (ICII 2001), Vol. 3, pp. 550-556, Nov. 2001.
relevance will be also considered in the summarized [8] Douglas Chai and King N. Ngan, “Locating facial
video. The advantage is that viewers will be able to region of a head-and shoulders color image,” Proc. of
capture more contents of the video (e.g., see a evildoer IEEE Int’l Conf. on Automatic Face and Gesture
that causes the explosion). With ρ=2 or 3, more Recognition, pp. 124-149, April 1998.
comprehensions about the video content can be [9] P. Peer, J. Kovace and F. Solina, “Human skin colour
achieved (due to the inclusion of more shots). clustering for face detection,” EUROCON 2003-Int’l
Conf. on Computer as a Tool. Sep. 2003.
[10] A. K. Jain, Fundamentals of Digital Image Processing,
6. Conclusion Prentice Hall, 1989.
[11] Martin Szummer and Rosalind W. Picard, “Indoor-
A personalized video summarization system is outdoor image classification,” Proc. Of IEEE Int’l
proposed in this paper. That means the video can be Workshop on Content-Based Access of Image and
summarized according to each viewer’s preference. To Video Database, pp.42-51, Jan. 1998.
fit user’s preference, semantic features of each frame [12] Navid Serrano, Andreas Savakis and Jiebo Luo, “A
are extracted so that its relevance to user preference computationally efficient approach to indoor/outdoor
can be determined. Users can be also capable of setting scene classification,” Proc. Of Int’l Conf. on Pattern
Recognition, Vol. 4, pp.146-149, 2002.
time or frame number constraint for video summary
[13] Zhu Li, Guido M. Schuster, Aggelos K. Katsaggelos,
via a friendly interface. and Bhavan Gandhi, “Rate-distortion optimal video
To summarize the video efficiently and summarization: a dynamic programming solution,”
effectively, a constrained optimization problem Proc. of Int’l Conf. on Acoustics, Speech, and Signal
(subject to time constraint and video smoothness) is Process. (ICASSP '04), Vol. 3, pp.III-457-60, 2004.
solved to determine the non-uniform sampling rates for [14] Kuo-Chiang Hsu, MPEG-4 Video Summarization
shots relevant to user preferences. Subjective tests Based on Semantic Feature Analysis and User
show that the detection rate of semantic features is Preference, Master thesis, National Chung Cheng
more than 95%, the quality of the summarized video University, Chia-Yi, Taiwan, July 2005.

491
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Fabrication of microfluidic pump using conducting polymer actuator


Jung Ho Kim, King Tong Lau, Dermot Diamond*
National Center for Sensor Research, Dublin City University, Dublin 9, Ireland,
Correspondence author: dermot.diamond@dcu.ie
, kim.lau@dcu.ie

known methods include piezoelectric


ABSTRACT [3], thermopneumatic [4], electrostatic
In this paper, we present a low-power [5] and electromagnetic actuation [6].
microfluidic pump based on polypyrrole- Besides the mechanically driven
actuator. This polymeric microfluidic reciprocating micropump, there are
pump is fabricated with polypyrrole
continuous flow type micro pumps
modified poly dimethylsiloxane (PDMS)
membrane encased into a based on electro-osmois (EO),
polymethylmethacrylate (PMMA) electrohydrodynamic (EHD), and
structure. The pumping action is induced magnetichydrodynamic (MHD). While
by electro-chemically actuating the PPy- there are many micropump designs
PDMS membrane while check valves at available for micro-nano scale liquid
the inlet and outlet control the flow delivery, they generally require high
direction in the respective pump phases. operation voltages and high running
This pump is self-priming. Accurate current and therefore are very power
control of the output flow rate can be hungry [7-12].
obtained by changing the actuation
frequency. With this pump, a maximum
The current work involves the
pumping rate of 52 µL/min was achieved
using an input power of 55mW operated at development of a micropump for small
±1.5 volt. volume fluid delivery. It is small in
Keywords: electrochemical actuator, size, low power and can be easily
microfluidic, micropump, polypyrrole. integrated into microfluidic devices for
e.g. lab-on–a-chip applications. This
design employs polypyrrole (PPy) as
the electrochemical actuator to produce
INTRODUCTION the mechanical movements required
Methods for manipulation of fluid in for the pumping action.
the micro scale are widely used in
areas ranging from chemistry, biology Working Principle
to materials science. For a microfluidic Figure 1 illustrates the typical
analytical system, the main function is configuration of a micropump which is
to provide a manifold of composed of a sealed pump chamber, a
interconnecting channels to control the diaphragm and check valves for
reagent/sample delivery and analyte controlling fluid inlet and outlet.
detection within a single integrated Reciprocating movement of the
platform. The basic building blocks diaphragm generates a two-mode pump
consist of channels in which the fluid cycle that results in a periodic volume
flows, valves which control the change that creates positive and
direction of flow, and one or more negative pressures in the pump
pumps that provide the driving force to chamber. During the prime-mode, a
move the fluid. Numerous microfluidic negative pressure is generated by the
devices such as microfluidic chips, upward movement of the diaphragm to
flow sensors, microvalves and close the outlet check valve and lift the
micropumps have been reported [1, 2]. inlet check valve to suck fluid into the
For driving the liquid movement, well pump chamber. During the following

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 457


DOI 10.1109/SUTC.2008.74
pump-mode, the downward diaphragm EXPERIMENTAL
movement generates positive pressure
to force the inlet check valve to close Fabrication of pump body
and push the outlet check valve to open The pump body was fabricated using a
and excrete fluid into the outlet. Hence, computer aided micro milling machine.
the actuator diaphragm provides Two PMMA sheets were first cut into
driving force for the fluid movement the desired sizes and micro milled to
and the check valves at the inlet and form two microchannels in the bottom
outlet direct the flow to create a PMMA layer. In the upper layer a
continuous flow in one specific pump chamber (12mm x 12mm x
direction. 1mm) with 2 cavity holes (5mm
diameter, 2.8mm depth) for check
Actuator valves were fabricated. A pressure
Pump chamber
sensitive adhesive (PSA) tape was used
to bond together the top and bottom
layers to form the pump structure.
Inlet check valve Outlet check valve
opens at prime mode opens at pump mode
Fabrication of Check valves and
diaphragm
Figure 1: Structure of a typical
Rapid prototyping was used for the
reciprocating pump with passive check
fabrication of the check valves and
valves and diaphragm.
diaphragm. First, master molds for
various check valve designs were
The diaphragm employed was a
produced from PMMA using computer
conductive polymer membrane/silicon
aided micro milling directly operated
PDMS bilayer membrane which can be
from a CAD file. The check-valves and
electrochemically actuated to provide
diaphragm were molded using a PDMS
the mechanical movements. Inherently
mixture (Sylgard 184, Dow Corning).
conductive polymers such as
A 10:1 mixture of solution A and B
polypyrrole (PPy) undergo reversible
supplied was stirred thoroughly and
volume change during electrochemical
degassed under vacuum until the
redox reactions as a result of
mixture was completely clear and
insertion/removal of either dopant ions
bubble free. The prepared PDMS
or the solvated counterions that are
mixture was poured onto the PMMA
present in the immediate environment
mold and cured at 65 for 4 hours.
to maintain electrical balance [13].
When specific voltages are applied After cooling the products were taken
alternatively across the PPy membrane, out of the mould with great care so as
the membrane deforms to a convex and to prevent damage to the very small
concave form to drive the diaphragm to and thin PDMS devices. Two typical
perform continuous push and pull check valves are shown in Figure 2(b)
actions. This bending type actuator and (c). The side arms stretching out
produces large deformations, but the from the valve body are for fixing onto
generated pressure is relatively small the check valve holder.
compared to a conventional solenoid
pump. For this reason, custom PMMA holders Figure 2(a) for check
designed check valves were fabricated valves were fabricated using the same
that would operate effectively at low micro-milling method such that the
actuation pressures. PDMS valve could sit inside the holder
whereas the side arm could be fixed
onto the groove with silicon glue. The

458
holder was then placed into the valve solution.
cavity in the upper pump layer.

Integration of the Ppy-PDMS


diaphragm
(c) The most important part of the pump
component is the probably the actuator
diaphragm which is an integrated
PDMS/Ppy bilayer membrane. It was
(b) fabricated by physically attaching each
(a) of the four corners and the centre of the
PPy membrane onto the PDMS
Figure 2: PDMS check valves and holder. membrane using silicon glue such that
The arms of check valve were fixed onto the mechanical movement of the
the holder using silicon glue such that it is diaphragm was driven by the
always at the ‘closed’ position. Pressure electrochemical actuation of the Ppy
change in the pump chamber open the layer.
valve by stretching the side arms.
The Ppy actuator pump assembly
150µm thick PDMS membrane Figure 3 shows the overall schematics
diaphragm was fabricated by casting of the pump design. The integrated
into a glass container with a controlled micropump consisted of two functional
amount of PDMS moulding mixture. PMMA layers. The bottom part was
The membrane was then cut into the incorporated with two PDMS check-
desired sizes for use in the pump valves and two fluidic channels (width
assembly. 0.5mm, length 10mm) for inlet and
outlet; the upper part contained a pump
Preparation of PPy membrane
chamber (12mm x 12mm x 1mm) with
PPy actuator membrane was fabricated a PDMS/PPy diaphragm membrane
electrochemically using a porous attached at the top. Figure 4 shows the
polyvinylidene fluoride membrane photograph of the fabricated
(PVDF, Millipore, 0.45µm pore size, micropump which had final
110µm thick) as the base layer. [14] dimensions of 5.6mm x 16mm x 26mm.
The PVDF membrane was first sputter- Teflon tubing (OD=1.6mm,
coated with about 100 Pt layers on ID=0.8mm) was fixed at the inlet and
both sides. Polymerization of pyrrole outlet.
was carried out in a propylene
carbonate (Aldrich, Dublin) solution Ppy membrane
containing 0.06M pyrrole monomer PDMS diaphragm
(Merck, Dublin) and 0.05M lithium Pump chamber
outlet
Check valve

bis- (trifluoromethanesulfonyl)imide
Top layer
(LiTFSI, Aldrich). Approximately Inlet Holder for
0.5% water was also added to this check valve

solution to aid deposition. The polymer Cavities for check


valve holders
deposition was achieved using a Inlet/outlet

constant current density of 0.3 mA/cm2


channels

for a period of 5 hours at -20 . Prior


Bottom layer
to use, the actuator was cut to the
desired size (10mm x 10mm) and Figure 3: Schematic of the pump assembly.
stored in propylene carbonate-LiTFSI

459
mechanical force during actuation.
Inlet Anchoring
points The actuator pump produced highest
Outlet
flow rate initially which decreased
Top PMMA
layer PPy gradually to reach equilibrium within
actuator
10 minutes. The exact reason for this
Bottom layer
observation was not clear but this is a
common feature of PPy based
Figure 4: A photograph of the PPy based actuators. It was believed to be caused
microfluidic pump. by internal restructuring of the polymer
matrix that limits the ion transportation
MEASUREMENTS
to result in a reduced bending
To validate the pump performance, the movement. In light of this, it was
inlet of the pump was connected using decided that a 10 minute warm up time
silicone tubing (ID=1.5mm, should be allowed for all future
OD=1.9mm) to a reservoir that pumping experiments.
contained water. The outlet was
connected to a collecting container Check valve and pump performance
situated on a weighing balance to The check valves were configured in
monitor the quantity of water pumped the pump such that it was naturally in
out into the collector. A potentiostat the ‘closed’ position. The pressure
(Solartron, model 1285A) was used to generated by the actuator diaphragm
drive the PPy actuator using a dc caused the elastic side arm(s) to stretch
square-wave voltage of ±1.5V (vs SCE or bend depending on the design so
reference electrode). A typical pump that liquid could go through the gap.
stroke time of 2s involved switching Therefore the nature of the valve
the working electrode voltage to material (e.g. the adhesiveness of the
+1.5V and held for 1 second; then valve material to the contact surface),
switch to -1.5V and held for 1 second. the size of the contact area and the
The pumping volume or flow rate was thickness of the side arm determines
determined by the weight of water the force required for the valve
delivered during the pumping operation. As the PPy membrane can
experiments. only generate small bending force, fine
tuning of these parameters was crucial
RESULT AND DISCUSSION
for the successful operation of the PPy
Ppy actuator diaphragm pump.
The performance of the Ppy membrane Figure 2 (b) and (c) shows two typical
is responsible for the efficiency of the check valve designs used in this study.
actuator pump. In this Ppy/PDMS The first design (Figure 2c) had only
diaphragm, Ppy was grown on each one side arm and would work as a flap
side of the PVDF substrate. Each side type valve. This design (diameter
of the PPy membrane was connected to 2.4mm, height 0.5mm) required
the potentiostat such that one side minimal force to push open and was
worked as the working electrode and expected to give maximum pumping
the other side as the counter electrode. efficiency. However, it did not provide
When electrical voltage was applied, sufficient mechanical stability to stop
one side would swell while the backward flow during delivery
simultaneously the other side would mode.
contract to produce maximum

460
A second design was a valve with two thus less volume change in the pump
arms which had diameter of 2mm, chamber to produce a slow flow rate,
height of 0.5mm with side arms length whereas a slow switching induces large
of 1.5mm (see Figure 2b). actuation and large volume change to
produce a higher flow rate. Figure 5(b)
ͣ͢ shows that a limiting flow rate of 52
Ϊ ͮ ͪͥͥͨ͟͡Ω ͞ ͧͥͨ͟͢͡
͢͡ µL/min was achieved at and above a
΁ΦΞ ΡΚΟΘ ΧΠΝΦΞ Ζ ͠µL

ͣ
΃ ͮ ͪͪͤͥ͟͡
ͩ stroke time of 6 s. Conversely,
ͧ switching the PPy membrane faster
ͥ than 1Hz does not produce significant
ͣ
(a)
enough force to deliver sample through
͡
͡ ͣ ͥ ͧ ͩ ͢͡ ͣ͢ ͥ͢
the valves and limits the lower
΄ΥΣΠΜΖ ΥΚΞΖ ͠Τ achievable flow rate. Linear regression
ͧ͡ from Figure 5 (a) yields a flow rate of
18 µL/min for a stroke time of 1s
Pumping rate ͙ µ ͽ͠min͚

ͦ͡
which would be approaching the lower
ͥ͡
limit of this device. The back pressure
ͤ͡ generated by this actuator pump was
(b) determined by the height of water it
ͣ͡
͡ ͣ ͥ ͧ ͩ ͢͡ ͣ͢ ͥ͢
could pump up a vertical tube and was
Stroke time /s found to be 11 mbar.
Figure 5: Measured characteristics of PPy
micropump: (a) pumping volume Power consumption
(µL/stroke) as a function of stroke time. (b)
flow rate (µL/min) as a function of stroke The operational cost of this device is
time. The error bars represent standard very low compared to conventional
deviation from 5 repeats in both cases. solenoid or oscillator based
micropumps. Figure 6 is the current
This improved design increased the profile obtained during a 2 minute
force required to stretch and open the pumping operation from time zero. The
valve which stopped the back flow power consumption calculated was 55
from occurring. However, a significant mWcm2 for a 4s stroke time operated
reduction of pumping volume or flow at ±1.5V. Since the stroke time is
rate was observed as expected as the linearly proportional to the degree of
valve movement was much restricted. PPy membrane actuation (or bending),
The plot of stroke time vs volume it is reasonable to deduce that the
delivered by the pump presented in power consumption is somehow
Figure 5(a) showed good linearity (R2 related to the stroke time. Hence,
= 0.993), which suggested that a reducing the stroke time (and also the
consistent pump action was achieved. flow rate) would further reduce the
The data clearly indicates that the power consumption reported above.
amount of deformation of PPy The energy required to induce the
membrane is proportional to the time at electrochemical reaction of the Ppy
which the driving voltage is applied, membrane depends on the amount of
and the pump flow rate can be Ppy molecule present on the membrane.
controlled by varying the stroke time A thick Ppy coating would consume
which is the switching frequency of the more energy but it would exert bigger
input electrical potentials (i.e. actuation force than a thinner one; hence to
frequency). Fast switching (short maintain the mechanical properties and
stroke time) induces less actuation and to achieve low energy consumption

461
one can reduce the size of the Ppy deployment as it can operate for long
membrane if the demand for faster period of time (38 hrs continuous
flow rate is not paramount. running) just from a standard 1.5V AA
battery.

80.00
CONCLUSIONS
60.00 A self-priming micropump based on
40.00 PPy actuator has been fabricated with
I/ mA cm^2

20.00 plastic substrates without requiring


0.00 expensive micro fabrication equipment
0 20 40 60 80 100 120
-20.00 or using complicated manufacturing
-40.00 procedures. By custom designing the
-60.00 check valve, actuator diaphragm and
-80.00 Time/s pumping chamber, combined with
using in-house synthesized PPy
Figure 6: Measured current profile during membrane, the actuator pump is
pumping. The operating voltage used was capable of producing a maximum flow
±1.5V supplied by a potentiostat. Stroke rate of 52 µL/min and a nominal
time was 4s. Power consumption was minimum flow rate of 18 µL/min when
calculated to be 55.5mW. operated at ±1.5V power supply. The
back pressure it generates is 11mBar.
This device is very low power that
Future development only consumes 55mW per stroke.
This micropump is anticipated to be
used in microfluidic applications. The ACKNOWLEDGEMENTS
aim is to integrate the pump structure We would like to thank Dr. Roderick
into microfluidic chips to form a single Shepherd and Yanzhe Wu for their
platform to eliminate separate pumping advices on PPy membrane fabrication.
and tubing connections. This will This work was funded by Science
significantly reduce the complexity of Foundation Ireland (Grant no. SFI
microanalyser design and fabrication 03/IN.3/1361), the Biotex Project
needs. In addition, this device is (FP6-2004-1ST-NMP-2) and Korea
conceptually very simple and is Research Foundation (KRF-2005-214-
fabricated using very low cost and D00236)
readily available materials. This low
power device is also suitable for field
Micropump Based on PCB Technology,
References Sens. Actuators A, 88, 220-226(2001).
[1] P. Woias, Micropumps-Past, [5] T. Bourouina, A. Bosseboeuf, J.
Progress and Future Prospects, Sens. Grandchamp, Design and Simulation
Actuators B, 105, 28-38(2005). of an Electrostatic Micropump for
[2] D. J. Laser and J. G. Santiago, A Drug-delivery Applications, J.
Review of micropumps, J. Micromech Micromech. Microeng., 7, 186-
Microeng, 14, 35-64(2004). 188(1997).
[3] M. Koch, A. Evans, A. [6] S. Bohm, W. Olthuis and P.
Brunnschweiler, The Dynamic Bergveld, A Plastic Micropump
Micropump Driven with a Screen Constructed with Conventional
Printed PZT Actuator, J. Micromech. Techniques and Materials, Sens.
Microeng, 8(2), 119-122(1998). Actuators A, 77, 223-228(1999).
[4] A. Wego, L. Pagel, A Self-filling [7] J. H. Park, K. Yoshida, S. Yokota,

462
Resonantly driven piezoelectric 866, 2007.
micropump fabrication of a micropump [11] K. Yamato, K. Kaneto, Tubular
having high power density, linear actuators using conducting
Mechatronics, 9, 687-702, 1999. polymer, polypyrrole, Analytica
[8] J. Darabi, M. Rada, M. Ohadi, J. Chimica Acta, 568, pp. 133-137, 2006.
Lawler, Design, Fabrication, and [12] R. Zengerle, S. Kluge, M. Richter,
Testing of an Electrohydrodynamic A Bidirectional Silicon Micropump, in:
Ion-Drag Micropump, J. of Proceedings of MEMS ’95,
Microelectomechanical systems, 11(6), Amsterdam, The Netherlands, 29 Jan.-
pp. 684-690, 2002. 2 Feb. 1995, pp. 19-24.
[9] J. Munyan, H. Fuentes, M. Draper, [13] Y. Wu, D. Zhou, G. M. Spinks, P.
R. Kelly, A. Woolley, Electrically C. Innis, W. M. Megill and G. G.
actuated, pressure-driven microfluidic Wallace, TITAN: A Conducting
pumps, Lab on a chip, 3, pp. 217-220, Polymer Based Microfluidic Pump,
2003. Smart Materials and Structures, 14,
[10] T. Hensen, K. West, O. Hassager, 1511-1516(2005).
N. Larsen, An all-polymer micropump [14] S. Hara, T. Zama, W. Takashima,
based on the conductive polymer K. Kaneto, “TFSI-dopped Polypyrrole
poly(3,4-ethylenedioxythiophene) and actuator with 26% strain”, J. Materials
a polyurethane channel system, J. Chemistry, Vol. 14, P. 1516-1517, 2004.
Micromech. Microeng., 17, pp. 860-

463
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Calibration and Data Integration of Multiple Optical Flow Sensors for Mobile
Robot Localization

Jwu-Sheng Hu Yung-Jung Chang Yu-Lun Hsu


Department of Electrical and Control Engineering
National Chiao Tung University
Hsinchu 300, Taiwan, ROC.
jshu@cn.nctu.edu.tw nuo.ece95g@nctu.edu.tw lun.ece96g@nctu.edu.tw

Abstract illumination. The camera, LED and associated optical


mechanism are specially arranged to ensure a robust
This paper proposes a calibration method as well measurement [8]. Because the surface has texture
as a computational algorithm to integrate multiple variation, it is then likely to detect the motion of the
optical flow sensors. Optical flow sensors offer a sensor by matching the patterns between consecutive
different kind of odometer as compared with the wheel images. However, off-the-shelf sensors only give
encoder. Using multiple sensors, it is possible to translation information because rotation is not needed
reduce the effect of measurement uncertainties. Since in computer mouse applications. Therefore, at least
all sensors are mounted on a rigid body, their two optical flow sensors have to be used to detect the
measurement data must obey a certain relation. This complete motion information [3]–[6].
relation is utilized in this paper and mathematical There are many factors that might affect the
formulations are developed to realize the computation. accuracy of the optical flow measurement. The work in
It is shown that the calibration procedure can be cast [8] provides a detailed analysis of the possible errors
as an optimization problem given enough of the optical flow sensor itself and it is possible to
measurement data. Further, the rigid-body relation is reduce the error by taking average over an array of
formulated as a null-space constraint using the sensors. However, taking the average does not
calibrated parameters. During operation, unreliable consider the differences among the sensors as they
sensor measurements can be removed by accessing the might encounter different conditions. For example, the
error distance to the null space. Simulation results are optical flow sensor passing by a hole (i.e., a sudden
presented to support the proposed methods. change of height of the surface) gives an incorrect
reading due to out-of-focus. Further, to use multiple
1. Introduction sensors, there are more issues to be considered.
Borenstein and Feng [1] categorized the errors into: 1)
Odometer based on wheel encoder is most Systematic errors and 2) Non-systematic errors. For
commonly used in practice because of its simplicity our case, the reasons leading to the systematic error
and availability. Recently, the method of localization include imperfect measurements of position and
using optical flow sensors (or optical mouse sensor) orientation of optical flow sensors and variation of
was proposed [2]–[9]. Combining the measurement resolutions. The reasons of the non-systematic error
with landmarks to perform self-localization was also come from the sensor itself such as inability to detect
reported [7][10]. Comparing to optical encoder, the the change of a homogeneous surface or the distance
optical flow sensor measurement is not affected by between sensor and sensing surface is too large [5].
wheel-slippage because of direct sensing of the The technical issues mentioned above have never
movement between the sensor and sensing surface. been studied in detail when constructing a sensor
Further, the cost of the sensor is very low due to its module using multiple optical flow sensors. This work
massive applications of computer mice. also proposes a calibration method to deal with the
The principle of optical mouse is using a systematic errors as well as a consistency check
miniaturized CMOS camera to capture consecutive strategy to reduce the inaccuracy affected by non-
images reflected from the surface through the LED systematic errors. Rigorous mathematical formulations
and derivations are given to facilitate the design in real

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 464


DOI 10.1109/SUTC.2008.20
practice. The following section describes the methods li2 + l 2j − 2cos(γ ij )lil j
of integrating multiple optical flow sensors. In section Δθij = (4)
3, the rigid-body constraints and the geometric Dij
relations of optical flow sensors are introduced. ⋅ sign(l j sin(α j + σ ij ) − li sin(α i + σ ij ))
Section 4 presents the calibration method which
optimizes the parameters of sensors using the
formulation in section 3. In Section 5, the consistence
check strategy is developed to choose the reliable
sensor measurements during operation. Several
simulation results are given in Section 6 to
demonstrate the proposed method and a conclusion is
given in Section 7.

2. Position and orientation estimation using


multiple sensors
Figure 1. Geometric relation of two sensors
The analysis in this section makes an extension of
the work in [5] to multiple sensors. Consider there are Define a coordinate ( x′ , y′ ) aligned with the line
N optical flow sensors, labeled as i = 1 to N, mounted Oi O j and the origin located at its mid-point (Figure 2).
on a plane. Each sensor is able to measure a 2-
The new sensor locations can be calculated as,
dimensional translation in its own coordinate. In
xi′ = ri (sin(Δθij + α i + σ ij ) − sin(α i + σ ij )) sign(Δθij ) + Dij / 2
general, sensor coordinates (coordinate defined on the (5)
yi′ = ri (cos(α i + σ ij ) − cos(Δθij + α i + σ ij )) sign(Δθij )
motion detection axes of the optical flow sensor) are
not necessary aligned to each other. Suppose two x′j = rj (sin(Δθij + α j + σ ji ) − sin(α j + σ ji )) sign(Δθij ) − Dij / 2
sensors labeled i and j (Figure 1) are at a distance D ij y′j = rj (cos(α j + σ ji ) − cos(Δθij + α j + σ ji )) sign( Δθij )
to each other. The coordinate of sensor i is rotated at
the angle σij relative to the line connecting both
sensors (line Oi O j in Figure 1) while the angle for
sensor j is σji. The sign of σij and σji is positive if the
rotation is counterclockwise (CCW) and negative
otherwise.
Considering that the sensor move along an arc
during the sampling interval, the length of the arc is,
li = xi + yi (1)
where xi and yi are the measurements of sensor i at
each sample instant on the coordinate of sensor i. The
motion direction (tangent to the arc) of sensor i is at Figure 2. The movement of two sensors
the angle αi relative to the sensor coordinate, i.e.
li cos(αi) = xi and li sin(αi) = yi . (2) Denoting the center of the line Oi O j as oij and its
From Figure 1, the angle γij can be calculated as γij movement as Δoij (see Figure 2), we have,
= |αi +σij-αj-σji |. Denoting the rotational angle as Δθij, Δoij = ⎡ Δxij′ Δyij′ ⎤T (6)
⎣ ⎦
the radius of rotation for sensor i is, where
l
ri = i and Δyij′ = ( yi + y j ) .
( x′ + x ′ )
(3) ′ ′
Δθij Δxij′ = i j
2 2
and from the law of cosine, Δθij can be calculated as, Suppose that the center of the robot relative to the oij
on the coordinate of Figure 2 is c′ij , the movement of
the center, denoted as Δc′ij , is
Δc′ij = (T(Δθij ) − I)c′ij + Δoij (7)

465
where I is the identity matrix and T(Δθij) is the The pattern εij of can be used to access if the
transformation matrix as, nominal parameters is correct or if the sensor reading
⎡ cos(Δθij ) − sin(Δθij ) ⎤ (8) is reliable. Define the error vector ε as the collection of
T(Δθij ) = ⎢ ⎥
⎣ sin(Δθij ) cos(Δθij ) ⎦ N(N-1)/2 errors εij,
uuuuur T
Suppose the orientation of the vector Oi O j to the ε = ⎡ε12 ε13 L ε1N ε 23 ε 34 L ε ( N −1) N ⎤ = B ⋅ X (18)
⎣ ⎦
global coordinate is βij, the movement represented by where B is a matrix of dimension N(N-1)/2×2N shown
the global coordinate (denoted as Δcij) is in (19) at the bottom of this page. X is defined as
sensor measurements vector as
Δcij = T( βij )Δc′ij (9)
X = [ x1 y1 x2 y 2 L x N y N ]T whose dimension is
Therefore, the robot position and orientation relative to
2N×1. Moreover, denoting the orientation of the sensor
the global coordinate computed from the sensor pair i
i to the robot coordinate is φi and for sensor j is φj, the
and j at time k+1 are,
angle σij can be obtained from σij = φi −βij and similarly,
θ (k + 1) = θ (k ) + Δθij (10)
σji = φj −βij.
c(k + 1) = c(k ) + T(θ (k ))Δcij (11)
For N sensors, there will be C =N(N-1)/2 solutions for
N
2

the robot position and orientation update. A


straightforward way of update is to compute the mean
as,
2 N −1 N
θ (k + 1) = θ (k ) + ∑ ∑ Δθij
N ( N − 1) i =1 j =i +1
(12)

2 N −1 N
c(k + 1) = c( k ) + T(θ ( k ))∑ ∑ Δcij (13)
N ( N − 1) i =1 j =i +1
(a) (b)
Figure 3. The geometric relations of angles:
3. The rigid-body constraints and
(a) acute angle case. (b) obtuse angle case
geometric relations among sensors
Equation (18) can be used to compute the
Since all sensors are fixed relative to each other, the
parameters in B by minimizing ε. B contains the
measurements must obey the rigid body constraint.
angular parameters of sensors, i.e. all φi’s and βij’s. The
This constraint can be used to perform calibration as
number of φi is N and the number of βij is N(N-1)/2
well as access the correctness of measurement. For
rigid body motion, the constraint between any two (since βji = βij + π and there are no βii’s). However, all
sensors according to Figure 1 is φi’s are independent to each other but βij’s are not. In
li cos(αi +σij) = lj cos(αj +σji) (14) other words, there are relations among βij’s which
or should be satisfied when performing the minimization
li cos(αi) cos(σij)− li sin(αi) sin(σij) to find the parameters. To begin with, define the
coordinate of sensor 1 as the robot coordinate and the
= lj cos(αj) cos(σji)− lj sin(αj) sin(σji). (15)
This means that the projections of the sensor center of sensor 1 as the robot center, i.e. φ1 = 0 and the
measurements onto the joining line in Figure 1 should position of sensor 1 is (0,0). Figure 3 shows the
be the same. For N sensors, there will be N(N-1)/2 relations among sensor number 1, i, i+1, j-1 and j.
constraint equations. Since li cos(αi) = xi and li sin(αi) There are two cases when computing βij.
In Figure 3, suppose that Pij O j is perpendicular to
= yi , where xi and yi are the sensor measurements
during each sampling interval on the sensor coordinate. O1Oi and the point Pij is a point on line O1Oi . In the
The equation becomes first case (Figure 3(a)), ∠PijOiOj is an acute angle and
xi cos(σij)− yi sin(σij) = x j cos(σji) − y j sin(σji) (16) it is easy to see that the angle βij = β1i +
(π−∠PijOiOj). Let ψ a ,b ,c be the notation of angle
and the equation error is defined as
εij = xi cos(σij)− yi sin(σij)− x j cos(σji)+ y j sin(σji) (17) ∠OaObOc. and Dij the length of Oi O j . We can see that

⎡cos(σ 12 ) − sin(σ 12 ) − cos(σ 21 ) sin(σ 21 ) 0 0 L L L L 0 ⎤


⎢ cos(σ ) − sin(σ ) L L L L 0 ⎥ (19)
0 0 − cos(σ ) sin(σ ) ⎥
B=⎢ 13 13 31 31

⎢ M M M M M M M M M M M ⎥
⎢ ⎥
⎣ 0 0 0 0 0 0 L cos(σ ( N −1) N ) − sin(σ ( N −1) N ) − cos(σ N ( N −1) ) sin(σ N ( N −1) ) ⎦⎥

466
the length of Pij O j is equal to D1jsin( ψ i ,1, j ) and the Equivalently, we can have
Dij = Gij ⋅ D12
length of Pij Oi is equal to D1i− D1jcos( ψ i ,1, j ). As a (27)
where
result, the angle ∠PijOiOj is
D1 j sin(ψ i ,1, j ) ⎧ sin( β 2 j − β12 )
(20) ⎪ , when i = 1
∠Pij Oi O j = arctan( )
D1i − D1 j cos(ψ i ,1, j ) ⎪ sin( β 2 j − β1 j )
Gij = ⎨
and according to law of cosine, ⎪ sin( β1 j − β1i ) sin( β 2 j − β12 ) , otherwise
sin(ψ 1,i +1,i ) sin(ψ 1,i + 2,i +1 ) L sin(ψ 1, j , j −1 )
⎪ sin( β ij − β1i ) sin( β 2 j − β1 j )

D1i = D1 j (21)
sin(ψ 1,i ,i +1 ) sin(ψ 1,i +1,i + 2 ) L sin(ψ 1, j −1, j ) As the result, the position of each sensor relative to
Hence, sensor 1 can derive from D12 and βij. These geographic
⎛ ⎞ relations are fundamental to the calibration as well as
⎜ ⎟ consistency check algorithm described in the following
sin( ψ )
∠Pij Oi O j = arctan ⎜ i ,1, j ⎟
⎜ sin(ψ 1,i +1,i ) sin(ψ 1,i + 2,i +1 ) L sin(ψ 1, j , j −1 ) ⎟ context.
⎜⎜ − cos(ψ i ,1, j ) ⎟
sin(ψ ) sin(ψ ) L sin( ψ ) ⎟
⎝ 1,i ,i +1 1,i +1,i + 2 1, j −1, j ⎠
(22) 4. The Calibration method
In the second case (Figure 3(b)), the angle ∠PijOiOj is
an obtuse angle and βij = β1i + ∠PijOiOj. Similarly, we The objective of calibration is to compute the
can arrive at the following equation. parameters of the sensor configuration. Using (18) and
⎛ ⎞ (24), one can determine the angular parameters (φi’s

sin(ψ i ,1, j )
⎟ and βij’s) and subsequently, the distance among
∠Pij Oi O j = arctan ⎜ ⎟
⎜ sin(ψ 1,i +1,i ) sin(ψ 1,i + 2,i +1 )L sin(ψ 1, j , j −1 ) ⎟ sensors can be computed from (27) given one distance
⎜⎜ cos(ψ i ,1, j ) − ⎟
⎝ sin(ψ 1,i ,i +1 ) sin(ψ 1,i +1,i + 2 )L sin(ψ 1, j −1, j ) ⎟⎠ measurement between a pair of sensors. In other words,
(23) if that distance measurement and all sensor readings
are accurate, it is able to perform self-calibration
Further, it is straightforward to show that that ψ i ,1, j
without using external reference measurements.
=β1j−β1i, ψ 1,i ,i +1 = π−(βii+1−β1i), ψ 1,i +1,i =βii+1−β1i+1, Suppose M sets of measurements are collected by
ψ 1,i +1,i + 2 = βi+1i+2−β1i+1, ψ 1,i + 2,i +1 =π−(βi+1 −β1 moving the sensor module and each set is denoted as
i+2 i+2) ,
the following vector,
and so on. Therefore, the angle βij can be rewritten as, T
X m = ⎡⎣ x1,m y1,m x2,m y2,m L xN ,m yN ,m ⎤⎦ , m = 1, 2,L, M
⎧ sin( β1 j − β1i )
⎪ β1i + π − arctan( ), case (a) The trajectory of sensor motion shall be designed such
⎪ Sij − cos( β1 j − β1i ) (24)
β ij = ⎨ that the vector Xm, m=1 to M spans the remaining
⎪ β + arctan( sin( β 1j − β 1i ) subspace (a condition similar to persistence excitation).
), case (b)
⎪ 1i cos( β − β )
1i − S ij As described in Section 3, the independent angular
⎩ 1j

where parameters are φi’s, i=2 to N, β1j’s, j=2 to N and


sin( β i i +1 − β1 i +1 ) sin( βi +1 i + 2 − β1 i + 2 )L sin( β j −1 j − β1 j ) βk(k+1)’s, k=2 to N-1. The total number is
Sij = (N−1)+(N−1)+(N−2)=3N−4. Let Z be the vector Z = [z1
sin( βi i +1 − β1 i ) sin( βi +1 i + 2 − β1 i +1 )L sin( β j −1 j − β1 j −1 )
z2 … z3N-4]T = [φ2 … φΝ β12 β13 … β1N β23 β34 … β(N-
Equation (24) shows the relationship among βij’s and it T
1)N] . The problem of solving Z can be cast as the
is easy to see that the free parameters are β1j’s, j = 2 to
following optimization problem,
N and βi(i+1)’s, i=2 to N-1. All other βij’s can be
Min ( X 1T BT BX 1 + X 2T BT BX 2 + L + X MT BT BX M ) (28)
computed from them using (24). z

Given the angle βij’s, there are also geometry To solve this unconstraint optimization problem, we
relations among Dij’s. In fact, there is only one degree can directly use Newton-Raphson method.
of freedom left for Dij’s. To see this, according to law As mentioned above, the distance among sensors
of cosine, can be computed from (27) if D12 is known. It is also
sin(ψ 1,2, j ) likely to calibrate D12 if an external angular
D1 j = D12 . (25)
sin(ψ 1, j ,2 ) measurement is available. To see this, substituting (27)
Then, into (4), we have,
sin(ψ i ,1, j ) sin(ψ i ,1, j ) sin(ψ 1,2, j ) (26)
Dij = D1j = D12
sin(ψ 1,i , j ) sin(ψ 1,i , j ) sin(ψ 1, j ,2 )

467
li2 + l 2j − 2 cos(γ ij )lil j different kinds of strategies to access the reliability.
Δθij = sign(l j sin(α j + σ ij ) − li sin(α i + σ ij )) For example, if one of the sensors gives an incorrect
D12Gij
reading, we can find it out by removing it from X and
At each sample instant k, define a new variable uk as,
the remaining sub-vector should be in the null space.
N −1 N ⎛ l 2 + l 2 − 2 cos(γ )l l
2 Suppose the total number of unreliable sensors moved
∑ ∑⎜
i ,k j ,k ij i , k j , k
uk =
N ( N − 1) i =1 j =i +1 ⎜ Gij every time is Q. The measurement vector at time t is

⎞ denoted as Xt. The procedure of finding out these Q
⋅ sign(l j ,k sin(α j ,k + σ ij ) − li ,k sin(α i,k + σ ij )) ⎟
⎠ sensors at each time that data coming is defined as
As the result, the product of uk and inverse of D12 is following steps:
equal to average of the orientation estimation of each 1) At beginning, the total number of sensors of Xt is N.
sensor pair. More precisely, 2) Ignore the measurements of one of these sensors
2 N −1 N and redefine a measurement vector, Xr,t, of
uk ⋅ D12−1 = ∑ ∑ Δθij ,k = Δθreal ,k
N ( N − 1) i =1 j =i +1
(29) remained sensors.
3) Find the constraint matrix, Br, of remained sensors
where Δθ real ,k denotes the real rotation angle at each and the null space of Br, N(Br).
sample. By collecting B sets of data, we can set up the 4) Find the orthogonal projection vector X̂ r,t of Xr,t
equation as: onto N(Br).
u ⋅ D12−1 = θ real (30) 5) Calculate the distance from Xr,t to X̂ r,t
where θreal = [Δθ real ,1 Δθ real ,2 L Δθreal , B ]T is desired
6) Repeat step 2 to step 5 until each sensor have been
rotation vector and u = [u1 u2 L uB ]T . Finally, the ignored once. Then compare all of distances that
least-squares method can be used to solve this equation, are collected in step 5 and find the minimum one.
as 7) Remove the sensor that was ignored corresponding
(31) to minimum distance in step 6. If the total number
D = 1
12
(uT u) −1 uT ⋅θ real of removed sensors is equal to Q, then stop. Else,
go to step 2.
5. Consistence check strategy After these steps, we can obtain the (N −Q) reliable
sensors at time t. And Xr,t can be used to compute the
The performance of optical flow sensor depends on movement according to (1) − (11) and estimate the
the condition of sensing surface. Highly reflective overall position and orientation by computing the mean
surface or a sudden change of height might disturb the of these movements as (12) and (13).
sensor measurements seriously. Each pair of sensors
can get a estimation of position and orientation 6. Simulation results
according to (1) to (11). For N sensors, there will be
N(N-1)/2 estimates. In order to reduce the uncertainty The proposed methods are demonstrated by
caused by the non-systematic error, the unreliable simulation using Matlab. Eight optical-flow sensors are
sensor measurements shall be removed from the update. located at the corners of an octagon. The diagonal
The remaining measurements can used to update the distance of the octagon is 5 cm, and the relative
position and orientation of the robot as described orientation between two adjoining sensors is 45
previously in (12) and (13). degrees. 3% random errors are added to the angles and
From (18), if there is no error in the sensor distances to represent the uncertainty of hardware
measurements, ε should be zero. This means that the installation. Further, the sensor measurements are
correct measurement vector X should lie in the null perturbed by a 5% random noise at each sample. The
space of the matrix B (denoted as N(B)). Therefore, for sampling rate is set to be 50 samples/sec.
any vector X not in N(B), the orthogonal projection of First, 1000 samples of data are simulated to calibrate
X onto N(B) can be interpreted as the optimal the parameters using the procedure in Section 4. The
correction of X. Alternatively, the distance of X to N(B) sensor module then moves on a semi-circle (radius 100
(or the error vector after projection) represents degree cm) and its speed is 5 cm/sec hence there are total of
of incorrectness of the measurements. It is then 3142 samples of new data. The estimated trajectory
possible to use this distance to access the reliability of using calibrated parameters and the trajectory using
each sensor measurement. Accordingly, there could be nominal parameters are shown in Figure 4.

468
vector should belong to. The reliability of the sensor
data is determined based on the distance to the null
space. Simulation are conducted to support the
proposed methods and the results show the
effectiveness of the methods in achieving a better
accuracy.

8. Acknowledgment
This work was supported in part by the National
Figure 4. The comparison of localization result with Science Council of Taiwan, ROC under grant NSC 95-
and without using calibration method 2221-E-009-177 and the Ministry of Economic Affairs
under grant 95-EC-17-A-04-S1-054 and grant 96TR04.

9. References
[1] J. Borenstein and L. Feng, “Measurement and Correction
of Systematic Odometry Errors in Mobile Robots,” IEEE
Trans. Robotics and Automation, vol. 12, issue 6, pp.
869-880, Dec. 1996.
[2] D. K. Sorensen, “On-Line Optical Flow Feedback for
Mobile Robot Localization / Navigation,” Texas A&M
University, Master Thesis, May 2003.
[3] S. Lee and J.B. Song, “Robust mobile robot localization
Figure 5. The comparison of localization result with
using optical flow sensors and encoders,” IEEE Int. Conf.
and without using consistence check strategy Robotics and Automation, pp.1039-1044, Apr. 2004
[4] S. Lee, “Mobile Robot Localization using Optical Mice,”
Once again, let the robot move on the same path IEEE Int. Conf. Robotics, Automation and Mechatronics,
with same speed. But this time the measurements of vol. 2, pp.1192-1197, Dec. 2004
sensor 2 and 4 are biased at the midpoint of the semi- [5] A. Bonarini, M. Matteucci, and M. Restelli “Automatic
circle to represent non-systematic error. As shown in Error Detection and Reduction for an Odometric Sensor
Figure 5, the trajectory which doesn’t implement the based on Two Optical Mice,” IEEE Int. Conf. Robotics
consistency check strategy as described in Section 5 and Automation, pp.1675-1680, Apr. 2005
results in a large error. In contrast, the one using the [6] A. Bonarini, M. Matteucci, and M. Restelli “A
strategy successfully eliminates the faulty sensors and Kinematic-independent Dead-reckoning Sensor for
Indoor Mobile Robotic,” IEEE/RSJ Int. Conf. Intelligent
gives more accurate estimations.
Robots and Systems, vol. 4, pp.3750-3755, 28 Sept.-2
Oct. 2005.
7. Conclusion [7] D. Sekimori and F. Miyazaki, “Self-Localization for
Indoor Mobile Robots Based on Optical Mouse Sensor
In this work, an odometer using multiple optical Values and Simple Global Camera Information,” IEEE
flow sensors is introduced. Since the relative positions Int. Conf. Robotics and Biomimetics, pp. 605-610, 2005.
of the sensors are unchanged, their measurements [8] J. Palacin, , I. Valgañon and R. Pernia, “The optical
mouse for indoor mobile robot odometry measurement,”
should obey the rigid body constraint, i.e., the
Sensors and Actuators A: Physical, Volume 126, Issue 1,
projections of velocity measurements of a pair of 26 January 2006, Pages 141-147
sensors onto the line connecting them should be the [9] J.-S. Hu, J.-H. Cheng, and Y.-J. Chang, “Spatial
same. This relation is used first to calibrate the Trajectory Tracking Control of Omni-directional
parameters of sensor configuration. It is shown that all Wheeled Robot Using Optical Flow Sensors,” 1st IEEE
parameters can be computed from the sensor Multi-conference on Systems and Control (MSC),
measurements and the rotation angle of the module. To Singapore, October 1-3, 2007.
find the solution given multiple measurements, a [10] J.-S. Hu, Y.-J. Chang, W.-H. Liu, and C.-H. Yang,
Newton-Rophson like method is used to find the “Localization of an Omni-directional Robot Platform
Using Sound Field Characteristics and Optical Flow
optimal results. To filter out incorrect sensor data
Sensing,” 13th International Conference on Advanced
during operation, the rigid body constraint is again Robotics (ICAR 2007), Aug. 21-24, 2007, Jeju, Korea.
used to construct the null space where sensor data

469
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Intelligent Multimedia Recommender by Integrating Annotation and


Association Mining

Vincent S. Tseng, Ja-Hwung Su, Bo-Wen Wang, Chin-Yuan Hsiao, Jay Huang and Hsin-Ho Yeh
Department of Computer Science and Information Engineering
National Cheng Kung University, Tainan, Taiwan, R.O.C.
Email: tsengsm@mail.ncku.edu.tw

Abstract the near-expert knowledge to predict user’s preference,


but ignoring the semantic contents existing in
Making a decision among a set of items from discovered rules. In fact, form human perspective,
compound and complex information has been semantic relationships between conceptualized
becoming a difficult task for common users. multimedia contents and high-level human concepts
Collaborative filtering has been the mainstay of are much easier to understand in providing users a
automatically personalized search employed in good support to identify his preference. For instance,
contemporary recommender systems. Until now, it is the rule “{Young, Female}→{Tom_Cruise}” is more
still a challenging issue to reduce the gap between natural than the rule “{Young, Female} → {color1,
user perception and multimedia contents. To bridge texture2, shape3}”. Therefore, for conceptualizing
user’s interests and multimedia items, in this paper, we multimedia contents, some annotation models by using
present an intelligent multimedia recommender system machine learning and data mining techniques have
by integrating annotation and association mining been proposed over the past few years.
techniques. In our proposed system, low-level
multimedia contents are conceptualized to support
rule-based collaborative filtering recommendation by
automated annotation. From the discovered relations
between user contents and conceptualized multimedia
contents, the proposed recommender system can
provide a suitable recommendation list to assist users
in making a decision among a massive amount of items.

1. Introduction
Modern communication technologies ease the
multimedia data exchange on the Internet. From a
massive amount of multimedia data, it is very difficult
for a common user to obtain her/his preferred ones.
The main problem is that the gap between user
Figure 1. The framework of the traditional recommender
contents and multimedia contents is so large that very
system.
little previous work succeeds in building a great
multimedia recommender system. As shown in Figure
Automated annotation of multimedia contents
1, a traditional recommender system usually employs
actually does users a great favor on semantic
Collaborative filtering (CF) technique based on the
multimedia retrieval. In addition to multimedia
assumption that people can be grouped to share their
retrieval, semantic annotation can also convey the
valuable purchase information each other to make an
information about what users want. On the basis of this
choice about the preferred items, such as movies,
notion, in this paper, we implement an intelligent
books, etc. Unfortunately, these contemporary
recommender system by integrating annotation and
recommender systems always put the focus on eliciting

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 492


DOI 10.1109/SUTC.2008.82
association mining techniques. On the one hand, integrating visual features, speech features and
several annotation approaches we ever developed are frequent patterns.
used to conceptualize multimedia data, including
music and videos. On the other hand, the efficient 2.2 Recommendation strategy
association mining approach can discover implicit
relations between conceptualized multimedia contents In recent years, many practical recommender
and user’s interest hidden in user’s behaviors. The systems for multimedia data have been developed to
experimental results reveal that the implemented provide users with personalized service, e.g., TiVo [2],
system can really catch user’s intention. MythTV [9] and so on. In these practical systems, the
The remainder of this paper is organized as follows. common characteristic is that the lists of recommended
A review of related work is described in section 2. In items are generated by users’ logs with or without
section 3, we introduce our proposed recommender considering the relationships among users. In addition
system. Conclusions and future work are discussed in to above systems, a lot of solutions for
section 4. recommendation have been widely used to deal with
the recommendation problems. In this field, through
2. Related Work subjective organization, some of model-based CF
algorithms regard recommendation as behavior
In this section, the related past studies can be classification. Nikovski et al. [8] presented an
categorized into two main categories: Multimedia induction of compact decision trees as the optimal
annotation and Recommendation strategy. In the recommendation policy. Breese et al. [4] attempted to
followings, we are going to review and discuss the solve the problems of personalized recommendation by
related work further. Bayesian network. Another kind of recommendation
strategy is memory-based algorithms that find the
2.1 Multimedia annotation nearest neighbors to the active user by their similar
behaviors. Social filtering is a classic user-based
Generally speaking, the goal of multimedia method that estimates the potential groups to share the
annotation is to provide search engines to assist user in preference information with the members in the same
accomplishing the conceptual multimedia retrieval. groups [3][12]. Whatever the previous
Mainly, it takes advantage of low-level multimedia recommendation strategy is, the semantic relations
contents and high-level textual information to predict between users and items are ignored. However, the
the suitable keywords. CRM (Continuous Relevance semantic relations can really assist system
Model), is a classic statistics-based method developed administrators or users in analyzing the preference
by Lavrenko et al. [5][7] for annotating a video. It information.
segments each sequential key-frame into several
rectangle regions and then extracts the related visual 3. Proposed Recommender System
features from these segmented regions. The
annotations of each image are yielded soon after As mentioned in previous sections, the effective
calculating the related probabilities with Gaussian factor for recommender systems lies in the main
Mixture Function. By assuming that regions in an percept: how to establish the semantic relations
image can be described using a small vocabulary of between users and items. In this section, we will
blobs, Jeon et al. [6] proposed probabilistic models describe the proposed approach for integrating the
that allow us to predict the probability of generating a low-level multimedia contents and the high-level
word given the blobs in an image. Adams et al. [1] user’s contents to achieve the proposed recommender
proposed a method integrating Gaussian Mixtures system.
Model and Hidden Markov Model to discover the
concepts of videos. Tseng et al. [17] proposed a novel 3.1 Overview of the proposed system
web image annotation method, namely FMD (Fused
annotation by Mixed model graph and Decision tree), Basically, the proposed system can be viewed as an
which combines visual features and textual expert system. As illustrated in Figure 2, three main
information to conceptualize the web images. components are developed to generate the great
Additionally, for video annotation, Tseng et al. recommendation list for multimedia data. First, low-
[14][15][16] also developed a hybrid method to endow level multimedia contents have to be converted into
the videos with semantic-relevant keywords by high-level concept terms by our proposed self-learning

493
annotation. Second, by jointing high-level concept some key-objects. The main tasks of this work are to
terms, user profiles and usage logs, mining engine annotate the shots and the referred key-objects. To this
discovers the implicit rules as the knowledge base to end, we adopt different annotation strategies to finish
help recommendation agent infer the personalized the different kinds of annotation tasks.
recommendation list. Finally, users can rate each item
on the recommendation list and store the rating results z Multi-level CBA (Classified By Association)
into the usage database. The detailed descriptions are model
stated in the following subsections.
This method is mainly based on the work in [18]. In
this work, we proposed a new image classification
approach by using multiple level association rules
based on these image objects. The proposed approach
can be decomposed of three phases: (1) Building of
conceptual object hierarchy, (2) Discovery of
classification rules, and (3) Classification and
prediction of images.

Figure 2. The framework of the proposed system.

3.2 Self-learning annotation

To facilitate the knowledge mining and preference


discovery procedures, various learning approaches are
employed to imply good annotation results. Basically,
our proposed annotation tool can handle three kinds of
multimedia data, including images videos and music.
Different kinds of multimedia data are annotated by
different annotation strategies. In the followings, we
Figure 4. Procedure of Multi-level CBA model [18].
will further describe how to annotate different
multimedia data adaptively.
z CRM model (Continuous Relevance Model)

This model is built based on CRM method proposed


in [7]. As depicted in Figure 5, through the
computations of probabilities between the keywords
and the regions, the proper keywords can be derived
by their probabilities. The practical probability
formulation will be given in the followings.

.
Then we can obtain a probability list containing the
probability of each keyword while annotating an
unknown shot. In these functions, w is the set of all
Figure 3. The self-learning annotation model. candidate keywords, r is the set of all regions, N is the
number of all shots and J is a specific un-annotated
3.2.1 Annotation for images and videos shot.

In this work, as illustrated with Figure 3, a video z ICOA model (Intelligent Concept-Oriented
clip can be decomposed into several shots containing Annotation model)

494
visual distance between the objects of different
As shown in Figure 6, this work proposes a hierarchies is close enough, for example as shown in
framework to catch the high-level concepts by Figure 7, {r7→r4}, {r6→r3} and {r5→r4}.
performing content-based image understanding
approach. It first segments the shot into several regions
and identifying each segmented region by
classification models, e.g., decision tree, SVM, KNN
and so on.

Figure 7. MMG model [10].

z Multi-Content mining model

In this method, we propose an innovative method


for semantic video annotation through integrated
mining of visual features, speech features and frequent
semantic patterns existing in the video. As shown in
Figure 5. Procedure of CRM model.
Figure 8, the proposed method mainly consists of two
main phases: 1) Construction of four kinds of
predictive annotation models, namely speech-
association, visual-association, visual-sequential and
statistical models from annotated videos, and 2) Fusion
of these models for annotating un-annotated videos
automatically.

Figure 6. Procedure of ICOA model.

z MMG model (Mixed Media Graph)

This notion of this model based on [10]


concentrates on connecting the objects and the
captions via the shots. For example, as illustrated in
Figure 7, (A) stands for two annotated images {I1, I2}
which referred captions are {sea, sun, sky, waves} and
Figure 8. Procedure of Multi-Content mining model [16].
{cat, forest, grass, tiger} with respect to {t1, t2, t3, t4}
and {t5, t6, t7, t8}, and which referred main objects are
3.2.2 Annotation for music
{r1, r2, r3, r4} and {r5, r6, r7}. To provide a sufficient
keyword exploration, a three-layer MMG model, as
In contrast to videos, music contents are simpler to
shown in (B), can be formed as a hierarchical structure
be analyzed. However music is so popular that we
according to the information from (A). Each hierarchy
have to keep an eye on the solution for semantic music
of them is an image containing the referred captions at
annotation. As depicted in Figure 9, music in Mpeg 3
the 1st level and the referred objects at the 3rd level.
is first transformed into MDCT (Modified Discrete
The connection of each hierarchy is determined if the

495
Cosine Transform) value. According to the Currently we have implemented an annotation tool
transformed features, four semantic feature terms can that integrates various annotation models for images,
be derived by annotation model or our defined videos and music. In Figure 10, the interface can be
functions. divided into several major panels. The operation
procedure starts with opening an un-annotated
multimedia data by panel (H). Next, the multimedia
data is split into some shots and users can manually
annotate the first-k shots by panel (A) and (C).
Afterward the training operation is performed by panel
(E). Thereby, automated annotation terms can be
inferred by panel (E) and the results are shown in
panel (B), (D), (G) and (F). Finally the annotation
terms are stored into the database. The generated
annotation terms are important elements for
rule/knowledge mining. Without these annotation
terms, the multimedia data cannot be semantically
connected with user’s interest.

Figure 9. Procedure of music annotation.

Figure 10. Proposed Annotation tool.

clarify the specific preference shared by the users on


3.3 Mining Engine similar behaviors. A recommendation session can be
viewed as a transaction and a transaction contains the
Because the projection between shots and linguistic annotation terms and special user contents viewed as
terms has been made by annotation models, the items. In this work, we adopted CBW approach [11] to
recommendation results are easier to be explained in mine the rules since CBW is an efficient algorithm for
human sense. The notion of mining engine is used to association mining. Thus, a transformed transaction
discover the multimedia annotation terms frequently can be defined as:
purchased by the specific user groups. That is, the < UserContent, ItemContent >
where UserContent denotes the set of the user contents,
annotation terms and special user contents frequently
ItemContent denotes the set of the item contents including
occurring in a list of transactions, called transaction
annotation terms. Let’s take an example of association
table shown in Table 1, can help recommender system
rule.

496
{(Gender, Female), (Age, Young), (Occupation, Then we continue expressing how to search
Teacher),(Residence, Taipei)} → {(Category, Love), multimedia products by rules. As shown in Figure 12
(Actor, Tom_Cruise)} and Figure 13, if the cardinality of right-hand contents
says that most of female teachers living in Taipei is larger than one, recommendation agent has to look
always like watching love movies presented by for all of the products containing the right-hand
Tom_Cruise. This kind of frequent patterns can assist content pattern by pattern. Then it generates the
not only recommender system in eliciting the best intersection set of the found products. Note that the
recommendation list but also the system administrator final results are still sorted by the issuing date and the
in strengthening the personalized service. purchasing frequency.

Table 1. The transaction table.


Transaction id Item
Female, Leo, Young, Teacher, Taipei, love,
1001
Tom_Cruise
Female, Libra, Young, Teacher, Taipei, love,
1002
Tom_Cruise
Male, Taurus, Young, Student, Kaohsiung,
1003
Love, Meg_Ryan

3.4 Recommendation Agent

After elaborating on self-learning annotation and


mining engine, now we are going to explain how to Figure 12. Procedure for searching multimedia by rules.
yield the recommendation list by rule-matching-based
method. In our system, traditional first-rater problem
can be prevented because the preference for the active
user who never visits our system can be predicted by
the rules. Otherwise, for the system members, personal
log and generated rules are both considered reasoning
the preferred multimedia products. As shown in Figure
11, if recommendation agent cannot find out the rules
whose left-hand patterns match the active user’s
contents, hot multimedia products are then found by
the issuing date and the purchasing frequency.
Whatever case it is, top 10 multimedia products have
to be judged finally. Figure 13. Procedure for selecting multimedia by right-hand
content.

Figure 11. The framework of recommendation agent.

3.4.1 Search Multimedia by Rules


Figure 14. Procedure for searching multimedia by personal
log.

497
Horror, Science-Fiction}, which are ever purchased by
3.4.2 Search Multimedia by Personal Log the active user u. Table 2 shows an transaction
example for u. From the transactions, actor X is the
This operation is mainly aimed at the special most favorite actor for u. Table 3 shows that the
members whose contents cannot match any rule in the transaction table of the favorite actor X for u. As
rule database. As shown in Figure 14, first, all of shown in Table 4, for each category, the amount of
transactions for the special member have to be recommended products can be computed by the
collected. Second, select the most five favorite referred ratio.
multimedia categories by the referred purchase
frequency. Third, find the most favorite actor whose 4. Conclusions
purchase count is the largest. Fourth, combine the most
five favorite multimedia categories and the additional Collaborative-filtering based recommender system
category presented by the most favorite actor into six has been shown that it can provide users with the great
candidate categories. At last, according to the purchase support to make a decision among a huge amount of
ratio of each category, select un-purchased products precuts. However, semantic relations between user’s
for each candidate category. interest and the products are ignored in past studies. In
this paper, we have presented an intelligent
Table 2. The transaction table for the active user. recommender system to bridge user’s interest and the
products by integrating annotation and association
mining techniques. The low-level multimedia contents
can be converted into the linguistic terms and the
discovered relations are indeed very valuable for the
recommendation strategy and the system administrator.
In the future, we will further investigate the
aggregation for other recommendation strategies.

Acknowledgement

Table 3. The transaction table of the favorite actor X for the This research was supported by Ministry of Economic
active user. Affairs, R.O.C., under grant no. 96-EC-17-A-02-51-
024.

References
[1] W. H. Adams, G. Iyengar, C. Y. Lin, M. R. Naphade, C.
Table 4. The resulting table for recommendation. Neti, H. J. Nock, and J. R. Smith. “Semantic Indexing of
Multimedia Content Using Visual, Audio, and Text Cues.”
EURASIP Journal on Applied Signal Proceeding, Vol. 2003,
Issue 2, pp. 170-185, 2003.
[2] Kamal Ali and Wijnand van Stam,”TiVo: Making
Show Recommendations Using a Distributed Collaborative
Filtering Architecture”, in Proceedings of KDD, 2004.
[3] Chumki Basu, Haym Hirsh, and William Cohen.
Recommendation as Classification: Using Social and
Content-Based Information in Recommendation. In
Proceedings of AAAI, 1998.
[4] John S. Breese, David Heckerman, and Carl Kadie.
Empirical Analysis of Predictive Algorithms for
Collaborative Filtering. In Proceedings of the 14th
Conference on Uncertainty in Artificial Intelligence, 1998.
[5] S. L. Feng, R. Manmatha, and V. Lavrenko, “Multiple
Bernoulli Relevance Models for Image and Video
Annotation,” IEEE Computer Society Conference on
Example 1. Suppose that there are 12 products Computer Vision and Pattern Recognition, Vol. 2, pp. 1002-
belonging to 3 categories {Action, Adventure, Love, 1009, 2004.

498
[6] J. Jeon, V. Lavrenko, and R. Manmatha, “Automatic “Classify By Representative Or Associations (CBROA): A
Image Annotation and Retrieval using Cross-Media Hybrid Approach for Image Classification”, in Proceedings
Relevance Models.” In Proc. of the 26th annual of the Sixth International Workshop on Multimedia Data
international ACM SIGIR conference on Research and Mining (KDD/MDM), Chicago, IL, USA, August 21-24,
development in information retrieval, pp. 119 – 126, 2003. 2005.
[7] V. Lavrenko, S. L. Feng, and R. Manmatha, “Statistical [14] Vincent. S. Tseng, Ja-Hwung Su, and Chih-Jen Chen,
Models for Automatic Video Annotation and Retrieval,” in "Effective Video Annotation by Mining Visual Features and
Proc. of the International Conference on Acoustics, Speech Speech Features", in Proceedings of the third International
and Signal Processing, May 2004. Conference on Intelligent Information Hiding and
[8] Daniel Nikovski and Veselin Kulev. Induction of Multimedia Signal Processing, Kaohsiung, Taiwan,
Compact Decision Trees for Personalized Recommendation. November 26-28, 2007.
In Proceedings of the ACM symposium on Applied [15] Vincent. S. Tseng, Ja-Hwung Su and Jhih-Hong Huang,
computing, 2006. “A Novel Video Annotation Method by Integrating Visual
[9] J.A. Pouwelse, M. van Slobbe, J. Wang, H.J. Sips, Features and Frequent Patterns”, in Proceedings of the
"P2P-based PVR Recommendation using Friends, Taste Seventh International Workshop on Multimedia Data Mining
Buddies and Superpeers", Beyond Personalization 2005, (KDD/MDM), Philadelphia, PA, USA, August 20-23, 2006.
Workshop on the Next Stage of Recommender Systems [16] Vincent. S. Tseng, Ja-Hwung Su, Jhih-Hong Huang and
Research, San Diego, USA, Jan 2005. Chih-Jen Chen, "Integrated Mining of Visual Features, Speech
[10] J.Y. Pan, H.J. Yang, C. Faloutsos, P. Duygulu, Features and Frequent Patterns for Semantic Video
“Automatic multimedia cross-modal correlation Annotation," IEEE Transactions on Multimedia, vol. 10, no. 1,
discovery.” In Proc. of the tenth ACM SIGKDD 2008.
[17] Vincent. S. Tseng, Ja-Hwung Su, Bo-Wen Wang and
international conference on Knowledge discovery and
Yu-Ming Lin, "Web Image Annotation by Fusing Visual
data mining, August 22-25, 2004, Seattle, WA, USA. Features and Textual Information", in Proceedings of the
[11] Ja-Hwung Su and Wen-Yang Lin, “CBW: An Efficient
22nd ACM Symposium on Applied Computing, Seoul, Korea,
Algorithm for Frequent Itemset Mining,” in Proceedings of
March 11 – 15, 2007.
the 37th Annual Hawaii International Conference on System
[18] Vincent S. Tseng, Ming-Hsiang Wang and Ja-Hwung
Sciences, Hawaii, January 2004.
Su, “A New Method for Image Classification by Using
[12] U. Shardanand and P. Maes. Social Information
Multilevel Association Rules”, in Proceedings of the 1st
Filtering: Algorithms for automating “word of mouth”. In
IEEE International Workshop on Managing Data for
Proceedings of the SIGCHI conference on Human factors in
Emerging Multimedia Applications (EMMA), Tokyo, Japan,
computing systems, Pages 210 - 217, 1995.
April 2005.
[13] Vincent. S. Tseng, Chon-Jei Lee and Ja-Hwung Su,

499
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A COMBINED PLATFORM OF WIRELESS SENSORS AND ACTUATORS


BASED ON EMBEDDED CONTROLLER

S. C. Mukhopadhyay, G. Sen Gupta and R. Y. M. Huang*


Massey University, Palmerston North, New Zealand
x National Cheng-Kung University, Tainan, Taiwan

Abstract: This paper presents a combined platform Such wireless detonation and control systems are
of wireless sensors and actuators based on embed- already commercially available, but none offer a
ded controller. To proof the concept a fireworks modern, all-in-one package that includes control
detonation system utilizing MOSFETs to switch and choreography software along with hardware
high current bursts has been developed. All required modules that are up to a high standard and designed
electronics, enclosure panel and control software for use with the latest operating systems. Such sys-
have been fabricated. Commands are intelligently tems are also extraordinarily expensive, ranging in
issued using an intuitive user interface. Manual con- price from US$2000 for a very basic system up to
trol of channel firing and testing as well as a script- several thousand dollars US for a high end system.
ing system allow for complex and visually stunning
firework demonstrations to be put together. Most of the systems, in the higher price brackets at
least, include the ability to script firework detona-
Keywords: Wireless sensors and actuators, tions and there are very, very few software systems
MOSFET, Fireworks, Embedded Microcontroller, on the market that allow you to synchronize the
Detonation, Wireless, Electric-Match. detonations to music and simulate the show as a
graphical 3D visualization. Such a system of this
nature could cost the user upwards of $10,000 USD
for the hardware and software. This is by far out of
reach of smaller national and regional firework dis-
1 INTRODUCTION play operators.

The main idea is to develop a combined platform of A system is required that offers comparable features
wireless sensors and actuators based on embedded at a much reduced cost. The presented system con-
controller. In order to achieve the above objective, a sists of two main aspects, the remote firing module
project on wireless fireworks detonation systems to and the PC control and choreography software.
entertain public has been taken under consideration.
The implementation of the complete project will
match very well within the broader objective of
wireless sensors and actuators.

Large public fireworks displays must be choreo-


graphed to a high standard in this day and age in
order to meet public expectations. The designer of
such a complex display requires the use of a com-
puter controlled system in order to achieve the de-
manded level of timing and accuracy. But computer
control of firework detonation alone is not enough;
in order to reduce the complexity and workload of Figure 1: System connectivity diagram
wiring to each electronic-match a wireless system
can both improve reliability, reduce setup time and 2 OPERATIONAL ENVIRONMENT
cost all while improving the flexibility of position-
ing the control system, leading to an increase in the Such a system would be used in a very dangerous
safety of pyrotechnicians. and hectic environment; the remote firing module
will be placed in close proximity to live explosives,
and thus should be of rugged design. The potential

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 470


DOI 10.1109/SUTC.2008.83
for unpredictable weather is also a possibility. The
remote module should be able to operate reliably at
the long ranges (up to 1km) required.

Traditional systems have used very long runs of


cable, up to several hundred meters, for each fire-
work connected. This increases the setup time and Figure 2: A length of electric match
cost significantly, and can also lead to placing tech-
nicians too close to the firework shells when they 2.3 POWER SUPPLY
are being fired because of short lengths of cable
being used in order to reduce the amount of wiring A power supply is required that can deliver high
required. Such trade-offs between safety and setup currents (2 to 3 Amperes) while remaining small
should not be needed. and portable.
Because fireworks are classified as dangerous ex- Valve Regulated Lead-Acid (VRLA, also known as
plosives the safety of all technicians and the public sealed lead-acid) batteries possess these traits, and
in the vicinity of the device and/or fireworks is of are the only practical choice. The two main tech-
the utmost importance. Safety mechanisms both on nologies available are gel-cell and Absorbed Glass
the firing module and in the PC control software Mat (AGM), both technologies are very similar and
must be included in order to avoid the unintentional allow the battery to be stored, used and charged in
detonation of any firework. virtually any position. AGM is the newer technol-
ogy and is rapidly replacing gel-cell batteries in
These large public displays by their very nature many applications because of their better character-
attract very large numbers of people, often over istics and cheaper price. These lead-acid batteries
20,000. With such large crowds of people all in are reasonably compact, high capacity, and are ca-
anticipation of the main fireworks display event, pable of supplying over 100A at 12V, making them
things can become stressful, and the possibility of ideal for our purposes.
human error can come into play; these safety fea-
tures can be crucial to a successful outcome.
2.4 WIRELESS COMMUNICATIONS
+DUGZDUH'HVLJQ
The option we have taken is to utilize industrial
grade commercially available radio modems. The
2YHUYLHZ
MaxStream series of RF modems operate at 2.4GHz
Each electric-match that can be fired by the module and provide up to 16km range with a high gain an-
is connected to a “channel” (an individual, control- tenna. Communication with these devices is made
lable electric-match connection). MOSFETs are simple, needing only a serial port on the PC, or can
used to switch current flow through the channels be talked to directly from a microcontroller that
and these are controlled directly by an onboard mi- supports UART.
cro-controller. The system contains an LCD module
for displaying status and diagnostic information, as Security for the communications at this stage con-
well as local safety on/off switches for both firing sists of a ‘hacker’ not having access to the data pro-
and testing functionality for enhanced safety meas- tocols being used, and that a publicly uncommon
ures, again to protect technicians wiring the system. brand/model of RF modem is in use that is some-
The prototype system has 30 channels available per what expensive because if its industrial grade na-
module. ture. In the future it will be necessary to implement
an encryption scheme for all data communicated
with passwords set in the PC control software, and
2.2 ELECTRIC MATCHES entered into the remote module.
An electric match is used to provide a small initial
explosive charge to light a firework shell’s primary 2.5 PROCESSING
fuse. They are typically constructed from lengths of
22-gauge insulated wire joined by a small bridge- A Silicon Labs C8051F020 microcontroller operat-
wire coated in a pyrotechnic mixture that will ignite ing at 22.118450 MHz is the heart of the module,
when heated [1]. The matches are 1.8 to 2 ohms in providing all processing abilities, and interfacing
resistance, and can be ignited by applying a current with the RF module. This microcontroller is ideal
of approximately 1A (depending on model) through for the needs of this project as it has 64 GPIO lines,
the match [2]. These are connected to the module onboard ADCs, UART and several timers.
using spring-loaded speaker terminals for ease of
use.

471
with the electric-match will deliver about 3A of
2.6 SWITCHING DEVICE current.

Relays are commonly used in similar situations Testing is accomplished through a similar process,
where high current switching is required. They of- but limiting the current much further. While current
fer many benefits such as electrical isolation and is switched on to the electric match an ADC reading
ability to switch very large currents with minimal is taken to measure the voltage developed across the
power loss, but also have several disadvantages match, and thus the resistance of the match can be
also, which primarily include needing relatively derived, giving an indication as to whether the
high currents to do the switching, large size and match/channel’s connection is good or has failed
lifespan. Due to their mechanical nature, contact short circuit or open circuit. This is very important
bounce and electrical arcing can reduce the lifespan for diagnostic analysis of the show’s setup and en-
of relays significantly. suring the safety of technicians.

By using power MOSFETs we can switch quite § ADC * 3.3 ·


high currents (up to 20A) with very small surface ¨ ¸ * Rtest
Rfuse © 28 ¹ (1)
mount DPAK packages. These are also much
§ ADC * 3.3 ·
cheaper per unit and increase the reliability of op- 12  ¨ ¸
eration drastically. © 28 ¹

This is the equation for determining the resistance


of the electric-match (fuse) where “ADC” is the
measured ADC value. The ADC is 8-bit resolution
running on 3.3V. Module voltage is 12V and Rtest
is the testing current limit resistor.

Figure 3: MOSFET switching

Figure 3 shows the timing of an electric-match be-


ing fired. The bottom line is the current through the
match, measured using a hall-effect current probe
with 100mV/A, and the top line is the voltage de-
veloped across the match. As can be seen, the elec- Figure 4: Firing and testing schematic
tric-match takes approximately 1.5ms to blow with
2.5A passing through it.
2.8 DEVELOPED HARDWARE
The firing MOSFETs stay on for 500ms in order to
ensure the electric-match connected is successfully
fired, so as can be seen above, once the match has
blown and goes open circuit, 12V is seen across the
channel terminals with no current being drawn.

2.7 FIRING AND TESTING

To fire the electric matches several amps of current


are applied across the fuse-wire for a short burst of
time for each match, this allows for several to be
fired in parallel. Each electric match connection is
controlled by a MOSFET with another common
firing MOSFET to control current flow. Current is Figure 5: Development system hardware: Detonator
limited by a 2 ohm power resistor; this in series board and microcontroller board

472
3 FIRMWARE DESIGN

3.1 ADDRESSING

Each module in the system is assigned a one charac-


ter alphabetical address (i.e. ‘A’ through ‘Z’);
though this could easily be transferred to a 0-255
number system if an increased number of modules
are required.

The MaxStream RF modules used handle all low-


level encoding necessary, and each has its own
unique network address that can be given a subnet
mask. Therefore the RF module connected to the
Figure 6: Completed firing system board with at- control computer can be on one subnet, with all the
tached RF module firing modules on another subnet, this way packets
sent by firing modules will only be received by the
control computer.

3.2 COMMUNICATION PACKET STRUCTURE

Every packet sent and received in the system begins


with a four byte command string; this reduces the
processing overhead required by the microcontrol-
ler. This is followed by the address of the module
the packet is intended for, or has originated from.
Any subsequent data specific to the command or
response being sent is then appended to the packet.

Table 1: Packet Structure

Index Length Type Description

0 4 Char Command ID string

Figure 7: Underside view of firing system board


4 1 Char Module ID

5 1 Byte Message length

6 * Byte Message data

3.3 COMMAND AND RESPONSE SET

Commands are sent from the controlling PC to the


remote module, and are responsible for controlling
the core actions of the module, such as testing and
firing channels and requesting status responses from
the module such as safety switch states, number of
channels available on the module etc.

4 CONTROL SOFTWARE

The control software has been written in C# and is


split into three sections, each accessible by a tab:
Figure 8: Firing module enclosure and panel manual, scripting and choreography. In this paper

473
we will focus primarily on the manual and scripting belongs to. These timecodes are then uploaded to
interfaces. the remote module which will start stepping through
and firing them at the precise times when given the
start command.

Scripts can be edited within the application and any


syntax errors or out-of-bounds values will be
brought to the user’s attention during compilation.
The show designer module of the software will al-
low for cues to be synchronized with music and
using advanced particle effects [3] such as trails,
blurs and gravity, simulate a show. Fireworks will
then be assigned to physical channels and the result
saved to a script file.

5 ACKNOWLEDGEMENTS

James Tinsley built an initial detonator board proto-


type last year using a Silicon Labs development
Figure 9: Manual control interface board and relays to prove the concept, which was
quite helpful. Ken Mercer was also helpful in con-
The manual control interface allows the operator to tributing ideas to the project.
view which modules are active, how many channels
each contains and the status of each channel. From 6 REFERENCES
here channels can be individually tested and fired.
Before testing or firing is allowed, the Arm Testing
1 A. Yates, Electrical Ignition,
and Arm Firing buttons must be clicked to access to
http://www.vk2zay.net/article.php/14, ac-
these features. This both enables the controls on-
cessed 10th April 2007.
screen, and sends a signal to the remote firing mod-
ule indicating that firing and/or testing has been
2 Pyromate, Basics of Electrical Firing,
enabled.
http://www.pyromate.com/Basics-of-
Electrical-Firing.htm, accessed 10th April
The firmware on the remote module keeps track of
2007.
these states, and only allows firing or testing when
both the PC and the module’s safety switches are
3 T. Loke, D. Tan, H. Seah and M. Er, “Render-
off.
ing Fireworks Displays,” Computer Graphics
and Applications, Volume 12, pp 33-43, May
Scripting allows the show operator to plan and de-
1992.
sign the show, getting very accurate timing between
firing channels. The scrip file specifies which music
file to play in the background, and has a list of
timecodes for when to fire each channel. An exam-
ple script file is given below:
music
{
“D:/show soundtrack.wav”
}

module A
{
# comments
fire 1, 00:01:000
fire 2, 00:01:500
fire 3, 00:05:200
}

Fire times are specified in the format minutes : sec-


onds : milliseconds along with which channel it

474
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A Novel Multiphysics Sensoring Method Based on Thermal and EC


Techniques and Its Application for Crack Inspection

Cheng-Chi Tai and Yen-Lin Pan


Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
ctai666@gmail.com, ncku.ylpan@gmail.com

Abstract the impedance of the EC sensors. The use of a focused


laser beam provides the PI method with a microscopic
Crack inspection is a critical issue in quantitative resolution while using EC pickup sensors.
nondestructive evaluation (NDE). The eddy current Moulder et al. [1] showed that this new technique
(EC) method is effective for surface discontinuities dramatically increased image resolutions, and could be
detection, but the spatial resolution of conventional EC used to calibrate and characterize EC probes [2-4]. The
method is constrained by the size of EC probe. The method experimentally showed the high-resolution
photoinductive (PI) imaging method is a NDE capability inherent in this technique by adapting a PI
technique that combines EC and laser-based thermal sensor developed for a fiber optic probe to an existing
imaging methods. It had been clearly demonstrated photoacoustic microscope [1]. The same method will
that PI method is a high-resolution technique from work equally well to characterize cracks on thick
practical experiments. In this paper, the numerical metals. Tai et al. [5] showed from practical
multiphysics simulation of PI imaging have been experiments that PI method can be used to reveal the
performed with 2D transient using the finite element geometrical structure (such as depth and length) of
method (FEM) to characterize corner cracks at the corner cracks on the surface surrounding a bolt hole.
edge of a Ti-6Al-4V bolt hole specimen. The FEM They demonstrated that PI method has the potential to
simulation results of 0.25-mm, 0.50-mm, and 0.75-mm be used in the real world applications.
rectangular notches are showed and discussed. The In this work, we use the finite element method
results show that the PI imaging has higher spatial (FEM) to simulate the PI imaging technique for the
resolution in the area of the defect in 2D models as cases of bolt-hole cracks. Based on the simulation
compared with the conventional EC images. We results, we also discuss the effects of crack sizes and
demonstrate that the PI method is a novel sensoring compare the PI results with EC images for 0.25-mm,
method for characterizing the geometric shape of 0.50-mm, and 0.75-mm rectangular notches.
cracks.
2. The photoinductive imaging method
1. Introduction
The PI imaging method is a novel multiphysics
Traditionally, eddy current (EC) method is the sensoring method that combines EC and thermal-wave
primary nondestructive evaluation (NDE) method used methods. The physical principles underlying it are
to inspect surface and near-surface cracks in metal. illustrated in Figure 1, which shows the coil of an EC
However, one of the main disadvantages of EC method probe carrying a current placed in close proximity to
is the low spatial resolution, which is constrained by the specimen surface. The coordinate system of all
the diameter of EC probes. The photoinductive (PI) experiments is used in this paper. A focused laser
method is a hybrid NDE technique that combines EC beam generates a localized hot spot on the specimen
and laser-based thermal wave methods. It is similar to surface from above. The temperature fluctuation
photothermal (PT) imaging that generating interaction causes a local change in the electrical conductivity,
of thermal waves with the material by a light flux. A which in turn induces a change in the impedance of the
hot spot of the laser beam is focused on the specimen EC probe. The basic theory of the PI technique is
surface, causing the variations of electrical introduced in this section.
conductivity, will change the eddy currents and then

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 475


DOI 10.1109/SUTC.2008.53
e
the constitutive relation (C.R.), the current density ( J )
can be calculated as follows.
∂A
J e = σE = −σ (∇V + ) (4)
∂t
where E is the electric field intensity. The electric
potential ( V ) is obtained from Faraday’s law. The
defining equation for the magnetic vector
potential A is a direct consequence of the magnetic
Gauss’ law. The induced current ( I ) in the coil is
calculated by the integration of current density in the
cross-sectional area ( s ) of the coil:

∫J ⋅ ds = I
e
Figure 1. Inspection geometry of the photo- (5)
inductive field measurement technique. S

In electromagnetic fields, the metal specimen is like


The temperature variation results in the electrical a barrier for the transport of the eddy current, and the
properties such as conductivity (σ) and permeability (μ) penetration of the fields into the metal is given by the
of the specimen metal. For simplifying calculations, skin depth ( δ ). As discussed in [5], although we can
the perturbed fields can be replaced by the unperturbed increase the EC skin depth to obtain a better image,
fields with good accuracy. In this study, a nonmagnetic lower EC densities will induce weaker PI signals. The
Ti-6Al-4V metal is used. So the fluctuation in EC skin depth strongly influences the impedance of the
permeability can be ignored. The variation in crack, as shown in Sec. 4 below.
conductivity ( dσ ) can be calculated via the thermal
fluctuation dT using the following equation: 1
δ= (6)
πfμσ

dσ = dT (1)
dT where f is the frequency of EC probe driven with an
The electrical conductivity of specimen at ac current source.
For the multiphysics simulation, the PI image
temperature T is given by the expression:
causes a complicated interaction of the eddy currents
1 with the temperature field in the material. The main
σ T = σ T + dσ = (2)
0
[β0 (1 + α (T − T0 ) )] object of nondestructive inspection is to determine the
shape of the crack, especially the depth and length
where σ T , β 0 are the electrical conductivity and the information.
0

resistivity of specimen at temperature T0 , and α is the


3. Specimen and simulation
temperature coefficient of the resistivity. T0 is the
initial temperature 293 K, and T is the actual The specimens used for this study are titanium
temperature in the specimen sub-domain. blocks (Ti-6Al-4V) with 6-mm bolt holes. The notches
The dependent variable in this application mode is are 0.25-mm, 0.5-mm, and 0.75-mm in both length and
the azimuthal component of the magnetic vector depth and 0.2-mm in width. The copper coil probe
potential, A , which conforms to the following relation: (inner diameter = 2.54 mm, outer diameter = 4.1 mm,
and length = 0.76 mm) was inserted in the bolt hole
σVloop with the coil firmly positioned flush with the edge of
( jωσ − ω 2ε ) A + ∇ × ( μ −1∇ × A) = (3) the bolt hole. The probe was operated at a range of
L
frequencies from 100 kHz to 3 MHz. The temperature
where ω denotes the angular frequency, σ the produced by laser beam was at a range of from 100 oC
conductivity, μ the permeability, ε the permittivity, to 700 oC. Table 1 lists the parameter levels used in the
L the length, and Vloop the voltage applied to the coil. simulation in this study. The simulations were
implemented using the COMSOL MultiphysicsTM
The conductivity outside the coil is zero. According to software. In this work, we use the simplified 2D model

476
for comparing the characteristics of EC imaging temperature from 100 oC to 700 oC. Owing to the crack,
method and PI imaging method as shown in Figure 2. the eddy current density on the specimen goes around
the crack. The temperature fluctuation causes a local
change in the electrical conductivity of the specimen
and the current density of the specimen, and then the
impedance of the coil.
The simulation results of the signals interaction
with various temperature and frequency will be first
presented. In order to clearly exhibit the crack’s effect,
the impedance difference between signals with crack
and without crack is reported. The effects of EC
frequency on the EC imaging signal for a 0.5-mm long
and 0.2-mm wide notch is shown in Figure 3. The
Figure 2. The simplified 2D model for PI and position of the coil is 0.0 mm in x-axis and 0.1 mm
EC imaging method. (the bold dotted lines lift-off distance in y-axis. As can be seen from Figure
are the coil moving path and laser point 3, an increase in frequency, leads to a decrease in
moving path) rising quantity of coil impedances. Therefore, higher
eddy current frequencies give larger signals, and the
signals approach saturation when the frequency is
Table 1. higher than MHz. Although the signal amplitude is
Parameter levels used in the simulation increased when higher eddy current frequencies are
Coil Frequency (kHz)
100, 200, 400, 600, 800, applied, the skin depth is smaller. For smaller cracks in
1000, 1500, 2000, 3000 this titanium alloy, the current frequency should be
100, 150, 200, 250, 300, raised for imaging the cracks as far as possible.
Laser point temperature (oC)
350, 400, 500, 600, 700
The effects of temperature on the PI imaging signal
Crack length and depth (mm) 0.25, 0.5, 0.75
for a 0.5-mm long and 0.2-mm wide notch is shown in
Figure 4. The position of the coil is the same above,
We designed the coil that with a height equal to or and the position of the laser point is -0.5 mm in x-axis
greater than the depth of the notch. This was to be sure and 0.5 mm in y-axis. When lower laser beam
that the eddy currents surround the whole notch and so temperatures are applied, the amplitude of signal is
that the depth information can be revealed. The decreased. Because reducing the temperature will
distance between the specimen and the coil (diameter = generates higher current density and the deeper
1 mm, length = 0.76 mm) was 0.1 mm. In the case of penetration on the surface of this specimen. The rising
EC scan that without the laser point, the coil is moving
along the x-axis and cross the notch on the specimen.
In the PI scan, the coil is fixed right on the center of
the notch and the laser is moving in the same direction
as the previous case. The uniform scan plan with
closely spaced scan lines so that flaw orientation and
scan spacing would not affect the outcome was
assumed. To calculate the impedance ( ΔZ = Vloop / I )
of the coil in the simulations, the total induced eddy
current of the coil can be obtained by carrying out sub-
domain integration of the total current density for the
cross-section of the excited coil.

4. Results and discussion


The simulation results using the PI imaging method
and the conventional EC method are presented and Figure 3. Increasing impedance of variant EC
compared in this section. The effects of eddy current frequencies using EC method for a 0.5-mm
were compared by varying the coil excitation rectangular notch in Ti-6A1-4V.
frequency from 100 kHz to 3 MHz and the laser beam

477
Figure 4. Increasing impedance of variant
laser point temperatures using PI method for Figure 5. Variation of EC signals with the
a 0.5-mm rectangular notch in Ti-6A1-4V. length and depth of rectangular notch in Ti-
6A1-4V. EC frequency, 600 kHz.

values then decays from 100 oC to 700 oC as shown in


Figure 4. Especially, the signal increase slowly from
500 oC to 700 oC. It may due to the temperature of
specimen is over the aging temperature of Ti-6Al-4V
(510 oC). The other possible explanation is the rising
temperature will reduce the skin effect, although the
skin depth is steady gradually.
The comparison of EC impedance difference
signals for three different length rectangular notches is
shown in Figure 5. The range of two dash line is the
crack with 0.2 mm in width. Although the longer crack
length has lower positive and higher negative
maximum, the peak-to-peak values of EC signal with
the different length size seem the same. For
discriminating the size, the difference between positive
peak with the crack and the value with far crack could
be applied. The difference in 0.25 mm is bigger than in
longer crack length. On the other hand, the shape of Figure 6. Variation of PI signals with the
crack is more conformable for 0.25-mm length, and is length and depth of rectangular notch in Ti-
smoother than the other sizes in nearby crack. 6A1-4V. EC frequency, 600 kHz; laser
Figure 6 shows the effects of crack size on the PI temperature, 300 oC.
impedance difference signals for transverse scans
across three different length rectangular notches. Due
to the longer length crack has a weaker opposite 5. Conclusions
magnetic field intensity for the eddy current coil, the
signal’s level drops as crack size is increased. The As the numerical simulations results demonstrated,
signal with 0.75 mm is more uniformly distributed than we show the feasibility of PI imaging method when
the others at the far end. The shape of crack with applied to the detection of corner cracks. The EC
smaller size is more similar to the actual crack. frequency and laser beam temperatures affect PI signal
Comparison of flaw impedance measured with two amplitude, resolution, and quality of images. The
detection method for rectangular notch, the resolution higher resolution capability of PI method can be
of PI signal is higher than the EC signal. There is a proved visibly in the area of the defect in 2D models
higher sharp edge in PI signal than in EC signal. when compared with the conventional EC method. The
EC skin depth and laser temperature predominantly

478
affect the determination of shape and length. For the
effects of notch size at 600-kHz EC frequency and 300
o
C laser temperature, the combination of parameters is
more suitable to inspect the smaller crack.

Acknowledgment
This research was supported by the grant from
National Science Council, Taiwan (NSC 96-2628-E-
006-256-MY3). Also, this work made use of Shared
Facilities supported by the Program of Top 100
Universities Advancement, Ministry of Education,
Taiwan.

References
[1] J. C. Moulder, N. Nakagawa, K. S. No, Y. P. Lee, and J. F.
McClelland, "Photoinductive imaging: a new NDE technique,"
Review of Progress in Quantitative NDE edited by D. O.
Thompson and D. E. Chimenti (Plenum Press, New York), vol.
8A, p. 599, 1989.
[2] N. Nakagawa and J. C. Moulder, "Eddy Current Probe
Calibration via the Photoinductive Effect," NDT and E
International, vol. 28, pp. 245-246, 1995.
[3] J. C. Moulder and N. Nakagawa, "Characterizing the
performance of eddy current probes using photoinductive field-
mapping," Research in Nondestructive Evaluation, vol. 4, pp.
221-236, 1992.
[4] M. S. Hughes, J. C. Moulder, M. W. Kubovich, and B. A. Auld,
"Mapping eddy current probe fields using the photoinductive
effect " NDT & E International vol. 28, p. 251, 1995.
[5] C.-C. Tai and J. C. Moulder, "Bolt-Hole Corner Crack
Inspection Using the Photoinductive Imaging Method " Journal
of Nondestructive Evaluation, vol. 19, pp. 81-93, 2000.
[6] N. Tsopelas and N. J. Siakavellas, "Electromagnetic-thermal
NDT in thin conducting plates," NDT and E International, vol.
39, pp. 391-399, 2006.
[7] F. Yu and P. B. Nagy, "Dynamic Piezoresistivity Calibration
for Eddy Current Nondestructive Residual Stress
Measurements," Journal of Nondestructive Evaluation, vol. 24,
pp. 143-151, 2005.
[8] B. A. Auld, S. R. Jefferies, and J. C. Moulder, "Eddy-current
signal analysis and inversion for semielliptical surface cracks,"
Journal of Nondestructive Evaluation, vol. 7, pp. 79-94, 1988.

479
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Acoustic and Phoneme Modeling Based on Confusion Matrix for Ubiquitous


Mixed-Language Speech Recognition

Po-Yi Shih, Jhing-Fa Wang, Hsiao-Ping Lee, Hung-Jen Kai, Hung-Tzu Kao, Yuan-Ning Lin
Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan, R.O.C.

Abstract recognizing in vehicle speech recorded with a hands-


free microphone. In addition, collecting a large speech
This work presents a novel approach to acoustic database covering all mobile situations for several
and phoneme modeling in order to recognize languages is considered impractical because of cost
ubiquitous mixed-language speech. The conventional issues. As consequence, the system must be trained
approaches to perform multilingual speech recognition using a speech database not specific to application task.
are the usage of a multilingual phone set. A confusion In this work, a acoustic and phoneme modeling
matrix combining acoustic between every two phonetic based on confusion matrix for ubiquitous mixed-
is built for phonetic unit clustering. In this work, we language speech recognition (MLSR) embedded
are interested in speaker independent voice command system is proposed. When using this system, users can
recognition. The IPA representation is adapted for give command to the control electrical device or IA by
phonetic unit modeling. The EAT is applied to directly via speech. The proposed work will achieve
construct speaker independent acoustic models. The further to benefit all other speakers and make a great
experimental results show that the proposed method progress on next-generation automatic speech
can perform 70~80% lexicon recognition accuracy. recognition. The systems can be used easily in different
command-based control applications by changing the
1. Introduction dictionary description and grammar in each new task.
The rest of the paper is organized as follows. In
Speech is the most convenient form of human-to- Section 2, we discuss the corpora and monolingual
human communication and human-to-machine speech recognition. Section 3 shows the phone set
interaction. As global communication and multiethnic generation schemes for mixed-language speech
societies growing, the demand for multilingual recognition, followed by implementing MLSR using
capability increases. An utterance is sometimes spoken HTK tool in Section 4. Experimental results are shown
in two or more languages, as in mixed-language speech. in Section 5 and conclusions are given in Section 6
In Taiwan, Mandarin is the most important language in
Chinese societies; it has been widely studied in the 2. Review of related works
field of speech recognition for decades. [8] Recently, it
has received even more attention because of its huge
2.1. Training and testing corpora
population and the potentially huge market. However,
most people in Chinese societies speak at least 2
2.1.1. Corpora of English Across Taiwan (EAT).
languages in their daily lives, i.e., Mandarin and their
EAT project prepared 600 recording sheets. Each sheet
mother tongues, including English, Taiwanese (Min-
had 80 reading sentences, including English long
nan, Southern Hokkianese)....etc. To make the future
sentences, English short sentences, English words and
speech interface friendlier, the bi-lingual or even multi-
mixed Chinese-English sentences etc. Five academic
lingual capability of the speech recognizer is highly
affiliations joined this project. Each affiliation finished
desired.
120 reading sheets. Each sheet was used for speech
We are interested in speaker independent voice
recording individually for English Department people
command recognition, with a vocabulary size of tens to
and non-English Department people. The recording
hundreds of words. The recognizer has to operate using
using hand-held microphone and wire/wireless
typical handheld devices in mobile situations, including

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 500


DOI 10.1109/SUTC.2008.78
telephone was fulfilled. Microphone corpus was spoken languages globally. The IPA phonetic
recorded as sound files with 16 kHz sample rate and 16 transcription has been applied e.g. in many printed
bit sample resolution. Telephone corpus was recorded dictionaries. The phone inventory defined by IPA is
as sound file with 8 kHz sample rate and 16 bit sample still found to be subjective, and not all the phoneticians
resolution. Telephone corpus was divided into 600 agree with all the definitions [13]
copies (English Department + non-English Department) The IPA-MAP method provides consistent
of PSTN corpora and 600 copies (English Department definition of multilingual phone set. Furthermore, if a
+ non-English Department) of GSM corpora. set of multilingual acoustic models is created according
2.1.1. Corpora of Mandarin speech data Across to the IPA alphabet, the recognition system should be
Taiwan (MAT-400). MAT (Mandarin speech across potentially language independent. This means that an
Taiwan) project started from 1995 and was sponsored unseen target language can be recognized if the phone
by the National Science Council, Taiwan, ROC. This inventory of the source languages is sufficient, i.e. it
three-year project set its goal to generate a speech contains all the phonemes in the target language. The
database of 5000 speakers. Besides, it aimed at setting IPA chart is defined for the purposes of the phonetic
the standard format for speech data collection [10]. representation of the languages, and is based on
Nine stations for speech data collection were set up articulatory features of the sounds. It may not describe
around Taiwan. A speaker can input speech data by the acoustic features as accurately as is needed for the
dialing up to one of the speech data collection stations, purposes of speech recognition, e.g. the representation
following the voice instruction, and speaking to the does not cover the allophone variation of one phoneme.
telephone handset. 3.1.2. Computational methods. When expert
knowledge is not available or, e.g. the number of
2.2. Monolingual speech recognition multilingual phone models is constrained to be very
low, the IPA-MAP method may not be useful. In this
2.2.1. Chinese speech recognition. Substantial results case, an automatic, i.e. computational, method may
have been reported that the performance of Chinese provide an alternative approach for defining a set of
speech recognition using phoneme as phone unit can multilingual phone models [14] . Using the same
approximate the one using Initials and Finals (IFs) [12, number of phone models as in the IPA-MAP method,
13]. So we select a phone set of 37 phonemes and the recognition accuracy of the resulting multilingual
establish Chinese speech recognition system. Therefore, ASR system has been observed to be slightly better
it provides consistency with ASR of most western with the computational method [14]. Conversely,
languages. recognition accuracy comparable to the IPA-MAP
2.2.2. English speech recognition. In English LVCSR recognition system can be achieved with less phone
application, phoneme is the most popular subword unit. models when using the computational method [9]
This is due to its suitable number and limitation of The computational methods used in defining the
available training data. Another strong advantage of multilingual phone model set are based on some
using phoneme is the ease of creating word lexicons measure of dissimilarity of two language dependent
from CMU pronouncing dictionaries. phoneme models, namely HMMs. Usually, this kind of
Much work has been reported that phoneme is the a measure is employed, and the models are collected to
most appropriate basic acoustic model unit of English certain number of clusters, and each cluster is modeled
systems [19, 20] Herein we select 40 phonemes' plus a with a common multilingual phone model. The clusters
silence as monophone of our English speech are typically obtained using the bottom-up, i.e.
recognition application. agglomerative clustering algorithm [14][9].

3. Mixed-language phone set construction 3.2. Acoustic model measurement based on


and acoustic modeling confusion matrix

3.1. Mixed-language acoustic modeling 3.2.1. Measures based on confusion matrix. In


pattern recognition, the classification errors are very
3.1.1. Knowledge-based methods. The International often of greatest interest. The classification errors are
Phonetic Association (IPA) has defined a phonetic usually presented in the form of a confusion matrix,
alphabet for describing the sounds used in human such as shown in Table I. The matrix is obtained
speech [16]. This alphabet is designed to provide usually using a labeled data set that contains sufficient
consistent means to characterize the pronunciation in amount of samples from each class. The columns of the

501
Table 1. An example of a confusion matrix the minimum function gives a large value. This means
Classified as that the sum over all the models, i.e. the value sij , is
‘a’ ‘b’ ‘c’ large and the models are considered similar. On the
‘a’ 100 0 0 other hand, if one of the models is often confused to a
Source class ‘b’ 0 67 33 model, and the other is not confused to that particular
‘c’ 10 0 90
model, the output of the minimum function is small. In
such a case the sum over all the models is small, and
the models are considered dissimilar.
matrix correspond to the results of classification. The
The Houtgast algorithm can be viewed almost as a
rows correspond to the true pattern classes of the
special case of fuzzy similarity, a concept used in soft
observations. Thus, the element cij of a confusion
computing [19]. The only differences between the
matrix C, is the number of classifications of
Houtgast and this fuzzy similarity relation are that, the
observations of class i as class j. The diagonal elements
similarity value is the mean of the summed elements,
thus show the number of correctly classified
and the diagonal values of the similarity matrix equal to
observations for each case, and the off-diagonal
one, i.e
elements show the number of misclassifications, i.e.
substitution errors.
Table 1 is an example of a confusion matrix.
Observations of class ‘a’ are classified correctly every
time, i.e. there are no confusions of ‘a’ to ‘b’ or ‘c’. , (3)
However the observations of class ‘c’ are confused to
class ‘a’ ten times out of 100. The values in the table where sij' is the fuzzy similarity corresponding the
are percentages.
The confusion matrices are usually represented such Houtgast similarity sij' . Therefore, the elements of the
that the rows of the matrix are normalized by the total
number of tokens evaluated. This way the value in the
similarity matrix S’ satisfy 0 ≤ sij' ≤ 1, and the
estimated matrix shows relative number of confusions, maximum similarity, i.e. value one, is achieved when
i.e. the confusion frequency, this can be given as the models are the same.
3.2.3. Mixed-language phone set. The mixed-
language acoustic model for Mandarin and English
using the following steps:
, (1) 1. We group the phones into acoustically and
phonetically similar clusters, for each of the
where card denotes the number of elements in the set. language.
2. Develop single Gaussian acoustic model for all
Once a confusion matrix estimate C is obtained for the monophones.
the set of models, it contains the estimates of how 3. For each phone in Mandarin, we calculate the
likely it is that a given model is classified as other dissimilarity of the phone set based on the
model. confusion matrix to all the phones in the same
3.2.2. Phoneme model HMMs estimation of a group for English. If the value is below a
confusion matrix. One conversion method of a threshold, the source phone in Mandarin would
confusion matrix into a symmetric similarity matrix is be mapped to that phone in English. Otherwise,
the Houtgast algorithm [1][11][23]. It is given as both the phones would be modeled separately in
the bilingual system.
4. For some phones in Mandarin, there may not be
a phone in English that belongs to the same
, (2) cluster. In such cases, the algorithm will not try
to map that phone in Mandarin to a phone in
where cij is the element of the confusion matrix C English
and Sij is the similarity between the models i and j, i.e. 5. After getting the list of phones for the bilingual
an element of the similarity matrix S. The conversion system, the lexicon for Mandarin is edited using
essentially measures the simultaneous confusability of the rules obtained for mapping
the both models i and j to all models k. If the both Finally, we get the mixed-language phone set as the
models are very often confused to a particular model, table 2.

502
Table 2. Mixed-Language phone set for Mandarin The framework of the proposed system is depicted
and English in Figure 1. The details are stated in the following
subsections.

4.1. Feature extraction and front-end signal


processing

4.1.1. Features and transforms. MFCC features,


obtained using HTK [22], were used with 25ms
windows and 10ms shift, a pre-emphasis factor of 0.97,
a Hamming window, and 39 Mel scaled feature bands.
On this database neither silence removal, cepstral mean
subtraction, nor time difference features increased
performance, so these were not used.
4.1.2. Silence removal. In the silence region of speech,
we are supposed to get one peak in the middle. But
since the poles are placed consecutively without any
zeros in between, we may get two peaks or even
multiple peaks, which is undesirable. To avoid this
problem, the silence segments present in the speech
signal should be removed. Based on the knowledge
derived from the energy, zero-crossing rate and the
spectral flatness of a frame the decision is made. If the
silence is present in more than two frames of the signal,
4. Mandarin and English mixed-language then that particular segments are removed from the
speech recognition (MLSR) system original speech signal.
4.1.3. Voice activity detection. The goal of end-point
Training Phase detection (EPD for short) is to identify the important
Speech Signal Input part of audio signals for further processing. Hence EPD
is also known as "speech detecton" or "voice activity
Speech Corpora detection" (VAD for short). EPD play an important
(Chinese & English) rule in audio signal processing and recognition.

Front-end 4.2. Training phones of the proposed system


Signal Processing The Training phase was started by applying the
Acoustic Model
Training HTK tool HInit to a predefined simple prototype model.
This tool uses all the available training data and
Feature Vectors
utilizing the Viterbi alignment repeatedly, tries to
provide initial estimates of HMM parameters. Later,
Mixed-Language the HRest tool was used to provide more accurate
Acoustic Models parameter estimates using the Baum-Welch algorithm.
The recognition and results analysis phases were
carried out later using the appropriate tools
Search Algorithm
Lexicon Table 3. HMM parameter
# recog. units 57 ( 55 phones + silence + short pause)
# HMM states 3 for phones, 1 for silence,
1 for short pause
Grammar
# mixtures 8 for phones
32 for silence and short pause

Recognition Phase Output Sentence

Figure 1. Framework of the MLSR system

503
4.3. Recognition phase of the proposed system each frame. It is a time-synchronous search algorithm
in that it processes all states completely at time t before
4.3.1. Tree lexicon(A pronunciation dictionary). The moving on to time t + 1.
dictionary provides an association between words used The beam search heuristic reduces the average cost
in the task grammar and the acoustic models which of search by orders of magnitude in medium and large
may be composed of sub word (phonetic, syllabic etc.,) vocabulary systems. The combination of Viterbi
units. Since this project provides a voice operated decoding using beam search heuristic is often simply
interface the dictionary could have been constructed by referred to as Viterbi beam search
hand but the researcher wanted to try a different 4.3.4. Recognition process. The recognizer is now
method which could be used to construct a dictionary complete and its performance can be evaluated. The
for a large vocabulary ASR system. In order to train the recognition network and dictionary have already been
HMM network, a large pronunciation dictionary is constructed, and test data has been recorded. Thus, all
needed. that is necessary is to run the recognizer. The
4.3.2. The task grammar. The task grammar defines recognition process can be summarized as in the Figure
constraints on what the recognizer can expect as input. 2.
As the system built provides a voice operated interface
for name recognition, it handles words strings. For the
Input Signal
limited scope of this project, only the syllable making (.wav)
name grammars were needed. The grammar was
defined in BN-form, as follows: $variable defines a Configuration
phrase as anything between the subsequent = sign and File
the semicolon, where | stands for a logical or. Brackets
have the usual grouping function and square brackets HCOPY
Script
denote optionality. The used toy grammar was
File
Table 4. Lexicon file (A pronunciation dictionary)
sil [] Input Signal
ba B AA sp (.wav)
buo B OW sp
bai B AY sp
bei B ei sp Network
bau B AW sp (.slf)
ban B an sp
HVite HMMs
born B @n sp
(.mmf)
bang B a_n_b sp Dictionary
bng B @n_b sp
bi B IY sp
bie B IY EH sp
………….. HMM
List
seven S EH V AH N sp Transcription
eleven IH L EH V AH N sp (.mlf)
samsung S AE M S AH NG sp
taiwan T AY W AA N sp Figure 2. Recognition process of an input signal.
yahoo Y AA HH UW sp
taipei T AY P EY sp An input speech signal is first transformed into a
kaohsiung K EY OW S IY AH NG sp series of ”acoustical vectors” (here MFCCs) using the
washington W AA SH IH NG T AH N sp HTK tool HCopy, in the same way as what was done
state S T EY T sp with the training data.
SENT-END [] sil The input observation was then processed by a
SENT-START [] sil Viterbi algorithm, which matches it against the
recognizer’s Markov models using the HTK tool HVite.
4.3.3. Viterbi beam search. Viterbi search is The recognizer is shown as Figure 3.
essentially a dynamic programming algorithm,
consisting of traversing a network of HMM states and
maintaining the best possible path score at each state in

504
There are 10 people to test this system. They
randomly pick twenty patterns that are mixed Chinese
and English sentence. The below table is the speech
recognition result record.

Table 5. Experiment results


Number of Number Recognition
O X
People of Test rate
8 males 127 33 160 79.38%
2 females 24 16 40 60.00%
Total 151 49 200 75.50%

Figure 3. Recognizer = network+dictionary+HMMs 6. Conclusions and future works


5. Experimental results Due to significant differences between Chinese and
English, a good enough phone inventory of these two
5.1. Experimental setup languages has not been reported. To reduce acoustic
parameters, mono-phone automatic clustering, tri-
Training Databases: phone clustering, robust model-level, state-level or
EAT: mixture-level parameter tying will be investigated in
Through the man-made sifting way, there are totally this framework.
remaining 8375 wave files, including English long The experimental results show that the proposed
sentences, English short sentences amd English words. system can perform 70~80% lexicon recognition
The corpus contains 19221 words for training. This accuracy. Experiments also show that EAT speech
contributes about 5.33 hours of continuous speech. corpus can provides an analysis of mixed-language and
MAT-400: accented speech uttered by Taiwanese and illustrate a
In MAT-400, we use the MATDB-4 (1200) and direction for mixed-language accented speech
MATDB-5 (400) category. Through the man-made recognition construction.
sifting way, there are totally remaining 15400 wave This bilingual system is just a very beginning of our
files, including words of 2 to 4 syllables and multilingual development. In the future, more language
phonetically balanced sentences. The MAT-400 corpus such as Taiwanese will be added into the system.
contains 80903 words for training. This contributes
about 9.65 hours of continuous speech. 7. References
Testing Data:
There are one hundred test patterns containing, [1] Andersen O., Dalsgaard P. and Barry W. On the use of
Chinese move name, some English words and some data-driven clustering technique for identification of poly-
Mixed Chinese and English sentence. and mono-phonemes for four European languages. In
Testing Environment: Proceedings of International Conference on Acoustics,
The testing environment is using a six microphones Speech and Signal Processing, volume 1, Adelaide, Australia,
Apr. 1994, pp. 121–124,
array, as Figure 4.
[2] Chomsky, N. and Halle, M. The Sound Pattern of English.
Speech signal input New York: Harper & Row, 1968.

[3] C. Y. Tseng, “A phonetically oriented speech database for


Channel Mixer Mandarin Chinese,” Proc. ICPhS95, Stockholm, 1995, pp.
326- 329.

[4] C.-H. Lee, L. Rabiner et al. “Acoustic modcling for large


Mic 1 Mic 2 Mic 3 Mic 4 Mic 5 Mic 6
vocabulary speech recognition”, Computer speech and
Figure 4. MLSR system testing environment. language, 1990, Vol. 4_ pp.127- 165.

5.2. Experimental results [5] C.L. Huang, C-H Wu, “PHONE SET GENERATION
BASED ON ACOUSTIC AND CONTEXTUAL ANALYSIS
FOR MULTILINGUAL SPEECH RECOGNITION”

505
Department of Computer Science and Information [17] Rabiner L. Fundamentals of Speech Recognition. PTR
Engineering, National Cheng Kung University, Tainan, Prentice-Hall Inc., New Jersey, 1993.
Taiwan, R.O.C. (2007)
[18] Shengmin Yu Sheng Hu Shuwu Zhang Bo Xu,
[6] C.L. Huang, C-H Wu, “Generation of Phonetic Units for “CHINESE-ENGLISH BILINGUAL SPEECH
Mixed-Language Speech Recognition Based on Acoustic and RECOGNITION”, Hi-Tech Innovation Center, Institute of
Contextual Analysis”, Department of Computer Science and Automation Chinese Academy of Sciences, Beijing, P. R.
Information Engineering,National Cheng Kung University, China (2003)
Tainan, Taiwan, R.O.C. (2007)
[19] Turunen E. Survey of theory and applications of
[7] C. Y MA, Pascale FUNG, “Using English Phoneme Lukasiewicz-Pavelka fuzzy logic. In di Nola A. and Gerla G.,
Models for Chinese Speech Recognition” , The Human editors, Lectures on Soft Computing and Fuzzy Logic.
Language Technology Center Department of Electrical and Advances in Soft Computing, Physica-Verlag, Heidelberg,
Electronic Engineering Hong Kong University of Science and 2001, pp. 313–337.
Technology (HKUST), Hong Kong
[20] Vihola M., Harju M., Salmela P., Suontausta J. and
[8] F. Seide. N. J. C. Wang, 1998. Phonetic modeling in the Savela J. Two dissimilarity measures for HMMs and their
Philips Chinese continuous-speech recognition system. In application in phoneme model clustering. Accepted to
Proc. Proceedings of International Conference on Acoustics,
Speech and Signal Processing, Orlando, USA, 2002.
[9] Harju M., Salmela P., Lepp¨anen J., Viikki O. and
Saarinen J. Comparing parameter tying techniques for [21] Y. J. Chen, C-H. Wu et al. Generation of robust phonetic
multilingual acoustic modelling. In Proceedings of the set and decision tree for Mandarin using chi-square testing.
European Conference of Speech Communication and Speech Communication, 2002, Vol. 38 (3-4), pp. 349-364.
Technology, Aalborg, Denmark, Sept. 2001, pp. 2729–2732.
[22] Young, S. et al. HTKbook(V3.2), Cambridge University
[10] H. C. Wang, “MAT – A project to collect Mandarin Engineering Dept, 2002.
speech data through telephone networks,” Computational
Linguistics and Chinese Language Processing, 1997, vol.2, no. [23] Zgank A., Imperl B. and Johansen F. Crosslingual
1, pp. 73-90. speech recognition with multilingual acoustic models based
on agglomerative and tree-based triphone clustering. In
[11] Imperl B. and Horvat B. The clustering algorithm for the Proceedings of the European Conference of Speech
definition of multilingual set of context dependent speech Communication and Technology, Aalborg, Denmark, Sept.
models. In Proceedings of the European Conference of 2001, pp. 2725–2728.
Speech Communication and Technology, Budabest, Hungary,
1999, pp. 887–890.

[12] J. L. Gauvain, L.F. Laniel, G Adda, M. Adda-Decker.


Speaker Independent Continuous Speech Dictation, Speech
Communication, 1994, Vol. 15 (l-2), pp. 21-37.

[13] K¨ohler J. Comparing three methods to create


multilingual phone models for vocabulary independent
speech recognition tasks. In Proc. ESCA-NATO Tutorial and
Research Workshop: Multi-lingual Interoperability in Speech
Technology, 1999, pp. 79–84, Sept.

[14] K¨ohler J. Multilingual phone models for vocabulary-


independent speech recognition tasks. Speech
Communication, Aug. 2001, 35(1-2):21–30.

[15] Karjalainen M. Kommunikaatioakustiikka. Technical


Report 51, Helsinki University of Technology, Laboratory of
Acoustics and Audio Signal Processing, Espoo, Finland,
Preprint, In Finnish, 1999.

[16] Ladefoged P., Local J. and Shockey L., editors.


Handbook of the International Phonetic Association: A Guide
to the Use of the International Phonetic Alphabet. Cambridge
University Press, U.K., 1999.

506
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Speech Watermarking Based on Wavelet Transform and BCH Coding

Shi-Huang Chen, Shih-Yin Yu, and Chung-Hsien Chang


Department of Computer Science & Information Engineering,
Shu-Te University, Kaohsiung, 824, Taiwan
E-mail: shchen@mail.stu.edu.tw

Abstract robustness, and watermark pattern embedding rate


specifications.
This paper proposes a novel speech watermarking The speech watermarking techniques usually embed
algorithm using wavelet transform and BCH coding. speech watermark in unnecessary parts of speech
The characteristics of the proposed algorithm are as signal, or in human insensitivity auditory regions.
follows: 1) a more steady embedded code strategy Some of speech watermarking methods will change an
adopts to resist the compressed attack more effectively; interval to embed watermark [1-6]. However, this kind
2) the uses of multi-resolution property of wavelet of method has a drawback that is the unavoidably
transform and the BCH error correcting code to degradation of robustness [2]. In the other methods,
improve the robust of watermark; 3) the watermark the watermarks are embedded by the use of counterfeit
will be embedded into the low-, middle-, or high- human speech. It is unfortunate that such type of
frequency components by adaptive algorithm method also has the defect of weak robustness
according to human perceptual critical bands; 4) the especially when the counterfeit human speech is
proposed scheme can extract the watermark without destroyed. The distortion of the counterfeit human
using the original speech signal. Experimental results speech will also lead to the damage of the watermark
show that the proposed watermarking algorithm is [3, 4].
inaudible and robust against G.711, G.726, and GSM In this paper, a novel algorithm for speech
6.10 speech coding standards. watermarking is proposed. The main improvement of
the proposed algorithm is the robust wavelet-based
1. Introduction watermark embedding approach. In other words, the
proposed algorithm can embed an error-against
With the rapid development of the speech, audio, watermark in a robust sub-band of a given speech. The
image, and video compression methods, currently it is first step of the proposed algorithm is to decompose
not a difficult task to spread digital multimedia over the input speech into sub-band signals via wavelet
Internet. This makes the protections of digital packet transform. Then the proposed algorithm will
intellectual property rights and content authentications choose an optimal sub-band to embed the error-against
have been a serious problem. Hence the technology of watermark pattern. Here the error-against watermark is
digital watermarking is received a large deal of generated from the BCH encoding of the original
attention. Generally, digital watermarking techniques watermark pattern. The optimal sub-band mentioned
are based on either spread spectrum methods or above means that the embedded watermark in this
changing the least significant bits of selected band would have the least distortion and mean square
coefficients of a certain signal transform. For speech error (MSE) after speech compression. Finally the
watermarking, to ensure the embedded watermark is watermarked speech can be obtained by the use of
imperceptible, the audio masking phenomena is inverse wavelet packet transform. Similarly, in the
considered together with these conventional techniques. watermark extracting procedure, the proposed
In addition, a speech watermarking system should be algorithm needs to decompose the watermarked speech
robust to various speech compression operations [1-6]. into sub-band signals using the corresponding wavelet
The development of speech watermarking algorithms, packet transform. Then one can extract the watermark
therefore, involves a trade-off among speech fidelity, from the optimal sub-band and BCH decoding. It is
worthy to note that the proposed algorithm needs
neither original speech, nor training database.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 507


DOI 10.1109/SUTC.2008.81
Experimental results show that the proposed algorithm In wavelet packet analysis, the decompose signal of
can generate an inaudible watermark for speech signal process can be operated in high-pass and low-pass
and the watermark is robust against G.711, G.726, and banks. This way to decompose signal, passable ability
GSM 6.10 speech compression standards. composed of a wavelet decomposition tree. An
This paper is organized as follow. Section 2 example of three level of wavelet packet transform is
describes the wavelet packet as well as the BCH given in Figure 2.
coding process. The watermark embedding and One can obtain the original signal from the wavelet
extracting algorithms are given in Sections 3 and 4, sub-band signals via its corresponding inverse wavelet
respectively. Section 5 gives the experimental results transform. Such inverse wavelet transform can be
of the proposed algorithm and Section 6 concludes this constructed by the basic wavelet reconstruction that is
paper. shown in Figure 3.

2. Wavelet packet and BCH coding cD High Frequency 2


High Pass
2.1. Wavelet packet

It is well known that wavelet packet can provide a S


more detail decomposition structure for signal analysis.
By the use of wavelet packet, one can analyze the
characteristic of both high-pass and low-pass signals in cA Low Frequency 2
original multi-resolution analysis. In addition, the
decomposition structure of wavelet packet can be Low Pass
adapted to match the spectrum of a given signal. It Figure 3. The basic wavelet reconstruction structure
follows from literatures that wavelet packet has been
widely used in various applications [6], including 2.2. BCH coding
pattern recognition, image / audio compression, and etc.
In this paper, the wavelet packet is implemented via It is well known that Bose, Chaudhuri, and
filter bank structure. The basic wavelet decomposition Hocquenghem (BCH) error correcting code can correct
is shown in Figure 1 where cD means details and cA error bits and create secret code algebra disposition.
represents for approximations. From Figure 1, the BCH code belongs to linear block code with the
input signal can be adopt of individual high-pass property that cyclic shifts of a codeword are also
(details) and low-pass (approximations) signals. codewords. Therefore, BCH code is also one kind of
cyclic codes. The main advantage of linear block code
5
2 cD High Frequency is its implementation simplicity and low computational
complexity. Generally the techniques of linear block
High Pass
code will map a fixed number of message symbols to a
S fixed number of code symbols. The code symbols are
usually composed of two parts. The first part contains
the original information bits to be transmitted and the
2 cA Low Frequency second part contains the parity checking bits. A linear
Low Pass
block code with length n and k information bits is
Figure1. The basic wavelet decomposition structure denoted as a (n, k) code.
A binary BCH code is determined its generator
S polynomial g(x) [7]. For a t-error correcting code, g(x)
is the lowest degree polynomial with D, D2, D3, }, D2t
A1 D1
as roots. Here, D i is the element of a Galois field of 2,
AA2 DA2 AD2 DD2 i.e., GF(2). For any positive integers m (mt3) and t
(td2m1), there exists a binary BCH code with block
AAA3
length n and parity check length of nk where k is the
DAA3 ADA3 DDA3 AAD3 DAD3 ADD3 DDD3
number of information bits. That is, let encode length
Figure 2. An example of three level wavelet packet be 2 m  1 , and m is an integer (3dmd9), then the
generator polynomial can be from some m or m factors.

508
As mentioned in [7], BCH has an intimate relation The best band selection algorithm will choose an
between generator polynomial and minimal encode optimal band signal to embed the watermark. Here the
length with a minimum distance dmin. In addition, BCH optimal sub-band means that the embedded watermark
code holds the following relationships: in this band would have the least distortion as well as
n 2m 1 MSE after speech compression. This paper applies
G.711, G.726, and GSM 6.10 speech coding standards
n  k d mt
to evaluate the selection of best band.
d min t 2t  1
For more information about BCH code, see [7]. In
Inverse
this paper, various BCH codes with different n and k Input Wavelet Best Band
Wavelet
Speech Packet Selection
parameters are applied to speech watermarking. Packet

3. Watermark embedding scheme


The watermark embedding scheme of the proposed Watermark BCH Watermarked
Pattern Encoding Speech
speech watermarking algorithm contains four major
steps. They are (1) five-level Haar wavelet packet
Figure 5. The proposed watermark embedding scheme
transform, (2) BCH encoding, (3) best band selection,
and (4) inverse wavelet packet transform. The
Tables 1~7 show the best band distributions under
decomposition tree structure of the five-level wavelet
different (n, k) BCH codes with G.711, G.726, and
packet transform is given in Figure 4. Based on the
GSM 6.10 speech coding testing. It is worthy to note
bandwidth of human speech signal is generally below
that the watermark pattern will be embedded into the
4KHz, the sampling rate of the input speech signals
low-, middle-, or high-frequency components by the
used in this paper is 8KHz. In addition, the minimum
best band selection algorithm. The selection results
bandwidth of human perceptual critical bands is about
according to perceptual critical bands of human
100Hz [8]. Therefore, the use of five-level Haar
hearing.
wavelet packet transform will cover the perceptual
critical bands of human hearing.
Table 1. Best band distribution for (7, k) BCH code
k value Best band distributions
S
k =4 1, 2, 4, 5, 6, 7, 8, 10, 11, 15, 16, 17, 18,
19, 20, 21, 22, 23, 24, 26, 27, 28, 29,
A1 D1
30, 31, 32

Table 2. Best band distribution for (15, k) BCH code


AA2 DA2 AD2 DD2
k value Best band distributions
k =5 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
AAA3 DAA3 ADA3 DDA3 AAD3 DAD3 ADD3 DDD3
14, 15, 16, 17, 18, 19, 20, 21, 22, 23,
24, 25, 26, 27, 28, 29, 30, 31, 32
AAAA4 DAAA4 ADAA4 DDAA4 AADA4 DADA4 ADDA4 DDDA4 AAAD4 DAAD4 ADAD4 DDAD4 AADD4 DADD4 ADDD4 DDDD4
k =7 1, 2, 5, 8, 9, 10, 11, 13, 14, 15, 16, 18,
19, 20, 21, 22, 23, 24, 25, 28, 29, 30,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
31, 32
Figure 4. The five-level wavelet packet transform k =11 1, 5, 6, 7, 9, 10, 14, 15, 16, 18, 19, 24,
25, 26, 30, 31
Figure 5 shows the major steps of the proposed
watermark embedding scheme. It follows from Figure Table 3. Best band distribution for (31, k) BCH code
5 that the first step of the proposed watermark k value Best band distributions
embedding scheme is to decompose input speech into k =6 1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14,
32 sub-band signals using five-level Haar wavelet 15, 16, 17, 18, 19, 20, 21, 23, 24, 25,
packet transform. Then the watermark pattern with 26, 27, 28, 29, 30, 31, 32
BCH encoding will be embedded into one of these 32 k =11 1, 5, 6, 8, 9, 10, 11, 12, 14, 15, 17, 18,
sub-band signals. The watermark pattern used in this 19, 20, 21, 22, 23, 25, 26, 29, 30, 32
paper is any symbol consisted of 32 bits information. k =16 2, 5, 10, 11, 12, 15, 17, 19, 21, 22, 23,

509
27, 29, 32
k =21 2, 6, 8, 9, 10, 11, 12, 15, 17, 18, 22, 26, k =29 1, 2, 3, 6, 8, 9, 11, 16, 17, 18, 19, 20,
27 22, 23, 24, 26, 27, 28, 29, 30, 31, 32
k =26 1 k =37 1, 3, 7, 8, 9, 10, 12, 14, 15, 16, 18, 19,
21, 23, 25, 26, 27, 29, 32
Table 4. Best band distribution for (63, k) BCH code k =45 1, 8, 9, 15, 16, 19, 20, 22, 23, 27, 28,
k value Best band distributions 29, 32
k =7 Any sub-band k =47 3, 5, 7, 9, 15, 16, 19, 20, 23, 32
k =10 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14,
15, 16, 17, 18, 19, 20, 21, 22, 24, 25, Table 7. Best band distribution for (511, k) BCH code
26, 27, 28, 29, 30, 31, 32 k value Best band distributions
k =16 1, 2, 3, 4, 7, 9, 10, 11, 12, 15, 16, 17, k =10 Any sun-band
18, 19, 20, 22, 23, 24, 25, 26, 28, 29, k =19 Any sub-band
30, 31, 32 k =28 Any sub-band
k =18 1, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, k =31 Any sub-band
17, 18, 19, 20, 22, 24, 25, 26, 27, 28, k =40 1, 2, 3, 4, 6, 8, 10, 11, 12, 13, 15, 16,
29, 30, 31, 32 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
k =24 1, 5, 7, 9, 10, 12, 15, 16, 17, 18, 22, 25, 28, 29, 30, 31, 32
28, 32 k =49 1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14,
k =30 1, 14, 16, 17, 22, 29, 31 15, 16, 17, 18, 19, 20, 21, 23, 24, 25,
k =36 8, 10, 12, 18, 19, 20, 22, 28 26, 28, 29, 30, 31, 32
k =39 21, 22, 31, 32 k =58 1, 3, 6, 8, 10, 11, 12, 15, 16, 17, 18, 19,
k =45 No best sub-band 20, 23, 24, 25, 26, 28, 29, 30, 31, 32
k =51 18 k =67 1, 2, 4, 7, 8, 10, 12, 16, 17, 20, 22, 24,
k =57 No best sub-band 25, 26, 30, 32
k =76 1, 2, 4, 9, 10, 11, 12, 20, 22, 23, 26, 27,
Table 5. Best band distribution for (127, k) BCH code 30, 32
k value Best band distributions k =85 1
k =8 Any sub-band
k =15 1, 2, 3, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, Figure 6 shows the best band distribution among 32
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, sub-band signals. One can find the highest and lowest
26, 27, 28, 29, 30, 31, 32 frequency bands are usually selected as the best
k =22 1, 2, 3, 5, 6, 7, 8, 9, 11, 12, 14, 15, 16, watermark embedded bands in the most BCH code
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, setting conditions.
28, 29, 30, 31, 32 Finally, the watermarked speech can be obtained by
k =29 1, 2, 3, 5, 6, 7, 9, 10, 11, 13, 14, 15, 16, the use of inverse wavelet packet transform.
17, 18, 20, 21, 22, 24, 25, 26, 28, 29,
30, 31, 32
k =36 1, 10, 11, 14, 17, 18, 19, 20, 24, 26, 29,
32
k =43 18
k =50 26
k =57 No best sub-band
k =64 8

Table 6. Best band distribution for (255, k) BCH code


k value Best band distributions
k =9 Any sub-band
Figure 6. The best band distribution among 32 sub-
k =13 Any sub-band band signals
k =21 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14,
15, 16, 17, 18, 19, 20, 21, 22, 23, 24,

510
4. Watermark extracting scheme Figure 8 illustrates the average PSNR values of
various BCH code setting and different sub-bands.
In the watermark extracting scheme, the first step is This experimental result is obtained without using
to decompose the watermarked speech into 32 sub- speech compressions. In other words, the PSNR listed
band signals by the use of five-level Haar wavelet in Figure 8 is derived from the original speech and the
packet transform. Then, each optimal sub-band signal watermarked speech signals. One can find that the
will be operated (n, k) BCH decoding where the lower variable n of BCH code results in higher PSNR
parameters of (n, k) are notified by the watermark value.
embedding scheme. This BCH decoding operation will Table 8 gives the average PSNR values and
not be stopped until all of optimal sub-band signals are watermark detection rates for various speech
tested or the correct BCH decoding is performed. compression standards, including G.711, G.726, and
Finally, the watermark pattern can be extracted from GSM 6.10. G.711 standard is based on A-law or P-law
the optimal sub-band signal with correct BCH non-linear quantization whereas G.726 standard is
decoding. The main advantage of the proposed based on adaptive differential pulse code modulation
algorithm is that it does not require a match with an (ADPCM). The core algorithm of GSM 6.10 is
uncorrupted original speech signal or training database. regular-pulse-excited-long-term prediction (RPE-LTP).
Figure 7 is the block diagram of the watermark The bit-rates of G.711, G.726, and GSM 6.10 used in
extracting scheme. paper are 64Kbps, 32Kbps, and 13.3 Kbps,
respectively. This experimental results show that the
proposed watermarking algorithm is inaudible and
Watermark
Watermarked Wavelet BCH
Pattern robust against G.711, G.726, and GSM 6.10 speech
Speech Packet Decoding
Extracting coding standards.

Table 8. The average PSNR values and watermark


Watermark detection rates for various speech compression
Pattern standards
Figure 7. The proposed watermark extracting scheme G.711 G.726 GSM 6.10

5. Experimental results PSNR (dB) 48.63 48.33 35.82


Detection rate 100% 100% 96%
The proposed speech watermarking algorithm is
implemented on a Pentium-IV 2.4GHz PC and Matlab
programming language. All of the speech signals in the 6. Conclusions
test are selected from “Aurora 2” database with 8 KHz
sampling rate and 16-bit resolution. In this experiment, In this paper, a novel speech watermarking
there are 100 test speech signals and the length of algorithm using 5-level Haar wavelet packet transform
watermark embedding segment is 8000 samples. and BCH coding is proposed. To improve the
robustness of the speech watermark, the proposed
algorithm makes use of wavelet packet transform to
select a best sub-band. Then the watermark pattern is
encoded with BCH and embedded into the selected
best sub-band. Experimental results show that the
proposed watermarking algorithm is inaudible and
robust against G.711, G.726, and GSM 6.10 speech
compression standards. In addition, the watermark can
be extracted without using the original speech signal.
However, the proposed algorithm does not have well
performances under G.728, G.729, and AMR speech
compression standards. Further research will modify
the algorithm to be more robust against low bit-rate
speech compression methods.
Figure 8. The average PSNR values of various BCH
code setting and different sub-bands

511
References
[5] Yoiti Suzuki, Ryouichi Nishimura and Hao Tao, “Audio
watermark enhanced by LDPC coding for air transmission,”
[1] M. Swanson, B. Zhu, and A. Tewfik, “Current state of the
in Proceeding IIH-MSP 2006, pp. 23-26, 2006.
art, challenges and future directions for audio
watermarking,” in Proceeding ICMCS 1999, pp. 19-24, 1999.
[6] Xiang-Yang Wang and Hong Zhao, “A Novel
Synchronization Invariant Audio Watermarking Scheme
[2] Mohamed F. Mansour and Ahmed H. Tewfik, “Audio
Based on DWT and DCT,” IEEE Trans. On Signal
watermarking by time-scale modification,” in Proceeding
Processing, vol. 54, no. 12, pp. 4835-4840, 2006
ICASSP 2001, vol. 3, pp. 1353-1356, 2001.
[7] J.H. van Lint. Introduction to Coding Theory. Springer-
[3] Nian Gui-jun , Zhiyuan Xie and Shu-xun Wang,” Audio
Verlag, Berlin, 1982.
watermarking based on reverberation,” in Proceeding IIH-
MSP 2006, pp. 37-40, 2006.
[8] L. Rabiner and B.H. Juang, Fundamentals of Speech
recognition, Prentice Hall, 1993.
[4] Ming Li, Yun Liu and Yonghong Yan, “A novel audio
watermarking in wavelet domain,” in Proceeding IIH-MSP
2006, pp. 27-32, 2006.

512
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

QoS-based Learning Services Composition for Ubiquitous Learning


Fu-Ming Huang1,* Ci-Wei Lan2 Stephen J.H. Yang3
1, 2, 3
Dept. of Computer Science and Information Engineering, National Central
University, Taiwan

{1fmhuang, ,2lancw, 3jhyang}@csie.ncu.edu.tw

Abstract enabling the reusability of learning materials in online


learning.
With the advance of Internet technologies and The emerged Web Services oriented technologies
handheld computer, lots of software components are can provide a flexible, efficiency and loosely coupled
built for ubiquitous learning activities. Meanwhile, way to reuse existing learning services. Web Services
many studies devote to the promotion of a universal technology concentrates not only on interoperability
access way and wider reuse-ability of these ubiquitous but also on how to describe, publish, locate and invoke
learning components with the Web services. Web Services. A number of standards and
Nevertheless, the QoS capability will be a decisive specifications created from industry and academic have
characteristic to distinguish services with identical contributed to the development of Web Services such
functionalities from each other. In this paper, we as WSDL [1], UDDI [2]. Service providers can
propose an efficient service selection scheme to help describe services with WSDL to specify what the
teacher pick out learning services by considering two service does and how to accomplish the service. UDDI
different contexts namely single QoS-based service creates a standard interoperable platform that enables
discovery and QoS-based optimization of service companies and applications to quickly, easily, and
composition. Based on different concerns, we present dynamically locate Web Services over Internet.
the corresponding solutions to demonstrate the However, the above building blocks of Web Service
evaluation process of QoS capability for service infrastructure can only behave well in syntactical level.
selection. The experimental results not only showed Meanwhile, Quality-of-Service (QoS) concerns are
our solution is more efficient than the others but also getting crucial to the global success of Web services
proved our solution can work well for complicated based computing paradigm. Although a lot of
scenarios. specifications [3] have been proposed for
aforementioned concerns, there still lacks an efficient
Keywords: QoS, Web Services, Ubiquitous Learning, solution to assist service requester in performing
Composition, Workflow Patterns QoS-based service selection which typically should
deal with the following problems: (1) Creation of QoS
1. Introduction model, (2) QoS-based service discovery, (3)
QoS-based optimization of service composition. In
With the evolution of computer technologies, lots of case two or more functional-qualified services are
learning systems are invented for different pedagogical available for tasks in the workflow structure, there will
purposes. These ubiquitous learning systems bring a be plenty of combinatorial service sets which can
great of benefits such as asynchronous interactions, deliver the same functionality but differ from each
group collaborations, individualized instructions, other in QoS performance.
distance learning etc. and all learning materials can be In this paper, we propose an efficient service
accessed anytime, anywhere in a cost-effective manner. selection scheme to help service requester choose Web
In the past years, many researches contributed to services by taking service’s non-functional
promote exchange and interoperability of educational characteristics into consideration. Firstly, we present a
materials among different ubiquitous learning systems QoS model of Web services to illustrate QoS concerns
including IEEE LTSC’s Learning Object Metadata with concrete definitions. Considering different data
(LOM), ADL’s Sharable Content Object Reference types in the model, we apply multiple criteria decision
Model (SCORM) etc. Recently, IMS Global Learning making (MCDM) with weighted sum model (WSM) to
Consortium released the Learning Design specification help service requester evaluate services numerically.
to support the use of a wide range of pedagogies while Besides, we transform QoS-based optimization of

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 513


DOI 10.1109/SUTC.2008.12
service composition into a mathematical programming a given traffic contract” [8] and “The degree to which a
problem by deriving the objective function from system, component or process meets customer or user
constituent workflow patterns and the problem can be needs or expectations” [9]. Besides, more definitions
solved efficiently by taking advantage of the problem’s of quality can be found in [10] where they identify the
structure. Our experimental results have shown that the term from five perspectives including transcendent,
proposed service selection scheme is much better than product-based, user-based, manufacturing-based and
enumerative method. The remaining of the paper is value-based. It is shown that quality is a matter of
organized as follows. Related work of QoS-based opinion due to competing views and there is always no
service selection is illustrated in Section 2 and a QoS universal agreement on what quality exactly mean.
model of Web services is presented in Section 3. The For QoS aggregation in service compositions,
proposed QoS-based service selection scheme is Cardoso et al. [11] devise a stochastic workflow
specified in Section 4 and the experimental results are reduction algorithm to compute overall QoS
presented in Section 5. Finally, concluding remarks are performance of service processes. According to
given in Section 6. different workflow patterns, they define the
corresponding QoS aggregation with four attributes
2. Related work including response time, cost, reliability and fidelity.
Services that are grouped with workflow patterns can
In order to promote the learning efficiency, many be substituted for a virtual service with aggregative
studies devote themselves to developing the various QoS effects and they can predict QoS performance of
mobile and ubiquitous learning systems. Lin and Yeh service processes by performing the substitution
[4] introduce a Telephone-Supported Multimodal repeatedly until the whole process is transformed into a
Accessible Web system (TMAW) to increase the composite service node. A similar work can be found
degree of accessibility and mobility of Web pages. For in [12] where Jaeger et al also generate QoS
the context issues, Kim et al. [5] developed important aggregation in Web services compositions based on
elements of ubiquitous computing to acquire, express different workflow patterns. In comparison with our
and safely use the context information. They proposed research in this paper, their work provide predictive
the Context-Awareness Simulation Toolkit (CAST). QoS performance of service processes only and there is
Particularly, the created context information is reused no further solution to QoS-aware service discovery and
by the request of application and put into use for composition.
context learning. In [6], an application named
“e-examination” has been developed for creating and 3. A QoS model of learning services
performing this kind of computerized tests.
Stergiopoulos et al. showed that students who were In terms of functional descriptions, Web services
electronically examined performed better than those description language (WSDL) has provided a standard
conventionally examined. Kekwaletswe consider that model to specify service’s functionality by separating
the most fundamental facet of learning is the social the abstract representations of service’s input and
interaction in which learning is an outcome of output messages from the concrete descriptions of end
individuals sharing experiences [7]. Therefore, they point’s bindings. However, there is no standardized
proposed a learning environment that provides description framework at present, which is specifically
ubiquitous social presence awareness for the purposes designed to comprise all aspects of service’s
of facilitating interactive consultation as a learner non-functional characteristics. In this section, we
traverses the various learning contexts. synthesize related work in [10], [13], [14] and [15] to
Overall, these studies prompted a good trend of the present a QoS model of Web services upon as Table 1
development of ubiquitous learning. They combined which we can discuss our service selection scheme
the advanced technologies and suitable pedagogical consistently.
methods to improve the learning capability of users.
Nevertheless, they have not considered about the Table 1. A QoS model of Web services
reusability of these learning system. Fortunately, the Dimensions Attributes Definitions
Web services provide a total solution to publish and Response Execution time(S) +
Performance
time Waiting time(S)
integrate them in an easy and universal access way Reliability 1 – Failure rate(S)
through the service concept. And this study will focus Dependability Uptime(S) / Uptime(S) +
Availability
on the learning services composition efficiency from Downtime(S)
the QoS concerns. Execution fee(S) / per
Cost Price
request
There are various definitions of Quality-of-Service
* S: a service, P: identity of service provider
(QoS), for example, “Quality of Service refers to the Response time is an essential temporal concern for
probability of the telecommunication network meeting Web services and the evaluation of a service’s

514
response time for a request typically comprises the  q.value − q. min
 q. max − q. min if q. max − q. min ≠ 0
measurement of the execution time and waiting time. 
q.value = 
Reliability of Web services refers to the ability of  1 if q. max − q. min = 0
successfully performing service’s functionality for a 

request. Availability of Web services is the degree to positive attribute ─ Eq.1
which a service is operational and accessible when it is  q. max − q.value
required for use. Price is the expense of service  q. max − q. min if q. max− q. min ≠ 0

execution for a request. q.value = 
 1 if q. max− q. min = 0


4. QoS-based learning services composition negative attribute ─ Eq.2

Based on different data types of QoS attributes, we


(2) Weighting and sum of attribute value: Based on the
design a service selection scheme to help service
normalization, the QoS performance of each qualified
requester choose services through three stages as
service can be calculated uniformly by summing up the
illustrated in Figure 1. For functional and text-based
product of each normalized attribute value and the
QoS matchmaking, UDDI has already provided a
corresponding weight as shown below. The time
systematical way to retrieve qualified services by
complexity of previous MCDM with WSM method is
keyword or directory-based service discovery. Hence,
linear to the amount of qualified services, i.e. O(n)
we will focus on numeric based QoS matchmaking for
where n is the number of qualified services.
two different scenarios namely single QoS-aware
Score ( s ) = ∑ q 'i .value * wi
service discovery and QoS-aware optimization of
service composition. In the former case, services for a
given task are evaluated individually and the one 4.2 QoS-based optimization of service
which has the highest QoS performance will be composition
selected. By contrast, it has to consider the overall 4.2.1 The aggregative QoS effects of workflow
workflow structure of a given composition in the latter patterns
case to find out the service set whose aggregative QoS
performance is the highest among all combinatorial In [17], a lot of workflow patterns are used to
ones. check the modeling power of BPEL4WS, XLANG,
WSFL, XPDL and four workflow management
systems. Based on the comparison, we select some
basic patterns to demonstrate how the aggregative
effects are derived from pattern’s structure and it is
Figure 1. Service selection process able to perform the derivation for other complicated
patterns in the same way. In addition to descriptive
4.1 Single QoS-aware service discovery definitions, we also illustrate the formal semantics of
selected workflow patterns with Petri nets [18] to
For a given task, a lot of qualified services may be clarify their meaning graphically.
retrieved from a repository through functional and Sequence pattern is an execution model where services
text-based QoS matchmaking. In order to further are carried out one by one. Due to sequential execution
distinguish qualified services from each other, we order, there is no overlap between two services’
apply multiple criteria decision making (MCDM) with execution periods so the aggregative response time is
weighted sum model (WSM) [16] as a uniform the sum of two services’. Because two services are
evaluation method which is carried out within two executed independently, the probability of successfully
steps. performing both services can be derived from the
(1) Normalization of attribute value: In order to product of their reliability performance and the
prevent inaccurate evaluation due to various aggregative availability is calculated alike. Similarly,
measurement metrics of QoS attributes, the attribute the total expense for both executions is the amount of
values have to be normalized into the same scale. two services’.
According to different qualitative properties, numeric Parallel split pattern describes a special sequential
QoS attributes can be classified as positive or negative. process where one or more services in the latter are
executed concurrently after the completion of the
former service. The response time of parallel services
is recognized as the longest duration among their
executions and thus we can get the aggregative
response time by adding up the former service’s and

515
the maximal one in the latter. As well as sequence to find out the best service combination, we define
pattern, the aggregative reliability, availability and variables sij as follows.
price can be acquired by the product and sum of sij = 1 if the qualified service i is selected for the task j
sij = 0 otherwise
constituent services’ respectively.
Synchronization pattern also depicts a particular In a service combination, there is exactly one service
sequential process where the completion of all parallel which is selected for a task and thus we have the
services in the former will cause a service execution in following constraint of previously defined variables.
m
the latter.
Arbitrary cycles pattern is a kind of sequential process
∑s
i =1
ij =1 1≤ j ≤ 7

where the completion of the service will bring another Besides, we can derive the objective function of sij
round execution of the same service. Consequently, the from the aggregative QoS effects of constituent
aggregative QoS effects will depend on the number of workflow patterns as illustrated below.
repeat executions.
Exclusive choice pattern is similar to parallel split
pattern but differs with it in that there is only one
alternative is performed after the completion of the
former service. Hence, the aggregative QoS effects are
represented as there are multiple sequence patterns.
Synchronizing merge pattern includes two execution
models. One is the variant of sequence pattern with
exclusive choice and the other is the combination of
parallel split pattern and synchronization pattern.
Multiple merge pattern encompass three execution
models. The first two are the same as synchronizing
merge pattern and the third model is the variant of
parallel split pattern without synchronization. Since
there is no synchronization in the third model, the
terminate service will be executed once whenever each
former service is done. In the expressions, qrt, qr, qa and qp represent the value
of a service’s response time, reliability, availability and
4.2.2 A mathematical programming based solution price respectively. In order to solve the objective
The problem of finding out a service combination function with ease, we apply logarithm to transform
whose aggregative QoS performance is the best among Qreliability and Qavailability into linear form of sij as shown
others’ for a given composition is a kind of below.
combinatorial optimization that can be solved by
mathematical programming techniques [19]. In terms
of mathematical programming, there are a set of
variables, an objective function and a set of constraints.
For better illustration, we use an example of ubiquitous
learning as shown in Figure 2 to demonstrate how the
proposed solution works.

Figure 2. An example of ubiquitous learning As well as single QoS-aware service discovery, the
attribute values should be normalized into the same
Suppose a teacher tries to make the composition for scale with Eq1 and Eq2 correspondingly. As response
providing ubiquitous learning service and there are m time is a negative attribute, the max expression in
qualified services available for each task after aggregative response time will be converted into a min
functional and text-based QoS matchmaking. In order one after normalization. Finally, we can determine the

516
best service combination by finding out the maximal 0.01ms under the same setting. In fact, the time
value of refined objective function as follows. complexity of enumeration method is proportional to
O(mn) where m is the number of qualified services and
n is the number of tasks. Even though there are
numerous tasks in a composition with a lot of qualified
services for each task, the performance of our
mathematical programming based solution is still very
efficient as illustrated below.
600
493
500
400 390

Time (ms)
308
300 234
174
200
No. of tasks = 98
100
0
60 70 80 90 100
No. of qualified services
5. Experiments on QoS-based optimization 600
493
of services composition 500 409
444
374
400

Time (ms)
307 339
In order to evaluate the efficiency of proposed 273
300 240
mathematical programming based solution, we perform
a series of experiments on comparing our solution with 200
No. of qualified services = 100
enumeration method by a desktop with Pentium 4 3.0 100
GHz CPU and 1GB RAM. Based on the travel 0
planning scenario as illustrated in Figure 3, we 49 56 63 70 77 84 91 98
generate 10000 populations for each task randomly.
No. of tasks
For higher accuracy, each experiment is carried out 30
times and it has to randomly pick out qualified services Figure 4. The performance of mathematical
from the populations again every time. programming based solution
Suppose there are 6, 7, 8, 9 and 10 qualified
services for each task in Figure 3, the corresponding The top of Figure 4 depicts the performance for a
measurement of enumeration method’s time cost are composition with 98 tasks and it is shown that the
illustrated in Figure 3. increase of available qualified services for each task
does not impose a heavy computational burden on our
10000
solution. Similarly, we can also find that the
7844 performance of our solution is almost linear to the
8000 degree of a composition’s complexity in the bottom of
Figure 4. Based on the experimental data in [20], their
Time (ms)

6000
4422 IP with DAG solution needs to take 1.6 seconds when
4000 there are 40 tasks and 40 qualified services for each
2182 task. Obviously, our solution outperforms Zeng et al’s
2000 1027 since it only takes 0.240 seconds when there are 49
59 171 456
0 tasks and 100 qualified services for each task as shown
6 7 8 9 10 11 12
in the underside of Figure 4. Finally, we carried out
more experiments on testing the efficiency of our
No. of qualified services solution for very large service composition as shown in
Figure 3. The performance of enumeration method Table 2. The experimental results have proved our
with 7 tasks solution is not only better than other related work but
also very useful for the presence of considerable tasks
It is obvious that the time cost grows rapidly when and qualified services.
more qualified services are available for each task.
Conversely, the time cost of our mathematical Table 2. Experimental data of mathematical
programming based solution is always less than programming based solution

517
No. of tasks
No. of qualified services [7] R.M. Kekwaletswe,and D. Ngambi, "Ubiquitous Social
60 70 80 90 100 Presence: Context-Awareness in a Mobile Learning
525 919ms 1247ms 1629ms 2055ms 2564ms Environment", IEEE International Conference on Sensor
630 1111ms 1507ms 1975ms 2506ms 3082ms Networks, Ubiquitous, and Trustworthy Computing, 2006,
735 1290ms 1761ms 2325ms 2915ms 3648ms
pp. 90-95.
840 1483ms 2064ms 2667ms 3304ms 4064ms
945 1656ms 2246ms 2927ms 3705ms 4589ms [8] Wikipedia., 2006. Quality of Service. Available from:
1050 1880ms 2589ms 3308ms 4205ms 5242ms <http://en.wikipedia.org/wiki/Quality_of_service>.
[9] Jay, F. and Mayer, R., 1990. IEEE Standard Glossary of
Software Engineering Terminology, IEEE Std 610.12-1990.
6. Conclusions and future work [10] Garvin, D. A., 1988. Managing quality: The strategic
and competitive edge, Free Press, New York. PP.49-68.
The development of QoS-aware Web services is a [11] Cardoso, J., Miller, J., Sheth, A. and Arnold, J., 2002.
popular research issue as it is seen as the foundation Modeling Quality of Service for Workflows and Web Service
toward trustworthy service-oriented computing. The Processes, LSDIS lab, Computer Science, University of
promise of providing services with certain QoS Georgia, Tech. Rep. #02-002.
performance will make people be more confident of [12] Jaeger, M.C., Rojec-Goldmann G. and Muhl, G., 2005.
QoS aggregation in Web service compositions, in Proc. of
adopting Web services for critical tasks, e.g. business IEEE International conference on e-Technology,
transaction and personal financial etc. According to e-Commerce and e-Service (EEE), PP. 181-185.
different data types of service attributes, services are [13] Menasce, D.A., 2002. QoS Issues in Web Services,
evaluated through three stages and the solutions have IEEE Internet Computing, Vol. 6, Issue 6, PP. 72-75.
been illustrated correspondingly. Besides, the [14] Cardoso, J., Miller, J., Sheth, A. and Arnold, J., 2002.
experimental results have exhibited the practicability Modeling Quality of Service for Workflows and Web Service
of our mathematical programming based solution and Processes, LSDIS lab, Computer Science, University of
proved it outperforms other related work as well. In the Georgia, Tech. Rep. #02-002.
near future, we will focus on QoS-aware failure [15] O’sullivan, J., Edmond, D. and Hofstede, A. T., 2002.
What’s in a service? Towards Accurate Description of
recovery by extending proposed service selection Non-Functional Service Properties, Kluwer Academic
scheme to provide non-interrupted service execution. publishers Distributed and Parallel Databases, Vol. 12, PP.
117-133.
7. Acknowledgement [16] Hwang, C.L. and Yoon, K., 1981. Multiple attribute
decision making: Methods and applications, Springer-Verlag.
This work is supported by National Science Council, [17] Aalst, W. M. P. van der, 2003. Don’t Go with the Flow:
Web Services Composition Standards Exposed, in S. Staab
Taiwan under grants NSC95-2520-S008-006-MY3 and
(eds.) Web Services: Been There, Done That? IEEE
NSC96-2628-S008-008-MY3 Intelligent Systems, Vol. 18, Issue 1, PP. 72-76.
[18] Murata, T., 1989. Petri nets: Properties, Analysis and
8. References Applications, Proc. of the IEEE, Vol. 77, No. 4, PP. 541-580.
OASIS WSS TC., 2006. Web Services Security. OASIS.
[1] E. Christensen, F. Curbera, G. Meredith and S. Available from: <http://www.oasis-open.org/
Weerawarana, “Web Service Description Language (WSDL) committees/wss/>.
1.1,” W3C, http://www.w3.org/TR/WSDL, 2001. [19] Korte, B. and Vygen, J., 2005. Combinatorial
[2] UDDI coalition, “Universal Description, Discovery and Optimization: Theory and Algorithms, Springer-Verlag.
Integration,” OASIS, http://www.uddi.org/, 2000. [20] L. Zeng, B. Benatallah, A.H.H. Ngu, M. Dumas, J.
[3] Bajaj, S. et al., 2006. Web Services Policy 1.2 – Kalagnanam, and H. Chang, "QoS-Aware Middleware for
Framework (WS-Policy). WWW Member Submission. Web Services Composition", IEEE International Conference
Available from: on Sensor Networks, Ubiquitous, and Trustworthy
<http://www.w3.org/Submission/WS-Policy/>. Computing, 2006, pp. 311-327.
[4] J.C. Lin, and Y.M. Yeh, "A Research on
Telephone-Supported Multimodal Accessible Website",
IEEE International Conference on Sensor Networks,
Ubiquitous, and Trustworthy Computing, 2006, pp.118-123.
[5] I. Kim, H. Park, B.N. Noh, Y.L. Lee, S.Y. Lee, and H.H.
Lee, "Design and Implementation of Context-Awareness
Simulation Toolkit for Context learning", IEEE International
Conference on Sensor Networks, Ubiquitous, and
Trustworthy Computing, 2006, pp. 96-103.
[6] C. Stergiopoulos, P. Tsiakas, D. Triantis, and M. Kaitsa,
"Evaluating Electronic Examination Methods Applied to
Students of Electronics. Effectiveness and Comparison to the
Paper-and-Pencil Method", IEEE International Conference
on Sensor Networks, Ubiquitous, and Trustworthy
Computing, 2006, pp. 143-151.

518
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Collaborative annotation creation and access in a multimodal


environment with heterogeneous devices for decision support and for
experience sharing
Charles Robert
Laboratoire Lorrain de Recherche en Informatique et ses applications
Campus Scientifique, BP 54506 Vandoeuvre-les nancy, France
Email : charles.robert@robert-scientist.com

Abstract 1. Introduction
The importance of annotation as personal
expression of individuals on documents of Annotation is being defined as specific
interest can not be over emphasized in decision expression of interpretation of a document by a
support systems and experience sharing. An user of the document on the document. It is an
annotation model (AMIE: Annotation model for action as well as the resulting object. The
information exchange) proposed the integration personal expression of users is of interest
of the peculiarity of annotation creator for because; an annotation reflects the user’s view
decision making. The fact that annotation can and not the view of document creator. Because
assist in decision making was well discussed in of the personal traits in annotation, it has been
AMIE. If annotation is to be used for decision proposed as a useful tool for decision support
support, peculiarity of decision maker must be systems (Robert, 2007). In this same work, an
given proper attention specifically when it has to annotation model AMIE (Annotation model for
do with collaborative efforts. It is a known fact information exchange) was developed bearing in
that more and more decision makers are favored mind the specificities of document users who in
to the use of divergent access to information turn are annotation creators in a decision support.
devices including PDA and other non The model was based on principles of
convectional devices. Possibility of information information systems and decision making.
access specifically in annotation through The model can be summarized as
divergent sources must be considered with
substantial attention. The work gave a reviewed ∆ = Cx.Dt. Ar.(T 1 − T 2)
description of AMIE model in relation to Where
divergences of information sources. It presented Cx is the context of annotation. The context
efforts made in making annotation available is related to an object, signal and hypotheses.
across divergent devices in a collaborative It also has to do with the objective of
environment for decision making and for annotation, type of collaboration involved,
experience sharing. The platform of Microsoft information research problem, experience
visual studio dot net was used to experiment the sharing and decisional problem
proposal. Database running on SQL server CE Dt is the associated document, described
from PDA was synchronized to the standard SQL with bibliographic parameters such as title,
server to permit divergent information creation publication date, authors, etc…
and access. Ar is the document user who is also the
annotation creator. He has identity,
Keywords preferences, experiences, social values and
Annotation, heterogeneous environment, environments
devices, users parameters, access, creation (T1-T2) is the period when annotation was
made

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 519


DOI 10.1109/SUTC.2008.15
Based on these parameters in the model, it 3. Collaborative annotation for
was said that annotation can assist in decision experience sharing
making related to annotation creator, document
and the events related to time of annotation. Collaboration for decision making and
The model was specifically suited for experience sharing involves divergent sources of
electronic written annotation. The issue information and several users. When annotations
bothering on the technology of annotation was are made, they are made not for private use but
not addressed because several annotation tools for a group effort for experience sharing. The
addressed this problem isolated from decision work assumed that:
making efforts.
• There are at least two participants in the
information collaboration,
2. Related work • Participants are free to choose a device of
their choice at any time without reference
An annotation tool FlexNetDiscuss (or FND) to previous device used in previous
(Chong, 2003) was developed to permit collaboration
contextualized synchronized and asynchronous • A distributed and collaborative creation of
discussions around arbitrary HTML documents. annotation,
The attention was to share learning experiences
• Annotation spanned through more than
in an internet environment. This is an interesting
one specified period i.e. annotations are
work but without consideration for neither
made several times,
specificity of devices that may be used nor the
• The means of exchange of annotation is
possibility of using annotation for experience
electronic,
sharing and decision making. In Command and
Control (C2) organizations and in other military
systems, it was argued that such setting must Several methods of experiences sharing based
operate decisively and synchronously in on annotation in a collaborative environment
uncertain and dynamic settings (Freeman and were identified however two methods of interest
Hess, 2003). The argument was that individual were employed in the study. The two methods
team members must collaborate in their were moderated and symbiotic:
application of critical thinking in a process called Moderated annotation: In this case, an
“collaborative critical thinking”. It will be annotation (expressed personal perception,
interesting to see if this kind of processes can interpretation) is shared in a moderated way. A
function without effectively considering the moderator is at the centre of coordination. All the
relative place of participating devices. other annotators are linked to the central
Whereas earlier works in annotation moderator.
(Marshall, 1997)(Neuwirth et al, 1990)( Brush et Symbiotic Annotation: In this case, it was
al, 2002), considered annotation as a routine believed that all participants in the collaboration
activity for notification particularly among have a unique experience to share. No one is
students in a university library, the attention here rated higher than the other. Shared annotation
is beyond collaborative activities. When was based on the divergent experiences of the
annotation is tailored with specific tools and participants. Since annotation is on one single
model, it can assist in decision making not only document, an annotation is considered on the
about students but decisions about books basis of the initiating of one member of
(Robert, 2007). collaboration. There were common grounds
The VesselWorld system (Landsman et al, between these participants.
2000) is a synchronous communication tool that
is made available to participants in different- 4. Methodology for integration and
place at the same time for cooperatively solving implementation
a problem requiring coordination. It provided
tools for helping the players solve the Several methodologies for information
coordination task. Though this is a good attempt integration are available. It is possible to
in collaborative work, consideration for integrate information using frame base. There
heterogeneous devices was not apparent. were reports of indexing divergent information
sources for eventual common use. It is possible
to use query reformulation to be passed
(crawled) into divergent information sources. In

520
this work, information is made available to data dictionaries to understand content of
annotation database and from annotation databases.
database using a parallel access mode. The context of annotation determined what
In the methodology adopted, three parts can be was passed into the information system. For
identified for data integration. The base example, if the context of annotation was
databases that were hosted on participating academic, the interest may be in the attributes of
devices, dynamic database generated on the the generalized (compound) database that has to
server end and the interface. The base databases do with education. The content of each database
are heterogeneous and diverse. Each database was made known through the delineator.
was built for a particular purpose and with a The platform of Microsoft visual studio dot
schema. Though the content of different net was used to experiment the proposal.
databases may be the same, their representation Database running on SQL server CE from PDA
may be different. Each database is was synchronized to the standard SQL server to

Base
Dict Delineator
1
1 (Content)

Base
Dict 2
2
Information
Base decoder Harmonizer System
(Structure) Interface

Base
n
Dict
n
Integration
Context

Figure 1: Methodology for data integration


complemented with a dictionary that explains the permit divergent information creation and
content of the database. access.
The three-tier architecture consisted of three
The second part of our schema is a dynamic elements as three layers. The three layers are the
database built on a decoder, harmonizer, presentation layer, the data layer and the
delineator and a context integrator. application layer.
The base decoder was meant to analyze The presentation layer is related to the display
participating databases from devices. It was of information associated to the client in our
meant to identify database type, database model system. This layer permits user to send a request
and structure. The delineator used associated to resource situated on a distant web server

521
Referenced
information
source

Integrated heterogeneous
Information system

Device integration framework

Palmtop Computer Data logger Telephone GPS


Figure 2: Conception for device integration

www.loria.fr. This layer was presented as a form content and combined in a dynamic environment
of web browser. The client machine on this layer on the web server. Practically, devices are
was linked to the web server by TCP/IP network. integrated into the system using existing
User’s request was assured on the internet technology proposed by Microsoft.
connection by the HTTP and GPRS protocol. The report underlined experimentation of the
The application layer was a functional layer proposal in decision making process and
that assures the processing of sent request and experience sharing, its adaptation to learning
corresponding response. For this prototype, processes is on course. In this case, three types of
Microsoft visual studio was used at the back end. learning were identified as
The data layer was linked to a database server. • Cognitive: mental skills (Knowledge)
The essence of the database was to store and • Affective: growth in feelings or
manage the associated information. Commands emotional areas (Attitude)
sent by the query of users are managed using • Psychomotor: manual or physical skills
SQL server at the server end. (Skills)

Annotation was detailed to six types of


5. Conclusions and perspective learning levels. The learning levels were, (a)
Knowledge (b) Comprehension (c) Application
This work examined how heterogeneous devices (d) Analysis (e) Synthesis (f) Evaluation. With
can participate in annotation creation and its this kind of division, it will be possible to
access for decision making and experience effectively apply this work to learning
sharing. Methodology of data integration was environments.
based on separating the data structure and data

522
Formal framework and standard for device
integration in a collaborative annotation for [4.] Keyani Pedram, mRNA: A digital
decision making is being developed. Such annotation system to facilitate multi-
framework will be independent of technology disciplinary group collaboration
that may be employed.
[5.] Landsman Seth M., Alterman Richard,
Feinman Alex and Introne Joshua,
6. Reference “VesselWorld and ADAPTIVE”, R CS-
01-213,. Brandeis University. Demo
[1.] Brush A.J. Bernheim, Bargeron David, given at CSCW 2000, 2000
Grudin Jonathan, and Gupta Anoop,
“Notification for shared annotation of [6.] Marshall Catherine C., 1997,
digital documents”, Proceedings of CHI “Annotation: from paper books to the
2002, April 20-25, 2002, Minneapolis, digital library”, Proceedings of ACM DL
Minnesota, USA. , 2002 97 Philadelphia PA, USA
[2.] Chong Ng S. T., FlexNetDiscuss: “A [7.] Neuwirth Christine M., Kaufer David S.,
Generalized Contextual Group Chandhok Ravinder and Morris James
Communications System to Support H., “Issues in the Design of Computer
Learning Inside and Outside the Support for Co-authoring and
Classroom”, Proceedings of the The 3rd Commenting”, Proceedings of CSCW
IEEE International Conference on 1990, pages 183-195, 1990
Advanced Learning Technologies
(ICALT’03) , 2003 [8.] Oladipupo Oluremi, 2007, Distributed
Decision support system (DDSS) for
[3.] Freeman Jared and Hess Kathleen P., C2 Nigeria Bottling Company, BSc
Experimentation, C2 Decision making Computer Science thesis, University of
and cognitive analysis, C2 Assessment Ibadan
tools & metrics: Collaborative critical
thinking, 8th International command and [9.] Robert Charles Abiodun, 2007,
control research and technology L’annotation pour la recherche
symposium, June 17 – 19, 2003, National d’information dans le contexte
Defense University, Washington, DC, d’intelligence économique, Thèse
2003 doctorat de l'université Nancy 2, Nancy,
France, 16 février, 2007, 278 Pp

523
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A Computer-Assisted Approach for Designing Context-Aware Ubiquitous


Learning Activities

Tzu-Chi Yang1, Fan-Ray Kuo2, Gwo-Jen Hwang2 and Hui-Chun Chu2


1
Department Information Management, National Chi Nan University.
2
Department of Information and Learning Technology, National University of Tainan.
wilsonyang57@gmail.com,{revonkuo; gjhwang; carolchu}@mail.nutn.edu.tw

Abstract learning with context awareness features is still a novel


way of learning in terms of research. Although several
Recent advance of wireless communication and papers have presented the structure and curriculums
sensor technologies has leaded the educational for context-aware or ubiquitous learning, the way to
conception to a new area, called context-aware design and to create the context-aware u-learning
ubiquitous learning. The new conception of learning activities is seldom discussed. Past experiences show
not only depicts significant advantages, but also that developing a learning system by taking personal
reveals the difficulty of applying it. The major difficulty and environmental contexts into account is time-
is owing to the lack of procedural guidance that assists consuming [7], and planning a u-learning curriculum
the teachers in designing learning activities that bring without any guidance is even more difficult. Therefore,
the new learning conception into full play. To cope in this paper, an innovative approach, UPAM (U-
with this problem, this study proposes a procedural learning Procedure Acquisition Method), which can
knowledge acquisition strategy for designing context- help domain experts (teachers) design u-learning
aware ubiquitous learning activities. In addition, a activities by taking both the real world and virtual
practical application is presented to show the world environments into consideration.
effectiveness of the innovative approach.
2. Relevant research
1. Introduction
In recent years, researchers of e-learning have
In recent years, wireless sensor technology has been noticed the progress of wireless communication and
progressed rapidly, which can transmit and receive sensor technologies; therefore, the research issues have
messages via various ways of communications, such as been progressed from e-learning to m-learning (mobile
coordinate, data, machine codes and temperature. A learning) and u-learning. Moreover, several significant
system equipped with wireless sensor technology is characteristics of context-aware u-learning, which
able to collect some states (or called “context”) from make it different from conventional m-learning or the
real world; that is, the system can sense student broad-sense u-learning, have been discussed, including
information and environmental information in the real seamless services, context-aware services, and
world and then provide personalized services adaptive services [7].
accordingly. Such a feature is often called “context
awareness” [9]. 2.1 Ubiquitous computing technologies
More and more studies that take advantages of the
context awareness features have been reported, In recent years, a variety of wireless communication
especially in the development of ubiquitous learning and context-aware products have been developed, such
(u-learning) environments that emphasizes the as sensors and actuators, RFID (Radio Frequency
provision of any time and anywhere learning scenarios. Identification) tags and cards, wireless communication
For example, Ogata and Yano developed a system to equipment, mobile phones, PDAs (Personal Digital
assist overseas students in Japan to learn Japanese Assistant), and wearable computers. In this computing
language [13]; Yang proposed a learning environment, environment, anyone can make use of computers that
which stores resources through peer to peer (P2P) are embedded everywhere in a public environment at
model, for encouraging learning resource sharing [17]. any time. A student equipped with a mobile device can
Although the issues concerning web-based learning connect to any of them and access the network by
or mobile learning have been wide discussed, u- using wireless communication technologies [19].

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 524


DOI 10.1109/SUTC.2008.19
Moreover, not only can a student access the network context [10]. For example, Ogata and Yano developed
actively, but computers around the student can a u-learning system, which has been used to conduct
recognize the student’s behaviors and offer various students to learn Japanese under real-world situations
services according to the student’s situations, the [13]. The systems can provide students with
mobile terminal’s facility, the network bandwidth, and appropriate expressions according to different contexts
so on [2]. Researchers call such a new computing (e.g., occasions or locations) via mobile devices (e.g.,
conception the ubiquitous computing (u-computing). PDA- Personal Digital Assistant). Rogers [16]
Student assistance via u-computing technologies is integrated the learning experiences of indoor and
realized by providing students with proper decisions or outdoor activities by observation in the working scene.
decision alternatives. That is, a ubiquitous computing Students not only are capable of getting data, voice and
technology-equipped system supplies students with images from the scene by observations, but also of
timely information and relevant services by gathering related information from learning activities
automatically sensing students’ various context data via wireless networks. Recently, Joiner, Nethercott,
and smartly generating proper results [14]. Hull, and Reid [8] presented their studies of using
Furthermore, several approaches were purposed for context-aware devices applied in education by timely
building context ontology, which can be used to offering vocal statements of activities for students in
describe the structure of context hierarchy and the the real conditions.
relationships among different contexts [1], [20]. In the meanwhile, researchers attempted to find
some principles and methods for designing u-learning
2.2 Context-aware ubiquitous learning activities. For example, Cheng et al. [3] demonstrated
how a u-learning system provides adaptive services via
Although u-learning has attracted lots of attentions four steps: (1) Setting instructional requirements for
from researchers all over the world, the criteria of each of the student’s learning actions. (2) Detecting the
developing a u-learning environment have not been student's behaviors. (3) Comparing the requirements
clearly defined. Till now, researchers have different with the corresponding learning behaviors. (4)
aspects about the term “u-learning”. One aspect is Providing personal support to the student.
“anywhere and anytime learning”, which is a very Such a learning environment basically consists of
broad sense definition of u-learning. With this the following components: (1). A set of sensors that
definition, any learning environment allows students to are used to detect personal contexts (e.g., location and
carry mobile learning devices with wireless body temperature of the students) of and
communication capability is a u-learning environment; environmental contexts (e.g., temperature and moisture
that is, such a u-learning scenario is similar to that of of the learning environment). (2). A server that records
the well-known mobile learning, which allows students the contexts and provides active and passive supports
to access teaching contents via wireless networks in to the students. (3). A mobile learning device for each
any location at any time. student with which the student can receive the support
A stricter definition of u-learning is “learning with or guide from the server as well as access information
u-computing technology”, which emphasizes not only on the Internet. (4). Wireless networks that enable the
the usage of wireless communications but also the communications among the mobile learning devices,
sensor technology. The context-aware feature of u- the sensors and the server.
computing environments allows the learning system to Another issue raised with context-aware u-learning
better understand the student’s behaviors and the is the representation of the interactions of all objects in
timely environmental parameters in the real world, the learning environment. It is an important and
such as the locations and behaviors of the student, the challenging issue for both of the system developer and
temperature and moisture of the learning environment the domain experts to define a u-learning course or
[21]. Such contexts could be brief or detailed; for activity that takes personal contexts and environmental
example, the location of the student could be described contexts into considerations [7]. Henricksen and
by a zip code or a physical address. In the following Indulska proposed a Context Modeling Language
discussions, we shall focus on such a definition of u- (CML), which reformulates the modeling concepts as
learning and call it “context-aware u-learning” to extensions to Object-Role Modeling (ORM) [9]; Yuan
distinguish it from the broad sense definition. and Chen used Resource Description Framework (RDF)
Among various contexts that can be sensed, and Resource Description Framework Schema (RDFS)
researchers have indicated that “time” and “location” to represent the context information of various learning
may be the most important and fundamental scenarios [18]. In the meanwhile, an intelligent
parameters for recognizing and describing a student’s decision-making strategy and domain expert system

525
with context-aware capability were also proposed [15]. conditions for each learning step are defined in the
In the following sections, a U-learning Procedure third phase. In addition, this approach not only
Acquisition Method (UPAM) is proposed based on the provides preview service to prevent the domain expert
notations of these approaches. from feeling irritable, but also shows the progress of
content acquisition to the experts. The UPAM
3. U-learning procedure acquisition algorithm is given as follow:
method (UPAM)
Phase 1: Define context-aware parameters to be
In a context-aware u-learning activity, several considered.
kinds of interactions between the students and the 1.1: Initially, Learning Area Set (LAS) is defined as the
real-world contexts need to be taken into accounts. main activity zone and the Context Parameter Set
Usually, the domain expert will conduct several (CPS) is defined as the environmental parameters of
learning activities in a course unit. Each learning the main activity zone.
activity consists of several phases to guide the 1.2: While (there are more sub-activity areas to be
students to different learning contexts; furthermore, a identified in the main activity area)
phase consists of some relevant steps. Usually a {
learning activity is allowed to halt among phases; Add the new activity area to LAS and the
nevertheless, it would be better to prevent from the corresponding environmental parameters to CPS.
halt of the learning activity among steps, unless some }
accident has happened. 1.3: FOR (each Sub-activity area i)
Three kinds of learning activities can be conducted {
in a context-aware u-learning environment. For (each device j in Sub-activity area i)
(1). Learning material presentations based on the {
sequence of curriculum defined by the domain expert. Add device j and the corresponding context
Learning sequence and context awareness for this parameters to CPS.
kind of curriculum design is not absolutely required. }
The learning sequence is set for making learning more }
systematic by guide students to learn in the real world 1.4: Add personal context parameters of the students
with the supplemental materials provided by the to CPS.
learning system. In such a learning activity, usually 1.5: Add the parameters of personal profile and
the learning sequence would not affect the learning learning portfolios to CPS.
result. (2). A sequence of learning steps that guide the 1.6: Define the possible feedbacks that will be sent to
student to identify or classify a set of real-world the students via the mobile devices.
objects. The purpose of the curriculum design not
only emphasizes the learning sequence, but also Phase 2: Define the structure of the learning activity.
evaluates if the students understand the concepts to be 2.1 Set the general attributes for the learning activity.
learned. Therefore, for such a learning activity, the 2.2 Construct the conspectus of the learning activity.
learning sequences need to be more strictly followed 2.3 Preview the conspectus of the learning activity.
than that of the former. (3). The training of a series of
equipment operations as well as the operating Phase 3: Provide content and conditions for each
procedure. The operating procedure and the learning step.
operations of the equipment are highly correlated. The 3.1: FOR (each learning phase)
learning sequence of such a learning activity must be {
entirely followed. Give teaching content for each learning step in the
According to the foregoing and the past phase.
experiences, we propose a U-learning Procedure Define finishing conditions for each learning step
Acquisition Method, short for UPAM, to assist in the phase.
domain experts to plan the context-aware u-learning Confirm finishing conditions for each learning
activities in a more efficient and effective manner. step in the phase.
The UPAM algorithm consists of three phases: the Preview the teaching content and the learning
first phase is to get context parameters for the learning procedure in the phase
activities; the structure of the learning activities is }
defined in the second phase; the content and relevant 3.2: Preview the entire procedure of curriculum
(Optional).

526
4. Development of a computer-assisted u-
In order to clearly present the results acquired by learning activity design system
UPAM. Five context-aware parameters are used to
define the structure of UPAM [6]: To demonstrate the usefulness of our innovative
approach, two experts with great experiences in
(1) Personal context, including student’s location, planning u-learning curriculums were asked to
arrival time, body temperature, heartbeat rate, blood organize the context-aware learning activity of a
pressure etc. (2) Environmental context, including science course with UPAM. Sixth-grade students are
sensor’s ID, location, temperature, humid, gradient of the main students, and the aim of curriculum is to
atmosphere and other environmental context and understand seasons changed on the Earth and time
objects around the sensors. (3) Feedback returned from differences in different areas of the world.
sensors equipped on learning device, which includes The learning activity consists of a sequence of
data from target object, such as temperature, pH of learning targets specified and organized by the
water, value of air pollution, shape of tree and color. (4) teachers, such that the students can learn the domain
Personal information and learning portfolio derived knowledge by observing the real world objects with
from, including student’s calendar, starting time to the guidance and support from the digital world. The
learning activity, the shortest and longest time to learn learning system will actively and timely guide student
activity, internet on-line discussing statement, learning to operate equipments, which may improve student’s
place, learning portfolio and sequence and the learning effect.
restriction of learning activity etc. (5) Environmental Illustration and system presentation for UPAM are
information derived from database, including detailed shown in Figure 1. In phase 1, UPAM asked the expert
information about learning place, such as sequence of to provide context-aware parameters by showing the
learning activity, location limitation and management most frequently used attributes. If none of the default
rules, usage records, in location and equipments in parameters is preferred, the expert can add new
learning place. parameters to the candidate attribute list.
Although several previous studies have mentioned
the use of the context-aware parameters in the learning
activities, it is difficult for both the domain experts to
define the context-aware parameters while designing
the learning activities. The learning system developers
might be more aware of the context-aware parameters;
however, they are not the learning activity designers.
Therefore, the most efficient and effective way of
defining the parameters is to show the domain experts
the candidate parameters and to guide them to make
decisions, which is the strategy used in UPAM.
In addition to the definition of context-aware
parameters, UPAM guides the domain experts to
complete various contents for the learning activities,
including (1) Context-aware parameters and
architecture for curriculum; (2) Teaching content,
procedure and finishing conditions for each step; (3)
Additional information, such as expected learning path Figure 1. Context-aware Parameters
and the estimate for wireless sensor devices deployed. Acquisition
The learning path depends on the combination of
curriculum scheme and complete condition of each After defining the context-aware parameters, the
step. Moreover, the estimate of wireless sensor devices expert was asked to organize the learning events of the
relies on the integration of complete conditions for curriculum and establish dependent relationships
each step and context-aware parameters. Next section among the parameters and the events. In order to
we will take case study to introduce how UPAM to be manage the sequence of the learning events, it is
executed. necessary to set up finishing conditions for each step.
During the learning process, the learning system will
acquire learning contexts from the sensors or the
mobile devices, and compare them with expected

527
values. For the contexts that cannot be acquired from student. (2) Teaching content and procedure of the
the sensors, such as the comprehension status of the curriculum, as shown in Figure 3. (3) Additional
student, the learning system will try to acquire the information, such as learning sequence and the target
contexts by ask questions to the student through the locations for sensing students’ arrival, as shown in
mobile device. That is, the experts need to provide Table 3.
questions and the corresponding answers in advance.
Here is the scenario for illustrating these steps. First, Table 1. Category for Context-aware
students will be appointed to a projection screen; Parameters
besides, they need to watch an appointed movie then. Context of the learning environment & Environmental
System will ask them the question after 10 seconds data
they arrive the appointed place, “What scene do you Major active attributes
see in the movie? Option A, option B, option C.” The area
students are allowed to proceed to the next step if they (Tropic of (Opening
select the correct answer. Cancer Solar hours)
If the step can not be completed after several trials, Exploration
which implies that the students have encountered a Center)
problem in the learning activity, there are three Sub active attributes equipments Attributes of
alternative ways to deal with the situation (as shown in are area equipment
Figure 2): (The Nine (Opening (Projection (Location) ǵ
(1) Lead to the specific step. Planets hours) Screen) (Administrator)
(2) Let the student select one of the following actions: Exhibition (Globe) (Opening
“keeping the record and go next step”, “Asking the Chamber) hours) ǵ
domain expert for help” and “Asking the
(Location)
classmates who have completed the step for help”.
(World Time (Opening (Projection (Opening
(3) Inform the domain expert to terminate the current
clock hours) Screen) hours) ǵ
learning activity.
Exhibition (Location) ǵ
Chamber) (Administrator)ǵ
(Usage status)
(World (Location)
Time
clocks)
… … … …

Table 2. Context-aware parameters


(Context of Student)
(Location Info.)
( Student's profile and on-line behaviors)
(Tutor) ǵ (Name) ǵ (Reservation) ǵ (Emergency
call)ǵ(Related experience)...etc.
( Student's feedbacks )
Multiple-choice Q&A
Figure 2. Setup finishing conditions

While assisting the domain expert to design


learning activities, the UPAM will output related
documentations accordingly, including: (1) Context-
aware parameters, as shown in Tables 1 and 2. Table 1
depicts the context-aware parameters that the
environment need as well as the relationships among
them. Table 2 shows the parameters for describing the
context of the student, the basic data and the learning
portfolio of the student, and the feedbacks from the

528
planned procedure and additional information while
Table 3. Additional Information guides the domain experts to design the learning
Expected learning sequence of the curriculum. activities. From the practical application on a science
[The Nine Planets Exhibition Chamber]! ! [The course, it can be seen that UPAM is capable of
Nine Planets Exhibition Chamber- Globe]! ! [The improving the efficiency and effectiveness of creating
Nine Planets Exhibition Chamber - Projection Screen] context-aware u-learning activities. In the future, more
 ! [World Time clock Exhibition Chamber] !  practical experiments will be conducted by applying
World Time clock Exhibition Chamber -[ World Time UPAM to various kinds of science and language
clocks]! ! [World Time clock Exhibition Chamber courses.
- Projection Screen]…
Make sure the place where students must arrive. Acknowledgement
[Tropic of Cancer Solar Exploration Center] This study is supported in part by the National
[The Nine Planets Exhibition Chamber - Globe] Science Council of the Republic of China under
[The Nine Planets Exhibition Chamber - Projection contract numbers NSC 96-2520-S-024-004-MY3 and
Screen] NSC 95-2520-S-024-003-MY3.
[The Nine Planets Exhibition Chamber ]
[World Time clock Exhibition Chamber - World Time References
clocks]
[World Time clock Exhibition Chamber - Projection [1] Allard Strijker and Betty Collis, “Strategies for Reuse
Screen] of Learning objects: Context Dimensions,”
… International Journal on E-Learning, vol. 5, 2006,
pp.89-94.
[2] Cheng, L., and Marsic, I., “Piecewise Network
Figure 3 shows the UPAM interface for defining Awareness Service for Wireless/Mobile Pervasive
teaching materials, relevant parameters and finishing Computing,” Mobile Networks and Applications
condition of each step. (MONET), vol.7, no.4, 2002, pp. 269-278.
[3] Cheng, Z., Sun, S., Kansen, M., Huang, T., and He, A,
“A personalized ubiquitous education support
environment by comparing learning instructional
requirement with learner's behavior,” In the
Proceedings of the IEEE 19th International
Conference on Advanced Information Networking and
Applications, TamKang University Taiwan, March
2005, pp. 567-573.
[4] Christos Doulkeridis, Nikos Loutas and Michalis
Vazirgiannis, “A System Architecture for Context-
Aware Service Discovery,“ Electronic Notes in
Theoretical Computer Science, vol.146, 2006, pp.101–
116.
[5] G. D. Abowd, “Classroom 2000: An Experiment with
the Instrumentation of a Living Educational
Environment,” IBM Systems Journal, vol.38, no.4,
1999, pp. 508-530.
[6] Gwo-Jen Hwang, “Criteria and Strategies of
Ubiquitous Learning,” Proceedings of the IEEE
Figure 3. Teaching content and planned International Conference on Sensor Networks,
procedure Ubiquitous, and Trustworthy Computing (SUTC’06),
vol.2, 2006, pp.72-77.
5. Conclusions and feature research [7] Gwo-Jen Hwang, Chin-Chung Tsai and Stephen J.H.
Yang, “Criteria, Strategies and Research Issues of
Context-Aware Ubiquitous Learning (Periodical
In this study, an innovative approach, UPAM, to
style—Accepted for publication),” Educational
acquire related data for curriculum is proposed, which Technology & Society, to be published.
can assist the domain experts to scheme curriculum in [8] Joiner, R., Nethercott, J., Hull, R., and Reid, J,
an efficient and effective way with friendly student “Designing Educational Experiences Using
interfaces. UPAM will produce some useful references, Ubiquitous Technology,” Computers in Human
such as context-aware parameters, teaching content, Behavior, vol.22, no.1, 2006, pp. 67-76.

529
[9] Karen Henricksen and Jadwiga Indulska, “Developing Technical Reports (UBI-1-1), vol.39, April, 2003,
context-aware pervasive computing applications: pp.1-6.
Models and approach,” Pervasive and Mobile
Computing, vol.2, 2006, pp.37–64.
[10] Lonsdale, P., Baber, C., Sharples, M., and Arvanitis,
T, “A context awareness architecture for facilitating
mobile learning,” Paper presented at the meeting of
the second European conference on learning with
mobile devices (MLEARN 2003), London, UK, May,
2003.
[11] M. Khedr and A. Karmouch, “ACAI: agent-based
context-aware infrastructure for spontaneous
applications,” Computer Applications, vol.28, 2005,
pp.19–44.
[12] Marek Hatala, Ron Wakkary and Leila Kalantari,
“Rules and ontologies in support of real-time
ubiquitous application,” Web Semantics: Science,
Services and Agents on the World Wide Web , vol. 3,
2005, pp.5–22.
[13] Ogata, H., and Yano, Y., “Context-Aware Support for
Computer-Supported Ubiquitous Learning,” In
proceedings of the 2nd IEEE International Workshop
on Wireless and Mobile Technologies in Education
(WMTE'04), JungLi, Taiwan, March 2004, pp. 27-34.
[14] Ohbyung Kwon, Keedong Yoo and Euiho Suh,
“ubiES: Applying ubiquitous computing technologies
to an expert system for context-aware proactive
services,” Electronic Commerce Research and
Applications, vol.5, 2006, pp.209-219.
[15] Ohbyung Kwon, “The potential roles of context-aware
computing technology in optimization-based
intelligent decision-making,” Expert Systems with
Applications, vol.31, 2006, pp.629-642.
[16] Rogers, Y., Price, S., Randell, C., Fraser, D. S., Weal,
M., and Fitzpatrick, G, “Ubi-learning Integrating
Indoor and Outdoor Learning Experiences,”
Communications of the ACM, vol.48, no.1, pp.55-59,
2005.
[17] S. J. H. Yang, “Context Aware Ubiquitous Learning
Environments for Peer-to-Peer Collaborative
Learning, “ Journal of Educational Technology and
Society, vol. 9, no.1 , 2006, pp.188-201.
[18] Soe-Tsyr Yuan and Fang-Yu Chen, “UbiSrvInt- a
context-aware fault-tolerant approach toward wireless
P2P service provision,” Expert Systems with
Applications, vol. 32, 2007, pp.726–752.
[19] Uemukai, T., Hara, T., and Nishio, S, “A Method for
Selecting Output Data from Ubiquitous Terminals in a
Ubiquitous Computing Environment,” In the
proceedings of the 24th International Conference on
Distributed Computing Systems Workshops
(ICDCSW’04), Tokyo, Japan, March 2004, pp.562-
567.
[20] Wei-Po Lee, “Deploying personalized mobile services
in an agent-based environment,” Expert Systems with
Applications, vol.32, 2007, pp. 1194-1207.
[21] Y. Kawahara, M. Minami, H. Morikawa, and T.
Aoyama,, “ A Real-World Oriented Networking for
Ubiquitous Computing Environment,” IPSJ SIG

530
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Usability Comparison of Pen-based Input for Young Children on Mobile


Devices

Chih-Kai Chang
National University of Tainan
Department of Information and Learning Technology
chihkai@mail.nutn.edu.tw

Abstract both software and hardware may not be suitable for


kids, especially young children [1, 2, 3]. There have
This research implements a system, primarily been significant researches on children’s use of the two
focusing on examining young children’s capability of mouse interaction styles of drag-and-drop and point-
pen-based input on mobile devices, to compare and-click [4, 5]. Some researchers indicate that all K-
accuracy of pointing-and-clicking and dragging-and- 12 children can use a mouse as a most ease-to-use,
dropping dots with different sizes, locations, and affluent, and accuracy input device [6, 7]. However,
spreads. Although traditional e-learning systems use a very little research has focused on young children’s
mouse as input device, the trend of the mobile learning interactions with pen-based input devices until
bring the devices for mobile learning into classroom, recently. Consequently, this study investigates young
such as tablet PC and PDA (Personal Digital children’s usability on a tablet PC and PDA by
Assistant). Because the conventional input devices, that implementing a behavior recording system and setting
are a mouse and a keyboard, is replaced with a touch up an experimental scenario.
screen or pen-based input device, understanding young Well-designed e-learning software can insure young
children’s capability on those novel input device, children to enjoy it by providing suitable interface,
which is originally designed for an adult, is crucial for feedback, and content in time. Chambers et. al [8]
designing interface for kids on mobile devices. Two indicates that children’s learning efficiency will be
experiments of point-and-click and drag-and-drop promoted by providing suitable interface because the
were conducted to compare speed and error rate for cognition ability, mind and body development degree
children on tablet PC and PDA. The results show that of young children are different from the adult.
children’s capability of drag-and-drop by using a pen- Moreover, young children’s experience of using
based input is significantly worse than using a mouse. computer could be only a few. Consequently, the
However, most children can precisely point-and-click usability of e-learning software for young children are
any object larger than the width and height of 0.3 more important than adult. Unfortunately, the adaptive
centimeter on both tablet PC and PDA. The overall methods of most e-learning software are implemented
usability of PDA is better than the tablet PC for young by reducing the difficulty of a task without a thorough
children in the experimental settings. Consequently, analysis of young children’s behavioral patterns.
using pen-based input, even in a small screen size, with Although e-learning software can provide clear
proper interface design could be a feasible method for instructions, can young children carry that out
young children in a mobile learning activity. effortlessly? For instance, e-learning software may ask
young children to drag something on a tablet PC or
1. Introduction PDA, but how long is a suitable design?
Young children’s capability of moving and
Mobile learning is an important research issue in the selecting an object is affected by many variables, such
area of e-learning. Students’ capability of affluently as age, computer experience, gender, and task so on.
and accurately using mobile devices is as essential Previous results of young children’s capability on
consideration for teachers to determine whether use using a mouse reported the point-and-click interaction
mobile devices in a learning activity of field style was faster; fewer errors were committed using it;
investigation. Although children nowadays are exposed and it was preferred over the drag-and-drop interaction
to computers from a very early age, the interface of style [9, 10]. We decided to reexamine those issues on

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 531


DOI 10.1109/SUTC.2008.8
pen-based input of a tablet PC or PDA for three Third, how the spread, direction, and size of target
reasons: (1) the collected data can be compared with affect children’s usability of pen-based input? To
usability testing results of a mouse for educational answer those questions, the measuring systems of
software on young children; (2) research issues in point-and-click and drag-and-drop were implemented
previous studies about using a mouse are too many to on both tablet PC and PDA. After that, an experiment
be investigated in a single experiment on tablet PC. was conducted. Finally, the experimental results were
Hence, similar experimental design with a previous analyzed and reported.
study can focus on limited issues; (3) the most basic
operations of pen-based input for young children are 2. System design and implementation
not effortless because they capability of eye-hand
coordination are still developing [11]. The system is used for young children to learn
For the point-and-click interaction style, the English vocabulary on both a tablet PC and PDA. The
principle is that a larger target will get faster speed and screen size of the tablet PC is 12.1-inch with the
less error. Hence, the target of point-and-click should resolution 1024*768. Hence, the width and height of a
be large enough for young children. However, only one pixel is 0.024 centimeter. The screen size of the PDA
target on the interface of e-learning software is rarely is 3.2 inch with the resolution 240*320. That is, a pixel
happened. The target usually accompanies other on the PDA is 0.031 centimeter. Figure 1 depicts the
objects around it. Thus, young children may not be functional requirements of the usability testing system
able to aim the target or stably control the pen and by using the use case diagram of the Unified Modeling
make errors. Recent research investigates how the Language (UML). The learner at the left part of the
arrangement of target and objects influences the diagram faces two testing functions, which are drag-
precision of young children’s point-and-click by a and-drop and point-and-click capability testing.
mouse [12]. This paper also studies that issue (i.e. Learners’ behaviors on the system will be recorded by
arrangement of objects) by using the pen-based device. the behavioral records modular at the middle part of
Moreover, this paper compares precision of point-and- Figure 1. There are three functions, illustrated at the
click behaviors with two mobile devices. right part of Figure 1, using the behavioral records to
Consequently, suggestions for designers and analyze learners’ reaction time, speed, and proportion
instructors of mobile learning software can be given. of errors. Finally, researchers can use the finds for
For the drag-and-drop interaction style, children comparison with different mobile devices.
should use a pen to press the target and move to
destination. For young children, drag-and-drop is much Usability Testing System for Pen-based Input
more difficult than point-and-click because they should
contiguously keep stable pressure on the surface of a point-and-click <<include>>
<<include>> reaction time

mobile device. Moreover, mobile devices are usually <<include>>

used in a learning activity on the move, such as behavioral records


<<include>>
speed

museum guiding. In other words, young children could <<include>>

drag-and-drop
stand and handheld the mobile device to drag-and-drop Learner accuracy Research

something. Although some studies [13, 14, 15] also <<extends>>


reported their observation about young children’s drag-
selection errors dropping errors
and-drop interaction style, the paper concerns about the
usability differences of mobile devices for young
children on a ‘real’ mobile situation. The word ‘real’ Figure 1. Use case diagram of the testing system.
needs more explanation because previous studies of
interaction style usability usually use standard Figure 2 shows the snapshots of the usability testing
experimental procedure to measure children’s ability in system on a tablet PC. The system will pop up an
a laboratory. In this study, learners were asked to stand English vocabulary and its Chinese term at the left
and handheld the mobile devices while measuring their bottom corner of a screen. Meanwhile, a misspelling
usability. Thus, the experimental results will be English vocabulary will be arranged for correction by
comparable with a mobile learning activity. selecting or dropping. The left snapshot of Figure 2
To compare young children’s usability of pen-based illustrates selecting a misspelled character by pointing
input on PDA and tablet PC, the following questions and clicking the red dot. There are three dots, one red
were investigated. First, how the screen size of a and two white, for each character and the dots may be
mobile device affects children’s pen operations? placed on left, right, over, and under side of a character.
Second, how the mobile devices affect children’s The system will produce 12 (3*2*2) conditions
usability, including precision, speed, and error rate? according the varieties in the position of the red dot (3),

532
the orientation of the dots (2), and the proximity of the
dots (2). Hence, researchers can verify whether the
arrangement of dots affects learners’ usability testing
results of selection or not. The right snapshot of Figure
3 shows tasks of dragging a misspelled letter into a
waste bin. The system will produce 4 (2*2) conditions
according the distance (i.e. short and long) and
orientation (i.e. vertical and horizontal). Moreover, the
total time, reaction time, and movement speed are
calculated and recorded for further analysis.

Figure 3. The testing system snapshot on PDA.

3. Experimental results
Seventy seven children from Grade 1 (mean age 7
years) and Grade 2 (mean age 8 years) participated in
the experiment of measuring usability on a tablet PC.
Figure 2. The testing system snapshot on tablet PC. Sixty two children from Grade 1 and Grade 2
participated in the experiment of measuring usability
Figure 3 shows the snapshots of the usability testing on a PDA. The children had not received formal
system on a PDA. The design and functionality of the training in reading English, but some of they had learn
system is the same as the system on a tablet PC, how to recognize English letters in Kindergarten.
illustrated in Figure 2. However, the size and distance Every child is asked to complete the usability test in
of targets (i.e. the dots and characters) are different the situation of standing-up and holding the tablet PC
from those on a tablet PC. Although a dot uses 13 or PDA because mobile learning device for a learning
pixels on a tablet PC and 10 pixels on a PDA, the size activity of the field investigation is used in similar
of a dot on both systems is 0.3 centimeter; meanwhile, situation. Table 1 presents the experimental results of
the size of the trash can on both systems is 1 the reaction time, proportion of selecting errors,
centimeter width and 2 centimeter height. Hence, a dot horizontal accuracy, and vertical accuracy for the
on a PDA looks larger than a dot on a tablet PC point-and-click testing on a tablet PC or PDA. The
relatively. The distance between dots of the usability experimental results on a PDA are listed as the
testing system is 0.3 centimeter (i.e. 13 pixels) or 0.8 numbers in brackets.
centimeter (i.e. 33 pixels) on tablet PC. The distance On a tablet PC, children averagely need 7.47s to
between dots is shorter on the PDA because of the click the red dot. Although children in Grade 2 clicked
screen size limitation. The distance between dots is 0.2 faster than Grade 1, the data analysis result, F(1,
centimeter (i.e. 7 pixels) or 0.8 centimeter (i.e. 20 75)=0.592, p=0.444>0.05, indicates that there is no
pixels). For the drag-and-drop interaction, the distance significant difference between Grade 1 and Grade 2.
between a character and the trash can is 17 or 8 Similar with previous studies of mouse input, placing
centimeter on a tablet PC; meanwhile, the distance dots close together causes longer reaction times (7.56s
between a character and the trash can is 5 or 2 > 7.17s). However, the data analysis result, F(1,
centimeter on a PDA.
76)=0.543,p=0.463>0.05, indicates that there is no
significant difference between far and close distracting
objects. This suggests that the children need more time
to complete an aiming and clicking task by a pen than
by a mouse in previous study. Hence, the time
difference between far and close distracting objects
becomes insignificant.
On a PDA, children averagely need 8.47s to click
the red dot. Children in Grade 2 significantly clicked
faster than Grade 1, indicated by statistical result F(1,

533
60)=17.745, p=0.000<0.05. The data analysis result, Moreover, the proportion of errors on a tablet PC is
F(1, 61)=5.921, p=0.018<0.05, indicates that there is significant higher than it on a PDA according to the t-
significant difference between far and close distracting test result t(61)=-2.707, p=0.009<0.05. According
objects. To compare the results on both mobile informal interview, children reported that the thickness
devices, the independent t-test is used to make degree of the pen affects their usability. The pen of a
judgments from small samples. The result, t(61)=- PDA is too thin to handle; meanwhile, they get used to
2.738, p=0.008<0.005, shows that the reaction time on the thickness of the pen for a tablet PC.
a tablet PC is significant smaller than it on a PDA.

Table 1. Statistical results of the point-and-click testing on a tablet PC or PDA


Reaction time (s) Proportion of errors Horizontal accuracy Vertical accuracy
(pixels) (pixels)
Mean 7.47 (8.47) 0.58 (1.03) 5.56 (4.60) 5.72 (4.28)
Grade 1 7.92 (11.20) 0.70 (1.62) 5.30 (4.97) 5.73 (4.00)
Grade 2 7.02 (5.74) 0.46 (0.44) 5.81 (4.22) 5.71 (4.56)
Far distracting 7.17 (7.80) 0.73 (1.14) 5.97 (4.67) 5.71 (4.24)
objects
Close distracting 7.56 (9.14) 0.57 (0.92) 5.13 (5.09) 5.73 (3.88)
objects

the average reaction time on a PDA is less than it on a


Table 2 presents the experimental results of the tablet PC, their average speed of movement on a PDA
reaction time, speed, and proportion of errors for the is much slower than on a tablet PC, that is 81.45 pixels
drag-and-drop testing on a tablet PC or PDA. On a per second. Similar with testing results on a tablet PC,
tablet PC, children averagely need 11.29s to drag and dragging objects over long distances need more time
drop a misspelled letter into a waste bin. They average than dragging them over short distance (12.59s >
speed of movement is 173.69 pixels per second. The 8.81s); meanwhile, the average moving speed in long
children made an error in 0.39% over all testing items. distance tests is faster than the average moving speed
Children in Grade 2 spend less time to complete a task in short distance (85.54 > 77.36 pixels/s). Although
than Grade 1, F(1, 75)=16.201, p=0.000<0.01. Children drag faster on a tablet PC than on a PDA, it
Children in Grade 2 also drag faster than Grade 1, F(1, may be caused by the screen size of both devices. To
75)=4.194 , p=0.044<0.05. As expected, dragging eliminate the effects of screen size, we transform the
objects over long distances need more time than speed unit from number of pixels per second (i.e.
dragging them over short distance (12.49s > 10.19s); pixels/s) into percentage of screen size per second (i.e.
meanwhile, the average moving speed in long distance %/s). After transformation, the results show that
tests is faster than the average moving speed in short Children drag faster on a PDA than on a tablet PC.
distance (177.49 > 167.85 pixels/s). Moreover, Consequently, it can explain why the average reaction
horizontal movement of a letter generally resulted in time on a PDA is less than the reaction time on a tablet
faster movement speeds (182.08 > 163.26 pixels/s) PC.
than vertical movement of a letter.
On a PDA, children averagely need 10.70s to drag
and drop a misspelled letter into a waste bin. Although

Table 2. Statistical results of the drag-and-drop testing on a tablet PC or PDA


Reaction time (s) Speed (pixels/s) Drop errors per trial (%)
Mean 11.29 (10.70) 173.69 (81.45) 0.39 (0.39)
Grade 1 12.90 (13.75) 161.07 (73.99) 0.40 (0.51)
Grade 2 9.67 (7.65) 186.31 (88.91) 0.38 (0.27)
Short distance 10.19 (8.81) 167.85 (77.36) 0.44 (0.31)
Long distance 12.49 (12.59) 177.49 (85.54) 0.53 (0.47)
Horizontal move 11.99 (9.19) 182.08 (83.57) 0.55 (0.33)
Vertical move 10.69 (12.20) 163.26 (79.33) 0.36 (0.45)

534
6. Acknowledgements
4. Discussion and future work This study was supported by a grant from National
Science Council of Taiwan under contract number
Usability of input methods for learners is a critical NSC 95-2520-S-024-007 and NSC95-2524-S-008-001.
issue for an instructor considering adopting a mobile
learning device. Therefore, the aim of this study is to 7. References
investigate young children’s capability of using pen-
based input on two mobile devices for the mobile [1] Smith, D., & Keep, R. (1986). Children's
learning situation. The experimental results clearly Opinions of Educational Software. Educational
showed that young children can use a pen-like stylus to Research, 28(2), 83-88.
select and move an object although the reaction time,
[2] Strommen, E. F. (1993). Is It Easier to Hop or
accuracy, and speed are not as good as a mouse for
Walk? Development Issues in Interface Design.
selection and movement. In educational software for
Human-Computer Interaction, 8(4), 337-352.
mobile learning, the objects may need to be designed
larger for clicking and closer for dragging. In previous [3] Hourcade, J. P. (2002). It’s too small!
research, the operation of click-move-click is reported Implications of children’s developing motor skills
as a more appropriate procedure for young children on graphical user interfaces: Technical Report
than the operation of drag-and-drop. This study HCIL-2002.
suggests similar conclusion on both PDA and a table
[4] Inkpen, K. M. (1997). Three Important Research
PC because keeping pressure for a dragging movement
Agendas for Educational Multimedia: Learning,
on the touch screen of a mobile learning device seems
Children, and Gender. AACE World Conference
difficult for children. Especially, the physical distance
of moving an object by a pen on a tablet PC is much on Educational Multimedia and Hypermedia, 97,
longer than doing that on a PDA or by a mouse. 521-526.
[5] Inkpen, K. M. (2001). Drag-and-Drop versus
5. Conclusion Point-and-Click Mouse Interaction Styles for
Children. ACM Transactions on Computer-
This study investigated young children’s capability Human Interaction, 8(1), 1-33.
of pen-based input because the stylus, a pen-like [6] MacKenzie, I. S., Sellen, A., & Buxton, W. A. S.
drawing apparatus, is a popular and common input (1991). A comparison of input devices in element
device on a table PC. Some pilot classroom had been pointing and dragging tasks. Proceedings of the
introduced the mobile learning devices, such as PDA SIGCHI conference on Human factors in
or tablet PC, for supporting field investigation activity. computing systems: Reaching through technology,
In the near future, a personal, portable, wirelessly- 161-166.
networked learning device may be ready-to-hand
accessed for every child. Hence, understanding young [7] Joiner, R., Messer, D., Light, P., & Littleton, K.
children’ capability on those novel devices is crucial (1998). It is best to point for young children: a
for creating a “seamless learning” environment, which comparison of children's pointing and dragging.
implies that a student can learn by using the personal Computers in Human Behavior, 14(3), 513-529.
device as a mediator whenever they are curious in a
[8] Chambers, B., Abrami, P. C., McWhaw, K., &
variety of scenarios. This study focused on testing
Therrien, M. C. (2001). Developing a Computer-
young children’s aiming and moving capabilities on a
Assisted Tutoring Program to Help Children at
table PC while they are standing-up and holding the
Risk Learn to Read, Educational Research and
tablet PC. In conclusion, the experimental results show
Evaluation, Vol. 7, Issue 2&3, pp. 223-239.
that young children are clearly capable of using the
pen-based input for selection and movement on a tablet [9] Elliott, D., Hansen, S., Mendoza, J., & Tremblay,
PC although their capability of movement by the stylus L. (2004). Learning to optimize speed, accuracy,
is significantly worse than using a mouse. Therefore, and energy expenditure: a framework for
young children may be more comfortable to use the understanding speed-accuracy relations in goal-
device with the proper interface design for them than directed aiming. J Mot Behav, 36(3), 339-351.
the “normal” interface, which is originally designed for
[10] Grossman, T., Hinckley, K., Baudisch, P.,
an adult.
Agrawala, M., & Balakrishnan, R. (2006). Hover
widgets: using the tracking state to extend the
capabilities of pen-operated devices. Proceedings

535
of the SIGCHI conference on Human Factors in ACM Transactions on Computer-Human
computing systems, 861-870. Interaction (TOCHI), 7(3), 384-416.
[11] Kuhtz-Buschbeck, J. P., Boczek-Funcke, A., [14] Saund, E., & Lank, E. (2003). Stylus input and
Illert, M., Joehnk, K., & Stolze, H. (1999). editing without prior selection of mode.
Prehension movements and motor development in Proceedings of the 16th annual ACM symposium
children. Experimental Brain Research, 128(1), on User interface software and technology, 213-
65-68. 216.
[12] Donker, A., & Reitsma, P. (2007). Young [15] Djeziri, S., Guerfali, W., Plamondon, R., &
children's ability to use a computer mouse. Robert, J. M. (2002). Learning handwriting with
Computers & Education, 48(4), 602-617. pen-based systems: computational issues. Pattern
Recognition, 35(5), 1049-1057.
[13] Ren, X., & Moriya, S. (2000). Improving
selection performance on pen-based systems: a
study of pen-based interaction for selection tasks.

536
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Evaluation of the Learning of Scientific English in Podcasting PCs, MP3s,


and MP4s Scenarios

Siew-Rong Wu
Center for General Education
National Yang-Ming University
srwu@ym.edu.tw

Abstract The programs can then be updated through iTunes.


Users without mobile devices can access podcasts
easily from their PCs.
To examine the effects of the use of podcasting in the
There are three types of podcast devices: audio
learning of scientific English in the university setting, podcasts, enhanced podcasts, and video podcasts. The
this study was conducted in two English scientific most popular type is audio podcasts with a format that
listening and speaking classes taught by the can be played in MP3s. Enhanced podcasts have the
researcher-teacher at a medical university in Taipei, function of showing pictures as the audio programs are
being played. As to video podcasts, videos can be
TAIWAN from November to December 2007. Podcasts
played through a variety of formats, but usually the
on science were used as learning materials. Learning most commonly used format is for MP4s.
outcome assessments include concept-mapping, oral The most fascinating advantage of podcasts is that
tests, and scientific essay writing. Students' use of users can listen or watch the podcasts whenever,
wherever they want, for as many times as they want.
tools: PCs, MP3s, and MP4s, their frequency,
Podcasting is now widely used in the entertainment
duration, and location of use, and motivation and industry and in education, such as language learning.
learning attitude were assessed by questionnaires. For example, some teachers and learners created
Experimental results show that due to the activation of podcasts to introduce their hometowns to the whole
higher-order thinking and cognitive engagement, deep world or broadcast news to their own communities.
They collect every episode and upload it to the podcast
learning was facilitated in students’learning processes.
web sites. Members from all over the world can
Besides, a preference of using MP3s for the learning of download them freely. This kind of interesting podcasts
scientific English and a preference of using MP4s for certainly fascinates learners all over the world.
deep learning of science and scientific English were The easy access of podcasts on the Internet has
provided EFL English learners and instructors great
identified in this study.
resources of learning materials. This generation of
university students grew up with new advances of
information technology. They receive all forms of
1. Introduction visual stimulations from video games, TV programs,
movies, and MTVs. They are greatly influenced by
In September 2004, a new word “ podcasting”was information technologies because these visuals and
invented. It is a combination of “ iPod” and graphics have become part of their daily lives. They
“broa dcasting .”Podcasting refers to the making of use all kinds of visual representations in various
audio or video programs and the publication of them on situations, such as iPods and similar MP3/MP4 devices,
the Internet so that these programs may be downloaded DVDs, email, MSN, and blogs. All these activities
to personal computers or mobile devices, either with involve the computers. In other words, students in this
some fees or free of charge. Users can produce their generation are so skillful in using computers and the
own podcasts for downloading. They can also Internet. The advantage of using podcasts for language
subscribe to all kinds of podcasts through RSS feeds. learning lies in the availability of MP3s among students

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 537


DOI 10.1109/SUTC.2008.90
[2]. They can easily download and subscribe to RSS implemented in the experiments. Fifty students who
feeds to receive updates with the podcasts of their were taking the researcher-teacher’ s English scientific
target language. iPods can be used for listening, listening and speaking classes (Class A and Class B)
recording, and even oral testing. participated in this study. The tools of learning
However, many students in Taiwan are still not included PCs, MP3s, and MP4s. Participants signed up
aware of podcasting. Therefore, if we can make good for one of these three tools in each of the two podcast
use of all the technologies and resources to attract their projects. Questionnaires were used to assess their
attention, we can better motivate them to learn learning attitude and learning experiences after each
scientific English and increase their participation in project was completed.
learning activities. The use of free science podcasts in
learning scientific English should be able to easily 2.1. Procedure of Podcast Project 1
broaden students’scientific knowledge and enhance the
learning effects. The period of learning in this project was Nov. 29
Assessments of science learning involve all kinds of to Dec. 6, 2007. The whole class was divided into three
tools and methods. Among them, concept-mapping [4, groups: the PC group (9 students), the MP3s group (7
5, 6] has been widely used in education and in many students), and the MP4s group (7 students). Students
contexts of e-learning, such as in the design of adaptive were free to sign up for any of these tools until the
learning materials and in the construction of e-learning quota was filled.
domain knowledge. It has also become a great tool for After signing up, students could begin to access the
e-learning researchers’search of academic papers [1]. 70-sec. science podcast entitled “ Animals laugh when
Concepts are mental representations. In cognitive they are tickled.”The whole class watched it for the
science, “ deep understanding” refers to how the first time together, regardless of their choice of tool
concepts can be presented in the learners’minds based group. This podcast was also downloaded to the MP3s
on their prior knowledge. The connections among and MP4s in advance. The students in the 3 groups
concepts are resulted from deep understanding. began to watch the film closely on the tools they chose,
Compared with the use of visual representations, in their own speed, and for as many times as they
merely using verbal descriptions can also convey the wanted. Participants in the MP3 group were not
information, but then it would be more difficult to allowed to watch the video. They could only use the
present higher-order relationships. Therefore, to audio recording pre-downloaded to their MP3 players.
comprehend information from verbal descriptions, Meanwhile, they were asked to record the frequency of
more higher-order cognitive activities in the brain their use of this podcast, as well as the time and
would be needed. Sometimes, critical higher-order location of use.
relations might be missed out in the processes [8]. Half of the members in each tool group would
Hence, concept-mapping provides opportunities for volunteer to do concept-mapping in the learning
learners to thoroughly reflect on and re-organize the processes and submit the concept maps at the end of
new knowledge they have just acquired. the project. Concept maps could be drawn either by
To evaluate students’feelings and attitudes toward hand or by using a free online software at:
podcasting in PCs, MP3s, and MP4s scenarios in the http://ihmc.us. All concept maps were to be saved and
learning of scientific English in an EFL university named as cmap1, cmap2, etc. To prevent interactions
setting, a pilot study was conducted in two English of other variables, students were not allowed to refer to
scientific listening and speaking classes taught by the any online or off-line materials in the learning
researcher-teacher at a medical university in Taipei, processes. Rather, they were asked to only focus on the
TAIWAN from November 29 to December 20, 2007. podcast. Because it was a short film, students were not
These two classes were electives. Students from all allowed to collaborate in learning. When this one-week
departments could take these classes. Participants’ project was completed, individual assessments of
learning interests and attitudes before and after these concept-mapping and oral summary were conducted on
two scientific English podcast projects were examined Dec. 6, 2007.
afterwards.
2.2. Procedure of Podcast Project 2
2. Procedure
The period of learning in Project 2 was Dec. 13 to
By using free podcasts on the Internet as materials Dec. 20, 2007. The whole class was again divided into
for learning, two one-week podcast projects were 3 tool groups. Every student was required to choose a

538
learning tool different from what he or she had chosen projects to ensure even better learning outcomes
in Project 1. Each group was then divided into smaller through deep learning. Secondly, materials should be
groups for collaborative learning. One thing different the same for all learners in the experiments. This will
from Project 1 was that this time students were free to enable objective comparisons of learning effects.
search for any audio or video podcasts of their group’ s Podcasts used in the two projects were different. In
interests, download them to the MPs or MP4s for Podcast Project 1, the only podcast used for all
learning, and then share with their partners. Students in students’learning was the same, while in Podcast
the PC group would use the traditional way of Project 2, students were asked to choose much longer
learning – just using the podcasts on the Internet podcasts they liked. Some of them did not collaborate
directly, together with their partners. with others because they preferred to work on the
In the learning process, half of the members in each topics they liked alone, rather than working with
group would do concept-mapping and submit the partners for topics they were less interested. Therefore,
online concept maps they drew. Concept maps were for statistical reasons, only questionnaire assessment
either hand-drawn or drawn by using the online results obtained from Podcast Project 1 will be
concept-mapping tool at: http://ihmc.us. All concept reported here.
maps were saved. To exclude interaction with other Table 1 shows that MP3s were the most frequently
factors, students were not allowed to refer to any on- used tool for learning by using podcasts. The frequency
line or off-line materials. They were asked to focus of the use of MP3s was almost 5 times of that of the
only on the podcasts they found. Class podcasts were MP4s and PCs groups. However, when the students
produced after these two projects were completed. used MP4s, their average learning duration was 7 times
of that of the MP3s group, and its total time of use was
2.3. Learning attitude assessments 1.4 times of that of the MP3s group. That was exactly
an indication of deep learning! In the questionnaires,
At the end of each project, learning attitudes, participants also reported that portability and
motivation, and interests in the learning of scientific convenience were the most decisive factors for them
listening and speaking were assessed by identical when they chose tools to assist their learning of
questionnaires, as shown in the Appendix. The scientific English because they were very busy with
frequency/time/duration/location of learning, and studies, and their time was fragmental.
feelings/difficulties in the learning of science and These results indicate that MP3s were much more
scientific English from podcasts in the PCs, MP3s, and frequently used for the learning of scientific listening
MP4s scenarios were also assessed. and speaking because they could be used anywhere and
anytime the users wanted. Comparatively, MP4s were
3. Results and Discussion less frequently used because, according to the
respondents, they could not watch the film as they
3.1. Major findings and their implications walked or dined, although it was helpful to increase
their comprehension of the content. The use of PCs was
The most exciting results in this study are as follows: easy and convenient, but the location of use was
(a) the deep learning that occurred autonomously in confined.
Podcast Project 2; (b) the participants’use frequency of However, when it comes to the duration of learning
tool, as well as the dramatically different duration of per time, the MP4s tremendously outperformed the
learning in MP4s and MP3s. In the experimental design, MP3s! This apparently shows how MP4s could
to prevent interactions with irrelevant variables, the facilitate deep learning and meanwhile enhance
students were not allowed to refer to any scientific participants’interests in learning scientific listening
articles or papers. And yet, surprisingly, in the learning and speaking, although MP3s were shown to be the
processes, students were inspired by the content of the most frequently used tool among the three. Besides,
science podcasts. Hence, they autonomously searched MP4s also outperformed the other two tools in
for scientific essays or papers that were related to their participants’total time of tool use.
topics. Their in-depth oral presentations on the learning
outcomes were impressive. This phenomenon is a proof
that podcasting strongly motivated participants’
learning. This finding was also pedagogically
meaningful. Firstly, in the future, deep learning
activities should be implemented into the podcast

539
the concept maps. Much higher-order thinking was
Table 1.St
ude
nts
’us
eoft
het
hre
etool
sforl
ear
ning. displayed in their concept maps and oral presentations.
Even if some critical higher-order relations might have
been missed out, concept-mapping enabled the learners
to put it all together [8].
Duration Total
Learning attitude is another important factor in
Average of time of
learning. Figure 1 shows the assessment results
frequency learning tool obtained through questionnaires.
Podcast of tool per time use
Project 1 use (min.) (min.)
MP4 ( n=11) 4.8 20.5 76.1 Podcast Project 1
100%
MP3 ( n=9) 22.4 3.2 54.8
MP4

Percentage of responses
80%
PC ( n=15) 4 13 47.1 MP3
60% PC

40%
3.2. Findings about the enhancement of students’
motivation and learning attitude 20%

0%
Results of questionnaire assessments also show that Podcast Project 2
in Podcast Project 2, when participants were free to 100%
choose science podcasts for topics of their interests, in MP4
80%
Class A, 90% of the participants who collaborated in MP3
Percentage of responses

the process of learning felt that their motivation and 60% PC


interest in learning science and scientific English were
40%
enhanced. In contrast, only 56.3% of the participants in
Class B reported enhancement of motivation and 20%
interest through collaboration. This great variation was
0%
probably due to the fact that in Class A, 86% of the
Good So-so Poor Good So-so Poor
participants were freshmen, and they had more time for Pre-learning attitude Post-learning attitude
collaborative learning after class. In Class B, only 45%
of the participants were freshmen, and the other half Fig. 1. Questionnaire assessment results of
was composed of sophomores, juniors, and seniors. participants' learning attitudes before and after the
They were more mature but much busier with their own two podcast projects. In Project 1, the participants
studies in different fields. Therefore, it was difficult for
were free to sign up to one of the three tool groups:
them to meet and discuss about the learning content
after class. MP4, MP3, and PC groups. In Project 2, they also
As to whether doing concept-mapping in the signed up to one of the tool groups, but they were
learning process could enhance learning motivation and required to choose a tool that was different from
interest, in Project 1, 83.3% of the participants in Class the one they chose in Project 1.
A and 87.5% of the participants in Class B felt so. In
Project 2, 88.9% of the participants in Class A and In Project 1, learning attitude of the participants in
100% of the participants in Class B also reported the the MP3 group was most greatly enhanced. In Project 2,
use of concept-mapping to be effective and helpful to in general, learning attitude of the participants in the
their learning of science in different podcasting MP4 and MP3 groups were tremendously enhanced,
scenarios. especially in the MP4 group, while in the PC group the
This is solid evidence that concept-mapping was participants’ learning attitude was relatively least
beneficial in podcasting learning of scientific English enhanced. These differences indicate that when the
and was hence worth using in the teaching of scientific content of science learning is easy, as in Project 1, the
English. The reason is simple: because in concept- use of MP3s can best enhance students’learning
mapping, students would have to understand the attitude, although watching the film would be even
content of learning thoroughly and then reorganize better, according to the participants. However, when
them in their minds before they could begin to draw up

540
the content of science learning is more difficult and the English and their preference of using the MP4s for
length is longer, both MP4s and MP3s can more deep learning, as well as the longest duration of tool
effectively enhance students’learning attitude. use have suggested an important implication
The learning attitude of the participants in the PC pedagogically. MP4s are effective tools in the learning
group in both projects, however, show the least of scientific English in the podcasting scenarios.
enhancement of learning attitude, despite the fact that Concept-mapping was also proved to be an excellent
in Project 2, the learning attitude of participants started tool for the consolidation of new knowledge in science
out as the best. This was probably because participants [3].
were asked to choose a different tool than the one they
used in Project 1. The participants now in the PC group 5. Acknowledgements:
were those signed up to the MP4 and MP3 groups, and
they already had exciting and successful learning This study was supported in part by the National
experiences from Project 1. Likewise, those in the MP4 Science Council of Taiwan, Republic of China (Grants
and MP3 groups were mostly those from the PC group no. NSC-96-2524-S-008-002 and NSC-95-2520-S-010-
in Project 1, so their learning attitude was not as good 001).
as that of the other two groups.
Another interesting finding in the PC group’ s 6. References
participants in both projects is that there was not much
variation in their learning attitude before and after the
projects. This is very different from that of the MP4 [1] Chen, N.S., Kinshuk, J., Wei, C.W., & Chen, H.J.
and MP3 groups. Perhaps it was because they are using (2007). Mining e-Learning Domain Concept Map from
PCs everyday; therefore, using the podcasts on the PC Academic Articles. Computers & Education, pp. 694-
is not exciting. Therefore, their learning attitude was 698.
not enhanced much. [2] Godwin-Jones, R. (2005). Emerging Technologies:
Skype and Podcasting: Disruptive Technologies for
3.3. Findings about students’use of tools Language Learning, Language Learning and
Technology, 5 (3), 9-12. Retrieved from:
Questionnaire assessments also found dormitories to http://llt.msu.edu/vol9num3/emerging/.
be the most popular location where all three groups of
participants used the MP4s, MP3s, or PCs, esp. for [3] Hilbert, T.S. & Renkl, A. (2007). Concept mapping
those in the MP3 group, as its frequency tremendously as a follow-up strategy to learning from texts: what
outnumbered that of the other two groups. On the train characterizes good and poor mappers? Instructional
or MRT (Mass Rapid Transit), the MP3s were much Science, 2007, DOI 10.1007/s11251-007-9022-9.
more often used for the learning of science in these [4] Novak, J.D., & Gowin, D.B. (1984). Learning how
projects than the MP4s. All participants who signed up to learn. New York: Cambridge University Press.
for the MP4s were excited and cherished the
opportunity of using it for science learning, even [5] Novak, J.D. (1990). Concept mapping: A useful
though a few of them also admitted that it was a bit tool for science and education. Journal of Research in
embarrassing to use it on the MRT because they might Science Teaching, 10, 923-949.
attract other passengers’attention. [6] Novak, J.D. (1995). Concept mapping to facilitate
teaching and learning. Prospects. XXV(1),79-85.
4. Conclusion [7] Offir, B., Lev, Y., & Bezalel, R. (2007). Surface and
deep learning processes in distance education:
The findings in this study were significant in that Synchronous versus asynchronous systems, Computers
they provide insights for the teaching of scientific & Education, doi:10.1016/j.compedu.2007.10.009.
English for EFL (English as a foreign Language)
students. The deep learning that has occurred in the [8] Perini, L. (2005). Visual representation. In S.
participants’ learning processes [7] was exciting. Sarkar and J. Pfeifer (Eds.), Philosophy of Science: An
Apparently, higher-order thinking and cognitive Encyclopedia, 2, 863-870. London: Routledge.
engagement [9] were enhanced by the use of podcasts [9] Stoney, S., & Oliver, R. (1999). Can Higher Order
in the MP4s, MP3s, and PCs scenarios. Thinking and Cognitive Engagement Be Enhanced with
Major findings about the participants’preference of Multimedia? Interactive Multimedia Electronic
using the MP3s for surface learning of scientific Journal of Computer-Enhanced Learning 1(2).

541
7. Appendix ____________________________________________
________________.
Questionnaire for post-project assessment.
Did your way of learning in this project, that is, either
Fill out the following questionnaire, post it in our by using the PC or doing mobile-learning, enhance
forume n t
it
led“ Questionnairea bou tt h e podc ast
ing your motivation and interest in learning scientific
expe rience s
”a nds u
bmi tallv ersionsofy ou ron li
n e listening and speaking in English?
concept-maps on Oct. 10, 2008 when we meet again.
In our class assessment, on your concept maps, please ______ Yes, because
specify that they are the final concept maps. Your ____________________________________________
participation in answering this questionnaire will be ____________________________________________
counted toward your participation score for this course. ____________________________________________
__.
Your name: ____________; Department:
_____________________; gender: ______. ______No, because
____________________________________________
1. Which group do you belong to? ___PC group; ___ ____________________________________________
P3 group; ___P4 group. ____________________________________________
_______________.
Why did you choose to use this tool? (Reasons)
____________________________________________
_____________________. 4. Feelings of these learning experiences. Difficulties
and excitement should be reported as well. Also, please
2. In the space below, please record the following explain your attitude about podcasting. Is it helpful or
information about your learning from the podcasts: not, and why? Will you suggest me to implement
frequency (= how many times), duration (= for how podcasting by using MP4s and MP3s in this course in
long each time; the exact period of time for learning), the future?
when (at what time, and the dates) and where (e.g.
wh iley ou’r eta ki
n gtheMRTorwh ileyou ’rewa it i
ngi
n
line); and strategies you used for science learning.

1) (The first time and the details required above)


2) (The second time and the details required above)
.

3. What strategies did you use in learning in this


project? How did it work? Any excitement,
difficulties? How did you overcome the difficulties in
learning?

4. Your learning attitude in this course (scientific


listening and speaking) before we began to do these
podcasting projects was:
_____ Good; _____ So-so; _____ Poor.
Why?
____________________________________________
______________.

Your learning attitude in this course (scientific listening


and speaking) after these podcasting projects were
completed was:
_____ Good; _____ So-so; _____ Poor.
Why?

542
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

The Study of Using Sure Stream to Construct Ubiquitous Learning


Environment

Koun-Tem Sun Hsin-Te Chan


Dept. of Information Science and Learning Technology
National University of Tainan
ktsun@mail.nutn.edu.tw

Abstract The realization of ubiquitous learning provides an


online learning mechanism based on context awareness.
Over the recent years, with the rapid development Its is expected that the new communication media can
of science and technology, the Internet has become a provide users with a new learning technique and
necessary part of human life and is used for philosophy. If ubiquitous learning is used in teaching
communication, transmission of files and remote login, and learning, the positioning technique of wireless
etc. However, most mobile learning platforms on the communication can enable the digital learning system
market face a common problem: the traditional multi- to acquire clear knowledge of the learning behavior of
point communication technique is limited by the the learner. In addition, an effect of “learning at the
bandwidth of the network, and therefore, users cannot scene” can also be created, making the learning more
obtain the best learning scenery. interesting and creating better learning results. Mobile
In this study, we adopted the theory of context- devices can also be used to assist teaching and learning
aware learning and introduced novel communication [2]. However, because of the consideration of being
techniques, and proposed a technique using Sure convenient, mobile device is generally small in size
Stream Streaming Service, with the aim of ensuring the and portable, and therefore, the monitor is small and
smoothness and stability of media communication. We, the calculating ability is not quite satisfactory, bringing
based on the characteristics of HSDPA mobile troubles to the use of audio-visual teaching materials
communication, established an environment for [3]. In this study, we analyzed mobile device teaching
Ubiquitous Learning, creating good conditions for E- platforms currently available on the market, and found
learning. In this study, we also tested various types of the following disadvantages:
wireless communication media on the market, and 1) The system cannot, based on the bandwidth of the
generated corresponding test reports as reference user, choose an audio-visual file in appropriate size.
materials for related researches. 2) The quality of online audio-visual files is not
satisfactory, and therefore, the use of certain
1. Introduction teaching materials is limited to a region and such
teaching materials cannot be used in a U-learning
Over the recent years, the rapid development of environment.
wireless mobile technology creates the possibility for The resources are limited, and therefore, how to
mobile learning, which makes mobile device free of utilize current techniques to improve the load
time and place limits. With the increase in the number capability of the system, conduct optimized
of basement stations for 3G and 3.5G cell phones and management of the bandwidth and ensure high-quality
the widespread use of WiMax (802.16e) and WLAN transmission of teaching media is very important to the
(802.11a/b/g) wireless networks, the online learning quality of the teaching system. And it is also the
system may, according to the place and attribute of the subject of this study.
learner, provide appropriate information actively,
making learning theory and ubiquitous learning 2. Research Purposes
practical. According to Kinshuk, the mobile In this study, we based on the context-aware
technology has led to a revolution in long-distance learning theory and introduced new communication
teaching and learning, and learning any time and any techniques, with the aim of solving problems in media
place becomes a reality [1].

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 543


DOI 10.1109/SUTC.2008.68
transmission in the traditional M-Learning provide users with appropriate information. Context-
environment. There are two major studying purposes: aware learning emphasizes that learning contents shall
1) Advance a U-learning environment using the Sure be in the knowledge context, so that the learner can
Stream mechanism, so as to ensure the smoothness interact with new information actively and acquire
and stability of media transmission. useful knowledge. The application range of context
2) Test various types of wireless communication media awareness arising from situation factors varies, and
available on the market (including HSDPA and therefore, the behavior of context awareness is
WLAN), and produce testing reports as reference different. Schiller & Voisardΰ2004α pointed out that
materials for others. the behavior in context awareness is divided into:
1) Active context awarenessΚAs soon as receiving
3. Research Methods situation factors of the user, the system will, based
on these situation factors, adjust services actively;
In this research, we first, through literature review, 2) Passive context awarenessΚThe user actively files
acquired clear knowledge of related learning theories,
requests as to preferred situation factors, and then
network communication techniques, stream techniques
the system will, based on requests of the user,
and etc, and then, on the basis of current environment
provide task information[4].
framework, advanced a technique combining
In the research, we mainly adopted passive context
characteristics of HSDPA mobile communication and
awareness as the basic theory and the basis for system
using Sure Stream intelligent stream technology for
development.
solving problems as to media load met by mobile
device in ubiquitous learning environment. Finally we 4.1.2 ADDIE Teaching Design Mode. In 1975, US
also tested various types of wireless communication Army developed a set of IPISD(Interservice
media available on the current market, and then Procedures for Instructional Systems Development), to
generated testing reports and analyzing results as improve the efficiency of military training [5]. Later, it
reference materials for other researchers. was widely used in the field of digital learning., as
Major studying procedures are shown as follows: shown in Fig. 1. Note that the five procedures are in
1) Literature review circular relationship, not linear relationship. Each
2) System planning procedure is closely related to the next procedure, and
3) System designing and implementation sends feedbacks to the last procedure. The last
4) Implementation results and analysis procedure will reflect the first procedure. This is the
5) Conclusion concept of so-called Congruence (William, 2003).
4. Literature Review
The part of literature review includes four sections.
It is our hope that we may acquire some knowledge of
information related to this study from literature
exploration as the theoretical framework and basis of
this study. In the first section, we conducted analysis
of relevant learning theories; In the second section, we
explored various types of streaming techniques; In the
third section, we reviewed the development of the Sure
Stream technology and summarized its advantages; In Fig 1: ADDIE Teaching Design Mode
the fourth section, we described the grammatical
characteristics of Synchronized Multimedia Integration Limited by manpower and time, we attached
Language (SMIL). importance to procedures such as analysis, design,
implementation and development in the ADDIE mode.
4.1 Learning Theories Required by
Curriculum Establishment 4.2 3G & 3.5G Communication Technology
4.1.1 Context-aware Theory. Schiller & Voisard The so-called The Third Generation Mobile
ΰ2004α pointed out in Location-Based Services that Terminal (hereinafter referred to as 3G) is a concept
the concept of Context Awareness was advanced by advanced by the International Telecom Union. Based
US DOD in 1970. Its major spirit is to correctly on this concept, standards such as IMT-2000and

544
UMTS were prepared. WCDMA or CDMA2000 falls required by broadcasting a frame. The bottom value of
into the category of 3G, and TD-SCDMA popular in data size at this unit time point is represented by the
Mainland China is also a type of 3G. 3G is capable of vector d (t). From the 1st time point to the t time point,
transmitting digital data at high rate and adopts the the total amount of vectors of data sizes is represented
same package transmission and IP network addressing by D (t). Please see Formula (4.3.1) shown as follows:
techniques as the computer network, and its highest
¦
t
D (t ) d (i ) ʳ ʳ ʳ ʻˇˁˆˁ˄ʼʳ
transmitting rate reaches up to 384K/bps ~2M/bps. i 1
The 3.5G (HSDPA) is the high-speed version of ˪˸ʳ ˶˴́ʳ ̈̆˸ʳ ̇˻˸ʳ ˶˴̃˴˶˼̇̌ʳ ̂˹ʳ ̇˻˸ʳ ˵̈˹˹˸̅ʳ ̍̂́˸ʳ ˴̇ʳ ̇˻˸ʳ
WCDMA 3G technology. HSDPA is the abbreviation ̈̆˸̅ϗ̆ʳ ˸́˷ʳ ˴́˷ʳ ̇˻˸ʳ ̇̂̇˴˿ʳ ́̈̀˵˸̅ʳ ̂˹ʳ ˹̅˴̀˸̆ʳ ̇̂ʳ ˶˴˿˶̈˿˴̇˸ʳ
of High-Speed Downlink Packet Access, which is a ̇˻˸ʳ̈̃̃˸̅ʳ̅˸˶˸˼̉˼́˺ʳ˿˼̀˼̇ʳ̂˹ʳ̇˻˸ʳ˵̈˹˹˸̅ʳ̍̂́˸ˁʳ˧˻˸ʳ˶˴̃˴˶˼̇̌ʳ
mobile communication protocol, and it is used to ̂˹ʳ ̇˻˸ʳ ˵̈˹˹˸̅ʳ ̍̂́˸ʳ ˼̆ʳ ̅˸̃̅˸̆˸́̇˸˷ʳ ˵̌ʳ ˵ʳ ˴́˷ʳ ̇˻˸ʳ ̇̂̇˴˿ʳ
improve the downlink connection rate of 3G. This ́̈̀˵˸̅ʳ ̂˹ʳ ˹̅˴̀˸̆ʳ ˼̆ʳ ̅˸̃̅˸̆˸́̇˸˷ʳ ˵̌ʳ ˡˁʳ ˧˻˸ʳ ̀˸̇˻̂˷ʳ ˹̂̅ʳ
protocol provides packet data services in the down
˶˴˿˶̈˿˴̇˼́˺ʳ ̇˻˸ʳ ̅˸˶˸˼̉˼́˺ʳ ̈̃̃˸̅ʳ ˿˼̀˼̇ʳ ˕ʻ̇ʼʳ ̂˹ʳ ̇˻˸ʳ ˵̈˹˹˸̅ʳ
chain of W-CDMA. The transmitting rate may reach 8-
10 Mbit/s in a 5MHz signal carrier (If MIMO ̍̂́˸ʳ˼̆ʳ̆˻̂̊́ʳ˼́ʳ˙̂̅̀̈˿˴ʳʻˇˁˆˁ˅ʼˁʳʳ
technology is used, it may reach 20 Mbit/s.). In its ʳ B(t ) min{D(t  1)  b, D( N )} ʳ ʻˇˁˆˁ˅ʼʳ
actual realization, Adaptive AMC, MIMO, HARQ, for t 2,...., N B(1) b B(0) 0
quick dispatch and Fast Cell Selection are adopted [6]. ˜˹ʳ̇˻˸ʳ̉˸˶̇̂̅ʳ̀˸˸̇˼́˺ʳ̈̃̃˸̅ʳ˴́˷ʳ˿̂̊˸̅ʳ̅˸˶˸˼̉˼́˺ʳ˿˼̀˼̇̆ʳ
In Taiwan, the highest 3.5G speed rate is 3.6Mbps. ̂˹ʳ ̇˻˸ʳ ˵̈˹˹˸̅ʳ ̍̂́˸ʳ ˴̇ʳ ˴ʳ ̈́˼̇ʳ ̇˼̀˸ʳ ̃̂˼́̇ʳ ˼̆ʳ ˴ʻ̇ʼʿʳ ̇˻˸ʳ ̇̂̇˴˿ʳ
˴̀̂̈́̇ʳ ̂˹ʳ ̉˸˶̇̂̅̆ʳ ʻ˹̅̂̀ʳ ̇˻˸ʳ ˄̆̇ʳ ̇˼̀˸ʳ ̃̂˼́̇ʳ ̇̂ʳ ̇˻˸ʳ ̇ʳ ̇˼̀˸ʳ
4.3 Streaming Technology ̃̂˼́̇ʼʳ̀˸˸̇˼́˺ʳ̈̃̃˸̅ʳ˴́˷ʳ˿̂̊˸̅ʳ˿˼̀˼̇̆ʳ̊˼˿˿ʳ˵˸ʳ˔ʻ̇ʼˁʳˣ˿˸˴̆˸ʳ
̆˸˸ʳ˙̂̅̀̈˿˴ʳʻˇˁˆˁˆʼʳ̆˻̂̊́ʳ˴̆ʳ˹̂˿˿̂̊̆ˍʳʳʳ
The so-called streaming technology refers to the
¦
t
audio-visual technology of transmitting multi-media ʳʳ A(t ) i 1
a (i ) ʳʳ ʳ ʳʻˇˁˆˁˆʼʳ
data (audio-visual information and written language, ˧˻˸ʳ̀˸̇˻̂˷ʳ˹̂̅ʳ̀˴˾˼́˺ʳ̆̀̂̂̇˻ʳ̆̇̅˸˴̀˼́˺ʳ̆˶˻˸˷̈˿˸ʳ˼̆ʳ
etc), by utilizing current network equipments, to a
̇̂ʳ ̀˴˼́̇˴˼́ʳ ̇˻˸ʳ ˷˴̇˴ʳ ̆˼̍˸ʳ ˹̂̅ʳ ̆̇̅˸˴̀˼́˺ʳ ̇̅˴́̆̀˼̆̆˼̂́ʳ
single point or multiple points. Protocols such as
Multicast, Unicast, RTSP, RTP and TCPIP are adopted. ˵˸̇̊˸˸́ʳ̇˻˸ʳ̈̃̃˸̅ʳ˴́˷ʳ˿̂̊˸̅ʳ̅˸˶˸˼̉˼́˺ʳ˿˼̀˼̇̆ʳ̂˹ʳ̇˻˸ʳ˵̈˹˹˸̅ʳ
Based on these communication protocols, multi- ̍̂́˸ʿʳ ˴́˷ʳ ̊˻˸́ʳ ˔ʳ ʻ̇ʼʳ ̀˸˸̇˼́˺ʳ ̈̃̃˸̅ʳ ˴́˷ʳ ˿̂̊˸̅ʳ ˿˼̀˼̇̆ʳ ̂˹ʳ
media data are divided into several small frames, and ̇˻˸ʳ ˵̈˹˹˸̅ʳ ̍̂́˸ʳ ̂˹ʳ ˕ʳ ʻ̇ʼʳ ˴́˷ʳ ˗ʳ ʻ̇ʼʿʳ ˴ʳ ˹˸˴̆˼˵˿˸ʳ ̆̇̅˸˴̀˼́˺ʳ
then transmitted based on appropriate frame rate/sec, ̆˶˻˸˷̈˿˸ʳ ˼̆ʳ ˺˸́˸̅˴̇˸˷ˁʳ ˣ˿˸˴̆˸ʳ ̆˸˸ʳ ˙̂̅̀̈˿˴ʳ ʻˇˁˆˁˇʼʳ ̆˻̂̊́ʳ
and next read at the receiving end only if sufficient ˴̆ʳ˹̂˿˿̂̊̆ˍʳʳ
frames are received; and therefore, there is no need to D(t )  A(t )  B(t ) ʳʻˇˁˆˁˇʼʳ
receive all multi-media data. Figure 2 shows the
ʳʳ˕˴̆˸˷ʳ ̂́ʳ ˙̂̅̀̈˿˴ʳ ʻˇˁˆˁˇʼʿʳ ̊˸ʳ ˶˴́ʳ ̂˵̇˴˼́ʳ ˴ʳ ̆̀̂̂̇˻ʳ
locations of all communication protocols in the
protocol framework. ̆̇̅˸˴̀˼́˺ʳ̆˶˻˸˷̈˿˸ʳ˴̆ʳ̆˻̂̊́ʳ˼́ʳFigureˁˆˁʳ

Fig 3: smooth streaming schedule


Fig2:Multi-media Protocol Stack (Schulzrinne,2000)
˕ʻ̇ʼʳ˼̆ʳ̇˻˸ʳ̈̃̃˸̅ʳ˿˼̀˼̇ʳ̂˹ʳ̇˻˸ʳ˵̈˹˹˸̅ʳ̍̂́˸ʳ˴́˷ʳ˗ʳʻ̇ʼʳ̇˻˸ʳ
Take the broadcasting of video streaming as an ˿̂̊˸̅ʳ ˿˼̀˼̇ˁʳ ˧˻˸ʳ ˵̅̂˾˸́ʳ ˿˼́˸ʳ ˴̇ʳ ̇˻˸ʳ ˵̂̇̇̂̀ʳ ̅˸̃̅˸̆˸́̇̆ʳ ˴́ʳ
example. Data size temporarily stored in the buffer ̈́˹˸˴̆˼˵˿˸ʳ ̆̇̅˸˴̀˼́˺ʳ ̆˶˻˸˷̈˿˸ʿʳ ˴́˷ʳ ̇˻˸ʳ ˹˼̅̆̇ʳ ˵̅̂˾˸́ʳ ˿˼́˸ʳ
zone at the user’s end varies with unit time point, and ̀˸˸̇˼́˺ʳ ̈̃̃˸̅ʳ ˴́˷ʳ ˿̂̊˸̅ʳ ˿˼̀˼̇̆ʳ ˼̆ʳ ̇˻˸ʳ ˹˸˴̆˼˵˿˸ʳ ̆̀̂̂̇˻ʳ
therefore, to ensure the smoothness of the broadcasting ̆̇̅˸˴̀˼́˺ʳ̆˶˻˸˷̈˿˸ˁʳ˦̀̂̂̇˻ʳ̆̇̅˸˴̀˼́˺ʳ̆˶˻˸˷̈˿˸ʳ˶˴́ʳ̆̂˿̉˸ʳ
of video streaming, the data size in the buffer zone ̇˻˸ʳ̃̅̂˵˿˸̀ʳ̂˹ʳ˼́̆̈˹˹˼˶˼˸́̇ʳ˵̈˹˹˸̅ʳ˷˴̇˴ʳ˴̅˼̆˼́˺ʳ˹̅̂̀ʳ˴ʳ̉˸̅̌ʳ
must be maintained higher than the minimum value ̆˿̂̊ʳ ̆̇̅˸˴̀˼́˺ʳ ̇̅˴́̆̀˼̇̇˼́˺ʳ ̆̃˸˸˷ʳ ̂̅ʳ ̂˹ʳ ˿̂̆̆ʳ ̂˹ʳ ˷˴̇˴ʳ

545
˵˸˶˴̈̆˸ʳ ̂˹ʳ ˼́̆̈˹˹˼˶˼˸́̇ʳ ˵̈˹˹˸̅ʳ ̆̃˴˶˸ʳ ˴̅˼̆˼́˺ʳ ˹̅̂̀ʳ ˴ʳ ̉˸̅̌ʳ Considering that network communication
˹˴̆̇ʳ̆̇̅˸˴̀˼́˺ʳ̇̅˴́̆̀˼̇̇˼́˺ʳ̆̃˸˸˷ˮˊʿˋ˰ˁʳ techniques such as 3G, 3.5G, WiMax, WLAN, RFID,
4.4 Sure Stream Technology Bluetooth and irDA are becoming increasingly mature,
Because of the WilMax basement stations have not
The Sure Stream refers to the technology of been widely constructed, and the transmitting distance
transmitting A/V files by means of streaming via of RFID, irDA or Bluetooth is still too short (see Table
Stream Server and then broadcasting media transmitted 1: Communication Techniques Comparison).So this
with special condensation algorithm and based on the research put importance to environments such as 3.5G,
bandwidth of the user’s end. With the SVT (scalable 3G and WLAN.
video technology), even a low-speed device can select
the optimum condensation code rate to broadcast Table1 : Communication Techniques Comparison
media data in a smooth fashion, without releasing all
media data. Its main operating methods are shown as
follows:
1) First, establish a coding framework, which allows
several streams at different speed rates to be coded
at the same time and then incorporates them into a
file.
2) Adopt a complicated Client/Server mechanism to
detect changes in bandwidth. Considering that
software, equipments and data transmitting speeds
may be different, media data at different speed rates
are coded and recorded and then stored in a single
file, in other words, a streaming audio-visual file
that may be extended is established. Upon receiving 5.2 System Architecture
request from the client, it will send bandwidth
capacity to the server, and the server will, based on To effectively apply the sure stream technology to
the bandwidth of the client, send those parts in the Mobile/WLAN heterogeneous network environment,
Sure Stream file that meet the size of bandwidth to we adopted a three-tier mode to design this optimum
the client end, as shown in Figure. 5. system. Its architecture is divided into LMS Server,
When the user makes the request for a piece of Stream Server and mobile device platform. LMS
media content, the system’s coding instrument will Server (Business Logic Tier) is for providing users
record media at different speeds, and send bandwidth with teaching materials; Stream Server (Data Service
information at the client end to the stream server. Upon Tier) will, based on operations conducted by front-end
receiving the request of the client end, stream server user (Presentation Tier) on mobile device, process data
will, based on the part in the package that corresponds transmitted by LMS Server in an optimum way, and
to the bandwidth and the possible highest bandwidth, then send it back to mobile device, so that the learner
transmit files by means of streaming. This is the so- may read and learn, as shown in Figure.6.
called Sure Stream technology.

!
Fig.5: Sure Stream Automated Bandwidth Adjustment (Real,
2008) Fig.6: System Architecture
ʳ

5. System Planning 6. System Design and Implementation


5.1 System Assessment 6.1 System Design

546
The purpose of this research is to demonstrate When a person is learning in situated environment,
teaching materials on mobile device such as PDA or Mobile Device will send out demand information, and
UMPC (Ultra Mobile PC). And therefore, the system then use RAM sent by the browser to communicate
design architecture is divided into 1. teaching materials with Stream Server based on RTSP protocol, The
development and 2. system programs establishment. automated bandwidth judgment algorithm as shown in
1) Teaching Materials Development: Figure7. At this moment, Stream Server will, based on
Based on the theory of context language learning package signals sent by the Player, transmit data
and by introducing awareness technology, the streams in Multiple Encoded Stream File of Data
learning environment will be provided based on Storage at an optimum speed rate. And RTCP and RTP
location-specific content. The method for providing will provide flow volume control and congestion
teaching materials is online content providing. And control services.
the design of teaching contents is the ADDIE mode
frequently mentioned in Instruction System Design
(ISD).
2) System Programs Establishment:
This system is mainly in ASP. Net program
language and is developed by adopting distributed
structure. Webpage is the major user interface.

6.2 System Implementation


6.2.1. System Flow Design
Fig.7: Automated bandwidth judgment algorithm
There are six major designing procedures:
1) Establishment of U-Learning operating environment
2) Establish HSDPA/WLAN transmitting test platform 7. Implementation Results and Analysis
3) Analysis, design and preparation of teaching contents:
The research focused on the application of Sure
It is planned that the learning environment will be
Stream to Ubiquitous Learning Environment, and
prepared based on location-specific content.
based on context language learning theory, prepared
Teaching materials will be designed based on
twenty-eight learning contexts. Finally, we tested this
various situations. The major researching place is
learning platform through Mobile Device in
southern Taiwan, for there are diversified and rich
Mobile/WLAN environment. Details about related
situations in southern Taiwan. Please see Table 2 for
interfaces and results are shown as follows:
details.
1) Program interface and operation
4) Interface design : Design Mobile Device interface
5) Provide Context-Aware M-Learning environment
6) Environment examination.

Table 2: Learning Place/Context


Learning place Teaching content
Chih-kan Tower Snack, history
Science Park, Tainan Science and technology
Tainan University, History
Tainan National Scenic spots, arts, 2) Realization of Sure Stream mechanism
University of the Arts literature
Zuoying High Speed
Ticketing affairs
Railway Station
Sizihwan Bay, Kaohsiung Scenic spots
Pingtong Park Scenic spots, history
Pingtong Night Fair Snack

6.2.1. Design and Application of Sure Stream

547
3) Ubiquitous Environment Testing results show: although the widespread use of 3.5G
a. Mobil Device: ASUS P750 and R2H UMPC environment has not been realized, Ubiquitous
b. Testing platform: Chunghwa Telecom’s 3.5G/3G Learning Environment can still be realized in Mobile
c. Testing places: Twenty-eight places in Tainan, Device by using Sure Stream, SMIL and other
Kao Hsiung, and Ping Tong techniques.
4) Results Analysis Follow-up researches shall focus on
Results show that: Because the 3.5 G system of supplementing teaching materials satisfying learning
telecom companies has not been upgraded situations, developing better user’s interfaces, and
completely, signals are received only in one places providing quantized researching results as to the
during the course of testing. For the 3G system, application of cooperative learning mechanism, tracing
signals are received in all testing places. Although of learning process and efficiency of the context
the theoretical value of the download bandwidth of learning mechanism with the aim of learning whether
HSDPA system can reach 3.6Mb/s, the the learning efficiency under Ubiquitous Learning
measurement value is only 1.2Mb/s. The following Environment has been improved..
Fig 8,9 shows the results of applying Sure Stream to
Ubiquitous learning Environment. 9. References
[1] Kinshuk., (2003), Adaptive mobile learning technologies.
˨˵˼̄̈˼̇̂̈̆ʳ˘́̉˼̅̂́̀˸́̇ʳʳ˕˴́˷̊˼˷̇˻˧˸̆̇˼́˺
URL:http://www.globaled.com/articles/Kinshuk2003.pdf.
˄ˇ˃˃ Accessed 20/02/2008.
˦˼˺́˴˿˂˕˴́˷̊˼˷̇˻

˄˅˃˃
˄˃˃˃ ᑇ٨˄
ˋ˃˃ [2] Chia-Hui Huang, Chiung-Hui Chiu., (2005), “The
ˉ˃˃ ᑇ٨˅
ˇ˃˃ ᑇ٨ˆ excellent learning plans:Build high performance
˅˃˃
˃ integration environment with reality and digital learning”,
ৠࣟ՞‫چ‬॰

ৠࣟ೏్խᖂ

ৠࣟ‫ీ߫־‬

೏ৠՕᖯ
ն壂ሁ៱Օอ

೏ႂዧ壀‫ۍ‬ຄ
խ՞Օᖂ

೏ႂઝՠ塢
‫؀‬তᢌ๬Օᖂ

‫؀‬তઝᖂႼ೴
࡛ભ᠔ೃ

‫פګ‬Օᖂ

‫؀‬তߧ୿ᑔ
‫؀‬ত֞ᐔ

Technique Reports.

[3] Maija Metso, Mikko Löytynoja, Jari Korva, Petri Määttä


˧˸̆̇ʳ˿̂˶˴̇˼̂́ and Jaakko Sauvola., (2001), Mobile Multi- media
Services – Content Adaptation,” 3rd International
Fig 8, Ubiquitous Environment BandwidthTesting
Conference on Information, Com- munications and
Signal Processing.
˦̈̅˸˦̇̅˸˴̀ʳ˔̈̇̂̀˴̇˸˷ʳ˕˴́˷̊˼˷̇˻ʳ˝̈˷˺̀˸́̇
˄ˇ˃˃
[4] Ruei-Ting Feng., (2004),“The study of Implementation
˄˅˃˃ and Application for Context awareness Mobile outdoor
˕˵˴́˷̊˼˷̇˻

˄˃˃˃ ecosystem leaning system”, Unpublished master's thesis.


ˋ˃˃
ˉ˃˃ National Taiwan Normal University, Department of
ˇ˃˃
Industrial Education.
˅˃˃
˃
[5] Dick, W., and Carey, L., (2001), “The Systematic Design
ࣟ ᖂ

ዧ อ

խ ຄ

ઝ ᖂ

࡛ ೴

‫ ؀‬ᑔ
೏ ॰

೏ ీ

ᢌ 塢
ሁ ᖯ

ႂ ᖂ

‫ ګ‬ೃ

ত ᖂ

of Instruction”, 5th Ed,Longman, New York.


ৠ խ

ႂ Օ

‫ۍ‬

ত Օ

୿
ࣟ ‫چ‬

ত ՠ
壂 Օ

೏ Օ

‫ ؀‬Օ

֞

೏ ៱

‫ ؀‬๬

ߧ
ৠ ՞

‫־‬

‫ ؀‬ઝ
ն ৠ

‫פ‬



[6] Wikipedia., (2008), High Speed Downlink Packet Access,


˞ʳ˵̃̆
˖̂́̇˸̋̇ʳ˔̊˴̅˸́˸̆̆ʳˣ˿˴˶˸
˦̈̅˸ʳ˦̇̅˸˴̀ʳ˵˼̇̅˴̇˸ʳ URL:http://en.wikipedia.org/wiki/High-Speed_Downlink
_Packet_Access. Accessed 20/02/2008
Fig 9, Sure Stream Automated Bandwidth Judgment
[7] J.D. Salehi, A. L. Zhang, J. Kurose, and D. Towsley,
8. Conclusion (1998), “Supporting Stored Video: Reducing Rate
Variability and End-to-End resource requirements
The major spirit of “Ubiquitous Learning” is to
through Optimal Smoothing,” IEEE/ACm Transacton on
provide the user with appropriate teaching materials at Networking, vol. 6, no. 4, pp. 397-410.
a right time and place. Based on this spirit, we adopted
the technique of automated bandwidth adjustment to
[8] Yu-Ching Lin., (2003), “A Study on Developing a
ensure the smoothness and stability of media
Streaming Client-Server Architecture for MPEG-4 Video
transmission, and based on the context-aware theory, and Audio”, Unpublished master's thesis, I-SHOU
used ADDIE teaching systematic design mode and University.
mobile communication technology. Implementation

548
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Interest-based Peer Selection in P2P Network


1,* 2 3
Harry Chiou Addison Su Stephen Yang

1, 2, 3
Dept. of Computer Science and Information Engineering, National Central
University, Taiwan
1
oilcat2003@gmail.com ,2addison@csie.ncu.edu.tw, 3jhyang@csie.ncu.edu.tw

Abstract interest first. It is same in P2P architecture. Based on


this characteristic, we suppose that peer has the higher
Currently most distributed peer-to-peer network possibility to answer user's query if the peer has similar
systems send to other peers when users’ query request, interest with user than other peers. Therefore, we
but their search result is a great deal of needless merge user’s interest into P2P architecture and advance
answer as gap between their search goal and their interest-based peer selection mechanism in this paper.
information need. It is because each peers own When searching for something, user can know the peer
different interest domain knowledge. However, we whose interest is most similar to the user's interest by
purpose interest-based peer selection in order to rank the arranged peer list at first time. And transmitting the
best peers under the user context with content retrieval query to the selected peer directly. Because of it,
tasks. Generally speaking, providers may be best peers system will reduce query routing range or broadcast
what user want find. Providers can satisfy users’ the query. In addition, peer has the higher possibility to
information need, if we figure out providers what are answer user's query if the peer has similar interest with
user search goal. user. We transmit query to these selected peers directly
Based on ACM domain knowledge, we propose will increase the precision in theory and reduce the
reference ontology in order to calculate the semantic possibility with user unsatisfied query result then
similarity under each peer preferences, and rank best resend query.
peers. Feedback data is updated ontological user Because JXTA mechanism[1] doesn’t filter peer at
profile by user activities and content retrieval tasks. present. JXTA will store all peer’s advertisement into
Our approach can satisfy users’ information need and intrinsic cache which peers are routing pass through.
reduce their search time-consuming. When user want to query something. JXTA will use
multicast to transmit query to the peers saving in the
Keywords: interest, peer selection, JXTA, ACM, cache first. If these peers cant answer the query, system
similarity will broadcast the query into P2P network. It will
produce two drawback as listed below : 1.There are
1. Introduction many meaninglessness peers information in the cache.
2.Because of the cache saving the meaninglessness
Recently,P2P( peer-to-peer) search becomes the peers information. When query transmit to other peers,
popular and hot issue. P2P search brings many fetching searching result is usually filled with needless
features and support a new mechanism with sharing information. User need to waste additional time to
resource. Unlike Client/Server architecture, P2P filter the search result.
architecture doesn’t need to set up server. All users can In this paper, we advance interest-based peer
share resource with each other directly. It provides a selection mechanism in order to find adaptive peers
superior method to solve server overhead problem. But, and utilize this mechanism to reduce the frequency of
we need to consider other conditions. For example, in broadcast. In other words, it will reduce the query
P2P network how to find right peers, which can answer routing cost. Therefore, our Interest-based peer
user query. selection mechanism has three contribution as listed
How to find right peers? We can observe that in below: 1.Peer’s information stored in cache is more
daily life. When we are looking for information about meaningful. 2.It will reduce the broadcast frequency
something interested. In general, people will ask for and routing cost. 3.It will increase searching precision
the information from their friends who has same by finding out peers whose interest are similar.

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 549


DOI 10.1109/SUTC.2008.63
We implement the mechanism on the Adutella intuitively and empirically derived, combines the
which is a P2P bibliography resource sharing software shortest path length between two nodes and the depth
developed by our research team. in the taxonomy of the most informative subsumer.

2. Related works B. Information Content Measures


In this category, similarity measures are based on
The peer routing algorithm using in Gnutella is the information content of each concept. Information
called flooding. It will generate a large amount of Content Measures need statistics and probability to
traffic in the peer-to-peer network. If we want to calculate similarity. Therefore Information Content
increase search performance, at first we need to avoid Measures need corpus. For example, Lord et al[6].
using broadcast as possible as we can.
Aiming to the flooding problems, Structured P2P C. Feature-Based Measures
and Decentralized P2P architecture appear in P2P Up to now, the features of the terms in the
technique along with time. Someone put another ontology are not taken into account. However, the
attribute into DHT to record additional information[2]. features of a term contain valuable information
This method is based on the physical topology. But this concerning knowledge about the term. Feature-Based
method will limit by DHT capability and data type, it Measures considers also the features of terms in order
cant record extensive information like semantic to compute similarity between different concept, while
information etc. The other method focus on the it ignores the position of the terms in the taxonomy and
semantic analysis. For example, Peter Haase advance the information content of the term. For example,
semantic topology[3]. Semantic topology is Tversky [7].
independent of physical topology. Every peer will
abstract their expert knowledge from their knowledge D. Hybrid Measures
base. At the same time, user’s query will abstract query The next approaches used to compare two
topic. At last system can find out peer’s whose expert concepts c1 and c2 combine some of the above
knowledge similar to user’s query topic. Then system presented approaches, considering the path connecting
will forward query to the similar peer. It can reduce the two terms in the taxonomy, the IS-A links of the
query routing range and routing time. Next section we terms with their parents in the graph and the features of
will roughly introduce these measurement. the terms. For example, Rodriguez et al.[8].
These measurements are used to calculate words
A. Edge-Counting Measure similarity on corpus (like wordnet) or medical terms
In the first category, Edge-Counting measures similarity on medic ontology ( like MESH ). We want
utilize the numbers of edge between two nodes in the to challenge to apply these measurements on computer
taxonomy to calculate similarity. In general speaking, science ontology. We will use the ACM topic ontology
if the numbers of edge between two nodes are fewer be our domain knowledge. ACM topic ontology is a
then the similarity of two nodes are higher. This kind very convincing paper categorized ontology. Next we
of measurement is using to calculate words similarity need to choose the adaptive approach between these
in the corpus at the earliest. Recently this kind of measurements. In the paper X-similarity : Computing
measurement is also using to calculate similarity of semantic similarity between concepts from different
two ontologies gradually. In this paper, we will use this ontologies[9], author mention that Edge counting and
kind of measurement to calculate interest similarity. information content methods work by exploiting
Therefore, we will discuss this kind of methods more structure information (i.e., position of terms) and
detailedly. First we will introduce the fundamental information content of terms in a hierarchy and are best
method Wu and Palmer [4]. The Wu and Palmer suited for comparing terms from the same ontology.
approach consider about the depth and length of the Feature-Based Measures and Information Content
node in the taxonomy. Next, we will introduce the Measures are suited for comparing terms from the
popular method Li et al.[5]. Li et al. formula is different ontologies. Because we only use one
extending from Wu and Palmer formula. Same as Wu ontology–ACM topic ontology and our research
and Palmer, first we need to find out the most environment didn’t set up corpus. At last we choose
informative subsumer of two comparing nodes. Edge-Counting Measure to calculate similarity. But we
Variable H is meaning depth of the most informative will supply for other approach applying on ACM topic
subsumer in the taxonomy. The algebra alpha and beta ontology and comparison in future research.
will substitution 0.2 and 0.6. The value of alpha and composition.
beta is author’s experiment result. Li et al formula was

550
3. Method similarity and categorize peers according to the
similarity 2.Peer rank : System will utilize the interest
Figure 2. is a rough description about Adutella score to revise the similarity and rank the peers
application scenario. Adutella is a P2P bibliography according to the revised similarity. Finally, user will
resource sharing software developed by our research get a categorized and ranked peer list. Because of our
team. It is applying in computer science research domain is based on research environment. Therefore,
environments. Adutella support some services about our domain knowledge is using ACM topic ontology.
finding adaptive peers, personalized category and rank, ACM topic ontology is very convincing paper
and collect user’s behavior to produce feedback categorized ontology in computer science. We will use
mechanism. Edge-Counting approach to calculate similarity. We
will discuss about similarity calculation in detail in a
later section.

Figure 3. Interest-based peer selection mechanism


Figure 2. Adutella application scenario
3.2ʳPeer interest profile
When Adutella user want to search some academic
articles. He can utilize personal search function to If we want to find out the interesting resemblance
search articles. If user first time using Adutella, system between pees, first we need to find a way to describe
will ask user to input user’s personal favorite and peer’s interest. Therefore, we use interest profile to
create user’s personal interest profile. The next step record and describe peer’s interest. There are three
system will collect other peer’s interest profile on the terms in the peer interest profile as below :
P2P network and analyze it. Find out interest similar { ACM topic index , interest topic name , interest
peers. Then system will output a categorized and score }
ranked peer list to user. It supports a smart and Interest topic name is used to describe the ACM
convenient information or knowledge provider topic names which peer are interesting. ACM topic
selection mechanism. Finally system will auto record index is used to describe the recorded topic belong to
user’s behavior to produce feedback. Feedback will which ACM category. Interest score describe the
influence the user’s searching result and peer list. interesting degree. For example as below :

Peer A interest profile :


3.1ʳInterest-based peer selection mechanism H.3.3.3 Query formulation (7)
H.3.1.2 Dictionaries (8)
In this section, we will introduce Interest-based Peer B interest profile :
peer selection mechanism basic architecture first. C.2 COMPUTER-COMMUNICATIONNETWORKS (8)
I.5.4 Application (6)
Second, we will explain how to record and describe
peer’s interest. Finally, we will explain how to utilize
There are two peer interest profiles in above
collected information to calculate interesting similarity.
example - peer A and peer B. Peer A’s interest profile
Create a categorized and ranked peers list. Figure 3. is
has two rows. It recorded [ H.3.3.3 Query formulation
the Interest-based peer selection mechanism basic
( 7) ] and [H.3.1.2 Dictionaries (8) ]. For each row is
flowchart. First, system will collect other peer’s
indicate that peer’s interesting ACM topic information.
interest information(peer interest profile). Second,
For example, we can looking at this row[ H.3.3.3
these raw information will deal with two step as below
Query formulation ( 7) ]. It means peer A is interesting
1. Peer categorize : System will calculate peer’s
in topic “Query formulation”. “Query formulation” this

551
topic is categorized in H.3.3.3 and peer B’s interesting will categorize peer B into group tk. The variable S
degree is 7. We are setting the interest score number means the average similarity between peer A and peer
range into 1 to 10. Peer’s interesting degree is B. The value of S can use to rank peers. In our
accompany the interest score increasing. Therefore, mechanism, we must set up a threshold value. If the
from the above example we can know that peer A is value S of peer B more than threshold. It means peer B
interesting in Query formulation and Dictionaries and is having similar interest with peer A. Peer B’s interest
having high interesting degree with both topics. Peer B profile will store into peer A’s cache. Otherwise, if the
is interesting in Computer-Communicationworks and value S of peer B less than threshold. Peer A will
Application. Although peer B is very interesting in abandon the peer B’s interest profile. The threshold
Computer-Communicationworks. But peer B just value must be determined by user. Peers similar or not
having normal interest degree of Application. are determined by user after all.[10]
If we want to find out the shortest path between two
3.3ʳPeer classification nodes, we must understand the node location in the
taxonomy first. In other words, we need to match peer
After we collect other peer’s interest profiles interest profile into the ACM topic hierarchy. Because
completely, we need to calculate other peer’s similarity we know that peer interest profile recorded the interest
to user. According to this similarity, we can categorize topic index. According to the index, we can easily
peers into different group. What are the groups? The know the topic’s location in the ACM topic hierarchy.
groups are user’s interest ACM topics. We can use For example, we can look at Figure 4. Figure 4. is a
above interest profile example to explain it. Suppose partly ACM topic hierarchy. We notice the peer A’s
peer B’s interest profile collected by peer A. Because interest topic [ H.3.3.3 Query formulation (7) ]. The
peer A’s interest profile record that peer A is index H.3.3.3 include two kind important information.
interesting in Query formulation and Dictionaries. First is the depth of this topic, second is the path from
Therefore, peer B will categorize into group “Query this node to root. Therefore, we can easily know that
formulation” or group “Dictionaries”. If peer B is peer A’s interest topic will map to the red circle. Peer
categorized into group “Query formulation”. It means B’s interest topic will map to the green circle. After
that if peer A want to search some academic articles getting the location of interest topic, we can calculate
about Query formulation, peer A can transmit query to
peer B directly. Because peer B also have interest in
Query formulation. Below is our peer categorization
algorithm:

ACM Topics { t1 , t2 , t3 , …..tn }


Peer interest I (A) = { t1 , t3 , t4 , … , tx }
I (B) = { t3 , t5 , t8 , … , ty }
Max = 0
Si = 0
Sj = 0
For all ix For each ti  I(A)
For all ix For each tj  I(B)
Sj = Similarity ( ti , tj )
S = S + Sj Figure 4. partly ACM topic hierarchy
Si = S / || I(B) ||
If Si > Maxʳ ʳ the shortest path between nodes(topics). We have tree
Max = Si step listed as below: 1.Find common parent node. 2.
k=i calculate distance form parent node to two nodes.3.
Categorize(B) = tk
sum the distance
If we want to find the shortest path between topic
The set t describe all ACM topic classification. The C.2 and I.5.4. First, we must find out the common
two sets I(A) and I(B) are describing the interesting parent node of two nodes. Left to right string
topic set of peer A and peer B. Sj is the similarity comparison is a good method to help us find out the
between peer B’s interesting topic j and peer A’s parent node. For example, we notice the term H.3.3.3
interesting topic i. Si is the similarity between peer B’s and H.3.1.2. We can find out different from the fifth
whole interesting topics and peer A’s interesting topic i. character. Therefore we can understand these two
The for loop will find out the max similarity between nodes have the parent node H.3. If comparison strings
peer B and peer A’s interesting topic k. At last system are entire different, it means the parent node is root. By

552
this method, we can know the parent node of topic C.2 But it doesn’t mean peer A and peer B are interesting
and I.5.4 is root. Second, we need to calculate the in database very much. Maybe peer B just want to
distance from comparison nodes to parent node. Index know the common sense with database, but peer A is
string length also tell us the node depth further we can the database technology fanatic. So we can use interest
calculate the distance by sub the depth. Therefore , we score to separate them. Further we can find out exact
can find out the distance from root to C.2 is 2 and the peer’s interest. The output peer list also will more
distance from root to I.5.4 is 3. At last, we sum the accurate. In below section, we will use the real
distance from root to C.2 and the distance from root to example to apply interest-based peer selection
I.5.4. Finally, We find out the shortest path between mechanism.
topic C.2 and I.5.4 is 5. Although this method is very
Case 1 : 2 peer’s interest are similar, but interest scores are
convenient to find out the shortest path between two
different
nodes. But this method need one precondition – the Peer A : ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ
taxonomy hierarchy must be a tree. In other words, H.3.3.4 Relevance feedback ( 10 )ʳ ʳ ʳ ʳ ʳ
every node in the taxonomy have only one path to H.3.3.2 Information filtering ( 8 )ʳ ʳ ʳ ʳ ʳ
H.3.1 Content Analysis and Indexing (6 )ʳ ʳ
other node. Although we know ACM topic hierarchy is
H.4.1.6 Word processing ( 7 )ʳ ʳ ʳ ʳ ʳ ʳ ʳ
an ontology, we can abandon the additional relation I.5.4.3 Text processing ( 3 )ʳ
and only use the body of ACM topic hierarchy. It is a Peer B :
tree not a ontology. H.3.3.4 Relevance feedback ( 4 )
H.3.3.2 Information filtering ( 2 )
If we only use the path length to determine the
B.3.3.1 Formal models ( 8 )
similarity, it will bring some situations we cant explain. F.3.1.3 Logics of programs ( 8 )
For example, we can notice Figure 4. The shortest path F.3.2.1 Algebraic approaches to semantic ( 7 )
between node H and I is 2. The shortest path between
node H.3.1 and H.3.3 is 2. If we only consider path Case 2 : 2 peer’s interest are similar, and interest scores are
similar
length to determine similarity. We will get a ridiculous Peer A : ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ
result - the similar degree of articles belong to H.3.1 H.3.3.4 Relevance feedback ( 10 )ʳ ʳ ʳ ʳ ʳ
and h.3.3 are equal to the similar degree of articles H.3.3.2 Information filtering ( 8 )ʳ ʳ ʳ ʳ ʳ
H.3.1 Content Analysis and Indexing (6 )ʳ ʳ
belong to I and H. It is not accord with classification
H.4.1.6 Word processing ( 7 )ʳ ʳ ʳ ʳ ʳ ʳ ʳ
principle. It is intuitive that concepts at upper layers of I.5.4.3 Text processing ( 3 )ʳ
the hierarchy have more general semantics and less Peer C :
similarity between them, while concepts at lower H.3.3.4 Relevance feedback ( 10 )
H.3.3.2 Information filtering ( 8 )
layers have more concrete semantics and stronger
B.3.3.1 Formal models ( 8 )
similarity. Therefore, the depth of concept in the F.3.1.3 Logics of programs ( 6 )
hierarchy should be taken into account. We survey lots F.3.2.1 Algebraic approaches to semantic ( 4 )
of methods to solve depth problems. For example, Wu
and Palmer, Li et al., Rodriguez et al., Tversky etc. At We can notice the first case two peer’s interest are
last, we choose Li et al. approach to solve the depth similar, but the interest score are very different . The
problems. We apply Li et al. formula into the above second case, two peer’s interest are similar, and the
example. We will get simLi ( H, I ) = 0.3599 , simLi interest score are similar. If only according Li et al.
( H.3.1 , H.3.3 ) = 0.5588. In other words, Li et al. formula to calculate similarity, we can’t separate the
approach means the similar degree of articles belong to case2 and case3. Therefore, we add a interest revise
H.3.1 and h.3.3 are bigger to the similar degree of parameter ICorrection after the Li et al. formula as below :
articles belong to I and H. Therefore, we will use the Li
et al. formula be similarity function in the peer I (A) = { v1 , v3 , v4 , … , vx }
categorization algorithm. I (B) = { v3 , v5 , v8 , … , vy }
Similarity interest( I(A) , I(B) ) =
3.4ʳPeer ranking For all ti  I(A),For all tj  I(B) Similarity Li at al.( ti , tj )
x ICorrectionʳ
In the section, we introduce that how to utilize if ti = tj
interest score to revise the similarity calculated by Li et ʳ ʳ ʳ ʳ ICorrection = ( vavg / || ( vi –v j ) || + vavg )
al. formula and rank the peer list order. In the before Else
section, we mention that we record interest score in the ICorrection = 1
peer interest profile. Interest score describe the vavg = ( v1 + v3 + v4 + … + vx ) + / | I(A)|
interesting degree. Because we must separate the
interesting degree from different topics or people. For I(A) and I(B) are two interest score sets which we want
example, peer A and peer B are interesting in database. to compare. Similarity Li at al. .( ) function is Li et al.

553
formula. vi and vj are interest scores of topic ti and tj. system will save the peer information into cache.
The number range of parameter ICorrection is from 0 to Otherwise, system will abandon the peer information.
1. At first we must determine the compared topics After this filter mechanism, user will get an arranged
equal or not. If topic i equals topic j then the parameter peer list. Query can transmit to the selected peers by
ICorrection = ( vavg / || ( vi –v j ) || + vavg ). It means that if user directly. This mechanism can reduce the broadcast
the interest score of topic i and topic j are more frequency and query routing cost.
different. Then the value of ICorrection will fewer and
similarity will decrease. Because the interest degree of 6. Acknowledgement
the topic i (same as topic j) are more different with
peer A and peer B. Otherwise, if the interest score of This work is supported by National Science Council,
topic i and topic j are the same. Then ICorrection = 1. it Taiwan under grants NSC95-2520-S008-006-MY3 and
means the parameter won’t revise the original NSC96-2628-S008-008-MY3
similarity. Because of it, we can utilize different
interest degree to find out the exact peer’s similarity. 7. References
Apply the new formula to above example. Because of
the interest score are very different of peer A and peer [1] D. S. Milojicic, V. Kalogeraki, R. Lukose, K. Nagaraja, J.
C. We can separate with case2 and case3 and find that Pruyne, B. Richard, S. Rollins, and Z. Xu, “Peer-to-peer
the peer D is more similar to peer C. computing,” Technical Report HPL-2002-57, HP Lab, 2002.
[2] Q. Gao, Z. Qiu, “An Interest-based P2P RDF Query
4. Discussion Architecture, “CNDS lab, Peking university, Beijing, China,
Proceedings of the First International Conference on
Semantics, Knowledge, and Grid (SKG 2005).
Although we choose Edge-Counting Measure to [3] P. Haase, R. Siebes, F. van Harmelen, “Peer selection in
calculate similarity. But we will supply for other peer-to-peer networks with semantic topologies,”in:
approach applying on ACM topic ontology and International Conference on Semantics of a Networked
comparison in future research. Besides, we will World: Semantics for Grid Databases, June 2004, Paris.
challenge to use complete ACM topic ontology for [4] Z. Wu and M. Palmer, “Verb Semantics and Lexical
domain knowledge not the main body. Selection,” in Proceedings of the 32nd Annual Meeting of the
We will make more case to test interest revised Associations for Computational Linguistics (ACL'94), pp.
parameter ICorrection. In fact, we need to consider the 133-138, Las Cruces, New Mexico, 1994.
[5] Yuhua Li, Zuhair A. Bandar, and David McLean, “An
condition as below : Approach for Measuring Semantic Similarity between Words
Using Multiple Information Sources,” IEEE Transactions on
Peer A : ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ ʳ
Knowledge and Data Engineering, pp. 871-882, July/August
H.3.3.4 Relevance feedback ( 10 )ʳ ʳ ʳ ʳ ʳ
H.3.3.2 Information filtering ( 8 )ʳ ʳ ʳ ʳ ʳ 2003.
H.3.1 Content Analysis and Indexing (6 )ʳ ʳ [6] P.W. Lord, R.D. Stevens, A. Brass, and C.A, “Goble.
H.4.1.6 Word processing ( 7 )ʳ ʳ ʳ ʳ ʳ ʳ ʳ Investigating Semantic Similarity Measures across the Gene
I.5.4.3 Text processing ( 3 )ʳ ʳ ʳ ʳ ʳ ʳ ʳ Ontology: the Relationship between Sequence and
Peer B : Annotation," Bioinformatics, pp. 1275-1283, 2003.
H.3.3.3: Query formulation ( 10 ) [7] A. Tversky. Features of Similarity. Psycological Review,
H.3.3.6: Search process ( 8 ) pp. 327-352, 1977.
H.3.3.7: Selection process (6)
[8] M.A. Rodriguez and M.J. Egenhofer, “Determining
H.5.2.10: Prototyping ( 8)
C.4.6: Reliability, availability, serviceability ( 3 ) Semantic Similarity Among Entity Classes from Different
Ontologies,” IEEE Tramsactions on Knowledge and Data
Engineering, pp. 442-456, March/April 2003.
Although peer A and peer don’t have the exact same [9] A. Hliaoutakis P. Raftopoulou E. G.M. Petrakis, G.
interesting topic. But almost topic are in the category H. Varelas, “X-Similarity: Computing Semantic Similarity
We need to consider about this situation how to between Concepts from Different Ontologies,” Journal of
influence the similarity revising. Digital Information Management (JDIM), 4(4):233{238,
December 2006.
5. Conclusions and future work [10] P. Haase et al, “Bibster - a semantics-based
bibliographic peer-to-peer system,” In F. van Harmelen, S.
McIlraith, and D. Plexousakis, editors, Proceedings of the
This paper advance interest-based peer selection Third International Semantic Web Conference (ISWC2004),
mechanism in order to find out right peers by analyzing LNCS, pp. 122–136, Hiroshima, Japan, 2004. Springer.
user’s interest. System will collect other peer’s interest
information and analyze it. Next system will calculate
the user’s interest similarity with other peers. If system
find out the peer having similar interest with user,

554
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Free-Form Annotation Tool for Collaboration

Han-Zhen Wu1 Stephen J.H. Yang2 Yu-Sheng Su3


1, 2, 3
Dept. of Computer Science and Information Engineering, National Central University,
Taiwan
han_zhen_wu@yahoo.com.tw1 jhyang@csie.ncu.edu.tw2 addison@csie.ncu.edu.tw3

annotation. Much information is not delivered on


Abstract hardcopy in recently year, but users still created their
annotations on the hardcopy. Because of the gap of
The people often establish taking notes on reading convenient between digital annotation and hardcopy
and browsing activities; hence annotation is being very annotation is mainly reason. Unlike annotate on
important in human life in anytime and anywhere. We hardcopy, we need to solve the problems about
developed a free-form annotation tool for annotation analysis, classification, anchor position and
collaboration that provides a convenient way to create reduction the gap between digital and hardcopy.
annotation easily. Our approach is characterized by We developed a free-form handwritten annotation
two design criteria, including: 1) digital Ink annotation: tool to help user annotating web document. This tool
help users to focus on annotated concept or specified not only can analyze annotation anchor position, keep
sentence on reading and browsing a web page; 2) the writer’s semantic and support user to create
IM-based annotation discussion board: help users to annotation more conveniently, but it also can share
discuss sharing digital Ink annotation based on each annotation, collaborative discussion and annotate web
one’s aspect in an efficient manner. document currently without any restrictions.
This paper is organized as follows: Section 2
Keywords: handwritten, free-form, annotation, anchor, briefly discusses the problems of our related works.
classification, collaboration Section 3 introduces the methods and ideas in our
system. Section 4 discusses the problems and results
1. Introduction for our tool. Section 5 is conclusion and feature work.

More and more documents are digitized then 2. Related work


exchanged on network in recently year. Existing
annotation tools do not replace the same actions on Brower-based annotation tool [1] is the tool
hardcopy. They support the user to annotate digital installed in user’s computer. It is usually annotating
document as hardcopy without print the document out. HTML document web document locally. There is a
The personalized annotation products have ATnotes problem about annotation anchor position. It is the
and Annotater. The XLibris called paper-like tool was most important problem in Brower-based annotation
developed. The Annotea called web-based annotation tool. Brower-based annotation tools have various
system was developed. When many users are working anchoring methods that like keyword matching,
together, the server does not reduce burden efficiency. recording column number or recording the coordinate
Existing annotation systems may be restricted the etc [2, 3]. Using browser-based tool is easily. It is
inconvenience of annotation creation to influence the require lower hardware requirement than server-based
users’ volition. and proxy-based tool for annotation collaboration.
Existing free-form annotation tools create Free-Form annotation tool is important and
annotations via handwritten device. It can help user to convenient. It is very convenient for creating
create and retrieve annotations. The reasons of annotation in most conditions. In order to annotate
developing digital handwritten annotation tool are that digital document as convenient as hardcopy. We need
the annotation sharing, searching and managing on to solve some problems which like analysis and
digital document. Those are better than hardcopy classification annotation, free-form annotation

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 555


DOI 10.1109/SUTC.2008.60
anchoring, annotation interface and collaborating
discussion. Annotation analysis and classification are 3. Free-form annotation tool
very important problems for annotation system. For
example: 1) Anil K. Jain and Anoop M. Namboodiri In this section, we will present the proposed tool as
use the length and the curvature of ink stroke to shown in figure 1. Before user annotates the web
analysis text and non-text strokes [4]; 2) using the ink document, the web document metadata will be saving
stroke length distribution to analysis; 3) Shilman into document repository when user browses a web
analysis the time and space information of ink strokes document. It will get the web document data when user
[5]. creates annotation. Then it sends annotation to
Annotation anchor is a set of data which indicate recognize annotation type and anchoring module for
annotation position on the web document. It can be analysis its type and anchor position. Finally,
coordinates, text length or some web objects around annotation will be saving into annotation repository.
annotation in the document. Robust [6] define the User can retrieve annotation from annotation
detail anchor methods for text document. Callisto [7] repository if user wants to browse it.
refers his methods for their anchoring method, but
Callisto need the anchoring tool to complete annotation
creation manually. U-annotation [8] uses the Data
Object Model (DOM) [9] to find corresponding
anchoring object. DOM is a standard of web structure
presented by W3C. Our anchor method combines
DOM and DHTML. We use the start and end position
of annotation as anchor information. The advantages
are not only can anchor more precisely but it also can
save the annotation and document separated.
It will be more convenient for all users if
annotation tool provide user-friendly interface. The
Fluid ink [10] thought that users want to annotate as
convenient as traditional pen writing when they are
annotating. So they purposed an interface as button Figure.1 Free-form annotation tool architecture
free. It means that user can operate annotation without
traditional menu bottoms. Xlibris is a browsing 3.1 Web-document repository
annotation hardware device which provides full A4
size browsing and handwritten annotating. It’s When user has created an annotation that the tool
designed for as convenient as paper reading and will give an id number for it and the web document.
writing. It also can focus on annotations browsing. For Then save the data into web document repository. The
instance: Xlibris can centralize the annotations together id relates the document and annotations. User retrieves
reading. annotation and document by this id number when user
How to transmission annotation data efficiently and wants to see annotation.
currently is very important whether discuss digital dc:URL Web document address
document for collaboration or share annotations dc:creator Web document author
through network. We introduce the communication dc:keywords Document keyword
platform that we use that called Jabber/XMPP dc:identifier Document id
communication platform [11]”. It improves the socket
communication and support streaming XML function. Table1. Web document metadata
Furthermore there is other advantage about providing
the service of Peer-to-Peer communication. In order to 3.2 Annotation repository
collaborate annotating a web document for all users
instantly. There is a problem that how we achieve the Annotation repository outlines as follow table2.
synchronous for all participates’ annotations in the Annotation tool will save annotation information into
environment of peer-to-peer network. Because annotation metadata when user is creating annotation.
annotations in the same web document may relate each ks:annoID Annotation id
other by time order. Therefore how to notify the ks:annoDate Annotation created data
created annotation to all annotators, maintain the ks:annotator Annotation author
annotation sequence order and annotation correctness ks:tag Annotation tag
are very important.

556
ks:annoColor Annotation color
ks:annoStyle Text annotation type (boldface,
italic, underline)
ks:annoComment Annotation content
ks:ink Free-form annotation address
ks:recording Voice annotation address
ks:isClassifiedTo Annotation topic
ks:annoStart Annotation start anchor
position
ks:annoEnd Annotation end anchor
position
Table2. Annotation metadata Figure.3 Recognized annotation type and anchoring
mechanism
3.3 Annotation creation When user writes a drawing, the tool will recognize
it as an ink stroke. After user finishes his drawing the
Web page is composed by frames and each frame tool will send these ink strokes to analysis and
contains many objects. We use the object content to be classification. The analysis and classification
anchor information for an annotation. We can mechanism contain several parts as figure.3. The ink
decompose the web page objects by W3C standard strokes will be analysis its layout and non-text or text
easily. Then save the information into web document strokes group. Then recognize the annotation type for
repository. Annotation will be send to the recognized each non-text strokes group. New annotation type can
annotation type and anchoring mechanism for analysis add into classification types if user set a new personal
when free-form annotation is created. Final user need annotation type. In order to retrieve annotations after
decide saving annotation as file to complete annotation classification, the annotation will be anchored before
creation. As figure.2 show, we introduce 4 examples user save the annotation finally.
for annotation creation. First, John annotated his idea
to Tom’s annotation when Tom annotated object 1. 3.4.1 Annotation layout analysis
Second, Mary and Tom annotates at object 2 in turn.
The third, Tom, John and Mary annotate the same First, layout the ink strokes into text and non-text
object4. The fourth, Tom annotate his commit to john strokes group by their propriety of length and curvature.
and Mary’s annotation. We define the text drawing as text strokes, and the
others are non-text strokes. We layout the non-text
strokes by two steps. 1) Layout each stroke’s time
information when ink stroke is created. 2) Layout each
stroke’s by space information. Assume the user writing
each annotation is continuously. We think that is good
to analysis and classification immediately when each
annotation is created.

3.4.2 Annotation classification


Figure.2 An example of annotation creation
We recognize annotation type as highlight,
3.4 Recognized annotation type and comment, symbol, underline, circle, margin bar or
multiple marks [13] as follow figure4.
anchoring mechanism

Annotation analysis is a very important step to


affect the free-form annotation classification. We
combine the methods that we introduce in related work
about layout analysis and classification as our idea. We
thought combine these method could improve analysis Figure.4 An example of non-text annotation type
correctness.

557
We design the detectors to detect annotation types
for each non-text strokes group. Each strokes group 3.4.3.1 Correcting anchor position
will be detect by these detectors, each detector will We need correcting the anchor position because the
give a value of similarity. The annotation type is text range is not the same as we see on the screen. The
decided by detector which gave the highest value. We text range is always bigger then we see (as figure.6).
can create new personal annotation type as a new This difference will affect the anchor to the mistake
detector which provided by user itself. The new range easily. Besides we use handwritten input device
detector can be adding into existing detectors and to create annotation. It means that we are not anchor
improving the ability of detection. The advantage of precisely then mouse selection.
the detector is that we just need set a detector when
user encounter with a new annotation type without
modifying the program.

3.4.3 Annotation anchoring mechanism


Figure.6 Text range on (A) hardcopy or screen (B)
We use the method be called “isRange” which digital document
presented by W3C Range Object Model [13]. Using
this method we can separate the annotation from web We provide some examples of circle type of
document. The advantage is that can save the annotation as figure 7. The annotation may contain
bandwidth and storage if user is not sharing the several ranges that we are not easy to see.
annotation with the annotated document. Then we can
retrieve annotation according to these anchor
information when user wants to browse it.
Figure.7 Possible annotation forms
We introduce the isRange method as figure5. There
are two range called Body and B. Then Body isRange
There is the other example for correcting
B if the range B is contained in the range Body.
annotation as figure.8. Figure.8-B means user drawing
to three ranges. The red color in the figure 8-B is the
range that user does not want involving. The figure 8-C
is the result after correction.

Figure.8 Correcting annotation anchor range

3.5 Annotation retrieval and management


Figure.5 Anchoring mechanism
Due to we use the DOM structure to save
When we select range B as figure 5, the tool will annotation separately. User browses the annotation
find the node which contains the range B completely becoming more easily. The tool retrieves the
(range body in figure 5). Then it will compare the first annotation record form the database when user wants
character in the range B is the same in range body in to browse annotation. Combine this three anchor
turns (the characters length of figure -A as start point). information and document address that the tool can
The final character is fined by the same action (the show the annotation on the screen by internet explorer
characters length of figure 5-C as end point). Finally automatically.
we send the element, start point and end point as our The tool under off-line mode, user annotates web
anchor information. document locally. The annotation will be showing on
the annotation management interface. User can retrieve
annotations back by simple click the retrieval bottom.
With the on-line and collaboration mode that user
needs requiring transmission the annotation data from
others if user wants to see it.

558
Figure.9 An example of annotation search and retrieval

Figure.10 An example of IM-based annotation discussion board

We can search and retrieve annotations what we assign the order to solve the problem of annotation
interested when we want to browse the annotation from order. Annotation data recorded as the form of XML
annotation owner. First choose the other user’s ID at then transmission as string stream when broadcast to all
IM platform then open the search window. User can participates.
key in the information what he wants to search in our As example as figure.10, IM server will broadcast
interface. The tool will find and list all annotations the annotation information to each participate after
after we send searching request. User can choose which Albert creates an annotation. If Chris want to read the
annotations that he wants to browse. Then tool will ask annotation which created by Albert. Chris just selects
the annotation owner to transmitting the annotation the annotation then browses it. The tool will ask Albert
data to user. After annotation transmission complete to send annotation to Chris. Albert gets the annotation
that user can browse the annotation. in the same way when Chris answers the question by
free-form annotation. It achieves the purpose of
3.6 IM-based annotation discussion board collaboration that we want.

In order to let users annotate the same web 4. Discussion


document. We use our design that we introduce before
for collaboration. We think the action of annotation analysis and
The group chat mode of XMPP likes general classification is most important part. Otherwise the
on-line chat room. Jabber server will create a single follow action is useless if the analysis and classification
chat room before we start chat room. All participates’ was wrong initially. We think that it can improve the
communications will send to chat room then broadcast correctness if we combine their methods. We let user
to the others. All communications will be recorded for provide their training data as a new detector to improve
new participate to read. the ability of classification. Beside we also plan to join
We use the XMPP group chat mode as our multiple the feedback function that allow user fix the
annotation system. The room owner responsible for misjudgment. Let our design more complete.

559
We use the ROM that presented by W3C to be our
anchor method. We will record the start position and 7. References:
end position as our anchor information. We think that
anchor will be more precisely for annotation which [1] Ng S. T. Chong, “Annotation-based Web
related to web document. But disadvantage is that the Communications Systems: A Review”, United Nations
method can’t anchor the content from the script. University , Tech. Rep. CS-3408,2003
Because of we cannot exact the context information [2] S.J.H. Yang , V.M.Z. Du , N.W.Y. Shao, and I. Chen,
with our method. “Applying Personalized Annotation Mechanism to
e-Documentation,” Proceedings of the E-Commerce
For now, our design still need user to decide or
Technology for Dynamic E-Business, IEEE International
input the options. Although there are many annotation Conference,2004,pp. 142 - 145
tools that it has the same condition. But we want to [3] Po-Kuan Chiang, Heng-Li Yang, “Annotation Tool in
reduce any actions that corrupting annotation. We also Electronic Documents”, International Conference of Digital
can annotate document as similar as tradition Technology and Innovation Management, 2006, pp.418 ~ 427
annotating. [4] K. Jain, A.M. Namboodiri, J. Subrahmonia,” structure
We can communicate and discuss for the same in online documents”, Document Analysis and Recognition,
group by using our annotation tool though IM platform. Proceedings. Sixth International Conference ,2001, pp.
Each user can create their annotation database locally. 844-848
[5] M. Shilman, Wei Zile Raghupathy Sashi P.
Then they can share the annotations to the others
Simard, D. Jones,” discerning structure from freeform
through peer-to-peer network. The advantages are that handwritten notes”, Document Analysis and Recognition,
can avoid the cast of maintaining of server, separating Proceedings. Seventh International Conference,2003, pp. 60-
the annotation data form tool and improving its 65
flexibility. We can transmit a little data to getting [6] Thomas A. Phelps, Wilensky Robert,”Robust
annotation information from the other. It is also can intra-document locations”, Proceedings of the 9th
maintain the user’s privacy international World Wide Web conference on Computer
networks : the international journal of computer and
telecommunications netowrking,2000, pp. 105 - 118
5. Conclusion [7] David Bargeron , Tomer Moscovich ,” Reflowing
digital ink annotations”, Proceedings of the SIGCHI
We developed a free-form handwritten annotation tool conference on Human factors in computing systems,
to support user annotating web document quickly. Then 2003,pp.385 - 393 ;
analysis the anchor position then preserve the original [8] M.A. Chatti, , T.Sodhi, , M.Specht, , R.Klamma, ,
meaning of free-form annotation in web document. R.Klemke, , “u-Annotate: An Application for User-Driven
Moreover that we can share annotations, collaborate Freeform Digital Ink Annotation of E-Learning Content”,
discussion and annotate to web document correctness Advanced Learning Technologies, 2006. Sixth International
without any restrictions. Conference ,2006, pp.1039-1043
[9] Philippe Le Hégaret, and The W3C DOM Interest
The advantages of free-form handwriting annotation
Group ΔThe W3C and NIST,W3C Data Object ModuleΔ
tool are that can preserve personal meaning annotation, match 2001, http://www.w3.org/DOM/
quick and habituated writing form, convenient for [10] Robert Zeleznik, Timothy Miller,” Fluid
annotation sharing and annotation searching. inking: augmenting the medium of free-form inking with
We think that there may have better methods need to gestures” , Proceedings of Graphics Interface,2006,pp. 155 –
explore and use for early questions that we purposed 162
before. We have discovered other questions that can [11] Peter Saint-Andre,” Streaming XML with
help us reduction the gap of writing between tradition JabberXMPP”, IEEE Internet Computing, Volume 9 Issue
annotation and digital annotation. For instances, how to 5, IEEE Educational Activities Department Piscataway, NJ,
express the digital annotation as the same as hardcopy USA ,2005,pp.82-89
[12] F.M.Shipman, M.N.Price, C.C.Marshall,
annotation, annotate document more convenient and
G.Golovchinsky and B.N. Schilit, “Identifying Useful
use the tool in the different platform. Not only the Passages in Documents Based on Annotation Patterns”. In
problems we introduce before, but also these questions Proceedings of ECDL., 2003, pp.101-112.
are the research targets at feature. Peter Sharpe, Vidur Apparao, Lauren Wood,"Document
Object Model Range”, November 2000,
http://www.w3.org/TR/DOM-Level-2-Traversal-Range/range
6. Acknowledgement s.html
This work is supported by National Science Council,
Taiwan under grants NSC95-2520-S008-006-MY3 and
NSC96-2628-S008-008-MY3

560
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Hands-On Training for Chemistry Laboratory


in a Ubiquitous Computing Environment

Mune-aki SAKAMOTO Masakatsu MATSUISHI


Department of Applied Chemistry Academic Foundations Programs
Collage of Bioscience and Chemistry Practical Engineering Education Program
Kanazawa Institute of Technology, Japan Kanazawa Institute of Technology, Japan
mune-aki@neptune.kanazawa-it.ac.jp matsuishi@neptune.kanazawa-it.ac.jp

Abstract However, the bandwidth of the network has a phys-


ical limit, as well as a ubiquitous network. To advance
To build an ubiquitous and interactive practical- a practical approach in science laboratory, we have to
learning environment in science laboratory at advanced balance conditions between data size and educational
education, educational materials will consist of a large effect of the materials. In this paper, we have made a
amount of electronic data. With the increase of infor- short video material in chemistry laboratory and inves-
mation in the materials, volumes of data become huge tigated a relationship between degree of understanding
and bandwidth of the network would be oppressed. In and quality of video materials
order to advance a practical approach, we have to
make clear a condition of compatibility between data 2. Experimental
size and educational effect of the materials. In this
paper, we shed a light on minimum requirements for 2.1. Video materials
e-Learning in chemistry laboratory. A relationship be-
tween degree of understanding and quality of video ma-
A video material was part of a melting point anal-
terials were investigated using a sample of 1023 first
ysis in chemistry laboratory. A scene from the video
year university students in 8 classes in Japan. Although
materials is shown in figure 1.
the students discriminate a frame rate of a video mate-
rial, they recognized handling and peripheral operation
in chemistry laboratory on all rates.

1. Introduction

In recent years, information and communication tech-


nologies have been evolved and expanded. As ITC
progresses, demands for e-Leaning, e.g. remote lean-
ing, ubiquitous learning and Web leaning, are increased.
Web and video-based educational materials for fun-
damental subjects, such as mathematics and language
learning, are provided and widely used[1, 2]. In con- Figure 1. Scene from video material: A cap-
trast, there is few product to develop experimental tion in Japanese means ”Make sure the
skills in chemistry, biotechnology and physics[3] despite reagents are placed at the end of the capil-
heavy demand. Because of obtaining the skills for these lary”
experimental fields, a large number of the information
are required. Therefore, a size of electronic-educational
materials is much greater than that of fundamental sub- Video sources were recorded by DV-CAM(Sony),
jects. edited and captioned with DV-Raptor (Canopus), then

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 561


DOI 10.1109/SUTC.2008.75
exported to an uncompressed video file. A frame rate
of the file was adjusted to 10, 15 and 30fps using
AviUtil. The adjusted files were converted to com-
pressed video format through Windows Media Encoder
9, and provided for surveys. The properties of the
video materials are summarized in table1.

Table 1. Video properties


frame rate / fps
10 15 30
resolution 640 × 480 (4:3)
colour depth bit 24
Figure 2. Recognition of video quality: Pos-
film length sec 35
itive, neutral and negative answers are indi-
format WMV9 (WMV3)
cated as hatched, plain and dotted region, re-
bit rate kb/sec 232.6 299.7 1900
spectively. With oral explanation, asterisk is
file size Mb 1.003 1.297 8.683
added on shoulder of numbers.

2.2. Questionnaire survey

A questionnaire survey was carried out in a lecture oral explanation. These results would indicate that the
for fundamental experiment using computer-scored an- students discriminate between 30fps and the others, but
swer sheet to students at first grade of university. The do not discern low frame rate of the video materials.
sheet consists of questions and five-grade evaluations:
positive, slightly-positive, neutral, slightly-negative and 3.2. Experimental precautions
negative. Before filling in the questionnaires, the fps-
adjusted video materials were displayed to approxi- In general, students might not predict a danger in
mately 200 students in a classroom via 29-inch CRT chemistry laboratory. Heightening safety awareness in
monitors. Mean distance from the students was 2 meter the chemistry laboratory encourages students to pay
with 45 degrees angle upside eye-level. In addition, A careful attention to health and safety issues in all as-
few groups of the students had an oral explanation at pects of their lives as individuals and social beings.
casting video to study an effect of interactive commu- To make clear the effect of interactive communica-
nication. tion with video materials, we gave an oral explanation
In aggregate analysis, the sheets including unan- of cautions at handling apparatus and questioned "How
swered questions and all answers filled with same score do you recognize a danger in this procedure?" to the
were omitted. After the post-process, the answers were students.
classified, counted and summarized to three-grade: pos-
itive, neutral and negative. The number of distributed
questionnaires and effective answers were 1211 and
1023, respectively.

3. Results and discussions

3.1. Recognition of video quality

To investigate a recognition of video quality, we


asked to the students that "Are you satisfied with video
quality to learn chemistry experiments?".
As seen in figure 2, a half of the students said
positive at 30fps. On 10 and 15fps, positive answer
was slightly decreased to approximately 40%, but it
would be ascribe to almost same answers. Also, there Figure 3. Experimental precautions
is no significant difference between with or without

562
Figure 3 indicates that there is no difference in Regardless of frame rate, almost same answers were
recognition ratio at every frame rate. However, the obtained in all rate as opposed to recognition of quality.
ratio was doubled with oral explanation. The results In addition, an oral explanation would not make a dif-
suggest us that cautions in chemistry laboratory would ference in the results. The result shows that the frame
not be identified through video materials. rate of the video materials and additional explanations
might not affect to enhance recognition of operations
3.3. Safety handling of instruments and substances in chemistry laboratory.

In chemistry laboratory, students must master the us-


4. Conclusions
age and operation of instruments. Especially when, for
lack of information on the properties of substances, We have made frame rate-controlled video materials
the instruments and the substances are used improp- in chemistry laboratory and questionnaire survey on a
erly. Figure 4 shows safety filling-up method for a relationship between degree of understanding and frame
powder reagent to fine capillary. rate of the video materials. From the survey results
with over a thousand of university students, we ana-
lyzed and discussed that the quality of video materials
and effectiveness of oral explanation in chemistry lab-
oratory. Although the students discriminate a frame
rate of video materials, they recognized handling and
peripheral operation in all rates. Meanwhile, an oral
explanation would not enhance recognition, but help to
make an identification of precautions. In other words,
video materials for chemistry laboratory would be ef-
fective in low-frame rate, and could save bandwidth of
the network.

5. Future prospect
Figure 4. Scene from video material: A cap-
In response to the results of the survey, we have a
tion shows ”Stick a capillary into a powder of
plan to develop a distance leaning system for general
sample”
chemistry laboratories between our main campus and
satellite campus. The system will be established and
tested through a T1 hookup and a PHS network.
To estimate a leaning effect of the video materials,
we hold a referendum "How do you recognize handling
and operations from the video material?". The results References
are summarized in figure 5.
[1] Rita Porcell Donata Francescato and Paolo Renzi.
Evaluation of the efficacy of collaborative learning
in face-to-face and computer-supported university
contexts. Computers in Human Behavior, 22:163–
176, January 2006.
[2] Jui Feng Meilun Shih and Chin-Chung Tsai. Re-
search and trends in the field of e-learning. Com-
puters & Education, page in Press, 2007.
[3] N. Christmann E. Schaer, C. Roizard and
A. Lemaitre. Development and utilization of an
e-learning course on heat exchangers at ensic. Ed-
ucation for Chemical Engineers, 1(1):82–89, 2006.

Figure 5. Safety handling of instruments

563
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

A Progress Report and a Proposal:


Interactivity in Ubiquitous Learning Enhanced by Virtual Tutors in
e-Learning Contents
Toshiyuki YAMAMOTO, Ph.D. Ryo MIYASHITA
Department of Media Informatics Graduate Program in Information and
Kanazawa Institute of Technology, Japan Computer Engineering
caitosh@neptune.kanazawa-it.ac.jp Kanazawa Institute of Technology, Japan
Ryo Miyashita, Graduate Student miyashita@venus.kanazawa-it.ac.jp

Abstract classroom, the role of interactivity is best described as


communication between the teacher and his/her
It is proposed in this session that the interactivity students, which in turn will arouse the students’
necessary in ubiquitous learning can be enhanced by intellectual curiosity and learning motivation. In this
the virtual tutor who mediates between the learner and session, the authors emphasize that in the e-learning
the e-Learning contents. Arousing students' environment, interaction between the teacher and
intellectual curiosity and raising their motivation to his/her students can be maintained with the help of the
learn are the two great goals for e-Learning contents new technology so that students who have been not
developers. It is supposed that the interaction in the successful in learning in an e-Learning environment
communication between the teacher and his/her can also achieve some level of learning
students in a regular classroom can be maintained in accomplishment.
e-Learning with the help of Virtual Tutor, which The term “interactivity” has been used in a
appears on the learning contents in a timely fashion to variety of ways in the field of media technology. First
give learners advice, encouragement, reminders, and of all, the term “interactivity” is redefined to clarify
such. The session includes discussions to develop such the intended meaning for interactivity in learning.
a system making use of the most updated technology
as well as demonstrations of such a system which has 2. Defining Interactivity
been experimented at Department of Media
Informatics. It is proved that the learning environment In order to narrow down the intended definition
where human interactivity is essential can be of “interactivity” let us limit ourselves to the school
reconstructed in terms of the proposed Virtual Tutor. environment where some type of instruction is given
This paper presentation includes demonstrations of between the teacher and his/her students. Also for the
QED as well as a discussion of the importance of sake of the theory building, let us view “interactivity”
interactivity in ubiquitous e-Learning. from the perspective of the learner instead of the
Key Words: Ubiquitous Learning, e-Learning, producer of instructional materials, whether they be
e-Learning Contents, Interactivity, Virtual Tutor, QED, designers or teachers. Furthermore, in order to limit
TVML, Computer Graphics our field of research, let us define the environment for
learning only through the standard networked
multimedia computer which has the capability to
1. Introduction generate digitized sounds, display digitized videos,
capture images through cameras, and record the
learner's voice, as well as the conventional input
The role of interactivity in e-Learning must be devices such as a mouse and a keyboard, which can be
identified in order to achieve high learning commonly observed in computer labs in schools.
effectiveness. It cannot be denied that in a regular Between 1890 and 1920, when the wave of the

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 564


DOI 10.1109/SUTC.2008.32
industrialization impacted the field of education, the 2.1. Instinctive Interest
idea of creating teaching machines was developed so
that students can receive uniform instructions to Instinctive interest is more or less learning to
develop the same quality of development as output of survive, which humans share with other animals. In
education. But soon afterwards, curriculum developers addition, humans have interests to learn other things
realized the mistakes and shifted attention to the than just related to survival. Dewey calls this inherent
individual progress of students. During the last ten to instructive ability of interest "a love of learning."
fifteen years, the use of computers and other technical This interest involves physical, mental, and intellectual
devices was promoted as a way to augment learning. elements.
One prospect is that more and more computers will be Developmentally, children are “interested” in the
used in schools to augment learning. Our position here organic senses such as sight, touch, sound, taste, etc.
is that the instructional materials are for augmenting and link their functions with the brain. Children
learning in schools where teachers play the major role acquire this type of interest before the preschool
in students’ learning. period. At this stage, children also learn how to
It should also be kept in mind that the use of coordinate more than one organic sense.
computers in schools in the last few decades is still This instinctive interest does not directly concern
very new and immature. Even in higher education, us because it is acquired before schooling starts.
Geoghegan (1994) estimates that of all the educational
technologies implemented, no more than five percent 2.2. Sensorimotor Interest
of instructors use computers as anything more than
high-tech substitutes for the blackboard and the Sensorimotor interest involves physical activity,
overhead projector. He claims that identifying and where children are interested in commands of the
extending creative use of educational courseware sensorimotor apparatus of the body. Children show
should be central to studies dealing with interest in their command of eye-hand coordination at
implementation. Geoghegan's point is correct in the a higher level, and eventually in control over external
sense that teachers are still trying to identify the status objects by means of tools, in other words, control of
and function of computers in classrooms. Therefore, it applying one material to another. Interest in symbol
would be helpful to have the important aspect of, manipulation is another example.
“interactivity” in the usage of computers in Besides sports that use a ball such as soccer,
classrooms. tennis, etc., computers can be applicable in amplifying
To refine our working definition of “interactivity” the this type of interest. Various input systems such as a
origin of learning needs to be examined first. Dewey mouse, a keyboard, a joystick, a touch pad, and a
(1913) may be a good starting point for us. Dewey tablet and a pen, encourage and reinforce the
defines “interest” as an activity engaging a person in a development of sensorimotor learning because they
whole-hearted way. Interest does not just sit in the require not only eye-hand coordination but also the
human mind. Instead it operates itself outward in order command of external tools. The processor and output
to link internal power to the outside. When this devices, such as the computer screen, give immediate
process is accomplished, the individual experiences a feedback of the activity.
sense of satisfaction. In other words, interest is a Computers can also provide children with
trigger of subject-matter and it will produce a certain opportunities to enhance symbolic interest which is
result. Dewey considers this concept of activity as a part of sensorimotor interest. Incorrect spelling, for
central education principle. Since educative interests example, can receive immediate feedback as
vary indefinitely with age, with individual native misspelled words are highlighted. Even in more
endowments, with prior experience, and with social advanced symbolic interest can be enhanced by
opportunities, it is not possible to list them all. In other visualization software such as Mathematica or Maple,
words, “interest” is realized in the form of interactivity which displays mathematical formula in 3-D images.
in learning. These interests of children may be satisfied by human
Dewey categorizes four types of interests. They teachers, but in a classroom situation where there are
are: instinctive interest, sensorimotor interest, often more than 20 students, computers are more
distinctive intellectual interest, and social interest, efficient in performing symbol manipulation exercises
which are summarized respectively below. and individualized immediate feedback to errors and

565
visualization. love, and desire for approval are realized activities of
It follows that sensorimotor interest is a social interest. This social interest of children is deeply
significant area where computers can play an active intertwined with that of other children, or the interests
role in education. That is, with computers, the of their teachers and family members, as well as their
one-on-one tutoring environment can be achieved collective hopes, desires, plans, and experiences.
easily in a classroom. Due to its nature of social interest, social interest
as well as hopes, desires, plans, and experiences
2.3. Distinctive Intellectual Interest cannot be shared with other artificial intelligence
forms. Other artificial intelligence forms cannot be
The distinctive intellectual interest is concerned like human to share the cultural heritage and
with the interest in discovering or finding out what knowledge in the democratic society. Social interest
happens under given circumstances. This interest is must be fulfilled only by human teachers because
realized in activities such as planning ahead, taking human teachers can share this social interest with
notice of what happens, relating the process and its children. There is no room for computers to replace
result. Because the fundamental principle of science is human teachers in this respect. There is no question
connected with the relation of cause and effect, this about it.
type of interest must be enhanced and amplified in However, it should be noted that the Internet,
schools once children show this type of interest. This listserv, forum discussion boards, on-line chat rooms
interest requires higher order thinking, primarily and email can be good tools to reinforce and satisfy
because it involves more abstract concepts. social interest of children. They bring efficiency in
Like the area of sensorimotor interest, computers communication between children and teachers. Here
would be highly more effective than real teachers in the computer serves as media for communication
this area. The technology of virtual reality can create between children and teachers.
situations so that students can take individual flight or
driving lessons, or even learning how to operate a 3. Proposal
nuclear power plant or how to conduct a surgical
operation. Life-threatening experiences such as With the advance of technology, social interest,
learning to drive on a hazardous road or to fly an which was once thought of belonging solely to human
aircraft in a bad weather condition cannot be teachers, could be achieved by such technology as
conducted in a real classroom. Furthermore, dangerous QED or TVML. In such technology, computer
chemistry or physics experiments can also be graphics figures simulate human teachers. In other
performed in the virtual reality lab. Students can also words, computer graphics figures perform as teachers
learn how the Universe develops and how cancer or in the virtual learning situation. (See. Figure 1, Figure
AIDS affects human bodies from various angles with 2, & Figure 3.)
computer visualization.
In order to satisfy children's distinctive
intellectual interest, and provide stimulus in the form
of visualization of complex structures or variables,
computers can perform a better job than human
teachers alone. There is no comparison.

2.4. Social Interest

Social interest is defined as interest in persons.


This is a strong special interest, in which a child's
intense concern with other persons is involved.
Children are constantly dependent upon others for Figure 1. Virtual Tutor
support and guidance. Social interest is closely
related to our instinctive human nature to pay attention QED and TVML are technology to display a 3-D
to people and wishes to be intimately bonded to them. CG figure on a browser page so that it interact with a
Distinctive social instincts such as sympathy, imitation, learner, as shown in Figure 2. TVML is a similar

566
technology in which a content author can choose a
character to perform and voice over a browser page
according to text-based programming. The
sophistication of these technologies is currently being
explored, and with greater reactions, inflections, and
variable feedbacks becoming available, social interest
is deepened.

Figure 4: Cone of Learning


From: http://www.cals.ncsu.edu/agexed/sae/ppt1/

Social interest, interest in persons, can be


achieved though CG figures in the screen. In this
Figure 2. QED with e-Learning Contents virtual world, children’s concern with others is
maintained. Children can depend on CG figures on the
computer screen for support and guidance as long as
CG figures behave similarly to human teachers. In this
environment, instinctive human nature to pay attention
to people and to wish to be intimately bonded with
them is also maintained.
As long as children are able to anthropomorphize
CG figures to be treated as members of the social
world in which the children live, social interest can be
maintained, and perhaps even deepened.

Figure 3. TVML
From: http://www.nhk.or.jp/strl/tvml/japanese/mini/

Edgar Dale claims that after two weeks humans Figure 5. e-Learning Home Page
tend to remember 10% of what we read, 20% of what
we heard, 30% of what we saw, 50% of what we heard 4. Conclusion
and saw, 70% of what we said, and 90% of what we
both said and did. In the category of what we both said In this paper, clear boundaries of what computers
and did, doing the real thing as well as simulating the can do in education and what teachers can do in
real experience is included. It follows that education are established. Four types of interests
experiencing in the virtual world is equally effective as according to Dewey are explained. They are
experiencing in the real world. instinctive interest, sensorimotor interest, distinctive
intellectual interest, and social interest.
In Dewey’s argument, due to the nature of the
social interest which is rooted deeply with other

567
humans, the human to human interaction is only
created by the human teacher. However, it is proposed
that 3-D CG figures can replace the human teacher as
long as the learner personifies the 3-D CG figures and
treats them as members of his or her society group.
This paper presentation includes demonstrations
of QED.

References

Carlson, P. (1999). Virtual Education Manifesto.


(Evaluation draft before printing). Hypermedia
Solutions Limited.

Dale. E. (1969). Audio-visual Methods in Technology,


NewYork: Dryden.

Dewey. J. (1913). Interest And Effort In Education.


Houghton Mifflin Company: Boston.
Doll. R. C. (1996). Curriculum Improvement. Allyn
and Bacon: Boston.

Flinders, D. J. & Thornton, S. J. (1997). The


Curriculum Studies Reader. Routledge: New York.

Geoghegan, W. H. (1994). Stuck at the barricades: Can


information technology really enter the mainstream of
teaching and learning? In AAHE Bulletin, September
(pp. 13-16).

Geoghegan, W. H. (1994). Stuck at the barricades: Can


information technology really enter the mainstream of
teaching and learning? In AAHE Bulletin, September
(pp. 13-16).

Jacobson, R. (1999). Information Design.


Massachusetts Institute of Technology.

Thorndike, E. L. (1913). Educational Psychology. In


The psychology of learning (Vol.2). Teachers College
Press: New York.

Tobin, K. and Dawson, G. (1992). Constraints to


curriculum reform: Teachers and the myths of
schooling. In Educational Technology Research and
Development, 40(1), (pp. 81-92).

568
2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing

Collaborative Interpretative Service Assisted Design System Based on


Hierarchical Case Based Approach

Huan-Yu Lin1, Shian-Shyong Tseng*2, Jui-Feng Weng3, Jun-Ming Su4


Department of Computer Science National Chiao Tung University, ROC
Department of Information Science and Applications, Asia University, ROC2
huan.cis89@nctu.edu.tw1, sstseng@cis.nctu.edu.tw2, roy@cis.nctu.edu.tw3,
jmsu@csie.nctu.edu.tw4

Abstract Iterpretative Service


Determined in Determined in
Museum is the important learning environment, advance execution time
which assists learners to learn directly from objects and Static Dynamic
living, and its interpretative services play an important Service Service

role in museums to help learners learn more about the Adaptive content Interactive
selection content
exhibitions. However, the development of intelligent
interpretative services is costly, time consuming, and Adaptive Service Interactive Service
needs many kinds of domain knowledge. How to provide Consider the user's Consider the context
a platform to help experts of different domain work Information Information
(profile, preference, etc.) (location, time, etc.)
collaboratively to reduce the construction cost of
designing an intelligent interpretative service for new Personalized Context-aware
requirements is an important issue. Thus, we propose a Service Service
Collaborative Interpretative Service Assisted Design
System (CISAD), containing Requirement Integration Figure 1: The classification of interpretative services
Process to assist designers to collaboratively determine performed by the applications.
the requirements of the new service, and Intelligent
Query Processor to reuse previous successful In the modern museum, the design of interpretative
application from coarse-grained to fine-grained to services becomes more interactive, personal and
improve the reliability and reduce the construction cost context-aware. For example, in the National Museum of
of the solution application. Finally, we show an Nature Science of Taiwan, the interactive services and
example to describe the construction process of an digital exhibition contents are widely used to support
interpretative service for elder people by CISAD. better guidance, interpretation or digital learning, and
other museums [1, 2] apply ubiquitous learning
1. Introduction technology in their interpretation, where the computing
and communicating devices (such as sensors or RFID
Museum is the important learning environment, tags) can be embedded to ordinary things. It makes this
which assists learners to learn directly from objects and learning environment perceive different context
living, and its interpretative services play an important information and behave smartly [3]. While designing
role in museums to help learners learn more about the the U-learning applications, the context information of
exhibitions. As shown in Figure 1, the mechanism of learning environment, such as location, device, time,
interpretative services can be classified into static etc., can be utilized to provide better learning services
services and dynamic services according to the time that for learners.
the contents to be delivered. The dynamic learning However, the development of these kinds of
services can be classified into adaptive services or intelligent interpretative services is costly, time
interactive services according to the contents. The consuming, and needs many kinds of domain
adaptive services can be further classified into personal knowledge, such as the knowledge of sensors, hardware
learning services and context-aware learning services control, repository control, content management,
according to the attributes that are considered for the presentation device control and content adaptation, so it
content selection. is difficult for only one expert to complete all the design.
Thus, how to provide a platform to help experts of

978-0-7695-3158-8/08 $25.00 © 2008 IEEE 569


DOI 10.1109/SUTC.2008.76
different domain work collaboratively to reduce the 2. Related Work
construction cost of designing an intelligent
interpretative service for new requirements is an
important issue. 2.1. Service Oriented Architecture
In the interpretative service domain, most of the
applications include several basic functions, such as In recent years, the Service Oriented Architecture
event triggering, data processing, and content (SOA) has been accepted as a successful solution to
presentations. Thus, the service oriented architecture reach the system interoperability through the popular
(SOA) is appropriate to be applied for reducing cost, standards. Standard such as WSDL [4] / SOAP [5]is the
where interpretative service applications are most popular web service protocol to provide the
constructed as a set of web services to facilitate the standard way to carry the structural message through the
partially application reuse and composition. HTTP protocol. One of the benefits of SOA is the ability
In order to improve the developed application’s to compose the applications, process, or complex
reliability, when a new application needs to be services from the less complex service. This service
constructed, firstly, the previous successful applications, composition activity inspires the new design or
having the same behaviors, will be reused, and if there architectural style with the general architectural
are no similar applications, the experienced designers elements consisting of processing components,
usually partially reuse the previous applications to connector and data [6]. The service composition in SOA
compose to a new application. Therefore, we define a supports the loosely-coupled, business-aligned,
hierarchical service structures in service ontology, platform independent and network-based services to
which represents different kinds of interpretative enable the flexibility and reusability of system
service applications, containing different kinds of components [7]. In the [8], the concept of service
semantic tasks and the related real web services, to granularity which refers to the levels of service
assist to perform service reuse from coarse-grained to functionality is reused. The coarse-brained service may
fine-grained. According to service ontology, the have more business value to meet the business process.
successful interpretation application designed by The fine-grained service may have less business value
experts of different domains can be retained in a but more valuable for system experts. Although the
hierarchical case base. When a new requirement is SOA provides more reusable with standard architecture,
given, these experts can collaboratively define the the trade off of the system constraints and business
required features in their familiar parts based on the value is a difficult issue.
service ontology, and the integrated requirements can be
used to automatically generate the solution service
composition by the hierarchical case base. 2.2. Interpretative Applications
Accordingly, we propose a Collaborative
Interpretative Service Assisted Design System As shown in Table 1, several learning applications
(CISAD), containing a hierarchical case base to manage [9-17] can be categorized in terms of different service
the hierarchical cases. When designers want to design a type into static service, context-aware service,
new application, the Requirement Integration personalized service, and interactive service.
Process is used to let designers collaboratively design
the features of new service, and then Intelligent Query Table 1: Different types of interpretative applications
Processor is executed to generate queries of different Service Type Example
levels to perform coarse grained and fine grained case Static services
z Learning Object repository
reuse. If the retrieved service composition can satisfy
the requirements, this solution case will return to the z Item bank
designers and after revising, it can be retained to the Context-aware
z Adaptive mobile museum guide [18]
service
hierarchical case base to enrich the cases. z Context-aware tour guide [19]
Finally, the experiment of different interpretative
service scenarios simulations have been done, and the z Conference assistant [20]
results show the feasibility of the CISAD and show that Personalized
z Requirement satisfied learning [17]
service
the CISAD can support collaborative design and
Interactive
hierarchical service case reuse. z Japanese polite teaching [9]
service
z Knowledge awareness map [14]
z P2P content access and multimedia group
discussion [15]

570
2.3. e-Services of Museum flow of sensor service, content repository service and
audio player service in this example for visual handicap
In the research about mobile services of museum, visitors, it seems that the case-based reasoning approach
many interactive and wireless technologies were used to is a suitable approach to support the design of service by
provide the services such as personalized guiding, reusing the successful cases. As shown in the lower part
guiding by sharing, experience remembering, guide on of Figure 2, when the designer inputs his/her
demand, guiding in activity, etc. With the development requirement, our idea is to retrieve the similar cases and
of computer technology, many researches pay more required services for reusing. However, in the SOA
attention to support the handicap with new technologies. domain, without considering the application
Several researches about services design [3, 21] to requirements and the compatibility of service units, the
support the consideration of the children and elder also case reuse can not be done. Thus, how to provide a good
become more and more important. Thus, the support of case representation model to support the case-based
designing the service for the handicaps in public places reasoning is an important issue.
such as museum is an interesting and challenging issue.
We aim to support the design of interpretative services 3.4. Hierarchical Case Representation
in SOA environment. However, as mentioned in [22],
how to provide the case reusing that meet the business With our observation, the interpretative services in
process requirement in application level and museum domain can be represented as three granularity
environment constraints in service unit level is the levels. The application level is the use cases with the
challenging issue. features such as the application type, user’s ability,
intended content, environment or hardware device
3. Hierarchical Case Representation for constraints. Thus, the cases of interpretative services
Museum Interpretative Service can be managed as the three levels Service Ontology.
Figure 3 shows an example of Adaptive Application in
As described above, the growth of e-services is Service Ontology, where an adaptive interpretative
increasing rapidly in museum domain. Assume that in service application contains three parts of tasks, User
the museum, there are RFID sensors have been Info Querying Task, Content Fetching Task, and
deployed in front of exhibitions, each visitor is given a Content Presentation Task, to detect the learner’s
RFID card, and the service designer wants to develop an information, fetch the appropriate contents and present
adaptive interpretative service for visual handicap to the learner, and each task contains a kind of service
visitor. In the Service Oriented Architecture (SOA), to unit to perform this task. Different kinds of application
provide the adaptive service for visual handicap visitor, may have various kinds of tasks and service units, and a
our service scenario is that the sensor service can sense task or a service unit may be used in various kinds of
the personal profile, retrieve the appropriate content applications or tasks, respectively.
from content service and present the content by the
audio player service, as shown in Figure 2.
Service Scenario Solution
SID3: Audio Player
Service

RFID
tag
SID1: Sensor
service SID2: Content repository
service
Requested
Case ID
Requirement Services SID1: Sensor,
SID2: Content Repository,
SID3: Audio Player
Features of
Requirement Query

Designer
Suggesting Cases &
Case Base Figure 3: The Service Ontology for museum
Cases
Casesand
and
Reuse Required
RequiredServices
Services
Required Services
interpretative service

Figure 2: The case reuse of the interpretative service Based on the Service Ontology, an interpretative
service application can be represented as a hierarchical
However, the designing and programming tasks for case, where an application contains specific kinds of
the museum interpretative services are usually costly tasks according to the application type, and each task
and time consuming. Since the interpretative service can contains a set of service units, representing the real web
be described with cases of service scenarios such as the services. The semantic meaning of an application are

571
represented as a set of case features, and its contained collaboratively design the features of required
tasks also have sets of case features, which is a sub set of application to generate the query feature list, Intelligent
the case features in the application, to describe the Query Processor (IQP), which is used to recursively
semantic meaning of the tasks. generate queries and direct case retrieval in different
levels, and three level case retrieval processes, which
Example 1: The Hierarchical Case of adaptive visual can find the appropriate cases for coarse-grained and
handicap interpretative fine grained reuse.
As shown in Figure 4, the adaptive application
hierarchical case of the visual handicap interpretative
Expert Expert Expert
service contains three tasks, where the first task use a Requirement Integration Process
RFID to detect visual handicap property from learner Query Feature Collaborative Interpretative Service
List
profiles, the second task find the interpretative text from Assisted Design System
Intelligent Query Processor
repository, and the last task transform text to speech and
Application Level Query
present it in speaker. The text of italic font is the case Application
features of the tasks and application. Level Retrieval
Task Level Query
Integrate Task Level Hierarchical
Reject
Tasks Retrieval Case Base
Service Unit Level Query
Integrate Service Unit
Service Units Level Retrieval
Case Retain
Solution
Case

Solution
Evaluation and Accept
Revision

Figure 5: Collaborative Interpretative Service Assisted


Design System architecture
Figure 4: A hierarchical case of an adaptive interpretive Algorithm 1 is the meta-algorithm, showing the
service for visual handicap overall process in CISAD system. When a new
application is required, the RIP is fired to assist experts
In the case, different kinds of case features are to determine the application type and query feature list,
defined in different tasks to represent the tasks’ and these information can be used by IQP to generate a
semantic meaning. In the adaptive application, a User solution case. If the solution can be accepted by the
Info Querying Task contains the features: trigger experts, after necessary revise, the solution can be
message, user information, a Content Fetching Task applied to the real environment and retained to the case
contains the features: content type, content source, the base. If the solution can not be accepted, the process of
Content Presenting Task contains the features: CISAD will be run again to design new features and
presentation media, presentation device, and the generate new solutions.
features union of all the tasks containing in an Algorithm 1: Collaborative Interpretative Service
application are the application’s case features. Assisted Design (CISAD)

4. Collaborative Interpretative Service Definition of Symbols:


Assisted Design System Query: The query feature list, determined by designers.
RIP: Requirement Integration Process.
The interpretative services, designed for different IQP(CF, level, APtype): Intelligent Query Processor
domain and environment, can be represented as CF: Requirements features and values.
hierarchical cases and retained in the case base. When level: “Application”, “Task” or “Service unit”.
an application for new requirements is needed, the APtype: The application type, such as personal
experts of different domains can use our proposed application.
Collborative Interpretative Service Assisted Design
System (CISAD) to collaboratively determine the Ouput: Solution Case
application’s features and automatically generate the
solution application from the retained cases. CISAD Step 1: Call RIP.
system, shown in Figure 5, consists of Requirement //In Requirement Integration Process, experts
Integration Process (RIP), which can assist experts //colaboratively design the application type (APtype),

572
//query feature list (Query). if there are no cases in the application level that can
meet the requirements, the partial case reusing is needed.
Step 2: Call IQP(Query, “Application”, APtype). Since the hierarchical case base retained the service
//Get the solution case cases as levels of granularities, to support the partial
case reusing, the Intelligent Query Processor (IQP) is
Step 3: Experts perform the solution evaluation. provided to support the fine-grained case retrieval by
If the solution case can satisfy the requirements. decomposing the query. While the designer input the
Then feature values of requirement, the application level
Perform solution case revision by experts. query is processed by IQP firstly. Next, if the
Retain the refined solution case to the case base. application level case retrieval failed to satisfy the query,
Return the solution case. the IQP further decomposes the query into sub-queries
Else of task level or service unit level for partial case reusing.
Redo CISAD The IQP and query generation algorithm is given as
follows.
4.1. Requirement Integration Process Algorithm 3: Intelligent Query Processor (IQP)
(CF, level, APtype)
In RIP, shown in Algorithm 2, firstly, experts have to
determine the goal and type of the required application, Definition of Symbols:
such as a personal interpretative service for people of CF: Requirements features and values.
slight visual handicap. In the service ontology, the APtype: The application type, such as personal
features, used to describe the specific kind of application.
application, can be found according to the application level: “Application”, “Task” or “Service unit”.
type. The tasks of determining different features are
assigned to different experts depending on their Given: Service Ontology
expertise, and the determined features are integrated in
to a query feature list, which can be used to Input: CF, level, APtype
automatically generate solution case by IQP. Output: The query result case
Algorithm 2: Requirement Integration Process
(RIP) Step 1: Switch Case: level = “Application”
Then fire application level retrieval with CF.
Definition of Symbols: If the result failed to satisfy the query Then call
Query: The query feature list, determined by designers. IQP(CF, Task, APtype).
APtype: The application type, such as personal Else return the result application.
application.
Step 2: Case: level = “Task”
Given: Service Ontology Then generate task level CF from CF for each task,
containing in this kind of application, according to the
Step 1: Experts discuss the goal and application type APtype in Service Ontology.
(APtype) of the desired application. Fire task level retrieval with the task level CF.
Integrate the retrieved task into the original
Step 2: According to Service Ontology, find the case application.
feature types, belonged to the application of APtype. If the service units fail in connection
Then call IQP(CF, Service unit, APtype).
Step 3: Assign different feature design tasks to Else return the result application.
different experts according to their domain expertise.
Step 3: Case: level = “Service unit”
Step 4: Integrate the designed features as the query Then fire service unit level retrieval with the same
feature list (Query) task.
If the result failed to satisfy the query, return “query
Step 5: Return Query and APtype failed”.
Else
Integrate the retrieved service unit into the original
4.2. Intelligent Query Processor
application.
Return the result application.
In the application level, since there are case features,
the similarity among cases can be calculated. However,

573
4.3. Hierarchical Case Retrieval

Definition 1: Application Objective Function (AOF)


Application objective function can calculate the
semantic similarity of the application level query and an
application case according to the features similarity of
the query and case. The application level feature
similarity is introduced following.

Definition 2: Task Objective Function (TOF)


Task objective function, which is similar to AOF, can
calculate the semantic similarity of the task level query
and a task case according to the features similarity.

Definition 3: Linking Predicate Function (LPF)


The linking predicate function can check if the
connected service units are compatible or not. There are
two ways to predicate the connectivity of services. The
first is that if two services units are based on the same Figure 6: Interpretative services for elder people,
data standard and middleware standard, the predicate visual handicap, and serious visual handicap
function returns true. The second is that if these two
service units have successful cases in the case base, the
predicate function returns true. Otherwise it returns fail.

5. Application: The interpretative service


for elder people
As illustrated in Figure 6, three application level
service cases are stored in the hierarchical case base.
The Application 1, designed for elder people, can
enlarge the text content in monitor, the Application 2,
designed for visual handicap, can transform the web
content to an mp3 file, and the Application 3, designed
for serious visual handicap, can transform the messages Figure 7: In Requirement Integration Process, three
to a midi file. experts collaboratively define the query feature list
If the museum needs a new interpretative service for
elder people, which can detect the elder people, find Figure 8 shows the hierarchical case retrieval and
contents, and transform it to audio, the software adaptation, where the query feature list is generated in
engineer, hardware engineer, and museum manager are RIP. In application level retrieval, the most similar case
invited in the RIP to define the query feature list. In the is Application 1, but its last task can not satisfy the
RIP, shown in Figure 7, firstly, experts determine that requirement. Thus, the task level retrieval is fired and
the type of the new application is personal service, and the Task 2-3, included in Application 2, is found to
then the tasks of defining necessary features are replace the Task 1-3 in Application 1. However, the
dispatched to different experts, acording to their Service Unit 2-3 in Task 2-3 can not pass the linking
expertises. After finishing the tasks, the features can be predicate function in Application 1, so the service unit
integrated into a query feature list and be used to level retrieval is fired and the Service Unit 3-3 is found
retrieve the solution cases. to replace the Service Unit 2-3 in Task 2-3. Finally, after
evaluation and revision by experts, the solution case can
be apply to real environment and can be retain into the
case base.

574
and UbicKids. International Journal of Pervasive Comp.
and Comm., 2005. 1(1).
[4] W3C. Web Services Description Language 2001 [cited 2007
Dec]; http://www.w3.org/TR/wsdl.
[5] W3C. Simple Object Access Protocol. 2000 [cited 2007 Dec];
http://www.w3.org/TR/2000/NOTE-SOAP-20000508/.
[6] Fielding, R.T., Architectural Styles and the Design of
Network-based Software Architectures, in Information and
Computer Science. 2000, UNIVERSITY OF
CALIFORNIA, IRVINE.
[7] Shaw, M. and P. Clements, A Field Guide to Boxology:
Preliminary Classification of Architectural Styles for
Software Systems, in Proceedings of the 21st International
Computer Software and Applications Conference. 1997:
Washington, DC, USA.
[8] Madhusudan, T., J.L. Zhao, and B. Marshall, A case-based
reasoning framework for workflow model management.
Data & Knowledge Engineering 2004. 50(1): p. 87-115.
[9] Yin, C., H. Ogata, and Y. Yano. Ubiquitous-Learning System for
the Japanese Polite Expressions
in International Workshop on Wireless and Mobile Technologies in
Education (WMTE). 2005.
Figure 8: Hierarchical case retrieval and integration [10] Kunito, G., et al. Architecture for Providing Services in the
Ubiquitous Computing Environment. in IEEE Internal
6. Conclusion Conference on Distributed Computing Systems Workshops
(ICDCSW). 2006.
[11] Zhang, G., Q. Jin, and M. Lin. A Framework of Social Interaction
In this paper, we propose a collaborative design Support for Ubiquitous Learning. in International
approach and a hierarchical case reuse approach based Conference on Advancd Information Networking and
on a hierarchical case representation to facilitate Application (AINA). 2005.
[12] Cho, J. and E. Hwang. An Exhibition Reminiscent System for
interpretative service design. The requirement Ubiquitous Environment. in IEEE International
integration process is designed to assist experts of Conference on Computer and Information Technology
different domains to collaborative determine (CIT). 2006.
requirements of the new interpretative service, and [13] Sakamura, K. and N. Koshizuka. Ubiquitous Computing
Technologies for Ubiquitous Learning. in IEEE
intelligent query processor is used to perform the International Workshop on Wireless and Moblie
coarse-grained and fine-grained case reuse to generate Techinologies in Education (WMTE). 2005.
the solution case from different applications. To make [14] El-Bishouty, M.M., H. Ogata, and Y. Yano. Personalized
sure the feasibility of generated service, the application, Knowledge Awareness Map in Computer Supported
Ubiquitous Learning. in International Confernece on
task objective functions and linkage predicate function Advanced Learning Technologies (ICALT). 2006.
are proposed to evaluate the satisfaction of applications, [15] Yang, S.J.H., Context Aware Ubiquitous Learning Environments
tasks and service units, respectively. Thus, this for Peer-to-Peer Collaborative Learning. Educational
approach can assist experts to design and generate Technology & Society, 2006. 9(1): p. 188-201.
[16] Vogiazou, Y., et al. A research process for designing ubiquitous
feasible interpretative cases from previous successful social experiences. in Nordic conference on
application to satisfy the new requirements. Human-computer interaction. 2006.
[17] Cheng, Z., et al. A Proposal on a Learner's Context-aware
Personalized Education Support Method based on
7. Acknowledgement Principles of Behavior Science. in Internation Conference
on Advanced Information Networking and Application
This research was partially supported by National (AINA). 2006.
[18] Oppermann, R. and M. Specht. Adaptive mobile museum guide
Science Council of Republic of China under the number for information and learning on demand. in the 8th
of NSC95-2520-S009-007-MY3 and International Conference on Human-Computer
NSC95-2520-S009-008-MY3. Interaction. 1999: Lawrence Erlbaum Associates, Inc.
[19] Abowd, G.D., et al., CyberGuide: A Mobile Context-Aware Tour
Guide. Wireless Networks, 1997. 3(5).
8. Reference [20] Dey, A.K., et al., The Conference Assistant: Combining
context-awareness with wearable computing, in
[1] Fleck, M., et al., From informing to remembering: ubiquitous Proceeding of the 3rd International Symposium on
systems in interactive museums. Pervasive Computing, Wearable Computers. 1999
IEEE, 2002. 1(2): p. 13- 21. [21] Weng, J.-F., S.-S. Tseng, and N.-K. Si, Constructing the
[2] TateMuseum. Tate Modern Multimedia Tour. 2002 [cited 2008 Ubiquitous Intelligence Model based on Frame and
Jen]; http://www.tate.org.uk/modern/multimediatour/. High-Level Petri Nets for Elder Healthcare, in Fourth
[3] Ma, J., et al., Towards a Smart World and Ubiquitous Intelligence: International Conf. on Information Technology and
A Walkthrough from Smart Things to Smart Hyperspaces Applications. 2007: Harbin, China.

575
[22] Lewis, G.A., et al., Common Misconceptions about
Service-Oriented Architecture, in Sixth International IEEE
Conference on Commercial-off-the-Shelf (COTS)-Based
Software Systems. 2007.

576
Author Index

Andronache, Adrian...............................................355 Chu, Hui-Chun ......................................................524


Bastani, Farokh......................................................177 Chung, Chia-Wei ...........................................418, 423
Beutel, Jan .............................................................201 Chung, Jen-Yao .....................................................314
Brasee, Kaleb...........................................................52 Chung, Tein-Yaw ..................................................379
Brusey, James ..........................................................81 Chung, Yoo Chul.....................................................44
Buckl, Christian .....................................................162 Chung, Yu-Chi ......................................................217
Byrd, Gregory T. ..................................................122 Dai, Hong-Jie.........................................................410
Caicedo, Carlos E. .................................................249 Devillers, Raymond ...............................................209
Castro, Alfredo A. Villalba....................................193 Diamond, Dermot ..................................................457
Chakrabarti, Saikat ................................................106 Di Marzo Serugendo, Giovanna ............................193
Chan, Hsin-Te........................................................543 Eertink, Henk...........................................................98
Chandrasekhar, Santosh.........................................106 Fang, Chih-Lun......................................................256
Chang, Chih-Kai....................................................531 Fang, Hua-Yin .......................................................373
Chang, Chung-Hsien .............................................507 Feng, Ming-Whei ..................................................270
Chang, Henry.........................................................314 Florez-Lara, A. .....................................................138
Chang, Je-Wei .........................................................28 Fu, Li-Chen............................................................451
Chang, Ming-Wei ..................................................225 Fu, Shiwa S............................................................314
Chang, Pen-Ming...................................................338 Fujinami, Kaori......................................................154
Chang, Yang-Hui...................................................379 Fukazawa, Yoshiaki...............................................326
Chang, Ye-In .........................................................367 Gaura, Elena I. ........................................................81
Chang, Yung-Jung .................................................464 Gong, Haitao..........................................................386
Chao, Chih-Min .......................................................36 Goossens, Joël .......................................................209
Chaudhary, B. D. ..................................................361 Gopalakrishnan, Sathish ............................................1
Chen, Chi-Bang .....................................................233 Gratz, Patrick.........................................................355
Chen, Chien .............................................................28 Guo, Hong-YI........................................................400
Chen, Chien-Hsun .................................................373 Gupta, G. Sen ........................................................470
Chen, Chien-Wei ...................................................225 Gupta, Gourab Sen ................................................439
Chen, Chih-Nung...................................................332 Hamann, Hendrik F. ..............................................312
Chen, Ching-Han ...................................................445 Hamilton Jr, J. A. ...................................................90
Chen, Han ..............................................................285 Han, Pei-Chen..........................................................74
Chen, Li-Chieh ......................................................408 He, Jingsha ..............................................................67
Chen, Ming-Che ....................................................349 Hecht, David..........................................................386
Chen, Rong-Ming ..................................................386 Hesselman, Cristian .................................................98
Chen, Shi-Huang ...................................................507 Honiden, Shinichi ..................................................326
Chen, Shu-Ching ...................................................262 Hsiao, Chin-Yuan ..................................................492
Chen, Wei-Bang ....................................................233 Hsiao, Chun-Chieh ................................................306
Chen, Xuxiang.......................................................296 Hsiao, Han C. W. .................................................386
Chen, Yeong-Sheng...............................................373 Hsieh, Ming-Hua ...................................................170
Chen, Yung-Mu .....................................................379 Hsieh, Tsu-Yi ................................................418, 423
Cheng, Chia Yang..................................................436 Hsu, Chao-Yen ......................................................386
Cheng, Yun-Maw ..................................................408 Hsu, F. R. ..............................................................436
Chiang, Mu-Huan ..................................................122 Hsu, Kuo-Chiang ...................................................486
Chiang, Tsun Chieh .................................................28 Hsu, Wen-Lian ......................................................410
Chiou, Harry ..........................................................549 Hsu, Yu-Lun ..........................................................464
Chiu, Dickson K.W. .............................................296 Hu, Jwu-Sheng ......................................................464
Choi, Sun ...............................................................343 Hu, Rouh-Mei........................................................386
Chou, Paul .............................................................285 Hu, Yuh-Jong ........................................................400
Chou, Pei-Hsuan....................................................410 Huang, Chi-Hsin....................................................410

577
Huang, Chin-Tser ..................................................241 Lin, Huan-Yu.........................................................569
Huang, Fu-Ming ....................................................513 Lin, Jaimie Yi-Wen ...............................................410
Huang, Jay .............................................................492 Lin, Kawuu W. .....................................................170
Huang, Polly ..................................................271, 306 Lin, Lin ..................................................................262
Huang, R. Y. M. ...................................................470 Lin, Yih-Jeng.........................................................428
Huang, Ying ..........................................................314 Lin, Yuan-Ning......................................................500
Huang, Yu-Len ..............................................418, 423 Lin, Yuan-Tsun......................................................428
Huang, Yung-Fa ............................................320, 349 Liou, Sz-Ting.........................................................445
Hwang, Bor-Jiunn..................................................338 Liu, Chuan-Ming ...................................................185
Hwang, Gwo-Jen ...................................................524 Liu, Jung-Chun ........................................................74
Hwang, I-Shyan .....................................................338 Liu, Wei-Lun .........................................................320
Ishizuka, Mitsuru ...................................................385 Liu, Wen-Lin .........................................................233
Iyer, Vasanth..........................................................480 Liu, Xue.....................................................................1
Jan, Rong-Hong .......................................................28 Lopez-Gomez, M. A. ............................................138
Jao, Yu-Lang .................................................418, 423 Lu, Kuo-Hsiang .......................................................36
Jardak, Christine ....................................................146 Luo, Zongwei ........................................................280
Jiang, Chang-Jie ......................................................28 Mähönen, Petri ......................................................146
Jimenez-Plaza, J. M. .............................................138 Makki, S. Kami........................................................52
Jin, Ming-Hui ........................................................291 Matsuishi, Masakatsu ............................................561
Joshi, James B. D. .................................................249 Meng, Shengguang ................................................296
Jung, Jae-il.............................................................343 Meshkova, Elena ...................................................146
Kai, Hung-Jen........................................................500 Milojevic, Dragomir ..............................................209
Kao, Hung-Tzu ......................................................500 Miyashita, Ryo ......................................................564
Kemp, John..............................................................81 Moi-Tin, Chew ......................................................439
Kemper, Alfons .....................................................162 Mukhopadhyay, S. C. ...........................................470
Kim, Doo-young....................................................343 Nakamura, Yoshiyuki............................................326
Kim, Jung Ho.........................................................457 Navet, Nicolas .......................................................209
Kim, Sungil..............................................................44 Nélis, Vincent ........................................................209
Knoll, Alois ...........................................................162 Ng, Ka-Lok............................................................386
Ko, Yangwoo...........................................................44 Ni, Lionel M. ..........................................................19
Konstantas, Dimitri................................................193 Oldewurtel, Frank..................................................146
Kosov, Yury ..........................................................314 Ou, Jong-Waye ......................................................386
Ku, Ling-Feng .......................................................338 Pan, Meng-Shiuan .................................................130
Ku, Wei-Shinn.........................................................90 Pan, Yen-Lin..........................................................475
Kumar, Subodha ....................................................177 Pandey, Mayank ....................................................361
Kumaran, Santhosh................................................314 Peng, Wen-Chih ......................................................90
Kuo, Fan-Ray ........................................................524 Pirttikangas, Susanna.............................................154
Kuo, Tei-Wei.........................................................225 Plessl, Christian .....................................................201
Kuo, Yu-Chen........................................................332 Rammurthy, Garimella ..........................................480
Lan, Ci-Wei ...........................................................513 Ravitz, Guy............................................................262
Laredo, Jim ............................................................314 Riihijärvi, Janne.....................................................146
Lau, King Tong .....................................................457 Robert, Charles ......................................................519
Lee, Chiang............................................................217 Rothkugel, Steffen .................................................355
Lee, Do-hyeon .......................................................343 Sakamoto, Mune-Aki.............................................561
Lee, Dongman .........................................................44 Sandhu, Ravi............................................................10
Lee, Hou-Tsan .......................................................451 Scholz, Andreas .....................................................162
Lee, Hsiao-Ping .....................................................500 Seifert, Jean-Pierre...................................................10
Li, Po-Yi ..................................................................90 Sha, Lui .....................................................................1
Lian, Feng-Li.........................................................451 Shah, Nirav ............................................................177
Lie, Wen-Nung ......................................................486 Sheikh, Kamran .......................................................98
Lim, Roman...........................................................201 Shen, Jun-Hong .....................................................367
Lin, Chin-Yu..........................................................428 Sheu, Phillip C.-Y. ................................................386
Lin, Chu-Hsing ........................................................74 Shih, Ming-Te........................................................185
Lin, Guang-De.......................................................400 Shih, Po-Yi ............................................................500

578
Shimamoto, Shigeru ................................................59 Wang, Qixin ..............................................................1
Shiu, Ming-Chiuan ................................................451 Wang, Tsung-Wei....................................................90
Shyu, Mei-Ling......................................................262 Weng, Jui-Feng......................................................569
Singhal, Mukesh ....................................................106 Wenyin, Liu...........................................................296
Sommer, Stephan...................................................162 Wibbels, Martin .......................................................98
Soo, Von-Wun.......................................................394 Woehrle, Matthias .................................................201
Srinivas, M. B. ......................................................480 Wong, Edward C. .................................................280
Su, Addison ...........................................................549 Wu, Han-Zhen .......................................................555
Su, I-Fang ..............................................................217 Wu, Richard.............................................................20
Su, Ja-Hwung ........................................................492 Wu, Siew-Rong .....................................................537
Su, Jun-Ming .........................................................569 Wu, Xu ....................................................................67
Su, Yu-Sheng.........................................................555 Wu, Yao-Ting........................................................302
Sun, Koun-Tem .....................................................543 Wu, Yiju ..................................................................59
Sung, Jing-Tian......................................................291 Xu, Fei .....................................................................67
Tai, Cheng-Chi ......................................................475 Xu, Jianliang............................................................90
Tan, C. J. ...............................................................280 Yamamoto, Toshiyuki ...........................................564
Tang, Chuan Yi .....................................................436 Yang, Chuan-Yue ..................................................225
Tei, Kenji...............................................................326 Yang, Hao..............................................................285
Tejero-Calado, J. C. ..............................................138 Yang, Jeaha............................................................314
Thake, C. Douglas ...................................................81 Yang, Shih-Yao .....................................................394
Thiele, Lothar ........................................................201 Yang, Stephen J. H. .............................. 513, 549, 555
Tokmakoff, Andrew ................................................98 Yang, Tzu-Chi .......................................................524
Tsai, Jeffrey J. P. ..........................................272, 386 Yeh, Chi-Hsiang ......................................................20
Tsai, Kun-Cheng....................................................291 Yeh, Ching-Long ...................................................408
Tsai, Richard Tzong-Han.......................................410 Yeh, Hsin-Ho.........................................................492
Tsai, Tsung-Han ....................................................256 Yeh, Lun-Wu.........................................................130
Tsai, Yuan-Jiun......................................................114 Yen, I-Ling ............................................................177
Tsai, Yuh-Ren........................................................114 Yu, Ming-Shing .....................................................428
Tseng, Shian-Shyong.............................................569 Yu, Shih-Yin..........................................................507
Tseng, Vincent S............................................170, 492 Yu, Zhenwei ..........................................................272
Tseng, Yu-Chee .....................................................130 Zeadally, Sherali......................................................52
Tuladhar, Summit R. ............................................249 Zhang, Chengcui....................................................233
Wang, Bo-Wen ......................................................492 Zhang, Tianle.........................................................280
Wang, Chih-Cheng ................................................379 Zhang, Xinwen ........................................................10
Wang, Jhing-Fa......................................................500 Zhou, Feng.............................................................280
Wang, Neng-Chung .......................................320, 349

579
IEEE Computer Society
Conference Publications
Operations
ations Committee
CPOC Chair
Chita R. Das
Professor, Penn State University
Board Members
Mike Hinchey, Director, Software Engineering Lab, NASA Goddard
Paolo Montuschi, Professor, Politecnico di Torino
Jeffrey Voas, Director, Systems Assurance Technologies, SAIC
Suzanne A. Wagner, Manager, Conference Business Operations
Wenping Wang, Associate Professor, University of Hong Kong
IEEE Computer Society Executive Staff
Angela Burgess, Executive Director
Alicia Stickley, Senior Manager, Publishing Services
Thomas Baldwin, Senior Manager, Meetings & Conferences
IEEE Computer Society Publications
The world-renowned IEEE Computer Society publishes, promotes, and distributes a wide variety of authoritative
computer science and engineering texts. These books are available from most retail outlets. Visit the CS Store at
http://www.computer.org/portal/site/store/index.jsp for a list of products.

IEEE Computer Society Conference Publishing Services (CPS)


The IEEE Computer Society produces conference publications for more than 250 acclaimed international
conferences each year in a variety of formats, including books, CD-ROMs, USB Drives, and on-line publications.
For information about the IEEE Computer Society’s Conference Publishing Services (CPS), please e-mail:
cps@computer.org or telephone +1-714-821-8380. Fax +1-714-761-1784. Additional information about Conference
Publishing Services (CPS) can be accessed from our web site at: http://www.computer.org/cps
IEEE Computer Society / Wiley Partnership
The IEEE Computer Society and Wiley partnership allows the CS Press Authored Book program to produce a
number of exciting new titles in areas of computer science and engineering with a special focus on software
engineering. IEEE Computer Society members continue to receive a 15% discount on these titles when purchased
through Wiley or at: http://wiley.com/ieeecs. To submit questions about the program or send proposals, please e-
mail jwilson@computer.org or telephone +1-714-816-2112. Additional information regarding the Computer
Society’s authored book program can also be accessed from our web site at:
http://www.computer.org/portal/pages/ieeecs/publications/books/about.html
Revised: 21 January 2008

CPS Online is our innovative online collaborative conference publishing system designed to speed the delivery of
price quotations and provide conferences with real-time access to all of a project's publication materials during
production, including the final papers. The CPS Online workspace gives a conference the opportunity to upload
files through any Web browser, check status and scheduling on their project, make changes to the Table of Contents
and Front Matter, approve editorial changes and proofs, and communicate with their CPS editor through discussion
forums, chat tools, commenting tools and e-mail.
The following is the URL link to the CPS Online Publishing Inquiry Form:
http://www.ieeeconfpublishing.org/cpir/inquiry/cps_inquiry.html
580
Proceedings

2008 IEEE International Conference


on Sensor Networks, Ubiquitous,
and Trustworthy Computing

SUTC 2008
Proceedings

2008 IEEE International Conference


on Sensor Networks, Ubiquitous,
and Trustworthy Computing

SUTC 2008
11-13 June 2008 • Taichung, Taiwan

Editors
Mukesh Singhal, University of Kentucky, USA
Giovanna Di Marzo Serugendo, University of London, UK
Jeffrey J. P. Tsai, University of Illinois, Chicago, USA
Wang-Chien Lee, Pennsylvania State University, USA
Kay Romer, ETH Zurich, Switzerland
Yu-Chee Tseng, National Chiao-Tung University, Taiwan
Han C. W. Hsiao, Asia University, Taiwan

Sponsors
IEEE Computer Society
National Science Council, Taiwan, ROC
Academia Sinica, Taiwan, ROC
Institute for Information Industry, Taiwan, ROC
Asia University, Taiwan, ROC

Los Alamitos, California


Washington • Tokyo
Copyright © 2008 by The Institute of Electrical and Electronics Engineers, Inc.
All rights reserved.

Copyright and Reprint Permissions: Abstracting is permitted with credit to the source. Libraries may photocopy
beyond the limits of US copyright law, for private use of patrons, those articles in this volume that carry a code at
the bottom of the first page, provided that the per-copy fee indicated in the code is paid through the Copyright
Clearance Center, 222 Rosewood Drive, Danvers, MA 01923.

Other copying, reprint, or republication requests should be addressed to: IEEE Copyrights Manager, IEEE Service
Center, 445 Hoes Lane, P.O. Box 133, Piscataway, NJ 08855-1331.

The papers in this book comprise the proceedings of the meeting mentioned on the cover and title page. They reflect
the authors’ opinions and, in the interests of timely dissemination, are published as presented and without change.
Their inclusion in this publication does not necessarily constitute endorsement by the editors, the IEEE Computer
Society, or the Institute of Electrical and Electronics Engineers, Inc.

IEEE Computer Society Order Number E3158


BMS Part Number CFP08SUT-CDR
ISBN 978-0-7695-3158-8
Library of Congress Number 2008922598

Additional copies may be ordered from:

IEEE Computer Society IEEE Service Center IEEE Computer Society


Customer Service Center 445 Hoes Lane Asia/Pacific Office
10662 Los Vaqueros Circle P.O. Box 1331 Watanabe Bldg., 1-4-2
P.O. Box 3014 Piscataway, NJ 08855-1331 Minami-Aoyama
Los Alamitos, CA 90720-1314 Tel: + 1 732 981 0060 Minato-ku, Tokyo 107-0062
Tel: + 1 800 272 6657 Fax: + 1 732 981 9667 JAPAN
Fax: + 1 714 821 4641 http://shop.ieee.org/store/ Tel: + 81 3 3408 3118
http://computer.org/cspress customer-service@ieee.org Fax: + 81 3 3408 3553
csbooks@computer.org tokyo.ofc@computer.org
Individual paper REPRINTS may be ordered at: reprints@computer.org

Editorial and CD-ROM production by Stephanie Kawada

IEEE Computer Society


Conference Publishing Services (CPS)
http://www.computer.org/cps
Foreword

Welcome to the IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing
(SUTC 2008). SUTC 2008 is an international forum for researchers to exchange information regarding
advancements in the state of the art and practice of sensor networks, ubiquitous and trustworthy computing as well
as to identify the emerging research topics and define the future of sensor networks, ubiquitous and trustworthy
computing. The technical program of SUTC2008 consists of invited talks, paper presentations, and panel
discussions.
SUTC 2008 covers broad and diverse topics, which include
• Applications (novel use cases, deployment experience)
• Algorithms and Protocols (topology, coverage, routing, distributed coordination)
• Data Management and Processing (gathering, storage, fusion, dissemination)
• Deployment, Testing, and Debugging
• Design and Programming Methodologies
• Energy Management
• Embedded Processors, Sensors, and Actuators
• Management Aspects (configuration, adaptation, healing)
• Mobility, Location, and Context
• Modeling and Performance Evaluation (simulation, complexity analysis, user studies)
• Networking Technologies (ad hoc networks, personal area networks)
• Operating Systems and Middleware
• Privacy
• Protocol Design, Modeling, and Implementation Experiences
• QoS Aspects
• Reliability
• Security (authentication, access control, intrusion detection, and tolerance)
• Social Issues
• System and Network Architectures
• Trust (establishment, negotiation, management)
• User Interface Technologies
Many high-quality papers were submitted from four main continents and other areas. To maintain the high
standard of the conference, only 25% of regular papers were accepted for presentation in the conference. To address
the future R&D problems of senor network and ubiquitous systems, we have invited six distinguished speakers: Dr.
Jeannette Wing, US National Science Foundation; Dr. Lui Sha, University of Illinois; Dr. S. Sitharama Iyengar,
Louisiana State University; Dr. Ravi Sandhu, University of Texas at San Antonio; Dr. Kane Kim, University of
California, Irvine; and Dr. Lionel Ni, Hong Kong University of Science and Technology. We are greatly honored to
have them present their experience and vision for the future trend ubiquitous computing. Six workshops, one
industrial track, one special session, and two panels are also offered to debate the most important issues facing the
sensor networks research community.
The success of an international conference such as this depends greatly on the involvement of many individuals.
First of all, we would like to thank the Conference Committee and Program Committee members, especially
Program Co-Chairs, Wang-Chien Lee, Kay Römer, Yu-Chee Tseng; Program Vice Chairs, Pedro Marron, Trent
Jaeger, Andreas Willig, Tei-Wei Kuo, Gerd Kortuem, Yunhao Liu, Kun-Lung Wu, Chien-Chao Tseng; Workshop Co-
Chairs, Stephen Yang, Shu-Ching Chen, Jong Hyuk Park, Raja Jurdak; Industrial Program Co-Chairs, Jen-Yao
Chung, Nageswara Rao, Emile Aarts; Special Track Co-Chairs, Mei-Ling Shyu, Guna Seetharaman, Tzong-Chen
Wu; Finance Chair, Rong-Ming Chen; Publication and Registration Chair, Han C. W. Hsiao; Local Arrangement
Chair, Anthony Y. H. Liao; Web Co-Chairs, Fu-Ming Huang, Shih-Nung Chen; Publicity Co-Chairs, Alan Liu,
Shangping Ren, Sam Michiels; Best Paper Award Committee Co-Chairs, Lionel Ni, Ajay Kshemkalyani, Chung-Ta

xiii
King, and staff from Asia University. Finally, we would like to express our special thanks to the Advisory
Committee who provided the invaluable help and guidance necessary to put this conference together.
We also wish to thank Asia University, National Science Council, Institute for Information Industry, and
Academia Sinica for their contribution to the success of this conference.
We hope that you will have a great time at SUTC 2008.

Mukesh Singhal
Giovanna Di Marzo Serugendo
Jeffrey J. P. Tsai
General Co-Chairs

xiv
Welcome from
the Conference Program Co-Chairs

This volume contains the proceedings of SUTC 2008, the second IEEE International Conference on Sensor
Networks, Ubiquitous, and Trustworthy Computing. The conference took place in Taichung, Taiwan, 11-13 June
2008. Its objective was to bring together leading researchers from the closely related fields of sensor networks,
ubiquitous computing, and trustworthy computing to present and discuss the latest results in this rapidly developing
area.
A total of 102 papers were submitted to the conference by researchers from all over the world. After a thorough
review by members of the Program Committee and external reviewers, 26 papers were selected for inclusion in the
conference program. The technical program of SUTC 2008 represents a collection of excellent papers on important
aspects of sensor networks, ubiquitous computing, and trustworthy computing—thus bringing together researchers
from these related fields in a unique setting to exchange the latest results. In addition to the full technical papers, the
conference program included six keynote presentations, two panel discussions, an industrial track as well as several
workshops. The technical program was structured into three parallel tracks.
We would like to thank the Program Vice-Chairs Pedro J. Marron (University of Bonn, Germany), Trent Jaeger
(Pennsylvania State University, USA), Andreas Willig (Technical University of Berlin, Germany), Tei-Wei Kuo
(National Taiwan University, Taiwan), Gerd Kortuem (Lancaster University, UK), Yunhao Liu (Hong Kong
University of Science and Technology, Hong Kong), Kun-Lung Wu (IBM Watson Research Lab, USA), and Chien-
Chao Tseng (National Chiao-Tung University, Taiwan) for their tireless efforts which were instrumental in putting
together a strong technical program for the conference. We would also like to thank the members of the Technical
Program Committee for their excellent work in making this conference a success. We are greatly indebted to the
General Co-Chairs Mukesh Singhal (University of Kentucky, USA), Giovanna Di Marzo Serugendo (University of
London, UK), and Jeffrey P. Tsai (University of Illinois, Chicago, USA) for their generous help and successful
coordination of the many activities involved in this conference.

Wang-Chien Lee, Pennsylvania State University, USA


Kay Romer, ETH Zurich, Switzerland
Yu-Chee Tseng, National Chiao-Tung University, Taiwan

xv
Committees

Program Committee
Luis Almeida, Universidade de Aveiro, Portugal
Stefan Arbanowski, Fraunhofer FOKUS, Germany
Walid Aref, Purdue University, USA
Sanjoy Baruah, University of North Carolina at Chapel Hill, USA
Vandy Berten, National Taiwan University, Taiwan
Rajendra Boppana, University of Texas, San Antonio, USA
Eric Bouillet, IBM T. J. Watson Research, USA
Giorgio Buttazzo, Scuola Superiore Sant'Anna, Italy
Chih-Ming Chao, National Taiwan Ocean University, Taiwan
Han-Chieh Chao, National Dong Hwa University, Taiwan
Samarjit Chakraborty, National University of Singapore, Singapore
Chien Chen, National Chiao-Tung University, Taiwan
Guihai Chen, Nanjing University, China
Lei Chen, Hong Kong University of Science and Technology, Hong Kong
Shyh-Kwei Chen, IBM Watson Research Center, USA
Ming-Syan Chen, National Taiwan University, Taiwan
Ya-Su Chen, National Taiwan University of Science and Technology, Taiwan
Kuang-Hui Chi, National Yunlin University, Taiwan
Narankar Dulay, Imperial College, UK
Schahram Dustdar, Vienna University of Technology, Austria
Eylem Ekici, Ohio State University, USA
Patrik Floreen, Helsinki Institute for Information Technology, Finland
Andrea Forte, Columbia University, USA
Xiaoming Fu, Georg-August-University of Goettingen, Germany
Vinod Ganapathy, Rutgers University, USA
Bugra Gedik, IBM Watson Research Center, USA
Alain Gefflaut, Microsoft, Germany
Joel Goossens, Brussels University, Belgium
Lin Gu, Google, USA
Tao Gu, Institute For Infocom Research, Singapore
Dimitrios Gunopulos, University of California, Riverside, USA
Joerg Haehner, University of Hannover, Germany
Jinsong Han, Hong Kong University of Science and Technology, Hong Kong
Marcus Handte, University of Bonn, Germany
Takahiro Hara, Osaka University, Japan
Klaus Herrmann, University of Stuttgart, Germany
Jiman Hong, Kwangwoon University, Korea
Seongsoo Hong, Seoul National University, Korea
Vincent Hu, NIST, USA
Kien A. Hua, University of Central Florida, USA
Sajid Hussain, Acadia University, Canada
Salil Kanhere, University of New South Wales, Australia
Apu Kapadia, Dartmouth College, USA
Holger Karl, University of Paderborn, Germany
Abdelmajid Khelil, TU Darmstadt, Germany
Chin-Fu Kuo, National Kaohsiung University, Taiwan

xvi
Ricky Yu Kwong Kwok, Hong Kong University, China
Andreas Lachenmann, University of Stuttgart, Germany
Yee Wei Law, University of Melbourne, Australia
Chiang Lee, National Cheng Kung University, Taiwan
Dik Lun Lee, Hong Kong University of Science and Technology, China
Mei Li, Microsoft, USA
Ninghui Li, Purdue University, USA
XiangYang Li, Illinois Institute of Technology
Yingshu Li, Georgia State University, USA
Maria Lijding, University of Twente, Netherlands
Ee-Peng Lim, Nanyang Technological University, Singapore
Ting-Yu Lin, University of Illinois at Urbana-Champaign, USA
Feng Ling, Tsinghua University, China
Chuan-Ming Liu, National Taipei University of Technology, Taiwan
Donggang Liu, University of Texas at Arlington, USA
Li Lu, Hong Kong University of Science and Technology, Hong Kong
Yung-Hsiang Lu, Purdue University, USA
Chung-Horng Lung, Carleton University
Michael R. Lyu, Chinese University of Hong Kong, China
Steve McLaughlin, University of Edinburgh, UK
Nirvana Meratnia, University of Twente, Netherlands
Daniele Miorandi, Create-Net, Italy
Henk Muller, University of Bristol, UK
Nicolas Navet, INRIA, France
Tatsuo Nakajima, Waseda University, Japan
Nidal Nasser, University of Guelph, Canada
Beng Chin Ooi, National University of Singapore, Singapore
Luis Orozco, Universidad de Castilla La Mancha, Spain
Wen-Chih Peng, National Chiao Tung University, Taiwan
Stefan Petters, National ICT Australia Ltd., Australia
Thomas Plagemann, University of Oslo, Norway
Ashutosh Sabharwal, Rice University, USA
Reiner Sailer, IBM Research, USA
George Samaras, University of Cyprus, Cyprus
Pierangela Samarati, University of Milan, Italy
Dae-Hee Seo, Korea Information Security Agency, Korea
Lui Sha, University of Illinois, USA
Zili Shao, Hong Kong Polytechnic University, Hong Kong
Chien-Chung Shen, University of Delaware, USA
Haiying Shen, University of Arkansas, USA
Chi-Sheng Shih, National Taiwan University, Taiwan
Gunter Schafer, Technische Universitat Ilmenau, Germany
Gregor Schiele, University of Mannheim, Germany
Jochen Schiller, FU Berlin, Germany
Loren Schwiebert, Wayne State University, USA
Aruna Seneviratne, University of New South Wales, Australia
Sakir Sezer, Queen’s University Belfast, N. Ireland, UK
Shi-WuLo, National Chung Cheng University, Taiwan
Françoise Simonot-Lion, LORIA-INPL, France
Ye-Qiong Song, LORIA-INPL, France
Junehwa Song, KAIST, South Korea
Anna Squicciarini, Purdue University, USA
Avinash Srinivasan, Florida Atlantic University
Oliver Storz, Lancaster University, UK

xvii
Kian-Lee Tan, National University of Singapore, Singapore
Xueyan Tang, Nanyang Technological University, Singapore
Bulent Tavli, Tobb University, Turkey
Eduardo Tovar, ISEP-IPP, Portugal
Patrick Traynor, Penn State University, USA
Elisabeth Uhlemann, Halmstad University, Sweden
Athanasios Vasilako, University of Western Macedonia, Greece
Michail Vlachos, IBM T. J. Watson Research Center, USA
Thiemo Voigt, SICS, USA
Matthias Wagner, DoCoMo Communications Laboratories Europe, Germany
Farn Wang, National Taiwan University, Taiwan
Brent Waters, SRI, USA
Andreas Willig, Technical University of Berlin, Germany
Shih-Lin Wu, Chang Gung University, Taiwan
Bin Xiao, Hong Kong Poly University
Jianliang Xu, Baptist University, Hong Kong
Vivian Xu, Cisco, USA
Chu-Sing Yang, National Cheng Kung University, Taiwan
Chu-Sing Yang, National Taiwan University, Taiwan
Wei-Zu Yang, Asia University, Taiwan
Fan Ye, IBM T. J. Watson Research, USA
Li-Hsing Yen, National University of Kaohsiung, Taiwan
Hee Yong Youn, Sungkyunkwan University, Korea
Jeffrey Xu Yu, Chinese University of Hong Kong, China
Ting Yu, North Carolina State University, USA
Andrea Zanella, University of Padova, Italy
Dimitris Zeinalibour, University of Cyprus
Hongke Zhang, Beijing JiaoTong University, China
Xiaolan Zhang, IBM Research, USA
Xinwen Zhang, Samsung Research, USA
Dakai Zhu, University of Texas at San Antonio, USA
Sencun Zhu, Penn State University, USA
Marco Zuniga, Xerox Research Labs, USA

xviii
Organizing Committee

Advisory Committee
C. V. Ramamoorthy (Chair), University of California at Berkeley, USA
Wen-Tsuen Chen, National Tsing Hua University, Taiwan
S. S. Iyengar, Louisiana State University, USA
Kinji Mori, Tokyo Institute of Technology, Japan
Lionel M. Ni, Hong Kong University of Science and Technology, Hong Kong
Makoto Takizawa, Tokyo Denki University, Japan
Benjamin Wah, University of Illinois, Urbana, USA

Steering Committee
Jeffrey J. P. Tsai (Chair), University of Illinois, Chicago, USA
S. S. Iyengar, Louisiana State University, USA
Lionel M. Ni, Hong Kong University of Science and Technology, Hong Kong
Giovanna Di Marzo Serugendo, University of London, UK

General Co-Chairs
Mukesh Singhal, University of Kentucky, USA
Giovanna Di Marzo Serugendo, University of London, UK
Jeffrey J. P. Tsai, University of Illinois, Chicago, USA

Program Co-Chairs
Wang-Chien Lee, Pennsylvania State University, USA
Kay Romer, ETH Zurich, Switzerland
Yu-Chee Tseng, National Chiao-Tung University, Taiwan

Program Vice-Chairs
Sensor Networks Track
Pedro J. Marron, University of Bonn, Germany

Reliable Software Systems Track


Trent Jaeger, Pennsylvania State University, USA

Mobile Computing and Wireless Communication Track


Andreas Willig, Technical University of Berlin, Germany

Embedded Systems Track


Tei-Wei Kuo, National Taiwan University, Taiwan

Ubiquitous Computing Track


Gerd Kortuem, Lancaster University, UK

xix
Trustworthy Computing Track
Yunhao Liu, Hong Kong University of Science and Technology, Hong Kong

Pervasive Services and Data Management Track


Kun-Lung Wu, IBM Watson Research Lab., USA

Wireless Local-, Personal-, and Body-area Networks Track


Chien-Chao Tseng, National Chiao-Tung University, Taiwan

Workshop Co-Chairs
Shu-Ching Chen, Florida International University, USA
Jong Hyuk Park, Kyungnam University, Korea
Raja Jurdak, University College Dublin, Ireland
Stephen J. H. Yang, National Central University, Taiwan

Industrial Program Co-Chairs


Nageswara S. Rao, Oak Ridge National Lab., USA
Jen-Yao Chung, IBM Watson Research Lab., USA
Emile Aarts, Philips Research Lab., Netherlands

Special Track Co-Chairs


Mei-Ling Shyu, University of Miami, USA
Guna S. Seetharaman, Air Force Institute of Technology, USA
Tzong-Chen Wu, National Taiwan University of Science & Technology, Taiwan

Best Paper Award Committee Co-Chairs


Lionel M. Ni, Hong Kong University of Science and Technology, Hong Kong
Ajay Kshemkalyani, University of Illinois, Chicago, USA
Chung-Ta King, National Tsing Hua University, Taiwan

Finance Chair
Rong-Ming Chen, National University of Tainan, Taiwan

Publication and Registration Chair


Han C. W. Hsiao, Asia University, Taiwan

xx
Local Arrangement Chair
Anthony Y. H. Liao, Asia University, Taiwan

Publicity Co-Chairs
Sam Michiels, K.U. Leuven, Belgium
Shangping Ren, Illinois Institute of Technology, USA
Alan Liu, National Chung Cheng University, Taiwan

Web Co-Chairs
Shih-Nung Chen, Asia University, Taiwan
Fu-Ming Huang, National Central University, Taiwan

xxi

You might also like