Professional Documents
Culture Documents
Abstract—Surveillance systems provide the capability of collect- literature review. In public safety, real-time distributed archi-
ing authentic and purposeful information and forming appropriate tecture is required to transmit sensor data immediately for de-
decisions to enhance safety. This paper reviews concisely the his- duction. Awareness and intelligence is applied to address the
torical development and current state of the three different gener-
ations of contemporary surveillance systems. Recently, in addition automatic deduction. Video surveillance is thoroughly used in
to the employment of the incessantly enlarging variety of sensors, public safety. The usage of wireless networks is growing in pub-
the inclination has been to utilize more intelligence and situation lic safety and it is accompanied with energy efficiency. Surveil-
awareness capabilities to assist the human surveillance personnel. lance personnel often patrol in surveyed areas and their precise
The most recent generation is decomposed into multisensor envi- location must be known to exploit their benefit to the fullest.
ronments, video and audio surveillance, wireless sensor networks,
distributed intelligence and awareness, architecture and middle- As surveyed areas become constantly larger and more com-
ware, and the utilization of mobile robots. The prominent diffi- plex, scalability is a crucial issue in the surveillance of public
culties of the contemporary surveillance systems are highlighted. safety.
These challenging dilemmas are composed of the attainment of Public safety and homeland security are substantial concerns
real-time distributed architecture, awareness and intelligence, ex- for governments worldwide, which must protect their people
isting difficulties in video surveillance, the utilization of wireless
networks, the energy efficiency of remote sensors, the location dif- and the critical infrastructures that uphold them. Information
ficulties of surveillance personnel, and scalability difficulties. The technology plays a significant role in such initiatives. It can assist
paper is concluded with concise summary and the future of surveil- in reducing risk and enabling effective responses to disasters of
lance systems for public safety. natural or human origin [1].
Index Terms—Distributed systems, human safety, surveillance, There is an increasing demand for security in society. This
survey. results in a growing need for surveillance activities in many
environments. Recent events, including terrorist attacks, have
I. INTRODUCTION resulted in an increased demand for security in society. This
has influenced governments to make personal and asset security
URVEILLANCE systems enable the remote surveillance
S of widespread society for public safety and proprietary
integrity. This paper contains the revision of the background
priorities in their policies. Valera and Velastin [2] state that the
demand for remote surveillance relative to safety and security
has received significant attention, especially in the public places,
and the three different generations of surveillance systems. The remote surveillance of human activities, surveillance in forensic
emphasis of this paper is on the third-generation surveillance applications, and remote surveillance in military applications.
system (3GSS) and its current and significant difficulties. The The public can be perceived either as individuals or as a crowd.
3GSSs use multiple sensors. Domain-specific issues are omitted Valera and Velastin [2] indicate that a future challenge is to
from this paper, despite being inherent to their own domain. The develop a wide-area distributed multisensor surveillance system,
focus is on generic surveillance, which is applicable to public which has robust, real-time computer algorithms, which are
safety. executable with minimal manual reconfiguration for different
Surveillance systems are typically categorized into three dis- applications [2].
tinct generations of which the 3GSSs is the current genera- There is a growing interest in surveillance applications, be-
tion. The essential dilemmas of the 3GSSs are related to the cause of the availability of cheap sensors and processors at
attainment of real-time distributed architecture, awareness and reasonable costs. There is also an emerging need from the pub-
intelligence, existing difficulties in video surveillance, the uti- lic for improved safety and security in urban environments and
lization of wireless networks, the energy efficiency of remote the significant utilization of resources in public infrastructure.
sensors, location difficulties of surveillance personnel, and scal- This, with the growing maturity of algorithms and techniques,
ability difficulties. These aspects repetitively occurred in the enables the application of technology in miscellaneous sectors,
such as security, transportation, and the automotive industry. The
problem of remote surveillance of unattended environments has
Manuscript received August 4, 2009; revised November 16, 2009 and January received particular attention in the past few years [3].
28, 2010; accepted January 28, 2010. Date of publication March 1, 2010; date
of current version August 18, 2010. This paper was recommended by Associate Intelligent remote monitoring systems allow users to sur-
Editor L. Zhang. vey sites from significant distances. This is especially useful
The author is with the VTT Technical Research Centre of Finland, Oulu when numerous sites require security surveillance simultane-
90571, Finland (e-mail: tomi.raty@vtt.fi).
Color versions of one or more of the figures in this paper are available online ously. These systems use rapid and efficient corrective actions,
at http://ieeexplore.ieee.org. which are executed immediately once a suspicious activity is de-
Digital Object Identifier 10.1109/TSMCC.2010.2042446
tected. An alert system can be used to warn security personnel still images, or audio. Such data processed and analyzed by a
of impending difficulties and numerous sites can be simulta- human, a computer, or a combination of both at a command
neously monitored. This considerably reduces the load of the center. An administrator can decide on performing an on-field
security personnel [4]. operation to put the environment back into a situation considered
A fundamental goal of surveillance systems is to acquire good as normal. On-field control operations are issued by on-field
coverage of the observed region with as few cameras as possible agents who require effective communication channels to uphold
to keep the costs for the installation and the maintenance of cam- a close interaction with the command center [10].
eras, transmission channels, and complexity in scene calibration A surveillance system can be defined as a technological tool
reasonable [5]. that assists humans by offering an extended perception and
In this paper, we first present the background and progression reasoning capability about situations of interest that occur in
of surveillance systems. This is followed by careful descrip- the monitored environments. Human perception and reasoning
tions of the three generations of surveillance systems. Then we are restricted by the capabilities and limits of human senses
present the difficulties of contemporary surveillance systems, and mind to simultaneously collect, process, and store limited
which compose of the attainment of real-time distributed archi- amount of data [3].
tecture, awareness and intelligence, existing difficulties in video To address this amount of information, aspects such as scala-
surveillance, the utilization of wireless networks, the energy ef- bility and usability become very significant. This includes how
ficiency of remote sensors, location difficulties of surveillance information needs to be given to the right people at the right time.
personnel, and scalability difficulties. The paper is concluded To tolerate this growing demand, research and development has
with a future prospect and a brief summary. been subsequently executed in commercial and academic envi-
ronments to discover improvements or new solutions in signal
processing, communications, system engineering and computer
II. HISTORICAL SURVEILLANCE AND SURVEILLANCE SYSTEMS vision [2].
The stone-age warrior used his eyes and ears from atop of a
mantle to survey his battle area and to distinguish targets against
which his could utilize his primitive weapons. Despite advance-
ments in weaponry to catapults, swords, and shields, the eyes III. PROGRESSION OF SURVEILLANCE SYSTEMS
and ears of warriors were utilized for surveillance. The observa- Over the past two decades, surveillance systems have been an
tion balloon and the telegraph significantly improved range in area of considerable research. Recently, plenty of research has
both visibility and information transmission, respectively, but in been concentrated on video-based surveillance systems, partic-
the twentieth century, the improvements from the eyes and ears ularly for public safety and transportation systems [11].
transformed surveillance into the concept “modern” [6]. Data are collected by distributed sources and then they are
Military operations have introduced the importance of the typically transmitted to some remote control center. The auto-
combat surveillance problem. The location of target coordinates matic capability to learn and adjust to altering scene conditions
and shifting own troops accordingly requires dynamic actions and the learning of statistical models of normal event patterns
accompanied with decisions. Rapid, complete, and precise in- are growing issues in surveillance systems. The learning sys-
formation is needed to address this [7]. Information included tem offers a mechanism to flag potentially anomalous events
the detection and approximate location of personnel, concentra- through the discovery of the normal patterns of activity and
tions of troops, and the monitoring and storage of position data flagging the least probable ones. Two substantial restrictions
over time and according to movements [8]. Surveillance infor- that affect the deployment of these systems in the real world
mation must be delivered to the correct commander when he contain real-time performance and low cost. Multisensor sys-
requires it and the information must be presented in a meaning- tems can capitalize from processing either the same type or
ful form to address the problem of information processing [7]. different type of information collected by sensors, e.g., video
The data-collection problem is addressed by the entities, which cameras, and microphones, of the same monitored area. Appro-
perform the surveillance, e.g., intelligence sources and human priate processing techniques and new sensors offering real-time
surveillance, and transmit it to the command [7]. information associated to different scene characteristics can as-
The fundamental intention of a surveillance system is to ac- sist both to improve the size of monitored environments and to
quire information of an aspect in the real world. Military surveil- enhance performances of alarm detection in regions monitored
lance systems enhance the sensory capabilities of a military by multiple sensors [3].
commander. Surveillance systems have evolved from simple Security surveillance systems are becoming crucial in situa-
visual and verbal systems, but the purpose is still the same. tions in which personal safety could be compromised resulting
Even the most primitive surveillance systems gathered informa- from criminal activity. Video cameras are constantly being in-
tion concerning reality and communicated it to the appropriate stalled for security reasons in prisons, banks, automatic teller
users [9]. machines, petrol stations, and elevators, which are the most sus-
Generic surveillance is composed of three essential parts. ceptible for criminal activities. Usually, the video camera is con-
These are data acquisition, information analysis, and on-field nected to a recorder or to a display screen from which security
operation. Any surveillance system requires means to monitor personnel constantly monitor suspicious activities. As security
the environment and collect data in the form of, e.g., video, personnel typically monitor multiple locations simultaneously,
RÄTY: SURVEY ON CONTEMPORARY REMOTE SURVEILLANCE SYSTEMS FOR PUBLIC SAFETY 495
this manual task is labor intensive and inefficient. Significant of the events of interest. From a communication perspective,
stress may be placed on the security personnel involved [4]. these systems suffered from the main difficulties of analogue
Another technological breakthrough substantial to the devel- video communication, e.g., high-bandwidth requirements and
opment of surveillance systems is the capability of remotely poor allocation flexibility [3].
transmitting and reproducing images and video information, The 1GSS utilizes analogue CCTV systems. The advantage is
e.g., TV broadcasting and the successive use of video sig- that they provide good performance in some situations and the
nal transmission and display in the close circuit TV systems technology is mature. The utilization of analogue techniques for
(CCTV). CCTVs that provide data at acceptable quality date image distribution and storing is inefficient. The current 1GSSs
back to the 1960s. The availability of CCTVs can be considered examine the usage of digital information against analogue, re-
as the beginning point that allowed online surveillance to be view digital video recording, and CCTV video compression [2].
feasible, and 1960 can be considered the beginning date of the Computer vision is a significant artificial intelligence (AI)
first generation surveillance systems [3]. research area. From the 1970s to the 1990s, computer vision
Surveillance systems have developed in the three genera- proved its practical value in a vast range of application domains
tions [11]. The first generation of surveillance systems (1GSSs) including medical diagnostics, automatic target recognition, and
used analogue equipment throughout the complete system [11]. remote sensing [13].
Analogue closed-circuit television cameras (CCTV) captured
the observed scene and transmitted the video signals over ana-
logue communication lines to the central back-end systems, V. SECOND-GENERATION SURVEILLANCE SYSTEMS
which presented and archived the video data [11]. The main In this technological evolution, 2GSSs (1980–2000) corre-
challenge in the 1GSS is that it uses analogue techniques for spond to the maturity phase of the analogue 1GSS. The 2GSSs
image distribution and storage [2]. benefited from the early progression in digital video commu-
The second generation of surveillance systems (2GSSs) uses nications, e.g., digital compression, robust transmission, band-
digital back-end components [11]. They enable real-time au- width reduction, and processing methods, which assist the hu-
tomated analysis of the incoming video data [11]. Automated man operator by prescreening important visual events [3].
event detection and alarms substantially improve the content of Regarding the 2GSS, automated visual surveillance is
simultaneously monitored data and the quality of the surveil- achieved through the combination of computer vision technol-
lance system [11]. The difficulty in the 2GSS is that it does ogy and CCTV systems. The benefits of the second generation
not support robust detection and tracking algorithms, which are are that the surveillance efficiency of CCTV is enhanced. The
needed for behavioral analysis [2]. difficulties lie within the robust detection and tracking algo-
The 3GSSs have finalized the digital transformation. In these rithms needed for behavioral analysis. The current research of
systems, the video signal is converted into the digital domain at 2GSS rests in real-time robust computer vision algorithms, au-
the cameras, which transmit the video data through a computer tomatic learning of scene variability and patterns of behavior,
network, for instance a local area network. The back-end and and eliminating the differences between the statistical analyses
transmission systems of a third-generation surveillance system of a scene and establishing natural language interpretations [2].
have also improved their functionality [11]. The 2GSS research addressed multiple areas with improved
There are immediate needs for automated surveillance sys- results in real-time analysis and separation of 2-D image se-
tems in commercial, military applications, and law enforcement. quences, identification, and tracking of multiple objects in com-
Mounting video cameras is inexpensive, but locating available plex scenes, human behavior comprehension, and multisensor
human resources to survey the output is expensive. Despite data fusion. The 2GSS also improved intelligent man–machine
the usage of surveillance cameras in banks, stores, and park- interfaces, performance evaluation of video processing algo-
ing lots, video data currently are used only retrospectively as a rithms, wireless and wired broadband access networks, signal
forensic tool, thus losing its primary benefit as an active real- processing for video compression, and multimedia transmission
time medium. What is required is a continuous 24-h monitoring for video-based surveillance systems [3].
of surveillance video to alert security officers of a burglary in The majority of research efforts during the period of the
progress, or a suspicious individual lingering in a parking lot, 2GSSs have been used in the development of automated real-
while there still is time to prevent the criminal offence [12]. time event detection techniques for video surveillance. The
availability of automated methods would significantly ease the
monitoring of large sites with multiple cameras as the automated
IV. FIRST-GENERATION SURVEILLANCE SYSTEMS event detection enables prefiltering and the presentation of the
First video generation surveillance systems (1960–1980) con- main events [3].
siderably extend human perception capabilities in a spatial
sense. The 1GSSs are based on analogue signal and image trans-
VI. THIRD-GENERATION SURVEILLANCE SYSTEMS
mission and processing. In these systems, analogue video data
form a collection of cameras, which view remote scenes and The 3GSSs handle a large number of cameras, a geographical
present information to the human operators. The main disad- spread of resources, and many monitoring points. From an image
vantages of these systems concern the reasonably small atten- processing view, they are based on the distribution of processing
tion span of operators that may result in a significant miss rate capacities over the network and the use of embedded signal-
496 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 40, NO. 5, SEPTEMBER 2010
This information indicates where objects are and what they may
be doing as they are observed, and attempts to characterize usual
behavior [19].
Research interests have shifted from ordinary static image-
based analysis to video-based dynamic monitoring and anal-
ysis. Researchers have advanced in addressing illumination,
color, background, and perspective static aspects. They have
advanced in tracking and analyzing shapes related to moving
human bodies and moving cameras. They have improved activ-
ity analysis and control of multicamera systems. The research
of Trivedi et al. [13] addresses a distributed collection of cam-
eras, which provide wide-area monitoring and scene analysis
on several levels of abstraction. Installing multiple sensors in-
troduces new design aspects and challenges. Handoff schemes
are needed to pass tracked objects between sensors and clus- Fig. 5. Illustration of images and the output of background subtraction [22].
ters, methods are required to specify the best view given in the
scene’s context, and sensor-fusion algorithms capitalize a given
sensor’s strengths [13].
Modern visual surveillance systems deploy multicamera clus-
ters operating at real-time with embedded adaptive algorithms.
These advanced systems need to be operational constantly, and
to robustly and reliably detect events of interest in difficult
weather conditions. This includes adjusting to natural and ar-
tificial changes in the illumination, and withstanding hardware
and software system failures [23].
Generally, the initial step for automatic video surveillance is
adaptive background subtraction to extract foreground regions
from the incoming frames. Object tracking is then executed on
the foreground regions. In this case, tracking isolated objects is
relatively easy. When multiple tracked objects are placed into
groups with miscellaneous complexities of occlusion, tracking
each individual object through crowds becomes a challenging
Fig. 6. Example of tracklet tracking [26].
task. First, when objects merge into a group, the visual character-
istics for each object become unclear and obscure. The objects
distant from the camera can be partially or completely occluded Li et al. state that the aim of multitarget tracking is to infer
by the surrounding objects. Second, the poses and scales of the the target trajectories from image observations in a video. This
target objects may severely change when they are in crowds. poses a significant challenge in crowded environments where
Third, the motion speed and the direction of the target objects there are frequent occlusions and multiple targets have a similar
may essentially change during occlusion [24]. appearance and intersecting trajectories. Data association-based
Basically, the approach of the detection of moving objects is tracking (DAT) associates links to short track fragments, i.e.,
through background subtraction that contains the model of the tracklets, or detection responses into trajectories based on simi-
background and the detection of moving objects from those that larity in position, size, and appearance. This enables multitarget
differ from such a model. In comparison to other approaches, tracking from a single camera by progressively associating de-
such as optical flow, this approach is computationally afford- tection responses into longer track fragments, i.e., tracklets, to
able for real-time applications. The main dilemma is its sen- resolve target trajectories. Fig. 6 presents an image of tracklet
sitivity to dynamic scene challenges and the subsequent need tracking [26].
for background model adaptation through background mainte- Human motion tracking which is based on the input from
nance. This type of a problem is known to be essential and red–green–blue (RGB) cameras can produce results in indoor
demanding [25]. scenes with consistent illumination and steady background [27].
Fig. 5 illustrates a collection of images from a parking lot Outdoor scenes with significant background clutter results from
and the background subtraction output of these images. Object illumination changes are a challenge for conventional charged-
detection is achieved by constructing a representation of the couple device (CCD) cameras [27]. There have been contri-
scene, which is called a background model, and then locating butions on pedestrian localization and tracking in visible and
the differences from the model against each incoming frame. infrared videos [28]. Fig. 7 presents a thermal image and a color
The higher image sequence illustrates the complete scene and image of the same scene [28].
the lower image sequence represents the resulting background A significant problem encountered in numerous surveillance
subtraction output [22]. systems are the changes in ambient light, particularly in an out-
RÄTY: SURVEY ON CONTEMPORARY REMOTE SURVEILLANCE SYSTEMS FOR PUBLIC SAFETY 499
Fig. 7. (Left) Thermal image and (right) color image of a scene [28].
Fig. 8. Example of a microphone array for measuring the bearing angle [32].
Fig. 14. Detected target tracked and geo-registered on the map [53].
the location difficulties of surveillance personnel, and scalability The fundamental techniques for interpreting video and ex-
difficulties. tracting information from it have received a substantial amount
of attention. The successive set of challenges addresses on how
to use these techniques to construct large-scale deployable sys-
A. Real-Time Distributed Architecture tems. Several challenges of deployment contain the cost min-
It is fundamental to establish a framework or methodology imization of wiring, low-power hardware for battery-operated
for designing distributed wide-area surveillance systems. This camera installations, automatic calibration of cameras, auto-
ranges from the generation of requirements to the creation of matic fault detection, and the development of system manage-
design paradigms by defining functional and intercommunica- ment tools [41].
tion models. The future realization of a wide-area distributed Improving the smart cameras with additional sensors could
intelligent surveillance system should be through a collection of transform them into a high-performance multisensor system. By
distinct disciplines. Computer vision, telecommunications, and combining visual, acoustic, tactile, or location-based informa-
system engineering are clearly needed [2]. tion, the smart cameras become more sensitive and can transmit
A distributed multiagent approach may provide numerous results that are more precise. This makes the results more appli-
benefits. First, intelligent cooperation between agents may en- cable widely [11].
able the use of less expensive sensors and, therefore, a large The usual scenario in an industrial research and development
number of sensors may be deployed over a larger area. Sec- unit developing vision systems is that a customer presents a
ond, robustness is enhanced, because even if some agents fail, system specification and its requirements. The engineer then
others remain to perform the mission. Third, performance is interprets these requirements into a system design and validates
more flexible, there is a distribution of tasks at miscellaneous that the system design fulfils the user-specified requirements.
locations between groups of agents. For instance, the likelihood The accuracy requirements are typically defined in terms of
of correctly classifying an object or target increases if multiple detection and false alarm rates for objects. The computational
sensors are concentrated on it from different locations [2]. requirement is specified commonly by the system response time
A video surveillance network is a complicated distributed ap- to the presence of an object, e.g., real-time or delayed. The in-
plication and requires sophisticated support from middleware. tention of the vision systems engineer is to then exploit these
The role of middleware is primarily to support communica- restrictions and design a system that is operational in the sense
tion between modules. The nonfunctional requirements for the that it satisfies customer requirements regarding speed, accu-
video surveillance networks are best defined in architectural racy, and expenses [58].
terms and contain scalability (middleware must offer tools suit- The essential dilemma is that there is no known systematic
able for the scalable re-implementation of these algorithms), way for vision systems engineers to conduct this translation of
availability (the middleware needs to support sufficient fault the system requirements to a detailed design. It is still an art to
tolerance to uphold acceptable levels of availability), evolvabil- engineer systems that satisfy application-specific requirements.
ity (the capacity of the surveillance network to adjust to changes, There are two basic steps in the design process, which are 1) the
including changes to the hardware and modifications to the soft- choice of the system architecture and the modules to achieve the
ware), integration (middleware is the intermediary for this type task, and 2) the statistical analysis and validation of the system
of communication), security (middleware needs to offer secu- to check if it fulfils user requirements. In real life, the system
rity facilities to address such attacks), and manageability (the design and analysis phases usually follow each other in a cycle
network middleware must support the on-demand requirement until the engineer creates a design and a suitable analysis that
for manageability) [43]. satisfies the user specifications [58].
The systems provide a concrete and profitable assistance to Automation of the design process is a research area with
forensic investigations, despite that their potential capabilities multiple open issues, even though there has been some studies
are decreased in reality by the limitations of storage capacities, in the context of image analysis, e.g., automatic programming.
the frame skipping, and the data compression. Currently real- The systems analysis (performance characterization) phase in
time reactivity is insufficient, because the human operators that the context of video processing systems has been an active
cannot handle enormous amounts of surveillance streams [57]. region of research in the recent years. Performance evaluation
1) Architectural Dilemmas in Video Surveillance: While ex- of image and video analysis components or systems is an active
isting research has addressed multiple issues in the analysis of research topic in the vision community [58].
surveillance video, there has been little work in the area of 2) Real-Time Data Constraints: Society requires the results
more efficient information acquisition based on real-time au- of research activities to address new solutions in video surveil-
tomatic video analysis, such as the automatic acquisition of lance and sensor networks. Security and safety calls for new
high-resolution face images. There is a challenge in transmitting generations of multimedia surveillance systems, in which com-
information across different scales and the interpretation of the puters will act not only as supporting platforms but as the essen-
information become essential. Multiscale techniques present a tial core of real-time data comprehension process, is becoming
completely novel region of research, including camera control, a reality [57].
processing video from moving cameras, resource allocation, Most of the new research activities in surveillance are explor-
and task-based camera management in addition to challenges in ing larger dimensions, such as distributed video surveillance
performance modeling and evaluation [41]. systems, heterogeneous video surveillance, audio surveillance,
RÄTY: SURVEY ON CONTEMPORARY REMOTE SURVEILLANCE SYSTEMS FOR PUBLIC SAFETY 505
and biometric systems. In vast distributed environments, the processing or operator involvement, and 4) offer human users a
exploitation of networks of small cooperative sensors should high-level interface for dynamic scene visualization and system
considerably improve the surveillance capability of high-levels tasking [12].
sensors, such as cameras [57]. Intelligent visual surveillance is a vital application area for
As system size and diversity grow and consequently the com- computer vision. In situations in which networks of hundreds of
plexity increases, the probability of inconsistency, unreliability cameras are used to cover a wide area, the obvious restriction is
and nonresponsiveness grows. The design and implementation the ability of the user to manage vast amounts of information.
of distributed real-time systems present essential challenges to Due to this reason, automated tools that can generalize activi-
ensure that these complicated systems function as required. To ties or track objects are crucial to the operator. The ability to
comprehend or implement any complex system, it is necessary track objects across (spatially separated) camera scenes is the
to decompose it into component parts and functions. Distributed key to the user requirements. Extensive geometric knowledge of
systems can be considered in terms of independent concurrent the site and camera positions is normally needed. This type of
activities that need to exchange data that do not weaken the explicit mapping to camera placement is impossible for large in-
overall predictability and performance of the system [59]. stallations, because it requires that the operator knows to which
There are four crucial objectives that design methods for real- camera to switch when an object vanishes [61].
time systems should achieve, 1) to be able to structure the system While detecting and tracking objects are crucial capabili-
in concurrent tasks, 2) to be capable of developing reusable ties for smart surveillance, from the perspective of human in-
software by information hiding, 3) to be able to determine the telligence analyst, the most critical challenge in video-based
behavioral characteristics of the system, and 4) to be able to surveillance is interpreting the automatic analysis of data into
analyze the performance of the design by distinguishing its the detection of events of interest and the identification of trends.
performance and the fulfillment of requirements [59]. Contemporary systems have just begun to examine automatic
The main motivation of the paradigm shift from a central to event detection. The key points are video-based detection and
a distributed control surveillance system is an improvement of tracking, video-based person identification, large-scale surveil-
the functionality, availability, and autonomy of the surveillance lance systems, and automatic system calibration [41].
system. These surveillance systems can respond autonomously Object tracking is a vital task for many applications in the re-
to changes in the environment of the system and to detected gion of computer vision and particularly in those associated to
events in the monitored scenes. A static surveillance system video surveillance. Recently, the research community has con-
configuration is not desirable. The system architecture must centrated its interests on developing smart applications to en-
support reconfiguration, migration, quality of service, and power hance event detection capabilities in video surveillance systems.
adaptation in analysis tasks [11]. Advanced visual-based surveillance systems need to process
Recently, there has been rapid development in advanced videos resulting from multiple cameras to detect the presence of
surveillance systems to solve a collection of difficulties that mobile objects in the monitored scene. Every detected object is
vary from people recognition to behavior analysis with the in- tracked and their trajectories are analyzed to deduct their move-
tention to enhance security. These challenges have encountered ment in the scene. Finally, at the highest levels of the system,
different perspectives and were followed by a vast selection of detected objects are recognized and their behavior is analyzed
system architectures. As cheaper and faster computing hard- to verify if the state is normal or potentially dangerous [62].
ware accompanied with efficient and versatile sensors reached Motion detection, tracking, behavior comprehension, and
the consumer, there was a rapid development of multicamera personal identification at a distance can be realized by sin-
systems. In spite of their large area coverage, they introduce gle camera-based visual surveillance systems. Multiple camera-
new dilemmas that must be addressed in the architectural defi- based visual surveillance systems can be helpful, because the
nition [60]. surveillance region is enlarged and multiple view information
can outperform occlusion. Tracking with a single camera easily
creates obscurity resulting from occlusion or depth (see Fig. 15).
B. Difficulties in Video Surveillance This incomprehensibility may be removed by another view. Vi-
In realistic surveillance scenarios, it is impossible for a sin- sual surveillance using multicameras introduces dilemmas, such
gle sensor to view all the areas simultaneously, or to visually as camera installation, camera calibration, object matching, au-
track a moving object for a long period. Objects become oc- tomated camera switching, and data fusion [20].
cluded by buildings and trees and the sensors themselves have The recognition of human activities in restricted settings, such
confined fields of view. A promising solution to this difficulty as airports, parking lots, and banks is of significant interest in se-
is to use a network of video sensors to cooperatively monitor curity and automated surveillance systems. Albanese et al. [63]
all the objects within an extended region and seamlessly track state that science is still far from achieving a systematic solution
individual objects that cannot be viewed continuously by an in- to this difficulty. The analysis of activities executed by humans
dividual sensor alone. Some of the technical challenges within in restricted settings is of great importance in applications, such
this method are to 1) actively control sensors to cooperatively as automated security and surveillance systems. There has been
track multiple moving objects, 2) fuse information from mul- essential interest in this area where the challenge is to automat-
tiple sensors into scene-level object representations, 3) survey ically recognize the activities occurring in the field of a camera
the scene for events and activities that should “trigger” further and detect abnormalities [63].
506 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 40, NO. 5, SEPTEMBER 2010
Recently, there has been an emphasis on the development A sensor surveillance system comprises a set of wireless sen-
of wide-area distributed wireless sensor networks with self- sor nodes and a set of targets to be monitored. The wireless
organization capabilities to tolerate sensor failures, changing sensor nodes collaborate with each other to survey the targets
environmental conditions, and distinct environmental sensing and transmit the sensed data to a base station. The wireless sen-
applications. Particularly, mobile sensor networks (MSNs) re- sor nodes are powered by batteries and have demanding power
quire support from self-configuration mechanisms to guarantee requirements. The lifetime is the duration until there is no target
adaptability, scalability, and optimal performance. The best net- that can be surveyed by any wireless sensor node or data cannot
work configuration is typically time varying and context depen- be forwarded to be processed because of a lack of energy in the
dent. Mobile sensors can physically change the network topol- sensors nodes [84].
ogy, responding to events of the environment or to changes in A client-side computing device has a crucial influence on the
the mission [80]. total performance of a surveillance system. The utilization of
a cellular phone as a client of a surveillance system is notable,
because of its portability and omnipresent computing. The in-
tegration of video information and sensor networks established
E. Energy Efficiency of Remote Sensors the fundamental infrastructure for new generations of multime-
With the emergence of high-resolution image sensors, dia surveillance systems. In this infrastructure, different media
video transmission requires high-bandwidth communication streams, such as audio, video and sensor signals, would pro-
networks. It is predicted that future intelligent video surveil- vide an automatic analysis of the controlled environment and a
lance requires more computing power and higher communi- real-time interpretation of the scene [85].
cation bandwidth than currently. This results in higher reso-
lution images, higher frame rates, and increasing numbers of
cameras in video surveillance networks. Novel solutions are F. Dilemmas in Scalability
needed to handle demanding restrictions of video surveillance A scalable system should be able to integrate the sensor data
systems, both in terms of communication bandwidth and com- with contextual information and domain knowledge provided
puting power [81]. by both the humans and the physical environment to maintain
Intruder detection and data collection are examples of ap- a coherent picture of the world over time. The performance of
plications envisioned for battery-powered sensor networks. In the majority of the systems is far from what is required from
many of these applications, the detection of a certain triggering real-world applications [86].
event is the initial step executed prior to any other processing. A large-scale distributed video surveillance system usually
If trigger events occur seldom, sensor nodes will use a large comprises many video sources distributed over a vast area, trans-
majority of their lifetime in the detection loop. The efficient use mitting live video streams to a central location for monitoring
of system resources in detection then plays a key role in the and processing. Contemporary advances in video sensors and the
longevity of the sensor nodes. The energy consumption in the increasing availability of networked digital video cameras have
system includes transmission energy and the energy required allowed the deployment of large-scale surveillance systems over
by processing has not been considered directly in the detection existing IP-network infrastructure. Implementing an intelligent,
problem [82]. scalable, and distributed video surveillance system remains a
It is crucial to note that technology scaling will gradually de- research problem. Researchers have not paid too much attention
crease the processing costs with the transmission cost remain- on the scalability of video surveillance systems. They typically
ing constant. With the usage of compression techniques, one utilize a centralized architecture and assume the availability of
can reduce the number of transmitted bits. The transmission all the required system resources, such as computational power
cost is decreased with an increase of additional computation. and network bandwidth [87].
This communication computation tradeoff is the fundamental Fig. 18 presents an example of sensor coverage in a large com-
idea behind low-energy sensor networks. This is a sharp con- plex [15]. The sensor and its coverage is drawn and indicated,
trast to the classical distributed systems, in which the goal is e.g., B1, C1, C2, and C3 [15].
usually maximizing with the speed of execution. The most ap- The integration of heterogeneous digital networks in the same
propriate metrics in wireless networks is power. Experimental surveillance architecture needs a video encoding and distribu-
measurements indicate that the communication cost in wireless tion technology capable of adapting to the currently available
ad hoc networks can be two orders of magnitude higher than bandwidth, which may change in time for the same communi-
computation costs regarding consumed power [38]. cation channel, and to be robust against transmission errors. The
Integrated video systems (IVSs) are based on the recent devel- presence of clients with different processing power and display
opment of smart cameras. In addition to high demands in com- capabilities accessing video information requires a multiscale
puting performance, power awareness is of major importance representation of the signal. The restrictions of surveillance
in IVS. Power savings may be achieved by graceful degrada- applications regarding delay, security, complexity, and visual
tion of quality of service (QoS). There has been research done quality introduce strict demands to the technology of the video
in the tradeoff of image quality and power consumption. The codec. In a large surveillance system, the digital network that
work mainly concentrates on sophisticated image compression enables remote monitoring, storage, control and analysis is not
techniques [83]. within a single local area network (LAN). It typically represents
510 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 40, NO. 5, SEPTEMBER 2010
G. Location Difficulties
Location techniques have numerous possible applications
in wireless communication, surveillance, military equipment,
tracking, and safety applications. Sagiraju et al. [56] concen-
trate on positioning in cellular wireless networks. The results
can be applied to other systems. In the GPS, code-modulated
signals are transmitted by numerous satellites, which orbit the
earth, and are received by GPS receivers to determine the current
position. To calculate a position, the receiver must first acquire
the satellite signals. Traditionally, GPS receivers have been de-
signed with specific acquisition and tracking modes. After the
signal has been acquired, the receiver switches to the track-
ing mode. If it loses the lock, then the acquisition needs to be
repeated [56].
The GPS system comprises of at least 24 satellites in orbit
around the world, with at least four satellites viewable from any
Fig. 18. Schematic representation of sensor coverage in a large area [15]. point, at a given time, on Earth. Despite GPS being a sophis-
ticated solution to the location discovery process, it has mul-
tiple network dilemmas. First, GPS is expensive both in terms
of hardware and power requirements. Second, GPS requires
a collection of interconnected LANs, wired or wireless, with line-of-sight between the receiver and the satellites. It does not
different bandwidths and QoS. Different types of clients con- function well when obstructions, such as buildings, block the
nect to these networks and access one or multiple video sources, direct “view” of the satellites. Locations can be calculated by
decode them at the temporal and spatial resolution they require, trilateration. For a trilateration to be successful, a node needs
and provide different functions [88]. to have at least three neighbors who already are aware of their
QoS is a fundamental concern in distributed IVS. In video- positions [38].
based surveillance, normal QoS parameters contain frame rate, Security personnel review their wireless video systems for
transfer delay, image resolution, and video-compression rate. critical incident information. Complementary information in the
The surveillance tasks might also provide multiple QoS levels. form of maps and live video streaming can assist in locating the
In addition, the offered QoS levels can change over time due to problematic zone and act quickly and with knowledge of the
user instructions or modifications in the monitored environment. situation. The need for providing detailed real-time informa-
Novel IVS systems need to contain dedicated QoS management tion to the surveillance agents has been identified and is being
mechanisms [11]. addressed by the research community [10].
1) Scalability in Testing: Testing of individual modules is The analysis and fusion of different sensor information re-
called unit testing. Integration testing comprised of rerunning quires mapping observations to a common coordinate system to
the unit test cases after the system was completely integrated. achieve situational awareness and scene comprehension. Avail-
For feature testing, which is also called system testing, testers ability of mapping capabilities enables critical operational tasks,
developed test cases based on the requirements of the system. such as the fusion of multiple target measurements across the
They chose adequate test cases according to every expected network, deduction of the relative size and speed of the tar-
result. Load testing comprises four subphases, which are 1) get, and the assignment of tasks to Pan, Tilt, Zoom (PTZ) and
stability testing, 2) stress testing, 3) reliability testing, and 4) mobile sensors. This presents the need for automated and effi-
performance testing. Stability testing comprises the installation cient geo-registration mechanism for all sensors. For instance,
of software in a field-like environment and the verification of its target observations from multiple sensors may be mapped to a
ability to appropriately address data continuously. Stress test- geodetic coordinate system and then displayed on a map-based
ing comprises the verification of the ability of the software to interface. Fig. 19 illustrates an example of geo-registration in a
address heavy loads for short periods without crashing. Reli- visual sensor network [90].
ability testing comprises the verification that the software can
fulfill reliability requirements. Performance testing comprises
the verification that the software can achieve performance re- H. Challenges in Privacy
quirements [89]. Surveillance of events poses ethical problems. For instance,
A substantial pitfall in incorporating intelligent functions into events involving humans and the right to monitor can conflict
real-world systems is the lack of robustness, the inability to test with the individual privacy rights of the monitored people. These
and validate these systems under a variety of use cases, and the privacy challenges depend heavily on the shared acceptance of
lack of quantification of the performance of the system. Addi- the surveillance task as a necessity by the public with respect to
tionally, the system should gracefully degrade in performance a given application [3].
as the complexity of data grows. This is a very open research The suitability of homeland security for this role is plagued
issue that is vital for the deployment of these systems [3]. by questions ranging from dependability to the risks that tech-
RÄTY: SURVEY ON CONTEMPORARY REMOTE SURVEILLANCE SYSTEMS FOR PUBLIC SAFETY 511
in examining these issues by the research laboratories in the ware and architecture, which serves their unique properties and
last decade. Currently, the focus is on the application of these purposes.
integrated systems and the supplying of automated solutions to There are several major companies that deliver surveillance
realistic surveillance dilemmas [100]. systems. GE Security offers integrated security management,
There has been a dramatic progression in sensing for security intrusion and property protection, and video surveillance [103].
applications and in the analysis and processing of sensor data. ObjectVideo provides intelligent video software for security,
O’Sullivan and Pless [101] concentrate on two broad applica- public safety, and other applications [104]. IOImage provides
tions of sensors for security applications, which are 1) anomaly video surveillance, real-time detection, and alert and tracking
detection, and 2) object or pattern recognition [101]. services [105]. RemoteReality offers video surveillance ser-
In anomaly detection, the difficulty is to detect activity, behav- vices, including the detection and tracking of objects, in both
ior, objects, or substances that are atypical. Typical is defined visible and infrared thermal spectra [106]. Point Grey Research
with respect to historical data and is extremely scenario de- offers digital camera technology for machine vision and com-
pendent. Algorithms for anomaly detection must adjust to the puter vision applications [107].
scenario and be robust to a vast range of possible assumptions.
As a result, there is typically no model for an anomaly and
the model for the location and time are derived from observa- IX. CONCLUSION
tions. Scenarios that need anomaly detection include perimeter, This paper presented the contemporary state of modern
border, or gateway surveillance [101]. surveillance systems for public safety with a special emphasis
In object or pattern recognition, there is typically a model or on the 3GSSs and especially the difficulties of present surveil-
prior information of the object or pattern and the intention is to lance systems. The paper briefly reviewed the background and
categorize the pattern. The level of categorization, the required progression of surveillance systems, including a short review of
system robustness, and the required system efficiency define the first and second generation of surveillance systems. The third
and restrict the possible models and processing. The usage of generation of surveillance systems addresses topics such as mul-
biometrics for the recognition of people is a prime example of tisensor environments, video surveillance, audio surveillance,
an application that is evolving rapidly [101]. wireless sensor networks, distributed intelligence and aware-
Gupta et al. propose a leader–follower system, which receives ness, and architecture and middleware. According to modern
multimodal sensor information from a wide array of sensors, science, the current difficulties of surveillance systems for pub-
including radars and cameras. In such a system, a fixed wide lic safety reside in the fields of the attainment of real-time dis-
field of view (FOV) sensor conducts the duties of the leader. tributed architecture, awareness and intelligence, existing diffi-
The leader directs follower PTZ cameras to zoom in on targets culties in video surveillance, the utilization of wireless networks,
of interest. One of the typical difficulties in a leader–follower the energy efficiency of remote sensors, location difficulties of
system is that the follower camera can only follow the target as surveillance personnel, and scalability difficulties. A portion of
it remains in the FOV of the leader. Additionally, inaccuracies in the difficulties are the same as declared in the 3GSSs, but with
the leader–follower calibration may result in imprecise zooming detailed descriptions on the characteristics of the dilemmas,
operations [102]. such as the architectural, visual and awareness aspects. Other
In general, there is plenty of prototypical research, which has difficulties are completely novel or substantially highlighted,
transformed into practical solutions. Environments with mul- such as surveillance personnel location, application of wireless
tiple sensors include solutions in which electronic locks and networks, energy efficiency, and scalability.
user identification have been incorporated into doors, both of Novel sensors and new requirements will accompany surveil-
which can be perceived as individual sensors. The electronic lance systems. This places demanding challenges on architec-
lock indicates its own status and the user identification device ture and its real-time functionality. There are existing funda-
denotes the access rights of the user. This also forms a simple mental concepts, such as video and audio surveillance, but
realization of distributed intelligence and awareness in which there is a lack of their intelligent usage and especially their
each sensor acts independently but a higher level of deduction seamless interoperability through a united real-time architec-
can be performed based on the individual information of each ture. Contemporary surveillance systems still reside in state in
sensor. Video surveillance has been employed in solutions such which individual concepts may achieve functionality in specific
as the detection of the direction of movement. Airports have cases, but their comprehensive on-site interoperability is yet
utilized this technology to automatically raise alarms in situa- to be reached. Substantial evidence of a distributed multisen-
tions in which a person goes through a passage in the wrong sor intelligent surveillance system does not exist. As the size
direction. Audio surveillance technology has been adopted to of surveyed complexes and buildings grow, the deployment of
video camera solutions, which direct the cameras to the loca- wireless sensors and their energy consumption becomes more
tion of alarming sounds. Within various police forces, mobile notable. Wireless sensors are easy to deploy and low-energy con-
robots have been used to remotely survey a potentially haz- sumption is constantly improving. Scalability issues are funda-
ardous environment and transmit video feed to the user. Wire- mentally related to magnitude of areas under surveillance. Areas
less sensor networks can be used to indicate the locations of that require surveillance are growing and also the complexity
nomadic guards to the control room within an indoor perime- of surveillance systems is expanding. These both pose great
ter. All of these solutions have their own appropriate middle- challenges to the scalability aspect. Different sensors provide
RÄTY: SURVEY ON CONTEMPORARY REMOTE SURVEILLANCE SYSTEMS FOR PUBLIC SAFETY 513
different information and their exploitation in intelligent tasks [15] S. A. Velastin, B. A. Boghossian, B. P. I. Lo, J. Sun, and M. A. Vicencio-
remain a challenge. Sensor data should be decomposed into Silva, “PRISMATICA: Toward ambient intelligence in public transport
environments,” IEEE Trans. Syst., Man, Cybern. A, Syst. Hum., vol. 35,
fundamental blocks and the intelligent components should have no. 1, pp. 164–182, Jan. 2005, doi: 10.1109/TMSCA.2004.838461.
the responsibility of composing the deductions from them. An [16] Z. Rasheed, X. Cao, K. Shafique, H. Liu, L. Yu, M. Lee, K. Ram-
attempt should be made to construct a multisensor distributed nath, T. Choe, O. Javed, and N. Haering, “Automated visual anal-
ysis in large scale sensor networks,” in Proc. 2nd ACM/IEEE Int.
intelligent surveillance system that functions at a relatively high Conf. Distrib. Smart Cameras (ICDSC), Sep. 2008, pp. 1–10, doi:
level, capturing alerting situations with a very low false alarm 10.1109/ICDSC.2008.4635678.
rate. The surveillance personnel are one of the strongest aspects [17] P. K. Atrey and A. El Saddik, “Confidence evolution in multimedia
systems,” IEEE Trans. Multimedia, vol. 10, no. 7, pp. 1288–1298, Nov.
in a surveillance system and should be retained in the system. 2008, doi:10.1109/TMM.2008.2004907.
Despite advancements in intelligence and awareness, the human [18] I. N. Junejo, X. Cao, and H. Foroosh, “Autoconfiguration of a dy-
being will always be a forerunner in adaptability and deductions. namic nonoverlapping camera network,” IEEE Trans. Syst., Man,
Cybern. B, Cybern., vol. 37, no. 4, pp. 803–816, Aug. 2007, doi:
The endless demand and abundance of surveillance systems 10.1109/TSMCB.2007.895366.
for public safety has multiple issues, which still require resolu- [19] D. Makris and T. Ellis, “Learning semantic sense models from observing
tions. Extensive intelligent and automation accompanied with activity in visual surveillance,” IEEE Trans. Syst., Man, Cybern. B,
Cybern., vol. 35, no. 3, pp. 397–408, Jun. 2005.
energy efficiency and scalability in large areas are required to be [20] W. Hu, T. Tan, L. Wang, and S. Maybank, “A survey on visual surveil-
adopted by suppliers to establish surveillance systems for civic lance of object motion and behaviors,” IEEE Trans. Syst., Man, Cy-
and communal public safety. bern. C, Appl. Rev., vol. 34, no. 3, pp. 334–352, Aug. 2004, doi:
10.1109/TSMCC.2004.829274.
[21] C. Kreucher, K. Kastella, and A. O. Hero III, “Multitarget track-
ing using the joint multitarget probability density,” IEEE Trans.
REFERENCES Aerosp. Electron. Syst., vol. 41, no. 4, pp. 1396–1414, Oct. 2005, doi:
10.1109/TAES.2005.1561892.
[1] M. Reiter and P. Rohatgi, “Homeland security guest editor’s introduc-
[22] M. Shah, O. Javed, and K. Shafique, “Automated visual surveillance
tion,” IEEE Internet Comput., vol. 8, no. 6, pp. 16–17, Nov./Dec. 2004,
in realistic scenarios,” IEEE Multimedia, vol. 14, no. 1, pp. 30–39,
doi: 10.1109/MIC.2004.62.
Jan.–Mar. 2007, doi: 10.1109/MMUL.2007.3.
[2] M. Valera and S. A. Velastin, “Intelligent distributed surveillance sys-
[23] G. L. Foresti, C. Micheloni, L. Snidaro, P. Remagnino, and T. El-
tems: A review,” IEE Proc.-Vis. Image Signal Process., vol. 152, no. 2,
lis, “Active video-based surveillance system,” IEEE Signal Pro-
pp. 192–204, Apr. 2005, doi: 10.1049/ip-vis: 20041147.
cess. Mag., vol. 22, no. 2, pp. 25–37, Mar. 2005, doi: 10.1109/MSP.
[3] C. S. Regazzoni, V. Ramesh, and G. L. Foresti, “Scanning the is-
2005.1406473.
sue/technology special issue on video communications, processing, and
[24] L. Li, W. Huang, I. Y.-H. Gu, R. Luo, and Q. Tian, “An efficient
understanding for third generation surveillance systems,” Proc. IEEE,
sequential approach to tracking multiple objects through crowds for
vol. 89, no. 10, pp. 1355–1367, Oct. 2001, doi: 10.1109/5.959335.
real-time intelligent CCTV systems,” IEEE Trans. Syst., Man, Cy-
[4] A. C. M. Fong and S. C. Hui, “Web-based intelligent surveillance system
bern. B, Cybern., vol. 38, no. 5, pp. 1254–1269, Oct. 2008, doi:
for detection of criminal activities,” Comput. Control Eng. J., vol. 12,
10.1109/TSMCB.2008.927265.
no. 6, pp. 263–270, Dec. 2001.
[25] L. Maddalena and A. Petrosino, “A self-organizing approach to back-
[5] K. Müller, A. Smolic, M. Dröse, P. Voigt, and T. Wiegand, “3-D con-
ground subtraction for visual surveillance applications,” IEEE Trans.
struction of a dynamic environment with a fully calibrated background
Image Process., vol. 17, no. 7, pp. 1168–1177, Jul. 2008, doi:
for traffic scenes,” IEEE Trans. Circuits Syst. Video Technol., vol. 15,
10.1109/TIP.2008.924285.
no. 4, pp. 538–549, Apr. 2005, doi: 10.1109/TCSVT.2005.844452.
[26] Y. Li, C. Huang, and R. Nevatia, “Learning to associate: Hybrid boosted
[6] W. M. Thames, “From eye to electron—Management problems of the
multi-target tracker for crowded scene,” in Proc. IEEE Conf. Com-
combat surveillance research and development field,” IRE Trans. Mil.
put. Vis. Pattern Recognit. (CVPR), Jun. 2009, pp. 2953–2960, doi:
Electron., vol. MIL-4, no. 4, pp. 548–551, Oct. 1960, doi: 10.1109/IRET-
10.1109/CVPRW.2009.5206735.
MIL.1960.5008288.
[27] A. Leykin, Y. Ran, and R. Hammoud, “Thermal-visible video fusion
[7] H. A. Nye, “The problem of combat surveillance,” IRE Trans. Mil.
for moving target tracking and pedestrian classification,” in Proc. IEEE
Electron., vol. MIL-4, no. 4, pp. 551–555, Oct. 1960, doi: 10.1109/IRET-
Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2007, pp. 1–8, doi:
MIL.1960.5008289.
10.1109/CVPR.2007.383444.
[8] A. S. White, “Application of signal corps radar to combat surveillance,”
[28] A. Leykin and R. Hammoud, “Robust multi-pedestrian tracking in
IRE Trans. Mil. Electron., vol. MIL-4, no. 4, pp. 561–565, Oct. 1960,
thermal-visible surveillance videos,” in Proc. Conf. Comput. Vis. Pat-
doi: 10.1109/IRET-MIL.1960.5008291.
tern Recognit. Workshop (CVPRW), Jun. 2006, pp. 136–143, doi:
[9] C. E. Wolfe, “Information system displays for aerospace surveillance
10.1109/CVPRW.2006.175.
applications,” IEEE Trans. Aerosp., vol. AS-2, no. 2, pp. 204–210, Apr.
[29] W. K. Wong, P. N. Tan, C. K. Loo, and W. S. Lim, “An effective surveil-
1964, doi: 10.1109/TA.1964.4319590.
lance system using thermal camera,” in Int. Conf. Signal Acquis. Process.
[10] R. Ott, M. Gutierrez, D. Thalmann, and F. Vexo, “Advanced virtual
(ICSAP), Apr. 2009, pp. 13–17, doi: 10.1109/ICSAP.2009.12.
reality technologies for surveillance and security applications,” in Proc.
[30] D. Istrate, E. Castelli, M. Vacher, L. Besacier, and J. F. Serignat, “In-
ACM SIGGRAPH Int. Conf. Virtual Real. Continuum Its Appl. (VCRIA),
formation extraction from sound for medical telemonitoring,” IEEE
Jun. 2006, pp. 163–170.
Trans. Inf. Technol. Biomed., vol. 10, no. 2, pp. 264–274, Apr. 2006, doi:
[11] M. Bramberger, A. Doblander, A. Maier, B. Rinner, and H. Schwabach,
10.1109/TITB.2005.859889.
“Distributed embedded smart cameras for surveillance applica-
[31] M. Stanacevic and G. Cauwenberghs, “Micropower gradient flow acous-
tions,” Computer, vol. 39, no. 2, pp. 68–75, Feb. 2006, doi:
tic localizer,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 52, no. 10,
10.1109/MC.2006.55.
pp. 2148–2157, Oct. 2005, doi: 10.1109/TCSI.2005.853356.
[12] R. T. Collins, A. J. Lipton, H. Fujiyoshi, and T. Kanade, “Algorithms
[32] P. Julian, A. G. Andreou, L. Riddle, S. Shamma, D. H. Goldberg, and
for cooperative multisensor surveillance,” Proc. IEEE, vol. 89, no. 10,
G. Cauwenberghs, “A comparative study of sound localization algo-
pp. 1456–1477, Oct. 2001, doi: 10.1109/5.959341.
rithms for energy aware sensor network nodes,” IEEE Trans. Cir-
[13] M. M. Trivedi, T. L. Gandhi, and K. S. Huang, “Homeland security
cuits Syst. I, Reg. Papers, vol. 51, no. 4, pp. 640–648, Apr. 2004, doi:
distributed interactive video arrays for event capture and enhanced situa-
10.1109/TCSI.2004.826205.
tional awareness,” IEEE Intell. Syst., vol. 20, no. 5, pp. 58–66, Sep./Oct.
[33] A. F. Smeaton and M. McHugh, “Towards event detection in an audio-
2005, doi:10.1109/MIS.2005.86.
based sensor network,” in Proc. 3rd Int. Workshop Video Surveill. Sens.
[14] F. Castanedo, M. A. Patricio, J. Garcia, and J. M. Molina, “Extending
Netw. (VSSN), Nov. 2005, pp. 87–94.
surveillance systems capabilities using BDI cooperative sensor agents,”
[34] J. Chen, Z. Safar, and J. A. Sorensen, “Multimodal wireless networks:
in Proc. 4th Int. Workshop Video Surveill. Sens. Netw. (VSSN), Oct. 2006,
Communication and surveillance on the same infrastructure,” IEEE
pp. 131–138.
514 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 40, NO. 5, SEPTEMBER 2010
Trans. Inf. Forensics Secur., vol. 2, no. 3, pp. 468–484, Sep. 2007, doi: [56] P. K. Sagiraju, S. Agaian, and D. Akopian, “Reduced complexity ac-
10.1109/TIFS.2007.904944. quisition of GPS signals for software embedded applications,” IEE
[35] G. Xing, C. Lu, R. Pless, and Q. Huang, “Impact of sensing coverage on Proc.-Radar Sonar Navig., vol. 153, no. 1, pp. 69–78, Feb. 2006, doi:
greedy geographic routing algorithms,” IEEE Trans. Parallel Distrib. 10.1049/ip-rsn:20050091.
Syst., vol. 17, no. 4, pp. 348–360, Apr. 2006, doi: 10.1109/TPDS.2006.48. [57] R. Cucchiara, “Multimedia surveillance systems,” in Proc. 3rd Int. Work-
[36] R. R. Brooks, P. Ramanathan, and A. M. Sayeed, “Distributed target shop Video Surveill. Sens. Netw. (VSSN), Nov. 2005, pp. 3–10.
classification and tracking in sensor networks,” Proc. IEEE, vol. 91, [58] M. Greiffenhagen, D. Comaniciu, H. Niemann, and V. Ramesh, “Design,
no. 8, pp. 1163–1171, Aug. 2003, doi: 10.1109/JPROC.2003.814923. analysis, and engineering of video monitoring systems: An approach and
[37] A. M. Tabar, A. Keshavarz, and H. Aghajan, “Smart home care network a case study,” Proc. IEEE, vol. 89, no. 10, pp. 1498–1517, Oct. 2001,
using sensor fusion and distributed vision-based reasoning,” in Proc. 4th doi: 10.1109/5.959343.
Int. Workshop Video Surveill. Sens. Netw. (VSSN), Oct. 2006, pp. 145– [59] M. Valera and S. A. Velastin, “Real-time architecture for a large dis-
154. tributed surveillance system,” in Proc. IEE Intell. Surveill. Syst., London,
[38] S. Megerian, F. Koushanfar, M. Potkonjak, and M. B. Srivastava, U.K., Feb. 2004, pp. 41–45.
“Worst and best-case coverage in sensor networks,” IEEE Trans. [60] C. Micheloni, L. Snidaro, L. Visentini, and G. L. Foresti, “Sensor
Mobile Comput., vol. 4, no. 1, pp. 84–92, Jan./Feb. 2005, doi: bandwidth assignment through video annotation,” in Proc. IEEE Int.
10.1109/TMC.2005.15(410)4. Conf. Video Signal Based Surveill. (AVSS), Nov. 2006, pp. 48–48, doi:
[39] V. Chandramohan and K. Christensen, “A first look at wired sen- 10.1109/AVSS.2006.102.
sor networks for video surveillance systems,” in Proc. 27th Annu. [61] R. Bowden and P. KaewTraKulPong, “Towards automated wide area
IEEE Conf. Local Comput. Netw. (LCN), Nov. 2002, pp. 728– visual surveillance: Tracking objects between spatially-separated, uncal-
729. ibrated views,” IEE Proc.-Vis. Image Signal Process., vol. 152, no. 2,
[40] Z. Dimitrijevic, G. Wu, and E. Y. Chang, “SFINX: A multi-sensor pp. 213–223, Apr. 2005, doi: 10.1049/ip-vis: 20041233.
fusion and mining system,” in Proc. 2003 Joint Conf. Fourth Int. [62] C. Micheloni, G. L. Foresti, and L. Snidaro, “A network of co-operative
Conf. Inf., Commun. Signal Process., Dec., vol. 2, pp. 1128–1132, doi: cameras for visual surveillance,” IEE Proc.-Vis. Image Signal Process.,
10.1109/ICICS.2003.1292636. vol. 152, no. 2, pp. 205–212, Apr. 2005, doi: 10.1049/ip-vis: 20041256.
[41] A. Hampapur, L. Brown, J. Connell, A. Ekin, N. Haas, M. Lu, H. Merkl, [63] M. Albanese, R. Chellappa, V. Moscato, A. Picariello, V. S. Sub-
S. Pankanti, A. Senior, C.-F. Shu, and Y. L. Tian, “Smart video surveil- rahmanian, P. Turaga, and O. Udrea, “A constrained probabilistic
lance: Exploring the concept of multiscale spatiotemporal tracking,” petri net framework for human activity detection in video,” IEEE
IEEE Signal Process. Mag., vol. 22, no. 2, pp. 38–51, Mar. 2005, doi: Trans. Multimedia, vol. 10, no. 8, pp. 1429–1443, Dec. 2009, doi:
10.1109/MSP.20005.1406476. 10.1109/TMM.2008.2010417.
[42] S. Bandini and F. Sartori, “Improving the effectiveness of monitoring and [64] L. Yuan, A. Haizhou, T. Tamashita, L. Shihong, and M. Kaware,
control systems exploiting knowledge-based approaches,” Pers. Ubiqui- “Tracking in low frame rate video: A cascade particle filter with dis-
tous Comput., vol. 9, no. 5, pp. 301–311, Sep. 2005, doi: 10.1007/s00779- criminative observers of different life spans,” IEEE Trans. Pattern
004-0334-3. Anal. Mach. Intell., vol. 30, no. 10, pp. 1728–1740, Oct. 2008, doi:
[43] H. Detmold, A. Dick, K. Falkner, D. S. Munro, A. Van Den Hengel, 10.1109/TPAMI.2008.73.
and P. Morrison, “Middleware for video surveillance networks,” in Proc. [65] R. Cucchiara, C. Grana, A. Prati, and R. Vezzani, “Computer vision
1st Int. Workshop Middleware Sens. Netw. (MidSens), Nov.–Dec. 2006, system for in-house video surveillance,” IEE Proc.-Vis. Image Signal
pp. 31–36. Process., vol. 152, no. 2, pp. 242–249, Apr. 2005, doi: 10.1049/ip-vis:
[44] R. Seals, “Mobile robotics,” Electron. Power, vol. 30, no. 7, pp. 543–546, 20041215.
Jul. 1984, doi: 10.1049/ep.1984.0286. [66] J. A. Besada, J. Garcia, J. Portillo, J. M. Molina, A. Varona, and G. Gonza-
[45] S. Harmon, “The ground surveillance robot (GSR): An autonomous vehi- lex, “Airport surface surveillance based on video images,” IEEE Trans.
cle designed to transit unknown terrain,” IEEE J. Robot. Autom., vol. RA- Aerosp. Electron. Syst., vol. 41, no. 3, pp. 1075–1082, Jul. 2005, doi:
3, no. 3, pp. 266–279, Jun. 1987, doi: 10.1109/JRA.1987.1087091. 10.1109/TAES.2005.1541452.
[46] S. Harmon, G. Bianchini, and B. Pinz, “Sensor data fusion through a [67] S. M. Khan and M. Shah, “Tracking multiple occluding people by local-
distributed blackboard,” in Proc. IEEE Int. Conf. Robot. Autom., Apr. izing on multiple scene planes,” IEEE Trans. Pattern Anal. Mach. Intell.,
1986, pp. 1449–1454. vol. 31, no. 3, pp. 505–519, Mar. 2009, doi: 10.1109/TPAMI.2008.102.
[47] J. White, H. Harvey, and K. Farnstrom, “Testing of mobile surveillance [68] W. Hu, M. Hu, X. Zhou, T. Tan, J. Lou, and S. Maybank, “Principal axis-
robot at a nuclear power plant,” in Proc. IEEE Int. Conf. Robot. Autom., based correspondence between multiple cameras for people tracking,”
Mar. 1987, pp. 714–719. IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 4, pp. 663–671, Apr.
[48] D. Di Paola, D. Naso, A. Milella, G. Cicirelli, and A. Distante, “Multi- 2006, doi: 10.1109/TPAMI.2006.80.
sensor surveillance of indoor environments by an autonomous mobile [69] D.-Y. Chen, K. Cannons, H.-R. Tyan, S.-W. Shih, and H.-Y. M. Liao,
robot,” in Proc. 15th Int. Conf. Mechatronics Mach. Vis. Pract. (M2VIP), “Spatiotemporal motion analysis for the detection and classification of
Dec. 2008, pp. 23–28, doi: 10.1109/MMVIP.2008.474501. moving targets,” IEEE Trans. Multimedia, vol. 10, no. 8, pp. 1578–1591,
[49] A. Bakhtari, M. D. Naish, M. Eskandari, E. A. Cloft, and B. Ben- Dec. 2008, doi:10.1109/TMM.2008.2007289.
habib, “Active-vision-based multisensor surveillance—An implemen- [70] F. Yin, D. Makris, and S. A. Velastin, “Time efficient ghost removal for
tation,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 36, no. 5, motion detection in visual surveillance systems,” Electron. Lett., vol. 44,
pp. 668–680, Sep. 2006, doi: 10.1109/TSMCC.2005.855525. no. 23, pp. 1351–1353, Nov. 2008, doi: 10.1049/el:20082118.
[50] J. J. Valencia-Jimenez and A. Fernandez-Caballero, “Holonic multi- [71] Y. Wang, D. Bowman, D. Krum, E. Coelho, T. Smith-Jackson, D. Bailey,
agent systems to integrate multi-sensor platforms in complex surveil- S. Peck, S. Anand, T. Kennedy, and Y. Abdrazakov, “Effects on video
lance,” in Proc. IEEE Int. Conf. Video Signal Based Surveill. (AVSS), placement and spatial context presentation on path reconstruction tasks
Nov. 2006, p. 49, doi: 10.1109/AVS.2006.58. with contextualized videos,” IEEE Trans. Vis. Comput. Graph., vol. 14,
[51] Y.-C. Tseng, Y.-C. Wang, K.-Y. Cheng, and Y.-Y. Hsieh, “iMouse: An no. 6, pp. 1755–1762, Nov./Dec. 2008, doi:10.1109/TVCG.2008.126.
integrated mobile surveillance and wireless sensor system,” Computer, [72] W. Hu, D. Xie, Z. Fu, W. Zeng, and S. Maybank, “Semantic-based
vol. 40, no. 6, pp. 60–66, Jun. 2007, doi: 10.1109/MC.2007.211. surveillance video retrieval,” IEEE Trans. Image Process., vol. 16, no. 4,
[52] J. N. K. Liu, M. Wang, and B. Feng, “iBotGuard: An internet-based pp. 1168–1181, Apr. 2007, doi:10.1109/TIP.2006.891352.
intelligent robot security system using invariant face recognition against [73] L. Snidaro, R. Niu, G. L. Foresti, and P. K. Varshney, “Quality-based
intruder,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 35, no. 1, fusions of multiple video sensors for video surveillance,” IEEE Trans.
pp. 97–105, Feb. 2005, doi:10.1109/TSMCC.2004.840051. Syst., Man, Cybern. – Part B: Cybern., vol. 37, no. 4, pp. 1044–1051,
[53] H. Liu, O. Javed, G. Taylor, X. Cao, and N. Haering, “Omni-directional Aug. 2007, doi: 10.1109/TSMCB.2007.895331.
surveillance for unmanned water vehicles,” presented at the 8th Int. [74] P. K. Atrey, M. S. Kankanhalli, and R. Jain, “Timeline-based informa-
Workshop Vis. Surveill., Marseilles, France, Oct. 2008. tion assimilation in multimedia surveillance and monitoring systems,” in
[54] I. Pavlidis, V. Morellas, P. Tsiamyrtzis, and S. Harp, “Urban surveillance Proc. 3rd Int. Workshop Video Surveill. Sens. Netw. (VSSN), Nov. 2005,
systems: From the laboratory to the commercial world,” Proc. IEEE, pp. 103–112.
vol. 89, no. 10, pp. 1478–1497, Oct. 2001, doi: 10.1109/5.959342. [75] B. Hardian, “Middleware support for transparency and user control in
[55] J. Krikke, “Intelligent surveillance empowers security analysts,” IEEE context-aware systems,” presented at the 3rd Int. Middleware Doctoral
Intell. Syst., vol. 21, no. 3, pp. 102–104, May/Jun. 2006. Symp. (MDS), Melbourne, Australia, Nov.–Dec. 2006.
RÄTY: SURVEY ON CONTEMPORARY REMOTE SURVEILLANCE SYSTEMS FOR PUBLIC SAFETY 515
[76] A. Dore, M. Pinasco, and C. S. Regazzoni, “A bio-inspired learning [93] M. S. Kankanhalli and Y. Rui, “Application potential of multimedia
approach for the classification of risk zones in a smart space,” in Proc. information retrieval,” Proc. IEEE, vol. 96, no. 4, pp. 712–720, Apr.
IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2007, pp. 1–8, doi: 2008, doi: 10.1109/JPROC.2008.916383.
10.1109/CVPR.2007.383440. [94] A. Prati, R. Vezzani, L. Benini, E. Farella, and P. Zappi, “An integrated
[77] E. Blasch and S. Plano, “Proactive decision fusion for site security,” multi-modal sensor network for video surveillance,” in Proc. ACM Int.
in Proc. 8th Int. Conf. Inf. Fusion, Jul. 2005, pp. 1584–1591, doi: Workshop Video Surveill. Sens. Netw., Nov. 2005, pp. 95–102.
10.1109/ICIF.2005.1592044. [95] S. Calderara, R. Cucchiara, and A. Prati, “Multimedia surveillance:
[78] F. Castanedo, M. A. Patricio, J. Garcia, and J. M. Molina, “Robust data Content-based retrieval with multicamera people tracking,” in Proc. ACM
fusion in a visual sensor multi-agent architecture,” in Proc. 10th Int. Int. Workshop Video Surveill. Sens. Netw., Oct. 2006, pp. 95–100.
Conf. Inf. Fusion, Jul. 2007, pp. 1–7, doi: 10.1109/ICIF.2007.4408121. [96] P. K. Atrey, M. S. Kankanhalli, and R. Jain, “Information assimila-
[79] Y.-C. Tseng, T.-Y. Lin, Y.-K. Liu, and B.-R. Lin, “Event-driven tion framework for event detection in multimedia surveillance systems,”
messaging services over integrated cellular and wireless sensor net- ACM Multimedia Syst. J., vol. 12, no. 3, pp. 239–253, Dec. 2006.
works: Prototyping experiences of a visitor system,” IEEE J. Sel. [97] J. Kim, J. Park, K. Lee, K.-H. Baek, and S. Kim, “A portable surveil-
Areas Commun., vol. 23, no. 6, pp. 1133–1145, Jun. 2005, doi: lance camera architecture using one-bit motion detection,” IEEE Trans.
10.1109/JSAC.2005.845623. Consum. Electron., vol. 53, no. 4, pp. 1254–1259, Nov. 2007, doi:
[80] J.-S. Lee, “A petri net design of command filters for semiautonomous 10.1109/TCE.2007.4429209.
mobile sensor networks,” IEEE Trans. Ind. Electron., vol. 55, no. 4, [98] L. Havasi, Z. Szlavik, and T. Sziranyi, “Detection of gait charac-
pp. 1835–1841, Apr. 2008, doi: 10.1109/TIE.2007.911926. teristics for scene registration in video surveillance system,” IEEE
[81] E. Norouznezhad, A. Bigdeli, A. Postula, and B. C. Lovell, “A high res- Trans. Image Process., vol. 16, no. 2, pp. 503–510, Feb. 2007, doi:
olution smart camera with GigE vision extension for surveillance appli- 10.1109/TIP.2006.88839.
cations,” in Proc. Second ACM/IEEE Int. Conf. Distrib. Smart Cameras, [99] Y. Huang, X. Ao, Y. Li, and C. Wang, “Multiple biometrics system based
Sep. 2008, pp. 1–8, doi: 10.1109/ICDSC.2008.4635711. on DavinCi platform,” in Proc. Int. Symp. Inf. Sci. Eng. (ISISE), Dec.
[82] S. Appadwedula, V. V. Veeravalli, and D. L. Jones, “Energy-efficient 2008, pp. 88–92, doi: 10.1109/ISISE.2008.163.
detection in sensor networks,” IEEE J. Sel. Areas Commun., vol. 23, [100] L.-Q. Xu, “Issues in video analytics and surveillance systems: Re-
no. 4, pp. 693–702, Apr. 2005, doi: 10.1109/JSAC.2005.843536. search/prototyping vs. applications/user requirements,” in Proc. IEEE
[83] A. Maier, B. Rinner, W. Schriebl, and H. Schwabach, “Online multi- Conf. Adv. Video Signal Based Surveill. (AVSS), Sep. 2007, pp. 10–14,
criterion optimization for dynamic power-aware camera configura- doi: 10.1109/AVSS.2007.4425278.
tion in distributed embedded surveillance clusters,” in Proc. 20th Int. [101] J. A. O. O’Sullivan and R. Pless, “Advances in security technologies:
Conf. Adv. Inf. Netw. Appl. (AINA 2006), Apr., pp. 307–312, doi: Imaging, anomaly detection, and target and biometric recognition,” in
10.1109/AINA.2006.250. Proc. IEEE/MTT-S Int. Microw. Symp., Jun. 2007, pp. 761–764, doi:
[84] H. Liu, X. Jia, P.-J. Wan, C.-W. Yi, S.-K. Makki, and N. Pissnou, 10.1109/MWSYM.2007.380051.
“Maximizing lifetime of sensor surveillance systems,” IEEE/ACM [102] H. Gupta, X. Cao, and N. Haering, “Map-based active leader-follower
Trans. Netw., vol. 15, no. 2, pp. 334–345, Apr. 2007, doi: surveillance system,” presented at the Workshop Multi-Camera Multi-
10.1109/TNET.2007.892883. Modal Sens. Fusion Algorithms Appl. (M2SFA2), Marseille, France,
[85] Y. Imai, Y. Hori, and S. Masuda, “Development and a brief evaluation Oct. 2008.
of a web-based surveillance system for cellular phones and other mo- [103] GE Security website. (2009). [Online]. Available: http://www.gesecurity.
bile computing clients,” in Proc. Conf. Hum. Syst. Interact., May 2008, com/portal/site/GESecurity
pp. 526–531, doi: 10.1109/HSI.2008.4581494. [104] ObjectVideo website. (2009). [Online]. Available: http://www.
[86] V. A. Petrushin, O. Shakil, D. Roqueiro, G. Wei, and A. V. Gershman, objectvideo.com/company/
“Multiple-sensor indoor surveillance system,” in Proc. 3rd Can. Conf. [105] IOImage website. (2009). [Online]. Available: http://www.ioimage.com/
Comput. Robot Vis., Jun. 2006, p. 40, doi:10.1109/CRV.2006.50. [106] RemoteReality website. (2009). [Online]. Available: http://www.
[87] P. Korshunov and W. T. Ooi, “Critical video quality for distributed au- remotereality.com/
tomated video surveillance,” in Proc. 13th Annu. ACM Int. Conf. Multi- [107] PointGrey webiste. (2009). [Online]. Available: http://www.ptgrey.com/
media, Nov. 2005, pp. 151–160.
[88] A. May, J. Teh, P. Hobson, F. Ziliani, and J. Reichel, “Scalable video
requirements for surveillance systems,” IEE Intell. Surveill. Syst., pp. 17–
20, Feb. 2004.
[89] A. Avritzer, J. P. Ros, and E. Weyuker, “Reliability testing of rule-
based systems,” IEEE Softw., vol. 13, no. 5, pp. 76–82, Sep. 1996, doi:
10.1109/52.536461. Tomi D. Räty received the Ph.D. degree in informa-
[90] K. Shafique, F. Guo, G. Aggarwal, Z. Rasheed, X. Cao, and N. Haering, tion processing science from the University of Oulu,
“Automatic geo-registration and inter-sensor calibration in large sen- Oulu, Finland, in 2008.
sor networks,” in Smart Cameras. New York: Springer-Verlag, 2009, He is currently a Senior Research Scientist and
pp. 245–257. a Team Leader of the Software Platforms Team at
[91] C. Caricotte, X. Desurmont, B. Ravera, F. Bremond, J. Orwell, S. A. Ve- VTT Technical Research Centre of Finland, Oulu.
lastin, J. M. Obodez, B. Corbucci, J. Palo, and J. Cernocky, “Toward His research interests include surveillance systems,
generic intelligent knowledge extractions from video and audio: The model-based testing, network monitoring, software
EU-funded CARETAKER project,” in Proc. Inst. Eng. Technol. Conf. platforms, and middleware. He is the author or coau-
Crime Secur., Jun. 2006, pp. 470–475. thor of more than 20 papers published in various
[92] S. Fleck and W. Strasser, “Smart camera based monitoring system and conferences and journals.
its application to assisted living,” Proc. IEEE, vol. 96, no. 10, pp. 1698– Dr. Räty has served as a Reviewer for IEEE TRANSACTIONS ON MOBILE
1714, Oct. 2008, doi:10.1109/JPROC.2008.928765. COMPUTING and in several conferences.