Professional Documents
Culture Documents
Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems
Abstract There is a growing interest in IoT-enabled smart Advanced analytics techniques, such as machine learning
buildings. However, the storage and analysis of large amount of algorithms, can be used to analyze the IoT data for effective
high-speed real-time smart building data is a challenging task. and efficient operation of smart buildings. For example, the
There are a number of contemporary Big Data management data generated from the IoT sensors can be extracted and
technologies and advanced analytics techniques that can be used analyzed in real time or near real time for not only energy
to deal with this challenge. There is a need for an integrated IoT efficiency improvement of smart building but also for the
Big Data Analytics (IBDA) framework to fill the research gap in health, safety and comfort of the residents [4]. Advanced
the Big Data Analytics domain. This paper presents one such analytics techniques can be integrated with Big Data
IBDA framework for the storage and analysis of real time data
management technologies to effectively support smart building
generated from IoT sensors deployed inside the smart building.
The initial version of the IBDA framework has been developed by
analytics and autonomous decision making. Thus, this research
using Python and the Big Data Cloudera platform. The focuses on the challenge of real time analytics of the large
applicability of the framework is demonstrated with the help of a amount of high speed data generated from the smart building
scenario involving the analysis of real-time smart building data IoT devices. It provides an integrated framework of IoT Big
for automatically managing the oxygen level, luminosity and Data Analytics (IBDA). The proposed framework has three
smoke/hazardous gases in different parts of the smart building. major integrated technology components: IoT Sensors, Big
The initial results indicate that the proposed framework is fit for Data Management and Data Analytics. The applicability of the
the purpose and seems useful for IoT-enabled Big Data Analytics proposed framework is evaluated by using it to monitor and
for smart buildings. The key contribution of this paper is the control the oxygen level, luminosity and smoke/hazardous
complex integration of Big Data Analytics and IoT for addressing gases of the smart building to improve user experience,
the large volume and velocity challenge of real-time data in the comfortability, safety and health.
smart building domain. This framework will be further evaluated
and extended through its implementation in other domains. The paper is organized as follows: firstly, it provides the
research background. Secondly, it discusses the research
KeywordsIoT, Smart Building, Big Data, Cloudera, Digital- method. Thirdly, it provides the details and applications of the
Physical Ecosystems, Apache Flume, Apache Spark, Real Time proposed framework. Fourthly it discusses the related work and
Data Analytics. contribution. Finally, it concludes with future research
directions.
I. INTRODUCTION
With every passing day, an increasing number of Internet of II. RESEARCH BACKGROUND
Things (IoT) devices are connected around the globe. In 2011,
CISCO highlighted that the sum of interconnected objects A. IoT
around the world exceeded the total number of human beings IoT refers to using multiple connected devices via a
on our planet [1]. This report from CISCO also highlighted that common network to gather and make use of the data generated
in 2020, the number of connected objects will reach 50 billion, by embedded sensors, actuators and other physical objects [5].
causing drastic changes and developments in the digital IoT has and will continue to spread promptly in upcoming
domain. There are various domains in which IoT is and will years. This emerging technological domain will unleash new
continue to make considerable improvements to human life [2]. horizons and aspects of the services that will result in an
One such domain is IoT-enabled smart buildings or smart improved quality of life of the consumers and will also prove
cities. IoT sensors inside the buildings continuously monitor fruitful for enterprises in their productivity [6]. For consumers,
the environment and the data from these IoT sensors can be IoT has the potential to provide solutions in various sectors
collected and stored on a server from where it can be extracted, including but not limited to health, security, energy efficiency,
transformed and analyzed on demand or in real-time [3]. user comfortability and many others. On an enterprise level,
1326
is to design a framework. Thus, for this reason, this study five IoT oxygen sensor is within the pre-defined comfortable
embraced a design research (DR) approach [19]. In the DR range for the user or resident of the smart building, then no
approach, a framework is developed and evaluated for a real action is required. We have chosen a threshold value of 14
world or perceived problem [19]. The DR approach can be best which denotes the percentage concentration of oxygen in a
explained in three major steps: particular room or level. If the level of oxygen concentration is
below the user comfortability level, then an oxygen pump,
1. Literature Review Relevant literature and related work is which for this research is a virtual oxygen pump, is turned ON
first studied to identify the research gap and the research and remains ON until the level of oxygen detected from the
questions. In this research the literature related to the real
same sensor is reached above the specified threshold level.
time data analytics framework for IoT generated data and This is done to get the required level of oxygen in the particular
the techniques to control the smart buildings in real time room or a particular level where the oxygen sensor read a low
were reviewed. value of oxygen concentration i.e. below threshold. If the value
2. Design In the second step of the DR, the implementation of the oxygen concentration is within the acceptable limit i.e
or design of the research question is addressed. In this above 14, it will print Oxygen level ok on the Cloudera
research we designed the IBDA framework for real time terminal indicating that no action is required. Otherwise the
data analytics for smart buildings using IoT sensors and terminal displays Oxygen pump X turned ON, where X
Apache Hadoop. represents the pump number. When the pump is turned OFF,
the Cloudera terminal prints Oxygen pump X turned OFF. If
3. Evaluation In the third and final step of the DR approach, any of the smoke detectors detect smoke or hazardous gases,
the framework is evaluated using a real-World problem. In the relevant fire alarm is turned ON. This is represented in
this research, the developed framework was evaluated for Cloudera by printing out Fire alarm X turned ON, where X
a smart building scenario. represents the room or level where smoke is detected. The
The scope of this framework is limited to the data smoke alarm is not turned OFF in the code. It is assumed that
generation, data extraction, data ingestion into HDFS (Hadoop the fire alarm will have to be reset manually. Similarly, if
Distributed File System), data visualization, data analytics and during the data analytics process, it is detected that a particular
real time control of the smart building. This research is luminosity sensor detected a low luminosity level, the lights in
intended to facilitate the construction and development of the room where that particular sensor is installed are turned
smart buildings for the real time monitoring and control of ON. The Cloudera terminal will print Lights in room/level X
various facilities of the building including HVAC (Heating, turned ON where X represents the particular room or level of
Ventilation and Air Conditioning), luminosity and other the smart building. When the luminosity level in the room is
features for user comfortability. above the threshold, the lights in the room are turned OFF.
This is denoted in Cloudera by Lights in room/level X turned
OFF. This work will be further extended in future by using
IV. THE IBDA FRAMEWORK
physical IoT sensors and physical actuators and controls like
The proposed IBDA framework has three major integrated oxygen pumps and fire alarm or buzzer.
technology components: IoT Sensors, Big Data Management
and Analytics. Firstly, instead of using multiple physical IoT The IBDA framework architecture is shown in Fig. 1. It
oxygen sensors, luminosity sensors and smoke/hazardous gases shows all the steps involved in the analytics process of the
detectors, we used fifteen virtual sensors to generate the data proposed IBDA framework. Fig. 2 shows IoT sensor data
through software. For this purpose, we have implemented management, analytics and visualization. It also depicts how
python code to simulate five (IoT) oxygen sensors, five the building environment is controlled based on the real time
smoke/hazardous gases detectors and five luminosity sensors data analytics results. As discussed earlier, the data generated
inside the smart building. These five sensors represent the from the virtual sensors is ingested into HDFS. From HDFS,
sensors on five different locations of the smart building with the data is analyzed in real time using Apache Spark, and based
each location having a group of three different sensors. on the results of the analytics, oxygen pumps, fire alarm and
lights can be turned ON or controlled to keep the users and
Secondly, these sensors generate a large amount of data residents comfortable inside the building.
(Big Data), which is then sent to a TCP (Transmission Control
Protocol) port where an Apache Flume agent is already running Fig. 3 shows the flow chart of the IBDA framework
and listening to this port. The Apache Flume agent is setup analytics process. The IoT sensors generate oxygen
with the IoT generated real-time data streams as the source and concentration data, smoke detectors data and luminosity sensor
the HDFS as the sink so it stores the data onto HDFS after data from various levels or rooms of the smart buildings. This
listening from the TCP port. We used the Cloudera [17] Big data, after being ingested into Cloudera HDFS, is analyzed
Data platform (Virtual Machine for the Apache Hadoop using Apache Spark code. If the levels of oxygen and
environment) for storing the data. luminosity generated from any of the sensors is below the
acceptable range, the corresponding oxygen pump or lights
Thirdly, we used PySpark [20] scripts (Analytics) to listen deployed on that level or in that room are turned ON
and analyze the data in real-time that is being stored on HDFS. respectively to keep the levels of oxygen and luminosity in the
This PySpark code analyzed the oxygen concentration, smoke acceptable range for resident comfortability. If the smoke
detector data and luminosity sensor data coming from the detectors detect a smoke or a hazardous gas in the smart
sensors. If the level of oxygen concentration generated from building, the fire alarm in that particular room or level is turned
1327
Fig. 1. IBDA architecture ingest, store, analyse and actuate.
ON. As discussed earlier, for this research, it is assumed that listen to the same TCP port onto which all the fifteen oxygen
the fire alarm will need to be reset manually to turn it OFF. sensors are sending the data. The Apache Flume agent is setup
with HDFS as the sink so it keeps on storing the data onto the
V. IBDA FRAMEWOKR IMPLEMENTATION AND EVALUAITON HDFS as soon as it arrives from the virtual sensors. Fig. 5
shows the pseudo code for the data generation in the top half
As discussed earlier, we implemented the IBDA framework portion. Fig. 6 shows a flume agent running and listening to the
using Cloudera Hadoop distribution. We used python code to TCP port.
simulate the virtual sensor data and used PySpark to analyze it
in real time. The framework implementation and evaluation is
divided into six steps as explained below:
A. Data Generation
The data from fifteen IoT sensors is generated using python
code to simulate five virtual oxygen sensors, five virtual smoke
detectors and five virtual luminosity sensors. Using the python
code, the data is sent to a TCP port. It is assumed that these
fifteen sensors are deployed at five different levels or rooms of
the smart building whose environment we want to control
automatically. It is also assumed that these fifteen sensors are
deployed in groups of three with each level or room having a
group of one oxygen sensor, one luminosity sensor and one
smoke/hazardous gases detector. Fig. 4 shows the screenshot of
data generation code. For this research, it is assumed that no
two sensors will be generating and sending data at the same
time. Only one sensor is allowed to send the data at a time. The
data from each sensor is generated every ten seconds (defining
real time for this project). We used Pycharm to write the data
generation code in Python because of its ease of use.
B. Data extraction
To extract the data generated from the fifteen virtual
sensors as mentioned above, an Apache Flume agent is setup to Fig. 2 IBDA implementation
1328
Fig. 3. Flowchart of the analytics process
D. Data Visualization
Once the data is stored into HDFS, it can be visualized by
using the default visualization tools available with Cloudera
environment. The virtual sensor data stored into HDFS, can be
imported into Hue in the form of tables and can be analyzed to
gain useful insights into the data. Fig. 8 shows a snapshot of
the data visualization step.
Fig. 4. Data Generation
1329
E. Data Analytics
For analyzing the data being generated from the oxygen
sensors in real time, this research used Apache Spark. We
used PySpark which is the Spark Python API (Application
Programming Interface) to analyze the data in real time. This
PySpark code continuously monitors the data stored into the
HDFS coming from all fifteen sensors including five oxygen
sensors, five smoke sensors and five luminosity sensors. If the
1330
VI. RELATED WORK
There has been a growing interest in IoT, Big Data and
Analytics in recent reserches. For example, in one of the
studies [21], authors discuss Apache Spark as a computing
platform for Big Data analytics. They present a survey of
different techniques that can be used in the data analytics
process. In particular, they proposed different Big Data
processing techniques that can be utilized for smart grid data.
In a recent study [22], a number of challenges associated with
the IoT domain have been identified. This study highlights the
potential uncertainties and risks in this emerging domain. In
order to effectively understand and analyse the data, we need to
effectively model the IoT environment. The modelling aspect
of IoT system has been discussed in [23]. This paper proposes
Fig. 8. Data Visualization the use of feedback control theory to evaluate the performance
of IoT system. However it does not provide a structured
framework for the real time analytics and control of IoT
devices as discussed in this paper. A framework for real time
semantic annotation of streaming IoT data has been proposed
in [24]. This framework claims to facilitate the transfer of
large volume of IoT generated data from Smart City
applications. This work is a good start towards the
development of a concrete framework that may result in the
improvement of the performance of the Smart City. However,
this framework does not offer any help on how the data can be
analysed in real time so appropriate actions can be taken to
control the smart building.
Finally, most recently, an IoT enabled platform [25] has
been proposed for Wireless Sensing and Monitoring of
environmental conditions in the context of building
Fig. 9. Preparing Apache Spark for data analytics automation. This platform monitors the temperature, relative
humidity and light. In the proposed platform, a new hopping
method is used to transmit the data from the sending node to
the receiving node. The received data is then monitored in an
excel sheet. Android app was also developed to monitor the
data. This work provides a basic framework and does not
discuss in detail about the real time data analytics for the IoT-
enabled smart environment.
It is clear, based on the background research and recent
related work, that there is a need of integrated frameworks for
effectively guiding the organisations interesting to adopt IoT,
Big Data Management and Analytics for real time analytics
and effective decision making for the control of their smart
environments such as smart buildings. This paper presented
one such framework and its initial evaluation, which is called
here IBDA, for real time analytics.
VII. CONCLUSIONS AND FUTURE WORK
In this research we have developed and evaluated the
IBDA framework for the IoT enabled smart buildings
scenario. It has been demonstrated that how a smart building
components such as the oxygen pumps, fire alarms and lights
can be controlled automatically in real time without any
human intervention. For this research, we used five virtual
oxygen sensors, five virtual smoke/hazardous gas detectors
and five luminosity sensors generating the virtual values of
Fig. 10. Data Analytics and Smart Building Control Simulation
oxygen concentration, smoke levels and luminosity levels
1331
respectively. This was achieved using Python code. After [19] A. Duffy and F. J. O'Donnel, "A Design Research Approach," in
analyzing this data using PySpark which is a Spark Python Proceedings of the AID98 Workshop on Research Methods, Lisbon,
Portugal, 1998, pp. 20-27.
API, five virtual oxygen pumps, five virtual fire alarms and
[20] PySpark, "http://spark.apache.org/docs/latest/api/python/"
five virtual lights were implemented. For this research it was
[21] R. Shyam, B. Ganesh H.B, S. Kumar S, P. Poornachandran, and K. P.
assumed that no two sensors would be generating the data Soman, "Apache Spark a Big Data Analytics Platform for Smart Grid,"
simultaneously. In the second increment of this research and Procedia Technology, vol. 21, pp. 171-178, // 2015.
framework, we will use physical IoT sensors instead of the [22] A. Magruk, "The Most Important Aspects of Uncertainty in the Internet
virtual sensors used in this research. Similarly, we plan to use of Things Field Context of Smart Buildings," Procedia Engineering,
vol. 122, pp. 220-227, // 2015.
physical oxygen pumps, buzzers or fire alarms and lights
[23] K. Sato, Y. Kawamoto, H. Nishiyama, N. Kato, and Y. Shimizu, "A
instead of virtual controls. This idea can also be extended to modeling technique utilizing feedback control theory for performance
investigate the real time energy consumption of the smart evaluation of IoT system in real-time," in Wireless Communications &
building while keeping the user comfortability into account at Signal Processing (WCSP), 2015 International Conference on, 2015, pp.
the same time. In considering the luminosity levels, we will 1-5.
also consider the occupancy of the room to make it more [24] S. Kolozali, M. Bermudez-Edo, D. Puschmann, F. Ganz, and P.
Barnaghi, "A Knowledge-Based Approach for Real-Time IoT Data
energy efficient. This research can be extended to other Stream Annotation and Processing," in Internet of Things (iThings),
applications besides smart buildings. Some of the applications 2014 IEEE International Conference on, and Green Computing and
could be smart cities and airplanes to monitor and control Communications (GreenCom), IEEE and Cyber, Physical and Social
Computing(CPSCom), IEEE, 2014, pp. 215-222.
oxygen levels inside airplanes to maintain the comfort, health
[25] A. Y Joshi, V. N Patel, J. Shah, and B. Mishra, "Customized IoT
and safety. In future, we will present next versions of the Enabled Wireless Sensing and Monitoring Platform for Smart
integrated IBDA framework to community for further Buildings," Procedia Technology, vol. 23, pp. 256-263, 2016/01/01
feedback and refinement. 2016.
REFERENCES
[1] Cisco, "The Internet of Things How the Next Evolution of the Internet Is
Changing Everything," ed, 2011.
[2] L. Atzori, A. Iera, and G. Morabito, "The internet of things: A survey,"
Computer Networks, vol. 54, no. 15, pp. 2787 2805 (2010).
[3] M. Batty, Smart Cities and Big Data,
http://www.spatialcomplexity.info/.
[4] D. Leeds, "THE SOFT GRID 2013-2020: Big Data & Utility Analytics
for Smart Grid," GTM Research Report, 2012.
[5] Ericsson, "More than 50 Billion Connected Devices," ed, 2011.
[6] M. Dohler, Machine-to-Machine Technologies, Applications &
Markets, 27th IEEE International Conference on Advanced Information
Networking and Applications (AINA) (2013).
[7] ITU, The Internet of Things, 2005.
[8] M. A. Uusitalo, "Global Vision for the Future Wireless World from the
WWRF," IEEE Vehicular Technology Magazine, vol. 1, pp. 4-8, 2006.
[9] D. Zeng, S. Guo, and Z. Cheng, "The Web of Things: A Survey,"
Journal of Communications, vol. 6, pp. 424-38, 2011
[10] Wikipedia, "https://en.wikipedia.org/wiki/History_of_construction".
[11] S. Kejriwal and S. Mahajan, "Smart Buildings: How IoT Technology
aims to add value for real estate companies," ed: Deloitte University
Press.
[12] K. Zhou, C. Fu, and S. Yang, "Big data driven smart energy
management: From big data to big insights," Renewable and Sustainable
Energy Reviews, vol. 56, pp. 215-225, 4// 2016.
[13] J. L. Hernndez-Ramos, M. V. Moreno, J. B. Bernab, D. G. Carrillo,
and A. F. Skarmeta, "SAFIR: Secure access framework for IoT-enabled
services on smart buildings," Journal of Computer and System Sciences,
vol. 81, pp. 1452-1463, 12// 2015.
[14] M. M. Rathore, A. Ahmad, A. Paul, and S. Rho, "Urban planning and
building smart cities based on the Internet of Things using Big Data
analytics," Computer Networks, vol. 101, pp. 63-80, 6/4/ 2016.
[15] N. Marz, J. Warren, "Big Data Principles and best practices of scalable
real-time data systems," Chapter 1, Manning Publications Co.
[16] A. Q. Gill, N. Phennel, D. Lane and V.L. Phung, "IoT-enabled
emergency information supply chain architecture for elderly people: The
Australian context". Information Systems, vol. 58, pp. 75-86, 2016.
[17] Cloudera, "http://www.cloudera.com/"
[18] Apache Spark, "http://spark.apache.org/"
1332