Lit Review Final

RUNNIG HEADER: APPLICATION OF DEEP LEARNING NEURAL NETWORKS IN BIG DATA ANALYTICS
Application of Deep Learning Neural Networks in Big Data Analytics
Luis A. Gutierrez
The University of Texas at El Paso

Application of Deep Learning Neural Networks is Big Data Analytics
Abstract
This paper discusses the importance of deep learning techniques in big data analytics.
The differences and similarities between use of deep learning algorithms in big data analytics
and other fields such as Natural Language Recognition software and Visual Recognition
Software. This paper also discusses the current and future importance of deep learning
applications in big data analytics. Research shows how the demand for these techniques has
increased at an incredible pace and will continue this trend for the foreseeable future. Finally,
this paper gives an insight into some of the methods used by these deep learning algorithms to
collect, interpret, and apply data.
Introduction
Data science has been around for almost as long as humans have been on around too.
Data science has always just meant the compilation of data to create useful statistical data. It’s
the new definition of data science which this paper discusses. This “new” definition has only
emerged recently and more specifically describes the new profession that works with making
sense of vast amounts of data. In order to accommodate for the immense increase in data
compilation in the past few years, data scientists have come up with new techniques using
modern technology to solve the new challenges that emerge from having so much data. The
implementation of machine learning techniques, more specifically deep learning neural networks
and algorithms, in big data analytics has become essential to the advancement of data analytics.
In order to understand how deep learning techniques are used in big data analytics it is important
to look at how they are used in other fields. That along with understanding how data scientists
use deep learning algorithms to interpret this data will help the application of these techniques to
become more broadly used and understood.
To better understand what the application of deep learning in big data analytics we must
first define what each of those parts mean. According to the National Institute of Standards and
Technology USA, “big data is where the data volume, acquisition velocity, or data representation
limits the ability to perform effective analysis using traditional relationship approaches or
requires the use of significant scaling (more nodes) for efficient processing” (Rajaraman p.696).
In simpler terms… companies such as Google, Microsoft, Oracle, and Intel define big data as:
“Big data is a term describing the storage and analysis of large and/or complex data sets using a
series of techniques including but not limited to: NoSQL, MapReduce, and machine learning”
(Rajaraman p.697). Deep learning “allows computational models that are composed of multiple
processing layers to learn representations of data with multiple levels of abstraction” (LeCun p.
436). What that means is that deep learning algorithms use many layers of machine “learning” to
break down processes or problems and find solutions to those smaller problems to create one big
solution in the end. So how do these two fields come together to help us understand patterns in
data? “The general focus of machine learning is the representation of the input data and
generalization of the learnt patterns for the use on future unseen data” (Najafabadi p.1). Data is
fundamental for machine learning algorithms to work. Deep learning algorithms learn from the
data given and use that “knowledge” to create analysis on the data that humans use. This
automates data analysis and makes the process much faster and much more efficient.
What are some differences/similarities of how deep learning techniques
are used in big data analytics to how they are used in other fields?
To be able to understand the difference and similarities among deep learning techniques
in different fields it is important to see specific uses within data science and in other related
fields. So how is deep learning used in big data analytics? Big Data Biometrics is the science of
identifying a person or verifying his identity on the basis of physiological or behavioral
characteristics. For example, fingerprints, iris, face, voice, hand geometry, and dynamic
signatures are the top six biometrics in the real world (Jaseena p. 12). All this biometric
information is easily lost or confused. Internal Affairs agencies, and banks need to have access to
these authentications biometrics in order to effectively do their work. By using biometric big
data technologies, authentication systems can access and manipulate databases with information
on millions of individuals. Deep learning algorithms allow for all this information to be sorted
and organized quickly and efficiently because they learn to predict what new inputs will look
like and anticipate them. Another example of deep learning algorithms used in big data analytics
is how the U.S Department of Homeland Security uses machine learning and deep learning to
identify patterns in email and cell phone traffic and other sources to identify future threats so just
like the algorithms from which these systems are built, they can handle those threats before they
happen or before they get any larger.
Deep learning applied to big data analytics is all about making predictions to be able to
anticipate outcomes. Unlike in other fields were deep learning is used more towards learning
about data and users and improving on itself for the benefit of the user. One example is natural
language recognition software, also known as speech recognition software. Services such as
Amazon’s Alexa, Apple’s Siri, and Google’s Google Assistant, all use artificial intelligence to
attempt to make lives easier. Easier said than done, these services use sophisticated machine
learning neural networks to respond to your voice commands in their respective voice assistant
services. Deep learning algorithms recognize, classify, and categorize patterns from inputs and
cross reference them to the databases behind the systems supporting them. The way the data is
structured and stored is what makes these services so fast and efficient. Therefore, you get an
almost instant response from your Amazon Alexa after asking it how to lose weight after that
enormous thanksgiving dinner. Visual object recognition software follows similar guidelines.
The main difference between speech and visual recognition software is that the input data goes
from speech to visual inputs. For services such as Google’s Google Lens service, found in many
mobile devices running Google’s Android operating system, visual recognition software uses
deep learning algorithms to process images. This process can be broken down into three main
steps. The first is that the image recognition software gathers and organizes the data to work
with. It then uses that data to build a model of the image. Finally, it uses the model to recognize
images. While these steps may sound very simple, the calculations and comparisons that happen
in the background are incredibly complex. Finally, another way that deep learning is used that
you see every day but don’t necessarily think about is in targeted advertising. What this means is
that online advertisers use sophisticated learning mechanisms to target their audiences based on
the product or person the advertiser is promoting. In order to do this, these advertisement
software’s use the user’s information such as race, economic status, sex, age, etc. and their
search histories to personalize online advertisement. Therefore, you might have found that you
look up a product that you are interested in buying on your phone and the next day when you log
into your computer, you find endless ads on your browser promoting that same or similar
products. Targeted advertisement uses extremely complex algorithms that use the user’s data and
analyze their browsing patterns to recommend and show products that they are most likely to
purchase.
All these different services and infrastructures, while they are here to serve different
purposes, they all have one core function that they all share. That is their need to access data and
the manipulation of this data through deep learning algorithms. “The term machine learning
refers to the automated detection of meaningful patterns in data” (Jaseena p.14). For any deep
learning algorithm to be able to do its work, there must be a large amount of data from which to
discern patterns and create organizational structures for simplified access later. The quality and
the quantity of the data are both very big factors in determining how well the algorithm learns as
well as the prediction performance. For a biometric scanner to work successfully, it must be able
to access a database containing all the biometric information of all the people to be able to cross
reference the input dataset to the data in the database. For example, the FBI (Federal Bureau of
Investigation) Next Generation Identification has more than 100 million individuals in their
system and India’s Unique Identity biometric system has more than 600 million individuals
(Jaseena p.14). Natural language recognition software and speech recognition software have
access to immense libraries of data. Think about Google and all the data they have available to
provide to all their smart devices. What about amazon and their vast access to user information
that they provide to their voice assistant Alexa. Apple’s Siri, like its Google and Amazon
counterparts, offers a wide variety of services including weather reports, reminders, answers to
user questions, etc. All these services use deep learning algorithms along with all the data
collected by their corresponding services.

Why is deep learning so important in how data scientists interpret big
data?
As the amount of data keeps getting bigger, deep learning is coming in to play a key role
in providing big data predictive analytics solutions (Chen p.514). According to the National
Security Agency (NSA), the internet is processing 1,826 Petabytes of data per day. That is the
equivalent to almost 2 billion gigabytes of data per day, or the amount of storage found in
roughly 29 million 64 gigabyte iPhones. These machine learning techniques are widely used in
Big Data fields like search engines and medicine. In contrast to other machine learning
techniques, deep learning refers to techniques that use supervised and/ or unsupervised strategies
to automatically classify data. The advantage of deep learning over many of the other machine
learning techniques is that it can be successfully applied to large volumes of data. There are
many ways in which deep learning algorithms are applied to the data, and each algorithm and
process approach the data in different ways. However, they all aim to achieve one thing which is
to train these algorithms to become self-sustainable and make connections and predictions on the
data automatically. The main focus of these deep learning techniques is to do what is impossible
for human beings to do on their own, sort through raw, unsorted, unlabeled data and give
predictions back, which can be used to discover trends is consumer shopping habits or find when
the next outbreak of the flu will happen and even who will be the next president of the United
States. The amount of data and the increasing need for quality data is dictating where the
industry is moving (Al-Jarrah p.88).
If so much data has been accumulated just is the past few years, what kind of changes
will begin to be seen in just a few years from now? It is expected that by 2020 the amount of
digital information stored across the world will reach 35 trillion gigabytes (Chen p.514). The
future of Big Data lies in these deep learning algorithms. With so much data piling up in servers
all around the world, it will me more important than it is now to have these methods of sorting
through all this data to find meaningful patterns that will help advancements in every field
possible. For example, in 2013, President Barack Obama announced a brain mapping initiative
called the BRAIN (Brain Research Through Advancing Innovative Neurotechnologies) aiming to
develop techniques to map human brain functions and treat and cure brain disorders. The amount
of data needed for this initiative will test the limits of technologies for Big Data collection and
analysis (Chen p.521). As time goes on however, and more data is gathered, the running-time
complexities (time it takes for the algorithms to make operations) will increase. It will be critical
for the analysis of so much data that data scientists focus on adapting and current deep learning
techniques. As data grows and as the variety of data also increase, the algorithms will have to
adapt to the new parameters. Just like humans evolve, so does the data that we create.
How is deep learning changing the way data is collected, interpreted,
and applied?
Due to the incredible growth in the field of data analytics, it has been of the utter most
importance that the techniques with which the data is analyzed also evolve along side it. Some of
the data processing applications that exist today are analysis, capture, data curation, search,
sharing, storage, transfer, visualization, querying, updating and information privacy. All these
operations on data take time, and space. There is also such a great amount of data in existence
and all that data is incredibly varied. That is why data can be classified as big data is the data
consists of the three V’s, volume, variety, and velocity (Hordri p.33). Volume refers to the
quality that the data must have a large number of data. Variety refers to the quality that the data
must come in many forms or formats and can be organized I structured or unstructured ways.
Finally, velocity refers to the speed at which the data can be generated and organized. With new
machine learning techniques, these methods and processes take a fraction of the time and effort
that they used to take years ago. The first part of data analysis using deep learning techniques is
collecting the data. This is done with several techniques; rule learning, data mining, cluster
analysis, text analytics, and crowd sourcing (Hordri p.33).
Once data has been collected using one or a combination of these deep learning
techniques, interpreting the data is the next challenge. Like with collecting data, there are various
deep learning strategies used to interpret data; supervised learning, unsupervised learning,
reinforcement learning, and semisupervised learning (Jaseena p.15). These methods interpret
data differently and come up with different results, however, they all work to find similarities in
the data, group those similarities, and learn about those similarities to apply in the future. In the
end these are “learning” algorithms. In order for these algorithms to work successfully on their
own, this new notion of learning is what has created such a huge demand for deep learning
applications in big data analytics.
Figure 1. Supervised Deep Learning

Conclusion
As data evolves and it gets bigger, data scientists must evolve the methods with which this data
is collected and interpreted. Advances in technology will create an even bigger surge in the
amount of data stored in servers around the world. The key to future advancements in technology
lie within all that data. If these discoveries are to be made any time soon, the methods to extract
this information will be more important than ever before. With the help of machine learning
techniques, it will possible to sort through all this data and find the meaningful pieces to keep
moving forward.
Works Cited
Hordri,N.F.1,2, Samar, A. 1., Yuhaniz,S.S.1,2, & Shamsuddin, S. M. 2.,. (2017). A systematic
literature review on features of deep learning in big data analytics
Jaseena, K. U. 1., & Kovoor, B. C. 2. (2018). A survey on deep learning techniques for big data
in biometrics
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521, 436.
Najafabadi, M. M., Villanustre, F., Khoshgoftaar, T. M., Seliya, N., Wald, R., & Muharemagic,
E. (2015). Deep learning applications and challenges in big data analytics. Journal of Big
Data, 2(1), 1.
Rajaraman, V. (2016). Big data analytics. Resonance, 21(8), 695-716.
X. Chen, & X. Lin. (2014). Big data deep learning: Challenges and perspectives. IEEE Access, 2,
514-525.

Lit Review Final

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lit Review Final

Uploaded by

Copyright:

Available Formats

RUNNIG HEADER: APPLICATION OF DEEP LEARNING NEURAL NETWORKS IN BIG DATA ANALYTICS

Application of Deep Learning Neural Networks in Big Data Analytics

The University of Texas at El Paso

collect, interpret, and apply data.

become more broadly used and understood.

What are some differences/similarities of how deep learning techniques

identifying a person or verifying his identity on the basis of physiological or behavioral

happen or before they get any larger.

collected by their corresponding services.

Why is deep learning so important in how data scientists interpret big

industry is moving (Al-Jarrah p.88).

How is deep learning changing the way data is collected, interpreted,

analysis, text analytics, and crowd sourcing (Hordri p.33).

applications in big data analytics.

Figure 1. Supervised Deep Learning

Hordri,N.F.1,2, Samar, A. 1., Yuhaniz,S.S.1,2, & Shamsuddin, S. M. 2.,. (2017). A systematic

literature review on features of deep learning in big data analytics

Rajaraman, V. (2016). Big data analytics. Resonance, 21(8), 695-716.

You might also like