You are on page 1of 6

Summary: Big Data

Big Data technologies deal with massive, internet scale datasets and are designed
to work with both structured and unstructured data from diverse sources

Distributed and Parallel processing techniques such as MapReduce and open source
platforms such as Hadoop and Apache Spark have made Big Data affordable and
practical for all enterprises

Data Science, the intersection of Statistics, Domain Knowledge and Programming is


now among the most in-demand skills in the industry

Big Data storage technologies are specially designed to deal with the scale, rapid
retrieval & parallel processing requirements of massive data sets. Hadoop,
Cassandra & Cloudera (also Hadoop based) are popular technologies in this space.

Big Data analysis requires distributed processing capabilities and techniques like
MapReduce. Tools like Apache Spark, Storm & Flink are popular. Splunk is also a
popular platform to analyze large amounts of machine generated data (like logs).

Once Big Data is processed, visualization frameworks such as d3.js, Tableau and
Kibana are popular ways to generate interactive dashboards.

https://informationisbeautiful.net/
---------------------------------

The Cloud virtualizes & dematerializes Data Centers & democratizes app development

Big Data makes insights real time and also enables Artificial Intelligence that is
revolutionizing automation

Social Media & Digital marketing are changing the equation between customer,
product and seller

Blockchain and IoT are emerging disruptions that will change many industries

---
Pay Per Use - Cloud architectures shift Capex to Opex. You only pay for what you
use. Elasticity to deal with spikes is charged only for that particular use and de-
provisioned as soon as traffic reduces.

https://cloud.google.com/natural-language/

It dematerializes access to technology and turns Capex into Opex while also
significantly improving productivity

Cloud Use Case 5: Big Data applications

It is tremendously expensive and complex to set up modern day distributed Big Data
platforms like Hadoop and Spark. Cloud platforms offer these as managed services
that provide Elastic MapReduce like services to kick off Big Data pilots and
projects quickly.

Hands-on Exercise: Experience the Cloud

Here is a fantastic example of Cloud, Big Data and Machine Learning at work. Why
don't you head over to Google Cloud Natural Language APIand copy-paste some text
and get back sentiment analysis, entity and intent extraction. This is offered as
an on-demand API that is powered by massive, distributed Natural Language
Processing AI algorithms that have been on massive language datasets stored on Big
Data platforms.

Your Exercise: Type a paragraph of text, perhaps copied from a news article, and
experience these technologies at work

www.ap-institute.com

Data Ingestion is the process of obtaining & importing data for immediate use or
storage. This could happen in real time (Data Streaming) or in batch mode. Some
tools in this space: Apache Sqoop, Flume, Gobblin framework, Kinesis, Splunk
Lambda architecture can be employed to ingest data streams to view in real-time and
process batch data simultainteously in a single architecture.

Big Data storage technologies are specially designed to deal with the scale, rapid
retrieval & parallel processing requirements of massive data sets. Hadoop,
Cassandra & Cloudera (also Hadoop based) are popular technologies in this space.

Big Data analysis requires distributed processing capabilities and techniques like
MapReduce. Tools like Apache Spark, Storm & Flink are popular. Splunk is also a
popular platform to analyze large amounts of machine generated data (like logs).

Once Big Data is processed, visualization frameworks such as d3.js, Tableau and
Kibana are popular ways to generate interactive dashboards.

https://informationisbeautiful.net/

Summary: Big Data

Big Data technologies deal with massive, internet scale datasets and are designed
to work with both structured and unstructured data from diverse sources

Distributed and Parallel processing techniques such as MapReduce and open source
platforms such as Hadoop and Apache Spark have made Big Data affordable and
practical for all enterprises

Data Science, the intersection of Statistics, Domain Knowledge and Programming is


now among the most in-demand skills in the industry

Hands-on Exercise: Visualising Social Networks

Socilab uses your LinkedIn Social Graph and results from social science to map and
analyze your LinkedIn network. As an exercise, go ahead and try it out for your
LinkedIn profile, preferably from your desktop or laptop.

Please note: You will need to authorize this app to access your profile data.
Please read their terms and conditions before proceeding.

Why is AI different?

Applications have always had some form intelligence - using mechanisms such as Rule
engines, analytics and models. The crucial insight into why AI approaches are
different is the difference between traditional Models and Data-driven Models. AI
algorithms use data to improve their logic, unlike earlier approaches that required
offline programmatic interventions to change algorithms.

AI is inter-disciplinary by design. It is at the confluence of computer science,


psychology, neurological science, biology, mathematics, sociology and anthropology,
and philosophy.

The arrival of distributed Big Data processing platforms has made the ability to
analyze and infer Natural Language using algorithms that have been trained on
existing corpus of language samples is now mainstream. We see this everywhere from
Siri, Alexa and Google Voice Assistant. The earlier approach of encoding grammar as
rules has given way to a more data-driven approach that trains models using real
life use of language/s.

Recent AI techniques such as Deep Learning have enabled the ability to recognize
objects from pictures and videos with more efficiency than human vision! These see
usage in autonomous vehicles and security systems.

mathworks.com/deep learning

From specialised robots, AI has now enabled the first generation of general purpose
robots that can be trained by humans to learn general purposes tasks. Amazon, for
example, uses Kiva robots in its warehouses. These robots aid in item retrieval
before shipping

AI Lingo

Expert systems: Use algorithms such as Logistic Regression & Support Vector
Machines in situations like flight tracking systems, clinical systems and financial
advisory, and automated stock trading applications.

Natural Language Processing: Algorithms that you can see in day to day applications
like Google or Siri�s voice recognition, search engines and sentiment analysis.

Neural Networks - That use complex data structures that resemble neurons in our
brains to build pattern recognition models that learn from input data and reinforce
themselves based on errors. Deep Learning is a specific kind of Neural Network that
involves using many layers of �neurons� that has recently resulted in huge
improvements in areas such as face and object detection, character and handwriting
recognition.

Robotics - Industrial robots for manufacturing situations, such as moving,


spraying, painting, precision checking, drilling and coating etc.

Hands-on Exercise: Artificial Intelligence

Go try out Google Autodraw and experience the awesome power of visual computing. It
can take your utterly terrible sketches and turn them into beautiful art, exactly
the way you wished you could draw

Summary: Artificial Intelligence

AI is a set of technologies that combine algorithms and training data to enable


complex human-like intelligence in computers.

AI techniques have historically been used in Expert tasks such as Financial


analysis, Scientific analysis and Diagnosis.

The growth of Big Data platforms, cloud based distributed computing technologies
has now made it possible to build more complex, general purpose AI for mundane and
formal tasks.

AI is increasingly everywhere now - in search & recommendation engines, face


recognition, robots & self driven vehicles.

Blockchain, a disruptive new transactional storage mechanism that enables


decentralised trust mechanisms that can transform everything from banking,
healthcare and government systems in the near future

iftf.org/blockchainfuturelab

Blockchain uses a combination of cryptography and consensus building protocols to


enable systems of decentralised trust.

Public vs Private Blockchains

A public blockchain allows all users to both write and read to the transactional
record. Bitcoin is an example of a blockchain powered public ledger. A private
blockchain allows only trusted and authenticated users to write or read. The trust
mechanisms could be legal contracts between companies.

Blockchain Use Cases:

Smart Contracts - Blockchain can help companies to enforce contracts with customers
and other third parties.

Online Identity Management -Rights management for music/video is an example where


Blockchain can help create scalable, distributed identity resolution as opposed to
risky, centralized trust mechanisms that are vulnerable to hacking.

Medical Records - Blockchain can help create secure, globally available medical
records that cut across healthcare providers.

Asset management - Blockchain can enable better ways to manage asset ownership,
like land records or share trades.

http://dailyblockchain.github.io/

Summary: Unchain the Blockchain

Blockchain is a distributed transactional ledger that was originally created to


enable cryptocurrencies like Bitcoin

Its design enables decentralized trust mechanisms that now see use cases in
financial services, healthcare, education and legal services

IOT

Pervasive computing
transform our world into an always-on, always-monitoring, smart and adaptive
environment

Inflection Point

IoT has gathered momentum recently as a result of the following :

Availability of low cost, high computing power processors and miniaturized


architectures such as Raspberry Pi and Arduino

Low power usage architectures that have enabled ubiquitous deployment of sensor
devices

Scalable cloud based IoT management platforms to analyze sensor data

There are two parallel tracks in IoT technologies

Wearable - Google's Android Wear and a host of fitness, activity and health
trackers are part of this ecosystem

Embedded - Arduino, Raspberry Pi and Intel's Galileo are among the more popular
embedded IoT device architectures

Some of the popular IOT platform providers include GE Predix, RTI, ThingWorx, TCUP.

Some of the top IoT technologies predicted by Gartner are Security, Analytics,
Device Management, Low-power short-range networks, even stream processing etc.

http://www.simpleiothings.com/

Immersive Technology

Immersive Technologies come in 3 flavors:

Virtual Reality (VR) - Head Mounted Devices (HMD) Completely cut off from real
world and presents simulated world that is the only perception of the world. User
interactions (Gestures, Speech, Movement) cause a reaction within the virtual
world.

Augmented Reality (AR) - Devices (Mobiles, HMDs) that present both virtual and real
worlds forming a dual perception of the world. For a user, the virtual world does
not recognize the physical, but both are controlled by user interactions.

Mixed Reality (MR) = Devices (Mobiles, HMDs) that present both the worlds such that
the virtual world recognizes the physical and creates a close-to-reality perception
of the world.

VR : pebblestudio.co.uk

The Microsoft Hololens is an HMD that uses specialized hardware that runs on
Windows 10 OS and delivers a virtual world as a Universal Windows Platform (UWP)
App.

More on Microsoft Hololens

Software development for the Microsoft Hololens is done using Visual Studio. It
creates a UWP App on Windows 10 OS that is targeted to the Hololens hardware.
The virtual world can be created either directly using Visual Studio and Microsoft
DirectX programming in C++ OR by using gaming engines such as Unity, Vuforia or
Cocos Engine. Unity is the most popular development platform.
The design of the virtual world is done in Unity using the Unity Editor, which then
exports a VS solution. This solution is compiled to form the UWP App that executes
on the Hololens.
Microsoft provides a Holo emulator which can be used to create and test programs.

You might also like