You are on page 1of 38

Lecture 1a

Subir Varma
} An important subset of the field of Machine Learning
} What is Machine Learning (ML):
Designing systems that can learn from experience
Use a portion of the data set to train the system
Once trained, the system is able to work effectively even for
inputs that are not part of the training set.
} How is DL different from ML?
DL creates higher level representations of data
Easier to reason with and manipulate higher level
representations
} Data can be represented in different ways, and this has an
enormous influence on the performance of ML/DL algorithms.
Example: Roman Numerals vs Arabic Numerals
} We would like to map the raw data into some other space in a
way that makes the relationships between different things
more explicit
From Deep Learning by Goodfellow et.al.
} Simple Machine Learning is good at doing Linear
Discrimination
} Before the advent of Deep Learning,
Choosing a data representation appropriate for the problem, which could
then be fed into a simple Machine Learning system, was a manual time
consuming process
With many problems it was difficult to know what features should be
extracted
} With Deep Learning:
The system discovers the best representation itself, which can then be fed
into a Linear Discriminator This is called Representation Learning
Leads to better performance compared to hand design representations,
and allows the system to adapt to newer tasks with minimal human
intervention.
} Image represented as chemicals on a photographic film:
Good for certain operations, such as film development; difficult to
manipulate or transmit image
} Image represented as digital bits
Makes possible all kinds of image manipulations, compression and
enables easy image transmissions
} Deep Learning: Image represented as the output of a Neural
Network
Enables us to do operations that require a deeper understanding of the
image, such as:
- Detect the main objects in the image and name them
- Provide a verbal description of the image
- Generate similar images
Deep Learning
Network

Image Image Representation


224X224X3 pixels 4096 numbers
(150K)

Captures Semantic Information


(T-Distributed Stochastic Neighbor Embedding)

From Stanford CS 231n


} Deep Learning solves the problem of Representation Learning
by using
1. Compositions
Process of assembling a representations from simpler
object representations
2. Hierarchies
Process of building higher level representations by building
upon simpler ones
From Deep Learning by Goodfellow et.al.
} In traditional Computer Science, words/documents are
represented using data structures such as arrays, dictionaries
etc.
} These representations are good enough to answer questions
such as the number of times a particular word occurs in the
document
} But what about Higher Level semantic queries, such as:
Translate this book into German
Did the reviewer like this book
If John liked Harry Potter books, will he like books by Roald Dahl
} Representing words as vectors allows us to do the following:
Measure similarity between words using the dot product of
their vectors
Represent a web page using the average of its vectors
Complete analogies such as London is to England such as
Paris is to ----- by doing vector arithmetic.
http://cs.stanford.edu/people/karpathy/convn
etjs/demo/classify2d.html
} Supervised Learning: Learn from Labeled
Examples of Correct Output
} Unsupervised Learning: There are no labeled
examples Look for interesting patterns, find
representations
} Reinforcement Learning: A type of supervised
learning Instead of being told the correct
output, the system is given rewards instead
32X32X3 Images
- 14M Images
- 22,000
categories
Human
~ 5.1%

ILSVRC: 1.5M Images in 1000 categories


From Stanford CS 231n
From Stanford CS 231n
Google Translate replaced its Bayesian Translator with that based on Deep
Learning recently
Deep Learning has dramatically cut down the error rates in
Speech Recognition
From Deep Visual Semantic Alignments for Generating Image Descriptions
By Karpathy and Li
Input:
Pixels Output:
From Joy Stick
Game Commands
Screen

From Human-level control through deep reinforcement learning


By Mnih et.al.
From End to End Learning for Self Driving Cars
By Bojarski et.al.
From Stanford CS 231n
From Stanford CS 231n
From Stanford CS 231n
} Today Neural Networks can
See
Hear
Translate Languages
Do Optimal Control
} Future
.Think???
} Learning Models: Supervised, Unsupervised,
Reinforcement
} Training DLNs: Gradient Descent, Backprop
} Improving the Training Process and Model
Generalization Ability
} DLN Tools: TensorFlow
} Convolutional Neural Networks (ConvNets)
} Recurrent Neural Networks (RNNs)
} Reinforcement Learning
} Unsupervised Learning
} Lecture 1a - Introduction: Introduction to Deep Learning and discussion of important applications, An historical
overview of the development of this topic.
} Lecture 1b - Mathematical Preliminaries: A short overview of Probability Theory, Bayes Rule, Maximum
Likelihood Estimation.
} Lecture 2a Learning Models: Discussion of Supervised Learning, Unsupervised Learning, Linear Learning
Networks (Logistic Regression) and Multi-Layer Deep Learning Networks (DLN).
} Lecture 2b DLN Training Techniques: Gradient Descent, Batch and Stochastic Gradient Descent, Backprop
Forward and Backward Passes.
} Lecture 3a Improving Gradient Descent: Vanishing Gradient Problem, Activation and Loss Functions, Learning
Rate Selection, Momentum based techniques, Weight Initialization, Data Pre-Processing, Batch Normalization.
} Lecture 3b Improving Model Generalization Ability: The Under-fitting and Over-fitting problems,
Regularization Techniques L2, L1 and Dropout, Hyper-Parameter Selection Manual and Automated Tuning.
} Lectures 4a and 4b Tools and Techniques: Introduction to TensorFlow
} Lectures 5a, 5b, 6a Convolutional Neural Networks (ConvNets): History and Architecture of ConvNets
Convolutions, Pooling and Padding, Sizing ConvNets, Training ConvNets, Small Filters, ConvNet Architectures
LeNet5, AlexNet, ZFNet, InceptionNet, VGGNet, and Resnet, Transfer Learning using ConvNets.
} Lectures 6b, 7a, 7b Recurrent Neural Networks (RNNs): RNN Architectures One to One, Many to One, Many to
Many; Difficulties in Training RNNs and how to solve them, LSTMs, GRUs, Language Models, Encoder-Decoder
Systems, Attention Mechanism, Applications to Speech Recognition, Machine Translation and Image Captioning,
Incorporation of Memory into RNNs.
} Lectures 8a, 8b, 9a Reinforcement Learning: Introduction, Markov Decision Processes, Model based and Model
Free Architectures, Q-Functions, Dynamic Programming, Value and Policy Iteration, Sample Path based Learning
Monte Carlo Methods, Temporal Difference (TD) Learning, Approximating Q-Functions using DLNs, Policy
Gradient Methods.
} Lectures 9b, 10a, 10b Unsupervised Learning: Autoencoders, Representation Learning, Generative Adversarial
Networks (GANs).
} Chapter 1 of Deep Learning by Goodfellow, Bengio and
Courville.
http://www.deeplearningbook.org/
} Chapters 1 and 2 of Deep Learning by Das and Varma.
http://srdas.github.io/DLBook/
} Deep Learning by LeCun, Bengio and Hinton, Nature, Vol.
521, pp. 436-444 (May, 2015).
http://pages.cs.wisc.edu/~dyer/cs540/handouts/deep-
learning-nature2015.pdf

You might also like