What's Next For ML & You: Emily Fox & Carlos Guestrin

What’s next
for ML & you

Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Deploying an ML service

What is Production?
Serving live
predictions
Deployment
Measuring quality of
deployed models
Evaluation
Choosing between
deployed models
Management
Tracking model
quality & operations
3 Monitoring
Lifecycle of ML in Production
Deployment Evaluation
Management
Monitoring
4 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
The Setup…
Suppose we are building a website with

product recommendations,
trained using user reviews.
•  34.6M reviews
•  2.4M products
•  6.6M users

Deployment System
Batch training Real-time predictions

User & session
Model info
Historical
Data
recommendations
Predictions
Live
Data
Feedback
What happens after
(initial) deployment

Lifecycle of ML in Production
Deployment Evaluation
Management
Monitoring
After deployment
Evaluation Management Monitoring
Evaluate and track metrics over time

React to feedback from deployed models

Feedback loop for ML in production
Model
Historical
Data
Predictions
Live
Data
Feedback
Learning new, alternative models
Model
Historical
Data
Predictions
Model 2
Live
Data

Key questions
•  When to update a model?

•  How to choose between existing
models?
•  Answer: continuous evaluation and
testing

What is evaluation?
+
Evaluation Predictions Metric
What data?
Which metric?

Evaluating a recommender
Model
Historical
Data
Predictions
Sum Live User

squared Data engagement
error
Offline evaluation: Online evaluation:

When to update model Choosing between models
Updating ML models
Why update?
•  Trends and user tastes change over time
•  Model performance drops
When to update?
•  Track statistics of data over time
•  Monitor both offline & online metrics
•  Update when offline metric diverges
from online metrics or not achieving
desired targets
A/B Testing: Choosing between ML models
Group A
2000 visits
10% CTR
Model 1 Everybody gets
Model 2
Model 2
Group B
2000 visits
30% CTR

Other production considerations
•  A/B testing caveats

- Also multi-armed bandits
•  Versioning
•  Provenance
•  Dashboards
•  Reports
•  …

Machine learning challenges

Open challenges:
Model selection
User info
Yes!
Purchase history
Product info
Classifier
No
Other info
Xij known for black cells

Xij unknown for white cells
Rating
X= Rows index movies
Columns index users
≈
Parameters
19
of model
Open challenges:
Feature engineering/representation
1 0 0 0 5 3 0 0 1 0 0 0 0
•  Bag of word raw counts?
•  Normalize?
•  tf-idf? (which version???)
•  Bigrams
•  Trigrams
20
•  … ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Open challenges:
Scaling
Data is getting big…

Open challenges:
Scaling
Concurrently, models are getting big…
AR model per-‐channel
state sequences
across-‐
channel
state
mul$-‐channel
spa$al EEG data
covariance
model
channels
22
-me
CPUs stopped getting faster…
10
processor speed GHz
constant
0.1
0.01
1988
1990
1996
1998
2000
2006
2008
2010
1992
2002
1994
2004
release date
ML in the context of parallel architectures
GPUs Multicore
Clusters
Clouds
Supercomputers
But scalable ML in these systems is hard,

especially in terms of:
1.  Programmability
2.  Data distribution
3.  Failures
What’s ahead in this
specialization

2. Regression
Case study: Predicting house prices
•  Linear regression
Models •  Regularization:
Ridge (L2), Lasso (L1)
Including many features:

- Square feet
- # bathrooms
- # bedrooms
- Lot size
- Year built
26
- … ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
2. Regression
•  Gradient descent
Algorithms •  Coordinate descent
RSS(w0,w1) =
($house 1-[w0+w1sq.ft.house 1])2
+ ($house 2-[w0+w1sq.ft.house 2])2
+ ($house 3-[w0+w1sq.ft.house 3])2
+ … [include all houses]
27
ŵ ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
2. Regression
•  Loss functions, bias-variance
Concepts tradeoff, cross-validation, sparsity,
overfitting, model selection
price ($)
square feet
28 (sq.ft.) ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
3. Classification
Case study: Analyzing sentiment
•  Linear classifiers
(logistic regression, SVMs, perceptron)
Models •  Kernels
•  Decision trees

3. Classification
•  Stochastic gradient descent

Algorithms •  Boosting
Squeezing last bit

of accuracy by
blending models

3. Classification
•  Decision boundaries, MLE, ensemble

Concepts methods, random forests, CART,
online learning
Time
4. Clustering & Retrieval
Case study: Finding documents
•  Nearest neighbors
Models •  Clustering, mixtures of Gaussians
•  Latent Dirichlet allocation (LDA)
SPORTS WORLD NEWS
ENTERTAINMENT SCIENCE
•  KD-trees, locality-sensitive
hashing (LSH)
Algorithms •  K-means
•  Expectation-maximization (EM)

•  Distance metrics, approximation
algorithms, hashing, sampling
Concepts algorithms, scaling up with
map-reduce
1000530010000 1*3
+
5*2
3000200101000
= 13

5. Recommender Systems & Dimensionality
Reduction Case study: Recommending Products
•  Collaborative filtering
Models •  Matrix factorization
•  PCA
Xij known for black cells

Rating
Columns index users
≈
Parameters
of model
5. Matrix Factorization & Dimensionality Reduction
Case study: Recommending Products
•  Coordinate descent
Algorithms •  Eigen decomposition
•  SVD
Xij known
Form for black cells
estimates
Rating
X= LuRows
andindex
Rv movies
Columns index users
≈
5. Matrix Factorization & Dimensionality Reduction
Case study: Recommending Products
•  Matrix completion, eigenvalues,

Concepts random projections, cold-start
problem, diversity, scaling up
Xij known for black ce

Customers
Customers Xij unknown for white c

Columns index users
Products Products

6. Capstone: Build and deploy an intelligent
application with deep learning
Text
sentiment Computer
vision
analysis
Capstone
project
Recommenders Deep
learning
Deploy
intelligent
web app

What's Next For ML & You: Emily Fox & Carlos Guestrin

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

What's Next For ML & You: Emily Fox & Carlos Guestrin

Uploaded by

Copyright:

Available Formats

What’s next

for ML & you

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

Suppose we are building a website with

5 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

Batch training Real-time predictions

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

Evaluation Management Monitoring

Evaluate and track metrics over time

9 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

Batch training Real-time predictions

Batch training Real-time predictions

11 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

• When to update a model?

12 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

13 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

Sum Live User

Oﬄine evaluation: Online evaluation:

16 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

• A/B testing caveats

17 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

Xij known for black cells

21 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

But scalable ML in these systems is hard,

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

Including many features:

29 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

• Stochastic gradient descent

Squeezing last bit

30 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

• Decision boundaries, MLE, ensemble

SPORTS WORLD NEWS

33 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

34 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

Xij known for black cells

• Matrix completion, eigenvalues,

Xij known for black ce

Customers Xij unknown for white c

37 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

You might also like

•  When to update a model?

•  A/B testing caveats

•  Stochastic gradient descent

•  Decision boundaries, MLE, ensemble

•  Matrix completion, eigenvalues,