You are on page 1of 38

What’s next

for ML & you


Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Deploying an ML service

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


What is Production?
Serving live
predictions

Deployment
Measuring quality of
deployed models
Evaluation
Choosing between
deployed models
Management
Tracking model
quality & operations

3   Monitoring
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Lifecycle of ML in Production

Deployment Evaluation

Management
Monitoring
4   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
The Setup…

Suppose we are building a website with


product recommendations,
trained using user reviews.

•  34.6M reviews
•  2.4M products
•  6.6M users

5   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Deployment System

Batch training Real-time predictions


User & session
Model info
Historical
Data
recommendations
Predictions

Live
Data
Feedback
6   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
What happens after
(initial) deployment

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Lifecycle of ML in Production

Deployment Evaluation

Management
Monitoring
8   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
After deployment

Evaluation Management Monitoring

Evaluate and track metrics over time


React to feedback from deployed models

9   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Feedback loop for ML in production

Batch training Real-time predictions

Model
Historical
Data
Predictions

Live
Data
Feedback
10   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Learning new, alternative models

Batch training Real-time predictions

Model
Historical
Data
Predictions

Model 2
Live
Data

11   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Key questions

•  When to update a model?


•  How to choose between existing
models?
•  Answer: continuous evaluation and
testing

12   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


What is evaluation?

+
Evaluation Predictions Metric

What data?
Which metric?

13   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Evaluating a recommender
Model
Historical
Data
Predictions

Sum Live User


squared Data engagement
error

Offline evaluation: Online evaluation:


When to update model Choosing between models
14   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Updating ML models
Why update?
•  Trends and user tastes change over time
•  Model performance drops
When to update?
•  Track statistics of data over time
•  Monitor both offline & online metrics
•  Update when offline metric diverges
from online metrics or not achieving
desired targets
15   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
A/B Testing: Choosing between ML models
Group A

2000 visits
10% CTR
Model 1 Everybody gets
Model 2
Model 2
Group B

2000 visits
30% CTR

16   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Other production considerations

•  A/B testing caveats


- Also multi-armed bandits
•  Versioning
•  Provenance
•  Dashboards
•  Reports
•  …

17   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Machine learning challenges

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Open challenges:
Model selection
User info
Yes!
Purchase history

Product info
Classifier
No
Other info

Xij known for black cells


Xij unknown for white cells
Rating
X= Rows index movies
Columns index users

≈  
Parameters
19  
of model
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Open challenges:
Feature engineering/representation

1   0   0   0   5   3   0   0   1   0   0   0   0  
•  Bag of word raw counts?
•  Normalize?
•  tf-idf? (which version???)
•  Bigrams
•  Trigrams
20  
•  … ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Open challenges:
Scaling
Data is getting big…

21   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Open challenges:
Scaling
Concurrently, models are getting big…
AR  model     per-­‐channel  
state  sequences  

across-­‐  
channel  
state  
mul$-­‐channel    
spa$al   EEG  data  
covariance  
model    
channels  

22  
-me    
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
CPUs stopped getting faster…
10
processor speed GHz

constant
0.1

0.01
1988

1990

1996

1998

2000

2006

2008

2010
1992

2002
1994

2004

release date
23   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
ML in the context of parallel architectures

GPUs Multicore

Clusters

Clouds
Supercomputers

But scalable ML in these systems is hard,


especially in terms of:
1.  Programmability
2.  Data distribution
3.  Failures
24   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
What’s ahead in this
specialization

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


2. Regression
Case study: Predicting house prices
•  Linear regression
Models •  Regularization:
Ridge (L2), Lasso (L1)

Including many features:


- Square feet
- # bathrooms
- # bedrooms
- Lot size
- Year built
26  
- … ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
2. Regression
Case study: Predicting house prices
•  Gradient descent
Algorithms •  Coordinate descent

RSS(w0,w1) =
($house 1-[w0+w1sq.ft.house 1])2
+ ($house 2-[w0+w1sq.ft.house 2])2
+ ($house 3-[w0+w1sq.ft.house 3])2
+ … [include all houses]

27  
ŵ ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
2. Regression
Case study: Predicting house prices
•  Loss functions, bias-variance
Concepts tradeoff, cross-validation, sparsity,
overfitting, model selection
price ($)

square feet
28   (sq.ft.) ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
3. Classification
Case study: Analyzing sentiment
•  Linear classifiers
(logistic regression, SVMs, perceptron)
Models •  Kernels
•  Decision trees

29   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


3. Classification
Case study: Analyzing sentiment

•  Stochastic gradient descent


Algorithms •  Boosting

Squeezing last bit


of accuracy by
blending models

30   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


3. Classification
Case study: Analyzing sentiment

•  Decision boundaries, MLE, ensemble


Concepts methods, random forests, CART,
online learning

Time
31   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
4. Clustering & Retrieval
Case study: Finding documents
•  Nearest neighbors
Models •  Clustering, mixtures of Gaussians
•  Latent Dirichlet allocation (LDA)

SPORTS WORLD NEWS

ENTERTAINMENT SCIENCE
32   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
4. Clustering & Retrieval
Case study: Finding documents
•  KD-trees, locality-sensitive
hashing (LSH)
Algorithms •  K-means
•  Expectation-maximization (EM)

33   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


4. Clustering & Retrieval
Case study: Finding documents
•  Distance metrics, approximation
algorithms, hashing, sampling
Concepts algorithms, scaling up with
map-reduce

1000530010000 1*3
+
5*2
3000200101000
= 13

34   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


5. Recommender Systems & Dimensionality
Reduction Case study: Recommending Products
•  Collaborative filtering
Models •  Matrix factorization
•  PCA

Xij known for black cells


Xij unknown for white cells
Rating
X= Rows index movies
Columns index users

≈  
Parameters
of model
35   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
5. Matrix Factorization & Dimensionality Reduction
Case study: Recommending Products

•  Coordinate descent
Algorithms •  Eigen decomposition
•  SVD

Xij known
Form for black cells
estimates
Xij unknown for white cells
Rating
X= LuRows
andindex
Rv movies
Columns index users

≈  
36   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
5. Matrix Factorization & Dimensionality Reduction
Case study: Recommending Products

•  Matrix completion, eigenvalues,


Concepts random projections, cold-start
problem, diversity, scaling up

Xij known for black ce


Customers

Customers Xij unknown for white c


X= Rows index movies
Columns index users
Products Products

37   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


6. Capstone: Build and deploy an intelligent
application with deep learning

Text
sentiment Computer
vision
analysis
Capstone
project

Recommenders Deep
learning

Deploy
intelligent
web app
38   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

You might also like