6 views

Uploaded by Budi Purnomo

try

- The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Classifiers
- Hart Ing Tons Method
- D1T3 - Clarence Chio and Anto Joseph - Practical Machine Learning in Infosecurity
- Kmeans Code
- 209 Makhalova Elena Paper
- A Survey on Clustering Techniques in Medical Diagnosis
- 04_Chapter4
- Multi
- K-Means Segmentation Method for Automatic Leaf Disease Detection
- enev-ACSAC12-sensorsift
- Theory and Use of the EM Algorithm
- A Tutorial on Classification of Remote Sensing Data
- IRJET-Analysis of Document Clustering using Pseudo Dynamic Quantum Clustering Approach
- 512874654 (1)
- Hadoop
- Vec Support
- A q 25256259
- PQ10
- ALEKOE-IWSSIP2011 (1)
- 3D Measures Computed in Monocular Camera System - Makantasis

You are on page 1of 75

Cordelia Schmid

Category recognition

• Image classification: assigning a class label to the image

Car: present

Cow: present

Bike: not present

Horse: not present

…

Category

Tasks

recognition

• Image classification: assigning a class label to the image

Car: present

Cow: present

Bike: not present

Horse: not present

…

• Object localization: define the location and the category

L

Location

ti

Car Cow

Category

Difficulties: within object variations

Within-object variations

Difficulties: within class variations

Image classification

• Given

Positive training images containing an object class

• Classify

A test image as to whether it contains the object class or not

?

Bag-of-features

Bag of features – Origin: texture recognition

or textons

Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001;

Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

Bag-of-features

Bag of features – Origin: texture recognition

histogram

Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001;

Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

Bag-of-features

Bag of features – Origin: bag

bag-of-words

of words (text)

• Orderless document representation: frequencies of words

from a dictionary

• Classification to determine document categories

Bag-of-words

Co

Commono 2 0 1 3

People 3 0 0 2

Sculpture 0 1 3 0

… … … … …

Bag-of-features

Bag of features for image classification

SVM

descriptors and frequencies matrix

[Zhang,Marszalek,Lazebnik&Schmid,IJCV’07]

Bag-of-features

Bag of features for image classification

SVM

descriptors and frequencies matrix

Step 1 Step 2 Step 3

Step 1: feature extraction

• Scale-invariant

Scale invariant image regions + SIFT (see previous lecture)

– Affine invariant regions give “too” much invariance

– Rotation invariance for many realistic collections “too”

too much

invariance

• Dense descriptors

– Improve results in the context of categories (for most categories)

– Interest

I t t points

i t do

d nott necessarily

il capture

t “all”

“ ll” features

f t

• Color-based

Color based descriptors

• Shape-based

Shape based descriptors

Dense features

Computation of the SIFT descriptor for each grid cells

-Computation

-Exp.: Horizontal/vertical step size 3 pixel, scaling factor of 1.2 per level

Bag-of-features

Bag of features for image classification

SVM

descriptors and frequencies matrix

Step 1 Step 2 Step 3

Step 2: Quantization

Visual vocabulary

Clustering

Examples

p for visual words

Airplanes

Motorbikes

Faces

Wild Cats

Leaves

People

Bikes

Step 2: Quantization

• Cluster descriptors

– K-means

– Gaussian mixture model

• Assign

g each visual word to a cluster

– Hard or soft assignment

K-means

K means clustering

• Minimizing

g sum of squared

q Euclidean distances

between points xi and their nearest cluster centers

• Algorithm:

– Randomly y initialize K cluster centers

– Iterate until convergence:

• Assign each data point to the nearest center

• R

Recomputet eachh cluster

l t center t as th

the mean off allll points

i t

assigned to it

Gaussian mixture model (GMM)

• Mixture of Gaussians: weighted sum of Gaussians

where

ee

Hard or soft assignment

• K-means

K means hard assignment

– Assign to the closest cluster center

– Count number of descriptors assigned to a center

g

– Estimate distance to all centers

– Sum over number of descriptors

cy

frrequenc Image representation

…..

codewords

normalization with L1/L2 norm

• fine grained – represent model instances

• coarse grained – represent object categories

Bag-of-features

Bag of features for image classification

SVM

descriptors and frequencies matrix

Step 1 Step 2 Step 3

Step 3: Classification

bag of

features representations of images to different classes

Decision Zebra

boundary

Non-zebra

Training data

Vectors are histograms, one from each training image

positive negative

Train classifier,e.g.SVM

Linear classifiers

• Find linear function (hyperplane) to separate positive and

negative

i examples l

x i positive : xi w b 0

x i negative : xi w b 0

Which hyperplane

is best?

Linear classifiers - margin

x2

(color)

• G

Generalization

li ti iis nott

good in this case:

x1 (roundness)

x2

(color)

• Better if a margin

is introduced: b/|w|

x1 (roundness)

Nonlinear SVMs

• Datasets that are linearly separable work out great:

0 x

0 x

higher-dimensional

dimensional space:

x2

0 x

Nonlinear SVMs

• General idea: the original input space can always be

mapped to some higher-dimensional feature space

where the training set is separable:

Φ: x → φ(x)

Nonlinear SVMs

transformation φ(x), define a kernel function K such that

K(xi ,xj ) = φ(xi ) · φ(xj)

j

feature

eatu e space:

space

y K ( x , x) b

i

i i i

Kernels for bags of features

N

• Histogram intersection kernel: I (h1 , h2 ) min(h (i), h (i))

i 1

1 2

1 2

K (h1 , h2 ) exp D(h1 , h2 )

A

• D can be Euclidean distance RBF kernel

• D can be χ2 distance

N

D(h1 , h2 )

h1 (i) h2 (i) 2

i 1 h1 (i ) h2 (i )

Combining features

•SVM with multi-channel chi-square kernel

1 m

Dc ( H1 , H 2 )

2

i 1

[ ( h1i h2i ) 2

(h1i h2i )]

Kernel Learning (MKL)

[J. Zhang, M. Marszalek, S. Lazebnik and C. Schmid. Local features and kernels for

classification of texture and object categories: a comprehensive study, IJCV 2007]

Combining features

• For linear SVMs

– Early fusion: concatenation the descriptors

– Late fusion: learning weights to combine the classification scores

• In p

practice late fusion g

give better results

– In particular if different modalities are combined

Multi-class

Multi class SVMs

• Various direct formulations exist

exist, but they are not widely

used in practice. It is more common to obtain multi-class

SVMs by combining two-class

two class SVMs in various ways

– Training: learn an SVM for each class versus the others

– Testing: apply each SVM to test example and assign to it the

class of the SVM that returns the highest decision value

• One

O versus one:

– Training: learn an SVM for each pair of classes

– Testing: each learned SVM “votes”

votes for a class to assign to the test

example

Why does SVM learning work?

Illustration

Correct − Image: 35 Correct − Image: 37

20 20

40 40

60 60

80 80

100 100

120 120

20 20

40 40

60 60

80 80

100 100

120 120

Illustration

A linear SVM trained from positive and negative window descriptors

+ lie on object boundary (= local shape structures common to many training exemplars)

Bag-of-features

Bag of features for image classification

• Excellent results in the presence of background clutter

Examples for misclassified images

Bag of visual words summary

• Advantages:

– largely unaffected by position and orientation of object in image

– fixed length vector irrespective of number of detections

– veryy successful in classifying

y g images

g according g to the objects

j they

y

contain

• Disadvantages:

– no explicit use of configuration of visual word positions

– no model of the object location

Evaluation of image classification

• PASCAL VOC [05

[05-12]

12] datasets

– Training and test dataset available

– Used to report state

state-of-the-art

of the art results

– Collected January 2007 from Flickr

– 500 000 images downloaded and random subset selected

– 20 classes

– Class labels per image + bounding boxes

– 5011 ttraining

i i iimages, 4952 ttestt iimages

PASCAL 2007 dataset

PASCAL 2007 dataset

Evaluation

Precision/Recall

Results for PASCAL 2007

• Winner of PASCAL 2007 [[Marszalek et al.]] : mAP 59.4

– Combination of several different channels (dense + interest

points, SIFT + color descriptors, spatial grids)

– Non-linear

N li SVM with

ith G

Gaussian

i kkernell

al. 2009] : mAP 62

62.2

2

– Combination of several features

– Group-based

p MKL approach

pp

[Harzallah et al.’09] : mAP 63.5

– Use detection results to improve classification

Spatial pyramid matching

• Add spatial information to the bag

bag-of-features

of features

• Perform

P f matching

t hi ini 2D iimage space

Related work

Similar approaches:

Subblock description [Szummer & Picard, 1997]

SIFT [Lowe, 1999]

GIST [Torralba et al., 2003]

SIFT Gist

(1999, 2004) Torralba et al

al. (2003)

Spatial pyramid representation

Locally orderless

representation

i at

several levels of

spatial resolution

level 0

Spatial pyramid representation

Locally orderless

representation

i at

several levels of

spatial resolution

level 0 level 1

Spatial pyramid representation

Locally orderless

representation

i at

several levels of

spatial resolution

Spatial pyramid matching

• Combination of spatial levels with pyramid match kernel

[Grauman & Darell’05]

• Intersect histograms, more weight to finer grids

Scene dataset [Labzenik et al.’06]

Coast Forest Mountain Open country Highway Inside city Tall building Street

Store Industrial

4385 images

155 categories

c ego es

Scene classification

L Single-level

Single level Pyramid

0(1x1) 72.2±0.6

1(2x2) 77.9±0.6 79.0 ±0.5

2(4x4) 79.4±0.3 81.1 ±0.3

3(8x8) 77.2±0.4 80.7 ±0.3

Retrieval examples

Category classification – CalTech101

L Single-level Pyramid

0(1x1) 41.2±1.2

1(2x2) 55.9±0.9 57.0 ±0.8

2(4x4) 63.6±0.9 64.6 ±0.8

3(8x8) 60 3±0 9

60.3±0.9 64 6 ±0.7

64.6 ±0 7

Evaluation BoF – spatial

Image classification results on PASCAL

PASCAL’07

07 train/val set

spatial layout

1 0.53

2x2

3x1

1,2x2,3x1

Evaluation BoF – spatial

Image classification results on PASCAL

PASCAL’07

07 train/val set

spatial layout

1 0.53

2x2 0.52

3x1 0.52

1,2x2,3x1 0.54

C bi i iimproves average results,

Combination l ii.e., iit iis appropriate

i ffor

some classes

Evaluation BoF - spatial

for individual categories

g

1 3x1

Sheep 0.339 0.256

Bird 0.539 0.484

DiningTable 0.455 0.502

Train 0.724 0.745

g y dependent!

p

Combination helps somewhat

Discussion

• Summary

– Spatial pyramid representation: appearance of local image

patches

t h + coarse global

l b l position

iti iinformation

f ti

– Substantial improvement over bag of features

– Depends on the similarity of image layout

• Recent extensions

– Flexible, object-centered grid

• Shape

p masks [[Marszalek’12]] => additional annotations

– Weakly supervised localization of objects

• [Russakovsky et al.’12]

Recent extensions

[Perronnin et al.

al.’10,

10, Maji and Berg’09,

Berg 09, A. Vedaldi and Zisserman’10]

Zisserman 10]

– Fisher vector [Perronnin & Dance ‘07]

– VLAD descriptor [Jegou, Douze, Schmid, Perez ‘10]

– Supervector [Zhou et al. ‘10]

– Sparse coding [Wang et al. ’10, Boureau et al.’10]

Fisher vector

Statistical measure of the descriptors of the image w.r.t the GMM

D i ti off likelihood

Derivative lik lih d w.r.t.

t GMM parameterst

GMM parameters:

weight

mean

co-variance (diagonal)

Translated cluster →

large derivative on for this

component

Fisher vector

- only

l ddeviation

i ti wrtt mean, di

dim: K*D [K number

b off Gaussians,

G i D di

dim off d

descriptor]

i ]

- variance does not improve for comparable vector length

Image classification with Fisher vector

• Dense SIFT

• Fisher vector (k=32 to 1024, total dimension from approx.

5000 to 160000)

• Normalization

– square-rooting

– L2 normalization

– [Perronnin’10], [Image categorization using Fisher kernels of non-iid

image models, Cinbis, Verbeek, Schmid, CVPR’12]

• Classification approach

– Linear classifiers

– One versus

ers s rest classifier

Image classification with Fisher vector

• Evaluation on PASCAL VOC’07 linear classifiers with

– Fisher vector

– Sqrt transformation of Fisher vector

– Latent GMM of Fisher vector

models lead to improvement

p

• State-of-the-art performance

obtained

bt i d with

ith lilinear classifier

l ifi

Evaluation image description

Fisher versus BOF vector + linear classifier on Pascal Voc’07

•Fisher comparable

p to BOF +

non-linear classifier

•Limited gain due to SPM

on PASCAL

•Sqrt helps for Fisher and BOF

•[Chatfield et al

al. 2011]

Large-scale

Large scale image classification

has 14M images

g from 22k classes

Standard Subsets

– ImageNet

I N t Large

L S

Scale

l Vi

Visuall R

Recognition

iti ChChallenge

ll 2010 (ILSVRC)

• 1000 classes and 1.4M images

– ImageNet10K dataset

• 10184 classes and ~ 9 M images

Large-scale

Large scale image classification

• Classification approach

– One-versus-rest classifiers

– Stochastic gradient descent (SGD)

– At each step choose a sample at random and update the

parameters using a sample-wise estimate of the regularized risk

• Data reweighting

– Wh

When some classes

l are significantly

i ifi tl more populated

l t d than

th others,

th

rebalancing positive and negative examples

– Empirical

p risk with reweighting

g g

Importance of re-weighting

re weighting

d h d one tto u-OVR

dashed OVR

for each positive, β=1 natural

rebalancing

• For very high dimensions little impact

Impact of the image signature size

• Fisher vector (no SP) for varying number of Gaussians +

different classification methods, ILSVRC 2010

• Performance

P f improves

i f higher

for hi h di

dimensional

i l vectors

t

Experimental results

• Features: dense SIFT,

SIFT reduced to 64 dim with PCA

• Fisher vectors

– 256 Gaussians, using mean and variance

– Spatial pyramid with 4 regions

– Approx. 130K dimensions (4x [2x64x256])

– Normalization: square-rooting and L2 norm

– 4960 dimensions

– Normalization: square-rooting and L2 norm

Experimental results for ILSVRC 2010

• Features

F t : dense

d SIFT,

SIFT reduced

d d to

t 64 dim

di with

ith PCA

• 256 Gaussian Fisher vector using mean and variance + SP

(3x1) (4x [2x64x256] ~ 130k dim), square-root + L2 norm

• BOF dim=1024 + SP (3x1) (dim 4000), square-root + L2 norm

• Different classification methods

Large-scale

Large scale experiment on ImageNet10k

16.7

Top-1 accuracy

dimensional Fisher vectors

• w-OVR > u-OVR

• Improves

Impro es oover

er state of the art

art: 6

6.4%

4% [Deng et

et. al] and

WAR [Weston et al.]

Large-scale

Large scale experiment on ImageNet10k

• Illustration of results obtained with w

w-OVR

OVR and 130K

130K-dim

dim

Fisher vectors, ImageNet10K top-1 accuracy

Conclusion

well-suited

suited for

large-scale datasets

classification

one-versus-rest strategy is a must for competitive

p

performance

Conclusion

classification

• Code on

on-line

line available at http://lear

http://lear.inrialpes.fr/software

inrialpes fr/software

• Future work

– Beyond a single representation of the entire image

– Take into account the hierarchical structure

- The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning ClassifiersUploaded byAI Coordinator - CSC Journals
- Hart Ing Tons MethodUploaded byramataram
- D1T3 - Clarence Chio and Anto Joseph - Practical Machine Learning in InfosecurityUploaded bypmjoshir
- Kmeans CodeUploaded byvanessaolivatto
- 209 Makhalova Elena PaperUploaded byLokesh Kancharla
- A Survey on Clustering Techniques in Medical DiagnosisUploaded byEighthSenseGroup
- 04_Chapter4Uploaded byAbdi Juragan Rusdyanto
- MultiUploaded byinvincible_shalin6954
- K-Means Segmentation Method for Automatic Leaf Disease DetectionUploaded byAnonymous 7VPPkWS8O
- enev-ACSAC12-sensorsiftUploaded byKarthik Sheshadri
- Theory and Use of the EM AlgorithmUploaded byMoh Teeti
- A Tutorial on Classification of Remote Sensing DataUploaded byIRJET Journal
- IRJET-Analysis of Document Clustering using Pseudo Dynamic Quantum Clustering ApproachUploaded byIRJET Journal
- 512874654 (1)Uploaded byTimothy Nankervis
- HadoopUploaded byBalasubramanian Subramanian
- Vec SupportUploaded byDemian Pereira
- A q 25256259Uploaded byAnonymous 7VPPkWS8O
- PQ10Uploaded byEko Sefriyanto Adhi
- ALEKOE-IWSSIP2011 (1)Uploaded byrohit lunar
- 3D Measures Computed in Monocular Camera System - MakantasisUploaded byMeropi Manataki
- Lecture 9 1 SegmentationUploaded byBudi Purnomo
- Choose a Classifier - MATLAB & SimulinkUploaded bySyahril Pablo
- Metabolic Network Prediction Through Pairwise Rati - Roche-Lima Et Al. - 2014Uploaded byEder Efren Perez
- Poster New 100x200cmUploaded byLalit Garg
- Harvesting Image Databases From the WebUploaded byMohamed Sabil
- 1558531817293_MRI_PCI_DWTUploaded byarkojyoti chakraborty
- International Journal of Engineering and Science Invention (IJESI)Uploaded byinventionjournals
- AmericanSignLanguage-TermPaperUploaded byHimanshu Lilha
- An Automated Recognition of Fake or Destroyed Indian Currency Notes in Machine VisionUploaded byCarl Fernandes
- PEER_stage2_10.1016/j.specom.2007.01.001Uploaded byMyagmarbayar Nergui

- Snake-Based Segmentation of Teeth From Virtual Dental CastsUploaded byBudi Purnomo
- [IEEE] Human Identification Using Dental Biometric Analysis_RUBAH2Uploaded byBudi Purnomo
- OtsuUploaded byfamtalu
- Watershed Segmentation Using Otsu Based ThresholdingUploaded byBudi Purnomo
- 18.04.1491_jurnal_eproc.pdfUploaded byBudi Purnomo
- Lecture 9 1 SegmentationUploaded byBudi Purnomo
- Tutorial Bow Iciap13 1Uploaded byBudi Purnomo
- Face Detection using ANN.pdfUploaded byBudi Purnomo
- [17] Dental Intraoral System Supporting Tooth SegmentationUploaded byBudi Purnomo
- Gray ScaleUploaded byBudi Purnomo
- 01.Improved Watershed Algorithm for Cell Image SegmentationUploaded byBudi Purnomo
- Teeth Feature Extraction and Matching for HumanUploaded byBudi Purnomo
- Wavelet TransformUploaded byBudi Purnomo
- Recent Image Processing Techniques in Forensic OdontologyUploaded byBudi Purnomo
- Tutorial Bow Iciap13 3Uploaded byBudi Purnomo
- entrophy_histogram.pdfUploaded byBudi Purnomo
- Dental Radiographs and Photographs in Human Forensic IdentificationUploaded byBudi Purnomo
- 1-Introduction 16x9 MatlabUploaded byBudi Purnomo
- Lung Detection and Segmentation Using Marker Watershed and laplacian filtering.pdfUploaded byBudi Purnomo
- 06701708Uploaded byBudi Purnomo
- Tutorial Bow Iciap13 1Uploaded byBudi Purnomo
- Metode Shape Descriptor Berbasis Shape_buat_seminar_tesisUploaded byBudi Purnomo
- ann_lab1Uploaded byBudi Purnomo
- Adaptive Thresholding Using the Integral ImageUploaded byBudi Purnomo
- Statistical Gaussian Model of Image Regions in Stochastic Watershed SegmentationUploaded byBudi Purnomo
- 683-1350-1-SMUploaded byBudi Purnomo
- Segmentasi Citra Digital Ikan Menggunakan Metode ThresholdingUploaded byZerRifa Doank
- Color Image to Grayscale Image ConversionUploaded byBudi Purnomo
- Morphological PCBUploaded byBudi Purnomo

- FEA 2 MARKS AND ANSWERSUploaded bysnvijayan
- Taylor_and_Maclaurin_Series.pdfUploaded byiamcerbzjr
- [Xinfeng Zhou] a Practical Guide to Quantitative orgUploaded byDmitrii Kondratovich
- b Sp Line Curve Least Squares FitUploaded bycitisolo
- Calc2 Study Guide (Section 2)Uploaded byGabriela Frederick
- 2D lid diven cavity final report.pdfUploaded byVivek Joshi
- UpGrad Scholarship TestsUploaded byUtkarsh Prasad
- Graphs of Inverse Functions (1)Uploaded byamirborhani123
- sme2 wks 15 and 16 geometry and matricesUploaded byapi-138117317
- HW4-sol-MATH0420.pdfUploaded byBùi Quang Minh
- Influence Maximization in Complex Networks Through Optimal PercolationUploaded byAndrés Tobelem
- linear programUploaded byshivu9025
- Math2081 2013B Midterm Sol BBUploaded byTriet Nguyen
- Solving Polynomial EquationsUploaded byparinyaa
- Lecture 2Uploaded byAnonymous G0oZQa
- FunctionsUploaded byabhishekv0697
- Analysis of Simply Supported Isotropic Thin-plate Bending ProblemUploaded byBrix Oca
- Final 401Uploaded byHong Chul Nam
- HeatUploaded byHarris Cohn
- Jacobi and Gauss-seidelUploaded bymrinmoyi
- Graduate Functional Analysis Problem Solutions wUploaded byLuis Zanxex
- ParticleSwarmOptimization_09007dcc80529fdbUploaded byskv_net6336
- PlymouthUniversity MathsandStats Gradients and Directional DeriavativesUploaded byPrince Israel Eboigbe
- Greatest Integer Function RulesUploaded byAshwin Gokhale
- 2012squad Comb SolutionsUploaded byAngela Angie Buseska
- TrigSheet.pdfUploaded byМарија Ристић
- Topic 6 HashingUploaded byHaire Kahfi Maa Takaful
- (de Gruyter Studies in Mathematics )Rene Schilling-Bernstein Functions Theory and Applications-Gruyter(2010)Uploaded byOlivera Taneska
- MCQ Ch 7 FSC Part1 NaumanUploaded byAngelic Shine
- NmbrThry2Uploaded byAamir