You are on page 1of 25

Machine Learning

mit TensorFlow

Max Kleiner 24.09.2018


TensorFlow Agenda
4 Cases
– XOR, IRIS, MNIST, Image Cluster with QT
 Sacred Experiment Log Recorder
 Most experiments score with metrics as confusion matrix
 What's behind TF ? (pattern recognition)
 Cluster & Classify with different inputs, params, config
 Definelabel or topic ratings, hyper-parameters,
– assumed/implicit labels, predict versus target
 Conclusions/ ML Process Summary/

https://www.esecurityplanet.com/views/article.php/1501001/Security-Threat-Correlation-The-Next-Battlefield.htm 2

2
What is TensorFlow?
Intro Link to EuroPython

 Probabilistic (NB, MNB, PD,


BN, BC, PLSA, LDA, ...)
 Independence assumptions

CASSANDRA System
made
 Stochastic or Distance-based
(SVM, matching, VSIM, k-
NN,CR, PageRank,Kmeans)
 Features used
 Structures exploited

 Model-based (rules, BC, BN,


boosting)
 Social media driven
3
Baseline Perceptron (MLP) with XOR
C:\maXbox\BASTA2018\xor_perceptron.py

http://playground.tensorflow.org_maXbox1
http://playground.tensorflow.org_maXbox2
4
https://www.springboard.com/blog/data-mining-python-tutorial/
4
IRIS Classify Task

5
https://sebastianraschka.com/images/blog/2015/principal_component_analysis_files/iris.png
https://www.springboard.com/blog/data-mining-python-tutorial/ 5
IRIS Concept Steps from module import class
0 1 2 3 4
5 6 4 3 4
from sklearn import datasets, tree
iris = datasets.load_iris()
clf = tree.DecisionTreeClassifier()

CASSANDRA System
clf = clf.fit(iris.data, iris.target)
y_pred = clf.predict(iris.data)

print('Train accuracy_score: ')


metrics.accuracy_score(iris.target,y_pred)

Demo in VSCode
1. C:\maXbox\softwareschule\MT-HS12-
05\mentor_xml\casra2017\crawler\plot_iris_dataset_mx.py

6
IRIS Decision Tree

7
IRIS Confusion Matrix

8
from keras.datasets import mnist
0 1 2 3 4
5 6 7 8 9

The MNIST dataset is comprised of 70,000 handwritten numeric


digit images and their respective labels 0..9.

CASSANDRA System
2. Main: C:\maXbox\mX46210\DataScience\plot_confusion_matrix_vectormachine.py
3. Second C:\maXbox\mX46210\DataScience\confusionlist\mnist_softmax21.py
https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/4_Utils/tensorboard_basic.ipynb

There are 60,000 training images and 10,000 test images, all of
which are 28 pixels by 28 pixels.
 76 english, 38 content="voa,
 36 美国之音 " 74 special hand written
 44 Modified National Institute of Standards and
Technology database)
 36 голос 36 америки
9
QT from keras import backend as K

@ex.automain
def define_and_train(batch_size, epochs,
convolution_layers, maxpooling_pool_size, maxpooling_dropout,
dense_layers, dense_dropout, final_dropout,_run):
from keras.models import Sequential #convolution

CASSANDRA System
from keras.layers import Dense, Dropout, Flatten, Conv2D,
from keras.utils import to_categorical
from keras.losses import categorical_crossentropy
from keras.optimizers import Adadelta
from keras import backend as K
from keras.callbacks import ModelCheckpoint, Callback

4. C:\maXbox\mX46210\ntwdblib.dll\UnsharpDetector-master\UnsharpDetector-master\inference_gui.py

10
MongoDB/Sacred module import class
0 1 2 3 4
5 6 7 8 9

from __future__ import division, print_function,


unicode_literals

CASSANDRA System
from sacred import Experiment
from sacred.observers import MongoObserver
from sacred.utils import apply_backspaces_and_linefeeds
import pymongo, pickle, os
import pydot as pdot
import numpy as np
import tensorflow as tf

5. C:\maXbox\mX46210\DataScience\confusionlist\train_convnet_tf.py
11
MongoDB
 Start the shell process mongod from
 Call from script or start mongod: C:\Program
Files\MongoDB\Server\3.6\bin>mongod

CASSANDRA System
12
MongoDB My Cluster sacred.runs & completed

CASSANDRA System
13
GEO
Cluster
Story

An agent or probe that collects threat data from the security sensor
Normalization and correlation middleware. A console and associated database for
managing the solution and its alerts. 14
https://www.esecurityplanet.com/views/article.php/1501001/Security-Threat-Correlation-The-Next-Battlefield.htm
14
Task Manager

15

15
Task II

16

16
Task III

17
https://pythonexample.com/search/mnist%20tensorboard%20demo/8 17
What's behind test ? (backend pattern, crossentropy)
60000/60000 [==============================] - 426s 7ms/step - loss: 0.4982 - acc: 0.8510 -
val_loss: 0.0788 - val_acc: 0.9749
Using TensorFlow backend.
INFO - MNIST-Convnet4 - Result: 0.9749
INFO - MNIST-Convnet4 - Completed after 0:07:27
Test loss: 0.0788029053777
Test accuracy: 0.9749
 59392/60000 [============================>.] - ETA: 5s - loss: 0.0571 - acc: 0.9829
 59520/60000 [============================>.] - ETA: 3s - loss: 0.0572 - acc: 0.9829
 59648/60000 [============================>.] - ETA: 2s - loss: 0.0572 - acc: 0.9829
 59776/60000 [============================>.] - ETA: 1s - loss: 0.0572 - acc: 0.9829
 59904/60000 [============================>.] - ETA: 0s - loss: 0.0573 - acc: 0.9829
 60000/60000 [==============================] - 513s 9ms/step - loss: 0.0573 - acc:
0.9829 - val_loss: 0.0312 - val_acc: 0.9891
 Using TensorFlow backend.
 INFO - MNIST-Convnet4 - Result: 0.9891
 INFO - MNIST-Convnet4 - Completed after 0:33:28
 Test loss: 0.0311644290059 18
 Test accuracy: 0.9891
 18
What's behind code ? (keras, pymongo, pydot, graphviz)

db = pymongo.MongoClient('mongodb://localhost:27017/').sacred

print(tf.__version__)
os.environ["PATH"] += os.pathsep +
'C:/Program Files (x86)/Graphviz2.38/bin/'

from tensorflow.examples.tutorials.mnist import input_data

ex = Experiment("MNIST-Convnet4")
ex.observers.append(MongoObserver.create())
ex.captured_out_filter = apply_backspaces_and_linefeeds

https://www.programcreek.com/python/example/103267/keras.datasets.mnist.load_data
19

19
Whats behind Python: PIP3 Install
pip3 install sacred
Collecting sacred
Downloading
https://files.pythonhosted.org/packages/2d/86/7be3af
a4d4c1c0c76a5de03e5ff779797ab2654e377685255c11c13c0e
a5/sacred-0.7.3-py2.py3-none-any.whl (82kB)

Collecting pymongo
Downloading
https://files.pythonhosted.org/packages/46/39/b9bb7fed3e3a0ea621a1
512a938c105cd996320d7d9894d8239ca9093340/pymongo-3.6.1-cp36-cp36m-
win_amd64.whl (291kB)
100% |¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦| 296kB 728kB/s
Installing collected packages: pymongo
Successfully installed pymongo-3.6.1

https://github.com/pinae/Sacred-MNIST/blob/master/train_convnet.py 20

20
Create Questions (Method, Algos, Tools)
Finding the question is often more important than finding the answer
John Tukey

https://www.soovle.com/ 21

https://answerthepublic.com/reports/ 21
Machine Learning Process Chain

• Collab (Set a control thesis, understand the


problem, get resources Python etc.)
• Collect (Scrapy data, store, data mining,
filter data, inconsistent, incomplete)

• Consolidate or Clean data (normalization and


aggregation, PCA data reduction, filters,slice
out irrel. data or char map prob.)
• Cluster (kmeans for categorys, collocates for
N-keywords) algorithm – unsupervised)
• Classify (SVM, Sequential, Bayes – supervised)
• Conclude and Control (Predict or report context
thesis and drive data to decision)

http://www.softwareschule.ch/examples/machinelearning.jpg 22

22
https://www.kaggle.com/ v ( a, j ) v(b, j )
similarity of doc a to doc b = sim(a, b)   
word i  v ( a, j ' )
j'
2
 (b, j ' )
v
j'
2

 A'B'

CASSANDRA System
23
Double Trouble with ML → Stackexchange, Stackoverflow
THE TEST OVERVIEW
Status Description
QUEUED File
The run was just "C:\Users\max\AppData\Local\Programs\Python
queued \Python36\lib\site-
and not run yet
RUNNING packages\sklearn\metrics\cluster\unsupervised.p
Currently running (but see below) y", line 254, in calinski_harabaz_score
COMPLETED intra_disp += np.sum((cluster_k - mean_k) **
Completed successfully
FAILED 2) MemoryError
The run failed due to an exception
INTERRUPTED
The run was cancelled with a
KeyboardInterrupt No. of URLs removed 76,732,515
TIMED_OUT + No. of robots.txt 3,675,634
The run was aborted using a TimeoutInterrupt requests
[custom]
A custom py:class: - No. of excludedURLs 3,050,768
~sacred.utils.SacredInterrupt = No. of HTTP requests 77,357,381
occurred
HTTP requests not 1,763850
respond
24

24
SUMMARY & QUESTIONS
Which Stat / Module / TF Package ?
 Mindtoolset : https://basta.net/speaker/max-kleiner/

 Toolchain: KMeans-PySpark-ElasticSearchSQLServer-Scrapy-TensorFlow-
SVM-RandomForest-Sacred-MongoDB
 https://sacred.readthedocs.io/en/latest/tensorflow.html

 Best Book in our opinion:

Mastering Machine
Learning with
Python in Six Steps
A Practical Implementation Guide to
Predictive Data Analytics Using Python 25

25

You might also like