You are on page 1of 15

DECISION TREE INDUCTION FOR

FINANCIAL FRAUD DETECTION USING


ENSEMBLE LEARNING TECHNIQUES
Vijayalakshmi Mahanra Rao ,
Yashwant Prasad Singh
Faculty of Computing and Informatics
Multimedia University, Cyberjaya, Malaysia

ABSTRACT
Credit card fraud is a serious and major growing problem
in banking industries. With the advent of the rise of many
web services provided by banks, banking frauds are also
on the rise. Banking systems always have a strong security
system in order to detect and prevent fraudulent
activities of any kind of transactions. Though totally
eliminating banking fraud is almost impossible, but we can
however minimize the frauds and prevent them from
happening by machine learning techniques.This paper
aims to conduct experiments to study banking frauds using
ensemble tree learning techniques and genetic algorithm
to induct ensemble of decision trees on bank transaction
datasets for identifying and preventing bank fraud. It also
provides an evaluation and effectiveness of the ensemble
of decision trees on the credit card dataset.

MAIN POINTS IN ABSTRACT

minimize the frauds and prevent them


from happening by machine learning
techniques
conduct experiments to study banking
frauds using ensemble tree learning
techniques and genetic algorithm to
induct ensemble of decision trees
evaluation and effectiveness of the
ensemble of decision trees on the credit
card dataset

OUTLINE

Abstract
Main Points in Abstract
Methods
Motivation for Using Genetic Algorithm
with Decision tree Induction algorithm
(C4.5) & AdaBoost.M1
Dataset & Parameters
Experiment & Results
Conclusion
Contact

METHODS

Decision tree ID3, C4.5


AdaBoost.M1
Genetic Algorithm (GA)
WEKA

MOTIVATION FOR USING GENETIC ALGORITHM

ID3 and C4.5 uses greedy approach in


attribute selection
Experiment conducted to evaluate GA as an
approach to attribute selection without using
ID3 and C4.5s approach.
Also pruning of the tree will not be required
using GA, as the best attribute has been
selected

DATASET & PARAMETERS

German Credit Card Application


1000 instances, 20 attributes, with
class 1 (good) and 2 (bad)

Parameters : Percentage Split 70%,


Boosting with 100 iterations,
Population size of 50

EXPERIMENT & RESULTS (1)

Construct a decision tree which is improved


with the use of Genetic Algorithm (GA) for
feature selection
Decision tree will be induced using J48 as
well as ID3 algorithm available in WEKA for
comparison

EXPERIMENT & RESULTS (2)


Three forms of experiment that have
been performed are:
1) Experiment of decision tree without
any boosting.
2) Experiment of decision tree together
with AdaBoost.M1
3) Experiment of decision tree with
feature subset selection (wrapper
approach)

EXPERIMENT & RESULTS (3)

EXPERIMENT & RESULTS (4)

Experimental results have shown that GA


with ID3 or C4.5 performed better
compared to using the ID3 and C4.5
classifier alone
C4.5 with AdaBoost.M1 gives higher
accuracy compared to others

CONCLUSION

Data analytics has been done on the usage


of decision trees combined with boosting
and genetic algorithm
Improvement in classification accuracy is
observed using boosting algorithm on
decision tree and GA with decision tree.

REFERENCES

Shen, Aihua, Tong, Rencheng, Deng, Yaochen


(2007). Application of Classification Models
on Credit Card Fraud Detection. IEEE
Kohavi, R., John, H.G. (1996). Wrappers for
feature subset selection
Yang, J., Honavar. V (1997). Feature Subset
Selection Using A Genetic Algorithm
Yoav Freund, Robert E. Schapire:
Experiments with a new boosting algorithm.
In: Thirteenth International Conference on
Machine Learning, San Francisco, 148-156,
1996

CONTACT

Vijayalakshmi Mahanra Rao


lakshmi.mahanra@gmail.com

Prof. Yashwant Prasad Singh


y.p.singh@mmu.edu.my

Thank you!
Q&A

You might also like