This document is a group project report for a course on knowledge acquisition in decision making. It examines a dataset of shoe sales data using three techniques: decision trees, regression, and neural networks. The group aims to determine which technique best predicts whether target sales were achieved. The document describes the shoe data, problem statement, research methodology involving data preparation and modeling in Excel and SAS Enterprise Miner, results and discussion comparing the predictive performance of the three techniques.
Original Description:
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING
This document is a group project report for a course on knowledge acquisition in decision making. It examines a dataset of shoe sales data using three techniques: decision trees, regression, and neural networks. The group aims to determine which technique best predicts whether target sales were achieved. The document describes the shoe data, problem statement, research methodology involving data preparation and modeling in Excel and SAS Enterprise Miner, results and discussion comparing the predictive performance of the three techniques.
This document is a group project report for a course on knowledge acquisition in decision making. It examines a dataset of shoe sales data using three techniques: decision trees, regression, and neural networks. The group aims to determine which technique best predicts whether target sales were achieved. The document describes the shoe data, problem statement, research methodology involving data preparation and modeling in Excel and SAS Enterprise Miner, results and discussion comparing the predictive performance of the three techniques.
PREPARED BY: AZIZAH BINTI AZIZ 211299 SITI ZULAILA BINTI MD ZAINAL ABIDIN 212193 SITI SOLEHAH BINTI ABDUL REJAB 212302
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 2
Table of Contents CHAPTER 1 INTRODUCTION .............................................................................. 4 1.1 Data Background ................................................................................................... 4 CHAPTER 2 PROBLEM STATEMENT ................................................................ 5 2.1 The Use of Specific Solution Techniques .............................................................. 5 2.1.1 Decision Tree ............................................................................................... 5 2.1.2 Regression .................................................................................................... 6 2.1.3 Neural Network............................................................................................ 7 CHAPTER 3 RESEARCH METHODOLOGY ...................................................... 8 3.1 Knowledge Discovery in Database (KDD)............................................................ 8 3.1.1 Selection....................................................................................................... 8 3.1.2 Pre-processing .............................................................................................. 9 3.1.3 Transformation........................................................................................... 11 3.1.4 Data Mining ............................................................................................... 12 3.1.5 Interpretation & Evaluation Process .......................................................... 13 CHAPTER 4 RESEARCH SOLUTION ................................................................ 14 4.1 Steps Involving in EXCEL .................................................................................. 14 4.1.1 Original Data.............................................................................................. 14 4.1.2 Editing Data ............................................................................................... 15 4.2 Steps Involving in SAS Enterprise Miner ............................................................ 16 4.2.1 Creating New Folder .................................................................................. 17 4.2.2 Import Data ................................................................................................ 18 4.2.3 Nodes for Decision Tree ............................................................................ 19 4.2.3.1 Input Data Source .......................................................................... 19 4.2.3.2 Data Partition ................................................................................. 20 4.2.3.3 Tree Model .................................................................................... 22
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 3
4.2.4 Nodes for Neural Network ......................................................................... 31 4.2.5 Nodes for Regression ................................................................................. 35 CHAPTER 5 RESULT AND DISCUSSION ......................................................... 37 5.1 Test for Misclassification Rate ............................................................................ 36 5.1.1 Using Assessment Node ............................................................................ 49 CHAPTER 6 REFERENCES ................................................................................. 51
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 4
CHAPTER 1 INTRODUCTION 1.1 Data Background Shoes are the most commonly used for people nowadays. It takes us everywhere, let us walk wherever we want and are one of the parts of our body that hardest work in day to day. Wearing shoes is very important due to a number of reasons. One of the reasons why we must wear shoes is because of it protects our feet from germs. Besides, it keeps our feet warm, clean and safe as well. However, for the rich people will most concern with brand names and indeed it is probably very expensive shoes. Proper walking shoes will maintain the correct walking techniques and prevent feet injuries. Due to that, for our project in Knowledge Acquisition in Decision Making (SQIT3033), we were given a set of data shoes to be analysis. The total number of data shoes consists 395 and followed with several attributes which are region, product, subsidiary, stores, sales, target sales, inventory and returns. .In this report, our group wants to focus on the 3 parameter that is decision tree, regression and neural network to examine the data shoes set. We want to know which one from this three parameter will give us the best model to make prediction for determine either achieves the target of sales or not by using data shoes set. Therefore, this report will described all work that we have done to examine the data shoes set using the SAS Enterprise Miner.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 5
CHAPTER 2 PROBLEM STATEMENT The guide clearly states that there is no simple rule for determining whether the observations are listed into the item of a shoes; no rule like stores less than or greater is achieve the target sales or not. From above statement, we need to categorize the classes of target sales for determining the achievement of data shoes. 2.1 The Use of Specific Solution Techniques 2.1.1 Decision Tree Decision tree is a tree-shaped structure that represents set of decisions or prediction of data trends. It is suitable to describe sequence of interrelated decisions or prediction of future data trends and has the capability to classify entities into specific classes based on feature of entities. Each tree consists of three types of nodes: root node, internal node and terminal node/leaf. The top most nodes are the root node and it represents all of the rows in the dataset. Nodes with child nodes are the internal nodes while nodes without child node are called the terminal node or leaf. A common algorithm for building a decision tree selects a subset of instances from the training data to construct an initial tree. The remaining training instances are then used to test the accuracy of the tree. If any instance is incorrectly classified the instance is added to the current set of training data and the process is repeated. A main goal is to minimize the number of tree levels and tree nodes, thereby maximizing data generalization. Decision trees have been successfully applied to real problem, are easy to understand, and map nicely to a set of production rules.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 6
2.1.2 Regression The Regression node in Enterprise Miner does either linear or logistic regression depending upon the measurement level of the target variable. Linear regression is done if the target variable is an interval variable. In linear regression the model predicts the mean of the target variable at the given values of the input variables. Logistic regression is done if the target variable is a discrete variable. In logistic regression the model predicts the probability of a particular level(s) of the target variable at the given values of the input variables. Because the predictions are probabilities, which are bounded by 0 and 1 and are not linear in this space, the probabilities must be transformed in order to be adequately modeled. The most common transformation for a binary target is the logic transformation. Probity and complementary log-log transformations are also available in the regression node. There are three variable selection methods available in the Regression node of Enterprise Miner. Forward first selects the best one-variable model. Then it selects the best two variables among those that contain the first selected variable. This process continues until it reaches the point where no additional variables have a p-value less than the specified entry p-value. Backward will starts with the full model. Next, the variable that is least significant, given the other variables, is removed from the model. This process continues until all of the remaining variables have a p-value less than the specified stay p value. Stepwise is a modification of the forward selection method. The difference is that variables already in the model do not necessarily stay there. After each variable is entered into the model, this method looks at all the variables already included in the model and deletes any variable that is not significant at the specified level. The process ends when none of the variables outside the model has a p-value less than the specified entry value and every variable in the model is significant at the specified stay value.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 7
2.1.3 Neural Network An artificial neural network is a network of many simple processors ("units"), each possibly having a small amount of local memory. The units are connected by communication channels ("connections") that usually carry numeric (as opposed to symbolic) data encoded by various means. The units operate only on their local data and on the inputs they receive via the connections. The restriction to local operations is often relaxed during training. More specifically, neural networks are a class of flexible, nonlinear regression models, discriminant models, and data reduction models that are interconnected in a nonlinear dynamic system. Neural networks are useful tools for interrogating increasing volumes of data and for learning from examples to find patterns in data. By detecting complex nonlinear relationships in data, neural networks can help make accurate predictions about real-world problems.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 8
CHAPTER 3 RESEARCH METHODOLOGY The method that used in research methodology is known knowledge discovery (KDD). KDD is non-trivial extraction of implicit, previously unknown and potential useful information from data. This method used to find or digest hidden information in database and convert unknown or hidden pattern into useful, understandable and informative way. KDD process contains 5 steps that are selection, pre-processing, transformation, data mining, interpretation and evaluation. 3.1. Knowledge Discovery in Database (KDD) 3.1.1 Selection Generally, selection process is the data needs for the data mining process may be obtained from many different and heterogeneous data sources. Data selection process is to acquire the most appropriate size that useful to the KDD process. Using sampling method if data too big. Sampling method has two types that are probability sampling (randomly chosen) and non-probability sampling (expert judgment). (1) Central Limit Theorem The size of sample, n must be greater than 30 (n >30). If n is the size of population, standard error is 0.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 9
(2) Based on confidence interval and acceptance error.
(3) Subject to data availability (more is merrier). 3.1.2 Pre-processing Data pre-processing is a data mining technique that involves transforming raw data into an understandable format. Real-world data is often incomplete, inconsistent, and/or lacking in certain behaviours or trends, and is likely to contain many errors. Data pre- processing is a proven method of resolving such issues. Data processing can be categorized into few methods: Data cleaning Data cleaning is a process of detecting and correcting (or removing) corrupt or inaccurate records from a record set, table, or database. The purpose of using data cleaning is to identifying incomplete, incorrect, inaccurate, irrelevant, etc. parts of the data and then replacing, modifying, or deleting this dirty data. Data cleaning can handle incomplete, noisy and inconsistent data. o Incomplete data is the missing value happens due to improper data collection methods. During collection of data, there may happen no recorded value for certain attribute and causing incomplete of data. To overcome this problem, we can use mean-value, estimate the probable value using regression, using constant value or ignore the missing record.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 10
o Noisy data is random error or variance in data. This happens due to corrupted data transmission, technological limitation. During transmission data into certain software such as SPSS or SAS, we may key in wrong data in it, thus, this will cause noisy data happened. To solve this problem, we can use binning method or outlier removal method. o Inconsistent data means the data contains replication or possibly redundancy data. Method to overcome this problem is removing redundant or replicate data.
Data integration involves combining data residing in different sources and providing users with a unified view of these data. Data comes from different sources with different naming standard. This will cause in inconsistencies and redundancies. There are several ways to handle this problem: Consolidate different source into one repository (using metadata). Correlation analysis (measure the strength of relationship between different attribute)
Data reduction is the transformation of numerical or alphabetical digital information derived empirical or experimentally into a corrected, ordered, and simplified form. The basic concept is the reduction of multitudinous amounts of data down to the meaningful parts. This is to increase efficiency, can reduce the huge data set into a smaller representation. Several techniques can be used in data reduction such as data cube aggregation, dimension reduction, data compression and discretization.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 11
3.1.3 Transformation In the transformation process, which also known as data normalization, is basically re-scale the data into a suitable range. This process is important because it can increase the processing speed and reduce the memory allocation. There are several methods in transformation: (1) Decimal Scaling (2) MinMax (3) ZScore (4) Logarithmic Normalization. We choose Min Max normalization to solve our data. Min Max normalization is linear transformation of the original input to newly specified range. The formula use is
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 12
3.1.4 Data Mining Data mining is the use of algorithms to extract the information and patterns by the KDD process. This step applies algorithms to the transformed data to generate the desired results. In this project, we are using SAS Enterprise Miner to build up the comparing models. In the SAS Enterprise Miner, we are using decision tree method. Decision tree is a tree-shaped structure that represents set of decisions or prediction of data trends. It is suitable to describe sequence of interrelated decisions or prediction of future data trends and has the capability to classify entities into specific classes based on feature of entities. Each tree consists of three types of nodes: root node, internal node and terminal node/leaf. The top most nodes are the root node and it represents all of the rows in the dataset. Nodes with child nodes are the internal nodes while nodes without child node are called the terminal node or leaf. A common algorithm for building a decision tree selects a subset of instances from the training data to construct an initial tree. The remaining training instances are then used to test the accuracy of the tree. If any instance is incorrectly classified the instance is added to the current set of training data and the process is repeated. A main goal is to minimize the number of tree levels and tree nodes, thereby maximizing data generalization. Decision trees have been successfully applied to real problem, are easy to understand, and map nicely to a set of production rules. The resulting trees are usually quite understandable and can be easily used to obtain a better understanding of the phenomenon in question.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 13
Picture 1: Shoes variables
3.1.5 Interpretation & evaluation process In interpretation and evaluation process, certain data mining output is non-human understandable format and we need interpretation for better understanding. So, we convert output into an easy understand medium.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 14
CHAPTER 4 RESEARCH SOLUTION
4.1 Step to import data from EXCEL to SAS 4.1.1 Original Data Our data is about Shoes. The original data which is consist of 395 data with 8 variables. From the data we would like to know either the target sales are achieves or not.
Picture 2: Original data However, not all attribute data is valid. For instance, SALES attribute and target sales attributes, they are referring to the same things. As the problem needs to be solved by classification, therefore, the SALES attribute is removed.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 15
4.1.2 Editing Data Figure above shows the cleaned data that will be imported and used into SAS. The red colored is variables that related to sales.
Picture 3: Variables that related to target sales
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 16
4.2 Steps Involving in SAS Enterprise Miner For our project, it is about a classification of data SHOES (sales at Africa, Asia, Canada, Central America/Caribbean, Eastern Europe, Middle East, Pacific, South America, United States, and Western Europe) into YES or NO for achieving a sales target. Thus, we used regression, neural network and decision tree by using the SAS Enterprise Miner to solve the data set given. SAS Enterprise Miner helps in developing models quickly by streamlining the data mining process. Below are the step of developing model using SAS enterprise miner.
Picture 4: The overall scenario
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 17
4.2.1 Creating new folder Create 1 folder and rename as SQIT 3033. Copy the data shoes into the folder and open the SAS. To select the enterprise miner in the tool strip menu choose Solution > Analysis > Enterprise miner.
Picture 5: Creating a new folder To create a new project from the SAS menu tool strip, File>New>Project. Rename the project as SHOES and click browse to change the directory of your file location. Click create button when finish entering all information. Project will create and we can see the project name in the upper left hand corner in the diagram tab of a window that we will refer to as the project navigator. A default diagram labelled Untitled will be created underneath it. To rename the diagram, simply right click on the diagram icon and choose rename. Type a new name classification model.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 18
4.2.2 Import Data Go to File Import Data to change the data format from Excel to SAS format so it can be read and analyzed later using SAS Enterprise Miner.
Picture 6: Import Data The next step is selecting the data in Excel format by browsing the location where its saved. Figures below show the details steps on how to import the Excel data to SAS.
Picture 7: Steps on how to import Excel data to SAS
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 19
4.2.3 Nodes for Decision Tree 4.2.3.1 Input Data Source
Picture 8: Input data source Drag the data input source icon to the workspace. Open the input data source icon that you just dropped in your diagram workspace by double-clicking on the icon. The input data source window will now appear. In the source data field click the select button, a pop window will appear, from the library field drop down menu choose EMDATA option and your data will appear in the white workspace. EMDATA can make sure that our work is saved. Double click on your data set.
Picture 9: Emdata SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 20
4.2.3.2 Data Partition Data partition is important as its help to keep a subset of available data out of analysis and use it later for evaluation purpose. Picture 10: Data partition The function of this node is to partition the input data sets of shoes into a training, validation and test model. The training data set is used for preliminary model fitting. The validation data set is used to monitor and tune the free model parameters during estimation and is also used for model assessment. The test data set is an additional holdout data set that we can use for model assessment. Right click on this data partition node then click open and we set the percentage of each partition and we decide to set 70% for training, 0% for validation and 30% for test. Partition:
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 21
We also used Transform Variable node and connect it with Data Partition node.
Picture 11: Connection for Transform variables The function of the Transform Variables node is to create new variables or variables that are transformations of existing variables in the data. Transformations are useful when we want to improve the fit of a model to the data. The Transform Variables node also enables us to create interaction variables. Sometimes, input data is more informative on a scale other than that on which it was originally collected. For example, variable transformations can be used to stabilize variance, remove nonlinearity, improve additivity and counter non-normality. Therefore, for many models, transformations of the input data (either dependent or independent variables) can lead to a better model fit. These transformations can be functions of either a single variable or of more than one variable. In our project, we use the TransformVariables node to make variables better suited for decision tree, logistic regression models and neural networks.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 22
4.3.3.3 Tree Model Picture 12: Decision Tree After that we choose Decision Tree node and connect it with Transform node. The function of Decision Tree is used to fit decision tree models to the data. When we run the Decision Tree node in automatic mode, it automatically ranks the input variables, based on the strength of their contribution to the tree. This ranking can be used to select variables for use in subsequent modelling. We can override any automatic step with the option to define a splitting rule and prune explicit tools or sub-trees. Interactive training enables us to explore and evaluate a large set of trees as we develop them.
Picture 13: The process of Decision Tree SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 23
Add Decision Tree Nodes We have three decisions three in our model. 1) Decision Tree connect from Data Partition without connect to Assessment. 2) Decision Tree connects from Data Partition and Assessment. 3) Decision tree connects from Transform Variables and Assessment. *We draw the line to connect this node. Right click at Decision Tree node and then click Open. Decision Tree (2 Splits) Variables:
Basic:
From the result above, we can say that there are two numbers of branches from a node. SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 24
Then, we click Run to view the results.
Picture 14: Result tree 2 Splits - Based on the training data set, 2 leaves in the tree are selected. - The misclassification rate in training data set is 0.0505 Next, go to View and then choose Tree to view the decision tree results.
Then, right click at the blank space and choose view competing splits SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 25
From the table above, we can conclude that RETURNS variable was used for first split. The others variable such as INVENTORY, STORES, PRODUCT and REGION are the connecting split for the first node. Decision Tree (3 Splits) Variables:
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 26
Basic:
From the result above, we can say that there are four numbers of branches from a node. Then, we click Run to view the results.
Picture 15: Results tree 3 Splits
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 27
- Based on the training data set, 3 leaves in the tree are selected. - The misclassification rate in training data set is 0.0505 Next, go to View and then choose Tree to view the decision tree results.
Then, right click at the blank space and choose view competing splits.
From the table above, we can conclude that RETURNS variable was used for first split. The others variable such as INVENTORY, STORES, PRODUCT and REGION are the connecting split for the first node. SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 28
Decision Tree (2 Splits/Maximize Normality) Variables:
Basic:
From the result above, we can say that there are two numbers of branches from a node. Then, we click Run to view the results.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 29
Picture 16: Results tree 2 Splits/Maximize Normality
- Based on the training data set, 3 leaves in the tree are selected. - The misclassification rate in training data set is 0.0505 Next, go to View and then choose Tree to view the decision tree results. Then, right click at the blank space and choose view competing splits. SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 30
From the table above, we can conclude that RETURNS variable was used for first split. The others variable such as INVENTORY, STORES, PRODUCT and REGION are the connecting split for the first node.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 31
4.2.4 Nodes for Neural Network
Neural Network node is used to construct, train, and validate multilayer, feed forward neural networks. We add Neural Network node and connect with Transform Variables node and Assessment node. By default, the Neural Network node automatically constructs a network that has one hidden layer consisting of three neurons. In general, each input is fully connected to the first hidden layer, each hidden layer is fully connected to the next hidden layer, and the last hidden layer is fully connected to the output. The Neural Network node supports many variations of this general form. In this project, we click Open at neural network node and this is the result for: Variables:
Basic:
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 32
Click the button of Multilayer Perception, for Hidden neurons, Set number is 2. There are 2 hidden nodes.
Then, right click at neural network then click run. Then click weights. There are H11 and H12 relationship.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 33
Tables:
Misclassification rate in training data set = 0 Misclassification rate in testing data set = 0.0593
Based on the result, we can say that the Neural Network model is not the best as it contain error. Then, from table weight, we can make this table of summary to identify the relationship between variables and H11 and H12. SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 35
4.2.5 Nodes for Regression The function of Regression node is to fit both linear and logistic regression models to the data. We add Regression node and connect it with Transform Variables node and Assessment node. We can use continuous, ordinal, and binary target variables, and you can use both continuous and discrete input variables. The node supports the stepwise, forward, and backward selection methods. Right click the Regression node to open it and we get this result.
Variables:
Model options:
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 36
Selection Method:
Then we run the regression node and click Statistics and we will get the table below:
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 37
CHAPTER 5 RESULT AND DISCUSSION 5.1 Test for Misclassification Rate Misclassification rates from each process are noted and differentiate to find the best method or process to use for data mining to achieve target sales. The dependent variable for this analysis is the target sales (yes or no) which is categorical data and also the other 7 variable. The interval variables are 4 other variables. After screening the data, we found out that no missing value for our data. So, we do not need to add the replacement node because we have no missing value.
Continuous variable
Categorical variable SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 38
Picture 18: After transformation: Log (Inventory)
Picture 19: After transformation: Log (Returns) SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 39
Logistic Regression Model After running the logistic model, we inspect the T-scores in the figure below. The T-scores are ranked in decreasing order of their absolute values, which indicates that the higher the absolute value is, the more important the variable is. From the results, it shows target sales YES , Subsidiary MONTREAL not really important but Subsidiary GENEVA are the most important model input variables.
Picture 20: Effect T-scores
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 40
Picture 21: Lift chart for Regression For each observation, the logistic regression model predicts the probability that the target sales is yes. And all these observations are sorted by the predicted probability from highest to lowest. For example, from the Figure above, In top 10%, almost 100% of the target sales is yes. In top 30%, 100% of the target is yes. The horizontal blue line represents the base line rate. The base line is an estimate of the percentage of target sales that you would expect.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 41
Misclassification rate training = 0 Misclassification rate testing = 0.14407
From the figure above, in top 30%, 100% of the target sales is classified in class 3 (yes). The base line is an estimate of the percentage of class 3 (yes) target sales that you would expect. SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 42
Picture 22: Lift chart for Neural Network
Misclassification rate training = 0 Misclassification rate testing = 0.05932 SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 43
Lift chart to compare between regression and neural network. Here, we can conclude that the Neural Network disappear because it is the best model
Picture 23: Lift chart for Regression and Neural Network
Decision Tree Generally, decision trees are flexible, while regression models are relatively inflexible, for example, you have to add additional terms, i.e. interaction terms, polynomial terms. And decision trees can deal with missing values without imputation, while regression model usually has to impute missing values before building the model set. Based on the training dataset below we choose on any leave because all it have the lowest misclassification rate that is 0.0505. Therefore our accuracy is 1 0.0505 = 0.9495. That means overall accuracy is 95% (still have some error).
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 44
Tree Model for 2 Splits
Based on the decision tree above, there are 2 number of leaves node.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 45
The criteria of first and second leaf of nodes Based on the above tree, we can conclude that if Returns <3426.5 so, the target sales is NO with 93.4%. If Returns >=3426.5 so, the target sales is YES with 98.8%.
Returns were used for the first split. Inventory, Stores, Product, and Region were used as the competing split for the first split. Tree Model for 3 Splits
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 46
Based on the decision tree above, there are 3 number of leaves node The criteria for first, second and third node Based on the above tree, we can conclude that if Returns <2162.5 so, the target sales is NO with 100%. If Returns <3426.5.So, the target is NO with 63.9% and if Returns >=3426.5 so, the target sales is YES with 98.8%.
Returns were used for the first split. Inventory, Stores, Product, and Region were used as the competing split for the first split.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 47
Tree Model for 2 Splits/Maximize Normality
There are 2 numbers of leaves node based on the decision tree above.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 48
The criteria of first and second leaf node Based on the above tree, we can conclude that if Returns <8.13929 so, the target sales is NO with 93.4%. If Returns >=8.13929 so, the target sales is YES with 98.8%.
Returns (after transformation) were used for the first split. Inventory (after transformation), Stores, Product, and Region were used as the competing split for the first split. Summary: After do the transform and view the result, the variable for three model of tree is same where the first is Return, Inventory, Stores, Product and last is Region. The value for the third model is change but the turn of variable is still same. It is because we used maximize normality for third tree model.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 49
5.1.1 Using Assessment node Picture 24: Assessment node By using Assessment node, we can analyze the results to find the best method or tool. The best method is defined by its value of misclassification rate of each method. The lesser its misclassification rate, the higher its probability to be selected as the best tool.
SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 50
We used 3 models for tree where two tree for 2 splits way and one model for 3 splits way. The splits mean here we can decide how much the leaves of tree result. The result for three tree model is same. As conclusion, from the table above we know that the best model is when using Neural Network with Variable Selection tools. It only has 1% of misclassification rate on Testing.
Tree (3 split way) 0.0505 0.0593 SQIT 3033 KNOWLEDGE ACQUISITION IN DECISION MAKING 2014
Page 51
CHAPTER 6 REFERENCES 1) F. George. (2003). Data mining using SAS applications. Boca Raton: Chapman & Hall/CRC. 2) J. P. Bigus. (1996). Data Mining with Neural Networks: Solving Business Problems from Application Development to Decision Support. New York: McGraw-Hill. 3) M. Craven & J. Shavlik. (1997). Using Neural Networks for Data Mining: Future Generation Computer Systems, 13, pp. 211-229. Retrieved May 30, 2013 from http://pages.cs.wisc.edu/~shavlik/abstracts/craven.fgcs97.abstract.html 4) Noorlin Mohd Ali. (2005). Mining students' data with Holland model using neural network and logistic regression. Sintok: Faculty of Information Technology, Universiti Utara Malaysia.