Professional Documents
Culture Documents
the predictor independent variables (IVs ) are of interval or ratio nature. IMPORTANT DV : Non-metric (Nominal or ordinal scaled) Classification/grouping variable IVs : Metric variables (Interval or ratio scaled variables)
Examples: DV is Choice of a brand of PC (A, B or C) and IVs are rating of attributes of PCS on a 7point scale. Classification of customers into buyers and nonbuyers based on their demographic profiles such as age, income,sex and some factors related to shopping habits. Families who go/dont go for vacation holidays to holiday resorts as criterion variable and income,house hold size, attitude towards travel, importance to family vacation etc. as predictor variables.
Discriminant Analysis PURPOSE: to understand segmentation/ classifications and to predict group membership
INPUT: dependent variable as an indicator of group membership and independent variables as classification criteria KEY OUTPUT: classification matrix
The objectives of this technique are Development of Discriminant function which is a linear combination of independent variables, that best discriminates between the categories of the dependent variable (groups) Examine whether significant differences exists among the groups , in terms of the predictor variables.
1.
2.
3.
Determine which predictor variables contributes to most of the inter-group differences . Classification of cases to one of the groups based on values of the predictor variables.
4.
5.
The Linier discriminant analysis model known as the discriminant function is given by D ( or Y) = b0 + b1 X1 + b2X2 + ..+ bkXk Where D = Discriminant score bs = discriminant coefficient Xs = Independent variables ( k independent variables)
In discriminant analysis a score is assigned to individuals or objects .This forms a basis for classifyng the item in the most likely class.
The Linier discriminant function in standardised form is given by D ( or Y) = B1 X1 + B2X2 + ..+ BkXk
If the DV has two groups a single discriminant equation is needed for categorising.
If the DV has three groups two discriminant equations are needed for categorising. If the DV has n groups (n-I) discriminant equations will be required for categorisation
Examples of applications of DA in Business research 1. How do customers who exhibit store loyalty differ from those who do not, in terms of demographic characteristics? 2. Do market segments differ in their media consumption habits? 3. What are the distinguishing characteristics of consumers who prefer to shop on the net?
Important statistics associated with analysis Discriminat Scores (DS) Discriminant function coefficients: (Bs ) Canonical correlation : association between discriminant scores(DS) and the groups Centriod : mean value of DS for a particular group. Classification matrix ( Confusion matrix or prediction matrix): Matrix of correctly classified and misclassified cases.
1. 2. 3.
4.
5.
6.
Hit ratio : Proportion of correct classification. Eigen values : Ratio of between group and with in group sum of squares. Larger the eigen value better is the function. Eigen value > 1 indicates that 100% of the explained variance is accounted for.( square of the cannonical correlation explains the % variation in the dependent varaible explained by the model)
7.
8. Wilks lamda: indicates the significance of the model. A lower value indicates higher significance. (Wilks lamda is converted to a
Key Output of Discriminant Analysis: Classification Matrix Classification matrix True Group good Credit Bad Credit Assigned group Good Credit 40 10 Bad Credit 15 35
For the above matrix, the proportion of correct classification, i.e. hit rate is (40+35)/(40+35+10+15)=75/100=75%
Bad Credit 10
35
True Group
Good Credit 40 10
Bad Credit 10 30
Application problem A firm has developed a new industrial process which is a distinct improvement over the existing one .The firm wants to know which industrial units would be interested in buying the process.Units which are early adopters and innovators would go in for the new process. Net profit of industrial units and their membership with trade associations and technical societies are identified as two important determinants. Data w.r.t. these two are available. Data File Discrim.Sav
Logistic vs discrminant
When you need too many assumption and chances of type 2 error and of null hypothesis
Disciminat: all assumption are met and you need to end the hypothesis