You are on page 1of 34

Improved Governance Through

the Application of Advanced


Research Techniques:
Data Mining/Predictive Analytics

Nick B. Fontanilla, Ph.D.


UNDECIDED VOTERS ACROSS TIME
40.5%
45%
34.3%
40%

35% 27.0%

30%

25%

20%

15% 10%?
10%

5%

0%
Oct-09 Dec-09 Feb-10

5
Evolution of Research
Web 2.0
User Participation
Role of Social Media in Research
Researcher of the Future
Multi-Modal
• A survey that is administered in multiple
research modes, for example, web-based and
phone-based, web-based and paper-based,
etc.
Benefits of Multi-Modal
• A low cooperation or response rate does more
damage in rendering a survey’s results
questionable than a small sample because
there may be no valid way scientifically of
inferring the characteristics of the population
represented by the non-respondents.

American Association for Public Opinion Research


Best Practices for Survey and Public Opinion Research
Why Multi-Mode
• Increased response rates, reduced non-response error
– Multi-phase approaches have provento be =effective
– E.g. media studies with large, burdensome booklets.
• Improved sample coverage
– Hard to reach demographics
– Cell phone only households
• Respondent centric
– Offer participants choices in how and when to complete
– E.g., customer research, where respondents are highly valued.
• Research spanning intercontinental boundaries
– Mode dependent on local infrastruture and customs
• Reducing cost
– Balance increased admin vs. reduced interviewing cost
Traditional
Web 2.0
Web 2.0
Business Challenge for Institutions

“The need to provide better campus decision


support systems with an integrated view of data
is critically important to campuses in order to
manage the complexities of our institutions in a
turbulent market environment.”

- Educause

Source: Educause Core Data Service, Fiscal Year 2005 Summary Report, Brian L. Hawkins and Julia A. Rudy, November 2006.
Predictive Analytics
Predictive analysis helps connect data to
effective action by drawing reliable
conclusions about current conditions and
future events.

— Gareth Herschel, Research Director, Gartner Group


What is Data Mining?
“Data mining is the process of discovering
meaningful new correlation, patterns and
trends by sifting through large amounts of
data stored in repositories, and by using
pattern recognition technologies, as well as
statistical and mathematical techniques.”
What is Data Mining?
"Data mining offers firms in many industries the
ability to discover hidden patterns in their data
patterns that can help them understand
customer behavior and market trends.”
Types of Data Mining
• Supervised Data Mining
– Known outcome
– Example: Graduation Database
• Students who completed their studies vs. those who
dropped out

• Unsupervised Data Mining


– Particular groupings or patterns are unknown
– Example: Student Course Database
• Little is known about which courses are usually taken as a
group
Data Mining Applications in Higher Education

By Jing Luan, Ph.D.


Vice Chancellor, Educational Services and Planning
San Mateo County Community College District
And Founder, Knowledge Discovery Laboratory

Formerly
Chief Planning and Research Officer, Cabrillo College
Data Mining Applications in Academe

Who are likely to pursue Which subjects are


their application in our likely taken together
university? by our students?

What typology is our


Who are likely to shift
students classified
degree courses?
into?

Which subject types


are associated with
certain student types?
Data Mining Applications in Academe
Who are likely to transfer
Who are likely not to to another university?
graduate on time?

How would we Who among our alumni


allocate efficiently are likely to offer
time, manpower, and pledges?
budget?

How would we maximize the information


from comments / opinions provided by
our students in their evaluation?
Case Study 1:
Creating meaningful learning outcome typologies

• Challenge:
– “What do institutions know about their students?”
– A typical suburban community college with an
enrollment of 15,000 traditionally identifies its
students as:
• “Transfer Oriented” ,
• “Vocational Education Directed”, or
• “Basic Skill Upgraders”
– Classifications are based on students’ initial
declarations of educational goals at enrollment.
– To illustrate further the differences between each
student type
Case Study 1:
Creating meaningful learning outcome typologies

• Solution:
– Two-Step and K-Means clustering algorithms
– Using the general classification, boundaries among clusters
were unclear and dispersed.
– Possibly, student’s initial declaration of goals did not dictate
their academic behavior.
– Considering educational outcomes and length of study, Two
Step produced the ff clusters, which K-Means validated:
• “Transfers”
• “Vocational Students”
• “Basic Skill Students”
• “Students with Mixed Outcomes”
• “Dropouts”
Case Study 1:
Creating meaningful learning outcome typologies

• Results:
– Improved understanding of students types
• Older students tend to take their time.
• Younger students with more privileged socioeconomic
backgrounds often took high credit courses and
graduated quickly.

– Helped educators and administrators better meet


the needs of varied student groups
Case Study 2:
Academic Planning and Interventions – Transfer Prediction

• Challenge:
– More than half of community college students identify
transferring to four-year universities as their goal.
– To accurately predict academic outcomes in order to
facilitate timely academic intervention (e.g. student
transfer)

• Solution:
– Neural Network and Rule Induction algorithms using
Supervised Data Mining
– Predictors: Demogrpahics, Courses Taken, Units
Accumulated, Financial Aid
Case Study 2:
Academic Planning and Interventions – Transfer Prediction

• Results:
– Enabled the college to accurately identify good
transfer candidates.
– Model Accuracy:
• Neural Net : 72%
• Rule Induction (C5.0 and C&RT) : 80%
Case Study 3:
Predicting Alumni Pledges

• Challenge:
– For a typical urban university of 25,000, the
alumni population can be as ten times as its
enrollment.
– Universities send mailings to alumni on a regular
basis, even when alumni fail to respond.
– Mailing cost > $100K a year.
– To focus on the alumni most likely to make
pledges
Case Study 3:
Predicting Alumni Pledges

• Solution:
– Gain Chart
• Curved line : optimal return
rate (alumni contribution)
• 45-degree line: predicted
result if the entire population
received the mailing.
• 30 percentile of the
population  80 percent
response with pledge.

 Results:
 The college discovered a way to make its mailing more effective and
increase alumni pledges, while reducing costs.
National Marketing Conference, June 24 and 25, 2010
Strategic Marketing Conference for Students, July 20, 2010
Agora Youth Awards
Agora Conference
Agora Awards
Certified Professional Marketers in Asia
Thank you

You might also like