Professional Documents
Culture Documents
Course Catalogue
Driven on R
1
Course Content of
Statistical Analytics
2
1. Random Variables, Probability Distributions
a. Motivate the use of statistical methods for managerial decision
making
b. Discuss the concepts of probability distributions and random
variables
c. Review methods of representing data, pictorially and through
summary statistics
3
4. Confidence Intervals (I)
a. Introduce the concept of confidence intervals as a way to make
statistical inferences
b. Calculate confidence intervals for population mean with known and
unknown population
4
7. Hypothesis Tests (II)
a. Conduct two-sided hypothesis tests for population proportion / mean
9. Analysis of Variance
a. Introduce Design of Experiments
b. Conduct one way Analysis of Variance (ANOVA)
5
11. Introduction to regression methods; Bivariate data; Scatter plot;
Covariance; Correlation coefficient; Uses and issues; Correlation and causality;
Linear regression; Assumptions.
6
15. Ridge regression; Dummy variables; Transformations Power transformation,
Box-Cox transformation.
19. Binary response; Linear Probability Model; Advantages and issues; Guidelines
for Linear Regression Modeling
7
20. Regression Models for count data
a. Generalized Linear Models
b. Binary and multinomial logistic regressions
c. Poisson regression
d. Zero-inflated Poisson regression
e. Negative Binomial regression
8
22. Survival Analysis
a. Censoring and truncation. Characteristics of survival analysis
data
b. Time-to-event data. Hazard and survival functions
c. Kaplan-Meier estimate of survival function
d. Cox proportional hazards model (ph), estimation and its
analysis. Extensions
e. Stratified ph; ph with time-varying covariates
f. Parametric survival analysis with standard distributions
g. Accelerated failure time models
9
23. Design of Experiments
a. Basic concepts: randomization, replication and control
b. Experimental design for testing differences in several means:
Completely randomized and randomized complete block designs.
Cross-over designs
c. Two-level factorial experiments---full and fractional. Plackett-
Burman designs
d. Designs for three or more levels. Taguchi designs. Response
surface designs
e. Case-Control designs for campaign evaluation
f. Designs for conjoint analysis
10
Course Content of
Forecasting Analytics
11
1. Introduction to forecasting; Types and methods; Exploring data patterns
4. AR and MA models
12
Course Content of
Data Mining
13
Data Mining -1 (Unsupervised Learning)
1. Basic matrix algebra
2. Introduction to data mining
3. Dimension reduction techniques: Principal component Analysis(PCA)
4. Singular Value Decomposition (SVD)
5. Association rules
6. Sequential pattern mining
7. Recommender Systems (collaborative Filtering)
8. Network Analytics: Degree centrality, Closeness Centrality etc.
9. Cluster Analysis- Application on segmentation, anomaly detection
10. Hierarchical clustering and K-means clustering with various distance
measures and for continuous/ categorical variables
14
Data Mining-2 (Supervised Learning)
11. Overview of machine learning/supervised learning
12. Data exploration methods: Understanding data(distributions, visualizations),
Data nuances, data transformations
13. Basic classification algorithms
a. Version spaces and decision trees classifier
b. K-Nearest Neighbors and Parzwen window
c. Bayesian classifiers: nave Bayes and other discriminant classifiers
d. Perceptron and Logistic regression
e. Neural networks
14. Advanced classification algorithms
a. Bayesian Networks
b. Support Vector machines
15
15. Model validation and interpretation
16. Multi class classification problem
17. Bagging(random forest) and Boosting( Gradient Boosted Decision Trees)
18. Regression Analysis
19. Recommendation engines
20. Information retrieval
21. Practical tips in modeling: Bias vs trade off, Feature engineering and
incorporating domain knowledge.
16
Course Content of
Data Visualization
17
PART 1: NodeXL focused
1. 3 important principles of Visualization
2. Lie Factor
3. Using consistent scales
4. Presenting data in the context
5. Data-ink ratio
6. Tuftes Graphical Integrity Rules
7. Tuftes Principles for Analytical Design
8. Various chart junks & how to avoid chart junks
9. Dashboards Good, Bad & Ugly
10. Affordance Theory
18
PART 1: NodeXL focused
11. Network theory using NodeXL
a. Degree (In-degree, Out-degree)
b. Centrality (Closeness, Betweeness, Eigenvector)
c. Grouping / Clustering
d. Facebook network hands-on
12. Big data visualization problems
19
PART 2: Tableau focused
1. Introduction to the various file types
2. How to access help
3. Quick introduction to the user interface in Tableau
4. How to connect to the data sources
5. How to join the various data sources
6. How to create data visualization using Tableau feature Show Me
7. Reorder & remove visualization fields
8. How to sort & filter data
9. How to create a calculated field
10. How to perform operations using cross-tab
20
PART 2: Tableau focused
11. Working with workbook data & worksheets
12. How to create a packaged workbook
13. Creating various charts such as
a. Heat map
b. Box and Whisker plot
c. Pareto chart, etc.
14. Creating maps & setting map options
15. Creating dashboards & working with dashboard
21
Thank You
22