You are on page 1of 7

&

PG DIPLOMA
IN DATA ANALYTICS
Program Curriculum
Note: This curriculum is subject to change
based on inputs from IIITB and Industry

COURSE MODULE NAME SESSION SESSION NAME

UNDERSTANDING THE EXCEL INTERFACE


SLICING & DICING DATA
BASIC FORMATTING
DATA ANALYSIS IN EXCEL I CONDITIONAL FORMATTING
ADVANCED FORMATTING
PRINTING & PAGE LAYOUT

EXCEL WITH S. ANAND PASSWORD & NAMING FILES


DELIMITED FILES
DISCOVERING SHORTCUTS
INTRODUCTION TO FORMULAE
DATA ANALYSIS IN EXCEL II
COMPLEX FUNCTIONS
CELL REFERENCING & TEXT FUNCTIONS
LOGICAL FORMULAE
ANAND'S ANECDOTES

INSTALLING R
BASIC OPERATIONS IN R
VECTORS, FACTORS, MATRICES, LISTS
R INTRO TO R
LOOPS & CONDITIONAL STATEMENTS
FUNCTIONS
TOOLS & LANGUAGES

DATA FRAMES

INSTALLATION
BASICS
LISTS
DATA STRUCTURES IN PYTHON
TUPLES
DICTIONARIES
SETS
IMPORTING PACKAGES
IF-ELIF-ELSE
PYTHON*
LOOPS & CONDITIONAL STATEMENTS
CONTROL STRUCTURES & FUNCTIONS COMPREHENSIONS
FUNCTIONS
MAP, FILTER & REDUCE
INTRODUCING NUMPY
DATA ANALYSIS USING PANDAS INTRODUCING PANDAS
MERGING, QUERYING & AGGREGATION

BASICS OF SQL
SQL INTRO TO SQL MYSQL FUNCTIONS
SQL WITH R

TABLEAU INTERFACE
TABLEAU CONNECTING TO DATA & BASIC VISUALISATIONS
VISUALISATION WITH TABLEAU
INSIGHTS FROM VISUALISATIONS

DASHBOARD & STORIES

*Optional
&

PG DIPLOMA
IN DATA ANALYTICS
Program Curriculum

COURSE MODULE NAME SESSION SESSION NAME

INTRO TO DATA ANALYTICS


DATA MINING
JARGON BUSTING MACHINE LEARNING
TYPES OF DATA ANALYTICS
BIG DATA
INTRO TO DATA ANALYTICS BUSINESS UNDERSTANDING
DATA UNDERSTANDING
DATA PREPARATION
ANALYTICAL PROBLEM SOLVING FRAMEWORK
MODEL BUILDING
MODEL EVALUATION
MODEL DEPLOYMENT

STRUCTURED, SEMI-STRUCTURED & UNSTRUCTURED DATA


BUSINESS OBJECTIVES & DATA TYPES
NOMINAL, ORDINAL, INTERVAL & RATIO ATTRIBUTES
INTRO TO DATA MANAGEMENT

DATA DICTIONARY
DATA DICTIONARY & GRANULARITY
DATA GRANULARITY
INCONSISTENT DATA
MISSING DATA
HOMONYMS
BUSINESS & DATA UNDERSTANDING DATA QUALITY ISSUES & CLEANING SYNONYMS
INACCURATE DATA
GENERAL PURPOSE ATTRIBUTES
UNSTRUCTURED DATA
TYPES OF MERGES
MERGING IN R
DATA PREPARATION & MERGES
DATA DEDUPLICATION
BUSINESS RULES

DEFINING DATA WAREHOUSE


INTRO TO DATA WAREHOUSING
DIMENSION MODELLING
STRUCTURE OF DATA WAREHOUSE
DESIGNING A DATA WAREHOUSE
DATA CUBES
ACQUIRING RELEVANT DATA
DATA STORAGE OPERATIONS ON DATA CUBE
SOME MORE TERMINOLOGIES
VISUALISATIONS - THE WORLD OF IMAGERY
INTRO TO DATA VISUALISATION
UNDERSTANDING THE BASIC CHART TYPES
USING BASE PACKAGE
VISUALISATIONS IN R
USING GGPLOT2 PACKAGE
&

PG DIPLOMA
IN DATA ANALYTICS
Program Curriculum

COURSE MODULE NAME SESSION SESSION NAME

MEAN
MEDIAN
MEASURES OF CENTRAL TENDENCY MODE
TRUNCATED MEAN
GEOMETRIC MEAN
RANGE
VARIANCE
DESCRIPTIVE STATISTICS
STANDARD DEVIATION
SPREAD OF THE DATA
SKEWNESS
KURTOSIS
COEFFICIENT OF VARIATION
COVARIANCE
ASSOCIATION BETWEEN VARIABLES CORRELATION
CORRELATION IS NOT CAUSATION

UNDERSTANDING PROBABILITY
MARGINAL PROBABILITY
BASICS OF PROBABILITY JOINT PROBABILITY
CONDITIONAL PROBABILITY
BAYES THEOREM FOR CONDITIONAL PROBABILITY
BASICS OF DISTRIBUTION - PDF & CDF
INFERENTIAL STATISTICS PROBABILITY DISTRIBUTION DISCRETE PROBABILITY DISTRIBUTIONS
NORMAL DISTRIBUTION
SAMPLING BIASES & SAMPLING TECHNIQUES
SAMPLING DISTRIBUTION
SAMPLING & SAMPLING DISTRIBUTION CENTRAL LIMIT THEOREM
CONFIDENCE INTERVAL
MARGIN OF ERROR
STATISTICS & EDA

BASICS
NULL & ALTERNATE HYPOTHESIS
STANDARDISED SCORE APPROACH
CONCEPTS IN HYPOTHESIS TESTING UNSTANDARDISED TEST SCORE
P-VALUE APPROACH
TYPES OF TESTS
TYPES OF ERRORS
HYPOTHESIS TESTING
1-POPULATION MEAN TEST
2-POPULATION MEAN TEST
SETTING UP HYPOTHESIS TEST
1-POPULATION PROPORTION TEST
2-POPULATION PROPORTION TEST
UNDERSTANDING T-DISTRIBUTION
SETTING UP T-TEST
WHEN NOT TO USE Z-TEST
NON-PARAMETRIC TEST
SETTING UP CHI-SQUARE TEST

INTRODUCTION TO EDA UNDERSTANDING EDA


SUMMARISING ONE VARIABLE
MEASURING THE SPREAD
UNIVARIATE ANALYSIS
OUTLIER DETECTION
EXPLORATORY DATA ANALYSIS
Q-Q PLOT
ANALYSING QUANTITATIVE VARIABLES
MULTIVARIATE ANALYSIS EDA USING GGALLY
ANALYSING CATEGORICAL VARIABLE
&

PG DIPLOMA
IN DATA ANALYTICS
Program Curriculum

COURSE MODULE NAME SESSION SESSION NAME

INTRO TO SIMPLE LINEAR REGRESSION


SIMPLE LINEAR REGRESSION
LINEAR REGRESSION IN R
LINEAR REGRESSION
INTRO TO MULTIPLE LINEAR REGRESSION
MULTIPLE LINEAR REGRESSION
MLR IN R

CONTINUOUS DATA
K-NN CATEGORICAL DATA
K-NN IN R
NAIVE BAYES WITH 1 FEATURE
SUPERVISED CLASSIFICATION I*
CONDITIONAL INDEPENDENCE
NAIVE BAYES DECIPHERING NAIVE BAYES
NAIVE BAYES WITH CONTINUOUS DATA
NAIVE BAYES IN R

SIGMOID FUNCTION
INTRO TO LOGISTIC REGRESSION
ESTIMATING & INTERPRETING THE COEFFICIENTS
FEATURE SELECTION THROUGH STEPWISE
PREDICTIVE ANALYTICS I

MULTIVARIATE LOGISTIC REGRESSION


SUPERVISED CLASSIFICATION II FEATURE SELECTION - VIF
C-STATISTIC
MODEL EVALUATION KS STATISTIC & ROC CURVE
THRESHOLD SELECTION

WHY USE SVM


CONCEPTS OF HYPERPLANES
SUPERVISED CLASSIFICATION III* HYPERPLANES & SVM MAXIMUM MARGIN CLASSIFIER
SUPPORT VECTOR CLASSIFIER
SVM IN R

UNSUPERVISED LEARNING
INTRO TO CLUSTERING
CUSTOMER SEGMENTATION
STEPS OF THE ALGORITHM
K MEANS ALGORITHM VISUALISING THE K MEANS ALGORITHM
PRACTICAL CONSIDERATIONS IN K MEANS
DATA PREPARATION
MAKING THE CLUSTERS
UNSUPERVISED LEARNING: K MEANS IN R
DECIDING THE OPTIMAL K
CLUSTERING
INTERPRETING THE RESULTS
STEPS OF THE ALGORITHM
HIERARCHICAL CLUSTERING INTERPRETING THE DENDROGRAM
TYPES OF LINKAGES
CONSTRUCTING THE DENDROGRAM
HIERARCHICAL CLUSTERING IN R CUTTING THE DENDROGRAM
INTERPRETING THE DENDROGRAM

*Optional
&

PG DIPLOMA
IN DATA ANALYTICS
Program Curriculum

COURSE MODULE NAME SESSION SESSION NAME

INTRO TO MODEL SELECTION


MODEL & LEARNING ALGORITHM
PRINCIPLES OF MODEL SELECTION
SIMPLICITY, COMPLEXITY & OVERFITTING
MODEL SELECTION*
BIAS-VARIANCE TRADEOFF
REGULARISATION & HYPERPARAMETERS
MODEL EVALUATION
MODEL EVALUATION & CROSS VALIDATION

UNDERSTANDING THE BASIC FRAMEWORK


GENERALISED LINEAR REGRESSION
GENERALISED LINEAR REGRESSION IN R
ADVANCED REGRESSION
RIDGE & LASSO REGRESSION
REGULARISATION REGRESSION
RIDGE & LASSO IN R

TIME SERIES VS REGRESSION


INTRO TO TIME SERIES
COMPONENTS OF TIME SERIES
UNDERSTANDING STATIONARITY

UNDERSTANDING WHITE NOISE

ACF & PACF PLOTS

AR & MA MODELLING
WORKING WITH STATIONARY TIME SERIES
ARMA MODELLING

TIME SERIES MODEL EVALUATION


TIME SERIES DIFFERENCING
DIFFERENCING VS CLASSICAL DECOMPOSITION

END-TO-END ANALYSIS ADDITIVE VS MULTIPLICATIVE MODEL


TIME SERIES SMOOTHING

MAKING TIME SERIES FORECAST

CONCEPT OF DECISION TREES


INTERPRETING A DECISION TREE

INTRO TO DECISION TREES ADVANTAGES & DISADVANTAGES


DECISION TREE IN R
REGRESSION WITH DECISION TREE
CONCEPT OF HOMOGENEITY
DECISION TREES GINI INDEX
ALGORITHMS FOR
ENTROPY & INFORMATION GAIN
PREDICTIVE ANALYTICS II

DECISION TREE CONSTRUCTION


MULTISTAGE PROPERTY & GAIN RATIO
SPLITTING BY VARIANCE
TREE TRUNCATION
TRUNCATION & PRUNING TREE PRUNING
COST CONSIDERATION & MISSING DATA

INSPIRATION FROM HUMAN BRAIN


WORKING OF A NEURON
HYPER PARAMETERS OF NEURAL NETWORKS
STRUCTURE OF NEURAL NETWORKS SIMPLIFYING NEURAL NETWORKS
SPECIFYING THE HYPERPARAMETERS
ACTIVATION FUNCTION
BUILDING A SAMPLE NETWORK ON MNIST DATA
LAYERS IN NEURAL NETWORKS
INFORMATION FLOW IN NEURAL NETWORKS INFORMATION FLOW IN NEURAL NETWORKS
INFORMATION FLOW - IMAGE RECOGNITION
WHAT DOES TRAINING A NETWORK MEAN?
NEURAL NETWORKS*
COMPLEXITY OF THE COST FUNCTION
TRAINING A NEURAL NETWORK
UPDATING THE WEIGHT & BIASES
UPDATING THE WEIGHTS & BIASES
STOCHASTIC GRADIENT DESCENT
TRAINING IN BATCHES
EXPLORATION & EXPLOITATION
PROBLEMS WITH BACKPROPAGATION
REPRESENTATION LEARNING REPRESENTATION LEARNING
CONVOLUTION OF NEURAL NETWORKS
DEALING WITH SEQUENTIAL DATA
RECURRENT NEURAL NETWORKS
REGULARISATION IN NEURAL NETWORKS

NEURAL NETWORKS IN R DATA UNDERSTANDING & PREPARATION

WHY ENSEMBLES WORK


CREATING AN ENSEMBLE - BAGGING
ENSEMBLES BAGGING & BOOSTING
CREATING AN ENSEMBLE - BOOSTING

ENSEMBLE IN R

MARKET BASKET ANALYSIS


SUPPORT & CONFIDENCE
UNDERSTANDING ITEMSETS
APRIORI PRINCIPLE
GENERATING FREQUENT ITEMSETS
ASSOCIATION RULE MINING* DATA STRUCTURE FOR FREQUENT ITEMSETS
GENERATING RULES FROM FREQUENT ITEMSETS

UNDERSTANDING ASSOCIATION RULES MAXIMAL & CLOSED FREQUENT ITEMSETS


MEASURE OF LIFT

ASSOCIATION RULE MINING IN R

*Optional
&

PG DIPLOMA
IN DATA ANALYTICS
Program Curriculum

COURSE MODULE NAME SESSION SESSION NAME

WHAT IS BIG DATA?


IS IT THE NEW TERM?
HOW IS IT GENERATED?

INTRODUCTION TO BIG DATA WHO GENERATE IT? THE CONCEPT OF INTERNET


DATA EXPLOSION
INTRO TO BIG DATA WHO USES IT? OR THE APPLICATIONS
WHY IS IT USEFUL IN BUSINESS?

HOW IS IT BEING USED BY BUSINESSES OR INDUSTRIES?

INDUSTRY APPLICATIONS & UTILITY HOW TO PROCESS IT? OR THE PROCESSING PLATFORMS
WHAT ARE THE JOB ROLES FOR BIG DATA ANALYSTS?

HISTORY AND BACKGROUND OF HADOOP

INTRODUCTION TO HADOOP V1 - BASIC COMPONENTS,


WORKING
INTRO TO HADOOP
INTRODUCTION TO HADOOP V2 - BASIC COMPONENTS,
WORKING
HADOOP DIFFERENCE BETWEEN HADOOP V1 AND V2
INTRODUCTION TO HDFS - COMPONENTS, STORAGE

HDFS COMMANDS DEMONSTRATION


WORKING WITH HDFS
BIG DATA

FAULT TOLERANCE, HIGH AVAILABILITY IN HDFS


CASE STUDY FOR HDFS

INTRODUCTION TO DATA INGESTION AND WHY IS IT NEEDED?

INTRODUCTION TO SQOOP
DATA INGESTION WITH SQOOP
SQOOP IMPORT AND EXPORT
SQOOP COMMANDS DEMO
MANAGING DATA
INTRODUCTION TO HIVE
UNDERSTANDING HIVE COMPONENTS
HADOOP DATABASE - HIVE
USING HIVE COMMANDS ON SAMPLE DATASETS TO
DEMONSTRATE MANAGED AND EXTERNAL TABLES

INTRODUCTION TO SPARK - BASICS, ARCHITECTURE,


GENERAL WORKING
INTRO TO SPARK
DIFFERENCE BETWEEN HADOOP V2 AND SPARK

WHY SPARK IS NEEDED?


INTRODUCTION TO SPARKR
SPARK DATA ANALYSIS USING SPARKR
USING SPARKR

INTRODUCTION TO SPARKSQL
DATA ANALYSIS USING SQL
USING SPARKSQL

INTRODUCTION TO SPARK MLLIB


MACHINE LEARNING USING SPARK
USING MLLIB FOR DATA MINING

HOW BANKS MAKE MONEY


INTRODUCTION TO BFS INTRODUCTION TO BFS
CUSTOMER LIFECYCLE

ACQUISITION STRATEGIES
ACQUISITION ANALYTICS ACQUISITION ANALYTICS
LAB - BANK MARKETING
BFS

CROSS SELLING AND UPSELLING


ENGAGEMENT ANALYTICS ENGAGEMENT ANALYTICS
RETENTION MANAGEMENT

TYPES OF RISK IN BFS


RISK ANALYTICS RISK ANALYTICS
RISK MITIGATION STRATEGIES
&

PG DIPLOMA
IN DATA ANALYTICS
Program Curriculum

COURSE MODULE NAME SESSION SESSION NAME

UNDERSTANDING THE BUSINESS OF E-COMMERCE

USE OF ANALYTICS IN INVENTORY MANAGEMENT

USE OF ANALYTICS IN MARKETING

INTRO TO E-COMMERCE INTRO TO E-COMMERCE USE OF ANALYTICS IN SITE OPTIMISATION

USE OF ANALYTICS IN FRAUD DETECTION

USE OF ANALYTICS IN OPTIMISING DELIVERY

USE OF ANALYTICS IN CUSTOMER FEEDBACK ANALYSIS

CONTENT BASED FILTERING


RECOMMENDER SYSTEM
E-COMMERCE

RECOMMENDER SYSTEM COLLABORATIVE FILTERING

RECOMMENDER SYSTEM IN R RECOMMENDER SYSTEM IN R

MARKUP & MARKDOWNS IN PRICEBASED FILTERING


PRICE OPTIMISATION PRICE OPTIMISATION
4 FORCE MODEL FOR PRICE OPTIMISATION

UNDERSTANDING THE CONCEPT OF MARKET MIX MODELLING

VARIOUS MARKETING LEVERS


MARKET MIX MODELING MARKET MIX MODELING
VARIOUS STATISTICAL MODELS

VISUALISATION & CONCLUSION

A/B TESTING
A/B TESTING A/B TESTING
EXECUTING A/B TEST IN OPTIMISELY

OVERVIEW OF GLOBAL HEALTHCARE MARKET

US HEALTHCARE
OVERVIEW OF HEALTHCARE INDUSTRY
SCOPE OF ANALYTICS

JOB PERSPECTIVES

OVERVIEW OF THE HEALTHCARE COMPONENTS


INTRODUCTION TO
PATIENT CYCLE
HEALTHCARE SPACE COMPONENTS OF HEALTHCARE
PBMS

BROAD ANALYTICAL OBJECTIVES

ROLE OF REGULATORY BODIES


REGULATORY BODIES IN HEALTHCARE STRUCTURE OF REGULATORY BODIES IN US

REFORMS IN HEALTHCARE POLICIES


HEALTHCARE ANALYTICS

COST MANAGEMENT

CARE AND HEALTH MANAGEMENT

NETWORK DESIGN

MEDICAL COST AND CARE MANAGEMENT EVIDENCE-BASED CARE


DATA UNDERSTANDING DISEASE MANAGEMENT
AND ANALYSIS
UNDERSTANDING PBM DATASETS

PATIENT ADHERENCE

UNDERSTANDING PUBLIC DATASETS


PUBLIC DATASETS IN HEALTHCARE
ANALYSIS OF CMS DATASET

OVERVIEW OF DRUG LIFECYCLE

CLINICAL TRIALS

CLINICAL TRIALS & PHARMA MARKET ACCESS MARKET ACCESS

STAKEHOLDERS' PERSPECTIVE IN MARKET ACCESS

DRUG LIFECYCLE ANALYTICS SCENARIOS IN MARKET ACCESS

BUSINESS PROBLEM

FRAMEWORK TO SOLVE THE BUSINESS PROBLEM


PHARMA MARKET ACCESS DEMONSTRATION
ATTRIBUTES & DATA PULLING

INSIGHT GENERATION

You might also like