You are on page 1of 9

Training on Demand Forecasting and Integrated

Resource Planning

Day 2 – Session 6b
Data for Demand Forecasting
Different Data Requirements
• Different data requirements for different forecasting
approaches and methods
• Model-based methods often require more data items;
• GDP
• Population (electrified)
• Price of Electricity and price elasticity
• Disposable household income
• Weather Conditions (Temperature, humidity)
• Appliance penetration and saturation
• Month of year
• Type of day
• Time of Day
• Customer mix

Slide 2
Different Data Requirements
• Timeseries data based methods often require few data
items;
• Interval meter data
• Interval weather data
• However, the overall volume of fewer data items is
many folds greater than large number of model data
items

Slide 3
How much data?
• History is fundamental to accuracy of forecasts
• Longer history includes all possible situations/events (extreme
weather conditions, financial crises, socio-economic situation)
• Minimum of 5+ years of historic data is highly desirable
• 15Min interval data at customer, transformer, feeder, substation,
transmission line, generation levels, for 5+ years will be ?? TB!!!!!
• Such volume of data is already used by telecom and financial sectors
• Utilities are waking up to need and capabilities for such large data

Slide 4
Data for FC and Computers
Fundamental Rules of Computing
• Computers can process large data and memory
• Conduct complex calculations in fraction of time
• Nonetheless, computers are DUMB machines, they follow
human instructions and directions
• Computers ONLY make decisions, based on rules defined by
their programmers (humans)
• Human intelligence is dynamic, computer intelligence is static
(at least for the machines that we commonly use )

Slide 5
Rule of Thumb defining Computers

• Garbage In
• Garbage Out
• Or Computers Are;
• GIGO
• (Introduced in 90’s)

Slide 6
Fundamental Rules of Computing
• Computers don’t assess the accuracy, useability and
rationale of their outputs
• They produce output, processing tons of data using
immense calculations and complex rules
• If input data has error, the OUPUT will always have
error, irrespective of how accurate and sophisticated
your model is
• Always check QUALITY of data before starting to
produce/use model output (forecasts)
• Conduct simple SANITY checks first followed by
advanced QA tests

Slide 7
Assessing Quality of Data – Sanity Check

• Completeness of data: Number of data points as


against expected numbers (missing or duplicates)
• Graphical Representations: Look for spikes and dips,
smoothness/irregularities
• Simple Statistical Values: Minimum, Maximum, Mean,
Sum, Standard Deviation of Mean, Boxplot (Outliers)
• Simple Sanity Check: Based on above, simple rationale
judgements, to decide on quality of data

Slide 8
Improving Quality of Data
• Data Validation: Assessing if data is valid
• Data Revision: If value is not Valid, revisit source and
reconfirm
• Data Estimation: If value is certainly not valid, discard
and ESTIMATE. Do Not use invalid
• Revise Quality Assessment: Repeat until data quality is
acceptable
• Use Data: Use Acceptable data in models to produce
results
• See Actual good and bad quality data in Excel

Slide 9

You might also like