Professional Documents
Culture Documents
Statistics, Data Analysis, and Decision Modeling, Third Edition James R. Evans
Modern organizations manage by fact for performance evaluation, improvement, and decision making Some organizations ignore data: They may not fully understand what to measure or how to measure. They may be reluctant to spend the required time and effort. They may feel they can make decisions by instinct and do not need data. They may fear discovering problems or poor performance that data may uncover.
Information derives from the analysis of data Analysis refers to extracting larger meaning from data to support evaluation and decision making. Data are also used as key inputs to decision models logical or mathematical representations of problems or business situations.
Statistics
Statistics the science of collecting, organizing, analyzing, interpreting, and presenting data for the purpose of gaining insight and making better decisions. Applications abound in all business disciplines, manufacturing and quality control, health care, sports, and daily life.
Statistical Thinking
A philosophy of learning and action for improvement based on three principles: All work occurs in a system of interconnected processes Variation exists in all processes systematic ways of doing things that achieve desired results Variation must be understood and reduced
Variation
Common causes of variation complex interactions of variation in materials, tools, machines, operators, and the environment Individual sources are not easily understood and cannot be controlled Special causes of variation variation arising from external sources not inherent in a process Can be identified and controlled or explained Many managers do not properly distinguish between these two causes, confuse them, and as a result, often make poor decisions
Six Sigma - a business process improvement approach that seeks to find and eliminate causes of defects and errors, reduce cycle times and cost of operations, improve productivity, better meet customer expectations, and achieve higher asset utilization and returns on investment in manufacturing and service processes. Six sigma is a measure of at most 3.4 errors or defects per million opportunities
DMAIC (Define, Measure, Analyze, Improve, and Control) Uses a wide variety of statistical and process improvement tools. Many companies report positive financial results from Six Sigma initiatives
Metric - a unit of measurement that provides a way to objectively quantify performance. Discrete or Continuous Examples: profit, ROI, market share, customer satisfaction, defects, order accuracy Measurement the act of obtaining data. Measure numerical information that results from measurement
Defects per unit Errors per opportunity Defects per million opportunities (dpmo)
Financial Perspective profitability, revenue growth, ROI, EPS, Internal Perspective quality levels, productivity, process yields, cycle time, cost, Customer Perspective service levels, satisfaction ratings, repeat business, complaints, Innovation and Learning Perspective intellectual assets, employee satisfaction, market innovation, training effectiveness, supplier performance,
Lagging measures (outcomes) Leading measures (performance drivers) Statistical relationships Examples IBM Rochester: causal relationships between people skills, quality, customer satisfaction, and financial/market share performance Sears: employee attitudes predict behavior, which predicts customer retention, which predicts financial performance
Sources of Data
Internal obtained from company records, databases, etc. External obtained from published sources, external databases, the internet Generated obtained from surveys, focus groups, etc.
Data Classification
Type of Data Cross-Sectional measurements taken at one time period Time series data collected over time Number of Variables Univariate data consisting of a single variable to measure some entity Multivariate data consisting of two or more variables to measure some entity
Cross-Sectional, Univariate
Cross-Sectional, Multivariate
Data Classification
Categorical (nominal) data sorted into mutually exclusive (an observation cannot belong to more than one category) categories Geographical region, type of employee, gender, state of birth, type of automobile owned Properties No quantitative relationships among categories Statistics such as averages are usually meaningless
Data Classification
Ordinal data data ordered or ranked according to some relationship to one another Ranking of colas in taste tests, employee performance appraisals, satisfaction survey scales Properties Categories can be compared with one another Statistics usually meaningless because of no fixed units of measurement; i.e., differences are meaningless
Data Classification
Interval data data that are ordered and characterized by a specified measure of distance between observations, but with no natural zero. Temperature scales, time, survey scales that are assumed to be interval Properties Ratios are meaningless (50 degrees is not twice as hot as 25 degrees) Differences are meaningful, so statistics such as averages may be compared
Data Classification
Ratio data data that have a natural zero Sales dollars, length, weight, time from start of a process, most business and economic data Properties Strongest form of measurement; both ratios and differences are meaningful
Data Classification
E.g., number of defects per unit of production, percentage of on-time flight arrivals, number of complaints per customer, percentage of top box responses in a satisfaction survey
Delivery time, number of ounces in a bottle of beer, monthly revenues, diameter of a drilled hole, balance in your checking account, time spent on homework
Population all items of interest for a particular decision or investigation All married drivers in the U.S. over age 25 All individuals who do not own a cell phone Sample a subset of a population Nielsen samples of TV viewers Accounting department samples of invoices for audits Samples are used To reduce costs of data collection When a full census cannot be taken
Definition of a Statistic
A statistic is a summary measure of sample data used to describe a characteristic of a population or to draw inferences about the population. 100 owners of a certain car reported 85 problems in the first 90 days of ownership. The statistic 85 describes the number of problems per 100 cars during the first 90 days of ownership, and suggests that the entire population of owners of these cars experience an average of 0.85 problems per car.
Statistical Methodology
Descriptive statistics collection, organization, and description of data Statistical inference drawing conclusions about unknown characteristics of a population based on samples Predictive statistics inferring future values based on historical data
Opening, saving, and printing files Navigation Selecting ranges Inserting/deleting rows and columns Entering and editing text, data, and formulas Formatting data (number, currency, decimal) Working with text strings Performing basic arithmetic calculations Formatting text Modifying the appearance of a spreadsheet
Copying Formulas
Select a cell. Choose EditCopy (or click Copy icon or press Ctrl-C ). Click on cell to copy to. Choose EditPaste (or click on Paste icon or press Ctrl-V ).
Cell References
Relative addressing: B5, G13 Absolute addressing: $B$5, $G13, K$11 Change reference using F4 key
Functions
Range functions: MIN, MAX, SUM, AVERAGE, AND(condition 1, condition 2,) OR(condition 1, condition 2,) IF(condition, value if true, value if false) VLOOKUP(value, table range, column number)
Paste Function
Easiest way to locate a particular function and identify the correct arguments
Split screen Paste special Column and row widths Displaying formulas Filling a range Comment boxes Displaying grid lines and headers for printing
Excel Add-Ins
PHStat Menu
PivotTables
Create custom summaries and charts from data Need a database with headers. Select any cell and choose PivotTable Report from Data menu. Follow the wizard steps. Drag and drop data items into or out of any of the fields
PivotTable Wizard
PivotTable Structure
PivotTable Examples
To change statistics, right click inside table and select Field Properties