You are on page 1of 131

Introduction Excel Guide 1

The Excel Guide


to accompany

Practical Business Statistics


by Andrew F. Siegel

PREPARED BY ANDREW F. SIEGEL

Copyright © 2003 Andrew F. Siegel


Published by Irwin/McGraw-Hill

Excel is a registered trademark of Microsoft, Inc.


2 Excel Guide Introduction

Preface
You may already be familiar with Excel, the versatile spreadsheet program that is used widely in
business and management analysis of nearly everything from accounting and finance to production
and marketing. Much of the success of spreadsheets is due to the complete flexibility you have in
putting text, numbers, and graphics anywhere on your computer screen (and to have formulas
update themselves automatically). In addition, Excel includes a sophisticated set of statistical tools
and is therefore a natural computing environment for a business statistics course.

The purpose of this Excel Guide is to help you learn statistics by working through real-data
examples from the textbook Practical Business Statistics by Andrew F. Siegel. You don't need to
know a lot about computers when you begin. Because the material is presented “from scratch,”
once you launch into Excel, you will be able to get results right away. Just follow along and try
commands like the ones you see presented and discussed here.

This Excel Guide works with Excel only. If you would like to enhance the statistical capabilities
of Excel, we recommend StatPad , an Excel add-in which also provides non-technical
explanations of the results of your statistical analysis. Many statistical methods explained here are
much easier to do in StatPad. When you use StatPad, it seems as though the added conveniences
were built into Excel itself, so there is no need to leave the familiar spreadsheet environment of
Excel. StatPad comes with Practical Business Statistics by Andrew F. Siegel, published by
Irwin/McGraw-Hill.

The Excel Guide begins with an introductory chapter to tell you about Excel and get you up-and-
running with the basics. After that, the chapters here closely follow the same sequence as the
chapters of the textbook Practical Business Statistics, beginning with the Histograms chapter
(Chapter 3). While this Excel Guide gives enough information for you to see how to use the
computer, you may wish to keep Practical Business Statistics handy for reference and further
details about the examples because some of them are taken directly from the textbook. Once
you've seen how to work the textbook examples, it should be straightforward for you to do
homework and projects.

Each chapter contains discussion, examples, explanations, and the results of actual Excel sessions.
Don’t forget that many Excel data files are available with Practical Business Statistics, so that
there is no need to retype any data from your textbook.

Best wishes to you in this learning adventure!


Introduction Excel Guide 3

Introduction and Sample Excel


Session
Excel is a powerful computing environment with statistical capabilities. You can type data into the
worksheet, analyze and manipulate the data, and write text to identify and explain it. Summaries,
charts, and detailed calculations are easily done using Excel’s menu commands and functions. In
this introductory chapter we cover some of the basics of Excel with hints and tips including:
entering data, doing arithmetic, using functions, selecting and naming cells, using UNDO,
formatting, working with data files from Practical Business Statistics, sorting, and making a
chart.
If you are new to Excel, remember that the best way to learn is by experimenting. Explore the
menu system and try things out to see how they work. Use Help for guidance. This manual gives
you step-by-step instructions for many tasks. If you are an experienced Excel user, this manual
will show you many ways in which Excel can be used for statistical calculations.

Moving Around and Typing Data into the Worksheet


I enjoy the freedom of working in spreadsheets like Excel. You can click on any cell you want,
type anything you want - text or number or function - and it stays there when you hit the Enter
key. To move around, you can use the mouse or the cursor keys ← , ↑ , → , and ↓ . Here is a
worksheet that has some text, some numbers and a function. Note that you can see what function
is in the selected cell (in this case, “=C3+C4”) by looking at the Formula Bar near the top.

In this case, the function starts, as always with an equals sign “=”. To enter the formula that adds
Jim’s and Adrian’s sales together, you might either type the formula directly and hit Enter, or
construct it by pointing to cells as follows:
1. Select cell C6 by clicking on it or moving to it with the cursor keys
2. Hit the = key
4 Excel Guide Introduction

3. Click on Jim’s sales in cell C3


4. Hit the + key
5. Click on Adrian’s sales in cell C4
6. Hit Enter.

Using Formulas to do Basic Arithmetic


Any cell can contain a formula that uses basic arithmetic with numbers and references to numbers
(or formula results) in other cells. Here are some of the rules of basic arithmetic in Excel.
1. Start by selecting the cell where you want the result to go and hitting the = key
2. Use operators “+” for addition, “−“ for subtraction, “*” for multiplication, “/” for division,
and exponentiation “^” to raise to a power. Here are some examples (in column B) with
the formulas written out (in column C). Note that 2 ^ 3 means 2 multiplied by itself 3
times, so it’s 2 * 2 * 2 = 8. Note also that the last formula, in cell B7, adds the results of
two other formulas to find 18 as 8 + 10.

3. Rules of arithmetic say that these operations are performed in the following order:
a. Exponentiation “^” is done first
b. Multiplication “*” and division “/” happen next. You will want to use parentheses so
that equations with multiplication after division like 2 / (3 * 4) are correctly evaluated
c. Addition “+” and subtraction “-“ are done last. Thus 6 + 4 * 2 ^ 3 is evaluated as 6 + 4
* 8, which is 6 + 32, which is 38.
d. If you want something to happen first, put it in parentheses. For example, (2 + 3) * 4
makes the addition happen before the multiplication.
Introduction Excel Guide 5

e. If you have a minus sign that is not subtracting, be careful! It happens even before
exponentiation! Thus -2 ^ 4 is evaluated as (-2) ^ 4 which is 16. If you wanted -(2 ^ 4)
you would need to include the parentheses to make the exponentiation happen first and
to get -16 as the answer.
4. Percentages are used as if they were already divided by 100. For example, if you enter a
percent like “20%” directly into a cell, its value is taken to be 0.20. This makes it easy, for
example, to find 20% of a number: you simply multiply the number by 20%.

Using Functions to Compute a Number


Excel has a vast collection of useful functions. One easy way to browse them is to select an empty
cell and choose Insert/Function from the main menu. Here is how it looks if you select the
Statistical Function Category and then the AVERAGE function:

This is a nice way to insert a function into the worksheet because Excel will help you fill in the
details in the correct order, so that you don’t have to memorize what goes where, which is
especially useful with functions that need more than one piece of information. To insert the
AVERAGE function, click OK to see a dialog box like this
6 Excel Guide Introduction

that is ready for you to select one or more cells by clicking or dragging the mouse across cells
with the numbers you want to average. You may move this dialog box out of the way by dragging
most anywhere on it. Here is how it looks after dragging down cells B2 through B7:

When you click OK, the result is placed into the worksheet in the cell that was selected when you
first chose Insert/Function from the menu. Here is the result:

You could achieve exactly the same result by selecting cell B9 and typing “=AVERAGE(B2:B7)”
without the quotation marks and then hitting Enter. Another way to do this is to type
“=AVERAGE(” without the quotation marks, then use the mouse to drag down cells B2 through
B7, then type “)” without the quotation marks and hit Enter.

Selecting a Range of Cells


You will probably want to do many things to cells: put things in them, format them, calculate with
them. The way Excel works, you will need to know how to select cells in order to change them or
use them.
To select a rectangular range of worksheet cells, simply drag the mouse from one corner
diagonally to the opposite corner. The result will look something like this:
Introduction Excel Guide 7

Another way to select these cells would be to use the cursor keys to move to one corner, say C70.
Then hold the Shift key while you move right → twice. Then hit the End key (with or without
Shift). Finally, hold the Shift key while you hit the down arrow ↓ . When you use the End key,
the next movement (left, right, up, or down) will go to the end of the row or column you are
working in. Holding the Shift key expands the selection.

Naming a Range of Cells


It is much more convenient to refer to a list of data using an Excel Range Name like “Sales”
instead of an Excel address like “D3:D6”. It is a good idea to also have a column heading like
“Sales” in the cell above the data, but this may not be enough. Some versions of Excel will try to
figure out which cells you wish to work with, but the best way to be sure that the name is
associated with the correct data is to explicitly give the range a name.
Here is one way to create an Excel range name for a column of sales numbers with a label at the
top:
1. Begin by selecting the sales numbers (just the numbers, not the label) by dragging the
mouse down the column. It should look something like this:
8 Excel Guide Introduction

2. Choose Insert/Name/Define from the main menu system. Because the label is at the top
and you have selected cells below it, Excel knows what you want to do and proposes to
give the range name “Sales” to the data in cells D3 through D6. Here is how it should
look

you can also use this Define Name dialog box to see what other names are defined and to
check that they refer to the correct worksheet range.
You cannot just choose any name for a range. The first character must be a letter or the
underscore character “_”. The other characters can be letters, numbers, periods, and
underscore characters, but not spaces (use underscores instead). Names cannot be the
same as a cell reference (e.g. C16, R3C5, R and C are not allowed). There is no
distinction between uppercase and lowercase letters, so “Sales”, “sales”, “SALES”, and
“sALeS” all refer to the same worksheet cells.
3. When you choose OK, the range name is assigned. Whenever you select this range, its
name (“Sales”) will appear in the name box near the top left corner of the worksheet, at
the left end of the formula bar. You can select this range quickly by choosing its name in
the name box.
Introduction Excel Guide 9

The Fill Handle


At the lower right-hand corner of a selection is the fill handle. Here’s one nice thing it can do if
you drag it with the mouse: extend a sequence automatically:

Another nice thing the fill handle can do is automatically copy a selected cell’s formula down a
column by dragging the fill handle as far as you want. If the cell is next to a column with data in it,
then double-clicking the fill handle will automatically copy the cell’s contents down the column!
10 Excel Guide Introduction

Copying and Pasting


To copy and paste text, a number, or a formula, you select the source cell(s), choose Edit/Copy
from the main menu, select a cell at the destination, and choose Edit/Paste.
To move the contents of a cell or cells, select the source cell(s) and choose Edit/Cut, select a cell
at the destination, and choose Edit/Paste (or just hit Enter).
To paste just the numbers but not the formulas, when you paste, choose Edit/PasteSpecial/Values.
If a formula adds the two numbers to its left, then the way it copies depends on how the cell
addresses are specified. With relative addressing the formula changes to reflect its new location.
Suppose that the formula “=A5+B5” is in cell C5. When this formula is copied to another cell, the
resulting formula will change so that it adds the two cells to the left of the destination. For
example, if this formula is copied to cell C6, the formula will change to “=A6+B6”. With absolute
addressing the formula remains the same. If the formula includes dollar signs to read
“=$A$5+$B$5”, then it remains unchanged when it is copied, always adding these two cells.

Using UNDO
Thank goodness for UNDO! No need to worry if you have just erased your precious data by
accidentally hitting the delete key, so long as you react reasonably quickly. Just choose Edit/Undo
from the main menu, and your valuable data will reappear as if by magic. Excel now has multiple
UNDO levels, so that you can undo more than one action.

Formatting a Range of Cells


To make your worksheet look nice, you will need to format cells. Select the cells first, then use
Format/Cells from the main menu. You will then have control over how numbers appear (number
of decimal places, percentage, dollar signs, dates, etc.), how cells are aligned (left or right, top or
bottom), what font is used (including color, size, underline, italic, bold), how cell borders are
indicated, and what patterns and colors fill the cells.
To show numbers with 2 decimal places and commas for thousands separation, here is the
Format/Cells dialog box with the Number tab chosen. Don’t forget to select the cell(s) first!
Introduction Excel Guide 11

To show numbers as percentages with one decimal place, you would use
12 Excel Guide Introduction

Working with Data Files from Practical Business Statistics


For use with Excel, each chapter of Practical Business Statistics has its own data file that
includes the data tables from examples and problems. To access it, use File/Open from Excel’s
menu. Each column of numbers is named and ready to use. For example, the data sets from
Chapter 3 are in the file named Chapter03.xls, and the employee database from Appendix A of the
textbook is in the file named EmployeeDatabase.xls. A list of the names used for each individual
data table within a file can be found in the Appendix to this Excel guide.
To work with a column of numbers from a data file, you may use its name in a formula, such as
“=AVERAGE(yield)” to place the average of a column of numbers named “yield” into a cell in
your worksheet. Alternatively, you may drag the mouse down the numbers in the data set to select
them if you wish.

Sorting to Put a Range in Order


When you want to put a column of data in order, smallest to largest or largest to smallest, simply
select your data, then choose Data/Sort from the main menu.
If you have a larger database with more than one variable measured on each elementary unit, be
sure to select the entire data set before sorting. Here is a small database:

To sort it by revenues, you may either start by selecting A6 through C9, or let Excel do it for you
when you choose Data/Sort from the main menu. Here is how it should look as you prepare to
sort by Revenues, with both columns of data selected along with the identifying labels.
Introduction Excel Guide 13

When you choose OK, the cities are sorted in order by revenues, and their expenses have
correctly remained associated with them:

Making a Chart
Here is how to create a chart in Excel.
1. Select your data, either one column or multiple columns. In some cases you will want to
select the label at the top of the column for Excel to use.

2. Choose Insert/Chart from the main menu or click on the Chart Wizard icon on the
toolbar. The dialog box gives you many chart options:
14 Excel Guide Introduction

Of particular interest in statistics are the XY (Scatter) used for bivariate and multivariate
data and the Line chart used in time series analysis. Creating a histogram will require some
computation before the chart is created. Details on creating particular types of charts will
be covered as situations arise in this Excel Guide.
3. As you click on Next > to go through the sequence of dialog boxes, you will have the
option to add titles, as well as to add or take away gridlines or legends. If you choose to
put the chart back “As Object in” your worksheet, you will be able to move and size it
near the data it came from.
4. In addition, if you don’t like the gray background in a chart, double-click on it and set the
Patterns in the Area to None. To change the size of the chart, drag a sizing handle (which
appear in the corners and in the middle of the sides when you click just inside the edge of
the chart). To move the chart to a different place in the worksheet, drag just inside the
edge but not on a sizing handle. To add or change titles, right-click just inside the chart,
select Chart Options from the little pop-up menu, and choose the Titles tab. To change the
font size, right-click on the item (a title or an axis) and choose Format from the little pop-
up menu.

Using the Data Analysis ToolPak


Some statistical methods, such as regression and the analysis of variance, can be performed in
Excel by using the “Data Analysis ToolPak” which is part of Excel, but you may need to install it
before you can use it.
Introduction Excel Guide 15

To find out if the Data Analysis ToolPak is installed on your system, look under the Tools menu
for Data Analysis. If you cannot find Data Analysis under Excel’s Tools menu, select Add-Ins
from the Tools menu and make sure the Analysis ToolPak is checked. If the Analysis ToolPak was
not installed when Excel was installed on your computer, you may need to install it from the Excel
CD-ROM.

Hints, Tips, and Troubleshooting


Here are some general comments that fall into the categories of hints, tips, and troubleshooting.
Experiment! Explore the menu system. Try things out to see how they work. And check your
work for reasonableness: don’t just believe it has to be correct because you did it on a computer.
Save your work often so that if the computer shuts off unexpectedly you will not be sad. If your
work is important, then keep more than one copy of it in more than one place.
Use the help system to learn more about Excel, either from the menu or by hitting the F1 key.
Personally, I find that Help/ContentsAndIndex/Index from the main menu is the most useful.
Be familiar with the Tools/Options menu choice, which give you control over many worksheet
features. Here are some highlights:
1. With the View tab of Tools/Options, if something like a formula bar or scroll bar
disappears from your worksheet, you will be able to bring it back. If you want to get rid of
those gridlines, you can.
2. With the Calculation tab of Tools/Options, you can make sure that the worksheet is set to
calculate automatically. If calculation is set to Manual, then you may need to hit the F9
key to see correct up-to-date results.
3. With the Edit tab of Tools/Options, you can control whether the selection moves down, or
some other direction, or stays in the cell when you hit Enter. You can also ask Excel not to
guess what you mean when you start typing, by un-checking the box at “Enable
AutoComplete for cell values”.
4. With the General tab of Tools/Options, you can choose the default font and size.
To widen a column so that you can see all that is in a cell, select the cell and then use
Format/Column/AutoFitSelection from the main menu.
There are additional toolbars, in particular, the drawing toolbar can be useful for placing arrows
and other drawing objects on the worksheet. To see them, right-click in the open area near the top
of the window.
If you are not sure how to get Excel to do something, try right-clicking or double-clicking on the
object. The context-sensitive menu that appears when you right-click with the mouse can be very
helpful, by making suggestions that are appropriate to the object you are interested in. Try this on
a range or on part of a chart when you are not sure what to do. The Esc key makes this pop-up
menu go away if you decide not to use it.
16 Histograms Chapter 3

Histograms (Chapter 3)
Here is how to produce a histogram in Excel by first creating a column of bins to hold the
frequencies, then using Excel’s COUNTIF function to count how many data values fall into each
bin, and finally create a bar chart of these frequencies with labels and connected bars.
You have two alternatives to these procedures while staying in Excel. First, with StatPad, creating
a histogram is quick and easy. Second, with the data analysis add-in (“Analysis ToolPak”),
creating a histogram requires more steps and the final result (after eliminating gaps between bars)
can be counterintuitive because a data value that falls on a bin boundary may be placed in the bin
to its left, instead of the bin to its right (so that, e.g., 60 would be counted as “50 to 60” instead
of “60 to 70”).

Example: Computer Ownership Rates (Histogram)


Consider the data for rates of computer ownership (Table 3.5.2 of Practical Business Statistics).
Here are the steps involved in creating a histogram:
1. Create a column of bin boundaries, in this case from 30% to 70% by 5% (a reasonable
choice because the data values range from 37.2% to 66.1%). To do this, you might type
“30%” in cell E277, hit Enter, then use Excel’s menu commands Edit/Fill/Series with
Series in Columns, Step value 5% and Stop value 70% as shown here: 1

2. Compute the counted frequencies using the COUNTIF function. Select the cell to the
right of the first bin boundary amount. We want the number of data values from 30% to
1Typing “30%” in the cell is the same as typing “0.30” in the cell and then using Format/Cells/Number to specify
percentage format with two decimal places.
Chapter 3 Histograms 17

35% (remember that 35% is the same as 0.35 in Excel). Since 30% is in cell E271 and
35% is in cell E272, we can use the formula
=COUNTIF(computer_owners,"<"&E272)-COUNTIF(computer_owners,"<"&E271)

which has been carefully crafted in this form so that all counts can be found by copying
down the column, to the next-to-last cell (representing data values from 65% to 70%).
For this formula to work, the column of data must have a name such as
“computer_owners” here (if your data does not yet have a name, then select the numbers
in the data column and use Excel’s menu command Insert/Name/Define to give your data
a name). To copy and paste after typing the formula and hitting enter, you may use the
menu command Edit/Copy, then select the cells of the column and then use Edit/Paste (or
just double-click the little fill handle at the lower right of the selected cell, then delete the
last one in the column). Here is the result so far:

3. Prepare for charting by selecting the bin boundaries and the counts, INCLUDING THE
BLANK TOP ROW, which will convince Excel to draw the bar chart correctly, using the
bin boundaries as the category axis. Here is how it should look as you select Insert/Chart
from the menu (or click on the Chart Wizard icon on the toolbar):
18 Histograms Chapter 3

4. Use the standard Column Chart Type with first Sub-Type:


Chapter 3 Histograms 19

5. Click on Next > twice, then eliminate some unnecessary features. Delete the legend by
selecting the Legend tab and unselecting the “Show legend” checkbox, and eliminate
gridlines by selecting the Gridlines tab and unselecting anything checked there:

6. Click on Finish to place the chart in the worksheet.


7. Eliminate the gaps between the bars by right-clicking on a bar to bring up a little menu
from which you choose “Format Data Series”

8. Choose the Options tab, then decrease the Gap Width to 0 to make it into a true
histogram:
20 Histograms Chapter 3

9. Click OK to complete this task. You now have a histogram in the worksheet!

10. Here are some optional steps. If you don’t like the gray background, double-click on it
and set the Patterns in the Area to None. Similarly, by double-clicking inside a bar, you
may change or eliminate the color. To change the size of the histogram, drag a sizing
handle (which appear in the corners and in the middle of the sides when you click just
Chapter 3 Histograms 21

inside the edge of the chart). To move the chart to a different place in the worksheet, drag
just inside the edge but not on a sizing handle. To add titles, right-click just inside the
chart, select Chart Options from the little pop-up menu, and choose the Titles tab. To
change the font size, right-click on the item (a title or an axis) and choose Format from the
little pop-up menu. To format the horizontal axis as percent, double click on the axis, then
choose Number and Percent. Here is one possible result:

Histogram of Computer Ownership

20
Number of States

15

10

0
30% 35% 40% 45% 50% 55% 60% 65%
Percent of Households

Example: Assets of Commercial Banks (Transformation)


This example shows how you can transform a data set using logarithms. We use the Excel
function =LOG10( ) to find the base 10 logarithm of each data values, but you may use natural
logarithm (base e), using the =LN( ) function instead.
Consider the assets, in billions, of commercial banks in the Fortune 1000 (Table 3.4.1 of Practical
Business Statistics). A histogram of these value, found using the methods explained earlier in this
chapter, is very skewed:
22 Histograms Chapter 3

To compute the logarithms of the data values, begin by computing the logarithm of the first data
value. To do this, select the cell to its right, then use Excel’s Insert/Function menu command. You
will find the LOG10 function under the Math & Trig category:

Select OK to see the LOG10 dialog box, then click on the first data value (you may need to drag
the dialog box out of the way to see it) to tell Excel which number to take the logarithm of, as
follows (in this case, the number in cell E79, which you specify by clicking on it):
Chapter 3 Histograms 23

Select OK, then double-click on the fill handle to copy this formula down the column of data,
resulting in a new column containing the logarithms of the data (if you prefer, you may use
Edit/Copy and Edit/Paste instead):
24 Histograms Chapter 3

Now give these logarithms a name, for example, logAssets, while they are still selected, by
choosing the Insert/Name/Define menu command and typing the name logAssets:
Chapter 3 Histograms 25

Now we are ready to construct the histogram of logAssets, using the methods explained earlier in
this chapter, but this time for the logAssets data. Here is the resulting histogram, which is much
less skewed than the original data:
26 Landmark Summaries Chapter 4

Landmark Summaries (Chapter 4)


Excel can quickly compute many statistical summaries and, with some effort, draw the related
graphs. In this chapter we consider the average, median, weighted average, five-number summary,
boxplot, and cumulative distribution function.

Example: How Many Defective Parts? (Average, Median)


This example shows how to use Excel to find the average, median, quartiles, and percentiles.
Consider the data for defective parts (from the example in Chapter 4 of Practical Business
Statistics).
If your data are not yet named, begin by giving a name (such as “Defects” here) to your column of
numbers by highlighting the numbers and then using Excel’s menu command Insert/Name/Define.
Next, select the cell where you want to put the average. You may either
1. type “=AVERAGE(Defects)” directly into the cell and hit Enter
or
2. select Average from the statistical functions listed under the menu command
Insert/Function, hit OK, and then either type “Defects” directly into the dialog box, or
drag the mouse down your column of numbers to tell Excel which data set to use. Then
select OK.
Chapter 4 Landmark Summaries 27

Either way, the result is the same. After selecting another cell to hold the median and repeating
these steps to find the median, the result (average is 5.1, median is 4.5) is as follows:

Example: Your Grade Point Average (Weighted Average)


This example shows how to compute a weighted average, given two columns of numbers: one
with values and the other with the weights. Consider the data on grades (from the example in
Chapter 4 of Practical Business Statistics). A grade point average is the weighted average grade
where credits define the weights.
28 Landmark Summaries Chapter 4

Be sure each column of numbers has a name (select the column of numbers and use Excel’s
Insert/Name/Define menu command if needed). The weighted average can then be computed
using the expression “=SUMPRODUCT(Credits, Grade)/SUM(Credits)”. The SUMPRODUCT
function multiplies credits by grade for each course and adds them up, while the SUM function
finds the total credits. Remember always to divide by the sum of the weights (in this example, the
credits). The result here is a grade point average of 3.45:

Example: How Many Defective Parts? (Quartiles, 5-Number Summary,


Percentiles)
To find the quartiles, recall that the rank of the lower quartile is [1+int(1+n)/2]/2. You can find n,
the number of data values, by using Excel’s COUNT function. To convince Excel to find the data
value at this rank (and to average two data values if the rank includes a fractional part), we can
use Excel’s PERCENTILE function, with a few modifications, as shown below. To find the upper
quartile, the formula changes only slightly. You can use these formulas to find the quartiles of any
data set by substituting the data set name in place of “Defects”. Here are the results for the
Defects data:
Chapter 4 Landmark Summaries 29

The 5-number summary consists of the smallest, lower quartile, median, upper quartile, and
largest. You can use Excel’s MIN and MAX functions to find the smallest and largest. Here is the
5-number summary:

To find a percentile when you have the percentage, you may use Excel’s PERCENTILE function,
which needs to know the data set and the percentage. Here is the 85th percentile for the Defects
data:

Given a number (not necessarily a data value, but in the same units as the data values) you may
use Excel’s PERCENTRANK function to find the percentage that tells what percentile it is. This
example shows that 11 is the94th percentile. That is, about 94% of the data values are smaller than
11. To get the number 0.944 to show as 94.4%, you may select the cell and format it as a
percentage (using the menu command Format/Cells/Number/Percentage).
30 Landmark Summaries Chapter 4

Example: CEO Compensation (Boxplot)


This example shows how to draw a box plot, once you have the 5-number summary, which
involves a particular arrangement of the five numbers in a table. A simpler alternative is to use
StatPad. Consider a data set of CEO compensation, with five-number summary 100,000,
1,000,000, 1,497,500, 2,101,000, and 7,730,000. Here are the steps involved in creating a box
plot:
1. Arrange the 5-number summary exactly as follows, repeating some summaries and leaving
a space before the median as shown here:

2. To the left of these numbers, type in the numbers 1, 2, 3 in exactly the following sequence.
This will tell Excel how to draw the lines to create the box plot (the number 2 is in the
middle, while 1 will place it to the left and 3 to the right).
Chapter 4 Landmark Summaries 31

3. Select both columns of numbers all the way down (including the blank line) and choose
Insert/Chart from the menu as follows:
32 Landmark Summaries Chapter 4

4. Choose “XY (Scatter)” as the Chart Type, and choose “Scatter with data points connected
by lines without markers” as the Chart sub-type, as follows:

Click here And here

5. Click Next > twice, then eliminate some unnecessary features. Delete the X Axis by
selecting the Axes tab and unselecting the “Value (X) Axis” checkbox. Delete the legend
by selecting the Legend tab and unselecting the “Show legend” checkbox, and eliminate
gridlines by selecting the Gridlines tab and unselecting anything checked there. You may
also add titles by clicking on the Titles tab:
Chapter 4 Landmark Summaries 33

6. Click on Finish to place the chart in the worksheet. The chart is selected so you see the
sizing handles around it and the data it was made from.

7. Drag the sizing handles to make it larger. In addition, if you don’t like the gray
background, double-click on it and set the Patterns in the Area to None. To move the
34 Landmark Summaries Chapter 4

chart to a different place in the worksheet, drag just inside the edge but not on a sizing
handle. To add or change titles, right-click just inside the chart, select Chart Options from
the little pop-up menu, and choose the Titles tab. To change the font size, right-click on
the item (a title or an axis) and choose Format from the little pop-up menu. Here is the
result:

Example: Defects Data (Cumulative Distribution Function)


This example shows how to draw a cumulative distribution function (CDF), which involves
arranging two copies of the data set together with the percentages in a table. A simpler alternative
is to use StatPad. Consider the data for defective parts in production (from an example in Chapter
4 of Practical Business Statistics). Here are the steps involved in creating the CDF:
1. Select the all of the numbers in the data column and get ready to make copies of it using
Edit/Copy from the main menu. One quick way to select the numbers is to click on the
first number, then hit the End key, and then hold the Shift key while you hit the down
arrow ↓ .
Chapter 4 Landmark Summaries 35

2. Click on a wide-open area of the worksheet with room for two columns not touching any
other data in your worksheet. Paste the data once (using Edit/Paste from the main menu),
then select the empty cell under the last data value (one quick way is to hit End, ↓ , and
↓ ) and paste it again. Here is how it looks after pasting once, just before the second
pasting:
36 Landmark Summaries Chapter 4

3. Now sort this double data set as follows. First, select any single data value within the
column (Excel should sort the entire column). Then choose Data/Sort from Excel’s main
menu and select OK from the dialog box. You will then have two copies, sorted. Here is
the worksheet just before sorting:
Chapter 4 Landmark Summaries 37

4. Create the column of percentages. Place the number 0 in the empty cell just to the right of
the top cell of your sorted double data set by typing 0, Enter. Just below it, type the
formula “=1/COUNT(Defects)” where you would substitute your data set name for
“Defects” here. Just below that, type the = key, click on the cell with the 0 you just
entered, then type “+1/COUNT(Defects)”, substituting your data set name for “Defects”
and hit Enter. Finally, double-click the fill handle to complete the column (or copy this cell
to the cells under it to fill out the column). Here is the result just before double-clicking on
the fill handle - note that the cell P10 is where the zero was entered.
38 Landmark Summaries Chapter 4

5. Select both columns of numbers and choose Insert/Chart from the menu as follows:
Chapter 4 Landmark Summaries 39

6. Choose “XY (Scatter)” as the Chart Type, and choose “Scatter with data points connected
by lines without markers” as the Chart sub-type, as follows:
40 Landmark Summaries Chapter 4

Click here And here

7. Click Next > twice, then eliminate some unnecessary features. Delete the legend by
selecting the Legend tab and unselecting the “Show legend” checkbox, and eliminate
gridlines by selecting the Gridlines tab and unselecting anything checked there. You may
also add titles by clicking on the Titles tab:
Chapter 4 Landmark Summaries 41

8. Click on Finish to place the chart in the worksheet. The chart is selected so you see the
sizing handles around it and the data it was made from.
42 Landmark Summaries Chapter 4

9. Drag the sizing handles to make it larger. Then double-click on the Cumulative Percent
axis (or on any number on this Y axis), select the Number tab, choose Percentage with 0
Decimal places as follows:
Chapter 4 Landmark Summaries 43

10. In addition, if you don’t like the gray background, double-click on it and set the Patterns
in the Area to None. To move the chart to a different place in the worksheet, drag just
inside the edge but not on a sizing handle. To add or change titles, right-click just inside
the chart, select Chart Options from the little pop-up menu, and choose the Titles tab. To
change the font size, right-click on the item (a title or an axis) and choose Format from the
little pop-up menu. Here is the result:
44 Variability Chapter 5

Variability (Chapter 5)
Excel can quickly compute the basic variability measures. In this chapter we consider the standard
deviation, the range, the coefficient of variation, and the variance.

Example: The Advertising Budget (Standard Deviation, Range,


Coefficient of Variation, Variance)
This example shows how to find four measures of variability: the standard deviation, range,
coefficient of variation, and variance. Consider the data for the advertising budget of firms within
an industry group (from the example in Chapter 5 of Practical Business Statistics). For these
formulas to work, the column of data should have a name such as “Budget” here (if your data
does not yet have a name, then select the numbers in the data column and use Excel’s menu
command Insert/Name/Define to give your data a name).
Use Excel’s STDEV function to find the sample standard deviation. To find the range, subtract
the smallest from the largest using Excel’s MIN and MAX functions. To find the coefficient of
variation, recall that we divide the standard deviation (STDEV function) by the average
(AVERAGE function). Finally, to find the variance, use Excel’s VAR function. Here are the
results:
Chapter 5 Variability 45

If you need the population standard deviation instead of the sample standard deviation, you may
use the function STDEVP instead of STDEV.
46 Probability Chapter 6

Probability (Chapter 6)
Most of the probability chapter requires thinking, and perhaps a calculator, to get the answers. Of
course you can use Excel to do your arithmetic for you - just select a cell, hit the = key, type an
expression such as (0.1+0.3)*0.4, and hit Enter to see the answer. Excel can also be used to
demonstrate the law of large numbers, to show you how the (random) relative frequency of an
event becomes closer to the probability as the number of trials grows larger.

Example: The Law of Large Numbers


Suppose an event has probability 0.4. There is nothing random about this number. The
randomness is in whether the event happens or doesn't each time you run the random experiment.
If you run it 10 times, the event might happen exactly 4 times, but it also might happen twice, 6
times, or just once. In this example you will see how the relative frequencies, while being random,
get closer to the probability as n increases.
1. Start with a new worksheet (File/New) and then type “Probability” in cell A2 and 0.4 in
cell B2.
2. With cell B2 still selected, use Insert/Name/Define from the main menu to name it
“Probability”.
3. In cell A8, type the formula “=IF(RAND()<Probability,1,0) and hit Enter.

4. Hit the F9 key (called the “Recalculation key”). Each time you do, a new random number
RAND() will be compared to the Probability: if it is smaller, then 1 is displayed and the
event “happens”, otherwise you will see 0. Hit F9 over and over to get a sense of how a
random event with probability 0.4 might occur. If you wish, select cell B2 and type in a
different probability number, hit Enter, then recalculate over and over again with F9. Try it
with probability 0.1 and 0.9 and others if you wish.
Chapter 6 Probability 47

5. Now select cell A8 and choose Edit/Copy from the main menu. Next, click once with the
mouse on cell A9. To select lots of cells from A9 on down, hold the Shift key while you hit
Pg Dn over and over. When you have selected a few hundred or a few thousand cells,
choose Edit/Paste from the main menu. You now have repeated the random experiment
many times, once in each cell starting with A8:

6. Compute the relative frequencies as follows: enter the formula


“=SUM($A$8:A8)/COUNT($A$8:A8)” into cell B8, being careful about the $ signs,
which will help Excel when you copy the formula down the column by double-clicking on
the fill handle, as shown here:
48 Probability Chapter 6

6. Hit the F9 key to see how the relative frequencies might change. Here is one possibility:
note that the relative frequencies are 0 for the first two trials because the event didn’t
happen yet. After 3 trials, the relative frequency is 1 out of 3, or 0.333333. After 4 trials it
drops to 1 out of 4, or 0.25, and so forth:
Chapter 6 Probability 49

7. To create a graph, first select the column of relative frequencies. This might be done by
selecting cell B8, hitting End, then holding down Shift while you hit the down arrow ↓ .
Then choose Insert/Chart from the main menu and choose a Line Chart with the first
Chart sub-type:
50 Probability Chapter 6

8. Click Next > twice, then delete the legend by selecting the Legend tab and unselecting the
“Show legend” checkbox. You may also add titles by clicking on the Titles tab:
Chapter 6 Probability 51

9. Click on Finish to place the chart in the worksheet, and resize it with the sizing handles.
Note how the graph of the relative frequencies hovers fairly near to the probability of 0.4.
Hit the recalculation key (F9) a few times to see how else it might have come out, with
different randomness each time.

10. You can see what relative frequencies look like with different probabilities. Here is how
they might look if you change Probability to 0.9:
52 Probability Chapter 6
Chapter 7 Random Variables 53

Random Variables (Chapter 7)


Excel can be used to perform, or help with, many of the basic calculations involving random
variables. This chapter will cover discrete random variables (mean and standard deviation),
binomial probabilities, normal probabilities, Poisson probabilities, and exponential probabilities.

Example: Profit and Economic Scenarios (Mean and Standard


Deviation of a Discrete Distribution)
This example shows how to use Excel to help with the calculation of the mean and standard
deviation of a discrete distribution. Consider the profits example (from the example in Chapter 7
of Practical Business Statistics). For these formulas to work, each column of data should have a
name such as “Profit” and “Probability” here (if your data does not yet have a name, then select
the numbers in a data column and use Excel’s menu command Insert/Name/Define to give your
data a name). The mean, 3.65, is the sum of the products of value times probability, hence the
formula is “=SUMPRODUCT(Profit,Probability)”. Give this cell (which now contains the mean)
the name “Mean”. The standard deviation, 4.40, is the square root (SQRT function) of the sum of
the products of the square of value minus mean times probability, hence the formula is
“=SQRT(SUMPRODUCT((Profit-Mean)^2,Probability))”. These formulas give us 3.65 for the
mean and 4.40 for the standard deviation:

Example: How Many of Five Possibilities Will Succeed? (Binomial


Probabilities)
This example shows how to find probabilities for the binomial distribution, given the number of
trials n and the probability π for each one. Consider the example in which n = 5 and π = 0.8. That
is, you have 5 independent possibilities and each one has probability 0.8 of success.
54 Random Variables Chapter 7

To use Excel to compute binomial probabilities, use the formula “=BINOMDIST(a,n,π,FALSE)”


to find the probability P(X=a) of being equal to a, and use the formula “=BINOMDIST(a,n,π,
TRUE)” to find the probability P(X≤a) of being less than or equal to a, as follows, where the
“FALSE” and “TRUE” in Excel’s binomial distribution formula refers to whether the probability
distribution is cumulative or not.
Here are some results. The probability that exactly 3 succeed is 0.2048, the probability that 3 or
fewer succeed is 0.2627, and the probability that 3 or more succeed is 0.9421 (evaluated as “not 2
or less”):

Example: Standard Normal Probabilities


This example shows how to find probabilities for the standard normal distribution in Excel. The
standard normal probability table (Chapter 7 of Practical Business Statistics) gives the probability
that a normal distribution with mean 0 and standard deviation 1 will be less than a given value.
For example, the probability that a standard normal is less than 1.38 is 0.9162. These may easily
be found using Excel’s NORMSDIST function as follows:
Chapter 7 Random Variables 55

Example: Sales Forecasting (Normal Probabilities)


This example shows how you can solve normal probability problems without standardizing the
numbers. Because you tell Excel the mean and standard deviation, you can ask about probabilities
concerning the original numbers (no need to subtract the mean and divide by the standard
deviation; Excel’s NORMDIST function will do that for you).
Consider the sales forecasting example (from Chapter 7 of Practical Business Statistics). Sales
are forecast as having a mean of $20 million and a standard deviation of $3 million. Here you find
the probability (0.0478) that sales will be less than $15 million, as well as three other probabilities:
To use Excel to compute these probabilities, we use the function
“NORMDIST(value,mean,standardDeviation,TRUE)” to find the probability that a normal
distribution with specified mean and standard deviation is less than some value. There is no need
to standardize because Excel will do this for you as part of the calculation. The first calculation is
straightforward because it is a probability of being less. The second calculation is one minus the
NORMDIST function because it is a probability of being greater. The third calculation is the
difference of two NORMDIST calculations because it is the probability of being between two
values. The fourth calculation is one minus the difference of two NORMDIST calculations
because it is the probability of NOT being between two values. Here are the results:

Example: How Many Warranty Returns (Poisson Probabilities)


This example shows how to find probabilities for the Poisson distribution. Consider the warranty
returns example (from Chapter 7 of Practical Business Statistics) where you expect 1.3 of your
products to be returned, on average, each day for warranty repairs. Assuming a Poisson
distribution, the POISSON function can give you either the probability that a particular number
56 Random Variables Chapter 7

will be returned, or the cumulative probability that a particular number or less will be returned on
a particular day.
Here is how to use Excel’s function “POISSON(value,mean,FALSE)” to find the probability that
a Poisson random variable is exactly equal to some value, and how to use
“POISSON(value,mean,TRUE)” to find the probability that a Poisson random variable is less
than or equal to some value. The terms TRUE and FALSE in the function refer to whether the
probability is cumulative or not. Here are the results:

Example: Customer Arrivals (Exponential Probabilities)


This example shows how to find probabilities for the exponential distribution. Consider the
customer arrivals example (from Chapter 7 of Practical Business Statistics) where customers
arrive independently at a constant mean rate of 40 per hour. The random variable is the waiting
time until the next customer arrives. The mean waiting time is 1.5 minutes, computed as 60
minutes per hour divided by 40 expected arrivals in that time.
Using Excel’s function EXPONDIST(value,1/mean,TRUE), you can find the probability that an
exponential random variable with a given mean is less than or equal to the given value. Note that
Excel’s EXPONDIST function uses 1/mean, not the mean itself. Here are two calculations, the
probability of waiting 5 minutes or less for the next customer, and the probability of waiting 2
minutes or less:
Chapter 8 Random Sampling 57

Random Sampling (Chapter 8)


Excel can choose a random sample with or without replacement. The standard error of the
average may easily be found using Excel formulas.

Example: Choosing a Random Sample of 3 from a Population of 10


Here is how to use Excel to choose a random sample of size n = 3 from a population of size N =
10 by shuffling the population, using a column of random numbers placed next to the population
listing.
1. Create a column of frame numbers, in this case from 0 to 10. To do this quickly (even for
much larger N), you might type “1” in cell A3, hit Enter, then use Excel’s menu commands
Edit/Fill/Series with Series in Columns, Step value 1 and Stop value 10 as shown here:

2. Insert random numbers by typing “=RAND()” in cell B3, just to the right of the first frame
number, hit ENTER, and then copy the result down the column to produce a column of
random numbers (this is quickly done by double-clicking the little fill handle at the lower
right corner of the selected cell B3).
3. To shuffle the population, first select both columns of numbers (the frame numbers and the
random numbers). For a large population, this is easily done by selecting the first frame
number (cell A3 here), holding Shift while you hit the right arrow → , hitting End, and
holding Shift while you hit the down arrow ↓ . Then use Data/Sort from Excel’s main
menu, being sure to sort by the random numbers.
58 Random Sampling Chapter 8

4. After the columns are sorted randomly, you may take the first three frame numbers to
obtain your random sample, which results in selection of items 7, 10, and 2 in this
example.

Example: Shopping Trips (Standard Error of the Average)


This example shows how to find the standard error of the average for a column of data, once you
have the standard deviation, by dividing it by the square root of n. Consider the shopping trips
example (from Chapter 8 of Practical Business Statistics). Suppose you put the standard
deviation, S = 8.63, into cell A15 and the sample size, n = 200, into cell A16. The standard error
of 0.610 may then be found using the formula “=A15/SQRT(A16)” as follows:
Chapter 8 Random Sampling 59

Alternatively, you can compute the standard error all at once with the formula
“=STDEV(rangeName)/SQRT(COUNT(rangeName))”, where “rangeName” is the name of your
data.
60 Confidence Intervals Chapter 9

Confidence Intervals (Chapter 9)


You can use Excel to compute confidence intervals for you, given a sample of data, at any
specified confidence level. Excel will even look up the t table value for you.

Example: Controlling the Average Thickness of Paper (Confidence


Interval)
This example shows how to construct confidence intervals for a sample of data. Consider the
example of paper thickness (Table 9.1.2 of Practical Business Statistics).
Here is how to use Excel to find the confidence interval. First, if needed, give the data column a
name (such as “Thickness” here) by selecting the numbers and using Excel’s Insert/Name/Define
menu command. Next, use Excel’s AVERAGE, STDEV, and COUNT functions to compute the
average, the standard deviation, and the sample size respectively and name the cells so they can be
easily used. The 95% confidence interval formula is then computed as average plus or minus t
times the standard error, where we use Excel’s TINV function to find the t value. Excel’s TINV
function is shown using “1−0.95” because it needs “one minus the confidence level” instead of the
confidence level itself. The term n−1 is used because TINV needs the number of degrees of
freedom.

To use a different confidence level other than 95%, you need only change the 0.95 in the TINV
function. For example, for a 99% confidence interval, you would use 0.99 in place of 0.95.
Chapter 9 Confidence Intervals 61

Example: Controlling the Average Thickness of Paper (One-sided


Confidence Intervals)
Here is how to find a one-sided confidence interval. Consider the example of paper thickness
(Table 9.1.2 of Practical Business Statistics).
In order to find a one-sided 95% confidence interval, the t table value changes to TINV(2*(1-
0.95),n-1), placing all the probability of error on one side because the other side extends
indefinitely without chance of error. To claim that the population mean paper thickness is at least
a certain value, the appropriate calculation is average minus t times standard error (so that the
one-sided interval from here to all higher values includes the average). Using the average value of
0.0040147, the standard deviation of 0.0002614, and the sample size of 15, we have:

To use a different confidence level other than 95%, you need only change the 0.95 in the TINV
function. For example, for a 99% confidence interval, you would use 0.99 in place of 0.95.
62 Hypothesis Testing Chapter 10

Hypothesis Testing (Chapter 10)


Excel can help you perform hypothesis tests for various situations involving population means for
which univariate data are available: one- and two-sided tests, various test levels, and two-sample
problems (both paired and unpaired).
If you are using the confidence interval approach to hypothesis testing (for example, deciding a
two-sided test by seeing whether the reference value is in the interval), please use the confidence
intervals explained earlier for Chapter 9.
Instead of having you specify the test level (for example 5%), Excel can give you the p-value (as
well as the t value and basic summaries). You may then complete the test at any level by
comparing the computed p-value to the test level. For example, if the reported p-value is less than
5%, the test is significant at the 5% level (otherwise it is not significant). You may wish to review
the discussion of p-values in Chapter 10 of Practical Business Statistics.

Example: Controlling Paper Thickness (the t Test: Computing the t


Statistic and Finding the p-Value)
This example shows how to test a population mean against a known reference value based on a
random sample from the population. Consider the data on paper thickness (Table 9.1.2 of
Practical Business Statistics), to be tested against the reference value µ0 = 0.00385. If your data
are not yet named, please select your column of numbers and use Excel’s menu command
Insert/Name/Define. To find the t statistic, we subtract the reference value, 0.00385, from the
average and then divide by the standard error (which is standard deviation divided by square root
of n). To find the p-value, we use the Excel formula =TDIST(ABS(t),n-1,2) where t is the
computed t statistic and n is the sample size (the “2” tells Excel to find a 2-sided p-value). Here,
then, are the results of an ordinary two-sided test for this example:
Chapter 10 Hypothesis Testing 63

Example: Controlling Paper Thickness (One-sided t Test)


For a one-sided test, the t statistic and sample size n both stay the same as before, but the p-value
must be computed differently. These calculations are different depending on the side being tested.
First, consider the case of a one-sided test to see if the sample average is significantly larger than
the reference value (that is, the research hypothesis claims that the population mean is larger than
the reference value). In this case, the p-value is either =TDIST(ABS(t),n-1,1) or =1-
=TDIST(ABS(t),n-1,1), depending on whether t is positive or negative respectively. Using the t
statistic of 2.4395561 and sample size n = 15 for the paper thickness example, the one-sided p-
value is 0.0143, found as follows:

Next, consider the case of a one-sided test to see if the sample average is significantly smaller than
the reference value (that is, the research hypothesis claims that the population mean is smaller
than the reference value). In this case, the p-value is either =TDIST(ABS(t),n-1,1) or =1-
=TDIST(ABS(t),n-1,1), depending on whether t is negative or positive respectively. Using the t
statistic of 2.4395561 and sample size n = 15 for the paper thickness example, the one-sided p-
value is 0.9857, found as follows:
64 Hypothesis Testing Chapter 10

This example has been used to illustrate the calculations. Note that, in real life, you would not
compute both of these tests (significantly greater, significantly smaller) on the same data set
because you would have to choose the side you wished to test before performing the test.

Example: Reactions to Advertising (Paired t Test)


This example shows how to perform a paired t test to see whether two paired columns of data are
significantly different or not, on average. This test begins by subtracting the two columns (which
is permitted because the situation is paired) and then testing these differences against the reference
value 0. Consider the data on reactions to advertising (Table 10.6.1 of Practical Business
Statistics).
The differences are calculated by using Excel’s arithmetic formulas. In this case, the formula
=D10-C10 was entered into cell E10 to compute After - Before for the first person. This formula
was then copied down the column (either using copy and paste from the main menu, or simply
double-clicking the little fill handle at the lower right corner of the selected cell E10). The two-
sample paired t test then becomes an ordinary one-sample t test of the differences, using the
reference value 0. The result is p =0.03, and since p < 0.05, we conclude that there is a significant
difference between the Before and the After scores. Here are the calculations:
Chapter 10 Hypothesis Testing 65

Example: Gender Discrimination and Salaries (Two-Sample Unpaired t


Test)
This example shows how to perform a two-sample t test for the small-sample situation (see the
two formulas for the standard error of the difference in Chapter 10 of Practical Business
Statistics).
Consider the data on gender discrimination and salaries (Table 10.6.4 of Practical Business
Statistics). The hypothesis test (to see if the average salaries of men’s and women’s salaries are
different from one another) starts with the basic summaries: the average of each group, the
standard deviation of each group, and the sample size of each group. Each of these summaries is
given a name to make it easy to use (by selecting the cell and using the menu command
Insert/Name/Define).
66 Hypothesis Testing Chapter 10

Then you can find the standard error of the average difference, the t statistic, and the p-value from
these summaries. The conclusion is that there is a very highly significant difference between men's
and women's salaries (p < 0.001). Here are the Excel results:
Chapter 10 Hypothesis Testing 67
68 Correlation and Regression Chapter 11

Correlation and Regression (Chapter


11)
Excel provides assorted methods for the analysis of bivariate data: correlation, plotting, and
regression analysis.

Example: Contacts and Sales (Correlation)


This example shows how to find the correlation in Excel by using the CORREL function after
naming your two columns of numbers (for example, by selecting a column of numbers and using
the Insert/Name/Define menu command to name it). Here is how to find the correlation of 0.985
between contacts and sales:

Example: Internet Usage Ratings (Plotting the Data)


This example shows how to use Excel to create a scatterplot for a bivariate data set. Consider the
data on Internet usage ratings (Table 11.1.3 of Practical Business Statistics). It is easiest if the
two columns are next to each other, with the X-axis data to the left of the Y-axis data. We will
create a scatterplot of Time (vertical) against Pages (horizontal).
1. Begin by selecting both columns of numbers (with the horizontal X axis data to the left).
2. Choose Insert/Chart from the main menu.
Chapter 11 Correlation and Regression 69

3. Choose XY (Scatter) from the list of chart type, and the first Chart sub-type (“Scatter.
Compares pairs of values”).
4. Continuing with Excel’s steps, you can create a scatterplot as an object in the worksheet.
Here is how the initial dialog box looks like after you select the data and begin to insert a
chart, together with the finished chart in the worksheet.

5. In addition, if you don’t like the gray background in the chart, double-click on it and set
the Patterns in the Area to None. To eliminate the legend at the right in the chart, right-
click on it and clear. To eliminate gridlines, right-click on one and clear. To change the size
of the chart, drag a sizing handle (which appear in the corners and in the middle of the
sides when you click just inside the edge of the chart). To move the chart to a different
place in the worksheet, drag just inside the edge but not on a sizing handle. To add or
change titles, right-click just inside the chart, select Chart Options from the little pop-up
menu, and choose the Titles tab. To change the font size, right-click on the item (a title or
an axis) and choose Format from the little pop-up menu. To change the number format of
an axis, double-click on it and select Number. Here is one possible result:
70 Correlation and Regression Chapter 11

90

80

70
60

50
Time 40
30
20

10

0
0 50 100 150 200
Pages

Example: Internet Usage Ratings (Plotting the Least-Squares Line)


Here is how to use Excel to add a least-squares line to a scatterplot. We continue with the
Internet usage ratings data.
Right-click with the mouse on a data point in the chart, then select Add Trendline from the
context-sensitive menu that appears, and finally specify Linear as the Trend/Regression type
before clicking OK. The initial step of right-clicking on a data point is shown below, followed by
the end result after the line has been added.
Chapter 11 Correlation and Regression 71
72 Correlation and Regression Chapter 11

Example: The Stock Market (Regression Analysis)


Here is how to perform regression analysis with Excel, using data from Table 11.1.6 on the daily
percent change in the S&P500 stock market index, trying to predict today’s market movement
from yesterday’s. As an alternative, you may wish to consider using StatPad, which will provide
more explanation of the results and give more output and charting options.
1. First give a name to each column of numbers if needed (for example, by selecting a
column of numbers and using Excel’s Insert/Name/Define menu command).
2. Look under the Tools menu for Data Analysis, and then select Regression. In the resulting
dialog box, you may specify the range name for the Y variable (“Today” in this example)
and for the X variable (“Yesterday”). If you cannot find Data Analysis under Excel’s Tools
menu, select Add-Ins from the Tools menu and make sure the Analysis ToolPak is
checked. If the Analysis ToolPak was not installed when Excel was installed on your
computer, you will need to install it from the Excel CD-ROM.
3. Click “Output Range” in the dialog box and specify where in the worksheet you want the
results to be placed, then click OK. Here is the dialog box and its results, which include
the R2 value of 0.0132 (which tells you that only 1.32% of the variation in market
performance can be explained by yesterday’s market change), the standard error of
estimate Se of 0.0087 (which tells you that, after using the predicted value, the actual
performance is typically different by about 0.87 percentage points), as well as the
estimated regression coefficient b = 0.1114 (which tells you that, for each percentage point
of yesterday’s market performance, we expect today’s market performance to be up about
an additional one-tenth of that, on average), its standard error Sb = 0.1522, the t statistic t
= 0.732, and its p-value of 0.468 (which is not significant because p > 0.05, telling you
that there is no significant relationship between yesterday’s and today’s market
performance).
Chapter 11 Correlation and Regression 73

From these results, looking at the last table’s Coefficients and recognizing that “X variable 1”
refers to the X variable “Yesterday”, you can see that the least-squares prediction equation is
Today = 0.000398 + 0.111421 × Yesterday
Because the R2 is 0.0132 or 1.32% (from the first table of Regression Statistics), it is clear that
given whatever the market did yesterday does not seem to help you very much to predict what it
will do today.
To perform the t test, you may look at the t statistic (“t stat” for X Variable 1” in the last table) of
0.732 and its p-value of 0.468. Because p > 0.05 the relationship between Yesterday’s and
Today’s stock market movements is not significant.
This is also clear from the 95% confidence interval for the regression coefficient, which extends
from -0.196226 to 0.419068 and includes the reference value 0. These numbers are found in the
last row of the last table under the headings “Lower 95%” and “Upper 95%”.
74 Multiple Regression Chapter 12

Multiple Regression (Chapter 12)


Excel’s Tools/DataAnalysis commands allow you to perform multiple regression analysis and
correlation analysis of multivariate data. As an alternative, you may wish to consider using
StatPad, which will allow you to pick an choose your X variables even if they are not right next to
one another, and will also explain the results and give more output and charting options.

Example: Magazine Ads (Multiple Regression)


Here is how to perform multiple regression analysis on the magazine ad data from Table 12.1.3 in
Chapter 12 of Practical Business Statistics, to understand how the cost per page of advertising
can be (at least partially) explained by the magazine’s characteristics. Here is the data set we will
be working with (there are 55 magazines in all - this is just the top of the database).

1. Look under the Tools menu for Data Analysis, and then select Regression. If you cannot
find Data Analysis under Excel’s Tools menu, select Add-Ins from the Tools menu and
make sure the Analysis ToolPak is checked. If the Analysis ToolPak was not installed
when Excel was installed on your computer, you will need to install it from the Excel CD-
ROM.
2. In the resulting dialog box, you may specify the range for the Y variable by selecting the
label at the top along with the column of numbers to be predicted by dragging the mouse
down the column starting at the label “Page” in cell D9 down to the Page Cost value for
Chapter 12 Multiple Regression 75

the last magazine in cell D64. The X variables must be right next to each other, forming a
rectangular range of rows and columns. In this case the X variable range, including labels,
is from E9 (the label “Audience”) to G64 (the Income measure for the last magazine),
selected by dragging the mouse diagonally from one corner to the other. Here is the
resulting dialog box:

What to do if you do not want to use all of the X variables? For example, to leave one out
you should create a copy of the X variables (selecting them, using Edit/Copy, selecting a
cell in a different part of the worksheet, then using and Edit/Paste), select the column of
data to be omitted, delete it with the Del key (this is why we use a copy!), select the
columns to its right by dragging the mouse diagonally across from one corner to the other,
then use Edit/Cut, move to the empty column, and use Edit/Paste to close the gap. You
now have a copy of the X data that omits the column you are not using.
3. Click “Labels” in this dialog box because we have included labels at the top of the data
columns. This was done to make the results easier to interpret (so that Excel can use the
names of the variables instead of just “X variable 2” for example).
4. Click “Output Range” in this dialog box and specify (by clicking the mouse or typing a cell
address) where in the worksheet you want the results to be placed, then click OK. The
result is not a pretty sight - it still needs to be tidied up because some cells cannot be read
because they are blocked by others and the numbers are not aligned nicely.
76 Multiple Regression Chapter 12

4. Now tidy it up and format the results. If there is more in a cell than you can see, select it
and use the menu command Format/Columns/Autofit Selection in order to make the
column wider so that you can see it all. To control the number of decimal places shown,
select the cell(s), then use Format/Cells, then under the Number tab you might choose
Number and then specify the number of decimal places. The last two columns have been
deleted because they contain no new information (they just repeated the columns before
them). Here are the results after tidying up:
Chapter 12 Multiple Regression 77

The results in the first table of Regression Statistics include the R2 value of 0.787 (which tells you
that or 78.7% of the variation in Page Costs can be explained by the X variables) and the standard
error of estimate Se of 21,578 (which tells you that Page Costs can be predicted to within about
this many dollars).
The ANOVA table includes the F test, whose p-value 3.81619E-17 is very small (the “E-17” tells
you to move the decimal point to the left 17 places, so actually p =
0.0000000000000000381619). In particular, p < 0.001 and the result is very highly significant.
The last table has the Coefficients, including the constant term of 4,042.799 and the regression
coefficients: 3.788 for Audience, -123.634 for Male, and 0.903 for Income. The Standard Error
column shows standard errors for each of these coefficients. Next are their t statistics and p-values
(note that Audience and Income are significant, but Male is not). Finally you have 95% confidence
intervals for the regression coefficients - for example, we are 95% sure that the effect of an
additional dollar of Income is to increase Page Costs somewhere between $0.161 and $1.645, on
average.

Example: Magazine Ads (Correlations)


Here is how to find the correlation matrix of a multivariate data set, giving you the correlation of
each pair of variables.
78 Multiple Regression Chapter 12

1. Look under the Tools menu for Data Analysis, and then select Correlation. In the resulting
dialog box, you may select the labels at the top of each column as part of the data range
(which must be data columns arranged right next to each other, forming a rectangular
range of rows and columns). Also click on “Labels in First Row” so that Excel can use the
variable names to help you understand the results. In this case the Input Range is from D9
(the label “Page”) to G64 (the Income measure for the last magazine), selected by
dragging the mouse diagonally from one corner to the other. Here is the resulting dialog
box:

2. Click on OK. You can see, for example, that the correlation between Page Costs and
Audience is the highest, with r = 0.872. The correlation between Audience and Income is
negative, with r = -0.353. Here are the results:
Chapter 14 Time Series 79

Time Series (Chapter 14)


Excel can be used to perform a trend-seasonal analysis of time series data, accomplished by
performing a number of detailed steps one at a time to produce the results. As an alternative, you
may wish to consider using StatPad, which will perform this analysis automatically, with many
output and charting options.

Example: Ford Automotive Sales (Trend-Seasonal Analysis)


This example shows how to perform a trend-seasonal analysis of quarterly data using Excel. This
analysis is built up one basic step at a time by finding the moving average, the ratio-to-moving-
average, each seasonal index, the seasonally adjusted series, and the long-term trend. Consider the
data for Ford Motor Company’s Automotive Sales from Table 14.2.1 of Practical Business
Statistics. Here is the data set (it actually extends through 2000 - this is just the top of the
database).

1. To find the moving average for a quarterly series like this one, remember that it starts with
the third row (so that we can average a full year’s worth of data, with a half-year before
and a half-year after). So we start in the third quarter (cell D6 in this case). Note that if we
go back two quarters and ahead two quarters there are two “Quarter 1” values, so they
must have weight 0.5 each so that quarters 1 through 4 are treated equally. The easiest
way to compute this weighted average is actually to average two overlapping full years’
80 Time Series Chapter 14

worth of data: the four quarters of 1994 (cells C4:C7 here) with the full year beginning
one quarter later (cells C5:C8). This is why, in this case, you can use the formula
=AVERAGE(C4:C7,C5:C8)
in cell D6 for the first moving-average value. An easy way to enter the formula is to drag
down each four-quarter range instead of typing in its address. Here is how it looks so far:

If you have a monthly instead of a quarterly time series, then instead of the “quarter”
column with 1, 2, 3, 4, 1, 2, 3, ... you would have a “month” column with 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 1, 2, 3, ... and the moving average would start in the seventh row
instead of the third. The formula for the moving average would again be the average of
two overlapping full years’ worth of data (1) the first 12 months and (2) the full year
beginning one month later with months 2 through 13. With monthly data the moving
average is also unavailable for the last six months.
2. Double click on the fill handle to copy this formula down the column, then select and
delete the last two entries of this column because the moving average is unavailable for the
last two quarters. (For monthly data, delete the last six entries). Here is the result so far:
Chapter 14 Time Series 81

3. Find the ratio-to-moving-average, by dividing the Sales value by the Moving Average
value, (in this case, place the formula =C6/D6 in cell E6) then double-click on the fill
handle to copy this formula down the column. Here is the result:
82 Time Series Chapter 14

4. The seasonal index can be computed for all quarters, even when the moving average and
ratio are unavailable. The seasonal index for a given quarter (1, 2, 3, or 4) is the average
Chapter 14 Time Series 83

of all the ratios for that quarter, averaged over all the years that have a ratio for that
quarter. For example, the seasonal index for quarter 1 is the average of the ratio 1.03142
for quarter 1 in 1995 with the ratio 1.000796 for quarter 1 in 1996, and so forth through
quarter 1 of 2000. Here is a fairly easy way to compute the seasonal index column by
using the SUMIF(RANGE,CRITERIA,SUM_RANGE) function to sum the ratios for the
selected quarter, divided by the COUNTIF(RANGE,CRITERIA) function that counts
how many there are.
In this case the formula to put in cell F4 is
=SUMIF($B$6:$B$29,B4,$E$6:$E$29)/COUNTIF($B$6:$B$29,B4)
Note carefully the use of dollar signs in the cell addresses: references with $ will not
change when the formula is copied. The RANGE is $B$6:$B$29 in both functions
(SUMIF and COUNTIF), consisting of those values in the “Quarter” column for which
ratios are available, so that the first two and last two rows are excluded. The CRITERIA
in both functions (SUMIF and COUNTIF) is simply B4, which refers to the Quarter
number, 1, for the first row of data. No dollar signs are used here so that when the
formula is copied, the result will be for the appropriate quarter for that row. The
SUM_RANGE is $E$6:$E$29 for the SUMIF function, telling it to sum up the ratio
values for the specified quarter number, specifying only those rows for which ratios are
available.
After entering this formula into cell F4, drag the fill handle down the entire column (or use
copy and paste) to find all the seasonal values. Note that they repeat exactly from one year
to the next, for example, the quarter 1 seasonal index is always 0.9993252 for all years:
84 Time Series Chapter 14

5. The seasonally adjusted values are found by dividing each Sales figure by its Seasonal
Index. In this case, the formula is =C4/F4. Enter the formula into the top cell, then copy
down the column, perhaps by double-clicking on the fill handle:
Chapter 14 Time Series 85

6. Before you can find the long-term trend, you need a “time period” column consisting of
the numbers 1, 2, 3, ... counting how many time periods have gone by. A quick way to do
this is to start with 1 and 2 in the first two rows (H4 and H5 in this example), select both
cells, then double-click the little fill handle in the lower right corner of the selected cells.
86 Time Series Chapter 14

7. Use this column of time periods to predict the seasonally adjusted column (Y) from the
time period (X) using regression analysis. A quick way to do this is with the
FORECAST(X,KNOWN_Y’S,KNOWN_X’S) function, using the first time period value
column for X, using the entire seasonally-adjusted series with absolute $ cell addressing
Chapter 14 Time Series 87

as the KNOWN_Y’S, and using the entire time period column with absolute $ cell
addressing as the KNOWN_X’S. In this case, entering the formula into cell I4 using
Insert/Function from the main menu for this problem looks like this (be careful to use omit
$ for X but to use $ in the other two ranges:

8. Choose OK to see the resulting long-term trend value in the top cell, then double-click the
fill handle to copy the formula down the column:
88 Time Series Chapter 14

9. To extend the trend beyond the series and find the seasonally-adjusted forecast values, the
quickest way is to select the last two rows of the time period and the trend columns (you
Chapter 14 Time Series 89

need two rows so that Excel will know to keep increasing the time period in the next step)
as follows:

and then to drag the little fill handle at the lower right corner of the selected range to drag
it down as many rows as you want. It’s like magic!

10. To prepare to forecast by seasonalizing the trend, you will need to extend the columns for
year, quarter, and seasonal index (columns A, B, and F here). After extending columns A
and B, you may select the last seasonal index (cell F31 here) and drag the fill handle down
to extend it (if Excel has not already done this for you):
90 Time Series Chapter 14

11. You are now ready to create the forecast values by multiplying the trend by the seasonal
index. In this example, enter the formula =I4*F4 into cell J4, then double-click the fill
handle (or copy and paste) to complete the forecast column. Congratulations! You are
done the calculations!
Chapter 14 Time Series 91
92 Time Series Chapter 14

Example: Ford Automotive Sales (Charting the Series and Forecast)


Here is one way to make a chart of one or more of the columns you have created. In this example
we create a chart of the original series (sales) together with the forecast values.
1. To begin, select the Sales column including the label at the top (so Excel can use this
label), then choose Insert/Chart from the main menu and specify Chart Type as Line and
Chart sub-type as either the first choice, or “Line with markers displayed at each data
value” as specified here:
Chapter 14 Time Series 93
94 Time Series Chapter 14

2. To list the years along the horizontal axis, click Next >, choose the Series tab, click in the
“Category (X) axis labels:” portion of the dialog box and drag with the mouse down the
numbers in the Year column in the spreadsheet (in this example, cells A4:A36, excluding
the label at the top this time). The dialog box now looks like this:

3. To add the forecasts to this chart, click Add, then click in the Values area of the dialog
box, then drag with the mouse down the Forecast values in the worksheet (just the
numbers). Next click in the Name area of the dialog box, then click on the cell with the
label “Forecast” (in cell J3 here). Your dialog box now looks like this:
Chapter 14 Time Series 95

4. Click Next >, make any changes you like, then click Finish to place the chart into the
worksheet. After resizing the chart and double-clicking on the gray background to make it
white, the chart looks like this:

45

40

35

30

25 Sales
20 Forecast

15

10

0
1995

1997

1999

2001
1994

1996

1998

1999
2000

2001
1994

1996

1998

2002
1995

1997

2000
96 ANOVA Chapter 15

ANOVA (Chapter 15)


Excel can perform one-way and two-way ANOVA. Here is an example of each type of analysis.

Example: Supplier Quality Scores (One-way ANOVA)


This example shows how to perform a basic one-way analysis of variance to test for significant
differences among several individual columns of data. Consider the data on supplier quality (Table
15.1.1 of Practical Business Statistics).
1. Look under the Tools menu for Data Analysis, select “Anova: Single Factor”, and choose
OK. If you cannot find Data Analysis under Excel’s Tools menu, select Add-Ins from the
Tools menu and make sure the Analysis ToolPak is checked. If the Analysis ToolPak was
not installed when Excel was installed on your computer, you will need to install it from
the Excel CD-ROM.

2. In the dialog box that appears, click in “Input Range” and select your data including
labels at the top, being sure to extend down to the last row even if you extend past the
end of some data columns. Excel requires that your variables be next to one another so
that your Input range is a rectangle. Click the check box “Labels in First Row” so that
Excel will recognize the names of the columns. Click to the left of “Output Range”, click
to the right of “Output Range” and then click in a cell in the worksheet where Excel can
put the results. So far, here is how it looks:
Chapter 15 ANOVA 97

3. Click OK to see the results. In this case the p-value of 0.005 tells you that the mean
quality scores of these three suppliers are highly significantly different from one another (p
< 0.01). That is, you may conclude that there are supplier differences. Also shown are the
average quality for each supplier (82.056, 80.667, and 87.684) and each supplier's
variance. You also find the between-sample variability of 269.081 and the within-sample
variability of 45.631 under the MS column of the ANOVA table (MS stands for Mean
Square).
Here are the results, after tidying up by adjusting column widths (try selecting cells that
are not displayed properly, then using Format/Column/AutoFitSelection) and by
formatting most cells to show three decimal places (using Format/Cells, selecting the
Number tab, then using Category Number with 3 decimal places for these cells).
98 ANOVA Chapter 15

4. To find the suppliers’ standard deviations, you may take the square root of each variance,
using the SQRT function as follows:
Chapter 15 ANOVA 99

Example: Production Quality by Shift and Supplier (Two-way ANOVA)


This example shows how you might perform a two-way analysis of variance with interaction term.
Consider the data on production quality according to shift (day, night, and swing) and supplier (A,
B, and C) summarized in Figure 15.4.1, and discussed in Problem 16, with averages listed in Table
15.5.4 of Practical Business Statistics.
1. Look under the Tools menu for Data Analysis, select “Anova: Two-Factor with
Replication”, and choose OK. If you cannot find Data Analysis under Excel’s Tools menu,
select Add-Ins from the Tools menu and make sure the Analysis ToolPak is checked. If the
Analysis ToolPak was not installed when Excel was installed on your computer, you will
need to install it from the Excel CD-ROM.
100 ANOVA Chapter 15

2. In the dialog box that appears, click in “Input Range” and select your data including
labels at the top and on the sides. Excel requires that the data be arranged in a table as
shown below. In this case there are 5 observations for each combination of shift and
supplier, so the “Rows per sample” is set at 5. Click to the left of “Output Range”, click to
the right of “Output Range” and then click in a cell in the worksheet where Excel can put
the results. So far, here is how it looks:
Chapter 15 ANOVA 101

3. Click OK to see the results, as shown below. First you see summary statistics for each
combination of shift and supplier (for example, the average quality for Shift 1 and Supplier
1 is 77.062, the average for Supplier 1 is 82.417 (to the right in the first table, for Supplier
1, under “Total”), and the average for Shift 1 is 80.076 (below, in the table headed “Total”
under the column headed Shift 1 at the very top).
In the ANOVA table are the results of the hypothesis tests, including a p-value of 0.720 for
testing whether the suppliers have equal means or not, a p-value listed as 0.000 for testing
whether the shifts have equal means or not, and a p-value of 0.014 for the interaction of
shift and supplier.
Here are the results, after tidying up by adjusting column widths (try selecting cells that
are not displayed properly, then using Format/Column/AutoFitSelection) and by
formatting most cells to show three decimal places (using Format/Cells, selecting the
Number tab, then using Category Number with 3 decimal places for these cells).
102 ANOVA Chapter 15
Chapter 16 Nonparametrics 103

Nonparametrics (Chapter 16)


Excel has functions that can help you with nonparametric testing based on ranks of the data. In
this chapter we will illustrate the sign test and the two-sample unpaired nonparametric test (the
Mann-Whitney U-test, also called the Wilcoxon rank-sum test).

Example: Local and National Family Income (Sign Test)


The sign test can be used to test whether the median of a random sample differs significantly from
a reference value. Consider the example of local family incomes using data from table 16.1.2 of
Practical Business Statistics. The question is: Do these incomes differ significantly from the
national median of $27,735?
By using the COUNTIF(RANGE,CRITERIA) function, you can find the modified sample size by
using, for the criteria, the condition that the data values are different from the reference value (to
say “different” in Excel, you use the less-than and greater-than signs like this: “<>
ReferenceValue”). You can also find the number of data values below the reference value by using
the less-than sign “< ReferenceValue” in the criteria.
You may then find the p-value for the test by using the BINOMDIST and the MIN functions. If m
denotes the modified sample size, #Below denotes the number of data values below the reference
value, and θ0 denotes the reference value, then the p-value is
=2*MIN(BINOMDIST(m,#Below,0.5,TRUE),BINOMDIST(m-#Below,0.5,TRUE))
Here is how you could find the modified sample size and the number of data values below the
reference value (m = 25 and #Below = 6) for the income data, together with the p-value. The
result is that the observed median income is significantly different from the reference value.
104 Nonparametrics Chapter 16

Example: Incomes of Mortgage Applicants (Unpaired Two-Sample Test)


The nonparametric test for two unpaired samples is based on the ranks of the overall data set with
both samples combined. Both the Mann-Whitney U-test and the Wilcoxon rank-sum test give the
same result. Excel can help you perform this procedure.
Chapter 16 Nonparametrics 105

1. Begin by listing both groups of numbers in a single column, with labels in the column to its
left to identify the group of each number. Then, with any data cell selected, use the menu
command Data/Sort to sort both columns by data value. Here is the Data/Sort dialog box:

2. Now find the rank of each data value, being careful to average any ties. To do this, create
a column (headed “1, 2, 3 ...” below) consisting of the initial ranks (before averaging) of
106 Nonparametrics Chapter 16

1, 2, 3, and so forth. Then create a column of ranks with tie averaging by using the
SUMIF(DataRange,DataValue,123Range)/COUNTIF(DataRange,DataValue), being
careful to use absolute $ addressing for DataRange and 123Range but not for DataValue.
Here is the result after copying that formula down the column (for example, by double-
clicking on the fill handle after entering the first formula). Note that the averaged rank of
18.5 is used for both income values of 57,000.
Chapter 16 Nonparametrics 107

3. To find the average rank for each group, you may again use the SUMIF and COUNTIF
functions, this time as
108 Nonparametrics Chapter 16

SUMIF(GroupLabelRange,”Fixed”,RanksRange)/COUNTIF(groupLabelRange,”Fixed”)
for the fixed-rate mortgages, changing “Fixed” to “Variable” for the variable-rate
mortgages. Here are the results:
Chapter 16 Nonparametrics 109

4. Now find the average difference in ranks by subtracting these average ranks. Find the
standard error by using the sample size for each group (16 and 14, here). Divide the
average difference in ranks by the standard error to find the test statistic. Finally, find the
p-value using the function
=2*(1-NORMSDIST(ABS(TestStatistic)))
The results are as follows. Note that these two groups are not significantly different from
one another because p > 0.05.
110 Chi-Squared Analysis Chapter 17

Chi-Squared Analysis (Chapter 17)


Chi-squared analysis is used to test for significance in counted data. You can use Excel to test
whether population percentages are equal to known reference values; this is done by performing
the appropriate steps (e.g., compute the expected counts, etc.). Excel may also be used to
perform a test for independence in a two-way table of counts.

Example: Causes of Quality Problems (Chi-squared Test for Known


Percentages)
In this example we solve the problem of testing observed counts against known population
reference percentages by executing the steps in Excel. Consider the data on quality problems in
Tables 17.2.2 and 17.2.3 of Practical Business Statistics.
1. Find the column of expected counts by multiplying each reference percent by the total
observed count. Give a name such as “Observed” to the observed counts, and the name
“Expected” to the expected counts, perhaps by selecting a column of numbers and using
the menu command Insert/Name/Define.
2. To find the chi-squared statistic, since you have named the Observed and Expected
numbers, you may use the following matrix formula:
=SUM((Observed-Expected)^2/Expected)
by typing it in and holding Ctrl and Shift while you hit Enter. Because this is a matrix
formula, it may not give you an answer if you hit only Enter by itself.
3. To find the p-value, you may use the function CHIDIST(ChiSq,DF) evaluated using the
chi-squared statistic and its number of degrees of freedom. Here are the results. Note that
the observed counts do not differ significantly from the reference percentages because p >
0.05.
Chapter 17 Chi-Squared Analysis 111

Example: Is Your Market Segmented? (Chi-squared Test for


Independence)
In this example we perform the chi-squared test for independence of two categorical variables
with Excel’s CHITEST function, which gives the p-value directly. Consider the data on market
segmentation (Table 17.3.1 of Practical Business Statistics) which gives the number of consumers
of each type (practical or impulsive) who purchased each type of rowing machine. Remember that
the chi-squared test requires counts, not percentages or averages.
1. Excel can help you compute the p-value of the chi-squared test for independence using the
CHITEST function, but you have to compute the table of expected counts first. In the
example below, to create a formula for expected counts that will copy correctly to fill the
entire table, note the use of “absolute addressing” using dollar signs in the formula
“=B$6*$D3/$D$6” to find the expected 18.93 purchases of basic machines by practical
consumers. This formula can be copied and pasted to fill the table while always taking
column totals from row 6 (hence the reference B$6), always taking the row totals from
column D (hence the reference $D3), and always taking the overall total from cell D6
(hence the reference $D$6).
112 Chi-Squared Analysis Chapter 17

2. The results are shown below: first the original table of counts, next the table of expected
counts, and finally the CHITEST function, which uses both the original table and the table
of expected counts (but not the totals). The resulting CHITEST p-value is 3.07823E-15,
which represents the very small number 0.00000000000000307823 because the scientific
notation "E-15” tells you to move the decimal point 15 places to the left. Clearly the result
is very highly significant because this p-value is less than 0.001.
Chapter 18 Quality Control 113

Quality Control (Chapter 18)


You can produce quality control charts in Excel. You arrange the data for the chart in one column,
copy the center line number down another column, and similarly set up one column for each of
the control limits, then create the chart. Here are examples for XBar and R charts. In a similar
way you can also produce a percentage chart. The procedure is simpler if you use StatPad.

Example: Weights of Boxes of Detergent (XBar and R Charts)


Here is how to use Excel to draw an XBar chart for the detergent data from Table 18.3.4 of
Practical Business Statistics. Begin with a column containing a list of the averages (of five
observations each in this case). Immediately to its right, create a column containing the average
XBarBar=16.093 of these averages repeated down the column. Next to it, create a column for the
lower control limit XBarBar-A2*RBar = 15.941 and one for the upper control limit
XBarBar+A2*RBar = 16.245. Now select all four of these columns (just the numbers) and use
Excel’s menu command Insert/Chart (or click on the Chart Wizard icon) to bring up the Chart
Wizard dialog box. Under “Chart type” select Line (with markers displayed at each data value) as
shown. As you work your way through the Chart Wizard (by clicking on Next), pause at the
Chart Options dialog box. Click on the Gridlines tab to specify whether or not you would like
gridlines (which were not used here) and click on the Legend tab to delete the legend (by
unselecting the “Show legend” check box). Click on Finish, and the XBar chart appears.
114 Quality Control Chapter 18

To use Excel to draw an R chart for the detergent data, proceed as for the XBar chart, but use the
range values R for the first column, their average RBar for the second column, and the
Chapter 18 Quality Control 115

appropriate lower and upper control limits D3*RBar = 0 and D4*RBar = 0.556 for the third and
fourth columns. Here is the R chart in Excel:
116 Excel Range Names Appendix

Appendix: Excel Range Names


For use with Excel, each chapter of Practical Business Statistics has its own file
on the CD-ROM that includes the data tables from examples and problems. To
access it, use File/Open from Excel’s menu. Each column of numbers is named and
ready to use. For example, the data sets from Chapter 3 are in the file named
Chapter03.xls, and the employee database from Appendix A of the textbook is in
the file named EmployeeDatabase.xls.
To work with a column of numbers from a data file, you may use its name in a
formula, such as “=AVERAGE(yield)” to place the average of a column of
numbers named “yield” into a cell in your worksheet. Alternatively, you may drag
the mouse down the numbers in the data set to select them if you wish.
Here are the Excel range names assigned to each individual data table within a file.
Note that spaces are not allowed in Excel names, and that the underline character
(_) is often used instead.
CHAPTER 2, RANGE NAMES.
Table 2.6.3. Sales and Income.
• Sales
Characteristics of Food Services
Companies. • Income
• Profits
Table 2.6.4. Selected Customers and
• Return Purchases.
• Employees • Purchases
• Revenues
Table 2.6.5 Ratings of Cell Phones.
Daily Stock Price Information for Home • Price
Depot
• Open Table 2.6.6. Comparison of Upright
• High Vacuum Cleaners.
• Low • Vacuum_Price
• HD_Close • Weight
• Volume
• HD_Percent_Change Table 2.6.7. Closing Price and Monthly
Change for DJIA Firms.
Table 2.6.1. Employment/History Status • Close
of Five People. • Change
• Salary
• Experience Table 2.6.8. Daily DJTA for September
1998.
Table 2.6.2. Selected Product Output of • DJIA
Production Facilities. • Net_Change
• Employ • Percent_Change
Appendix A Data Files and Variable Names 117

Control Equipment Companies in the


CHAPTER 3, RANGE NAMES. Fortune 500.
• Revenue_Change
Table 3.2.1. Home Mortgage Rates.
Table 3.9.6. Hospital Charges for Heart
• Interest_Rate Failure and Shock.
• Hospital_Charges
Table 3.2.2. Starting Salaries.
• Salary Table 3.9.7. CEO Compensation for
Food Processing Firms.
Table 3.4.1. Assets of Commercial Banks • CEO_Compensation
in the Fortune 1000.
• Assets Table 3.9.8. Market Share for Seattle
Radio Stations.
Table 3.5.1. Yields of Money Market • Listeners
Funds.
• Taxable_Yield Table 3.9.9. Net Income of Selected
• Tax_Exempt_Yield Firms.
• Net_Income
Table 3.5.2. Rates of Computer
Ownership Table 3.9.10. Cost of Traditional Funeral
• Computer_Owners Service.
• Funeral_Cost
Table 3.6.1. Changes in Spending on
Syndicated TV Advertising. Table 3.9.11. Special Exemptions to the
• Change Tax Code.
• Exemption
Table 3.8.1. The Number of Employees
for Food Services Firms. Problem 3.21. Defective Motors, Per
• Employees Batch Of 250.
• Defects
Table 3.9.1. Yields of Municipal Bonds.
• Yield Table 3.9.12. Cost to Rent a Car.
• Rental
Table 3.9.2. Market Response to Stock
Buy-Backs. Problem 3.23. Interest Rates.
• Price_Change • Rate
Table 3.9.3. Active Stock Market Issues. Problem 3.24. Market Values.
• Stock_Change • Market
Table 3.9.4. CREF'S Investments. Problem 3.25. Executive Salaries.
• Market_Value • Executive_Salary
Table 3.9.5. Percent Change in Problem 3.26. Order Size.
Revenues for Scientific, Photo, and
118 Excel Range Names Appendix

• Order CHAPTER 4, RANGE NAMES.

Problem 3.27. Envelope Prices.


Example: How Many Defective Parts?
• Envelope_Price
• Defects
Problem 3.28. Market Share.
Example: Your Grade Point Average.
• Share
• Credits
Table 3.9.13. Percentage Change in • Grade
Dollar Value.
• Dollar_Change Example: The Firm's Cost of Capital
• Market_Value
Problem 3.30. Tylenol Prices. • Rate_of_Return
• Tylenol
Table 4.1.1. Loss at Opening, Crash of
Table 2.6.7. Closing Price and Monthly 1987.
Change for DJIA Firms. • Loss
• DJIA_Close
• DJIA_Change Table 4.2.1. CEO Compensation in
Technology.
Table 2.6.8. Daily DJIA for January • Salary
2002.
• DJIA_Net_Change Table 4.2.2 Business Failures by State.
• Failures
• DJIA_Percent_Change
Problem 4.1. Cars Requiring Extra Work
Case.
• Cars
• Material
• Manager Table 4.3.1. Last Month's Sales
• Inventory • Sales

Table 4.3.2. Value Added Tax Rates by


Country.
• VAT

Table 4.3.3. Profits for General


Merchandisers in the Fortune 500.
• Profits

Problem 4.7. Beta of Stock Portfolio.


• Shares
• Cost_Per_Share
• Beta

Table 4.3.4. State Population and Taxes.


Appendix A Data Files and Variable Names 119

• Population • CREF_Value
• State_Taxes
Table 4.3.10. Length in minutes for
Table 4.3.5. Percent Change in Housing selected films from a video library.
Values over Five Years for U.S. Regions. • Time
• Percent_Change
Table 3.9.6. Hospital Charges for Heart
Table 4.3.6. Revenues for selected Failure and Shock.
Fortune 500 companies. • Hospital_Charges
• Revenues
Table 3.9.7. CEO Compensation for
Table 4.3.7. Percent increases of initial Food Processing Firms.
public stock offerings. • CEO_Compensation
• Percent_Increase
Table 3.9.10. Cost of Traditional Funeral
Problem 4.16. Paper Mill Problems. Service.
• Problem • Funeral_Cost

Table 4.3.8. Home Mortgage Loan Fees Table 4.3.11. Sales of Some 'Light'
• Fee Foods.
• Food_Sales
Problem 4.23. Strength of Cotton Yarn.
• Strength Table 2.6.7. Closing Price and Monthly
Change for DJIA Firms.
Problem 4.24. Factory Inventory Level. • DJIA_Close
• Inventory • DJIA_Change

Problem 4.25. Your Products' Share. Table 2.6.8. Daily DJIA for January
• Share 2002.
• DJIA_Net_Change
Problem 4.26. Monthly Sales. • DJIA_Percent_Change
• Monthly_Sales
Case.
Table 4.3.9. Changing Value of the • Chairs
Dollar. • Tables
• Change • Bookshelves
• Cabinets
Table 3.9.1. Yields of Municipal Bonds.
• Value
• Yield
CHAPTER 5, RANGE NAMES.
Table 3.9.2. Market Response to Stock
Buy-Backs.
• Price_Change Table 5.1.1. Finding The Deviations
From The Average.
Table 3.9.4. CREF'S Investments. • Dart_Returns
120 Excel Range Names Appendix

Table 5.5.4. Weights for Two Samples of


Example: The Advertising Budget. Candy Bars.
• Budget • Before
• After
Table 5.1.3-5.1.4. Closing Stock Prices
and Daily Returns. Table 5.5.5. Cost due to traffic
• Dow_Jones congestion, per registered vehicle.
• Dow_Jones_Return • Cost_Traffic

Example: S&P 500 Stock Index Problem 5.17. Rates of Return.


Volatility. • ROR
• Standard_Deviation
Problem 5.18. Interest Rates
Table 5.2.1. Employee Salaries. • Rate
• Employee_Salary
Table 5.5.6. Theme Park Admission
Table 5.2.2. Hospital Length of Stay. Prices.
• Days • Admission

Table 5.5.1. Advertising Accounts in Table 5.5.7. Changing Value of the


Play. Dollar.
• Ad_Budget • Change

Table 5.5.2. Performance of Problem 5.20. Weights of Sinks.


Pharmaceutical Firms. • Weight
• Stock_Return
Table 5.5.8. Hotel Room Prices.
Table 5.5.3. Largest Stock Mutual • Price
Funds.
• Return_Mutual_Fund Table 5.5.9. Gifts Returned.
• Assets_Mutual_Fund • Returned

Problem 5.6. Number of Executives for Problem 5.23. Airline Ticket Prices
Seattle Firms. • Ticket_Cost
• Executives
Problem 5.24. Productivity Measures.
• Productivity

Problem 5.25. Sales.


• Sales

Problem 5.26. Percentage of Gold.


• Gold

Problem 5.27. Return on Equity.


Appendix A Data Files and Variable Names 121

• ROE Table 5.5.11. International Bond Mutual


Fund Performance.
Table 4.3.2. Value Added Tax Rates by • Performance_Before
Country. • Performance_After
• VAT
Table 5.5.12. Age and Cost for Presses.
Table 4.3.10. Length in minutes for • Age
selected films from a video library. • Cost_Presses
• Time
Table 2.6.7. Closing Price and Monthly
Table 5.5.10. International taxation. Change for DJIA Firms.
• GDP • DJIA_Close
• Taxes • DJIA_Change

Table 3.9.1. Yields of Municipal Bonds. Table 2.6.8. Daily DJIA for January
• Yield 2002.
• DJIA_Net_Change
Table 3.9.2. Market Response to Stock • DJIA_Percent_Change
Buy-Backs.
• Price_Change Case.
• Part_Size
Table 3.9.4. CREF'S Investments.
• Market_Value CHAPTER 7, RANGE NAMES.

Table 3.9.10. Cost of Traditional Funeral


Service. Example. Profit Under Various Economic
• Funeral_Cost Scenarios.
• Profit
Problem 5.40. Defective Motors, Per • Prob_of_Profit
Batch Of 250.
• Defects Table 7.6.1. Probability Distribution of
Payoff.
Table 4.3.1. Last Month’s Sales. • Payoff
• Last_Month_Sales • Prob_of_Payoff

Table 4.3.5. Percent Change in Housing Table 7.6.2. Probability Distribution of


Values over Five Years for U.S. Regions. Downtime.
• Percent_Change • Downtime
• Prob_of_Downtime
Table 4.3.8. Home Mortgage Loan Fees.
• Fee TABLE 7.6.3. Probabilities for Qualified
Technical Applicants.
• Applicants
• Probability_of_Applicants
122 Excel Range Names Appendix

Table 7.6.4. Rates of Return and CHAPTER 9, RANGE NAMES.


Probabilities.
• ROR
Table 9.1.2. Thickness of Selected
• Prob_of_ROR Sheets of Paper.
• Thickness
Table 7.6.5. Quality Control Problems.
• Prob_of_Rework TABLE 9.1.3. Yearly Percentage of
• Rework_Cost Adults Using the Internet.
• Internet_Usage
Case.
• Oil_Price Table 9.1.4. Yields of a Chemical
• Prob_of_Oil_Price Processing Facility.
• Tons
CHAPTER 8, RANGE NAMES.
Problem 9.11. Personal Computer
Orders.
Table 8.6.1. Project Analysis.
• Computers
• Probability
• Profit_or_Loss Problem 9.14. Weights of Loaves of
Bread.
Table 8.6.2. Industrial Farm Equipment • Weight
Firms.
• Profit Problem 9.16. Cleaning Cost
• Cleaning_Cost
Table 8.6.3. Revenue Change for Fortune
500 Soap and Cosmetics Companies.
Table 9.6.1. Prices at SuperMall and
• Revenue_Change elsewhere for various items.
• SuperMall
Table 8.6.4. Economic Forecasts.
• Elsewhere
• Forecast
• Savings
Problem 8.31. Recent Billings.
Problem 9.27. Daily Changes in S&P 500
• Billing
Stock Market Index.
Problem 8.33. Quality of Agricultural • Change
Produce.
Table 9.6.2. Performance of
• Quality
Recommended Stocks.
• Performance

Problem 9.34. Computer Speed.


• Seconds

Problem 9.35. Economic Viability of


Mining Operation.
• ROR
Appendix A Data Files and Variable Names 123

• Weight_Food
Table 4.3.1. Last Month's Sales.
• Sales Problem 10.22. Prices.
• Price
Problem 9.40. Strength of Cotton Yarn.
• Strength Problem 10.23. Calorie Content.
• Calories
Table 5.5.4. Weights for Two Samples of
Candy Bars. Table 10.7.2. Store Returns.
• Before • Returned
• After
Problem 10.25. Satisfaction Scores.
Problem 9.44. Quality scores for • Satisfaction
agricultural produce.
• Quality Problem 10.26. Pollutant Levels.
• Pollution
Problem 9.45. Caffeine in Coffee.
• Caffeine Problem 10.27. Component Weights.
• Weight_Component
Case.
• Order_Amount Table 10.7.3. Performance of Socially
Aware Funds.
CHAPTER 10, RANGE NAMES. • ROR

Table 10.7.4. World Income Funds One-


Table 10.6.1. Relaxation Scores.
Year Market Return.
• Before
• Market_Return
• After
Table 10.7.5. Vocal Stress Level.
Table 10.6.4. Salaries Arranged by • True_Stress
Gender.
• False_Stress
• Women
• Men Table 10.7.6. Wine Tasting Scores.
• Chardonnay
Problem 10.8. Inventory Level.
• Sauvignon
• Inventory
Table 10.7.7. Days Until Failure.
Problem 10.9. Weights of Loaves of
Bread. • You
• Weight_Bread • Competitor

Table 4.3.7. Percent increases of initial Table 10.7.8. Monthly Daycare Rates.
public stock offerings. • Laurelhurst
• Percent_Increase • Other_Areas

Problem 10.21. Weight of Frozen Foods. Table 10.7.9. New Product Preferences.
124 Excel Range Names Appendix

• Milwaukee • Today
• Green_Bay • Yesterday

Table 10.7.11. Supplier Quality. Table 11.1.7. S&P100 Index Call


• Custom_Cases Options.
• International_Plastics • Strike_Price
• Call_Price
Table 5.5.4. Weights for Two Samples of
Candy Bars. Table 11.1.8. Temperature and Yield for
• Candy_Before an Industrial Process.
• Candy_After • Temperature
• Yield_Process
Case.
• n Table 11.1.9. Fiber-Optics Long-
• Avg Distance Communications.
• stdDev • Investment
• stdErr • Miles
• t
Table 11.1.11. U. S. Treasury Bonds.
• p
• Coupon_Rate
CHAPTER 11, RANGE NAMES. • Bid_Price

Table 11.1.12. Weekly Production.


Table 11.1.1. First Quarter Performance. • Number_Produced
• Contacts • Cost_Weekly
• Sales_Qtr
Data for Figure 11.1.18. Restaurant and
Table 11.1.3. Internet Usage Ratings. Food Store Expenditures by State,
• Audience millions.
• Reach • Food_Stores
• Pages • Restaurant
• Time
Table 11.2.2. Weekly Production
Table 11.1.4. Top Merger & Acquisition (Outlier Omitted).
Advisers. • Produced
• Deals • Cost
• Dollars
Table 11.2.3. Territory and Performance
Table 11.1.5. Mortgage Costs. of Salespeople.
• Fee • Territory
• Interest • Sales_Performance

Table 11.1.6. Percent Change in Stock Table 11.3.1. Printing Presses.


Index. • Age
Appendix A Data Files and Variable Names 125

• Cost_Printing Table 11.3.9. Market Share and 30-


Second Advertising Cost.
Table 11.3.2. Airline On-Time • Share
Performance. • Ad_Cost
• Month_On_Time
• Year_On_Time Table 11.3.10. Gold Coins.
• Weight
Table 11.3.3. International Closed-End • Price_Gold
Bond Funds.
• NAV Table 11.3.11. Room for Expansion.
• Price_Fund • Existing_Units
• Capacity
Table 11.3.4. Business Failures By State.
• Failures Table 11.3.12. Gasoline Prices.
• Population • Price_11_30_90
• Price_2_26_91
Table 11.3.5. Daily Stock Changes.
• McDonalds Table 11.3.13. Salaries and Money
• Dow_Jones Raised Per Capita, Charitable
Organizations.
Table 11.3.6. Expense Ratio and One- • President
Year Rate of Return. • Money_Raised
• WR_ Expense_Ratio
• WR_Return Table 11.3.14. Mailing Lists.
• Size
Table 11.3.7. Votes for Albert Gore, Jr. • Sales
• Nov7
• Certified Table 11.3.15. Short-Term Bond Funds.
• Change • Maturity
• Return
Table 11.3.8. Total U.S. Advertising
Spending by Retail Firms. Table 11.3.16. Production Data.
• Ads2000 • Workers
• Ads1999 • Production

Table 11.3.17. Biotechnical Stocks.


• EPS
• Price_Biotech

Table 11.3.18. Newspaper Advertising


Rates Per Line.
• Circulation
• Open_Line_Rate
126 Excel Range Names Appendix

Table 11.3.19. Newspaper Ad Rates • Salary


Adjusted for Readership. • Experience
• Circulation2 • Gender
• Milline_Rate
Table 12.5.1. Picasso Paintings.
Table 11.3.20. Defects and Possible • Price
Causes.
• Area
• Defect_Rate
• Year
• Temperature_Variability
• Stoppages Table 12.5.4. Computer Response Time,
Users, and Load.
Case • Response_Time
• Purifier • Users
• Yield_Case • Load
CHAPTER 12, RANGE NAMES. Table 12.5.5. Performance of
International Stocks.
Table 12.1.3. Advertising Costs, • US
Characteristics of Magazines. • Europe
• Page • Pacific_Rim
• Audience
• Male Table 12.5.7. CEO Salaries, Sales and
• Income Return on Equity for Northwest
Companies
Table 12.2.2. Computers and Office • NW_Salary
Equipment Companies in the Fortune • NW_Sales
500. • NW_ROE
• Mkt_Val
• Assets Table 12.5.8. Brokerage House Asset-
• Employees Allocation.
• Performance
Table 12.2.13. Dividends, Sales of • Stocks
Goods. • Bonds
• Div
• Nondur Table 12.5.11. Staff and Contribution
• Durable Levels for Charities.
• Staff
Table 12.3.3. Temperature and Yield for • Public
an Industrial Process. • Govt
• Temperature_Process • Other
• Yield
Table 12.5.14. Fiber-Optics Long-
Table 12.4.4. Salary, Experience, and Distance Communications.
Gender for Employees. • Invest
Appendix A Data Files and Variable Names 127

• Miles Table 14.3.1. Civilian Unemployment


Rate.
Table 12.5.16. Price and Profit in Test • Unemployment
Markets.
• Price_Test Table 14.4.1. Quarterly Revenues for
• Profit Walt Disney Company and Subsidiaries.
• Disney
Table 12.5.18. Interest Rates.
• Fed_Funds Table 14.4.2. Quarterly Net Sales for
PepsiCo.
• T_Bills
• PepsiCo
• T_Bonds
Table 14.4.3. Quarterly Sales for Deere
Case.
& Company.
• Temperature
• Deere_Sales
• Density
• Rate Table 14.4.4. Quarterly Sales for Castle
• AM_PM & Cooke, Inc.
• Defect • Castle_Cooke_Sales

CHAPTER 13, RANGE NAMES. Table 14.4.5. Quarterly Sales for


Nordstrom, Inc.
• Nordstrom_Sales
Data from Appendix of Report: Quick
Pricing Formula. Table 14.4.6. Quarterly Sales.
• Components • Sales
• Size
• Cost Table 14.4.10. Interest Rate Forecasts.
• Forecast
CHAPTER 14, RANGE NAMES. • Lower_95
• Upper_95
Table 14.1.1. Radio, TV, and Computer
Store Sales. CHAPTER 15, RANGE NAMES.
• Radio_TV_Computer
Table 15.1.1. Quality Scores for
Table 14.1.3-14.1.4. U.S. Retail Sales. Suppliers' Products.
• Unadjusted_Sales • Amalgamated
• Seasonally_Adjusted_Sales • Bipolar
• Consolidated
Table 14.1.5. U.S. Treasury Bills.
• Yield
Table 15.5.3. Lengths of Telephone
Calls.
Table 14.2.1. Ford Motor Company.
• Info
• Automotive_Sales
• Sales
128 Excel Range Names Appendix

• Service • Yours
• Other • Competitor

Table 15.5.4. Original Data. Table 16.4.7. Prescription Drug Prices.


• Quality • United_States
• Shift • Canada
• Supplier
CHAPTER 17, RANGE NAMES.
CHAPTER 16, RANGE NAMES.
Table 17.1.1. Vehicle Desired.
Table 16.1.2. Incomes of Sampled • Vehicle_Count
Families. • Vehicle_Percent
• Income
Table 17.1.2. Responses to the Question
Table 16.2.2. Level of Creativity. on GM Cars.
• Ad_1 • Boomer
• Ad_2 • Nonboom
• Overall
Table 16.3.2. Income of Mortgage
Applicants. Table 17.2.2. Defective Components.
• Fixed • Defect_Count
• Variable
Table 17.3.1. Rowing Machine
Table 16.4.1. Building Materials Firm Purchases.
Profits. • Practical
• Building_Profit • Impulsive

Table 16.4.2. Aerospace Firm Profits. Table 17.4.1. Vehicle Desired: This
• Aerospace_Profit week's count and last year's percentage.
• This_Count
Table 16.4.3. Relaxation Scores. • Last_Percent
• Before
• After Table 17.4.2. Incoming Telephone Calls.
• Phone_Count
Table 16.4.4. Stress Levels. • Phone_Percent
• True_Answer
• False_Answer Table 17.4.3. Survey of Future Business
Conditions.
Table 16.4.5. Gender Salary Data. • Managers
• Women • Employees
• Men
Table 17.4.4. Survey on the Chances of a
Table 16.4.6. Reliability of Products Stock Market Crash.
Under Abuse. • Stockholders
Appendix A Data Files and Variable Names 129

• Nonstockholders Table 18.5.1. Frequency of Problems in


Candy Manufacturing.
Table 17.4.5. Order Rates by Region • Candy_Problems
• East
• West Table 18.5.2. Problems in Rebate
Processing.
Table 17.4.6. Status of Mortgage • Rebate_Problems
Applications.
• Residential Table 18.5.3. Thickness of Protective
Coating.
• Commercial
• Thick_1
Table 17.4.7. Household Responses. • Thick_2
• Satisfied • Thick_3
• Dissatisfied
Table 18.5.4. Length of Broccoli Trees,
Table 17.4.8. Newsletter Interest Level. n=4.
• Customer • Broc_1
• Potential_customer • Broc_2
• Broc_3
CHAPTER 18, RANGE NAMES. • Broc_4

Table 18.5.5. Defective Invoices, n=500.


Table 18.1.1. Defect Causes, with • Errors
Frequency of Occurrence.
• Number_Defects Table 18.5.6. High Speed Memory Chips.
• Chip_Number
Table 18.3.3. Summaries of
Measurements for 8 Samples, n=4.
Table 18.5.7. Baking Oven
• Average_Meas Temperatures.
• Range_Meas • Mon_Avg
• Mon_Range
Table 18.3.4. Weights of Sampled Boxes
of Detergent. • Tues_Avg
• Average_Detergent • Tues_Range
• Range_Detergent • Wed_Avg
• Wed_Range
Table 18.4.2. Summaries of
Measurements for 12 Samples. APPENDIX A, RANGE NAMES.
• Defects_of_500
Appendix A. Employee Database.
Table 18.4.3. Errors in Batches of n=300
• Salary
Purchase Orders.
• Gender
• Defects_of_300
• Age
• Experience
130 Excel Range Names Appendix

• Level • Lifetime
• Lifetime_D0
APPENDIX B, RANGE NAMES. • Lifetime_D1
• MajorDonor
Appendix B. Donations Database. • MajorDonor_D0
Note: “_D0” indicates 19,011 non- • MajorDonor_D1
donors, while “_D1” indicates 989 • MedHouseInc
donors, out of 20,000 overall.
• MedHouseInc_D0
• Age
• MedHouseInc_D1
• Age_D0
• OwnerOccupied
• Age_D1
• OwnerOccupied_D0
• Age55_59
• OwnerOccupied_D1
• Age55_59_D0
• PCOwner
• Age55_59_D1
• PCOwner_D0
• Age60_64
• PCOwner_D1
• Age60_64_D0
• PerCapIncome
• Age60_64_D1
• PerCapIncome_D0
• AvgGift
• PerCapIncome_D1
• AvgGift_D0
• Professional
• AvgGift_D1
• Professional_D0
• Cars
• Professional_D1
• Cars_D0
• Promotions
• Cars_D1
• Promotions_D0
• CatalogShopper
• Promotions_D1
• CatalogShopper_D0
• RecentGifts
• CatalogShopper_D1
• RecentGifts_D0
• Clerical
• RecentGifts_D1
• Clerical_D0
• Sales
• Clerical_D1
• Sales_D0
• Donation
• Sales_D1
• Donation_D0
• School
• Donation_D1
• School_D0
• Farmers
• School_D1
• Farmers_D0
• SelfEmployed
• Farmers_D1
• SelfEmployed_D0
• Gifts
• SelfEmployed_D1
• Gifts_D0
• Technical
• Gifts_D1
• Technical_D0
• HomePhone
• Technical_D1
• HomePhone_D0
• YearsSinceFirst
• HomePhone_D1
Appendix A Data Files and Variable Names 131

• YearsSinceFirst_D0
• YearsSinceFirst_D1
• YearsSinceLast
• YearsSinceLast_D0
• YearsSinceLast_D1

You might also like