You are on page 1of 12

Economics of Education Review 20 (2001) 377388

www.elsevier.com/locate/econedurev

Student performance, attrition, and class size given missing


student data
a, b,* c
William E. Becker , John R. Powers
a
Department of Economics, Indiana University, Bloomington, IN 47405, USA
b
School of International Business, University of South Australia, Australia
c
US EPA, Mail Code 4301, 1200 Pennsylvania Avenue, NW, Washington, DC 20460, USA

Received 26 August 1999; accepted 9 May 2000

Abstract

Class size is of particular interest to education researchers and administrators because it is one of the few variables
that administrators can change from term to term. In studies of class size, however, little if any attention is given to
the consequence of missing student records that result from data cleaning done by those collecting the data, student
unwillingness to provide data, or students self-selecting out of the study and the implications of this selection on an
appropriate measure of class size. These shortcomings are addressed here: class size and other class-specific variables
that may affect student learning of economics are considered along with the hazard of attrition between the pre-course
test and the post-course test and students failure to complete questionnaires about themselves and the courses. Contrary
to studies that have used an average or an end-of-term class size measure and find no class-size effect, beginning class
size is found to be significant and negatively related to learning of economics, all else equal. In part, this is the result
of students in larger classes being significantly more likely than students in smaller classes to withdraw from the course
before taking the posttest. 2001 Elsevier Science Ltd. All rights reserved.

JEL classification: A2; C24

Keywords: Selection; Education; Testing; Class size

1. Introduction can change from term to term.1 In studies of class size,


however, little, if any, attention is given to (1) the conse-
Kennedy and Siegfried (1997), among others, report quence of missing student records that result from data
that characteristics over which instructors or department cleaning done by those collecting the data, (2) student
chairs have control do not significantly affect college unwillingness to provide data, (3) students self-selecting
student achievement in economics. Of particular interest
is class size, because it is a variable that administrators
1
Lazear (1999) argues that a schools optimal class size var-
ies directly with the quality of students. Because the negative
congestion effect of disruptive students is lower for better stu-
dents, the better the students, the bigger the optimal class size
and the less that class size appears to matter: in equilibrium,
class size matters very little. To the extent that class size mat-
* Corresponding author. Tel.: +1-812-855-3577; fax: +1- ters, it is more likely to matter at lower grade levels than upper
812-855-3736. grade levels where class size is smaller.(p. 40) However,
E-mail addresses: beckerw@indiana.edu (W.E. Becker), Lazear does not address how class size is to be measured or
powers.john@epa.gov (J.R. Powers). the influence of class size on attrition. His analysis does not

0272-7757/01/$ - see front matter 2001 Elsevier Science Ltd. All rights reserved.
PII: S 0 2 7 2 - 7 7 5 7 ( 0 0 ) 0 0 0 6 0 - 1
378 W.E. Becker, J.R. Powers / Economics of Education Review 20 (2001) 377388

out of the study, resulting in smaller class sizes at the an Instructors Questionnaire (IQ), administer a Student
end of the term, and (4) the implications of this selection Questionnaire (SQ), and provide other course and stud-
on an appropriate measure of class size. ent related information. Complying instructors were
In this paper, we address the consequence of these responsible for all aspects of this data collection and for
four oversights in an examination of the relationship forwarding results for data assimilation.
between class size and student learning. To address the The voluntary nature of this process resulted in differ-
first two data problems, we add observations that were ent numbers of observations for many of the variables,
dropped during the early stage of assembling data for including for those students who completed both the pre-
the 3rd edition of the Test of Understanding in College test and posttest. Of particular interest in our study is
Economics. To examine the effects of attrition (course (1) the lack of information provided by students about
withdrawal) we introduce three measures of class size themselves, (2) the discarding of data on students who
(beginning, ending, and average) and employ a Heckman did not properly complete the test form, and (3) the
(1979) estimation procedure to adjust for self-selection attrition that took place between the pretest and posttest
out of the course. Contrary to studies that have used an that is not reflected in the widely used NCEEs cleaned
average or an end-of-term class size measure and find (i.e., censored) data set.
no class-size effect, beginning class size is found to be With the exception of Siegfried and Kennedy (1995)
significant and negatively related to learning, all else and Kennedy and Siegfried (1997), researchers have
equal. In part, this is the result of students in larger overlooked the fact that Saunders removed students who
classes being significantly more likely than students in did not answer all questions on each administration of
smaller classes to withdraw from the course before tak- the TUCE, with one and only one answer per question.3
ing the posttest. To determine class size, excluded students must be
The paper is organized as follows. Following this brought back in and counted as part of a class. The num-
introduction, Section 2 describes the data used in our ber of records in the uncensored microeconomics data
analysis. Section 3 explains the typical learning set made available to us directly by Saunders shows that
regressions using student self-reported data but with a 2836 students took the 30 question microeconomics
distinction for beginning, ending, and average class size. TUCE as either a pretest or posttest in classes for which
Section 4 addresses the importance of including students the posttest score counted as part of students course
who did not self-report. Section 5 provides a model of grades. Of these 2836 students, 2587 had a pretest score,
the propensity to persist in the course and take the post- 2326 had a posttest score, but only 2077 had both pre-
test. Section 6 gives the estimated learning equation with and posttest scores.4 (For comparison purposes, Saund-
an adjustment for attrition. The conclusion, Section 7, is ers NCEE cleaned or censored data set for the same
that beginning class size has a negative effect on student sections has 2363 pretest scores, 2275 posttest scores,
persistence and learning in all cases examined. and 1896 matched pre- and posttest scores.)
Also of interest to us is the number of students who
had a pretest score, a posttest score, and filled out a
2. The data
3
The data set for this study is an uncensored version The significance of excluding those who answered fewer
of that assembled as part of the norming of the 3rd edi- than 30 questions on the pretest goes beyond errors in class size
tion of the Test of Understanding in College Economics determination. For example, knowing the posttest counts in the
course grade, a cunning student may reason that improvement
(TUCE), which was produced by Saunders (1994) for
might end up counting as well and deliberately leave answers
the National Council on Economic Education (NCEE).2 blank on the pretest. The possibility of such data manipulation
In addition to administering the TUCE as a pre-course and resulting sample selection bias has been ignored or
test (pretest) and/or a post-course test (posttest) to their unrecognized by researchers.
classes, participating instructors were asked to complete 4
There appears to be agreement among researchers who
have used the TUCE data set that student achievement, whether
measured by a posttest score or by the change between the pre-
address the dynamics of class size determination over the term test score and the posttest score, is influenced positively by the
of a course. Similarly, the debate over the effect of class size posttest score counting in students course grades. Thus,
summaries by the meta analyses of Hanushek (1994) and researchers are now restricting their analysis to only those stu-
Hedges, Lane, & Greenwald (1994a,b) as reviewed by and con- dents for whom the posttest score affected the course grade.
tributed to by Krueger (1999), makes no reference to the proper We will not question this practice here as we do not have the
determination of class size. resources to assemble the necessary missing data for all of the
2
The censored version of the data set is available from additional sections in which the posttest score did not influence
the National Council on Economic Education. Our uncen- the students course grades. For a theoretical rationale on the
sored data set and our LIMDEP programs are available from importance of a test having value to a student see Becker
W. E. Becker upon request. (1982).
W.E. Becker, J.R. Powers / Economics of Education Review 20 (2001) 377388 379

student questionnaire (n=1427). Thus, the three subsets For student characteristics, Student Control Variables
of data that are used in our analysis include (1) the stu- include the students GPA (STUDENTGPA), whether
dents who had a pretest score (n=2587), (2) the students the student reports taking an economics course in high
who had both a pretest score and a posttest score school (STUDENTHIGH SCHOOL ECON), the stud-
(n=2077), and (3) the students who had both a pretest ents sex (STUDENTGENDER), whether the student
score and a posttest score, and who also filled out a stud- reports having a job (STUDENTJOB), the students
ent questionnaire (n=1427). assessment of how interesting the course was
The variables for our analysis are listed and defined (STUDENTINTEREST), the students assessment of
in Table 1, with sample means and standard deviations the quality of the textbook (STUDENTTEXTBOOK),
(in parentheses) provided for each of the three data set the students assessment of the instructors English com-
sizes of interest. To simplify both the presentation and munication ability (STUDENTCOMMUNICATE), and
interpretation of the variables, they are divided into cat- the students assessment of the instructors overall effec-
egories. Student Test Performance Variables include the tiveness (STUDENTEFFECTIVENESS OF
students score on the TUCE at the start of the term INSTRUCTOR). In addition, we included Student
(STUDENT PRETEST SCORE), at the end of the term Dummy Variables to identify students for whom a final
(STUDENT POSTTEST SCORE), and the difference exam (posttest) was taken (STUDENTTOOK FINAL)
between the two scores (STUDENT CHANGE SCORE and if the student did not complete an evaluation (NO
=Student posttest score minus Student pretest score). STUDENT DATA).
Class size variables include the size of the class at the
start of the term (INITIAL CLASS SIZE), at the end of
the term (TERMINAL CLASS SIZE), and the average 3. Regression analysis of test scores using student
of the two sizes (MEAN CLASS SIZE). self-reported data
To control for factors other than class size, we some-
what arbitrarily selected variables from the TUCE data In keeping with previous educational research, as dis-
set that had previously been found to influence learning. cussed in Becker (1997), a students knowledge of eco-
For institutional characteristics, the Institutional Con- nomics at the beginning and end of the course is meas-
trol Variables include dummy variables that indicate ured by the pretest and posttest scores. Learning of
whether the students academic institution is a doctoral economics is defined by the difference between these
granting institution (DOCTORAL INSTITUTION), a two scores.6 This learning (or value added) is assumed
masters granting institution (MASTERS to be produced by human capital inputs (e.g., ability or
INSTITUTION), or a bachelors granting institution aptitude), utilization measures (e.g., outside job versus
(BACHELORS INSTITUTION).5 outside study), and instructional technology or other
For instructor characteristics, Instructor Control Vari- environmental considerations (e.g., class size, institution
ables include the instructors sex (INSTRUCTOR type). In this framework, researchers typically exclude
GENDER), whether English was the instructors native students for whom there is no matching pretest and post-
language (INSTRUCTORS ENGLISH), and whether test and for whom information is missing on the input
the instructor had earned a PhD (INSTRUCTOR PHD). measures, and ignore the bias that may be introduced by
this practice. In this section, we first explore within the
matched pre- and posttest setting the consequence of
5
excluding students for whom data are missing on poten-
The Carnegie Foundation (1994) classification has ten
tial explanatory variables and later explore the effect of
groups: Research I and II, Doctoral I and II, Masters I and II,
Baccalaureate I and II, and Associate I and II. The four group attrition from pretest to posttest.
classifications used in the TUCE data involved the collapsing In the typical analysis, data come from students and
of Research and Doctoral institutions into one category, and not
using the I and II distinction. Because the school names are
unknown and masked by Saunders at the time of data entry and
6
classification, we are not able to unbundle the schools. This is Becker (1983) identified the estimation bias that results
unfortunate because results from the Becker and Watts (1999) from correlation between the pretest and the error term inherent
survey call into question the appropriateness of condensing in a regression of the posttest on the pretest and other covari-
these categories for the assessment of teaching and learning ates. Ten years later, Kennedy (1992) re-established the exist-
activities. For example, 71 of the 120 economics departments ence of this bias. Yet, education researchers continue to use
at Research I and II universities for whom questionnaires were the pretest as a regressor in the posttest regressions with no
completed reported an average weight given teaching in tenure acknowledgment of the problem. In deference to custom, esti-
and promotions cases of 0.25 (out of 1.00). For the 93 depart- mates of the posttest on the pretest and other covariates are
ments at Doctorate I and II institutions, there were only 33 available on request as appendices to this article. Because our
respondents, for whom the average weight given to teaching results are robust across specifications, we restrict our dis-
was 0.40. cussion in the text to estimation of the change-score model.
Table 1
380
Variable definitions and descriptive statistics, by data subset

Variables, by category Students with


pretest score, posttest
pretest score and
pretest score [n=2587] score, and SQ data
posttest score [n=2077]
[n=1427]
Mean (St.Dev.) Mean (St.Dev.) Mean (St.Dev.)

Student test performance variables


STUDENT PRETEST SCORE: Students exam score at start of term 10.60 (4.03) 10.85 (4.07) 11.16 (4.22)
STUDENT POSTTEST SCORE: Students exam score at end of term n.r. 16.30 (5.88) 16.67 (5.87)
STUDENT CHANGE SCORE: Student posttest score pretest score n.r. 5.46 (4.58) 5.52 (4.46)
Class Size Variables
INITIAL CLASS SIZE: Class size at start of term 55.56 (23.78) 55.74 (23.91) 56.48 (24.92)
TERMINAL CLASS SIZE: Class size at end of term 49.97 (22.46) 50.70 (22.48) 51.33 (23.87)
MEAN CLASS SIZE: Mean of Initial and Terminal class size 52.77 (22.80) 53.22 (22.90) 53.90 (24.11)
Institutional Control Variables
DOCTORAL INSTITUTION: If doctoral granting institution then 1, otherwise 0 0.32 (0.47) 0.34 (0.47) 0.28 (0.45)
MASTERS INSTITUTION: If masters granting institution then 1, otherwise 0 0.42 (0.49) 0.41 (0.49) 0.45 (0.50)
BACHELORS INSTITUTION: If bachelors granting institution then 1, otherwise 0 0.14 (0.34) 0.14 (0.35) 0.16 (0.36)
Instructor Control Variables
INSTRUCTOR GENDER: If instructor is male then 1, if female then 2 1.23 (0.42) 1.23 (0.42) 1.21 (0.41)
INSTRUCTORS ENGLISH: If instructors native language is English then 1, otherwise 0 0.92 (0.27) 0.92 (0.27) 0.95 (0.22)
INSTRUCTOR PHD: If instructor has a PhD then 1, otherwise 0 0.69 (0.46) 0.70 (0.46) 0.73 (0.44)
Student Control Variables
STUDENTGPA: Students self-reported GPA times 100 n/a n/a 286.92 (57.21)
STUDENTHIGH SCHOOL ECON: If student reports taking an economics course in high
n/a n/a 0.44 (0.50)
school then 1, otherwise 0
STUDENTGENDER: If student is male then 1, if female then 2 n/a n/a 1.40 (0.49)
STUDENTJOB: If student reports having a job then 1, otherwise 0 n/a n/a 0.50 (0.50)
STUDENTINTEREST: Students assessment of how interesting the course was (10 to 50) n/a n/a 34.23 (9.18)
STUDENTTEXTBOOK: Students assessment of the quality of the textbook (10 to 50) n/a n/a 34.77 (8.82)
STUDENTCOMMUNICATION SKILLS OF INSTRUCTOR: Students assessment of
n/a n/a 45.97 (6.36)
instructors English communication ability (10 to 50)
W.E. Becker, J.R. Powers / Economics of Education Review 20 (2001) 377388

STUDENTEFFECTIVENESS OF INSTRUCTOR: Students assessment of the instructors


n/a n/a 41.13 (7.95)
overall effectiveness (10 to 50)
Student Dummy Variables
STUDENTTOOK FINAL: If student took final exam then 1, otherwise 0 0.80 (0.40) 1.00 (0.00) 1.00 (0.00)
NO STUDENT DATA: If student did not provide data then 1, otherwise 0 0.29 (0.45) 0.16 (0.36) 0.00 (0.00)
W.E. Becker, J.R. Powers / Economics of Education Review 20 (2001) 377388 381

instructors as self-reported information.7 Students are Note that we are asserting here that the posttest count
then excluded from the analysis if their records are miss- cannot be used directly as a measure of class size in a
ing values for any variable used as a regressor in the study of learning because it incorporates students
change-score (or learning) regression, even though each decisions to persist to the final exam or withdraw. That
student has both pretest and posttest scores used to form is, the sample selection process that led to the taking of
the change-score that defines learning. In our uncensored a final exam contaminates the posttest count as a measure
version of the TUCE data set, of the 2587 pretest takers, of class size; any class size calculation that incorporates
2077 took the posttest (2nd column in Table 1), but only this final enrollment measure is endogenous. Neverthe-
1427 (3rd column in Table 1) had sufficient student data less, for comparison purposes, we report regressions
to estimate parameters in the typical regression model of employing all three measures of class size as regressors
learning.8 In our consideration of the importance of class Table 2): pretest count, posttest count, and the mean.10
size, we start with this traditional change-score In Table 2, the coefficient estimates are reported for
regression that employs student self-reported data, a least-squares regression of individual student learning
instructor data and institutional characteristics, as listed (STUDENT CHANGE SCORE) on the three sets of con-
in the 3rd column of Table 1. trol variables plus each of the three class size variables.
Our main hypothesis is that the smaller the class in Our population of interest consists of students who were
which the student enrolls, all else equal, the more likely in the class at the start of the course. In the first column
the student is to complete the course, take the posttest, of Table 2, class size at the beginning of the term
and do well on it. In the traditional learning regression (INITIAL CLASS SIZE) is negatively and significantly
(no adjustment for selection, and all else equal), this related to learning: a one-person increase in initial class
hypothesis translates into the conjecture that class size size lowers each students predicted change score by
is negatively related to the change scores. We advance 0.0104, which is significant at the 5 percent Type I error
no prior hypothesis for the influence of the other covari- level with a one-tail p-value of 0.036. On the other hand,
ates on learning, although past research, as well as as others have observed, and as seen in columns two and
intuition, suggests that aptitude and motivation to learn, three, end-of-semester class size (TERMINAL CLASS
as measured by grade point average, should have a posi- SIZE) and average class size (MEAN CLASS SIZE) are
tive effect unless students are selectively or incorrectly not significantly related to learning at typical Type I error
reporting their grade point averages (Becker, 1997). levels. But unlike beginning class size, these findings are
As our measure of class size, we could use starting based on a coefficient estimator that is biased because
enrollment, final enrollment, or some average enrollment end-of-semester class size is endogenous; it is correlated
measure. Kennedy and Siegfried (1997) use the average with the error term and thus so is any average incorporat-
number of pretests and posttests as their measure of class ing it. In addition, all three regressions in Table 2 are
size, for example. Because we believe that learning and based on only those students who were in class and pro-
the propensities to persist versus withdraw are related, vided data on themselves when the student questionnaire
we need to know about a students decision to stay or was administered.
drop. Unfortunately, the TUCE data set does not provide
direct information on whether a student formally with-
drew from the course. We assume, however, that the 4. The consequence of missing student data
existence of a pretest but no posttest signals a with-
drawal, because the posttest counted in the course grade.9 Results presented in Table 2 require the exclusion of
each student who did not complete a student question-
naire (SQ). A student may not have completed a ques-
7
Maxwell and Lopus (1994) identify biases associated with tionnaire because he dropped the course before the ques-
self-reported data. In the TUCE data set the accuracy of the tionnaire was administered, was not in class when it was
student-reported data has not been checked, although it is
administered, or did not wish to complete it.
known that some students provided information on discussion
sections that were not a regularly scheduled part of the course. To assess the consequence of the missing student data
8
A student questionnaire was assumed empty or missing if
no entry was recorded for at least one of four arbitrarily selected
10
questions (instructor interest, English, overall teaching effec- We do not have beginning enrollments as reported by the
tiveness, and textbook quality). registrar; we have only the pretest count of those who took the
9
A comparison of Becker and Watts (1996) and Siegfried, TUCE at the start of the term in a specific class. As do other
Saunders, Stinar, & Zhang (1996) survey results suggests that researchers, we use this pretest count for our starting class size,
the TUCE data underrepresents large classes. This may be but our class size includes all students who returned the pretest
related to the high cost imposed on TUCE-participating instruc- regardless of the number of questions answered. Similarly, the
tors for administering, collecting and maintaining records in total number of posttests taken (regardless of the number of
large classes. Unfortunately, the Becker and Watts (1996) data questions answered or the existence or nonexistence of a
set also suffers from a low response rate. pretest) determines ending class sizes.
382 W.E. Becker, J.R. Powers / Economics of Education Review 20 (2001) 377388

Table 2
OLS estimates of student learning using student-provided data

Dependent Variable: STUDENT CHANGE SCORE


Regression results with class size equal toa
Independent variables Initial class size Terminal class size Mean class size
Coefficient estimate (Standard error)

Constant 0.533 (1.257) 0.133 (1.259) 0.343 (1.2591)


INITIAL CLASS SIZE 0.010* (0.006)
TERMINAL CLASS SIZE 0.004 (0.006)
MEAN CLASS SIZE 0.004 (0.0061)
DOCTORAL INSTITUTION 1.496** (0.480) 1.214** (0.489) 1.382** (0.485)
MASTERS INSTITUTION 0.209 (0.500) 0.018 (0.497) 0.086 (0.499)
BACHELORS INSTITUTION 1.882** (0.573) 2.068** (0.570) 1.987** (0.572)
INSTRUCTOR GENDER 0.129 (0.296) 0.142 (0.297) 0.124 (0.296)
INSTRUCTORS ENGLISH 2.492** (0.598) 2.704** (0.597) 2.599** (0.598)
INSTRUCTOR PHD 0.459 (0.367) 0.120 (0.365) 0.294 (0.367)
STUDENTGPA 0.019** (0.002) 0.019** (0.002) 0.019** (0.002)
STUDENTHS ECON 0.192 (0.229) 0.186 (0.229) 0.190 (0.229)
STUDENTGENDER 0.964** (0.235) 0.990** (0.235) 0.980** (0.235)
STUDENTJOB 0.242 (0.227) 0.231 (0.228) 0.236 (0.228)
STUDENTINTEREST 0.049** (0.014) 0.049** (0.014) 0.049** (0.014)
STUDENTTEXTBOOK 0.028* (0.014) 0.027* (0.014) 0.027* (0.014)
STUDENTCOMMUNICATE 0.011 (0.022) 0.011 (0.022) 0.011 (0.022)
STUDENTEFFECTIVE 0.006 (0.017) 0.007 (0.017) 0.007 (0.017)
R-squared 0.143 0.141 0.141
F 15.65 15.43 15.43
n 1427 1427 1427

a
**Significant at 1% Type I error level (one-tail), *Significant at 5% Type I error level (one-tail).

on estimators in the matched pre- and posttest learning that use this question-specific but sample-reducing data,
models, consider the expected value of the change score, we specify our measure of learning as a function of
as calculated from a regression of the difference in post- whether students completed a SQ but not as a function
test score (y1) and pretest score (y0) on the set of full of anything they may or may not have written.
information for each student. Let i be the full infor- Our first hypothesis is that a students failure to com-
mation set that should be used to predict the ith students plete a questionnaire provides information on the stud-
change score yi[=(y1iy0i)]. Let P(mi=1) be the prob- ents performance: all else equal, those who do not com-
ability that some of this explanatory information is miss- plete a SQ learn less than those who do. If this
ing. The desired expected value for the ith students hypothesis is true, then researchers who incorporate the
learning is then individual student characteristics obtained as part of the
E(yi|i)E(yi|ci)P(mi1)[E(yi|mi) (1) TUCE norming are reporting on a conditional event (the
learning effect of an identified student characteristic,
E(yi|ci)] given the information provided) where the condition is
where ci is the subset of information available from not independent of the effect. That is, the coefficient on
complete records and mi is the subset of incomplete rec- the identified student characteristic cannot be interpreted
ords. as a simple marginal effect. Incorporating student data
The expected value of the change score on the left- in the explanation of learning, where only a subset of
hand side of Eq. (1) is desired but only the first major students provide these data and the decision to provide
term on the right-hand side can be estimated. They are these data is related to learning, implies regressor and
equal only if P(mi=1) is zero or its multiplicative factor error term correlation, and therefore biased estimation of
within the braces is zero. In our sample, however, the the parameters of the learning regression.
relative frequency of missing explanatory data is 0.15. Unlike students, the instructors who submitted useable
Because willingness to complete a survey is likely not a TUCE scores almost universally completed an instructor
purely random event, E(yi|mi)E(yi|ci). As an questionnaire. Data on instructors can thus be employed
alternative to using the student data provided on the stud- with no prior hypothesis regarding their influence on stu-
ent questionnaires, and to assess the effect in regressions dents posttest performance.
W.E. Becker, J.R. Powers / Economics of Education Review 20 (2001) 377388 383

To test our hypothesis about the impact of completing making NO STUDENT DATA endogenous in this
a SQ on student performance, we re-ran the OLS matched pretest and posttest regression. Such a person
regression with INITIAL CLASS SIZE and NO STUD- would likely drop the course and not appear in a learning
ENT DATA along with the institutional and instructor regression, as discussed in the next section. However, in
control variables. If test performance is not affected by order to assess the impact of the class-size effect for a
a students willingness to participate, as reflected by a learning regression that includes and excludes the NO
completed SQ, the expected value of the NO STUDENT STUDENT DATA measure, we provide the results in
DATA coefficient is zero (that is, not filling out a SQ the second column of Table 3, where NO STUDENT
has no impact on student learning). But, as seen near the DATA is omitted. In this case, completely removing any
bottom of the 1st column of Table 3, the coefficient on reference to student data does not materially change the
the NO STUDENT DATA variable is 0.5063 and sig- effect of class size, which continues to be significantly
nificant at the 5 percent Type I error level with a one- and negatively related to achievement at the 5 percent
tail p-value of 0.0309. This implies that, all else equal, Type I error level. (Although not reported here for
compared to students who provide information about reasons already discussed, if class size is measured by
themselves, the change score for those students who do the number taking the posttest, then its effect on the post-
not provide information about themselves are predicted test continues to be positive even after taking account of
to score one half of a question lower. The implications those students for whom there is no self-reported data.)
are clear: in studies using matched pre- and posttests,
conditioning only on those students who provided infor-
mation about themselves removes students who would 5. Propensity to take the posttest
otherwise lower predicted learning.
In addition, after controlling for whether students pro- Becker (1983) and others have argued for the use of
vided information, the coefficient on the initial class size alternatives to standardized multiple-choice tests for
variable is negative (0.010) and significant at 5 percent assessing the inputoutput relationship in education.
Type I error level with a one-tail p-value of 0.033. Initial Card and Krueger (1996) propose the use of attrition
class size is an important input to the value-added meas- rates and cite Sander (1993) as an example of a national
ure of student learning of economics, even when adjust- study that uses attrition rates to assess school districts.
ments are made for missing data. We do the same here to assess the importance of class
Although our analysis in this section does not include size.
those who did not take a posttest, an argument can be The introduction of attrition rates as an output measure
made that a student who had a matching pretest and post- makes clear that terminal class size reflects students
test might not be willing to provide data on an SQ if decision to persist or withdraw from the course. As
he or she anticipates no learning. That is, anticipated or already argued, terminal class size is not exogenous in
expected learning affects willingness to complete an SQ, a pretest to posttest, change-score analysis. Unlike initial

Table 3
OLS estimates of student learning without student-provided data

Dependent variable: STUDENT CHANGE SCORE


Regression resultsa
Independent variables With No student data Without No student data
Coefficient estimate (Standard error)

Constant 6.856** (0.651) 6.797** (0.651)


INITIAL CLASS SIZE 0.010* (0.005) 0.009* (0.005)
DOCTORAL INSTITUTION 1.956** (0.409) 1.819** (0.403)
MASTERS INSTITUTION 0.378 (0.413) 0.471 (0.410)
BACHELORS INSTITUTION 2.210** (0.494) 2.140** (0.493)
INSTRUCTOR GENDER 0.387 (0.250) 0.379 (0.250)
INSTRUCTORS ENGLISH 2.750** (0.373) 2.739** (0.373)
INSTRUCTOR PHD 0.647* (0.288) 0.717** (0.286)
NO STUDENT DATA 0.506* (0.271) n/a
R-squared 0.095 0.095
F 27.45 30.84
n 2077 2077

a
**Significant at 1% Type I error level (one-tail), *Significant at 5% Type I error level (one-tail).
384 W.E. Becker, J.R. Powers / Economics of Education Review 20 (2001) 377388

enrollment, which is predetermined once the course is interest, and thus greater propensity to withdraw, is
underway, terminal enrollment is influenced by the really demonstrated. However, as will be addressed, we
instructors actions it is endogenous. cannot exclude the possibility that students dropped the
Using the same explanatory variables introduced in class before the SQ was administered, making the NO
Table 3, we next put forward the following hypotheses STUDENT DATA endogenous and an inappropriate
about persistence: explanatory variable for predicting a missing posttest.
Whether the ith student does (Ti=1) or does not (Ti=0)
All else equal, the larger the initial or beginning class take the posttest is observable. We assume, however, that
size, the less likely it is for a student in the class to there is an unobservable continuous dependent variable
take the posttest. Ti* underlying the students decision to take the posttest.
We call this latent variable the students propensity to
Thus, we conjecture that the coefficient estimate for take the posttest. More formally, if T* is the vector of
INITIAL CLASS SIZE will be negative. Curriculum students propensities to take the posttest, H is the matrix
specialists trace the rationale for this conjecture about of observed explanatory variables including the pretest,
persistence to Tinto (1987), although empirical support a is the vector of corresponding slope coefficients, and
of the type preferred by economists is lacking. w is the vector of error terms, then the ith students pro-
We also hypothesize that a students starting knowl- pensity of take the posttest is given by
edge of economics will influence the decision to stay
T i Hiawi (2)
with the course:
where
All else equal, the lower the students pretest score Ti=1, if Ti*0, and student i takes the posttest, and
the less likely the student is to take the posttest. Ti=0, if Ti*0, and student i does not take the posttest.
For estimation purposes, the error term wi is assumed
Although our conjecture here is based on intuition to be a standard normal random variable that is indepen-
more than any one scholars theory, it seems so obvious dently and identically distributed with the other error
that we are surprised that other education researchers terms in the w vector.
employing the standard learning regressions of value The probit model of the propensity to take the posttest
added have not incorporated it into their analyses. We, in Eq. (2) was estimated using the maximum likelihood
therefore, expect that the coefficient estimate on STUD- routine in LIMDEP.EXE (Sept 1999). The estimated pro-
ENT PRETEST SCORE will be positive. bit coefficients are given in Table 4.
To predict the likelihood of a student taking the post- Our findings indicate support for all three of the
test, the use of individual student characteristics would hypotheses in this section. When NO STUDENT DATA
be helpful; but, as with the matched pretest and posttest is included as a covariate in the regression specification,
regression, incorporating this student data implies a sub- the coefficient estimate for INITIAL CLASS SIZE is
stantial reduction in the sample size. Unfortunately, the negative (0.005) and highly significant with a one-tail
point in the semester at which the student questionnaires p-value of 0.006; the coefficient estimate for STUDENT
were administered is unknown; we only know that stu- PRETEST SCORE is positive (0.022) and highly sig-
dents were asked to complete the questionnaires at some nificant with a one-tail p-value of 0.01; and the coef-
point after the administration of the pretest and before ficient estimate for NO STUDENT DATA is negative
the administration of the posttest. We, therefore, conjec- (1.931) and highly significant with a one-tail p-value
ture that a students failure to complete a questionnaire that is approximately zero at the level of precision we
reflects the students motivations to attend and partici- have here. These results indicate that failing to complete
pate in class, and in turn, his or her propensity to persist a SQ (NO STUDENT DATA) makes an important con-
to the end. This leads us to the following hypothesis: tribution to explaining the propensity to take the posttest.
Although the NO STUDENT DATA variable is an
All else equal, the student who completes a SQ is important explanatory variable, a comparison of results
more likely to take the posttest. between the two columns in Table 4 shows that INITIAL
CLASS SIZE makes a similar and significant contri-
This conjecture is based on the notion that interested bution to explaining the propensity to take the posttest
students attend class, which means that the coefficient regardless of whether NO STUDENT DATA is included
for NO STUDENT DATA will be negative. Students not in the probit as a regressor. When NO STUDENT DATA
present the day the SQ was administered may have mis- is included, the coefficient estimate on INITIAL CLASS
sed for other reasons, but we will interpret this as sig- SIZE is 0.005, and when NO STUDENT DATA is not
naling a lack of interest and greater propensity to drop included, the coefficient estimate on INITIAL CLASS
the course. If the student was there the day of the SQ SIZE is 0.004. In both cases, the coefficient estimates
administration and still did not complete it, then lack of are significant at the one percent Type I error level.
W.E. Becker, J.R. Powers / Economics of Education Review 20 (2001) 377388 385

Table 4
Probit model estimates of the propensity to take the posttest

Dependent variable: PROPENSITY TO TAKE THE POSTTEST


Regression resultsa
Independent variables With No student data Without No student data
Coefficient estimate (Standard error)

Constant 0.995** (0.243) 0.170 (0.199)


STUDENT PRETEST SCORE 0.022** (0.010) 0.042** (0.008)
INITIAL CLASS SIZE 0.005** (0.002) 0.004** (0.002)
DOCTORAL INSTITUTION 0.976** (0.146) 0.412** (0.118)
MASTERS INSTITUTION 0.407** (0.139) 0.091 (0.115)
BACHELORS INSTITUTION 0.521** (0.177) 0.245* (0.144)
INSTRUCTOR GENDER 0.199* (0.092) 0.125* (0.075)
INSTRUCTORS ENGLISH 0.088 (0.134) 0.007 (0.116)
INSTRUCTOR PHD 0.134 (0.103) 0.155* (0.086)
NO STUDENT DATA 1.931** (0.072) n/a
Chi-square 922.95 68.69
n 2587 2587

a
**Significant at 1% Type I error level (one-tail), *Significant at 5% Type I error level (one-tail).

(Note: although not reported here, if the posttest count disturbance term in the w vector of the selection Eq. (2).
is used as the measure of class size in the probit model Thus, for the ith student we have
of the propensity to drop, its coefficient is positive but
insignificant. This positive effect is to be expected (ei,wi) bivariate normal (0,0,se,1,r) (4)
because the posttest count already excludes those who and for all perturbations in the two equation system
withdrew.) we have
E(e)E(w)0,E(ee)s2e I,E(ww)I, and E(ew) (5)
6. The learning regression with an adjustment for rseI.
attrition

We have argued that the class-size effect may be That is, the disturbances have zero mean, unit vari-
related to (1) the propensity to complete a course of ance, and no covariance among students, but there is
study and take the posttest, and (2) the difference covariance between selection and the post-test score for
between the pretest and posttest scores. Researchers who a student.
have studied the effect of class size in a single-equation The learning equation specification (which places the
explanation of student achievement have ignored the pretest on the left-hand side as opposed to having its
contamination caused by attrition. Although we have coefficient estimated with bias as an explanatory variable
addressed both outcomes, so far we have treated the two on the right-hand side in a post-test regression) ensures
outcomes as separable events. the identification of Eq. (3), although both Eqs. (2) and
The effect of student attrition on measured student (3) are identified by the difference in functional forms.
learning from pretest to posttest and an adjustment for Estimates of the parameters in Eq. (3) are desired, but
the resulting bias caused by ignoring students who do the ith change score (yi) is observed for only the subset
not complete the course can be summarized with a two- of students for whom Ti=1. The regression for this cen-
equation model formed by the selection Eq. (2) and the sored sample of 2077 students is
ith students learning: E(yi|Xi,Ti1)XibE(ei|Ti 0);i1,2,,2077. (6)
yiXibei (3)
where y=(y1y0) is a vector of randomly selected Similar to omitting a relevant variable from a
change scores, X is the matrix of explanatory variables, regression, selection bias is a problem because the mag-
and again the subscript i indicates the ith students record nitude of E(ei|Ti*0) varies across individuals and yet
in the ith row. b is a vector of coefficients corresponding is not included in the estimation of Eq. (3) for the 2077
to X. Each of the disturbances in vector e are assumed students. To the extent that ei and wi (and thus Ti*) are
to be distributed bivariate normal with the corresponding related, the estimators are biased.
386 W.E. Becker, J.R. Powers / Economics of Education Review 20 (2001) 377388

The learning regression involving matched pretest and some degree, we report results with its use (1st column)
posttest scores can be adjusted for student attrition dur- and without its use (2nd column) as an explanatory vari-
ing the course in several ways. An early Heckman (1979) able of selection and learning.
solution to the sample selection problem is to rewrite the When NO STUDENT DATA is omitted, a highly sig-
omitted variable component of the regression so that the nificant negative class-size (INITIAL CLASS SIZE)
equation to be estimated is effect is evident for learning (0.014, with a one-tail p-
E(yi|Xi,Ti1)Xib(rse)li;i1,2,,2077 (7) value of 0.007). Likewise, class size (INITIAL CLASS
SIZE) is highly significant in selection (0.004). The
where li=f(T*i)/[1F(T*i)], and f(.) and F(.) are the highly significant estimate of r (0.637) suggests that
normal density and distribution functions. The inverse these two equations should be estimated simultaneously.
Mills ratio (or hazard) li is the standardized mean of When NO STUDENT DATA is included as a
the disturbance term wi, for the ith student who took the regressor, the class-size effect continues to be significant
posttest; it is close to zero only for those well above the in both selection (0.005) and learning (0.010), but
T=1 threshold. The values of l are generated from the the estimate of r (0.037) and the estimate of the NO
estimated probit Eq. (2). Each student in the learning STUDENT DATA coefficient in the learning equation
regression gets a calculated value li, with the vector of (0.632) are both insignificant. This insignificance of
these values serving as a shift variable in the learning
both the NO STUDENT DATA coefficient in the learn-
regression.
ing equation and r may be the result of the endogeneity
The estimates of both r and se and all the other coef-
of NO STUDENT DATA. The important point for our
ficients in Eqs. (2) and (3) are obtained simultaneously
purpose is that when attrition is taken into consideration,
using the maximum likelihood routine in LIMDEP, and
the effect of initial class size, on both the propensity to
shown in Table 5. Again, because we cannot exclude the
take the final exam and knowledge gained, is negative.
possibility that NO STUDENT DATA is endogenous to

Table 5
Maximum likelihood estimates of the selection and corrected learning equations

Regression resultsa
Independent variables With No student data Without No student data
Coefficient estimate (Standard error)

Selection (probit) equation. Dependent variable: PROPENSITY TO TAKE THE POSTTEST


Constant 0.990** (0.240) 0.096 (0.197)
STUDENT PRETEST SCORE 0.023** (0.009) 0.047** (0.0073)
INITIAL CLASS SIZE 0.005** (0.002) 0.004** (0.002)
DOCTORAL INSTITUTION 0.972** (0.151) 0.415** (0.120)
MASTERS INSTITUTION 0.404** (0.144) 0.108 (0.115)
BACHELORS INSTITUTION 0.515** (0.191) 0.311* (0.144)
INSTRUCTOR GENDER 0.199* (0.091) 0.154* (0.075)
INSTRUCTORS ENGLISH 0.086 (0.119) 0.007 (0.119)
INSTRUCTOR PHD 0.132 (0.098) 0.125 (0.087)
NO STUDENT DATA 1.929** (0.071) n/a
Corrected learning equation. Dependent variable: STUDENT CHANGE SCORE
Constant 6.818** (0.724) 5.345** (0.701)
INITIAL CLASS SIZE 0.010* (0.006) 0.014** (0.0056)
DOCTORAL INSTITUTION 1.997** (0.554) 2.464** (0.427)
MASTERS INSTITUTION 0.362 (0.433) 0.299 (0.422)
BACHELORS INSTITUTION 2.232** (0.505) 2.468** (0.495)
INSTRUCTOR GENDER 0.394 (0.253) 0.464* (0.261)
INSTRUCTORS ENGLISH 2.743** (0.380) 2.647** (0.401)
INSTRUCTOR PHD 0.642* (0.290) 0.957** (0.306)
NO STUDENT DATA 0.632 (1.269) n/a
SIGMA(1) 4.357** (0.070) 4.758** (0.162)
RHO(1,2) 0.037 (0.357) 0.637** (0.095)
Log Likelihood 6826.47 7248.03
n 2587 2587

a
**Significant at 1% Type I error level (one-tail), *Significant at 5% Type I error level (one-tail).
W.E. Becker, J.R. Powers / Economics of Education Review 20 (2001) 377388 387

7. Conclusion Acknowledgements

The authors thank Phillip Saunders for providing the


Researchers are using the NCEEs 3rd edition of the data used in this paper and for input in the earlier stages
TUCE data set to explore the teaching and learning of of this study. William Greene was most helpful solving
the principles of economics. However, they fail to problems encountered in moving from the DOS to the
adequately consider the missing data problems caused by Windows version of LIMDEP. Special thanks are given
the removal of students who did not answer all 30 ques- to David Ribar for his attention to detail and constructive
tions on the pretest and the posttest, students who did suggestions provided in the processes of editing this
not complete a student questionnaire, and the attrition of issue of the Economics of Education Review honoring
students during the course. We have provided an analysis John Riew. The opinions and conclusions expressed are
of the different ways in which these missing data might those of the authors and do not reflect the position of
influence student performance. their employers or anyone other than the authors.
Our results are not consistent with the standard mul-
tiple-choice test findings that class size does not matter
in college level economics. We find, regardless of the References
specification considered, that the initial class-size meas-
Becker, W. E. (1982). The educational process and student
ure is negatively related to learning, all else equal. This achievement given uncertainty in measurement. American
is, at least in part, the result of students in larger classes Economic Review, 72, 229236.
being more likely than students in smaller classes to Becker, W. E. (1983). Economic education research: part III,
withdraw from the course before taking the posttest. statistical estimation methods. Journal of Economic Edu-
We speculate that other researchers finding of no cation, 14, 415.
class-size effect is based on their conditioning on only Becker, W. E. (1997). Teaching economics to undergraduates.
Journal of Economic Literature, 35, 13471373.
those who complete a post-test, with the statistical analy-
Becker, W. E., & Watts, M. (1996). Chalk and talk: a national
sis performed on the restricted sample for which there survey on teaching undergraduate economics. American
is a pretest, posttest, and student information; when we Economic Review Proceedings, 86, 448454.
include students who did not complete the posttest or Becker, W. E., & Watts, M. (1999). How departments of eco-
provide student information, and control for attrition, we nomics evaluate teaching. American Economic Review Pro-
find a statistically significant negative class size effect. ceedings, 89, 344349.
It may be the endogeneity of the end-of-term class size Card, D., & Krueger, A. B. (1996). The economic return to
measure that yields the result that class size is irrelevant. school quality. In W. E. Becker, & W. J. Baumol, Assessing
Educational Practices: The Contribution of Economics (pp.
Our results should be taken as suggestive and tenta-
161182). Cambridge: MIT Press.
tive. First, they may be unique to the microeconomics Carnegie Foundation for the Advancement of Teaching (1994).
courses from which our TUCE data are a sample; the A Classification of Institutions of Higher Education. Prince-
collection of TUCE data was opportunistic and not done ton University Press, Princeton, NJ.
in accordance with the tenets of random sampling. Hanushek, E. (1994). Money might matter somewhat: a
Second, we do not have a unique theory as to what it is response to Hedges, Lane, and Greenwald. Educational
about initial class size that makes it significant (and end- Researcher, May, 58.
of-term class size insignificant) in generating student Heckman, J. (1979). Sample bias as a specific error. Econo-
metrica, 47, 153162.
learning. We can speculate that students entering larger
Hedges, L., Lane, R., & Greenwald, R. (1994a). Does money
classes get less attention from the instructional staff, are matter? A meta analysis of studies of the effects of differen-
less likely to bond with other members of the class, feel tial school inputs on student outcomes. Educational
disengaged in large lecture halls, and the like. It may be Researcher, April, 514.
that a classroom is like a public good in which the likeli- Hedges, L., Lane, R., & Greenwald, R. (1994b). Does money
hood of disruptive students conveying negative exter- matter? money does matter somewhat: a reply to Hanushek.
nalities (a congestion effect) increases with class size. Educational Researcher, May, 910.
Unfortunately, we do not have the data to assess these Kennedy, P. (1992). How much bias from using the pretest as
a regressor? Unpublished working paper, Simon Fraser Uni-
alternative theories. versity, Burnaby BC, Canada.
By definition, missing data cannot be retrieved, and Kennedy, P., & Siegfried, J. (1997). Class size and achievement
we are able only to crudely model the processes that are in introductory economics: evidence from the TUCE III
responsible. Nevertheless, our results are sufficient to data. Economics of Education Review, 16, 385394.
cast doubt on studies of class size that ignore the self- Krueger, A. B. (1999). An economists view of class size
selection associated with a students decision to complete research. Unpublished working paper, Princeton University,
a voluntary questionnaire and persist to the final exam. Princeton NJ., December 24.
Lazear, E. (1999). Educational Production. NBER Working
388 W.E. Becker, J.R. Powers / Economics of Education Review 20 (2001) 377388

Paper Series, National Bureau of Economic Research, No. Siegfried, J., & Kennedy, P. (1995). Does pedagogy vary with
7349. class size in introductory economics. American Economic
Maxwell, N., & Lopus, J. (1994). The Lake Wobegon effect Review Proceedings, 85, 347351.
in student self-reported data. American Economic Review Siegfried, J., Saunders, P., Stinar, E., & Zhang, H. (1996). How
Proceedings, 84, 201205. is introductory economics taught in America? Economic
Sander, W. (1993). Expenditures and student achievement in Inquiry, 34, 182192.
Illinois. Journal of Public Economics, 52, 403416. Tinto, V. (1987). Leaving College: Rethinking the Causes and
Saunders, P. (1994). The TUCE III Data Set: Background infor- Cures for Student Attrition. Chicago: University of
mation and file codes (documentation, summary tables, and Chicago Press.
five 3.5-inch double-sided, high density disks in ASCII
format). National Council on Economic Education, New
York.

The author has requested enhancement of the downloaded file. All in-text references underlined in blue are linked to publications on Researc

You might also like