Professional Documents
Culture Documents
2 would not be a valid test of this objective if they had thoroughly rehearsed the proof
in class.
Reliability of assessment
This is the requirement that the outcome of the assessment is consistent for students with the same
ability, whenever the assessment is used, whoever is being assessed, and whoever conducts the assess-
ment. Again, this is not too difcult to guarantee in mathematics because of the use of detailed marking
schemes. However, as we will see later, even when these are used there can be quite signicant variations
between different markers. Any scheme that is sufciently detailed to always guarantee reliability is al-
most certain to be complex and unwieldy, or to render the question so anodyne as to pose little challenge.
152
5.4.4 Framework for Discussion of Student Assessment
We will need some overall structure for discussing assessment, and in this chapter we will take this
simply as the order in which things usually have to be done in assessment, that is:
setting assessment tasks (Section 5.5)
marking and moderation (Section 5.6)
coursework and feedback to students (Section 5.7).
In this chapter we also look briey at the issue of evaluation of our teaching (Section 5.8). We need not say
much about this here because it will probably form part of your institutional and departmental training.
However, we can say a few things about the mathematical aspects of such things, which you might nd
useful. In particular this involves the way we use the outcomes of assessment.
Essential to developing skills in assessment of mathematics is a detailed study of the actual practice of
setting and marking examination papers. To provide this we include in the appendices copies of papers,
module specications, mark schemes and sample students solutions for examinations in engineering
mathematics and abstract algebra. These are used in the following sections to provide nitty gritty ex-
posure to the real thing. None of it is intended as good practice to imitate. Rather, regard them as case
studies designed to stimulate your own ideas, and to illustrate how you might discuss such papers in
your own department.
5.5 Setting Assessment Tasks
5.5.1 The need for Learning Objectives
By assessment tasks we mean any sort of assessment - examinations, coursework, projects, computer
aided assessment, etc. Following Principles 3 and 4 we have to design the assessment to measure what the
student is expected to have learnt, as expressed in the aims and objective of the course and to take account
of how it has been taught. The following exercise is intended to highlight this aspect of assessment. It is
of course not an exercise in mathematics, but an exercise in teaching, indeed of any subject, which makes
a crucial point that is sometimes overlooked, even in mathematics teaching at advanced levels.
Exercise - sum marking
Students have been asked to evaluate the product 1234 34 without a calculator. Mark out of ten
this attempt by one student:
1 2 3 4
3 4
4 9 3 6
3 7 1 2 0
4 2 0 5 6
Compare your mark with those of a few colleagues. Do you all agree? Discuss any differences. Why
are there differences? What is the correct mark?
In fact, without further information about the objectives (which would dene the context and purpose)
of the question in the previous exercise there is no correct mark. If the objective is to test whether a
153
student knows how to multiply such numbers then although the answer is wrong, this student clearly
knows what to do and has simply made a small slip in mental arithmetic in one carry. Ahigh mark would
therefore seem to be in order. But if the objective is to obtain exactly the right answer, with accuracy of
calculation paramount throughout and the method assumed known, then this student has not achieved
the objective and a modest mark would be not unreasonable - some might even give a low mark.
The above exercise gets to the core of setting assessment tasks - we have to be quite clear that what
we are asking the student to do will measure what we expect them to have learned. Furthermore, the
student should have been told what we expect of them. All this could simply be summarised under the
requirement that the assessment is fair, and taken very much as common sense, and indeed was taken
in this way say twenty years ago. However, we are now expected to articulate much more precisely what
it is that is being assessed. This is essential these days because we are now assessing a much wider range
of the population than a decade or two ago. As well as leading to a much more varied background in our
students it also means that with less personal contact with them (Modularisation, larger classes, etc) we
have to rely on assessment results to a much greater extent than in the past.
The common vehicle for specifying what we expect students to learn is the learning objective (Section
2.6). Learning objectives must be dened and published for everyone to see for a given module (See, for
example, Appendices 2 and 6), and then the assessment must measure the attainment of these objectives.
We say the assessment must be linked to (or alligned with) the learning objectives. This is the main
subject of this section, to design assessment tasks for various types of objectives. This amounts to the
assessment strategy, in the way that the teaching and learning strategy links teaching activities to the
learning objectives (Section 2.7). For an in-depth critique of learning objectives in mathematics assess-
ment, see Niss [59]. Like many books however, criticism of learning objectives is often on the basis of
intended learning that cannot easily be expressed in behaviourist objectives, neglecting the fact that there
is much intended learning that can and perhaps should be so expressed.
Example - the assessment of proof
This issue about clarity of objectives and its importance for high quality teaching and learning
is illustrated very well by the assessment of proof in mathematics, referred to already in this
book. If we set a question requiring a proof it makes a great difference whether the student has
been forewarned about the possibility of it coming up. If the student feels it might come up
and they rehearse it verbatim in advance then the answer to the question reveals little about
their real abilities in unseen proof - the skills they need may in fact be at the lowest level. On
the other hand, if it is intended that they will be able to handle unseen proofs in examinations
then they must be made aware of that possibility, and must be trained accordingly. Acommon
type of compromise here is to teach a particular method of proof, such as induction, but in the
examination we might set new examples of this that they have not yet seen. The point is that
we need to be clear about what skills our questions will aim to assess, and we have to inform
our students about this in the learning objectives of the module. A related issue is the extent
to which we expect students to be precise about the statements of theorems. For mathematics
students we might require explicit statements of the conditions of theorems, whereas for say
engineers we might not be too fussy as long as they have the right idea and can use a theorem.
Either way such issues need to be claried in the learning objectives.
154
5.5.2 Designing Assessment Tasks for given Learning Objectives
Exercise
You have to write a question to test what rst year students have learned about Pythagoras
theorem. Construct three questions, easy, medium and hard for this purpose. Discuss your
questions with a colleague, particularly focusing on your interpretation of easy, medium and hard in
relation to the skills that the students are expected to develop.
Almost certainly, any two attempts at this exercise by different lecturers will produce very different re-
sults. It is just like the rst exercise in this section, the objectives have not been specied clearly and so
we are all free to interpret the exercise as we wish and set questions accordingly. On the other hand, if
we used the examples of a learning objective and the guidance on MATHKIT given in Section 2.6 then it
is likely that different attempts at the exercise would be more consistent. In practice of course we would
not go into anything like this detail, but instead would use intuitive notions of what we expect from the
students and how we would assess them. The point of the exercise is to emphasize that in setting an as-
sessment task we need to decide on the balance we want between K, I, T and design the task accordingly
(Or use your own preferred means of categorizing cognitive skills that you want to assess). As already
touched on above note that K will usually include a whole range of factual items or results or elementary
techniques, and any practical assessment will only sample this. Also care is needed in ensuring that the
assessment of I and T is authentic. (If either have been seen before to any great extent then they simply
become K - a drilled higher order skill becomes a lower order skill) - these days some lecturers ensure
this by stating what proportion of a question is bookwork or previously seen.
Contrary to the views expressed by some authors, most cognitive skills can be assessed by appropriately
set unseen examinations. It has to be said that this becomes increasingly difcult as the duration of the
examination shortens. And it is a matter for debate whether this can be done to any signicant extent
in the ninety-minute module examination that is becoming common these days. The traditional three-
hour examination seems preferable here in that this is a more appropriate length of time to assess real
cognitive skills. Of course for the so-called transferable skills of communication, time management,
team working, etc, then one needs specialised forms of assessment such as projects or group exercises.
And for practical topics such as numerical methods one might have computer laboratory assessment, but
again this is a specialised area not dealt with here.
Straight K can be easily assessed by short objective questions, even by multiple choice or computer aided
assessment. Of course Kcan constitute a signicant piece of work, albeit routine - for example in inverting
a matrix or nding eigenvalues. So such skills may take up a sizeable proportion of a question. But one
would normally include also an aspect of IT - in the case of matrix theory for example this is often quite
easy to do in the form of unseen proofs, or application of routine techniques to novel situations. Another
way of incorporating assessment of IT is by essay type questions (see example below). Or one might
ask for a particular technique or method to be extended or generalised to an unseen situation. Most IT
questions will be multi-step requiring students to weave ideas together (but again unseen).
An often neglected aspect of assessment of mathematics is the issue of howit is presented. You dont have
to be teaching long to come across the poor level of mathematical presentation and arguments used by
students in their solutions to examination questions. Often this comes about because they simply copy
what they have seen the lecturer do on the board - write out a list of equations (sometimes minus the
equals signs!) with little explanation. As long as they get the answer, they are happy. You may try to
encourage better presentation skills by awarding some marks for this in your mark scheme - but make
sure the students know about this.
155
Example - the RSA encryption algorithm
The RSA (Rivest, Shamir, Adleman - [75]) encryption algorithm is a public key encryption
technique which uses prime factorization as the trap-door one way function. It is at present
one of the most powerful encryption systems known, and since it has relatively modest back-
ground requirements in mathematics it is an excellent motivating example in a rst course
in Abstract Algebra (See Appendices 5-8). The problem is that the algorithm itself is fairly
easy to remember with practice, and it is possible for a student to work through numerical
examples in recipe mode, requiring little real understanding of the rationale and theory be-
hind the method. That is, the actual coding and decoding can be, and frequently is, reduced
to straight K. It relies on three key theorems but even these can be learnt verbatim, reducing
them essentially to K. But the students are required to understand and be able to adapt and
use the algorithm, that is they are expected to develop skills at the IT level. One way that
this can be fostered is to get them to express and describe the algorithm and its underpinning
theory themselves, in their own words. So, they are expected to do this in coursework set, and
the relevant examination questions normally contain an essay component, as in the example
shown in the abstract algebra paper in Appendix 5. As one might expect, most students do
ne on the routine numerical example part, demonstrating good K skills, but struggle on the
descriptive part and usually lose marks on this IT component.
Example - integration
Integrating by partial fractions is often learnt as a routine Ktechnique, even though it is multi-
step. Indeed it is not unknown for students to do well on this question yet slip up elsewhere on
the paper on questions that utilize the component skills of splitting into partial fractions and
log integration. In the case of integration by partial fractions the process as a whole probably
provides prompts for doing the component techniques. The sort of variation on this problem
that might require IT skills (provided it is unseen) is provided by the integral included in
Question 15 in the engineering mathematics paper of Appendix 1,
I
3
=
e
2x
e
2x
4e
x
+ 3
dx.
This is actually quite subtle at this level - the student does have to link together a couple of
ideas, substitution and partial fractions, they need to know their exponential function very
well and need to be able to transfer the idea of partial fractions to a new environment. Of
course, if they have seen such things before then it immediately becomes K again!
Exercise - assessing the objective
Think of a topic you teach and formulate a learning objective on some particular aspect - remember,
you need:
an action verb
conditions
standards
some categorisation of the types of cognitive skills involved
Discuss, with a colleague, how you would assess student achievement of the objective by a suitable
task/question.
156
5.5.3 Matching Assessment to the Student Prole
Just as we need to consider the background of the students when we are actually teaching them, so we
also have to think of this in assessing them. In mathematics there is one example of this issue that par-
ticularly emphasises its importance - namely the service teaching issue. For example there are many
ways that a question in elementary integration might look completely different for a rst year engineer-
ing class and for a rst year mathematics honours class. Of course the language might look different.
With the mathematicians we may be talking about continuous, differentiable, integrable functions, etc.
For engineers it may be better to avoid such terminology - maybe put them under the well behaved
umbrella. But there is more to it than that.
With engineers, for example, one might be more lenient on the level of rigour and precision expected (up
to a point). Questions may be more oriented towards applying techniques, with less proof expected
from engineers. We may specify the method to help them, or we may allow them to use any valid
method if they have shown evidence of learning. Because they may have seen mathematics from a range
of different perspectives (say in their other engineering subjects), we may be more tolerant of slips in
notation, terminology and approach. The point is that with engineers our attitude to their abilities in
mathematics might be something like This is only a tool to them, has this student shown that (s)he can
use it provided they have the resources, and could in practice iron out the few slips they have made in
their solution? On the other hand for the mathematicians it might be This is fundamental material for
them, do they have a thorough understanding and facility in this, able to work quickly and accurately,
adapting as necessary?. The difference between these two approaches will clearly inuence the type of
questions set.
Example
In complex analysis we often refer to regions of the form0 < |z| < , as in say the statement
of Laurents theorem. This should be ne for the educated mathematics student. But for
engineers for example, when doing complex variable, it might be better to simply refer to it
as a punctured disc. In the heat of an examination a proliferation of symbols in a question
might wrong foot the engineer, when all we really want to know is whether they can actually
use the relevant result. On the other hand, mathematics students should be uent in such
things and one would expect them to be comfortable, even to prefer, a symbolic formalism.
Another area in which assessment may need to be geared to the background of students is if we have a
signicant number of foreign students whose rst language may not be English. Whilst mathematics is
virtually the only universal language it is also true to say that when a native language is used in math-
ematics it is often to convey subtlety which may be lost on non-native speakers. Amongst all the other
things we have to consider in assessment it is very hard to set papers that take such things into account.
Usually it is only after the event, when the scripts come in, or students ask you questions in the exami-
nation, that you realise a question may be ambiguous or even meaningless to a foreign student. When
this happens, you may need to take a lenient view in the marking (although of course with anonymous
marking you wont know who the foreign students are, so you will have to play this by ear!!).
And, of course, you may be a foreign lecturer. Your whole approach to such things as assessment may
be inuenced by your own cultural background, as well as language. Some nationalities provide less
support for their students than might be the norm in the UK. Some may be used to quite severe policies
on marking. So, if you are a foreign lecturer, new to UK teaching then discuss widely with colleagues to
nd out as much as possible about UK conventions and other matters of assessment .
Another way in which you might need to learn about the students is in terms of your expectations of
their abilities. Research on the expectations that lecturers have of their students abilities has shown
[20] a signicant gap between expectations and reality. And that was before the widening participation
157
agenda reached the levels of some 40% of the eligible age group! In some ways, being younger than
many academic staff the new lecturer will be nearer to their students and perhaps understand better
what their backgrounds might really be. On the other hand the chances are that the new lecturer will
come from an institution with highly qualied entrants where the standards are much higher than for
the students they have to teach. Then older more experienced hands may have a better idea of what the
base-line is. Either way, we need to familiarise ourselves with the real abilities of our students and pitch
our assessment accordingly. This is not lowering standards, but good teaching (Principle 5).
5.5.4 Validity and Reliability of Assessment Tasks
Ensuring that an assessment task measures the skills it is intended to (validity), and that it will do so
consistently in different situations (reliability) is important and difcult. These days we usually have
moderators, checkers, collaborators, etc to help us in setting our examination papers, and this may help
improve validity and reliability. Other practical measures that have been suggested are given below, most
of these being self-evident in the context of mathematics.
Improving validity:
matching assessment to the learning objectives
testing a wider range of objectives
using a number of assessments that can be compared against each other
testing under secure conditions to avoid cheating
improving the reliability of the assessment.
Improving reliability:
setting clear unambiguous questions.
insuring questions are at the right level and standard
imposing realistic time limits
use a rigorous marking scheme with precise criteria
use moderation to check setting and marking
minimizing the choices available on an examination
increasing the range of assessment methods and the duration of the assessment.
5.5.5 Methods of Assessment
As noted earlier, methods of assessment, viewed as measuring the level of attainment of learning objec-
tives (i.e. what the students have learnt), must be matched to the learning objectives, to the student prole
and be practical. Across the range of academic disciplines there is a wide choice of assessment methods
to use, but often practical issues determine the choice. Challis et al [18] describe the many assessment
methods one might use in mathematics, and discusses the functions they may full.
158
Overwhelmingly, the usual assessment method in mathematics is the unseen time-limited examination.
There is a great deal of criticism of such assessment in the literature, but as mentioned earlier, despite
claims to the contrary, a properly set unseen examination of sufcient duration can assess any cognitive
skill. No one would suggest that Olympiad test papers are in some way decient in exercising cognitive
skills!
Between 1998 and 2000 the Quality Assurance Agency conducted what turned out to be the last detailed
department by department evaluation of mathematics provision in the UK [27]. This was the most thor-
ough and detailed investigation of what mathematics departments did for their students in the UK, and
it produced volumes of useful information. One of the aspects studied was student assessment [8], and
in this there were many comments about the range of assessment, probably because this is regarded
as one of the desirable features of good effective assessment in some disciplines. The argument is that
there should be a range of different types of assessment methods, to assess the various types of skills one
wishes to develop. In some subjects this principle is essential and has long been employed. For example,
in a practical science such as Chemistry students have to develop practical skills in the laboratory, and so
they are also assessed in these.
In the past, mathematics has not been seen as a practical subject and has traditionally been examined by
unseen examinations usually lasting three hours (Or module examination of 90 minutes, while Interna-
tional Mathematical Olympiad examinations, for example, range from three to ve hours). But in recent
years, with increased use of computers in core mathematical subjects, and the increasing emphasis on
transferable skills, mathematicians are expected to develop a wider range of skills, hence the call for a
wider range of assessment methods. However, it is not the wider range per se that is necessary, but the
need to match assessment methods to the learning objectives. Thus, if a particular degree programme
is solely devoted to pure mathematics, then it is quite possible that traditional examination assessment
methods are appropriate, although even here such things as coursework and projects are now becoming
commonplace [41]. In what follows we are going to focus mainly on the time-limited unseen examination,
since that is probably what most of us will be involved with.
It is only right that in our discussion of examinations we should be aware of the reservations that some
experts have about the unseen time-limited examination as a method of assessment. The problem is that
it is easy to set bad examinations, and very difcult to set good ones, and this often brings them into
disrepute. While here we take this as an argument for thorough training (one should say education)
in setting examinations, the case for replacing them should also be considered. One of the most schol-
arly discussions along these lines is by Heywood ([40], page 274) who is pessimistic about traditional
examinations. He notes that:
other than objective tests and grading systems, examinations have been shown to be relatively
unreliable
apart from errors in scoring, assessors are not always agreed on the purposes of testing or what
they should be testing
question setting is often treated as an art rather than a science
few assessors have any knowledge of the fundamentals of educational measurement
little attention is paid to the way students learn or of howassessment should be designed to enhance
learning
if the processes of examining and assessment are to be improved then more explicit statements of
criteria showing how these should inuence learning will have to be made.
159
All this leads Heywood to propose a move towards outcomes based assessment. Heywood is actually
talking about generic aspects of assessment, and his call for sharper statements of criteria may have less
force in mathematics because of our widespread use of detailed marking schemes. But there is no doubt
that student assessment is an imperfect pseudo-science and much needs to be done to put it on a sound
foundation. This said, we still have to make the best of the current system, and any new lecturer will
have to learn to operate within that.
5.5.6 The Unseen Time Limited Examination
This, usually closed book (although there may be formulae sheets, for example), traditional form of as-
sessment is easy to present, and each student has exactly the same assessment to complete. The usual
format for the unseen time-limited examination used to be the end of year three-hour paper, but with
modularisation this has sometimes been reduced, for example, to one and a half-hour examinations (Or
two hour, particularly in the nal year) for each module. Often such changes have been brought in with-
out careful thought for their implications. For example, current modular examination schemes do not
always allow time for the assimilation of deep learning or for a student to demonstrate their in-depth
understanding of a topic if the examination immediately follows the module. The assessment of higher
order thinking skills usually requires longer examinations. In this section we are however talking about
the unseen time-limited closed book examination in general, regardless of duration.
In putting together such a paper, we have to think about the number of questions, the choice open to the
student, the duration of the paper and individual questions, etc. There is a wide range of formats - eg, 5
from 8 three hour examination, 3 from 5 one and a half hour examination. Your department may have a
standard format that you can adopt. Usually the examination is a compromise between what is desirable
for enabling the student to demonstrate what they have learnt, and what is practicable with the resources
available. One could for example assess a 100 hour module with a half hour multiple choice computer
marked test. This would be very efcient and save a lot of time, but it would not really give the students a
fair chance to showwhat they can do. On the other hand we could give each student a six-hour paper and
a one-hour viva. This would probably give every student a more than fair opportunity to demonstrate
their abilities, and would give a very accurate picture of what they know and can do. But it would be
impractical for most departments. The three-hour end of year exam evolved over a long period of such
compromise and had probably got it about right (at least for mathematics). The assumption that two one
and a half examinations are the same as one three hour examination is however debatable.
One examination paper format sometimes used, particularly when we have a very wide range of student
ability, such as in engineering mathematics classes, is the Section A and B type paper (See Appendix
1). In this Section A contains a largish number of relatively straightforward mainly K type questions
spreading across the full syllabus, and all these questions have to be attempted. Then Section B contains
a smaller number of longer, harder questions, fromwhich there is some limited choice, and which contain
signicant IT components. This gives the weaker students the opportunity to demonstrate some range
of basic understanding across the module, while still providing a challenge for the stronger students.
Once an overall paper format has been decided we have to design the questions to test what we expect
the students to have learnt. This involves a lot of professional judgement and experience. Here we can
only give some rough ideas, and by far your best input on this will be from discussions with as many
colleagues as possible, looking at past papers, or those of other institutions. And of course as a student
yourself you will have seen many examination papers. But, this said, when you actually come to put the
paper together your main thought should be for the students you have actually taught and for whom the
examination is intended. It is not good practice to take questions off the peg, from books or other papers
- they may be used for ideas, but should be rewritten with your students in mind. The questions should
be your questions, for your module, for your students. While you are designing the questions you will
160
probably have lots of ideas that you cant use all at once - just bank them for future use.
Overall the paper should cover the bulk of the main ideas of the module, and should test a range of
cognitive skills. How you measure this and ensure reasonable coverage is again a matter of judgement.
Using something like the KIT categorisation most questions should have a fair distribution of these types
of skills, and the paper as a whole should be structured to ensure that a high mark is only possible with
considerable evidence of higher order skills. This can be a difcult balancing act.
Example
Suppose you are examining a rst year methods course comprising advanced calculus, dif-
ferential equations, complex numbers, matrix theory, and vectors. Each of these topics would
have to have a corresponding question or part of a question. Also across the paper as a whole
the student should be called upon to demonstrate each type of skill fromKIT. It is well known
that, at least at the elementary level, it is difcult to set high I and T questions in matrix and
vector algebra, and equally difcult to set lowI and Tquestions in calculus. As a consequence,
if the paper is not carefully set it is quite possible for a student to concentrate on say the alge-
braic questions and achieve a good mark with relatively little use of higher order skills, and
having done little calculus. One way round this is to mix algebra and calculus in a single
question, but then it is difcult to t in signicant higher order skills in both topics. This dif-
cult issue is ultimately a matter for your judgement, and whatever departmental mechanisms
there are for supporting you in designing questions.
A few general examples might help. Thus a typical basic knowledge objective question testing little
more than memory would have mostly K, a good applied question, testing applications would have
some K and a lot of T. A good all round question might have a fair spread. A standard type of question
might have say 8/20 K, 6/20 I, and 6/20 T. This is a sort of quantication of what an experienced lecturer
might suggest - Start the question with some routine material to give all the students a chance, have
some more substantial material of medium difculty for the good students, and them something hard for
the very best. It is a lecturers informed professional judgement that must decide such matters and for
example would take into account what (s)he has done in class with the students. Thus, as noted earlier
the proof of the irrationality of
3 subtends an angle of 30