You are on page 1of 89

1.

Sets
Set theory is the foundation of probability and statistics, as it is for almost every branch of mathematics.

Sets and subsets


First, a set is simply a collection of objects; the objects are referred to as elements of the set. The statement that
x is an element of set S is written xS, and the negation that x is not an element of S is written as xS. By
definition, a set is completely determined by its elements; thus sets A and B are equal if they have the same
elements:

A=B if and only if xAxB


If A and B are sets then A is a subset of B if every element of A is also an element of B:

AB if and only if xAxB


Concepts in set theory are often illustrated with small, schematic sketches known as Venn diagrams, named for
John Venn. The Venn diagram in the picture below illustrates the subset relation.

AB
Membership is a primitive, undefined concept in naive set theory, the type we are studying here. However, the
following construction, known as Russell's paradox, after the mathematician and philosopher Bertrand Russell,
shows that we cannot be too cavalier in the construction of sets.
Let R be the set of all sets A such that AA. Then RR if and only if RR.
Proof:
The contradiction follows from the definition of R: If RR, then by definition, RR. If RR, then by
definition, RR. The net result, of course, is that R is not a well-defined set.
Usually, the sets under discussion in a particular context are all subsets of a well-defined, specified set S, often
called a universal set. The use of a universal set prevents the type of problem that arises in Russell's paradox.
That is, if S is a given set and p(x) is a predicate on S (that is, a valid mathematical statement that is either true
or false for each xS), then {xS:p(x)} is a valid subset of S. Defining a set in this way is known as
predicate form. The other basic way to define a set is simply be listing its elements; this method is known as list
form.
In contrast to a universal set, the empty set, denoted , is the set with no elements.

A for every set A.


Proof:

A means that xxA. Since the premise is false, the implication is true.
Suppose that A, B and C are subsets of a set S. The subset relation is a partial order on the collection of subsets
of S. That is,
1. AA (the reflexive property).
2. If AB and BA then A=B (the anti-symmetric property).
3. If AB and BC then AC (the transitive property).
If AB and AB, we sometimes write AB, and A is said to be a strict subset of B. If AB, then A
is called a proper subset of B. If S is a set, then the set of all subsets of S is known as the power set of S and is
denoted P(S).
Special Sets
The following special sets are used throughout this text. Defining them will also give us practice using list and
predicate form.

R, the set of real numbers, and the universal set for the other subsets in this list.

N={0,1,2,}, the set of natural numbers

N+={1,2,3,}, the set of positive integers

Z={,2,1,0,1,2,}, the set of integers

Q={xR:x=mn for some mZ and nN+}, the set of rational numbers

A={xR:p(x)=0 for some polynomial p with integer coefficients}, the set of algebraic
numbers.

Note that N+NZQAR. We will also occasionally need the set of complex numbers

C={x+iy:x,yR} where i is the imaginary unit.

Set Operations

We are now ready to review the basic operations of set theory. For the following definitions, suppose that A and

B are subsets of a universal set, which we will denote by S.


The union of A and B is the set obtained by combining the elements of A and B.

AB={xS:xA or xB}
The intersection of A and B is the set of elements common to both A and B:

AB={xS:xA and xB}


If the intersection of sets A and B is empty, then A and B are said to be disjoint. That is, A and B are disjoint
if AB=
The set difference of B and A is the set of elements that are in B but not in A:

BA={xS:xB and xA}


Sometimes (particularly in older works and particularly when AB), the notation BA is used instead of

BA. When AB, BA is known as proper set difference.


The complement of A is the set of elements that are not in A:

Ac={xS:xA}
Note that union, intersection, and difference are binary set operations, while complement is a unary set
operation.
In the Venn diagram applet, select each of the following and note the shaded area in the diagram.
1. A
2. B
3. Ac
4. Bc
5. AB
6. AB
Basic Rules

In the following problems, A, B, and C are subsets of a universal set S. The proofs are straightforward, and just
use the definitions and basic logic. Try the proofs yourself before reading the ones in the text.

ABAAB.
The identity laws:
1. A=A
2. AS=A
The idempotent laws:
1. AA=A
2. AA=A
The complement laws:
1. AAc=S
2. AAc=
The double complement law: (Ac)c=A
The commutative laws:
1. AB=BA
2. AB=BA
Proof:
These results follows from the commutativity of the or and and logical operators.
The associative laws:
1. A(BC)=(AB)C
2. A(BC)=(AB)C
Proof:
These results follow from the associativity of the or and and logical operators.

Thus, we can write ABC without ambiguity. Note that x is an element of this set if and only if x is an
element of at least one of the three given sets. Similarly, we can write ABC without ambiguity. Note that x
is an element of this set if and only if x is an element of all three of the given sets.
The distributive laws:
1. A(BC)=(AB)(AC)
2. A(BC)=(AB)(AC)
Proof:
1. xA(BC) if and only if xA and xBC if and only if xA and either xB or xC if and
only if xA and xB, or, xA and xC if and only if xAB or xAC if and only if

x(AB)(AC.
2. The proof is exactly the same as (a), but with or and and interchanged.
DeMorgan's laws (named after Agustus DeMorgan):
1. (AB)c=AcBc
2. (AB)c=AcBc.
Proof:
1. x(AB)c if and only if xAB if and only if xA and xB if and only xAc and xBc if and
only if xAcBc
2. x(AB)c if and only if xAB if and only if xA or xB if and only xAc or xBc if and
only if xAcBc
The following statements are equivalent:
1. AB
2. BcAc
3. AB=B
4. AB=A
5. AB=
Proof:
1. Recall that AB means that xAxB.
2. BcAc means that xBxA. This is the contrapositive of (a) and hence is equivalent to (a).

3. If AB then clearly AB=B. Conversely suppose AB=B. If xA then xAB so xB.


Hence AB.
4. If AB then clearly AB=A. Conversely suppose AB=A. If xA then xAB and so xB.
Hence AB.
5. Suppose AB. If xA then xB and so by definition, xAB. If xA then again by definition,

xAB. Thus AB=. Conversely suppose that AB=. If xA then xAB so xB. Thus
AB.
In addition to the special sets mentioned earlier, we also have

RQ, the set of irrational numbers

RA, the set of transcendental numbers

Since QAR it follows that RARQ, that is, every transcendental number is also irrational.
Set difference can be expressed in terms of complement and intersection. All of the other set operations
(complement, union, and intersection) can be expressed in terms of difference.
Results for set difference:
1. BA=BAc
2. Ac=SA
3. AB=A(AB)
4. AB=S{(SA)[(SA)(SB)]}
Proof:
1. This is clear from the definition: BA=BAc={xS:xB and xA}.
2. This follows from (a) with B=S.
3. Using (a), DeMorgan's law, and the distributive law, the right side is

A(ABc)c=A(AcB)=(AAc)(AB)=(AB)=AB
4. Using (a), (b), DeMorgan's law, and the distributive law, the right side is

[Ac(AcB)c]c=A(AcB)=(AAc)(AB)=S(AB)=AB

So in principle, we could do all of set theory using the one operation of set difference. But as (c) and (d)
suggest, the results would be hideous.

(AB)(AB)=(AB)(BA).
Proof:
A direct proof is simple, but for practice let's give a proof using set algebra, in particular, DeMorgan's law, and
the distributive law:

(AB)(AB)=(AB)(AB)c=(AB)(AcBc)=(AAc)(BAc)(ABc)(BBc)=(
BA)(AB)=(AB)(BA)
The set in the previous result is called the symmetric difference of A and B, and is sometimes denoted AB.
The elements of this set belong to one but not both of the given sets. Thus, the symmetric difference
corresponds to exclusive or in the same way that union corresponds to inclusive or. That is, xAB if and
only if xA or xB (or both); xAB if and only if xA or xB, but not both. On the other hand, the
complement of the symmetric difference consists of the elements that below to both or neither of the given sets:

(AB)c=(AB)(AcBc)=(AcB)(BcA)
Proof:
Again, a direct proof is simple, but let's give an algebraic proof for practice:

(AB)c=[(AB)(AB)c]c=(AB)c(AB)=(AcBc)(AB)=(AcA)(AcB)(BcA)(
BcB)=S(AcB)(BcA)S=(AcB)(BcA)
There are 16 different (in general) sets that can be constructed from two given events A and B.
Proof:

S is the union of 4 pairwise disjoint sets: AB, ABc, AcB, and AcBc. If A and B are in general position,
these 4 sets are distinct. Every set that can be constructed from A and B is a union of some (perhaps none,
perhaps all) of these 4 sets. There are 24=16 sub-collections of the 4 sets.
Open the Venn diagram applet. This applet lists the 16 sets that can be constructed from given sets A and B
using the set operations.
1. Select each of the four subsets in the proof of the last exercise: AB, ABc, AcB, and AcBc. Note
that these are disjoint and their union is S.
2. Select each of the other 12 sets and show how each is a union of some of the sets in (a).

General Operations
The operations of union and intersection can easily be extended to a finite or even an infinite collection of sets.
Definitions
Suppose that A is a nonempty collection of subsets of a universal set S. In some cases, the subsets in A may be
naturally indexed by a nonempty index set I, so that A={Ai:iI}. (In a technical sense, any collection of
subsets can be indexed.)
The union of the collection of sets A is the set obtained by combining the elements of the sets in A:

A={xS:xA for some AA}


If A={Ai:iI}, so that the collection of sets is indexed, then we use the more natural notation:

iIAi={xS:xAi for some iI}


The intersection of the collection of sets A is the set of elements common to all of the sets in A:

A={xS:xA for all AA}


If A={Ai:iI}, so that the collection of sets is indexed, then we use the more natural notation:

iIAi={xS:xAi for all iI}


Often the index set is an integer interval of N. In such cases, an even more natural notation is to use the upper
and lower limits of the index set. For example, if the collection is {Ai:iN+} then we would write i=1Ai
for the union and i=1Ai for the intersection. Similarly, if the collection is {Ai:i{1,2,,n}} for some

nN+, we would write ni=1Ai for the union and ni=1Ai for the intersection.
A collection of sets A is pairwise disjoint if the intersection of any two sets in the collection is empty:

AB= for every A,BA with AB. The collection of sets A is said to partition a set B if the collection
A is pairwise disjoint and A=B.
Partitions are intimately related to equivalence relations.
Basic Rules

In the following problems, A={Ai:iI} is a collection of subsets of a universal set S, indexed by a nonempty
set I, and B is a subset of S.
The general distributive laws:
1. (iIAi)B=iI(AiB)
2. (iIAi)B=iI(AiB)
Restate the laws in the notation where the collection A is not indexed.
Proof:
1. x is an element of either set if and only if xB and xAi for some iI.
2. x is an element of either set if and only if xB or xAi for every iI.

(A)B={AB:AA}, (A)B={AB:AA}
The general De Morgan's laws:
1. (iIAi)c=iIAci
2. (iIAi)c=iIAci
Restate the laws in the notation where the collection A is not indexed.
Proof:
1. x(iIAi)c if and only if xiIAi if and only if xAi for every iI if and only if xAci for

every iI if and only if xiIAci.


2. x(iIAi)c if and only if xiIAi if and only if xAi for some iI if and only if xAci for

some iI if and only if xiIAci.

(A)c={Ac:AA}, (A)c={Ac:AA}
Suppose that the collection A partitions S. For any subset B, the collection {AB:AA} partitions B.
Proof:
Suppose A={Ai:iI} where I is an index set. If i,jI with ij then

(AiB)(AjB)=(AiAj)B=B=, so the collection {AiB:iI} is disjoint. Moreover, by the


distributive law,

iI(AiB)=(iIAi)B=SB=B

A partition of S induces a partition of B


Suppose that {Ai:iN+} is a collection of subsets of a universal set S
1. n=1k=nAk={xS:xAk for infinitely many kN+}
2. n=1k=nAk={xS:xAk for all but finitely many kN+}
Proof:
1. Note that xn=1k=nAk if and only if for every nN+ there exists kn such that xAk. In
turn, this occurs if and only if xAk for infinitely many kN+.
2. Note that xn=1k=nAk if and only if there exists nN+ such that xAk for every kn. In
turn, this occurs if and only if xAk for all but finitely many kN+.
The sets in the previous result turn out to be important in the study of probability.

Product sets
Definitions
Product sets are sets of sequences. The defining property of a sequence, of course, is that order as well as
membership is important.
Let us start with ordered pairs. In this case, the defining property is that (a,b)=(c,d) if and only if a=c and

b=d. Interestingly, the structure of an ordered pair can be defined just using set theory. The construction in the
result below is due to Kazamierz Kuratowski
Define (a,b)={{a},{a,b}}. This definition captures the defining property of an ordered pair.
Proof:
Suppose that (a,b)=(c,d) so that {{a},{a,b}}={{c},{c,d}}. In the case that a=b note that (a,b)={{a}}.
Thus we must have {c}={c,d}={a} and hence c=d=a, and in particular, a=c and b=d. In the case that

ab, we must have {c}={a} and hence c=a. But we cannot have {c,d}={a} because then (c,d)={{a}}
and hence {a,b}={a}, which would force a=b, a contradiction. Thus we must have {c,d}={a,b}. Since c=a
and ab we must have d=b. The converse is trivial: if a=b and c=d then {a}={c} and {a,b}={c,d} so

(a,b)=(c,d).
For ordered triples, the defining property is (a,b,c)=(d,e,f) if and only if a=d, b=e, and c=f. Ordered
triples can be defined in terms of ordered pairs, which via the last result, uses only set theory.
Define (a,b,c)=(a,(b,c)). This definition captures the defining property of an ordered triple.
Proof:
Suppose (a,b,c)=(d,e,f). Then (a,(b,c))=(d,(e,f)). Hence by the definition of an ordered pair, we must
have a=d and (b,c)=(e,f). Using the definition again we have b=e and c=f. Conversely, if a=d, b=e, and

c=f, then (b,c)=(e,f) and hence (a,(b,c))=(d,(e,f)). Thus (a,b,c)=(d,e,f).


All of this is just to show how complicated structures can be built from simpler ones, and ultimately from set
theory. But enough of that! More generally, two ordered sequences of the same size (finite or infinite) are the
same if and only if their corresponding coordinates agree. Thus for nN+, the definition for n-tuples is

(x1,x2,,xn)=(y1,y2,,yn) if and only if xi=yi for all i{1,2,,n}. For infinite sequences, (x1,x2,
)=(y1,y2,) if and only if xi=yi for all iN+.
Suppose now that we have a sequence of sets of n sets, (S1,S2,,Sn), where nN+. The Cartesian product
of the sets is defined as follows:

S1S2Sn={(x1,x2,,xn):xiSi for i{1,2,,n}}


Cartesian products are named for Ren Descartes. If Si=S for each i, then the Cartesian product set can be
written compactly as Sn, a Cartesian power. In particular, recall that R denotes the set of real numbers so that

Rn is n-dimensional Euclidean space, named after Euclid, of course. The elements of {0,1}n are called bit
strings of length n. As the name suggests, we sometimes represent elements of this product set as strings rather
than sequences (that is, we omit the parentheses and commas). Since the coordinates just take two values, there
is no risk of confusion.
Suppose that we have an infinite sequence of sets (S1,S2,). The Cartesian product of the sets is defined by

S1S2={(x1,x2,):xiSi for each i{1,2,}}

When Si=S for iN+, the Cartesian product set is sometimes written as a Cartesian power as S or as SN+.
An explanation for the last notation is given in the next section on functions. Also, notation similar to that of
general union and intersection is often used for Cartesian product, with as the operator. thus

ni=1Si=S1S2Sn,i=1Si=S1S2
Rules for Product Sets
We will now see how the set operations relate to the Cartesian product operation. Suppose that S and T are sets
and that AS, BS and CT, DT. The sets in the exercises below are subsets of ST.
The most important rules that relate Cartesian product with union, intersection, and difference are the
distributive rules:
1. A(CD)=(AC)(AD)
2. (AB)C=(AC)(BC)
3. A(CD)=(AC)(AD)
4. (AB)C=(AC)(BC)
5. A(CD)=(AC)(AD)
6. (AB)C=(AC)(BC)
Proof:
1. (x,y)A(CD) if and only if xA and yCD if and only if xA and either yC or yD if
and only if xA and yC, or, xA and yD if and only if (x,y)AC or (x,y)AD if and
only if (x,y)(AC)(AD).
2. Similar to (a), but with the roles of the coordinates reversed.
3. (x,y)A(CD) if and only if xA and yCD if and only if xA and yC and yD if and
only if (x,y)AC and (x,y)AD if and only if (x,y)(AC)(AD).
4. Similar to (c) but with the roles of the coordinates reversed.
5. (x,y)A(CD) if and only if xA and yCD if and only if xA and yC and yD if and
only if (x,y)AC and (x,y)AD if and only if (x,y)(AC)(AD).
6. Similar to (e) but with the roles of the coordinates reversed.
In general, the product of unions is larger than the corresponding union of products.

1. (AB)(CD)=(AC)(AD)(BC)(BD)
2. (AC)(BD)(AB)(CD)
Proof:
1. (x,y)(AB)(CD) if and only if xAB and yCD if and only if at least one of the
following is true: xA and yC, xA and yD, xB and yC, xB and yD if and only if

(x,y)(AC)(AD)(BC)(BD)
2. This follows from (a). Note that the left side is a proper subset of the right side if AD or BC is
nonempty.
On the other hand, the product of intersections is the same as the corresponding intersection of products.

(AC)(BD)=(AB)(CD)
Proof:

(x,y)(AC)(BD) if and only if (x,y)AC and (x,y)BD if and only if xA and yC and
xB and yD if and only if xAB and yCD if and only if (x,y)(AB)(CD.
In general, the product of differences is smaller than the corresponding difference of products.
1. (AB)(CD)=[(AC)(AD)][(BC)(BD)]
2. (AB)(CD)(AC)(BD)
Proof:
1. (x,y)(AB)(CD if and only if xAB and yCD if and only if xA and xB and

yC and yD. On the other hand, (x,y)[(AC)(AD)][(BC)(BD)] if and only if


(x,y)(AC)(AD) and (x,y)(BC)(BD). The first statement means that xA and
yC and yD. The second statement is the negation of xB and yC and yD. The two
statements both hold if and only if xA and xB and yC and yD.
2. As in (a), (x,y)(AB)(CD) if and only if xA and xB and yC and yD. This implies
that (x,y)AC and (x,y)(BD).
Projections and Cross Sections
Suppose again that S and T are sets, and that CST. The cross section of C at xS is

Cx={yT:(x,y)C}
Similarly the cross section of C at yT is

Cy={xS:(x,y)C}

Note that CxT for xS and CyS for yT. We define the projections of C on T and onto S by

CT={yT:(x,y)C for some xS},CS={xS:(x,y)C for some yT}


The projections are the unions of the appropriate cross sections.
1. CT=xSCx
2. CS=yTCy
Cross sections are preserved under the set operations. We state the result for cross sections of xS. By
symmetry, of course, analgous results hold for cross sections of yT.
Suppose that C,DST. Then for xT,
1. (CD)x=CxDx
2. (CD)x=CxDx
3. (CD)x=CxDx
Proof:
1. y(CD)x if and only if (x,y)CD if and only if (x,y)C or (x,y)D if and only if yCx or

yDx.
2. The proof is just like (a), with and replacing or.
3. The proof is just like (a), with and not replacing or.
For projections, the results are a bit more complicated. We give the results for projections onto T; naturally the
results for projections onto S are analogous.
Suppose again that C,DST. Then
1. (CD)T=CTDT
2. (CD)TCTDT
3. (CT)c(Cc)T
Proof:
1. Suppose that y(CD)T. Then there exists xS such that (x,y)CD. Hence (x,y)C so yCT,
or (x,y)D so yDT. In either case, yCTDT. Conversely, suppose that yCTDT. Then yCT
or yDT. If yCT then there exists xS such that (x,y)C. But then (x,y)CD so y(CD)T.
Similarly if yDT then y(CD)T.

2. Suppose that y(CD)T. Then there exists xS such that (x,y)CD. Hence (x,y)C so yCT
and (x,y)D so yDT. Therefore yCTDT.
3. Suppose that y(CT)c. Then yCT, so for every xS, (x,y)C. Fix x0S. Then (x0,y)C so

(x0,y)Cc and therefore y(Cc)T.


It's easy to see that equality does not hold in general in parts (b) and (c). In part (b) for example, suppose that

A1,A2S are nonempty and disjoint and BT is nonempty. Let C=A1B and D=A2B. Then CD=
so (CD)T=. But CT=DT=B. In part (c) for example, suppose that A is a nonempty proper subset of S
and B is a nonempty proper subset of T. Let C=AB. Then CT=B so (CT)c=Bc. On the other hand,

Cc=(AcB)(ABc), so (Cc)T=T.

Computational Exercises
Subsets of R
The universal set is [0,). Let A=[0,5] and B=(3,7). Express each of the following in terms of intervals:
1. AB
2. AB
3. AB
4. BA
5. Ac
Answer:

1. (3,5]
2. [0,7]
3. [0,3]
4. (5,7)
5. (5,)
The universal set is N. Let A={nN:n is even} and let B={nN:n9}. Give each of the following:
1. AB in list form

2. AB in predicate form
3. AB in list form
4. BA in list form
5. Ac in predicate form
6. Bc in list form
Answer:
1. {0,2,4,6,8}
2. {nN:n is even or n9}
3. {10,12,14,}
4. {1,3,5,7,9}
5. {nN:n is odd}
6. {10,11,12,}
Coins and Dice
Let S={1,2,3,4}{1,2,3,4,5,6}. This is the set of outcomes when a 4-sided die and a 6-sided die are tossed.
Further let A={(x,y)S:x=2} and B={(x,y)S:x+y=7}. Give each of the following sets in list form:
1. A
2. B
3. AB
4. AB
5. AB
6. BA
Answer:
1. {(2,1),(2,2),(2,3),(2,4),(2,5),(2,6)}
2. {(1,6),(2,5),(3,4),(4,3)}
3. {(2,5)}

4. {(2,1),(2,2),(2,3),(2,4),(2,5),(2,6),(1,6),(3,4),(4,3)}
5. {(2,1),(2,2),(2,3),(2,4),(2,6)}
6. {(1,6),(3,4),(4,3)}
Let S={0,1}3. This is the set of outcomes when a coin is tossed 3 times (0 denotes tails and 1 denotes heads).
Further let A={(x1,x2,x3)S:x2=1} and B={(x1,x2,x3)S:x1+x2+x3=2}. Give each of the following
sets in list form, using bit-string notation:
1. S
2. A
3. B
4. Ac
5. Bc
6. AB
7. AB
8. AB
9. BA
Answer:
1. {000,100,010,001,110,101,011,111}
2. {010,110,011,111}
3. {110,011,101}
4. {000,100,001,101}
5. {000,100,010,001,111}
6. {110,011}
7. {010,110,011,111,101}
8. {010,111}

9. {101}
Let S={0,1}2. This is the set of outcomes when a coin is tossed twice (0 denotes tails and 1 denotes heads).
Give P(S) in list form.
Answer:

{,{00},{01},{10},{11},{00,01},{00,10},{00,11},{10,11},{00,01,10},{00,01,11},{00,10,11},
{01,10,11},{00,01,10,11}}
Cards
A standard card deck can be modeled by the Cartesian product set

D={1,2,3,4,5,6,7,8,9,10,j,q,k}{,,,}
where the first coordinate encodes the denomination or kind (ace, 2-10, jack, queen, king) and where the second
coordinate encodes the suit (clubs, diamonds, hearts, spades). Sometimes we represent a card as a string rather
than an ordered pair (for example q). For the problems in this subsection, the card deck D is the universal set.
Let H denote the set of hearts and F the set of face cards. Find each of the following:
1. HF
2. HF
3. FH
4. HF
Answer:
1. {j,q,k}
2. {1,2,34,5,6,78,910}
3. {j,q,k,j,q,k,k,k,j,q,k}
4. {1,2,3,4,5,6,7,8.9,10,j,q,k,j,q,k,j,q,k}
A bridge hand is a subset of D with 13 cards. Often bridge hands are described by giving the cross sections by
suit. Thus, suppose that N is a bridge hand, held by a player named North, with

N={2,5,q},N={1,5,8,q,k},N={8,10,j,q},N={1}
Find each of the following:

1. The nonempty cross sections of N by denomination.


2. The projection of N onto the set of suits.
3. The projection of N onto the set of denominations
Answer:
1. N1={,}, N2={}, N5={,}, N8={,}, N10={}, Nj={}, Nq={,,}, Nk={}
2. {,,,}
3. {1,2,5,8,10,j,q,k}
By contrast, it is usually more useful to describe a poker hand by giving the cross sections by denomination. In
the usual version of draw poker, a hand is a subset of D with 5 cards. Thus, suppose that B is a poker hand,
held by a player named Bill, with

B1={,},B8={,},Bq={}
Find each of the following:
1. The nonempty cross sections of B by suit.
2. The projection of B onto the set of suits.
3. The projection of B onto the set of denominations
Answer:
1. B={1,8}, B={q}, B={1,8}
2. {,,}
3. {1,8,q}
The poker hand in the last exercise is known as a dead man's hand. Legend has it that Wild Bill Hickock held
this hand at the time of his murder in 1876.
General unions and intersections
For the problems in this subsection, the universal set is R.
Let An=[0,11n] for nN+. Find
1. n=1An
2. n=1An
3. n=1Acn

4. n=1Acn
Answer:
1. {0}
2. [0,1)
3. (,0)[1,)
4. R{0}
Let An=(21n,5+1n) for nN+. Find
1. n=1An
2. n=1An
3. n=1Acn
4. n=1Acn
Answer:
1. [2,5]
2. (1,6)
3. (,1][6,)
4. (,2)(5,)
Subsets of R 2
Let T be the closed triangular region in R2 with vertices (0,0), (1,0), and (1,1). Find each of the following:
1. The cross section Tx for xR
2. The cross section Ty for yR
3. The projection of T onto the horizontal axis
4. The projection of T onto the vertical axis
Proof:
1. Tx=[0,x] for x[0,1], Tx= otherwise

2. Ty=[y,1] for y[0,1], Ty= otherwise

3. [0,1]
4. [0,1]

2. Functions
Functions play a central role in probability and statistics, as they do in every other branch of mathematics. For
the most part, the proofs in this section are straightforward, so be sure to try them yourself before reading the
ones in the text.

Definitions and Properties


Suppose that S and T are sets. Technically, a function f from S into T is a subset of the product set ST with
the property that for each element xS, there exists a unique element yT such that (x,y)f. We then write

y=f(x). Less formally, a function f from S into T is a rule (or procedure or algorithm) that assigns to each

xS a unique element f(x)T. The definition of a function as a set of ordered pairs, is due to Kazamierz
Kuratowski. When f is a function from S into T we write f:ST.

A function f from S into T


The set S is the domain of f and the set T is the range space or co-domain of f. The actual range of f is the set
of function values:

range(f)={yT:y=f(x) for some xS}


So range(f)T. If range(f)=T, then f is said to map S onto T (instead of just into T). Thus, if f maps S
onto T then for each yS there exists xS such that f(x)=y. By definition, any function maps its domain
onto its range.
A function f:ST is said to be one-to-one if distinct elements in the domain are mapped to distinct elements in
the range. That is, if u,vS and uv then f(u)f(v). Equivalently, if f(u)=f(v) then u=v.
Inverse functions
If f is a one-to-one function from S onto T, we can define the inverse of f as the function f1 from T onto S
given by

f1(y)=xf(x)=y;xS,yT
If you like to think of a function as a set of ordered pairs, then f1={(y,x)TS:(x,y)f}. The fact that f
is one-to-one and onto ensures that f1 is a valid function from T onto S. Sets S and T are in one-to-one
correspondence if there exists a one-to-one function from S onto T. One-to-one correspondence plays an
essential role in the study of cardinality.
Restrictions
Suppose that f:ST and that AS. The function fA:AT defined by the rule fA(x)=f(x) for xA is
called, appropriately enough, as the restriction of f to A. As a set of ordered pairs, note that

fA={(x,y)f:xA}.
Composition
Suppose that g:RS and f:ST. The composition of f with g is the function fg:RT defined by

(fg)(x)=f(g(x)),xR
Composition is associative:
Suppose that h:RS, g:ST, and f:TU. Then

f(gh)=(fg)h
Proof:
Note that both functions map R into U. Using the definition of composition, the value of both functions at

xR is f(g(h(x))).
Thus we can write fgh without ambiguity. On the other hand, composition is not commutative. Indeed
depending on the domains and co-domains, fg might be defined when gf is not. Even when both are
defined, they may have different domains and co-domains, and so of course cannot be the same function. Even
when both are defined and have the same domains and co-domains, the two compositions will not be the same
in general. Examples of all of these cases are given in the computational exercises below.
Suppose that g:RS and f:ST.
1. If f and g are one-to-one then fg is one-to-one.
2. If f and g are onto then fg is onto.
Proof:
1. Suppose that u,vR and (fg)(u)=(fg)(v). Then f(g(u))=f(g(v)). Since f is one-to-one,

g(u)=g(v). Since g is one-to-one, u=v.


2. Suppose that zT. Since f is onto, there exist yS with f(y)=z. Since g is onto, there exists xR
with g(x)=y. Then (fg)(x)=f(g(x))=f(y)=z.
The identity function on a set S is the function IS from S onto S defined by IS(x)=x for xS
The identity function acts like an identity with respect to the operation of composition. If f:ST then
1. fIS=f
2. ITf=f
Proof:
1. Note that fIS:ST. For xS, (fIS)(x)=f(IS(x))=f(x).
2. Note that ITf:ST. For xS, (ITf)(x)=IT(f(x))=f(x).

The inverse of a function is really the inverse with respect to composition. Suppose that f is a one-to-one
function from S onto T. Then
1. f1f=IS
2. ff1=IT
Proof:
1. Note that f1f:SS. For xS, (f1f)(x)=f1(f(x))=x.
2. Note that ff1:TT. For yT, (f1f)(y)=f(f1(y))=y
An element xSn can be thought of as a function from {1,2,,n} into S. Similarly, an element xS can
be thought of as a function from N+ into S. For such a sequence x, of course, we usually write xi instead of

x(i). More generally, if S and T are sets, then the set of all functions from S into T is denoted by TS. In
particular, as we noted in the last section, S is also (and more accurately) written as SN+.
Suppose that g is a one-to-one function from R onto S and that f is a one-to-one function from S onto T. Then

(fg)1=g1f1.
Proof:
Note that fg)1:TS and g1f1:TS. For yT, let x=(fg)1(y). Then (fg)(x)=y so that

f(g(x))=y and hence g(x)=f1(y) and finally x=g1(f1(y)).


Inverse Images
Suppose that f:ST. If AT, the inverse image of A under f is the subset of S consisting of those elements
that map into A:

f1(A)={xS:f(x)A}

The inverse image of A under f

Technically, the inverse images define a new function from P(T) into P(S). We use the same notation as for
the inverse function, which is defined when f is one-to-one and onto. These are very different functions, but
usually no confusion results. The following important theorem shows that inverse images preserve all set
operations.
Suppose that f:ST, and that AT and BT. Then
1. f1(AB)=f1(A)f1(B)
2. f1(AB)=f1(A)f1(B)
3. f1(AB)=f1(A)f1(B)
4. If AB then f1(A)f1(B)
5. If A and B are disjoint, so are f1(A) and f1(B)
Proof:
1. xf1(AB) if and only if f(x)AB if and only if f(x)A or f(x)B if and only if

xf1(A) or xf1(B) if and only if xf1(A)f1(B)


2. The proof is the same as (a), with intersection replacing union and with and replacing or throughout.
3. The proof is the same as (a), with set difference replacing union and with and not replacing or
throughout.
4. Suppose AB. If xf1(A) then f(x)A and hence f(x)B, so xf1(B).
5. If A and B are disjoint, then from (b), f1(A)f1(B)=f1(AB)=f1()=.
The result in part (a) hold for arbitrary unions, and the results in part (b) hold for arbitrary intersections. No new
ideas are involved; only the notation is more complicated.
Suppose that {Ai:iI} is a collection of subsets of T, where I is a nonempty index set. Then
1. f1(iIAi)=iIf1(Ai)
2. f1(iIAi)=iIf1(Ai)
Proof:
1. xf1(iIAi) if and only if f(x)iIAi if and only if f(x)Ai for some iI if and only if

xf1(Ai) for some iI if and only if xiIf1(Ai)


2. The proof is the same as (a), with intersection replacing union and with for every replacing for some.

Forward Images

Suppose again that f:ST. If AS, the forward image of A under f is the subset of T given by

f(A)={f(x):xA}

The forward image of A under f


Technically, the forward images define a new function from P(S) into P(T), but we use the same symbol f
for this new function as for the underlying function from S into T that we started with. Again, the two functions
are very different, but usually no confusion results.
It might seem that forward images are more natural than inverse images, but in fact, the inverse images are
much more important than the forward ones (at least in probability and measure theory). Fortunately, the inverse
images are nicer as well--unlike the inverse images, the forward images do not preserve all of the set operations.
Suppose that f:ST, and that AS and BS. Then
1. f(AB)=f(A)f(B).
2. f(AB)f(A)f(B). Equality holds if f is one-to-one.
3. f(A)f(B)f(AB). Equality holds if f is one-to-one.
4. If AB then f(A)f(B).
Proof:
1. Suppose yf(AB). Then y=f(x) for some xAB. If xA then yf(A) and if xB then

yf(B). In both cases yf(A)f(B). Conversely suppose yf(A)f(B). If yf(A) then


y=f(x) for some xA. But then xAB so yf(AB). Similarly, if yf(B) then y=f(x) for
some xB. But then xAB so yf(AB).
2. If yf(AB) then y=f(x) for some xAB. But then xA so yf(A) and xB so yf(B)
and hence yf(A)f(B). Conversely, suppose that yf(A)f(B). Then yf(A) and yf(B), so
there exists xA with f(x)=y and there exists uB with f(u)=y. At this point, we can go no further.
But if f is one-to-one, then u=x and hence xA and xB. Thus xAB so yf(AB)
3. Suppose yf(A)f(B). Then yf(A) and yf(B). Hence y=f(x) for some xA and yf(u)
for every uB. Thus, xB so xAB and hence yf(AB). Conversely, suppose yf(AB).

Then y=f(x) for some xAB. Hence xA so yf(A). Again, the proof breaks down at this point.
However, if f is one-to-one and f(u)=y for some uB, then u=x so xB, a contradiction. Hence

f(u)y for every uB so yf(B). Thus yf(AB).


4. Suppose AB. If yf(A) then y=f(x) for some xA. But then xB so yf(B).
The result in part (a) hold for arbitrary unions, and the results in part (b) hold for arbitrary intersections. No new
ideas are involved; only the notation is more complicated.
Suppose that {Ai:iI} is a collection of subsets of S, where I is a nonempty index set. Then
1. f(iIAi)=iIf(Ai).
2. f(iIAi)iIf(Ai). Equality holds if f is one-to-one.
Proof:
1. yf(iIAi) if and only if y=f(x) for some xiIAi if and only if y=f(x) for some xAi and
some iI if and only if yf(Ai) for some iA if and only if yiIf(Ai).
2. If yf(iIAi) then y=f(x) for some xiIAi. Hence xAi for every iI so yf(Ai) for
every iI and thus yiIf(Ai). Conversely, suppose that yiIf(Ai). Then yf(Ai) for every

iI. Hence for every iI there exists xiAi with y=f(xi). If f is one-to-one, xi=xj for all i,jI.
Call the common value x. Then xAi for every iI so xiIAi and therefore yf(iIAi).
Suppose again that f:ST. As noted earlier, the forward images of f define a function from P(S) into P(T)
and the inverse images define a function from P(T) into P(S). One might hope that these functions are
inverses of one-another, but alas no.
Suppose that f:ST.
1. Af1[f(A)] for AS. Equality holds if f is one-to-one.
2. f[f1(B)]B for BT. Equality holds if f is onto.
Proof:
1. If xA then f(x)f(A) and hence xf1[f(A)]. Conversely suppose that xf1[f(A)]. Then

f(x)f(A) so f(x)=f(u) for some uA. At this point we can go no further. But if f is one-to-one,
then u=x and hence xA.

2. Suppose yf[f1f(B)]. Then y=f(x) for some xf1(B). But then y=f(x)B. Conversely
suppose that f is onto and yB. There exist xS with f(x)=y. Hence xf1(B) and so

yf[f1(B)].
Spaces of Real Functions
Let S be a set and let V denote the set of all functions from S into R. The usual arithmetic operations on
members of V are defined pointwise. That is, if fV, gV, and cR, then f+g, fg, fg, and cf defined as
follows for all xS.

(f+g)(x)=f(x)+g(x)

(fg)(x)=f(x)g(x)

(fg)(x)=f(x)g(x)

(cf)(x)=cf(x)

If g(x)0 for every xS, then f/g defined by (f/g)(x)=f(x)/g(x) for xS is also in V. The 0 functions

0 of course is defined by 0(x)=0 for all xS. Each of these functions is also in V.
A fact that is very important in probability as well as other branches of analysis is that V with addition and
scalar multiplication as defined above is a vector space.

(V,+,) is a vector space over R. That is, for all f,g,hV and a,bR
1. f+g=f+f (commutative rule)
2. f+(g+h)=(f+g)+h (associative rule)
3. a(f+g)=af+ag (distributive rule)
4. (a+b)f=af+bf (distributive rule)
5. f+0=f (existence of zero vector)
6. f+(f)=0 (existence of additive inverses)
7. 1f=f (unity rule)

Proof:
Each of these properties follows from the corresponding property in R.
Indicator Functions
Suppose that A is a subset of a universal set S. The indicator function of A is the function 1A:S{0,1}
defined as follows:

1A(x)={1,0,xAxA
Thus, the indicator function of A simply indicates whether or not xA for each xS.
Conversely, any function f:S{0,1} is the indicator function of the set A=f1{1}={xS:f(x)=1}
Thus, there is a one-to-one correspondence between P(S), the power set of S, and the collection of indicator
functions {0,1}S. The next exercise shows how the set algebra of subsets corresponds to the arithmetic algebra
of the indicator functions.
Suppose that A and B are subsets of S. Then
1. 1AB=1A1B=min{1A,1B}
2. 1AB=1(11A)(11B)=max{1A,1B}
3. 1Ac=11A
4. 1AB=1A(11B)
5. AB if and only if 1A1B
Proof:
1. Note that both functions on the right just take the values 0 and 1. Moreover,

1A(x)1B(x)=min{1A(x)1B(x)}=1 if and only if xA and xB


2. Note that both function on the right just take the values 0 and 1. Moreover, 1(11A(x))

(11B(x))=max{1A(x),1B(x)}=1 if and only if xA or xB.


3. Note that 11A just takes the values 0 and 1. Moreover, 11A(x)=1 if and only if xA.
4. Note that 1AB=1ABc=1A1Bc=1A(11B) by parts (a) and (c)

5. Since both functions just take the values 0 and 1, note that 1A1B if and only if 1A(x)=1 implies

1B(x)=1. But in turn, this is equivalent to AB.


The results in part (a) extends to arbitrary intersections and the results in part (b) extends to arbitrary unions.
Suppose that {Ai:iI} is a collection of subsets of S, where I is a nonempty index set. Then
1. 1iIAi=iI1Ai=min{1Ai:iI}
2. 1iIAi=1iI(11Ai)=max{1Ai:iI}
proof:
In general, a product over an infinite index set may not make sense. But if all of the factors are either 0 or 1, as
they are here, then we can simply define the product to be 1 if all of the factors are 1, and 0 otherwise.
1. The functions in the middle and on the right just take the values 0 and 1. Moreover, both take the value 1
at xS if and only if xAi for every iI.
2. The functions in the middle and on the right just take the values 0 and 1. Moreover, both take the value 1
at xS if and only if xAi for some iI.
Functions into Product Spaces
Suppose that S and T are sets and that f:STn where n{2,3,}. For i{1,2,,n}, we define the ith
coordinate function fi:S=T by the rule that fi(x) is the ith coordinate of f(x) for each xS. Conversely, a
sequence of functions (f1,f2,,fn), each mapping S into T defines a function f:STn by the rule

f(x)=(f1(x),f2(x),,fn(x)) for xS.


Operators
Suppose that S is a set. A function f:SS is sometimes called an unary operator on S. As the name suggests,

f operates on an element xS to produce a new element f(x)S. Similarly, a function g:SSS is


sometimes called an binary operator on S. As this name suggests, g operates on a pair of elements

(x,y)SS to produce a new element g(x,y)S.


The arithmetic operators are quintessential examples:

minus(x)=x is a unary operator on R.

sum(x,y)=x+y is a binary operator on R.

product(x,y)=xy is a binary operator on R.

difference(x,y)=xy is a binary operator on R.

reciprocal(x)=1x is a unary operator on R{0}.

For a fixed universal set S, the set operators on P(S) were studied in the section on Sets:

complement(A)=Ac is a unary operator.

union(A,B)=AB is a binary operator.

intersect(A,B)=AB is a binary operator.

difference(A,B)=AB is a binary operator.

As these example illustrate, a binary operator is often written as xfy rather than f(x,y). Still, it is useful to
know that operators are simply functions of a special type.
Suppose now that f is a unary operator on a set S, g is a binary operator on S, and that AS. Then A is said to
be closed under f if xA implies f(x)A. Thus, f restricted to A is a unary operator on A. Similarly, A is
said to be closed under g if (x,y)AA implies g(x,y)A. Thus, g restricted to AA is a binary operator
on A. Returning to our basic examples, note that

N is closed under plus and times, but not under minus and difference.

Z is closed under plus, times, minus, and difference.

Q is closed under plus, times, minus, and difference. Q{0} is closed under reciprocal.

Many properties that you are familiar with for special operators (such as the arithmetic and set operators) can
now be formulated generally. Suppose that f and g are binary operators on S. In the following definitions, x, y,
and z are arbitrary elements of S.

f is commutative if f(x,y)=f(y,x), that is, xfy=yfx

f is associative if f(x,f(y,z))=f(f(x,y),z), that is, xf(yfz)=(xfy)fz.

g distributes over f (on the left) if g(x,f(y,z))=f(g(x,y),g(x,z)), that is, xg(yfz)=(xgy)f(xgz)

The Axiom of Choice

Suppose that S is a collection of nonempty subsets of a set S. The axiom of choice states that there exists a
function f:SS with the property that f(A)A for each AS. The function f is known as a choice
function.
Stripped of most of the mathematical jargon, the idea is very simple. Since each set AS is nonempty, we can
select an element of A; we will call the element we select f(A) and thus our selections define a function. In
fact, you may wonder why we need an axiom at all. The problem is that we have not given a rule (or procedure
or algorithm) for selecting the elements of the sets in the collection. Indeed, we may not know enough about the
sets in the collection to define a specific rule, so in such a case, the axiom of choice simply guarantees the
existence of a choice function. Some mathematicians, known as constructionists do not accept the axiom of
choice, and insist on well defined rules for constructing functions.
A nice consequence of the axiom of choice is a type of duality between one-to-one functions and onto functions.
Suppose that f is a function from a set S onto a set T. There exists a one-to-one function g from T into S.
Proof.
For each yT, the set f1{y} is non-empty, since f is onto. By the axiom of choice, we can select an element

g(y) from f1{y} for each yT. The resulting function g is one-to-one.
Suppose that f is a one-to-one function from a set S into a set T. There exists a function g from T onto S.
Proof.
Fix a special element x0S. If yrange(f), there exists a unique xS with f(x)=y. Define g(y)=x. If

yrange(f), define g(y)=x0. The function g is onto.

Computational Exercises
Some Elementary Functions
Each of the following rules defines a function from R into R.

f(x)=x2

g(x)=sin(x)

h(x)=x

u(x)=ex1+ex

Find the range of the function and determine if the function is one-to-one in each of the following cases:

1. f
2. g
3. h
4. u
Answer:
1. Range [0,). Not one-to-one.
2. Range [1,1]. Not one-to-one.
3. Range Z. Not one-to-one.
4. Range (0,1). One-to-one.
Find the following inverse images:
1. f1[4,9]
2. g1{0}
3. h1{2,3,4}
Answer:
1. [3,2][2,3]
2. {n:nZ}
3. [2,5)
The function u is one-to-one. Find (that is, give the domain and rule for) the inverse function u1.
Answer:

u1(p)=ln(p1p) for p(0,1)


Give the rule and find the range for each of the following functions:
1. fg
2. gf
3. hgf
Answer:

1. (fg)(x)=sin2(x). Range [0,1]


2. (gf)(x)=sin(x2). Range [1,1]
3. (hgf)(x)=sin(x). Range {1,0,1}
Note that fg and gf are well-defined functions from R into R, but fggf.
Dice
Let S={1,2,3,4,5,6}2. This is the set of possible outcomes when a pair of standard dice are thrown. Let f, g,

u, and v be the functions from S into Z defined by the following rules:

f(x,y)=x+y

g(x,y)=yx

u(x,y)=min{x,y}

v(x,y)=max{x,y}

In addition, let F and U be the functions defined by F=(f,g) and U=(u,v).


Find the range of each of the following functions:

1. f
2. g
3. u
4. v
5. U
Answer:

1. {2,3,4,,12}
2. {5,4,,4,5}
3. {1,2,3,4,5,6}
4. {1,2,3,4,5,6}

5. {(i,j){1,2,3,4,5,6}2:ij}

Give each of the following inverse images in list form:


1. f1{6}
2. u1{3}
3. v1{4}
4. U1{(3,4)}
Answer:

1. {(1,5),(2,4),(3,3),(4,2),(5,1)}
2. {(3,3),(3,4),(4,3),(3,5),(5,3),(3,6),(6,3)}
3. {(1,4),(4,1),(2,4),(4,2),(3,4),(4,3),(4,4)}
4. {(3,4),(4,3)}
Find each of the following compositions:
1. fU
2. gU
3. uF
4. vF
5. FU
6. UF
Answer:
1. fU=f
2. gU=|g|
3. uF=g
4. vF=f
5. FU=(f,|g|)

6. UF=(g,f)
Note that while fU is well-defined, Uf is not. Note also that fU=f even though U is not the identity
function on S.
Bit Strings
Let nN+ and let S={0,1}n and T={0,1,,n}. Recall that the elements of S are bit strings of length n,
and could represent the possible outcomes of n tosses of a coin (where 1 means heads and 0 means tails). Let

f:ST be the function defined by f(x1,x2,,xn)=ni=1xi. Note that f(x) is just the number of 1s in the
the bit string x. Let g:TS be the function defined by g(k)=xk where xk denotes the bit string with k 1s
followed by nk 0s.
Find each of the following
1. fg
2. gf
Answer:
1. fg:TT and (fg)(k)=k.
2. gf:SS and (gf)(x)=xk where k=f(x)=ni=1xi. In words, (gf)(x) is the bit string with the
same number of 1s as x, but rearranged so that all the 1s come first.
In the previous exercise, note that fg and gf are both well-defined, but have different domains, and so of
course are not the same. Note also that fg is the identity function on T, but f is not the inverse of g. Indeed f
is not one-to-one, and so does not have an inverse. However, f restricted to {xk:kT} (the range of g) is oneto-one and is the inverse of g.
Let n=4. Give f1({k}) in list form for each kT.
Answer:
1. f1({0})={0000}
2. f1({1})={1000,0100,0010,0001}
3. f1({2})={1100,1010,1001,0110,0101,0011}
4. f1({3})={1110,1101,1011,0111}
5. f1({4})={1111}

Again let n=4. Let A={1000,1010} and B={1000,1100}. Give each of the following in list form:
1. f(A)
2. f(B)
3. f(AB)
4. f(A)f(B)
5. f1(f(A))
Answer:

1. {1,2}
2. {1,2}
3. {1}
4. {1,2}
5. {1000,0100,0010,0001,1100,1010,1001,0110,0101,0011}
In the previous exercise, note that f(AB)f(A)f(B) and Af1(f(A)).
Indicator Functions
Suppose that A and B are subsets of a universal set S. Express, in terms of 1A and 1B, the indicator function of
each of the 14 non-trivial sets that can be constructed from A and B. Use the Venn diagram applet to help.
Answer:
1. 1A
2. 1B
3. 1Ac=11A
4. 1Bc=11B
5. 1AB=1A1B
6. 1AB=1A+1B1A1B
7. 1ABc=1A1A1B

8. 1BAc=1B1A1B
9. 1ABc=11B+1A1B
10. 1BAc=11A+1A1B
11. 1AcBc=11A1B+1A1B
12. 1Acbc=11A1B
13. 1(ABc)(BAc)=1A+1B21A1B
14. 1(AB)(AcBc)=11A1B+21A2B
Suppose that A, B, and C are subsets of a universal set S. Give the indicator function of each of the following,
in terms of 1A, 1B, and 1C in sum-product form:
1. D={xS:x is an element of exactly one of the given sets}
2. E={xS:x is an element of exactly two of the given sets}
Answer:
1. 1D=1A+1B+1C2(1A1B+1A1C+1B1C)+31A1B1C
2. 1E=1A1B+1A1C+1B1C31A1B1C
Operators
Recall the standard arithmetic operators on R discussed above.
We all know that sum is commutative and associative, and that product distributes over sum.
1. Is difference commutative?
2. Is difference associative?
3. Does product distribute over difference?
4. Does sum distributed over product?
Answer:
1. No. xyyx
2. No. x(yz)(xy)z
3. Yes. x(yz)=(xy)(xz)

4. No. x+(yz)(x+y)(x+z)

3. Relations
Definitions
Suppose that S and T are sets. A relation from S to T is a subset of the product set ST. As the name suggests,
a relation R from S into T is supposed to define a relationship between the elements of S and the elements of

T, and so we usually use the more suggestive notation xRy when (x,y)R. The domain of R is the set of first
coordinates and the range of R is the set of second coordinates.

domain(R)range(R)={xS:(x,y)R for some yT}={yT:(x,y)R for some xS}


Recall that these are projections; the domain of R is the projection of R onto S and the range of R is the
projection of R onto T. A relation from a set S to itself is said to be a relation on S.
Basic Examples
Suppose that S is a set and recall that P(S) denotes the power set of S, the collection of all subsets of S. The
membership relation from S to P(S) is perhaps the most important and basic relationship in mathematics.
Indeed, for us, it's a primitive (undefined) relationshipgiven x and A, we assume that we understand the
meaning of the statement xA.
Another basic primitive relation is the equality relation = on a given set of objects S. That is, given two objects

x and y, we assume that we understand the meaning of the statement x=y.


Other basic relations that you have seen are
1. The subset relation on P(S).
2. The order relation on R
These two belong to a special class of relations known as partial orders that we will study in the next section.
Note that a function f from S into T is a special type of relation. To compare the two types of notation (relation
and function), note that xfy means that y=f(x).

Constructions
Set Operations
Since a relation is just a set of ordered pairs, the set operations can be used to build new relations from existing
ones. Specifically, if Q and R are relations from S to T, then so are QR, QR, QR. If R is a relation from

S to T and QR, then Q is a relation from S to T.


Suppose that Q and R are relations from S to T.
1. x(QR)y if and only if xQy or xRy.
2. x(QR)y if and only if xQy and xRy.
3. x(QR)y if and only if xQy but not xRy.
4. If QR then xQy implies xRy.
Restrictions
If R is a relation on S and AS then RA=R(AA) is a relation on A, called the restriction of R to A.
Inverse
If R is a relation from S to T, the inverse of R is the relation from T to S defined by

yR1x if and only if xRy


Equivalently, R1={(y,x):(x,y)R}. Note that any function f from S into T has an inverse relation, but
only when the f is one-to-one is the inverse relation also a function (the inverse function).
Composition
Suppose that Q is a relation from S to T and that R is a relation from T to U. The composition QR is the
relation from S to U defined as follows: for xS and zU, x(QR)z if and only if there exists yT such
that xQy and yRz. Note that the notation is inconsistent with the notation used for composition of functions,
essentially because relations are read from left to right, while functions are read from right to left. Hopefully,
the inconsistency will not cause confusion, since we will always use function notation for functions.

Basic Properties
The important classes of relations that we will study in the next couple of sections are characterized by certain
basic properties. Here are the definitions:
Suppose that R is a relation on S.

1. R is reflexive if xRx for all xS.


2. R is irreflexive if no xS satisfies xRx.
3. R is symmetric if xRy implies yRx for all x,yS.
4. R is anti-symmetric if xRy and yRx implies x=y for all x,yS.
5. R is transitive if xRy and yRz implies xRz for all x,y,zS.
The proofs of the following results are straightforward, so be sure to try them yourself before reading the ones
in the text.
A relation R on S is reflexive if and only if the equality relation = on S is a subset of R.
Proof:
This follows from the definitions. R is reflexive if and only if (x,x)R for all xS.
A relation R on S is symmetric if and only if R1=R.
Proof:
Suppose that R is symmetric. If (x,y)R then (y,x)R and hence (x,y)R1. If (x,y)R1 then

(y,x)R and hence (x,y)R. Thus R=R1. Conversely, suppose R=R1. If (x,y)R then (x,y)R1
and hence (y,x)R.
A relation R on S is transitive if and only if RRR.
Proof:
Suppose that R is transitive. If (x,z)RR then there exists yS such that (x,y)R and (y,z)R. But
then (x,z)R by transitivity. Hence RRR. Conversely, suppose that RRR. If (x,y)R and

(y,z)R then (x,z)RR and hence (x,z)R. Hence R is transitive.


A relation R on S is antisymmetric if and only if RR1 is a subset of the equality relation = on S.
Proof:
Restated, this result is that R is antisymmetric if and only if (x,y)RR1 implies x=y. Thus suppose that

R is antisymmetric. If (x,y)RR1 then (x,y)R and (x,y)R1. But then (y,x)R so by

antisymmetry, x=y. Conversely suppose that (x,y)RR1 implies x=y. If (x,y)R and (y,x)R then

(x,y)R1 and hence (x,y)RR1. Thus x=y so R is antisymmetric.


Suppose that Q and R are relations on S. For each property below, if both Q and R have the property, then so
does QR.
1. reflexive
2. symmetric
3. transitive
Proof:
1. Suppose that Q and R are reflexive. Then (x,x)Q and (x,x)R for each xS and hence

(x,x)QR for each xS. Thus QR is reflexive.


2. Suppose that Q and R are symmetric. If (x,y)QR then (x,y)Q and (x,y)R. Hence (y,x)Q
and (y,x)R so (y,x)QR. Hence QR is symmetric.
3. Suppose that Q and R are transitive. If (x,y)QR and (y,z)QR then (x,y)Q, (x,y)R,

(y,z)Q, and (y,z)R. Hence (x,z)Q and (x,z)R so (x,z)QR. Hence QR is transitive.
Suppose that R is a relation on a set S.
1. Give an explicit definition for the property R is not reflexive.
2. Give an explicit definition for the property R is not irreflexive.
3. Are any of the properties R is reflexive, R is not reflexive, R is irreflexive, R is not irreflexive
equivalent?
Answer:
1. R is not reflexive if and only if there exists xS such that (x,x)R.
2. R is not irreflexive if and only if there exists xS such that (x,x)R.
3. Nope.
Suppose that R is a relation on a set S.
1. Give an explicit definition for the property R is not symmetric.
2. Give an explicit definition for the property R is not antisymmetric.
3. Are any of the properties R is symmetric, R is not symmetric, R is antisymmetric, R is not
antisymmetric equivalent?

Answer:
1. R is not symmetric if and only if there exist x,yS such that (x,y)R and (y,x)R.
2. R is not antisymmetric if and only if there exist distinct x,yS such that (x,y)R and (y,x)R.
3. Nope.

Computational Exercises
Let R be the relation defined on R by xRy if and only if sin(x)=sin(y). Determine if R has each of the
following properties:
1. reflexive
2. symmetric
3. transitive
4. irreflexive
5. antisymmetric
Answer:
1. yes
2. yes
3. yes
4. no
5. no
The relation R in the previous exercise is a member of an important class of equivalence relations.
Let R be the relation defined on R by xRy if and only if x2+y21. Determine if R has each of the following
properties:
1. reflexive
2. symmetric
3. transitive
4. irreflexive
5. antisymmetric
Answer:
1. no
2. yes
3. no
4. no

5. no

4. Partial Orders
Basic Theory
Definitions
A relation on S that is reflexive, anti-symmetric, and transitive is said to be a partial order on S, and the pair

(S,) is called a partially ordered set. Thus for all x, y, zS:


1. xx, the reflexive property
2. If xy and yx then x=y, the antisymmetric property
3. If xy and yz then xz, the transitive property

As the name and notation suggest, a partial order is a type of ordering of the elements of S. Partial orders occur
naturally in many areas of mathematics, including probability. A partial order on a set naturally gives rise to
several other relations the set.
Suppose that is a partial order on a set S. The relations , , , , and are defined as follows:
1. xy if and only if yx.
2. xy if and only if xy and xy.
3. xy if and only if yx.
4. xy if and only if xy or yx.
5. xy if and only if neither xy nor yx.
Note that is the inverse of , and is the inverse of . Note also that xy if and only if either xy or

x=y, so the relation completely determines the relation . Finally, note that xy means that x and y are
related in the partial order, while xy means that x and y are unrelated in the partial order. Thus, the relations
and are complements of each other, as sets of ordered pairs. A total or linear order is a partial order in
which there are no unrelated elements.
A partial order on S is a total order or linear order if for every x, yS, either xy or yx.
Suppose that 1 and 2 are partial orders on a set S. Then 1 is an sub-order of 2, or equivalently 2 is an
extension of 1 if and only if x1y implies x2y for x, yS.

Thus if 1 is a suborder of 2, then as sets of ordered pairs, 1 is a subset of 2. We need one more relation
that arises naturally from a partial order.
Suppose that is a partial order on a set S. For x, yS, y is said to cover x if xy but no element zS
satisfies xzy.
If S is finite, the covering relation completely determines the partial order, by virtue of the transitive property.
The covering graph or Hasse graph is the directed graph with vertex set S and directed edge set E, where

(x,y)E if and only if y covers x. Thus, xy if and only if there is a directed path in the graph from x to y.
Hasse graphs are named for the German mathematician Helmut Hasse. The graphs are often drawn with the
edges directed upward. In this way, the directions can be inferred without having to actually draw arrows.
Basic Examples
Of course, the ordinary order is a total order on the set of real numbers R. The subset partial order is one of
the most important in probability theory:
Suppose that S is a set. The subset relation is a partial order on P(S), the power set of S.
Proof:
We proved this result in the section on sets. To review, recall that for A, BP(S), AB means that xA
implies xB. Also A=B means that xA if and only if xB. Thus
1. AA
2. AB and BA if and only if A=B
3. AB and BC imply AC
Let denote the division relation on the set of positive integers N+. That is, mn if and only if there exists

kN+ such that n=km. Then


1. is a partial order on N+.
2. is a sub-order of the ordinary order .
Proof:
1. Clearly nn for nN+, since n=1n, so is reflexive. Suppose mn and nm, where m, nN+.
Then there exist j, kN+ such that n=km and m=jn. Substituting gives n=jkn, and hence j=k=1.

Thus m=n so is antisymmetric. Finally, suppose mn and np, where m, n, pN+. Then there
exists j, kN+ such that n=jm and p=kn. Substituting gives p=jkm, so mp. Thus is transitive.
2. If m, nN+ and mn, then there exists kN+ such that n=km. Since k1, mn.
Basic Properties
The proofs of the following basic properties are straightforward. Be sure to try them yourself before reading the
ones in the text.
The inverse of a partial order is also a partial order.
Proof:
Clearly the reflexive, antisymmetric and transitive properties hold for .
If is a partial order on S and A is a subset of S, then the restriction of to A is a partial order on A.
Proof:
The reflexive, antisymmetric, and transitive properties given above hold for all x, y, zS and hence hold for
all x, y, zA.
The following theorem characterizes relations that correspond to strict order.
Let S be a set. A relation is a partial order on S if and only if is transitive and irreflexive.
Proof:
Suppose that is a partial order on S. Recall that is defined by xy if and only if xy and xy. If xy
and yz then xy and yz, and so xz. On the other hand, if x=z then xy and yx so x=y, a
contradiction. Hence xz and so xz. Therefore is transitive. If xy then xy by definition, so is
irreflexive.
Conversely, suppose that is a transitive and irreflexive relation on S. Recall that is defined by xy if and
only if xy or x=y. By definition then, is reflexive: xx for every xS. Next, suppose that xy and

yx. If xy and yx then xx by the transitive property of . But this is a contradiction by the irreflexive
property, so we must have x=y. Thus is antisymmetric. Suppose xy and yz. There are four cases:
1. If xy and yz then xz by the transitive property of .
2. If x=y and yz then xz by substitution.

3. If xy and y=z then xz by substitution.


4. If x=y and y=z then x=z by the transitive property of =.
In all cases we have xz so is transitive. Hence is a partial order on S.
Monotone Sets and Functions
Partial orders form a natural setting for increasing and decreasing sets and functions. Here are the definitions:
Suppose that is a partial order on a set S and that AS. Then
1. A is increasing if xA and xy imply yA.
2. A is decreasing if yA and xy imply xA.
Suppose that S is a set with partial order S, T is a set with partial order T, and that f:ST. Then
1. f is increasing if and only if xSy implies f(x)Tf(y).
2. f is decreasing if and only if xSy implies f(x)Tf(y).
3. f is strictly increasing if and only if xSy implies f(x)Tf(y).
4. f is strictly decreasing if and only if xSy implies f(x)Tf(y)

Recall the definition of the indicator function 1A associated with a subset A of a universal set S: For xS,

1A(x)=1 if xA and 1A(x)=0 if xA.


Suppose that is a partial order on a set S and that AS. Then
1. A is increasing if and only if 1A is increasing.
2. A is decreasing if and only if 1A is decreasing.
Proof:
1. A is increasing if and only if xA and xy implies yA if and only if 1A(x)=1 and xy implies

1A(y)=1 if and only if 1A is increasing.


2. A is decreasing if and only if yA and xy implies xA if and only if 1A(y)=1 and xy implies

1A(x)=1 if and only if 1A is decreasing.


Isomorphism

Two partially ordered sets (S,S) and (T,T) are said to be isomorphic if there exists a one-to-one function f
from S onto T such that xSy if and only if f(x)Tf(y), for all x, yS. The function f is an isomorphism.
Generally, a mathematical system often consists of a set and various structures defined in terms of the set, such
as relations, operators, or a collection of subsets. Loosely speaking, two mathematical systems of the same type
are isomorphic if there exists a one-to-one function from one of the sets onto the other that preserves the
structures, and again, the function is called an isomorphism. The basic idea is that isomorphic systems are
mathematically identical, except for superficial matters of appearance. The word isomorphism is from the Greek
and means equal shape.
Suppose that the partially ordered sets (S,S) and (T,T) are isomorphic, and that f:ST is an isomorphism.
Then f and f1 are strictly increasing.
Proof:
We need to show that for x, yS, xSy if and only if f(x)Tf(y). If xSy then by definition, f(x)Tf(y).
But if f(x)=f(y) then x=y since f is one-to-one. This is a contradiction, so f(x)Tf(y). Similarly, if

f(x)Tf(y) then by definition, xSy. But if x=y then f(x)=f(y), a contradiction. Hence f(x)Sf(y).
In a sense, the subset partial order is universalevery partially ordered set is isomorphic to (S,) for some
collection of sets S.
Suppose that is a partial order on a set S. Then there exists SP(S) such that (S,) is isomorphic to

(S,).
Proof:
For each xS, let Ax={uS:ux}, and then let S={Ax:xS}, so that SP(S). We will show that the
function xAx from S onto S is one-to-one, and satisfies

xyAxAy
First, suppose that x, yS and Ax=Ay. Then xAx so xAy and hence xy. Similarly, yAy so yAx
and hence yx. Thus x=y, so the mapping is one-to-one. Next, suppose that xy. If uAx then ux so

uy by the transitive property, and hence uAy. Thus AxAy. Conversely, suppose AxAy. As before,
xAx, so xAy and hence xy.
Extremal Elements
Various types of extremal elements play important roles in partially ordered sets. Here are the definitions:
Suppose that is a partial order on a set S and that AS.

1. An element aA is the minimum element of A if and only if ax for every xA.


2. An element aA is a minimal element of A if and only if no xA satisfies xa.
3. An element bA is the maximum element of A if and only if bx for every xA.
4. An element bA is a maximal element of A if and only if no xA satisfies xb.
In general, a set can have several maximal and minimal elements (or none). On the other hand,
The minimum and maximum elements of A, if they exist, are unique. They are denoted min(A) and max(A),
respectively.
Proof:
Suppose that a, b are minimum elements of A. Since a, bA we have ab and ba, so a=b by the
antisymmetric property. The proof for the maximum element is analogous.
Minimal, maximal, minimum, and maximum elements of a set must belong to that set. The following
definitions relate to upper and lower bounds of a set, which do not have to belong to the set.
Suppose again that is a partial order on a set S and that AS. Then
1. An element uS is a lower bound for A if and only if ux for every xA.
2. An element vS is an upper bound for A if and only if vx for every xA.
3. The greatest lower bound or infimum of A, if it exists, is the maximum of the set of lower bounds of A.
4. The least upper bound or supremum of A, if it exists, is the minimum of the set of upper bounds of A.
By the uniqueness result above, the greatest lower bound of A is unique, if it exists. It is denoted glb(A) or

inf(A). Similarly, the least upper bound of A is unique, if it exists, and is denoted lub(A) or sup(A). Note
that every element of S is a lower bound and an upper bound for , since the conditions in the definition hold
vacuously.
The symbols and are also used for infimum and supremum, respectively, so A=inf(A) and

A=sup(A) if they exist.. In particular, for x, yS, operator notation is more commonly used, so
xy=inf{x,y} and xy=sup{x,y}, if these elements exist. If xy and xy exist for every x, yS, then
(S,) is called a lattice.
For the subset partial order, the inf and sup operators correspond to intersection and union, respectively:

Let S be a set and consider the subset partial order on P(S), the power set of S. Let A be a nonempty
subset of P(S)), that is, a nonempty collection of subsets of S. Then
1. inf(A)=A
2. sup(A)=A
Proof:
1. First, AA for every AA and hence A is a lower bound of A. If B is a lower bound of A then

BA for every AA and hence BA. Therefore A is the greatest lower bound.
2. First, AA for every AA and hence A is an upper bound of A. If B is an upper bound of A
then AB for every AA and hence AB. Therefore A is the least upper bound.
In particular, AB=AB and AB)=AB, so (S,P(S)) is a lattice.
Consider the division partial order on the set of positive integers N+ and let A be a nonempty subset of N+.
1. inf(A) is the greatest common divisor of A, usually denoted gcd(A) in this context.
2. If A is infinite then sup(A) does not exist. If A is finite then sup(A) is the least common multiple of

A, usually denoted lcm(A) in this context.


Suppose that S is a set and that f:SS. An element zS is said to be a fixed point of f if f(z)=z.
The following result explores a basic fixed point theorem for a partially ordered set. The theorem is important in
the study of cardinality.
Suppose that is a partial order on a set S with the property that sup(A) exists for every AS. If f:SS is
increasing, then f has a fixed point.
Proof.
Let A={xS:xf(x)} and let z=sup(A). If xA then xz so xf(x)f(z)}. Hence f(z) is an upper
bound of A so zf(z). But then f(z)f(f(z)) so f(z)A. Hence f(z)z. Therefore f(z)=z.
Note that the hypotheses of the theorem require that sup()=min(S) exists. The set A={xS:xf(x)} is
nonempty since min(S)A.
If is a total order on a set S with the property that every nonempty subset of S has a minimum element, then

S is said to be well ordered by . One of the most important examples is N+, which is well ordered by the

ordinary order . On the other hand, the well ordering principle, which is equivalent to the axiom of choice,
states that every nonempty set can be well ordered.
Orders on Product Spaces
Suppose that S and T are sets with partial orders S and T respectively. Define the relation on ST by

(x,y)(z,w) if and only if xSz and yTw.


1. The relation is a partial order on ST, called, appropriately enough, the product order.
2. Suppose that (S,S)=(T,T). If S has at least 2 elements, then is not a total order on S2.
Proof:
1. If (x,y)ST then xSx and yTy and therefore (x,y)(x,y). Hence is reflexive. Suppose that

(u,v), (x,y)ST and that (u,v)(x,y) and (x,y)(u,v). Then uSx and xSu, and vTy and
yTv. Hence u=x and v=y, so (u,v)=(x,y). Hence is antisymmetric. Finally, suppose that
(u,v), (x,y), (z,w)ST, and that (u,v)(x,y) and (x,y)(z,w). Then uSx and xSz, and
vTy and yTw. Hence uSz and vSw, and therefore (u,v)(z,w). Hence is transitive.
2. Suppose that (S,S)=(T,T), and that x, yS are distinct. If (x,y)(y,x) then xSy and ySx
and hence x=y, a contradiction. Similarly (y,x)(x,y) leads to the same contradiction x=y. Hence

(x,y) and (y,x) are unrelated in the partial order .

The product order on R2. The region shaded red is the set of
points (x,y). The region shaded blue is the set of points (x,y). The region shaded white is the set of points
that are not comparable with (x,y).
Product order extends in a straightforward way to the Cartesian product of a finite or an infinite sequence of
partially ordered spaces. For example, suppose that Si is a set with partial order i for each i{1,2,,n},
where nN+. Let S=S1S2Sn. The product order on S is defined by as follows: for x=(x1,x2,

,xn)S and y=(y1,y2,,yn)S, xy if and only if xiiyi for each i{1,2,,n}. Recalling that
sequences are just functions defined on a particular type of index set, the product order can be generalized even
further:
Suppose that Si is a set with partial order i for each i in a nonempty index set I. Let

S={f:f is a function on I such that f(i)Si for each iI}


Define the relation on S by fg if and only if f(i)ig(i) for each iI. Then is a partial order on S,
known again as the product order.
Proof:
In spite of the abstraction, the proof is perfectly straightforward.
1. If fS, then f(i)if(i) for every iI, and hence ff. Thus is reflexive.
2. Suppose that f, gS and that fg and gf. Then f(i)ig(i) and g(i)if(i) for each iI. Hence

f(i)=g(i) for each iI and so f=g. Thus is antisymmetric


3. Suppose that f, g, hS and that fg and gh. Then f(i)ig(i) and g(i)ih(i) for each iI.
Hence f(i)ih(i) for each iI, so fh. Thus is transitive.
Note that no assumptions are made on the index set I, other than it be nonempty. In particular, no order is
necessary on I. The next result gives a very different type of order on a product space.
Suppose again that S and T are sets with partial orders S and T respectively. Define the relation on ST
by (x,y)(z,w) if and only if either xSz, or x=z and yTw.
1. The relation is a partial order on ST, called the lexicographic order or dictionary order.
2. If S and T are total orders on S and T, respectively, then is a total order on ST.
Proof:
1. If (x,y)ST, then x=x and yTy. Hence (x,y)(x,y) and so is reflexive. Suppose that

(u,v), (x,y)ST, and that (u,v)(x,y) and (x,y)(u,v). There are four cases, but only one can
actually happen.
o If uSx and xSu then uSu. This is a contradiction since S is irreflexive, and so this case
cannot happen.
o

If uSx, x=u, and yTv, then uSu. This case also cannot happen.

If u=x, vTy, and xSu, then uSu. This case also cannot happen

if u=x, vTy, x=u, and yTv, then u=x and v=y, so (u,v)=(x,y).

Hence is antisymmetric. Finally, suppose that (u,v), (x,y), (z,w)ST and that (u,v)(x,y) and

(x,y)(z,w). Again there are four cases:


o

If uSx and xSz then uSz, and hence (u,v)(z,w).

If uSx, x=z, and yTw then uSz and hence (u,v)(z,w).

If u=x, vTy, and xSz then uSz and hence (u,v)(z,w).

If u=x, vTy, x=z, and yTw, then u=z and vTw, and hence (u,v)(z,w).

Therefore is transitive.
2. Suppose that S and T are total orders. Let (u,v), (x,y)ST. Then either uSx or xSu and
either vTy or yTv. Again there are four cases
o

If uSx then (u,v)(x,y)

If xSu then (x,y)(u,v)

If u=x and vTy then (u,v)(x,y)

If u=x and yTv then (x,y)(u,v)

Hence is a total order.

The lexicographic order on R2. The region shaded red is the set
of points (x,y). The region shaded blue is the set of points (x,y). The region shaded white is the set of
points that are not comparable with (x,y).

As with the product order, the lexicographic order can be generalized to a collection of partially ordered spaces.
However, we need the index set to be totally ordered.
Suppose that Si is a set with partial order i for each i in a nonempty index set I. Suppose also that is a total
order on I. Let

S={f:f is a function on I such that f(i)Si for each iI}


Define the relation on S as follows: fg if and only if there exists jI such that f(i)=g(i) if i<j and

f(j)jg(j). Then
1. is a partial order on S, known again as the lexicographic order.
2. If i is a total order for each iI, and I is well ordered by , then then is a total order on S.
Proof:
1. By the result above on strict orders, we need to show that is irreflexive and transitive. Thus, suppose
that f, gS and that fg and gf. Then there exists jI such that f(i)=g(i) if i<j and

f(j)jg(j). Similarly there exists kI such that g(i)=f(i) if i<k and g(k)kf(k). Since I is totally
ordered under , either j<k, k<j, or j=k. If j<k then we have g(j)=f(j) and f(j)jg(j), a
contradiction. If k<j then we have f(k)=g(k) and g(k)kf(k), another contradiction. If j=k, we
have f(j)jg(j) and g(j)jf(j), another contradiction. Hence is irreflexive. Next, suppose that

f, g, hS and that fg and gh. Then there exists jI such that f(i)=g(i) if i<j and f(j)jg(j).
Similarly, there exists kI such that g(i)=h(i) if i<k and g(k)kh(k). Again, since I is totally
ordered, either j<k or k<j or j=k. If j<k, then f(i)=g(i)=h(i) if i<j and f(j)jg(j)=h(j). If

k<j, then f(i)=g(i)=h(i) if i<k and f(k)=g(k)kh(k). If j=k, then f(i)=g(i)=h(i) if i<j and
f(j)jg(j)jh(j). In all cases, fh so is transitive.
2. Suppose now that i is a total order on Si for each iI and that I is well ordered by . Let f, gS
with fg. Let J={iI:f(i)g(i)}. Then J by assumption, and hence has a minimum element j.
If i<j then iJ and hence f(i)=g(i). On the other hand, f(j)g(j) since jJ and therefore, since

j is totally ordered, we must have either f(j)jg(j) or g(j)jf(j). In the first case, fg and in the
second case gf. Hence is totally ordered.
The term lexicographic comes from the way that we order words alphabetically: We look at the first letter; if
these are different, we know how to order the words. If the first letters are the same, we look at the second
letter; if these are different, we know how to order the words. We continue in this way until we find letters that
are different, and we can order the words. In fact, the lexicographic order is sometimes referred to as the first
difference order. Note also that if Si is a set and i a total order on Si for iI, then by the well ordering
principle, there exists a well ordering of I, and hence there exists a lexicographic total order on the product
space S. As a mathematical structure, the lexicographic order is not as obscure as you might think.

(R,) is isomorphic to the lexicographic product of (Z,) with ([0,1),), where is the ordinary order for
real numbers.
Proof:
Every xR can be uniquely expressed in the form x=n+t where n=xZ is the integer part and

t=xn[0,1) is the remainder. Thus x(n,t) is a one-to-one function from R onto Z[0,1). For example,
5.3 maps to (5,0.3), while 6.7 maps to (7,0.3). Suppose that if x=m+s, y=n+tR, where of course
m, nZ are the integer parts of x and y, respectively, and s, t[0,1) are the corresponding remainders. Then
x<y if and only if m<n or m=n and s<t. Again, to illustrate with real real numbers, we can tell that
5.3<7.8 just by comparing the integer parts: 5<7. We can ignore the remainders. On the other hand, to see that
6.4<6.7 we need to compare the remainders: 0.4<0.7 since the integer parts are the same.
Limits of Sequences of Real Numbers
Suppose that (a1,a2,) is a sequence of real numbers.
The sequence inf{an,an+1} is increasing in nN+.
Since the sequence of infimums in the last result is increasing, the limit exists in R{}, and is called the
limit inferior of the original sequence:

liminfnan=limninf{an,an+1,}
The sequence sup{an,an+1,} is decreasing in nN+.
Since the the sequence of supremums in the last result is decreasing, the limit exists in R{}, and is called
the limit superior of the original sequence:

limsupnan=limnsup{an,an+1,}
Note that liminfnanlimsupnan and equality holds if and only if limnan exists (and is the
common value).

Computational Exercises
Let S={2,3,4,6,12}.
1. Sketch the Hasse graph corresponding to the ordinary order on S.

2. Sketch the Hasse graph corresponding to the division partial order on S.


Answer:

1.

2.
Consider the ordinary order on the set of real numbers R, and let A=[a,b) where a<b. Find each of the
following that exist:
1. The set of minimal elements of A
2. The set of maximal elements of A
3. min(A)
4. max(A)
5. The set of lower bounds of A
6. The set of upper bounds of A
7. inf(A)
8. sup(A)
Answer:
1. {a}
2.
3. a
4. Does not exist
5. (,a]

6. [b,)
7. a
8. b
Again consider the division partial order on the set of positive integers N+ and let A={2,3,4,6,12}. Find
each of the following that exist:
1. The set of minimal elements of A
2. The set of maximal elements of A
3. min(A)
4. max(A)
5. The set of lower bounds of A
6. The set of upper bounds of A
7. inf(A)
8. sup(A).
Answer:

1. {2,3}
2. {12}
3. Does not exist

4. 12
5. {1}
6. {12,24,36,}
7. 1
8. 12
Let S={a,b,c}.
1. Give P(S) in list form.

2. Describe the Hasse graph of (S,P(S))


Answer:
1. P(S)={,{a},{b},{c},{a,b},{a,c},{b,c},S}
2. For AP(S) and xSA, there is a directed edge from A to A{x}

Note that the Hasse graph of looks the same as the graph of , except for the labels on the vertices. This
symmetry is because the complement relationship.
Let S={a,b,c,d}.
1. Give P(S) in list form.
2. Describe the Hasse graph of (S,P(S))
Answer:
1. P(S)={,{a},{b},{c},{d},{a,b},{a,c},{a,d},{b,c},{b,d},{c,d},{a,b,c},{a,b,d},{a,c,d},

{b,c,d},S}
2. For AP(S) and xSA, there is a directed edge from A to A{x}

Note again that the Hasse graph of looks the same as the graph of , except for the labels on the vertices.
This symmetry is because the complement relationship.
Suppose that A and B are subsets of a universal set S. Let A denote the collection of the 16 subsets of S that
can be constructed from A and B using the set operations. Show that (A,) is isomorphic to the partially
ordered set in the previous exercise. Use the Venn diagram applet to help.
Proof:
Let a=AB, b=ABc, c=AcB, d=AcBc. Our basic assumption is that A and B are in general position,
so that a,b,c,d are distinct and nonempty. Note also that {a,b,c,d} partitions S. Now, map each subset S of

{a,b,c,d} to S. This function is an isomorphism from S to A. That is, for S and T subsets of {a,b,c,d},
ST if and only if ST.

5. Equivalence Relations
Basic Theory
Definitions
A relation on a nonempty set S that is reflexive, symmetric, and transitive is said to be an equivalence
relation on S. Thus, for all x,y,zS,
1. xx, the reflexive property.
2. If xy then yx, the symmetric property.
3. If xy and yz then xz, the transitive property.
As the name and notation suggest, an equivalence relation is intended to define a type of equivalence among the
elements of S. Like partial orders, equivalence relations occur naturally in most areas of mathematics, including
probability.
Suppose that is an equivalence relation on S. The equivalence class of an element xS is the set of all
elements that are equivalent to x, and is denoted

[x]={yS:yx}
Results
The most important result is that an equivalence relation on a set S defines a partition of S, by means of the
equivalence classes.
Suppose that is an equivalence relation on a set S.

1. If xy then [x]=[y].
2. If x
y then [x][y]=.
3. The collection of (distinct) equivalence classes is a partition of S into nonempty sets.
Proof:
1. Suppose that xy. If u[x] then ux and hence uy by the transitive property. Hence u[y].
Similarly, if u[y] then uy. But yx by the symmetric property, and hence ux by the transitive
property. Hence u[x].
2. Suppose that x
y. If u[x][y], then u[x] and u[y], so ux and uy. But then xu by the
symmetric property, and then xy by the transitive property. This is a contradiction, so [x][y]=.
3. From (a) and (b), the (distinct) equivalence classes are disjoint. If xS, then xx by the reflexive
property, and hence x[x]. Therefore xS[x]=S.
Sometimes the set S of equivalence classes is denoted S/. The idea is that the equivalence classes are new
objects obtained by identifying elements in S that are equivalent. Conversely, every partition of a set defines an
equivalence relation on the set.
Suppose that S is a collection of nonempty sets that partition a given set S. Define the relation on S defined
by xy if and only if xA and yA for some AS.
1. is an equivalence relation.
2. S is the set of equivalence classes.
Proof:
1. If x, then xA for some AS, since S partitions S. Hence xx, and so the reflexive property holds.
Next, is trivially symmetric by definition. Finally, suppose that xy and yz. Then x,yA for
some AS and y,zB for some BS. But then yAB. The sets in S are disjoint, so A=B.
Hence x,zA, so xz. Thus is transitive.
2. If xS, then xA for a unique AS, and then by definition, [x]=A.

A partition of S. Any two points in the same partition set are


equivalent.
Sometimes the equivalence relation associated with a given partition S is denoted S/S. The idea, of course,
is that elements in the same set of the partition are equivalent.
The process of forming a partition from an equivalence relation, and the process of forming an equivalence
relation from a partition are inverses of each other.
1. If we start with an equivalence relation on S, form the associated partition, and then construct the
equivalence relation associated with the partition, then we end up with the original equivalence relation.
In modular notation, S/(S/)=.
2. If we start with a partition S of S, form the associated equivalence relation, and then form the partition
associated with the equivalence relation, then we end up with the original partition. In modular notation,

S/(S/S)=S.
Suppose that S is a nonempty set. The most basic equivalence relation on S is the equality relation =. In this
case [x]={x} for each xS. At the other extreme is the trivial relation defined by xy for all x,yS. In
this case S is the only equivalence class.
Every function f defines an equivalence relation on its domain, known as the equivalence relation associated
with f. Moreover, the equivalence classes have a simple description in terms of the inverse images of f.
Suppose that f:ST. Define the relation on S by xy if and only if f(x)=f(y).
1. The relation is an equivalence relation on S.
2. The set of equivalences classes is S={f1{t}:trange(f)}.
3. The function F:ST defined by F([x])=f(x) is well defined and is one-to-one.
Proof:
1. If xS then trivially f(x)=f(x), so xx. Hence is reflexive. If xy then f(x)=f(y) so trivially

f(y)=f(x) and hence yx. Thus is symmetric. If xy and yz then f(x)=f(y) and f(y)=f(z),
so trivially f(x)=f(z) and so xz. Hence is transitive.

2. Recall that trange(f) if and only if f(x)=t for some xS. Then by definition,

[x]=f1{t}={yS:f(y)=t}={yS:f(y)=f(x)}
3. From (1), [x]=[y] if and only if xy if and only if f(x)=f(y). This shows both that F is well defined,
and that F is one-to-one.

The equivalence relation on S associated


with f:ST
Suppose again that f:ST.
1. If f is one-to-one then the equivalence relation associated with f is the equality relation, and hence

[x]={x} for each xS.


2. If f is a constant function then the equivalence relation associated with f is the trivial relation, and
hence S is the only equivalence class.
Proof:
1. If f is one, then xy if and only if f(x)=f(y) if and only if x=y.
2. If f is constant on S then f(x)=f(y) and hence xy for all x,yS.
Equivalence relations associated with functions are universal: every equivalence relation is of this form:
Suppose that is an equivalence relation on a set S. Define f:SP(S) by f(x)=[x]. Then is the
equivalence relation associated with f.
Proof:
From (1), xy if and only if [x]=[y] if and only if f(x)=f(y).
Suppose that and are equivalence relations on a set S. Let denote the intersection of and (thought
of as subsets of SS). Equivalently, xy if and only if xy and xy.
1. is an equivalence relation on S.
2. [x]=[x][x].

Suppose that we have a relation that is reflexive and transitive, but fails to be a partial order because it's not
anti-symmetric. The relation and its inverse naturally lead to an equivalence relation, and then in turn, the
original relation defines a true partial order on the equivalence classes. This is a common construction, and the
details are given in the next theorem.
Suppose that is a relation on a set S that is reflexive and transitive. Define the relation on S by xy if and
only if xy and yx.
1. is an equivalence relation on S.
2. If A and B are equivalence classes and xy for some xA and yB, then uv for all uA and

vB.
3. Define the relation on the collection of equivalence classes S by AB if and only if xy for some
(and hence all) xA and yB. Then is a partial order on S.
Proof:
1. If xS then xx since is reflexive. Hence xx, so is reflexive. Clearly is symmetric by the
symmetry of the definition. Suppose that xy and yz. Then xy, yz, zy and yx. Hence xz
and zx since is transitive. Therefore xz so is transitive.
2. Suppose that A and B are equivalence classes of and that xy for some xA and yB. If uA
and vB, then xu and yv. Therefore ux and yv. By transitivity, uv.
3. Suppose that AS. If x,yA then xy and hence xy. Therefore AA and so is reflexive. Next
suppose that A,BS and that AB and BA. If xA and yB then xy and yx. Hence xy
so A=B. Therefore is antisymmetric. Finally, suppose that A,B,CS and that AB and BC.
Note that B so let yB. If xA,zC then xy and yz. Hence xz and therefore AC. So

is transitive.
A prime example of the construction in the previous theorem occurs when we have a function whose range
space is partially ordered. We can construct a partial order on the equivalence classes in the domain that are
associated with the function.
Suppose that S and T are sets and that T is a partial order on T. Suppose also that f:ST. Define the relation

S on S by xSy if and only if f(x)Tf(y).


1. S is reflexive and transitive.
2. The equivalence relation on S constructed in the previous theorem is the equivalence relation associated
with f.
3. S can be extended to a partial order on the equivalence classes corresponding to f.

Proof:
1. If xS then f(x)Tf(x) since T is reflexive, and hence xSx. Thus preceqS is reflexive. Suppose
that x,y,zS and that xSy and ySz. Then f(x)Tf(y) and f(y)Tf(z). Hence f(x)Tf(z)
since T is transitive. Thus S is transitive.
2. For the equivalence relation on S constructed in (8), xy if and only if xSy and ySx if and only
if f(x)Tf(y) and f(y)Tf(x) if and only if f(x)=f(y), since T is antisymmetric. Thus is the
equivalence relation associated with f.
3. This follows immediately from (8) and parts (a) and (b). If u,vrange(f), then f1({u})Sf1({v})
if and only if uTv.

Examples and Applications


Simple functions
Give the equivalence classes explicitly for the functions from R into R defined below:
1. f(x)=x2.
2. g(x)=x.
3. h(x)=sin(x).
Answer:
1. [x]={x,x}
2. [x]=[x,x+1)
3. [x]={x+2n:nZ}{(2n+1)x:nZ}
Calculus
Suppose that I is a fixed interval of R, and that S is the set of differentiable functions from I into R. Consider
the equivalence relation associated with the derivative operator D on S, so that D(f)=f. For fS, give a
simple description of [f].
Answer:

[f]={f+c:cR}
Congruence

Recall the division relation from N+ to Z: For dN+ and nZ, dn means that n=kd for some kZ. In
words, d divides n or equivalently n is a multiple of d. In the previous section, we showed that is a partial
order on N+.
Now fix dN+. Define the relation d on Z by mdn if and only if d(nm). The relation d is known as
congruence modulo d. Also, let rd:Z{0,1,,d1} be defined so that r(n) is the remainder when n is
divided by d. Specifically, recall that by the Euclidean division theorem, named for Euclid of course, nZ can
be written uniquely in the form n=kd+q where kZ and q{0,1,,d1}, and then rd(n)=q.
Congruence modulo d.
1. d is the equivalence relation associated with the function rd.
2. There are d distinct equivalence classes, given by [q]d={q+kd:kZ} for q{0,1,,d1}.
Proof:
1. Recall that for the equivalence relation associated with rd, integers m and n are equivalent if and only if

rd(m)=rd(n). By the division theorem, m=jd+p and n=kd+q, where j,kZ and p,q{0.1,},
and these representations are unique. Thus nm=(kj)d+(qp), and so mdn if and only if

d(nm) if and only if p=q if and only if rd(m)=rd(m).


2. Recall that the equivalence classes are r1d{q} for qrange(rd)={0,1,,d1}. By the division
theorem, r1d{q}={kd+q:kZ}.
Explicitly give the equivalence classes for 4, congruence mod 4.
Answer:

[0]4={0,4,8,12,}{4,8,12,16,}

[1]4={1,5,9,13,}{3,7,11,15,}

[2]4={2,6,10,14,}{2,6,10,14,}

[3]4={3,7,11,15,}{1,5,9,13,}

Linear Algebra
Linear algebra provides several examples of important and interesting equivalence relations. To set the stage, let

Rmn denote the set of mn matrices with real entries, for m,nN+

First recall that the following are row operations on a matrix:


1. Multiply a row by a non-zero real number.
2. Interchange two rows.
3. Add a multiple of a row to another row.
Row operations are essential for inverting matrices and solving systems of linear equations. If ARmn and

BRmn, then A and B are said to be row equivalent if A can be transformed into B by a finite sequence of
row operations.
Row equivalence is an equivalence relation on Rmn.
Proof.
If ARmn, then A is row equivalent to itself: we can simply do nothing, or if you prefer, we can multiply the
first row of A by 1. For symmetry, the key is that each row operation can be reversed by another row operation:
multiplying a row by c0 is reversed by multiplying the same row of the resulting matrix by 1/c.
Interchanging two rows is reversed by interchanging the same two rows of the resulting matrix. Adding c times
row i to row j is reversed by adding c times row i to row j in the resulting matrix. Thus, if we can transform

A into B by a finite sequence of row operations, then we can transform B into A by applying the reversed row
operations in the reverse order. Transitivity is clear: If we can transform A into B by a sequence of row
operations and B into C by another sequence of row operations, then we can transform A into C by putting the
two sequences together.
Next recall that nn matrices A and B are said to be similar if there exists an invertible nn matrix P such
that P1AP=B. Similarity is very important in the study of linear transformations, change of basis, and the
theory of eigenvalues and eigenvectors.
Similarity is an equivalence relation on Rnn.
Proof:
If ARnn then A=I1AI, where I is the nn identity matrix, so A is similar to itself. Suppose that

A,BRnn and that A is similar to B so that B=P1AP for some invertible PRnn. Then
A=PBP1=(P1)1BP1 so B is similar to A. Finally, suppose that A,B,CRnn and that A is similar
to B and that B is similar to C. Then B=P1AP and C=Q1BQ for some invertible P,QRnn. Then

C=Q1P1APQ=(PQ)1A(PQ), so A is similar to C.

Next recall that for ARmn, the transpose of A is the matrix ATRnm with the property that (i,j) entry of

A is the (j,i) entry of AT, for i{1,2,,m} and j{1,2,,n}. Simply stated, AT is the matrix whose
rows are the columns of A. For the theorem that follows, we need to remember that (AB)T=BTAT for

ARmn and BRnk, and (AT)1=(A1)T if ARnn is invertible.


Matrices A,BRnn are said to be congruent (not to be confused with congruent integers above) if there exists
an invertible PRnn such that B=PTAP. Congruence is important in the study of orthogonal matrices and
change of basis.
Congruence is an equivalence relation on Rnn
Proof:
If ARnn then A=ITAI, where again I is the nn identity matrix, so A is congruent to itself. Suppose that

A,BRnn and that A is congruent to B so that B=PTAP for some invertible PRnn. Then A=(PT)
1BP1=(P1)TBP1

so B is congruent to A. Finally, suppose that A,B,CRnn and that A is congruent

to B and that B is congruent to C. Then B=PTAP and C=QTBQ for some invertible P,QRnn. Then

C=QTPTAPQ=(PQ)TA(PQ), so A is congruent to C.
Number Systems
Equivalence relations play an important role in the construction of complex mathematical structures from
simpler ones. Often the objects in the new structure are equivalence classes of objects constructed from the
simpler structures, modulo an equivalence relation that captures the essential properties of the new objects.
The construction of number systems is a prime example of this general idea. The next exercise explores the
construction of rational numbers from integers.
Define a relation on ZN+ by (j,k)(m,n) if and only if jn=km.
1. is an equivalence relation.
2. Define mn=[(m,n)], the equivalence class generated by (m,n), for mZ and nN+. This definition
captures the essential properties of the rational numbers.
Proof:

1. For (m,n)ZN+, mn=nm of course, so (m,n)(m,n). Hence is reflexive. If (j,k),

(m,n)ZN+ and (j,k)(m,n), then jn=km so trivially mk=nj, and hence (m,n)(j,k). Thus
is symmetric. Finally, suppose that (j,k),(m,n),(p,q)ZN+ and that (j,k)(m,n) and
(m,n)(p,q). Then jn=km and mq=np, so jnp=kmp which implies jmq=kmp, and so jq=kp.
Hence (j,k)(p,q) so is transitive.
2. Suppose that jk and mn are rational numbers in the usual, informal sense, where j,mZ and k,nN+.
Then jk=mn if and only if jn=km if and only if (j,k)(m,n), so it makes sense to define mn as the
equivalence class generated (m,n). Addition and multiplication are defined in the usual way: if (j,k),

(m,n)ZN+ then
jk+mn=jn+mkkn, jkmn=jmkn
The definitions are consistent; that is they do not depend on the particular representations of the
equivalence classes.

6. Cardinality
Basic Theory
Definitions
Suppose that S is a non-empty collection of sets. We define a relation on S by AB if and only if there
exists a one-to-one function f from A onto B. The relation is an equivalence relation on S. That is, for all

A,B,CS,
1. AA, the reflexive property
2. If AB then BA, the symmetric property
3. If AB and BC then AC, the transitive property
Proof:
1. The identity function IA on A, given by IA(x)=x for xA, maps A one-to-one onto A. Hence AA
2. If AB then there exists a one-to-one function f from A onto B. But then f1 is a one-to-one function
from B onto A, so BA
3. Suppose that AB and BC. Then there exists a one-to-one function f from A onto B and a one-toone function g from B onto C. But then gf is a one-to-one function from A onto C, so AC.

A one-to-one function f from A onto B is sometimes called a bijection. Thus if AB then A and B are in oneto-one correspondence and are said to have the same cardinality. Thus, the equivalence classes under this
equivalence relation capture the notion of having the same number of elements.
Let N0=, and for kN+, let Nk={0,1,k1}. As always, N={0,1,2,} is the set of all natural
numbers.
Suppose that A is a set.
1. A is finite if ANk for some kN, in which case k is the cardinality of A, and we write #(A)=k.
2. A is infinite if A is not finite.
3. A is countably infinite if AN.
4. A is countable if A is finite or countably infinite.
5. A is uncountable if A is not countable.
In part (a), think of Nk as a reference set with k elements; any other set with k elements must be equivalent to
this one. We will study the cardinality of finite sets in the next two sections on Counting Measure and
Combinatorial Structures. In this section, we will concentrate primarily on infinite sets. In part (c), a countable
set is one that can be enumerated or counted by putting the elements into one-to-one correspondence with Nk
for some kN or with all of N. An uncountable set is one that cannot be so counted. Countable sets play a
special role in probability theory, as in many other branches of mathematics. Apriori, it's not clear that there are
uncountable sets, but we will soon see examples.
Preliminary Examples
If S is a set, recall that P(S) denotes the power set of S (the set of all subsets of S). If A and B are sets, then

AB is the set of all functions from B into A. In particular, {0,1}S denotes the set of functions from S into
{0,1}.
If S is a set then P(S){0,1}S.
Proof:
The mapping that takes a set AP(S) into its indcator function 1A{0,1}S is one-to-one and onto.
Specifically, if A,BP(S) and 1A=1B, then A=B, so the mapping is one-to-one. On the other hand, if

f{0,1}S then f=1A where A={xS:f(x)=1}. Hence the mapping is onto.

Next are some examples of countably infinite sets.


The following sets are countably infinite:
1. The set of even natural numbers E={0,2,4,}
2. The set of integers Z
Proof:
1. The function f:NE given by f(n)=2n is one-to-one and onto.
2. The function g:NZ given by g(n)=n2 if n is even and g(n)=n+12 if n is odd, is one-to-one and
onto.
At one level, it might seem that E has only half as many elements as N while Z has twice as many elements as

N. But as the previous result shows, that point of view is incorrect: N, E, and Z all have the same cardinality
(and are countably infinite). The next example shows that there are indeed uncountable sets.
If A is a set with at least two elements then S=AN, the set of all functions from N into A, is uncountable.
Proof:
The proof is by contradiction, and uses a nice trick known as the diagonalization method. Suppose that S is
countably infinite (it's clearly not finite), so that the elements of S can be enumerated: S={f0,f1,f2,}. Let a
and b denote distinct elements of A and define g:NA by g(n)=b if fn(n)=a and g(n)=a if fn(n)a.
Note that gfn for each nN, so gS. This contradicts the fact that S is the set of all functions from N into

A.
Subsets of Infinite Sets
Surely a set must be as least as large as any of its subsets, in terms of cardinality. On the other hand, by the
example above, the set of natural numbers N, the set of even natural numbers E and the set of integers Z all
have exactly the same cardinality, even though ENZ. In this subsection, we will explore some interesting
and somewhat paradoxical results that relate to subsets of infinite sets. Along the way, we will see that the
countable infinity is the smallest of the infinities.
If S is an infinite set then S has a countable infinite subset.
Proof:

Select a0S. It's possible to do this since S is infinite and therefore nonempty. Inductively, having chosen

{a0,a1,,ak1}S, select akS{a0,a1,,ak1}. Again, it's possible to do this since S is not finite.
Manifestly, {a0,a1,} is a countably infinite subset of S.
A set S is infinite if and only if S is equivalent to a proper subset of S.
Proof:
If S is finite, then S is not equivalent to a proper subset by the pigeonhole principle. If S is infinite, then S has
countably infinite subset {a0,a1,a2,} by the previous result. Define the function f:SS by f(an)=a2n and

f(x)=x for xS{a0,a1,a2,}. Then f maps S one-to-one onto S{a1,a3,a5,}.


When S was infinite in the proof of the previous result, not only did we map S one-to-one onto a proper subset,
we actually threw away a countably infinite subset and still maintained equivalence. Similarly, we can add a
countably infinite set to an infinite set S without changing the cardinality.
If S is an infinite set and B is a countable set, then SSB.
Proof:
Consider the most extreme case where B is countably infinite and disjoint from S. Then S has a countably
infinite subset A={a0,a1,a2,} by a result above, and B={b0,b1,b2,}. Define the function f:SSB by

f(an)=an/2 if n is even, f(an)=b(n1)/2 if n is odd, and f(x)=x if xS{a0,a1,a2,}. Then f maps S


one-to-one onto SB.
In particular, if S is uncountable and B is countable then SB and SB have the same cardinality as S, and in
particular are uncountable.
In terms of the dichotomies finite-infinite and countable-uncountable, a set is indeed at least as large as a subset.
First we need a preliminary result.
If S is countably infinite and AS then A is countably infinite.
Proof:
It suffices to show that if A is an infinite subset of S then A is countably infinite. Since S is countably infinite,
it can be enumerated: S={x0,x1,x2,}. Let ni be the ith smallest index such that xniA. Then

A={xn0,xn1,xn2,} and hence is countably infinite.


Suppose that AB.

1. If B is finite then A is finite.


2. If A is infinite then then B is infinite.
3. If B is countable then A is countable.
4. If A is uncountable then B is uncountable.
Proof:
1. This is clear from the definition of a finite set.
2. This is the contrapositive of (a).
3. If A is finite, then A is countable. If A is infinite, then B is infinite by (b) and hence is countably
infinite. But then A is countably infinite by the previous result.
4. This is the contrapositive of (c).
Comparisons by one-to-one and onto functions
We will look deeper at the general question of when one set is at least as big than another, in the sense of
cardinality. Not surprisingly, this will eventually lead to a partial order on the cardinality equivalence classes
First note that if there exists a function that maps a set A one-to-one into a set B, then in a sense, there is a copy
of A contained in B. Hence B should be at least as large as A.
Suppose that f:AB is one-to-one.
1. If B is finite then A is finite.
2. If A is infinite then B is infinite.
3. If B is countable then A is countable.
4. If A is uncountable then B is uncountable.
Proof:
Note that f maps A one-to-one onto f(A). Hence Af(A) and f(A)B. The results now follow from the
previous comparison result:
1. If B is finite then f(A) is finite and hence A is finite.
2. If A is infinite then f(A) is infinite and hence B is infinite.
3. If B is countable then f(A) is countable and hence A is countable.
4. If A is uncountable then f(A) is uncountable and hence B is uncountable.

On the other hand, if there exists a function that maps a set A onto a set B, then in a sense, there is a copy of B
contained in A. Hence A should be at least as large as B.
Suppose that f:AB is onto.
1. If A is finite then B is finite.
2. If B is infinite then A is infinite.
3. If A is countable then B is countable.
4. If B is uncountable then A is uncountable.
Proof:
For each yB, select a specific xA with f(x)=y (if you are persnickety, you may need to invoke the axiom
of choice). Let C be the set of chosen points. Then f maps C one-to-one onto B, so CB and CA. The
results now follow from the comparison result above:
1. If A is finite then C is finite and hence B is finite.
2. If B is infinite then C is infinite and hence A is infinite.
3. If A is countable then C is countable and hence B is countable.
4. If B is uncountable then C is uncountable and hence A is uncountable.
The previous exercise also could be proved from the one before, since if there exists a function f mapping A
onto B, then there exists a function g mapping B one-to-one into A. This duality is proven in the discussion of
the axiom of choice. A simple and useful corollary of the previous two theorems is that if B is a given countably
infinite set, then a set A is countable if and only if there exists a one-to-one function f from A into B, if and
only if there exists a function g from B onto A.
If Ai is a countable set for each i in a countable index set I, then iIAi is countable.
Proof:
Consider the most extreme case in which the index set I is countably infinite. Since Ai is countable, there exists
a function fi that maps N onto Ai for each iN. Let M={2i3j:(i,j)NN}. Note that the points in M are
distinct, that is, 2i3j2m3n if (i,j),(m,n)NN and (i,j)(m,n). Hence M is infinite, and since MN,

M is countably infinite. The function f given by f(2i3j)=fi(j) maps M onto iIAi, and hence this last set
is countable.

If A and B are countable then AB is countable.


Proof:
There exists a function f that maps N onto A, and there exists a function g that maps N onto B. Again, let

M={2i3j:(i,j)NN} and recall that M is countably infinite. Define h:MAB by


h(2i3j)=(f(i),g(j)). Then h maps M onto AB and hence this last set is countable.
The last result could also be proven from the one before, by noting that

AB=aA{a}B
Both proofs work because the set M is essentially a copy of NN, embedded inside of N. The last theorem
generalizes to the statement that a finite product of countable sets is still countable. But, from the result above
on sets of functions, a product of infinitely many sets (with at least 2 elements each) will be uncountable.
The set of rational numbers Q is countably infinite.
Proof:
The sets Z and N+ are countably infinite and hence the set ZN+ is countably infinite. The function

f:ZN+Q given by f(m,n)=mn is onto.


A real number is algebraic if it is the root of a polynomial function (of degree 1 or more) with integer
coefficients. Rational numbers are algebraic, as are rational roots of rational numbers (when defined).
Moreover, the algebraic numbers are closed under addition, multiplication, and division. A real number is
transcendental if it's not algebraic. The numbers e and are transcendental, but we don't know very many other
transcendental numbers by name. However, as we will see, most (in the sense of cardinality) real numbers are
transcendental.
The set of algebraic numbers A is countably infinite.
Proof:
Let Z0=Z{0} and let Zn=Zn1Z0 for nN+. The set Zn is countably infinite for each n. Let

C=n=1Zn. Think of C as the set of coefficients and note that C is countably infinite. Let P denote the set
of polynomials of degree 1 or more, with integer coefficients. The function (a0,a1,

,an)a0+a1x++anxn maps C onto P, and hence P is countable. For pP, let Ap denote the set of

roots of p. A polynomial of degree n in P has at most n roots, by the fundamental theorem of algebra, so in
particular Ap is finite for each pP. Finally, note that A=pPAp and so A is countable. Of course NA, so

A is countably infinite.
Now let's look at some uncountable sets.
The interval [0,1) is uncountable.
Proof:
Recall that {0,1}N+ is the set of all functions from N+ into {0,1}, which in this case, can be thought of as
infinite sequences or bit strings:

{0,1}N+={x=(x1,x2,):xn{0,1} for all nN+}


By (4), this set is uncountable. Let N={x{0,1}N+:xn=1 for all but finitely many n}, the set of bit
strings that eventually terminate in all 1s. Note that N=n=1Nn where Nn={x{0,1}N+:xk=1 for

all kn}. Clearly Nn is finite for all nN+, so N is countable, and therefore S={0,1}N+N is uncountable.
In fact, S{0,1}N+. The function

xn=1xn2n
maps S one-to-one onto [0,1). In words every number in [0,1) has a unique binary expansion in the form of a
sequence in S. Hence [0,1)S and in particular, is uncountable. The reason for eliminating the bit strings that
terminate in 1s is to ensure uniqueness, so that the mapping is one-to-one. The bit string x1x2xk0111
corresponds to the same number in [0,1) as the bit string x1x2xk1000.
The following sets have the same cardinality, and in particular all are uncountable:
1. R, the set of real numbers.
2. Any interval I of R, as long as the interval is not empty or a single point.
3. RQ, the set of irrational numbers.
4. RA, the set of transcendental numbers.
5. P(N), the power set of N.

Proof:
1. The mapping x2x1x(1x) maps (0,1) one-to-one onto R so (0,1)R. But (0,1)=[0,1){0}, so

(0,1](0,1)R, and all of these sets are uncountable by the previous result.
2. . Suppose a,bR and a<b. The mapping xa+(ba)x maps (0,1) one-to-one onto (a,b) and hence

(a,b)(0,1)R. Also, [a,b)=(a,b){a}, (a,b]=(a,b){b}, and [a,b]=(a,b){a,b}, so


(a,b)[a,b)(a,b][a,b]R. The function xex maps R one-to-one onto (0,), so (0,)R. For
aR, the function xa+x maps (0,) one-to-one onto (a,) and the mapping xax maps
(0,) one to one onto (,a) so (a,)(,a)(0,)R. Next, [a,)=(a,){a} and
(,a]=(,a){a}, so [a,)(,a]R.
3. Q is countably infinite, so RQR.
4. Similarly, A is countably infinite, so RAR.
5. If S is countably infinite, then by the previous result and (a), P(S)P(N+){0,1}N+[0,1).
The Cardinality Partial Order
Suppose that S is a nonempty collection of sets. We define the relation on S by AB if and only if there
exists a one-to-one function f from A into B, if and only if there exists a function g from B onto A. In light of
the previous subsection, AB should capture the notion that B is at least as big as A, in the sense of
cardinality.
The relation is reflexive and transitive.
Proof:
For AS, the identity function IA:AA given by IA(x)=x is one-to-one (and also onto), so AA. Suppose
that A,B,CS and that AB and BC. Then there exist one-to-one functions f:AB and g:BC. But
then gf:AC is one-to-one, so AC.
Thus, we can use the construction in the section on on Equivalence Relations to first define an equivalence
relation on S, and then extend to a true partial order on the collection of equivalence classes. The only
question that remains is whether the equivalence relation we obtain in this way is the same as the one that we
have been using in our study of cardinality. Rephrased, the question is this: If there exists a one-to-one function
from A into B and a one-to-one function from B into A, does there necessarily exist a one-to-one function from

A onto B? Fortunately, the answer is yes; the result is known as the Schrder-Bernstein Theorem, named for
Ernst Schrder and Sergi Bernstein.

If AB and BA then AB.


Proof:
Set inclusion is a partial order on P(A) (the power set of A) with the property that every subcollection of

P(A) has a supremum (namely the union of the subcollection). Suppose that f maps A one-to-one into B and
g maps B one-to-one into A. Define the function h:P(A)P(A) by h(U)=Ag[Bf(U)] for UA. Then
h is increasing:
UVf(U)f(V)Bf(V)Bf(U)g[Bf(V)]g[bf(u)]Ag[Bf(U) Ag[Bf(U)]
From the fixed point theorem for partially ordered sets, there exists UA such that h(U)=U. Hence

U=Ag[Bf(A)] and therefore AU=g[Bf(U)]. Now define F:AB by F(x)=f(x) if xU and


F(x)=g1(x) if xAU.

f maps A one-to-one onto f(A); g maps Bf(A) one-to-one onto AU


Next we show that F is one-to-one. Suppose that x1,x2A and F(x1)=F(x2). If x1,x2U then f(x1)=f(x2)
so x1=x2 since f is one-to-one. If x1,x2AU then g1(x1)=g1(x2) so x1=x2 since g1 is one-to-one. If

x1U and x2AU. Then F(x1)=f(x1)f(U) while F(x2)=g1(x2)Bf(U), so F(x1)=F(x2) is


impossible.
Finally we show that F is onto. Let yB. If yf(U) then y=f(x) for some xU so F(x)=y. If yBf(U)
then x=g(y)AU so F(x)=g1(x)=y.
We will write AB if AB, but A
B, That is, there exists a one-to-one function from A into B, but there
does not exist a function from A onto B. Note that would have its usual meaning if applied to the
equivalence classes. That is, [A][B] if and only if [A][B] but [A][B]. Intuitively, of course, AB means
that B is strictly larger than A, in the sense of cardinality.

AB in each of the following cases:


1. A and B are finite and #(A)<#(B).
2. A is finite and B is countably infinite.
3. A is countably infinte and B is uncountable.
We close our discussion with the observation that for any set, there is always a larger set.

If S is a set then SP(S).


Proof:
First, it's trivial to map S one-to-one into P(S); just map x to {x}. Suppose now that f maps S onto P(S)
and let R={xS:xf(x)}. Since f is onto, there exists tS such that f(t)=R. Note that tf(t) if and
only if tf(t).
The proof that a set cannot be mapped onto its power set is similar to the Russell paradox, named for Bertrand
Russell.
The continuum hypothesis is the statement that there is no set whose cardinality is strictly between that of N and

R. The continuum hypothesis actually started out as the continuum conjecture, until it was shown to be
consistent with the usual axioms of the real number system (by Kurt Gdel in 1940), and independent of those
axioms (by Paul Cohen in 1963).
If S is uncountable, then there exists AS such that A and Ac are uncountable.
Proof:
There is an easy proof, assuming the continuum hypothesis. Under this hypothesis, if S is uncountable then

[0,1)S. Hence there exists a one-to-one function f:[0,1)S. Let A=f[0,12). Then A is uncountable, and
since f[12,1)Ac, Ac is uncountable.
There is a more complicated proof just using the axiom of choice.

7. Counting Measure
Basic Theory
Suppose that S is a finite set. For AS, the cardinality of A is the number of elements in A, and is denoted

#(A). The function # on P(S) is called counting measure. Counting measure plays a fundamental role in
discrete probability structures, and particularly those that involve sampling from a finite set. The set S is
typically very large, hence efficient counting methods are essential. The first combinatorial problem is attributed
to the Greek mathematician Xenocrates.
In many cases, a set of objects can be counted by establishing a one-to-one correspondence between the given
set and some other set. Naturally, the two sets have the same number of elements, but for some reason, the
second set may be easier to count.
The Addition Rule
The addition rule of combinatorics is simply the additivity axiom of counting measure.
If {A1,S2,,An} is a collection of disjoint subsets of S then

#(i=1nAi)=i=1n#(Ai)

The addition rule


Simple Counting Rules
The counting rules in this subsection are simple consequences of the addition rule. Be sure to try the proofs
yourself before reading the ones in the text.

#(Ac)=#(S)#(A). This is the complement rule.


Proof:
Note that A and Ac are disjoint and their union is S. Hence #(A)+#(Ac)=#(S).

The complement rule

#(BA)=#(B)#(AB). This is the difference rule.


Proof:
Note that AB and BA are disjoint and their union is B. Hence #(AB)+#(BA)=#(B).
If AB then #(BA)=#(B)#(A).
Proof:
This follows from the difference rule, since AB=A.
If AB then #(A)#(B).
Proof:
This follows from the previous result: #(B)=#(A)+#(BA)#(A).
Thus, # is an increasing function, relative to the subset partial order on P(S), and the ordinary order on

N.
Inequalities
This subsection gives two inequalities that are useful for obtaining bounds on the number of elements in a set.
The first is Boole's inequality (named after George Boole) which gives an upper bound on the cardinality of a
union.
If {A1,A2,,An} is a a finite collection of subsets of S then

#(i=1nAi)i=1n#(Ai)
Proof:

Let B1=A1 and Bi=Ai(A1Ai1) for i{2,3,,n}. Note that {B1,B2,,Bn} is a pairwise disjoint
collection and has the same union as {A1,A2,,An}. From the the increasing property, #(Bi)#(Ai) for
each i{1,2,,n}. Hence by the addition rule,

#(i=1nAi)=#(i=1nBi)i=1n#(Ai)
Intuitively, Boole's inequality holds because parts of the union have been counted more than once in the
expression on the right. The second inequality is Bonferroni's inequality (named after Carlo Bonferroni), which
gives a lower bound on the cardinality of an intersection.
If {A1,A2,,An} is a finite collection of subsets of S then

#(i=1nAi)#(S)i=1n[#(S)#(Ai)]
Proof:
Using the complement rule, Boole's inequality, and DeMorgan's law,

#(i=1nAi)=#(S)#(i=1nAci)#(S)i=1n#(Aci)=#(S)i=1n[#(S)#(Ai)]
The Inclusion-Exclusion Formula
The inclusion-exclusion formula gives the cardinality of a union of sets in terms of the cardinality of the various
intersections of the sets. The formula is useful because intersections are often easier to count. We start with the
special cases of two sets and three sets. As usual, we assume that the sets are subsets of a finite universal set S.
If A and B are subsets of S then #(AB)=#(A)+#(B)#(AB).
Proof:
Note first that AB=A(BA) and the latter two sets are disjoint. From the addition rule and the difference
rule,

#(AB)=#(A)+#(BA)=#(A)+#(B)#(AB)

The inclusion-exclusion theorem for two sets


If A, B, C are subsets of S then #(ABC)=#(A)+#(B)+#(C)#(AB)#(AC)#(BC)

+#(ABC).
Proof:
Note that ABC=(AB)[C(AB)] and that AB and C(AB) are disjoint. Using the addition
rule and the difference rule,

#(ABC)=#(AB)+#[C(AB)]=#(AB)+#(C)#[C(AB)]=#(AB)+#(C)
#[(AC)(BC)]
Now using the inclusion-exclusion rule for two sets (twice) we have

#(ABC)=#(A)+#(B)#(AB)+#(C)#(AC)#(BC)+#(ABC)

The inclusion-exclusion theorem for three sets


The inclusion-exclusion rule for two sets and for three sets can be generalized to a union of n sets; the
generalization is known as the (general) inclusion-exclusion formula.
Suppose that {Ai:iI} is a collection of subsets of S where I is an index set with #(I)=n. Then

#(iIAi)=k=1n(1)k1JI,#(J)=k#jJAj
Proof:
The proof is by induction on n. The formula holds for n=2 events by (7) and for n=3 events by (8). Suppose
the formula holds for nN+, and suppose that {A1,A2,,An,An+1} is a collection of n+1 subsets of S.
Then

i=1n+1Ai=(i=1nAi)[An+1(i=1nAi)]
and the two sets connected by the central union are disjoint. Using the addition rule and the difference rule,

#(i=1n+1Ai)=#(i=1nAi)+#(An+1)#[An+1(i=1nAi)]=#(i=1nAi)
+#(An+1)#[i=1n(AiAn+1)]
By the induction hypothesis, the formula holds for the two unions of n sets in the last expression. The result
then follows by simplification.
The general Bonferroni inequalities, named again for Carlo Bonferroni, state that if sum on the right is truncated
after k terms (k<n), then the truncated sum is an upper bound for the cardinality of the union if k is odd (so
that the last term has a positive sign) and is a lower bound for the cardinality of the union if k is even (so that
the last terms has a negative sign).
The Multiplication Rule
The multiplication rule of combinatorics is based on the formulation of a procedure (or algorithm) that
generates the objects to be counted. Specifically, suppose that the procedure consists of k steps, performed
sequentially, and that for each j{1,2,,k}, step j can be performed in nj ways regardless of the choices
made on the previous steps. Then the number of ways to perform the entire algorithm (and hence the number of
objects) is n1n2nk.
The key to a successful application of the multiplication rule to a counting problem is the clear formulation of
an algorithm that generates the objects being counted, so that each object is generated once and only once. That
is, we must neither over-count nor under-count. It's also important to notice that the set of choices available at
step j may well depend on the previous steps; the assumption is only that the number of choices available does
not depend on the previous steps.
The first two results below give equivalent formulations of the multiplication principle.
Suppose that S is a set of sequences of length k, and that we denote a generic element of S by (x1,x2,,xk).
Suppose that for each j{1,2,,k}, xj has nj different values, regardless of the values of the previous
coordinates. Then #(S)=n1n2nk.
Proof:
A procedure that generates the sequences in S consists of k steps. Step j is to select the jth coordinate.
Suppose that T is an ordered tree with depth k and that each vertex at level i1 has ni children for i{1,2,

,k}. Then the number of endpoints of the tree is n1n2nk.

Proof:
Each endpoint of the tree is uniquely associated with the path from the root vertex to the endpoint. Each such
path is a sequence of length k, in which there are nj values for coordinate j for each j{1,2,,k}. Hence the
result follows from the previous result on sequences.
Product Sets
If Si is a set with ni elements for i{1,2,,k} then

#(S1S2Sk)=n1n2nk
Proof:
This is a corollary of the result above on sequences.
If S is a set with n elements, then Sk has nk elements.
Proof:
This is a corollary of the on previous result on product sets.
In the previous result, note that the elements of Sk can be thought of as ordered samples of size k that can be
chosen with replacement from a population of n objects. Elements of {0,1}n are sometimes called bit strings of
length n. Thus, there are 2n bit strings of length n.
Functions
The number of functions from a set A of m elements into a set B of n elements is nm.
Proof:
An algorithm for constructing a function f:AB is to choose the value of f(x)B for each xA. There are

n choices for each of the m elements in the domain


Recall that the set of functions from a set A into a set B (regardless of whether the sets are finite or infinite) is
denoted BA. The result in the previous exercise is motivation for this notation. Note also that if S is a set with

n elements, then the elements in the Cartesian power set Sk can be thought of as functions from {1,2,,k}
into S. Thus, the result above on Cartesian powers can be thought of as a corollary of the previous result on
functions.
Subsets
If S is a set with n elements then there are 2n subsets of S.

Proof from the multiplication principle:


An algorithm for constructing AS, is to decide whether xA or xA for each xS. There are 2 choices
for each of the n elements of S.
Proof using indicator functions:
Recall that there is a one-to-one correspondence between subsets of S and indicator functions on S. An
indicator function is simply a function from S into {0,1}, and the number of such functions is 2n by the result
above on functions
Suppose that {A1,A2,Ak} is a collection of k subsets of a set S. There are 22k different (in general) sets that
can be constructed from the k given sets, using the operations of union, intersection, and complement. These
sets form the algebra generated by the given sets.
Proof:
First note that there are 2k pairwise disjoint sets of the form B1B2Bk where Bi=Ai or Bi=Aci for
each i. Next, note that every set that can be constructed from {A1,A2,,An} is a union of some (perhaps all,
perhaps none) of these intersection sets.
Open the Venn diagram applet.
1. Select each of the 4 disjoint sets AB, ABc, AcB, AcBc.
2. Select each of the 12 other subsets of S. Note how each is a union of some of the sets in (a).
Suppose that S is a set with n elements and that A is a subset of S with k elements. The number of subsets of S
that contain A is 2nk.
Proof:
Note that subset B of S that contains A can be written uniquely in the form B=AC where CAc. Ac has

nk elements and hence there are 2nk subsets of Ac.

Computational Exercises
A license number consists of two letters (uppercase) followed by five digits. How many different license
numbers are there?
Answer:

67600000

Suppose that a Personal Identification Number (PIN) is a four-symbol code word in which each entry is either a
letter (uppercase) or a digit. How many PINs are there?
Answer:

1679616
In the board game Clue, Mr. Boddy has been murdered. There are 6 suspects, 6 possible weapons, and 9
possible rooms for the murder.
1. The game includes a card for each suspect, each weapon, and each room. How many cards are there?
2. The outcome of the game is a sequence consisting of a suspect, a weapon, and a room (for example,
Colonel Mustard with the knife in the billiard room). How many outcomes are there?
3. Once the three cards that constitute the outcome have been randomly chosen, the remaining cards are
dealt to the players. Suppose that you are dealt 5 cards. In trying to guess the outcome, what hand of
cards would be best?
Answer:
1. 21 cards
2. 324 outcomes
3. The best hand would be the 5 remaining weapons or the 5 remaining suspects.
An experiment consists of rolling a fair die, drawing a card from a standard deck, and tossing a fair coin. How
many outcomes are there?
Answer:

624
A fair die is rolled 5 times and the sequence of scores recorded. How many outcomes are there?
Answer:

7776
Suppose that 10 persons are chosen and their birthdays recorded. How many possible outcomes are there?
Proof:

41969002243198805166015625
In the card game Set, each card has 4 properties: number (one, two, or three), shape (diamond, oval, or
squiggle), color (red, blue, or green), and shading (solid, open, or stripped). The deck has one card of each
(number, shape, color, shading) configuration. A set in the game is defined as a set of three cards which, for
each property, the cards are either all the same or all different.

1. How many cards are in a deck?


2. How many sets are there?
Answer:

1. 81
2. 1080
A coin is tossed 10 times and the sequence of scores recorded. How many sequences are there?
Answer:

1024
A string of lights has 20 bulbs, each of which may be good or defective. How many configurations are there?
Answer:

1048576
The die-coin experiment consists of rolling a die and then tossing a coin the number of times shown on the die.
The sequence of coin results is recorded.
1. How many outcomes are there?
2. How many outcomes are there with all heads?
3. How many outcomes are there with exactly one head?
Answer:

1. 126
2. 6
3. 21
Run the die-coin experiment 100 times and observe the outcomes.
Suppose that a sandwich at a restaurant consists of bread, meat, cheese, and various toppings. There are 4
different types of bread (select one), 3 different types of meat (select one), 5 different types of cheese (select
one), and 10 different toppings (select any). How many sandwiches are there?
Answer:

61440
Consider a deck of cards as a set D with 52 elements.
1. How many subsets of D are there?

2. How many functions are there from D into the set {1,2,3,4}?
Answer:

1. 4503599627370496
2. 20282409603651670423947251286016
The Galton Board
The Galton Board, named after Francis Galton, is a triangular array of pegs. Galton, apparently too modest to
name the device after himself, called it a quincunx from the Latin word for five twelfths (go figure). The rows
are numbered, from the top down, by (0,1,). Row n has n+1 pegs that are labeled, from left to right by

(0,1,,n). Thus, a peg can be uniquely identified by an ordered pair (n,k) where n is the row number and k
is the peg number in that row.
A ball is dropped onto the top peg (0,0) of the Galton board. In general, when the ball hits peg (n,k), it either
bounces to the left to peg (n+1,k) or to the right to peg (n+1,k+1). The sequence of pegs that the ball hits is
a path in the Galton board.
There is a one-to-one correspondence between each pair of the following three collections:
1. Bit strings of length n
2. Paths in the Galton board from (0,0) to any peg in row n.
3. Subsets of a set with n elements.
Thus, each of these collections has 2n elements.
Open the Galton board applet.
1. Move the ball from (0,0) to (10,6) along a path of your choice. Note the corresponding bit string and
subset.
2. Generate the bit string 0111001010. Note the corresponding subset and path.
3. Generate the subset {2,4,5,9,10}. Note the corresponding bit string and path.
4. Generate all paths from (0,0) to (4,2). How many paths are there?
Answer:
4. 6

You might also like