You are on page 1of 36

Chapter 2

A Short Introduction to Python


for Mathematicians

This chapter contains a short introduction to the programming language Python,


which we already used in Chapter 1 to state some algorithms. In this introduction
we will not cover all aspects of Python, but only a subset of its features suitable to
implement computer algebra algorithms.
We assume that the reader already has some programming experience in an ob-
ject oriented language, such as Java. On the homepage of the Python Software Foun-
dation, there is a collection of introductory texts about Python, both for program-
mers and non-programmers: http://wiki.python.org/moin/BeginnersGuide/.
Among these is a free e-book called Building Skills in Python by S. F. Lott [Lot];
it assumes you have a programming background, and goes in far more detail than
our introduction. It also covers how to install Python.
Finally, we want to mention the Python tutorial. Whenever you want to learn
the basics of a topic (especially the ones not covered here), you can find an easy
introduction there. Two particular topics which are of interest and which we will
not cover here are input and output (including output formatting and writing to
and reading from files) and how to use command line arguments.

2.1 Getting Started


You can get Python from http://www.python.org. In this lecture, all examples
are written using Python 2.7.3. There do exist versions 3.x of Python, but since
SAGE [S+ 13] uses Python 2.x and both versions of Python are incompatible, we
decided to stick to Python 2.x. The current release can be downloaded at http:
//www.python.org/download/releases/2.7.3/.
There are many more or less fancy IDEs1 for Python. One quite basic IDE,
which usually comes along with Python, is IDLE. A more fancy IDE is Eric. Both
are installed on the institutes computers (for Eric, see Programs / Programming /
Eric python IDE in the menu). You can find an exhaustive list of other editors and
IDEs at http://wiki.python.org/moin/PythonEditors. The following screenshot
shows Eric:
1
IDE stands for Integrated Development Environment. An IDE often comes with a text editor
to write programs, and allows to run them and debug them.

37
38 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

You can also use the command line interface of Python. To execute a Python
script, say one called test.py, type python test.py on the command line. If you
just run python, an interactive Python shell will be started:
1 felix@sr1 :~ $ python test . py
2 Hello world !
3 felix@sr1 :~ $ python
4 Python 2.7.3 ( default , Aug 1 2012 , 05:14:39)
5 [ GCC 4.6.3] on linux2
6 Type " help " , " copyright " , " credits " or " license " for more
information .
7 >>> print " Hi ! "
8 Hi !
9 >>> exit ()
10 felix@sr1 :~ $

The prompt in Python is >>>; anything entered after the prompt will be executed.
You can also run test.py from the Python shell:
1 >>> execfile ( " test . py " )
2 Hello world !
3 >>>

Finally, note that from the command line, you can also run Python commands
directly using piping:
1 felix@sr1 :~ $ echo print " Hi ! " | python
2 Hi !
3 felix@sr1 :~ $ echo for i in xrange (3) : print i | python
4 0
5 1
6 2
7 felix@sr1 :~ $

2.2 Some Basic Python Programs


We have already used a bit of Python code in Chapter 1. Most of it should be
self-explanatory. A simple example was presented in Listing 1.1:
2.2. SOME BASIC PYTHON PROGRAMS 39

1 def gcd (a , b ) :
2 " Compute GCD of its two inputs "
3 while b != 0:
4 a, b = b, a % b
5 return a

This defines a function called gcd() which uses the Euclidean Algorithm to com-
pute a greatest common divisor of its two inputs. In the second line, a string is
given which describes what the function is doing. This string is also known as the
docstring of the function. You can see it when typing help(gcd) at the prompt:
1 >>> help ( gcd )
2 Help on function gcd in module __main__ :
3
4 gcd (a , b )
5 Compute GCD of its two inputs

The main part of the function is a while loops, which repeats until the condition
b != 0 is violated. Note that != stands for 6=. During the loop, we compute the pair
(b, a mod b) and store it into (a, b). Thus, after the loop iteration, a contains the
old value of b, and b contains the value of a % b with the values of a and b before
that line. After the loop, the current value of a is returned (after all, b == 0 since
the loop stopped).
Note that the block structure in the Python program is implied only by inden-
tation! Correct indentation is very important to Python, and if the indentation is
sloppy, it will generate errors. For example, the following code snippets are invalid:
1 def gcd (a , b ) :
2 " Compute GCD of its two inputs "
3 while b != 0:
4 a, b = b, a % b
5 return a

1 def gcd (a , b ) :
2 " Compute GCD of its two inputs "
3 while b != 0:
4 a, b = b, a % b
5 return a

These two listings yield IndentationError: unexpected indent.


1 def gcd (a , b ) :
2 " Compute GCD of its two inputs "
3 while b != 0:
4 a, b = b, a % b
5 return a

This listing yields IndentationError: unindent does not match any outer indentation
level.
A more complicated code listing in Chapter 1 is Listing 1.3, which implements
the Extended Euclidean Algorithm and uses loop unrolling.
1 def gcdex (a , b ) :
2 " Compute extended GCD ( with B e zout equation ) of its two
inputs . Returns the GCD followed by the coefficients of
the linear combination . "
40 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

3 ai = b # ai stands for : a with index i


4 aim1 = a # aim1 stands for : a with index i -1
5 # We can accelerate the first step
6 if ai != 0:
7 q , r = divmod ( aim1 , ai ) # compute both quotient and
remainder
8 aim1 , ai = ai , r
9 bim1 , bi = 0 , 1 # before : bi = 0 , bim1 = 1
10 cim1 , ci = 1 , -q # before : ci = 1 , cim1 = 0
11 # Now continue
12 while ai != 0:
13 q , r = divmod ( aim1 , ai ) # compute both quotient
and remainder
14 aim1 , ai = ai , r
15 bim1 , bi = bi , bim1 - q * bi
16 cim1 , ci = ci , cim1 - q * ci
17 else :
18 bim1 = 1
19 cim1 = 0
20 return aim1 , bim1 , cim1

It demonstrates how to write comments (by starting them with #; they last until
the end of the line) and how to do if statements. Note that for integers, / and %
for integers are defined to round to ;2 that is, a / b and a % b are evaluated
a
 a
as b and a b b . (This implies that a % b is 0 if b < 0, and 0 if b > 0.)
In particular, q, r = a / b, a % b yields an Euclidean division, since after this,
|r| < |b| and a = qb + r.

2.2.1 A Complete Program


To create a complete program which can be executed, one simply can put all required
code into one file: function definitions and statements which are executed when the
program is run. An example is the following:
1 def gcd (a , b ) :
2 " Compute GCD of its two inputs "
3 while b != 0:
4 a, b = b, a % b
5 return a
6
7 a,b = 5, 7
8 d = gcd (a , b )
9 print a , b , d
10 a , b = 9 84 50 41 03 12 30 12 41 98 751 34 12 83 71 23 12 3 ,
79347851293412371287123183713712471927132
11 d = gcd (a , b )
12 print a , b , d

Note that one can also have statements before the function declaration. But these
statements are not allowed to use the functions defined afterwards. If this program
is stored in a file named gcd.py, one can execute it by typing python test.py on the
command line, or by selecting Run Script in the IDE. In Eric, you can do this by
2
Note that in Python 3.x, / will yield a floating point number with the correct result
 (up to
floating point accuracy). To obtain an integer, one has to use //, which always returns ab . Note
that // is also available in Python 2.x.
2.2. SOME BASIC PYTHON PROGRAMS 41

pressing F2 and then pressing Enter in the following dialog. Eric will show the
programs output in the horizontal toolbox at the bottom.
If one has two functions which use each other, one can state them in any order;
the above-mentioned restriction that you cannot use functions declared afterwards
only applies to code at the top-level, i.e. code without indentation. This program is
valid and prints 64 among the string "Some recursive fun!":
1 print " Some recursive fun ! "
2

3 def A ( n ) :
4 return B ( n - 1)
5
6 def B ( n ) :
7 if n <= 0:
8 return 42
9 else :
10 return A ( n ) + 1
11
12 print A (23)

2.2.2 Splitting Up Programs into Modules


It is also possible to split up programs into more than one Python file. For this, we
need the notion of modules. Assume we have a file gcd.py, containing:
1 # -* - coding : utf -8 -* -
2
3 def simple_gcd (a , b ) :
4 " Compute GCD of its two inputs "
5 while b != 0:
6 a, b = b, a % b
7 return a
8
9 def extended_gcd (a , b ) :
10 " Compute extended GCD ( with B e zout equation ) of its two
inputs . Returns the GCD followed by the coefficients of
the linear combination . "
11 ai = b # ai stands for : a with index i
12 aim1 = a # aim1 stands for : a with index i -1
13 # We can accelerate the first step
14 if ai != 0:
15 q , r = aim1 / ai , aim1 % ai # compute both quotient
and remainder
16 aim1 , ai = ai , r
17 bim1 , bi = 0 , 1 # before : bi = 0 , bim1 = 1
18 cim1 , ci = 1 , -q # before : ci = 1 , cim1 = 0
19 # Now continue
20 while ai != 0:
21 q , r = aim1 / ai , aim1 % ai # compute both
quotient and remainder
22 aim1 , ai = ai , r
23 bim1 , bi = bi , bim1 - q * bi
24 cim1 , ci = ci , cim1 - q * ci
25 else :
26 bim1 = 1
42 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

27 cim1 = 0
28 return aim1 , bim1 , cim1

(The comment in the first line tells Python that we use the UTF-8 encoding. If
this line is omitted, then the e in the docstring of extended_gcd would result in a
runtime error. On the other hand, if we enter programs in the Python interpreter
interface, for example by copy and paste, this will not result in a problem since the
interpreter uses UTF-8 by default)
Assume that in a second file, we have the following content:
1 import gcd
2
3 print gcd . simple_gcd (134983 , 789213712)
4 print gcd . extended_gcd (134983 , 789213712)

The output is
1 1
2 (1 , 340147175 , -58177)

Note that import gcd tells Python that we want to use the module gcd, which
Python searches for in the file gcd.py. If it does not exist, we get an error like
ImportError: No module named this_module_does_not_exist. After importing the
module, all functions defined in gcd.py are available. We can access them by pre-
fixing them with the name of the symbol: gcd.simple_gcd accesses the function
simple_gcd in the module gcd.
We can also import certain functions so that we do not have to qualify their
name by specifying the module:
1 from gcd import simple_gcd
2
3 print simple_gcd (134983 , 789213712) # ok
4 print gcd . simple_gcd (134983 , 789213712) # runtime error !
5 print extended_gcd (134983 , 789213712) # runtime error !
6 print gcd . extended_gcd (134983 , 789213712) # runtime error !

Finally, we can also include everything from the module gcd by writing from gcd
import *. In that case, using the qualifier gcd results in an error (as above).
Note that in case gcd.py contains top-level code (i.e. code without indentation),
this code will be executed when the import statement is executed first; i.e., if the
import statement appears in a loop, the code from the imported module will only
be executed during the first loop iteration.
Also note that it is possible to use the import statement inside a function defi-
nition. The imported symbols will then only be available in this function:
1 def f () :
2 from gcd import simple_gcd , extended_gcd
3
4 print simple_gcd (134983 , 789213712) # ok
5 print extended_gcd (134983 , 789213712) # ok
6
7 f ()
8 print simple_gcd (134983 , 789213712) # runtime error !
9 print gcd . simple_gcd (134983 , 789213712) # runtime error !
2.3. DATA TYPES, EXPRESSIONS AND CONVERSIONS 43

2.3 Data Types, Expressions and Conversions


2.3.1 Expressions
The following table lists all expression types with their evaluation order. Expression
types with smaller evaluation order will be evaluated first, and expression types with
the same evaluation order will be evaluated from left to right:

Evaluation Operator Description


order
0 (...) expression grouping
, tuple forming
[...] list forming
{...} dictionary or set forming
(see below) literals (see below)
1 s[i] indexing
s[i:j], s[i:j:k] slicing
s.attr attributes
f(...) function calls
2 +x unary plus (no change)
-x unary minus (sign change)
~x bitwise negation
3 x ** y exponentiation
4 x * y multiplication
x / y division3
x // y floor division4
x % y remainder (modulo)5
5 x + y addition
x - y subtraction
6 x << y bit shifting (to the left)
x >> y bit shifting (to the right)
7 x & y bitwise and
8 x ^ y bitwise exclusive or (XOR)
9 x | y bitwise or
10 x < y less than
x <= y less or equal than
x > y greater than
x >= y greater or equal than
x == y equal value
x != y not equal value
x <> y not equal value
x is y equal object
x is not y not equal object
x in y element of

3
In Python 3.x, this will be true division, while in Python 2.x, it is either true division or floor
division, depending on
 the operands types.
4
The result is xy .
5
The result is x y xy . Note that for integers, one can compute both x // y and x % y by
 

using the divmod() function: (x // y, x % y)== divmod(x, y).


44 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

Evaluation Operator Description


order
x not in y not element of
11 not x boolean negation
12 x and y boolean and
13 x or y boolean or
14 lambda a: e Lambda function
15 x = y assignment6
x *= y augmented assignment: multiplication
x /= y augmented assignment: division
x //= y augmented assignment: floor division
x %= y augmented assignment: modulo
x += y augmented assignment: addition
x -= y augmented assignment: subtraction
x >>= y augmented assignment: left shift
x <<= y augmented assignment: right shift
x &= y augmented assignment: and
x |= y augmented assignment: or
x ^= y augmented assignment: exclusive or (XOR)

Literals Literals can be booleans, integers, floating point numbers, imaginary


floating point numbers, strings, and byte literals (which allow to embed binary
data). Booleans, integers, floating point numbers and strings will be discussed in
Section 2.3.2. We will not cover byte literals.

Tuples, Lists, Dictionaries, Slicing, ... These topics will be discussed later,
namely in Sections 2.3.3, 2.3.4 and 2.3.5.

Comparisons One important difference to other programming languages is that


Python interprets statements such as a <= b > c != d as mathematicians expect
them do evaluate: namely, this expression is true if and only if a b, b > c, and
c 6= d.
Also note the difference between identity comparison such as is and is not,
and value comparison such as ==, != and <>. For example, "a"+"b"=="ab" is always
true, while "a"+"b" is "ab" may or may not be true, depending on how the Python
implementation works.7
6
Note that chain assignments are possible, i.e. something like a = b = c = d = 0. This is not
possible for augmented assignments; if an expression contains an augmented assignment, it must
contain no other augmented or non-augmented assignment. Also, augmented assignments do not
accept tuples as destinations as in a, b += 1, 2. On the other hand, if a is a tuple, a += 2,3
is valid.
7
In the Python implementation I use, I obtain the following results:

1 >>> a = " 1 " ; b = " 2 " ; c = " 12 "


2 >>> a + b == c
3 True
4 >>> a + b is c
5 False
If we replace the strings "1", "2" and "12" by the integers 1, 2 and 3, both comparisons yield
2.3. DATA TYPES, EXPRESSIONS AND CONVERSIONS 45

Lambda functions Lambda functions allow to create anonymous functions (i.e.


functions without a name). We will not discuss them further, but refer to other
documentations such as [Lot].

2.3.2 Fundamental Data Types


Python offers several fundamental data types:

1. bool: truth values;

2. int and long: integers;

3. float and complex: floating point numbers;

4. string: texts (or more precisely: lists of characters);

Before we will discuss them now in more detail, we want to mention that one can
find out the type of any Python expression by writing type(expression):
1 >>> type (15)
2 < type int >
3 >>> type (2**128)
4 < type long >
5 >>> type (0.5)
6 < type float >
7 >>> type (1.5 j )
8 < type complex >
9 >>> type ( " hello " )
10 < type str >
11 >>> type ((1 ,2 ,3) )
12 < type tuple >

Truth Values Variables of type bool can take only the values True and False.
When interpreted as a number, True corresponds to 1 and False corresponds to 0.
If a number is converted to a bool, 0 will be converted to False and anything else
to True.

Integral Data Types Python provides two integral data types, int and long.
The type int represent signed CPU integers, which usually offer 232 of 264 different
values. Opposedly, the type long can store integers whose size is only bounded by
the systems memory.
While arithmetic with int is usually faster, computations which exceed the size
of int automatically return values of type long. Since Python 2.2, this conversion is
done automatically, and in Python 3.x, there will only be one integral type. (Note
that small enough values of type long will not be automatically converted back to
int.)
When entering numbers, one can specify them by specifying their decimal digits
(i.e. they are given as reduced 10-adic numbers; see Definition 1.1.1). Trailing zeros
should not be used as they are interpreted usually not as expected8 Thus, one should
not write 091 or 00000091, but 91 instead. A trailing zero is only allowed to enter
0 itself.
True. We will explain why this is the case in more detail in Section 2.4.
8
We just mention hexadecimal numbers and octal numbers.
46 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

Finally, note that negative numbers can only be entered using expressions: -1 is
an expression composed of the unary negation operator and the integer literal 1.

Floating Point Data Types Floating point numbers of type float are usually
represented by double precision numbers (IEEE 754). A number is a floating point
number as soon as a decimal separator (.) or a exponent (e or E) is written down.
Examples are 0.1, 5e23, 123.4323e-54 and 00001239.0 (here, trailing zeros are al-
lowed). Note that there does not need to be something before the decimal separator,
so .5 is also a valid floating point number.
Python can also handle complex numbers. These are tuples of float numbers
(and thus are not well-suited to store Gaussian integers in Z[i]). One cannot write
complex numbers directly in a Python program, but has to compose them by adding
a real part (a usual float value) to an imaginary part, which is a float value followed
directly by a j. For example, 1+2j and 0.123-2.354j are composed complex numbers.
(Note that since addition has a bigger evaluation order than exponentiation, 1+j**2
equals 1 + (i2 ) = 0 instead of (1 + i)2 = 2i. Therefore, do not forget parentheses if
you use composed complex numbers in expressions.)

Strings There are different ways to enter strings (which are of type str) in Python:

short strings can be enclosed by single () or double (") quotes; both delimiters
have to be the same, whence abcd and "def" are short strings, while "abcd
is not delimited; short strings are not allowed to contain line breaks, but can
contain quotes of the type not used as delimiters;

long strings can be enclosed by three single quotes () or three double quotes
("""), with the same rules as above; as opposed to short strings, long strings
can also contain line breaks and one or two adjacent occurrences of the quotes
used for delimiting the string.

Inside strings, one can use \ to escape characters: for example, "a"b" denotes the
string with the content a"b, which contains a double quote. There are also other
escape sequences, such as n for a newline and \ for the backslash itself, and escape
sequences to specify unicode characters by hex value or name. We will not go in
detail here but refer to [vRD13, Section 2.4.1]. Strings can also have prefixes, namely
r, R, u, U, which indicate raw and unicode strings. Again, for details, we refer to
[vRD13, Section 2.4.1] as we do not need this.
Strings behave like sequence objects such as tuples (Section 2.3.3) and lists (Sec-
tion 2.3.4). In particular, their length can be obtained by using the len() function,
and two strings can be concatenated to one string using the + operator. We will
discuss indexing and slicing in Section 2.3.4.
There are also certain other operations available for strings; please refer to other
documentation for more details.

2.3.3 Tuples
Tuples are a quick way to put together different values as one value. Tuples can be
generated by using the comma (,) operator: simple examples are a,b and a, and
(3,4,5). (These define tuples of length 2, 1 and 3.) Note that (1) does not yield a
tuple (of length 1), but an integer. That is, tuples can only be formed by using the
2.3. DATA TYPES, EXPRESSIONS AND CONVERSIONS 47

comma operator, just adding parentheses does not help. Parentheses only help to
separate tuples from their surroundings, for example inside expressions.
Tuples can also be used as targets during assignment; we already used this
during the GCD algorithm (Listing 1.1). For example, a,b = b,a swaps the values
of a and b. Both the left-hand side of the assignment and the right-hand side of
the assignment contain tuples of length 2. Note that in case of such assignments,
the tuples on both sides must have the same number of elements: a,b = a,b,c and
a,b,c = a,b result in errors. On the other hand, if a is a tuple of length 2, then
b,c = a is valid, since both sides of the assignment are tuples of size 2.
As strings and lists, tuples are sequence objects. We will discuss what this means
in the next section.

2.3.4 Lists and Slicing


Lists behave very similar to tuples (Section 2.3.3). The main difference is that
tuples (as well as strings) are immutable, which means that they cannot be modified
(concatenation two tuples creates a new tuple). List elements can be added, removed
or changed while not changing to a new list object; therefore, lists are mutable.
As tuples and strings, lists are sequence objects. This means that they support
the following operations:

Operation Description
len(a) Yields the number of elements in the sequence a.
a[i] Yields the i-th element of the sequence a; note that
indices begin with 0, whence a sequence of length n
has indices 0, . . . , n 1; also note that negative indices
can be used to access elements from the back, as i for
n i > 0 yields access to the element with index n i.
a + b Yields a sequence of the same type with the values
of sequence b appended to the end of the values of
sequence a; note that a and b must be sequences of the
same type.
n * a If n is an integer, creates a sequence where the
a * n sequence a is repeated n times; in case n <= 0, an
empty sequence will be returned.
reverse(a) If a is a sequence with elements a0 , . . . , an1 , returns
the reversed sequence an1 , . . . , a0 .
sorted(a) Returns a new sequence of the same type where the
elements are sorted in ascending order.9
x in a Returns True if and only if the element x appears
x not in a in the sequence a (for strings, x is actually checked for
being a substring); the analogue holds for not in.
a.index(x) Returns the smallest index i such that a[i] == x,
or throws an exception if it is not found.10
a.count(x) Returns the number of indices i such that a[i] == x.
9
First, note that this creates a new sequence; the old one is not modified. Second, note that one
can specify how to sort using additional parameters, which we will not describe here in detail.
10
Using a.index(x,start) one can limit the search for all indices i with i >= start, and with
a.index(x,start,stop) one can limit the search for all indices i with start <= i < stop.
48 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

Operation Description
min(a) Returns the minimal respectively maximal element of the
max(a) sequence a.
a[i:j] Yields the subsequence with indices i, i + 1, . . . , j 1.
a[i:j:s] Yields the subsequence with indices i, i + s, i + 2s, . . .
while i + s < j (respectively i + s > j if s < 0). This
process is called slicing.

(There are two more sequence types, bytes and bytearray, which we will not need
and thus also not discuss here.)
Moreover, since list is mutable, it supports the following operations:

Operation Description
a[i] = b Changes the i-th component of a to the value b.
a[i:j:k] = b Here, b must be another sequence (in case k != 1 this is
only possible if b has the same length as the selected slice).
a.append(x) Inserts the single element x at the end of the list.
a.extend(x) Inserts the sequence x at the end of the list.
a.insert(i, x) Inserts element x at position i in the list.
del a[i] Removes the element at index i from the list. The length
of the list will be reduced by one.
a.remove(x) Finds x in the list and removes the first occurrence.
Throws an exception if x cannot be found.
a.pop() Removes and returns the last value from the sequence a.
a.pop(i) Removes and returns the i-th value from the sequence a.
a.reverse() Reverses the elements in the list. Does not create a new
list, but modifies the current list.
a.sort() Sorts the elements of the list in ascending order. Does not
create a new list, but modifies the current list. (Again, one
can change the sorting order by additional arguments.)

Efficiency Note that many operations on sequences are somewhat inefficient. As


the content of a sequence comes in no particular order, any algorithm looking for
elements for example, a in b, a.index(x), a.count(x), max(a), min(a), a.remove(x)
and a.reverse() are slow, i.e. their running time is bounded only in terms of the
sequence size n by O(n). Also, due to the internal representation used for sequences,
operations such as a.insert(i, x), del a[i] and a.pop(i) have complexity O(n).
Often, a.append(x) and a.extend(b) also have worst-case complexity O(n), while
they can be much faster in practice due to reservation of additional space. Note
that concatenating two sequences is also slow in this sense.
In the next section, we will see how some of these operations can be done much
more efficiently if we do not need to know how the elements are stored.

2.3.5 Dictionaries and Sets


Dictionaries and sets are what in computer science is called an associate structure.
(This has nothing to do with associative algebraic structures such as semigroups.)
This means that the indices we use to access data in such structures are not anymore
2.3. DATA TYPES, EXPRESSIONS AND CONVERSIONS 49

natural numbers 0 to n 1, if the object contains n elements, but we can use


different index sets. Examples are (real-world) dictionaries, which associates to a
string (the word we want to look up) another string (a translation, explanation,
etc.). Dictionaries in Python do essentially this, while the indexing set must not be
a set of strings (but can be essentially any kind of set of Python objects), and the
results also can be of any type (and not just strings).
A set in Python is an object which can contain other objects, and allows to
efficiently find out whether it contains an element or not. Here, efficiently means
that this has complexity O(log n) assuming that the set contains n elements. (This
is also the complexity of many operations for dictionaries.) Moreover, sets allow to
form unions, intersections and set-theoretic differences efficiently.
The types of dictionaries and sets are dict and set, respectively. Dictionaries
and sets are mutable. Given a set, one can create a immutable version of it which
is of type frozenset. For dictionaries, no immutable type exists.
Note that the indices used for dictionaries and sets must be immutable types.
Thus, we can use integral types, floating point types, strings, tuples, frozen sets, but
not lists, sets and dictionaries.
The most important to know about how to interface dict objects is the following:

Operation Description
{a:x, b:y, ...} Creates a dictionary with for index a stores value x,
which for index b stores value y, and so on.
{ } Creates an empty dictionary.
dict()
dict(d) Creates a copy of the dictionary d.
len(d) Returns the number of entries in d.
d[i] Accesses the value of index i. If the index does not
exist in d, an exception is thrown.
d.get(i)
d[i] = x Creates or changes the entry with index i to the new
value x.
d.setdefault(i, val) If the index i exists in d, returns its value. If not,
creates the index with value val and returns val.
del d[i] Removes the entry for index i.
d.clear() Removes all entries.
d.pop(i) Removes and returns the value for index i. Throws an
exception if such an entry does not exist.
d.pop(i, val) The second variant returns val if the entry does not
exist.
i in d Tests whether or not the index i exists in d.
d.has_key(i)
i not in d
d.items() Creates a list of (index, value) tuples.
d.keys() Creates a list of indices.
d1.update(d2) Inserts all pairs (index, value) from a dictionary d2 into
dictionary d1. Note that the old entries of d1 remain
untouched if their indices do not appear in d2.
50 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

Operation Description
d1 == d2 Tests whether two dictionaries contain exactly the
d1 != d2 same (index, value) pairs.
d1 <> d2

One can operate on objects of type set as follows. All operations except the
ones which change the set can also be applied to frozen sets:

Operation Description
set(v) Creates a (frozen) set from the set,
frozenset(v) frozen set or sequence v.
In case v is a set, a copy is created.
{ a, b, ... } Creates a set with entries a, b, ...
set() Creates an empty set.
s1 | s2 Returns the union of s1 and s2.
s1.union(s2)
s1 & s2 Returns the intersection of s1 and s2.
s1.intersection(s2)
s1 - s2 Returns the set-theoretic difference
s1.difference(s2) of s1 and s2.
s1 ^ s2 Returns the symmetric difference of
s1.symmetric_difference(s2) s1 and s2; i.e., returns the set which
contains precisely the elements which
are contained in precisely one of s1
and s2.
s1 |= s2 Forms the union of s1 and s2 and
s1.union_update(s2) stores the result in s1.
s1 &= s2 Forms the intersection of s1 and s2
s1.intersection_update(s2) and stores the result in s1.
s1 -= s2 Forms the difference of s1 and s2
s1.difference_update(s2) and stores the result in s1.
s1 ^= s2 Forms the symmetric difference of
s1.symmetric_difference_update(s2) s1 and s2 and stores the result in s1.
s1 < s2 Tests whether s1 is a strict subset
of s2.
s1 <= s2 Tests whether s1 is a subset of or
equal to s2.
s1.issubset(s2) Tests whether s1 is a subset of or
equal to s2.
s1 > s2 Tests whether s1 is a strict
superset of s2.
s1 >= s2 Tests whether s1 is a superset of
or equal to s2.
s1.issuperset(s2) Tests whether s1 is a subset of or
equal to s2.
s1 == s2 Tests whether s1 is equal to s2.
s1 != s2 Tests whether s1 is different from
s1 <> s2 s2.
2.4. VARIABLES AND NAMES 51

Operation Description
len(s) Returns the cardinality of s.
min(s) Returns the minimal respectively
max(s) maximal element of s.
s.remove(x) Removes the element x from the set s.
s.clear() Removes all elements from the set.
s.add(x) Adds the element x to the set.
s.pop() Takes an arbitrary element of the set,
removes it and returns it. Will throw
an exception in case the set is empty.

For both dictionaries and sets, one can also obtain so-called iterators. We will
not treat them separately, but refer to the treatment of the for loop in Section 2.5.2.

2.3.6 Type Conversions


Type conversions can usually be done in a functional way by writing newtype(value);
as an example, int(5.3) yields 5 of type int, and set([2, 3, 5, 7, 11, 13, 17,
19]) yields a set containing all primes < 20.
There are also automatic type conversions, mostly during arithmetic operations;
such type conversions are also known as coercions. The coercion rules in Python
are quite simple, due to the lack of many different fundamental arithmetic types. If
an arithmetic operation such as +, -, *, /, // or % is applied to two inputs, Python
looks at their types, and takes the maximal type according to the following order:

int < long < float < complex

and converts both operands to this type. The result is then of this type; the only ex-
ception is that if both operands are of type int and the result cannot be represented
by an int. In this case, the result will be of type long.

2.4 Variables and Names


Pythons usage of variables differs slightly from other programming languages. In
most programming languages, variables usually store values. Besides storing values,
variables in many programming language can also store references or pointers.11 In
Python, there are no variables which store values, but only variables which store
references.
This has interesting consequences, which become apparent when comparing the
operators == and != (respectively <>) to the is and is not operators. The operators
==, != and <> compare the values of the objects the references point to, while is
and is not compare the references itself, i.e. test whether two references point to
the same object or not.
For example, it can be that two strings, which are immutable objects, compare
equal, i.e. have the same value, but are different objects. This occurs frequently
when combining or somehow creating strings. For example, if a="1" and b="2", then
a+b will create a new string with content "12". But if c="12" is another string, it
11
References are essentially pointers, but usually disallow operations which are available for point-
ers, such as pointer arithmetic. For the purpose of this section, it suffices to know what a reference
is.
52 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

might be that a+b is not c (as both strings with value "12" are represented at two
different locations in memory), while we have a+b == c (both strings are equal).
For immutable objects, the behavior of variables is essentially the same as if the
variables would store references. (Except that we have is and is not to distinguish
whether two objects with the same value are the same or not.) But for mutable
objects, such of objects of type list, dict and set, it makes a difference: in the
following simple program, a and b will have the same value, even though we change
only a. On the other hand, c is a true copy of a, whence it was not changed.
1 a = { 1: " hello " , 2: " world " }
2 b = a
3 c = dict ( a )
4 del a [1]
5 a [2] = " paradise "
6 print a # output : {2: paradise }
7 print b # output : {2: paradise }
8 print c # output : {1: hello , 2: world }

In this regard, Python is quite similar to Java, except that there are no native
types in Python which behave different and which are not objects, as it is the case
in Java with types such as int, long, float, etc.
If you ever worked with a programming language which uses pointers, you might
have came along a concept called smart pointer with reference counting. This is
essentially what Python uses. A smart pointer points to two things: first, to some
memory location which it manages, and then to another memory location which
contains an integer. This integer stores the number of smart pointers which point to
this location (reference count). Variables in Python are like smart pointers; if they
fall out of scope, they are destroyed by decreasing the reference count by one. If it
dropped to zero, the memory location is freed. If not, nothing happens. This ensures
that memory is not released while it is still used (because something is pointing to
it), but also ensures that memory is always freed.

2.5 Flow Control


The program flow in Python is very similar to the flow in other imperative and
object oriented programming languages, such as Java or C/C++. It also shares
most flow control structures with these languages.

2.5.1 Conditionals
The main conditional statement in Python is the if statement. In fact, it is the
only conditional statement, as there is nothing like switch in Java or C/C++.
The syntax is quite self-explanatory:
1 if condition_1 :
2 code_1
3 elif condition_2 :
4 code_2
5 elif condition_3 :
6 code_3
7 else :
8 code_4
2.5. FLOW CONTROL 53

There is no limit on the number of elif statements; there can also be none. The else
statement is optional, but if it appears, it must appear after all elif statements.
The behavior is simple: first, condition condition_1 is evaluated. If it evaluates
to True, the code block code_1 is executed and the execution continues after the
whole if block. If it evaluates to False, the elif statements are checked one by one.
If the condition for one evaluates to True, the corresponding code block is executed,
and the execution continues after the whole if block. If all conditions evaluate to
False, the else block if it exists is executed.
Note that instead of an indented code block, one can also list all commands
behind the colon (:):
1 if condition_1 : code_1
2 elif condition_2 : code_2
3 elif condition_3 : code_3
4 else : code_4

One can list several commands by separating them by semicolons (;):


1 if n == 1: print 1
2 elif n < 10: print 2; print n ;
3 elif n < 100: print 3; print n / 10; print n % 10;

2.5.2 Loops
Python offers two kinds of loops: while loops and for loops.
The while loops in Python behave as in most other programming languages:
given an expression, a code block will executed over and over again as long as the
expression evaluates to True:
1 while condition :
2 code

(As before, the code can also be listed after the colon.)
On the other hand, the for loops behave differently as in Java or C++12 . They
loop over iterable collections, for example tuples, lists, slices, strings, dictionaries,
and sets. (In fact, one can create own iterable classes. We will not go in more detail
on this topic; see [Lot, Chapter 19] instead.)
The syntax is quite simple:
1 for variable in collection :
2 code

The code block code will be executed for every element in the collection collection.
In an iteration, the variable variable will reference to the current element of the
collection. The following simple example:
1 P = [2 , 3 , 5 , 7 , 11 , 13 , 17 , 19]
2 for p in P :
3 print p , " is a prime "

will output
12
In the new C++12 standard, for loops as in Python were added.
54 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

1 2 is a prime
2 3 is a prime
3 5 is a prime
4 7 is a prime
5 11 is a prime
6 13 is a prime
7 17 is a prime
8 19 is a prime

(Note that one can also use the collections name as the variable; in that case, the
for loop will still loop over all elements of the collection, but after the loop, the
variable will reference to the last element of the collection. In case no other variable
references to the collection, it will not be referenced anymore afterwards and will be
thrown out of memory.)
There exists helper functions called range() and xrange(), which generate lists re-
spectively iterable objects for integers. While range(y) yields the list [0,1,...,y-1],
the syntax of range(x,y) and range(x,y,d) are similar to slicing: range(x,y) will
return the list [x,x+1,...,y-1], and range(x,y,d) the list [x,x+d,x+2*d,...] with
all entries x + k d with x + k d < y (if d > 0) respectively x + k d > y (if d < 0).
Note that range() creates a real list, so a region in memory is allocated and filled
with the required list of integers:
1 >>> x = range (15)
2 >>> print x
3 [0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14]
4 >>> x = range (99 , 11 , -5)
5 >>> print x
6 [99 , 94 , 89 , 84 , 79 , 74 , 69 , 64 , 59 , 54 , 49 , 44 , 39 , 34 , 29 ,
24 , 19 , 14]

Using this, for x in range(15): print x will output all integers 0, 1, . . . , 14.
As mentioned above, range() creates a list in memory with all integers. This is
usually not necessary, and can be annoying if we iterate over large ranges. Here, the
xrange() function is better suited: it will create an iterable object, which requires
minimal memory storage (essentially to store the parameters, i.e. start, stop and
step, and the current index), but yields the same result as range() when used in a
for loop:

1 >>> for x in xrange (5) :


2 ... print x
3 ...
4 0
5 1
6 2
7 3
8 4

The difference to range() is only visible when outputting the results:


1 >>> print range (1 ,16 ,3)
2 [1 , 4 , 7 , 10 , 13]
3 >>> print xrange (1 ,16 ,3)
4 xrange (1 , 16 , 3)
2.6. EXCEPTIONS 55

If you do not need an explicit list, you should always use xrange() instead of range().
Finally, note that you can always convert an xrange() object to a list by writing
list(xrange(15)): this will yield the list [0, 1, ..., 14].
As in other programming languages, one can use continue and break inside loops.
The continue statement ends the current iteration of the loop and jumps directly to
the next one (or ends the loop if there is no more iteration). The break statement
exists the loop; no more loop iterations are executed. For for loops, the variable
used to store the current element will keep its value after the loop, so one can check
where the loop was ended.

2.6 Exceptions
Python will react to errors or invalid operations by raising/throwing an exception;
this is very similar to the same concept in Java and C++. For example, if you
do x = 1/0, Python will raise a ZeroDivisionError exception. If you do nothing
special, Python will interrupt your program, print out a traceback (which shows
where in your program the exception was raised) and prints the exception with an
explanation:
1 >>> x = 1/0
2 Traceback ( most recent call last ) :
3 File " < stdin > " , line 1 , in < module >
4 Zer oDivis ionErr or : integer division or modulo by zero

Such exceptions also appear when evaluating the exponential of a too large number:
1 >>> import math
2 >>> math . exp (2000)
3 Traceback ( most recent call last ) :
4 File " < stdin > " , line 1 , in < module >
5 OverflowError : math range error

Another example is trying to look up something which is not in a dictionary:


1 >>> x = { 1: " hallo " , 2: " world " }
2 >>> print x [3]
3 Traceback ( most recent call last ) :
4 File " < stdin > " , line 1 , in < module >
5 KeyError : 3

Since exceptions can always appear somewhere in your program, and you do not
want to always check before doing something if it will work (sometimes, you cannot
really do that anyway). For this, Python allows to catch exceptions. The following
example shows how that works:
1 try :
2 print 1/0
3 except :
4 print " Caught exception ! "
5 print " ... continuing ... "

This will produce the following output:


1 Caught exception !
2 ... continuing ...
56 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

So instead of terminating the program, it will catch any exception raised in the
code block and continue afterwards. One can also be more specific about which
exceptions one wants to catch:
1 try :
2 print 1/0
3 x = { 0: " A " , 1: " B " }
4 print x [2]
5 import math
6 math . exp (2000)
7 except Ze roDivi sionEr ror :
8 print " Caught division by zero exception ! "
9 except KeyError , e :
10 print " Caught key error exception : " , e

The output will be Caught division by zero exception!. If we comment the line
print 1/0 out and run it again, we will obtain Caught key error exception: 2. Note
that by specifying , e after KeyError, we ask Python to store the exception into the
variable e. This allows us to find out more specific information on what went wrong.
Finally, if we also comment out print x[2], i.e. we are left with:
1 try :
2 # print 1/0
3 x = { 0: " A " , 1: " B " }
4 # print x [2]
5 import math
6 math . exp (2000)
7 except Ze roDivi sionEr ror :
8 print " Caught division by zero exception ! "
9 except KeyError , e :
10 print " Caught key error exception : " , e

the output will be:


1 Traceback ( most recent call last ) :
2 File " test . py " , line 6 , in < module >
3 math . exp (2000)
4 OverflowError : math range error

Here, an exception was raised which was not caught, since it did not fit any of the
except statements. If we would have added a generic except: at the end, we would
have also caught this one (without knowing what kind of exception it is).
Python allows us to also specify a finally: statement, which will be executed
if no exception is raised, or if an except statement was executed, or if no matching
except statement was found. For example,
1 try :
2 # print 1/0
3 x = { 0: " A " , 1: " B " }
4 # print x [2]
5 import math
6 math . exp (2000)
7 except Ze roDivi sionEr ror :
8 print " Caught division by zero exception ! "
9 except KeyError , e :
10 print " Caught key error exception : " , e
11 finally :
2.7. FUNCTIONS 57

12 print " : -) "

will result in the output


1 : -)
2 Traceback ( most recent call last ) :
3 File " test . py " , line 6 , in < module >
4 math . exp (2000)
5 OverflowError : math range error

This allows us to do some clean-up, like closing files, before the program is termi-
nated.
Note that we can also rise exceptions: we could simply write raise ValueError
and an exception of type ValueError is thrown:
1 >>> raise ValueError
2 Traceback ( most recent call last ) :
3 File " < stdin > " , line 1 , in < module >
4 ValueError
We can give it a value by writing raise ValueError, "ding!" or raise ValueError("ding!"):
1 >>> raise ValueError , " ding ! "
2 Traceback ( most recent call last ) :
3 File " < stdin > " , line 1 , in < module >
4 ValueError : ding !
5 >>> raise ValueError ( " ding ! " )
6 Traceback ( most recent call last ) :
7 File " < stdin > " , line 1 , in < module >
8 ValueError : ding !
It is also possible to creating own exceptions by creating a class and deriving it from
Exception; see, for example, [Lot, Section 18.3].

2.7 Functions
Functions can be defined using the def statement. The following defines three func-
tions with zero, one and three arguments:
1 def f () :
2 return 42
3
4 def g ( x ) :
5 print x **4
6
7 def h (x , y , z ) :
8 return ( x - y ) * z

Now f, g and h are variables (!) referencing to functions. We can assign them
to other variables, assign new values to them, and execute the function they are
referring to:
1 >>> print f ()
2 42
3 >>> f
4 < function f at 0 x1fccd70 >
5 >>> g (2)
6 16
58 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

7 >>> print g (2)


8 16
9 None
10 >>> f = g
11 >>> f ()
12 Traceback ( most recent call last ) :
13 File " < stdin > " , line 1 , in < module >
14 TypeError : g () takes exactly 1 argument (0 given )
15 >>> f (2)
16 16
17 >>> g = 5
18 >>> f (1)
19 1
20 >>> g
21 5
22 >>> f
23 < function g at 0 x1fccc80 >
24 >>> g (5)
25 Traceback ( most recent call last ) :
26 File " < stdin > " , line 1 , in < module >
27 TypeError : int object is not callable
28 >>> h (1 , 2 , 3)
29 -3
If a function uses the return statement, the value specified after it will be returned.
If Python encounters no return statement before it reaches the end of the function
body, it will return None.13
Note that it is possible to provide default arguments for functions:
1 def f (x , y = " A " , z = " B " ) :
2 print x , y , z
3

4 f("x")
5 f("x", "y")
6 f("x", "y", "z")

will output
1 x A B
2 x y B
3 x y z
Finally, it is possible to define functions with a variable number of arguments.
If the last argument in the function definition is preceded by a *, one can use any
number of arguments at this position (including zero). Inside the function, the
arguments at this position will be available as a tuple:
1 def f (x , y , * z ) :
2 print x , y , len ( z ) , z
3
4 f("a", "b")
5 f("a", "b", "c")
6 f("a", "b", "c", "d")
7 f("a", "b", "c", "d", "e")
8 f("a") # results in an exception : too few arguments given

13
The type None can have precisely one value, None, which can be converted to bool and then
yields False. The value None is often used to indicate that nothing is there.
2.9. THE MATHEMATICS MODULE 59

This will be the output:


1 a b 0 ()
2 a b 1 ( c ,)
3 a b 2 ( c , d )
4 a b 3 ( c , d , e )
5 Traceback ( most recent call last ) :
6 File " < stdin > " , line 1 , in < module >
7 TypeError : f () takes at least 2 arguments (1 given )

2.8 Modules
We have already seen how to use modules in Section 2.2.2 using the import and
from statements. In the following, we will present three modules which are useful
for mathematicians and for algorithm implementers.
Python offers several more modules, which we will ignore for this introduction.
Consult additional documentation for more information.

2.9 The Mathematics Module


The math module can be used with import math. Then, for example, the sin function
can be accessed by math.sin(). If we instead write from math import *, we can
access it directly by writing sin().
The math module defines two constants:

Constant Description
pi a float approximation of 3.1416
e a float approximation of exp(1) 2.7183

Moreover, it defines the following functions. Note that these functions accept float
variables, but not complex variables:

Function Description
acos(x) Evaluates arccos x [0, ) for x [1, 1]
asin(x) Evaluates arcsin x [/2, /2) for x [1, 1]
atan(x) Evaluates arctan x (/2, /2) for x R
atan2(x, y) Evaluates arctan xy (, )
ceil(x) Evaluates dxe
cos(x) Evaluates cos x [1, 1] for x R
cosh(x) Evaluates cosh x [1, ) for x R
exp(x) Evaluates exp x (0, ) for x R
fabs(x) Evaluates |x|
floor(x) Evaluates bxc
fmod(x, y) Evaluates the fractional part of xy .
Sign handling is platform dependent.
frexp(x) x = m 2e with 12 |m| < 1 and e Z
Splits x asp
hypot(x, y) Evaluates x2 + y 2 for x, y R
ldexp(x, y) Forms x 2y ; expects y to be an integer (int or long)
60 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

Function Description
log(x) Evaluates log x for x (0, )
log x
log10(x) Evaluates log10 x = log 10 for x (0, )
modf(x) Splits x into integral and fractional part;
result is tuple of floats
pow(x, y) Evaluates xy
sin(x) Evaluates sin x [1, 1] for x R
sinh(x) Evaluates sinh x R for x R

sqrt(x) Evaluates x [0, ) for x [0, )
tan(x) Evaluates tan x R for x R
tanh(x) Evaluates tanh (1, 1) for x R

2.10 The Random Number Generation Module


The random module can be used with import random. Note that most computers
are not able to generate real random numbers,14 and thus stick to so-called pseu-
dorandom numbers. The quality of the generated numbers depends on the used
generator and how it is seeded. If two pseudorandom number generators are seeded
in the same way, they produce the same output of random numbers. This has two
main implications:
the seed should be chosen carefully;
it allows to reproduce a very specific run of an probabilistic algorithm.
Pseudorandom number generators (or short, PRNG) can be described in an abstract
sense by a finite set S of states, a finite set of numbers X, as well as two functions f :
S S and g : S X. Given a starting state s0 S, define the sequence (sn )nN
by sn+1 := f (sn ). This pseudorandom walk can be mapped to X by applying g:
if the generator is good, and s0 is not an exceptional starting state, the sequence
g(s0 ), g(s1 ), g(s2 ), . . . should look random.
It is inherent to this process that the sequence (g(sn ))nN is periodic (with maybe
a non-trivial perperiod), as S is finite and we thus have sa = sb for some a < b, which
then implies si+k(ba) = si for all i a and k N. The pseudorandom number
generator used in Python is the Mersenne Twister [MN98] RNG (random number
generator; it has a quite large period and behaves well under certain statistical tests.
(Note that it is not suited for cryptographical purposes, though.)
Using pseudorandom numbers instead of real random numbers always bears
the danger of obtaining wrong results, as the generated sequence might be biased
in some (non-trivially detectable) way. When using pseudorandom numbers to ran-
domize algorithms, one often just assumes that they behave like real random
numbers. Technically, this is wrong, but in practice it usually works well.
Let us now list all functions the random module provides, sorted by three cate-
gories for discrete (pseudo-)random number generation, continuous random num-
ber generation (since computers can represent only a finite number of different real
14
It is a philosophical question what real random numbers should be and if such numbers
actually exist. If you believe that if you know the state of every quark in the universe at a certain
point in time, that you could simulate it deterministically from this point of time, then true
random numbers cannot be created in this universe. Even a very simple process, such as throwing
a dice or flipping a coin, would be totally deterministic if this is the case. Still, it suffices in practice
(and seems to work well) to assume that a thrown dice is random enough for all our purposes.
2.10. THE RANDOM NUMBER GENERATION MODULE 61

numbers, this is essentially discrete as well), and additional functions to control the
state of the PRNG.

Discrete Random Number Generation

Function Description
choice(s) Returns a uniformly random distributed element
of the sequence s. Raises an IndexError exception
if s is empty.
getrandbits(k) Returns a uniformly distributed integer r
with 0 <= r < 2**k
randint(x, y) Returns a uniformly distributed integer r
with x <= r <= y
randrange(y) Equivalent to choice(range(y))
randrange(x, y) Equivalent to choice(range(x, y))
randrange(x, y, k) Equivalent to choice(range(x, y, k))
sample(s, k) Given a sequence s, will select k random elements
(without repetitions) from s
shuffle(s) Given a mutable sequence s, will shuffle s
randomly15

Continuous Random Number Generation

Function Description
random() Returns a uniformly distributed float in [0, 1)
uniform(x, y) Returns a uniformly distributed float in [x, y)
triangular(x, y) Returns a number r in [x, y] distributed with a
symmetric triangular distribution with peak
at x+y
2
triangular(x, y, mode) Returns a number r in [x, y] distributed with a
triangular distribution with peak at mode
betavariate(a, b) Returns a number according the Beta
distribution
expovariate(l) Returns a number according the exponential
distribution
gammavariate(a, b) Returns a number according the Gamma
distribution
gauss(m, s) Returns a number according the Gaussian
distribution
lognormvariate(m, s) Returns a number according the log-normal
distribution
normalvariate(m, s) Returns a number according the normal
distribution
vonmisesvariate(m, k) Returns a number according the von Mises
distribution
15
Note that for not very small len(s), every pseudorandom number generator will fail to generate
all possible permutations.
62 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

Function Description
paretovariate(a) Returns a number according the Pareto
distribution
weibullvariate(a, b) Returns a number according the Weibull
distribution

Consult the Python documentation for more details on the distributions.

Seeding and State Management

Function Description
seed() Seed random number generator by system time or
random source provided by the operating system
seed(x) Seed random number generator by x,
which must be an hashable object
getstate() Returns the state of the random number generator
setstate(s) Sets the state of the random number generator.
Should be a state returned by getstate()
jumpahead() Modifies the state of the random number generator
to a future position

Note that all these functions use a hidden instance of the class random.Random.
One can create an own instance of that class to obtain a random number genera-
tor which is independent from the default one. There also exist different random
number generator classes which use different methods than the default class (which
uses the Mersenne Twister), namely random.WichmannHill (using the Wichmann-Hill
generator) and random.SystemRandom (using the operating systems provided random
number source).

2.10.1 The Time-It Module


The timeit module provides a timing facility which allows to determine how much
time is spend executing a command. We will explain how to use this module by
example.
Assume that we have the following function to find all primes x:
1 def all_primes_up_to ( x ) :
2 " Uses the Sieve of Sieve of Eratosthenes to find all
primes up to x . "
3 v = ( x - 1) * [ True ] # one boolean value for every
integer from 2 up to x
4 P = []; # set of primes found so far
5 for i in xrange ( x - 1) : # iterate over all entries
6 if v [ i ]: # is the entry irreducible ?
7 P . append ( i + 2) # if yes , it is prime
8 for j in xrange (2 * i + 2 , x - 1 , i + 2) :
9 v [ j ] = False # strike out all non - primes
10 return P

If this is in primesieve.py, we can then do import all_primes_up_to from primesieve


and after that use all_primes_up_to(100) to for example obtain a list of all primes
2.10. THE RANDOM NUMBER GENERATION MODULE 63

up to 100:
1 >>> import all_primes_up_to from primesieve
2 >>> print all_primes_up_to (100)
3 [2 , 3 , 5 , 7 , 11 , 13 , 17 , 19 , 23 , 29 , 31 , 37 , 41 , 43 , 47 , 53 ,
59 , 61 , 67 , 71 , 73 , 79 , 83 , 89 , 97]
We can now test how efficient this method is from the command line:
1 felix@sr1 : $ python -m timeit -s " import primesieve "
" primesieve . all_primes_up_to (100) "
2 10000 loops , best of 3: 20.4 usec per loop
3 felix@sr1 : $ python -m timeit -s " import primesieve "
" primesieve . all_primes_up_to (100) "
4 10000 loops , best of 3: 20.6 usec per loop
5 felix@sr1 : $ python -m timeit -s " import primesieve "
" primesieve . all_primes_up_to (200) "
6 10000 loops , best of 3: 39.1 usec per loop
7 felix@sr1 : $ python -m timeit -s " import primesieve "
" primesieve . all_primes_up_to (300) "
8 10000 loops , best of 3: 58.7 usec per loop
9 felix@sr1 : $ python -m timeit -s " import primesieve "
" primesieve . all_primes_up_to (1000) "
10 10000 loops , best of 3: 193 usec per loop
11 felix@sr1 : $ python -m timeit -s " import primesieve "
" primesieve . all_primes_up_to (2000) "
12 1000 loops , best of 3: 384 usec per loop
13 felix@sr1 : $ python -m timeit -s " import primesieve "
" primesieve . all_primes_up_to (4000) "
14 1000 loops , best of 3: 769 usec per loop
15 felix@sr1 : $ python -m timeit -s " import primesieve "
" primesieve . all_primes_up_to (10000) "
16 1000 loops , best of 3: 1.93 msec per loop
What python -m timeit -s command_1 command_2 does is to first execute command_1,
and then n-times execute command_2 m times, measure how many times the m exe-
cutions need for every of the n iterations, and returns the smallest time divided by
m. The default for n is 3, and for m the command line version of timeit determines
a value so that the total time required for measuring is still acceptable. As you can
see above, m is 10000 for the first five runs, and 1000 for the last three.
Also note that even if we run the same measurement twice, the values might
be different: for primes up to 100, we obtain the two different values 20.4 usec and
20.6 usec.
We can also use timeit from inside a Python program:
1 import timeit
2

3 for x in [100 ,200 ,300 ,1000 ,2000 ,4000 ,10000 ,100000 ,1000000]:
4 iterations = 4000000/ x
5 timer = timeit . Timer ( " all_primes_up_to ( " + str ( x ) + " ) " ,
" from primesieve import all_primes_up_to " )
6 time = min ( timer . repeat (3 , iterations ) ) / iterations
7 print x , " ->" , time * 1000 , " milliseconds ( " , iterations ,
" iterations ) "

Here, we first loop over different numbers of x the integer up to which we want to
find all primes. Then we determine the number of iterations by a crude formula. We
64 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

then create a timeit.Timer object and give it the statement to execute followed by
the initialization statement. Then, timer.repeat(3, iterations) runs three times
iterations iterations, and returns a list of three elements, where each element is
the total time (in seconds) needed to run iterations iterations of the command.
We choose the minimum and divide by the number of iterations. Then, we output
x followed by the time per execution of all_primes_up_to(x) in milliseconds, and
finally the number of iterations. The output could be as follows:
1 100 -> 0.0202603518963 milliseconds ( 40000 iterations )
2 200 -> 0.0395186066628 milliseconds ( 20000 iterations )
3 300 -> 0.0593184638004 milliseconds ( 13333 iterations )
4 1000 -> 0.194074273109 milliseconds ( 4000 iterations )
5 2000 -> 0.385856032372 milliseconds ( 2000 iterations )
6 4000 -> 0.777964830399 milliseconds ( 1000 iterations )
7 10000 -> 1.95342242718 milliseconds ( 400 iterations )
8 100000 -> 19.9860274792 milliseconds ( 40 iterations )
9 1000000 -> 228.603720665 milliseconds ( 4 iterations )

We can use this output to see how the Sieve of Eratosthenes scales with the input
parameter x. If we also output time/x/math.log(math.log(x)), we will see that this
seems to be decreasing with x. Therefore, the complexity of the algorithm appears
to be in O(x log log x). (Which is actually the case.) If we instead output time/x,
we see that it also decreases in the beginning, but then starts to increase again. (It
also makes sense that the algorithms complexity should be in o(x), since the main
loop already iterates (x) times.) While this yields no proof that the running time
is indeed O(x log log x), it yields empirical evidence that this is the case.
For some algorithms, the results one can theoretically show differ quite a lot
from the empirical measurements. Therefore, measuring how the algorithms behave
in practice is quite important in these cases.
More details on timeit can be found here.

2.11 Objects and Classes


Python is an object oriented language: everything in Python is an object. This
includes executable code (such as functions), integers, floating point numbers, and
strings. In this regard, Python is different from hybrid languages such as Java and
C++, which both allow object oriented programming, but where not everything is
an object.

2.11.1 Simple Classes


A simple class can be defined as follows:
1 class TestClass ( object ) :
2 " A simple test class "
3
4 i = 123
5
6 def f ( self ) :
7 return " We got " + str ( self . i )

We can use the class as follows:


2.11. OBJECTS AND CLASSES 65

1 >>> a = TestClass ()
2 >>> a . f ()
3 We got 123
4 >>> a . i = 234
5 >>> a . f ()
6 We got 234
7 >>> a
8 < __main__ . TestClass instance at 0 x1f64680 >
9 >>> type ( a )
10 < class __main__ . TestClass >

This class has one attribute, i, and one method, f(). Methods can be defined
like usual functions, but have to always accept at least one argument, which will
automatically be the object the method is called from.16 Note that we can also
invoke f() of the object a by writing TestClass.f(a).
Note that Python unfortunately does not have the concept of public and private
attributes and methods (as opposed to Java and C++). Everything is public.
Creating a constructor in Python can be done by creating a method called
__init__():

1 class TestClass ( object ) :


2 " A simple test class "
3
4 def __init__ ( self , a ) :
5 self . i = a
6
7 def f ( self ) :
8 return " We got " + str ( self . i )

With this, we can do:


1 >>> a = TestClass (1)
2 >>> a . f ()
3 We got 1
4 >>> a = TestClass (3)
5 >>> a . f ()
6 We got 3

2.11.2 Inheritance
If we use the syntax as in Section 2.11.1, the class will be derived from the base class
object. But we can also derive from other classes:

1 class Parent ( object ) :


2 " A simple test class from which we want to inherit "
3
4 class Sibling ( Parent ) :
5 " A simple test class which derives from Parent "

We can test subclass relations as follows:


16
If we define a method g() in TestClass taking no argument and try to call it, an exception
will be thrown with message TypeError: g()takes no arguments (1 given). Also, the call
TestClass.g() will fail with the error TypeError: unbound method g()must be called
with TestClass instance as first argument (got nothing instead).
66 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

1 >>> a = Parent ()
2 >>> b = Sibling ()
3 >>> type ( a )
4 < class __main__ . Parent >
5 >>> type ( b )
6 < class __main__ . Sibling >
7 >>> isinstance (a , Parent )
8 True
9 >>> isinstance (a , Sibling )
10 False
11 >>> isinstance (b , Parent )
12 True
13 >>> issubclass ( Parent , Sibling )
14 False
15 >>> issubclass ( Sibling , Parent )
16 True
For more details, refer to [Lot, Chapter 24]. We will not need this here.

2.11.3 Defining Operators


It is possible to define operators for a class. This includes almost all operators
listed in Section 2.3.1. If we for example write a + b in Python, the interpreter
will try to evaluate a.__add__(b). If this is not defined, it will raise an exception
(TypeError: unsupported operand type(s)for +).
The following list shows all operators which can be defined and which function
will be called by Python to apply the operator. If two corresponding calls are given,
the second one will be used if the first one is not defined.

Operator Corresponding Call


-a a.__neg__(b)
+a a.__pos__(b)
abs(a) a.__abs__(b)
~a a.__invert__(b)
a + b a.__add__(b) or b.__radd__(a)
a - b a.__sub__(b) or b.__rsub__(a)
a * b a.__mul__(b) or b.__rmul__(a)
a / b a.__div__(b) or b.__rdiv__(a)
a // b a.__floordiv__(b) or b.__rfloordiv__(a)
a % b a.__mod__(b) or b.__rmod__(a)
divmod(a, b) a.__divmod__(b) or b.__rdivmod__(a)
a ** b a.__pow__(b) or b.__rpow__(a)
a << b a.__lshift__(b) or b.__rlshift__(a)
a >> b a.__rshift__(b) or b.__rrshift__(a)
a & b a.__and__(b) or b.__rand__(a)
a | b a.__or__(b) or b.__ror__(a)
a ^ b a.__xor__(b) or b.__rxor__(a)
a += b a.__iadd__(b) (should return self)
a -= b a.__isub__(b)
a *= b a.__imul__(b)
a /= b a.__idiv__(b)
a //= b a.__ifloordiv__(b)
2.11. OBJECTS AND CLASSES 67

Operator Corresponding Call


a %= b a.__imod__(b)
a **= b a.__ipow__(b)
a <<= b a.__ilshift__(b)
a >>= b a.__irshift__(b)
a &= b a.__iand__(b)
a |= b a.__ior__(b)
a ^= b a.__ixor__(b)
int(a) a.__int__(b)
long(a) a.__long__(b)
float(a) a.__float__(b)
complex(a) a.__complex__(b)
str(a) a.__str__()
repr(a) a.__repr__()
len(a) a.__len__()
a[k] a.__getitem__(k)
a[k] = x a.__setitem__(k, x)
del a[k] a.__delitem__(x)
x in a a.__contains__(x)
x not in a not a.__contains__(x)
a == b a.__eq__(b)
a != b a.__ne__(b)
a <> b a.__ne__(b)
a <= b a.__le__(b)
a < b a.__lt__(b)
a >= b a.__ge__(b)
a > b a.__gt__(b)

Note that in particular the arithmetic functions can return NotImplemented in case
the arithmetic operation is not implemented or wanted for the combination of input
types. Also note that the augmented operators should return self.
It is also possible to handle slicing: the index argument k to a.__getitem__(k),
a.__setitem__(k, x) and a.__delitem__(k) can also be a slice object, an object
of type slice. The calls a[:l], a[k:l] and a[k:l:s] are essentially equivalent to
a[slice(l)], a[slice(k, l)] and a[slice(k, l, s)], respectively. Given an ob-
ject s of type slice, one can obtain the concrete indices by calling s.indices(length),
where one has to supply the number of elements in length. It returns a tuple of three
elements (begin, end, stride), where negative values are converted to valid indices in
range 0, . . . , length 1.

2.11.4 Example: A Fixed-Dimension Vector Class


Let R be a ring. We want to create a vector class which contains one element of
Rn , for some n. The vector class should allow adding and subtracting such vectors,
assuming they have the same length, as well as scalar multiplication. We also want
to access
Note that tuples are essentially vectors. Unfortunately, adding tuples means con-
catenation, subtraction of tuples it not defined, and multiplying by integers results
in the tuple being concatenated to itself. This is not what we want. Therefore, we
68 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

have to encapsulate a tuple into a new class. In fact, we will use a list, since a tuple
is immutable, but we want the vector to be mutable.
We assume that the following code is stored in vector.py. We begin with a basic
interface:

Listing 2.1: A Fixed-Dimension Vector Class


1 " Defines a fixed - dimensional vector class "
2
3 class Vector ( object ) :
4 " A fixed - dimensional vector "
5
6 def __init__ ( self , dimension ) :
7 """ Initializes the vector . If an integer is given , a
8 vector with values 0 ( of type int ) will be created
9 in that dimension . If a tuple , list or Vector is
10 given , a copy will be created and used as the
11 vector . """
12 if type ( dimension ) == int :
13 self . data = dimension * [0];
14 elif type ( dimension ) == Vector :
15 self . data = list ( dimension . data ) ; # create a copy
16 else :
17 self . data = list ( dimension ) ;
18

19 def __len__ ( self ) :


20 " Returns the dimension of the vector "
21 return len ( self . data )
22
23 def __getitem__ ( self , k ) :
24 " Returns the k - th component of the vector "
25 return self . data [ k ]
26
27 def __setitem__ ( self , k , x ) :
28 " Sets the k - th component of the vector "
29 if type ( k ) == slice :
30 # we want to avoid that the user uses this to
31 # change the dimension of the vector
32 raise RuntimeError ( " No slice writing support for
Vectors implemented ! " )
33 self . data [ k ] = x

The constructor accepts three different kind of data: first, it accepts a dimension
in form of a non-negative integer (if the integer is negative, it is assumed to be 0).
In case another Vector is given, we make a copy of its internal list representation.
Finally, we try to cast the argument to a list; if the argument is any kind of sequence
(like tuple or list), it will be converted into a new list.
We then define len(v), v[k] and v[k] = x for a Vector v and integers k (also
slices for accessing) and values x. Next, we want to define addition, subtraction,
unary plus and minus, as well as scalar multiplication:
35 def __add__ ( self , v ) :
36 " Computes the sum of this vector and v "
37 if type ( v ) != Vector :
38 return NotImplemented
39 if len ( self . data ) != len ( v . data ) :
2.11. OBJECTS AND CLASSES 69

40 raise RuntimeError ( " Vectors have different


dimensions ! " )
41 res = Vector ( self )
42 for i in xrange ( len ( res . data ) ) :
43 res [ i ] += v [ i ]
44 return res
45
46 def __sub__ ( self , v ) :
47 " Computes the difference of this vector and v "
48 if type ( v ) != Vector :
49 return NotImplemented
50 if len ( self . data ) != len ( v . data ) :
51 raise RuntimeError ( " Vectors have different
dimensions ! " )
52 res = Vector ( self )
53 for i in xrange ( len ( res . data ) ) :
54 res [ i ] -= v [ i ]
55 return res
56
57 def __neg__ ( self ) :
58 " Computes the negative of the given vector "
59 res = Vector ( self )
60 for i in xrange ( len ( res . data ) ) :
61 res [ i ] = - res [ i ]
62 return res
63
64 def __pos__ ( self ) :
65 " Computes the positive of the given vector "
66 res = Vector ( self )
67 for i in xrange ( len ( res . data ) ) :
68 res [ i ] = + res [ i ]
69 return res
70
71 def __mul__ ( self , k ) :
72 " Computes the scalar product of this vector and k "
73 res = Vector ( self )
74 for i in xrange ( len ( res . data ) ) :
75 res [ i ] *= k
76 return res
77
78 def __rmul__ ( self , k ) :
79 " Computes the scalar product of this vector and k "
80 res = Vector ( self )
81 for i in xrange ( len ( res . data ) ) :
82 res [ i ] *= k
83 return res

Note that we have to define the scalar product twice, since __mul__() is only
called for v * k; for k * v, the scalar usually knows nothing about vectors, whence
k.__mul__(v) is NotImplemented. Then Python tries v.__rmul__(k).
Finally, we want to add functions to convert the vector to a string or return a
textual representation which allows to reconstruct the vector:
85 def __str__ ( self ) :
86 " Converts vector to a string "
87 s = "["
70 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

88 for i in xrange ( len ( self . data ) ) :


89 if i != 0:
90 s += " , "
91 s += str ( self . data [ i ])
92 s += " ] "
93 return s
94
95 def __repr__ ( self ) :
96 " Converts vector to a string "
97 s = " Vector ([ "
98 for i in xrange ( len ( self . data ) ) :
99 if i != 0:
100 s += " , "
101 s += repr ( self . data [ i ])
102 s += " ]) "
103 return s

Now we can use the vector class as follows:


1 >>> from vector import Vector
2 >>> v = Vector ([1 , 2 , 3 , 4 , 5])
3 >>> w = Vector ([5 , 4 , 3 , 2 , 42])
4 >>> len ( v )
5 5
6 >>> print v + w , v - w
7 [6 , 6 , 6 , 6 , 47] [ -4 , -2 , 0 , 2 , -37]
8 >>> print -v , + w
9 [ -1 , -2 , -3 , -4 , -5] [5 , 4 , 3 , 2 , 42]
10 >>> print 5 * v , w * 5
11 [5 , 10 , 15 , 20 , 25] [25 , 20 , 15 , 10 , 210]
We can also ask for help about the Vector class:
1 >>> help ( Vector )
2 Help on class Vector in module vector :
3
4 class Vector ( __builtin__ . object )
5 | A fixed - dimensional vector
6 |
7 | Methods defined here :
8 |
9 | __add__ ( self , v )
10 | Computes the sum of this vector and v
11 |
12 | __getitem__ ( self , k )
13 | Returns the k - th component of the vector
14 |
15 | __init__ ( self , dimension )
16 | Initializes the vector . If an integer is given , a
17 | vector with values 0 ( of type int ) will be created
18 | in that dimension . If a tuple , list or Vector is
19 | given , a copy will be created and used as the
20 | vector .
21 |
22 | __len__ ( self )
23 | Returns the dimension of the vector
24 |
25 | __mul__ ( self , k )
2.11. OBJECTS AND CLASSES 71

26 | Computes the scalar product of this vector and k


27 |
28 | __neg__ ( self )
29 | Computes the negative of the given vector
30 |
31 | __pos__ ( self )
32 | Computes the positive of the given vector
33 |
34 | __repr__ ( self )
35 | Converts vector to a string
36 |
37 | __rmul__ ( self , k )
38 | Computes the scalar product of this vector and k
39 |
40 | __setitem__ ( self , k , x )
41 | Sets the k - th component of the vector
42 |
43 | __str__ ( self )
44 | Converts vector to a string
45 |
46 | __sub__ ( self , v )
47 | Computes the difference of this vector and v
48 |
49 | -----------------------------------------------------------
50 | Data descriptors defined here :
51 |
52 | __dict__
53 | dictionary for instance variables ( if defined )
54 |
55 | __weakref__
56 | list of weak references to the object ( if defined )
57 ( END )
72 CHAPTER 2. A SHORT INTRODUCTION TO PYTHON

You might also like