You are on page 1of 26

Stochastic Models

Introduction to R

Walt Pohl
Universität Zürich
Department of Business Administration

February 28, 2013


What is R?

R is a freely-available general-purpose statistical package,


developed by a team of volunters on the Internet.
It is widely used among statisticians, and frequently new
statistical techniques are first implemented in R.
It is less widely-used by economists, who tend to prefer
commercial statistical packages or Matlab.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 2/1
R versus Excel
R has many more probability and statistical
functions built in or avaiable in free packages.
R is command-driven. You enter a sequence of
commands to manipulate your data.
While everything in Excel is in terms of cells, R has
a bunch of different data types: vectors, arrays,
objects. You can define your own.
Normally you will create a “.R” command file that
is separate from your data.
Note: Excel also has a separate command language –
VBA.
Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 3/1
R versus Matlab

The real target audience for Matlab is engineers.


Matlab has many features useful for engineers but
not useful for us.
The target application for R is statistics. R has
many more statistical functions than Matlab.
Matlab started as a package for manipulating
matrices, and added other features later.
Non-matrix based operations are awkward.
R was designed for general-purpose programming
from the beginning.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 4/1
R versus Other Statistics Programs

R is free.
R is more command-driven and less GUI driven.
R is very close to S-Plus.
R supports as broad of an array of operations as any
other statistics program.
R’s programming language is better-designed than
most of its competitors.
Since different packages are written by different
volunteers, R is not as uniform as some other
systems.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 5/1
Important URLs

R home page – http://www.r-project.org/


Closest R mirror site – http://stat.ethz.ch/CRAN/
R tutorial –
http://cran.r-project.org/doc/manuals/R-intro.html

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 6/1
Monte Carlo Simulation in R
R has many, built-in probability distributions. For each
supported distribution XXX, R comes with four functions:
dXXX – density function
pXXX – cumulative distribution function
qXXX – quantile function (inverse of the CDF)
rXXX – random draw
XXX = unif, norm, chisq, t, etc.
Example: For the normal distribution, we have dnorm,
pnorm, qnorm, rnorm.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 7/1
Vectors in R

For us, the basic R datatype is a vector of numbers.


The c command creates vectors:
Example: If you type c(1, 3, 4.5), R returns the vector
(1, 3, 4.5).
You can assign vectors to variables, using the < −
operator.
x < − c(1, 3, 4.5);

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 8/1
Vectors in R, cont’d

You can get the value of individual entries by using the []


operator.
x[3] will return 4.5.
You can also get subvectors by using ranges.
x[1:2] will return the vector 1, 3.
The length function allows you to refer to the end in a
range:
x[2:length(x)] will return the vector 3, 4.5.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 9/1
Operations on Vectors

Where possible, any operation on vectors will be applied


elementwise.
So if x and y are two vectors, then z = x * y will be the
vector where z[i] = x[i] * y[i].
Likewise log(x) will be the vector whose each entry will
be log(x[i]), etc.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 10 / 1


Sample Statistics

R has built-in functions for the usual sample statistics:


mean(x) – Mean of vector x
var(x) – Variance of vector x
sd(x) – Standard Deviation of vector x
quantile(x, q) – The q-th quantile of vector x.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 11 / 1


Reading Data

The easiest way to import data into R is through CSV


files. Excel can export files in this format.
The function read.csv imports a file as a CSV file.
Example: apple < − read.csv(”apple.csv”) imports the
file named ”apple.csv” into the variable apple. The data
is returned in the form of a data frame.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 12 / 1


Data frames

A data frame is a named list of vectors. In the case of


”apple.csv”, we get four entries on the list:
DATE – end date of month.
RET – monthly return on Apple stock.
VWRETD – monthly return on CRSP
value-weighted index.
rf – monthly risk-free rate.
You access the vector by using $. Example: apple$RET.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 13 / 1


Regression

R has a very easy to use interface for regression: the lm


function. For example, to fit the CAPM for Apple, we
would use
lm(RET ∼ VWRETD, data=apple)
The first argument uses the tilde operator indicate
that we want to regress RET on VWRETD.
The second argument indicates that the data comes
from the apple frame.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 14 / 1


Regression cont’d

lm by itself only returns the coefficients. To get more


detail, including t stats, use
summary(lm(RET ∼ VWRETD, data=apple))

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 15 / 1


Built-In Mathematical Functions

R has various built-in mathematical functions:


exp(x) – e x .
log(x) – natural logarithm, log x. (Use log(x, b) for
logb x).
xˆy – x y .

sqrt(x) – x
Note these all work on vectors. exp(c(1, 2)) gives you
c(2.718282, 7.389056).

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 16 / 1


Special Mathematical Values

Floating point supports some special values


1/0 = Inf.
−1/0 = -Inf.
0/0 = NaN.
Mathematical operations are defined for these special
values. For example, Inf + Inf = Inf, and Inf - Inf = NaN.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 17 / 1


Defining Your Own Functions

You can define a function by using R’s function


command:
f < − function(x) xˆ2
This creates a function that squares its argument, and
assigns it to the variable f. Calling f(2) in R will return 4.
Functions can take vector arguments. So f(c(1, 2)) will
return c(1, 4).

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 18 / 1


Matrices

R also supports matrices. Use matrix(0, nrow=m,


ncol=n) to create an m-by-n matrix. For example
g = matrix(0, nrow = 3, ncol = 4);
To access the element in the i-th row and j-th column,
use [] with two numbers. For example
g[1,2] < − 3;
assigns 3 to gi,j .

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 19 / 1


Logical Operations

R has the following basic logical operations.


==: equality
!−: not equal
<, >: greater or less than
<=, >=: greater/less than or equal
They evaluate to TRUE or FALSE.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 20 / 1


Logical Operations on Vectors

Logical operations work on vector arguments, and return


a vector of TRUE or FALSE values.
Example: 1:10 > 5.
You can use the functions any or all to see if any or all of
the entries in the vector are TRUE.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 21 / 1


Control Structures

R supports the standard control structures found in most


programming languages:
Branching: if
Definite iteration: for
Indefinite iteration: while

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 22 / 1


Control Structures: If

A statement like “if test code1 else code2 ’ executes


code1 if the test is true, and code2 if the test is false.
(“else code2 can be missing, means to do nothing).
Example: if (0 == 0) print(“is zero”) else print(“is not
zero”).

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 23 / 1


Control Structures: For

For allows you to do something a fixed number of times:


Example: for (i in 1:10) print(i);

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 24 / 1


Control Structures: While

While allows you to do something until a condition


becomes TRUE. (It may take forever).
Example:
i = 10;
while (i>0) {
print(i);
i = i - 1;
}
(Notice the use of braces here. This is because the body
of the while loop contains multiple statements.)

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 25 / 1


Writing Fast R Code

R is faster for vector operations than for loops.


Example: x < − (1:1000)2
is faster than
x < − rep(0, 1000); # create an array of all zeros.
for (i in 1:1000) {
x[i] < − iˆ2;
}

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 26 / 1

You might also like