Foreach Iterators - Lewis

foreach + iterators
Bryan Lewis Steve Weston
Revolution Computing New Haven, CT USA
Rmetrics 2009
Outline
iterators
foreach
Experimenting with existing packages
iterators
An S3 class with tools for iterating over various R data structures: Conceptually like while loops Dened by a nextElem function Like iterators in Java and other languages
Simple Examples
it <- iter (1:3)
it <- icount (3)
Another example
iquery <- function (con, statement, ..., n=1) { rs <- dbSendQuery (con, statement, ...) nextElem <- function() { d <- fetch (rs, n) if (nrow (d) == 0) { dbClearResult (rs) stop (StopIteration) } d } structure (list (nextElem=nextElem), class=c (iquery, iter)) } nextElem.iquery <- function(obj) obj$nextElem()
foreach
New looping methods for R An abstract interface to parallel computing Python/Haskell-like list comprehensions
Foreach Syntax
foreach(iterator,...) %dopar% { statements }
Example
> foreach (j=1:4) %dopar% { j } [[1]] [1] 1 [[2]] [1] 2 [[3]] [1] 3 [[4]] [1] 4
Examples
> foreach (j=1:4,.combine=c) %dopar% { j } [1] 1 2 3 4 > foreach (j=icount(4),.combine=+) %dopar% { j } [1] 10 Note the difference with sum (1:4).
Another Example
> > > >
library (randomForest) x <- matrix (runif (500), 100) y <- gl (2, 50) rf <- foreach (ntree=rep (250, 4), .combine=combine) %dopar% + randomForest (x, y, ntree=ntree)
The %dopar% operator
%dopar% is a registration API for parallel back-ends: doSEQ (the default backend) doMC (multicore package) doNWS doSNOW
The %dopar% operator
%dopar% is a registration API for parallel back-ends: doSEQ (the default backend) doMC (multicore package) doNWS doSNOW doRHIPE? doRMPI? ...
Foreach tries to parse R syntax reasonably
> z <- 2 > f <- function (x) { sqrt (x + z) } > foreach (j=1:4, .combine=c) %dopar% { f (j) } [1] 1.732051 2.000000 2.236068 2.449490
List comprehension
> foreach (j=-2:2,.combine=c) %:% when (j>=0) + %dopar% sqrt (j) [1] 0.000000 1.000000 1.414214
Nesting
Foreach loops can be nested. Nesting admits at least two interesting cases: Easy loop unrolling Easy multi-paradigm parallelism
Loop unrolling
Compare (100 iterations of 5 parallel tasks): x <- foreach (j=1:100,.combine=sum) %do% { foreach (k=1:5,.combine=c) %dopar% {j*k} } With an unrolled version (500 parallel tasks): y <- foreach (j=1:100,.combine=sum) %:% foreach (k=1:5,.combine=c) %dopar% {j*k} The unrolled approach is better load-balanced on a cluster.
Multi-paradigm parallelism
> > > > + + + + +
require (doSNOW) cl <- makeCluster (c (n1, n2, n3, n4)) registerDoSNOW (cl) foreach (j=<iterator>, .packages=doMC) %dopar% { foreach (k=<iterator>) %dopar% { registerDoMC () ... } }
Example: Very simple backtesting
simpleRule <- function (z, fast=12, slow=26, signal=9, instr, benchmark) { x <- MACD (z, nFast=fast, nSlow=slow, nSig=signal, maType="EMA") position <- sign (x[,1]-x[,2]) s <- xts (position,order.by=index(z)) return (instr*(s>0) + benchmark*(s<=0)) }
Brute-force parameter optimization

# Define a return series Ra for the instrument # (below we use the closing price of MSFT), and # benchmark series Rb M <- 100 S <- matrix(0,M,M) for (j in 1:(M-1)) { for (k in min ((j+2),M):M) { R <- simpleRule (Cl (MSFT),j,k,9, Ra, Rb) Dt <- na.omit (R - Rb) S[j,k] <- mean (Dt)/sd(Dt) } }
Now in parallel, by rows...
M <- 100 S <- foreach (j=1:(M-1), .combine=rbind, .packages=c (xts,TTR)) %dopar% { x <- rep (0,M) for (k in min ((j+2),M):M) { R <- simpleRule (Cl (MSFT),j,k,9,Ra,Rb) Dt <- na.omit (R - Rb) x[k] <- mean (Dt)/sd( Dt) } x }
Parallelizing parts of an existing package
Basic idea Prole code with Rprof (profr is a nice wrapper that visualizes the results) Examine bottlenecks for apply-like statements and for loops with independent code blocks Rewrite for loops without side-effects as required (may require a custom combine function) Unlock the namespace, provisionally replace target function(s) and experiment (a nice trick)
Example: ipred
(Work through the ipred replacement functions in the lecvx.R.)
Appendix: Fun map/reduce examples

Succint map/reduce...from the mapReduce package by Christopher Brown: mapReduce <- function (map, ..., data=NULL, applyfun=sapply) { innerFun <- function(my.data, expr) eval(expr, my.data) outerFun <- function (expr, split.data) sapply (split.data, innerFun, expr) attach (data) map <- eval (substitute (map, data)) detach (data) expr = substitute (c ( ... ))[-1] split.data <- split( data, map ) applyfun (expr, outerFun, split.data) }
mapReduce sequential and parallel examples

# An example mapReduce (cyl, mean(mpg), mean(hp), data=mtcars, applyfun=sapply) # With multicore: require (mutlicore) mapReduce (cyl, mean(mpg), mean(hp), data=mtcars, applyfun=mclapply) # With SNOW parSapply: require (snow) cl <- makeSOCKcluster(c("localhost","localhost")) ssapply <- function (A,B,C) {parSapply(cl, A, B, C)} mapReduce (cyl, mean(mpg), mean(hp), data=mtcars, applyfun=ssapply)
mapReduce parallel examples
# With Rmpi mpi.parSapply: require (Rmpi) x <- mapReduce (cyl, mean(mpg), mean(hp), data=mtcars, applyfun=mpi.parSapply) # With foreach: require (foreach) fapply <- function (A,B,C) { foreach (j=A, .combine=cbind) %dopar% B(j, C) } mapReduce (cyl, mean(mpg), mean(hp), data=mtcars, applyfun=fapply)

Foreach Iterators - Lewis

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Foreach Iterators - Lewis

Uploaded by

Copyright:

Available Formats

foreach + iterators

Bryan Lewis Steve Weston

Revolution Computing New Haven, CT USA

Experimenting with existing packages

it <- iter (1:3)

it <- icount (3)

foreach(iterator,...) %dopar% { statements }

> > > >

The %dopar% operator

The %dopar% operator

Foreach tries to parse R syntax reasonably

> > > > + + + + +

Example: Very simple backtesting

Brute-force parameter optimization

Now in parallel, by rows...

Parallelizing parts of an existing package

(Work through the ipred replacement functions in the lecvx.R.)

Appendix: Fun map/reduce examples

mapReduce sequential and parallel examples

mapReduce parallel examples

You might also like