2014 How To Run PAM Using R in Combination With Coffalyser For P376 Customer Support Material

How to Run PAM analysis using R in combination
with Coffalyser.NET program

To classify a sample as BRCA1-like or non-BRCA1-like a classifier in the statistical programming language R
can be used. This classifier was developed using Prediction analysis for microarrays (PAM) () (Tibshirani R et
al. 2002, PNAS, 99:6567-72).
This is an approach to cancer class prediction from gene expression profiling, based on an enhancement of
the simple nearest prototype (centroid) classifier. The prototypes shrink and hence obtain a classifier that is
often more accurate than competing methods. The method of nearest shrunken centroids identifies
subsets of genes that best characterize each class. The technique is general and can be used in many other
classification problems. More information about this method can be found at:
http://statweb.stanford.edu/~tibs/SAM/
Contents
How to Run PAM analysis using R in combination with Coffalyser.NET program............................................. 1
Coffalyser.Net - Support............................................................................................................................. 2
Contact us.................................................................................................................................................. 2
Using R scripts for calling your BRCA1ness classification with P376 ............................................................ 3
Step 1: Collect all relevant data / programs ............................................................................................ 3
Step 2: Windows regional settings ......................................................................................................... 3
Step 3: Install Coffalyser.Net and install R .............................................................................................. 3
Step 4: Install pamr package in R ............................................................................................................ 3
Create your training data file ..................................................................................................................... 4
Formats of export files ........................................................................................................................... 4
Step 5. Analyze your training data set .................................................................................................... 5
Step 6. Open the experiment explorer and export data in R format........................................................ 6
Step 7. Add the classification to the txt training data.............................................................................. 7
Calling your unknown data......................................................................................................................... 9
Step 8. Analyze your test data ................................................................................................................ 9
Step 9. Call your data in R..................................................................................................................... 11
Output file ........................................................................................................................................... 12
Coffalyser.Net - Support
Coffalyser.net Home Wordpress

http://coffalyser.wordpress.com/
YouTube (flash instruction videos)

http://www.youtube.com/user/Coffalyser
Registration page, click on login on the left side:

http://www.mlpa.com
Wiki (our old home for support material)

http://wiki.coffalyser.net
Publication with regard to analysis methods (open book)

http://www.intechopen.com/books/modern-approaches-to-quality-control/analysis-of-mlpa-data-using-novelsoftware-coffalyser-net-by-mrc-holland
MRC-Holland Main
http://www.mlpa.com
Download R for your operating system at:

http://cran.r-project.org/bin/windows/base/
NOTE: we only tested version 2.15.1
Contact us
MRC-Holland provides free support to all Coffalyser.Net users.

For general MLPA related questions you can send an email to info@mlpa.com
For Coffalyser.Net related questions you can send an email to support@coffalyser.net
Using R scripts for calling your BRCA1ness classification with P376

Step 1: Collect all relevant data / programs
You need to have:
The last version of Coffalyser.Net v.140425.1321 (www.mlpa.com)

The R program version (versions 3.1.1, 3.0.3 and 2.15.1 had been tested at MRC-Holland)
(http://cran.r-project.org/bin/windows/base/old/)
The training data in ABIF format (if not yet provided with this manual email to info@mlpa.com)
The unknown data you are planning to call for BRCAness-like in ABIF format
Step 2: Windows regional settings

Please note that in our findings the method did not work unless your regional settings have a dot as the
decimal separation sign and a comma as the thousand separation sign. You can adjust these settings under:
Start Menu Configuration Screen Clock, Language and Region Region and Language On the tab
Formats Additional Settings Customize Format
Step 3: Install Coffalyser.Net and install R
Install Coffalyser.Net using the installation manual provided with the setup files. Also install one of in step 1
mentioned R versions for Windows according to the instruction on the screen.
Step 4: Install pamr package in R
You will need to install the R-package for PAM training and calling. From the menu bar click on Packages
and then select Install package(s).
Next you will need to select a mirror. Select the closest mirror to your location. Now scroll through the list of
packages and select pamr and click on OK.
Create your training data file

Before you can classify your unknown data you will first need to make a training data set using a set of
samples of which the type is already known. MRC-Holland can provide a set of samples in ABIF format that
includes reference samples and test samples. Within the selection of test samples, there are sporadic tumors
and BRCA1ness tumors. These samples were analyzed using the P376 lot B2-0911 MLPA mix and the
fragment products were separated on an ABI-3130 XL genetic analyzer with a GS-500 LIZ size marker.
This training set can also be used when test samples are analysed with P376 lot B3-0414.
Formats of export files
Coffalyser.Net has a special export function that will export files to a format that can directly be accepted by
R. If you do not wish to use the export function, then please consult the manual of R and PAM in order to
create input files in the correct format.
Please note: Both data types (training and Unknown data) need to be normalised in the same way. Our
recommendation is to use Coffalyser.Net to normalise your data. However, if you are using the global
normalisation method described by the NKI (see detailed instructions Lips E. et al. 2011 Breast Cancer Res.
13(5):R107), you need to normalise your unknown data set in the very same way.
Mosaicism: In case your tumor samples contain normal cells then better results may be obtained by
changing the arbitrary borders to 0.85-1.2.
Step 5. Analyze your training data set

Open Coffalyser.Net and analyze the training data set provided by MRC-Holland according to the analysis
manual that can be found on the Coffalyser.Net home page. Use the reference samples and No DNA samples
as provided below.
Reference samples:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
P376-B2-0911-NEW MB-NKI-REF1-CHE-1
P376-B2-0911-NEW MB-NKI-REF15-CHE-11-2
No DNA control samples:

1. P376-B2-0911-NEW MB-NKI-NODNA-CHE-5
2. P376-B2-0911-NEW MB-NKI-NODNA-CHE-10
Note: during the analysis we recommend to only include samples that have 100% score on the FMRS. So
only include samples that have 4 bars for the FMRS score in the comparative analysis. Also please note
that if you change analysis settings that you are consequent with these changes for both your test samples
and the training data set!
Step 6. Open the experiment explorer and export data in R format

Open the experiment results from the experiment analysis form.
In the Comparative Analysis Experiment Explorer use the key combination of: Ctrl + Shift + Alt + R, this will
allow you to save the grid data to a specific txt file format that may be used for R.
Do not make the R-script file yet, you will need to use this option for your test data later.
Step 7. Add the classification to the txt training data

Open the trainings set data in Excel. In the first row you will see the sample names. You can make the names
easier recognizable by replacing P376-B2-0911-NEW MB-NKI- for nothing ( ) in the entire row. You can do
this by selecting the entire row and using the key combination Ctrl-F.
Now select all the columns that contain the reference samples (noted with REF) and remove these
columns from the worksheet. Also remove the column with the sample names: N20120-CHE-10, N16986CHE-10, B1022-CHE-6 and C020-CHE-7.
Now we need to add the classification for each sample in row 2, right underneath the sample name. You can
find the classification of all the samples of the training set in the table below. Please be sure to use the exact
classification names for all samples. If you accidently add a single symbol then this sample will be seen as a
new group.
Table 1: Classification of all samples in the training data

Sporadic_Like BRCA1_Like
2058
2131
2124
2165
2134
2224
2151
2254
2169
2312
2175
2355
2182
B1007
2188
B1035
2195
B1045
2204
B1049
2216
B1058
2227
B1061
2232
B1064
2234
B1065
2276
C119
2278
C121
2295
T4147
2298
T6701
2350
C035
C036
C044
C048
C065
C068
C127
C128
C129
C130
Now save the changes you made to the grid, be sure that you KEEP the format: Text (Tab delimited)!
8
Calling your unknown data

Step 8. Analyze your test data
Now analyze your unknown test data and open the Comparative Analysis Experiment Explorer. While on
the tab with the overview use the keyboard combination of "Ctrl + Alt + Shift + R". This will generate a txt file
suitable for importing in R - program. Save the data at the same folder location as your test data!
When asked to create an R-script, answer Yes.
Now you will be asked to select the file that you want to use as training data. Select the training data txt file
where you have just added the classification and click on Open.
Please note: the R codes needed to train and make the calls are now copied to your clipboard. This is done
so that you do not need to type in all the codes that direct the program to all the relevant file locations. If
you want to use this option, you will need to open the R program directly after and paste the content of your
clipboard in the R console as explained in the next step. The R codes will also be saved in a txt file that will
be saved at the same location.
10
Step 9. Call your data in R

Open RGui and paste the content of the clipboard in the R Console. Depending on the locations of the files
your R code will look something like this:
thesource("http://bioconductor.org/biocLite.R")
biocLite("pamr")
library (pamr)
pamrB1excel<-pamr.from.excel('C:/PAM/p376 0911 trainingset.txt', 52, sample.labels=TRUE,
batch.labels= FALSE)
pamr_b1_vs_spor.train <- pamr.train(pamrB1excel)
pamrB1exceltest<-pamr.from.excel('C:/PAM/p376 0911 GM.txt', 27, sample.labels=TRUE,
batch.labels= FALSE)
test_predict<-pamr.predict(pamr_b1_vs_spor.train, pamrB1exceltest$x, threshold=0)
table( pamrB1exceltest$y,test_predict)
test_predict<-pamr.predict(pamr_b1_vs_spor.train, pamrB1exceltest$x, threshold=0, type=
"posterior")
test_predict
data.frame(SampleID=pamrB1exceltest$samplelabels, test_predict)
write.table(data.frame(sample=pamrB1exceltest$samplelabels, test_predict), sep="\t",
row.names=F, file='C:/PAM/OUTPUT FILE.txt')
---------------------------------------------------------------------------------------------In case you want to type in the R codes yourself, you will need to replace the file locations with the correct
information. Please note: depending on the version that you used for installation the PAM code may works
directly. It is also possible that you receive an error message indicating a missing package.
11
If you receive the error message:

Error: could not find function "pamr.from.excel"; pamr_b1_vs_spor.train <- pamr.train(pamrB1excel); Error:
could not find function "pamr.train"; Error: could not find function "pamr.from.excel"
Then you probably miss the package for R, please see step 4 on how to install the pamr package.
Output file
Your output will look something like this. Please first use an experiment with samples that are known and
classified to validate the method works! This output will also be available as a file in the same folder with all
the other data. This file will be called: OUTPUT FILE.txt, calls in these files are shown as P-values. The cut-off
value to classify a sample as BRCA1-like should be set at 0.5. Below this score, a sample should be classified
as non-BRCA1-like (Lips E. et al. 2011 Breast Cancer Res. 13(5):R107).
12

2014 How To Run PAM Using R in Combination With Coffalyser For P376 Customer Support Material

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2014 How To Run PAM Using R in Combination With Coffalyser For P376 Customer Support Material

Uploaded by

Copyright:

Available Formats

How to Run PAM analysis using R in combination

with Coffalyser.NET program

Coffalyser.net Home Wordpress

YouTube (flash instruction videos)

Registration page, click on login on the left side:

Wiki (our old home for support material)

Publication with regard to analysis methods (open book)

Download R for your operating system at:

MRC-Holland provides free support to all Coffalyser.Net users.

Using R scripts for calling your BRCA1ness classification with P376

The last version of Coffalyser.Net v.140425.1321 (www.mlpa.com)

Step 2: Windows regional settings

Create your training data file

Step 5. Analyze your training data set

No DNA control samples:

Step 6. Open the experiment explorer and export data in R format

Step 7. Add the classification to the txt training data

Table 1: Classification of all samples in the training data

Calling your unknown data

When asked to create an R-script, answer Yes.

Step 9. Call your data in R

If you receive the error message:

You might also like