You are on page 1of 3

GPU Computation in Mathematica 8

Wang Junhong, High Performance Computing


NUS Computer Centre

1.

Introduction

With Mathematica, the enormous parallel processing power of Graphical Processing Units
(GPUs) can be harnessed from an integrated built-in interface. Incorporating GPU
technology into Mathematica allows high-performance solutions to be developed in many
areas such as financial simulation, image processing, and modelling. GPU program creation
and deployment is fully integrated with Mathematica's high-level development tools,
boosting productivity in moving from prototype to large-scale solution. With the available of
the recently released GPU cluster, Matheamtica users can explore and run Mathematica on
hundreds of GPU cores on the GPU compute nodes using the built-in CUDAlink functions.

2.

Install Mathematica CUDALink

Mathematica CUDAlink needs to be installed at users home directory the first time. To
make it easier for users, the CUDAResources paclet was downloaded onto to the GPU
clusters login node Gold-c01. It can be installed via the following steps on Gold-c01:
[gold-c01]$ math
Mathematica 8.0 for Linux x86 (64-bit)
Copyright 1988-2010 Wolfram Research, Inc.
In[1]:= Needs["CUDALink`"]
In[2]:= CUDAResourcesInstall["/usr/CUDALink/CUDAResources-Lin64-8.0.0.8.paclet"]
CUDALink::initf:
Due to changes in the NVIDIA CUDA distribution, a line to load CUDALink and
OpenCLLink from the paclet directory was added to the end of the kernel
initialization file located in
"/home/svu/ccev710/.Mathematica/Kernel/init.m". The original
initialization file has been backed up to
"/home/svu/ccev710/.Mathematica/Kernel/init.m.220.bak".
Out[2]= {Paclet[CUDAResources, 8.0.0.8, <>]}
In[3]:= CUDAQ[]
Out[3]= True
In[4]:= CUDAResourcesInformation[]
Out[4]= {{Name -> CUDAResources, Version -> 8.0.0.8, BuildNumber -> -1,
> QualifiedName -> CUDAResources-Lin64-8.0.0.8,

Once the command CUDAQ[] shows True, you are ready to use Mathematicas
CUDALink for GPU computing exercise.
Please note that the CUDALink installation occupies about 1.3GB disk space at Home
directory, so make sure you have sufficient home disk space before you do the installation.
The default installation path is at .Mathematica under Home directory.
[gold-c01]$ cd
[gold-c01]$ cd .Mathematica/
[gold-c01]$ pwd
/home/svu/ccev710/.Mathematica
[gold-c01]$ ls -l
drwx------ 4 ccev710
drwx------ 2 ccev710
drwx------ 3 ccev710
drwx------ 3 ccev710
drwx------ 2 ccev710
drwx------ 2 ccev710
drwx------ 5 ccev710
drwx------ 6 ccev710

svuusers
svuusers
svuusers
svuusers
svuusers
svuusers
svuusers
svuusers

1024
80
80
1024
1024
80
1024
1024

Aug
Aug
Aug
Aug
Aug
Aug
Aug
Aug

14
14
14
14
14
14
14
14

13:33
13:33
13:33
15:25
13:34
13:33
13:33
13:33

ApplicationData
Applications
Autoload
FrontEnd
Kernel
Licensing
Paclets
SystemFiles

[gold-c01]$ du -ms
1306
.

If you dont need to run CUDALink, you may remove the folder .Mathematica to save disk
space in your HPC home directory.

3.

Use of Mathematica CUDALink

After the first time installation of CUDALink, you can code with Mathematicas CUDALink
functions to make use of hundreds of GPU cores on the node. All the CUDALink functions
start with the prefix CUDA. You need to load CUDALink before calling other CDUALink
functions.
The acceleration performance of Mathematicas CUDALink on GPU cores was demonstrated
by a simple function of matrix multiplication. The comparison was made by running the
matrix multiplication on CPU cores and GPU cores. As shown below, it takes about 3.8
seconds to multiply a random matrix [6000, 6000] on CPU cores, and about 1.9 seconds to
multiply the same matrix on GPU cores. When the matrix was loaded onto GPU memory,
the computing time for the multiplication was reduced to 0.001 second only. This implies
that the acceleration performance can be improved tremendously by using the GPU
memory and GPU cores for the computation.

Start Mathematica

[gold-c01]$ math

Load CUDALink

In[1]:= Needs["CUDALink`"]
In[2]:= CUDAQ[]
Out[2]= True

Define Matrix
{6000,6000}
Time of the
Multiplication on
CPU cores
Time of the
Multiplication on
GPU cores via
CUDADot function
Time of the
Multiplication on
GPU cores via GPU
Memory and
CUDADot function

In[3]:= randM = RandomReal[1, {6000, 6000}];


In[4]:= AbsoluteTiming[randM.randM;]
Out[4]= {3.787636, Null}

In[5]:= AbsoluteTiming[CUDADot[randM, randM];]


Out[5]= {1.884636, Null}
In[6]:= randMG = CUDAMemoryLoad[randM]
Out[6]= CUDAMemory[1220209595, Type -> Double, Dimensions ->
{6000, 6000},
In[7]:= AbsoluteTiming[res = CUDADot[randMG, randMG]]
Out[7]= {0.001054, CUDAMemory[758701228, Type -> Double,
>

4.

Dimensions -> {6000, 6000}, ByteCount -> 288000000,

How to Move Forward

The Mathematicas official webpage provides a detailed user guide, application examples,
CUDALink functions and relevant tutorials. There are also detailed online lectures and
demonstrations on GPU computation in Mathematica on YouTube. Feel free to
contact ccehpc@nus.edu.sg if you need any help.

CUDALink User Guide.


CUDALink Applications.
White Paper for CUDA Programming within Mathematica.
Applications of GPU Computation in Mathematica (Youtube).

You might also like