You are on page 1of 11

CONCURRENCY: PRACTICE AND EXPERIENCE, VOL.

9(11), 11271137 (NOVEMBER 1997)

A matrix math library for Java


TABER H. SMITH, AARON E. GOWER AND DUANE S. BONING
MIT Microsystems Technology Laboratories, Bldg. 39, Rm. 328, Cambridge, MA 02139, USA
(e-mail: taber@mit.edu)

SUMMARY
The lack of platform-independent numerical toolsets presents a barrier to the development of
distributed scientific and engineering applications. Unlike self-contained applications, which
can utilize specialized interfaces to numerical algorithms, distributed applications require a
computing environment with cohesive data structures and method interfaces. These features
are essential in providing consistency between independently developed parts of distributed applications. We describe a Java-based framework that provides a set of consistent data structures
and standard interfaces for numerical methods which operate on these data structures. The
data structures we utilize are double precision real and complex matrices in Java. Our method
interfaces are designed to model those of MATLAB. Since many engineering toolsets rely heavily on core numerical linear algebra algorithms, our current work is focused on implementing
a computational foundation of fundamental numerical algorithms operating within our matrix
framework. The matrix framework and numerical algorithm libraries are extremely useful for
a wide range of applications and should prove to be easily extendable for developing various
applications and toolsets beyond their current implementations. 1997 John Wiley & Sons,
Ltd.
Concurrency: Pract. Exper., Vol. 9(11), 11271137 (1997)
No. of Figures: 3.

No. of Tables: 2.

No. of References: 5.

1. INTRODUCTION
Current trends in university research are moving toward projects involving collaboration
between several universities on larger projects[1]. These changes are initiating the development of tools to aid in collaboration on research projects as well as collaborative software
for implementing research projects. These applications are being developed in a distributed
fashion, with several different groups at various schools working separately on different
projects, which are intended to be able to communicate with one another upon completion.
This framework is highlighted in Figure 1.
One motivation for our work is the need for a common mathematical framework for
these distributed applications. Although each application within a project consists of a
different functionality, they often share a common mathematical base. Therefore, in order
for the distributed applications to effectively share data in a distributed environment within
possibly different applications, common data structures are necessary. Similar to shared data
structures, shared methods (e.g. written within one application for use by other applications)
are also needed with well defined interfaces (e.g. well-defined types for inputs and outputs,
as well as the manner in which methods operate on objects).
Correspondence to: T. H. Smith, Massachusetts Institute of Technology, Microsystems Technology Laboratories, Bldg. 39, Rm. 328, Cambridge, MA 02139, USA. (e-mail: taber@mit.edu)
Contract grant sponsor: DARPA; Contract grant numbers: DABT63-94-C-0055; DABT63-95-C-0088.

CCC 10403108/97/11112711$17.50
1997 John Wiley & Sons, Ltd.

Received April 1997


Revised July 1997

1128

T. H. SMITH ET AL.

Figure 1.

The collaborative semiconductor research vision

A second motivation for the work we present here is derived from commonly used development methodologies and tools. In particular, many university research groups initially
develop scientific applications in MATLAB. The built-in graphics, simplified debugging
and large number of readily available toolboxes make this a useful framework for testing
new theories and ideas. Unfortunately, transitioning much of this research for use in industrial tests requires these initial applications to be converted into stand-alone applications
deployable on a large number of different platforms. In fact, experiments to be performed
in industrial settings can be greatly delayed by slow conversion from the MATLAB environment to a Fortran, C, or Java implementation. Increasing the efficiency of this transition
phase is imperative to the success of: (i) testing research ideas in a realistic setting provided
by industrial interactions with universities; and (ii) proliferating the implementation of new
research methodologies into widespread use.
The current effort to implement platform-independent Java applications presents the
final motivation for our work. For completely new or relatively simple applications, a
clean start with Java is the most efficient approach to creating 100% Java applications.
However, for a large portion of the scientific community this is not a reasonable solution.
Several applications depend heavily on the numerical methods which have been previously
researched and developed over many years[25].
For example, a current research project at the Massachusetts Institute of Technology
(MIT), Stanford, and several other universities involves the integration of tools and software for application in the semiconductor manufacturing industry. The toolsets are being
implemented in the Java framework and range from distributed design libraries for use in
remote design and simulation to characterization utilities such as a remote microscope
to process and equipment diagnostic routines and to process control applications, which rely
heavily on well-established numerical methods (e.g. the singular value decomposition and
eigenvalue decomposition). The collaborative approach to research in the semiconductor
arena is shown in Figure 1.
This approach enables many researchers to combine their expertise and resources via
a wide range of Internet applications. Many of these independent groups have built applications utilizing numerical algorithms developed over many years which are widely
available in languages such as C and Fortran. We have found that the conversion of these
Concurrency: Pract. Exper., Vol. 9, 11271137 (1997)

1997 John Wiley & Sons, Ltd.

A MATRIX MATH LIBRARY

1129

complex applications (or toolsets), which depend heavily on reliable numerical algorithms,
can also be greatly hindered when confronted with the lack of algorithms available in the
Java environment. While several other platforms offer excellent resources (e.g. NetLib) for
finding well-debugged, reliable and efficient routines, the Java platform is at a loss in this
aspect.
These three needs have driven the development of: (i) a framework, upon which a large
class of numerical algorithms can be implemented in Java; (ii) the ongoing development
of a library of Java methods, similar to that provided by the MATLAB environment,
which cover the core numerical linear algebra algorithms; and (iii) a utility for converting
MATLAB user scripts and functions to Java equivalents. We have termed our matrix
framework MatrixCafe in the spirit of MATLAB and Java. The Java implementation has
three key goals: (i) adopt the software architectural principle from MATLAB that the
complex matrix is a fundamental data object for unifying the framework; (ii) achieve a
functional equivalence to the base MATLAB layer; and (iii) create a one-to-one mapping
from MATLAB functions to equivalents in the MatrixCafe package to facilitate rapid
conversion of MATLAB user-written scripts and functions to Java.
Section 2 outlines the MatrixCafe framework we have developed. The library we are currently developing is discussed in Section 3, and an example application is given. Section 4
briefly highlights our work on converting MATLAB files to Java. Finally, we summarize
our initial work and discuss directions for future work in Section 5.
2. THE MATRIX MATH FRAMEWORK
The goal of our work is to provide a wide range of numerical algorithms in a cohesive
framework within the Java environment upon which distributed scientific and engineering
applications can be built. Specifically, we aim to address the needs outlined above: the
need for a cohesive environment for distributed computing, the need for an advanced
mathematical library and the need to efficiently transfer MATLAB scripts and functions
into a stand-alone platform independent language such as Java.
Therefore, we have focused on developing a matrix framework upon which these algorithms may be built. Although the Java language provides the ability to generate twodimensional arrays, very few of the java.lang.Math methods readily apply to such arrays.
Some basic matrix classes in Java have been distributed with varying degrees of applicability, reliability and functionality. Our effort here is to provide a framework upon which a
full set of matrix operations and methodologies may be implemented.
Several applications in the scientific community require complex matrices for a range of
problems, including image processing and linear system analysis. Therefore, we define a
Matrix class, composed of a real matrix and an imaginary matrix. In strictly real applications, the imaginary part is ignored and underlying methods invoke real versions of the
same operations which operate on only the real component of the data.
The Java language inherently provides the ability for multidimensional array processing
beyond two dimensions. However, many of the publicly available routines which currently
exist were developed with the specific application to two-dimensional matrices. Since
our focus here is to provide an environment in which to implement many of the existing
algorithms, we restrict our framework to cover only two-dimensional matrices. Although
this was one major limitation of MATLAB 4, the extension to multiple dimensions requires
a complex rule set for operations and functions. For a large percentage of applications,
1997 John Wiley & Sons, Ltd.

Concurrency: Pract. Exper., Vol. 9, 11271137 (1997)

1130

T. H. SMITH ET AL.

this benefit is secondary to the need for an initial implementation of basic numerical linear
algebra routines.
Applications in the scientific community require varying degrees of precision. In addition,
many of the publicly available mathematical routines are written in various precisions to
address different levels of memory and efficiency requirements. However, in order to
provide a common interface to and within our framework, and to focus on providing a
reduced set of algorithms which address the largest number of applications in a reliable and
general fashion, we have chosen to standardize the Matrix class within our framework to be
composed of two two-dimensional arrays of double precision floating point numbers: one
representing the real component and the other representing the imaginary component.
With this, we restrict all the functions within our framework to accept and provide only such
matrices (with a few exceptions for importing and exporting data). In this manner, every
element within our framework is a double precision complex matrix, including scalars,
vectors, integer matrices and all floating point matrices (both real and imaginary). Where
necessary, matrices within a method may be reduced to integer precision for efficiency
or inherent integer requirements (e.g. integer matrices which contain index information in
reference to another matrix). Exceptions are made only for methods which are designed
to create matrices within the framework or to extract them from the framework. The class
definition for our framework is then of the form
public class Matrix
{
private double[ ][ ] real, imag;
}
As in the case of MATLAB, these definitions have several implications on performance.
First, the use of double precision values for single precision computations will be slow on
single precision floating-point platforms. Integer calculations will be slowed by the double
precision operations or by conversion to integers for use in integer routines. In addition,
memory requirements are greater for representing integer or single precision numbers as
double precision. Scalars and vectors will suffer a slight performance loss due to being
placed in a two-dimensional array wrapper. Future work will be required to assess the full
implication of these performance issues.
In addition to the Matrix data structure outlined above, we provide exception handling
for a large class of commonly occurring exceptions. We define a MatrixException class
as follows
public class MatrixException extends Exception
This exception class is currently subclassed into two general exception classes. The first
class provides exception handling for matrix dimension exceptions, which are thrown
by operations that impose constraints on the dimensions of their operands. This class is
designed to catch severe problems within the matrix library methods, as well as in any
toolsets or applications being developed on top of the framework. A typical example
might be non-equivalent inner matrix dimensions in a matrix multiplication. The second
subclass of MatrixException for the Matrix class is ComputationException. The
ComputationException class itself contains several subclasses to handle: convergence
exceptions, to be thrown when iterative algorithms cannot achieve convergence; precision
exceptions, to be thrown when matrix elements fall beneath a precision limit in algorithms
Concurrency: Pract. Exper., Vol. 9, 11271137 (1997)

1997 John Wiley & Sons, Ltd.

1131

A MATRIX MATH LIBRARY

with precision arguments; sparse matrix exceptions, to be thrown by algorithms which


require non-sparse data; and condition exceptions, to be thrown by methods which require
certain matrices to have full rank. It is likely that this basic set of exceptions will be
extended, but the current set provides coverage for most problems occurring in numerical
computations.
Finally, our framework defines standards for method inputs, outputs and operands. These
standards are modeled after MATLAB wherever possible. All inputs to MatrixCafe methods
are strictly required to be of type Matrix. Exceptions to this occur only in constructors and
static methods which provide Matrix object creation. For example, a Matrix constructor
exists which has as an input a two-dimensional array of single precision floating point
values. A static method zeros also takes integer inputs in order to create Matrix objects
with all zeros. Method outputs take one of two forms single output or multiple output.
Single output methods only return one Matrix object. The returned object does not
replace any of the input objects (unlike routines from other libraries). Although it can reduce
performance, it provides the ability to write intuitive code. For example, the operation
Y = A (B + C)

(1)

where Y , A, B and C are matrices of appropriate dimension, is shown as follows


Matrix Y,A,B,C;
...
// Code to construct matrices A, B, and C
...
try {
Y = A.mtimes(B.plus(C));
} catch (MatrixDimensionException e) {
System.out.println("Fatal Error");
System.exit(0);
}
Unfortunately, Java does not currently support method overloading with multiple outputs.
Although most packages overcome this barrier by including outputs as inputs, passed by
reference and modified within the routine, this implementation is not as intuitive as the
multiple output approach supported by MATLAB. Since we prefer to maintain simplicity
in our interfaces, but cannot implement the MATLAB approach directly in Java, we require
all routines with multiple outputs to return an array of Matrix objects. This has several
limitations. First, unnecessary allocation and wrapping of the outputs is required. Second,
this standard creates more work for the package users, since they are now required to know
which methods return arrays of Matrix objects. Finally, there is no method overloading
for functions with multiple outputs. This array of Matrix outputs does, however, provide
coherency within the framework as to which objects are inputs and which are outputs,
improving the ease at which the package may be understood and used. An example of the
use of the singular value decomposition (SVD) is shown as follows
Matrix A,S,V,D, Array[];
...
// Code to construct matrix A, and initialize S, V, D, and Array
1997 John Wiley & Sons, Ltd.

Concurrency: Pract. Exper., Vol. 9, 11271137 (1997)

1132

T. H. SMITH ET AL.

...
try {
Array = A.svd();
S = Array[0];
V = Array[1];
D = Array[2];
} catch (ComputationException e) {
System.out.println("Fatal Error computing SVD.");
System.exit(0);
}
The purpose of this matrix math framework is to serve as a foundation upon which a
large number of applications and toolboxes may be built. The application hierarchy of our
particular example of collaborative tools for the semiconductor manufacturing project may
be visualized as in Figure 2. Here, widely accessible applications at the surface build upon
high-level toolsets, which in turn are built upon the MatrixCafe framework. We now turn
to describing the current state of such a library.

Figure 2.

The application hierarchy of the collaborative semiconductor research resources

3. A BASIC MATRIX MATH LIBRARY IN Java


We have begun an effort to assemble a library of numerical algorithms, primarily to aid
in transitioning our research from current implementations in C and our developmental
MATLAB implementations to the more flexible platform-independent Java environment.
3.1. Structure
The matrix math library is built on the Matrix class definition given above, but much
of the library functions currently implemented are available only for real matrices. The
current goals are to convert existing routines from LAPACK[5] into Java ourselves, or to
Concurrency: Pract. Exper., Vol. 9, 11271137 (1997)

1997 John Wiley & Sons, Ltd.

1133

A MATRIX MATH LIBRARY

utilize Java implementations converted elsewhere. Implementations within the framework


will be restricted to these well tested routines which have test case examples to verify the
numerical stability of the converted routines.
Currently, we have written routines to create Matrix objects within our framework
via constructors based on external data types (e.g. integers, floats and double precision
values including scalars and one- and two-dimensional arrays). We have also implemented
versions of these constructors which take only Matrix objects as inputs. In particular, we
provide constructors which fall under these main categories, whose structure is outlined in
Table 1
1.
2.
3.
4.
5.

construct a matrix with elements all having a given value: zero, one or other.
construct identity matrices.
construct a matrix with random entries.
construct column vectors with regularly, linearly and logarithmically spaced entries.
construct diagonal matrices from a one-dimensional array of type double.

In addition, we provide a large number of methods for matrix manipulation, such as


1.
2.
3.
4.

methods to extract elements or submatrices into two-dimensional arrays of doubles


methods to extend matrices with additional elements or matrices
methods to extract diagonal, and upper and lower triangular portions of matrices
methods for flipping, rotating, reshaping and transposing matrices.

We also provide a large range of basic matrix math including


1. methods for addition, subtraction, multiplication, Kronecker tensor products and
powers of matrices
2. methods for element-by-element operations for each of these as well as methods for
trigonometric, exponential, logarithmic and complex math functions
3. methods for data analysis: numeric precision reduction (rounding), min, max, mean,
std, median, mode, column-wise sums and products and column-wise cumulative
sums and products
4. methods for relational operations on matrices, including: equality, greaterThan,
lessThan, greaterThanOrEqual, lessThanOrEqual, logical AND, OR, XOR and negation, as well as compound versions of these for vectors and matrices such as ANY
and ALL
5. methods for searching and sorting matrix elements based on these relational operators.
Although there is not room here for a complete listing of current methods, an abbreviated
one is given in Table 2, and a complete one at the MatrixCafe packages World Wide Web
site:
http://www-mtl.mit.edu/~taber/MatrixCafe/MatrixCafe.html.
Current work is aimed at developing the numerical linear algebra portions of the framework by converting routines from LAPACK. Methods for the numerical linear algebra will
grow to include:
1. methods for condition number, matrix and vector norms, matrix rank, determinant,
trace, null spaces and inverses
1997 John Wiley & Sons, Ltd.

Concurrency: Pract. Exper., Vol. 9, 11271137 (1997)

1134

T. H. SMITH ET AL.

Table 1.

Structure of the matrix class


The Matrix Class

Constructors

Static methods

Matrix
manipuation
methods

From 1D or 2D
array of doubles
From integer
size specifiers
From data files

Zeros matrix
Ones matrix
Identity matrix
Random matrix
Evenly spaced
vectors

Basic matrix
math operations
Basic data
analysis methods
Manipulations
(e.g. submatrix)

Table 2.

Numerical
linear algebra
Matrix determinants,
inverses, etc.
Eigenvalues
Matrix decompositions
(LU, QR, SVD)
Linear equation solving

Currently Implemented matrixCafe.DoubleMatrix methods

Constructors and
static methods
DoubleMatrix(DataInputStream)
DoubleMatrix(double[])
DoubleMatrix(double[][])
DoubleMatrix(int)
DoubleMatrix(int, double)
DoubleMatrix(int, int)
DoubleMatrix(int, int, double)
eye(int, int)
init(int, int, double)
linSpace(double, double, int)
regSpace(double, double, int)
logSpace(double, double, int)
ones(int, int)
zeros(int, int)
rand(int, int)
randn(int, int)

Methods

Methods

Methods

abs
acos
acot
acsc
all
and
any
asec
asin
atan
ceil
opy
cos
cot
csc
cumprod
cumsum
diag
divide
exp
find
fix
fliplr
flipup
floor
equals
equalTo
extract

elementsNonZero
getElements
getNumberOfColumns
getNumberOfRows
greaterThan
greaterThanOrEqualTo
isEmpty
isFinite
isInf
isNan
isScalar
isVector
kron
lessThan
lessThanOrEqualTo
log
log10
lu
max
mean
median
min
minus
mpower
mtimes
not
notEqualTo

numCols
numRows
or
plus
power
prod
qr
readMatrix
rem
reshape
round
sec
setElements
sign
sin
size
sort
std
sum
svd
tan
times
toString
trace
transpose
tril
triu
writeMatrix
xor

Concurrency: Pract. Exper., Vol. 9, 11271137 (1997)

1997 John Wiley & Sons, Ltd.

A MATRIX MATH LIBRARY

1135

2. methods for performing orthogonalization, and row echelon reduction, LU, QR and
Cholesky decompositions
3. methods for solving linear equations, non-negative least-squares, pseudoinverses and
least-square solutions
4. methods for generating eigenvalues and eigenvectors, characteristic polynomials,
generalized eigenvalues and the singular value decomposition.
As a rough measure of the magnitude of the project, we have approximately 4800 lines of
Java code as of the beginning of April 1997. We estimate that the base libraries are roughly
25% done. We hope to complete 50% of the project by the end of August 1997.
3.2. Performance
Our focus at this initial stage is not necessarily to achieve optimal efficiency, but rather
on providing robust and reliable routines. Aside from the computational speed limitations
mentioned above, the interpreted nature of Java will limit the performance of the MatrixCafe
package. However, with the addition of just-in-time (JIT) compilers and Java compiler
optimizations, this performance gap is likely to decrease. Alternatives to using 100% Java
applications include using native methods written in other languages within a wrapper or
remote method invocation with application servers. Native methods are fast and decrease
the effort necessary to convert existing algorithms for use in scientific applications, but
platform dependency restricts their usefulness in distributed computing applications as
well as for more general audiences cases in which the platform of the end user may not
be known. The use of application servers restricts the availability of libraries to those users
who have dedicated servers which may be complicated to set up. If application servers are
provided for use via the Internet, then potential users are reliant on network lag and the
number of servers provided for their performance specifications. Neither of these options
are appealing for our interests.
We have not yet reached a point where we can properly assess the performance of
our framework or the routines within it. After several numerical linear algebra routines
are added to the package, we will benchmark the performance of these methods with
other implementations. We can then compare and contrast these routines with respect
to the performance issues described above. Based on these tests, we may be forced to
make changes or offer alternatives to our ideal framework outlined above. In particular,
the methods may need to be implemented for all Java types (integers, single precision
floating-point numbers, as well as double precision floating-point numbers) if computation
performance is low, if the numeric accuracy of the double precision representation causes
accuracy problems (particular for integer computations), or if runtime type-checking causes
significant performance problems. The requirement of instantiating new outputs (the nonoverwriting of inputs) may also be removed if the performance loss is large.
We foresee good numeric stability within specific routines by virtue of the methods
which are being converted for use in the library. However, the stability of the software
implementation within our framework will be heavily tested through the use of the package
in several applications within the distributed semiconductor manufacturing project outlined
in Figure 1. It is hoped that these applications will also test the extensibility of the classes
for use in specific applications.

1997 John Wiley & Sons, Ltd.

Concurrency: Pract. Exper., Vol. 9, 11271137 (1997)

1136

T. H. SMITH ET AL.

Figure 3.

Java code for an image processing example using the MatrixCafe libraries

3.3. Example application


The MatrixCafe libraries are in use for a project to benchmark run-by-run process control
algorithms for the semiconductor manufacturing industry. Simulations with the library
have been successfully run and a basic process simulator has been implemented for use in
a distributed benchmarking system.
We now turn to demonstrating how these library functions may be applied to a simple image processing example. Although the application simply finds the grand mean of
the image and subtracts it from the image, it is sufficient to highlight the use of the MatrixCafe package. The purpose of the constructor for the class given in Figure 3 is to
create an ImageApplication object which contains the zero mean version of an image.
The MatrixCafe package is included as a Java package at the beginning of the code. The
DoubleMatrix variables are declared for the resulting imageMatrix class variable and
for the instance variables at the beginning of the constructor. The constructor begins by
calling a method to import the image data into a two-dimensional array of double precision
floating-point numbers (not shown). A DoubleMatrix object is then constructed with this
two-dimensional array. We see the use of the mean method for determining the column-wise
and grand means of the image. Note the intuitive use of multiple calls to the DoubleMatrix
methods. Finally, the subtraction of the grand mean from the imageMatrix variable utilizes
the exception handling provided by the package.

Concurrency: Pract. Exper., Vol. 9, 11271137 (1997)

1997 John Wiley & Sons, Ltd.

1137

A MATRIX MATH LIBRARY

4. A MATLAB M-FILE CONVERTOR


The final goal for our work is to improve the speed at which code developed in MATLAB can
be converted for use in stand-alone applications. We have modeled the MatrixCafe methods
such that their inputs and outputs have a one-to-one correspondence with the equivalent
built-in functions of MATLAB. This one-to-one mapping provided by the package will
allow user-written MATLAB scripts and functions to be converted to Java through the
use of the library. We are currently working on a parser which will automatically perform
this conversion. This will enable users to develop their applications in MATLAB with the
supporting built-in graphics and other features, while enabling them to quickly transfer the
results into Java upon completion.
5. CONCLUSIONS AND FUTURE WORK
We have developed a framework which defines a general class of matrices upon which a
large base of applications could be developed. Many basic matrix math operations have
been implemented using this framework, and efforts to complete a numerical linear algebra
library are underway. This framework should prove to be easily extendable to include
libraries for discrete-time signal processing, optimization, statistics and possibly several
other important libraries. In addition, the close structuring with that of MATLAB will
enable users to rapidly transfer new applications to a stand-alone form with the aid of a
parser.
A large amount of work is needed to complete the conversion of LAPACK routines into
the package. In addition, future work is needed to measure the performance of the given
framework as well as its stability and numerical accuracy.
ACKNOWLEDGMENTS
This research has been supported by DARPA under contracts DABT63-94-C-0055 and
DABT-63-95-C-0088.
REFERENCES
1. P. Losleben and D. Boning, A new semiconductor research paradigm using internet collaboration, Int. Conf. on Sim. of Semicond. Processes and Devices (SISPAD 96), Tokyo, September
1996.
2. G. Golub and C. F. Van Loan, Matrix Computations, Johns Hopkins University Press, Baltimore,
MD, 2nd Edn, 1989.
3. J. J. Dongarra and E. Grosse, Distribution of mathematical software via electronic mail,
Commun. ACM, 30, 403407 (1987).
4. C. L. Lawson, R. J. Hanson, D. Kincaid and F. T. Krogh, Basic linear algebra subprograms for
FORTRAN usage, ACM Trans. Math. Softw., 5, 308323 (1979).
5. E. Anderson, Z. Bai, C. Bischof, J. W. Demmel, J. J. Dongarra, J. Du Croz, A. Greenbaum,
S. Hammarling, A. Mckenney and D. Sorensen, LAPACK: A portable linear algebra library for
high-performance computers, Computer Science Dept. Technical Report CS-90-105, University
of Tennessee, Knoxville, 1990, LAPACK Working Note 20.

1997 John Wiley & Sons, Ltd.

Concurrency: Pract. Exper., Vol. 9, 11271137 (1997)

You might also like