You are on page 1of 58

Chapter 6

Mathematical Preliminaries
In the section we introduce tensors, but in fact you have surely encountered tensors in your studies, e.g. the
stress, strain and inertia tensors. The presentation uses direct, indicial and matrix notations so that hopefully
you can rely on your experience with matrices to get through this material without too much eort. Some
results are stated without proof, however references are provided should you desire, and I hope you do, to
obtain a more thorough understanding of the material.

6.1 Vector Spaces


A set V is a real vector space, whose elements are often called vectors, if it is equipped with two operations:
addition, denoted by the plus sign +, that takes any vector pair (a, b) V V and generates the sum a + b,
which is also a vector, and scalar multiplication that takes any scalar vector pair (, a) R V and
generates the scalar multiple a, which is also a vector, such that the following properties hold for any
vectors a, b, c V and scalars , R:
V1 : Commutativity with respect to addition, a + b = b + a.
V2 : Associativity with respect to addition, (a + b) + c = a + (b + c).
V3 : Existence of the zero element 0 V such that a + 0 = a.
V4 : Existence of the negative elements a V for each a V such that (a) + a = 0.
V5 : Distributivity with respect to scalar multiplication, ( a) = ( ) a.
V6 : Distributivity with respect to scalar addition, ( + ) a = a + a.
V7 : Distributivity with respect to vector addition, (a + b) = a + b.
V8 : Existence of the identity element 1 R such that 1 a = a.

As seen here, we represent scalars as lower case Greek letters and vectors as lower case bold face Latin
letters. And because of your experience with N-dimensional vector arrays we do not mention the obvious
equalities, e.g. 0 = 0, a = (1) a, etc., which can be proved using the above axioms.
We say the set of k 1 vectors {a1 , a2 , , ak } that are elements of the vector space V are linearly
dependent if there exists scalars 1 , 2 , , k that are not all zero such that
0 =

k
!

i ai

i=1

= 1 a1 + 2 a2 + + k ak
253

(6.1)

mathematical preliminaries
otherwise we say the set is linearly independent. And we say a vector space V is k-dimensional (with k 1)
if a set of k linearly independent vectors exists, but no set of l > k linearly independent vectors exists. And
finally, a set of k linearly independent vectors in a k-dimensional vector space V forms a basis of V.
Example 6.1. The idea of vector spaces, linear dependence, linear independence and bases is not limited to vectors. Indeed, the set of N-dimensional vector arrays RN , e.g.

a=

a1
a2
..
.
aN

is an N-dimensional vector space, but so too is the set of N M matrix arrays RNM , e.g.

A11
A
21
A = .
..

AN1

A12
A22
..
.

A1M
A2M
..
.

AN2

AN M

// 0 / 00
1
0
is an NM-dimensional vector space. Possible bases for these two vector spaces with N = M = 2 are
,
0
1
/1
2 1
2 1
2 1
20
1 0
0 1
0 0
0 0
and
,
,
,
, respectively.
0 0
0 0
1 0
0 1
The set of trial functions K() = {u : R R | u C1p () and u(x0 ) = 0}, cf. Eqn. 4.36, is an infinite
3M
dimensional vector space whereas the set Kh () = {uh : R R | i C1p (), Ui R, uh (x) = i=1
i (x) Ui } is
an M-dimensional vector space, cf. Eqn. 5.27.
Note that only in the first case do we refer to the elements as vectors even though matrices and continuous functions
are also elements of vector spaces.

A set E is a Euclidean vector space if it is a three-dimensional vector space equipped with the additional
two operations: the inner (dot or scalar) product, denoted by the dot , that takes any pair of vectors in
a, b V and generates the real number a b R and the cross product, denoted by the wedge , that takes
any pair of vectors in a, b V and generates the vector a b V, such that the following properties hold
for any vectors a, b, c V and scalars , R:
E1 : Symmetry, a b = b a.
E2 : Linearity, ( a + b) c = (a c) + (b c).
E3 : Positive definiteness, a a 0 and a a = 0 if and only if a = 0.
E4 : a b = b a.
E5 : Associativity with respect to the cross product, ( a + b) c = (a c) + (b c).
E6 : a (a b) = 0.
E7 : (a b) (a b) = (a a) (b b) (a b)2 .
254

vector spaces
Any, not necessarily three dimensional, vector space that also exhibits properties E1 , E2 and E3 is referred
to as an inner product space.
The norm (magnitude, length, or modulus) of a vector a E is defined as
1

|a| = (a a) 2 .

(6.2)

If |a| = 1 then a is a unit vector and if a b = 0 then a and b are orthogonal.


From property E7 with |a|2 = a a and |b|2 = b b we see that
4

|a b|
|a| |b|

52

ab
+
|a| |b|

52

=1

(6.3)

to wit we define the angle [0, ] between the vectors a and b such that
cos =
sin =

ab
,
|a| |b|
|a b|
.
|a| |b|

(6.4)

Certainly you have seen the result that the area enclosed by the parallelogram defined by the vectors a and b
is given by a = |a| |b| sin = |ab|, cf. Fig. 6.1. And from this observation you are comfortable with the fact
that a b = 0 if and only if a and b are linearly dependent. To show necessity, i.e. a b = 0 if a and b are
linearly dependent, let a = b for some ! 0 and apply property E5 to obtain (ab) = ( b)b = (b b)
and similarly with the help of property E4 we have (ab) = (ba) = (bb). Combining these equalities
we see (a b) = (a b) and hence (a b) = 0. To show suciency, i.e. a b = 0 only if a and b are
linearly dependent, please complete Exer. 6.1.
The scalar triple product [d, a, b] of the vectors a, b, d E is defined such that
[d, a, b] = d (a b).

(6.5)

This can be interpreted as the signed volume of the parallelepiped defined by the vectors a, b and d, cf. Fig.
6.1. Indeed, the signed volume is the product of the base area a = |a b| and the height h = d n = cos |d|
where n = (1/|a b|) a b is the unit vector that is orthogonal to the plane defined by the vectors a and b.
It is not dicult to show that for any , R and any a, b, c, d E we have
6

[d, a, b] = [a, b, d] = [b, d, a] = [d, b, a] = [a, d, b] = [b, a, d],


7
a + b, c, d = [a, c, d] + [b, c, d],
[a, b, c] = 0 if and only if {a, b, c} are linearly dependent.

(6.6)

And thus we see that the volume can be negative depending on how we order the vectors in the scalar
triple product. The equality [d, a, b] = [d, b, a] follows from the definition of the triple product, i.e. Eqn.
6.5, and properties E2 and E4 . The equality [d, a, b] = [a, d, b] follows from 0 = [a + d, a + d, b] =
[a, a + d, b] + [d, a + d, b] = [a, a, b] + [a, d, b] + [d, a, b] + [d, d, b] = [a, d, b] + [d, a, b], which follows
from properties E6 , E2 , E5 and E6 , respectively. To prove the necessity of the last equality, i.e. [a, b, c] = 0
if {a, b, c} are linearly dependent, note that if {a, b, c} are linearly dependent, then a + b + c = 0 for
some , , R not all zero. Without loss of generality, assume ! 0 so that a = / b / c and hence
[a, b, c] = [/ b / c, b, c] = / [b, b, c] / [c, b, c] = / [b, b, c] + / [c, c, b] = 0, which
follows Eqns. 6.6.2 and 6.6.1 and E6 . To show suciency see Exer. 6.3.
The basis {e1 , e2 , e3 } for E is orthonormal if
ei e j = i j ,
255

(6.7)

mathematical preliminaries

a
e3

e2

e3

e1
n

e2

e1
d
ab

Figure 6.1: Illustrations of c = a b and [d, a, b] = d (a b).


where
i j =

1 if i = j
0 if i ! j

(6.8)

is the Kronecker delta. For example e1 e1 = 1 whereas e1 e2 = 0, which implies e1 is a unit vector that is
orthogonal to e2 . Using the basis allows us to express any a E as
a = a1 e1 + a2 e2 + a3 e3 ,

(6.9)

where a1 , a2 and a3 are the components of a relative to the basis {e1 , e2 , e3 }, cf. Fig. 6.2. Introducing the
indicial (or Einstein summation) convention we write the above as
a = ai ei ,

(6.10)

i.e. it is henceforth understood that when any, so called dummy, subscript appears twice in an expression
3
then it is to be summed from 1 to 3, e.g. ai ei = 3i=1 ai ei . The dummy terminology is due to the fact that
the result is independent of the index choice, e.g. ai ei = a j e j = ak ek =
Our use of an orthonormal basis allows us to compute the components quite easily, viz
a e j = (ai ei ) e j

= ai (ei e j )
= ai i j
= a j,

(6.11)

where we make use of the fact that ei e j = i j = 1 only when i = j. And hence we can express any vector
as
a = (a ei ) ei .

(6.12)

It is emphasized that the appearance of the Kronecker delta in a summation generally allows one to eliminate
an index.
256

vector spaces

a3 e3

e3

a2 e2
e2
e1

a1 e1

Figure 6.2: Illustrations of the righthanded orthonormal basis {e1 , e2 , e3 } and vector components.
To perform our computations we express the vectors by their 3-dimensional array of components, i.e.

a1

a
a=

a
3

whence

a1 + b1

a2 + b3
a + b =

a + b
3
3

(6.13)

(6.14)

which harkens back to your first vector encounter. And referring to Exam. 6.1 we see that

0
e1 =
,

1
e2 =

0,

0
e3 =
.

(6.15)

In general we can show that [e1 , e2 , e3 ] = 1, however herein we limit ourselves to righthanded bases
for which [e1 , e2 , e3 ] = 1 and this implies
e1 = e2 e3 ,

e2 = e3 e1 ,
e3 = e1 e2 ,

(6.16)

cf. Exer. 6.6. We will henceforth work with a righthanded, i.e. positively oriented, orthonormal basis, i.e.
the basis you are familiar with, cf. Fig. 6.2.
It is convenient now to introduce the alternator i jk defined such that
i jk

1
=

if {i, j, k} = {1, 2, 3}, {2, 3, 1} or {3, 1, 2}, i.e. cyclic permutations of 1, 2, 3


if {i, j, k} = {1, 3, 2}, {2, 1, 3} or {3, 2, 1}, i.e. non-cyclic permutations of 1, 2, 3
otherwise ,
257

(6.17)

mathematical preliminaries
1

1
i jk = 1

i jk = 1 3

Figure 6.3: Illustrations of the alternator, left: cyclic and right non-cyclic permutations.

cf. Fig. 6.3. In this way property E4 and Eqns. 6.5 and 6.16 yield

e j ek = m jk em ,
8
9
ei , e j , ek = ei (e j ek )
= ei m jk em

= m jk ei em

= m jk im
= i jk

(6.18)

and hence

a b = (ai ei ) (b j e j )
= ai b j (ei e j )

= ki j ai b j ek
:
::
:: e1 e2 e3 :::
= ::: a1 a2 a3 :::
:: b b b ::
1
2
3
= (a2 b3 a3 b2 ) e1 + (a3 b1 a1 b3 ) e2 + (a1 b2 a2 b1 ) e3

a2 b3 a3 b2

a3 b1 a1 b3
=
,

a1 b2 a2 b1
258

(6.19)

linear transformations tensors


where | | denotes the determinant. Similarly we have
a b = (ai ei ) (b j e j )
= ai b j (ei e j )
= ai b j i j
= ai bi

a1

a2
=

b1

3
3

b1

8
9

a1 a2 a3
b
=

3
= a1 b1 + a2 b2 + a3 b3 ,
1

|a| = (a a) 2
1

= (a2i ) 2

= (a1 a1 + a2 a2 + a3 a3 ) 2 ,

(6.20)

which are the usual results. Combining the previous two results gives
[c, a, b] = c (a b)

= cm em ki j ai b j ek

= ai b j cm ki j (em ek )

= ai b j cm ki j km

= ki j ai b j ck
::
:
:: a1 a2 a3 :::
= ::: b1 b2 b3 :::
:: c c c ::
1
2
3
= (a1 b2 c3 + a2 b3 c1 + a3 b1 c2 ) (a1 b3 c2 + a2 b1 c3 + a3 b2 c1 ).

(6.21)

6.2 Linear Transformations Tensors


Students sometimes panic when they hear the word tensor. However, a tensor A is nothing more than a
linear transformation, i.e. a linear function, that eats elements from a vector space V and spits out elements
into the same vector space V, i.e. A : V V. We will encounter strain and stress tensors, which you are
probably familiar. Perhaps you have also seen tensors in your dynamics class, e.g. the inertia tensor, which
eats the angular velocity vector and spits out the angular momentum vector. We denote tensors by upper
case bold face Latin letters and the set of all tensors, i.e. linear transformations from E to E, as L.
Because A is a linear function we have, for any , R and a, b V,
A( a + b) = A(a) + A(b).

(6.22)

As seen above, A behaves like a matrix and for this reason we usually write the value b = A(a) simply
as b = A a. We are getting a bit ahead of ourselves here, but momentarily we show that indeed, for
computational purposes, we may express

A11 A12 A13

A = A21 A22 A23 ,


(6.23)

A31 A32 A33


259

mathematical preliminaries
where the Ai j are the components of A.
The tensors A, B L are said to be equal if A a = B a for all vectors a E. With this we define tensor
addition and scalar multiplication as
(A + B) a = A a + B a,
( A) a = (A a)

(6.24)

for every R and A, B L.


The zero tensor O L is defined such that for every a E
O a = 0.

(6.25)

And the identity tensor I L is defined such that


I a = a.

(6.26)

Hopefully these definitions are familiar looking.


Using the above definitions it is not hard to show that the elements of L satisfy the vector space properties
V1 V8 set forth in Sect. 6.1, i.e. L, the set of all tensors (that maps vectors to vectors), is itself a vector
space, cf. Exam. 6.1. Specifically for every , R and A, B, C L we have
A + B = B + A,
(A + B) + C = A + (B + C),
O + A = A,
(A) + A = O,
( A) = ( ) A,
( + ) A = A + A,
(A + B) = A + B,
1 A = A.

(6.27)

For example, to prove A + B = B + A we resort to the tensor addition definition, property V1 , and tensor
addition definition, to wit (A + B) c = A c + B c = B c + A c = (B + A) c. We finally apply the tensor equality
definition to (A + B) c = (B + C) c.
You are familiar with composite functions f g defined such that y = f g(x) = f (g(x)). A composite
tensor (function) is defined similarly, i.e. AB such that b = AB(a) = A(B(a)). However, upon dropping
the parentheses we have b = AB a and upon dropping the , a practice we continue henceforth, we are left
with b = A B a and for this reason, tensor composition is generally referred to as tensor multiplication. Note
that in general A B ! B A. Using the above definitions the obvious identities follow
(A B) = ( A) B = A ( B),
A (B + C) = A B + A C,
(A + B) C = A C + B C,
A(B C) = (A B) C,
A O = O A = O,
IA = AI = A
for every R and A, B, C L.
260

(6.28)

linear transformations tensors

6.2.1 Dyadic Product


The dyadic (outer or tensor) product of the vectors a, b E is the tensor a b L defined such that
(a b) c = (c b) a

(6.29)

for every vector c E. Using the symmetry of the dot product, and placing the scalar (c b) = (b c) to the
right of a, and using the vector representation gives (a b) c = a (b c) = {a} ({b}T {c}) = ({a} {b}T ) {c}, cf.
Eqns. 6.13 and 6.20, i.e.

a1

a2
ab =

a
3

a1

a2
=

a1 b1

= a2 b1

a3 b1

b1

b2

b
3

b1 b2 b3

a1 b2 a1 b3

a2 b2 a2 b3 .

a3 b2 a3 b3

(6.30)

Again, we are getting a bit ahead of ourselves here.


Using the vector space properties, definition of Eqn. 6.29 and definition of tensor multiplication (composition) it follows that for any a, b, c, d, f E.
(a b) (c d) f = (a b) [(c d) f]
= (a b) [(d f) c]

= (f d) [(a b) c]
= (f d) (b c) a

= (b c) (f d) a

= (b c) (a d) f.
And since the above holds for all f we have
(a b) (c d) = (b c) (a d).

(6.31)

Arguing in a similar way as above, i.e. by using the definition of Eqn. 6.29 the arbitrariness of f, it can be
verified that
( a + b) c = (a c) + (b c),

a ( b + c) = (a b) + (a c),
I =

ei ei

for any , R.
261

(6.32)

mathematical preliminaries
Referring back to Eqns. 6.15 and 6.30, note that in particular

e1 e1
e2 e1
e3 e1

1 0

= 0 0

0 0

0 0

= 1 0

0 0

0 0

= 0 0

1 0

0
0
0
0
0
0
0
0
0

0 1

e1 e2 = 0 0

0 0

0 0

e2 e2 = 0 1

0 0

0 0

e3 e2 = 0 0

0 1

0
0
0
0
0
0
0
0
0

0 0

e1 e3 = 0 0

0 0

0 0

e2 e3 = 0 0

0 0

0 0

e3 e3 = 0 0

0 0

1
0
0
0
1
0
0
0
1

(6.33)

and hence, upon recalling Exam. 6.1, we suspect that the tensors ei e j form a basis on L and this is indeed
the case. To see this more formally consider the the operation a = A b for which the following holds
a = ai ei
= (a ei ) ei
= (ei a) ei

= [ei (A b)] ei

= [ei (A [b j e j ])] ei

= [ei (A [(b e j ) e j ])] ei


= [ei (A e j )] (b e j ) ei
A

;!!!!!!!!!!!!!!!!!!!<=!!!!!!!!!!!!!!!!!!!>
= {[ei (A e j )] (ei e j )} b,
=!!!!!!!>;!!!!!!!<

(6.34)

Ai j

where we made use of Eqns. 6.10, 6.12 and 6.29 and the fact that (b e j ) is a scalar. And hence upon defining
the tensor components via
Ai j = ei (A e j )

(6.35)

A = Ai j ei e j ,

(6.36)

we are able to express any tensor as

i.e. as a matrix of components, cf. Eqn. 6.23. As seen here the basis {e1 , e2 , e3 } on E induces the basis
{e1 e1 , e2 e1 , , e3 e3 } on L; and as expected L is 9 = 32 -dimensional.
Associating the tensor with the matrix array makes computations easy to perform. For example consider
a = Ab
ai ei = [Ai j ei e j ] (bk ek )
= (Ai j bk )(ei e j ) ek
= (Ai j bk )(ek e j ) ei
= (Ai j bk ) jk ei

= Ai j b j ei ,
262

(6.37)

linear transformations tensors


i.e. ai = Ai j b j and hence we have

a1
A1 j b j

a
A2 j b j
=

a

A b
3
3j j

A11 A12 A13


A21 A22 A23
=

A
31 A32 A33

b1

b
.

(6.38)

The other operations involving tensors and vectors are similarly performed, notably
A B = [Ai j (ei e j )] [Bkl (ek el )]
= Ai j Bkl (ei e j ) (ek el )
= Ai j Bkl (e j ek ) (ei el )
= Ai j Bkl jk (ei el )

so (A B)i j = Aik Bk j .

6.2.2

= Aik Bkl (ei el )

A11 A12 A13

= A21 A22 A23

A31 A32 A33

B11 B12 B13

B21 B22 B23


B31 B32 B33

(6.39)

Transpose, Symmetric, Skew and Projection Tensors

We relied on the transpose above, but here we take another approach from your familiar rowcolumn interchange. For every tensor A L there exists a unique tensor AT L called the transpose of A such that for
any a, b E we have
a (A b) = (AT a) b.

(6.40)

It is not hard to show that for any scalars , R, vectors a, b E and tensors A, B L that
ATij = A ji ,
( A + B)T
(AT )T
T

= AT + BT ,
= A,

(A B)

= BT AT ,

(a b)T

= b a,

A (a b) = (A a) b,

(a b) A = a (AT b).

(6.41)

For example, Eqn. 6.35 and Eqn. 6.40 with a = ei and b = e j combine to give Ai j = ei (A e j ) = (AT ei ) e j =
e j (AT ei ) = ATji , i.e. your usual row column interchange for the transpose holds true for our, perhaps
more abstract, definition. All of the above equalities can be obtained by resorting to similar component
manipulations. Alternatively the direct method may also be used. For example the definitions of tensor
multiplication and transposition, i.e. Eqn. 6.40, give ((A B)T a) b = a ((A B) b) = a (A (B b)) = AT a
(B b) = BT (AT a) b = ((BT AT ) a) b. Utilizing the arbitrariness of a and b gives us (A B)T = BT AT .
The tensor A L is symmetric if AT = A and skew if AT = A. And since we have A = 1/2 (A +
T
A ) + 1/2 (A AT ) we see that any tensor can be decomposed into symmetric and skew symmetric parts,
i.e.
A = Sym(A) + Skew(A),
263

(6.42)

mathematical preliminaries
where, obviously,
Sym(A) = 1/2 (A + AT )

1
2

A12 + A21 A13 + A31


2 A11

2 A22
A23 + A32
A12 + A21
A13 + A31 A23 + A32
2 A33

Skew(A) = 1/2 (A AT )

1
2

0
(A21 A12 )
A13 A31
A21 A12
0
(A32 A23 )
(A13 + A31 )
A32 A23
0

(6.43)

are the symmetric and skew parts.


In general a tensor has 9 independent components. And as seen above we are consistent with this fact
since the symmetric tensor has 6 independent components and the skew tensor has 3 independent components; three equal zero and three are the negative of the remaining three. And for this reason, for each skew
tensor W there exist a unique axial vector w = Axial(W) defined such that
Wa = w a

(6.44)

for all vectors a E. Indeed, a simple computation via Eqns. 6.19 and 6.38 shows

a1
W21 W13
0

a
0
W32
W a = W21

W13 W32
0
3

W21 a2 + W13 a3

W
a

W
a
=

21
1
32
3

W a + W a
13 1
32 2

W32
a1

=
W

a2

13

W a

21

= Axial(W) a,

(6.45)

where w = Axial(W) = W32 e1 + W13 e2 + W21 e3 . Conversely, for each vector w there exists a skew tensor
W = Skew(w) 6.1 that satisfies Eqn. 6.44. indeed,

w2 a3 w3 a2

w3 a1 w1 a3
wa =

w a w a

1 2
2 1

w3 w2
a1
0

0
w1
a2
= w3

w
w
0
a
2

= Skew(w) a,

(6.46)

where W = Skew(w) = w1 (e3 e2 e2 e3 ) + w2 (e1 e3 e3 e1 ) + w3 (e2 e1 e1 e2 ).


Besides being used to define a basis for L, the tensor product often appears in perpendicular projections,
cf. Sect. 1.2. Consider the vector a and the unit vector e. The part of a that lies along e is given by
6.1
Hopefully the duplicate use of the Skew notation will not cause confusion as the arguments are either tensors, cf. Eqn. 6.43 or
vectors, cf. Eqn. 6.46.

264

linear transformations tensors

a
(e e) a

(I e e) a

Figure 6.4: Illustrations of perpendicular projections.

(a e) e = (e e) a and the remaining part, that lies in the plane that is perpendicular to e, is given by
a (a e) e = (I e e) a, cf. Fig. 6.4. It is seen that a defining property of a perpendicular projection
P is that it is symmetric and idempotent, i.e. PT = P and P2 = P P = P. The latter equality comes from
the observation that e.g. upon defining the vector a = (e e) a as part of a that lies along e, then for all
subsequent projections we have a = (e e) a , i.e. the part of a that lies along e.

6.2.3 Tensor Invariants, Scalar Product and Norm


The first, second and third principal invariants of a tensor A L are the scalar valued functions 1 : E R,
2 : E R and 3 : E R defined such that
1 (A) [a, b, c] = [A a, b, c] + [a, A b, c] + [a, b, A c],
2 (A) [a, b, c] = [a, A b, A c] + [A a, b, A c] + [A a, A b, c],
3 (A) [a, b, c] = [A a, A b, A c]

(6.47)

for every vector a, b, c E. More typically, the first and third invariants are referred to as the trace, denoted trA and determinant, denoted detA, respectively. Upon recalling that [a, b, c] is the volume of a
parallelepiped defined by the vectors a, b and c we see that detA [a, b, c] is the volume of the transformed
parallelepiped defined by the vectors A a, A b and A c, cf. Fig. 6.5. We encounter the determinant in the
change of variable theorem where we relate dierential volume elements; and we encounter the trace when
we dierentiate the determinant.
265

mathematical preliminaries

a
Figure 6.5: Illustration of detA: Solid and dashed lines show the parallelepiped with volumes [a, b, c] and
[A a, A b, A c] = detA [a, b, c].

Noting that
A e1 = (Ai j ei e j ) e1
= Ai j (e1 e j ) ei
= Ai j 1 j ei
= Ai1 ei

A11

A21
=

A
31

A11

= A21

A31

A12 A13
A22 A23
A32 A33

so that in general A e j = Ai j ei we have, via Eqns. 6.16 and 6.18

(6.48)

detA [e1 , e2 , e3 ] = [A e1 , A e2 , A e3 ]
detA 123 = [Ai1 ei , A j2 e j , Ak3 ek ]
detA = Ai1 A j2 Ak3 [ei , e j , ek ]
= Ai1 A j2 Ak3 i jk
= (A11 A22 A33 + A21 A32 A13 + A31 A12 A23 ) (A11 A32 A23 + A21 A12 A33 + A31 A22 A13 )
::
:
:: A11 A12 A13 :::
= ::: A21 A22 A23 ::: ,
(6.49)
:: A
:
:
31 A32 A33
266

linear transformations tensors


i.e. your usual understanding of the determinant holds true. Likewise for the trace we have

trA [e1 , e2 , e3 ] = [A e1 , e2 , e3 ] + [e1 , A e2 , e3 ] + [e1 , e2 , A e3 ]


trA 123 = [A1i ei , e2 , e3 ] + [e1 , A2i ei , e3 ] + [e1 , e2 , A3i ei ]
trA = A1i [ei , e2 , e3 ] + A2i [e1 , ei , e3 ] + A3i [e1 , e2 , ei ]
= A1i i23 + A2i 1i3 + A3i 12i
= A11 + A22 + A33 ,

(6.50)

i.e. your usual understanding of the trace as the sum of the diagonal elements also holds true. And finally a
lengthy derivation gives

2 (A) [e1 , e2 , e3 ] = [e1 , A e2 , A e3 ] + [A e1 , e2 , A e3 ] + [A e1 , A e2 , e3 ]


2 (A) 123 = [e1 , A2i ei , A3 j e j ] + [A1i ei , e2 , A3 j e j ] + [A1i ei , A2 j e j , e3 ]
2 (A) = A2i A3 j [e1 , ei , e j ] + A1i A3 j [ei , e2 , e j ] + A1i A2 j [ei , e j , e3 ]
= A2i A3 j 1i j + A1i A3 j i2 j + A1i A2 j i j3
= (A22 A33 A23 A32 ) + (A11 A33 A13 A31 ) + (A11 A22 A12 A21 )
1 2
[A + A222 + A233 + 2 A22 A33 + 2 A11 A33 + 2 A11 A22 ]
=
2 11
1
[A211 + A222 + A233 + A23 A32 + A13 A31 + A12 A21 ]
2
1
1
(A11 + A22 + A33 )2 Ai j A ji
=
2
2
1
=
[(trA)2 trA2 ].
2

(6.51)

It may also be verified that

tr( A + B) = trA + trB,


trAT

= trA,

tr(a b) = a b,

tr(A B) = tr(B A),

det( A) = 3 detA,
detAT

= detA,

det(A B) = detA detB,


trI = 3,
detI = 1.

(6.52)

Similar to the manner in which any tensor can be expressed as the sum of a symmetric and skew com267

mathematical preliminaries
ponents we define the deviatoric (traceless) and spherical parts of the tensor A via
1
Sph(A) =
trA I
3

1 0 0
1

=
(A11 + A22 + A33 ) 0 1 0 ,

3
0 0 1
1
Dev(A) = A trA I
3
2
1
1
A12
A13
3 A11 3 A22 3 A33

=
A21
13 A11 + 32 A22 13 A33
A23

A31
A32
31 A11 13 A22 + 23 A33

so that

A = Sph(A) + Dev(A).

(6.53)

(6.54)

It is clearly seen that trDev(A) = 0 and hence the traceless terminology, cf. Exer. 6.17. Moreover, by
recognizing that Dev(A)33 = (Dev(A)11 + Dev(A)22 ) we see that Sph(A) and Dev(A) are defined by 1 and
8 independent components.
If we can define a suitable scalar product on L, then it will be an inner product space. Our dream is
realized by defining the scalar product
A B = tr(AT B),

(6.55)

which, upon noting that AT B = Aki Bk j ei e j , cf. Eqns. 6.39 and 6.41, and that tr(ei e j ) = ei e j = i j , cf.
Eqns. 6.7 and 6.52, is computed as
A B = tr[Aki Bk j ei e j ] = Aki Bk j tr(ei e j ) = Aki Bk j ei e j = Aki Bk j i j = Aki Bki ,

(6.56)

which is similar to the vector scalar product, cf. Eqn. 6.20. Using the definition of the trace, we can show,
analogously to properties E1 , E2 and E3 , that the following relations hold
A B = B A,

( A + B) C = (A C) + (B C),

(6.57)

A A 0 for all A L and A A = 0 if and only if A = O

for all scalars , R and tensors A, B, C L. For example, A B = tr(AT B) = tr((AT B)T ) =
tr(BT (AT )T ) = tr(BT A)) = B A, which follows from Eqns. 6.41, 6.52 and 6.55.
We define the norm |A| of the tensor A as
1

|A| = (A A) 2

= (Ai j Ai j ) 2 ,

(6.58)

which is analogous to the vector norm, cf. Eqn. 6.20. And we say two tensors A and B are orthogonal if
A B = 0. Again, these definitions on L are analogous to those on E.
Using Eqns. 6.52 and 6.55 it is also readily verified that
trA
A (B C)
a Ab
(a b) (c d)
trA
SA
WA

=
=
=
=
=
=
=

A I,
(BT A) C = (ACT ) B,
A (a b),
(a c)(b d),
trSym(A),
S Sym(A)
if S L is symmetric,
W Skew(A)
if W L is skew
268

(6.59)

linear transformations tensors


for arbitrary tensors A, B and C and arbitrary vectors a, b, c, and d. For example, trA = trAT = tr(AT I) =
A I, which follows from Eqns. 6.28, 6.52 and 6.55. The third line above allows us to express the tensor
components of Eqn. 6.35, i.e. Ai j = ei A e j as
Ai j = A (ei e j ),

(6.60)

which is analogous to Eqn. 6.11, i.e. ai = ei a.

6.2.4

Inverse, Orthogonal, Rotation, Involution and Adjugate Tensors

If detA = 0 we say that A is not invertible, i.e. it is singular; and this is on agreement with your matrix
experience. Additional insight into this claim is gained by noting that

a1
A11 A12 A13


a2
A a = A21 A22 A23

A31 A32 A33 a3

A13
A12
A11

A23
A22
A21
a3
(6.61)
a2 +
a1 +
=

A
A
A
32

31

33

and hence A a is a linear combination of the columns of the matrix [A]. Now suppose A a = 0 for some
a ! 0. This implies the columns of the matrix [A] are linearly dependent and thus by Eqns. 6.6 and 6.21 the
triple product of these three column vectors is zero, i.e. the detA = 0. If this is the case, then if A b = c
we also have A (b + a) = A b + A a = c + 0 = c, i.e. A is not one-to-one as both A b and A (b + a) equal c
and and hence it has no inverse. We can also show that the condition det A = 0 implies an a ! 0 exists such
A a = 0, cf. Exer. 6.23. Clearly the condition det A = 0 implies A has no inverse.
If det A ! 0 then A is invertible and we say A1 L is the unique inverse of A L and it satisfies
A A1 = A1 A = I.

(6.62)

Note that by 6.52 we see that detA det A1 = det(A A1 ) = detI = 1 and hence
det A1 = (det A)1 .

(6.63)

It can also be shown that the following identities hold for all invertible A, B L
(A B)1 = B1 A1 ,
(A1 )T

= (AT )1 .

(6.64)

And for this reason we sometimes write AT = (A1 )T = (AT )1 .


A tensor Q L is orthogonal if it preserves inner products, i.e. if
(Q a) (Q b) = a b

(6.65)

for all vectors a, b E. As seen here and from Eqn. 6.4 the angle between the transformed vectors, e.g.
Q a, is unchanged as well as the length of the transformed vectors since (Q a) (Q a) = a a = |a|2 .
Using the definition of the transpose, cf. Eqn. 6.40, and matrix multiplication we see that (Q a) (Q b) =
a QT (Q b) = a (QT Q) b = a b and hence QT Q = I. Moreover, applications of Eqn. 6.52 give
1 = det I = det(QT Q) = det QT det Q = det Q det Q = (detQ)2 and hence detQ = 1. This detQ = 1
result implies Q is invertible so upon manipulating QT Q Q1 = I Q1 we find
QT = Q1 ,
269

(6.66)

mathematical preliminaries

a
e

I a

Re a

Figure 6.6: Inversion (left) and reflection (right) illustrations.

i.e. our definition is consistent with your familiar notion that the inverse of an orthogonal matrix equals its
transpose.
We encounter orthogonal tensors when we discuss rigid deformation and material symmetry. Notable
orthogonal tensors are the central inversion Q = I and the reflection about the plane with normal vector e
Re = I 2 e e.

(6.67)

Illustrations of the actions of these tensors on an arbitrary vector a are seen in Fig. 6.6. These are examples
of improper orthogonal tensors as their determinants equal -1. The central inversion and the reflections are
also examples of involutions since they equal their own inverses, e.g. Re Re = I. Physically this is not
surprising since the twice inverted or reflected object returns to its original state, e.g. Re Re a = a, cf. Fig.
6.6.
The set of rotations contains all proper orthogonal tensors, i.e. orthogonal tensors with determinants
equal to 1. If {p, q, r} is an orthonormal (right-handed) basis for E then it is readily verified that the tensor
Rp () = p p + (q q + r r) cos (q r r q) sin

0
0

= 0 cos sin

0 sin cos

(6.68)

is a rotation and the components of [R] are with respect to the {p, q, r} basis, cf. Fig. 6.7 and Exer. 6.40. We
refer to p as the axis of rotation.
For every A L there exists the unique tensor A L called the adjugate that satisfies
A (a b) = (A a) (A b)

(6.69)

for all a, b E. We use the adjugate to relate areas and normal vectors in transformed domains, cf. Fig. 6.8.
Indeed, the product of the area and normal vector of the parallelogram defined by the vectors a and b is a b
whereas the analogous product defined by the transformed vectors A a and A b is (A a) (A b) = A (a b).
If A is invertible, then upon taking the scalar product Eqn. 6.69 with c E and using the trick that
270

linear transformations tensors

Ra

p r
q

Figure 6.7: Rotation illustration.


I = A A1 we obtain
c A (a b) = c {(A a) (A b)}

= (A A1 c) {(A a) (A b)}

= [A (A1 c), A a, A b]
= detA [(A1 c), a, b]

= detA (A1 c) (a b)

= c detA (A1 )T (a b).

(6.70)

An application of Eqn. 6.64 and the arbitrariness of a b and c imply


A = detA AT .

(6.71)

6.2.5 Eigenpairs, Spectral Representation and Polar Decomposition


An eigenpair of a tensor A L is the nonzero scalarvector pair (, v) that satisfies
A v = v.

(6.72)

We call the eigenvalue and v the eigenvector. Note the exceptional case here, i.e the product of A v is
parallel to v, i.e. A v v.
Rearranging Eqn. 6.72 we see that the eigenpair satisfies
(A I) v = 0
271

(6.73)

mathematical preliminaries

A (a b) = (A a) (A b)
ab

Ab
b

Aa
a

Figure 6.8: Illustration of the adjugate. Solid and dashed lines show parallelograms with areas |a b| and
|(A a) (A b)| = |A (a b)|.

272

linear transformations tensors


and thus for v ! 0 we require the columns of A I to be linearly dependent, i.e. A I is singular, cf.
Eqn. 6.61 and the following discussion. It then follows that the eigenvalues are the roots of the characteristic
equation
det(A I) = 0.

(6.74)

Substituting A I for A in Eqn. 6.47 we see that


det(A I) [a, b, c] = [(A I) a, (A I) b, (A I) c]
= 3 [a, b, c] + ([A a, b, c] + [a, A b, c] + [a, b, A c])
=

([a, A b, A c] + [A a, b, A c] + [A a, A b, c]) + [A a, A b, A c]
@
+ 1 (A) 2 2 (A) + 3 (A) [a, b, c],
(6.75)
3

i.e. the characteristic equation is also given by

3 1 (A) 2 + 2 (A) 3 (A) = 0

(6.76)

thus oering additional insight into the invariants.


As seen in Eqn. 6.76, the characteristic equation is an order 3 polynomial and as seen in Eqns. 6.49,
6.50 and 6.51 the invariants are real and hence the three roots, i.e. eigenvalues, of a linear transformation
A L consist of three real numbers or one real number and a complex conjugate pair. The set of these three
eigenvalues comprises the spectrum of A.
If S is symmetric then the eigenpairs are real, cf. [7]. Moreover, the eigenvectors associated with distinct
eigenvalues are orthogonal. To see this let (i , vi ) and ( j , v j ) be two eigenpairs of the symmetric tensor S
such that i ! j Then v j S vi = v j i vi and vi S v j = vi j v j . Subtracting these two equalities gives
v j S vi vi S v j = (i j ) v j vi . Now using the definition of the transpose and the symmetry of S we
obtain v j S vi vi S v j = v j S vi ST vi v j = v j S vi S vi v j = v j S vi v j S vi = 0 and hence
0 = (i j ) v j vi . Finally using the inequality (i j ) ! 0 we obtain v j vi = 0.
Now assume that the eigenvalues of the symmetric A are distinct and that the eigenvectors are scaled to
be unit vectors (as both vi and vi satisfy Eqn. 6.72, cf. Exer. 6.27). In this way, we see that the orthogonal
unit eigenvectors v1 , v2 , and v3 form an orthonormal basis for E. So, upon using Eqns. 6.32, i.e. I = vi vi ,
and 6.41, i.e. A (a b) = (A a) b, we have the spectral decomposition of S, i.e.
S = SI
= S (vi vi )

= (S vi ) vi )
3
!
=
(i vi ) vi
i=1

3
!
i=1

i (vi vi )

1 0 0

= 0 2 0

0 0 3

(6.77)

where [S] = diag[1 , 2 , 3 ] is expressed relative to the {v1 , v2 , v3 } basis.


Similar results hold even if the eigenvalues of the symmetric A are not distinct. Consider the case for
which = 1 = 2 ! 3 . Then using the above arguments we have v1 v3 = v2 v3 = 0, i.e. any perpendicular
273

mathematical preliminaries

v1
v2

v1

Figure 6.9: Spectral representation illustration for the 1 = 2 ! 3 case.

vectors v1 and v2 that lie in the plane with normal vector v3 can be used in Eqn. 6.77, viz.

S =

3
!
i=1

i (vi vi )

= (v1 v1 + v2 v2 ) + 3 v3 v3

= (I v3 v3 ) + 3 v3 v3 ,

(6.78)

where we use Eqn. 6.32 and recognize both I v3 v3 and v3 v3 as perpendicular projections, cf. Fig. 6.4.
Lastly, if = 1 = 2 = 3 then {v1 , v2 , v3 } can be any orthonormal basis and we have
S =

3
!
i=1

i (vi vi )

= (v1 v1 + v2 v2 + v3 v3 )
= I.

(6.79)

We say a tensor A is positive definite if a A a > 0 for every a ! 0 E and positive semi definite if
a A a 0 for every a E . Necessary and sucient conditions that a symmetric tensor S is positive definite
are that its eigenvalues are positive, i.e. i > 0 for i = 1, 2, 3 and similarly, S is positive semi-definite if
i 0, cf. Sect. 1.6.
Using the spectral representation we may express invariants of Eqns. 6.47, 6.49, 6.50 and 6.51 as
1 (A) = trA = 1 + 2 + 3 ,
2 (A) = 2 3 + 1 3 + 1 2 ,
3 (A) = det A = 1 2 3 .
274

(6.80)

linear transformations tensors


If S is positive definite then its eigenvalues are positive and we can take their square roots giving

S =

3 A
!
i (vi vi )
i=1

= 0

0
2
0

which is obviously positive definite symmetric and satisfies


the inverses of the eigenvalues to obtain

0
0
3

(6.81)

B C2
S = S S = S. Additionally we can take

3
!
1
(vi vi )

i=1 i

1
1 0 0

= 0 12 0 ,

0 0 13

S1 =

(6.82)

which is again positive definite symmetric. It is readily verified that S1 S = S S1 = I.


Now we show that any invertible tensor A can be expressed via the (right) polar decomposition
A = R U,

(6.83)

where the tensors R and U are orthogonal and positive definite symmetric, respectively. Indeed, consider
the tensor AT A, which by Eqn. 6.41 is symmetric. Moreover, by property V3 we see that a (AT A) a =
A a A a 0 and a (AT A) a = 0 only if A a = 0; but A is invertible thusly A a = 0 implies a = 0, cf. the
discussion following Eqn. 6.61. Consequently a AT A a > 0 for all a ! 0, i.e. AT A is positive
definite
symmetric and hence there exists a positive definite (invertible square root) symmetric tensor U = AT A,
cf. Eqn. 6.81. Next define the tensor R = A U1 so that R U = (A U1 ) U = A, which is Eqn. 6.83. We
must now show R is orthogonal. To these ends note that RT R = (A U1 )T (A U1 ) = UT AT A U1 =
U1 U2 U1 = U1 U U U1 = I, which implies R is orthogonal.
Using the above argument we can show that V = R U RT is positive definite symmetric and hence we
also have the left polar decomposition
A = V R.

(6.84)

The fact that these decompositions are unique is verified in [3].

6.2.6 Fourth-Order Tensors


The aforementioned tensors, i.e. linear functions that eat vectors and spit out vectors are technically called
second-order tensors (or 2-tensors). For our purposes we treat fourth-order tensors (or 4-tensors ) as linear
functions that eat 2-tensors and spit out 2-tensors, e.g. C : L L such that
C[ A + B] = C[A] + C[B]

(6.85)

for all , R and A, B L. As seen above, the [] notation is used above to indicate the argument of the
function, e.g. rather than writing B = C(A) we write B = C[A] and this is in contrast to the 2-tensors where
we write b = C a rather than b = C(a). The brackets are required here because in general C[A] B ! C[A B];
275

mathematical preliminaries
for 2-tensors, the analogous operation C a b does not appear as the operation a b is not defined. Herein we
denote 4-tensors with upper case blackboard bold Latin letters and the set of all 4-tensors as L4 . Why are
we studying 4-tensors? Well, in our subsequent studies we encounter the elasticity 4-tensor, which you may
have seen by name of the generalized Hookes law, which maps the strain 2-tensor into the stress 2-tensor.
Not surprisingly, 4-tensors share properties analogous to those of 2-tensors, Indeed, the 4-tensors A, B
L4 are said to be equal if A[C] = B[C] for all 2-tensors C L. The analogy follows by defining 4-tensor
addition and scalar multiplication as
(A + B)[C] = A[C] + B[C],
( A)[C] = (A[C])

(6.86)

for every scalar R, the zero 4-tensor O L4 such that


O[C] = O

(6.87)

I[C] = C.

(6.88)

and the identity tensor I L4 such that

In this way, we see that L4 , the set of all 4-tensors (that maps 2-tensors to 2-tensors), is a vector space, i.e.
for every , R and A, B, C L4
A + B = B + A,
(A + B) + C = A + (B + C),
A + O = A,
(A) + A = O,
( A) = ( ) A,
( + ) A = A + A,
(A + B) = A + B,
1 A = A.

(6.89)

As with 2-tensor composition we denote 4-tensor composition, i.e. multiplication, such that A B[C] =
A[B[C]] and note that in general A B ! B A. Using the above definitions yields the identities
(A B) = ( A) B = A ( B),
A (B + C) = A B + A C,
(A + B) C = A C + B C,
A(B C) = (A B) C,
A O = O A = O,
IA = AI = A

(6.90)

for every R and A, B, C L4 .


The dyadic product of the 2-tensors A, B L is the 4-tensor A B such that
(A B)[C] = (C B) A
for every C L, cf. Eqn. 6.29.
276

(6.91)

linear transformations tensors


The same way that the the basis {e1 , e2 , e3 } on E induces the basis {e1 e1 , e2 e1 , , e3 e3 } on L we
use the basis {e1 e1 , e2 e1 , , e3 e3 } on L to induce a basis on L4 . To see this we proceed as in Eqn.
6.34 to obtain for A = C[B]
A = Ai j (ei e j )

= (A (ei e j )) (ei e j )
= ((ei e j ) A) (ei e j )

= {(ei e j ) C[B]} (ei e j )

= {(ei e j ) C[Bkl (ek el )]} (ei e j )

= {(ei e j ) C[(B (ek el ))(ek el )]} (ei e j )


= {(ei e j ) C[ek el ]} (B (ek el )) (ei e j )
C

;!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!<=!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!>
= {{(ei e j ) C[ek el ]} {(ei e j ) (ek el )}[B],
=!!!!!!!!!!!!!!!!!!!!!>;!!!!!!!!!!!!!!!!!!!!!<

(6.92)

Ci jkl

where we made use of Eqns. 6.60, 6.36, 6.59 and 6.91. Thus any 4-tensor can be expressed as
C = Ci jkl (ei e j ) (ek el ),

(6.93)

Ci jkl = (ei e j ) C[ek el ]

(6.94)

where

are the components relative to the basis {(e1 e1 ) (e1 e1 ), (e2 e1 ) (e1 e1 ), (e3 e3 ) (e3 e3 )}
and thus we see that L4 is an 81 = 34 dimensional vector space.
Now as per Eqns. 6.37 and 6.39 it can be verified that
(C[A])i j = Ci jkl Akl ,
(C B)i jkl = Ci jmn Bmnkl

(6.95)

for any 2-tensor A L and 4-tensors B, C L4 .


This 4-tensor dyadic product shares properties of its second-order counterpart, cf. Eqn. 6.32. Namely
for any , R and A, B, C, D E we have
(A B)i jkl = Ai j Bkl ,

( A + B) C = (A C) + (B C),

A ( B + C) = (A B) + (A C),

(A B) (C D) = (B C)(A D),

(ei e j ) (ei e j ).

I =

(6.96)

Referring to Eqn. 6.40 for every 4-tensor C there exists a unique 4-tensor CT called the transpose of C
that satisfies
A C[B] = CT [A] B

(6.97)

for all 2-tensors A, B L. Upon letting A = ei e j and B = ek el we find via Eqns. 6.57 and 6.94
(ei e j ) C[(ek el )] = CT [(ei e j )] (ek el )

Ci jkl = (ek el ) CT [(ei e j )]


=

277

T
Ckli
j,

(6.98)

mathematical preliminaries
cf. Eqn. 6.41. Summarizing the above and using similar such arguments it can be verified that
CiTjkl = Ckli j ,
( A + B)T

= AT + BT ,

(AT )T

= A,

= BT AT ,

= (B A),

(A B)

(A B)

(6.99)

A (A B) = A[A] B,

(A B)A = A AT [B]

for all 2-tensors A, B, C, D L and 4-tensors A, B, C L4 . cf. Eqn. 6.41.


For convenience we define the transposition 4-tensor T such that
T[A] = AT

(6.100)

T = (ei e j ) (e j ei ).

(6.101)

for every A and note that

Indeed,
T[A] = (ei e j ) (e j ei )[A]
= [(e j ei ) A] (ei e j )
= A ji (ei e j )
= AT ,

(6.102)

which follows from Eqns. 6.36, 6.41, 6.60, 6.91 and 6.101.
We say C possesses major symmetry if CT = C, i.e. Ci jkl = Ckli j . The major symmetry terminology
arises because a 4-tensor can also exhibit minor symmetries, i.e. C possesses the first minor symmetry if
(C[A])T = C[A] and the second minor symmetry if C[AT ] = C[A] for all A L. These three symmetries
respectively imply
C = CT ,
C = T C,
C = C T.

Ci jkl = Ckli j ,
Ci jkl = C jikl ,
Ci jkl = Ci jlk ,

and

(6.103)

The first line follows from Eqn. 6.98 whereas the second and third lines follow from Eqns. 6.95 and 6.100
and that fact the ATij = A ji , cf. Eqn. 6.41.
Referring to Fig. 6.4 we view a perpendicular projection as a 2-tensor that removes components of
a vector. We have a similar definition for 4-tensor projections and in particular we define four 4-tensor
projections such that they remove the skew, symmetric, deviatoric and spherical parts of any tensor A L,
cf. Eqns. 6.42, 6.53 and 6.54, i.e.
PSym [A] = Sym(A),
PSkew [A] = Skew(A),
PSph [A] = Sph(A),
PDev [A] = Dev(A)
278

(6.104)

linear transformations tensors


for all 2-tensors A L and note that

1
(I + T),
2
1
(I T),
=
2
1
1
=
I I = I I,
3
|I|2
1
1
= I 2 II =I II
3
|I|

PSym =
PSkew
PSph
PDev

(6.105)

note the use of the norm |I|, which makes the scaled identity (1/|I|) I behave like a unit vector e.
We say C L4 is invertible if there exists a unique C1 L4 , called the inverse of C such that
C1 C = C C1 = I.

(6.106)

The conjugation product of the 2-tensors A, B L is the 4-tensor A ! B such that


(A ! B)[C] = A C BT

(6.107)

for all 2-tensors C L. It can be verified for all 2-tensors A, B, C, D L that


(A ! B)i jkl = Aik B jl ,
( A + B) ! C = (A ! C) + (B ! C),
A ! ( B + C) = (A ! B) + (A ! C),
(A ! B) (C ! D) = (A C) ! (B D),

(6.108)

I = I ! I,
T

(A ! B)

= (AT ! BT ),

(A ! B) T = T (B ! A),
(A ! B)1 = (A1 ! B1 ),
where the last equality assumes that A and B are invertible.
Referring to Eqn. 6.65 a 4-tensor Q L4 is orthogonal if it preserves inner products, i.e. if
(Q A) (Q B) = A B

(6.109)

for all 2-tensors A, B L. Such 4-tensors arise when describing the constitutive response of anisotropic
materials, e.g. fiber reinforced composites. Analogous to Eqn. 6.66, orthogonal 4-tensors satisfy
QT = Q1 .

(6.110)

Referring to Eqns. 6.72 and 6.77 you suspect, I am sure, that if the fourth-order tensor C possesses major
symmetry then it has 9 = 32 eigenpairs of the form (i , Ei ) where the Ei L are the eigentensors defined
such that
C[Ei ] = i Ei

(6.111)

and that the spectral representation exists such that


C=

9
!
i=1

i Ei Ei ,

where the eigentensors are normalized such that Ei E j = i j .


279

(6.112)

mathematical preliminaries

6.2.7 Matrix Representation


Before proceeding we develop a matrix-vector abstraction that can be used to perform 2- and 4-tensor
computations in much the same way that 2-tensor and vector computations are performed in Eqn. 6.38. This
is particularly useful when developing finite element programs.
For a k m matrix A, and the n l matrix B, the Kronecker product is the k n m l matrix defined as

A11 B A12 B
A B A B
22
21
A B = .
..
..
.

Ak1 B Ak2 B

A1m B
A2m B
..
.
Akm B

(6.113)

Now we define the vector representation vec (X) of the m n matrix X, by stacking the n columns of X to
form a single column vector, i.e.

vec (X) =

In this way it may be verified that

X11
X21
X31
..
.
Xm1
..
.
X1n
X2n
X3n
..
.
Xmn

(6.114)

vec (A X B) = (BT A) vec (X) .

(6.115)

To simplify 4-tensor computations we introduce the matrix representation mat (C), of the 4-tensor C as

mat (C) =

C1111
C2111
C3111
C1211
C2211
C3211
C1311
C2311
C3311

C1121
C2121
C3121
C1221
C2221
C3221
C1321
C2321
C3321

C1131
C2131
C3131
C1231
C2231
C3231
C1331
C2331
C3331

C1112
C2112
C3112
C1212
C2212
C3212
C1312
C2312
C3312

C1122
C2122
C3122
C1222
C2222
C3222
C1322
C2322
C3322

C1132
C2132
C3132
C1232
C2232
C3232
C1332
C2332
C3332

C1113
C2113
C3113
C1213
C2213
C3213
C1313
C2313
C3313

C1123
C2123
C3123
C1223
C2223
C3223
C1323
C2323
C3323

C1133
C2133
C3133
C1233
C2233
C3233
C1333
C2333
C3333

Using this matrix construct, it can be verified the components of the 2-tensor A = C [B] obey
vec (A) = mat (C) vec (B)
280

(6.116)

(6.117)

set summary
for any 2- and 4-tensors B L and C L4 . It may also be verified that for any 2-tensors A, B L and any
4-tensors A, B L4
A B = vec (A)T vec (B) ,

(A B)T

= AT BT ,

mat (A B) = vec (A) vec (B)T ,


mat (A ! B) = B A,
B C
mat AT = mat (A)T ,
mat (A B) = mat (A) mat (B) .

(6.118)

6.3 Set Summary


To make life easier down the road, we list several commonly used sets, cf. Tab. 6.1
Notation
R
R+
E
L
LInv
L+
LSym
+
LSym
LSkew
LOrth
LRot
L(A, B)
L4

Set
Real numbers
Real numbers greater than 0
Vectors
2-tensors
Invertible 2-tensors
Invertible 2-tensors with positive determinant
Symmetric 2-tensors
Invertible positive definite symmetric 2-tensors
Skew 2-tensors
Orthogonal 2-tensors
Orthogonal 2-tensors with positive determinant, i.e. rotations
Linear mappings from A to B
4-tensors
Table 6.1: Set notation.

6.4 Dierentiation
Dierentiation arises in numerous places in the sequel. For example, we dierentiate the displacement to
obtain the strain tensor.
For now consider the function f : X R Y R, which maps reals into reals. Then based on your
past experience you know that the derivative f (x) of the function f at x, if it exists, is defined as
f (x) = lim

f (x + ) f (x)

(6.119)

and f is dierentiable if f (x) exists for all x X. Our use of f (x+) for small || implies x+ X and hence
x must be an interior point of X. Consequently X must be an open subset of R, e.g. X = (0, 2) R. Referring
to your first calculus class you also know that for f (x) = x2 , f (x) = 2 x and for g(x) = |x|, g (0) does not
exist. Perhaps what you have not thought about is the fact that the derivative f (x) of the function f at x (if
281

mathematical preliminaries
it exists) is unique because it is defined via the limit, which is unique. And this is the reason that g (0) does
not exist, i.e. while the one sided limits lim0+ (g(0 + ) g(0))/ = 1 and lim0 (g(0 + ) g(0))/ = 1
exist, the limit lim0 (g(0 + ) g(0))/ does not.
After taking your vector calculus class you know that for a scalar valued function of a vector, i.e. :
E R the gradient (x) of at x, if it exists, is defined as the vector

(x) =

x1 (x1 , x2 , x3 )

x2 (x1 , x2 , x3 )

x3 (x1 , x2 , x3 )

(x1 , x2 , x3 ) ei
xi

(6.120)

and this makes sense, i.e. the scalar valued component function is dierentiated with respect the 3 vector
components x j . For a vector valued function of a vector, i.e. f : E E the derivative Df(x) of f at x, if it
exists, is defined as the 2-tensor (a.k.a. matrix)

Df(x) =

f1
x1 (x1 , x2 , x3 )
f2
x1 (x1 , x2 , x3 )
f3
x1 (x1 , x2 , x3 )

f1
x2 (x1 , x2 , x3 )
f2
x2 (x1 , x2 , x3 )
f3
x2 (x1 , x2 , x3 )

f1
x3 (x1 , x2 , x3 )
f2
x3 (x1 , x2 , x3 )
f3
x3 (x1 , x2 , x3 )

fi
(x1 , x2 , x3 ) ei e j
x j

(6.121)

and this makes sense, i.e. the 3 component functions fi of the vector valued function f are dierentiated
with respect the 3 vector components x j . In the above it is understood that 1) x = xi ei , cf. Eqn. 6.10, 2)
(x) |x=xi ei = (x
1 , x2 , x3 ) and f(x) |x=xi ei = fi (x1 , x2 , x3 ) ei are represented by their component functions
3
: R R and fi : R3 R and 3) the partial derivatives are evaluated in the usual manner, i.e. via
Eqn. 6.119 with xi replacing x. Note the care that is taken to distinguish the function from its component
function, e.g. from .
Also note the care that is taken to have consistent variables on each side of the
above equations, namely (x1 , x2 , x3 ). And finally note that the components of the vector (x) and the
fi

(x1 , x2 , x3 ) and x
(x1 , x2 , x3 ),
2-tensor Df(x) with respect to the fixed orthonormal basis {e1 , e2 , e3 } are x
i
j
respectively.
Generalizing your knowledge we define the gradient of a scalar valued function of a 2-tensor, i.e. :
L R at A, if it exists, as the 2-tensor
(A) ei e j

A11 (A11 , A12 ,

A21 (A11 , A12 ,

A31 (A11 , A12 ,

, A33 )
, A33 )
, A33 )

A12 (A11 , A12 ,

A22 (A11 , A12 ,

A32 (A11 , A12 ,

(A11 , A12 , , A33 ) ei e j .


Ai j

, A33 )
, A33 )
, A33 )

A13 (A11 , A12 ,

A23 (A11 , A12 ,

A33 (A11 , A12 ,

, A33 )

, A33 )

, A33 )

(6.122)

This makes sense, i.e. the scalar valued component function is dierentiated with respect the 3 3 tensor
components Ai j . And finally, the derivative of a 2-tensor valued function of a 2-tensor, i.e. B : L L at A,
if it exists, is the 4-tensor
DB(A) =

B i j
(A11 , A12 , , A33 ) (ei e j ) (ek el )
Akl
282

(6.123)

dierentiation
and this makes sense, i.e. the 33 component functions B i j of the tensor valued function B are dierentiated
with respect the 3 3 tensor components Akl . Here is it is understood that 1) A = Ai j ei e j , cf. Eqn.
11 , A12 , , A33 ) is represented by its component function : R9 R and 3)
6.36, 2) (A) |A=Ai j = (A
B(A) |A=Ai j = B i j (A11 , A12 , , A33 ) ei e j is represented by its component functions B i j : R9 R, cf. Eqn.
6.93.
Example 6.2. Determine the derivative of the function f : R R
f (x) = x2 .
Using Eqn. 6.119 gives
f (x)

(x + )2 x2
0

2 x + 2
lim
0

2 x.
lim

=
=
=

Example 6.3. Determine the derivative of the function : E R such that


(x) =

x x = x21 + x22 + x23 ,

where the second equality follows from Eqn. 6.20. Using Eqn. 6.120 gives

2 x1

2
x
(x) =
= 2 x,

2x

where the second equality follows from Eqn. 6.13.

Example 6.4. Determine the gradient of the trace, i.e. 1 : L R where from Eqn. 6.50
1 (A) = tr A = A11 + A22 + A33 .
Application of Eqn. 6.122 yields

1 (A) = 0

0
1
0

0
0
1

= I.

The above dierentiation via partial dierentiation is absolutely fine, however there are situations where
it is cumbersome. Indeed, as seen in Exams. 6.3 and 6.4 we have to expand the function in terms of its
components and this is not always straightforward. For example consider the determinant function det =
3 : L R, cf. Eqn. 6.49 or the tensor inverse function G : L L such that G(A) = A1 .
To these ends we revisit the derivative definition of Eqn. 6.119. It generalizes nicely for other choices
of the co-domain Y, e.g. rather than Y R we could have Y E or Y L. However, for other choices of
283

mathematical preliminaries
the domain X this is not the case as the division, e.g. by X E, in Eqn. 6.119 is ill-defined. To remedy
this problem we modify Eqn. 6.119 so that the derivative of the function f : X U Y V at x X,
if it exists, is the linear operator D f (x) : U V that eats any u U and spits out the unique element
D f (x)[u] V such that
d
1
f (x + u) |=0 = f (x; u).
D f (x)[u] = lim [ f (x + u) f (x)] =
0
d

(6.124)

Recalling Eqn. 4.62 we see the appearance of the directional derivative of f at x with respect to u, i.e.
f (x; u). However, the derivative D f (x) considers all u U whereas the directional derivative only considers a specific u. Again we emphasize that D f (x) : U V is a function; its value at u U is denoted by
D f (x)[u] V; and because it is a linear function we use the square brackets (as we did with 4-tensors) to
delineate the argument u.
This linear operator derivative definition is consistent with those appearing in Eqns. 6.120, 6.121, 6.122
and 6.123. For example, specializing Eqn. 6.124 for the function f : E E gives
1
Df(x)[u] = lim [f(x + u) f(x)]
0

(6.125)

that holds for all u E. Letting u = e1 and using the component functions we subsequently gives
1
Df(x)[e1 ] = lim [f(x + e1 ) f(x)]
0
1
= lim [ fi (x1 + , x2 , x3 ) ei fi (x1 , x2 , x3 ) ei ]
0
fi
=
(x1 , x2 , x3 ) ei
x1

(6.126)

fi
and hence Df(x)[e j ] = x
(x1 , x2 , x3 ) ei . Now, Df(x) is a linear operator meaning Df(x)[ u + v] =
j
Df(x)[u] + Df(x)[v] so in particular upon expressing u = ui ei we have

Df(x)[u] = Df(x)[u j e j ]
= u j Df(x)[e j ]
fi
(x1 , x2 , x3 ) ei
= (u e j )
x j
C
B
fi
=
(x1 , x2 , x3 ) ei e j u,
x j

(6.127)

which follows from the partial derivative Eqn. 6.126, the inner product Eqn. 6.20 and the dyadic product
Eqn. 6.29. Finally, the arbitrariness of u is used to obtain 6.121 and whence we have the usual result
that the components of the derivative Df(x) are equal partial derivatives of the component functions, i.e.
fi
(Df(x))i j = x
(x1 , x2 , x3 ).
j
Some remarks concerning the derivative are worth noting.
D1 Be careful with the notation: we denote the (derivative) function of the function f : X U Y V
at x X U as D f (x) : U V and its value at u U as D f (x)[u] V.
D2 As an alternative to the Eqn. 6.124 derivative definition, we could have equivalently stated that the
derivative of the function f : X U Y V at x X, if it exists, is the linear operator
D f (x) : U V that eats any u U and spits out the unique element D f (x)[u] V such that
D f (x)[u] = f (x + u) f (x) + o(|u|),
284

(6.128)

dierentiation
f (x)
u

f (x + u)

x+u
D f (x)[u]
X

Y
Figure 6.10: Derivative illustration.
where the o(|u|) term tends to zero faster than |u|, cf. Eqn. 1.21. And thus we see that the derivative
at x acting on u, i.e. D f (x)[u] approximates the dierence f (x + u) f (x) for small u. Using this
definition requires U to be a normed linear space due to the appearance of |u|. However, we only work
with normed linear spaces, so this condition poses no limitation.
D3 Since derivative function D f (x) is linear we have D f (x)[1 u1 +2 u2 ] = 1 D f (x)[u1 ]+2 D f (x)[u2 ]
f (x + 1 u1 + 2 u2 ) f (x) for 1 , 2 R and u1 , u2 U.
D4 If f is a function of real number, i.e. f : X R Y V then we use the ordinary derivative
notation, i.e. f (x) u = D f (x)[u].
D5 If f is a dierentiable real valued function, i.e. f : X U Y R then D f (x) is a linear function
that eats u U and spits out real numbers D f (x)[u] R and hence we can use the vector inner
product, which is linear via property V2 , to express
D f (x)[u] = f (x) u.

(6.129)

We refer to f (x) as the gradient of f at x. Obviously since u U we see that f (x) U, i.e. the
gradient f (x) is an element of the vector space U, cf. Fig. 6.11.
D6 Considering f : X U Y V we note that the x+ u X requirement does not imply that u X,
rather it only requires that lim0 (x + u) is in X, e.g. for x = 1 X = (0, 2) R and u = 1 " X we
have lim0 (x + u) X.
D7 Upon similar consideration of f : X U Y V, we note that although the value D f (x)[ u] approximates the dierence f (x+ u) f (x) and both f (x+ u), f (x) Y we need not have D f (x)[ u]
Y. For example considerAthe function f : (1, 1) R (0, ) R with f (x) = arccos(x), which
gives, e.g. f (0.5) = 1/ 1 (0.5)2 = 1.1547 and f (0.5)[0.01] = 0.011547 " Y.
D8 The derivative is unique and this follows from their component representations, because the components are unique, cf. e.g. Eqn. 6.120 and Exer. 6.4.

285

mathematical preliminaries
Example 6.5. Here we repeat the results of Exam. 6.2 using Eqn. 6.128. First, it is worth taking a moment to note
what the derivative of f : R R at x. With this in mind and referring to the discussion surrounding Eqn. 6.124 we
note that the derivative of f at x will eat elements in R and spit out elements in R and hence we have D f (x) : R R,
and since this is a linear operator it acts on the increment u via scalar multiplication, i.e. D f (x)[u] = D f (x) u; and
upon referring to ordinary derivative remark D4 above we finally have D f (x)[u] = f (x) u, which is the usual result.
Now we proceed by applying Eqn. 6.128, viz.
f (x + u) f (x)

=
=

(x + u)2 x2
2 x u + u2

=
=

D f (x)[u] + o(|u|)
f (x) u + o(|u|),

where we again see that f (x) = 2 x and note that limu0 u2 /|u| = 0, which justifies the statement that u2 = o(|u|), cf.
Eqn. 1.21. Now we have f (x) : R R and, for example,
f (2.1) f (2)
2.12 2.2

.41

f (2) 0.1
4 .1

.4.

x2

x1

Figure 6.11: Gradient illustration (x) = 2 x for the function : E R such that (x) = x x of Exams.
6.3 and 6.6.

Example 6.6. Here we repeat the results of Exam. 6.3 using Eqn. 6.124. On this occasion the derivative of : E R

at x eats elements in E and spits out elements in R and hence D(x) : E R. Moreover, D(x) is a linear operator
that maps to the reals; it can be described by the dot product, cf. remark D5 above. In this way D(x)[u] = (x) u
where the gradient is a vector, i.e. (x) E. Now we proceed by applying Eqns. 6.124 and 6.129, viz.
(x) u =
=
=
=

7
16
(x + u) (x)

1
lim [(x + u) (x + u) x x]
0
9
18
lim 2 x u + 2 u u
0
2xu
lim

286

dierentiation
so that again we have (x) = 2 x, cf. Fig. 6.11.
With x = e1 + 2 e2 and u = 0.1 e1 + 0.1 e2 + 0.2 e3 we have
(x + u) (x)

(1.1 e1 + 2.1 e2 + 0.2 e3 ) (1.1 e1 + 2.1 e2 + 0.2 e3) (e1 + 2e2 ) (e1 + 2e2 )

.66

(x) u

2 (e1 + 2 e2 ) (0.1 e1 + 0.1 e2 + 0.2 e3)

.6.

Example 6.7. Here determine the derivatives of the tensor invariants j : L R of Eqn. 6.47. These derivatives eat
elements in L and spit out elements in R and hence D j (A) : L R. And as in Exam. 6.6, since the linear operator
D j (A) maps to the reals we describe it via the dot product, i.e. D j (A)[U] = j (A) U where the gradients are
tensors, i.e. j (A) L. Now we proceed by applying Eqns. 6.124 and 6.129 to the invariants j : L R such that
1 (A)

= tr A,
1
{(tr A)2 trA2 },
=
2
= det A,

2 (A)
3 (A)

cf. Eqn. 6.51.


Using Eqns. 6.59, 6.124 and 6.129 with 1 (A) = trA = I A gives
1 (A) U

=
=
=
=

1
[1 (A + U) 1 (A)]

1
lim [I (A + U) I A]
0
1
lim [I U]
0
IU

lim

whence
1 (A) = I,

(6.130)

which agrees with our Exam. 6.4 result.


The same process with Eqns. 6.51, 6.59, 6.124 and 6.129 gives
2 (A) U

=
=
=
=
=
=

1
[2 (A + U) 2 (A)]
0
1
@ 1?
@2
1 1?
lim
[tr(A + U)]2 tr(A + U)2 (trA)2 trA2
0 2
2
1 ?
@ 1?
@2
1 1
lim
[(A + U) I]2 (A + U)T (A + U) (A I)2 AT A
0 2
2
1
9 2 8
92
1 8
(U I)2 UT U
lim (A I)(I U) AT U +
0
2
lim

(A I)(I U) AT U
(trA I AT ) U

whence
2 (A) = trA I AT .

287

(6.131)

mathematical preliminaries
In regard to 3 (A) we use Eqn. 6.47 to obtain for 3 (A) = detA
[3 (A + U) 3 (A)][a, b, c] =
=
=

3 (A + U)[a, b, c] 3 (A)[a, b, c]
[(A + U) a, (A + U) b, (A + U) c] [A a, A b, A c]

{[U a, A b, A c] + [A a, U b, A c] + [A a, A b, U c]} +
2 {[A a, U b, U c] + [U a, A b, U c] + [U a, U b, A c]} +

3 [U a, U b, U c]
?
@
[U A1 A a, A b, A c] + [A a, U A1 A b, A c] + [A a, A b, U A1 A c] +
?
2 [A a, U A1 A b, U A1 A c] + [U A1 A a, A b, U A1 A c]+
@
[U A1 A a, U A1 A b, A c] +
3 [U a, U b, U c]

=
=

(U A1 ) [A a, A b, A c] + 2 2 (U A1 ) [A a, A b, A c] + 3 3 (U)[a, b, c]
B 1
C
1 (U A1 ) 3 (A) + 2 2 (U A1 ) 3 (A) + 3 3 (U) [a, b, c]

Using the arbitrariness of a, b, and c we have


3 (A + U) 3 (A) =

1 (U A1 ) 3 (A) + 2 2 (U A1 ) 3 (A) + 3 3 (U)

so that Eqns. 6.59, 6.124 and 6.129 give


3 (A) U

=
=
=

1
[3 (A + U) 3 (A)]

1 (U A1 ) 3 (A)

lim

3 (A) AT U

whence
3 (A) =
=

detA AT
A ,

(6.132)

where we use Eqn. 6.71. Note that this result assumes A is invertible, i.e. 3 : LInv L R \ 0 R.

6.4.1 Product rule


You used the product rule to evaluate derivatives, e.g. for f (x) = g(x) h(x) you compute f (x) = g (x) h(x) +
g(x) h (x) where f , g and h are functions on the reals to the reals, e.g. f : R R. And you evaluated
derivatives of more complicated functions, e.g. for the real valued function : E R you evaluated
the gradient (x) of Eqn. 6.120. But perhaps you have not evaluated the derivative of products of these
complicated functions. For example consider (x) = g(x) h(x) where f and g are vector valued functions
of vectors, e.g. f : E E. To dierentiate we cannot use the product rule in the naive way, i.e. we do not
have (x) = Dg(x) h(x) + g(x) Dh(x) where Dg(x) and Dh(x) are matrices similar to Df(x) in Eqn. 6.121.
Indeed the inner product in this derivative expression is nonsense since it is between a tensor (matrix) and
a vector.
When deriving the governing equations for linear elasticity we need to dierentiate functions that are
products in nature, i.e., f (x) = g(x) h(x) where the functions f : X U Y, g : X U W and
h : X U Z have the same domain X and respective co-domains Y, W and Z, which are, e.g. subsets
of R, E and/or L. For example, could represent the scalar multiplication between a real valued function
288

dierentiation
and either another real valued function, a vector valued function or a tensor valued function. It could also
represent the inner product between a pair of vector valued functions (as just discussed) or tensor valued
functions. The dyadic product between a pair of vector or tensor valued functions is yet another example.
Our task here is to dierentiate such functions.
In all cases, the operation is bilinear meaning that for d, g W, h, k Z and reals , R we
have ( d + g) h = (d h) + (g h) and g ( h + k) = (g h) + (g k), cf. Sect. 4.4.2. In
essence, if we fix h Z then is linear operator on W and visa versa. For the cases of the inner and dyadic
products between a pair of vectors this is clearly the case as seen through properties E1 and E2 and Eqn.
6.32, respectively.
To dierentiate f we use the product rule, which states that if Dg(x) : U W and Dh(x) : U Z
exist, then D f (x) : U Y exists and satisfies
D f (x)[u] = Dg(x)[u] h(x) + g(x) Dh(x)[u].

(6.133)

In the following examples we apply the product rule to obtain results that are required in our subsequent
developments.
Example 6.8. Here we determine the derivative of the function : E R where
(x) = g(x) h(x)
with g : E E and h : E E smooth functions. The derivatives of the smooth functions g and h at x linearly map
vectors u E to the vectors Dg(x)[u] E and Dh(x)[u] E and hence Dg(x) L and Dh(x) L are 2-tensors to wit
we write Dg(x)[u] = Dg(x) u and Dh(x)[u] = Dh(x) u. On the other hand, D(x) linearly maps vectors u E to reals
D(x)[u] and hence D(x)[u] = (x) u where (x) E is the gradient vector, cf. Exam. 6.6. Being that as it may,
we now apply Eqns. 6.40, 6.129 and 6.133 to obtain
D(x)[u] =
(x) u =
=

Dg(x)[u] h(x) + g(x) Dh(x)[u],

u (Dg(x))T h(x) + (Dh(x))T g(x) u


?
@
(Dg(x))T h(x) + (Dh(x))T g(x) u.

As seen above (x) = (Dg(x))T h(x) + (Dh(x))T g(x) E.

Example 6.9. Here we determine the derivative of the function a : E E where


a(x) = (x) h(x)
with : E R and h : E E smooth functions. The derivative of linearly maps vectors u E to reals D(x)[u]
and hence we have D(x)[u] = (x) u where (x) E is the gradient vector. On the other hand, the derivatives of
a and h linearly map vectors u E to vectors Da(x)[u] E and Dh(x)[u] E and hence we write Da(x)[u] = Da(x) u
and Dh(x)[u] = Dh(x) u as Da(x) L and Dh(x) L are 2-tensors. Forging on, we apply Eqns. 6.29, 6.129 and
6.133 to obtain
Da(x)[u] = (D(x)[u] h(x) + (x) Dh(x)[u]
Da(x) u = ((x) u) h(x) + (x) Dh(x) u
= (h(x) (x) + (x) Dh(x)) u.
Here we have Da(x) = h(x) (x) + (x) Dh(x), which we recognize as a tensor.

289

mathematical preliminaries

Example 6.10. Determine the derivative of the function F : LInv L LInv L where
F(A) = A1 .
Of course the domain of F is restricted to the subspace of invertible tensors. However, the domain of DF(A) is L,
i.e. the set of 2-tensors. Moreover, DF(A) linearly maps 2-tensors U L to 2-tensors DF(A)[U] L and hence
DF(A) L4 is a 4-tensor.
To address this problem we define G : LInv L LInv L such that G(A) = F(A) A = I. And based on
the above verbiage, we know that DG(A) L4 is a 4-tensor. Moreover, since G(A) = I a constant we trivially have
DG(A) = O. However our problem is to find DF(A) to wit we use Eqn. 6.133 to obtain
DG(A)[U] =
O[U] =
O =

DF(A)[U]A + F(A) I[U]


DF(A)[U]A + F(A) U
DF(A)[U]A + F(A) U,

where we used that fact that for H : L L such that H(A) = A we trivially have DH(A) = I L4 and DH(A)[U] =
U. Rearranging the above and using Eqn. 6.107 gives
DF(A)[U] =
=
=

F(A) U A1

A1 U A1
(A1 ! AT ) U

or
DF(A) =

(A1 ! AT ),

(6.134)

which is a fourth-order tensor, i.e. an element of L4 . This is in agreement with your prior knowledge, i.e. for f : R R
with f (x) = 1/x we have f (x) = 1/x2 that is defined at all x R except x = 0. Indeed at x = 0 f is not defined, i.e.
it is the element in R with no inverse.

Example 6.11. In this example we determine the derivative of the adjugate A of A. Noting from Eqn. 6.71 that
A = detA AT we define the function H : LInv L LInv L such that H(A) = A . However, to utilize the results
of Exams. 6.7 and 6.10 we define G : LInv L LInv L such that G(A) = 3 (A) F(A) so that
H(A) =

=
=

detA AT
T[detA A1 ]

=
=

T[3 (A) F(A)]


T[G(A)],

where we use Eqns. 6.64, 6.71 and 6.100 and define F : LInv L LInv L such that F(A) = A1 . From Exams.
6.7 and 6.10 we have D3 (A)[U] = 3 (A) U = det(A) AT U and DF(A)[U] = (A1 ! AT )[U] and hence Eqns.
6.129 and 6.133 give
DG(A)[U] = D3 (A)[U] F(A) + 3 (A) DF(A)[U]
= (3 (A) U) F(A) 3 (A) (A1 ! AT )[U]
B
C
= F(A) 3 (A) 3 (A) (A1 ! AT ) [U]
= detA (A1 AT A1 ! AT )[U],

290

dierentiation
where we used Eqns. 6.55, 6.57, 6.91, 6.104, 6.132 and 6.134. The above gives the fourth order tensor DG(A) =
detA (A1 AT A1 ! AT ) and hence, recalling the composition result A B[C] = A[B[C]] we have
DH(A) = detA T (A1 AT A1 ! AT ).

(6.135)

Example 6.12. We continue with the previous Exam. 6.11 and now dierentiate the function |A a| with respect to

A; here a is an arbitrary constant vector. Again noting from Eqn. 6.71 that A = detA AT we define the function
h : LInv L E such that h(A) = A a = H(A) a where we utilize the results of Exam. 6.11 so that
Dh(A)[U] =
=

DH(A)[U] a
8
9
detA T (A1 AT A1 ! AT ) [U] a.

(6.136)
1

Next we define the scalar valued function : LInv L R such that (A) = (h(A) h(A)) 2 = |A a| and use the
elementary rules of dierentiation to obtain
(A) U

=
=
=
=
=
=
=
=
=
=
=
=

1
1
(h(A) h(A)) 2 2 h(A) Dh(A)[U]
2
1
(h(A) h(A)) 2 h(A) DH(A)[U] a
1
(h(A) a) DH(A)[U]
(A)
1
DT H(A)[h(A) a] U
(A)
8
9@T
1 ?
T detA A1 AT A1 ! AT
[(A a) a] U
(A)
8
9T
1
detA A1 AT A1 ! AT TT [(A a) a] U
(A)
8
9
1
detA AT A1 AT ! A1 T[(A a) a] U
(A)
8
9
1
detA AT A1 AT ! A1 [a (A a)] U
(A)
8B
C
9
1
detA A1 (a (A a)) AT AT (a (A a)) AT U
(A)
8B
C
9
1
detA (A1 A a) a I (AT a) (A a) AT U
(A)
8B
C
9
1
detA (A a) (AT a) I (AT a) (A a) AT U
(A)
7
1 6
((A a) (A a)) I (A a) (A a) AT U,
(A)

where we used Eqns. 6.40, 6.41, 6.59, 6.64, 6.71, 6.86, 6.100, 6.97 and 6.108, amongst others. And hence we have
(A) =

7
1 6
((A a) (A a)) I (A a) (A a) AT .
(A)

(6.137)

6.4.2 Chain rule


When deriving the governing equations for linear elasticity we also need to dierentiate composite functions.
And you also used the chain rule to evaluate derivatives, e.g. for h(x) = g f (x) = g( f (x)) you compute
291

mathematical preliminaries
h (x) = g ( f (x)) f (x) where f , g and h are functions on the reals to the reals, e.g. f : R R. However, as
seen with the product rule, care must be taken when applying the chain rule to more complicated functions.
For example, for (x) = (g(x)) where and are scalar valued functions of vectors, e.g. : E R and g
is a vector valued functions of vectors, i.e. g : E E, we cannot merely say that (x) = (g(x)) Dg(x),
indeed the vector times the tensor operation is nonsense.
In our chain rule presentation we consider the two functions f : X U Y V and g : T Y
W Z where T is an open subset of Y and U, V and Z are vector spaces, cf. Fig. 6.12. Using these
functions we define the composite function h : X U W Z such that h(x) = g f (x) = g( f (x)) for all
x X. Now to the crux of the matter, if D f (x) : U Y and Dg(y) : Y Z exist, where y = f (x), then
Dh(x) : U Z exists and
y

that leads to

;<=> ;!!!!<=!!!!>
Dh(x)[u] = Dg( f (x) )[D f (x)[u]]

(6.138)

Dh(x) = Dg( f (x))D f (x).

(6.139)

As seen in Fig. 6.12, we have the equality Dh(x)[u] = Dg(y)[v] where the derivative of g(y) at y = f (x) acts
on the increment v = D f (x)[u].

h(x) = g(y)
Dh(x)[u] = Dg(y)[v]

x
X

h(x + u)

f (x + u)
T
Y

v = D f (x)[u]
y = f (x)

Figure 6.12: Composite function derivative illustration.

292

dierentiation
Example 6.13. We now refer to Eqn. 6.124 in which f : X U Y V and define the composite function
g = f h : R Y V where h : R X U such that h() = x + u so that
x+ u

In this way the chain-rule gives

;<=>
g() = f h() = f ( h() ).
Dg()[u] =
g () u =
=

D f (h())Dh()[u]
D f (h())[h () u]
D f (h()) u,

where we used the ordinary derivative notation of remark D4 , e.g. Dh() = h (), the trivial result h () = 1 and the
fact that u is a scalar. The arbitrariness of u gives us
g () = D f (h()),
which is identical to Eqn. 6.124 for = 0, i.e. g (0) = f (x; u). And thus, the directional derivative is obtained via the
chain rule, assuming that D f (x) exists.

Example 6.14. We assume the functions F : L L and G : L L are dierentiable and thus they have
derivatives that linearly map 2-tensors U L into 2-tensors DF(A)[U] L and DG(A)[U] L making DF(A) L4
and DG(A) L4 4-tensors. Using these functions we define the composite function H = FG : L L, which is
dierentiable since F and G are dierentiable. The derivative DH(A) L4 is also a 4-tensor as it also linearly maps
2-tensors U L into 2-tensors DH(A)[U] L. We evaluate this 4-tensor by appealing to Eqn. 6.138, i.e.
DH(A)[U] = DF(G(A))[DG(A)[U]].
Next we use the composition (multiplication) relation A B[C] = A[B[C]] and the arbitrariness of U whence
DH(A) = DF(G(A)) DG(A).
Now we assume the real valued function : L R is dierentiable and thus it has a gradient that linearly maps
2-tensors U R to scalars (A) U R making (A) : L R a 2-tensor. From this and the above we define
the real valued composite function = G : L R, which is dierentiable since and G are dierentiable. The
gradient (A) L is a 2-tensor that linearly maps 2-tensors U L into reals (A) U R. To evaluate this
gradient we refer to Eqns. 6.97, 6.129 and 6.138 which give
D(A)[U] =

D(G(A))DG(A)[U]

(A) U

D(G(A))[DG(A)[U]]
(G(A)) DG(A)[U]
(DG(A))T (G(A)) U

=
=
=

The arbitrariness of U implies


(A) = (DG(A))T (G(A)).

In regard to Eqns. 5.87 5.90 we mention the implicit function theorem; it asserts that if f : X
U Y U has an invertible derivative D f (x) L(U, U) atx (which linearly maps elements u U into
elements of the same space), then f has a smooth inverse function f : Y U X U at x. Moreover
293

mathematical preliminaries
we can define the composite functions f f : X U X U and f f : Y U Y U such that
f f (x) = x and f f (y) = y. Dierentiating these composite maps via the chain rule yields reveals
I = D f f (x)

= D f (y) D f (x) |y= f (x)

= D f (y) D f (x) |y= f (x) ,

where I is the identity operator on U. Rearranging the above gives


D f (y) |y= f (x) = [D f (x)]1 ,

(6.140)

i.e. the derivative of the inverse function D f (y) equals the inverse of the derivative [D f (x)]1 at the corresponding points y = f (x).

6.4.3 Higher order derivatives


With the risk of beating a dead horse, we state again that the derivative of the function f : X U Y V
at x, if it exists, is the linear operator D f (x) : U V that eats elements u U and spits out elements
D f (x)[u] V making D f (x) L(U, V), i.e. a linear transformation from U to V. Now if D f (x) exists
for all x X then we can view D f as a function mapping elements of X U into elements of L(U, V),
i.e. D f : X U L(U, V). 6.2 If this case we can think about the derivative D2 f (x) of the function D f
evaluated at x. This second derivative of f is defined analogously to the first derivative in Eqn. 6.124, but
now DD f replaces D f , D f replaces f , and L(U, V) replaces V. Indeed, the second derivative of f at x, if
it exists, is defined as
d
1
D2 f (x)[u] = lim [D f (x + u) D f (x)] = D f (x + u) |=0 = D f (x; u)
0
d

(6.141)

for all u U where we introduce the notation D2 f (x) in place of DD f (x). Such derivatives arise when
we discuss the elasticity tensor, i.e. the second derivative of the strain energy function. Following the
development of Eqn. 6.128 we see that
D2 f (x)[u] = D f (x + u) D f (x) + o(|u|)

(6.142)

for all u U. Being a derivative D2 f (x) is a linear operator, which eats elements u U and spits out
elements D2 f (x)[u] L(U, V), which themselves are linear operators, i.e. D2 f (x) : U L(U, V) and
hence for v U we have (D2 f (x)[u])[v] V, which we express more compactly as
D2 f (x) [u, v] = (D2 f (x)[u])[v]
= (DD f (x)[u])[v]

(6.143)

= D(D f (x)[u])[v],
where the second line follows from the notation D2 f (x) = DD f (x) and the third by viewing u as a constant
2
d df
( dx (x) u). Based on this
in the dierentiation, e.g. for the smooth function f : R R we have ddx2f (x) u = dx
discussion we can write either D2 f (x) : U L(U, V) or D2 f (x) : U U V. In the latter interpretation
it can be shown that the map D2 f (x) : U U V is both bilinear and symmetric in the sense that
D2 f (x) [ u + v, w] = (D2 f (x) [u, w]) + (D2 f (x) [v, w]),
2

(6.144)

D f (x) [w, u + v] = (D f (x) [w, u]) + (D f (x) [w, v]),


D2 f (x) [u, v] = D2 f (x) [v, u],
6.2

Note that this D f : X U L(U, V) is not generally linear, unlike D f (x) : U V, which by definition is always linear.

294

dierentiation
where w U and , R. Of course the second equality follows from the other two. The first bilinearity
equality follows from Eqn. 6.141. To obtain the symmetry equality we use Eqns. 6.128 and 6.142, i.e.
D2 f (x)[u, v] = D (D f (x)[u]) [v]

(6.145)

= D f (x + v)[u] D f (x)[u] + o(|v|)


6
7 6
7
= f (x + v + u) f (x + v) + o(|u|) f (x + u) f (x) + o(|u|) + o(|v|)
6
7 6
7
= f (x + u + v) f (x + u) + o(|v|) f (x + v) f (x) + o(|v|) + o(|u|)
= D f (x + u)[v] D f (x)[v] + o(|u|)

= D (D f (x)[v]) [u]
= D2 f (x)[v, u].

Continuing in this manner we can define still higher order derivatives and this progression leads to the
function classification discussion of Sect. 1.5.
Example 6.15. Refer to Exam. 6.7 and determine the second derivatives of the tensor invariants. In all cases the
second derivatives D2 i (A) are fourth-order tensors, i.e. D2 j (A) L4 , which follows from the facts that for the scalar
valued j : L R we have i (A) : L L and hence D2 i (A) : L L4 . Since 1 (A) = I, a constant, we trivially
have
D2 1 (A) = O.

(6.146)

From Eqn. 6.131, i.e. 2 (A) = 1 (A) I AT , the product rule and noting that the derivative of a sum is the sum of the
derivatives we obtain
D2 2 (A)[U] =
=
=
=

D1 (A)[U] I T[U]
(1 (A) U) I UT
(I U) I UT
(I I T)[U],

where we utilized Eqns. 6.59 and 6.91 and the trivial result that for F : L L such that F(A) = AT = T[A] we have
DF(A) = T. Utilizing the arbitrariness of U gives
D2 2 (A) = (I I T).

(6.147)

Equations 6.132 and 6.135 render the last result


D2 3 (A) = detA T (A1 AT A1 ! AT ).

(6.148)

6.4.4 Partial derivatives


The component results of Eqns. 6.120 6.123 utilize partial derivatives. And for functions of two variables
f
f : R R R you know that the partial derivative x
(x1 , x2 ) is just the usual derivative obtained by
1
viewing f as a function of only the first variable x1 . And this interpretation is absolutely correct. Herein
we define the partial derivative of the function f : X U Y V where X = X1 X2 Xn
is the n-fold set product of the open subsets Xi Ui , i = 1, 2, , n. Thus we have f (x) = y Y where
x = (x1 , x2 , , xn ) X with xi Xi Ui . For example the function f : E R E eats the vector
scalar pair (a, ) = E R and spits out the vector f(a, ) E; in this case X = E R, x1 = a X1 = E,
x2 = X2 = R and f(a, ) Y = E.
295

mathematical preliminaries
The i-th partial derivative of f at x, if it exists, is the linear operator Di f (x) : Ui V that eats
elements ui Ui and spits out elements Di f (x)[ui ] V such that
1
Di f (x)[ui ] = lim [ f (x1 , x2 , , xi1 , xi + ui , xi+1 , , xn ) f (x1 , x2 , , xi1 , xi , xi+1 , , xn )]
0
=

d
f (x1 , x2 , , xi1 , xi + ui , xi+1 , , xn ) |=0 , or
d

f (x1 , x2 , , xi1 , xi + ui , xi+1 , , xn ) f (x1 , x2 , , xi1 , xi , xi+1 , , xn ) + o(|ui |)

(6.149)

for all ui Ui . Note the similarity of the above with Eqns. 6.124 and 6.128. We emphasize that these
partial derivatives are not necessarily scalars like those appearing in Eqns. 6.120 6.123. Indeed, for our
f : E R E example we have D1 f(a, ) that linearly maps vectors u E into vectors D1 f(a, )[u] E
making D1 f(a, ) L a 2-tensor and D2 f(a, ) that linearly maps scalars u R into vectors D2 f(a, )[u] E
making D2 f(a, ) E a vector. Of course the components of these partial derivatives can be expressed using
expressions such as those appearing in Eqns. 6.120 6.123.
Not surprisingly, if D f (x) exists then
u

;!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!<=!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!>
Di f (x)[ui ] = D f (x)[(0, 0, , 0, ui , 0, , 0)]
=>;<

(6.150)

ith place

and conversely, if all of the partial derivatives exist at x, then


D f (x)[u] =

n
!

Di f (x)[ui ],

(6.151)

i=1

where u = (u1 , u2 , , un ) U = U1 U2 Un .
If Di f (x) exists for all x X then we can define the second partial derivative Di j f (x) : U j L(Ui , V),
which, if it exists, is the linear map from U j to L(Ui , V) such that
1
Di j f (x)[u j ] = lim [Di f (x1 , x2 , , x j1 , x j + u j , x j+1 , , xn ) Di f (x1 , x2 , , x j1 , x j , x j+1 , , xn )],
0
=

d
Di f (x1 , x2 , , x j1 , x j + u j , x j+1 , , xn ) |=0 ,
d

or

(6.152)

= Di f (x1 , x2 , , x j1 , x j + u j , x j+1 , , xn ) Di f (x1 , x2 , , x j1 , x j , x j+1 , , xn ) + o(|u j |).


Following the second derivative discussion, we can also treat the second partial derivative as the bilinear
map Di j f (x) : Ui U j V that satisfies
Di j f (x) [u, w] = D j (Di f (x)[u])[w],
Di j f (x) [ u + v, w] = (Di j f (x) [u, w]) + (Di j f (x) [v, w]),

(6.153)

Di j f (x) [u, w + z] = (Di j f (x) [u, w]) + (Di j f (x) [u, z]),
Di j f (x) [u, v] = D ji f (x) [v, u]
for every u, v Ui , w, z U j and , R. The first equality defines the bilinear map, the second two are
linearity results and the last is the symmetry, which follows from an argument similar to that of Eqn. 6.145.
296

the euclidean point space


Use the definition of the function f : X U Y V, e.g. how it acts on arbitrary vectors etc.
Note the domain U and codomain V of the linear mapping D f (x) : U V,
Use your knowledge of U and V to note what D f (x) L(U, V) is, e.g. a vector, 2-tensor, etc.
Use the definition of the derivative, product rule and chain-rule to obtain an expression for
D f (x)[u],
Use scalar, dyadic, conjugate products etc. to factor u from your D f (x)[u] expression, and
Use the arbitrariness of u to define D f (x).
Figure 6.13: Procedure to compute the derivative D f (x) : U V of the function f : X U Y V at
x X.

6.4.5 Dierentiation summary


As seen from the above examples, it is not always easier to evaluate the derivatives via the component route.
Rather to evaluate the derivative of the function f : X U Y V at x X, i.e. the linear operator
D f (x) : U V from U to V it is recommended to follow the steps in Fig. 6.13.

6.5 The Euclidean Point Space


In our analyses we are interested in quantifying the motion of a body B subjected to loads. The body itself
is comprised of particles or points y in a Euclidean point space E, i.e. y E and B E. Much ado can and
probably should be made about the Euclidean point space. However, we can bypass this discussion and cut
to the core of the matter. We define a single rectangular Cartesian coordinate system that consists of the
fixed basis {e1 , e2 , e3 } and the fixed point o that we call the origin. In this way, we identify all points y in our
body via their position vectors xy E relative to o and we associate the body B itself with the collection of
points, i.e. position vectors, that comprise the region in which the body resides, i.e. E. We assume
that is bounded, open and connected with a piecewise smooth boundary, i.e. surface, which has a unit
outward normal vector n. The collection of points, i.e. position vectors, that form the surface define the set
of .
, and the union of all such points, i.e. position vectors, in and define the closure
Functions defined on the body, i.e. , are referred to as fields. Examples of such functions are the vector
displacement field u : E R and the tensor stress field T : E L.

6.6 The Localization Theorem


We now formalize the localization argument
that we discussed in Eqns. 4.6 4.8. If f : R R is continuous,
D x+
then the mean value theorem gives x f (y) dy = 2 f ( x) where x [x , x + ]. Rearranging this result
D x+
D x+
as f ( x) = 1/(2 ) x f (y) dy and taking the limit as 0 gives f (x) = lim0 1/(2 ) x f (y) dy. To
297

mathematical preliminaries
see this note that
::
::
E x+
::
:
lim : f (x) 1/(2 )
f (y) dy:: = lim 1/|2 |
:
0 :
0
x

::E x+
::
::
:
[ f (x) f (y)] dy::
::
:
x
E x+
lim 1/|2 |
| f (x) f (y)| dy
0
E x
x+
lim 1/|2 |
max | f (x) f (y)| dy
0

= lim

x y[x,x+]

max

0 y[x,x+]

| f (x) f (y)|

= 0,

(6.154)

where the continuity of f is invoked to justify the final limit.


This one-dimensional result generalizes to smooth scalar, vector and tensor fields giving the localization
theorem, e.g for f : E E we have
E
1
f(y) dvy ,
(6.155)
f(x) = lim
0 Vol(N (x)) N (x)
where N (x) is the sphere with radius and center x, cf. Fig. 1.1, and Vol(N (x)) =

dv is the volume
D
of the subregion N (x) . This theorem is used repeatedly in the sequel. Indeed, if we have f(y) dv = 0
for all subregions then we deduce f(x) = 0 for all x by selecting = N (x) and taking the
limit as 0.
N (x)

6.7 The Divergence and Divergence Theorem


The conservation laws we will discuss are most naturally stated in terms of the entire body, e.g. the sum of
the forces acting on a body equals the time rate of change of the linear momentum of that body. While this
proves useful, equally useful are the local conservation equations, which must be satisfied for each location
in the body. To arrive at these local statements, we make extensive use of the localization (cf. Sect. 6.6) and
divergence theorems.
For a smooth vector field f : E E, we define the divergence as
divf(x) = tr Df(x)
= (Df(x))ii
fi
(x1 , x2 , x3 )
=
xi
f1
f2
f3
=
(x1 , x2 , x3 ) +
(x1 , x2 , x3 ) +
(x1 , x2 , x3 ),
x1
x2
x3

(6.156)

where we refer to Eqns. 6.50 and 6.121. As seen above divf is a scalar field, i.e. divf : E R. The
divergence of a smooth tensor field S : E L is defined through the divergence of a vector field, i.e.
it is the vector field divS : E that satisfies
a divS(x) = div(ST (x) a)
for every uniform vector a.
298

(6.157)

the divergence and divergence theorem


To obtain the component representation of the above we express x = xi ei , cf. Eqn. 6.10, and introduce
the component functions S i j : R3 R to write S(x) = S i j (x1 , x2 , x3 ) ei e j . Applications of Eqns. 6.10,
6.37 and 6.41 gives
ST (x) a = (ST (x))i j a j ei
= (S(x)) ji a j ei
= S ji (x1 , x2 , x3 ) a j ei

(6.158)

from which we identify (ST (x) a)i = S ki (x1 , x2 , x3 ) ak (where we replaced the dummy index j with k). This
fact combined with Eqn. 6.121 provides
S

k1 (x1 , x2 , x3 ) ak S k1 (x1 , x2 , x3 ) ak S k1 (x1 , x2 , x3 ) ak


x2
x3
x 1

S k2
S k2
S k2

D(ST (x) a) = x
(x
,
x
,
x
)
a
(x
,
x
,
x
)
a
(x
,
x
,
x
)
a
1
2
3
k
1
2
3
k
1
2
3
k
x2
x3
1

S k3
S k3
S k3
(x
,
x
,
x
)
a
(x
,
x
,
x
)
a
(x
,
x
,
x
)
a
1
2
3
k
1
2
3
k
1
2
3
k
x1
x2
x3
=

S ki
(x1 , x2 , x3 ) ak ei e j ,
x j

(6.159)

where we use the fact that a is uniform. Taking the trace of the above provides the divergence of ST (x) a,
div(ST (x) a) = trD(ST (x) a)
5
4
S ki
= tr
(x1 , x2 , x3 ) ak ei e j
x j
S ki
=
(x1 , x2 , x3 ) ak (ei e j )
x j
S ki
(x1 , x2 , x3 ) ak i j
=
x j
S ki
=
(x1 , x2 , x3 ) ak
xi
S ki
=
(x1 , x2 , x3 ) (a ek )
xi
S ki
=
(x1 , x2 , x3 ) ek a,
xi

(6.160)

which follows from Eqns. 6.7, 6.11, 6.20, 6.50 and 6.52. Equations 6.157 and 6.160 with the arbitrariness
of a finally gives
S i j
ei
x j

S 1 j

S 2j j

x j

S 3 j

x
j
S
S 12
11

x1 + x2 +

S 21 S 22
=

x1 + x2 +

S 31 + S 32 +
x1
x2

divS =

299

S 13
x3
S 23
x3
S 33
x3

(6.161)

mathematical preliminaries
where the arguments have been dropped for conciseness.
It can also be verified that
div( f) = divf + f,

div(ST f) = f divS + S Df,

(6.162)

where : E E is a smooth scalar valued function and the argument x E has been suppressed
for conciseness. For example, to verify the second equality we use the above definitions, the product rule
(cf. Eqn. 6.133) and the tensor inner product (cf. Eqn. 6.55), i.e.
div(ST f) = div(ST f) + div(ST f)
= f divS + trD(ST f)

= f divS + tr(ST Df)


= f divS + S Df,

(6.163)

where the over line ( ) means the function is treated as uniform for the dierentiation.
Dx
For g : R R a smooth function you are comfortable with the result that x 2 g (x) dx = g(x2 ) g(x1 ),
1
i.e. we transform an integral in the domain (x1 , x2 ) to its boundary {x1 , x2 }. This divergence theorem result
generalizes to the smooth vector and tensor fields f : E E and S : E L defined over the
region with boundary having outward unit normal vector n, as follows
E
E
divf dv =
f n da,
(6.164)

E
E
divS dv =
S n da,

where again the argument x E has been suppressed for conciseness.


The first result can be found in
D
any vector calculus book. To obtain the second equality consider a divS(x) dv for a an arbitrary uniform
vector and use Eqn. 6.157 and the first equality, cf. Exer. 6.59.
Upon applying the localization theorem, cf. Eqn. 6.155, to the left hand side of Eqn. 6.164 we obtain
the physical interpretation of the divergence fields
E
E
1
1
divf(x) = lim
divf dv = lim
f n da,
0 Vol(,x ) V,x
0 Vol(,x ) ,x
E
E
1
1
divS dv = lim
S n da,
(6.165)
divS(x) = lim
0 Vol(,x ) ,x
0 Vol(,x ) V,x
where ,x is the spherical region centered at x with radius , boundary ,x , and outward unit surface
normal vector n. We see that the divergence is the limiting value of the net flux fn or S n over the boundary of
the sphere per unit volume. This is seen in Fig. 6.14 for the function f(x) = (e1 x)3 e1 +(e2 x)3 e2 = x31 e1 +x32 e2
for which divf(x) = 3(e1 x)2 + 3(e2 x)2 = 3 (x21 + x22 ). At x = 0 the net flux entering the sphere is zero
because divf(0) = 0 whereas for x = e1 + e2 the net flux is leaving the sphere because divf(x) = 6 > 0.

6.8 The Group and Invariance


The following concepts come in handy when we discuss material symmetry. Recall that if A and B are
tensors, then their product A B is also a tensor, i.e. A B L. Using multiplication as our combination it can
be shown that for any orthogonal tensors A, B LOrth
300

exercises
e2

e2

e1

e1

(a)

(b)

Figure 6.14: Divergence illustration: (a) divf(0) = 0 and (b) divf(e1 + e2 ) = 6. The vectors depict the values
of f(x) = (e1 x)3 e1 + (e2 x)3 e2 = x31 e1 + x32 e2 .
G1 A B is an orthogonal tensor, i.e. A B LOrth ,
G2 A1 is an orthogonal tensor, i.e. A1 LOrth
and thus we say the set of orthogonal tensors, LOrth forms a group in L. Note that because A and A1 are in
the group, the identity is necessarily in the group, i.e. I = A A1 LOrth . The above two group properties
are referred to as closure and existence of a reverse element. It is hopefully apparent to you that the set of
0, 90o , 180o , 270o , rotations about the e3 axis, i.e. Re3 (n /2) where n is an integer (cf. Eqn. 6.68), is also
a group; it is in fact a subgroup of LOrth , which we denote as G LOrth (cf. Exer. 6.64).
We say the set X L is invariant under the group G LOrth if Q A QT X for all A X and all Q G.
A scalar valued function : X L R is invariant under G if X is invariant under G and (Q A QT ) = (A)
for all A X and all Q G. Similarly, a tensor valued function T : X L L is invariant under G if X is
invariant under G and T(Q A QT ) = Q T(A) QT for all A X and all Q G. Dierentiating these equalities
gives (Q ! Q)T (Q A QT ) = QT (Q A QT ) Q = (A), (Q ! Q)T D2 (Q A QT ) (Q ! Q) = D2 (A) and
DT(Q A QT ) (Q ! Q) = (Q ! Q) DT(A), i.e. relations for the derivatives of invariant functions, cf. Exer.
6.65.

6.9 Exercises
6.1. Prove that a b = 0 only if a, b E are linearly dependent. Hint: Use propert E7 to show that if
a b = 0 then a b = |a| |b|. Next assume a b = |a| |b| and verify that | |b| a |a| b | = 0, which
implies |b| a |a| b = 0 and hence, a and b are linearly dependent. Now consider the cases when
a b = |a| |b|, a = 0 and b = 0.
6.2. Prove Eqn. 6.6.2. Hint: use Eqn. 6.5 and properties E2 .
6.3. Prove Eqn. 6.6.3, i.e. that [a, b, c] = 0 only if a, b and c are linearly dependent. Suppose this is
not the case, i.e. suppose a, b and c are linearly independent, then from Exer. 6.1 and the discussion
following Eqn. 6.4 we know that b c ! 0. Next show that a, b and c are all orthogonal to b c.
301

mathematical preliminaries
(Hint: use the given equality [a, b, c] = 0, Eqn. 6.5 and properties E4 and E6 .) Now since a, b and c
are linearly dependent they form a basis for E and hence b c = a + b + c for some , , R
that are not all zero. Now consider the equality (b c) (b c) = ( a + b + c) (b c) and show
that the left and righthand sides are nonzero and zero, respectively. (Hint: use properties E2 and E3
and the above orthogonality results.) This contradiction implies a, b and c are linearly dependent.
6.4. In regard to the equality a = a1 e1 + a2 e2 + a3 e3 of Eqn. 6.9
(a) Prove that this expression exists. Hint: Use that fact that {e1 , e2 , e3 } is a basis for E and hence
what does this imply about the four vectors e1 , e2 , e3 , a E?

(b) Prove the components a1 , a2 and a3 are unique. Hint: express a via the dierent components a1 ,
a2 and a3 , subtract the two expressions and use the fact that {e1 , e2 , e3 } is a basis for E.
6.5. Evaluate j j .
6.6. Prove e1 = e2 e3 , e2 = e3 e1 , and e3 = e1 e2 without using Eqn. 6.15.
(a) Express e2 e3 in terms of its components, cf. Eqn. 6.12.

(b) Next show that e2 e3 = [e1 , e2 , e3 ] e1 . Hint: Use Eqns. 6.5 and 6.6.

(c) Now verify the equality |e2 e3 |2 = 1. Hint: use property E7 and the fact that {e1 , e2 , e3 } is an
orthonormal basis for E, cf. Eqn. 6.7.

(d) Combine the previous two results to show that [e1 , e2 , e3 ] = 1.


(e) Insert this result into the second to obtain e2 e3 = e1 .

(f) Finally restrict yourself to an orthonormal righthanded basis for which [e1 , e2 , e3 ] = 1.

(g) Repeat this process to obtain the other equalitites.


6.7. Prove Eqn. 6.27.
6.8. Prove Eqn. 6.28.
6.9. Justify the matrix representation of Eqn. 6.30. Hint: evaluate the components
(a b)i j = ei (a b) e j .
6.10. Prove Eqn. 6.32.
6.11. Prove that in general AB ! BA for A, B L. Hint: provide a counter example.
6.12. Prove Eqn. 6.41.
6.13. The tensor P is a perpendicular projection if PT = P and P2 = P, cf. Fig. 6.4.
(a) Verify that 0, I, e e and I e e, with e a unit vector, are perpendicular projections.

(b) Illustrate the action of the above four tensors acting on the arbitrary vector a, i.e. illustrate the
vector a and P a.
(c) Continuing along the same vein, illustrate the vector a and P2 a and explain the significance of
the requirement that P2 = P.
(d) Show that all perpendicular projections take the form of the above four tensors. Hint: Represent
the symmetric P via the spectral decomposition theorem and use the fact that P2 = P to show
that eigenvalues of P equal either zero or one.
302

exercises
6.14. Express tr(A B C) using indicial (component) notation for A, B, C L.
6.15. Evaluate the invariants of the tensor A = a b where a, b E.
6.16. Prove Eqn. 6.52.
6.17. Verify the equality trDev(A) = 0, cf. Eqn. 6.53.
6.18. Prove Eqn. 6.57.
6.19. Prove Eqn. 6.59.
6.20. Refer to Eqn. 6.59 and
(a) Prove that if S T = 0 for every symmetric tensor S then T is a skew tensor.

(b) Prove that if W T = 0 for every skew symmetric tensor W then T is a symmetric tensor.
6.21. Prove Eqn. 6.64.
(a) Prove (A B)1 = B1 A1 . Hint: replace A with A B in Eqn. 6.62 and refer to definition of tensor
composition.
(b) To prove (A1 )T = (AT )1 you might consider
i. Proving AT is invertible if A is invertible. Hint: use Eqn. 6.52.
ii. Next using Eqn. 6.62 with both A and AT to obtain A A1 = I = (AT )1 AT .
iii. Finally, taking the transpose of I = A A1 in the above and appling Eqn. 6.41.
6.22. Some authors define an orthogonal tensor as only preserving length, i.e Eqn. 6.65 is replaced by
(Q a) (Q a) = a a. Prove that this condition renders Eqn. 6.65, i.e. (Q a) (Q b) = a b. Hint: replace
a with a b and expand the defining equality (Q (a b)) (Q (a b)) = (a b) (a b).
6.23. Given the tensor A L prove that there is an a ! 0 E such that A a = 0 if and only if det A = 0.
(a) To prove necessity, i.e. A a = 0 implies det A = 0 see the discussion following Eqn. 6.61.
(b) To prove suciency, i.e. det A = 0 implies there exists an a ! 0 E such that A a = 0 use Eqn.
6.47 to show that [A e1 , A e2 , A e3 ] = 0 and then apply Eqn. 6.6 to show that A e1 , A e2 , and A e3
are linearly dependent. Finally use Eqns. 6.1 and 6.22 to define a.
6.24. Verify that the reflection of Eqn. 6.67 is
(a) symmetric,
(b) orthogonal, and
(c) an involution.
6.25. Verify that the tensor R of Eqn. 6.68 is a rotation.
6.26. Prove that if R is a rotation, then (R a) (R b) = R (a b) for arbitrary vectors a and b, i.e. prove that
R = R.
6.27. If (, v) is an eigenpair of A prove that (, v) for any (nonzero) scalar is also an eigenpair, and
hence eigenvectors can always be scaled to be unit vectors.
6.28. The product P1 A P is a similarity transformation of the tensor A where P is an invertible tensor.
Such operations the appear in coordinate transformations, amongst other things.
303

mathematical preliminaries
(a) Verify that principal invariants of P1 A P equal those of A and hence the terminology invariants.
(b) Determine the relationship between the eigenpairs of P1 A P and A.
6.29. Verify Eqn. 6.80.
6.30. Verify that the eigenvalues of a positive definite symmetric tensor S are positive.

2
1 3

6.31. Evaluate the eigenpairs of the tensor A = 3 2 1 .

2 1 2

1 3 3

6.32. Evaluate the right polar decomposition of the tensor A = 4 2 1 .

2 6 1

6.33. Verify Eqn. 6.84.

6.34. Find a real eigenpair of a skew tensor W. Hint: Try w = Axial(W).


6.35. In Eqn. 6.43 we used the components of skew tensor W to show that it only has three distinct components. To these ends
(a) Consider the arbitrary basis vectors ei and e j and justify the equalities ei W e j = e j W ei and
ei W ei = ei W ei = 0.

(b) You have just proved the 3 components Wii (no sum) are zero and that Wi j = W ji for i ! j
and hence there are only three distinct components, cf. Eqn. 6.45. To find these remaining 3
components first prove that W has a zero eigenvalue, i.e. = 0. Hint: combine Eqn. 6.72 with
the above (with the eigenvector p replacing ei ).
(c) Let p be the unit eigenvector corresponding to this zero eigenvalue, i.e. (0, p) is an eigenpair of
W and use it to define the orthonormal basis {p, q, r}.

(d) Use the above results to show pW q = pW r = qW p = rW p = 0 and rW q = qW r =


for some scalar .
(e) Finally combine the above results to verify the equality W = (r q q r), cf. Eqn. 6.45. We
see one obvious component and two hidden components, i.e. the components of p. There
are only two components of p because the third component is determined by the |p| = 1 unit
length condition.
6.36. In Eqn. 6.45 we used the components of skew tensor W to show that it has an associated axial vector
w. To these ends use the results of Exer. 6.35 and let w = p and verify that W a = w a for every
vector a. Hint: Express a via the orthonormal basis {p, q, r} and use Eqns. 6.11, 6.16 and 6.29.
6.37. Prove the v u u v is a skew tensor with corresponding axial vector u v for arbitrary vectors u
and v.
(a) Use Eqn. 6.41 to show (v u u v)T = (v u u v).

(b) Use Eqn. 6.6 to show (vuuv) (uv) = 0 and hence (0, uv) is an eigenpair of (vuuv).
(c) Using the results of Exers. 6.35 and 6.36 you now know u v is the axial vector corresponding
to the skew tensor (vuuv). To compute scalar evaluate v[(vuuv) u] = v[( uv)u].
Hint: use property E7 and Eqn. 6.29 on the left hand side and Eqns. 6.5 and 6.6 in the righthand
side.
304

exercises
6.38. Refer to Exer. 6.37 and verify the equalities (v u u v) w = (u v) w = (u w) v (v w) u for
arbitrary vectors u, v and w.
6.39. Find a real eigenpair of the rotation tensor R of Eqn. 6.68. Hint: Try p.
6.40. In Eqn. 6.68 we introduced a particular rotation R = p p + (q q + r r) cos (q r r q) sin .
Prove that every rotation can be defined as such.
(a) From QT Q = I derive the equality QT (Q I) = (Q I)T .

(b) Take the determinant of the above to prove det(Q I) = 0.

(c) Use this det(Q I) = 0 result to explain why = 1 is an eigenvalue of Q.

(d) Let p be the eigenvector corresponding to this unit eigenvalue, i.e. (1, p) is an eigenpair of Q
and use it to define the orthonormal basis {p, q, r}. Now prove that
i. p = Q p = QT p.
ii. p Q p = 1.
iii. p Q q = p Q r = q Q p = r Q p = 0.

(e) You now have 5 components of Q, cf. Eqn. 6.35. To find the remaining 4 first prove the following
i. Q q Q q = Q r Q r = 1.
ii. Q q Q r = 0.

(f) Justify the equalities Q q = q + p and Q r = q + p and provide definitions of , , and


. Hint: refer to Eqns. 6.9 and 6.11.
(g) Justify the equalities 2 + 2 = 1, 2 + 2 = 1 and + = 0 and = 1. Hint:
evaluate the three possible dot products between Q q and Q p and use the equality det Q = 1 =
[Q p, Q q, Q r] = [Q p, q + p, q + p].
(h) Verify that = = cos and = = sin is a solution to the above four equations.

(i) Combine steps f and h to evaluate the remaining 4 components. Hint: cf. Eqns. 6.35, 6.36 and
6.68.

6.41. Verify Eqn. 6.95.


6.42. Verify Eqn. 6.96.
6.43. Verify Eqn. 6.99.
6.44. Verify that TT = T. Hint: Apply Eqns. 6.52, 6.55, 6.97, and 6.100 and as always, use the arbitrariness
of A, B L.
6.45. Verify the following
(a) C = PSym C if C possesses the first minor symmetry
(b) C = C PSym if C possesses the second minor symmetry and that
(c) If C possesses the major and either minor symmetry, then it possesses the other minor symmetry.
6.46. How many independent components does C have if it possesses major, first, and second minor symmetries?
6.47. Refer to Eqn. 6.105 and use the Kronecker delta i j to express the components
305

mathematical preliminaries
(a) PSymi jkl
(b) PSkewi jkl
(c) PSphi jkl
(d) PDevi jkl
6.48. Refer to Eqn. 6.105 and evaluate the following
(a) PSym PSkew
(b) PSym + PSkew
(c) PSph PDev
(d) PSph + PDev
(e) PSym PSph
(f) Comment on your findings.
6.49. Verify Eqn. 6.108.
6.50. Refer to Eqn. 6.109 and verify that R = R ! R is orthogonal if R L is a rotation.
6.51. Verify Eqn. 6.118.
6.52. Use the intermediate result of Exam. 6.7 to verify the equation
det(A + a b) = det(A) [1 + a (A1 b)]
Hint: Let = 1 and U = ab, use your Exer. 6.15 results to show 2 ((ab) A1 ) = 0 and det(ab) = 0
and apply Eqn. 6.59.
6.53. Evaluate the derivative of the function F : L L where F(A) = AT A.
6.54. Evaluate the derivative of the function F : L L where F(A) = tr(A) A.
6.55. Evaluate the derivative of the function F : L L where F(A) = tr(A) B AT where B is a tensor.
6.56. Use the chain rule to evaluate the gradient and derivative of the maps
(a) f : E R and

(b) gf : E E

where : E R, g : E E, and f : E E are dierentiable functons.


6.57. Evaluate the first and second derivatives of the function : E R such that (x) = |x|.
6.58. For the tensor field B : E L and the uniform tensor A L verify that div(AB(x)) = A divB(x).
Hint: Apply the transpose and divergence definitions of Eqns. 6.40 and 6.157 multiple times.
6.59. Verify Eqn. 6.164.2.
6.60. Verify the equality
E
E
(W (x y) divT(x) + W T(x)) dv,
(x y) T(x) n(x) da =
w

where w = Axial(W) is a uniform vector and T : E L is dierentiable. Hint: Refer to Eqns.


6.5, 6.6, 6.40, 6.44, 6.163 and 6.164.
306

exercises
6.61. Equation 6.164 assumes that the fields f, and S are smooth everywhere in . Now assume they are
smooth everywhere except across the singular surface with unit normal vector m (cf. Fig. 6.15).
Under these conditions derive the following counterpart to Eqn. 6.164:
E
E
E
divf dv =
f n da [[f]] m da,
E
E
E
divS dv =
S n da [[S]] m da,
(6.166)

where the jump of the field f is defined


[[f(x)]] = f + (x) f (x)
with
f (x) = lim+ f(x m).
0

The jump[[S(x)]] is defined analogously to [[f(x)]]. Hint: Split the integrals on the left hand side into
two regions + and and then apply Eqn. 6.164.
n

m
+

V
Figure 6.15: Surface of discontinuity.
6.62. Verify the group closure property G1 for the set of orthogonal tensors LOrth , cf. Sect. 6.8.
6.63. Show that the set of all rotations LRot is a subgroup of LOrth .
6.64. Show that the set of all rotations Re3 (n /2) where n is an integer is a subgroup of LOrth ..
6.65. Assume that X is invariant under the group G LOrth and that : X L R and T : X L R
are invariant under G and verify the derivative following relations:
(a) Q (QT A Q) QT = (A).
307

mathematical preliminaries
(b) (Q ! Q) D2 (QT A Q) (Q ! Q)T = D2 (A).
(c) DT(QT A Q) (Q ! Q)T = (Q ! Q)T DT(A).

308

BIBLIOGRAPHY

Bibliography
[1] R. M. Bowen. Introduction to Continuum Mechanics for Engineers. Plenum Press, New York, 1989.
[2] D. E. Carlson. Continuum Mechanics Course Notes. University of Illinois, Urbana-Champaign, 1988.
[3] P. Chadwick. Continuum Mechanics: Concise Theory and Problems. Dover Publications, New York,
1999.
[4] M. E. Gurtin. Introduction to Continuum Mechanics. Academic Press, New York, 1981.
[5] M. E. Gurtin, E. Fried, and L. Anand. The Mechanics and Thermodynamics of Continua. Cambridge,
New York, 2010.
[6] J. K. Knowles. Linear Vector Spaces and Cartesian Tensors. Oxford University Press, New York,
1998.
[7] R. W. Ogden. Non-Linear Elastic Deformations. John Wiley & Sons, New York, 1984.
[8] P. Podio-Guidugli. A primer in elasticity. Journal of Elasticity, 58(1), 2000.
[9] M. Silhavy. The Mechanics and Thermodynamics of Continuous Media. Springer, New York, 1997.
[10] A. J. M. Spencer. Continuum Mechanics. Dover Publications, New York, 2004.

309

BIBLIOGRAPHY

310

You might also like