You are on page 1of 19

Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283 www.elsevier.

com/locate/cma

Articial neural network as an incremental non-linear constitutive model for a nite element code
M. Lek a, B.A. Schreer
b

b,*

a Department of Mechanics of Materials, Technical University of dz, Al. Politechniki 6, d 93 590, Poland o o z Department of Structures and Transportation Engineering, University of Padua, Via Marzolo 9, Padova 35131, Italy

Received 12 May 2003; accepted 12 May 2003

Abstract A back propagation articial neural network (BP ANN) is proposed as a tool for numerical modelling of the constitutive behaviour of a physically non-linear body. Training process of the ANN using experimental data is discussed in details and illustrated with an example. In particular, some diculties in the constitutive description proposed in consistent, incremental form are discovered and two solutions are proposed to overcome them. Two numerical examples are presented. The rst one deals with modelling of elasto-plastic hysteresis, the second shows the application of ANN to approximation of biaxial non-linear behaviour. 2003 Elsevier B.V. All rights reserved.
Keywords: Articial neural network; Constitutive modelling; Finite elements method

1. Introduction Classical application of articial neural network (ANN) for constitutive modelling of concrete was originally proposed by Ghaboussi et al. in [7]. An improved technique of ANN approximation for a problem of mechanical behaviour of drained and undrained sand is presented in [6]. In state of the art reviews [10,2325] the role of neural computing in constitutive modelling is clearly pointed out. A similar approach is employed in [4,18,11,21] and many other papers. The interest of such an application of ANN in the case when the model is built directly from some available experimental data is obvious. In such a case an unknown conventional analytical constitutive description can be directly replaced with a suitably trained ANN. A source of knowledge for ANN is not a symbolic formula but the set of experimental data in this case. The essence of the ANN technique is to construct the application that attributes a given set of output vectors to a given set of input vectors. When applied to the constitutive description, the physical nature of

Corresponding author. Tel.: +39-49-827-5602; fax: +39-49-827-5604. E-mail address: bas@caronte.dic.unipd.it (B.A. Schreer).

0045-7825/03/$ - see front matter 2003 Elsevier B.V. All rights reserved. doi:10.1016/S0045-7825(03)00350-5

3266

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

these inputoutput data is clearly determined by measured quantities: strainsstresses or displacements forces. We mention that in the case of substitution of the conventional description with the non-symbolic one, one usually constructs this new approximation by presenting the networks with pairs element of domain element of image of the considered, known constitutive operator. The neural black box operator, replacing an existing symbolic description, can be simpler in numerical manipulations, even as an element of a FE code, as it is shown in [5,15]. A hybrid FE-ANN code is described also in [19,20]. The authors show that the insertion of the constitutive law presented in the form of neural operator leads to some qualitative improvements in application of FE in engineering practice. Namely, the ANN representation can be modied to reduce an error of FE numerical experiment with respect to the real experiment. Our representation of constitutive law with ANN is slightly dierent. It is incremental while in [19,20] er functions are directly approximated. The construction of the non-symbolic description of non-linear constitutive behaviour known from experiment and the use of this representation in a nite element code is the subject of the present paper. Before, in [5] we have successfully incorporated into the HAMTRA FE code an ANN description of a onedimensional functional dependence of sorption on relative humidity, known from experiment. The present article has been inspired by an engineering analysis of mechanics of a bundle of super-conducting bres for fusion devices. The stressstrain relation will be coded with ANN and used then as a part of a FE model. In the paper we analyse results of recent experimental investigation of mechanical properties of a superconducting cable, performed at The University of Twente and published in [16]. This research revealed a very complex irreversible, non-linear behaviour of the cable. The description of these experimental results in terms of classical rheology has been undertaken in [3] and in [13,14] by ourselves. The symbolic theoretical model is composed of two equally dicult steps: rst the rheological scheme must be proposed then its parameters should be identied to t well the measured data. This procedure results in rather complicated formulae the use of which inside a FE model seems to be questionable. The method we propose in this paper, which employs the ANN technique, does not require any arbitrary choice of the constitutive model. The numerical parameters of the proposed description are easily and automatically dened. As will be shown, it can be incorporated in a very natural manner into any FE code. The presented method is, in our opinion, the shortest way from experimental research to numerical modelling. In subsequent sections we analyse in some details the properties of the approximation of the constitutive law by ANN from the point of view of the application in FE computations. We explain then how the ANN is inserted into a FE code. The paper is completed with two examples of practical applications: the rst one is very simple, one-dimensional; its advantage is that it can be compared directly with experimental data that have been used before to train the network. The second one is two-dimensional and describes the non-linear mechanical behaviour of the same super-conducting coil under tensioncompression cycles.

2. Neural network for constitutive modelling A neural network can be considered as a collection of simple processing units (nodes, articial neurons) that are mutually interconnected with variable weights. This system of units is organised to transform a set of given input signals into a set of given output signals. This transformation is organised as follows: each node of the network computes rst its activation as a weighted sum of incoming signals. Then the node transforms its activation by the non-linear, usually sigmoid transfer function (2) and sends it to every connected node. Both input to the network and the output from the network are suitably dened to possess a needed physical interpretation. In our case this is a sequence of corresponding values of stresses and strains. A functional dependence between input and output (if it exists) can be approximated by the net-

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

3267

work with an arbitrary precision. It is proven that ANN with sigmoid transfer function can be regarded as a universal approximator of a continuous function of many variables. The proof can be found in [2] and in many others papers quoted there. The pair: given inputknown output (target) forms an input pattern. After each forward transmission of the input signal through the network the transformed signal is compared with the target value and the error is computed. The weights of connections are modied to reduce the total error between the current network response and the corresponding target. This process is called training or learning. In this paper the ANN is trained by means of the back propagation (BP) algorithm. This is in fact a method of computing of the gradient of the square norm of the output error, with respect to the weights. According to the maximum descent rule, the individual correction of the strengths of connection wij is proportional to the ijth component of the minus gradient. The proportionality factor is called learning rate. We mention that all eective, iterative algorithms of minimisation can also be used instead, for example the Newton method, conjugate gradient method and many others. The process of weights correction is continued until the dierence between the neural network output and the desired, known output is minimised for a whole set of pairs: given inputknown output. The action of a classical ANN operator on a given input vector i is summarised by: ! ! X 3 X X 3 2 1 1 2 Dri oi wri f wqr f ip wpq bq 1 br bi :
r q p

In the above, parentheses f  enclose an argument of function f , wi denotes a matrix of adaptable weights of connections between nodes belonging to neighbouring ith and ith 1 layers of neurones, bi is a vector of bias values attributed to the hidden layer n i. The bipolar sigmoid activation function fi is attributed to neurone n i in a hidden or output layer: fi x pi 1 expki x=1 expki x: 2 The two parameters p and k can be also treated as variables adaptable during the learning process for each neurone or for some groups of neurones (as it is described in [12]) but in the present application they are kept constant, as usually in a typical BP ANN. The presence of the transfer function f assures that the transformation i ! o is non-linear. In this paper, the neural network with, at most, two hidden layers are used in simulations. The scheme of neural network associated with Eq. (1) is presented in Fig. 1. The interested reader is referred to any of the textbooks [1,8,9,17,22] for details concerning the activity of nets and nodal units.

1 i1 i2
1

1 a1 1 a1 2 f(a1 ) 1 f(a1 ) 2

w 11 w1 12 w
1 1Mh1

b2 w211 w212 w
2 1Mh2

1
2 a1 2 a2

b3 w311 w31M2
o a1 o a2

f(a2 ) 1
2 f(a2)

o1 o2

iMinp

1 a1 Mh1 f (aMh1)

2 a2 Mh2 f(a Mh2)

ao Mo

oMout

Fig. 1. Scheme of the neural network with hidden layers. The squares illustrate the action of the transfer function f on its argument activation of the neurone. The left most circles, indicating the input layer, are mostly nested in Eq. (1). This chain of transformations interprets clearly the approximation formula (1).

3268

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

The structure of the input vector i is crucial for the problem and it is discussed in Sections 2.1 and 2.2. We note here that the input and output data, from a formal point of view, form an ordered set of scalars but not necessary a vector or tensor. Because of this we will use the term column matrix for both input i and the output o. A set of all inputs (or all outputs) forms a rectangular matrix of the dimension I P (O P ), where I and O are the number of input or (respectively) output nodes. P is the number of patterns. The non-symbolic model is constructed as follows: the neural network is trained rst to reect correctly the set of observed, experimental data. In current practice only a part of the available data is used as a training set while the other part, a test set is hidden from the network in training. It is used for current testing (during each epoch) of the correctness of the predictions of the network when presented with unknown data. The networks generalisation capability (interpolation between some data sets is a particular case of generalisation) enables us to predict the material behaviour i.e. to produce the strainstress graph for an arbitrary sequence of strain values. The networks simulation can be checked against the real experimental results at this step. Such a new set of experimental data is called a verication set. If the networks prediction is satisfactory, the model is ready, if not, the new experimental data can be added to the existing training set and the network should be taught again in the larger experimental context. Mathematical models describing the relationship between stresses and strains consist of mathematical rules and expressions that explain the observed material behaviour. Such a symbolic description becomes usually very complex when it captures non-linear eects and accounts for dierent material behaviours in various ranges of stresses or strains. Usually the form of this description is postulated and then checked with few (but carefully dened) experimental observations. Articial neural network provides an alternative, non-symbolic approach to this problem. Since the neural operator is dened by learning from known experimental data, this is knowledge based rather than speculative approach. 2.1. Constitutive description by a set of experimental curves in stress space A constitutive relation can be represented by a set of curves in stress space, obtained (experimentally or numerically) for some given strains paths. ANN can be trained to reproduce these curves and to interpolate between them. In this sense ANN acts as the constitutive operator when presented with a vector containing strains and possibly some others state variables at the input. It means that the ANN can be constructed to approximate the family of functions rij e of a tensorial argument e. Extrapolating the method used in our previous paper [5], the following description of the graphs of the constitutive relationships is possible: at the input layer we present the points dening a reasonably long segment on the graph and the independent variable of any other point on the curve. The value of the function corresponding to this last point is presented at the output of the network. The scheme of such an ANN with the shortest possible segment on the graph can be denoted (for 2D) 15 mn3, where 15 neurones in the input layer take values ei ; ej ; rei ; rej ; ek while three output neurones take the values of rek , corresponding to the input value ek . Two hidden layers contain respectively m and n neurones. Latin subscript denotes the number of experimental point on the curve. Taking into account the fact that the state of stress in the current point is the function of all components of the stress tensor (treated as the state variables) the approximation applicable for a functional dependence should be reformulated. The following input pattern is thus well physically motivated: fei ; ej ; rei ; rej ; ek ; rek g: 3

The two elements in curl brackets represent input and output column matrices. When we deal with hysteresis loops, an auxiliary point must be added to the input set, to mark the segment of the curve. Any other point on the curve (for example jth in the preceding formula) can take the role of this supplementary ele-

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

3269

ment of the input matrix, as well as a parameter representing the cumulative work or energy dissipated up to the current point. This choice is discussed in [13]. The presented concept is very natural but, as we have shown in [12], the alternative, incremental description is simpler and more ecient. In this paper we will use exclusively the following scheme of the approximation of the constitutive relationships: The input column matrix needed for (1) is always of the form: i ei ; ri ; Dei or i ei ; ri ; gi ; Dei ; 4 g is a scalar parameter, very important when we deal with irreversible processes; for soils it can take a role of porosity for instance. The choice and physical interpretation of g is discussed in [13]. The output is always the stress increment Dr. Pattern sequence for the network we propose in this paper will be of the form: fei ; ri ; gi ; Dei ; Drabi g: The operator D acting on any measured entity s can be dened as follows: 8j > i Dj si ; sj si or Dsi ; si1 si : 6 5

Since in expression (5) we deal not only with increments but also with values of stress measured exactly in ith point of the constitutive curve, the choice between forward, central or backward increment denition is not trivial. It can inuence the process of numerical integration of stress. The proposed choice seems to be very natural from the physical point of view. It is to note that we can always build three (for 2D case) independent networks with a single node at the output, instead of the one network with multi-nodal output, described by Eq. (3). Each of the single nodes is interpreted as one component of the stress tensor, separately approximated. These three networks (5) are equivalent to the one with multi-nodal output layer when trained with the same set of patterns. This is preferable especially in the case when one of the tensors component is more dicult to approximate with the same precision than the two other. One can observe that in the proposed representation of the constitutive law neither yield surface nor plastic potential are explicitly dened. However, the stress response to any strain input will never fall outside the admissible domain in stress space since the network was trained only with admissible graphs. This approach is thus consistent with the traditional one, without dening any of the surfaces in stress space, required by classical elasticplastic approach. 2.2. Inuence of the length of the strain increment on the approximation quality Expression (1) proposes the ANN as a tool for approximation of curves resulting from experiment and dening a constitutive relationship. For the use of this approximation in the FE model it is important to have the increments Ds as small as possible since it will serve for approximation of a tangent stiness matrix. Because of this, in the formula (5) for each i only the neighbouring j will be used in data set prepared for the training process. If the experimental data are not dense enough, they can always be smoothed, taking into account each three neighbouring experimental points. The training data can be extracted from such an articially modied set of experimental data by interpolation. This procedure has not been applied in the paper. A much more important problem results from the fact that the strain increment takes a role of an independent variable in the representation (1). From the practice of approximation with ANN it is known that the interpolation is good only for those values of the independent variable that lie inside the segment of this variable, used in training. The network prediction will be thus unacceptable both for shorter and longer strain increments. An example provided below shows a numerical evidence of this feature. Two solutions

3270

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

can be proposed for this problem. The rst one assumes that only the best increments of strains can be used as element of the input of the trained ANN in recall mode. The choice of the increment used in training as the best one is obvious (but not unique). The results for any shorter increment should be thus obtained by an a posteriori interpolation between results obtained from ANN presented with an optimal increment. According to the second solution an articial subset of data, constructed for De 0; 0; 0 can be added to the training set. Expression (71 ) says that zero value of the independent variable gives zero increment of the dependent variable. Also some interpolated points for increments shorter than the given from experiment can be used in the teaching phase. In Eq. (72 ) the linear interpolation with a < 1 is considered. fei ; ri ; gi ; 0; 0g; fei ; ri ; gi ; aDgiven ei ; aDgiven ri g: 7 The rst method aects the use of the ANN in the recall mode, the second improves the learning phase. We use both solutions throughout the paper. 2.3. Construction of a pattern set for the ANN representation of the constitutive law It is obvious that the neural-like approximation of the true constitutive law must preserve the objectivity requirement. It means that the rigid rotation superposed with the deformation due to applied load cannot inuence the value of the approximate stress tensor related to the local, material co-ordinate system. It is easy to satisfy this requirement: the network must be presented with true, measured or computed data, obtained for rotated body. This is the unique way to force the neural-like representation to preserve the objectivity: since the construction of the ANN is given, only the choice of weights (thus training with correct or true examples) can assure the invariance with respect to rigid rotations. In the paper, we assume that we deal with innitesimal transformations thus the problem of objectivity of the stress increment is trivial. We add, however, a set of articial data simulating the response of the investigated material when measured in a rotated co-ordinates system. We explicitly assume that we deal with isotropic response in the cross section of the cable. This is a reasonable assumption for a composite with random spatial ordering of components as it is in the studied case. We note that if the response would not be isotropic, the information concerning its anisotropy should be given as a part of experimental results since our approach is knowledge based! Let us denote by @ the action on the input data of the neural-like operator N , dened by (1). N @ finputg foutputg: 8 In the context of formula (4) we must verify the following condition, imposed by isotropy of the approximated constitutive law (T denotes transposition of the matrix): 9 9 8 8 T > ei > > H ei H > > > > > T = = < < ri H ri H T fDri g then 8H : H H 1 we have N @ HT Dri H : 8i : if N @ 9 > > g > > g > > > > T ; ; : : Dei H Dei H To satisfy condition (9) we must train the network with some supplementary data: the new subset of patterns is of the form: fHT ei H; HT rei H; gi ; HT Dei H; HT Drabi Hgk : 10

The subscript k refers to the pattern obtained from ith experimental point by transformation by kth rotation matrix H. The total number K of these additional terms in the matrix of patterns depends on the number of trial rotation needed to train the network up to a satisfactory level of tolerance. We recall that the set of data contains also a number of supplementary patterns prescribed by (7).

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

3271

2.4. Modied algorithm of training We propose the following, modied algorithm of the supervised training of the ANN: Initiation of weights and activation functions to assure the best linear transformation of i into Dr. Initial weights minimise the value of the distance jXbm Yma tbr iT iar iT 1 j; ra ra 11 where r is the number of patterns, a, b, m number of input, output and hidden nodes. t and i are respectively the target and input vectors, X and Y are matrices of weights of synaptic links between input layerhidden layer and hidden layeroutput layer. Training of the network with only the experimental data (the part of data resulting from trial rotations for objectivity is not taken into account). Expanding the network with two supplementary layers: after input layer and before the output layer. Training of the network with only the part of data resulting from trial rotations, needed to force the objectivity of the approximation. The weights of previously trained main hidden layers are kept frozen during this step. Final correction of all weights using BP (delta-bar-delta) algorithm.

The introduced method of weights initialisation is not very important but for some classes of constitutive relationships it would result with an important economy in number of iterations. It can be explained as follow. For the network without hidden layers the best linear approximation of the relation between the input vector i and the target t is assured by a matrix Wba : 12 if W tiT iiT then kt Wik minimum: For the network with one hidden layer the same relation (the same W) can be decomposed using two matrices of weights Xbr , Yra such that XY W. For r 6 a and a random choice of Y, the best choice of X is the following: X WYT YYT : 13 The same reasoning can be repeated for each supplementary hidden layer. It is seen that the network is constructed starting with the best linear approximation of the relation between the input and the target. For a large class of constitutive relationships this initial guess can be very close to the nal approximation. 2.5. Verication of the quality of the approximation The following criteria are mostly used in practice to estimate the quality of the approximation of the set of data with an ANN: Mean square of the error between the output generated by ANN and the target prescribed in the set of training patterns (RMS). It is computed after each epoch for both: training set and testing set of patterns. Statistical correlation of the output data prescribed in the training patterns and the output generated by the ANN. This is also computed for both test and training sets. Denition of the RMS error is given by: s 1 X RMS Tpi outpi 2 : PK p;i
1 1

14

3272

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

In (14) p is the current number of input pattern, i is the number of output node, Tpi is a component of the target vector. P stands for the total number of patterns and K is the total number of output nodes. The set of test patterns is usually a subset of the whole data set. The patterns belonging to the test set are never presented to the network. If the structure of the ANN is well designed, the mean square error is possibly small simultaneously for training and testing sets and similar for these two sets. It means that the network is suciently rich to reproduce well the given set of patterns and the number of nodes (thus of weights of the internodal connections) is not too great since the generalised results are close to the test ones. A too great number of neurones causes the pathological compliance of the network thus the interpolation of the testing patterns is very bad. In some applications of ANN inside a FE code [5] we realised that even a well designed (according to the above quoted criteria) ANN causes some errors. Analysis of this behaviour shows that another, stronger criterion should be introduced. Let us consider the generation of the learned curve by the ANN via a recurrent procedure in which the output values are used as the element of the input data for the next step: For a given initial point: (e0 , r0 , g0 , De0 ), the ANN generates the value of Drab0 on the output node. Input pattern for the step i 1 contains the value of the rab ei1 rab ei Drabi (Drabi computed by ANN in the previous step), new value of g is gi gei Dei . Output value Drabi1 is used next for the generation of the subsequent point on the curve as well as the next input to the network. We say that ANN veries the autonomous criterion if the curve generated in the recurrence manner, autonomously by ANN, lies suciently close to the curve described by a set of training and tests patterns. All graphs presented below, show the autonomous behaviour of trained ANNs for some program of incrementation of the independent variable. We use step increment d dierent from that used in the learning process.

3. Simple examples of approximation of constitutive relationships with ANN The purpose of this section is to illustrate the observation that: The network not correctly dened can show a pathological behaviour. ANN trained with input containing patterns constructed according to Eqs. (5), (7) and (10) behaves better than that trained with training set dened as in (3). The network, for a given path of increments of strains, generates stress increments that allow to trace the curves very close to those used in training. The above is illustrated with examples of one-dimensional hysteresis and two-dimensional proportional loading path. The rheological type constitutive relationship we are going to approximate by ANN in these examples is taken from [3]. It explains roughly the behaviour of the same super-conducting cable that will be nally analysed in the last section of the present paper. The considered model was built using a number of Prandtl elements i.e. elastic spring connected in series with a dry friction slider. All Prandtl elements are connected in parallel. It is assumed that the common stiness of all springs is constant and equal to E. The yield stress si Eds of the ith dry friction element varies from element to element. It is interpreted as a random variable with a hypothetical probability density function. Thanks to the parallel contribution of dierent sliders, the very trivial elasto-plastic model represented by a single elastic springplastic slider transforms into a rich model that can be used to simulate a complex material response. Our assumption concerning this probability distribution of s is dierent from those in [3] and is given by Eq. (15). A wide family of qualitatively dierent

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

3273

behaviours can be obtained changing simply few parameters of the below dened probability density function. This is the principal advantage of this model in the numerical experimentation.   s m m s ps Heavisides m exp : 15 v2 v Functions of loading, unloading and reloading are taken directly from [3] and are quoted below. Stresses are calculated according to (16)(18) for the proposed probability density function (15). In loading, the strain driven numerical experiment is continued until e0 , in unloading the maximum strain value is e00 :   Z e rlo E e e sps ds ; 16
0

run E e

Z
0

e0 e=2

e sps ds
ee00 =2

e0

! e00 sps ds ; 17 Z
e0 e0 e00 =2

e0 e=2

rre E e

Z
0

e sps ds

e0 e00 =2

! e0 sps ds : 18

e00 sps ds

ee00 =2

We underline that in the above equations the stress is expressed by total strain. The split of total strain into the plastic and elastic parts is taken into account during the development of equations and is hidden from the resulting formulae. For the second example a superposition of the two one-dimensional stretchings are taken into account. We consider only loading phase in this case:   Z eab rcd Ecdab eab eab sps; m; neab ds : 19
0

The interaction between loading function in two directions can be accounted for by assuming, that the probability density function in one direction is inuenced by strains in the second direction. It is, however, neglected in the present paper. Parameter n is a counterpart of m in the second direction. In Fig. 2a the stress responses are drawn for some proportional strain paths and for an intermediary m and n while in Fig. 2b the limit case for m tending to zero is presented. This last gure is similar to the
2 1.5 1
1 2.5 2 1.5

0.5

22

0 -2.5 -2 -1.5 -1 -0.5 -0.5 -1 0 0.5 1 1.5 2 2 .5

22

0.5 0 -2.5 -2 -1.5 -1 -0.5 -0.5 -1 -1.5 0 0.5 1 1.5 2 2 .5

-1.5 -2

-2 -2.5

(a)

11

(b)

11

Fig. 2. 2D loading path: some possible stress responses for proportional, strain driven loading, created with Eq. (19).

3274

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

Saint-Venant yield criterion. We would like to stress that the model (19) used in the example is probably not very realistic but it is very suitable for numerical tests since many dierent curves can be generated with it. 3.1. Numerical experiments with hysteresis The proposed technique of constitutive modelling with ANN will be illustrated with an example of elasticplastic hysteresis. This task is very specic: rst of all the experimental data are presented in form of loops i.e. not a one to one applications. Moreover, the whole family of loops of hysteresis should be approximated, not a one single branch. The ANN must thus interpolate not only between points but also between curves. Since the ANN will be incorporated into a FE code, the approximation must be relatively independent of the step size kDek. During the numerical experiments we tested the capability of learning the one-dimensional hysteresis of the re graphs for several types of neural networks. The description of the network is always of the form ImnK (numbers of nodes in layers). The symbols that starts with 5 (5 nodes in the input layer) denote the ANN in non-incremental form (3). Those starting with 3 nodes correspond to the incremental form (5) but without the auxiliary variable g. Symbol ImnK=a means that the network with m and n nodes in hidden layers was tested with the number of increments a times grater then the one in training phase (the strain incremental step is about a time shorter). Description of the graphs in Fig. 3 includes three numbers interpreted as values of turning point strains from formulae (16)(18), multiplied by 100, according to the scheme: e0 e00 final e. With d we have marked the loops presented to the network in training. The horizontal and vertical axes are strain and normalised stress respectively. We perform the following numerical experiments: Training of ANNs of the type 5mn1, 3mn1 with 13 dierent loops of hysteresis. Five of these loops are reproduced in Fig. 3. Other loops are similar but non-symmetric in the sense that loadingunloading turning point is dierent from the unloadingreloading one. For each loop the increment of strains is quite uniform and varies from 0.00035 to 0.0007.

1.5

1.5

0.5

0.5

0 -0.1 -0.075 -0.05- 0.0250 -0.5 0.025 0.05 0.075 0.1


-0.075 -0.05 -0.025

0 0 -0.5 0.025 0.050 0.075 0.1

d1: 7_-7_7 d2:6_-6_6 d3:5_-5_5 d4: 4_-4_4 d5: 3_-3_3

-1

-1

test1 test2 test3 3331 3331 3331


(b)

-1.5

-1.5

(a)

Fig. 3. Examples of training sets for the numerical test. (a) The sampling points are marked only for two extreme training data sets. (b) Shows the results of the autonomous criterion test: the loops are drawn by ANN starting from 0; 0 with given increment of strain.

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

3275

Two kinds of autonomous criterion tests are then executed. First, the suciently trained ANNs are presented with three strain incremental programs that were never used in training. For the second case of testing, the increment of strain was reduced twice, three times, ve times and ten times. Results of these numerical experiments are illustrated below with some selected graphs. In all gures we collected the reproductions of the curves drawn by the ANN when the network starts from 0; 0 and then progresses by itself until the end of the loadingunloadingreloading loop. A similar, autonomous action will be used inside the FE code to update stresses at the end of each step of a Newton iterative solution (as it will be dened in Section 4). In Fig. 3a and b we see that the simple reproduction of the curves are very satisfactory even for a very small network (only three nodes in hidden layers). However, when the step size decreases, the quality of the approximation is not good. We observe this in Fig. 4 and in Fig. 5. Fig. 4a shows, that when the test increment decreases, the graphs are attracted to the centre of the loops. This is quite natural since the networks is taught with all the loops thus in the case of unknown presented data it tries to interpolate in best dened directions. In Fig. 4 we show the test results for the network that were trained with not complete set of learning data. Namely, the training set was dominated by function shifted to the left because of loadingunloading turning point smaller than the unloadingreloading one (data d10d13 were eliminated intentionally from the data set). We observe that the shape of the graphs is concave in unloading. In Fig. 4a, the reaction to this experiment is more distinct that in Fig. 4b. It means that the ANN with the input organised according to (3) (non-incremental) is more sensitive on the change of the training set than the incremental one. The incremental networks trained with extended data set (interpolated points on the curve and zero increment data, according to (7)) are much less sensitive to the reduction of the step size. The agreement is still good for the strain increments three and ve times shorter than that used in training. The degradation is, however, very quick after. The graphs 3331/10 bifurcate suddenly to a new shape. The graphs reproduced autonomously by the ANN, shown in Fig. 5, were never presented to the network in learning. The numerical experiments suggest that the ANNs constructed for incremental representation of a constitutive law (described by Eq. (5)) behave better than these described by Eq. (3).

1.5

1.5

0.5

0.5

0 -0.075 -0.05 -0.025 -0.5 0 0.025 0.05 0.075 0.1 -0.075 -0.05 -0.025

0.025 .

c 0.05

0.075 0.07

0.1

test1
-0.5

test3 3331

d1 5531
-1

5531/3 5531/2

-1

3331 3331/3

-1.5

-1.5

(a)

(b)

Fig. 4. Results of the tests for the network with dierent input structures: direct and incremental with step size shorter than that used in training.

3276

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

1.5

0.5

0 -0.075 -0.05 -0.025 -0.5 0 0.025 0.05 0.075 0.1

test2 test3 3331/10 3331/5 3331/3


-1.5

-1

Fig. 5. Results of the tests for the curves that have never been presented to the network. The degradation of the approximation starts for the increment ten times smaller than that the one used in learning.

3.2. Two-dimensional problem All observations made for one-dimensional case can be conrmed also for the two-dimensional example. In Fig. 6 the graphs illustrate the interpolation ability of the ANN of the type 5962. The network drew the curves that have never been used in the learning process. It is to note, that the structure of such a network is analogous to that of 3331 and is described in Section 2.2. In the input layer we have ve neurones interpreted as follows: r11 ; r22 ; kDecumulative k; De11 ; De22 . Two neurones at the output are valued with Dr11 , Dr22 .

1.5

2 1.5

22

22

1 0.5

0.5

0 -2 -1.5 -1 -0.5 -0.5 0 0.5 1 1.5 2

0 -2 -1.5 -1 -0.5 -0.5 0 0.5 1 1.5

Fig. 6. Testing results for the network with input structure as described in Section 2.2. ANN draws the curves that have never been presented to the network (generalisation). Continuous lines denote the true curves, markersthe approximation by ANN.

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

3277

4. Implementation of constitutive law represented by ANN inside a FE code Let us suppose that the basic equation for standard displacement-based nite element method can be written in the form: Z Z Z T T 0 BN : s dV NN t dS NT f dV ; edu BN du; 20 N
V S V

V is the volume of the body in the reference conguration, BN is the matrix that operating on the vector of admissible variation of independent variables, gives the strain measure edu conjugate to material stress s. In what follows in the paper we will consider small transformations thus innitesimal strain tensor. The index N denotes that B is constructed on the basis of the approximation of u by a set of interpolation functions Nx on appropriate nite elements. On the right side of (20) t is a stress vector given on the part S of the boundary while f are body forces acting on the elementary volume dV . Since the considered material behaviour is non-linear, the Newton algorithm will be applied to solve the system of equation (20). The Jacobian of the left hand side of (20) can be written as follows: Z J ds : edu s : dedu dV 0 : 21
V

The rst term under the integral (21) can be computed using a usual constitutive assumption: ds D : de where Dij osi : oej 22

We can rewrite the above equations, taking variation with respect to the independent variables of the problem u and obtaining (by denition) the stiness matrix K: Z main KMN BM : D : BN dV 0 : 23
V0

The second term in the integral (21) represents initial stress matrix. Using the assumed representation of constitutive law by ANN we have instead of (22): ds Nd;r @ de: 24

Index d denotes that the network quality is best for some given value of increment d, r means that the value of stress increment is computed at the current value of s r. It is clear that we must replace the neural operator in (24) by the matrix D or simply, construct this matrix using the given representation of the constitutive law. This will be done by trial incrementing of e. Let us suppose that both tensors ds and de are represented by column vectors: dr ds1 det de1 ds2 de2
1

ds1 Nd;r @ de1 de3 ;

Nd;r @ de2

Nd;r @ de1 ;

25

D drdet :

26

Matrix of trial vectors det is always proportional to the strains at the last equilibrated point (r; e) during the Newton iteration process (preceding step). Trial vectors cannot be arbitrary because Nd;r @ de 6 Nd;r @ de and in fact two dierent tangent stiness matrices can be dened in any point: one for loading and the other for unloading. It is supposed thus that the loading (unloading) is continued during

3278

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

B'
loading branch stress updating using self-iteration of ANN unloading from point B'

ft+1 ft A

C' N@dB C x1

N@(-d B) B x

x2

Fig. 7. Newton procedure with the use of ANN representation of constitutive load. Two branches of the curve: loading and unloading correspond to positive and negative sign of strains increment.

the current increment in the Newton iterations. The formulae (25), (26) are used here instead of computing the derivatives of the neural network with respect to input values (the method proposed in [20]). The stress in the second term in the integral (21) is computed using neural network in the recall mode for given, constant step de, until the strain e at the trial solution at the current step is reached. The ANN acts here in the autonomous activity mode as it is dened in Section 2.5 and tested in Section 3. This process corresponds to the classical integration of incremental constitutive equation for updating r. It starts always at the last equilibrated point and the increment de is proportional to the one, dened for this step (loading or unloading). This is illustrated in Fig. 7.

5. ANN-FE hybrid model of a bundle of super-conducting bres The use of the ANN representation in the numerical modelling of continua involves two main steps: rst the ANN can be dened and successfully trained, then the FE code must be adapted to accept the material data in this form. The correct denition of the neural-like operator is discussed in Section 3. We skip as too technical the analysis of the process of optimisation of the ANNs topology for the application presented below for a super-conducting cable. The reader interested in this technique is referred to the textbooks [1,8,9,17,22]. The insertion of the ANN into a standard FE code is described in Section 4. In the present section we describe the approximation with an ANN of the given experimental data and the example of the use of this description in a FE model. Cable-in-conduit conductors in components of fusion devices, such as for instance toroidal coils for the International Thermonuclear Reactor, may be regarded from a structural point of view as hierarchical composites. They are in fact made up of a large number of small bres (super-conductors) grouped in clusters, and nested in a cooper matrix strand. The strands in turn are grouped in petals and are bound together by an outer steel jacket. Due to the large number of repetitive strands, the whole cable can be considered as a homogeneous body in a macro scale. Constitutive relation for such a homogenised material is very dicult to deduce from the knowledge of the internal structure because of the complex geometry, unilateral contact between strands and geometrical non-linearity of its nite displacements. Recently an experimental analysis, performed at The University of Twente and published in [16], conrmed that the mechanical behaviour of such a structure is very complex. The cable was pressed in the direction of its diameter and the displacement of the upper part of the steel jacket with respect to its bottom part has been measured for 38 cycles of loading and unloading. For cyclic loading complicated hysteresis

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283
3000 2500 2000 1500 1000 500 0 0 50 100 Training set targets Training output Test output 150 200 250

3279

Pattern sequenc e

Fig. 8. ANNs prediction of two initial loops that were not presented to the network during training. (ANN of 4551 type i.e. with the input layer of the form ei ; ri ; gi ; Dei .

Fig. 9. Illustration of the use of the data collected from the experiment with a single cable to model a quasi continuous behaviour of a whole super-conducting beam.

loops in the displacementforce plane have been measured. We observe large initial irreversible settlement of the virgin cable followed by non-linear elastic behaviour. This behaviour is reproduced as a background for the results of FE modelling in Figs. 8, 10 and 11. Our numerical experiments show that a surprisingly small network learns well the constitutive relation between the applied force and the displacement. The number of neurones in hidden layers was never higher than 6. The results presented in Fig. 8 are obtained for the network with two hidden layers, ve neurones each. The correlation ratio for all tested networks was very good (of the order of 0.99), the RMS error was very small (of the order of 0.02). The graphs of reproduction for the test set are very close to the given target. The graph in Fig. 8 shows that the two rst cycles are surprisingly well reproduced. These two cycles are qualitatively dierent from the others and the network had only one (third) similar set of data at disposition to learn this. The markers denoting the networks response for the input never presented during the learning process are very close to

3280

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283
20 18 16 14 12 10 8 6 4 2 0 -2 0.09 0.08 0.07 0.06 0.05 0.04 loading program 0.03 experimental 0.02 loading experimental 0.01 data FE results 0

stress

13

19

25

31

37

43

49

55

20 18 16 14 12 10 8 6 4 2 0 -2 0

experimental data FE res ults

stress

0.02

0.04
strain

0.06

61

0.08

strain

0.1

Fig. 10. Comparison of FE results with the experimental data. Solid dots and solid line are obtained from displacement at the end of each nite element, re-scaled to strains to be comparable with experimental graph. (b) Represents the same data but rearranged in form of er loops.

the expected output marked with dotted line. Cumulated networks outputsdisplacements f xi df xi ; dxi1 in [lm] are reported along the vertical axis. When the two last cycles were hidden from the network during the training, their reproduction by the ANN was even better than in Fig. 8 since these two last cycles are very similar to most of the learned loops. Fig. 8 proves well that the generalisation is correct. It means that the network acts rather like a model of the constitutive relation than as a tool of the storage of the experimental data. The network learns well the constitutive law, not the numerical data. The observations made for one-dimensional case is conrmed also for the two-dimensional example we are going to use inside the FE model in the sequel. 5.1. One-dimensional FE-ANN model The experimental data concern the mean stress and the corresponding mean strains in the single strand. This is because the measured displacement (co-linear with the force) is apparently a measure of the response of the whole structure. Also in [3] the same interpretation is proposed. The investigated strand is a part of larger super-conducting structure that can be considered as a homogeneous one due to the huge number of strands in the typical cross section. Without entering into details of the shape of this cross section, we can say that we have a constitutive law, which is true in a mean sense for a periodic cell constructing the global super-structure. This is illustrated in Fig. 9.

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283
20 18 16 14 12
stress

3281

0.08 0.06

strain
exp. force FEM load program exper. hor. disp l. exper.Vert.dis pl. FEM vert. disp l. FEM hor.dis pl.

0.04 0.02 0 -0.02


11 16 21 26 31 36 41 46 51 56 61 66 71 1 6

10 8 6 4 2 0 -2 -0.04

(a)
20 18 16 14 12 10 8 6 4 2 0 -2 experimental one-D data 2D FE solution

stress

0.02

0.04 strain

0.06

0.08

(b)

Fig. 11. Comparison of FE results (2D) with the experimental data. Bold dots and bold line are obtained from displacement at the end of nite element, re-scaled to strains to be comparable with experimental graph. (b) Represents the same data but rearranged in the form of er loops.

We use the ANN representation of the constitutive equation dened above in our own research FE code. The true experiment was displacement driven (the kinematic load has been applied to the sample). The numerical experiment we have done was force driven. We have prescribed few levels of concentrated force acting in vertical direction as presented in Fig. 9. A denition of kinematic loading program equivalent to the one carried on in the laboratory is dicult because of the rapid drop of forces in unloading with nearly innitesimal change of displacement. The values of forces have been chosen on the path known from experiment. It is illustrated in Fig. 10a. The response of the element should be identical with that observed in the laboratory. The dierences observed in both Fig. 10a and b are probably due to the smeared character of approximation with ANN. The drop of force at the end of each loop provokes some numerical troubles since an innitesimal strain increment corresponds to a large decrease of force. 5.2. Two-dimensional FE-ANN solution An articial construction of the experimental data has been necessary to perform a 2D numerical experiment. The true data have been completed by adding a hypothetical behaviour in the second, horizontal direction. The purpose of this example is rather to illustrate the performance of the hybrid FE-ANN code than to analyse the structure of the cable. We have assumed that the horizontal displacement of the single strand are identical in their character with the really measured vertical one. Only the magnitude was scaled by a coecient )0.35. We assume thus that we deal with plane state of stress and the data we have can be interpreted as pairs:

3282

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

Fi 0

! 0 d ! i 0 0

0 0:35di

! e: 27

With this interpretation, we can complete the set of input patterns for ANN according to (9). The nite element model contains a simple square mesh of triangles ten rows by ten columns. The boundary conditions are the following: vertical displacement constrained at the bottom edge of the square, uniform stress vector at the upper edge. Horizontal displacements are free except the one at the axis of symmetry. The displacement referred to in Fig. 10 has been measured at the upper corner of the square domain. All observations made for one-dimensional case can be conrmed also for the two-dimensional example. The displacements are, however much smaller in this case. 6. Conclusions The following conclusions can be drawn from this paper: The presented examples show that the stress paths drawn for a given strain history can be approximated very well by a small neural network. The suciently trained ANN can interpolate between learned curves to draw the one, not presented in training. The incremental ANN representation of any constitutive law is always (by construction) consistent (in the sense of theory of plasticity). This observation concerns both sources of knowledge about the material: real and numerical experiment. This representation is automatic in the sense that it does not require any a priori choice or adaptation of the existing constitutive theory for the description of the observed material behaviour. Finally we show that it is possible to incorporate the ANN constitutive description into a Finite Element code. A realistic FE model can be thus constructed for a material described by ANN. The examples show that the model is possible even in the case of complicated non-linear, inelastic behaviour. Acknowledgements This paper has been partly supported by CUTTER project GRD1/1999/10330 (Enhanced Design and Production of Wear Resistant Rock Cutting Tools For Construction Machinery) and partly by FUSION grant RFX FU0S-CT2000-00045 (EFDA/00-S21). References
[1] H. Abdi, Les Rseaux de Neurones, Presses Universitaires de Grenoble, 1994. e [2] T. Chen, H. Chen, Universal approximation to non-linear operators by neural networks with arbitrary activation functions and its application to dynamical systems, IEEE Trans. Neural Networks 6 (4) (1995) 911917. [3] U. Galvanetto, V. Naumov, V. Palmov, B.A. Schreer, Analysis of the mechanical behaviour of cable-in-conduit superconductors under transverse cyclic loading, Int. J. Computat. Civil Struct. Engrg. 1 (2) (2000) 110. [4] S. Garcia, M.P. Romo, V. Taboada-Urtuzuastegui, Knowledge-based modelling of sand behaviour, Proceedings of ECCOMAS 2000, Barcelona, 2000, pp. 1114. [5] D. Gawin, M. Lek, B.A. Schreer, ANN approach to sorption hysteresis within a coupled hygrothermo-mechanical FE analysis, Int. J. Numer. Meth. Engrg. 50 (2001) 299323. [6] J. Ghaboussi, D.E. Sidarta, New nested adaptive neural networks (NANN) for constitutive modelling, Comput. Geotec. 22 (1) (1998) 2952. [7] J. Ghaboussi, J.H. Garrett, X. Wu, Knowledge-based modelling of material behaviour with neural networks, J. Engrg. Mech. 117 (1991) 132151.

M. Lek, B.A. Schreer / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

3283

[8] J. Hertz, A. Krogh, G.R. Palmer, Introduction to the theory of neural computation, Lecture Notes, vol. I, Santa Fe Institute Studies in the sciences of Complexity, Addison-Wesley, 1991. [9] Y.H. Hu, J.-N. Hwang (Eds.), Handbook of Neural Network Signal Processing, CRC PRESS, 2002. [10] S. Kortesis, P.D. Panagiotopoulos, Neural networks for computing in structural analysis: Methods and prospects of applications, Int. J. Numer. Meth. Engrg. 36 (1993) 23052318. [11] M. Lek, Use of articial neural network to dene a non-linear eective constitutive law for a composite, Proceedings of the 13th Polish Conference on Computer Methods in Mechanics PCCMM97, 1997, pp. 725732. [12] M. Lek, Modied BP articial neural network as an incremental non-linear constitutive model, Proceedings of European Conference on Computational Mechanics, ECCM-2001, 2001 on CD. [13] M. Lek, B.A. Schreer, Articial neural network for parameter identications for an elasto-plastic model of super-conducting cable under cyclic loading, Comput. Struct. 80 (22) (2002) 16991713. [14] M. Lek, B.A. Schreer, One-dimensional model of cable-in-conduit superconductors under cyclic loading using articial neural networks, Fusion Engrg. Des. 60 (2) (2002) 105117. [15] G. Mucha, Z. Waszczyszyn, Hybrid neural-network/computational program for bending analysis of elastoplastic beams, Proceedings Of the XIII Polish Conference On Computer Methods in Mechanics, 1997, pp. 949956. [16] N.H. Nijuhuis, W. Noordman, H.H.J. Ten Kate, Mechanical and electrical testing of an ITER CS1 model coil conductor under transverse loading in a cryogenic press, Preliminary Report, University of Twente, 1998. [17] S. Osowski, Sieci Neuronowe w ujeciu algorytmicznym, Wydawnictwo Naukowo Techniczne, Warszawa, 1996. [18] D. Penumadu, R. Zhao, Triaxial compression behaviour of sand and gravel using articial neural networks (ANN), Comput. Geotech. 24 (1999) 207230. [19] H.S. Shin, G.N. Pande, Intelligent nite elements, in: S. Valliappan, N. Khalili (Eds.), Computational Mechanics-New Frontiers for New Millenium, Elsevier Science, 2001. [20] H.S. Shin, G.N. Pande, On self-learning nite element codes based on monitored response of structures, Comput. Geotech. 27 (2000) 161178. [21] Z. Sikora, R. Ossowski, Y. Ichikawa, K. Tkacz, Neural networks as a tool for constitutive modelling, in: F. Oka, A. Yashima (Eds.), Localization and Bifurcation Theory for Soils and Rocks, Balkema, Rotterdam, 1998. [22] R. Tadeusiewicz, Sieci Neuronowe, Akademicka Ocyna Wydawnicza, 1993. [23] Z. Waszczyszyn, Neural networks in plasticity: some new results and prospects of applications, European Congress on Computational Methods in Applied Sciences and Engineering ECCOMAS 2000, 2000 on CD. [24] Z. Waszczyszyn, Some new results in applications of backpropagation neural networks in structural and civil engineering, in: Advances in Engineering Computational Technology, Civil-Comp Press, Edinburgh, 1998, pp. 173187. [25] G. Yagawa, H. Okuda, Neural networks in computational mechanics, Arch. Computat. Meth. Engrg. 3 (4) (1996) 435512.

You might also like