• Nebyly nalezeny žádné výsledky

System Identification Using Multilayer Differential Neural Networks: A New Result

N/A
N/A
Protected

Academic year: 2022

Podíl "System Identification Using Multilayer Differential Neural Networks: A New Result"

Copied!
21
0
0

Načítání.... (zobrazit plný text nyní)

Fulltext

(1)

Volume 2012, Article ID 529176,20pages doi:10.1155/2012/529176

Research Article

System Identification Using Multilayer Differential Neural Networks: A New Result

J. Humberto P ´erez-Cruz,

1

A. Y. Alanis,

1

Jos ´e de Jes ´us Rubio,

2

and Jaime Pacheco

2

1Centro Universitario de Ciencias Exactas e Ingenier´ıas, Universidad de Guadalajara, Boulevard Marcelino Garc´ıa Barrag´an No. 1421, 44430 Guadalajara, JAL, Mexico

2Secci´on de Estudios de Posgrado e Investigaci´on, ESIME-UA, IPN, Avenida de las Granjas No. 682, 02250 Santa Catarina, NL, Mexico

Correspondence should be addressed to J. Humberto P´erez-Cruz,phhhantom2001@yahoo.com.mx Received 22 November 2011; Revised 30 January 2012; Accepted 2 February 2012

Academic Editor: Hector Pomares

Copyrightq2012 J. Humberto P´erez-Cruz et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

In previous works, a learning law with a dead zone function was developed for multilayer differential neural networks. This scheme requires strictly a priori knowledge of an upper bound for the unmodeled dynamics. In this paper, the learning law is modified in such a way that this condition is relaxed. By this modification, the tuning process is simpler and the dead-zone function is not required anymore. On the basis of this modification and by using a Lyapunov-like analysis, a stronger result is here demonstrated: the exponential convergence of the identification error to a bounded zone. Besides, a value for upper bound of such zone is provided. The workability of this approach is tested by a simulation example.

1. Introduction

During the last four decades system identification has emerged as a powerful and effective alternative to the first principles modeling1–4. By using the first approach, a satisfactory mathematical model of a system can be obtained directly from an input and output experimental data set 5. Ideally no a priori knowledge of the system is necessary since this is considered as a black box. Thus, the time employed to develop such model is reduced significantly with respect to a first principles approach. For the linear case, system identification is a problem well understood and enjoys well-established solutions 6.

However, the nonlinear case is much more challenging. Although some proposals have been presented 7, the class of considered nonlinear systems can result very limited. Due to their capability of handling a more general class of systems and due to advantages such as the fact of not requiring linear in parameters and persistence of excitation assumptions8,

(2)

artificial neural networksANNshave been extensively used in identification of nonlinear systems 9–12. Their success is based on their capability of providing arbitrarily good approximations to any smooth function13–15as well as their massive parallelism and very fast adaptability16,17.

An artificial neural network can be simply considered as a nonlinear generic mathematical formula whose parameters are adjusted in order to represent the behavior of a static or dynamic system18. These parameters are called weights. Generally speaking, ANN can be classified as feedforwardstaticones, based on the back propagation technique 19 or as recurrent dynamic ones 17. In the first network type, system dynamics is approximated by a static mapping. These networks have two major disadvantages: a slow learning rate and a high sensitivity to training data. The second approachrecurrent ANN incorporates feedback into its structure. Due to this feature, recurrent neural networks can overcome many problems associated with static ANN, such as global extrema search, and consequently have better approximation properties. Depending on their structure, recurrent neural networks can be classified as discrete-time ones or differential ones.

The first deep insight about the identification of dynamic systems based on neural networks was provided by Narendra and Parthasarathy 20. However, none-stability analyses of their neuroidentifier were presented. Hunt et al. 21 called attention to determine the convergence, stability and robustness of the algorithms based on neural networks for identification and control. This issue was addressed by Polycarpou and Ioannou 16, Rovithakis and Christodoulou17, Kosmatopoulos et al. 22, and Yu and Poznyak 23. Given different structures of continuous-time neural networks, the stability of their algorithms could be proven by using Lyapunov-like analysis. All aforementioned works considered only the case of single-layer networks. However, as it is known, this kind of networks does not necessarily satisfy the property of universal function approximation24.

And although the activation functions of single-layer neural networks are selected as a basis set in such a way that this property can be guaranteed, the approximation error can never be made smaller than a lower bound24. This drawback can be overcome by using multilayer neural networks. Due to this better capability of function approximation, the case multilayer was considered in25for feedforward networks and for continuous time recurrent neural networks for first time in26and subsequently in27. By using Lyapunov-like analysis and a dead-zone function, boundedness for the identification error could be guaranteed in26.

The following upper bound for the “average” identification error was reported,

lim sup

T→ ∞

1 T

T

0

1− f0 Υ

λmin

P−1/2Q0P−1/2P1/2Δt

ΔTtQ0Δtdtf0 Υ, 1.1

whereΔtis the identification error,Q0is a positive definite matrix,f0is a upper bound for the modeling error,Υis an upper bound for a deterministic disturbance, and·is a dead-zone function defined as

z

⎧⎨

z z≥0,

0 z <0. 1.2

Although, in28, open-loop analysis based on the passivity method for a multilayer neural network was carried out and certain simplifications were accomplished, the main result about

(3)

the aforementioned identification error could not be modified. In 29, the application of the multilayer scheme for control was explored. Since previous works 26–29 are based on this “average” identification error, one could wonder about the real utility of this result.

Certainly, boundedness for this kind of error does not guarantee thatΔtbelongs toL2orL. Besides, none value for upper bound of identification error norm is provided. Likewise, none information about the speed of the convergence process is presented. Another disadvantage of this approach is that the upper bound for the modeling errorf0 must be strictly known a priori in order to implement the learning laws for the weight matrices. In order to avoid these drawbacks, in this paper, we propose to modify the learning laws employed in26in such a way that their implementation does not require anymore the knowledge of an upper bound for the modeling error. Besides, on the basis of these new learning laws, a stronger result is here guaranteed: the exponential convergence of the identification error norm to a bounded zone. The workability of the scheme developed in this paper is tested by simulation.

2. Multilayer Neural Identifier

Consider that the nonlinear system to be identified can be represented by

˙

xtfxt, ut, t ξt, 2.1 wherextn is the measurable state vector fort : {t:t≥ 0}, utqis the control input,f : n× q × n is an unknown nonlinear vector function which represents the nominal dynamics of the system, and ξtn represents a deterministic disturbance.

fxt, ut, trepresents a very ample class of systems including affine and nonaffine-in-control nonlinear systems. However, when the control input appears in a nonlinear fashion in the system state equation2.1, throughout this paper, such nonlinearity with respect to the input is assumed known and represented byγ·:qs.

Consider the following parallel structure of multilayer neural network d

dtxtAxtW1,tσV1,txt W2,tφV2,txtγut, 2.2 where xtn is the state of the neural network,utq is the control input, An×n is a Hurwitz matrix which can be specified by the designer, the matricesW1,tn×mand W2,tn×rare the weights of output layers, the matricesV1,tm×nandV2,tr×nare the weights of hidden layers,σ·is the activation vector-function with sigmoidal components, that is,σ·: σ1·, . . . , σm·T,

σjv: aσj

1exp

m

i1cσj,ividσj, forj1, . . . , m, 2.3 whereaσj,cσj,i, anddσj are positive constants which can be specified by the designer,φ·: rr×sis also a sigmoidal function, that is,

φijz: aφij 1exp

r

l1cφij,lzldφij fori1, . . . , r, j1, . . . , s, 2.4

(4)

whereaφij,cφij,l, anddφij are positive constants which can be specified by the designer,γ·: qsrepresents the nonlinearity with respect to the input—if it exists—which is assumed a priori known for the system 2.1. It is important to mention that m and r, that is, the number of neurons forσ·and the number of rows forφ·, respectively, can be selected by the designer.

The problem of identifying system 2.1 based on the multilayer differential neural network2.2consists of, given the measurable state xt and the inputut, adjusting on line the weightsW1,t, W2,t, V1,t, andV2,tby proper learning laws such that the identification error Δt:xtxtcan be reduced.

Hereafter, it is considered that the following assumptions are valid;

A1System2.1satisfies theuniform ontLipschitz condition, that is,

fx, u, t−fz, v, tL1x−zL2u−v; x, zn;u, vq; 0≤L1, L2 <∞. 2.5

A2The differences of functionsσ·andφ·fulfil the generalized Lipschitz conditions

σTtΛ1σt≤ΔTtΛσΔt, γTutφTtΛ2φtγut≤ΔTtΛφΔtγut2, 2.6 where

σt:σ

V10xt

σ V10xt

, φt:φ V20xt

φ V20xt

, 2.7

Λ1m×m2r×rσn×nφn×nare known positive definite matrices, V10m×n and V20r×n are constant matrices which can be selected by the designer.

Asσ·andφ·fulfil the Lipschitz conditions and from Lemma A.1 proven in26 the following is true:

σt:σV1,txtσ V10xt

DσV1,txtνσ,

σtγut:

φV2,txtφ V20xt

γut

2.8

s

i1

φiV2,txtφi

V20xt

γiut

s i1

DV2,txtv

γiut, 2.9

where

Dσ ∂σY

∂Y

YV10xt

m×m, D ∂φiZ

∂Z

ZV20xt

r×r, 2.10

νσmandνnare unknown vectors but bounded byνσ2Λ1l1V1,txt2Λ1, ν2Λ

2l2V2,txt2Λ2, respectively;V1,t : V1,tV10,V2,t : V2,tV20,l1 andl2 are

(5)

positive constants which can be defined asl1 : 4L2g,1,l2 : 4L2g,2, whereLg,1 and Lg,2are global Lipschitz constants forσ·andφi·, respectively.

A3The nonlinear functionγ·is such thatγut2uwhereuis a known positive constant.

A4Unmodeled dynamicsftis bounded by ft2

Λ3

f0f1xt2Λ3, 2.11

wheref0andf1are known positive constants andΛ3n×n is a known positive definite matrix and ft can be defined as ft : fxt, ut, tAxtW10σV10xtW20φV20xtγut; W10n×m andW20n×r are constant matrices which can be selected by the designer.

A5The deterministic disturbance is bounded, that is,ξt2Λ4 ≤Υ,Λ4is a known positive definite matrix.

A6The following matrix Riccati equation has a unique, positive definite solutionP:

ATPP AP RPQ0, 2.12

where R2W10Λ−11

W10T

2W20Λ−12 W20T

Λ−13 Λ−14 , Q ΛσφQ0, 2.13

Q0is a positive definite matrix which can be selected by the designer.

Remark 2.1. Based on30,31, it can be established that the matrix Riccati equation2.12has a unique positive definite solutionPif the following conditions are satisfied;

aThe pairA, R1/2is controllable, and the pairQ1/2, Ais observable.

bThe following matrix inequality is fulfilled:

1 4

ATR−1R−1A R

ATR−1R−1AT

ATR−1AQ. 2.14

Both conditions can relatively easily be fulfilled ifAis selected as a stable diagonal matrix.

A7It exists a bounded controlut, such that the closed-loop system is quadratic stable, that is, it exists a Lyapunov functionV0>0 and a positive constantλsuch that

∂V0

∂x fxt, ut, t≤ −λxt2. 2.15 Additionally, the inequalityλf1Λ3must be satisfied.

(6)

Now, consider the learning law:

W˙1,tstK1tσTV1,txt stK1PΔtxtTV1,tTDTσ, W˙2,tstK2tγTutφTV2,txt stK2txTtV2,tT

s i1

γiutDT ,

V˙1,tstK3DσTW1,tTPΔtxtTstl1

2K3Λ1V1,txtxTt, V˙2,tstK4

s i1

γiutDT

W2,tTtxTtstsl2u

2 K4Λ2V2,txtxTt,

2.16

wheresis the number of columns corresponding toφ·, K1n×n, K2n×n, K3m×m, andK4r×r are positive definite matrices which are selected by the designer.stis a dead- zone function which is defined as

st:

1− μ

P1/2Δt

, z

⎧⎨

z z≥0, 0 z <0,

μ f0 Υ λmin

P−1/2Q0P−1/2.

2.17

Based on this learning law, the following result was demonstrated in26.

Theorem 2.2. If the assumptions (A1)–(A7) are satisfied and the weight matricesW1,t,W2,tV1,t, and V2,tof the neural network2.2are adjusted by the learning law2.16, then

athe identification error and the weights are bounded:

Δt, W1,t, W2,t, V1,t, V2,tL, 2.18

bthe identification errorΔtsatisfies the following tracking performance:

lim sup

T→ ∞

1 T

T

0

1− f0 Υ

λmin

P−1/2Q0P−1/2P1/2Δt

ΔTtQ0Δtdtf0 Υ. 2.19

In order to prove this result, the following nonnegative function was utilized:

Vt:V0P1/2Δtμ2

tr

W1,tTK−11 W1,t tr

W2,tTK2−1W2,t tr

V1,tTK3−1V1,t tr

V2,tTK4−1V2,t

,

2.20

whereW1,t:W1,tW10; W2,t:W2,tW20.

(7)

3. Exponential Convergence of the Identification Process

Consider that the assumptionsA1–A3andA5-A6are still valid but the assumption A4is slightly modified as follows.

B4In a compact setΩ∈ n, unmodeled dynamicsftis bounded byft2Λ3f0where f0is a constant not necessarily a priori known.

Remark 3.1. B4 is a common assumption in the neural network literature 17, 22. As mentioned inSection 2,ftis given byft:fxt, ut, tAxtW10σV10xtW20φV20xtγut. Note thatW10σV10xtandW20φV20xtγutare bounded functions becauseσ·andφ·are sigmoidal functions. Asxtbelongs toΩ, clearlyxtis also bounded. Therefore, assumption B4 implies implicitly thatfxt, ut, tis a bounded function in a compact setΩ∈ n.

Although certainly assumption B4is more restrictive than assumption A4, from now on, assumptionA7is not needed anymore.

In this paper, the following modification to the learning law2.16is proposed:

W˙1,t−2k1tσTV1,txt 2k1txTtV1,tTDTσα 2W1,t, W˙2,t−2k2tγTutφTV2,txt 2k2PΔtxtTV2,tT

s i1

γiutDT

α 2W2,t, V˙1,t−2k3DTσW1,tTtxTtk3l1Λ1V1,txtxtTα

2V1,t, V˙2,t−2k4

s i1

γiutDT

W2,tTPΔtxTtk4sl22V2,txtxTtα 2V2,t,

3.1

where k1, k2, k3, and k4 are positive constants which are selected by the designer; P is the solution of the Riccati equation given by 2.12;α : λminP−1/2Q0P−1/2;s is the number of columns corresponding toφ·. By using the constants k1, k2, k3, and k4 in 3.1 instead of the matricesK1, K2, K3, andK4 in2.16, the tuning process of the neural network2.2 is simplified. Besides, none dead-zone function is now required. Based on the learning law 3.1, the following result is here established.

Theorem 3.2. If the assumptions (A1)–(A3), (B4), (A5)-(A6) are satisfied and the weight matrices W1,t,W2,t,V1,t, andV2,tof the neural network2.2are adjusted by the learning law3.1, then

athe identification error and the weights are bounded:

Δt, W1,t, W2,t, V1,t, V2,tL, 3.2

bthe norm of identification error converges exponentially to a region bounded given by

tlim→ ∞xtxt

f0 Υ

αλminP. 3.3

(8)

Proof ofTheorem 3.2. Before beginning analysis, the dynamics of the identification error Δt

must be determined. The first derivative ofΔtis t

dt d

dtxtxt. 3.4

Note that an alternative representation for2.1could be calculated as follows:

˙

xtAxtW10σ V10xt

W20φ V20xt

γut ftξt. 3.5

Substituting2.2and3.5into3.4yields

Δ˙tAxtW1,tσV1,txt W2,tφV2,txtγutAxtW10σ V10xt

W20φ V20xt

γutftξt, tW1,tσV1,txtW10σ

V10xt

W2,tφV2,txtγut

W20φ V20xt

γutftξt.

3.6

Subtracting and adding the terms W10σV1,txt, W10σV10xt, W20φV2,txtγut, and W20φV20xtγutand considering thatW1,t :W1,tW10,W2,t :W2,tW20,σt : σV10xtσV10xt, φt : φV20xtφV20xt, σt : σV1,txtσV10xt, φtγut : φV2,txtφV20xtγut,3.6can be expressed as

Δ˙ttW1,tσV1,txtW10σV1,txt W10σV1,txtW10σ V10xt

W10σ V10xt

W10σ V10xt

W2,tφV2,txtγutW20φV2,txtγut W20φV2,txtγut

W20φ V20xt

γut W20φ V20xt

γutW20φ V20xt

γutftξt

tW1,tσV1,txt W10σtW10σtW2,tφV2,txtγut W20φtγut W20φtγutftξt,

Δ˙ttW1,tσV1,txt W2,tφV2,txtγut W10σtW20φtγut W10σt W20φtγutftξt.

3.7

In order to begin analysis, the following nonnegative function is selected:

Vt ΔTtPΔt 1 2k1

tr

W1,tTW1,t 1

2k2

tr

W2,tTW2,t

1 2k3tr

V1,tTV1,t

1 2k4 tr

V2,tTV2,t

,

3.8

(9)

where P is a positive solution for the Riccati matrix equation given by 2.12. The first derivative ofVtis

V˙t d dt

ΔTtt

d dt

1 2k1tr

W1,tTW1,t

d dt

1 2k2tr

W2,tTW2,t

d dt

1 2k3tr

V1,tTV1,t

d dt

1 2k4tr

V2,tTV2,t

.

3.9

Each term of3.9will be calculated separately. Ford/dtΔTtPΔt, d

dt

ΔTtt

TtPΔ˙t. 3.10

substituting3.7into3.10yields d

dt

ΔTtPΔt

TtP AΔtTtPW1,tσV1,txtTtPW2,tφV2,txtγut

TtP W10σtTtP W20φtγutTtP W10σtTtP W20φtγut

−2ΔTtPft−2ΔTtP ξt.

3.11

The terms 2ΔTtP W10σt, 2ΔTtP W20φtγut, −2ΔTtPft, and −2ΔTtP ξt in 3.11 can be bounded using the following matrix inequality proven in26:

XTYYTXXTΓ−1XYTΓY, 3.12 which is valid for anyX, Yn×kand for any positive definite matrix 0 < Γ ΓTn×n. Thus, for 2ΔTtP W10σtand considering assumptionA2,

TtP W10σt ΔTtP W10σtσtT W10T

t

≤ΔTtP W10Λ−11 W10T

tσTtΛ1σt

≤ΔTtP W10Λ−11 W10T

t ΔTtΛσΔt.

3.13

For 2ΔTtP W20φtγut,and considering assumptionsA2andA3 2ΔTtP W20φtγut ΔTtP W20φtγut γTutφTt

W20T

PΔt

≤ΔTtP W20Λ−12 W20T

tγTutφTtΛ2φtγut

≤ΔTtP W20Λ−12 W20T

tTtΛφΔt.

3.14

(10)

By using3.12and given assumptionsB4andA5,−2ΔTtPftand−2ΔTtP ξtcan be bounded, respectively, by

−2ΔTtPft−ΔTtPftftTt≤ΔTtPΛ−13 tftTΛ3ft

≤ΔTt−13 tf0,

−2ΔTtP ξt−ΔTtP ξtξTtPΔt≤ΔTtPΛ−14 PΔtξTtΛ4ξt

≤ΔTt−14 t Υ.

3.15

Considering2.8, 2ΔTtP W10σtcan be developed as

TtP W10σtTtP W10DσV1,txtTtP W10νσ. 3.16

By simultaneously adding and subtracting the term 2ΔTtP W1,tDσV1,txt into the right-hand side of3.16,

TtP W10σtTtP W1,tDσV1,txt−2ΔTtPW1,tDσV1,txtTtP W10νσ. 3.17

By using3.12and considering assumptionA2, the termΔTtP W10νσcan be bounded as

TtP W10νσ ΔTtP W10νσνTσ W10T

t≤ΔTtP W10Λ−11 W10T

PΔtνσTΛ1νσ

≤ΔTtP W10Λ−11 W10T

tl1V1,txt2

Λ1

.

3.18

And consequently, 2ΔTtP W10σtis bounded by

TtP W10σt≤2ΔTtP W1,tDσV1,txt−2ΔTtPW1,tDσV1,txtΔTtP W10Λ−11 W10T

PΔtl1V1,txt2

Λ1

. 3.19

For 2ΔTtP W20φtγutand considering2.9,

TtP W20φtγutTtP W20 s

i1

DV2,txtν

γiut

TtP W20 s

i1

DV2,txtγiutTtP W20 s

i1

νγiut.

3.20

(11)

Adding and subtracting the term 2ΔTtP W2,ts

i1DV2,txtγiut into the right-hand side of 3.20,

TtP W20φtγutTtP W2,t s

i1

DV2,txtγiut−2ΔTtPW2,t s

i1

DV2,txtγiut

TtP W20 s

i1

νγiut.

3.21

By using3.12, 2ΔTtP W20s

i1νγiutcan be bounded by

TtP W20 s i1

νγiut ΔTtP W20 s

i1

νγiut

s

i1

νγiut T

W20T PΔt

≤ΔTtP W20Λ−12 W20T

PΔt s

i1

νγiut T

Λ2

s i1

νγiut,

3.22

but considering that

s

i1

νγiut T

Λ2

s i1

νγiuts s

i1

γi2utνTΛ2ν 3.23

and from assumptionsA2andA3, the following can be concluded:

TtP W20 s

i1

νγiut≤ΔTtP W20Λ−12 W20T

PΔtsl2uV2,txt2

Λ2

. 3.24

Thus, 2ΔTtP W20φtγutis bounded by

TtP W20φtγut≤2ΔTtP W2,t s

i1

DV2,txtγiut−2ΔTtPW2,t s

i1

DV2,txtγiut

ΔTtP W20Λ−12 W20T

tsl2uV2,txt2

Λ2

.

3.25

Odkazy

Související dokumenty

In this work, we investigate how the classification error of deep convolutional neural networks (CNNs) used for image verification depends on transformations between two

The text of the work is focused on description of the convolutional neural networks, and does not provide much details about the usefulness of the dense 3D representation for

This chapter is providing brief introduction of different types of Artificial Neural Networks (ANNs) such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs)

The ob- jective of this work is to research the possibilities of using neural networks for processing of weather radar image sequences as an enhancement to the existing

• We use Recurrent Fuzzy Neural Network (RFNN) which is a hybrid method combining fuzzy systems and articial neural networks to predict the Srepok runo.. • We improve the performance

Oguztoreli: On the neural equations of Cowan and Stein (1).. d) On the stability and numerical solutions of two neural models. Williams: Properties of small neural networks. Pic

In this paper we have characterized the class of languages that are accepted by binary-state neural networks with an extra analog unit, which is an intermediate computational

• open problem: e.g. a more efficient implementation of finite automata by binary-state probabilistic neural networks than that by deterministic neural networks.. arithmetic