• Nebyly nalezeny žádné výsledky

On improving accuracy of the error estimates in the conjugate gradient method

N/A
N/A
Protected

Academic year: 2022

Podíl "On improving accuracy of the error estimates in the conjugate gradient method"

Copied!
48
0
0

Načítání.... (zobrazit plný text nyní)

Fulltext

(1)

On improving accuracy of the error estimates in the conjugate gradient method

Petr Tichý

Faculty of Mathematics and Physics, Charles University

based on joint work with Gérard Meurant and Zdeněk Strakoš

PANM 19

June 24-29, 2018, Hejnice, Czech Republic

(2)

The conjugate gradient method

Ais symmetric and positive definite,Ax=b

input A,b,x0

p0 =r0 =bAx0 for k= 1,2, . . . do

αk−1 = rTk−1rk−1

pTk−1A pk−1

xk = xk−1+αk−1pk−1 rk = rk−1αk−1A pk−1

βk = rkTrk rk−1T rk−1

pk = rk+βkpk−1

end for

exact arithmetic

y

orthogonality rirj piApj

optimality ofxk

y∈Kminkkx−ykA.

2

(3)

Estimating the A-norm of the error

A brief history

kx−xkk2A . . . measureof the “goodness” of xk

[Hestenes, Stiefel 1952]

Gauss quadratureerror bounds (Lanczos) →GQL

[Dahlquist, Golub, Nash 1978],[Golub, Meurant 1994]

Estimating errors in CG → CGQLCGQ

[Golub, Strakoš 1994],[Golub, Meurant, 1997],[Meurant, T. 2013]

Why it works in finite precisionarithmetic

[Golub, Strakoš 1994],[Strakoš, T. 2002, 2005].

(4)

Estimating the A-norm of the error

A brief history

kx−xkk2A . . . measureof the “goodness” of xk

[Hestenes, Stiefel 1952]

Gaussquadrature error bounds (Lanczos) →GQL

[Dahlquist, Golub, Nash 1978],[Golub, Meurant 1994]

Estimating errors in CG → CGQLCGQ

[Golub, Strakoš 1994],[Golub, Meurant, 1997],[Meurant, T. 2013]

Why it works in finite precisionarithmetic

[Golub, Strakoš 1994],[Strakoš, T. 2002, 2005].

3

(5)

Estimating the A-norm of the error

A brief history

kx−xkk2A . . . measureof the “goodness” of xk

[Hestenes, Stiefel 1952]

Gaussquadrature error bounds (Lanczos) →GQL

[Dahlquist, Golub, Nash 1978],[Golub, Meurant 1994]

Estimating errors in CG → CGQLCGQ

[Golub, Strakoš 1994],[Golub, Meurant, 1997],[Meurant, T. 2013]

Why it works in finite precisionarithmetic

[Golub, Strakoš 1994],[Strakoš, T. 2002, 2005].

(6)

Estimating the A-norm of the error

A brief history

kx−xkk2A . . . measureof the “goodness” of xk

[Hestenes, Stiefel 1952]

Gaussquadrature error bounds (Lanczos) →GQL

[Dahlquist, Golub, Nash 1978],[Golub, Meurant 1994]

Estimating errors in CG → CGQLCGQ

[Golub, Strakoš 1994],[Golub, Meurant, 1997],[Meurant, T. 2013]

Why it works in finite precisionarithmetic

[Golub, Strakoš 1994],[Strakoš, T. 2002, 2005].

3

(7)

Quadrature bounds

(8)

Gauss and Gauss-Radau quadrature bounds

Given µλmin, it holds that

αkkrkk2 < kx−xkk2A < α(µ)k krkk2

where

α(µ)k+1=

α(µ)kαk

µα(µ)kαk

+βk+1

, α(µ)0 = 1 µ.

Practically relevant questions:

How to getµ?

Qualityof the bound? Numericalbehavior?

5

(9)

Gauss and Gauss-Radau quadrature bounds

Given µλmin, it holds that

αkkrkk2 < kx−xkk2A < α(µ)k krkk2

where

α(µ)k+1=

α(µ)kαk

µα(µ)kαk

+βk+1

, α(µ)0 = 1 µ.

Practically relevant questions:

How to getµ?

Qualityof the bound?

(10)

Upper bound in exact arithmetic

Gauss-Radau bound, bcsstk01 matrix,n= 48

µ= λmin

1 + 10−m, m= 2, . . . ,14

0 10 20 30 40 50

10−5 10−4 10−3 10−2 10−1

A−norm of the error Gauss−Radau bound

6

(11)

Upper bound in finite precision arithmetic

Gauss-Radau bound, bcsstk01 matrix,n= 48

µ= λmin

1 + 10−m, m= 2, . . . ,14

10−15 10−10 10−5 100

A−norm of the error Gauss−Radau bound

(12)

Upper bound in finite precision arithmetic

µ > λmin, bcsstk01 matrix,n= 48

µ= λmin

1−10−m, m= 2 : 2 : 14, α(µ)k <0

0 50 100 150

10−15 10−10 10−5 100

A−norm of the error Gauss−Radau bound

8

(13)

An upper bound on the upper bound

(14)

An upper bound on the Gauss-Radau bound, µλ

min

[Meurant, T. 2018?]

kx−xkk2A < α(µ)k krkk2 < krkk2 µ

krkk2 kpkk2

0 50 100 150

10−15 10−10 10−5 100

A−norm of the error Gauss−Radau bound new bound

10

(15)

The new bound

[Meurant, T. 2018?]

kx−xkk2A < α(µ)k krkk2 < krkk2 µ

krkk2 kpkk2

Havingµ, we can compute it almostfor free.

Monotonically decreasing.

Not sensitive to the choice ofµ.

As good as Gauss-Radau in many cases.

It can be used even if µ > λmin (heuristics).

(16)

How to approximate µ?

12

(17)

The conjugate gradient method

Ais symmetric and positive definite,Ax=b

input A,b,x0

p0 =r0 =bAx0 for k= 1,2, . . . do

αk−1 = rk−1T rk−1

pTk−1A pk−1

xk = xk−1+αk−1pk−1 rk = rk−1αk−1A pk−1

βk = rTkrk rk−1T rk−1

pk = rk+βkpk−1

end for

Uk

1 α0

qβ

1

α0

. .. . ..

. .. rβk−1

αk−2

1 αk−1

Tk = UTkUk

(18)

Approximation of λ

min

and λ

max

in CG

+

Q Q QQ k

9

XX XX y

9

XX XX XX XX XX XX X y

A Tk EST

Tk=UTkUkλmin(Tk) = 1/kU−1k k2

How to approximatekU−1k k2?

14

(19)

Incremental estimation

of the smallest Ritz value in CG

Uk is bidiagonal,

U−1kU−1k+1

byadding one column and one row.

Incremental norm estimation: incrementally improve an approximation of the maximum right singular vector.

[Bischof 1990],[Duff, Vömmel 2002],[Duintjer Tebbens, Tůma 2014].

(20)

Incremental estimation

of the smallest Ritz value in CG

Uk is bidiagonal,

U−1kU−1k+1

byadding one column and one row.

Incremental norm estimation: incrementally improve an approximation of the maximum right singular vector.

[Bischof 1990],[Duff, Vömmel 2002],[Duintjer Tebbens, Tůma 2014].

15

(21)

Approximation of the smallest Ritz value in CG

Havingαk andβk, update

σk = −

sαkβk αk−1

(sk−1σk−1+ck−1τk−1)

τk = αkb2kτk−1+ 1 ωk2 = (ρkτk)2+ 4σk2

c2k = 1 2

1−ρkτk ωk

ρk+1 = ρk+ωkc2k sk =

q 1−c2k, ck = |ck|sign(σk) µk+1 = ρ−1k+1

Very cheap, no need to store vectors or coefficients,

(22)

Approximation of λ

min

bcsstk01,n= 48

20 40 60 80 100 120 140

10-4 10-2 100 102

min approximation A-norm of the error

λminµk

λmin

17

(23)

Approximation of λ

min

s3dkt3m2,n= 90449,ichol

500 1000 1500 2000 2500 3000 10-4

10-2 100 102

min approximation A-norm of the error

λminµk λ

(24)

Bounds summary

bcsstk01,n= 48

αkkrkk2 < kx−xkk2A . krkk2 µk

krkk2 kpkk2

50 100 150

10-15 10-10 10-5 100

A-norm of the error GQ lower bound upper upper bound upper bound

19

(25)

Bounds summary

s3dkt3m2,n= 90449,ichol

αkkrkk2 < kx−xkk2A . krkk2 µk

krkk2 kpkk2

500 1000 1500 2000 2500 3000 3500 10-10

10-5 100

A-norm of the error Radau upper bound GQ lower bound upper bound

k upper bound

(26)

Improving accuracy of the estimates

kx−xkk2A = αkkrkk2 + kx−xk+1k2A

[Golub, Strakoš 1994, Golub, Meurant 1997, Strakoš, T. 2002]

21

(27)

Use a delay and bound (k + d)th error

bcsstk01,n= 48,d= 10

kx−xkk2A =

k+d−1

X

j=k

αjkrjk2 + kx−xk+dk2A

50 100 150

10-15 10-10 10-5 100

A-norm of the error GQ lower bound upper bound

k upper bound

(28)

Use a delay and bound (k + d)th error

bcsstk01,n= 48,d= 10

kx−xkk2A =

k+d−1

X

j=k

αjkrjk2 + kx−xk+dk2A

50 100 150

10-15 10-10 10-5 100

A-norm of the error GQ lower bound upper bound

k upper bound

22

(29)

Use a delay and bound (k + d)th error

s3dkt3m2,n= 90449,ichol,d= 100

kx−xkk2A =

k+d−1

X

j=k

αjkrjk2 + kx−xk+dk2A

500 1000 1500 2000 2500 3000 3500 10-10

10-5 100

A-norm of the error GQ lower bound upper bound

k upper bound

(30)

Use a delay and bound (k + d)th error

s3dkt3m2,n= 90449,ichol,d= 100

kx−xkk2A =

k+d−1

X

j=k

αjkrjk2 + kx−xk+dk2A

500 1000 1500 2000 2500 3000 3500 10-10

10-5 100

A-norm of the error GQ lower bound upper bound

k upper bound

23

(31)

How to choose d adaptively?

kx−xkk2A =

k+d−1

X

j=k

αjkrjk2

| {z }

νk,d

+kx−xk+dk2A,

= νk,d + kx−xk+dk2A

kx−xkk2A kx−xkk2A,

If kx−xk+dkA

kx−xkkA < ε, then

νk,d1/2 < kx−xkkA < νk,d1/2

1−ε2, e.g., ε= 0.8.

(32)

How to choose d adaptively?

kx−xkk2A =

k+d−1

X

j=k

αjkrjk2

| {z }

νk,d

+kx−xk+dk2A,

= νk,d + kx−xk+dk2A

kx−xkk2A kx−xkk2A,

If kx−xk+dkA

kx−xkkA < ε, then

νk,d1/2 < kx−xkkA < νk,d1/2

1−ε2, e.g., ε= 0.8.

24

(33)

Pseudo algorithm - adaptive choice of d

1: go= 1

2: d=d+ 1

3: while ((d≥1)and (go))do

4: computeνk−d+1,d 5: if kx−xkx−xkkA

k−dkA < εthen

6: compute lower or upper bounds at iterationkd

7: d=d−1

8: else

9: go= 0

10: end if

11: end while

(34)

Adaptive choice of d

µk upper bound approach, bcsstk01

kx−xk+dkA kx−xkkA .

krk+dk

µk+d

krk+dk kpk+dk

νk,d < ε= 0.5,

50 100 150

10-15 10-10 10-5 100

A-norm of the error with adaptive choice of d

0 50 100 150

-50 0 50

26

(35)

Adaptive choice of d

µk upper bound approach, bcsstk01

kx−xk+dkA kx−xkkA .

krk+dk

µk+d

krk+dk kpk+dk

νk,d < ε= 0.5,

50 100 150

10-15 10-10 10-5 100

A-norm of the error with adaptive choice of d

-50 0 50

(36)

Adaptive choice of d

µk upper bound approach, s3dkt3m2

kx−xk+dkA kx−xkkA .

krk+dk

µk+d

krk+dk kpk+dk

νk,d

< ε= 0.5,

500 1000 1500 2000 2500 3000 10-10

10-5 100

A-norm of the error with adaptive choice of d

500 1000 1500 2000 2500 3000 -2000

0 2000

27

(37)

Adaptive choice of d

µk upper bound approach, s3dkt3m2

kx−xk+dkA kx−xkkA .

krk+dk

µk+d

krk+dk kpk+dk

νk,d

< ε= 0.5,

500 1000 1500 2000 2500 3000 10-10

10-5 100

A-norm of the error with adaptive choice of d

500 1000 1500 2000 2500 3000 -2000

0 2000

(38)

Adaptive choice of d

µk upper bound approach, s3dkt3m2

2300 2400 2500 2600 2700 2800 2900 10-10

10-8 10-6 10-4

A-norm of the error with adaptive choice of d

2300 2400 2500 2600 2700 2800 2900 -200

0 200

28

(39)

Another approach

superliner, linear, sublinear CG convergence

Letα>β > γ >0. Then βγ αβ > β

αβ

α > γ β.

Apply to kx−xk−dk2A > kx−xkk2A > kx−xk+dk2A. Then

νk,d

νk−d,d

> kx−xkk2A

kx−xk−dk2A ⇔ kx−xkkA

kx−xk−dkA > kx−xk+dkA kx−xkkA .

CG convergence:

lineargoodapproximation superlinerupperbound sublinearlowerbound

(40)

Another approach

superliner, linear, sublinear CG convergence

Letα>β > γ >0. Then βγ αβ > β

αβ

α > γ β.

Apply to kx−xk−dk2A > kx−xkk2A > kx−xk+dk2A. Then

νk,d

νk−d,d

> kx−xkk2A

kx−xk−dk2A ⇔ kx−xkkA

kx−xk−dkA > kx−xk+dkA kx−xkkA .

CG convergence:

lineargoodapproximation superlinerupperbound sublinearlowerbound

29

(41)

Another approach

superliner, linear, sublinear CG convergence

Letα>β > γ >0. Then βγ αβ > β

αβ

α > γ β.

Apply to kx−xk−dk2A > kx−xkk2A > kx−xk+dk2A. Then

νk,d

νk−d,d

> kx−xkk2A

kx−xk−dk2A ⇔ kx−xkkA

kx−xk−dkA > kx−xk+dkA kx−xkkA .

CG convergence:

lineargoodapproximation superlinerupperbound sublinearlowerbound

(42)

Yet another approach

decrease formula

For`+e < k andd >0, one can relate kx−xk+dk2A

kx−xkk2A and kx−x`+ek2A kx−x`k2A .

It holds that

kx−xk+dk2A kx−xkk2A =

1− 1

1− kx− 1

x`+ek2 A kx−x`k2

A

k−1

P

j=`+e

gj

`+e−1

P

j=`

gj

+ 1

k+d−1

P

j=k

gj

k−1

P

j=`+e

gj

+1

where

gj =αjkrjk2.

30

(43)

Yet another approach

decrease formula

For`+e < k andd >0, one can relate kx−xk+dk2A

kx−xkk2A and kx−x`+ek2A kx−x`k2A . It holds that

kx−xk+dk2A kx−xkk2A =

1− 1

1− kx− 1

x`+ek2 A kx−x`k2

A

k−1

P

j=`+e

gj

`+e−1

P

j=`

gj

+ 1

k+d−1

P

j=k

gj

k−1

P

j=`+e

gj +1

where

2

(44)

Conclusions

A new boundon theA-norm of the error:

simple, comparable with the Gauss-Radau bound, it can be used even forµ > λmin (heuristics).

Cheapapproximations of extreme Ritz values can be used in the new “upper”bound µ ←→ µk,

using these approximationsestimate othercharacteristics. Strategy for improving accuracy of the error estimates

based on thedecreaseofkxxkkA inditerations, three approaches, how to choosedadaptively.

Future workcombine approaches, numerical experiments

31

(45)

Conclusions

A new boundon theA-norm of the error:

simple, comparable with the Gauss-Radau bound, it can be used even forµ > λmin (heuristics).

Cheapapproximations of extreme Ritz values can be used in the new “upper”bound µ ←→ µk,

using these approximationsestimate othercharacteristics.

Strategy for improving accuracy of the error estimates based on thedecreaseofkxxkkA inditerations, three approaches, how to choosedadaptively.

Future workcombine approaches, numerical experiments

(46)

Conclusions

A new boundon theA-norm of the error:

simple, comparable with the Gauss-Radau bound, it can be used even forµ > λmin (heuristics).

Cheapapproximations of extreme Ritz values can be used in the new “upper”bound µ ←→ µk,

using these approximationsestimate othercharacteristics.

Strategy for improving accuracy of the error estimates based on thedecreaseofkxxkkA inditerations, three approaches, how to choosedadaptively.

Future workcombine approaches, numerical experiments

31

(47)

Conclusions

A new boundon theA-norm of the error:

simple, comparable with the Gauss-Radau bound, it can be used even forµ > λmin (heuristics).

Cheapapproximations of extreme Ritz values can be used in the new “upper”bound µ ←→ µk,

using these approximationsestimate othercharacteristics.

Strategy for improving accuracy of the error estimates based on thedecreaseofkxxkkA inditerations, three approaches, how to choosedadaptively.

Future workcombine approaches, numerical experiments

(48)

Related papers

G. H. Golub and Z. Strakoš, [Estimates in quadratic formulas, Numer.

Algorithms, 8 (1994), pp. 241–268.]

G. Meurant and P. Tichý,[Practical estimation of theA-norm of the error in CG, to be submitted soon, 2018]

G. Meurant and P. Tichý,[On computing quadrature-based bounds for the A-norm of the error in CG, Numer. Algorithms, 62 (2013), pp. 163-191]

Z. Strakoš and P. Tichý,[On error estimation in CG and why it works in FP computations, Electron. Trans. Numer. Anal., 13 (2002), pp. 56–80.]

Thank you for your attention!

32

Odkazy

Související dokumenty

687–705 ] suggest to compute bounds for the A-norm of the error in the conjugate gradient (CG) method using Gauss, Gauss-Radau and Gauss-Lobatto quadratures.. The quadratures

This article explores the labour emigration of young people in Bul- garia both from the perspective of their intentions to make the transition from education to the labour market

Notice that the choice of weights depends on the problem at hand: accuracy of the estimation of the entire distribution of the statistic, accuracy of a confidence interval, accuracy

Author states he used secondary data from Bureau of Economic Analysis and Bureau of Labor Statistics but does not state HOW he used them.. The second part - an online survey, is

Key words: quantum K-theory; quantum cohomology; quintic; Calabi–Yau manifolds; Gro- mov–Witten invariants; Gopakumar–Vafa invariants; q-difference equations; q-Frobenius

The proofs of the convergence were based either on functional difference inequalities or on a general theorem on error estimates for approximate solutions to functional

A posteriori error estimates for the generalized overlapping domain decomposition method with Dirichlet boundary conditions on the boundaries for the discrete solutions on subdomains

Gene Golub and collaborators: [Dahlquist, Golub, Nash 1978] relate error bounds to Gauss quadrature; see also [Golub, Meurant 1994].. Idea of estimating errors in CG, behavior in