• Nebyly nalezeny žádné výsledky

Bounds for Tail Probabilities of the Sample Variance

N/A
N/A
Protected

Academic year: 2022

Podíl "Bounds for Tail Probabilities of the Sample Variance"

Copied!
20
0
0

Načítání.... (zobrazit plný text nyní)

Fulltext

(1)

Volume 2009, Article ID 941936,20pages doi:10.1155/2009/941936

Research Article

Bounds for Tail Probabilities of the Sample Variance

V. Bentkus

1

and M. Van Zuijlen

2

1Vilnius Pedagogical University, Studentu 39, LT-08106 Vilnius, Lithuania

2IMAPP, Radboud University Nijmegen, P.O. Box 9010, 6500 GL Nijmegen, The Netherlands

Correspondence should be addressed to V. Bentkus,bentkus@ktl.mii.lt Received 11 February 2009; Accepted 20 June 2009

Recommended by Andrei Volodin

We provide bounds for tail probabilities of the sample variance. The bounds are expressed in terms of Hoeffding functions and are the sharpest known. They are designed having in mind applications in auditing as well as in processing data related to environment.

Copyrightq2009 V. Bentkus and M. Van Zuijlen. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction and Results

LetX, X1, . . . , Xn be a random sample of independent identically distributed observations.

Throughout we write

μEX, σ2EX−μ2, ωEX−μ4 1.1

for the mean, variance, and the fourth central moment ofX, and assume thatn≥2. Some of our results hold only for bounded random variables. In such cases without loss of generality we assume that 0≤X≤1. Note that 0≤X≤1 is a natural condition in audit applications.

The sample varianceσ2of the sampleX1, . . . , Xnis defined as

σ2 1

n−1 n

i1

XiX2, 1.2

(2)

whereXis the sample mean,nXX1· · ·Xn. We can rewrite1.2as

σ2 1 nn−1

i /j,1≤i,j≤n

XiXj2

2 . 1.3

We are interested in deviations of the statisticσ2 from its meanσ2 Eσ2, that is, in bounds for the tail probabilities of the statisticT σ2σ2,

P{T ≥t}P

σ2σ2t

, 0≤tσ2, 1.4

P{T ≤ −t}P

σ2σ2t

, t≥0. 1.5

The paper is organized as follows. In the introduction we give a description of bounds, some comments, and references. InSection 2we obtain sharp upper bounds for the fourth moment. InSection 3we give proofs of all facts and results from the introduction.

If 0≤X ≤1, then the range of interest in1.5is 0≤tγ2, where

γ2

⎧⎪

⎪⎪

⎪⎪

⎪⎩ 1

4 −σ2 1

4n−1, if nis even, 1

4 −σ2 1

4n, if nis odd.

1.6

The restriction 0≤tσ2on the range oftin1.4 resp., 0≤tγ2in1.5in cases where the condition 0≤X≤1 is fulfilledis natural. Indeed,P{Tt}0 fort > σ2, due to the obvious inequalityσ2≥0. Furthermore, in the case of 0≤X ≤1 we haveP{T≤ −t}0 fort > γ2since

σ2γ2σ2seeProposition 2.3for a proof of the latter inequality.

The asymptoticasn → ∞properties ofTseeSection 3for proofs of1.7and1.8 can be used to test the quality of bounds for tail probabilities. Under the conditionEX4 <∞ the statisticT σ2σ2is asymptotically normal provided thatXis not a Bernoulli random variable symmetric around its mean. Namely, ifω > σ4, then

nlim→ ∞P√

nTy

ωσ4

1−Φ y

, y∈R. 1.7

Ifω σ4which happens if and only ifX is a Bernoulli random variable symmetric around its mean, then asymptoticallyThasχ2type distribution, that is,

nlim→ ∞P

nT2 P

η2−1≥y

, y∈R, 1.8

whereηis a standard normal random variable, andΦy P{η≤y}is the standard normal distribution function.

(3)

Let us recall already known bounds for the tail probabilities of the sample variance see1.19–1.21. We need notation related to certain functions coming back to Hoeffding 1 . Let 0< p≤1 andq1−p. Write

H x;p

1 qx p

−qx−p

1−xqx−q, 0≤x≤1. 1.9

Forx ≤ 0 we defineHx;p 1. For x > 1 we setHx;p 0. Note that our notation for the functionHis slightly different from the traditional one. Letλ ≥0. Introduce as well the function

Πx;λ ex 1x

λ −x−λ

forx≥0, 1.10

andΠx;λ 1 forx≤0. One can check that H

x;p

≤Π

x;p q

. 1.11

All our bounds are expressed in terms of the functionH. Using1.11, it is easy to replace them by bounds expressed in terms of the functionΠ, and we omit related formulations.

Let 0≤p <1 andσ2≥0. Assume that

p σ2

1σ2, q 1

1σ2, pq1. 1.12

Letεbe a Bernoulli random variable such thatP{ε−σ2}qandP{ε1}p. ThenEε0 andEε2σ2. The functionHis related to the generating functionthe Laplace transformof binomial distributions since

H x;p

inf

h>0exp{−hx}Eexp{hε}, 1.13

Hn x;p

inf

h>0 exp{−hnx}Eexp{hε1· · ·εn}, 1.14

whereε1, . . . , εnare independent copies ofε. Note that1.14is an obvious corollary of1.13.

We omit elementary calculations leading to1.13. In a similar way Πx;λ inf

h>0 exp{−hx}Eexp h

ηλ

, 1.15

whereηis a Poisson random variable with parameterλ.

The functionsHandΠsatisfy a kind of the Central Limit Theorem. Namely, for given 0< p <1 andy≥0 we have

nlim→ ∞Hn

yn−1/2 p

q;p

lim

n→ ∞Πn

yn−1/2 λ;λ

exp

y2 2

1.16

(4)

we omit elementary calculations leading to1.16. Furthermore, we have1 H

y

p q;p

≤exp

y2 2

, 1

2 ≤p <1, y≥0, 1.17

and we also have2

H yp

q ;p

≤exp

py2 2q

y1

, 0≤p≤ 1

2, y≥0. 1.18

Using the introduced notation, we can recall the known resultssee2, Lemma 3.2 . Letk n/2 be the integer part ofn/2. Assume that 0X ≤1. Ifσ2is known, then

P{T ≥t} ≤U0, U0def Hk t

σ2; 1−2σ2

. 1.19

The right-hand side of1.19is an increasing function ofσ2 ≤1/4seeSection 3for a short proof of1.19as a corollary ofTheorem 1.1. Ifσ2is unknown butμis known, then

P{T≥t} ≤U1, U1def Hk t

μμ2; 1−2μ2μ2

. 1.20

Using the obvious estimateσ2μ1−μ, the bound1.20is implied by1.19. In cases where bothμandσ2are not known, we have

P{T ≥t} ≤U2, U2def Hk

4t;1 2

, 1.21

as it follows from1.19using the obvious boundσ2 ≤1/4.

Let us note that the known bounds1.19–1.21are the best possible in the framework of an approach based on analysis of the variance, usage of exponential functions, and of an inequality of Hoeffdingsee3.3, which allows to reduce the problem to estimation of tail probabilities for sums of independent random variables. Our improvement is due to careful analysis of the fourth moment which appears to be quite complicated; seeSection 2. Briefly the results of this paper are the following: we prove a general bound involvingμ,σ2, and the fourth momentω; this general bound implies all other bounds, in particular a new precise bound involvingμandσ2; we provide as well bounds for lower tailsP{T ≤ −t}; we compare the bounds analytically, mostly asnis sufficiently large.

From the mathematical point of view the sample variance is one of the simplest nonlinear statistics. Known bounds for tail probabilities are designed having in mind linear statistics, possibly also for dependent observations. See a seminal paper of Hoeffding 1 published in JASA. For further development see Talagrand3 , Pinelis4,5 , Bentkus6,7 , Bentkus et al.8,9 , and so forth. Our intention is to develop tools useful in the setting of nonlinear statistics, using the sample variance as a test statistic.

(5)

Theorem 1.1extends and improves the known bounds1.19–1.21. We can derive 1.19–1.21 from this theorem since we can estimate the fourth moment ω via various combinations ofμandσ2using the boundedness assumption 0≤X ≤1.

Theorem 1.1. Letk n/2 andω00.

IfEX4<andωω0, then

P{T≥t} ≤U, Udef Hk t

σ2;p

1.22

with

p σ4ω0

4ω0 s2

1s2, s2 σ4ω0

4 . 1.23

If 0X1 andωω0, then

P{T ≤ −t} ≤L, Ldef Hk 2t

1−2σ2;p

1.24

with

p40

1−4σ240 s2

1s2, s240

1−2σ22. 1.25 Both boundsUandLare increasing functions ofp,ω0,ands2.

Remark 1.2. In order to derive upper confidence bounds we need only estimates of the upper tailP{T ≥ t} see2 . To estimate the upper tail the conditionEX4 < ∞is sufficient. The lower tailP{T ≤ −t}has a different type of behavior since to estimate it we indeed need the assumption thatXis a bounded random variable.

For 0≤X≤1Theorem 1.1implies the known bounds1.19–1.21for the upper tail of T. It implies as well the bounds1.26–1.29for the lower tail. The lower tail has a bit more complicated structure,cf. 1.26–1.29with their counterparts1.19–1.21 for the upper tail.

Ifσ2is known, then

P{T ≤ −t} ≤L0, L0def Hk 2t

1−2σ2; 2σ2

. 1.26

One can showwe omit detailsthat the boundL0is not an increasing function ofσ2. A bit rougher inequality

P{T ≤ −t} ≤L0, L0def Hk

2t; 2σ2 12σ2

1.27

(6)

0 0.25 0.5 0.75 1 μ

0 0.25

σ2

D1

D2

D3

Figure 1:DD1D2D3.

has the monotonicity property sinceL0 is an increasing function ofσ2. Ifμis known, then using the obvious inequalityσ2μ1μ, the bound1.27yields

P{T≤ −t} ≤L1, L1def

Hk

2t; 2μ−2μ2 12μ−2μ2

. 1.28

If we have no information aboutμandσ2, then usingσ2≤1/4, the bound1.27implies P{T ≤ −t} ≤L2, L2 def Hk

2t;1

3

. 1.29

The bounds above do not cover the situation where both μ and σ2 are known. To formulate a related result we need additional notation. In case of 0 ≤ X ≤ 1 we use the notation

f1

1−μ 1 2 −μ

, f3μ

μ− 1 2

. 1.30

In view of the well-known upper boundσ2μ1μfor the variance of 0 ≤X ≤1, we can partition the set

D

μ, σ2

∈R2: 0≤μ≤1,0≤σ2μ 1−μ

1.31

of possible values ofμandσ2into a unionDD1D2D3of three subsets D1

μ, σ2

D:σ2f1

, D3

μ, σ2

D:σ2f3

, 1.32

andD2D\D1D3; seeFigure 1.

Theorem 1.3. Writek n/2 . Assume that 0≤X1.

(7)

The upper tail of the statisticT satisfies

P{T ≥t} ≤U3, U3def

Hk t

σ2;pu

1.33 withpus2/1s2, where

s2

⎧⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎪

⎪⎩

σ4 1−μ4 21−μ2σ2 , if

μ, σ2

D1, abσ24

4 , if μ, σ2

D2, σ4μ4

2σ2 , if μ, σ2

D3,

1.34

and where one can write

1−μ

2μ−12, b2−8μ3. 1.35 The lower tail ofT satisfies

P{T ≤ −t} ≤L3, L3def Hk 2t

1−2σ2;pl

1.36

withpls2/c2s2, wherec 1−2σ2/2σ2, ands2is defined by1.34.

The bounds above are obtained using the classical transformGHG, HGx inf

h<xEexp{hY−x} 1.37

of survival functions Gx P{Y ≥ x} cf. definitions 1.13 and 1.14 of the related Hoeffding functions. The bounds expressed in terms of Hoeffding functions have a simple analytical structure and are easily numerically computable.

All our upper and lower bounds satisfy a kind of the Central Limit Theorem. Namely, if we consider an upper bound, sayUUt resp., a lower boundLLtas a function of t, then there exist limits

nlim→ ∞U t

n

exp

−ct2

, lim

n→ ∞L t

n

exp

−dt2

1.38 with some positivec and d. The values ofc and dcan be used to compare the bounds—

the larger these constants, the better the bound. To prove1.38it suffices to note that with k n/2

nlim→ ∞Hk x

n;p

exp

qx2 4p

. 1.39

(8)

The Central Limit Theorem in the form of1.7restricts the ranges of possible values ofcand d. Namely, using1.7it is easy to see thatcanddhave to satisfy

c, dadef 1 2

ωσ4. 1.40

We provide the values of these constants for all our bounds and give the numerical values of them in the following two cases.

iXis a random variable uniformly distributed in the interval0,1/2 . The moments of this random variable satisfy

μ1

4, σ2 1 48,

μ, σ2

D1, ω 1

1280, a1440. 1.41

Forμ, σ2, ωdefined by1.41, the constantscanddwe give asc1, d1. iiXis uniformly distributed in0,1 , and in this case

μ 1

2, σ2 1 12,

μ, σ2

D2, ω 1

80, a90. 1.42

For the constantscanddwithμ, σ2, ωdefined by1.42we give asc2, d2. We have

U2: c4, c14, c24,

U1: c 1

2μ−2μ2

1−2μ2μ2, c1 4.26. . . , c24, U0: c 1

2−4σ4, c125.04. . . , c27.2, U3: c 1

4s2, c1 42.60. . . , c218, 1.43

(9)

U: c 1

40, c1411.42. . . , c225.71. . . , 1.44 L2: d2, d12, d22,

L1: d 1

2μ−2μ2, d12.66. . . , d22, L0: d 1

2, d124, d26, L0: d 1

2−4σ4, d125.04. . . , d27.2, L3: d 1

4s2, d1 42.60. . . , d2 18,

1.45

L: d 1

40

, d1411.42. . . , d225.71. . . , 1.46

while calculating the constants in1.44and 1.46we choose ω0 ω. The quantity s2 in 1.43and1.45is defined by1.34.

Conclusions

Our new bounds provide a substantial improvement of the known bounds. However, from the asymptotic point of view these bounds seem to be still rather crude. To improve the bounds further one needs new methods and approaches. Some preliminary computer simulations show that in applications where n is finite and random variables have small means and variances like in auditing, where a typical value of n is 60, the asymptotic behavior is not related much to the behavior for smalln. Therefore bounds specially designed to cover the case of finitenhave to be developed.

2. Sharp Upper Bounds for the Fourth Moment

Recall that we consider bounded random variables such that 0 ≤ X ≤ 1, and that we write μ EX and σ2 EX −μ2. In Lemma 2.1we provide an optimal upper bound for the fourth moment ofXλgiven a shiftλ∈R, a meanμ, and a varianceσ2. The maximizers of the fourth moment are either Bernoulli or trinomial random variables. It turns out that their distributions, sayν, are of the following three typesi–iii:

ia two point distribution such that

ν{d} r, ν{1} p, σ2

1−μ, 2.1

r 1−μ2

1−μ2σ2, p σ2

1−μ2σ2; 2.2

(10)

iia family of three point distributions depending on 1/4< λ <3/4 such that

ν{0} q, ν{d} r, ν{1} p, ddλ2λ−1

2, 2.3

q σ2f1

dλ , r μ 1−μ

σ2

dλ1−dλ , p σ2f3

1−dλ, 2.4

where we write f1

1−μ μdλ

, f3μ dλμ

; 2.5

notice that2.4supplies a three-point probability distribution only in cases where the inequalitiesσ2 > f1andσ2> f3hold;

iiia two point distribution such that

ν{0} q, ν{d} r, σ2

μ , 2.6

q σ2

μ2σ2, r μ2

μ2σ2. 2.7

Note that the point d in 2.2–2.7 satisfies 0 ≤ d ≤ 1 and that the probability distributionνhas meanμand varianceσ2.

Introduce the set

D

μ, σ2

∈R2:μEX, σ2 E Xμ2

,0≤X≤1

. 2.8

Using the well-known boundσ2μ1μvalid for 0≤X ≤1, it is easy to see that

D

μ, σ2

∈R2: 0≤μ≤1,0≤σ2μ 1−μ

. 2.9

Letλ∈R. We represent the setD⊂R2as a unionDD1λD2λD3λof three subsets setting Dλ1

μ, σ2

D:σ2f1

, D3λ μ, σ2

D:σ2f3

, 2.10

andD2λ D\Dλ1Dλ3, wheref1 andf3 are given in2.5. Let us mention the following properties of the regions.

aIfλ≤ 1/4, thenD D1λsince for suchλobviouslyμ1μf1 for all 0 ≤μ≤1.

The setD3λ{0,0}is a one-point set. The setD2λis empty.

bIfλ ≥3/4, thenDDλ3 since for suchλclearlyμ1μf3for all 0≤μ≤1. The setDλ1 {1,0}is a one-point set. The setDλ2is empty.

(11)

For 1/4< λ <3/4 all three regionsD1λ,D2λ,Dλ3 are nonempty sets. The setsDλ1andDλ3 have only one common pointdλ,0∈D, that is,D1λD3λ{dλ,0}.

Lemma 2.1. Letλ∈R. Assume that a random variableXsatisfies

0≤X≤1, EXμ, EX−μ2σ2. 2.11

Then

EX−λ4≤EXλ4 2.12 with a random variableXsatisfying2.11and defined as follows:

iifμ, σ2D1λ, thenXis a Bernoulli random variable with distribution2.2;

iiifμ, σ2D2λ, thenXis a trinomial random variable with distribution2.4;

iiiifμ, σ2D3λ, thenXis a Bernoulli random variable with distribution2.7.

Proof. WritingY Xλ, we have to prove that if

−λ≤Y ≤1−λ, EY μλ, EY−EY2 σ2, 2.13

then

EY4≤EY4 2.14

withYXλ. Henceforth we writeadλ, so thatYcan assume only the values−λ,a, 1−λwith probabilitiesq,r,pdefined in2.2–2.7, respectively. The distributionLY is related to the distributionνLXasB νBλfor allB⊂R.

Formally in our proof we do not need the description2.17of measuressatisfying 2.15. However, the description helps to understand the idea of the proof. Leta ∈ Rand σ2≥0. Assume that a signed measureof subsets ofRis such that the total variation measure is a discrete measure concentrated in a three-point set{−λ, a,1−λ}and

Rdx 1,

Rxdx μλ,

Rx−μλ2dx σ2. 2.15 Thenis a uniquely defined measure such that

qdef {−λ}, rdef {a}, pdef {1λ} 2.16 satisfy

q σ2

aμλ 1−μ

, r μ

1−μ

σ2

aλ1aλ, p σ2

aμλ μ 1−aλ .

2.17

(12)

We omit the elementary calculations leading to2.17. The calculations are related to solving systems of linear equations.

Leta, b, c∈R. Consider the polynomial

Pt tcbtta2c0c1tc2t2c3t3t4, t∈R. 2.18 It is easy to check that

c30⇐⇒bc2a0. 2.19

The proofs ofi–iiidiffer only in technical details. In all cases we finda,b, andc depending onλ,μandσ2such that the polynomialP defined by2.18satisfiesPt ≥ 0 for−λ ≤ t ≤ 1−λ, and such that the coefficientc3 in2.18vanishes,c3 0. Usingc3 0, the inequalityPt ≥ 0 is equivalent tot4c0c1tc2t2, which obviously leads toEY4c0c1μ−λ c2σ2. We note that the random variableYassumes the values from the set

{t:Pt 0}

t:c0c1tc2t2t4

. 2.20

Therefore we have

EY4c0c1

μλ

c2σ2EY4, 2.21 which proves the lemma.

iNowμ, σ2Dλ1. We choosec1−λandλσ2/1μ. In order to ensure c30cf.2.19we have to take

b−c−2a≡ −2μ−13λ 2σ2

1−μ. 2.22

Ifb≤ −λ, thenPt≥0 for all−λ≤t≤1−λ. The inequalityb≤ −λis equivalent to σ2

1−μ

μ−2λ1 2

f1⇐⇒

μ, σ2

D1λ. 2.23

To complete the proof we note that the random variable Y Xλ with X defined by2.2assumes its values in the set{a,1−λ} ⊂ {t : Pt 0}. To find the distribution ofYwe use2.17. Settingλσ2/1μin2.17we obtain q0 andr,pas in2.2.

iiNowμ, σ2D2λor, equivalentlyσ2 > f1 andσ2 > f3. Moreover, we can assume that 1/4 < λ < 3/4 since only for suchλthe region D2λis nonempty. We choose c1−λandb−λ. ThenPt≥0 for all−λ≤t≤1−λ. In order to ensurec3 0 cf.2.19we have to take

abc

2 ≡λ− 1

2. 2.24

(13)

By our construction {t : Pt 0} {−λ, a,1−λ}. To find a distribution of Y supported by the set{−λ, a,1−λ}we use2.17. It follows thatXYλhas the distribution defined in2.4.

iiiWe choosec−λandλσ2/μ. In order to ensurec3 0cf.2.19we have to take

b−c−2a≡3λ−2μ−2σ2

μ . 2.25

Ifb≥1−λ, thenPt≥0 for all−λ≤t≤1−λ. The inequalityb≥1−λis equivalent to

σ2μ

2λ−μ−1 2

f3⇐⇒

μ, σ2

Dλ3. 2.26

To conclude the proof we notice that the random variableYXλwithXgiven by2.7 assumes values from the set{−λ, a} ⊂ {t:Pt 0}.

To prove Theorems1.1and1.3we applyLemma 2.1withλμ. We provide the bounds of interest asCorollary 2.2. To prove the corollary it suffices to plug λ μin Lemma 2.1 and, using2.2–2.7, to calculateEXμ4explicitly. We omit related elementary however cumbersome calculations. The regionsD1,D2, andD3are defined in1.32.

Corollary 2.2. Let a random variable 0X1 have meanμand varianceσ2. Then

EX−μ4

⎧⎪

⎪⎪

⎪⎪

⎪⎨

⎪⎪

⎪⎪

⎪⎪

σ61−μ−2σ4σ21−μ2, if μ, σ2

D1,

μ

1−μ μ−1

2 2

σ2

2−2μ3 4

, if

μ, σ2

D2, σ6μ−2σ4σ2μ2, if

μ, σ2

D3.

2.27

Proposition 2.3. Let 0X1. Then, with probability 1, the sample variance satisfiesσ2γ2σ2 withγ2given by1.6.

Proof. Using the representation 1.3 of the sample variance as anU-statistic, it suffices to show that the functionf:Rn → R,

fx

i /k,1≤i,k≤n

xixk2, x x1, . . . , xn∈Rn, 2.28

in the domain

D{x∈Rn: 0≤x1 ≤1, . . . ,0≤xn≤1} 2.29 satisfiesf ≤2nn−1γ2σ2. The functionfis convex. To see this, it suffices to check thatf restricted to straight lines is convex. Any straight line can be represented asL{xth:t∈R}

(14)

with somex, h ∈Rn. The convexity offonLis equivalent to the convexity of the function gtdef fxthof the real variablet∈R. It is clear that the second derivativegt 2fh is nonnegative sincef≥0. Thus bothgandfare convex.

Since both f and D are convex, the function f attains its maximal value on the boundary ofD. Moreover, the maximal value off is attained on the set of extremal points of D. In our case the set of the extremal points is just the set of vertexes of the cube D.

In other words, the maximal value off is attained when each ofx1, . . . , xn is either 0 or 1.

Sincefis a symmetric function, we can assume that the maximal value offis attained when x1 · · · xm 1 and xm1 · · · xn 0 with somem 0, . . . , n. Using 2.28, the corresponding value off is 2mn−m. Maximizing with respect tomwe getfn2/2, if nis even, andf ≤ n2 −1/2, ifnis odd, which we can rewrite as the desired inequality f≤2nn−1 γ2σ2.

3. Proofs

We use the following observation which in the case of an exponential function comes back to Hoeffding 1, Section 5 . Assume that we can represent a random variable, sayT, as a weighted mixture of other random variables, sayT1, . . . , Tm, so that

T α1T1· · ·αmTm, α1, . . . , αm≥0, α1· · ·αm1, 3.1 where αs are nonrandom numbers. Let f be a convex function. Then, using Jensen’s inequalityfTα1fT1 · · ·αmfTm, we obtain

EfT≤α1EfT1 · · ·αmEfTm. 3.2

Moreover, if random variablesT1, . . . , Tmare identically distributed, then

EfT≤EfT1. 3.3

One can specialize3.3forU-statistics of the second order. Letux, y uy, xbe a symmetric function of its arguments. For an i.i.d. sampleX1, . . . , Xnconsider theU-statistic

U 1

nn−1

i /k,1≤i,k≤n

uXi, Xk. 3.4

Write

V 1

kuX1, X2 uX3, X4 · · ·uX2k−1, X2k, kn 2

. 3.5

Then3.3yields

EfU≤EfV 3.6

(15)

for any convex functionf. To see that3.6holds, letπ π1, . . . , πnbe a permutation of 1, . . . , n. Define Vπ as 3.5 replacing the sample X1, . . . , Xn by its permutation Xπ1, . . . , Xπn. Thensee1, Section 5

U 1 n!

π

Vπ, 3.7

which means thatUallows a representation of type3.1withmn! and allVπidentically distributed, due to our symmetry and i.i.d. assumptions. Thus,3.3implies3.6.

Using1.3we can write

T 1

nn−1

i /k,1≤i,j≤n

u Xi, Xj

3.8

withux, y σ2−x−y2/2. By an application of3.6we derive EfT≤Ef

Zk

k

, Ef−T≤Ef

Zk

k

, kn 2

3.9

for any convex functionf, whereZk Y1· · ·Yk is a sum of i.i.d. random variables such that

Y1Dσ2− X1X22

2 . 3.10

Consider the following three families of functions depending on parameterst, h∈R:

f y

y−h

t−h, t∈R, h < t, 3.11 f

y

y−h2

t−h2 , t∈R, h < t, 3.12 f

y exp

h yt

, t∈R, h >0. 3.13 Any of functionsfgiven by3.11dominates the indicator functiony→I{y∈t,∞}of the intervalt,∞. ThereforeP{T ≥t} ≤EfT. Combining this inequality with3.9, we get

P{T ≥t} ≤inf

h Ef Zk

k

, P{T≤ −t} ≤inf

h Ef

Zk k

3.14

withZkbeing a sum ofki.i.d. random variables specified in3.10. Depending on the choice of the family of functionsf given by3.11, the inf in3.14 is taken overh < t orh > 0, respectively.

(16)

Proposition 3.1. One has

EX1X242ω6σ4. 3.15

If 0X1, thenωEX−μ4σ2−3σ4.

Proof. Let us prove3.15. Using the i.i.d. assumption, we have

EX1X24E

X1μ μX24 2EX−μ4−8E

X1μ

X2μ36EX1μ2X2μ2 2ω6σ4.

3.16

Let us prove thatωσ2−3σ4. If 0≤X≤1, thenX1X22≤1. Using3.15we have 2ω6σ4EX1X24≤EX1X222, 3.17

which yields the desired bound forω.

Proposition 3.2. LetY be a bounded random variable such thataYbwith some nonrandom a, b∈R. Then for any convex functiong:a, b → Rone has

EgY≤Egε, 3.18

whereεis a Bernoulli random variable such thatEY EεandP{εa}P{εb}1.

IfYbfor someb >0, andEY 0,EY2r2, then3.18holds with g

y

yh2

, h∈R, g

y exp

hy

, h≥0, 3.19

and a Bernoulli random variableεsuch that0, varεr2,

pr def P{εb} r2

b2r2, qr def P

εr2 b

b2

b2r2, 3.20

prqr 1.

Proof. See2, Lemmas 4.3 and 4.4 .

Proof ofTheorem 1.1. The proof is based on a combination of Hoeffding’s observation3.6 using the representation 3.8 of T as a U-statistic, of Chebyshev’s inequality involving exponential functions, and ofProposition 3.2. Let us provide more details. We have to prove 1.22and1.24.

(17)

Let us prove1.22. We apply3.14with the family3.13of exponential functionsf.

We get

P{T≥t} ≤inf

h>0exp{−ht}Eexp hZk

k

. 3.21

By3.10, the sumZk Y1· · ·Ykis a sum ofkcopies of a random variable, sayY, such that

Y σ2−X1X22

2 . 3.22

We note that

Yσ2, EY 0, EY2 ωσ4

2 ≤ ω0σ4

2 . 3.23

Indeed, the first two relations in3.23are obvious; the third one is implied byωω0,

EY2 EX1X24

4 −σ4, 3.24

andEX1X242ω6σ4; seeProposition 3.1.

LetMstand for the class of random variablesYsatisfying3.23. Taking into account 3.21, to prove1.22it suffices to check that

Jdef inf

h>0exp{−ht} sup

Y1,...,Yk∈MEexp hZk

k

Hk t

σ2;p

, 3.25

whereZkis a sum ofkindependent copiesY1, . . . , YkofY. It is clear that the left-hand side of 3.25is an increasing function ofω0. To prove3.25, we applyProposition 3.2. Conditioning ktimes on all random variables except one, we can replace all random variablesY1, . . . , Ykby Bernoulli ones. To find the distribution of the Bernoulli random variables we use3.23. We get

sup

Y∈MEexp hZk

k

Eexp hSk

k

, 3.26

whereSkε1· · ·εkis a sum ofkindependent copies of a Bernoulli random variable, say ε, such thatEε 0 andP{ε σ2} pwithpas in1.23, that is,p σ4ω0/3σ4ω0. Note that in3.26we have the equality sinceε∈ M.

(18)

Using3.26we have

Jinf

h>0exp{−ht}Eexp hSk

k

infh>0 exp

ht k

Eexp

k

k

infh>0 exp

ht σ2

Eexp

σ2

k

Hk t

σ2;p

.

3.27

To see that the third equality in3.27holds, it suffices to change the variablehbykh/σ2. The fourth equality holds by definition1.13of the Hoeffding function sinceε/σ2is a Bernoulli random variable with mean zero and such thatP{ε/σ2 1} p. The relation3.27proves 3.25and1.22.

A proof of1.24repeats the proof of1.22replacing everywhereT andY by−Tand

−Y, respectively. The inequalityYσ2in3.23has to be replaced by−Y ≤1/2−σ2, which holds due to our assumption 0 ≤ X ≤ 1. Respectively, the probabilityp now is given by 1.25.

Proof of1.19. The bound is an obvious corollary ofTheorem 1.1since byProposition 3.1we haveωσ2−3σ4, and therefore we can chooseω0 σ2−3σ4. Setting this value ofω0 into 1.22, we obtain1.19.

Proof of1.26and1.27. To prove1.26, we setω0 σ2−3σ4in1.24. Such choice ofω0is justified in the proof of1.19.

To prove1.27we use1.26. We have to prove that

H 2t

1−2σ2; 2σ2

H

2t; 2σ2 12σ2

, 3.28

and that the right-hand side of3.28is an increasing function ofσ2. By the definition of the Hoeffding function we have

H 2t

1−2σ2; 2σ2

inf

h>0exp

− 2ht 1−2σ2

Eexp{hδ}

inf

h>0exp{−2ht}Eexp h

1−2σ2 δ

,

3.29

where δ is a Bernoulli random variable such that P{δ 1} 2σ2 and Eδ 0. It is easy to check that δ assumes as well the value−2σ2/1−2σ2 with probability 1−2σ2. Hence

−2σ2/1−2σ2δ≤1. Therefore−2σ2≤1−2σ2δ≤1−2σ2, and we can write Eexp

h

1−2σ2 δ

≤ sup

W∈MEexp{hW}, 3.30

(19)

whereMis the class of random variablesWsuch thatEW0 and−2σ2W≤1. Combining 3.29and3.30we obtain

H 2t

1−2σ2; 2σ2

≤inf

h>0 exp{−2ht}sup

W∈MEexp{hW}. 3.31

The definition of the latter sup in 3.31 shows that the right-hand side of 3.31 is an increasing function ofσ2. To conclude the proof of1.27we have to check that the right- hand sides of3.28and3.31are equal. Using3.18ofProposition 3.2, we getEexp{hW} ≤ Eexp{hε}, whereεis a mean zero Bernoulli random variable assuming the values−2σ2and 1 with positive probabilities such thatP{ε1}2σ2/12. Sinceε∈ M, we have

sup

W∈MEexp{hW}Eexp{hε}. 3.32

Using the definition of the Hoeffding function we see that the right-hand sides of3.28and 3.31are equal.

Proof ofTheorem 1.3. We useTheorem 1.1. In bounds of this theorem we substitute the value ofω0 being the right-hand side of2.27, where a bound of typeωω0 is given. We omit related elementary analytical manipulations.

Proof of the Asymptotic Relations1.7and1.8. To describe the limiting behavior ofT we use Hoeffding’s decomposition. We can write

nn−1

2 T n−1

1≤i≤n

u1Xi

1≤i<k≤n

u2Xi, Xk 3.33

with kernelsu1andu2such that

u1x σ2−x−μ2

2 , u2x

xμ yμ

σ2 . 3.34

To derive3.33, use the representation ofTas aU-statistic3.8. The kernel functionsu1and u2are degenerated, that is,Eu1X 0 andEu2X, x 0 for allx∈R. Therefore

var

nn−1 2σ2 T

nn−12varu1nn−1

2 varu2 3.35

with

varu1 ωσ4

4 , varu21, ωEX−μ4. 3.36

(20)

It follows that in cases whereω > σ4the statisticTis asymptotically normal:

nT

ωσ4 −→η, asn−→ ∞, 3.37

whereη is a standard normal random variable. It is easy to see thatω σ4 if and only if X is a Bernoulli random variable symmetric around its mean. In this special case we have u1X≡0, and3.33turns to

nn−1T σ2

εD 1· · ·εn2n, 3.38

whereε1, . . . , εnare i.i.d. Rademacher random variables. It follows that ωσ4⇒ n−1T

σ2 −→η2−1, asn−→ ∞, 3.39 which completes the proof of1.7and1.8.

Acknowledgment

Figure 1was produced by N. Kalosha. The authors thank him for the help. The research was supported by the Lithuanian State Science and Studies Foundation, Grant no. T-15/07.

References

1 W. Hoeffding, “Probability inequalities for sums of bounded random variables,” Journal of the American Statistical Association, vol. 58, pp. 13–30, 1963.

2 V. Bentkus and M. van Zuijlen, “On conservative confidence intervals,” Lithuanian Mathematical Journal, vol. 43, no. 2, pp. 141–160, 2003.

3 M. Talagrand, “The missing factor in Hoeffding’s inequalities,” Annales de l’Institut Henri Poincar´e B, vol. 31, no. 4, pp. 689–702, 1995.

4 I. Pinelis, “Optimal tail comparison based on comparison of moments,” in High Dimensional Probability (Oberwolfach, 1996), vol. 43 of Progress in Probability, pp. 297–314, Birkh¨auser, Basel, Switzerland, 1998.

5 I. Pinelis, “Fractional sums and integrals ofr-concave tails and applications to comparison probability inequalities,” in Advances in Stochastic Inequalities (Atlanta, Ga, 1997), vol. 234 of Contemporary Mathematics, pp. 149–168, American Mathematical Society, Providence, RI, USA, 1999.

6 V. Bentkus, “A remark on the inequalities of Bernstein, Prokhorov, Bennett, Hoeffding, and Talagrand,”

Lithuanian Mathematical Journal, vol. 42, no. 3, pp. 262–269, 2002.

7 V. Bentkus, “On Hoeffding’s inequalities,” The Annals of Probability, vol. 32, no. 2, pp. 1650–1673, 2004.

8 V. Bentkus, G. D. C. Geuze, and M. van Zuijlen, “Trinomial laws dominating conditionally symmetric martingales,” Tech. Rep. 0514, Department of Mathematics, Radboud University Nijmegen, 2005.

9 V. Bentkus, N. Kalosha, and M. van Zuijlen, “On domination of tail probabilities ofsupermartingales:

explicit bounds,” Lithuanian Mathematical Journal, vol. 46, no. 1, pp. 3–54, 2006.

Odkazy

Související dokumenty

[r]

Navrhované analytické řešení pracuje s budoucí robustní architekturou (viz kapitola 3.6.1) pouze okrajově, je celé stavěno na dočasné architektuře (viz kapitola

Výše uvedené výzkumy podkopaly předpoklady, na nichž je založen ten směr výzkumu stranických efektů na volbu strany, který využívá logiku kauzál- ního trychtýře a

11 Toto nekomerční poselství může být buď povinnou součástí reklamy, jako je tomu v rekla- mě na tabákové výrobky, která musí být doprovozena zdravotnickým varováním,

Mohlo by se zdát, že tím, že muži s nízkým vzděláním nereagují na sňatkovou tíseň zvýšenou homogamíí, mnoho neztratí, protože zatímco se u žen pravděpodobnost vstupu

The main objective of this thesis is to explore how retail banks in the Slovak Republic exploit branding and what impact it has on customers’ satisfaction and loyalty. When

The proof is largely based on the Costin-Lebowitz- Soshnikov argument and the asymptotic estimates for the expectation and variance of the number of eigenvalues in an interval..

In this section we observe that the provability of circuit lower bounds in S 2 1 (bit) would give us an efficient witnessing of errors of p-size circuits for SAT described in