On the rate of convergence of the laws of Markov chains
associated with orthogonal polynomials
Marc Lindlbauer∗
Institute for Biomathematics and Biometry, GSF – National Research Center for Environment and Health, Ingolstadter Landstrae 1, D-85764 Neuherberg, Germany
Received 1 November 1997; received in revised form 20 March 1998
Abstract
We investigate random walks (Sn)n∈N0 on the nonnegative integers arising from isotropic random walks on distance
transitive graphs. The laws of those isotropic random walks converge in distribution to the normal distribution and the transition probabilities of theSn are closely related with a sequence of Bernstein–Szego polynomials. We give an explicit
representation for these polynomials as a sum of Chebychev polynomials of the second kind and using this representation we prove an upper bound for the rate of convergence of the laws of the Sn. c 1998 Elsevier Science B.V. All rights
reserved.
Keywords:Random walks; Orthogonal polynomials; Limit theorems
1. Introduction
The central limit theorem for sums of R–valued independent and identically distributed random variables and the theorem of Berry–Esseen are well known from probability theory. In 1990 Voit [9] proved analogous results in a very general setting for random walks (Sn)n∈N0 on N0 whose transition probabilities are closely connected with certain sequences of orthogonal polynomials. Using the methods from [9] we investigate the rate of convergence in the central limit theorem for the Sn
for the special case where these random variables arise from isotropic random walks on an innite distance transitive graph which is a generalization of homogeneous trees. Closely related with the structure of innite distance transitive graphs is a sequence (Pn)n∈N0 of orthogonal polynomials which turn out to be Bernstein–Szego polynomials. We give an explicit representation of the Pn as a sum
∗
E-mail: [email protected].
of Chebychev polynomials of the second kind that enables us to improve the rate of convergence presented in [9]. Furthermore we compare the rate in this paper with the classi-cal rate from the theorem of Berry–Esseen. The paper is organized as follows: Since this paper is based on the setting in [9] we give a short overview of it in Section 2. Section 3 deals with random walks on innite distance transitive graphs which are the motivating exam-ple for how the theorems are stated. Section 4 introduces some notation which is necessary to state the main results in Section 5. Section 6 presents the proofs for the results in Section 5.
2. Random walks associated with a sequence of orthogonal polynomials
Let (an)n∈N, (bn)n∈N and (cn)n∈N be sequences of real numbers satisfying an; cn¿0, bn¿0 and
an+bn+cn= 1. Furthermore, we assume that := limn→∞an; := limn→∞bn and := limn→∞cn
exist and that
(i) 0¡¡¡1 and
(ii) P∞n=1n·max(0; bn−bn+1)¡∞.
Condition (i) is essential for this paper since for = the asymptotic behaviour of the laws of the Sn is of a very dierent kind (see, e.g., [4, 10]). Condition (ii) is a technical one that is necessary
to obtain the results in [9]. Both conditions are stated for completeness and will implicitly be used in this paper when applying [9].
Using Favard’s theorem (see, e.g., [2]) we can dene a sequence of orthogonal polynomials by
P0≡1; P1(x) = 2√·x+; P1·Pn=anPn+1+bnPn+cnPn−1: (1)
Since the (Pn)n∈N0 form a basis of the complex vector space of polynomials in one complex variable, the linearization coecients of the product Pm·Pn=Pkm=+|nm−n|gm; n; kPk are uniquely
deter-mined and we can dene a convolution of point measures m and n on N0 for all m; n∈N0 by setting
m∗n:= m+n X
k=|m−n|
gm; n; kk: (2)
We point out that the range of the summation index in (2) is due to the orthogonality of the Pn.
Moreover, if we assume that all the linearization coecients gm; n; k are nonnegative then we can
extend the convolution ∗ of point measures uniquely to a norm continuous convolution on Mb(N0), the space of all bounded measures on N0. With this convolution N0 can be given the structure of a polynomial hypergroup. For details see [6, 7, 11]. If in the following a convolution on Mb(N0) occurs it will always be assumed to arise in the described way and will be denoted as ∗P to specify
the dening sequence of orthogonal polynomials.
Denition 2.1. A Markov chain (Sn)n∈N0 is called homogeneous with respect to the convolution ∗P if there exists a sequence (n)n∈N of probability measures on N0 with
P(Sn+1=r |Sn=k) =n+1∗Pk({r}) (3)
for all n; r; k∈N0. The n are called transition measures. If all n are equal to a probability measure
then (Sn)n∈N0 is called stationary.
Since the results of this paper do not depend on the law of S0 (see [9]) we may assume without loss of generality that S0≡0. An easy induction on n yields the following important consequence of the denition:
Proposition 2.2. For all n∈N the law of Sn is given by 1∗P2∗P· · · ∗Pn.
Finally we remark that the normalization P0≡1 is a necessary condition for a family of orthogonal polynomials to dene a polynomial hypergroup. The choice of P1 is due to Voit (see, e.g., [9] or [10]).
3. Distance transitive graphs and the associated random walks on N0
Let denote an undirected, connected graph with countably innite vertex set V. Furthermore we assume that is locally nite, i.e. that every vertex has only a nite number of neighbours. Let d denote the usual graphtheoretic distance. With this notation we can dene as follows:
Denition 3.1. Let be as above with automorphism group A. is called distance transitive if for all u; v; w; y∈V with d(u; v) =d(w; y) there exists g∈A such that g(u) =w and g(v) =y.
MacPerson (see [8]) showed that such graphs must have a very simple geometry:
Theorem 3.2 (MacPherson). Every innite locally nite distance transitive graph is a standard graph a; b with a; b∈N\{1}.
A standard graph a; b is a graph where exactly a copies of a complete graph with b vertices are
tacked to every vertex of . An example for a= 2 and b= 3 is given in Fig. 1.
Now consider a probability space (;A;P) and a simple random walk (Tn)n∈N0 on starting at a xed vertex v0∈V, which is also a Markov chain. The property to be a simple random walk means the following:
(1) P(T0=v0) = 1,
(2) P(Tn+1=u|Tn=v) =P(Tn+1=g(u)|Tn=g(v)); (u; v∈V; n∈N0; g∈A),
(3) P(d(Tn+1; Tn)¿1) = 0 for all n∈N0.
Because of (2) such a random walk is called isotropic. By setting Sn:=d(Tn; v0) we obtain a
N0-valued random walk, that is again a Markov chain. Since properties (2) and (3) imply that in
Fig. 1.a= 3,b= 3.
be seen using the above characterization of MacPherson of innite distance transitive graphs that the transition probabilities of the Sn are given as follows:
P(Sn+1=r |Sn=k) =
a−1
a ; r=k+ 1; k6= 0; b−2
a(b−1); r=k; k6= 0; 1
a(b−1); r=k−1; k6= 0; 1; r= 1; k= 0;
0; |r−k|¿1;
for all n; k; r∈N0. We note that property (2) ensures that the above transition probabilities do not
depend on v0. Using these probabilities we dene a sequence (Cn)n∈N0 of orthogonal polynomials as follows:
C0≡1; C1(x) 2 a
s
a−1 b−1·x+
b−2 a(b−1);
C1·Cn=
a−1 a Cn+1
b−2 a(b−1)Cn+
1
a(b−1)Cn−1:
Computing the orthogonality measure of these polynomials it turns out that they are Bernstein–Szego polynomials. It is given by the following formula (see, e.g., [1, (4.28)–(4.30)]):
Z 1
−1
Cm(x)Cn(x)w(x) dx= 0 (m6=n; a¿b¿2);
Z 1
−1
where
w(x) = 1 2
√
1−x2 (x1−x)(x2−x)
; l(a; b) =b−a ab ;
x1=
2−a−b
2√(a−1)(b−1); x2=
ab−a−b+ 2 2√(a−1)(b−1):
Following [9] we will call the Cn generalized Cartier polynomials. Since all the linearization
coef-cients in the product CmCn= Pm+n
k=|m−n|gm; n; kCk are nonnegative (see [9]) the Cartier polynomials
dene a convolution structure on Mb(N0) as described in Section 2. Therefore the transition proba-bilities of a random walk arising from a simple isotropic random walk on innite distance transitive graphs are given for all n; k; r∈N0 by P(Sn+1=r|Sn=k) =1∗Ck({r}) for all n; k; r∈N0: Hence
(Sn)n∈N0 is a stationary random walk on N0 according to Denition 2.1. We generalize this situation replacing the transition measure 1 by an arbitrary probability measure . In the context of isotropic random walks on this amounts to dropping condition (3). Finally we note that except for the trivial case a=b= 2 all assumptions of Section 2 are satised.
4. Technical notes
In order to state the main results of this paper we have to introduce some notation. If f is an arbitrary C-valued function on N0 and if is an arbitrary probability measure let (f) denote the integral RN0f(k)(dk) whenever this integral exists. Using the notation of Section 2 we dene:
x0:= 1−
2√¿1; 0:= ln(
q
=)¿0:
x0 and 0 are related by the formula x0= cos(i0). From (1) in Section 2 we obtain by an easy induction argument that Pn(x0) = 1 for all n∈N0. Next we present a sequence of functions which will be essential for our proof for the rate of convergence.
Denition 4.1. For all n∈N0 we dene a function mn on N0 by
mn(k) := in
d
d
n
Pk(cos())
= i0
:
Those functions are called moment functions of nth order.
Proposition 4.2. (i) m1 and m2 are non-negative.
(ii) (m1(k))2¡m2(k) for all k∈N and m1(0) =m2(0) = 0: (iii) If 1 and 2 are probability measures then
1∗P2(m1) =1(m1) +2(m1);
1∗P2(m2) =1(m2) +2(m2) + 21(m1)2(m1):
For the generalized Cartier polynomials the moment function m1 can be computed explicitly (see [9]). For all n∈N0 we have
m1(n) =n−
b(a−1) a(ab−a−b)
1− 1
(a−1)n(b−1)n
: (4)
If Y is any N0-valued random variable and E(m1(Y))¡∞ and E(m2(Y))¡∞ we can dene
E∗
P(Y) :=E(m1(Y)); V∗P(Y) :=E(m2(Y))−E∗P(Y) 2
to be the modied expectation and the modied variance, respectively. As a consequence of Proposition 4.2(ii) we have V∗
P(Y)¿0 and V∗P(Y)¿0 if the law of Y is not equal to 0. Now
let (Sn)n∈N0 be a stationary Markov chain with respect to∗P with transition measure . If we assume
(m1) and (m2) to exist we have E∗P(Sn) =nE∗P(S1) and V∗P(Sn) =nV∗P(S1). This is an immediate
consequence of Proposition 4.2(iii) and Proposition 2.2.
5. The main results
Let (Sn)n∈N0 again be a stationary Markov chain with transition measure 6=0. Furthermore we assume that (m1)¡∞ and (m2)¡∞.
Analogous to sums of independent identically distributed R-valued random variables we dene a
random variable Sen for all n∈N0 by eSn:= (m1(Sn)−nE∗P(S1))=
p
nV∗
P(S1) Sn as in the previous
section. Since V∗
P(S1)¿0 due to 6=0 division by
p V∗
P(S1) is allowed. Voit [9] showed that the
sequence of the eSn converges in distribution to the standardized normal distribution N(0;1) and
that an analogue of the theorem of Berry–Esseen holds. The theorem below improves the results of [9] for the special case of generalized Cartier polynomials. The next proposition assures that the assumptions of Theorem 5.2 make sense, i.e. that there are examples where the assumptions are satised.
Proposition 5.1. For the Cartier polynomials we have the following representation as a sum of Chebychev polynomials of the second kind for all n∈N and all ∈C\{0}:
Cn(cos) = (a−1)
sin((n+ 1)) + b−2
[(a−1)(b−1)]1=2sin(n)−
1
As a consequence we have
sup
n∈N0|
Cn(cos(i0−x))−eim1(n)x|= O(x2)
for x∈R with |x| →0.
Essential for the proof of the theorem below is to have a bound for the above supremum. So it is natural to ask how fast the Sen converge to N(0;1) in the setting of Section 2 provided that we
know how supn∈N0|Pn(cos(i#0−x))−eim1(n)x|behaves. An answer is given in the following theorem:
Theorem 5.2. Let (Pn)n∈N0 be a sequence of orthogonal polynomials as in Section 2. Furthermore
let (Sn)n∈N0 denote a stationary Markov chain on N0 with transition measure satisfying 6=0
and Pk∈N0k3({k})¡∞: Set :=E∗P(S1) and 2:=V
∗P(S1) and let Fn and G be the distribution
functions of (m1(Sn)−n)= √
n 2 and N(0;1);respectively. If we assume that sup
n∈N0|Pn(cos(i0−
x))−eim1(n)x|= O(xr) (r¿0) the following holds:
kFn−Gk∞= O(n−
r
2(r+1)):
If the convolution ∗P arises from Cartier polynomials we have kFn−Gk∞= O(n−
1 3):
Remark. The crucial point in the proof of Theorem 5.2 is the existence of an upper bound for supn∈N0|Pn(cos(i0−x))−eim1(n)x|. The only way known to the author to obtain such a bound is using taylor expansion of Pn(cos(i0 −x)) −eim1(n)x as is done in the proof of Proposition 5.1. In this case we can almost reach the classical rate O(n−1=2) provided we have a polynomial bound of sucient high order.
6. Proof of the main results
Proof of Proposition 5.1. From the recurrence formula for the generalized Cartier polynomials we have
C1(x)Cn(x) =
2 a
s
a−1
b−1xCn(x) +
b−2
a(b−1)Cn(x)
and hence
2 a
s
a−1
b−1xCn(x) = a−1
a Cn+1(x) + 1
a(b−1)Cn−1: (5)
With
r:=1 a
s
a−1
b−1; k:=
b−2
a(b−1); an:= a−1
one can dene a sequence Qn of polynomials with Q0(x) := 1 and Qn:=r−na1a2· · ·an−1Cn for all
n∈N. Using (5) we obtain
2xQ1(x) =Q2(x) +
c1
r2Q0(x);
2xQn(x) =Qn+1(x) +Qn−1(x); n¿2:
From [1, (4.28)–(4.30)] we deduce for all ∈C\{0}:
Qn(cos) =
sin((n+ 1)) + k
r sin(n)− r2
−c1
r2 sin((n−1))
sin() :
Going back to the Cn this equation yields the desired representation.
To obtain the second claim we write
Bn:=
b(a−1) a(ab−a−b)
1− 1
(a−1)n(b−1)n
:
Using the explicit form of m1 given in Section 4 this yields
e−im1(n)x= cos(nx) cos(B
nx) + sin(nx) sin(Bnx)−i [sin(nx) cos(Bnx)−cos(nx) sin(Bnx)]:
Since for small x we have sin(x) =x+ O(x2) and cos(x) = 1 + O(x2) and since B
n is bounded a
longer but straightforward computation shows for small x:
e−im1(n)x=B
nxsin(nx) + cos(nx) + O(x2)−i [sin(nx)−Bnxcos(nx) + O(x2)];
Cn(cos(i#0+x)) =Anxsin(nx) + cos(nx) + O(x2) + i [Bnxcos(nx)−sin(nx) + O(x2)]
with
An=
b(a−1) a(ab−a−b)
1 + 1
(a−1)n(b−1)n
:
Hence we have
Re[Cn(cos(i#0+x))−e−im1(n)x] =x(An−Bn) sin(nx) + O(x2)
with
An−Bn=
2b(a−1) a(ab−a−b)
1
(a−1)n(b−1)n:
Therefore using |sin(x)|6|x| for all x∈R we obtain
|Re[Cn(cos(i#0+x))−e−im1(n)x]|6
2b(a−1) a(ab−a−b)
n
(a−1)n(b−1)nx
2+ O(x2);
where the rest O(x2) does not depend on n. Since ¿¿0 either a or b has to be greater than 2. Therefore (a−1)(b−1)¿1 and
lim
n→n
2b(a−1) a(ab−a−b)
n
Hence
|Re[Cn(cos(i#0+x))−e−im1(n)x]|= O(x2):
An analogous argument for the imaginary part of Cn(cos(i#0−x))−eim1(n)x shows
|Im[Cn(cos(i#0+x))−e−im1(n)x]|= O(x2); where the rest O(x2) does not depend on n.
For the proof of the main theorem we need the following two lemmata.
Lemma 6.1. Let (Pn)n∈N0 be a sequence of orthogonal polynomials as in Section 2. Then
|Pk(cos(i0−))|61 for all ∈R and for a probability measure on N0 the function g:R→C;
7→ P∞k=0Pk(cos(i0−))({k}) is well dened. If we assume that
P∞
k=0kn({k})¡∞ for some
n∈N; then g is n-times dierentiable and g(n)() = P∞
k=0(d=d)nPk(cos(i0 − ))({k}) holds.
Furthermore; if (Sn)n∈N0 is a stationary Markov chain then
E(PS
n(cos(i0−))) =E(PS1(cos(i0−)))
n
for all ∈R.
The proof is given in [9].
Lemma 6.2. Let r¿1 be a real number and let (Pn)n∈N0 be a sequence of polynomials as in
Section 2 satisfying
sup
k∈N0|
Pk(cos(i#0−x))−eim1(k)x|= O(xr)
for |x| →0. For ¿0 dene for all t∈R:
hn(t) := E
"
eit
m1(Sn) √n −P
Sn
cos
i#0−
t √n
#
:
If (Kn)n∈N is any sequence of real numbers with Kn→ ∞ for n→ ∞ then for all d¿0 there exists
a C¿0 such that for all n∈N and all t∈R satisfying |t|6dKn the following holds:
hn(t)6C|t|
Kr−1
n
nr=2 :
Especially for the Bernstein–Cartier polynomials we have hn(t)6C|t|n−2=3 for |t|6dn−1=3.
Proof. Using the assumption supk∈N0|Pk(cos(i#0−x))−eim1(k)x|= O(xr) there exists a C1¿0 such that
hn(t)6E "
e
itm1√(Snn) −PSn
cos
i#0−
t √n
#
6 sup
k∈N0|
Pk(cos(i#0−x))−eim1(k)x|6C1|
t|r
Since |t|6dKn the above inequality becomes
Setting K:=Kn with Kn→ ∞ for n→ ∞ and Kn6K1√n one obtains from Lemma 6.2, Eq. (7)
for suitably chosen constants K4; K5¿0. Together with
2
it follows from the above inequality:
sup
polynomials with r= 2.
References
[1] R. Askey, J.A. Wilson, Some basic hypergeometric orthogonal polynomials that generalize Jacobi polynomials, Mem. Amer. Math. Soc. 54 (319) (1985).
[2] T.S. Chihara, An introduction to orthogonal polynomials, Mathematics and its Applications, vol. 13, Gordon and Breach, New York, 1978.
[3] W. Feller, An Introduction to Probability Theory and its Applications, vol. II, 2nd ed., Wiley, New York, 1971. [4] L. Gallardo, Comportement asymptotique des marches aleatoires associees aux polynˆomes de Gegenbauer et
applications, Adv. Appl. Probab. 16 (2) (1984) 293–323. [5] H. Heuser, Lehrbuch der Analysis, Teubner, Stuttgart, 1981.
[6] R.I. Jewett, Spaces with an abstract convolution of measures, Adv. Math. 18 (1975) 1–101. [7] R. Lasser, Orthogonal polynomials and hypergroups, Rend. Math. Appl. 2 (1983) 185–209.
[8] H.D. Macpherson, Innite distance transitive graphs of nite valency, Combinatorica 2 (1) (1982) 63– 69.
[9] M. Voit, Central limit theorems for random walks on N0 that are associated with orthogonal polynomials,
J. Multivariate Anal. 34 (2) (1990) 290 –322.