case of packet erasures. For example, for a packet length L, what was one bit earlier will now be a block ofL bits. Each binary entry in the encoding/parity check matrix will now be anL×Lbinary matrix. The rate will remain the same. So, at each time, k packets each of length L will be encoded to n packets each of the same length L.
Recall that the anytime performance of the code is determined by the delay- dependent codebook Ct,d and its distance distribution {Nw,dt }ndw=1. In the case of packet erasures, one can obtain analogous results by defining the Hamming distance of a codeword slightly differently. By viewing a codeword as a collection of packets, define its Hamming distance to be the number of non-zero packets. The definition of the delay-dependent distance distribution {Nw,dt } will change accordingly. With this modification, one can easily apply the results developed in Sections 4.6, 4.7 and the decoding algorithm in Section 5.2 above to the case of packet erasures. For example, a reasonably simple calculation will show that a rate exponent pair (R, β) that is achievable in the case of binary erasures with bit erasure probabilitywill be achiev- able in the case of packet erasures with packet lengthLand packet erasure probability L. The converse is not true though and we will not delve into the calculations here.
Here we envision the anytime code operating on top of the existing packet com- munication layer. One can alternately consider an alternate mode where the input to the encoder is not packetized. That is, at each time, the encoder receives K bits, say, where K is not necessarily a multiple of L and uses a linear tree code to map these K bits to N bits where N is a multiple of L and corresponds to N/L packets.
The rate of this code is K/N and each block in the block lower triangular generator matrix corresponding to the tree code will have dimension N×K. The analysis will be no different in this case.
5.3.1 A Sequential Decoder
Consider decoding an m−ary tree code over alphabet S and distance parameter α, over a discrete memoryless channel with input and output alphabet S. Suppose that the channel introduces an error with probability, i.e., the probability that the channel reproduces the input at the output is 1−. Further suppose that < α/2.
Let r= (r1, . . . , rt) denote the received word till timet. Let ˆcτ denote the decoder’s estimate of the input to the channel at time τ using channel outputs till time t−1.
Also let ˆcM Lτ|t denote the corresponding ML estimate using channel ouputs received till time t. Under the channel model assumed, maximum-likelihood estimation amounts to minimum-distance decoding, i.e.,
ˆ
cM L1:t|t= argmin
c∈C
kr−ck
. One can supply a simple certificate to verify if ˆcτ = ˆcM Lτ|t . Proposition 5.1. If kˆc1:t−rk< αt/2, then ˆc1|t = ˆcM L1|t Proof. Note that
kˆcM L1:t|t−rk ≤ kˆc1:t−rk< αt 2
Suppose on the contrary that ˆc1 6= ˆcM L1|t . Then by the tree code property kˆc1:t−ˆcM L1:t|tk ≥αt
So we have
kˆc1:t|t−rk=kˆc1:t−ˆcM L1:t|t+ ˆcM L1:t|t−r| ≥ kˆc1:t−cˆM L1:t|tk − kˆcM L1:t|t−rk ≥ αt 2 which is a contradiction. Hence ˆc1 = ˆcM L1|t .
Similarly if kˆc1:t|t−rk< αt/2 and kˆc1:t|t−r2:tk< α(t−1)/2, then ˆc1 = ˆcM L1|t and ˆ
c2 = ˆcM L2|t . One can proceed like this until the first instantτ when kˆcτ+1:t−rτ+1:tk ≥
α(t−τ)/2. We will state this as a lemma for easy reference.
Lemma 5.2. Let
τ = argmax
i
kˆci:t−ri:tk< α(t−i+ 1) 2
Then ˆci = ˆcM Li|t for all 1≤i≤τ
With this observation, we are ready to describe the sequential decoder. Suppose the decoder has computed the ML estimate ˆcM L1:t−1|t−1 using channel outputs till time t−1. Extend ˆcM L1:t−1|t−1 by one symbol arbitrarily to get a valid guess ˆc1:t, i.e., ˆc1:t= h
ˆ
cM L1:t−1|t−1,ˆct|t−1i
is a codeword. Use Lemma 5.2 to determine the longest prefix of ˆc1:t that can be verified to match the ML codeword and let the length of this prefix beτ, i.e., ˆcM L1:τ|t = ˆc1:τ. The remaining portion, ˆcM Lτ+1:t, can be determined by an exhaustive search in the subtree of depth t−τ that is rooted at the node in the code tree that is indexed by the prefix ˆcM L1:τ|t.
5.3.2 Complexity
The total number of the operations performed at time t is equal to the sum of the numbers to perform the following two tasks
1. Determining the longest prefix as in Lemma 5.2, and 2. Exhaustive search in the sub-tree
[92] shows how to perform 1) with a constant number of operations per time step.
The complexity of 2) is O mt−τ
since the code tree is m−ary. Now kˆcτ+1:t−rτ+1:tk ≥ α(t−τ)
2 =⇒ kˆcM Lτ+1:t−1|t−1−rτ+1:t−1k ≥ α
2(t−τ)−1
The probability of this event is at most 2−(t−τ−1)D(α/2,). The average complexity is bounded above by
X
`≥1
2−`D(α2,)m`
which is finite provided D(α/2, )>logm. One can guarantee this for small-enough rates. The more general sequential decoding algorithms in [28, 47, 48] guarantee finite average complexity for rates upto to the computational cutoff rate. One can rephrase the complexity distribution as follows: the probability of having to performL opera- tions decays as L−γ for someγ >0. For small-enough rates γ >1 which is when the average complexity is bounded. The same technique would apply to the linear tree codes also but linearity allows one to improve the complexity distribution to have a tail that decays as 2−Ω(√3L) over erasure channels which is better than any poly- nomial decay and performs very well in practice. Moreover the average complexity is bounded for all rates up to the channel capacity. With this background, we will speculate an approach to construct codes with similar complexity distribution over the binary symmetric channel.