Differential Pulse Code Modulation of Moving Pictures Adjoining moving pictures differ only very slightly from each other. They

5 Video Signal Formats for HDTV and UHDTV

7.1 Video Compression

7.1.5 Differential Pulse Code Modulation of Moving Pictures Adjoining moving pictures differ only very slightly from each other. They

contain stationary areas which won’t change at all from frame to frame;

there are areas which only change their position and there are objects which are newly added. If each frame were to be transmitted completely

0 255

ts time

134 7 Video Coding (MPEG-2, MPEG-4/AVC, HEVC)

every time, some of the information transmitted would always be the same, resulting in a very high data rate. The obvious conclusion is to differentiate between these types of picture areas and to transmit only the difference, i.e. the delta value, from one frame to the next. This particular method of redundancy reduction, which is based on a method which has been known for a long time, is called differential pulse code modulation (DPCM).

What then is differential pulse code modulation? If a continuous analog signal is sampled and digitized, discrete values, i.e. values which are no longer continuous, are obtained at equidistant time intervals (Fig.7.10.).

These values can be represented as pulses spaced apart at equidistant intervals, which corresponds to a pulse code modulation. The height of each pulse carries information in discrete, non-continuous form about the cur- rent state of the sampled signal at precisely this point in time.

Fig. 7.14. Forward predicted delta frames

In reality, the differences between adjacent samples, i.e. the PCM values, are not very large because of the previous band-limiting. If only the difference between adjacent samples is transmitted, transmission capacity can be saved and the required data rate is reduced. This type of pulse code modulation is a relatively old idea and is now called differential pulse code modulation (Fig. 7.11.).

The problem with the usual DPCM is, however, that after a switch-on or after transmission errors it takes a very long time until the demodulated time domain signal again matches the original signal to some extent. This problem can be eliminated though by employing the small trick of trans-

I I

Δ Δ Δ Δ

P P P P

GOP

I = Intra frame encoded picture P = “Predicted" forward encoded picture GOP = Group of Pictures

= Motion vector = Block

mitting at regular intervals firstly complete samples, then a few differences followed again by a complete sample etc. (Fig.7.12.) This very closely approaches the differential pulse code modulation method used in the MPEG-1/-2 image compression.

Before a frame is examined for stationary and moving components, it is first divided into numerous square blocks of 16x16 luminance pixels and 8x8 CB and CR pixels each (Fig.7.13.). Due to the 4:2:0 pattern, 8x8 CB

pixels and 8x8 CR pixels are in each case overlaid on one layer of 16x16 luminance pixels each. This arrangement is now called a macroblock (Fig.7.25.). One single frame is composed of a large number of macro- blocks and the horizontal and vertical number of pixels is selected to be such that it is divisible by 16 and also by 8 (Y: 720 x 576 pixels). At certain intervals complete reference frames, so-called I (intracoded) frames, formed without forming the difference, are then repeatedly transmitted and interspersed between them the delta frames (interframes).

Fig. 7.15. Bidirectionally predicted delta frames

Forming the difference is done at macroblock level, i.e. the respective macroblock of a following frame is always compared with the macroblock of the preceding frame. Put more precisely, this macroblock is first examined to see whether it has shifted in any direction due to movement in the picture, has not shifted at all or whether the picture information in this macroblock is completely new. If there is a simple displacement, only a so-called motion vector is transmitted. In addition to the motion vector, it

I I

B B P B

GOP

I = Intra frame encoded picture P = “predicted" forward encoded picture B = "bidirectional" encoded picture GOP = Group of Pictures Forward

encoding Backward

encoding

Δ Δ Δ

136 7 Video Coding (MPEG-2, MPEG-4/AVC, HEVC)

is also possible to transmit the difference, if any, with respect to the preceding macroblock. If the macroblock has neither shifted nor changed in any way, nothing needs to be transmitted at all. If no correlation with an adjoining preceding macroblock can be found, the macroblock is completely recoded. Such pictures produced by simple forward prediction are called P (predicted) pictures (Fig.7.14.).

Apart from unidirectionally forward predicted frames there are also bidirectionally, i.e. forward and backward, predicted delta frames, so-called B pictures. The reason for this is the much lower data rate in the B pictures compared with the P pictures or even I pictures, which becomes possible as a result of this. The arrangement of frames occurring between two I pictures, i.e. complete pictures, is called a group of pictures (GOP) (Fig.7.14.).

The motion estimation for obtaining the motion vectors proceeds as fol- lows: Starting with a delta frame to be encoded, the system looks in the preceding frame (forward prediction P) and possibly also in the subsequent frame (bidirectional prediction B) for suitable macroblock information in the environment of the macroblock to be encoded. This is done by using the principle of block matching within a certain search area around the macroblock.

Fig. 7.16. Motion vectors

If a matching block is found in front, and also behind in the case of bidirectional coding, the motion vectors are determined forward and backward and transmitted. In addition, any additional block delta which may be

Frame N-1, motion vector forward

Frame N, B encoded macro block

Frame N+1, motion vector backward Matching window

necessary can also be transmitted, both forward and backward. However, the block delta is coded separately by DCT with quantization, described in the next chapter, a method which saves a particularly large amount of storage space.

A group of pictures (GOP) then consists of a particular number and a particular structure of B pictures and P pictures arranged between two I pictures. A GOP usually has a length of about 12 frames and corresponds to the order of I, B, B, P, B, B, P, .... The B pictures are thus embedded between I and P pictures. Before it is possible to decode a B picture at the receiving end, however, it is absolutely necessary to have the information of the preceding I and P pictures and that of the following I or P picture in each case. But according to MPEG, the GOP structure can be variable. So that not too much storage space needs to be reserved at the receiving end, the GOP structure must be altered during the transmission so that the respective backward prediction information is already available before the actual B pictures. For this reason, the frames are transmitted in an order which no longer corresponds to the original order.

Fig. 7.17. Order of picture transmission

Instead of the order I0, B1, B2, P3, B4, B5, P6, B7, B8, P9, the pictures are now transmitted in the following order: I0, B-2, B-1, P3, B1, B2, P6, B4, B5, P9, etc. (Fig. 7.17.). That is to say, the P or I pictures following the B pictures are now available at the receiving end before the corresponding B pictures are received and must be decoded. The storage space to be reserved at the receiving end is now calculable and limited. To be able to re-

B_-2 B_-1

B₁ B₂

B₄ B₅

P₃

P₆

I₁₀

I₀

138 7 Video Coding (MPEG-2, MPEG-4/AVC, HEVC)

store the original order, the frame numbers must be transmitted coded in some way. For this purpose, the DTS (decoding time stamp) values con- tained in the PES header are used, among other things (see Section 3, The MPEG Data Stream).

Fig. 7.18. One-dimensional Discrete Cosine Transform

7.1.6 Discrete Cosine Transform Followed by Quantization

Dalam dokumen (Signals And Communication Technology) Walter Fischer - Digital Video And Audio Broadcasting Technology A Practical Engineering Guide-Springer (2020) (Halaman 152-157)