OVERVIEW OF THE LATEST VIDEO CODING H.264/AVC STANDARD
2.5 Intra Prediction
2.5 Intra Prediction 19 In order to increase the efficiency of the intra coding process inH.264/AVC, spatial correlation between adjacent MBs in a given frame is exploited [21]. The idea is based on the observation that adjacent MBs tend to have. Therefore, as a first step in the encoding process for a given MB, one may predict the MB of interest from the surrounding MBs (typically the ones located on top and to the left of the MB of interest, since those MBs would have already been encoded). The difference between the actual MB and its prediction is then coded, which results in fewer bits to represent the MB of interest as compared to when applying the transform directly to the MB itself.
In contrast to some previous standards ( namely H.263+ and MPEG-4 Visual), where intra prediction conducted in the transform domain, intra prediction in H.264/AVC is always conducted in spatial domain, by referring to neighboring samples has been of previously coded blocks which are to the left and/or above the block to be predicted. This may incur error propagation in environments with transmission errors that propagate due to motion compensation into inter-coded macroblocks [12]. Therefore, a constrained intra coding mode can be signaled that allows prediction only from intra-coded neighboring macroblocks.
For the luminance (luma) samples, intra prediction may be formed for each 4×4 block or for a 16×16 macroblock. There are a total of 9 optional prediction modes for each 4×4 luma block; four optional modes for a 16×16 luma block. The latest H.264 standard also defines 8×8 block and also has nine prediction modes which are the same as those modes used in 4×4 block. Similarly for chroma 8×8 block, another four prediction direction is used.
2.5.1 Intra 4 × 4 Prediction
In case of 4×4 luminance block, the prediction block is defined using neighboring pixels of reconstructed blocks. The prediction block is calculated based on the samples labeled A-M as shown in Fig. 2.5. For example, mode 2 is called DC prediction in which all pixels (labeled a to p) are predicted by (A+B+C+D+I+J+K+L)/8.
The mode 0 specifies the vertical prediction mode in which pixels (labeled a, e, i and m) are predicted from A, and the pixels (labeled b, f, j and n) are predicted from B, and so on.
The remaining modes are defined similarly according to the different directions as shown in Fig. 2.6 and Table 2.2. Note that in some cases, not all of the samples above and to the left are available within the current slice: in order to preserve independent decoding of slices, only samples within the current slice are available for prediction.
Fig. 2.5 Prediction samples of a 4×4 block.
Fig. 2.6 Nine prediction mode of a 4×4 block.
2.5 Intra Prediction 21 DC prediction (mode 2) is modified depending on which samples A-M are available;
the other modes (mode 1-8) may only be used if all of the required prediction samples are available The encoder may select the prediction mode for each block that minimizes the residual between the block to be encoded and its prediction. For example, in the case where Mode 3 (Diagonal-Down-Left prediction) is chosen, the values of a to p are given as follows:
• a is equal to (A+2B+C+2)/4
• b, e are equal to (B+2C+D+2)/4
• c, f, i are equal to (C+2D+E+2)/4
• d, g, j, m are equal to (D+2E+F+2)/4
• h, k, n are equal to (E+2F+G+2)/4
• l, o are equal to (F+2G+H+2)/4, and
• p is equal to (G+3H+2)/4.
Table 2.2 Nine intra 4×4 prediction modes
Mode 0 (Vertical) The upper samples A, B, C, D are extrapolated vertically.
Mode 1 (Horizontal) The left samples I, J, K, L are extrapolated horizontally.
Mode 2 (DC) All samples in P are predicted by the mean of samples A . . . D and I . . . L.
Mode 3 (Diagonal down-left)
The samples are interpolated at a 450angle between lower- left and upper-right.
Mode 4 (Diagonal down-right)
The samples are extrapolated at a 450angle down and to the right.
Mode 5 (Vertical right) Extrapolation at an angle of approximately 26.60to the left of vertical (width/height = 1/2).
Mode 6 (Horizontal down)
Extrapolation at an angle of approximately 26.60below hor- izontal.
Mode 7 (Vertical left) Extrapolation (or interpolation) at an angle of approximately 26.60to the right of vertical.
Mode 8 (Horizontal up) Interpolation at an angle of approximately 26.60above hori- zontal.
2.5.2 Intra 8 × 8 Prediction
Similar to intra 4×4 block, 8×8 luma block also has nine prediction mode based on the direction of Fig. 2.6. For prediction of each 8×8 luma block, one mode is selected from the 9 modes, similar to the 4×4 intra-block prediction.
2.5.3 Intra 16 × 16 Prediction
As an alternative to the 4×4 luma modes described above, the entire 16×16 luma component of a macroblock may be predicted. Four modes are Mode 0 (vertical), Mode 1 (horizontal), Mode 2 (DC) and Mode 3 (Plane). Fig. 2.7 shows the four modes and their direction. Table 2.2 describes the calculation of four modes.
Fig. 2.7 Intra 16×16 prediction modes.
Table 2.3 Four intra 16×16 prediction modes Mode 0 (Vertical) Extrapolation from upper samples (H).
Mode 1 (Horizontal) Extrapolation from left samples (V).
Mode 2 (DC) Mean of upper and left-hand samples (H + V).
Mode 3 (Plane) A linear ‘plane’ function is fitted to the upper and left-hand samples H and V. This works well in areas of smoothly- varying luminance.
2.5.4 Intra Chroma Prediction
Each chroma component of a macroblock is predicted from chroma samples above and/or to the left that have previously been encoded and reconstructed. The chroma prediction is defined for three possible block sizes, 8×8 chroma in 4:2:0 format, 8×16 chroma in 4:2:2