Rate Distortion Cost Function for H.264/AVC

MODE DECISION ALGORITHM FOR H.264/AVC

6.2 Rate Distortion Cost Function for H.264/AVC

adaptive threshold. Low complexity based transform and quantization approach is given in [93].

Some of the aforementioned fast mode decision algorithms try to classify the MB into large partition or small partition and skip checking some unnecessary modes [94]. However, they still need to compute the rate-distortion costs of some possible modes for the ultimate mode decision, which involve computationally intensive processes of image transformation, quantization, entropy coding, and pixel reconstruction. In H.264/AVC video coding standard, SAD-based and SATD-based cost functions are developed as fast mode decision techniques for avoiding these computationally intensive processes. However, the major drawback is that the rate-distortion performance of the encoded video is quite degraded, which affects their practical implementation. In this chapter, a relative sum of absolute value (RSAD) based fast mode decision algorithm is proposed, which can avoid rate-distortion cost computation while maintaining high rate-distortion performance of the H.264/AVC codec.

6.2 Rate Distortion Cost Function for H.264/AVC

In H.264/AVC encoding process, the best MB coding mode is selected by computing the rate- distortion cost of all possible modes. The best mode is the one with minimum rate-distortion cost and this cost is defined as [78][18].

J_RD(S,C,Mode|QP) =SSD(S,C,Mode|QP) +λ.R(S,C,Mode|QP) (6.1) where, QP is the quantization parameter, λ is the Lagrangian multiplier. A strong connection between the local Lagrangian multiplier and the QP was found experimentally:

λ =0.852^(QP−12)/3 (6.2)

The R in (1) can be written as

R=R_header+R_motion+R_esidual (6.3)

where,R_header,R_motionandR_residual present the number of bits need to encode the header information, motion vectors and quantized residual block, respectively.

In equation (6.1), MODE is chosen from the set of potential prediction modes as below:

For Intra frame:

MODE∈{Intra 4×4, Intra 16×16} and for Inter frame:

MODE∈{SKIP, Inter16×16, Inter 16×8, Inter 8×16, InterP8×8, Intra 4×4, Intra 16×16 }

SKIP mode means that no motion or residual bits will be encoded (only the mode indicator is transmitted). Inter P8×8 mode actually contains four sub-modes. A similar rate-distortion cost function is employed to determine the optimal sub-mode that achieves minimal rate-distortion cost. Once the best sub-mode is found, the minimal cost for the MB is evaluated by searching all the possible modes.

InterP8×8∈{ Inter 8×8, Inter 8×4, Inter 4×8, Inter 4×4}

SSD(S, C) is the sum of the squared difference (SSD) between the original blocks S and the reconstructed block C, and it can be expressed as:

SSD(S,C) =

N−1 i=0

∑

N−1

∑

j=0

(S_{i j}−C_{i j})²=∥S−C∥²_F (6.4) whereS_{i j} andC_{i j}are the (i,j)th elements of the current original block S and the reconstructed block C, respectively. Moreover, N is the image block size (N = 4 in H.264 standard) and∥ ∥_F is Frobenius norm. It is also found that the computation of spatial-domain SSD is very time-consuming, since we have to obtain the reconstructed block after Transformation

→Quantization→Inverse Quantization→Inverse Transformation→Pixel Reconstruction for each possible prediction mode [94]. Besides, encoding bits R is also difficult to calculate since it needs entropy coding, such as CAVLC or CABAC to obtain, which is also quite a time-consuming process. To accelerate the coding process, the JVT reference software provides a fast SAD-based cost function:

J_SAD=







SAD(S,P) +λ1.4K for intra 4×4 mode;

SAD(S,P) otherwise.

(6.5)

where, SAD(S,P) is sum of absolute difference between the original block S and the predicted block P. The λ1 is also approximate exponential function of the quantization parameter (QP) which is almost the square ofλ, and the K equal to 0 for the probable mode and 1 for the other modes. This SAD-based cost function could save a lot of computations as the distortion part is based on the differences between the original block and the predicted block instead of the reconstructed block. Thus, the processes of image block transformation, quantization, inverse quantization, inverse transformation and reconstruction of image block can all be saved.

6.2 Rate Distortion Cost Function for H.264/AVC 85 In addition, the number of bits is estimated by constants either equal 4 or 0. Thus, the variable length coding using CAVLC or CABAC can also be skipped. However, the expense of the computation reduction usually comes with quite significant degradation of coding efficiency. The rate-distortion performance comparison of H.264/AVC using RDO-based and SAD-based cost functions for different QPs is shown in above in Table 6.1. Compared with RDO-based encoder, the performance degradation of SAD-based cost functions are not- ignorable. Thus, SAD is not an appropriate criterion to determine the best mode. Actually, the SAD values of different inter modes usually have the following relations:

SAD_inter16×16≥SAD_inter16×8≥SAD_inter8×16≥SAD_inter8×8

which is shown in Table 6.2. The reason behind is that the small partition motion estimation can always provide better prediction accuracy than the large mode motion estimation.

According to this relationship, theJ_SAD cost function is inclined to choose smaller block mode as the best mode since their SAD values are smaller. Thus, it can explain whyJ_SAD cost function is not appropriate for the best mode decision.

Table 6.1 R-D Performance Comparison Using RDO-based and SAD-based Cost Function in terms of PSNR (dB) and Rate (kbps)

Sequence QP 28 32 36 40

RDO SAD RDO SAD RDO SAD RDO SAD

Akiyo PSNR 38.76 38.56 35.85 35.63 33.14 32.97 30.59 30.48 Rate 153.17 157.54 104.98 109.20 71.27 73.76 49.15 51.73 Foreman PSNR 36.13 35.92 33.50 33.22 31.00 30.66 28.52 28.18 Rate 248.51 257.17 155.01 162.55 99.95 105.21 66.36 71.29 Stefan PSNR 34.96 34.67 31.55 31.20 28.52 28.11 25.66 25.33 Rate 593.41 602.52 370.54 376.56 228.97 232.76 141.70 145.20

Table 6.2 The probability of SAD relationship among different inter modes Sequences P(SAD_16×16≥

SAD_16×8)

P(SAD_16×16≥ SAD_8×16)

P(SAD_16×16≥ SAD_8×8)

P(SAD_8×16 ≥ SAD_8×8)

Akiyo 88.70% 83.65% 79.44% 86.24%

Foreman 79.70% 82.95% 86.91% 83.63%

Stefan 85.83% 84.55% 86.43% 89.11%

Dalam dokumen Efficient Mode Selection Scheme for Video Coding (Halaman 113-116)