HEVC in Detail - HEVC ― High Efficiency Video Coding, UHDTV

IDCTDCT

7.6 HEVC ― High Efficiency Video Coding, UHDTV

7.6.5. HEVC in Detail

This section will describe the essentials of HEVC (High Efficiency Video Coding, also known as H.265). The table below is a comparison of H.261, H.262, H.264, and H.265.

The key features of HEVC are:

 Resolutions of up to 8K

 Coding Tree Units (CTU), Coding Tree Block (CTB), Coding Unit (CU), Coding Block (CB), Transform Unit (TU), Transform Block (TB) – structure

 Multiple Reference Pictures

 Advanced Motion Vector Signaling

 Motion compensation at higher accuracy

 Dependent Slice Segments

 DCT – Discrete Cosine Transform for pixel blocks of 4x4, 8x8, 16x16, and 32x32 pixels

 Improved intra-prediction by shifting a block within a single image in 34 possible directions

 DST – Discrete Sine Transform for 4x4 luminance intra-prediction residuals

 De-blocking-filter

 8-bit and 10-bit profiles (“deep color space”)

HEVC/H.265 defines tiers, levels, and profiles. The two tiers of HEVC are called Main tier and High tier, resp. The Main tier is intended for standard usage, while the High tier is intended for profes- sional applications. There are three profiles: Main Profile, Main 10 Profile, and Main Still Picture Profile. The Main Profile and the Main 10 Profile differ in the possible resolution, i.e. 8 bits, or 8, 9, and 10 bits, resp. The levels describe the maximum possible resolutions, refresh rates, and data rates. A “High tier” is not defined in all levels.

Table 7.3. Comparison of H.261, H.262, H.264, and H.265

H.261 H.262

MPEG-2 video

H.264/AVC H.265/HEV C

Resolution

(pixels) QCIF (174x144), CIF (352x288)

up to

1920x1080 up to

4096x2304 up to 8192x4320

Color sub- sampling

4:2:0 4:2:0, 4:2:2, 4:4:4

4:2:0, 4:2:2, 4:4:4

4:2:0, (4:2:2), (4:4:4)

Interlacing No Yes Yes to date No

Block size 16x16 16x16 16x16 8x8, 16x16,

32x32, 64x64 Intra-partition

size 16x16 16x16 16x16, 8x8,

4x4 4x4, 8x8,

16x16, 32x32 In-loop

Filter No No De-blocking De-

blocking, advanced Transform 8x8 DCT 8x8 DCT 4x4, 8x8 In-

teger DCT

4x4, 8x8, 16x16, 32x32 Inte-

168 7 Video Coding (MPEG-2, MPEG-4/AVC, HEVC)

ger DCT, 4x4-Integer DST, Transform Skip Entropy

Coding Zig-zag

scan, VLC Zig-zag

scan, VLC Zig-zag scan, CAVLC, CABAC

Horizontal, vertical and diagonal scan, CABAC

Table 7.4. Profiles in HEVC

Profile Description

Main Profile 4:2:0, 8-bit

Main 10 Profile 4:2:0, 8 to 10 bits Main Still Picture Profile 4:2:0, 8-bit

The HEVC standard does not contain the terms "Group of Pictures"

(GOP), I-, P- and B-pictures that were present in the previous standards, but the term GOP is used again in the "colloquial" language of various publications. An H.265 data stream or a H.265 GOP begins with a so- called IDR (Instantaneous Decoding Refresh) or a CRA (Clean Random Access); at this point, an intra-coded image is transmitted, where the decoder can reset (e.g. when the user switches from one program to the next).

In HEVC, each image is decomposed into larger or smaller slices (with the same flexibility as in H.264). A slice enables parallel video coding and also facilitates re-synchronization during decoding; it is decomposed into Coding Tree Units that in turn consist of Coding Tree Blocks of the Y, CB, CR components. The term “Coding Tree Unit” replaces the term “Mac- roblock” of the earlier standards. Video coding in H.265 is organized within the slices; it has intra-slices, predicted slices, and bidirectional slices. A Coding Tree Unit consists of up to 64x64 pixels and is the smallest com- mon multiple for all three layers: luminance, CB and CR. A Coding Tree Unit consists of Coding Blocks. Each Coding Bock consists of N x N pixels. And it is at this point – when a Coding Block is broken down into Transform Blocks and pixels – where H.265/HEVC deviates from its predecessor standards H.262 (MPEG-2 video) and H.264 (MPEG-4 AVC).

In H.265, a "Unit" is always the combination of all Component Layers, and a “Block” in H.265 describes a range of N x M or N x N pixels within a Component Layer (Y, CB, CR), i.e. a “Unit” consists of “Blocks”.

Table 7.5. Levels in HEVC Level Resolution

[pixels, n x m]

Maximal refresh rate [1/s]

Main Tier Max. data rate [Mbit/s]

High Tier Max. data rate [Mbit/s]

1 128x96

176x144 33.7

15 0,128 --

2 176x144

352x288 100

30 1.5 --

2.1 352x288

640x360 60

30 3 --

3 640x360

720x576 960x540

67.5 37.5 30

6 --

3.1 720p HD 33 10 -

4 720p HD

1080p HD 68

32 12 30

4.1 720p HD

1080p HD 136

64 20 50

5 720p HD

1080p HD 3840x2160

272 128 32

25 100

5.1 720p HD 1080p HD 3840x2160

300 256 64

40 160

5.2 1080p HD 3840x2160

300 128

60 240

6 3840x2160

7680x4320

256 32

60 240

6.1 3840x2160 7680x4320

300 64

120 480

6.2 7680x4320 128 240 800

170 7 Video Coding (MPEG-2, MPEG-4/AVC, HEVC)

Fig. 7.43. Block formation in all three layers in H.262 to H.265; in H.265, the term “Block” is replaced by the notion of “Coding Tree Block”, “Transform Block”, etc. and has a more granular subdivision.

Fig. 7.44. Block division in H.262, H.264 and H.265

In H.265/HEVC, a Coding Tree Block is then further decomposed into smaller “Tiles” according to the current partial image content, using so-called Coding Quadtree procedures. This subdivision is performed for all layers (luminance and chrominance) and leads to different sized Predic- tion Blocks (PB), Coding Blocks (CB), and Transform Blocks (TB). Intra- and inter-frame coding takes place at the Prediction Block level, while transform coding is performed at the Transform Block level. The maximal

and minimal size of a Transform Block is 32 x 32 pixels and 4 x 4 pixels, resp. The maximal size of a Prediction Block is 64 x 64 pixels.

Fig. 7.45. Subdivision, refining a Coding Tree Block (CTB) into Coding Blocks (CB) and Transform Blocks (TB)

The transition from H.262/MPEG-2 through H.264/MPEG-4 to H.265/HEVC can be seen as a progressively finer subdivision of the macroblock structures – with “macroblock” called “Coding Tree Unit” in HEVC –, aiming to increasingly adapt “as needed” to the current image content to be encoded. Accordingly, the HEVC video coder can apply macroblocks where there are few image details, and select finer structures where more image details need to be encoded (see also Figs. 7.43. and 7.49.). In H.262, the lowest level of detail was the block or macroblock; in H.264, macroblocks were decomposed into macroblock partitions as needed; and in H.265, the Coding Tree Block is decomposed along so-called Coding Quadtrees as needed.

HEVC/H.265 uses the following block hierarchy:

 CTU (Coding Tree Unit, consisting of Coding Tree Blocks of the components Y, CB, CR)

 CU (Coding Unit)

 PU (Prediction Unit, a unit of 64 x 64, … 4 x 4 pixels over which intra- or inter-prediction is performed)

 TU (Transform Unit, multiple Transform Blocks with the same type of transform coding)

 CTB (Coding Tree Block)

 CB (Coding Block)

 PB (Prediction Block)

 TB (Transform Block, N x N block subjected to transform coding)

172 7 Video Coding (MPEG-2, MPEG-4/AVC, HEVC)

Fig. 7.46. Refining of Coding Tree Blocks along coding quadtree structures

At the end, a Transform Block (an N x N block of a certain size) is des- ignated for each component layer (Y, CB, CR), which is then subjected to transform coding (Fig. 7.48.) to convert it into the frequency domain. En- ergy concentration to a few values and quantization are performed in this latter domain. Transform coding (Fig. 7.48.) uses DCT (Discrete Cosine Transform), whereby the maximal block size is 32 x 32 pixels and the minimal block size is 4 x 4 pixels. During intra-coding, DST (Discrete Sine Transform) is also possible. Transform is followed by quantization, the first lossy step (irrelevance reduction). Entropy coding is then performed using CABAC (Context-based Adaptive Binary Arithmetic Coding) to de- crease redundancy.

Fig. 7.47. Intra-coding by shifting a block within an image in H.264 (8 modes) and H.265 (34 modes)

Fig. 7.48 Transform coding

In H.262, differential coding existed only at the differential image level, i.e. during inter-frame coding. While mostly left unmentioned, differential coding between adjacent blocks is done also for the DC coefficients in H.262/MPEG-2 video. This means that intra-frame coding (the coding of image components within an image) is performed in H.262 using DCT transform coding and a differential coding of the DC values of the trans- formed block.

In H.264, intra-coding (i.e. the coding of individual frames) is extended in the sense that a block can also be coded by shifting blocks and describ- ing these with shift vectors (Fig. 7.47.); the difference is then DC-coded.

H.265 provides significantly more granular block shift vectors (34 direction angles instead of 8 in the predecessor standard H.264) for intra-coding (Fig. 7.47.). Also, DST (Discrete Sine Transform) can now also be used for coding adjacent blocks within a single image if this leads to more efficient results. Using DST here makes sense, because it may be better at pattern description within a single image (image background).

This does not contradict the Gibbs phenomenon described earlier.

174 7 Video Coding (MPEG-2, MPEG-4/AVC, HEVC)

The following deblocking filter has the task of smoothing eventual blocking artifacts caused by transform coding. The deblocking filter per- forms a targeted search for edges not originally present in the image, and eliminates them as best as it can.

Figure 7.48. shows the block diagram of a hybrid HEVC video encoder.

As can be seen, the HEVC encoder also includes a HEVC decoder.

Fig. 7.49. Block diagram of a HEVC encoder with integrated decoder [FKT_2013_HEVC]

Dalam dokumen (Signals And Communication Technology) Walter Fischer - Digital Video And Audio Broadcasting Technology A Practical Engineering Guide-Springer (2020) (Halaman 185-193)