IDCTDCT
7.6 HEVC ― High Efficiency Video Coding, UHDTV
7.6.5. HEVC in Detail
This section will describe the essentials of HEVC (High Efficiency Video Coding, also known as H.265). The table below is a comparison of H.261, H.262, H.264, and H.265.
The key features of HEVC are:
Resolutions of up to 8K
Coding Tree Units (CTU), Coding Tree Block (CTB), Coding Unit (CU), Coding Block (CB), Transform Unit (TU), Transform Block (TB) – structure
Multiple Reference Pictures
Advanced Motion Vector Signaling
Motion compensation at higher accuracy
Dependent Slice Segments
DCT – Discrete Cosine Transform for pixel blocks of 4x4, 8x8, 16x16, and 32x32 pixels
Improved intra-prediction by shifting a block within a single image in 34 possible directions
DST – Discrete Sine Transform for 4x4 luminance intra-prediction residuals
De-blocking-filter
8-bit and 10-bit profiles (“deep color space”)
HEVC/H.265 defines tiers, levels, and profiles. The two tiers of HEVC are called Main tier and High tier, resp. The Main tier is in- tended for standard usage, while the High tier is intended for profes- sional applications. There are three profiles: Main Profile, Main 10 Profile, and Main Still Picture Profile. The Main Profile and the Main 10 Profile differ in the possible resolution, i.e. 8 bits, or 8, 9, and 10 bits, resp. The levels describe the maximum possible resolutions, re- fresh rates, and data rates. A “High tier” is not defined in all levels.
Table 7.3. Comparison of H.261, H.262, H.264, and H.265
H.261 H.262
MPEG-2 video
H.264/AVC H.265/HEV C
Resolution
(pixels) QCIF (174x144), CIF (352x288)
up to
1920x1080 up to
4096x2304 up to 8192x4320
Color sub- sampling
4:2:0 4:2:0, 4:2:2, 4:4:4
4:2:0, 4:2:2, 4:4:4
4:2:0, (4:2:2), (4:4:4)
Interlacing No Yes Yes to date No
Block size 16x16 16x16 16x16 8x8, 16x16,
32x32, 64x64 Intra-partition
size 16x16 16x16 16x16, 8x8,
4x4 4x4, 8x8,
16x16, 32x32 In-loop
Filter No No De-blocking De-
blocking, advanced Transform 8x8 DCT 8x8 DCT 4x4, 8x8 In-
teger DCT
4x4, 8x8, 16x16, 32x32 Inte-
168 7 Video Coding (MPEG-2, MPEG-4/AVC, HEVC)
ger DCT, 4x4-Integer DST, Transform Skip Entropy
Coding Zig-zag
scan, VLC Zig-zag
scan, VLC Zig-zag scan, CAVLC, CABAC
Horizontal, vertical and diagonal scan, CABAC
Table 7.4. Profiles in HEVC
Profile Description
Main Profile 4:2:0, 8-bit
Main 10 Profile 4:2:0, 8 to 10 bits Main Still Picture Profile 4:2:0, 8-bit
The HEVC standard does not contain the terms "Group of Pictures"
(GOP), I-, P- and B-pictures that were present in the previous standards, but the term GOP is used again in the "colloquial" language of various publications. An H.265 data stream or a H.265 GOP begins with a so- called IDR (Instantaneous Decoding Refresh) or a CRA (Clean Random Access); at this point, an intra-coded image is transmitted, where the de- coder can reset (e.g. when the user switches from one program to the next).
In HEVC, each image is decomposed into larger or smaller slices (with the same flexibility as in H.264). A slice enables parallel video coding and also facilitates re-synchronization during decoding; it is decomposed into Coding Tree Units that in turn consist of Coding Tree Blocks of the Y, CB, CR components. The term “Coding Tree Unit” replaces the term “Mac- roblock” of the earlier standards. Video coding in H.265 is organized with- in the slices; it has intra-slices, predicted slices, and bidirectional slices. A Coding Tree Unit consists of up to 64x64 pixels and is the smallest com- mon multiple for all three layers: luminance, CB and CR. A Coding Tree Unit consists of Coding Blocks. Each Coding Bock consists of N x N pix- els. And it is at this point – when a Coding Block is broken down into Transform Blocks and pixels – where H.265/HEVC deviates from its pre- decessor standards H.262 (MPEG-2 video) and H.264 (MPEG-4 AVC).
In H.265, a "Unit" is always the combination of all Component Layers, and a “Block” in H.265 describes a range of N x M or N x N pixels within a Component Layer (Y, CB, CR), i.e. a “Unit” consists of “Blocks”.
Table 7.5. Levels in HEVC Level Resolution
[pixels, n x m]
Maximal refresh rate [1/s]
Main Tier Max. data rate [Mbit/s]
High Tier Max. data rate [Mbit/s]
1 128x96
176x144 33.7
15 0,128 --
2 176x144
352x288 100
30 1.5 --
2.1 352x288
640x360 60
30 3 --
3 640x360
720x576 960x540
67.5 37.5 30
6 --
3.1 720p HD 33 10 -
4 720p HD
1080p HD 68
32 12 30
4.1 720p HD
1080p HD 136
64 20 50
5 720p HD
1080p HD 3840x2160
272 128 32
25 100
5.1 720p HD 1080p HD 3840x2160
300 256 64
40 160
5.2 1080p HD 3840x2160
300 128
60 240
6 3840x2160
7680x4320
256 32
60 240
6.1 3840x2160 7680x4320
300 64
120 480
6.2 7680x4320 128 240 800
170 7 Video Coding (MPEG-2, MPEG-4/AVC, HEVC)
Fig. 7.43. Block formation in all three layers in H.262 to H.265; in H.265, the term “Block” is replaced by the notion of “Coding Tree Block”, “Transform Block”, etc. and has a more granular subdivision.
Fig. 7.44. Block division in H.262, H.264 and H.265
In H.265/HEVC, a Coding Tree Block is then further decomposed into smaller “Tiles” according to the current partial image content, using so-called Coding Quadtree procedures. This subdivision is performed for all layers (luminance and chrominance) and leads to different sized Predic- tion Blocks (PB), Coding Blocks (CB), and Transform Blocks (TB). Intra- and inter-frame coding takes place at the Prediction Block level, while transform coding is performed at the Transform Block level. The maximal
and minimal size of a Transform Block is 32 x 32 pixels and 4 x 4 pixels, resp. The maximal size of a Prediction Block is 64 x 64 pixels.
Fig. 7.45. Subdivision, refining a Coding Tree Block (CTB) into Coding Blocks (CB) and Transform Blocks (TB)
The transition from H.262/MPEG-2 through H.264/MPEG-4 to H.265/HEVC can be seen as a progressively finer subdivision of the mac- roblock structures – with “macroblock” called “Coding Tree Unit” in HEVC –, aiming to increasingly adapt “as needed” to the current image content to be encoded. Accordingly, the HEVC video coder can apply macroblocks where there are few image details, and select finer structures where more image details need to be encoded (see also Figs. 7.43. and 7.49.). In H.262, the lowest level of detail was the block or macroblock; in H.264, macroblocks were decomposed into macroblock partitions as need- ed; and in H.265, the Coding Tree Block is decomposed along so-called Coding Quadtrees as needed.
HEVC/H.265 uses the following block hierarchy:
CTU (Coding Tree Unit, consisting of Coding Tree Blocks of the components Y, CB, CR)
CU (Coding Unit)
PU (Prediction Unit, a unit of 64 x 64, … 4 x 4 pixels over which intra- or inter-prediction is performed)
TU (Transform Unit, multiple Transform Blocks with the same type of transform coding)
CTB (Coding Tree Block)
CB (Coding Block)
PB (Prediction Block)
TB (Transform Block, N x N block subjected to transform coding)
172 7 Video Coding (MPEG-2, MPEG-4/AVC, HEVC)
Fig. 7.46. Refining of Coding Tree Blocks along coding quadtree structures
At the end, a Transform Block (an N x N block of a certain size) is des- ignated for each component layer (Y, CB, CR), which is then subjected to transform coding (Fig. 7.48.) to convert it into the frequency domain. En- ergy concentration to a few values and quantization are performed in this latter domain. Transform coding (Fig. 7.48.) uses DCT (Discrete Cosine Transform), whereby the maximal block size is 32 x 32 pixels and the min- imal block size is 4 x 4 pixels. During intra-coding, DST (Discrete Sine Transform) is also possible. Transform is followed by quantization, the first lossy step (irrelevance reduction). Entropy coding is then performed using CABAC (Context-based Adaptive Binary Arithmetic Coding) to de- crease redundancy.
Fig. 7.47. Intra-coding by shifting a block within an image in H.264 (8 modes) and H.265 (34 modes)
Fig. 7.48 Transform coding
In H.262, differential coding existed only at the differential image level, i.e. during inter-frame coding. While mostly left unmentioned, differential coding between adjacent blocks is done also for the DC coefficients in H.262/MPEG-2 video. This means that intra-frame coding (the coding of image components within an image) is performed in H.262 using DCT transform coding and a differential coding of the DC values of the trans- formed block.
In H.264, intra-coding (i.e. the coding of individual frames) is extended in the sense that a block can also be coded by shifting blocks and describ- ing these with shift vectors (Fig. 7.47.); the difference is then DC-coded.
H.265 provides significantly more granular block shift vectors (34 direction angles instead of 8 in the predecessor standard H.264) for in- tra-coding (Fig. 7.47.). Also, DST (Discrete Sine Transform) can now also be used for coding adjacent blocks within a single image if this leads to more efficient results. Using DST here makes sense, because it may be better at pattern description within a single image (image background).
This does not contradict the Gibbs phenomenon described earlier.
174 7 Video Coding (MPEG-2, MPEG-4/AVC, HEVC)
The following deblocking filter has the task of smoothing eventual blocking artifacts caused by transform coding. The deblocking filter per- forms a targeted search for edges not originally present in the image, and eliminates them as best as it can.
Figure 7.48. shows the block diagram of a hybrid HEVC video encoder.
As can be seen, the HEVC encoder also includes a HEVC decoder.
Fig. 7.49. Block diagram of a HEVC encoder with integrated decoder [FKT_2013_HEVC]