• Tidak ada hasil yang ditemukan

Challenges and Literature Survey

ALGORITHMS IN VIDEO CODING

8.1 Challenges and Literature Survey

Chapter 8

PERFORMANCE ANALYSIS OF

stabilization, motion segmentation, and image analysis. However, the requirements for these applications differ significantly for video encoding algorithms: Motion vectors [119] have to reflect the real motion within the image sequence; otherwise the algorithms will not give the desired results. For video encoding the situation is different. Motion vectors are used to compensate the motion within the video sequence and only the remaining signal (prediction error) has to be encoded and transmitted. Therefore, the motion vectors have to be selected in order to minimize the prediction error and the number of bits required to code the prediction error. Motion estimation is in most cases based on a search scheme which tries to find the best matching position of a 16×16 macro block (MB) of the current frame with a 16×16 blocks within a predetermined or adaptive search range in the previous frame. The matching position relative to the original position is described as the motion vector which is (after subtraction of the predictor and variable length coding) transmitted in the bit stream to the video decoder.

Fast ME algorithms [120] shown that they can save a lot of searching points, but they all based on an assumption that all data are accessed randomly, that means they assume the whole memory block is refreshed for each searching points and also the sequence of searching points does not accounted under this assumption. It is very different from the actual hardware environment that directs implementation of these algorithms may not be efficient. There are two major factors for a ME algorithm affects the hardware to be efficient.

One is memory access efficiency and another is pipeline efficiency [121]. For memory access, while using a simple architecture that the video block data is accessed row-by-row (or column-by-column); if the block distortions are measured on neighboring locations, it is not necessary to refresh the whole block thus save the bandwidth of the data bus. For instant, one step up, down, left, right movements are very efficient since it only requires replacing single row or column of memory instead of reloading the whole block. We can see that the memory access will be efficient if we keep the searching steps small. On the contrary, we should prevent diagonal move or random jump since these moves requires refreshing more data or almost the whole memory block.

For pipeline efficiency, if a fixed searching sequence can be defined, all video data can be queued up so the block distortion measure can keep working thus the hardware can be utilized. But for fast ME algorithms the further searching path always depends on distortions of the previous searched points [122]. For handling uncertain future searching point, predic- tion can be used to pre-fetch the data for next searching point. But when the prediction is wrong the penalty might be serious since the pipeline need to be cleared and data should be

8.1 Challenges and Literature Survey 123 reloaded. If we want a fast ME algorithm to be pipeline friendly we should keep the number of branches to be as small as possible or make the branches to be predictable. In [123]

a low-power parallel tree architecture is proposed for full search block-matching motion estimation as parallel tree architecture exploits the spatial data correlations between parallel candidate block searches for data sharing, which effectively eliminates huge amount of data access bandwidth while consumes fewer hardware resources compared with array-based architectures. Besides, this architecture can also eliminate redundant computation without pipeline latency and excess power consumption caused by register shifting and redundant memory accessing in array-based architectures. In [124] a memory hierarchy model for a full search motion estimation core. The motion estimation is the most complex module in a video encoder requiring intensive computation and high memory bandwidth, mainly when the focus is high definition videos. The proposed memory hierarchy model is based on a data reuse scheme considering the full search algorithm features and expressively reduces the external memory bandwidth required for the motion estimation process and it provides a very high data throughput for the ME core.

In [125] systolic array architecture for FSBMA is introduced by RTL-level VHDL for using as a motion estimation unit in low bit rate and real-time applications such as video tele- phony. The three-step search (TSS) algorithm has been widely used as the motion estimation technique in some low bit-rate video compression applications, owing to its simplicity and effectiveness. However, TSS uses a uniformly allocated checking point pattern in its first step, which becomes inefficient for the estimation of small motions. A new three-step search (NTSS) algorithm is introduced in [126], employing a center-biased checking point pattern in the first step, which is derived by making the search adaptive to the motion vector distribution, and a halfway-stop technique to reduce the computation cost. Based on the real world image sequence’s characteristic of center-biased motion vector distribution, a new four-step search (4SS) algorithm with center-biased checking point pattern for fast block motion estimation is proposed in [127]. The proposed 4SS performs better than the well-known three-step search and has similar performance to the new three-step search (N3SS) in terms of motion compensation errors. A block-based gradient descent search (BBGDS) algorithm is proposed in [128] to perform block motion estimation in video coding. The minimum within the checking block is found, and the gradient descent direction where the minimum is expected to lie is used to determine the search direction and the position of the new checking block. In block motion estimation, search patterns with different shapes or sizes and the center-biased characteristics of motion-vector distribution have a large impact on the searching speed and quality of performance an algorithm using a cross-search pattern as the initial step and

large/small diamond search (DS) patterns as the subsequent steps for fast block motion estimation is introduced in [129] and they also that CDS is much more robust, and provides faster searching speed and smaller distortions than other popular fast block-matching al- gorithms. In [130] two cross-diamond-hexagonal search (CDHS) algorithms, which differ from each other by their sizes of hexagonal search patterns is proposed. These algorithms basically employ two cross-shaped search patterns consecutively in the very beginning steps and switch using diamond-shaped patterns. To further reduce the checking points, two pairs of hexagonal search patterns are proposed in conjunction with candidates found located at diamond corners. The widespread uses of block-based inter frame motion estimation for video sequence compression in both MPEG and H.263 a standard is due to its effectiveness and simplicity of implementation. Nevertheless, the high computational complexity of the full-search algorithm has motivated a host of suboptimal but faster search strategies.

A popular example is the three-step search (TSS) algorithm. However, its uniformly spaced search pattern is not well matched to most real-world video sequences in which the motion vector distribution is non-uniformly biased toward the zero vector. Such an observation inspired the new three-step search (NTSS) which has a center-biased search pattern and supports a halfway-stop technique. It is faster on average, and gives better motion estimation as compared to the well-known TSS. Later, the four-step search (4SS) algorithm was introduced to reduce the average case from 21 to 19 search points, while maintaining a performance similar to NTSS in terms of motion compensation errors. An unrestricted center-biased diamond search (UCBDS) algorithm in [131] is introduced which is more efficient, and robust over a wide range of test video sequences. Inspired from the above, in this chapter an efficient, effective and robust approach is proposed to reduce the numbers of search in motion estimation. Besides the performance are analyzed and the ranking of the commonly used motion estimation algorithm is proposed for low and high motion video content.