102 5.1 Comparison (with MSE and CC) of local features (overlapping and non-overlapping . regions) for different window sizes. 115 5.2 Comparison (by SSI and QI) of local features (overlapping and non-overlapping regions) for different window sizes.
Image Formation Models
Image formation in a single camera-based setup
Each transformation is a transition from one coordinate system to another, which is stated as follows: i) The transformation from the point xw to xc is performed by translating it by the vector t, and then rotating it by the matrix R, expressed as follows: In simple terms, the three-dimensional point can be projected onto the image plane by multiplying this point by the camera matrix.
Image formation in a stereo vision setup
- Epipolar geometry
- Rectification
- Triangulation
- Relation between depth information and disparity value
- Single camera-based depth estimation
The mapping of points in one image plane to their corresponding point on the epipolar lines in the other image can be performed using epipolar geometry (epipolar constraint). For the three-dimensional world point in one image, its corresponding corresponding points in the other.
Applications of Stereo Vision
These two objects have different disparity values despite having the same color, as can be seen in Figure 1.7(c). The size of an object, the distance between two objects, and the distance of an object from the camera can be determined in a stereo vision setup.
Basics of Stereo Correspondence
Issues of Accurate Disparity Map Estimation
- Occlusions
- Photometric variations
- Image sensor noise
- Specularities and reflections
- Foreshortening effect
- Perspective distortions
- Textureless regions
- Repetitive structures
- Discontinuity
So the disparity map obtained from stereo image pairs may not provide actual specular surface information. In general, stereo correspondence methods assume that the areas of objects in both stereo images are the same.
Organization of the Thesis
The matching costs of the local regions are then combined using a two-step filtering process to estimate the disparity map. Disparity map estimation is one of the techniques used to extract three-dimensional information of a scene.
The Basic Principle of Finding a Disparity Map
Global Algorithms
Data term
It is completely pointless to calculate the match price for occluded pixels, since those pixels are present in one image and not in the other. According to this procedure, if two pixels of the left image correspond to the same pixel of the right image, then the pixel that has a smaller discrepancy value is considered the hidden pixel.
Smoothness term
In equation (2.12), if both pixels have the same color, a higher weight value is assigned and vice versa. This problem can be overcome by considering more adjacent pixels that slightly modify the smoothness term given in equation (2.11).
Optimization
- Dynamic programming
- Graph cut
- Belief propagation
To overcome this, a heuristic method is proposed which incorporates vertical smoothness into the optimization technique [45]. A message from pixel p to pixel q encodes the belief of p around q, i.e. the probability of q at a particular disparity value.
Local Algorithms
Problem with different window sizes
- Choosing an appropriate window
- Adaptive window size and multi-resolution approaches
- Cost aggregation
On the other hand, a fixed window is applied for the pixels of the non-boundary areas. The bilateral filter depends on the histogram of the difference image which is independent of the window size.
Problem of finding a disparity map for varying illumination
- Gabor phase-based stereo correspondence
- Adaptive normalized cross correlation
- Mutual information-based matching
The ambiguities in a phase-based disparity map estimation method arise due to the presence of singularities in phase information. In [98], phase information is obtained by convolving the input images with the Gabor filter at different scales.
Occlusion Detection and Filling
Occlusion detection
There is a sudden change in the disparity value of the occluded surface with respect to the background. In this method, if the disparity values of a pair of corresponding pixels are different, the pixel in the reference image is considered as the occluded pixel. The x-coordinate pest1 of the estimated left pixel pest corresponding to the tox coordinate p′1 of pixelp′ in the right image is given.
Occlusion filling
It is worth noting that LRC is the most used algorithm for occlusion detection for the last decades. Bimodality and goodness-of-fit jumps can detect boundary-closed regions, while LRC, ORD, and OCC algorithms can detect all semi-closed regions [110]. Although the above-mentioned methods are able to detect occluded pixels, these methods also detect many correctly matched pixels as occluded pixels.
Summary
Apparently, the degradation of the disparity map accuracy is due to the choice of a poor energy function. The performance of local methods is comparable to that of the global methods after incorporating cost aggregation step in stereocorrespondence methods. Ambiguity in the disparity map occurs mainly due to the fact that the features used to find the matching pixels fail to distinguish between the matching pixel and its neighboring pixels.
Motivation of the Thesis
Objective of the Thesis
Stereo matching constraints and assumptions
This constraint states that the matching point of a pixel in the left image lies on the corresponding epipolar line in the right image. This constraint claims that there is at most one matching pixel in the right image that corresponds to every pixel in the left image. A segment Sl in the left image with spatial orientation θl corresponds to a segmentSr in the right image with spatial orientation θr if the following condition holds:. ii).
General steps of disparity map computation
- Matching cost computation
- Cost aggregation
- Disparity computation/optimization
- Disparity map refinement
This calculation gives a range of values for each of the pixels in the reference image for different disparity values. Cost aggregation for pixelp is performed by combining the cost values of all the pixels in the support region. The disparity map is obtained by determining the disparity dp of all the pixels p in the reference image.
A brief overview of existing stereo correspondence algorithms
In addition, the size of the eigenvector depends on the number of histogram bins. The computational complexity of this method depends on the size of the image and the number of labels used. proposed a cost pooling method based on a linear model [120]. Therefore, in the proposed method, PCA is used to reduce the dimensionality of the wavelet Gabor coefficients.
Proposed Local Stereo Matching Method
- Matching cost computation
- Cost aggregation
- Disparity computation
- Disparity map refinement
This improved performance of the proposed feature is due to the fact that the directional features are extracted at different orientations and scaling. To show the performance of the proposed cost aggregation method, a guided filter is used instead of the Kuwahara and median filter combination for cost aggregation [11], and the comparative result is shown in the table. This is achieved by taking the index of the minimum value of the aggregated costs.
Datasets used for Evaluation
In the occlusion filling step, min(dl, dr) is assigned to an occluded pixel, where dl and dr are the disparity values of neighboring non-occluded left and right pixels. This includes the disparity map calculated without cost aggregation, the disparity map obtained after cost aggregation only by the Kuwahara filter, the disparity map obtained after cost aggregation using the combination of Kuwahara and median filters, and the final disparity map obtained after refinement. a) Inequality map calculated without cost aggregation; (b) Disparity map obtained after cost aggregation of Kuwahara filter only. (c) Disparity map obtained after cost aggregation using the combination of Kuwahara and median filters (before refinement); (d) Final disparity map obtained after refinement.
Evaluation Methodology
In this figure, white color shows the whole image, while black color is assigned to unknown regions. In this figure, the white color indicates the interrupted regions, the black color refers to the closed regions, and the gray color is used to indicate the remaining regions of the image. The percentage of bad pixels is calculated in the three critical image regions above for four standard Middlebury stereo images.
Experimental Results
Variation in Kuwahara filter window size: The error rate for Tsukuba, Venus, Teddy, and Cones images is shown in Figure 3.23 for different Kuwahara filter window sizes. Variation of the median filter window size: Figure 3.24 shows the error rate for different median filter window sizes. Variation of the number of principal components: Figure 3.25 shows the error rate for different numbers of principal components used for local stereo correspondences for all.
Summary
In most of the methods, occluded pixels are detected only after the estimation of initial disparity map. In bimodality, the neighborhood of the occluded pixels has disparity values for both non-occluded and occluded areas. This change in disparity values corresponds to the occluded pixels in the second image and vice versa.
Background
Occluded regions are only visible in one image of the stereopair and invisible in the other image. Invisibility is caused by the geometry of the scene and the own and/or mutual occlusion of the objects in the scene. In this setup, CL and CR are the camera centers, and BL1, BL2, BR1, and BR2 are the background objects present in the scene.
Proposed Method for Occlusion Detection and Filling
- Matching cost computation
- Cost aggregation
- Disparity map computation
- Proposed linear regression-based asymmetric occlusion detection (LAOD) method 90
- Disparity refinement
Gabor wavelet based feature is used to find the corresponding matching pixel in the target image. Finally, the disparity value of the selected pixel is assigned to the pixel targeted for filling. 4.13) where ∆cpq is the color disparity of the pixelqfrom,pis the pixel under consideration,q is the non-closed pixel in the neighborhood area Np, and γc is a constant.
Experimental Results
Np|is the number of pixels in window Np, and ε is a user-defined smoothness parameter. In the next section, we have explained how our method is also suitable for the detection of occluded pixels in a horizontally inclined surface. In the areas with horizontally inclined surfaces, many pixels in the reference image correspond to a single pixel in.
Summary
In this chapter, two main characteristics of Gabor features are specified, namely: (i) The real coefficients of the Gabor wavelet are sufficient to represent the image; and (ii) Local Gabor wavelet features with overlapping regions represent the image more accurately compared to global Gabor features and local features extracted for non-overlapping regions. Experimental results show that local Gabor wavelet features extracted from overlapping regions represent the image more effectively than global and non-overlapping region-based features. The performance of this local Gabor Waltz function is compared with global Gabor functions.
Basics of Gabor Wavelet
It is again observed that local Gabor features for overlapping regions can represent an image more accurately compared to other two counterparts. The robustness of all three Gabor features is analyzed for radiometric variations in a scene, and we found that the real coefficients of local Gabor features for overlapping regions are more robust compared to Gabor features extracted from the imaginary part or size information. This method is also significantly better than the local Gabor features for non-overlapping regions and the global features.
Global Gabor Wavelet Feature (GGWF) Extraction
Input image, image represented only with the real coefficients, image represented only using the imaginary coefficients, and the image represented using the size information are shown from left to right in this figure. In this figure, input image, image represented by the real coefficients, image represented using the imaginary coefficients, and the image represented using the size information form= 2, n= 2 in Equations and (5.6) from the left shown to the right. Input image, image reconstructed using only the real coefficients, image reconstructed using only the imaginary coefficients, and the image reconstructed using the size information are shown from left to right in this figure.
Local Gabor Wavelet Feature (LGWF) Extraction
The reconstructed images using real coefficients, imaginary coefficients and magnitude information using Equation (5.9) are shown in the second row of Figure 5.2. The input image is divided into sub-regions of size u1 × v1 to find LGWFs from non-overlapping regions. Equations with small modifications can also be used to represent and reconstruct the original image for non-overlapping regions.
Experimental Results
- Different window sizes
- Different number of orientations
- Different number of scalings
- Synthetic illumination changes
- Real radiometric changes
- Performance evaluation of Gabor features for stereo correspondence
Performance of the features for overlapping regions decreases with the increase of window size, while the performance of features extracted from non-overlapping regions increases with the increase of window size. In the case of the local feature, few number of orientations are sufficient enough to represent the pixel variations in the image patches. To evaluate the performance of these features under different lighting conditions, the intensity values of the pixels of the real image are varied.
Summary
A state-of-the-art review of the existing literature on disparity map estimation methods was presented. Therefore, finding the matching pixels for all pixels in the occluded areas is a difficult task. To calculate the real coefficients, only the real part of the Gabor filter needs to be stored in memory.
Possible Extensions
It has also been mentioned in previous literature that the optimal performance of a two-dimensional Gabor filter can be achieved by using the real part of the filter. A plane wave with frequency (ξ0, ν0) tends to the direction of propagation along the short axis of the elliptical Gaussian. Each of these two families of Gabor waves can be created by rotating and dilating (affine group) the parent Gabor wave as follows:
Linear Regression
- The geometry of a linear perspective camera system [1]
- Calculation of the x-coordinate of the projected point in the image plane by using
- Image formation in a stereo vision setup (Epipolar geometry)
- Stereo images rectification
- Stereo images before and after rectification. (a) Reference image before rectification;
- Elementary stereo geometry in the rectified configuration [1]
- Accurate image segmentation using three-dimensional information. (a) Left image; (b)
- Presence of occlusion is highlighted with red and yellow colour boxes in the Teddy stereo
- Photometric variations in a stereo image pair. (a) Left image; (b) Right image [5]
- Stereo images affected by noises. (a) Left image; (b) Right image [6]
- Specular surfaces in (a) Left image; (b) Right image [5]
- Specular reflections in (a) Left image; (b) Right image [5]
- Foreshortening areas for two different viewpoints [5]
- Stereo images having perspective distortions. (a) Left image; (b) Right image; (c)
- Presence of textureless regions in a stereo image pair. [5]
- Presence of repetitive structures in a stereo image pair [5]
- Discontinuous regions in stereo images. (a) Left image; (b) Right image; (c) Discontin-
- General block diagram for computing a disparity map
- Pictorial illustration of census transform
- An example where uniqueness constraint fails
- Illustration of ordering constraint in two scenarios
- Cyclopean distance [8]
- General steps of stereo correspondence methods
- Matching cost computation
- Cost aggregation
- Disparity computation
- Block diagram of the proposed disparity map estimation method
- Gabor wavelet kernel (real part). (a)-(d) for scale 2; (e)-(h) for scale 5; (a) and (e) for
- Local Gabor wavelet feature extraction
- Role of real and imaginary coefficients of Gabor wavelet on disparity map. (a) Left
- Subregions of Kuwahara filtering
- Behaviour of Kuwahara filter at boundary regions
- Disparity space image filtering. (a) Disparity space image (d = 1); (b) Filtering by
- Intermediate results. (a) Disparity map computed without cost aggregation; (b) Dis-
- Middlebury stereo standard dataset. Left to right - Tsukuba, Venus, Teddy, and Cones
- Middlebury stereo dataset (2005). Left to right - Cloth1, Books, Dolls, Laundary,
- Middlebury stereo dataset showing non-occluded, all, and discontinuous regions
- Experimental results on 2005 Middlebury datasets - Cloth1, Book, Dolls, Laundary,
- Variations of local stereo window size. (a) Tsukuba, (b) Venus, (c) Teddy, and (d) Cones. 76
- Variations of Median filter window size. (a) Tsukuba, (b) Venus, (c) Teddy, and (d)
- Variations of number of principal components. (a) Tsukuba, (b) Venus, (c) Teddy, and
- Variations of number of Gabor wavelet filter orientations. (a) Tsukuba, (b) Venus, (c)
- Variations of number of Gabor wavelet filter scaling. (a) Tsukuba, (b) Venus, (c) Teddy,
- Average percentage of bad pixels. (a) Variation of local stereo window size, (b) Variation
- General stereo vision set-up [14]
- Left disparity map showing the ground truth occluded pixels. Border occlusion (blue
- Different types of occlusion [14]
- Stereo vision setup for different types of occlusions [14]
- Block diagram of the proposed occlusion detection method
- Example showing the case when a pixel not satisfies continuity, ordering, and uniqueness
- Detected occluded pixels (shown by black colour) by proposed LAOD and LRC methods
- Block diagram of the proposed occlusion filling method
Zhang et al., “Shape from shading: A survey,” Pattern analysis and machine intelligence, IEEE Transactions on, vol. Lempitsky et al., "Fusion moves for Markov random field optimization," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. Yang et al., “Stereo matching with color-weighted correlation, hierarchical belief distribution, and occlusion handling,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.