We conclude by identifying some avenues for future explorations:
In Chapter 3 and 4 we discussed automatic image stitching from multiple views. Illumination, sensors and optics can affect the key points found. These problems can be thought of within the general framework of computational photography, which refers to computational image capture, processing and manipulation techniques that enhance or extend the capabilities of digital photography. Typically, it is about merging multiple pictures of the same subject matter by using different exposure parameters. This is well handled by the illuminated images and object to focus. In the future, there may be no such
“poor” photograph, because the shooting conditions will be completely reconfigurable after the event has been recorded.
Currently, the images need to be sorted out manually before stitching. With the advancement of digital photography, our ability to understand images has not kept up in pace with our ability to generate them. In the future, algorithms for searching and sorting in image database will become as fundamental and available as those for searching for text on the World-Wide Web.
In this project, further increase of the clarity of the image background, background learning process is needed to differentiate the foreground and the background objects.This technique will be able to solve the occlusion of objects while its moving. Flexible models to correspondence with similar metrics linked to human perception will be required for progress in this area.
REFERENCES
Aghajan, H. and Caavallaro, A., 2009. Multi-Camera Networks: Principles and Applications, Elsevier Inc.
Baker, P., and Aloimonos, Y., 2000. Complete Calibration of a Multi-camera Network. DOI 10.1109/ OMNVIS.2000.853820
Bay,H,. Tuytelaars, T., and Van Gool, L., 2006. SURF: Speeded Up Robust Features, Computer Vision - European Conference on Computer Vision 2006, Volume 3951, 2006, pp. 404-417
Brown M., and Lowe, D.G., 2002. Cardiff: Invariant features from interest point groups. British Machine Vision Conference, pp. 656-665.
Canny, J., 1986. A Computational approach to Edge Detection. Pattern Analysis and Machine Intelligence, PAMI-8(6):679-697.
Chen, I.H, and Wang, S.J., 2007. An Efficient Approach for the Calibration of Multiple PTZ cameras. Automation Science And Engineering, 4, pp. 286 - 293 Del Bimbo, A., Dini, F., Lisanti, G., and Pernici, F., 2010. Exploiting distinctive visual landmark maps in pan-tilt-zoom camera networks. Computer vision and Image Understanding 114, pp. 611-623.
Fischler M., and Bolles R., 1981. Random sample consensus: A paradigm for model fitting with application to image analysis and automated cartography.
Communications of the ACM, 24:381-395.
Fisher R., Perkins S., Walker A., and Wolfart, E., 2000, Affine Transformation [Online].
Available at: http://homepages.inf.ed.ac.uk/rbf/HIPR2/affine.htm [Accessed:
25 January 2011]
Forsyth, D., and Ponce, J., 2003. Computer Vision: A Modern Approach, Prentice Hall.
Gower, J.C., 1982, Math. Scientist: Euclidean Distance Geometry, 7, pp. 1-14.
Harris, C., and Stephens, M.J., 1988. A combined corner and Edge Detector.
Alvey Vision Conference, pp. 147-152.
Henkel, R.D., 1997. Fast Stereovision by Coherence Detection. Computer Analysis Of Image and Pattern, Volume 1296, pp. 297-304.
Hough, P.V.C., 1962. Method and means for recognizing complex patterns.
U.S. Patent 3069654.
Ke, Y., and Sukthankar, R., 2004. PCA-SIFT: “A More Distinctive
Representation for Local Image Descriptors”. Proc. Conf. Computer Vision and Pattern Recognition, pp. 511-517.
Krumm,J., Harris,S., Meyers, B., Brumitt, B., Hale M., and Shafer S., 2000.
Multi-Camera Multi-Persom Tracking for EasyLiving. Microsoft Research Vision Technology Group, Third IEEE International Workshop on Visual Surveillance.
Lee, C.W., Sebastian, P., and Yap, V.V., 2009. Stereo Vision Tracking System.
Future Computer and Communication, pp487-491.
Lindeberg T., 1994. Scale-space theory: A basic tool for analyzing structures at different scales. Journal of Applied Statistics, 21(2):pp. 224-270.
Liu, J., and Hubbold, R., 2006. Automatic Camera Calibration and Scene Reconstruction with Scale-Invariant Features. International Symposium on Visual Computing, pp. 558-568.
Liu, R., Zhang, H., Liu, M., Xia, X., and Hu, T., 2009. Stereo Cameras Self- Calibration Based on SIFT, Measuring Technology and Mechatronics Automation, pp. 352-355.
Low, Y.Q., Lee, S.W., Goi, B.M., and Ng, M.S., 2011. A New SIFT-based Camera Calibration Method for Hybrid Dual-Camera. ICIEIS, pp96-103, 2011.
Lowe, D.G., 1999. Object Recognition from Local Scale-Invariant Features.
Proc. Of the International Conference on Computer Vision, pp. 1150-1157.
Lowe, D.G., 2001. Local feature view clustering for 3D object recognition, IEEE Conferences on Computer Vision and Pattern Recognition, pp. 682-688.
Lowe, D.G., 2004. Distinctive Image Features from Scale-Invariant Keypoints.
International Journal of Computer Vision, pp.91-110.
Matlab Central [Online]. Available at:
http://www.mathworks.com/matlabcentral/fileexchange/28760-2d-2d-
projective-homography-3x3-estimation/content/homography.m [Accessed: 25 April 2011]
Memont ,Q., and Khant, S., 2011. Camera calibration and three-dimensional world reconstruction of stereo-vision using neural networks, International Journal of System Science, pp 1155-1159.
Protocol Manual, “D” Protocol [Online]. Available at:
http://www.optovision.fr/telechargement/Pelco-D_protocol.pdf [Accessed: 01 April 2011]
Rafael, C., Gonzalez, and Woods, R.E., 2007. Digtal Image Processing Third Edition, Pearson Prentice Hall.
Senior, A.W., Hampapur, A., and Lu, M., 2005. Acquiring Multi-scale Images by Pan-Tilt-Zoom Controls and Automatic Multi-Camera Calibration.
Application of Computer Vision Volume 1, pp. 433-438.
Shotton,J., Blake,A., and Cipolla, R., 2005. Contour-Based Learning for Object Detection. International Conference of Computer Vision, pp. 503-510.
University of Calgary, Homogeneous Coordinates [Online]. Available at:
http://www.ece.uvic.ca/~bctill/20006/additional/homcoord/homog-coords.pdf [Accessed: 01 April 2012]
University of Nevada, Reno, The Geometry of Perspective Projection [Online].
Available at:
http://www.cse.unr.edu/~bebis/CS791E/Notes/PerspectiveProjection.pdf [Accessed: 01 March 2012]
University of Oxford, Introduction – a Tour of Multiple View Geometry.
[Online]. Available at:
http://www.robots.ox.ac.uk/~vgg/hzbook/hzbook2/HZintroduction.pdf [Accessed: 06 January 2012
Wan, D., and Zhou, J., 2008. Stereo Vision using two PTZ cameras. Computer vision and Image Understanding 112, pp. 184-194.
Xing, Y.J., Xing, J., Sun, J., and Hu L., 2007. An Improved Neural Networks for Stereo-camera Calibration. Journal of Achievements in Materials and Manufacturing Engineering, volume 20, issues 1-2.
Yokoyama, M., and Poggio, T. 2005. A Contour-Based Moving Object Detection and Tracking. International Conference of Computer Vision, pp.
271-276.
Zhou, X., Collins, R., Kanade, T., and Metes, P., 2003. A master-slave system to acquire biometric imagery of humans at a distance. ACM International Workshop on Video Surveillance, pp. 113-120.
Appendix A
IMTRANS - Homogeneous transformation of an image
function newim = imTrans(im, T, region, sze);
im = double(im)/255; % Make sure image is double threeD = (ndims(im)==3); % A colour image
if threeD ## Transform red, green, blue components separately r = transformImage(im(:,:,1), T, region, sze);
g = transformImage(im(:,:,2), T, region, sze);
b = transformImage(im(:,:,3), T, region, sze);
# Fix to correct the image - for some reason # it comes out mirrored left-to right r = fliplr(r);
g = fliplr(g);
b = fliplr(b);
newim = repmat(uint8(0),[size(r),3]);
newim(:,:,1) = uint8(round(r*255));
newim(:,:,2) = uint8(round(g*255));
newim(:,:,3) = uint8(round(b*255));
else # Assume the image is greyscale newim = transformImage(im, T, region, sze);
# Applying Fix
newim = fliplr(newim);
end
#---
# The internal function that does all the work
function newim = transformImage(im, T, region, sze);
[rows, cols] = size(im);
# Determine default parameters if needed if nargin == 2
region = [0 rows 0 cols];
sze = max(rows,cols);
elseif nargin == 3 sze = max(rows,cols);
elseif nargin ~= 4
error('Incorrect arguments to imtrans');
end
# Find where corners go - this sets the bounds on the final image B = bounds(T,region);
nrows = B(2) - B(1);
ncols = B(4) - B(3);
# Determine any rescaling needed s = sze/max(nrows,ncols);
S = [s 0 0 # Scaling matrix 0 s 0
0 0 1];
T = S*T;
Tinv = inv(T);
# Recalculate the bounds of the new (scaled) image to be generated B = bounds(T,region);
nrows = B(2) - B(1);
ncols = B(4) - B(3);
newim = zeros(nrows,ncols);
[x,y] = meshgrid(1:ncols,1:nrows); # All possible xy coords in the image.
# Transform these xy coords.
sxy = Trans(Tinv, [x(:)'+B(3) ; y(:)'+B(1) ; ones(1,ncols*nrows)]);
xi = reshape(sxy(1,:),nrows,ncols);
yi = reshape(sxy(2,:),nrows,ncols);
[x,y] = meshgrid(1:cols,1:rows);
newim = interp2(x,y,double(im),xi,yi); # Interpolate values from source image
#---
#
# Internal function to find where the corners of a region, R
# defined by [minrow maxrow mincol maxcol] are transformed to
# by transform T and returns the bounds, B in the form
# [minrow maxrow mincol maxcol]
function B = bounds(T, R)
P = [R(3) R(4) R(4) R(3) # homogeneous coords of region corners R(1) R(1) R(2) R(2)
1 1 1 1 ];
PT = round(Trans(T,P));
B = [min(PT(2,:)) max(PT(2,:)) min(PT(1,:)) max(PT(1,:))];
% minrow maxrow mincol maxcol
Appendix B
Sample cammads of the PTZ camera
Command Word 3 Word 4 Word 5 Word 6
Set Preset 00 03 00 01 to 20
Clear Preset 00 05 00 01 to 20
Go To Preset 00 07 00 01 to 20
Flip (180° about) 00 07 00 21
Go To Zero Pan 00 07 00 22
Set Auxiliary 00 09 00 01 to 08
Clear Auxiliary 00 0B 00 01 to 08
Remote Reset 00 0F 00 00
Set Zone Start 00 11 00 01 to 08
Set Zone End 00 13 00 01 to 08
Write Char. To Screen 00 15 X Position 00 to 28 ASCII Value
Clear Screen 00 17 00 00
Alarm Acknowledge 00 19 00 Alarm No.
Zone Scan On 00 1B 00 00
Zone Scan Off 00 1D 00 00
Set Pattern Start 00 1F 00 00
Set Pattern Stop 00 21 00 00
Run Pattern 00 23 00 00
Set Zoom Speed 00 25 00 00 to 03
Set Focus Speed 00 27 00 00 to 03
Reset Camera to defaults 00 29 00 00
Auto-focus auto/on/off 00 2B 00 00-02
Auto Iris auto/on/off 00 2D 00 00-02
AGC auto/on/off 00 2F 00 00-02
Backlight compensation on/off 00 31 00 01-02
Auto white balance on/off 00 33 00 01-02
Enable device phase delay mode 00 35 00 00
Set shutter speed 00 37 Any Any
Adjust line lock phase delay 00-01 39 Any Any
Adjust white balance (R-B) 00-01 3B Any Any
Adjust white balance (M-G) 00-01 3D Any Any
Adjust gain 00-01 3F Any Any
Adjust auto-iris level 00-01 41 Any Any
Adjust auto-iris peak value 00-01 43 Any Any
Query1 00 45 Any Any
Appendix C
Different resolutions sample dataset.
(a)
(b)
(c)
Appendix D
SIFT algorithm results of the laboratory dataset.
Image 1
Resolution 71x
58 141x 116 212x
173 282x
231 352x
288 423x 346 493x
404 564x 461 634x
519 704x 576 Preprocessing
0.5
1 0.05 0.06 0.08 0.11 0.14 0.17 0.22 0.27 0.32
DoG pyramids
0.0
3 0.05 0.07 0.10 0.14 0.18 0.24 0.30 0.37 0.45
Locating Keypoints 0.3
8 1.52 3.56 6.50 10.84 15.8
2 23.38 29.76 38.19 48.83 Gradient
0.0
2 0.02 0.04 0.06 0.10 0.13 0.18 0.23 0.29 0.35
Orientation
assignment 0.4
5 0.16 0.30 0.51 0.77 1.09 1.34 1.52 1.81 1.85
Features Descriptor 2.9
8 6.10 12.1
2 21.4
6 34.05
44.5
8 60.13 68.44 79.85 77.17 Total Processing time
4.3 7 7.90
16.1 5
28.7
1 46.01
61.9
4 85.44 100.4
7
120.7 8
128.9 7
Keypoints Found 47 113 224 396 625 809 1064 1229 1446 1383
Image 2
Resolution 71x
58 141x 116 212x
173 282x
231 352x
288 423x 346 493x
404 564x 461 634x
519 704x 576 Preprocessing
0.0
1 0.01 0.03 0.05 0.08 0.11 0.15 0.19 0.24 0.29
DoG pyramids 0.0
3 0.03 0.06 0.09 0.15 0.17 0.22 0.29 0.35 0.44
Locating Keypoints 0.3
4 1.34 3.19 5.85 9.77 14.5
2 20.71 27.41 35.58 45.72
Gradient - 0.01 0.03 0.06 0.09 0.12 0.18 0.22 0.28 0.34
Orientation
assignment 0.0
6 0.11 0.20 0.33 0.47 0.60 0.78 0.99 1.20 1.17
Features Descriptor 2.1
2 4.54 8.08 14.7
4 20.43
27.1
8 35.01 46.20 52.61 53.24 Total Processing time
2.5 6 6.04
11.5 9
21.1
2 30.99
42.7
0 57.05 75.30 90.26 101.2
0
Keypoints Found 39 84 150 267 377 493 618 832 960 957
Image 3
Resolution
71x 58
141x 116
212x 173
282x 231
352x 288
423x 346
493x 404
564x 461
634x 519
704x 576
Preprocessing - 0.01 0.03 0.05 0.08 0.11 0.14 0.18 0.23 0.29
DoG pyramids 0.2
0 0.03 0.06 0.09 0.14 0.17 0.23 0.29 0.36 0.43
Locating Keypoints 0.3
0 1.24 3.03 5.60 9.68 14.2
5 20.36 26.92 35.30 45.03
Gradient - 0.02 0.03 0.06 0.09 0.13 0.17 0.22 0.28 0.34
Orientation assignment
0.0
3 0.08 0.15 0.20 0.31 0.43 0.60 0.77 0.88 0.90
Features Descriptor 1.2
0 3.19 5.56 8.52 14.64 18.4
3 26.72 35.80 41.20 42.67 Total Processing time
1.7
3 4.57 8.86 14.5
2 24.94
33.5
2 48.22 64.18 78.25 89.66
Keypoints Found 22 59 103 157 270 337 482 644 749 769
Image 4
Resolution
71x 58
141x 116
212x 173
282x 231
352x 288
423x 346
493x 404
564x 461
634x 519
704x 576
Preprocessing - 0.01 0.03 0.05 0.08 0.11 0.15 0.18 0.24 0.29
DoG pyramids
0.0
2 0.03 0.06 0.09 0.13 0.17 0.23 0.29 0.36 0.43
Locating Keypoints 0.4
2 1.73 4.03 7.18 11.58 17.2
9 23.94 30.98 39.59 50.43
Gradient - 0.02 0.03 0.06 0.09 0.13 0.19 0.22 0.28 0.36
Orientation assignment
0.0
7 0.22 0.41 0.69 1.04 1.36 1.68 2.01 2.28 2.34
Features Descriptor 2.5
5 9.46 18.3
9 30.7
4 48.35
59.5
9 74.21 92.18 102.6
1
103.7 2 Total Processing time
3.0 6
11.4 7
22.9 5
38.8
1 61.27
78.6 5
100.4 0
125.8 6
145.3 6
157.5 7
Keypoints Found 47 176 335 566 878 1090 1338 1655 1857 1855
Image 5
Resolution
71x 58
141x 116
212x 173
282x 231
352x 288
423x 346
493x 404
564x 461
634x 519
704x 576
Preprocessing - 0.01 0.03 0.05 0.08 0.11 0.15 0.18 0.23 0.29
DoG pyramids
0.0
2 0.03 0.06 0.09 0.13 0.18 0.24 0.29 0.35 0.44
Locating Keypoints 0.3
8 1.58 3.64 6.76 10.66 16.1
5 21.85 29.09 37.33 48.11
Gradient - 0.02 0.03 0.06 0.09 0.13 0.17 0.23 0.28 0.35
Orientation assignment
0.0
5 0.15 0.30 0.44 0.61 0.85 1.07 1.28 1.51 1.48
Features Descriptor 2.3
6 6.47 11.7
9 20.5
8 27.52 38.9
8 47.36 59.79 72.32 67.66 Total Processing time
2.8 1 8.26
15.8 5
27.9
8 39.09
56.4
0 70.84 90.86 112.0
2
118.3 3
Keypoints Found 44 120 219 381 506 707 853 1079 1313 1216
Image 6
Resolution 71x
58 141x 116 212x
173 282x
231 352x
288 423x 346 493x
404 564x 461 634x
519 704x 576
Preprocessing - 0.02 0.03 0.05 0.08 0.11 0.15 0.18 0.23 0.29
DoG pyramids
0.0
2 0.03 0.06 0.09 0.13 0.17 0.23 0.29 0.35 0.44
Locating Keypoints 0.3
5 1.48 3.49 6.31 10.28 15.6
5 21.45 28.51 37.37 47.93
Gradient - 0.02 0.03 0.06 0.09 0.13 0.17 0.22 0.28 0.35
Orientation assignment
0.0
3 0.12 0.23 0.32 0.50 0.70 0.88 1.00 1.19 1.22
Features Descriptor 1.4
1 5.12 9.75 14.7
5 22.47 29.5
2 40.12 46.07 54.35 55.34 Total Processing time 1.8
1 6.79 13.5
9 21.5
8 33.55 46.2
8 63.00 76.27 93.77 105.5 7
Keypoints Found 26 95 181 273 408 535 722 834 990 984