Multiple Moving Targets Detection in a Mobile Video Camera Image Sequence Using Fuzzy-Edge Based Feature Matching

(1)

Multiple Moving Targets Detection in a Mobile Video Camera Image Sequence Using Fuzzy-Edge Based Feature Matching

Alireza Behrad Seyed Ahmad Motamedi

Electrical Engineering Department, Amirkabir Univ. of Technology 424 Hafez Ave. 15914,TEHRAN, IRAN

Email: [email protected], [email protected]

ABSTRACT: In this paper we present a new algorithm for fast detection of multiple moving targets in a mobile video camera image sequence using fuzzy-edge based feature matching. To detect moving targets, background motion is estimated and compensated using an affine transformation. The resultant motion-rectified image is employed for detection of targets using a split and merge algorithm. We also checked other features for precise detection of the targets. To calculate background motion model, a few features are found in the edge-detected image of the current frame and then these features are matched with points in the edge-detected image of next frame using a new method called fuzzy-edge based feature matching. Then using LMedS algorithm an affine model is calculated.

Keyword: Moving Target Detection, Fuzzy Edge, Edge Matching, affine transformation

1 INTRODUCTION

Visual detection of moving targets is one of the most challenging issues in computer vision. Applications of the visual detection are numerous and they span a wide range of applications including surveillance system, vehicle tracking and aerospace application, to name a few. Detection of abstract targets (e.g.

vehicles in general) is a very complex problem and demands sophisticated solutions using conventional pattern recognition and motion estimation methods. Motion-based segmentation is one of the powerful tools for detection of moving targets. When the camera is static, it is simple to detect moving objects in the image sequence by differencing two consecutive frames [1],[2], but the conventional difference-based methods fail to detect moving targets when the camera is also moving. In the case of mobile camera all of the objects in the scene have an apparent motion, which is related to the camera motion. A number of methods have been proposed for detection of the moving targets in mobile camera including direct camera motion estimation [3], optical flow [4], and geometric transformation [5], [6]. In [3], it is assumed that camera motion is only because of camera pan and tilt motion so using an active camera these parameters are estimated and compensated. Direct measurement of camera motion is a proper method to estimate the apparent motion of the background, but it is not applicable when the camera motion is also related to other motions than pan and tilt. Optical flow methods are another well-known methods for detection of moving objects but are not in wide use because of their high computational cost and low accuracy in the presence of noise and illumination change. Geometric transformation methods have low computational cost and are suitable for real-time purpose. In these methods, a uniform background motion is assumed, which a motion model can be used to model this motion. When the apparent motion of the background is estimated, it can be exploited to locate moving objects. Another difficulty with visual detection algorithms is the illumination change of the scene. Hager and Belhumer [7] have modeled the illumination changes into SSD formulation by using a low-dimensional linear subspace. The main disadvantage of this algorithm is the need of several images of the same target in different illumination condition. In [8] spatial illumination variations is modeled by low-order polynomial but the algorithm is not fast for real-time purpose.

In this paper we have proposed a new method for detection of moving targets using a mobile monocular camera. In addition to camera motion we also considered the detection of multiple targets and illumination change of the scene. This paper is organized as follows. In section 2, the procedure for calculation of background motion model is discussed. Section 3 describes detection of moving targets using motion-rectified images. Illumination change removing algorithm appears in section 4.

Experimental results are shown in section 5 and conclusion appears in Section 6.

(2)

2 ACKGROUND MOTION ESTIMATION

When the camera is mobile, all the objects in the scene have an apparent motion, which is related to camera motion so to detect true moving targets, camera motion should be estimated. When the background motion is calculated, it is used to generate motion compensated images, which can be used for detection purpose. To estimate camera motion we assume that the moving targets constitute small region (less than 50%) of the scene, therefore a motion model that describes dominant motion of the scene is the motion model of the background and can be calculated using LMedS statistic. Our algorithm to compute camera or apparent background motion model contains three stages, which are described in the following subsections.

2.1 Feature selection

First stage to calculate camera motion model is to calculate motion vectors for different points of the image. To reduce the computational overhead of the algorithm as well as increasing the accuracy of motion vectors, motion vectors are usually calculated for strong features of the image. Features such as line, corners and edges or operators such as Moravec operator may be used for this purpose. Edges are good features, which are robust against noise and smooth variation of illumination. To find precise match of the edges they are usually expressed as line or curves, which is a time consuming process. To reduce the computational overhead of the algorithm we used a new method, which extracts point features from edge-detected image of current frame. To find precise match of the features, points should be selected as features that have enough information for matching process. So we calculate edge points number and edge spatial variances in x and y directions in windows centered on edge points. We select points that the edge number is between two low and high thresholds. We also reject points that the calculated variances are lower than a threshold. Then we assign grades to remaining points based on the following equation.

) )(

(count L H count k

v k v k

grade= _x _x+ _y _y+ _c − − (1)

where vx and vy are variances in x and y directions, count is the number of edge points in the window, L and H are low and high thresholds for edge points number, which are initialized based on window size and they are adaptively adjusted if enough features are not found for a typical image sequence and kx, ky, kc are fixed parameters. When the grades are calculated, the remaining features are scanned to remove adjacent features and features with lower grades.

2.2 Feature matching

When the feature points are detected, these features are matched with points in the region of interest in the next frame. In addition to noise and illumination change of the scene, change in geometry and size of objects in the image especially when the camera is zooming, may deteriorate the efficiency of matching process. So the direct matching of edges is not a proper method to find corresponding points in the presence of size and geometry variations. To alleviate this problem, we used a new edge matching method called fuzzy-edge based feature matching. We describe edges as fuzzy values, which are called fuzzy edges. To describe an edge point as fuzzy values we assign the value of ‘1’ to edge point and a value between 0 and 1 to the points that are in the neighborhood of the edge point. This value is determined based on the distance from the edge point and the fuzzy membership function, which is used for this purpose. Different membership functions [9] such as triangular and trapezoid can be used for construction of fuzzy edges. Figure 1 shows an edge-detected image and fuzzy edges for a rectangular shape object.

When the fuzzy edges are obtained for both current and next frames, normalized correlation method is used to locate the corresponding points of features in the next frame. Normalized correlation equation is given by:

2 / 1 2 2 ,

2 2 1 ,

1

2 2

1 ,

1

] ) , ( [ ] ) , ( [

] ) , ( ][

) , ( [

−

=

∈

f y x f f y x f

f y x f f y x f r

S y x S

y x

S y

x (2)

Here

f

₁

and f

₂ are the average values of the fuzzy edges in the two regions being compared, and the summations are carried out over all fuzzy values with in small windows centered on the features.

(3)

Figure 1: (a) An edge-detected image (b) fuzzy edges using triangular membership function with the base of 7

2.3 Motion model calculation

When the feature points and their matches are found, a motion model, which describes the motion of the background or camera is calculated. Several methods have been proposed to describe the motion of an object between two frames. Among these methods three transformation: affine, projective and polynomial, are most commonly used in target detection and image registration [7]. The affine transformation accounts for rotation, translation, scaling and skew between two images as follows:

+

=

6 5 4

3 2 1

a a y x a a

a a Y X

i i i

i (3)

where (xi , yi ) are locations of points in the current frame and (Xi , Yi ) are locations of points in the next frame and a1-a6 are motion parameters. Although projective and polynomial consider more distortions, affine transformation is easy to use and is proper for real-time applications. Furthermore, in this work the motion model is computed for the background in two consecutive frames so various distortions are expected to be small. Therefore affine transformation is selected. This transformation has six parameters;

therefore, three matching pairs are required to fully recover the motion. However it is necessary to select the three points from the background to assure an accurate model for camera motion. If the moving targets constitute a small area (i.e. less than 50%) of the image, then LMedS algorithm can be applied to determine the affine transformation parameters of the apparent background motion between two consecutive frames according to the following procedure:

1. Find N random feature points and corresponds using the algorithm described in section 2.1 and 2.2.

2. Select M random sets of three feature points: (xi , yi , Xi , Yi ) for i=1,2,3, from the N feature points obtained in step 1. (xi ,yi) are coordinates of the feature points in previous frame , and (Xi , Yi ) are their corresponds in current frame.

3. For each set calculate the affine transformation parameters.

4. Transform N feature points in step 1 using M affine transformations, obtained in step 3 and calculate the M medians of squared differences between corresponding points and transformed points. Then select the affine parameters for which the median of squared difference is the minimum.

According to the above procedure, the probability p that at least one data set in the background and their correct corresponding points are obtained is derived from the following equation [6]:

q M

M q

p(ε, , )=1−(1−((1−ε) )³) (4)

where ε(<0.5) is the ratio of the moving objects regions to whole image and q is the probability that corresponding points are correctly find.

3 TARGET DETECTION

When affine parameters are estimated, they can be used to remove of the apparent background motion, by transformation of current frame. Now difference of the next frame and transformed current frame reveals true moving targets. Then we apply a threshold to produce a binary image. The resultant binary image for

(4)

a typical image is shown in figure 2. Some parts are segmented as moving targets due to noise and connected component property can be applied to reduce errors due to noise.

To find targets bounding-box we used split and merge algorithm. We scan the binary image and calculate the number of moving pixels in the windows (typically 20*20) centered on the binary image pixels. Then a window with the maximum number of moving pixels is selected. If the number is larger than a threshold, split and merge algorithm is used to find target bounding-box around the selected window otherwise the search algorithm finishes. If a target is found, its bounding box is removed from the binary image and search process is repeated to find next target.

If no target is found at all, then it means either there is no moving target in the scene or, the relative motion of the target is too small to be detected. In the latter case, it is possible to detect the target by adjusting the frame rate of the camera. The algorithm accomplishes this automatically by analyzing the proceeding frames until a target is detected.

Our special interest is detection of the moving vehicles so we used aspect ratio and horizontal and vertical line as constraints to verify vehicles. Our experiments show that comparison of the length of horizontal and vertical lines in the target area with the perimeter of the target will give a good clue about the nature of the target.

Figure 2: Current frame and difference of two consecutive frames after background motion compensation, the calculated affine parameters are: a1=0.9958, a2=0.003, a3=-0.006, a4=1.0032, a5=2.33, a6=-1.51

4 ILLUMINATION CHANGE CANCELING

When there is no noise and illumination change in the consecutive frames, brightness constancy assumption holds as:

I0(x+u, y+v)= I1(x, y) (5) where I0 and I1 are image intensity function in two consecutive frames and u and v are motion vectors. In the presence of noise and illumination change, the above equation can be written in its generalized form :

) , ( ) , ( ) , ( ) ,

( ₁

0 x u y v x y I x y x y

I + + =α +β (6)

where α(x,y) is a low varying function showing illumination change and β(x,y) is sum of image noise and a low varying part showing illumination change. Low varying part of β(x,y) can not affect our algorithm because we used edge features for matching. To reduce the effect of noise and α(x,y) we used the algorithm illustrated in figure 1 [13]. In this algorithm we first smooth the image to remove noise then we calculate the logarithm of filtered image to convert multiplication to addition. Now using an averaging filter the low varying part of signal can be removed. Then exponential function is applied to recover I1(x, y). This algorithm also removes the low varying part of I1(x, y) but this part of signal is not very important for edge detection.

Figure 3: Illumination change removing algorithm 5 RESULTS

(5)

We first examined various edge detection and filtering algorithms to obtain a robust edge detection algorithm against noise. We examined Sobel, Canny, Prewitt and LOG edge detection algorithms with various noise removing filters including edge-preserving filters [10], [11]. To test these algorithms we applied various edge detection algorithms to both noiseless and filtered noise-added images then using fuzzy edges and normalized correlation the similarity of two images is measured.

Experimental result showed that the sobel edge detection algorithm with gaussian smoothing filter has better results. The results for four different images and gaussian filter are shown in Table 1.

The triangular membership function with the base of 7 pixels was used for construction of fuzzy edges in this experiment.

The final algorithm has been implemented on a Pentium III 500Mhz using a Visual C++ program.

We used triangular membership function with the base of 5 pixels to make fuzzy edges. We have tested the algorithm with both simulated and actual sequences of images for various targets in different landscapes and with wide variety of image sequence. For simulation of illumination change we used second order illumination model i.e. ax²+ by²+cxy+dx+ey+f for α(x,y) and β(x,y). For each frame we used random numbers for a,b,c,d,e,f and random origin which is one of the corner of image. A typical image before and after illumination addition and the result of illumination change canceling algorithm are shown in figure 4. Figure 5 shows some results for detection of various targets in arbitrary background.

As it is shown the algorithm has successfully detected the targets.

Experimental results have shown that the algorithm is reliable and can successfully detect targets in most cases such as targets with changes in geometry and illumination. Comparison of results generated by the proposed method with those of other methods showed that more reliable results could be obtained with the aid of the proposed method in real-time.

Image Gaussian Noise Sobel Prewitt LOG Canny Girl

5dB 0.75 0.75 0.56 0.25

7.5dB 0.78 0.78 0.72 0.41

10dB 0.81 0.81 0.81 0.56

Cameraman

5dB 0.90 0.89 0.55 0.25

7.5dB 0.91 0.90 0.69 0.32

10dB 0.92 0.91 0.80 0.45

Patrol (Figure 3-a)

5dB 0.79 .78 0.64 0.3

7.5dB 0.82 0.82 0.78 0.44

10dB 0.84 0.83 0.85 0.68

Car (Figure 4 left- bottom image)

5dB 0.79 0.79 0.69 0.34

7.5dB 0.817 0.82 0.79 0.43

10dB 0.82 0.83 0.83 0.67

Table 1: Similarity measurement for different noise levels and edge detection algorithms

Figure 4: (a) Original Image (b) image after illumination change simulation (c) output of illumination canceling algorithm

6 CONCLUSION

In this paper we have proposed a new method for detection of the moving targets in mobile camera image sequence. We used affine transformation and LMedS algorithm for estimation of the apparent background motion. To select feature points we used a new method, which extracts features from the edge-detected image. We also used fuzzy-edge based feature matching to find corresponding points for the features.

When the apparent motion of the background is compensated, the difference of two consecutive frames is

(6)

used for detection of the moving target. We designed a split and merge algorithm for multiple moving targets detection. We also checked the targets for some features. Detection of slow moving targets using frame rate adjusting algorithm and robustness of the algorithm against illumination and geometry changes are another features of our algorithm. The proposed algorithm successfully detects moving vehicles and objects in arbitrary scenes obtained from a mobile video camera. Local and regional computations have made the algorithm suitable for real-time applications. Experimental results have shown that the algorithm is reliable and can successfully detect moving targets in most cases.

Figure 5: Various targets detected using detection algorithm REFERENCES

[1] Lipton, A.J., Fujiyoshi H.F. & Patil, R.S. (1998) “Moving target classification and tracking from real- time video”, Proc. IEEE Workshop Application of Computer Vision 4, 8-14.

[2] Hwang J., Ooi Y. & Ozawa S. (1992) “Visual feedback control system for tracking and zooming a target”, Proc. of the International Conference on Power Electronics and Motion Control, 740-745.

IEEE .

[3] Murray, D. & Basu, A. (1995) “Motion tracking with an active camera” IEEE Trans. Pattern Anal. &

Mach. Intell., 16(5):449-459.

[4] Barron, J.L., Fleet D.J., Beauchemin, S.S. & Burkitt, T.A.(1992) “Performance of optical flow techniques”, International Journal of Computer Vision, 12(1):43-77.

[5] _ (2001) A Robust Vision-based Moving Target Detection and Tracking System. Proceeding of Image and Vision Computing conference (IVCNZ2001), University of Otago, Dunedin, New Zealand.

[6] Araki, S., Matsuoaka, T., Yokoya, N. & Takemura H. (2000) “Real-time tracking of multiple moving object contours in a moving camera image sequence”, IEICE Trans. Inf. & Syst., E83-D(7).

[7] Hager, G.D. & Belhumeur, P.N. (1998) Efficient region tracking with parametric Models of Geometry and illumination. IEEE Transaction on Pattern Analysis and Machine Intelligence, 20(10):1029-1035.

[8] Lai, S.H. & Fang, M. (1999) “Robust and efficient image alignment with spatially varying illumination model”, Proceedings of IEEE Conference on Computer Vision & Pattern Recognition, 2:167-172 .

[9] Wang,L.X.,(1997) A Course in Fuzzy systems and control,. Prentice-Hall international Inc.

[10] Wang, X. (1994) “Optimal Edge-preserving hybrid fiter” IEEE Trans. On Image processing 3(6):862-865.

[11] Wang, X. (1992) “On gradient inverse weighted filter”, IEEE Trans. on Image processing 40(2):482-484.

[12] Jain, R., Kasturi, R. & Schunck , B.G. (1995) Machine vision, McGraw-Hill Inc.

[13] Gonzalez, R. C. and Woods R. E. (1992) Digital Image Processing, Addison-Weseley publishing Company.