cv08 part17 reconstruction3

(1)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Computer Vision – Lecture 17

Structure-from-Motion

21.01.2009

Bastian Leibe

RWTH Aachen

http://www.umic.rwth-aachen.de/multimedia

[email protected]

(2)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Course Outline

•

Image Processing Basics

•

Segmentation & Grouping

•

Object Recognition

•

Local Features & Matching

•

Object Categorization

•

3D Reconstruction



Epipolar Geometry and Stereo Basics



Camera calibration & Uncalibrated Reconstruction



Structure-from-Motion

•

Motion and Tracking

(3)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: A General Point

•

Equations of the form

•

How do we solve them? (always!)



Apply SVD



Singular values of A = square roots of the eigenvalues of

A

T

A.



The solution of Ax=0 is the

nullspace

vector of A.



This corresponds to the

smallest singular vector

of A.

3

Ax 0



11 11 1

1

T N T

NN N NN

d

v

d

v

�

��

�

��

�



�

��

�

��

�

��

�

A UDV

U

L

O

M O

M

L

SVD

Singular values

Singular vectors

(4)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Properties of SVD

•

Frobenius norm



Generalization of the Euclidean norm to matrices

•

Partial reconstruction property of SVD



Let



i

i=1,…,N

be the singular values of

A

.



Let

A

p

= U

p

D

p

V

pT

be the reconstruction of

A

when we set



p+1

,…,



N

to zero.



Then

A

p

= U

p

D

p

V

pT

is the best rank-p approximation of

A

in

the sense of the Frobenius norm

(i.e. the best least-squares approximation).

4

2

1 1

m n

ij F

i j

A

a

 



��

min( , ) 2

1

m n

i i





�

(5)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Camera Parameters

•

Intrinsic parameters



Principal point coordinates



Focal length



Pixel magnification factors



Skew (non-rectangular pixels)



Radial distortion

•

Extrinsic parameters



Rotation R



Translation t

(both relative to world coordinate system)

•

Camera projection matrix



General pinhole camera: 9 DoF



CCD Camera with square pixels: 10 DoF



General camera:

11 DoF

5

B. Leibe

0 0

1 1 1

x x x

y y y

m f p x

K m f p y

 

� ��

_� _�� _{� �} _�

� ��

s





(6)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Calibrating a Camera

Goal

•

Compute intrinsic and

extrinsic parameters using

observed camera data.

Main idea

•

Place “calibration object”

with known geometry in the

scene

•

Get correspondences

•

Solve for mapping from scene

to image: estimate P=P

int

P

ext

6

B. Leibe

Slide credit: Kristen Grauman P?

(7)

P e rc e p tu a l a n d S e n s o ry A u g m e n te d C o m p u ti n g C o m p u te r V is io n W S 0 8 /0 9

Recap: Camera Calibration (DLT

Algorithm)

•

P has 11 degrees of freedom.

•

Two linearly independent equations per independent

2D/3D correspondence.

•

Solve with SVD (similar to homography estimation)



Solution corresponds to smallest singular vector.

•

5 ½ correspondences needed for a minimal solution.

7

B. Leibe

Slide adapted from Svetlana Lazebnik

0

p

A



0

P

X

0

X

0

X

0

X

0

3 2 1 1 1 1 1 1 1







T n n T T n T n n T n T T T T T T T

x

y

x

y



(8)

•

Two independent equations each in terms of

three unknown entries of X.

•

Stack equations and solve with SVD.

•

This approach nicely generalizes to multiple cameras.

Recap: Triangulation – Linear Algebraic

Approach

8 B. Leibe

X

P

x

X

P

x

2 2 2 1 1 1





0

X

P

x

0

X

P

x

2

1









0

X

P

]

[x

0

X

P

]

[x

2

1





Slide credit: Svetlana Lazebnik

O₁ O₂

x₁ x2

X? R₁ R₂

(9)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Epipolar Geometry –

Calibrated Case

9

B. Leibe

X

x x’

Camera matrix:

[I|0]

X

= (

u

,

v

,

w

,

1

)

T

x

= (

u

,

v

,

w

)

T

Camera matrix:

[

R

T

| –

R

T

t

]

Vector

x

’

in second

coord. system has

coordinates

Rx

’

in the

first one.

t

The vectors

x

,

t

, and

Rx’

are coplanar

R

(10)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Calibrated Case

10

B. Leibe

X

x x’

Essential Matrix

(Longuet-Higgins, 1981)

0

)]

(

[









t

R

x

(11)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Uncalibrated Case

•

The calibration matrices

K

and

K’

of the two

cameras are unknown

•

We can write the epipolar constraint in terms of

unknown

normalized coordinates:

11

B. Leibe

X

x x’

0

ˆ

E

x





(12)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Uncalibrated Case

12

B. Leibe

X

x x’

Fundamental Matrix

(Faugeras and Luong, 1992)

0

ˆ

E

x





x

T

x

K

x

K

x











ˆ

1

with

0











F

K

E

K

x

F

(13)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

•

Problem: poor numerical conditioning

Recap: The Eight-Point

Algorithm

13

B. Leibe

x

= (

u

,

v

, 1)

T

,

x’

= (

u’

,

v’

, 1)

T

Minimize:

under the constraint

|

F

|

2

= 1

2

1

)

(

i

N

i

T

i

F

x







(14)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Normalized Eight-Point

Algorithm

1.

Center the image data at the origin, and scale it so

the mean squared distance between the origin and

the data points is 2 pixels.

2.

Use the eight-point algorithm to compute

F

from

the normalized points.

3.

Enforce the rank-2 constraint using SVD.

4.

Transform fundamental matrix back to original

units: if T and T’ are the normalizing

transformations in the two images, than the

fundamental matrix in original coordinates is

T

F T’.

14

B. Leibe _{[Hartley, 1995]}

11 11 13

22

33 31 33

T

d

v

F

d

v

�

��

�

��

�



�

��

�

��

�

��

�

UDV

U

L

M O

M

L

SVD

(15)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Comparison of Estimation

Algorithms

15

B. Leibe

8-point Normalized 8-point Nonlinear least squares

Av. Dist. 1 2.33 pixels 0.92 pixel 0.86 pixel Av. Dist. 2 2.18 pixels 0.85 pixel 0.80 pixel

(16)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Epipolar Transfer

•

Assume the epipolar geometry is known

•

Given projections of the same point in two

images, how can we compute the projection of

that point in a third image?

16

B. Leibe

x

₁

x

₂

x

₃

l

₃₂

l

₃₁

l

₃₁

= F

T

13

x

1

l

₃₂

= F

T

23

x

2

(17)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Active Stereo with

Structured Light

•

Optical triangulation



Project a single stripe of laser light



Scan it across the surface of the object



This is a very precise version of structured light scanning

17

B. Leibe

Digital Michelangelo Project

http://graphics.stanford.edu/projects/mich/

(18)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Topics of This Lecture

•

Structure from Motion (SfM)



Motivation



Ambiguity

•

Affine SfM



Affine cameras



Affine factorization



Euclidean upgrade



Dealing with missing data

•

Projective SfM



Two-camera case



Projective factorization



Bundle adjustment



Practical considerations

•

Applications

18

(19)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Structure from Motion

•

Given:

m

images of

n

fixed 3D points

x

ij

=

P

i

X

j

,

i =

1

, … , m, j =

1

, … , n

•

Problem: estimate

m

projection matrices

P

i

and

n

3D points

X

j

from the

mn

correspondences

x

ij

19

x

₁_j

x

₂_j

x

3j

X

_j

P

₁

P

₂

P

₃

B. Leibe

(20)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

What Can We Use This For?

20

B. Leibe

•

E.g. movie special effects

Video

(21)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Ambiguity

•

If we scale the entire scene by some factor

k

and, at the same time, scale the camera matrices

by the factor of 1/

k

, the projections of the scene

points in the image remain exactly the same:



It is impossible to recover the absolute scale of

the scene!

21

B. Leibe

)

(

1

X

P

PX

x

k





(22)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Ambiguity

•

If we scale the entire scene by some factor

k

and, at the same time, scale the camera

matrices by the factor of 1/

k

, the projections

of the scene points in the image remain

exactly the same.

•

More generally: if we transform the scene

using a transformation

Q

and apply the

inverse transformation to the camera

matrices, then the images do not change

22

B. Leibe



PQ





QX



PX

x



PX





PQ

-1





QX



-1

(23)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Reconstruction Ambiguity:

Similarity

23

B. Leibe



PQ





Q

X



PX

x



-1

S

(24)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Reconstruction Ambiguity:

Affine

24

B. Leibe



PQ





Q

X



PX

x



-1

A

(25)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Reconstruction Ambiguity:

Projective

25

B. Leibe



PQ





Q

X



PX

x



-1

P

(26)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Projective Ambiguity

26

B. Leibe

(27)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

From Projective to Affine

27

B. Leibe

(28)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

From Affine to Similarity

28

B. Leibe

(29)

Hierarchy of 3D

Transformations

•

With no constraints on the camera calibration matrix or on the

scene, we get a

projective

reconstruction.

•

Need additional information to

upgrade

the reconstruction to

affine, similarity, or Euclidean.

29 B. Leibe



v

T

v

t

A

Projectiv

e

15dof

Affine

12dof

Similari

ty

7dof

Euclidea

n

6dof

Preserves intersection and tangency Preserves parallellism, volume ratios Preserves angles, ratios of length



1

0

t

A

T



1

0

t

R

T

s



1

0

t

R

T Preserves angles,

lengths

(30)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Topics of This Lecture

•

Structure from Motion (SfM)



Motivation



Ambiguity

•

Affine SfM



Affine cameras



Affine factorization



Euclidean upgrade



Dealing with missing data

•

Projective SfM



Two-camera case



Projective factorization



Bundle adjustment



Practical considerations

•

Applications

30

(31)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

•

Let’s start with

affine cameras

(the math is

easier)

31

B. Leibe

center at infinity

(32)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Orthographic Projection

•

Special case of perspective projection



Distance from center of projection to image plane is

infinite



Projection matrix:

32

B. Leibe

Slide credit: Steve Seitz

(33)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Affine Cameras

33

B. Leibe

Orthographic Projection

Parallel Projection

(34)

Affine Cameras

•

A general affine camera combines the effects of

an affine transformation of the 3D space,

orthographic projection, and an affine

transformation of the image:

•

Affine projection is a linear mapping + translation

in inhomogeneous coordinates

34 B. Leibe

















1

0

b

A

P

1

0

]

affine

4

[

1

0

1

0

1

]

affine

3

[

₂₁ ₂₂ ₂₃ ₂

1 13 12 11

b

a

b

a

x

X

a

₁

a

₂

b

AX

x





























2 1 23 22 21 13 12 11

b

Z

Y

X

a

y

x

Projection of

world origin

(35)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Affine Structure from Motion

•

Given:

m

images of

n

fixed 3D points:

•

x

ij

=

A

i

X

j

+

b

i

, i =

1

,… , m, j =

1

, … , n

•

Problem: use the

mn

correspondences x

ij

to estimate

m

projection matrices A

i

and translation vectors b

i

,

and

n

points X

j

•

The reconstruction is defined up to an arbitrary

affine

transformation Q (12 degrees of freedom):

•

We have 2

mn

knowns and 8

m

+ 3

n

unknowns (minus

12 dof for affine ambiguity).



Thus, we must have 2

mn

>= 8

m

+ 3

n

– 12.



For two views, we need four point correspondences.

35

B. Leibe























_

1

X

Q

1

X

,

Q

1

0

b

A

1

0

b

A

₁

(36)

Affine Structure from Motion

•

Centering: subtract the centroid of the image

points

•

For simplicity, assume that the origin of the

world coordinate system is at the centroid of

the 3D points.

•

After centering, each normalized point x

ij

is

related to the 3D point X

i

by

36 B. Leibe





j i n k k j i n k i k i i j i n k ik ij ij

n

X

A

X

A

b

X

A

b

X

A

x

ˆ

1

ˆ

1 1 1























  

j

i

ij

A

X

x

ˆ



(37)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Affine Structure from Motion

•

Let’s create a 2

m

×

n

data (measurement)

matrix:

37

B. Leibe





mn

m

n

x

D

ˆ

2

1

2

22

21

1

12

11







Cameras

(2

m)

Points (n)

C. Tomasi and T. Kanade. Shape and motion from image streams under orthography:

A factorization method. IJCV, 9(2):137-154, November 1992.

(38)

Affine Structure from Motion

•

Let’s create a 2

m

×

n

data (measurement)

matrix:

•

The measurement matrix

D = MS

must have rank

3!

38

B. Leibe

C. Tomasi and T. Kanade. Shape and motion from image streams under orthography:

Cameras

(2

m × 3)



n



m

mn

m

n

X

A

x

D











2

1

2

1

2

1

2

22

21

1

12

11

ˆ









(39)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Factorizing the Measurement

Matrix

39

B. Leibe

(40)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Matrix

•

Singular value decomposition of D:

(41)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Matrix

•

Singular value decomposition of D:

(42)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Matrix

•

Obtaining a factorization from SVD:

(43)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Matrix

•

Obtaining a factorization from SVD:

43 Slide credit: Martial Hebert

This decomposition minimizes

(44)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Affine Ambiguity

•

The decomposition is not unique. We get the same D

by using any 3×3 matrix C and applying the

transformations M → MC, S →C

-1

S.

•

That is because we have only an affine transformation

and we have not enforced any Euclidean constraints

(like forcing the image axis to be perpendicular, for

example). We need a

Euclidean upgrade

.

44

B. Leibe

(45)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Estimating the Euclidean

Upgrade

•

Orthographic assumption: image axes are

perpendicular and scale is 1.

•

This can be converted into a system of 3

m

equations:

45

B. Leibe

x

X

a

₁

a

₂

a

₁

· a

₂

= 0

|a

₁

|

2

= |a

2

|

2

= 1

Slide adapted from S. Lazebnik, M. Hebert

1 2 1 2

1 1 1

2 2 2

ˆ

0

ˆ

1

1 ,

1,...,

ˆ

1

T T i i i i

T T

i i i

T T

i i i

a a

a CC a

a

a CC a

i

m

a

a CC a

�



� �

�



�



�



�



(46)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Estimating the Euclidean

Upgrade

•

This can be converted into a system of 3

m

equations:

•

Let

•

Then this translates to

3m

equations in L

 Solve for L

 Recover C from L by Cholesky decomposition: L = CC_T  Update M and S: M = MC, S = C_-1S

46

B. Leibe

Slide adapted from S. Lazebnik, M. Hebert

1 2 1 2

1 1 1

2 2 2

ˆ

0

ˆ

1

1 ,

1,...,

ˆ

1

T T i i i i

T T

i i i

T T

i i i

a a

a CC a

a

a CC a

i

m

a

a CC a

�



� �

�



�



�



�



�

1

2

,

1,...,

T

i i _T

i

a

A

i

m

a

� �



� �



� �

,

1,...,

T

i

A LA



I

i



m

T

(47)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Algorithm Summary

•

Given:

m

images and

n

features x

ij

•

For each image

i, c

enter the feature coordinates.

•

Construct a 2

m

×

n

measurement matrix D:



Column

j

contains the projection of point

j

in all views



Row

i

contains one coordinate of the projections of all the

n

points in image

i

•

Factorize D:



Compute SVD: D = U W V

_T 

Create U

3

by taking the first 3 columns of U



Create V

3

by taking the first 3 columns of V



Create W

3

by taking the upper left 3 × 3 block of

W

•

Create the motion and shape matrices:



M = U

3

W

3½

and S = W

3½

V

3T

(or M = U

3

and S = W

3

V

3T

)

•

Eliminate affine ambiguity

(48)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Reconstruction Results

48

B. Leibe

C. Tomasi and T. Kanade. Shape and motion from image streams under orthography:

(49)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Dealing with Missing Data

•

So far, we have assumed that all points are

visible in all views

•

In reality, the measurement matrix typically

looks something like this:

49

B. Leibe

Cameras

Points

(50)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Dealing with Missing Data

•

Possible solution: decompose matrix into

dense sub-blocks, factorize each sub-block,

and fuse the results



Finding dense maximal sub-blocks of the matrix is

NP-complete (equivalent to finding maximal cliques

in a graph)

•

Incremental bilinear refinement

50

(1) Perform

factorization on

a dense

sub-block

F. Rothganger, S. Lazebnik, C. Schmid, and J. Ponce. Segmenting,

Modeling, and Matching Video Clips Containing Multiple Moving

Objects. PAMI 2007.

(51)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Dealing with Missing Data

•

Possi