• Tidak ada hasil yang ditemukan

where

ˆ

p(zi) =p(zi |ζ) Y

j∈Ai

p(lij |zi,ˆaj). (7.3) M-step: To estimate the annotator parameters a, we maximize the expectation of the logarithm of the posterior ona with respect to ˆp(zi) from the E-step. We call the auxiliary function being maximized Q(a,a). Thus the optimalˆ a? is found from

a? = arg max

a Q(a,a),ˆ (7.4)

where ˆais the estimate from the previous iteration, and

Q(a,ˆa) =Ez[logp(L |z,a) + logp(a|α)] (7.5)

= XM

j=1

Qj(aj,ˆaj), (7.6)

where Ez[·] is the expectation with respect to ˆp(z) and Qj(aj,ˆaj) is defined as Qj(aj,aˆj) = logp(aj |α) +X

i∈Ij

Ezi[logp(lij |zi,aj)]. (7.7)

Hence, the optimization can be carried out separately for each annotator, and relies only on the labels that the annotator provided. It is clear from the form of (7.3) and (7.7) that any given annotator is not required to label every image.

216

117

217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269

270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323

CVPR #7 CVPR #7

CVPR 2010 Submission #7. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE.

l ij

z i a j α

ζ

i, j

M

N |L|

Figure 3. Plate representation of the general model. The i, j pair in the middle plate, indicating which images each annotator labels, is determined by some process that depends on the algorithm (see Sections 3–4).

Inference: Given the observed variables, that is, the la- bels L , we would like to infer the hidden variables, i.e. the target values z, as well as the annotator parameters a. This can be done using a Bayesian treatment of the Expectation- Maximization (EM) algorithm [2].

E-step: Assuming that we have a current estimate a ˆ of the annotator parameters, we compute the posterior on the target values:

ˆ

p(z) = p(z | L , a) ˆ ∝ p(z) p( L | z, a) = ˆ

� N

i=1

ˆ

p(z i ), (2) where

ˆ

p(z i ) = p(z i | ζ ) �

j ∈A

i

p(l ij | z i , a ˆ j ). (3) M-step: To estimate the annotator parameters a, we maximize the expectation of the logarithm of the posterior on a with respect to p(z ˆ i ) from the E-step. We call the aux- iliary function being maximized Q(a, a). Thus the optimal ˆ a is found from

a = arg max

a Q(a, a), ˆ (4)

where a ˆ is the estimate from the previous iteration, and Q(a, a) = ˆ E z [log p( L | z, a) + log p(a | α)] (5)

=

� M

j=1

Q j (a j , a ˆ j ), (6) where E z [ · ] is the expectation with respect to p(z) ˆ and Q j (a j , a ˆ j ) is defined as

Q

j

(a

j

, a ˆ

j

) = log p(a

j

| α) + �

i∈Ij

E

zi

[log p(l

ij

| z

i

, a

j

)] . (7)

Hence, the optimization can be carried out separately for each annotator, and relies only on the labels that the anno- tator provided. It is clear from the form of (3) and (7) that any given annotator is not required to label every image.

Input: Set of images U to be labeled 1: Initialize I , L , E , B = {∅}

2: while |I| < |U| do

3: Add n images { i : i ∈ ( U \ I ) } to I 4: for i ∈ I do

5: Compute p(z ˆ

i

) from L

i

and a

6: while max

zi

p(z ˆ

i

) < τ and |L

i

| < m do 7: Ask annotators j ∈ E to provide a label l

ij

8: if no label l

ij

is provided within time T then 9: Get l

ij

from annotator j ∈ ( A

\ B )

10: end if

11: L

i

← {L

i

∪ l

ij

} , A ← {A ∪ { j }}

12: Recompute p(z ˆ

i

) from updated L

i

and a 13: end while

14: end for

15: Set E , B = {∅}

16: for j ∈ A do

17: Estimate a

j

from p(z ˆ

i

) by max

aj

Q

j

(a

j

, a ˆ

j

) 18: if var(a

j

) < θ

v

then

19: if a

j

satisfies an expert criterion then 20: E ← {E ∪ { j }}

21: else

22: B ← {B ∪ { j }}

23: end if

24: end if 25: end for 26: end while

27: Output p(z) ˆ and a

Figure 4. Online algorithm for estimating annotator parameters and actively choosing which images to label. The label collec- tion step is outlined on lines 3–14, and the annotator evaluation step on lines 15–25. See Section 4 for details.

4. Online Estimation

The factorized form of the general model in (1) allows for an online implementation of the EM-algorithm. Instead of asking for a fixed number of labels per image, the online algorithm actively asks for labels only for images where the target value is still uncertain. Furthermore, it finds and pri- oritizes expert annotators and blocks sloppy annotators on- line. The algorithm is outlined in Figure 4 and discussed in detail in the following paragraphs.

Initially, we are faced with a set of images U with un- known target values z. The set I ⊆ U denotes the set of im- ages for which at least one label has been collected and L is the set of all labels provided so far. Initially I and the set of all labels L are empty. We assume that there is a large pool of annotators A , of different and unknown ability, available to provide labels. The set of annotators that have provided labels so far is denoted A ⊆ A and is initially empty. We keep two lists of annotators: the expert-list, E ⊆ A , is a set of annotators who we trust to provide good labels, and the bot-list, B ⊆ A , are annotators that we know provide low quality labels and would like to exclude from further labeling. We call the latter list “bot”-list because the labels could as well have been provided by a robot choosing la- bels at random. The algorithm proceeds by iterating two 3

Figure 7.3: Plate representation of the general model. The i, j pair in the middle plate, indicating which images each annotator labels, is determined by some process that depends on the algorithm (see Sections 7.4–7.5).

Input: Set of imagesU to be labeled 1: InitializeI,L,E,B={∅}

2: while |I|<|U|do

3: Addnimages{i : i(U \ I)} toI 4: fori∈ I do

5: Compute ˆp(zi) fromLi anda

6: whilemaxzip(zˆ i)< τ and|Li|< mdo

7: Ask expert annotatorsj ∈ E to provide a labellij

8: if no labellij is provided within timeT then 9: Obtain labellij from some annotatorj (A0\ B) 10: Li← {Lilij},A ← {A ∪ {j}}

11: Recompute ˆp(zi) from updatedLi anda 12: SetE,B={∅}

13: forj∈ Ado

14: Estimateaj from ˆp(zi) by maxajQj(aj,ˆaj) 15: if var(aj)< θv then

16: if aj satisfies an expert criterionthen 17: E ← {E ∪ {j}}

18: else

19: B ← {B ∪ {j}}

20: Output ˆp(z) anda

Figure 7.4: Online algorithm for estimating annotator parameters and actively choos- ing which images to label. The label collection step is outlined on lines 3–11, and the annotator evaluation step on lines 12–19. See Section 7.5 for details.

set I ⊆ U denotes the set of images for which at least one label has been collected and L is the set of all labels provided so far. Initially I and the set of all labels L are empty. We assume that there is a large pool of annotators A0, of different and unknown ability, available to provide labels. The set of annotators that have provided labels so far is denotedA ⊆ A0 and is initially empty. We keep two lists of annotators:

the expert-list, E ⊆ A, is a set of annotators who we trust to provide good labels, and thebot-list, B ⊆ A, are annotators that we know provide low quality labels and would like to exclude from further labeling. We call the latter list “bot”-list because the labels could as well have been provided by a robot choosing labels at random.

The algorithm proceeds by iterating two steps until all the images have been labeled:

(1) the label collection step, and (2) the annotator evaluation step.

Label collection step: I is expanded with n new images from U. Next, the algorithm asks annotators to label the images in I. First annotators in E are asked.

If no annotator from E is willing to provide a label within a fixed amount of timeT, the label is instead collected from an annotator in (A0 \ B). For each image i ∈ I, new labels lij are requested until either the posterior on the target value zi is above a confidence threshold τ,

maxzi

ˆ

p(zi)> τ, (7.8)

or the number of labels |Li| has reached a maximum cutoff m. It is also possible to set different thresholds for different zi’s, in which case we can trade off the costs of different kinds of target value misclassifications. The algorithm proceeds to the annotator evaluation step.

Annotator evaluation step: Since posteriors on the image target values ˆp(zi) are computed in the label collection step, the annotator parameters can be estimated in the same manner as in the M-step in the EM-algorithm, by maximizingQj(aj,ˆaj) in (7.7). Annotator j is put in either E or B if a measure of the variance of aj is below a threshold,

var(aj)< θv, (7.9)

where θv is the threshold on the variance. If the variance is above the threshold

we do not have enough evidence to consider the annotator to be an expert or a bot (unreliable annotator). If the variance is below the threshold, we place the annotator inE if aj satisfies some expert criterion based on the annotation type, otherwise the annotator will be placed in B and excluded labeling in the next iteration.

On MTurk the expert- and bot-lists can be implemented by using “qualifications”.

A qualification is simply a pair of two numbers, a unique qualification id number and a scalar qualification score, that can be applied to any worker. The qualifications can then be used to restrict (by inclusion or exclusion) which workers are allowed to work on a particular task.