• Tidak ada hasil yang ditemukan

Classifiers for Ensemble Learning on Data Streams

N/A
N/A
Protected

Academic year: 2023

Membagikan "Classifiers for Ensemble Learning on Data Streams"

Copied!
12
0
0

Teks penuh

(1)

Pairwise Combination of

Classifiers for Ensemble Learning on Data Streams

COMPX523 Presentation

Paper by Heitor Murilo Gomes, Jean Paul Barddal and Fabrício Enembreck

Presentation by Hongyu Wang

(2)

Introduction

We’ll discuss two ensemble learning pairwise voting strategies:

● Pairwise Accuracy (PA)

● Pairwise Patterns (PP) What are voting strategies?

- Ways of summing up predictions made by individual classifiers in an ensemble.

(3)

Background & Motivation

Classifier diversity is very important to ensemble learning.

Some degree of overlap between the classifiers can almost always be expected.

Pairwise Accuracy and Pairwise Patterns can use the overlaps to support ensemble prediction.

(4)

Pairwise Accuracy

For each possible pair of classifiers, ci and cj, we can calculate their shared accuracy and error rate:

We also need the accuracy of each individual classifier:

(5)

Pairwise Accuracy Continued

During prediction, we have a vector, v, to store votes for the labels. If a pair of classifiers, ci and cj, vote for the same label, i.e. hi(x) == hj(x):

If they vote for different labels, i.e. hi(x) != hj(x):

The label with the most vote in the end is the overall prediction.

(6)

Pairwise Accuracy Example

Suppose we have two classifiers, ci with an accuracy of 85% and cj with an accuracy of 75%, and have a shared accuracy of 65%

and a shared error rate of 5%.

If they both predict label 0, label 0 receives vote 0.65 - 0.05 = 0.6.

If ci votes for label 0 and cj votes for label 1, label 0 receives vote 0.85 - 0.65 = 0.2, label 1 receives vote 0.75 - 0.65 = 0.1.

cj-correct cj-error

Ci-correct 65% 20%

Ci-error 10% 5%

label 0 label 1

+0.6

label 0 label 1

+0.2 +0.1

(7)

Pairwise Patterns

Pairwise Patterns doesn’t evaluate individual classifiers.

Instead, it records the relation between the pair of predicted labels and the correct label for each pair of classifiers and each training instance.

For prediction, it refers to a matrix of records constructed during training and votes according to the prediction pattern.

(8)

Pairwise Patterns Example

Let’s assume a pair of classifiers that haven’t predicted anything together, their matrix is 0 across all patterns for all labels.

Now some training instances comes in, in 7 cases where the first classifiers predicts a while the second predicts b, the correct labels have 1 a, 1 b and 5 c’s. The training process increments the counts of a, b and c accordingly for pattern (a,b). Now for

prediction if the first classifier votes for a and the second votes for b, the pair will give 1 vote to a, 1 vote to b and 5 votes to c as a result of the pattern (a,b).

a b c

... ... ... ...

(a,b) 0 0 0

... ... ... ...

a b c

... ... ... ...

(a,b) 1 1 5

... ... ... ...

(9)

Experiments

10 data streams were tested by the authors of the paper, including 4 real datasets (Spam Corpus (SPAM), Forest Covertype (COVT) , Air-lines (AIRL), Electricity (ELEC)) and 6 synthetic data streams (SEA generator (SEA), Agrawal generator (AGR), Random tree generator, Hyperplane generator (Hyper)). PA was tested with Generic Ensemble, and PP was tested with Generic Ensemble and Leveraging Bagging. A few other ensemble methods were also tested for benchmarking purposes.

(10)

Results

PA and PP gave a statistically significant boost to the performance of Generic Ensemble and Leveraging Bagging for some data streams compared with the default strategy.

PA is able to adapt to drifts relatively fast and PP can utilise patterns even if the individual classifiers are relatively poor.

(11)

Results Continued

(12)

Summary

Pairwise Accuracy

● emphasises agreement between good classifiers

● adapts to change Pairwise Patterns

● extracts information from prediction patterns

● can use the patterns well even with poor classifiers

Referensi

Dokumen terkait

Abstract---The objectives of this research were: 1) to study the management of coastal aquaculture at the Coastal Aquaculture Research and Development Center,

Malay Civet Y Medium Ross et al., 2016. Malay Weasel N Meijaard et