• Tidak ada hasil yang ditemukan

SVSCLASS: Support Vector based Stream Classifier

N/A
N/A
Protected

Academic year: 2023

Membagikan "SVSCLASS: Support Vector based Stream Classifier"

Copied!
12
0
0

Teks penuh

(1)

A Support Vector Based Approach For Classification Beyond the

Learned Label Space in Data Streams

Poorya ZareMoodi Sajjad Kamali Siahroudi Hamid Beigy

(2)

Abstract

SVSCLASS: Support Vector based Stream Classifier resolves problems in data stream:

● Classification of novel classes

● Concept-drift

● Infinite-length Previous Work

ECSMiner: Decision boundary and ensemble of classifiers. Creates new and replaces old ones.

CLAM: An ensemble of classifiers per each class, and not forgets observed class.

LOCE: Neighborhood graph. Evaluating classifiers and dedicating a prune phase.

(3)

SVSCLASS: Support Vector based Stream Classifier

● A support vector based approach for classification beyond the learned label space in data streams.

● Dynamically maintaining boundaries by shrinking, enlarging and merging spheres.

● Adapt both dramatic and gradual changes of data.

● More accurate and more memory efficient.

● Complex shapes

Keywords of concept-evolution.

● Cohesion-separation condition: similarity

● Threshold condition: outliers.

When a new chunk arrives, label instances within boundaries. Make decision on others’ cohesion and separation.

Do SVC (Support Vector Clustering) on these instances to partition them into clusters. When receiving true label, the boundaries will be updated

(4)

Algorithm

Goal: look for the smallest sphere that encloses most of the target points in the feature space and most of the negative points be outside it. It is a minimization problem:

Introducing the Lagrange multipliers and Setting to 0 the partial derivatives of L with respect to R, μ, xi and xl gives the constraints:

Where

(5)

Algorithm

According to the values of the Lagrange multipliers αi, the target points are classified into three types.

1) Inner points that lie either inside or on the sphere surface, αi = 0.

2) Support vectors (SVs) that lie on the sphere surface, 0 < αi < C1.

3) Bounded support vectors (BSVs) that lie either outside or on the sphere surface, αi = C1.

This sphere is mapped back into data space as several components, each enclosing a separate cluster of points.

SVC is used to identify these clusters. BSVs are assigned to the cluster that they are closest to.

(6)

Maintaining Boundaries

After receiving real labels, update sphere sets.

● Create a new set and sphere for novel class,.

● Enlarge spheres for existing classes

● Or create a new sphere if applicable.

● Shrink a sphere if other classes locate within it.

● Merge some of its spheres if it is applicable.

Three instance sets: Wi, Oi and Ai are created.

● Wi: inside instances that belong to class i,

● Oi: outside instances that belong to class i,

● Ai: inside instances for class i but in fact belong to other classes.

(7)

Creating and enlarging

Creating and enlarging

Oi Above threshold: create new sphere for Oi and remove oldest sphere when full.

Oi Below threshold: find nearest sphere for OI and and add to update list U.

WI: Find nearest sphere for Wi and add to update list U.

New SVset and BSVset are created from SVset, BSVset, and U minus inner instances.

(8)

Shrinking

For each class, find its spheres set Ai that instances of other classes located within their boundaries.

New SVset and BSVset come from old inner instances which have been discarded. So generate some points in the sphere to play the role of discarded instances.

Then, find Lagrange multipliers with the union of SVset and generated instances as target points, and A as negative ones.

If the new sphere’s radius is below 90% of the old one, the old sphere structure summary will be replaced by the new one.

(9)

Merging spheres

In order to save the memory usage, merge some of spheres if it is applicable.

For each class, calculate distance between each pair of its spheres.

If their distance is less than the threshold h, old spheres are merged as a new sphere.

(10)

Classification and Novel Class Detection

Classification in the first step:

● Classify instances within the spheres of existing classes.

● The other instances may be novel class or outliers of existing classes.

● Verify both cohesion-separation and threshold conditions.

● Separate them into clusters by SVC.

In the next step:

● Verify threshold condition on the number of instances.

● Label novel classes.

● Label outliers.

(11)

EXPERIMENTS - Parameter Analysis

Mnew, Fnew and ERR are the

performance metrics we are used for evaluation purpose.

Mnew is the ratio of novel class instances that are classified as one of existing classes.

Fnew is the ratio of existing classes instances which are classified as novel.

ERR is the ratio of all stream’s instances that are misclassified.

(12)

EXPERIMENTS - Comparison with others

Results: SVSClass has the best performance in all measures and datasets except for Mnew on KDD.

Reason: model intricate-shape class boundaries more accurately.

Performance comparison of SVSClass, CLAM, LOCE and ECSMiner on the selected datasets.

Referensi

Dokumen terkait

This is in line with (Sumaryana 2018), (Salamah, 2019) and (Aslamiah, 2019), which stated that organizational culture positively and significantly affects employee