• Tidak ada hasil yang ditemukan

A Fruitful Data Mining Using Orange

N/A
N/A
Protected

Academic year: 2017

Membagikan "A Fruitful Data Mining Using Orange"

Copied!
2
0
0

Teks penuh

(1)

CLEAR December 2013 Page 25 This article is an introduction to Orange data

mining tool and its use with python scripts, not

in details, in a brief. Have a fun..

Data Mining is the computational process of

discovering patterns in large data sets involving

methods at the intersection of artificial

intelligence, machine learning, statistics, and

database systems. This area has special

applications in areas like medical science,

business intelligence, and similar areas of real

life. There are many open source and

proprietary softwares tools are available for data

mining applications and research. Orange

(http://orange.biolab.si) is a general-purpose,

open source machine learning and data mining

tool. It includes a set of components for data

preprocessing, feature scoring and filtering,

modeling, model evaluation, and exploration

techniques.

Here the python scripting using orange package

is discussed.

Install Orange:

To build and install Orange you can use the

setup.py in the root orange directory (requires

GCC, Python and numpy development headers).

The details will be available in the orange

documentation page

(http://orange.biolab.si/download/).

Test the installation

After installing orange, type the following in

python interactive shell.

>>>import Orange

>>>Orange.version.version

'2.6a2.dev-a55510d'

If this leaves no error and warning, Orange and

Python are properly installed and you are ready

to continue.

A Fruitful Data Mining Using

ORANGE

Manu Madhavan

(2)

CLEAR December 2013 Page 26 Data Input

Orange can read files in native and other data

formats. Native format starts with feature

(attribute) names, their type (continuous,

discrete, string). The third line contains meta

information to identify dependent features

(class), irrelevant features (ignore) or meta

features (meta).

You may download lenses.tab to a target

directory and there open a python shell.

>>>import Orange

>>>data=Orange.data.Table("lense s")

>>>

Data mining using Orange-python

The orange python can be used for all data

mining applications like, classification,

prediction, clustering and learning.

Here I am illustrating, how orange can be used

for classification.

For classification, you need a training data set

and test data set. The data readable by orange

methods are stored in a .tab file (stored as

Tables). The table contains different features of

the data and class value. For training data, the

class value will be the actual class, in which the

feature vector belongs to. In case of testing data

set, the class value will be absent, which have to

be predicted by a learning function. Orange

have a vast variety of learning functions like,

KNN, Least Mean Square Error, Naive- Bayes,

Logistic/Linear regression, etc. The following

sample code use KNN method to predict the

class value of test data set.

Let the training dataset is stored in trainset. tab

and test dataset is stored in testset. tab. The

following script will print the class of test data.

Let the training dataset is stored in trainset. tab

and test dataset is stored in testset. tab. The

following script will print the class of test data.

>>>train=Orange.data.Table("trai nset")

>>>test=Orange.data.Table("tests et")

>>>learner=Orange.classification .knn.kNNLearner()

>>>classifier = learner(train)

>>>for i in range (len(trainset)):

...print i,classifier(test[i])

This will print the class of each vector in test

set. You can verify the result by comparing the

results with that by using other learning tools.

Referensi

Dokumen terkait

You can use the following math magician game in class where you can make the kids believe that math is a tricky and interesting subject.. Here is the math

If you can get a sample test, or a previous version, so you know what kind of questions will be on it, you’ll be more likely to study the right things!. In general, test-anxiety is

Your attitude is the most powerful tool for positive action that can help you become successful, so you will need to understand this before you can work on the following steps

For example, you can defi ne an abstract generic base class that is implemented with a concrete type in the derived class.. Consider the following example, where the class

This class will simply extend the Attendees class (which in turn extends the Observer class), so you have access to the addAttendee and removeAttendee functions and can have

The following example demonstrates both declaration practices, in addition to demonstrating how public methods can be called from outside the class: The following is the result:

Conclusions Based on data analysis and comparison of the result of pre-test and post-test in experiment class a and control class c, it was found that: There was an effect of the use

The result of the analysis from the play script The Humans 2015 shows about the anxiety of American middle class that experienced by protagonist can be seen through causes and symptoms