• Tidak ada hasil yang ditemukan

CONCEPT/FRAMEWORK BIG DATA ANALYTICS ACTIVITIES

N/A
N/A
Protected

Academic year: 2021

Membagikan "CONCEPT/FRAMEWORK BIG DATA ANALYTICS ACTIVITIES"

Copied!
35
0
0

Teks penuh

(1)

Mata Kuliah: Big Data And Data Analytics

Oleh: Tim Dosen

CONCEPT/FRAMEWORK BIG DATA ANALYTICS

ACTIVITIES

(2)

2

Creating the great business leaders

o

Review Data Analytics & Big Data (last week topics)

o

Understanding Data

o

Activity / Storytelling Based on Data Type (Model Based)

o

Asking the Questions to Data

(3)

Telkom University

Analytics:

“the systematic computational analysis of data or statistics” (Google Definition)

“the method of logical

analysis

” (Merriam – Webster)

Analysis of data is a process of inspecting, cleaning, transforming, and modeling

data

with

the goal of discovering useful

information

, suggesting conclusions, and supporting

decision-making.

Data science is an interdisciplinary field about processes and systems to

extract

knowledge

or insights from

data

in various forms, either structured or

unstructured,

[1][2]

which is a continuation of some of the data analysis fields such

as

statistics

,

data mining

, and

predictive analytics

(From Wikipedia, by many references)

and many more…..

Definitions

1. Dhar, V. (2013).

"Data science and prediction"

. Communications of the ACM. 56 (12): 64.

doi

:

10.1145/2500499

.

2. Jeff Leek (2013-12-12).

"The key word in "Data Science" is not Data, it is Science"

. Simply Statistics.

(4)

4

Creating the great business leaders

Data Science

Based on aforementioned definitions, we can

conclude that Data Analytics includes:

• Data engineering

• Scientific Method

• Math

• Statistics

Data Engineering may includes:

• Data Gathering

• Data Mining

• Data Transformation

• Data Cleansing

• etc.

(5)

Telkom University

Our approach to Big Data

(6)

6

Creating the great business leaders

Big Data Approach

Framework

Some people prefer 3Vs,

6Vs or 7Vs even 12Vs to

explain big data. But the

original “bigness”

measurement metrics

are volume, velocity,

and variety.

For example 7Vs:

1. Volume

2. Velocity

3. Variety

4. Variability

5. Veracity

6. Visualitazion

7. Value

(7)
(8)

8

Creating the great business leaders

Data Set

Data Analytics

Methods

Knowledge

(9)

Telkom University

(10)

10

Creating the great business leaders

UNDERSTANDING DATA : High Dimensional Data

Name

Address

Occupation

Age

Blood Type Marital

…..

Sex

Agus

Jl. Mawar 1

Artist

30

A

Married

…..

Male

Andry

Jl. Kucing 50

Lawyer

32

O

Married

…..

Male

Beatrice

Jl. Raya 27

Student

21

O

Single

…..

Female

Ben

Jl. Diponegoro 12

Driver

37

AB

Married

…..

Male

…...

….

….

….

…..

…..

…..

…..

Zorro

Jl. Dago 34

Student

18

B

Single

…...

Male

Dimension / Attributes / Properties

High Dimensional Data, add up complexity problem to Big Data Analytics

Curse : High space searching, Summarization, Reduction (PCA)

Blessing : Comprehensive data knowledge

(11)

Telkom University

UNDERSTANDING DATA : Network vs Non Network Data

Name

Sex

Age

Number of

Friend

Agus

Male

25

2

Cecep

Male

23

3

Dita

Sex

21

2

Rina

Sex

22

1

Agus

Cecep

Dita

Rina

Agus

-

1

1

0

Cecep

1

-

1

1

Dita

1

1

-

0

Rina

0

1

0

-Non Network Data

Network Data

Cecep

Dita

Rina

Agus

(12)

12

Creating the great business leaders

UNDERSTANDING DATA :

(13)
(14)

14

Creating the great business leaders

Characteristics

Stuctured Data

Unstructured Data

Well defined content

Structure not obvious

Easily understood

Process data to understand

Stored in RDBMS

RDBMS not a good fit

Easy to enter, store, and analyze

Difficult and costly to analyze

Example:

Data in database table (customer data, sales data,

sensor data)

Example:

Email, video files, audio files, web pages,

presentations, social media feeds

(15)

Telkom University

UNDERSTANDING DATA : SQL vs NoSQL

SQL NoSQL

(16)
(17)
(18)

18

Creating the great business leaders

(19)

Telkom University

Case Studies : Data Analytics Common Roles

1. Estimation

2. Predictions

3. Classification

4. Clustering

(20)

20

Creating the great business leaders

1. Estimation

Customer

Number of Order (O)

Number of Traffic Light (TL)

Distance (D)

Delivery Time (T)

1

3

3

3

16

2

1

7

4

20

3

2

4

6

18

4

4

6

8

36

...

1000

2

4

2

12

Label

Estimate Pizza Time Delivery

Delivery Time (T) = 0.48O + 0.23TL + 0.5D

Knowledge

Learning with Estimation Methods

(Regresi Linier)

(21)

Telkom University

Output/Pola/Model/Knowledge

1. Formula/

Function

(Rumus atau Fungsi Regresi)

• DELIVERY TIME = 0.48 + 0.6 DISTANCE + 0.34 TRAFFIC LIGHT + 0.2

ORDER

2. Decision

Tree

(Pohon Keputusan)

3. Correlation and Association

4. Rule

(Aturan)

• IF ipk>3.5 THEN lulus cum laude

(22)

22

Creating the great business leaders

2. Prediction

Stock price data set in

a form of

time series

(rentet waktu) model

Learning with Prediction(Neural Network)

(23)

Telkom University

2. Prediction

Predict Stock Price

Knowledge in a form of

Neural Network Model

(24)

24

Creating the great business leaders

3. Classification

Classify Student Graduation Time

Student

Number

Sex

National

Final Score

School

Origin

IPS1

IPS2

IPS3

IPS 4

...

Graduation

Status

10001

L

28

SMAN 2

3.3

3.6

2.89

2.9

On Time

10002

P

27

SMA DK

4.0

3.2

3.8

3.7

Late

10003

P

24

SMAN 1

2.7

3.4

4.0

3.5

Late

10004

L

26.4

SMAN 3

3.2

2.7

3.6

3.4

On Time

...

...

11000

L

23.4

SMAN 5

3.3

2.8

3.1

3.2

On Time

Learning with Classification Methods(C4.5)

(25)

Telkom University

3. Classification

Classify Student Graduation Time

Knowledge in a form of

Decision Tree Model

(26)

26

Creating the great business leaders

3. Classification

Golf Playing Time Recommendation

Input

Output

If outlook = sunny and humidity = high then play = no

If outlook = rainy and windy = true then play = no

If outlook = overcast then play = yes

If humidity = normal then play = yes

If none of the above then play = yes

(27)

Telkom University

3. Classification

Golf Playing Time Recommendation

Output

(28)

28

Creating the great business leaders

3. Classification

Contact Lens Recommendation

Input

(29)

Telkom University

4. Clustering

Finding Iris Flower Cluster

Input

Dataset without Label

(30)

30

Creating the great business leaders

4. Clustering

Output (Distance Plot)

(31)

Telkom University

5. Association

Association Product Sold

(32)

32

Creating the great business leaders

5. Association

Association Product Sold

(33)

Telkom University

5. Association

• association rule algorithm objective is to find some

attributes which has shown up “

together

• Example, on Thursday night, 1000 customer has bought

200 orang membeli

Soap

, where from 200 who bought soap,

50 among them bought

Fanta

• In association rule, we have “

If buy Soap, then buy Fanta

”,

with

support

value = 200/1000 = 20% and

confidence

value= 50/200 = 25%

• Some association rule algorithm are :

A priori algorithm

,

(34)

34

Creating the great business leaders

o

Find a Case Study of Big Data Implementation / Application for

Business or others

o

State the objective, problems, solution idea

o

State the methodology used (explain)

o

State the model, measurement, accuracy

(35)

Telkom University

o

Find a Case Study of Big Data Implementation / Application for

Business or others

o

State the objective, problems, solution idea

o

State the methodology used (explain)

o

State the model, measurement, accuracy, evaluation

o

Learn Big Data online free course (www.bigdatauniversity.com)

Referensi

Dokumen terkait

Atribut Produk memiliki pengaruh yang positif dan signifikan terhadap keputusan pembelian Handphone atau Smartphone Samsung jenis Android pada Mahasiswa UNDIP. 5 Fifyanita

Lampiran 4 Entry Data Instrumen Skala Minat Belajar.. Lampiran 5 Hasil uji coba Instrumen Skala Minat

Social safeguards relating to participation and rights are also clearly positioned in the draft agreement stating that “ full and effective participation of…indigenous peoples

alat peraga terhadap motivasi dan hasil belajar matematika siswa kelas VIII. MTs Negeri

Komunikasi kelompok: Berinteraksi melalui Rapat Pimpinan, Rapat Staf, Rapat Bagian (Divisi), Rapat

banyak anak usia dini yang kurang memahami cara bersosialisasi dengan benar.. dengan teman

“Generally speaking, it’s fair to say existing networks are ready for IoT traffic, because IoT traffic isn’t all that different from normal IP (Internet Protocol) traffic,”

497.500.000,- (Empat ratus sembilan puluh tujuh juta lima ratus ribu rupiah) Tahun Anggaran 2016, maka bersama ini kami Sub Bagian Pengadaan I Bagian Layanan