• Tidak ada hasil yang ditemukan

Department of Computer Science and Engineering

N/A
N/A
Protected

Academic year: 2023

Membagikan "Department of Computer Science and Engineering"

Copied!
2
0
0

Teks penuh

(1)

Daffodil International University

Department of Computer Science and Engineering

Faculty of Science & Information Technology

Final Exam Examination, Summer 2020 @ DIU Blended Learning Center Course Code: CSE450 (Day), Course Title: Datamining

Level: 4 Term: 2 Section: O5, O9, O11 Instructor: FZA Modality: Open Book Exam Date: Thursday 20 August 2020, Time: 02:00pm-06:00pm

Four hours (4:00) to support online open/case study based assessment Marks: 40 Directions:

Students need to go through the CASE STUDY shown in this exam paper.

Analyze and answer specific section based on your own thinking and work.

Do not share as this will be treated as plagiarism by Blended Learning Center.

Use Case-1 You are given the transaction data shown in the Table below from a fast food restaurant. There are 9 distinct transactions (order:1 – order:9) and each transaction involves between 2 and 4 meal items.

There are a total of 5 meal items that are involved in the transactions. For simplicity we assign the meal items short names (M1 – M5) rather than the full descriptive names (e.g., Big Mac).

For all of the parts below the minimum support is 2/9 (.222) and the minimum confidence is 7/9 (.777). Note that you only need to achieve this level, not exceed it. Show your work for full credit.

Marks

Question-1(a) Apply the Apriori algorithm to the above dataset of transactions and identify all frequent kitemsets. Show all of your work. You must show candidates but can cross them off to show the ones that pass the minimum support threshold.

.

5

Question-1(b) Find all strong association rules of the form: X ˄Y → Z and note their confidence values.

3 Question-1(c) Justify with some real life examples that Association Rule Mining can be used

for extracting product sales patterns in retail store transactions.

7

(2)

Use Case-2 A pizza chain wants to open its delivery centers across a city. What do you think would be the possible challenges?

They need to analyze the areas from where the pizza is being ordered frequently.

They need to understand as to how many pizza stores has to be opened to cover delivery in the area.

They need to figure out the locations for the pizza stores within all these areas in order to keep the distance between the store and delivery points minimum.

Resolving these challenges includes a lot of analysis and mathematics.

At the beginning of this analysis following six points are chosen for the store and delivery centers:

A = (1,2), B = (2,2), C = (2, 1), D = (-1, 4), E = (-2, -1), F = (-1,-1)

Marks

Question-2(a) Starting from initial clusters Cluster-1 = {A} which contains only the point A and Cluster-2 = {D} which contains only the point D, run the K-means clustering algorithm and report the final clusters. Use L1 distance as the distance between points which is given by

d( (x1, y1), (x2, y2) ) = | x1 – x2 | + | y1 – y2 |

Draw the points of stores and pizza delivery center on a 2-D grid and check if the clusters make sense.

7

Question-2(b) Write four similar example scenarios that you can use clustering algorithm to analyze the data.

8

Use Case-3 There is a strong linkage between statistical data analysis and data mining. Some people think of data mining as automated and scalable methods for statistical data analysis.

Question-3(a) Do you agree or disagree with this perception? Explain your opinion. 5 Question-3(b) Present one statistical analysis method that can be automated and/or

scaled up nicely by integration with current data mining methodology

5

Referensi

Dokumen terkait

Touhid Bhuiyan Professor and Head Department of Computer Science and Engineering Faculty of Science & Information Technology Daffodil International University Chairman