Daffodil International University
Department of Computer Science and Engineering
Faculty of Science & Information Technology
Semester Final Examination, Fall 2020 @ DIU Blended Learning Center Course Code: CSE 450 (Day), Course Title: Data Mining
Level: 4 Term: 2 Section: O11, PC_B –PC_F Instructor: MU Modality: Open Book Exam
Date: Thursday 17 December, 2020 Time: 02:00 pm - 06:00 pm
Four hours (4 hrs) to support online open/case study based assessment Marks: 40 Directions:
Students need to go through the CASE STUDY shown in this exam paper.
Analyze and answer specific section based on your own thinking and work.
Do not share as this will be treated as plagiarism by Blended Learning Center.
1. On a special occasion, six friends Bristy, Rimi, Sadia, Soma, Tania, and Toma meet after long years and decide to spend some time in flower shopping together. All of them are fond of flower and frequently purchase flowers. On the other hand, Fardin has recently started up a florist shop. He sales various flowers such as rose, aconite, tulip, orchid, and lily to his customers. Eventually, the six friends appear Fardin’s shop for purchasing flowers. Bristy buys rose, aconite, tulip; Rimi buys rose, aconite; Sadia buys rose, orchid, lily; Soma buys lily, orchid; Tania buys tulip, lily; and Toma buys rose, orchid, lily. So, the transactions become as follows:
T1 rose, aconite, tulip T2 rose, aconite T3 rose, orchid, lily T4 orchid, lily T5 tulip, lily
T6 rose, orchid, lily
Based on these shopping transactions of the six friends, Fardin having special interests towards data science expressed a desire to explore relationship among these items and find the association among the items bought.
Now, answer the following questions based on the above scenario.
a) Using the support threshold s = 0.4, find the frequent item sets. 5 b) Find the association rules that are generated by applying Apriori algorithm. 10 c) Suggest what initiatives Fardin could take (based on the findings of the questions 1(a) &
1(b)) that would help to boost up his business.
5 Answer all of the following questions. Figures in the right-hand margin indicate full marks.
2. Write the answer to the following questions in a single sentence.
a) How many possible candidate itemsets are there if d items are given? 1 b) Why is the matching each transaction against every candidate computationally expensive
in brute-force approach? 1
c) If d=4 items are given, calculate the total number of possible association rules in brute-
force approach using two different ways. 4
d) Write a mathematical relation between k (from k-itemset) and w (maximum transaction
width)? 1
e) Write the main two objectives of cluster analysis. 1
f) What are the input(s) of K-means clustering algorithm? 1
g) Why cannot bisecting K-means clustering algorithm yield an empty cluster? 1 h) Why is a dendrogram not applicable on K-means clustering algorithm? 1 i) What is the appropriacy of using minimum spanning tree (MST) other than all other types
of tree to divisive hierarchical clustering?
1
j) What are the observations, for which the size of proximity matrix can be reduced from m2 to about m2/2?
1
3. a. Describe the data mining steps that you have followed to do the project in this course
“Data Mining”.
b. From the following confusion matrix find the TP,TN,FP,FN . (4+
3)