• Tidak ada hasil yang ditemukan

PDF Dissertation submitted in partial fulfillment of

N/A
N/A
Protected

Academic year: 2023

Membagikan "PDF Dissertation submitted in partial fulfillment of"

Copied!
54
0
0

Teks penuh

An analysis in the field of data mining is made to show how data mining can help in business, such as marketing, credit card authentication. Data mining (an essential process where intelligent methods are used to extract data patterns). Data mining features are used to specify the kinds of patterns to be found in data mining tasks.

Data mining is applicable to telecommunications, medical science, professional institutions, business ventures, educational institutions and agriculture.

Objectives and Scope of Study

In the second case, we can try to estimate the missing values, for example by assigning the most frequent value of the same attribute. Amount of dimensionality reduction: This can be measured by the proportion of selected input features out of the total number of candidates. The basic idea of ​​the induction algorithm is to ask questions that provide the most answers.

The output connection is divided into a number of branches that transmit the same signal, and these branches end at the input connections of the neurons in the network.

Figure 2.1: Knowledge Discovery Process
Figure 2.1: Knowledge Discovery Process

For each tuple inthe training set, propagate it through the network ad evaluate the output prediction to the actual result. Adjust the weight

For each tuple in the training set, propagate it through the network and evaluate the output prediction to the actual output.

For each tuple tj in D,propagate through the network and make appropriate

Literature Review on Fuzzy Logic and Fuzzy rules induction

As an illustration, mathematically we can express the set of tall persons as a collection of persons whose height is equal to or greater than 6 feet; it may classify a person 6 feet tall as a tall person, but not a person 5,999 feet tall. That is, the transition from "belongs to a set" to "does not belong to a set" is gradual, and this smooth transition is characterized by membership functions that give fuzzy sets flexibility in modeling commonly used linguistic expressions, such as "the water is hot " or the.

For example, the classical set A of real numbers greater than 6 can be expressed as where there is a clear, unambiguous limit 6, such that if x is greater than this number, then x belongs to the set A; otherwise x does not belong to the set. For example, if we have the set tall man, its complement is the set not tall man. That is, the union is the maximum membership value of an element in any set:.

Fuzzy membership functions can be defined by several possible methods. method is the heuristic method where predefined shapes will be chosen to represent. The most popular functions are piecewise linear functions such as the triangular and trapezoidal membership functions due to their computational efficiency. The fuzzy rule can have many precursors such as the following form If Xj is Ai and/or x2 is Bi, then yi is Oi,.

Although human experts have played an important role in the development of conventional fuzzy systems, the automatic generation of fuzzy rules from data is very useful when human experts are not available and can even provide information not previously known by experts. Various approaches have been designed to develop data-driven learning for fuzzy rule-based systems. They involve using a method that automatically generates membership functions of fuzzy rule structures from both.

Chen [18] proposed a method for generating fuzzy rules for classification problems; this method uses fuzzy subsethood values ​​to generate fuzzy rules. a) Identify the nature of the classification problem. The method for generating fuzzy rule models based on fuzzy subsethood values ​​formally defined as follows.

Figure 2.4: Triangular membership function
Figure 2.4: Triangular membership function

YiPaW

Define fuzzy distributions for input variables and output variables according to the type of data and the nature of the classification problems. A weighted subset-based algorithm is to use the subset values ​​as relative weights over the importance of different conditional attributes. Intuitively, the language term with the highest subset value will be the most important, and the term with the lowest will be the least important.

However, we can apply the relative contributions made by individual linguistic term of each variable to the ultimate conclusion made by multiplying each linguistic term by its respective weight; the fuzzy rule to be generated will be of the form Integrated neuro-fuzzy systems can combine the parallel computation and learning capabilities of neural networks with the human-like knowledge representation and explanation capabilities of fuzzy systems. It has input and output layers and three hidden layers representing membership function and fuzzy rules.

A smoothing neuron receives a sharp input and determines the degree to which that input belongs to the neuron's soft set. The network then propagates the input pattern from layer to layer until the output layer produces the output pattern. If this sample differs from the desired output, an error is computed, which is then propagated back through the network from the output layer to the input layer.

To propagate error signals, we start at the output layer and work back to the hidden layers. After calculating the errors on the output layer, we update the weight on the output layer and.

Figure 2.7: fuzzy neural network
Figure 2.7: fuzzy neural network

Fuzzy logic operation is and, or which is interpreted as minimum and maximum arithmetic operation respectively, this arithmetic operation is not differentiated. To apply back-propagation and obtain better learning performance, we establish our training algorithm using the smooth derivative [28], which is briefly described as follows.

The first phase tries to capture what the system will do (its requirements), the second determines how it will be designed, in the middle is the actual programming, the fourth phase is the full system testing, and the. We need to feed the training dataset to train the system to generate the weighted fuzzy production rules (WFPR) and then these weighted fuzzy production rules are trained with a min-max. Iris dataset is the basic dataset on which all data mining algorithms will be initially tested.

If the data type of the input attribute is nominal, then the available category will be the linguistic term, and the linguistic term number is independent of the target class number. If the data type of the input attribute is continuous numeric, then the number of the linguistic term is equal to the number of the target class and the membership function is a trapezoid. After determining the fuzzy member class, Follow (2.8) we will calculate the subset value of each linguistic term over the classification class.

After getting the weighted subset values, we can generate the set of weighted fuzzy production rule. The task of fuzzy neural network subsystem is to train the weighted fuzzy production rules by modifying the weight so that the rule can classify data with more accuracy. The activation function for the input membership layer is the trapezoidal membership function; the activation for the fuzzy rule layer is the fuzzy rule, and the activation function for the output layer is the logic operation function.

After rebinding the rule we can have a set of subrules that have the Or function and the And operation rule. Where OC is the number of output neurons, yk is the desired output for neuron k, and y(5)k is.

Figure 3.1 Waterfall model
Figure 3.1 Waterfall model

According to the principle of gradient descent, the backpropagation equations can be written as.

5max Wfj dYu{4) dYJ3) dmax WtJ *Y™

YiA)-Yu)*maWJ^*Yl2,ifCl

The implementation of this project is divided into modules that consist of typical data mining processes or phases, namely Data Selection, Data Preprocessing, Fuzzy Rule Induction, Fuzzy Rule Training, and Validation. Each of these processes is considered a module of the application and takes place in sequence due to the linear nature in this tool. A bank will have to consider many personal details, especially the credit history, of the applicant and there is no simple rule to assess the risk of granting a credit card to each person.

Quinlan, was retrieved from the following URL: http://www.csee.usf.edu/-mlast/credit.dat. The dataset contains records from bank credit card applications, including the credit scores (accept/decline). With iris data set, the whole data set was labeled from 1 to 150, is divided equally into two sub-data sets: IP1 and IP2; IP1 consists of the odd numbered objects and IP2 consists of the even numbered object. The hybrid system, fuzzy neural network has improved the accuracy of the weighted fuzzy production rules.

We can realize that with a hybrid system, we can have a high accuracy neural network classifier and can also generate rules that are inherited from fuzzy logic. In conclusion, the proposed algorithm, fuzzy neural network helps to increase the accuracy of the fuzzy logic classifier. Data mining has proven to be a very important area of ​​research that helps organizations make good use of the vast amount of data they have.

As this project serves as a demonstration of the basic features of a typical data mining process, there are many other extensions to this work to make it a more and more powerful tool. Therefore, when it comes to very large data set, the capacity of the memory and the computing power of the CPU become critical. Therefore, there must be a new design of the application so that it can connect with a database server, for example, Oracle DBMS, or Microsoft SQL Server, etc.

18] Chen S.M., Lee S.H. and Lee C.H A New Method for Generating Fuzzy Rules From Numerical Data for Handling Classification Problems", Applies Artificial Intelligence, vol Induction of Fuzzy Rules and Membership Functions from Training Exams", Fuzzy Sets and Systems, Rule Generation of classification with the NEFCLASS Neuro-fuzzy System, Proceedings of the North American Fuzzy Information Processing society NAFIP's 96 Biennial conference, Berkeley, CA>.

Table 4.1 credit card approval data set
Table 4.1 credit card approval data set

Gambar

Figure 2.1: Knowledge Discovery Process
Figure 2.2: Sample decision tree
Figure 2.3: Artificial Neural Network
Figure 2.4: Triangular membership function
+7

Referensi

Dokumen terkait

LIST OF FIGURES Figure 1: Hierarchy of process risk management strategies 6 Figure 2: The membership function of TFN 8 Figure 3: Example of membership Sanction of linguistic variable