• Tidak ada hasil yang ditemukan

PDF Chapter 3 Research Methodology

N/A
N/A
Protected

Academic year: 2024

Membagikan "PDF Chapter 3 Research Methodology"

Copied!
2
0
0

Teks penuh

(1)

6

CHAPTER 3

RESEARCH METHODOLOGY

3.1. Data Collection

The dataset used for this project is taken from Kaggle, this dataset is in CSV 38.01 KB format consisting of 614 rows and 13 columns as factors namely loan id, gender, marital status, dependents, education, self-employed, applicant income, applicant income, total loan, loan term, credit history, property area, loan status. This dataset is divided into two ratios 70% for training, 30% for testing, and 60% for training, 40% for testing. Training datasets to train models and test datasets to test datasets. And each test data will be divided into three data to see the consistency of its accuracy, as well as testing with several parameters.

3.2. Algorithms

This project uses Logistic Regression and Extreme Gradient Boosting algorithms.

Logistics Regression is used because it is efficient to train, does not have assumptions so there is no need to test assumptions, has high accuracy results, and the algorithm can be used to predict decisions. XGBoost or called Extreme Gradient Boosting is one of the powerful algorithms which has steps with trimming to increase the generalizability of the model, newton boosting which works to prevent gradient descent, and extra randomization parameters to reduce tree correlation increasing the strength of the algorithm ensemble.

3.3. Design

The dataset is collected and preprocessed, the data will be cleaned by filling in the missing variables and handling noise, then data transformation is carried out so that the dataset becomes more effective and efficient. After splitting the dataset, the model uses Logistic Regression, and Extreme Gradient Boosting, then the results are used to test the data.

(2)

7 3.4. Coding

Here, this project uses Python 3.6.9 as the programming language. In addition to its popularity, Python is used because this programming language is an interpreted language so that it is easy to understand, has a simple syntax, Python has an extensive standard library, and is open source. And I use google colaboratory because google colaboratory uses the cloud to execute code so that the performance of the local machine will not decrease in code execution.

This project uses pandas library to read datasets and numpy which is useful when doing calculations.

3.5. Analysis

In this project, the data taken from filling out online application forms from companies that want to automate the credit application process are 615 rows and 13 columns as parameters.

Furthermore, the data will be broken down into training datasets and test datasets, which later the model will implement the Logistic Regression and Extreme Gradient Boosting algorithms to analyze which algorithm has the highest accuracy, then whether the two algorithms are suitable to be applied in this case or not, and which variables affect the outcome decision that a loan is rejected and accepted.

Referensi

Dokumen terkait

Survey method was used to study the level of social freedom and its effect on teaching of women teacher educators, to study the social freedom and its effect on teaching of rural women

45 The number of research samples 45 Based on the sampling criteria as mentioned above, the number of samples used in this study was 45 students.. The following is a list of the

The following next process is chemical preparation for deposited Ni and Au layer into aluminum bond pad Table 3.6, and for deposited Ni and Au the process flow is surface cleaning,

Figure 3.1 Diagrammatic Representation of the Research Method Community Service Learning Intervention for 8 Months Post-test II administered After Six Months Statistical Analysis

Angle Profile Parameters of GaitSimulator User Interface Kinematic Model Description Tswing swing period of the legs, in seconds, input by the user Tstance stance

As stated by Masri Sangarimbun that survey research is research that takes a sample from one population and uses a questionnaire as the main data collection tool.1 Meanwhile,

This sample also was recommended from the English teacher that most of students of second grade were still low in comprehending English text, that's why the teacher recommended to the

From the server-side, the program code is built using the import package google.golang.org/grpc and several other packages, wherein the server program code the port variable is declared