• Tidak ada hasil yang ditemukan

modeling banking stability index using machine learning

N/A
N/A
Nguyễn Gia Hào

Academic year: 2023

Membagikan "modeling banking stability index using machine learning"

Copied!
10
0
0

Teks penuh

(1)

MODELING BANKING STABILITY INDEX USING MACHINE LEARNING TECHNIQUE

By Agus Afiantara

21751009

MASTER’S DEGREE in

MASTER OF INFORMATION TECHNOLOGY ENGINEERING AND INFORMATION TECHNOLOGY

SWISS GERMAN UNIVERSITY The Prominence Tower

Jalan Jalur Sutera Barat No. 15, Alam Sutera Tangerang, Banten 15143 - Indonesia

February 2019

Revision after the Thesis Defense on 24 January 2019

(2)

Agus Afiantara STATEMENT BY THE AUTHOR

I hereby declare that this submission is my own work and to the best of my knowledge, it contains no material previously published or written by another person, nor material which to a substantial extent has been accepted for the award of any other degree or diploma at any educational institution, except where due acknowledgement is made in the thesis.

Agus Afiantara

_____________________________________________

Student Date

Approved by:

Dr. Eng. Bagus Mahawan

_____________________________________________

Thesis Advisor Date

Dr. Eka Budiarto, S.T., M.Sc.

_____________________________________________

Thesis Co-Advisor Date

Dr. Maulahikmah Galinium, S.Kom., M.Sc.

_____________________________________________

Dean Date

(3)

Agus Afiantara ABSTRACT

Modeling Banking Stability Index Using Machine Learning Technique

By Agus Afiantara

Dr. Eng Bagus Mahawan, Advisor Dr. Eka Budiarto, S.T., M.Sc, Co-Advisor

SWISS GERMAN UNIVERSITY

The purpose of this research is to create an early warning detection model of financial stability index from selected variables to detect instability of financial system for the banking sector of Indonesia using PCA and machine learning algorithm Random Forest (Random Forest Classifier and Regressor). Economic Crisis are unexpected and unpredicted events that can have serious implications with consequences seriously affecting country’s condition of economy. The capability of data and information processing of economic indicators is absolutely required as a challenge to avoid the instability of financial system in the future. There are a lot of variables used to describe instability of financial system in many ways, but there is a big question which variables can precisely predict the instability of financial system. The diverse data characteristics of types and units should be normalized before processing data in order to speed up the calculation need in the model. A very important point of concern in this research is to reduce dimension of variables selected using PCA as generally used to compress the data into a new dataset with fewer dimension. Bagged decisions trees like Random Forest is used to estimate the importance of features. Thorough the research Random Forest as machine learning technique and combined with PCA is shown to have the best result to classify and select the core variables of economic indicators and predict the

(4)

Agus Afiantara the instability of financial system signal that constructed from Financial Stability Index (FSI) of Central Bank of Indonesia with monthly data. The algorithm is trained over the data from 2004-2011 period. As the result nine most components analysis are obtained as input for random forest machine learning to predict the instability of financial system especially banking system with explained variance ratio (the capability to explain) around 97%, accuracy around 89%, and precision 91% and mean absolute error around 11%.

Keywords: model, independent variables, un-supervised machine learning, PCA, data processing, dimension reduction, python, random forest.

(5)

Agus Afiantara

© Copyright 2019 by Agus Afiantara All rights reserved

(6)

Agus Afiantara I dedicate my thesis work to my God Allah SWT, and to my family, and many thanks to my lovely wife Linda Nirmala, who always support and care of me, and my daughter who I loved, Putri Zahra Maharani, and Putri Azizah Dafianti. I give you an example that learning is a continuous process from the cradle to the grave and is an obligation for us as Muslims. For my mom Hj.E.Prijatini who continues to pray for their children to always try to achieve success. “I love you mom for all my hearth”, and for my father who had left us 10 years ago, but I still remembered the lessons he had given May Allah give the best place for you.

(7)

Agus Afiantara ACKNOWLEDGEMENTS

I wish to thank to my fellows and my colleagues who want to be used as a place to discuss a lot about my thesis, there are my Bro Hendra Syamsir and Mas Advis Budiman always share the knowledge in helping me to finish this research. Thank you for all the information and lessons that have been given.

I wish also to thank to my advisor and co-advisor for their support, patience. Their gentle but firm direction has been most appreciated. Dr.Eng. Bagus Mahawan and Dr.

Eka Budiarto, ST, M.Sc were particularly helpful in guiding me toward a quantitative methodology. Finally, I would like to all of my student colleagues in SGU, salute to all of you.

.

(8)

Agus Afiantara Page

STATEMENT BY THE AUTHOR ... 2

ABSTRACT ... 3

DEDICATION ... 6

ACKNOWLEDGEMENTS ... 7

TABLE OF CONTENTS ... 8

LIST OF FIGURES ... 11

LIST OF TABLES ... 14

LIST OF EQUATION ... 15

CHAPTER 1 - INTRODUCTION ... 16

1.1. Background ... 16

1.2. Research Problems ... 19

1.3. Research Objectives ... 20

1.4. Research Question ... 21

1.5. Hypothesis ... 21

1.6. Research Scope and Limitation ... 22

1.7. Significance of Study ... 22

1.8. Structure of the Thesis ... 22

CHAPTER 2 - LITERATURE REVIEW ... 23

2.1. Overview ... 23

2.2. Banking Sector Stability ... 24

2.2.1. Financial System and Banking System Relation ... 27

2.2.2. Stress Testing ... 28

2.2.3. Financial Stability Index ... 29

2.2.4. Banking Stability Index ... 30

2.3. Principal Component Analysis ... 30

2.4. Machine Learning ... 31

2.5. Random Forest ... 35

2.5.1. Random Forest Classifier ... 36

2.5.2. Random Forest Regressor ... 37

2.6. Related Works ... 39

CHAPTER 3 – RESEARCH METHODS ... 46

3.1. CRISP-DM Methodology ... 46

3.1.1. Business Understanding ... 48

(9)

Agus Afiantara

3.1.2. Data Understanding ... 50

3.2. Data Preparation ... 51

3.2.1. NA Treatment ... 51

3.2.2. Log Transformation ... 51

3.2.3. Normalization ... 52

3.2.4. Validation Technique (Split Data Phase) ... 54

3.2.4.1. Re-substitution ... 55

3.2.4.2. Hold-out ... 55

3.2.4.3. K-Fold Cross Validation (KFCV) ... 56

3.2.5. Dimensionality Reduction ... 56

3.3. Combination PCA and Random Forest in BSI Modeling... 57

3.4. BSI Model Assessment ... 60

3.5. Evaluation ... 61

3.5.1. Classification Rate/Accuracy ... 63

3.5.2. Recall ... 63

3.5.3. Precision ... 64

3.5.4. F-Measure ... 64

3.6. Deployment ... 64

CHAPTER 4 – RESULTS AND DISCUSSIONS ... 65

4.1. Requirements ... 65

4.1.1. Software Requirement ... 65

4.1.2. Hardware Requirement ... 65

4.2. Flow Logic in BSI Model Construction ... 66

4.3. Load Dataset ... 66

4.4. Descriptive Statistic ... 68

4.5. Data Transformation ... 69

4.6. Modeling ... 76

4.6.1. Model ... 76

4.6.2. Hold-out validation result ... 76

4.6.3. Parameter Tuning ... 78

4.7. Final Model and Performance of BSI ... 84

4.8. Interpret Model and Discussion ... 88

4.8.1. Visualizing a Single Decision Tree ... 89

4.8.2. Variable Importance ... 89

4.8.3. The Estimated Model Based on K-Fold Cross Validation ... 91

(10)

Agus Afiantara

4.10. Score New Data ... 93

4.11. Lost Interpretation of PCA ... 93

CHAPTER 5 – CONCLUSIONS AND RECCOMENDATIONS ... 102

5.1. Conclusions ... 102

5.2. Feature Works ... 103

5.3. Recommendations ... 104

GLOSSARY ... 105

REFERENCES ... 111

APPENDIX 1. Independent Variables (Internal Bank, Money Market, Capital & Debt Market, Macroeconomic) ... 115

APPENDIX 2. Flow Diagram of BSI Model Construction using Python ... 126

APPENDIX 3.A. Display Data Information of Dataset... 127

APPENDIX 3.B. Checking of NA Treatment ... 127

APPENDIX 3.C. Data Visualization to know how is correlation between variables. ... 127

APPENDIX 3.D. Log Transformation ... 127

APPENDIX 3.E. Log Transformation of each blocks ... 128

APPENDIX 4.A. Validation Technique ... 129

APPENDIX 4.B. Standard Scaler ... 129

APPENDIX 4.C. Random Forest Classifier model ... 130

APPENDIX 4.D. Code for simulate to evaluate modeling ... 130

APPENDIX 4.E. Code for create image file of decision trees ... 131

APPENDIX 4.F. Display Chart Actual vs Prediction ... 132

APPENDIX 4.G. Save the Model for future use. ... 133

APPENDIX 4.H. Loading Model. ... 133

APPENDIX 4.I. Scoring new Data. ... 133

APPENDIX 4.J. Equation of PCA1 to PCA9 ... 134

CURRICULUM VITAE ... 143

Referensi

Dokumen terkait

A F F I D A V I T I, _____________________________, of legal age, single/married/widowed, residing at ____________________________________________, after having been sworn according to