D 1<
., n'- " fv ...
,ý , ý, s, ý
AN ONTOLOGY BASED APPROACH FOR RETRIEVING
MEDICAL AND HEALTH SCIENCE KNOWLEDGE IN QURAN
Muhammad Afifi Bin Mohamad Safee
Thesis submitted in partial fulfilment for the degree of
MASTER OF SCIENCE
UNIVERSITI SAINS ISLAM MALAYSIA
July 2018
AUTHOR DECLARATION
I hereby declare that the work in this thesis is my own except for quotations and summaries which have been duly acknowledged.
Date: 05 July 2018 Signature:
Name: Muhammad Afifi Bin Moharnad Safee Matric No: 3150093
Address: 98-25A, Persiaran Bahagia, Jalan Junid. 84000 Muar.. lohor
ii
ACKNOWLEDGEMENTS
This study would not have been possible without the guidance and help of several individuals who in one way or another contributed and extended their valuable assistance in the preparation and completion of this study.
First and foremost, my utmost gratitude to Prof Madya Dr Madinah Mohd Saudi and Dr Sakinah Ali Pitchay whose sincerity and encouragement I will never forget. They have guided me with her patience and professionalism thatgreatly helps me and motivates me to continue and complete the thesis.
I would also like to thank my wife, parents, and family members for always supporting me and encouraging me with their love and best wishes.
Special thanks to my University. Universiti Sains Islam Malaysia, and the Malaysian government (Ministry of Higher Education) for the financial sponsorships. I also wish to extend my thankfulness to all my colleagues and friends around me who have supported me throughout my Master candidature with their love, patience and understanding.
11 i
ABSTRAK
Terdapat dua teknik untuk pencarian untuk Al-Quran iaitu pencarian berdasarkan kata kunci dan pencarian berasaskan semantik. Berdasarkan kajian lalu. teknik berasaskan kata kunci mempunyai ketepatan yang rendah dan sering membawa kepada pen-- eluaran maklumat yang salah. Oleh itu, ramai penyelidik rnenggunakan pencarian semantik bagi men;., ýatasi masalah ini. Walaubagaimanapun, pencarian semantik sedia ada mempunyai
limitasi kekurangan liputan domain, dimana ontologi telah digunakan untuk melengkapi dan memperkaskan lagi liputan domain tersebut. Untuk domain Al-Quran. walaupun banyak penyelidikan ontology bidang untuk AI-Quran yang telah dijalankan. masih belum Nvujud domain Sains Perubatan dan Kesihatan. Justeru itu, objektif penyelidikan dibentuk bagi membangunkan ontologi bidang Sains Perubatan dan Kesihatan dalam Al-Quran serta melaksanakan kaedah carian hybrid berdasarkan semantik ontologo dan
carian kata kunci dari AI-Quran. Eksperimen telah dijalankan dengan menggunakan perisian terbuka iaitu protege, JAVA dan apache Jena dimana ianya diounakan untuk mengolah ontologi dan menoembanokan aplikasi pencarian. Strukturontologi telah disahkan oleh pakar bidang. di mana struktur ontologi dinilai oleh ahli ontologi dan konsistensi ontologi disahkan dengan menggunakan protege reasoner. Daripada kajian
ini. domain ontologi
yang baru untuk bidang Perubatan dan Sains Kesihatan telah berjaya dibentuk dan kaedah pencarian berasakan hibrid semantik ontologi dan kata kunci telah berjaya dibangunkan. Hasil penilaian menunjukkan berhasil mendapat kadar ketepatan 88% dan kadar ingatan 91%. Hasil kajian ini boleh dimanfaatkan oleh bidang lain dimasa akan datang.
Kata kunci: Ontolo-i. Pengeluaran Maklumat, Carian Semantik. Perubatan dan Sains Kesihatan. Ilmu AI-Quran.
iv
ABSTRACT
There are two techniques for Quran knowledge representation which are keyword-based and semantic-based. Based on previous works, keyword-based technique has low accuracy and always leads to wrong information retrieval. Therefore, many researchers
implement semantic search to overcome these problems. But the semantic-based is lack in domain coverage, where ontology has been used to complement and create new domain for semantic-based problem. As for Quran domain, though there a few works
have been done, but still does not covered for the Medical and Health Science domain.
Hence, the research objectives for this research are to develop ontology for the Medica I and Health Science domain in the Quran and to implement hybrid ontology-based
semantic and keyword search application to retrieve related queries in Quran. The experiment was conducted by using open source tools which are protege, JAVA
framework and apache Jena where these are used to develop the ontology and to develop search application. The content of ontology is evaluated by Quran experts, where ontology structure is evaluated by ontology expert and ontology consistency is verified
by using protege reasoner. In this research, a new domain for the Medical and Health Science and hybrid ontology-based semantic and keyword search application have been developed. The evaluation results show that based on Quran experts' evaluation, the search method proposed has produced 88% of Recall rate and 91 % of Precision rate.
For future work. the ontology domain can be further expanded on different domain.
Keywords: Ontology. Information Retrieval, Semantic Search. Medical and Health Science. Al-Quran Knowledge.
V
TABLE OF CONTENTS
AUTHOR DECLARATION ACKNOWLEDGEMENTS
ABSTRAK ABSTRACT
TABLE OF CONTENTS LIST OF TABLES
LIST OF FIGURES
LIST OF APPENDICES
LIST OF ABBREVIATIONS
CHAPTER 1: INTRODUCTION 1.1 Background
1.2 Problem Statement 1.3 Research Questions 1.4 Research Objectives
1.5 Scope of The Research 1.6 Operational Definition 1.7 Thesis Organization
CHAPTER 2: LITERATURE REVIEW
2.1 Information Retrieval Background
2.2 Current Al-Quran Retrieval Approach 22.1 Keyword-Based Technique 2.2.2 Semantic-Based Technique
2.2.3 Ontology Research For Al-Quran 2.3 The Semantic Web
2.4 Introduction To Ontology 2.5 Ontology Evaluation
2.6 Ontology Development Tools 2.7 Summary
CHAPTER 3: RESEARCH METHODOLOGY ). I Research Framework
3.2 Overall Research Process
3 2.1 Identified Problem 2.2 Suggestion
3.2.2.1 Research Design 32.3 Development
3.2.3.1 Development Of Ontology
32.32 Develop Searching Application 3.2.4. Evaluation
3.2.4.1 Ontology Evaluation
3.2.4.2 Searching Application Performance Evaluation 32.5. Summary
ii
iv
v
vii viii
Lx
x
xi
1 4 J J 6 7 7
9 11 12 12 13 18 19 20
1 1)
23
24 25 27 27 28 28 28 29 34 34 34 35
vi
CHAPTER 4: FINDINGS
4.1 Quranmed Ontology Development 36
4.2 System Demonstration 48
4.3 Comparison Quranmed Algorithm With Existing Work 51 CHAPTER 5: ANALYSIS AND DISCUSSIONS
5.1 Ontology Evaluation 53
5.1.1 Expert Validation 53
5.1.2 Ontology Query Validation 54
5.1.3 Ontology Consistency Validation Using Reasoner 56 5.2 Quranmed Searching Application Performance Evaluation 57
5.3 Summary 61
CHAPTER 6: CONCLUSION AND FUTURE WORK
6.1 Introduction 62
6.2 Main Contributions 62
6.3 Future Works 64
6.4 Conclusion 64
REFERENCES 65
vii
LIST OF TABLES
Table 1.1: Medical and Health Science domain structure 7 Table 2.1: Comparison study of Existing Al-Quran searching technique 16
Table 2.2: Approaches for ontology evaluation 21
Table 3.1: Mapping Research objectives, methodology and Outcomes 24
Table 3.2: Proposed ontology evaluation 34
Table 4.1: Questions & Answers determine Qurankled's domain & scope 37 Table 4.2: Definition and general classification of QuranMed 40 Table 4.3: Identifying the sub and sub-sub concepts of QuranMed terms 40 Table 4.4: QuranMed Object Properties with their domain and range 42 Table 4.5: QuranMed Object Data Properties with their domain and range 44 Table 5.1: Result of test case scenario using keyword 59 Table 5.2: Result of test case scenario using medical field search 60 Table 5.3: Result of test case scenario using verse and surah name search 60 Table 5.4: Result of test case scenario using exact verse word search 61
VIII
LIST OF FIGURES
Figure 1.1: Mapping research objectives with research outcomes 6 Figure 2.1: Measuring Search Effectiveness (Recall and Precision) 10 Figure 2.2: Classification of Existing Al-Quran searching technique II
Figure 2.3: The Semantic Web Layer cake diagram 19
Figure 3.1: Research process adopted from (Kuechler and Vaishnavi. 2008) 26
Figure 3.2: The Al-Quran Content Hierarchy 28
Figure 3.3: QuranMed searching application flow 30
Figure 3.4: QuranMed proposed framework 31
Figure 3.5: QuranMed Sparql Builder algorithm for the proposed framework 32 Figure 4.1: OOPS! Ontology Pitfall Scanner Results 38 Figure 4.2: Medical and Health Science Concepts Classification 39
Figure 4.3: List of Main Concepts of QuranMed 41
Figure 4.4: Object Property for hasRelatedAyat 42
Figure 4.5: Data Property for hasKeyword 43
Figure 4.6: QuranMed instance illustration in protege 45 Figure 4.7: Medical and Health Science dataset from expert 46
Figure 4.8: Ontology summary diagram 47
Figure 4.9: Relationship of " hastield" and " hasAyatUthmani" 48 Figure 4.10: Relationship of "hasField" and " hasTranslation" 49 Figure 4.11: Relationship of "hasField" and " hasKeyword" 49 Figure 4.12: Relationship of "hasField" and "hasLabel" 50 Figure 4.13: Relationship of "hasLabel" and "sameAs" 50
Figure 4.14: QuranMed search algorithm 52
Figure 4.15: Hakkoum search algorithm 52
Figure 5.1: Answer for test Query 1 54
Figure 5.2: Answer for test Query 2 55
Figure 5.3: Answer for test Query 3 55
Figure 5.4: Selecting FACT++ reasoner in protege 56
Figure 5.5: Error in taxonomy and inconsistent reasoning 57
ix
LIST OF APPENDICES
Appendices Page
Appendix A: SPARQL query to query QuranMed ontology content 73
Appendix B: Expert List 76
Appendix C: Medical and Health Science dataset sample 78 Appendix D: Sample screen searching Al-Quran using Ontology based approach 84
Appendix E: Ontology data source hierarchy 87
x
LIST OF ABBREVIATIONS
OWL Ontology Web Language
RDBMS Relational Database Management System
RDF Resource Description Framework
RDF-S Resource Description Framework Schema
XML Extensible Markup Language
URL Uniform Resource Locator
CLIR Cross Language Information Retrieval
DL Description Logic
xi