• Tidak ada hasil yang ditemukan

Tuberculosis Abnormality Detection in Chest X-Rays: A Deep Learning ApproachX-Rays: A Deep Learning Approach

BATCHNORM 512

3.2 Tuberculosis Abnormality Detection in Chest X-Rays: A Deep Learning ApproachX-Rays: A Deep Learning Approach

3.1.2 Conclusion

As presented in the paper, the proposed CNN-based method achieved 2.1% im- provement over other compared methods. The improved performance was due to the data augmentation, parameter tuning, and pre-processing techniques employed.

3.2 Tuberculosis Abnormality Detection in Chest

Tuberculosis Abnormality Detection in Chest X-Rays: A Deep Learning

Approach

Mustapha Oloko-Oba and Serestina Viriri

(

B

)

School of Mathematics, Statistics and Computer Sciences,

University of KwaZulu-Natal, Durban, South Africa

219098624@stu.ukzn.ac.za,viriris@ukzn.ac.za

Abstract.

Tuberculosis has claimed many lives, especially in developing countries. While treatment is possible, it requires an accurate diagnosis to detect the presence of tuberculosis. Several screening techniques exist and the most reliable is the chest X-ray but the necessary radiological expertise for accurately interpreting the chest X-ray images is lacking.

The task of manual examination of large chest X-ray images by radiolo- gists is time-consuming and could result in misdiagnosis as a result of a lack of expertise. Hence, a computer-aided diagnosis could perform this task quickly, accurately and drastically improve the ability to diagnose correctly and ultimately treat the disease earlier. As a result of the com- plexity that surrounds the manual diagnosis of chest X-ray, we propose a model that employs the use of learning algorithm (Convolutional Neu- ral Network) to effectively learn the features associated with tuberculosis and make corresponding accurate predictions. Our model achieved 87.8%

accuracy in classifying chest X-ray into abnormal and normal classes and validated against the ground-truth. Our model expresses a promising pathway in solving the diagnosis issue in early detection of tuberculo- sis manifestation and, hope for the radiologists and medical healthcare facilities in the developing countries.

Keywords:

Tuberculosis · Chest X-ray · Classification · Deep

learning · Pulmonary · Convolutional neural network

1 Introduction

Tuberculosis, known as TB, is regarded as a significant health challenge across the world but more prevalent in developing countries [1–3]. TB is caused by the Mycobacterium tuberculosis bacteria that generally affect the lungs (pulmonary), and various parts of the body (extrapulmonary) [4]. Several TB patients lose their lives yearly as a result of diagnosis error and lack of treatments [1, 5].

Pulmonary Tuberculosis is a transmissible infection mostly visible around the collarbone. It is contractible via the air when people who have an active

c Springer Nature Switzerland AG 2020

L. J. Chmielewski et al. (Eds.): ICCVG 2020, LNCS 12334, pp. 121–132, 2020.

122 M. Oloko-Oba and S. Viriri

tuberculosis infection sneeze, cough, or transmit their saliva through the air or other means [6]. In other words, persistence and closeness of contact are major factors of the risk of contracting TB which make individuals inhabiting in the same household at higher risk of contracting TB than casual contacts. Lack of diagnosis and treatment of affected individuals will as well increase the rate of transmission.

Tuberculosis disease is certainly treatable if identified/diagnosed early for appropriate treatment. Diverse test procedures may be employed to confirm a diagnosis of suspected pulmonary tuberculosis. These test techniques include a Computed Tomography (CT) scan, Chest X-ray (CXR), Magnetic Resonance Imaging (MRI) scan or ultrasound scan of the affected part of the body. Among the various screening techniques in existence, chest X-Ray is renowned for eval- uating the lungs, heart and chest wall to diagnose symptoms as reported by [7]

and recommend their application for screening patients in order to exclude the usual instance of a costly test.

World Health Organization affirms that Tuberculosis is, in fact, one of the topmost causes of death across the world most especially in developing countries.

As reported in 2016, 1.8 million deaths were recorded resulting from about 10.4 million cases of people affected with tuberculosis. As much as tuberculosis can be detected on chest X-ray, tuberculosis prevalent regions usually suffer the expertise of radiologists required to accurately diagnose and or interpret the X-ray results [2].

In this paper, we propose a model that will accurately improve the qual- ity and timely diagnosis of chest X-rays for the manifestation of tuberculosis.

The proposed model will increase diagnosis efficiency, enhance performance and eventually minimize cost as opposed to the process of manually examining the chest X-ray scan which is costly, time-consuming, and prone to errors due to lack of professional radiologist and volume of the chest X-rays.

Fig. 1.

Sample of the normal and abnormal chest X-rays.

55

Tuberculosis Abnormality Detection in Chest X-Rays 123

2 Related Work

The strengths and possibilities offered by machine learning has provided a boost to computer vision especially for the diagnosis and screening of various health diseases and conditions. In spite of the global application of machine learning techniques in the biomedical domain, chest X-ray is still a very important and renowned tool [7] among others for evaluating pulmonary diseases that require rapt attention. There is always a need to improve the existing methods and proposing new techniques for stability, global growth, and better performance.

A handful of investigation has been done by [8–10] in assessing the ability of the existing computer-aided detection CAD systems in biomedical domain to diagnose pulmonary nodules. Notable improvement of 0.986 area under curve (AUC) was reported in [8] with the use of a computer-aided design system as opposed to 0.924 AUC without CADs. Similarly, 0.923 AUC improvement with CADs was observed by [9] over 0.896 AUC without CADs. The results of their assessment ultimately show the significant impact of CADs in assisting radiolo- gists to improve the diagnosis from chest X-rays.

An experiment for tuberculosis screening was presented in [24] using Alexnet and VGGNet architectures for the classification of CXR into positive and neg- ative classes. The analysis carried out on the Montgomery and Shenzhen CXR datasets reported that VGGNet outperformed Alexnet as a result of a deeper net- work of VGGNet. The performance accuracy of 80.4% was obtained for Alexnet while VGGNet reached 81.6% accuracy. The authors concluded that improved performance accuracy is possible by increasing the dataset size used for the experiment.

Another article that deals with detecting the presence of tuberculosis early is presented in [14] which applied a median filter, histogram equalization and homo- morphic filter to preprocess the input image before segmentation is applied uti- lizing the active contour and finally arriving at the classification using the mean values. Although, the research does not report any accuracy attained but affirm the impact and contribution of computer-aided diagnosis as an assisting tool to guide radiologists and doctors in reaching an accurate and timely diagnosis decision with respect to tuberculosis detection.

The study carried out in [11] presented proposals that will improve fea- ture extractors in detecting diseases using pre-trained CNN. This research is famous for combining multiple instance learning algorithms with pre-trained CNN, assessment of classifiers trained on features extracted, and the compari- son of performance analysis of existing models for extracting features from the radiograph dataset and achieved an accuracy of 82.6%.

An experiment was done [15] to compare the performance of three different methods in detecting tuberculosis in the lungs. The obtained results show the K-nearest neighbor (KNN) classifier with the maximum accuracy of 80%, Simple linear regression 79%, and the Sequential minimal optimizer 75% accuracy.

An approach to discover tuberculosis from a radiograph image where the

X-ray machine is placed behind the patients (posteroanterior) was presented in

[12]. This study employs a graph cut segmentation approach for extracting the

124 M. Oloko-Oba and S. Viriri

lung region on the chest X-ray and then computes sets of features such as (edge, texture, and shape) that allow classification into abnormal and normal classes respectively.

A CNN model that involved classification of different manifestations of tuber- culosis was presented in [19]. This work looked at unbalanced and less categorized X-ray scans and incorporated cross-validation with sample shuffling in training the model and reported to have obtained 85.6% accuracy in classifying the var- ious manifestation of tuberculosis using the Peruvian datasets composed of a total of 4701 samples with about 4248 samples labeled as abnormal containing six manifestation of tuberculosis and 453 samples labeled as normal. The authors affirm that their model surpasses previous classification models and is promising in tuberculosis diagnosis.

The research work presented in [13] is one of the first to employ deep learning techniques on medical images. The work was based on popular Alexnet archi- tecture and transfer learning for screening the system performance on different datasets. The cross dataset performance analysis carried out shows the system accuracy of 67.4% on the Montgomery dataset and 83.7% accuracy on the Shen- zhen dataset.

In [25], the authors participated in the 2019 ImageCLEF challenge and pre- sented a deep learner model (LungNet) that focus on automatic analysis of tuberculosis from computer tomography CT scans. The CT scans employed is firstly decompressed and the slices are extracted having 512 images for the X and Y dimensions and between 40–250 images for the Z dimension. Filters were then introduced on the slices to eliminate the slices that do not contain valu- able information required for classifying the samples. The proposed LungNet along with ResNet-50 architecture was employed as a deep learner whose out- puts are regarded as the preliminary results. The deep learner model was trained on 70(%) and 50(%) training sets and achieved AUC performance of 63(%) and 65(%) for the ImageCLEF CT report and severity scoring task respectively.

Although, these performances were not the best presented in the challenge but the authors believed if subjected to advanced pre-processing techniques such as data augmentation, and masking could provide a better performance.

CheXNeXt is an algorithm developed by the authors in [26], for identifica- tion of 14 various pathologies in chest X-rays. The algorithm which employed convolutional neural network approach was validated on the NIH dataset and compared the result with interpretation of 9 professional radiologists. The results show that CheXNeXt achieved equivalent performance with radiologists in 10 different pathologists, best performance on 1 pathology (atelectasis) attaining 0.862(%) AUC and underperformed in 3 pathologists. The algorithm took less than 2 min to identify the various pathologists while it took the radiologist about 240 min.

The study in [27] is conducted to assess the detection accuracy of qXR, a computer aided diagnosis software based on convolutional neural network. The authors utilized microbiologically established lung tuberculosis images as the standard for reference and made use of kappa coefficient along with confidence

57

Tuberculosis Abnormality Detection in Chest X-Rays 125 interval as the statistical tools to analyse the data and examine the inter-rater reliability of radiologist in detecting certain lung abnormalities. The study also used radiologist interpretation as standard to validate the detection accuracy of the qXR in terms of generating ROC curves and calculating AUC. The qXR sys- tem achieved 0.81(%) AUC for detection of lung tuberculosis, 71(%) sensitivity and 80(%) specificity.

As with existing models, most of which basically show the performance accu- racy of their models without a view of the predicted sample. We, however in this work present a model evaluated on the Shenzhen datasets to provide an improved performance accuracy thereby showing predictions of the model val- idated on the groundtruth data. The groundtruth data will ultimately assist us to further develop a tuberculosis diagnosis system to be deployed to health facilities in the developing countries to improve quality and timely diagnosis.

3 Methods and Techniques

3.1 Datasets

The dataset used in training our model is the Shenzhen tuberculosis dataset.

This dataset is specific to tuberculosis and is publicly available for the purpose

of research. The Shenzhen dataset is made up of 336 abnormal samples labeled

as “1” and 326 normal samples labeled as “0”. All samples are of the size 3000

by 3000 pixels saved as portable network graphic (png) file format as shown in

Fig. 1. This dataset is accompanied by clinical readings that gives details about

each of the samples with respect to sex, age, and diagnosis. The dataset can be

accessed at https://lhncbc.nlm.nih.gov/publication/pub9931.