• Tidak ada hasil yang ditemukan

A research on medical image segmentation with deep learning

N/A
N/A
Protected

Academic year: 2023

Membagikan "A research on medical image segmentation with deep learning"

Copied!
92
0
0

Teks penuh

Furthermore, size-invariant semantic segmentation can be one of the important problems in medical image segmentation. Normal parenchyma, (b) Ground-glass opacity, (c) Reticular opacity, (d) Honeycombing, (e) Emphysema and (f) Consolidation..49 Figure 4.3 Examples of HRCT images (top row) and their corresponding .

Motivations

In recent years, the deep learning model has been widely applied and popularized in computer vision. In particular, deep learning has advanced the accuracy and robustness of medical image segmentation for various anatomies and diseases.

Contributions

CNN

General segmentation architectures

  • FCN
  • U-Net
  • Cascaded U-Net
  • No-new-U-Net

To explain on a 2D network, it consists of repeatedly applying two 3 x 3 convolutions (unfilled convolutions) each followed by a rectified linear unit (ReLU) and a 2 x 2 max join operation with step 2 for the reduction of samples. After all, the first step is to determine the approximate location and shape of the entire image.

Figure 2.2 FCN architectures.
Figure 2.2 FCN architectures.

Evaluation metrics

Glioblastoma in brain MRI

First, the initial segmentation of the brain tumor was performed using a conventional image processing method in a scratch fashion. For detection of the initial tumor region, we cut the data into 8 squares per subject.

Figure 3.1 Sub-regions of GBM [47]. (a) Tumor core visible in T2, (b) Enhancing tumor  structures visible in t1c (blue) surrounding the cystic/necrotic components of the core (green),  (c) Segmentations are combined to generate the final labels of the tumo
Figure 3.1 Sub-regions of GBM [47]. (a) Tumor core visible in T2, (b) Enhancing tumor structures visible in t1c (blue) surrounding the cystic/necrotic components of the core (green), (c) Segmentations are combined to generate the final labels of the tumo

Infarct in Brain MRI

Because ischemic infarcts of various shapes and sizes can occur anywhere in the brain, infarct segmentation must capture local information along with a global context dictionary. The lesions in the image represent high signal intensity at b1000 and low signal intensity at the apparent diffusion coefficient (ADC) map is a well-known image representing acute infarction. Due to the importance of tracking lesion segmentation, several semi-automatic or automatic stroke segmentation methods have been previously presented.

In addition, the previously described methods often suffer from normalization problems typical of non-quantitative imaging methods, which is especially important in the case of multicenter data sets that are usually collected using different imaging parameters. Radiologists described in detail the maximum visible extent of high signal intensity on image b1000 with an infarcted lesion. Instance segmentation identifies each object instance for each known object in the image and assigns a label to each pixel in the image.

In the ROI, pooling takes fixed size windows from the feature map and uses those features to obtain the final class label and bounding box.

Figure 3.5 Examples of the results of infarct segmentation. (a) Original image, (b) Ground  truth, (c) Results of the application of DSC+Focal loss network in the second stage.
Figure 3.5 Examples of the results of infarct segmentation. (a) Original image, (b) Ground truth, (c) Results of the application of DSC+Focal loss network in the second stage.

Supine-prone tissues in Breast MRI

However, these methods are usually limited by the characteristics of the MR images used in the study datasets. However, because the cancer and surrounding tissue in the breast change significantly between the supine and supine positions, the regular registration algorithm does not work. Although breast images were acquired with an MR scanner, image normalization was required to correct the intensity of the image.

Then we did the normalization by subtracting the average image intensity of each pixel in the image and dividing the pixel by the standard deviation (SD) of the intensities. Consequently, the finely detailed information captured in the lower part of the network is used in ascending and descending parts. The DSC, JSC, and HD of the segmentation results of breast and surrounding tissues for each method segmentations are given in Table 1 and its supplement, respectively.

The training results with the data of the supine and supine postures together showed lower results compared to the training results per posture.

Figure 3.6 Overall procedure of semantic segmentation in supine and prone breast MRI.
Figure 3.6 Overall procedure of semantic segmentation in supine and prone breast MRI.

Kidney substructures with Renal Cell Carcinoma (RCC) in kidney CT

There were four phases in CT scans – the non-contrast phase, the renal cortical phase, the renal parenchymal phase, and the renal excretory phase. Image data has been de-identified in accordance with the Privacy Rule of the Health Insurance Portability and Accountability Act. The cascaded architecture was designed to improve segmentation performance by using RPN before segmentation within the available GPE memory.

Second, the results of the additional data through the trained network were corrected manually instead of creating new GTs from scratch. Third, all the originally used and newly added data were reused for subsequent training. The concatenation results lead to an improved segmentation through the prevention of information loss.

Subsequently, the second U-Net module for final segmentation was trained to create meshes for five subclasses of the kidney.

Figure 3.9 Kidney and RCC label image example.
Figure 3.9 Kidney and RCC label image example.

Pancreatic cancer in Pancreas CT

When testing for model overfitting, the difference in total DSC accuracy between the validation and test data sets of the final model was 6.17, demonstrating that this model is not overfitting. For this study, we obtained a CT image dataset of 862 patients, 360 with solid lesions such as cancer and 142 with cystic lesions. CT imaging was performed with a Somato 64 scanner (Siemens AG, Healthcare Division, Erlangen, Germany) with the following parameters observed: craniocaudal abdominal scan - 120 kV, slope - 0.9, collimation - 0.6 mm, interslice spacing - 5 mm and core soft recon.

The radiologist (a board-certified radiologist with 10 years of experience (H.J.K)) collected the CT data and manually segmented the pancreas using consensus CT images. We developed a 3D detection network (U-Net with constraint and excitation blocks) to regress the locations of pancreatic regions; in the second step, we applied a fine segmentation using a 2D segmentation network to fragment the pancreas in a cascade manner based on the detection results. If the model cannot be generalized to that applied to slices that are severely dissociated along the z-axis, the focal lesions of the pancreatic volume boundary in that direction will most likely be inaccurate, resulting in critical errors.

We calculated the accuracy of the result of the pancreatic segmentation and focal lesions for abdominal CT.

Figure 3.10 Pancreas and focal legions image example.
Figure 3.10 Pancreas and focal legions image example.

Multi-structures in dental CBCT

These advantages enable CBCT to replace CT for 3D imaging and CASS modeling. Segmentation of multi-facial structures is essential for dental implants and orthognathic surgeries to create safety margins around the facial nerves and surgical lines in the facial bones. However, the segmentation of multi-facial structures is extremely challenging due to the structural irregularities, complex shapes and heterogeneity of image contrast in the voxel-by-voxel approach, especially CBCT.

Additionally, the training and validation datasets of hard tissue, maxillary sinus, and mandible include 7 and 4 cases, 20 and 4 cases, and 20 and 4 cases, respectively. Depending on the structures, the segmentation result is made differently according to the gold standard. The hard tissues and mandibular canals are drawn by hand by an expert and confirmed by an expert dentist.

The initial mandibular and maxillary sinus masks were created by in-house software using conventional image processing techniques, including 3D sculpting and thresholding.

Strategy for medical image segmentation

Therefore, various augmentation methods including random rotation, flipping, resizing and others were performed to make variations in the dataset to train more robust deep learning based models. The details of the augmentation methods were chosen and controlled depending on the characteristics of each lesion, organ and the clinical context. The number of test subjects depends on the severity of labeling and the possibility of disease occurrence.

Due to variation in its size, shape, location, image reconstruction protocol and modality, medical image segmentation is considered one of the most difficult tasks. Considering all these factors, the deep learning based automatic segmentation (DLAS) model should be selected, trained and evaluated. In the case of large size images and small target objects, the cascade method is recommended for increasing the accuracy due to the fine structure, with high resolution, of the candidate area obtained at the first segmentation stage.

Based on the results and the issues raised, these studies were extended to more advanced studies such as studies focusing on more balanced segmentation with different level labels, robust feature extraction from radiomics, smart labeling with a human in the loop, etc.

Figure 4.1 Flowchart for strategy in medical image segmentation.
Figure 4.1 Flowchart for strategy in medical image segmentation.

Smart labeling with human in the loop

The calculation for RMS is the same as in equation (3), where x is the difference between corresponding points in the two models, and n is the total number of points. The results of the comparison of the segmentation time for the five substructures between manual and CNN-corrected segmentation have been listed in Table 3. CNN-corrected segmentation reduced the time for artery segmentation by 19 minutes 8 seconds and the time for vein, ureter, parenchyma. , and RCC by 12 minutes 1 second, 19 minutes 23 seconds, 8 minutes 20 seconds, and 17 minutes 8 seconds, respectively, with a total segmentation time reduction of 76 minutes, which is more than half the time required by manual segmentation.

Apart from the time it took for the initial packet to load, CNN's segmentation took less than 1 second per case. It is observed that the results of CNN-corrected segmentation are very similar to those of manual segmentation, while they do not match those of CNN segmentation. The mean values ​​of DSC for the five subclasses increased at the completion of each phase.

In addition, the final results of the last-stage segmentation were found to be better than those of nnU-Net using our dataset.

Fine-tuning with different level labels in imbalanced datasets

This may have been caused by patch level tag limitation and subclass imbalance. The U-Net-based semantic segmentation was trained based on the classification results of the SVM classifier. This could also be caused by an imbalance limit at the subclass image level.

The U-Net-based semantic segmentation was fine-tuned using image-level labels from 46 HRCT images of the training set. These agreements were significantly higher than those of the SVM classifier and the untuned U-Net-based semantic segmentation. In Table 4.2, the performance of the fine-tuned model in terms of six pattern classifications is significantly higher than that of the pre-fine-tuned model.

In the case of UIP, honeycomb covers the significant areas of the HRCT of the lung.

Figure  4.2  Typical  ROIs  in  six  image  patterns  of  diffuse  interstitial  lung  disease
Figure 4.2 Typical ROIs in six image patterns of diffuse interstitial lung disease

Comparison between deep learning based and human segmentations in radiomics

Chen, X., et al., Anatomy-Regularized Representation Learning for Cross-Modality Medical Image Segmentation.IEEE Trans Med Imaging, 2020. Zhao, Z.Q., et al., Object Detection With Deep Learning: A Review.Ieee Transactions on Neural Networks and Learning Systems, 2019. Black, K.M., et al., Deep learning computer vision-algoritme til påvisning af nyrestenssammensætning.BJU Int, 2020.

Villalba-Diez, J., et al., Deep learning for industrial computer vision quality control in the printing industry 4.0. Sensors (Basel), 2019. Wang, G., et al., Interactive Medical Image Segmentation Using Deep Learning with Image-Specific Fine Tuning. IEEE Trans Med Imaging, 2018. Menze, B.H., et al., Multimodal Brain Tumor Image Segmentation Benchmark (BRATS). IEEE Trans Med Imaging, 2015.

Milenkovic, J., et al., Automated breast region segmentation in axial breast MR images. Comput Biol Med, 2015. Dalmis, M.U., et al., Using deep learning to segment breast and fibroglandular tissue in MRI volumes. Med Phys , 2017. Zhang, L., et al., Automated deep learning method for whole-breast segmentation in diffusion-weighted breast MRI.J Magn Reson Imaging, 2020.

Gambar

Figure 2.1  An illustration of the architecture of our CNN.
Figure 2.2 FCN architectures.
Figure 2.3 U-Net architecture.
Figure 2.4 Cascaded U-Net methods: (a) Step 1 of cascaded U-Net: The first U-Net learns  to segment location pancreas from whole CT images, (b) Step 2 of cascaded U-Net: The second  U-Net learns to fine-segment pancreas lesions from step 1 of the cascade
+7

Referensi

Dokumen terkait

EDWARDS Agriculture Group, Agriculture and Life Sciences Division, Field Service Centre, Lincoln University, Lincoln 7467, Canterbury, New Zealand [email protected] Abstract