Nazarbayev University Repository

Breast cancer is the most common type of cancer, with more than 2.2 million cases reported in 2020. Breast cancer treatment can be extremely effective, especially when the disease can be detected at an early stage. Nowadays, scientists offer many solutions to identify the type of tumor in the early stages through medical imaging.

Breast cancer (BC) is one of the most common cancers in women, and it is the leading cause of cancer-related mortality in women. According to WHO (World Health Organization) statistics, breast cancer was diagnosed in 2.3 million women, of which 685,000 women had cancer with a fatal outcome in 2020. In the structure of oncological morbidity, breast cancer ranks first (13.2), lung cancer in second place (10 .4), and colorectal cancer in third place (9.6)[3].

As mentioned above, the death rate for people with breast cancer has increased every year in recent years. In practice, mammography images are widely used for the screening of tumors, especially for the diagnosis of a breast tumor, but unfortunately, the analysis of mammography images cannot precisely identify the type of tumor (malignant or benign) with certainty[4]. The main motivation for this work is to help with the early identification of breast cancer and reduce human error, because early detection of cancer can help a person overcome the disease or prolong their life.

In this work, we will implement and adapt pre-trained CNN architectures on the BreakHis dataset and improve the models using parameter fine-tuning and data augmentation techniques on tissue images of poor quality and resolution. The 6 different CNN architectures such as MobileNetV2 , ResNet50, Inception-ResNet-v2, VGG16, Inception-v3 and EfficientNetV2 were trained on five datasets.

Histopathology

CNN architectures for classification images

The calculations performed at different magnifications (x4, x10, x20 and x40) for multivariate statistical analysis, diagnosis and classification are included in the histopathological image analysis. Kothari et al.[21] presented a version of histogram normalization in which color normalization is based on the existence of color rather than frequency. As a result, a considerable amount of work has gone into the topic of BC histopathology image analysis, particularly the automatic categorization of benign and malignant images for computer-aided diagnosis.

26] provide a BC diagnostic method that uses cytological pictures of fine needle biopsies to distinguish between benign and malignant images. 28] conducted a study using a variety of state-of-the-art CNNs using a fine-tuning technique (ResNet-50, Xception, SeResNet-18, SeResNet-34, VGG-16, Inception-V3, Inception- ResNet -V2). Furthermore, to identify a better design of CNNs, the authors suggest a surrogate model architecture based on an optimization of the tree analysis estimator.

They recruited 57 subjects from the dynamic protocol of the DMR-IR database, 19 of whom were healthy and 38 of whom were diseased. To enrich the data, the authors used vertical and horizontal flips, rotation of angles between 0 and 45 degrees, 20 percent magnification, and normalized noise. The authors' best performance using state-of-the-art CNNs is a 0.91 F1 score using SeResNet-18.

BreaKHis dataset contains microscopic biopsy images of benign and malignant breast cancers, breast tissue biopsy slides stained with hematoxylin and eosin are used to prepare samples (HE). -Accumulation Diode) interline transfer charge-coupled device with a pixel size of 6.5 m and a total pixel number of 752,582 is used in this camera[29]. The camera is set to automatic exposure and focusing on the microscope is done manually while you look at the digital photo on the computer screen.

The pixel size is calculated by dividing the camera's actual pixel size (6.5 m) by the magnification of the relay lens (3.3) and the objective lens. Black borders are included on both the left and right sides of the original photos, as well as text notes in the upper left corner. Breast tumors, both benign and malignant, can be classified into a number of categories depending on the appearance of the tumor cells under the microscope [30].

Table 3.1: The acquisition system’s magnification and digital resolution Visual magnification Objective lens Effective pixel size (m)

Data Augmentation

Flipping
Cropping
Rotation
Noise injection
Colour space transformations

In addition, these types of methods help remove unbalanced histopathological images from the breast due to the large size of the ductal carcinoma. This implies the use of cropping, rotation, flipping and other augmentation methods involved to enhance the quality of the histopathological images for the models. Such returns are not label-preserving transformations for text recognition datasets, such as SVHN or MNIST[31].

Color space transformation techniques are performed by changing pixel values and it includes options like brightness, contrast, saturation, hue, blur, etc. Illumination in photographs is one of the most common problems encountered in image recognition tasks, even in the analysis of medical data , this problem is often encountered. As a result, color space transformation is considered the key to solving this problem, quickly correcting for dark or bright images by reducing their pixel value to a constant by decreasing or increasing the value[31].

Figure 3-2: The steps of implementation data augmentation techniques: A - original image, B - resized image to 224 size, C - horizontally flipped, D - rotated picture, E - cropped image, F - changed colour space transformation: brightness,contrast,hue and

Deep Neural Network Models

As it progresses from one phase to the next, the input size is halved and the channel width is doubled. Inception-v3 is a convolutional neural network design from the Inception family that uses label smoothing, 7 x 7 factorized convolutions, and an auxiliary classifier to pass label data down the network (as well as using batch normalization for layers in the side header) [34]. The Inception-ResNet-v2 convolutional neural network was trained on over millions of photos from the ImageNet collection.

A 164-layer network can recognize photos from 1,000 different object categories, including a keyboard, mouse, pencil, and 23 types of animals[35]. One of the commonly used datasets is ImageNet and some experts have mostly extracted it and built a training model called VCG16 based on CNN architecture. Simple images were processed with a minimum detection field of 3x3 using convolutional layers, but it was still usable as it used 7x7.

In general, the task of using these pre-prepared architectures was to create 3 connection layers connecting these pre-prepared models and modify these models with weight updates by training the developed model structures on the Breakhis dataset.

Figure 3-4: The Architecture of ResNet 50

Training of the CNN models

In this way we get a newly trained architecture that we can change using different optimization techniques, data augmentation methods, etc. In total, a total of 30 models have been created and many experiments have been carried out to find the best parameters for these models. Moreover, the quality of the models was evaluated through accuracy, precision, recall and F-1 score statistics, which were obtained using the confusion matrix by calculating the true and positive outcomes of models.

At the beginning of building the models, we used a normal 5-layer neural network, which showed an average result of around 80%, after adding layers to improve the result, but the result fluctuated between 80-85%, which was not enough. After we came to the fact that we will use pre-prepared models and improve them by using various optimizers, changing the parameters of the settings of these models, using data augmentation methods, which in turn give models with greater accuracy. The implementation of pre-prepared models showed an average of 83-84%, and by adding extension methods and changing model parameters, we increased it to an average of 88-89%.

Initially, Adam was the optimizer in the models, then we changed the optimizer to SGD, keeping only the best result in epochs and adding cross-validation, this experiment was successful and it increased the results by 5-7%. By experimenting with different models with different parameters, we came to the conclusion that 12 epochs will be used with a batch size of 64 and an SGD optimization.

Accuracy,Precision,Recall and F-1 Score

According to the results of Table 4.1, the models: Efficientnetv2, Mobilenet-v2, Resnetv2-50 and VGG16 showed the highest results. In Figure 4.4 we can see the model implementation and the variation of train and validation accuracy. As the results show, the zoom factor affects the model estimation because the parameters set in the models and the image parameters are different, so the model estimation is reduced.

For example, with a magnification coefficient of 400, almost all models show relatively low accuracy, because when histopathological images are magnified by 400, many image parameters appear that do not match the models' expectations. In Figure 4.3 we can see a confusion matrix that summarizes the prediction results of the improved Efficientnetv2-b0 model. All this research on the BreaKHis dataset and our proposal work represents results with higher accuracy than state-of-the-art other models.

Also, we can inform that our model is such an excellent methodology for breast cancer detection. Breast cancer is the most common type of cancer in women and is the leading cause of death among all women aged 35 to 54. After all the experiments, we can assume that the quality of the data determines 60-70% of the success of the model.

The best result showed EfficientNet 94.5% accuracy on total data, and on average it showed 91.87% accuracy on test data for 4 magnification factors and compared to other state-of-the-art studies represents the higher result. For future work, we will implement a data fusion methodology that will compare the results of the best models to help improve accuracy and add a Generative adversarial network technique to enrich the complex images to improve the quality of the model. Finally, as we discussed earlier, methods for detecting malignant and benign breast cancer glands.

Mammography, breast ultrasound, and magnetic resonance imaging for surveillance of women at high familial risk for breast cancer. American Society of Clinical Oncology Guidelines recommendations for Sentinel lymph node biopsy in early stage breast cancer. Computational image analysis identifies histopathologic image features associated with somatic mutations and patient survival in gastric adenocarcinoma.

A comparative study of different types of convolutional neural networks for breast cancer histopathological image classification. Computer-aided diagnosis of breast cancer based on cytological image analysis of fine-needle biopsies.