• Tidak ada hasil yang ditemukan

1.4 Literature Survey

1.4.2 Steganalysis Schemes

al.[48] introduced SteganoGAN to hide/extract binary data to/from images. The methods [47,48] have not used extractor loss in training, resulting in improper embedding and might show visual artifacts in generated stego images.

pixels, which enables the model to detect embedding changes in the texture re- gions. SRM uses various small submodels to capture these dependencies. Each submodel is created from noise residual, Rij, computed using high-pass filters as follows:

Rij = ˆXij(N ij)−c.Xij, (1.3) wherecdenotes the order of residual,Nij is the neighbors of pixelXij,Xij 6=Nij, and ˆXij(.) is a predictor ofc.Xij defined over Nij. The set {Xij+Nij} is known as residual’s support. The residuals are quantized and truncated as follows:

Rij ←round trunT

Rij

q !

, (1.4)

wheretrunT(x) =xifx∈[−T, T], andq >0 denote the quantization parameter.

The truncation step limits the dynamic range of residuals which results in the reduced dimensionality of co-occurrence matrices [14]. The quantization causes the residual more sensitive to the embedding modification in spatial discontinu- ities such as edges and textures. The co-occurrence matrix (of order D = 4) is computed from the quantized residual (with T = 2, andq ∈ {1,1.5,2}) of entire image. This co-occurrence matrix is used as an SRM feature vector to train the Ensemble Classifier (EC) [17]. The EC used with SRM features consists of L Fischer Linear Discriminant (FLD)s as base learners due to their simplicity and fast training. The optimal value of L is automatically learned during ensemble training. The decisions of base learners are aggregated by majority voting. SRM is one of the successful steganalyzers. Other different variations of SRM, such as CRMQ1 [51], JRM [52], etc., were subsequently reported in the literature.

Though handcrafted feature-based schemes are known to detect state-of-the-art steganalysis schemes successfully, their detection performance relies heavily on handcrafted filters’ design.

1.4.2.2 Deep Feature-based Steganalysis

In recent times, deep learning methods, specifically, Convolutional Neural Net- works (CNNs), have shown tremendous success for many computer vision applica-

tions, such as object detection [53], object tracking [54], image classification [55], medical imaging [56], etc. This success drove the researchers to exploreCNN for steganalytic classification.

Tan & Li [57] proposed a deep learning architecture based on a nine-layered Stacked Convolutional Auto-Encoder (SCAE) to simulate the SRM for stegana- lytic detection. For example, the first layer consists of forty kernels, initialized us- ing a high-pass filter (KV filter) given in eq. (1.5). The method used max-pooling in the subsequent layers for dimensionality reduction. Finally, a fully-connected layer, followed by a softmax layer, is used for classification. The detection perfor- mance of the method was inferior to that of the handcrafted feature-based SRM.

This method is the first work that used deep learning for steganalysis. Therefore, various design considerations are not taken into account, such as avoiding the use of max-pooling.

KV = 1 12

−1 2 −2 2 −1

2 −6 8 −6 2

−2 8 −12 8 −2

2 −6 8 −6 2

−1 2 −2 2 −1

(1.5)

Qian et al. [2] proposed a shallow CNN architecture with a Gaussian activation function, called GNCNN, to model the cover signal as 0 and stego signal as +1 or

−1. GNCNN used the KV filter in preprocessing step to increase the signal-to- noise ratio (SNR), thereby suppressing image content and exposing noise content in preprocessing step to allow better extraction of stego noise. GNCNN was the first CNN model, which attained steganalytic detection close to the SRM. How- ever, the shallower architecture of GNCNN might not have good discriminative power for steganalytic detection.

Xu et al. [39] devised a structural design of CNN (XuNet), which includes absolute layers (ABS) for better modeling of the stego noise (negative as well as positive embeddings) and Batch Normalization (BN) [58] to evade the CNN model from falling in the local minima when training. XuNet also used the

KV filter in preprocessing step of the network. The structural design of XuNet enabled the model to attain performance competitive to the SRM.

Qian et al. [59] presented a transfer learning paradigm, where the model (GNCNN) is trained using images embedded with a high payload. The trained weights are then finetuned using lower payloads. The steganalytic detection per- formance of the model was better than that of SRM on WOW steganography with low embedding rates.

Ye et al. [60] introduced a CNN architecture (YeNet) where layers were ini- tialized with the thirty high-pass filters of SRM and TLU activation to limit the residual within a confined range. They also used data-augmentation [61] as regularization to improve the training of the steganalyzer.

Liet al. [16] proposed a steganalyzer with a diverse-activation-module called ReSTNet. ReSTNet was inspired by the observation that increasing the width of the network boosts the steganalytic detection performance. The ReSTNet archi- tecture is formed by combining three pre-trained CNNs. EachCNNpreprocessed the input image using a different type of filter - Gabor [62], SRM linear, and non-linear filters and also used distinct activation functions - ReLU, Sigmoid, and TanH. This diverse design of the model performed better than the existing approaches.

Yedroudj et al. [63] proposed a steganalyzer by fusing state-of-the-art detec- tors. Yedroudj-Net used thirty filters fromSRMin the preprocessing step, similar to YeNet. After preprocessing, the model uses five convolution layers for feature learning, followed by a fully-connected network.

Boroumandet al. [4] proposed an end-to-end deeperCNNmodel that utilizes skip connections [64] for steganalytic detection. The first seven layers of the model compute the noise residuals, and the rest five layers performed the steganalytic detection using the computed noise residuals. Unlike other existing methods, this is the first method, which does not use any fixed high-pass filter for preprocessing.