Evaluating the LSTM network for classified output

How to do it...

Use setOutputs() to set the output labels:

compGraphBuilder.setOutputs("predictSequence");

Construct an output layer using the addLayer() method and RnnOutputLayer: 2.

compGraphBuilder.addLayer("predictSequence", new

RnnOutputLayer.Builder(LossFunctions.LossFunction.MCXENT)

.activation(Activation.SOFTMAX).nIn(10).nOut(numOfClasses).build(),

"L1");

How it works...

In step 1, we have added a predictSequence label for the output layer. Note that we mentioned the input layer reference when defining the output layer. In step 2, we specified it as L1, which is the LSTM input layer created in the previous recipe. We need to mention this to avoid any errors during execution due to disconnection between the LSTM layer and the output layer. Also, the output layer definition should have the same layer name we specified in the setOutput() method.

In step 2, we have used RnnOutputLayer to construct the output layer. This DL4J output layer implementation is used for use cases that involve recurrent neural networks. It is functionally the same as OutputLayer in multi-layer perceptrons, but output and label reshaping are automatically handled.

Evaluating the LSTM network for classified

In the previous chapter, we explored a use case for time series binary classification. Now we have six labels against which to predict. We have discussed various ways to enhance the network's efficiency. We follow the same approach in the next recipe to evaluate the neural network for optimal results.

How to do it...

Initialize the ComputationGraph model configuration using the init() 1. method:

ComputationGraphConfiguration configuration = compGraphBuilder.build();

ComputationGraph model = new ComputationGraph(configuration);

model.init();

Set a score listener to monitor the training process:

model.setListeners(new ScoreIterationListener(20), new

EvaluativeListener(testIterator, 1, InvocationType.EPOCH_END));

Start the training instance by calling the fit() method:

model.fit(trainIterator,numOfEpochs);

Call evaluate() to calculate the evaluation metrics:

Evaluation evaluation = model.evaluate(testIterator);

System.out.println(evaluation.stats());

How it works...

In step 1, we used a computation graph when configuring the neural network's structure.

Computation graphs are the best choice for recurrent neural networks. We get an evaluation score of approximately 78% with a multi-layer network and a whopping 94%

while using a computation graph. We get better results with ComputationGraph than the regular multi-layer perceptron. ComputationGraph is meant for complex network

structures and can be customized to accommodate different types of layers in various orders. InvocationType.EPOCH_END is used (score iteration) in step 1 to call the score iterator at the end of a test iteration.

Note that we're calling the score iterator for every test iteration, and not for the training set iteration. Proper listeners need to be set by calling setListeners() before your training event starts to log the scores for every test iteration, as shown here:

model.setListeners(new ScoreIterationListener(20), new

EvaluativeListener(testIterator, 1, InvocationType.EPOCH_END));

In step 4, the model was evaluated by calling evaluate():

Evaluation evaluation = model.evaluate(testIterator);

We passed the test dataset to the evaluate() method in the form of an iterator that was created earlier in the Loading the training data recipe.

Also, we use the stats() method to display the results. For a computation graph with 100 epochs, we get the following evaluation metrics:

Now, the following are the experiments you can perform to optimize the results even better.

We used 100 epochs in our example. Reduce the epochs from 100 or increase this setting to a specific value. Note the direction that gives better results. Stop when the results

are optimal. We can evaluate the results once in every epoch to understand the direction in which we can proceed. Check out the following training instance logs:

The accuracy declines after the previous epoch in the preceding example. Accordingly, you can decide on the optimal number of epochs. The neural network will simply memorize the results if we go for large epochs, and this leads to overfitting.

Instead of randomizing the data at first, you can ensure that the six categories are uniformly distributed across the training set. For example, we can have 420 samples for training and 180 samples for testing. Then, each category will be represented by 70 samples.

We can now perform randomization followed by iterator creation. Note that we had 450 samples for training in our example. In this case, the distribution of labels/categories isn't unique and we are totally relying on the randomization of data in this case.

Performing Anomaly Detection 8

on Unsupervised Data

In this chapter, we will perform anomaly detection with the Modified National Institute of Standards and Technology (MNIST) dataset using a simple autoencoder without any pretraining. We will identify the outliers in the given MNIST data. Outlier digits can be considered as most untypical or not normal digits. We will encode the MNIST data and then decode it back in the output layer. Then, we will calculate the reconstruction error for the MNIST data.

The MNIST sample that closely resembles a digit value will have low reconstruction error.

We will then sort them based on the reconstruction errors and then display the best samples and the worst samples (outliers) using the JFrame window. The autoencoder is constructed using a feed-forward network. Note that we are not performing any

pretraining. We can process feature inputs in an autoencoder and we won't require MNIST labels at any stage.

In this chapter, we will cover the following recipes:

Extracting and preparing MNIST data Constructing dense layers for input Constructing output layers

Training with MNIST images

Evaluating and sorting the results based on the anomaly score Saving the resultant model

Let's begin.

Technical requirements

The code for this chapter can be found here: https://github.com/PacktPublishing/Java- Deep-Learning-Cookbook/blob/master/08_Performing_Anomaly_detection_on_

unsupervised%20data/sourceCode/cookbook-app/src/main/java/

MnistAnomalyDetectionExample.java.

The JFrame-specific implementation can be found here:

https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/08_

Performing_Anomaly_detection_on_unsupervised%20data/sourceCode/cookbook-app/

src/main/java/MnistAnomalyDetectionExample.java#L134.

After cloning our GitHub repository, navigate to the Java-Deep-Learning- Cookbook/08_Performing_Anomaly_detection_on_unsupervised

data/sourceCode directory. Then, import the cookbook-app project as a Maven project by importing pom.xml.

Note that we use the MNIST dataset from here: http://yann.lecun.com/exdb/mnist/. However, we don't have to download the dataset for this chapter: DL4J has a custom implementation that allows us to fetch MNIST data automatically. We will be using this in this chapter.

Dalam dokumen Java Deep Learning Cookbook (Halaman 183-188)