Evaluating the LSTM network's efficiency - Java Deep Learning Cookbook

After each training iteration, the network's efficiency is measured by evaluating the model against a set of evaluation metrics. We optimize the model further on upcoming training iterations based on the evaluation metrics. We use the test dataset for evaluation. Note that we are performing binary classification for the given use case. We predict the chances of that patient surviving. For classification problems, we can plot a Receiver Operating Characteristics (ROC) curve and calculate the Area Under The Curve (AUC) score to evaluate the model's performance. The AUC score ranges from 0 to 1. An AUC score of 0 represents 100% failed predictions and 1 represents 100% successful predictions.

How to do it...

Use ROC for the model evaluation:

ROC evaluation = new ROC(thresholdSteps);

Generate output from features in the test data:

DataSet batch = testDataSetIterator.next();

INDArray[] output = model.output(batch.getFeatures());

Use the ROC evaluation instance to perform the evaluation by 3. calling evalTimeseries():

INDArray actuals = batch.getLabels();

INDArray predictions = output[0]

evaluation.evalTimeSeries(actuals, predictions);

Display the AUC score (evaluation metrics) by calling calculateAUC(): 4.

System.out.println(evaluation.calculateAUC());

How it works...

In step 3, actuals are the actual output for the test input, and predictions are the observed output for the test input.

The evaluation metrics are based on the difference between actuals and predictions.

We used ROC evaluation metrics to find this difference. An ROC evaluation is ideal for binary classification problems with datasets that have a uniform distribution of the output classes. Predicting patient mortality is just another binary classification puzzle.

thresholdSteps in the parameterized constructor of ROC is the number of threshold steps to be used for the ROC calculation. When we decrease the threshold, we get more positive values. It increases the sensitivity and means that the neural network will be less confident in uniquely classifying an item under a class.

In step 4, we printed the ROC evaluation metrics by calling calculateAUC():

evaluation.calculateAUC();

The calculateAUC() method will calculate the area under the ROC curve plotted from the test data. If you print the results, you should see a probability value between 0 and 1. We can also call the stats() method to display the whole ROC evaluation metrics, as shown here:

The stats() method will display the AUC score along with the AUPRC (short for Area Under Precision/Recall Curve) metrics. AUPRC is another performance metric where the curve represents the trade-off between precision and recall values. For a model with a good AUPRC score, positive samples can be found with fewer false positive results.

Constructing an LSTM Neural 7

Network for Sequence Classification

In the previous chapter, we discussed classifying time series data for multi-variate features.

In this chapter, we will create a long short-term memory (LSTM) neural network to classify univariate time series data. Our neural network will learn how to classify a univariate time series. We will have UCI (short for University of California Irvine) synthetic control data on top of which the neural network will be trained. There will be 600 sequences of data, with every sequence separated by a new line to make our job easier. Every sequence will have values recorded at 60 time steps. Since it is a univariate time series, we will only have columns in CSV files for every example recorded. Every sequence is an example recorded.

We will split these sequences of data into train/test sets to perform training and evaluation respectively. The possible categories of class/labels are as follows:

Normal Cyclic

Increasing trend Decreasing trend Upward shift Downward shift

In this chapter, we will cover the following recipes:

Extracting time series data Loading training data Normalizing training data

Constructing input layers for the network Constructing output layers for the network

Evaluating the LSTM network for classified output Let's begin.

Technical requirements

This chapter's implementation code can be found at https://github.com/

PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/07_Constructing_LSTM_

Neural_network_for_sequence_classification/sourceCode/cookbookapp/src/main/

java/UciSequenceClassificationExample.java.

After cloning our GitHub repository, navigate to the Java-Deep-Learning-

Cookbook/07_Constructing_LSTM_Neural_network_for_sequence_classificatio n/sourceCode directory. Then import the cookbookapp project as a Maven project by importing pom.xml.

Download the data from this UCI website: https://archive.ics.uci.edu/ml/machine- learning-databases/synthetic_control-mld/synthetic_control.data.

We need to create directories to store the train and test data. Refer to the following directory structure:

We need to create two separate folders for the train and test datasets and then create subdirectories for features and labels respectively:

This folder structure is a prerequisite for the aforementioned data extraction. We separate features and labels while performing the extraction.

Note that, throughout this cookbook, we are using the DL4J version 1.0.0-beta 3, except in this chapter. You might come across the following error while executing the code that we discuss in this chapter:

Exception in thread "main" java.lang.IllegalStateException: C (result) array is not F order or is a view. Nd4j.gemm requires the result array to be F order and not a view. C (result) array: [Rank: 2,Offset: 0 Order: f Shape: [10,1], stride: [1,10]]

At the time of writing, a new version of DL4J has been released that resolves the issue.

Hence, we will use version 1.0.0-beta 4 to run the examples in this chapter.

Dalam dokumen Java Deep Learning Cookbook (Halaman 169-174)