4.3 Software Tools
4.3.4 Custom Scripts
The following paragraphs describe the numerous scripts developed in order to fully automate the experi- mental pipeline.
Cross Validation
A script was written in Python17(shown in listing 4.5) that automates the execution of cross validation for each experiment. Listing 4.6 shows an example of how the cross validation process can be invoked from the command line. Note theXXplaceholders, which are replaced with the fold numbers during execution. Also note that any of the multi-label classification techniques can be executed by the script, since the execution command is passed to the script as a command line parameter. The output of executing the cross validation script in this way will be ten files with names of the formBR_NB_fold_XX.out.txt(using naive Bayes as an example), but in each case theXXplaceholders will be replaced by a folder number. These are the files containing the raw results of the experiment.
Something to note is that it is not necessary to train and test the model in separate steps. This is due to the fact that the machine learning libraries and tools used for experimentation in this study are designed to train and test models in the same thread of execution (although, importantly, they also support saving models for later use). This can be seen from the example cross validation execution shown in listing 4.6. Another example of this is shown in listing 4.7, which shows the configuration required for the Clus application to train and test a predictive clustering tree.
Results Processing
Scripts were written in Python for processing the raw results19. Separate scripts were written for produc- ing the scalar evaluation measures for Clus and for MEKA/Mulan. An additional script was written for producing the R commands necessary to generate theROCcurves andAUCvalues.
17Source code provided on disc accompanying dissertation.
4.3 Software Tools 73
#!/usr/bin/python import os
import sys
# Display help if not enough parameters are supplied if len(sys.argv) != 2:
print "Usage: ./cross_validate.py command_line_with_XX_as_place_holder"
exit(1)
# Initialise the command command = sys.argv[1]
# Generate and execute the command for each fold for i in range(10):
# Replace XX with fold number
fold_command = command.replace("XX", str(i))
# Run command
os.system(fold_command)
Listing 4.5. Cross validator script. This script will run the given command once for each cross validation fold, each time modifying the training and test set filenames using the correct fold number.
Confusion Matrix-Based and Higher Order Measures The Clus results processing script is used to the generate the two confusion matrix-based and two higher order evaluation measures for the PCT method.
Since the PCT technique does not produce probabilistic predictions, it is not possible to generateROCcurves (and henceAUC) for this method. The output required in order to generate the four evaluation measures for each label is thecumulative confusion matrix (the sum total of the confusion matrices for each of the ten cross validation folds) for each label. The script generates this matrix for each label by parsing the output file generated by each each training and test cycle and building an in-memory confusion matrix for each
./cross_validate.py "java -cp lib/*:lib/*.jar weka.classifiers.multilabel.BR \ -t dataset1_fold_XX_train.arff -T dataset1_fold_XX_test.arff \
-f BR_NB_fold_XX.out.txt -W weka.classifiers.bayes.NaiveBayes"
Listing 4.6. Example execution of cross validation for an experiment.
4.3 Software Tools 74
[Data]
File = dataset1_fold_1_train.arff TestSet = dataset1_fold_1_test.arff [features]
Target = 126-136 Clustering = 126-136 Descriptive = 1-125 [Tree]
FTest = 0.1
Heuristic = VarianceReduction ConvertToRules = Leaves
PruningMethod = None [Ensemble]
EnsembleMethod = RForest
Listing 4.7. Clus configuration file for a single fold (in this case fold 1).
label for every fold. Finally, simple matrix addition is performed and the output is echoed to the command line (which could be redirected to a file at the user’s discretion).
The confusion matrix-based and higher order measures are calculated in much the same way for the MEKA and Mulan library outputs as they are for the Clus outputs. Only one additional step needs to be performed, which is the calculation of the confusion matrix for each label for each fold. This is necessary because the MEKA and Mulan outputs do not contain the confusion matrices, but rather the actual pre- dictions made. The confusion matrix is calculated by using an in-memory matrix, parsing each prediction made and simply incrementing the value of the appropriate cell in the matrix depending on the outcome of the prediction.
ROC Analysis In order to generate theROC curves and AUCvalues, the results from the methods that support the calculation of these evaluation measures (BR-NB, HOMER, MLkNN, RAkEL and ECC) must be transformed into a format that can be used as input by theROC analysis tool. Note that SVMs do not produce probabilistic predictions and thusROC analysis cannot be performed for the BR-SVM technique.