not. 1 represents an anomalous event, 0 represents a normal event.
Both the Debugging Report and the Suspicious Lines Report share the same data structure shown in Figure 6.3. The only difference between the two reports is that the Suspicious Lines report only contains the lines that have been flagged as anomalous, while the Debugging Report contains all lines contained in the entire log file. This approach enables faster debugging as operators or developers can consult the Suspicious Lines Report to identify which lines, as indicated by their line numbers in the original log file, are anomalous, and then cross-reference these with the Debugging Report to analyse the events leading to, and following, the anomalous event.
Cognitive Debugging Framework Inference Engine Processing
Data Preparation
Data Miner Inference Engine
Input Log File (Raw log file)
Debugging Report Log Key
Data Store
Inference Engine Config
File Data Miner
Config File
Suspicious Lines Report Report Generation
Model Store
Legend Common processing path
Training mode only processing path Inference mode only processing path
Subcomponent Process Data
Store
Figure 6.4: Diagram illustrating the design of the Automated Log File Analysis Framework.
Major subcomponents and processes are illustrated. Training Mode and Inference Mode execution paths are also shown
event templates are consistently mapped to the same log event keys across log files for a given system.
The Model Data Store stores the learned model parameters describing the Anomaly Detection Model of the Inference Engine. When operating in training mode, the Anomaly Detection Model is trained on a given dataset to model the system behaviour. This model is described by a set of parameters stored in the Model Data Store. When operating in inference mode, the framework loads a previously learned model, corresponding to log files of the same system as the one under test, into the Anomaly Detection Model and uses this to detect anomalies in given log files.
As shown in Figure 6.4, the ALFAF has three major processing stages: Data Preparation, Inference Engine Processing and Report Generation. The first two processes are common to both modes of operation, while theReport Generation process only executes when the framework is operating in inference mode.
Data Preparation is the first processing stage and ingests the raw log files of the system under test. During this processing stage, the raw log files are parsed by the Data Miner into a structured dataset. If operating in training mode, the Log Key Data Store is created and the mapping of all unique log event key to log event templates in the structured dataset is stored in the Log Key Data Store.
If operating in inference mode, after parsing the log files into a structured dataset, a previously generated Log Key Data Store is queried to retrieve existing log event keys for any log events
occurring in the log file that are already contained in the Log Key Data Store. This is done to ensure that log events, of a given system’s log files, are consistently mapped to the same log event keys across log files. This is important as the deep learning algorithm extracts features from these log event keys and these features are crucial to the ability of the Inference Engine to accurately detect anomalies within the log files. When operating in inference mode, the possibility exists that new log events, not already contained in the Log Key Data Store, are identified. To address this, an option to update the the Log Key Data Store when operating in inference mode is made available.
After theData Preparationprocessing stage, the parsed, structured log file dataset is then passed to the next processing stage, Inference Engine Processing. During this stage, the ALFAF uses the Inference Engine to either train an anomaly detection model, or to detect anomalies within log files using a previously trained model, depending on whether it is run in training mode or inference mode.
In training mode, after training the Anomaly Detection Model, the Model Data Store is created and the learned model parameters are stored. In inference mode, previously learned model parameters are retrieved from the Model Data Store and are loaded into the Anomaly Detection Model to be used for inference.
If the framework is operated in training mode, the processing pipeline is complete after the Inference Engine Processing stage. If the framework is operated in inference mode, then an additional processing stage, Report Generation, is executed.
During theReport Generationprocessing stage, the output of the Inference Engine, an Anomaly Detection Report, is used to generate the Debugging Report and the Suspicious Lines Report as described in Section 6.2.
6.3.1 Implementation
The Automated Log File Analysis Framework (ALFAF) is implemented as a Python class named AutomatedLFAFramework. The UML Class Diagram for this class is shown in Figure 6.5.
Instantiating an instance of the AutomatedLFAFramework class requires the following argu- ments:
• data miner config - path to configuration file for the Data Miner
• inference engine config - path to configuration file for the Inference Engine
• input dir - path to the directory where the log files to be considered are located
• output dir - path to a directory where framework outputs are to be stored
• name - a unique identifier for an instance of the framework
• device - specifies which compute device, CPU or GPU, to use for model training and inference
• mode - specify the mode of operation. Either training orinference
+train_anomaly_detection_model() +analyse_log_file()
+_data_preparation() +_create_datastore()
+_retrieve_log_keys_from_datastore() +_prepare_new_events_for_datastore() +_update_datastore()
+_infer_and_detect_anomalies() +_translate_to_log_keys() +_generate_debug_report() +input_dir
+output_dir +name
+training_mode +data_miner +inference_engine +df_parsed_log +log_key_datastore +update_datastore
AutomatedLFAFramework
Figure 6.5: UML Class Diagram for theAutomatedLFAFramework class that implements the Automated Log File Analysis Framework
• log key datastore - path to a previously generated log key data store. Only required for inference mode
• update datastore - flag to enable or disable updating of the log key store when operating in inference mode
The input dir,output dir, name,log key datastore and update datastore attributes map to and store the corresponding class initialisation arguments. Thetraining mode attribute is a boolean flag that is set based on the value of the mode input argument. df parsed log is initialised to store a copy of the parsed log file, from the Data Miner, once available. The data miner and inference engine attributes are used to store instances of the DataMiner and InferenceEngine classes that implement the Data Miner and Inference Engine subcomponents respectively. It should be noted that the instantiation of both subcomponents requires the appropriate config- uration files. These are described in Chapters 4 and 5.
The two modes of operation of the framework are invoked by two class methods, namely train anomaly detection model for training mode andanalyse log file for inference mode. These methods are provided with the file name of the log file to be considered as input, and imple- ment the entire end-to-end processing pipeline for each mode of operation. A number of helper functions facilitate the various processes of the ALFAF.
The data preparation method implements the Data Preparation processing stage. The cre- ate datastore,retrieve log keys from datastore,prepare new events for datastoreandupdate datastore methods perform the various operations required to create the Log Key Data Store when in training mode and to retrieve log keys from the data store and update the Log Key Data Store
when in inference mode.
When operating in inference mode, theanalyse log filemethod makes use of theinfer and detect anomalies method to generate the Anomaly Detection Report, and then subsequently calls the trans-
late to log keys and generate debug report methods to create the Debug and Suspicious Lines Reports.
All outputs generated by the ALFAF, whether operating in training mode or inference mode, are stored to disk in the directory specified by the output dir class initialisation argument.