report consists of all input log event key sequences, the actual targets, the expected candidate targets and whether or not an anomaly was detected by the model. This report is later used by the Automated Log File Analysis Framework to provide assistive debugging information to the user. This process is detailed in Chapter 6.
For the Anomaly Detection process, the following additional class attributes are required:
1. num candidates - this specifies the number of top candidate log event keys to consider during anomaly detection
2. model parameters - this contains a path to a file containing previously learned model parameters
6
System Integration and Framework Design
This chapter finalises the design and development of the Automated Log File Analysis Frame- work (ALFAF). It details the integration of the previously designed Data Miner and Inference Engine subcomponents, alongside the design of the overall ALFAF that implements the end-to- end automated log file analysis processing pipeline.
The design considerations and system requirements put forward for the development of the subcomponents are applicable to the ALFAF. In addition, framework-specific considerations and requirements that guide system integration and the development of the complete framework at the top level are also put forward. These are once again based on the objectives of this project and are detailed in Section 6.1.
Before detailing the subcomponent integration and design of the ALFAF, attention is directed toward the system interfaces in Section 6.2. In this section, the various internal and external interfaces of the system are identified and detailed.
Finally, in Section 6.3, the design and development of the complete ALFAF is detailed. This section describes the integration of the previously designed subcomponents and the design of additional processes required to realise the overall ALFAF.
6.1 Design Considerations and System Requirements
The Automated Log File Analysis Framework (ALFAF) consists of the two previously designed subcomponents, theData Miner and theInference Engine, as presented in Chapter 1. During the design and development of these subcomponents, design considerations and system require- ments were taken into account to ensure that these subcomponents are designed to enable the ALFAF to perform its intended functions. These considerations are discussed in Chapters 4 and 5. At the top level, i.e. the ALFAF itself, additional design considerations and requirements are put forward to account for the additional functionality required at the framework level as well as the integration of the subcomponents. These design considerations are based on the
research questions put forward in Chapter 1 and the project objectives. They are used to guide the integration of the subcomponents and the design of the ALFAF and are presented below:
Automated Log File Analysis Framework Objective The main objective of the ALFAF is to provide an end-to-end processing pipeline that implements automated log file analysis to identify possible points of system failure and errors from given system log files. The framework shall perform the following processing stages required for automated, machine learning-based log file analysis as discussed in Chapter 3: Log Parsing, Feature Engineering, and Log File Analysis.
Input The ALFAF shall receive raw, system-generated log files as input. These log files shall contain log messages stored as raw text.
Output The ALFAF shall make the following outputs available after performing automated log file analysis:
• Debugging Report- a comma separated values (CSV) file containing the entire, original log file in a structured format with suspicious or potential anomalous log messages flagged.
This report will also indicate why a given log message is flagged as an anomaly. This report may be used by system operators or developers to understand what may have gone wrong during system operation.
• Suspicious Lines Report - a CSV file containing only the suspicious or potential anomalous log messages. This report may be used by system operators to quickly identify anomalous lines and cross-reference these with theDebugging Report to speed up analysis.
Operational Modes The ALFAF shall have two modes of operation. A training mode in which the framework is trained on log files generated by a given system, and an inference mode in which the framework is used to detect failures and errors within new log files generated by the same system.
Data StoresThe ALFAF is designed to be robust and able to perform log file analysis on the log files from any given system. However, the framework does need to be trained on a particular system before being able to be effective. Part of this training involves creating two data stores.
The first contains a history of log event key to log event template mappings. This data store is generated during framework training and is used to ensure that all subsequent log files, of the same system, are parsed and mapped using a consistent log event key to log event template mapping. The second data store contains the parameters describing a previously trained deep learning-based Anomaly Detection Model that was trained to detect anomalous log events from log files for the system under test.
InterfacesThe ALFAF has an internal interface between the Data Miner and Inference Engine subcomponents. It shall also have external interfaces to the system under test and the end-user i.e. operator or developer.