High-Level Design: Inference Engine - PDF Presented by: University

This section presents the high-level design of the Inference Engine. This design is derived by considering the processing path of the incoming parsed log file data through the Inference Engine and by taking into account the major objectives and design goals of the Inference Engine. This high-level design is used to identify the major subcomponents and processes of the Inference Engine, to completely describe the functional behaviour and to outline the complete end-to-end design of the Inference Engine.

The main objective of the Inference Engine is two-fold. Firstly, the Inference Engine shall ingest a parsed log file and train an anomaly detection algorithm to identify failures and errors in a

system from these log files. Secondly, the Inference Engine shall again ingest a parsed log file, and using a previously trained model, shall identify, detect and flag potential erroneous events within the provided log files.

In order for the Inference Engine to perform both objectives, two modes of operation are defined.

A Training Mode in which the Inference Engine trains an anomaly detection model and an Anomaly Detection Mode in which the Inference Engine loads a previously trained model and detects anomalies within system log files.

With these design considerations and the main objectives in mind, the high-level design of the Inference Engine is illustrated in Figure 5.1.

Inference Engine Feature Extraction

Anomaly Detection Model Training

Data Loading

Input dataset (parsed log

file)

Model parameters

Anomaly Detection Report

Feature Extractor Anomaly

Detection Model

Inference Engine Config

File

Legend Common processing path

Training mode only processing path Anomaly detection mode only processing path

Subcomponent Process

Figure 5.1: High-Level Design of the Inference Engine component of the Automated Log File Analysis Framework. The major subcomponents of the Inference Engine, the Feature Extractor andAnomaly Detection Model, are illustrated alongside the inputs to and outputs

from the Inference Engine. Also shown are the different data path flows for model training and for model inference

As shown in Figure 5.1, the Inference Engine has two data processing pipelines, one correspond- ing to each mode of operation. The Inference Engine consists of two main subcomponents, the Feature Extractor and the Anomaly Detection Model, as well as a number of processes. Both subcomponents are used during both operational modes but are configured differently. The processes used vary depending on the mode of operation.

There are four major processing stages that are performed by the Inference Engine. These are Feature Extraction,Data Loading,Model TrainingandAnomaly Detection. These are illustrated in Figure 5.1.

The first processing stage, Feature Extraction, is responsible for ingesting the parsed log file and extracting useful, representative features that can aid in the objective of anomaly detection using a deep learning algorithm. Both modes of operation require Feature Extraction but the way in which the final data representation is derived in either case differs. TheFeature Extractor subcomponent implements theFeature Extraction process and is detailed further in Section 5.3.

AfterFeature Extraction, the next processing stage in the pipeline isData Loading. During this stage, the data, a feature-rich dataset representing the parsed log files, is loaded into the correct data structures and format in preparation for feeding into a deep learning model. During this process, the data is also copied to memory on the applicable compute device that will be used for model training or inference. Both modes of operation require this process. TheData Loading process is detailed in Section 5.5.

Once theData Loading process is complete, the processing pipeline diverges depending on the mode of operation. InTraining Mode, after loading the data, the Inference Engine initiates the Model Training processing stage. During this processing stage, theAnomaly Detection Model is trained to model relationships and dependencies among the input features in the data to detect and flag anomalous log events within the log files. This processing stage produces a trained model described by a list of model parameters that can be loaded to use the model for inference at a later stage.

InAnomaly Detection Mode, after the data is loaded, the Inference Engine executes theAnomaly Detection process. During this processing stage, the Inference Engine attempts to detect and identify anomalies within the provided log file. This processing stage uses theAnomaly Detection Model subcomponent, but instead of training a new model, model parameters learned during Model Training are loaded. The output of this processing stage is a report identifying potential anomalous log events.

The Model Training andAnomaly Detection processes are detailed in Section 5.5.

TheAnomaly Detection Model, utilized by theModel TrainingandAnomaly Detection processes is a subcomponent that implements the deep learning model that is to be trained to perform anomaly detection. The design of this model is discussed in Section 5.4.

As shown in Figure 5.1, there are three main inputs to the Inference Engine. The first is a dataset containing parsed log files as generated by the Data Miner detailed in Chapter 4. The format of this dataset is discussed and detailed in Chapter 6. The second input is a configuration

file that contains configuration information for the various subcomponents and processes of the Inference Engine. The details of the configuration file is described in Section 5.5. Lastly, when used in Anomaly Detection Mode, a set of parameters describing a previously trained deep learning model is required to instantiate a version of the model capable of performing anomaly detection.

The main outputs of the Inference Engine are the model parameters when used in Training Mode and a report detailing detected anomalies when used inAnomaly Detection Mode.

The integration of all subcomponents, and the implementation of all processes and sub-processes making up the Inference Engine, is detailed in Section 5.5.

Dalam dokumen PDF Presented by: University (Halaman 91-94)