Data Miner Tuning and Verification - PDF Presented by: University

This section firstly details the design and functional verification of the Data Miner. Following that, the Data Miner is tuned to achieve optimal performance. The tuning process yields the optimal set of parameters for each of the available log parsing algorithms. The optimal set of

parameters is then finally used to evaluate the performance of the Data Miner across all of the available log parsing algorithms.

7.2.1 Design and Functional Verification

With reference to Section 4.1 in Chapter 4, the following design and functional considerations and requirements proposed for the Data Miner are considered and intrinsically verified:

Input Log FilesThe Data Miner is designed to ingest log messages produced by any system, stored as raw text in a file. The maximum size of log file that may be ingested is limited by the available compute capability and not the design of the Data Miner. This is detailed in Sections 4.3 and 4.6

Output DataAfter processing a given log file, the Data Miner generates two outputs: a structured log file in which each log message is represented by a log event template, corresponding log event key and extracted runtime parameters; and an event occurrence matrix that contains counts of each log event template occurring in the log file. This is detailed in Section 4.6.

Interfaces An interface that facilitates data transfer from the Data Miner to the Inference Engine exists and is detailed in Section 6.2.

Parsing AlgorithmThe Data Miner supports multiple, state-of-the-art log parsing algorithms including: AEL, Drain,IPLoM,LenMa,LogMine and Spell. An interface exposing the configurable parameters of each algorithm is also available and detailed in Section 4.6.

Usage The Data Miner was designed and developed as a stand-alone tool. It is run indepen- dently of the framework during the algorithm tuning and performance evaluation experiments detailed in this section.

Performance Metrics The Data Miner is capable of outputting the following performance metrics: Parsing Accuracy andParsing Time. These are detailed further in this section.

7.2.2 Algorithmic Tuning

As detailed in Sections 4.5.1 and 4.7, each of the log parsing algorithms that the Data Miner supports has a set of configurable parameters that affect the performance of the algorithm on a given log file. In order to find the optimal set of parameters for each algorithm, algorithmic tuning of the Data Miner is required. The objective of this tuning is to find the optimal set of parameters, for a particular log parsing algorithm, that results in the highest parsing accuracy on log files from a given system. Parsing accuracy, detailed in Section 7.2.3, is the performance metric that best describes the ability of a log parsing algorithm to reduce a raw log to a structured, but still representative, dataset and has been shown to have the most impact on the performance of later log file analysis.

The log parsing algorithm tuning tool developed in Section 4.7 is used to tune the Data Miner.

Tuning is performed for the Data Miner across all supported log parsing algorithms. The HDFS dataset with a ground truth structured log file is used.

The results of Data Miner algorithmic tuning, across all algorithms, are presented in Table 7.2.

Table 7.2: Data Miner algorithmic tuning results Log Parsing Algorithm Performance

Parameter AEL Drain IPLoM LenMa LogMine Spell Search Space Size 1919 2020 12648 101 600 101 Time Taken to Tune

(seconds) 237.93 605.34 1849.35 28.89 1856.49 22.77 Minimum Parsing

Accuracy 0.3500 0.3500 0.8420 0.4355 0.0000 0.3180 Maximum Parsing

Accuracy 0.9975 0.9975 1.0000 0.9975 0.9975 1.0000 Table 7.2 shows the results of the algorithmic tuning of the Data Miner. As can be seen by the Minimum and Maximum Parsing Accuracy, the performance of the algorithms varies substantially as the configurable parameters change. All algorithms were capable of achieving a parsing accuracy greater than 99%, with theIPLoM andSpell algorithms able to achieve 100%

parsing accuracy on the given dataset. The time required to tune each log parsing algorithm also varies substantially and is affected by both the total size of the parameter search space and the complexity and implementation of the log parsing algorithm.

The Maximum Parsing Accuracy, for each algorithm, is achieved when using the optimal set of parameters derived after tuning each algorithm. These optimal parameter sets are used in further performance evaluation of the Data Miner. It should be noted that these optimal parameters were derived through tuning the Data Miner on the HDFS dataset and these are not expected to perform favourably on log files produced by other systems. For reference, the optimal parameters that results in the best performance from each algorithm is shown in Table 7.3.

Table 7.3: Optimal parameters for the various log parsing algorithms after tuning on the HDFS dataset

Log Parsing Algorithm Optimal Parameters

AEL Merge Percentage: 0.25

Minimum Event Count: 2 Drain Parse Tree Depth: 1

Similarity Threshold: 0.12

IPLoM

Cluster Goodness Threshold: 0.3 Partition Support Threshold: 0 Lower Bound: 0.01

Upper Bound: 0.9

File Support Threshold: 0

LenMa Clustering Similarity Threshold: 0.54 LogMine

Clustering Parameter (k): 0.1 Clustering Hierarchy Levels: 2 Maximum Distance: 0.004

Spell Longest Common Subsequence Threshold (Tau): 0.67

7.2.3 Performance Evaluation

The performance of the Data Miner is evaluated in terms of: Parsing Accuracy,Parsing Time, Efficiency and Robustness. These metrics, how they are measured and the results of the Data Miner’s performance evaluation are detailed in this section. During the performance evaluation process, each log parsing algorithm is configured with the optimal set of parameters as derived during algorithm tuning in Section 7.2.2 and detailed in Table 7.3.

Parsing Accuracy

Parsing accuracy is defined as the ratio of correctly parsed log messages to all log messages in a given log file [40]. This is shown in Equation 7.1. Measuring parsing accuracy requires a ground truth of parsed log messages for a given log file.

parsing accuracy = # correctly parsed log messages

# total log messages (7.1)

The log parsing algorithm parses each log message into an event template which consists of the static part of the log message and the variable parts represented by wildcards (*). The number of correctly parsed log messages is determined by considering groups of messages in the parsed log file generated by the log parsing algorithm. A group of log messages, represented by the same log event template, is considered to be correctly parsed if two conditions are satisfied.

Firstly, there is only a single corresponding log event describing these log messages in the ground truth, and secondly, the number of log messages represented by the event template in the generated parsed log is equal to the number of log messages represented by the corresponding event template in the ground truth. If these conditions are satisfied then the total number of log messages, corresponding to this specific event template, is added to the number of correctly parsed log messages.

The parsing accuracy metric describes how well a given log parsing algorithm is able to correctly extract log event templates from a log file. This metric is of particular importance when the parsed log files are to be used for later log file analysis processes.

The results of the Data Miner Parsing Accuracy tests are shown in Table 7.4.

Table 7.4: Data Miner Parsing Accuracy Test Results Log Parsing Algorithm Performance

Metric AEL Drain IPLoM LenMa LogMine Spell

Parsing Accuracy 0.9975 0.9975 1.0000 0.9975 0.9975 1.0000 Unique Events

Extracted 16 16 14 16 16 14

Time Taken (seconds) 0.08 0.16 0.14 0.27 2.50 0.17 As previously noted, when tuned, all log parsing algorithms are capable of achieving a very high parsing accuracy. Two algorithms are able to achieve a perfect parsing accuracy score, and similarly have identified the same number of unique log event templates from the dataset.

Parsing Time and Efficiency

Parsing time is simply measured as the time taken for a log parsing algorithm to complete parsing of a given log file. The efficiency of each log parsing algorithm is evaluated by measuring the parsing time on log files of increasing size to determine how the parsing time scales.

The results of the Data Miner Parsing Efficiency Tests are shown in Table 7.5. These results are also illustrated in Figure 7.1

Table 7.5: Data Miner Parsing Efficiency Test Results

Log Parsing Algorithm Parsing Time (s) Log Message Count AEL Drain IPLoM LenMa LogMine Spell

1k 0.05 0.09 0.07 0.13 0.61 0.12

10k 0.39 0.79 0.66 1.32 58.43 0.81

100k 3.86 8.08 6.78 13.14 DNF 8.12

1000k 47.01 81.47 71.66 129.61 DNF 84.68

10000k 778.20 808.75 712.36 1569.39 DNF 839.56 DNF = Did not finish within 1 hour

10³ 10⁴ 10⁵ 10⁶ 10⁷

Log Messages in Log File 0

200 400 600 800 1000 1200 1400 1600

Parsing Time (seconds)

Parsing time of log parsing algorithms

Log Parsing Algorithms AELDrain

IPLoM

LenMa LogMine Spell

Figure 7.1: Plot showing efficiency of log parsing algorithms

As shown in Figure 7.1, up to a log files containing 1 million log messages, most of the log parsing algorithms’ parsing times scale in a similar manner. TheLogMine algorithm was unable to complete parsing a log file containing 100 000 log messages in under an hour. The LenMa algorithm’s parsing time increases substantially beyond log files containing 1 million messages and shifts away from the scaling exhibited by the other algorithms.

Robustness

The robustness of the Data Miner is assessed in terms of how well the various log parsing algorithms, tuned toward log files from a particular system, perform on log files generated by

different systems.

To measure the robustness, three additional log file datasets from other systems were used.

These systems were OpenStack, theBlue Gene/L Supercomputer and the Thunderbird Super- computer. These datasets contained raw log files in addition to a ground truth of parsed log messages. These datasets were obtained from Logparser [7].

Each log parsing algorithm was configured to use the optimal parameters, derived through algorithmic tuning on the HDFS dataset and then run on the other three datasets to measure the achieved parsing accuracy.

The results of the Data Miner Robustness Tests are shown in Table 7.6.

Table 7.6: Data Miner Robustness Test Results

AEL Drain IPLoM LenMa LogMine Spell HDFS Parsing Accuracy 0.9975 0.9975 1.0000 0.9975 0.9975 1.0000 BGL Parsing Accuracy 0.8215 0.8630 0.9375 0.6895 0.7805 0.7615 OpenStack Parsing Accuracy 0.2950 0.1850 0.3095 0.3305 0.2850 0.2520 Thunderbird Parsing Accuracy 0.6530 0.5135 0.6630 0.9430 0.6520 0.6445

The results from the robustness tests show that the performance of the log parsing algorithms on a particular system’s log files is very sensitive to the parameters used to configure the algorithm.

There is also no observable pattern in the performance reduction, which indicates this variance is attributed to the structure of the log messages themselves.

Dalam dokumen PDF Presented by: University (Halaman 118-123)