Presenting Numerical Data - Thesis Projects

Part I Concepts

10.2 Presenting Numerical Data

● Other information. Depending on the purpose of the implementation, you may have to provide the reader with other information about the software. Examples include hardware and software requirements, user instructions etc.

If you have built a simulator you can also validate the implementation by using reference algorithms (i.e. baselines). By doing so you show that your implementation behaves correctly with respect to already established algorithms and baselines.

the dependence of update time on the number of tuples, indicating that update time increases exponentially with the number of tuples. It is customary to organise line plots in such a way that the values of the independent variable appear on the x-axis, and those of the dependent variable on the y-axis.

In Fig. 10.4, we have added a second dependent variable to the line plot. This one shows the update time for the proposed new algorithm. It is immediately visible to the reader that the new algorithm appears to scale better to situations with large numbers of tuples. Another detail that has been added in the new graph is a set of error bars, showing the variation in average update time for each numer of tuples.

The two algorithms appear to perform approximately equally in the range 10,000 –60,000 tuples, so therefore it is not apparent from the graph whether we can draw the conclusion that the new algorithm is better overall for the investigated interval. It is therefore necessary to show the lower part of the interval in more detail, and one way of doing this is simply to plot the range 10,000 –60,000 tuples in a separate graph (another alternative could be to use a log-scale for the y-axis).

Update time, both algorithms

0 20 40 60 80 100 120 140

10 20 30 40 50 60 70 80 90 100 Tuples (x 1,000)

Seconds

Old New

Fig. 10.4 Line plot showing performance comparison between two database access algorithms Update time, old algorithm

0 20 40 60 80 100 120

10 20 30 40 50 60 70 80 90 100 Tuples (x 1,000)

Seconds

Fig. 10.3 Line plot showing performance, i.e. update time of the old database access algorithm.

All update times plotted are averages of ten simulations

This is done in Fig. 10.5, and it indeed shows that the new algorithm performs worse than the standard algorithm for the interval 10,000–40,000 tuples. On the other hand, the difference between the two algorithms is very small in this interval, resulting in an overlap between the error bars. This means that the small difference can probably be explained as chance variation, rather than being a real difference in performance between the algorithms.

10.2.2 Avoiding Misleading Graphs

When choosing which interval to plot in the line graph, it is important to keep in mind that you can easily give a false impression by showing too narrow an interval.

Consider choosing to plot the interval 10,000–40,000 tuples in Fig. 10.5. If it is not clearly pointed out to the reader that such a plot is a complement to the one for the whole interval, in Fig. 10.4, the graph may give the misleading impression that the new algorithm is worse than the old one. This misleading impression could of course also arise from the table, if we were to include only rows 1–4 in Table 10.2.

The purpose of using graphs is to make numerical data easier to understand by visualising them. We saw an example of this in Sect. 10.2.1 where line plots were used to visualise the data from Table 10.2. However, since these visualisations can be done in different ways, it is also possible to give the reader different impressions of the data. To some extent, this is good, since different interpretations are possible, but there is also the risk of giving false or misleading impressions.

An example of a misleading graph is shown in Fig. 10.6. It contains a column plot where the y-axis has been cut so that it starts at 0.75, rather than at zero. The consequence of this is that the differences between the algorithms seem bigger than they really are. This should be obvious by considering the difference at 10,000 tuples, where the column for the new algorithm is about ten times higher than that

Update time, both algorithms

0 2 4 6 8 10

1 2 3 4 5 6

Tuples (x 10,000)

Seconds

Old New

Fig. 10.5 Line plot showing performance comparison between two database access algorithms.

Same as the graph in Fig. 10.4, except that only the interval 10,000 –60,000 tuples is included

10.2 Presenting Numerical Data 81

for the old algorithm, although the real difference is that the new algorithm takes 1.1 s and the old 0.8 s. The new algorithm thus takes 37% longer to execute than the old one, which means that the column should only be 37% higher. This discrepancy decreases further up the x-axis, but is present throughout the whole graph. At 20,000 tuples, the column for the new algorithm is 55% higher than the column for the old algorithm, but Table 10.2. shows that the new algorithm took only 31% longer to execute than the old one (2.1 and 1.6 s, respectively).

There are tools available today for drawing graphs very rapidly by automating part of the process. Some of these tools are easy to learn, and very efficient to use for generating graphs for your report. A drawback, however, is that they can sometimes automatically create misleading graphs. The shortening of the y-axis of Fig. 10.6, for example, may be done automatically by some tools. You must therefore inspect the graph carefully, and correct any mistakes made by the tool.

10.2.3 Significance Tests

You may be presenting a numerical comparison of data from experiments or simulations where you have varied one parameter. In such a case, if the system you have studied is stochastic, i.e. if the outcome of an individual run depends to some degree on chance, then it will be necessary to repeat each parameter setting a number of times in order to present the average result from a number of runs. This is because a single run can produce an untypical result by chance. Thus, it is important that any results that are going to be considered when drawing conclusions, are not just ran- dom effects. This can be done by applying a test for statistical significance.

As an example, we can observe that the advantages of the old algorithm in Fig. 10.4 were not statistically significant, since the error bars overlap for every plotted point in the interval 10,000 –60,000 tuples. The differences between the

0,75 1,25 1,75 2,25 2,75 3,25 3,75

10 20 30 40

Tuples (x 1,000)

Seconds

Old New

Fig. 10.6 Misleading column plot, which makes the difference between the old and new algorithm seem larger by letting the y-axis begin at 0.75 s

algorithms do, however, seem to be significant for all points above 60,000 tuples.

In summary, this means that we can say that these experiments indicate that the new algorithm is advantageous for 70,000–100,000 tuples, but that neither an advantage nor disadvantage was found for smaller numbers of tuples.

It is of vital importance, before attempting to draw any conclusions from data obtained through experiments or simulations, that you apply a suitable test for statistical significance. Failure to do so means that you run the risk of drawing incorrect conclusions, and – of course – that your examiner rejects the conclusions you propose, since they have not been shown to be significant. Which significance test fits your data will depend on the experimental set up and the type of data you have generated, and this is too large an issue to discuss in detail in this book. You are therefore strongly advised to consult the statistics literature and to seek the advice of your supervisor on this important issue.

Dalam dokumen Thesis Projects (Halaman 87-91)