Introduction - Energy and thermal management of CMPs by dynamic cache reconfiguration

Chapter 4 Static Energy Reduction by Performance Linked Dynamic Cache Resizing (DiCeR)

In this chapter, we are going to discuss about the dynamic tuning of LLC size, which is a promising option for reducing the cache leakage in modern CMPs.

Towards this, our policy dynamically shuts down or turns on cache banks based upon the system performance and the banks’ usage statistics. In addition with savings in leakage energy, shutting down of a cache bank remaps its future requests to another active bank, called as target bank. The proposed technique is evaluated on three different implementation policies.

Chapter 4. DiCeR 78 As we discussed in Chapter 1 and 2, an effective way of reducing power consumption of the on-chip LLCs is by shrinking its size. Some recent works [85, 91, 75]

have proposed power optimisation approaches where on-chip LLC has been shrunk.

Reduction in LLC size can degrade system performance if the application’s cache demand is more or if some heavily used cache portion is powered off. Hence, tuning process of on-chip LLC size should consider the system performance and locality of reference as its constraints. As these constraints are only known during execution of the applications, hence, dynamic/runtime cache size tuning approaches will be more effective for reducing LLC power consumption.

The recent surveys [5, 4] on cache-size tuning techniques from power/performance perspective have broadly classified the cache power reduction techniques into two categories:

1. Power supply control (State Preserving Approach), and 2. Resizing of the Cache Memory (State Destroying Approach).

The former one optimises the power consumption by controlling the power supply at physical circuitry whereas the latter one resizes the cache.

For modern tiled CMPs, recent works [91, 75] propose a set of utilisation based cache resizing techniques, which power off least utilised cache portions and dynamically remaps subsequent future requests to other active parts. In our work, we have taken a similar approach for optimising cache power consumption, by request remapping at L2-controller, unlike the prior works, where remapping is done at L1-controllers.

The current work proposes a dynamic cache tuning technique which considers performance and locality of reference as its constraints for managing the cache size. Towards this, we initially attempt to reduce cache-size by shutting down cache banks till an allowable degradation threshold in IPC which we refer to as BSP, the Basic bank-Shutdown Policy. In order to save leakage power, based on usage statistics, this policy turns-off L2 cache banks at runtime and its future

Chapter 4. DiCeR 79 accesses are remapped to other L2 cache banks, called as target banks. Once, the performance degrades beyond a predefined threshold, the system stops bank shutdown process. However, this policy cannot provide adequate cache space to the process in case it needs more cache space in future, during execution. To address this issue, we developed an extended policy of BSP, which takes care the sudden increment in application’s Working Set Size (WSS) during execution by allowing dynamic restarting of the powered off cache banks. System performance is monitored periodically and accordingly L2 bank(s) will be restarted if performance degradation is more than a threshold value. During turning on process, all the remapped contents are brought back to this bank from its remapped location. The results are compared with Drowsy [1], an existing policy. Specifically, the main contributions of this work can be listed as follows:

1. B ON OFF ALL A performance linked dynamic cache tuning strategy resizes the L2 caches by turning off cache banks. However, if the application needs more cache space, L2 banks are turned-on/restarted.

2. B OFF ONCE Frequent turning on and off of the L2 banks degrades performance. Hence, we experiment with different on-off patterns. Once the performance degradation reaches a threshold value, the system will not allow any more shutdown of L2 bank(s). After this only turning on of L2 cache banks will be allowed.

3. B ON OFF OPT The frequent resizing of L2 cache of first policy may degrade system performance. On the other hand, second policy does not allow the system to save power by turning-off the cache banks once the turn-off process is stopped, even when there is scope to do so in future.

These two problems have been rectified in the third policy by putting some restrictions on cache resizing.

Basically, in this work, power saving is done by complete shutdown of the un- derutilised cache banks. Shutting down of these least utilised cache banks can also aggravate the system performance, if the current application later changes its

Chapter 4. DiCeR 80

Figure 4.1: Tiled CMP architecture

cache-space requirement. In such situations, the powered off banks will also be turned-on when performance degradation is more than a preset threshold value.

The baseline architecture used in this work is elaborated in Figure 4.1. Accord- ing to this figure, the whole chip is a collection of some replicated tiles, where each tile contains a processor core along with its private L1 (Data & Instruction) caches and a chunk of shared L2 cache, called as L2 bank. In this figure tile 4 is elaborated in details. Note that, L2 cache is used here as on-chip LLC and it is physically distributed uniformly among the tiles. The tiles are connected to each other through a 2D NoC and hence, each tile is also attached to an NoC router.

Dalam dokumen Energy and thermal management of CMPs by dynamic cache reconfiguration (Halaman 103-106)