Book Keeping and Future Requests - Proposed Energy Saving Policy

4.2 Proposed Energy Saving Policy

4.2.1 Book Keeping and Future Requests

Before turning off a cache bank, two major issues need to be handled: (i) relocation of existing valid blocks in the bank, and (ii) future requests which will come to this bank after turning off. All future requests to this shutdown bank should be redirected to some other bank, called ‘target’ bank. The ‘target’ bank is chosen based on the usage statistics, which implies that the banks, those are prospective candidates to be turned off in future, will not be selected as ‘target’. Handling of already cached blocks have two options: (a) Writeback all these blocks to the next lower level memory (W), or (b) send/migrate these blocks to the target bank (M).

Hypothetically, writing back to next level memory may be an expensive operation, especially (it will be much more expensive in terms of latency) when the next level memory will be off-chip. In this regard, we also performed a set of experiments on PARSEC benchmark applications, which claim the correctness of our hypothesis.

To show the effectiveness of both W and M policies, we implement them in our simulation framework for a TCMP architecture having a 4MB 4 way L2 cache as on-chip LLC. For in depth analysis, our BSP has been modified in this experiment, where, we perform simulations on 4 different configurations:

• BSP H M : Turn off heavily used cache banks and migrate blocks from victim to target during shutdown operation.

• BSP H W : Turn off heavily used cache banks and write back blocks from victim to the lower level memory during shutdown operation.

• BSP C M : Turn off lightly used cache banks and migrate blocks from victim to target during shutdown operation.

• BSP C W : Turn off lightly used cache banks and write back blocks from victim to the lower level memory during shutdown operation.

Figure 4.5 shows that, the IPC degradation is more for write-back than migration for all of our applications, as the write-back policy takes more time to settle down

Chapter 4. DiCeR 86

Figure 4.5: Comparison between Migration (BSP H M, BSP C M) and Write- Back (BSP H W, BSP C W) policies in terms of IPC, while turning off some

heavily used and lightly used L2 banks.

than the migration based policy. For write-back cases, average IPC degradation is near to 10%, whereas in migration based policy average IPC degradation is less than 5%. These results have strengthened our hypothesis, and motivate us to use the latter option i.e. migrate the victims’ blocks to the ‘target’ while resizing the cache.

However, during relocation of these blocks, the bank being shutdown (victim bank) is searched line-by-line to evict its blocks one-by-one for sending them to its target.

While block migration initiates, it is only the victim bank who does not handle any further requests. But other banks in the system continue as usual with normal operations. In prior works, the complete system was stalled during the relocation.

But in our proposal, only the bank being shutdown (victim bank) is stalled during the shutdown process. Once the block migration (from victim bank to target bank) is completed, the (victim) bank will be shutdown. For turned off banks, future requests are forwarded to the target bank. Dynamic energy consumption for the target bank increases marginally but the static energy of the turned off bank, which is a significant portion, is saved. This whole process will increase the traffic in NoC which will incur a number of stall cycles and increase in NoC power consumptions. Furthermore, cache bank on-off overhead will also incur stall cycles in the system. Our simulated system takes into account all of these system

Chapter 4. DiCeR 87 overheads. The average reconfiguration overhead incurred in our experimental evaluation is reported in Section 4.4.1.

During shutdown process, the migrated blocks are loaded into the target bank from the victim bank. In case no free ways are available in the corresponding set of the target bank, ideally the oldest block among the incoming migrated block and the LRU (Least Recently Used) of that particular set should be evicted. But, an LRU block can only be decided inside the bank. Hence, in our implementation we add an extra field calledReuse-Counter of 4 bits to each block like [18], having a negligible storage overhead. This value is also sent with each block during migration. The values of Reuse Counters of the incoming migrated block and the LRU block of the target’s set are compared, and lower valued one will be eventually evicted.

However, conflict cases are handled by evicting the migrated blocks.

Similarly, during turning on process, the target bank of the bank being turned on will be stalled and will not be handling any requests during the turning on process.

The remapped data of this bank, which is being turned on, will be brought back from its target bank. The bank will be turned on after completion of this migration process and all the pending requests will be processed after that. On completion of turning on process, the bank starts handling its own requests normally. The whole process is transparent to the underlying cache coherence protocol. The request- redirection happens at victims and the states of redirected blocks are maintained by the target banks.

Storage Overhead for Source Bank Tracking Selection of the target bank is decided at runtime based on bank usage statistics of the cache banks. Hence, no separate remap table is required to be maintained. But whenever a cache bank is needed to be turned on all of its remapped cache blocks need to be placed back in it from its target bank. As a bank can become target bank for more than one banks, an extra field, called Source ID is added with every block in the LLC to keep track of the original cache bank ID for remapped cases. To store source cache bank ID, size of this field will be log₂N, where N is the number of cache banks.

Chapter 4. DiCeR 88 For a 4MB L2 cache having 16 cache banks of equal size with 64 Bytes data block, this additional field will have an overhead of 32 KB extra storage i.e. 0.78%, which is negligible.

Prior works, in which power saving has been done through remapping techniques, have used a remap table at the L1-cache level and whenever any new bank is shutdown the entries in all L1 caches need to be updated. Also keeping and maintaining such tables with L1 caches is an additional overhead. Our proposed method is completely transparent to the L1 caches. The L2 controllers of the banks being shutdown maintain the information of the target bank. For every shutdown bank, there is only one target bank, so, there is no hardware overhead of remap tables.

Dalam dokumen Energy and thermal management of CMPs by dynamic cache reconfiguration (Halaman 111-114)