5.6 Experimental Evaluation
5.6.5 Hardware overheads
In this section the hardware requirements of FS-DAM is compared with the base- line. The hardware overhead of two architectures are compared in terms of storage overhead, area overhead and energy overhead. Area consumed by LLC in different designs are calculated using CACTI 6.0 [2]. The energy consumption of the dif- ferent components of LLC are calculated separately and then combined together to show the total energy consumption.
5.6.5.1 Energy Overhead
There are two types of energy consumed by each component: static and dynamic.
Dynamic energy is consumed during every access of the component, whereas the static energy represents the leakage power of the component. In the case of the baseline, the components are banks and NoC, while in the case of FS-DAM an additional component called FS-TGS is needed. CACTI 6.0 is used to calculate the static energy per unit time and dynamic energy per access for each bank and FS-TGS. These static energy values when multiplied with the total execution time give the total static energy consumption of the component. The dynamic energy from CACTI while multiplied with the total number of accesses gets the total dynamic energy consumed by the component. The values like total execution time and the total number of accesses are obtained from the simulation.
The improvements gained by FS-DAM over baseline is after considering all the cycles required for re-groupings. The static energy required for re-grouping is included in the static energy consumed by the banks during the execution time.
But re-grouping requires some additional cache accesses for the adjustment of the reserve blocks as discussed in Section 5.5.4. The number of such accesses cannot be more than twice the number of reserve locations during each re-grouping process. These additional cache accesses slightly increase the total dynamic energy consumption of FS-DAM. Since FS-TGS is always active, the execution time of the bank and FS-TGS is the same. The dynamic energy is depend on the number
Additional energy consumption required in %
vips face bdy2 flud frq16 ferrt mix1 mix2 frt16 blk4 freq Average Static Energy 3.14 1.97 3.13 2.97 2.20 3.60 2.04 -0.08 0.07 -0.01 -0.01 1.73 Dynamic Energy 2.41 2.62 2.10 2.99 2.23 1.71 2.99 0.48 1.56 2.96 2.00 2.19
Table 5.7: The energy overhead of FS-DAM over the baseline design.
of FS-TGS accesses. Since FS-TGS has to be searched for all the block requests from L1, the number of FS-TGS accesses is the same as the number of L2 accesses.
Figure5.13(a)shows the static energy consumption breakdown for FS-DAM. The energy consumption of three components: bank, NoC and FS-TGS are shown sep- arately. It can be observed that the average energy consumption of FS-TGS is only 2% of the total energy consumption. The NoC consumption of both baseline and FS-DAM remains same as the DAM based policies are internal to each bank.
The distribution of dynamic energy consumption in FS-DAM is shown in Figure 5.13(b). In this figure the energy consumption of different components are di- vided into two groups: normal and additional. Normal means all the components common in both baseline and FS-DAM, i.e. bank, NoC. Additional means the dynamic energy required by FS-TGS and re-grouping. It can be observed that the total dynamic energy consumed by FS-TGS and re-grouping process is only 4.2% of the total dynamic energy consumption by the LLC. The energy required by re-grouping is 0.48% to 0.05% with an average of 0.3%. Table 5.7 shows the additional energy required by FS-DAM over baseline. The static energy overhead of FS-DAM over baseline is only 1.73% while the overhead in dynamic energy is 2.19%.
(a) Static energy. (b) Dynamic energy.
Figure 5.13: Energy consumption breakdown for FS-DAM.
Baseline L2-bank
V-Way overhead (in %)
CMP-SVR overhead (in %) FS-DAM overheads (in %)
F=2 F=4 F=8 F=2 F=4 F=8
256KB, 4-way 15.32 2.27 4.63 9.44 3.11 5.43 10.20
128KB, 4-way 14.8 2.27 4.63 9.44 3.03 5.34 10.11
Table 5.8: The storage overhead of DAM based techniques over baseline cache.
Rx: x% ways per set reserve for RT, F: fellow-group size.
5.6.5.2 Storage and Area Overhead
The storage overheads of CMP-SVR and FS-DAM depend on the size of their additional tag-arrays. Each additional tag-array entry stores a tag address and a validity bit. Note that both CMP-SVR and FS-DAM need some additional bits for separating the tag of different fellow-sets in their additional tag array. For example, in case of fellow-group size 4, the number of additional bits required for each entry is 2 bits. The number of entries in the additional tag-array is the same for both CMP-SVR and FS-DAM. In addition, FS-DAM needs two tables called SetMapperandSetPointer. Table5.8compares the storage overhead of V-Way, CMP-SVR and FS-DAM. The storage overhead of V-Way are calculated based on the storage required for the forward and backward pointers. From the table it can be observed that the overhead in both CMP-SVR and FS-DAM increases while the fellow-group size increases. This is because of the above mentioned additional bits required to separate the tag of different fellow-sets. The overheads of CMP-SVR remains same irrespective of the bank sizes but varies for FS-DAM. Increasing the banks size while keeping the associativity same increases the number of sets in each bank. Hence the storage required for SetPointer and SetMapper increases.
Cacti 6.0 [2] is used to calculate the area consumption of FS-DAM with different values of F. Table 5.9 shows the percentage overheads of FS-DAM as compared to the baseline design. The overhead of FS-DAM is between 1.6% to 3.5%. In the case of a 256KB 4-way associative L2-bank having R=2 and F=2, the area overhead of FS-DAM is 3.38%.
L2-bank: 256KB-4way L2-bank: 512KB-4way
F=2 F=4 F=8 F=2 F=4 F=8
Area overhead (in %) 3.380 3.397 3.486 3.128 3.518 3.551
Table 5.9: The area overhead of FS-DAM over the baseline design. Rx: x%
ways per set reserve for RT, F: fellow-group size.