2.2 Challenges to Employ Emerging NVMs in the Caches
2.2.2 Challenges related to Weak Write Endurance
The write endurance of the cache is defined as the total number of write operations that a memory cell can entertain before it breaks down. In the CMPs, different levels of the cache incur different numbers of writes. For instance, the upper-level cache experiences larger number of writes as compared to the last level caches (LLC). With a lesser amount of writes in the LLC, there is a good possibility that the writes are distributed unevenly inside the cache, which in turn generates write variation. In cache architectures, the researchers classify write variation into two categories [29]:
1. Inter-Set Write Variation: The inter-set write variation is caused due to the uneven write distribution across the cache-sets. In particular, some of the cache sets inside the bank incurs a large number of writes as compared to other sets. The figure 2.8 shows the presence of inter-set write variation in a cache bank. As can be seen from the figure, there is a non-uniform write
Chapter 2. Background 34
1 4 16 64 256 1024 4096 16384
Write Counts (Log4)
Way Indexes
Mix3 Mix4 Swap Body
STT-RAM ReRAM
Figure 2.9: Write counts inside the cache set for the different workloads in the baseline STT/ReRAM caches
distribution across the cache sets. Such uneven distribution of writes results into wear out of heavily written sets faster than the lightly or moderately written ones.
2. Intra-Set Write Variation: The intra-set write variation is the variation due to distinct write counts of the blocks inside the set. Here, some of the blocks inside a set entertain larger number of writes as compared to other blocks in the set. Figure 2.9 reports the existence of intra-set write variation inside the cache sets. As can be seen from the figure, there is a variable write count among the different ways of cache sets. Such non-uniform write distribution leads to the early breakage of heavily written blocks as compared to lightly or moderately written blocks.
The two write variations: Inter and Intra-Set, mentioned above, are measured with the help of coefficients. Equations 2.1 and 2.2 present these coefficients: (i) InterV: measures the average coefficient of variation across the cache sets (ii) IntraV: measures the average coefficient of variation inside a cache set [29].
InterV = 1 W riteavg
v u u u u t
PS k=1
PA l=1
Wk,l
A −W riteavg
!2
N −1 (2.1)
Chapter 2. Background 35
IntraV = 1 S.W riteavg
S
X
k=1
v u u u u t
PA
l=1 Wk,l−PA m=1
Wk,m
A
!2
A−1 (2.2)
In these equations,Aimplies cache associativity,S represents the number of cache sets,Wk,l is the write count in the cache setkand waylandW riteavg is the average number of write counts in a cache bank.
Along with the weak write endurance, in the actual execution environment, the lifetime of the NVM LLC is further affected by these two write variations as mentioned above. The lifetime of the cache is defined as follows:
Lifetime: The lifetime of the caches can be defined either as raw lifetime or error tolerant lifetime [29]. The raw lifetime is determined by the first failure of the cache line. Whereas, the error tolerant lifetime is measured with the raw lifetime and the error recovery methods.
In this dissertation, we have used raw lifetime which is the basis of an error tolerant lifetime.
With respect to write variations and write count, the raw lifetime of caches can be determined by either of the following two methods:
1. With respect to write counts, the lifetime is the inverse of the maximum write counts on the block of the cache [31].
LI = 1
∀Sk=1∀Al=1max(Wk,l) (2.3) 2. Concerning the write variation, the lifetime is calculated by considering the three important factors: (i) The coefficient of Intra-set write variation, IntraV (ii) The coefficient of Inter-set write variation, InterV (iii) Average
Chapter 2. Background 36
PARSEC v2.1 SPEC CPU 2006
Workload Body Cann Dedup Swap X264 Mix1 Mix2 Mix3 Mix4 STT-RAM
Ideal
Lifetime 41.8K 3.8K 4.8K 14.3K 8.7K 3.21K 7.44K 14.9K 25.5K IntraV 328.7% 57.4% 136.4% 361.4% 246.3% 21.6% 95.4% 223.6% 191.9%
InterV 450.7% 32.9% 66.4% 137.4% 325.3% 15.7% 213.9% 152.2% 96.5%
Baseline
Lifetime 38 19.2 30.9 41.3 8.65 47.3 6.33 25.8 45.7 ReRAM
Ideal
Lifetime 2.12K 99.5 795.4 6.37K 462.2 185.4 175.2 565.5 666.2 IntraV 191.4% 49.37% 75.56% 396.6% 97.47% 11.25% 26.4% 80.6% 40%
InterV 332.8% 20.4% 161.9% 961.6% 50.9% 17.57% 17.1% 34.54% 17.45%
Baseline
Lifetime 0.61 0.51 0.53 1.17 1 1.16 2.22 3.03 3.02
Table 2.3: Lifetime (in years) comparison analysis for the different work- loads in the ideal STT-RAM/ReRAM and the actual STT-RAM/ReRAM based
caches
write count in a cache bank [29].
LI = W riteavg base∗(1 +InterVbase+IntraVbase)
W riteavg pt∗(1 +InterVpt+IntraVpt) −1 (2.4) In the above equation, the W riteavg is the average number of write in a cache bank. Whereas, thebase and pt used in the subscript with each term represent the metric value for the baseline and the proposed technique.
Table 2.3 reports the effect of the write variations on the lifetime of different levels of non-volatile LLC. The conclusion that can be drawn from the table is that compared to ideal caches (where the writes are uniformly distributed), the effect of the write variations on the lifetime (ideal lifetime for the ideal caches, baseline lifetime for the actual cache) of actual NVM cache is significant. The values reported in the table is computed by considering the write endurance value of STT to 4 x 1012 writes (considerably large) and the write endurance value of ReRAM to 108 writes. However, the recent chip fabricated from Samsung and Intel report the write endurance value of STT to 106 write cycles [44, 45] that can affect the lifetime further.
Chapter 2. Background 37
Write Reduction Policies
For HCA For NVM caches
Block Migration Policies Region based Prediction Policies Cache Partitioning
Policies Bypass Policies
Reconfiguration Policies
Figure 2.10: Classification of write reduction techniques based on the type of caches