Operation - Fellow Sets with Static Reserve Part (FSSRP)

6.2 Proposed Wear Leveling Techniques

6.2.1 Fellow Sets with Static Reserve Part (FSSRP)

6.2.1.2 Operation

The operation of the proposed technique is elaborated through algorithm 5. In the algorithm, the tunable parameterI is used as the predefined interval (line 1). The write counter associated with each set is represented by the variable W_i (line 2).

Similarly, the write bit incorporated with each block in the NP part of the cache is represented by b_mn (line 3). The lightly written set of the group is represented byS_l. The decision of the light written set (S_l) in the group is taken with the help of write counters associated with the sets in the group. In other words, set which has a low value of write counter (W_i) is treated as lightly written set (S_l) of the group.

For the initialI cycles, the cache is used as a normally available cache. During the intervalI, if any write happens to any block or in any set, the corresponding write bit of the block b_mn is set and write counter W_m of the cache set is incremented (line 4 to 6).

Chapter 6. Inter-set Wear Leveling using DAM Techniques 153

Algorithm 5 FSSRP Wear Leveling Algorithm

1: I: Predefined interval.

2: Wi: Write counter associated with set i. 0≤i≤S

3: bij: Write bit associated with the block in set i and way j. 0≤i≤S, 0≤j <(A−r)

4: Run application forIcycles treating the cache as a normal cache.

5: DuringIcycle, the write counter (Wi) is incremented with each write in seti.

6: Similarly, the write bit(bij) is set with each write in the block.

7: repeat

8: foreach requestRcoming from L1 cache to blockBin L2 cachedo

9: ifR=ReadHitthen

10: The Read operation is performed on the blockBirrespective of its location.

11: else ifR=W riteHitthen

12: ifBlockBis found in NP part of cache then

13: ifthe write bitbijof the blockBis setthen

14: ifthere exist a light written setSlin the groupthen

15: The write request for the blockBis redirected to the locationLin the RP part ofSl. . BlockBmoved to RP on first write back

16: else

17: The write operation is performed on BlockB. Increment the write counter (Wi) and keep the write bit set.

18: end if

19: else

20: The write operation is performed on BlockB. Increment the write counter (Wi) and set the write bitbij.

21: end if

22: else

23: The write operation is performed on blockB. Increment the write counter of the set in which

the write operation is performed. . BlockBin RP part

24: end if

25: else

26: Forward the RequestRto main memory. Keep the newly arrived block in NP part of cache. . cache miss

27: end if

28: end for

29: untilthe end of the execution

0 1 2 3 4 5 6 7 8 9 10 11 12 13 1415 16 Ways

12 NP Ways 4 RP Ways

0 1 2 3 4 5 6 7

8 Sets

B₀ B₁ B₂B₃ B₇ B₁₁ B₁₅ B₁₉ B₂₃ B₂₇ B₃₁ B₆ B₁₀ B₁₄ B₁₈ B₂₂ B₂₆ B₃₀ B₅ B₉ B₁₃ B₁₇ B₂₁ B₂₅ B₂₉ B₄ B₈ B₁₂ B₁₆ B₂₀ B₂₄ B₂₈

0 1 2 3

B₀ B₁ B₂ B₃ G₀

G₁ G₂ G₃ G₀ G₁ G₂ G₃

B₁₉ B₁₈ B₁₇ B₁₆ B₇ B₆ B₅

B₄ B₂₀B₂₁B₂₂B₂₃ B₁₁

B₁₀ B₉ B₈

4 Sets

B₂₇ B₂₆ B₂₅ B₂₄ B₁₅ B₁₄ B₁₃

B₁₂ B₂₈B₂₉B₃₀B₃₁ 8 Ways

0 1 2 3 4 5 6 7

TGS L2 Cache

L1 Cache

2 3 6 7

5 L2 Cache Set (S): 8

L2 Cache Associativity (A): 16 Reserve Way Per Set (r): 4 Group Size (m): 2 TGS Set (S_tgs): 4 TGS Associativity (A_tgs): 8

Figure 6.2: Working example of FSSRP wear leveling policy

Chapter 6. Inter-set Wear Leveling using DAM Techniques 154 For the request R coming from L1 to L2 cache, the tag lookup operation is performed in NP part of the cache. Simultaneously, the tag of the requested block is also searched in the RP part of the group with the help of TGS. Note that the TGS set location is found out with the help of equation 6.4. If the requested block is present in the NP part of the cache, then it is a direct hit. Otherwise, it is an indirect hit. Note that if the requested block B is in RP part of the cache, then the cache set and cache way of the block is found out with the help of equation 6.2 and equation 6.1.

Depending upon the result of the lookup operation, different operations are performed in the cache, which can be explained as follows:

• Read Hit: For a read request R, if the block B is present in the cache. The read operation is performed on the block irrespective of its location (either NP or RP) (line 9 to 10).

• Write Hit (PUTX or write-back) and block B in NP part of the cache: If the requested block B is present in the L2 cache with write bit set, and if there exist any lightly written set (S_l) in the group, the write request is redirected from the current set to the RP part ofS_l. If there is an invalid line(s) in RP of theS_l, the requestRis redirected to the first invalid line and the block B will be invalidated from the NP part of the cache. Otherwise, if there is no invalid line, then the LRU victim line is selected from the RP of S_l. In this case, the write-back operation of the victim line is scheduled to next level of memory. Afterward, the write request is redirected to the generated location, and the block will be invalidated from the NP part of the cache. Subsequently, the tag entry of the redirected block is created in the TGS with the help of equation 6.4 and equation 6.3 (line 14 to 16). On the other hand, if no lightly written set exists in the group or the write bit (b_mn) of the block is not set. In these cases, the write request is performed in its current location, and the bit b_mn of the block is set (line 17 to 21). Note that when there is no lightly written set in the group, this implies that the current set itself is the lightly written set

Chapter 6. Inter-set Wear Leveling using DAM Techniques 155 of the group. Once the request is served, the write counter (W_m) of the cache set in which the write is performed is incremented.

• Write Hit (PUTX or write-back) and block B in RP part of the cache: If the requested block B is present in L2 cache and it belongs to RP part of the cache (indirect hit), the write request is performed normally on the block B. Afterward, the write counter of the set (Wi) in which the write is performed is incremented (line 22 to 24).

• Cache Miss: If the requested block is not present in the L2 cache, the request Rfrom the L1 cache will be forwarded to the next level of memory. In this case, the newly arrived block will be placed in the NP part of the cache (line 25 to 27).

The working methodology of an algorithm is presented in fig. 6.2. Note that the details about the structure of the cache and the TGS are already explained in section 6.2.1.1.

Example: To demonstrate the method, three cases are considered with respect to set-0. In the first case, a read request from L1 cache to way-i of L2 cache (shown by the arrow with label-1) is served normally (as represented by arrow-2) irrespective of the location (NP or RP) of the block. In the second case, a write request from L1 cache to block in way-6 of L2 cache (shown by arrow 3) which implies the NP part of the cache. In this case, the write request is redirected to the RP part of lightly written set in the group, say set 4 in our example (shown by arrow 4). Once the write operation is performed in set 4 and the entry in TGS is updated in set 0, the write-back acknowledgment is sent back to the L1 cache (as represented by arrow 5). In the last case, the write hit in the way-14 of L2 cache i.e., RP part of the cache (shown by arrow 6) is served normally. Once the operation is performed, the write-back acknowledgment is sent back to the L1 cache (shown by arrow 7).

Chapter 6. Inter-set Wear Leveling using DAM Techniques 156

20%

40%

60%

80%

100%

Write Count (%)

NP RP

Figure 6.3: Write count percentages in the different section of FSSRP

Workloads Swap Dedup Body Fluid Freq Stream X264 Mean IntraV % 1.3% 45.8% 47.1% -2.5% 7.1% 8.7% 1.2% 15.5%

Table 6.1: Percentage increase in coefficient of Intra-Set write variation

Dalam dokumen LiNoVo: Longevity Enhancement of Non-Volatile Caches by (Halaman 182-186)