• Tidak ada hasil yang ditemukan

Doctor of Philosophy

N/A
N/A
Protected

Academic year: 2023

Membagikan "Doctor of Philosophy"

Copied!
222
0
0

Teks penuh

I would like to thank the heads of the Department of Computer Science and Engineering during my Ph.D. I would also like to thank all the researchers of the Department of Computer Science and Engineering at IITG for creating a warm atmosphere of mutual support and encouragement.

Network-on-Chip

Correct Functioning of NoC

To get rid of such a situation, deadlock avoidance or deadlock recovery must be incorporated in a NoC. A NoC is subject to starvation when some packets keep waiting for resources indefinitely and are unable to make progress because access to certain resources is not granted to them [3].

Verification of NoC

Challenges in Formal Modeling and Verification of NoC

  • Starvation Freedom in NoC
  • Deadlock Detection in NoC
    • Application Specific Deadlock Detection
  • Deadlock Representation and its Avoidance
    • Channel Dependency Graph
    • Turn Model
    • Deadlock in Torus NoC and its Avoidance
  • Formal Modeling of NoC

XY-Turns and YX-Turns are individually non-blocking because they cannot complete a cycle as shown in Fig. A set of allowed curves and a possible blocking representation using these curves are shown in Fig.

Figure 1.4: Deadlock example: (a) Traffic patterns leads to deadlock, (b) No deadlock in another traffic pattern.
Figure 1.4: Deadlock example: (a) Traffic patterns leads to deadlock, (b) No deadlock in another traffic pattern.

Thesis Objectives

Due to the complexity and extensive number of NoC functional units, detailed modeling of NoCs is not performed in most of the existing works. The detailed modeling of the NoC is important so that the verification results also remain valid in real NoC.

Contributions of the Thesis

  • Formal Modeling of NoC using FSM and Verification of Starvation
  • Formal Modeling of NoC using CFSM and Developing a Simulation
  • Deadlock Avoidance in Torus NoC using Arc Model and DDG
  • Deadlock Free Routing Algorithms for Torus NoC using Arc Model . 19
  • Model Checking
  • Equivalent Checking
  • Theorem Prover

In our next work, we have demonstrated deadlock-free routing algorithms for Torus NoC using deadlock-free arc pairs with respect to XY-Turns. Chapter 6 Application of Arc model to design deadlock free routing algorithm for Torus NoC is presented in this chapter.

Figure 1.9: A high-level overview of the contributions from the thesis
Figure 1.9: A high-level overview of the contributions from the thesis

Formal Modeling and Verification of NoC

  • Modeling using FSM and CFSM
  • Traffic Modeling using Queuing Approach
  • Run-time Deadlock Detection
  • Challenges and Objective

Generic NoC is based on an abstract view of the communication network of NoC is presented in this work. Formal Modeling of NoC presented in this work considering the proposed Bidirectional NoC design [52].

Routing Algorithms for NoC

Three partially adaptive routing algorithms are namely West-First, North-Last and Negative-First, which are presented in [1]. Another approach to designing an adaptive routing algorithm applicable to both Mesh and Torus NoC is presented by Duato in [11].

Deadlock Avoidance in a Routing Algorithm

  • Up*/Down* and Turn Model Approach
  • FirstHop Routing for Avoiding Deadlock in Torus NoC
  • Dally’s Approach with Virtual Channel
  • Duato’s Approach with Escape Path and Virtual Channel
  • Bubble Flow Control with Dedicated Buffer
  • Other Approaches
  • Challenges and Objectives

97] proposed an adaptive saving algorithm in Torus NoC using VC and additional buffer at the input gate. Additional VCs or buffers should be used for developing a deadlock-free routing algorithm in the Torus NoC.

Conclusions

NoC Router Components

Multiple packets can compete for the same output port at the same time. In the NoC context, the egress port and its connection to the next router are considered a source of contention.

Contributions

In this chapter, we present starvation-free verification using our FSM-based NoC model. A brief description of the FSM and the short form used to describe the NoC model is presented in Section 3.2.

Finite State Machine and the Naming Convention

Finite State Machine

In the same context, each of the FSMs interacts with other FSMs in the NoC model. Alternatively, Σ represents the global state information of the complete NoC that is used to define state transitions δ in each individual FSM.

Short forms and the Naming Convention

All the transitions in an FSM are controlled by the states of other FSMs with which the FSM interacts and all transitions happen in a manual control manner. In context of our FSM-based NoC model, Q represents the states in the FSM under consideration and Σ represents the FSM state information for the complete NoC model.

Formal Modeling of NoC using FSM

  • High-Level Overview of the Movement of Packets
  • Synchronization between NoC Components
    • Synchronization between two Routers
    • Synchronization within Router Components
  • Modeling Buffer using FSM
    • FSM model of Sync
    • FSM model of Buffer
  • FSM Model of Switch
  • FSM Model of Return
  • Approach for Designing Virtual Channels
  • FSM Model of an Arbiter
    • FSM Model of Fixed-priority Arbiter
  • Progress in Router Components
    • Progress in a Buffer
    • Progress in a Switch
    • Progress in a Fixed-priority and Round-robin Arbiters
  • Synchronization within a Router
  • Correctness of a Priority Generator

The condition (R2PrS = 11) indicates that the priority is set for the packet from the local input port. Satisfactory progress in the proposed NoC model implies that the current state of an FSM is not permanently stuck at a certain state, that is, the state of an FSM keeps changing provided a packet is presented as input for transmission.

Figure 3.2: Movement of packets from router R5 to R2, and synchronization using Return and Sync FSMs
Figure 3.2: Movement of packets from router R5 to R2, and synchronization using Return and Sync FSMs

Application of the Model

Verification of Starvation-freedom

Satisfying this property indicates that every packet from W input port intended for S output port will be transmitted in the future. By satisfying this property, the starvation-freedom for packets from S input port is ensured at R1ArbiterE.

Verification of Transfer of Packets

Verification of Overall NoC

  • Number of FSMs in an NoC
  • Active Windows

The total number of FSMs required when designing a full NoC using a fixed priority arbiter and a round-robin arbiter is shown in Table 3.2. The number of states in each FSM is multiplied while an entire NoC is coded using a model checker.

Table 3.2: Number of FSMs in an NoC with Fixed-priority (FP) and Round-robin (RR) arbiter
Table 3.2: Number of FSMs in an NoC with Fixed-priority (FP) and Round-robin (RR) arbiter

Experimental Results and Analysis

Verification of Progress, Synchronization and Priority Generation within

  • Runtime Improvement with Parallel Execution considering

The speed is calculated as the ratio between serial execution time and parallel execution time. Even the acceleration trend is not the same for both Fixed Priority and Round Robin arbiter.

Table 3.3: Verification of progress, synchronization and priority with fixed-priority (FP) and round-robin (RR) arbiter (A) within individual routers
Table 3.3: Verification of progress, synchronization and priority with fixed-priority (FP) and round-robin (RR) arbiter (A) within individual routers

Verification of Transfer of Packets and Starvation Freedom considering

  • Analysis of the Findings on Starvation Freedom
  • Runtime Improvement with Parallel Execution for the Ac-

The packet from ingress port L (highest priority port) is forwarded to the other router. On the other hand, the packet from input port E (the next higher priority port) is ready and waiting for the same output port N.

Table 3.4: Starvation-freedom for fixed-priority (FP) and round-robin (RR) arbiters
Table 3.4: Starvation-freedom for fixed-priority (FP) and round-robin (RR) arbiters

Conclusion

NoC with a round-robin arbitration policy takes more time than NoC with a fixed-priority arbitration policy. The experimental results in Table 3.5 show that the verification time increases with increasing NoC sizes as well.

Contributions

Formal Modeling of NoCs using CFSM

To our knowledge, detailed formal modeling of NoCs is not explored enough in the literature (discussed in Chapter 2). The condition required for deadlock in CFSM-based model of NoCs is formally captured in Theorem 4.5.2.

Development of CFSM based Simulation Framework

The proposed CFSM-based model of NoCs is generic in terms of routing algorithms, that is, any routing algorithm can be symbolically simulated on it. The second contribution of this work is an application-specific simulation framework using our CFSM-based NoC models.

Background of Communicating Finite State Machine based Modeling

When a message m is transmitted from a CFSM P x at a state Gen, there must be a transition labeled −m from the stateGen (for generation). This message is consumed at P2 in the stateRet2 and makes a transition from the state Ret2 to T r2.

Figure 4.2: Three CFSM processes communicating with each other
Figure 4.2: Three CFSM processes communicating with each other

Formal Modeling of NoC using CFSM

Naming Convention

The buffer present at the north port of router R3 is likewise denoted by BufferN3. We maintain unique name and unique number for each CFSM while generating the CFSM model.

Modeling Buffer

  • Buffer with single slot
  • Buffer with more than one slots
  • Changing of priority in round-robin fashion
  • Transmitting the packet from current router to the next router 93

North port switch of router 3 requests East output port scheduler of the same router to generate priority. Modeling of the buffer with three slots, at the northern port of router R3, is shown in Fig.

Table 4.1: Naming convention for Messages in CFSM Prefix Formation Example Meaning
Table 4.1: Naming convention for Messages in CFSM Prefix Formation Example Meaning

Proposed Scheme for Deadlock Detection

Delayed Reception

This situation is an example of delayed acceptance where the transition is possible with a delay (ie when +p4p3 is accepted and P3 moves to the initial state I2). It is important to note that transition is possible for process Pj from state Sm after a finite time period in case of delayed reception.

Figure 4.8: An example of delayed reception
Figure 4.8: An example of delayed reception

Representation of Deadlock in NoC using CFSM

From the global state Gf, all processes are in receiving states and no further transition is possible, even if some of the message queues are not empty. Since all processes are in receiving state and no transition is possible, each process waits for a certain one.

Figure 4.9: CFSMs with Cyclic Dependency
Figure 4.9: CFSMs with Cyclic Dependency

Deadlock Detection Framework

In a similar way, the waiting chain continues and reaches a situation where Pn−1 is waiting for Pn. Therefore, we conclude that if all CFSMs are in receiving states and no transition is possible in these CFSMs, there are cyclic message-waiting dependencies between CFSMs that create deadlock.

Automation of CFSM Model Generation

Bounded Communication

A CFSM model is said to be bounded if the length of each message queue is finite during all its possible transitions [77]. By applying these two lemmas and performing a meticulous analysis of the transitions, we found that all of our CFSM model is bounded.

Complexity of the CFSM Model

Experimental Results and Analysis

  • Experiment I: XY-routing on Mesh and Torus of Different Sizes
  • Experiment II: XY-routing on Mesh and Torus with Different Traffic
  • Experiment III: Deadlock Detection on Dynamic XY-routing and Mod-
  • Experiment IV: Deadlock Avoidance with Increasing Buffer Size
  • Experiment V: Detection of False Positive Deadlock Warning in Book-
  • Justification for Deadlock in Adaptive Routing Algorithms

Deadlock detection using dynamic XY routing and modified West-First routing is analyzed in the Experiment III. Considering dynamic XY routing and modified West-First routing on a 5x5 NoC (with different buffer sizes), we applied traffic sample of 1600 randomly generated packets in Mesh and Torus NoC.

Figure 4.12: Experiment I: XY Algorithm on Mesh and Torus NoCs with traffic pattern of 100000 packets
Figure 4.12: Experiment I: XY Algorithm on Mesh and Torus NoCs with traffic pattern of 100000 packets

Conclusion

  • Cyclic Resource Dependency in Torus NoC
  • Deadlock and its Representation in NoC
    • Channel Dependency Graph
    • Turn Model
  • Does Deadlock Always Possible in Torus due to Wraparound Channel? 124
  • Deadlock Representation in Torus NoC
  • Deadlock Avoidance in Torus NoC
  • Contributions

Torus NoC is prone to deadlocks due to the inherent cyclic path created by the envelope channel. The first step for XY routing in Torus NoC is to decide whether an envelope channel is applicable or not.

Figure 5.1: Visualising 5x5 Torus NoC as a combination of Mesh sub-network and wraparound channels.
Figure 5.1: Visualising 5x5 Torus NoC as a combination of Mesh sub-network and wraparound channels.

The Arc Model for Avoiding Deadlock in Torus

Restricted Move via Wraparound Channel

After the EW enveloping channel, two alternate directions, either towards North or South, are permitted. The EW envelope channel is divided into EWn and EWs arcs, as shown in Fig.

Classification of Wraparound Channels

For example, the nodes represent a west-to-east channel connected to the west gateway buffer in the neighbor router shown in Figure 5.8(a). The dotted line between s and a indicates a few more such west port channel buffers involved in the dependency cycle.

Figure 5.7: Arc model: Eight possible Arcs in Torus NoC
Figure 5.7: Arc model: Eight possible Arcs in Torus NoC

Directional Dependency Graph

Application of Arc Model and DDG

DDG is useful in this process to detect stall, avoid stall and to show stall freedom for a given combination of arcs and turns. In this thesis, we consider XY turns in the Mesh subnet (i.e., EN, ES, WN, WS turns in Figure 5.3(c)) with arcs and present deadlock detection, deadlock avoidance, and deadlock freedom for various combination arcs.

Case Study: Arcs with XY-Turns

Single Arc with XY-Turns

  • Turns Introduced due to Arcs
  • Deadlock Freedom for Individual Arc

Since NE turn is restricted, NSw arc is deadlock free with respect to XY turns. Therefore, deadlock is not possible for EWn Arc with XY-Turn, as the SE turn is limited in the route.

Figure 5.11: Deadlock freedom for (a) NSe, (b) NSw, (c) EWn, (d) EWs Arcs with XY-Turns 5.5.1.2 Deadlock Freedom for Individual Arc
Figure 5.11: Deadlock freedom for (a) NSe, (b) NSw, (c) EWn, (d) EWs Arcs with XY-Turns 5.5.1.2 Deadlock Freedom for Individual Arc

Deadlock Detection for Arc Pairs with XY-Turns

  • Deadlock in Arc Pairs from the same Wraparound Channel . 138
  • Deadlock with a Combination of X-Arc and Y-Arc

In the same way, stasis is possible for (SNe + SNw). Figure 5.13: a) Arc NSe and NSw, b) Stall scenario due to the combination of NSe and NSw. The turn introduced by (EWs + WEn) and the corresponding stall scenario with XY-alignment are given in the figure.

Figure 5.12: a) EWn and EWs Arcs, b) Deadlock Scenario with EWn Arc, EWs Arc and XY- XY-Turn
Figure 5.12: a) EWn and EWs Arcs, b) Deadlock Scenario with EWn Arc, EWs Arc and XY- XY-Turn

Deadlock Avoidance using DDG Representation

  • Deadlock avoidance for (EWs + WEn) Arcs with XY-Turns 143

If the starting point cannot be reached even after trying all possible allowed paths with the routing algorithm, we conclude that the algorithm is deadlock-free on this topology. SNe and NSe arcs are individually stall-free with XY twists, as shown in Section 5.5.1.2.

Figure 5.16: (a) SNe and NSe Arcs ( b ) Dependency graph: deadlock is not possible by SNe and NSe Arcs
Figure 5.16: (a) SNe and NSe Arcs ( b ) Dependency graph: deadlock is not possible by SNe and NSe Arcs

Experimental Deadlock Detection

Experimental Results for Arc Pairs with XY-Turns

Using DDG, it is found that no cycle formation is possible for fourteen Arc pairs out of twenty eight possible Arc pairs. Blocking is detected using DDG as well as for the same arc pairs (EWs + WEn) as shown in Fig.

Table 5.2: Deadlock detection for Arc pairs with XY-Turns
Table 5.2: Deadlock detection for Arc pairs with XY-Turns

Deadlock Scenarios generated by CFSM Framework

  • Deadlock Scenarios for (EWs + EWn) Arc with XY-Turn . 147

An experimental snapshot of the deadlock for (EWs + WEn) with XY-Turn in the Mesh subnet is shown in Fig. In the third example of the experimental stall scenario, we considered pairs of arcs (EWn + NSw).

Figure 5.18: Experimental deadlock scenario in a 5x5 Torus NoC for EWs and WEn Arcs with XY-routing due to new Turns introduced by Arcs
Figure 5.18: Experimental deadlock scenario in a 5x5 Torus NoC for EWs and WEn Arcs with XY-routing due to new Turns introduced by Arcs

Contributions

A deterministic deadlock-free routing algorithm for a Torus using three arcs with XY-direction is presented. A design approach for deadlock-free routing algorithm for Torus NoC using Arc model is presented in Section 6.2.

Deadlock Free Routing Algorithm Design Approach using Arc Model

If the formation of such a cycle is not possible in any possible DDG, the routing method is deadlock-free. Routing algorithms with different combinations of arcs and with different combinations of non-deadlocked turns in the Mesh subnet are shown in the next section.

Routing using Two Arcs along with XY-Turns

Routing steps for Algorithm 1

In the next step, the pack moves one jump distance south according to EWs Arc rules. The conditions for EWs and NSe arcs can be checked in any order in the algorithm and this will not affect the routing of a packet by the algorithm.

Figure 6.3: Paths for packet p1(47, 10), p2(60, 14) and p3(21, 39) as per Algorithm 1
Figure 6.3: Paths for packet p1(47, 10), p2(60, 14) and p3(21, 39) as per Algorithm 1

Deadlock freedom for Algorithm 1

Since NW and NE turns are not allowed here, looping is not possible. Therefore, Algorithm 1 using EW and NSe Arcs with XY routing in the Mesh sub-grid is deadlock-free.

Maximum Possible Arcs with XY-Turns in a Routing Algorithm

Evaluation plan

Considering deadlock free Arc pairs in X-direction

Considering deadlock free Arc pairs in Y-direction

Routing using Three Arcs with XY-Turns

Routing steps for Algorithm 2

The arc acceptance conditions ensure that EW and WE will not be accepted in the bottom row and NSe will not be accepted in the rightmost column. The conditions for EW, WE and NSe Arc can be checked in any order and the algorithm would not be affected.

Deadlock Freedom for Algorithm 2

  • Wraparound Channel Compatible with Algorithm 2
  • Routing steps for Algorithm 3
  • DDG to show Deadlock Freedom for Algorithm 3

The dependence graph for the arcs EW and the wrapping channel EW with the limitation of the first jump is shown in the figure. The arcs EW, WE and NSe with the wrapping channel SN in the first jump are shown in Fig. 6.10(a).

Figure 6.6: Directional Dependency graphs for (a) EW, (b) WE, (c) NS and (d) SN wraparound channel with FirstHop restriction
Figure 6.6: Directional Dependency graphs for (a) EW, (b) WE, (c) NS and (d) SN wraparound channel with FirstHop restriction

Experimental Results

  • Comparing FirstHop Algorithm with the Algorithm 1
    • Effects of the Percentage of Traffic Injected from the Bound-
  • Comparisons of Algorithm 2 with Algorithm 1, Up*/Down* Algo-
  • Comparison for the Algorithm 3 with Algorithm 2 and FirstHop Al-
    • Effects of the Percentage of Traffic Injected from the Bound-
  • Hop Count Savings for PARSEC Benchmark Suites

The results indicate that saving in hop counts with the FirstHop algorithm decreases with the increase of the NoC size. Therefore, the hop counts saved by the FirstHop Algorithm depend on the percentage of traffic generated from border routers.

Figure 6.11: Percentage of Hop counts saved by FirstHop Algorithm and Algorithm 1 using uniform traffic.
Figure 6.11: Percentage of Hop counts saved by FirstHop Algorithm and Algorithm 1 using uniform traffic.

Conclusions

FSM based NoC Model for Verification of Starvation

Application Specific Deadlock Detection using CFSM based NoC Model184

Deadlock Free Routing Algorithms for Torus NoC

Future Directions

Conclusions

A 4x4 Mesh NoC

Simplified block diagram depicting functional units in an NoC router [3]

Deadlock example: (a) Traffic patterns leads to deadlock, (b) No deadlock in

Channel Dependency Graph: (a) Deadlock cycle, (b) No deadlock

Turn model: (a) All possible Turns create deadlock, (b) XY-Turns (solid

Deadlock cycle from Turn model: (a) Permitted Turns by a routing algorithm,

A 5x5 Torus NoC composed of ring networks

A high-level overview of the contributions from the thesis

Avoid deadlock by discontinuing the cyclic path

A 3x3 Mesh NoC and five bidirectional ports in a router

Movement of packets from router R5 to R2, and synchronization using Return

R2SwitchS: Switch at South Port of R2

R2ReturnS: Synchronization between S port Buffer, S port Switch and Ar-

R2ArbiterS: Fixed-priority arbiter at S port of router R2

R2PriorityS (Round-robin priority generator at the S output port of router

R2ArbiterS (Round-robin arbiter at South port of Router R2)

Partitioning NoC: Active Windows for the Router R5 and R9

Speed up with the increase in the number of routers

Buffer with three slots: (a) Overall communication for buffer at the North

CFSM model for a switch: (a) Switch at the North port of router R3 (b)

Arbiter and Scheduler at the South Port of Router R1: (a) CFSM model for

Virtual channel: (a) Buffer with four slots, (b) Buffer restructuring for two

An example of delayed reception

CFSMs with Cyclic Dependency

Global State Transitions of CFSMs in Fig. 4.9

Cyclic Dependency Graph for the CFSMs in Fig. 4.9

Experiment I: XY Algorithm on Mesh and Torus NoCs with traffic pattern

Experiment II: XY Algorithm on a 3x3 NoC (Mesh and Torus) applying

Deadlock Detection: Dynamic XY Algorithm and Modified West-First Al-

Turn model for routing algorithms: (a) XY-routing, (b) Dynamic XY-routing,

Gambar

Figure 1.3: Simplified block diagram depicting functional units in an NoC router [3]
Figure 1.4: Deadlock example: (a) Traffic patterns leads to deadlock, (b) No deadlock in another traffic pattern.
Figure 1.7: Deadlock cycle from Turn model: (a) Permitted Turns by a routing algorithm, (b) Deadlock cycle
Figure 1.9: A high-level overview of the contributions from the thesis
+7

Referensi

Dokumen terkait

To study the satisfied properties of composable Web services, formal models based on Petri nets are proposed in this paper: the Web service process nets (SPNs) representing

Hence, by all the behavior and principle flow of the deaerator system, the application of formal verification via model checking can be checked by using the software

modeling. The implementation of this model is very useful for water quality monitoring. The objectives of this study are: 1) to study the distribution of watershed parameter, that

รูปแบบการจัดการเรียนรู้เชิงรุกโดยใช้ปัญหาเป็นฐาน เพือส่งเสริมสมรรถนะนักศึกษาการศึกษานอกระบบและการศึกษาตามอัธยาศัย Problem-Based Active Learning Model on Non-formal Education

Half-sandwich ruthenium II p-cymene complexes based on organophosphorus ligands: Structure determination, computational investigation, in vitro antiproliferative effect in breast

Constrained nonlinear model predictive control using hammerstein and wiener models - a partial least squares framework.. Instrumental-variable methods for identification of Hammerstein

Based on these key recommendations are made in the context reducing the contextual vulnerability of the rural mountain communities of Eastern Himalaya Sikkim to climate change The

List of Acronyms ACO Ant Colony Optimization ALC-PSO Particle Swarm Optimization with Aging Leader and Challengers AMS Analog-Mixed Signal ANSA Adjoint Network based Sensitivity