The goal of this assignment is to simulate the flow of packets in a network using "wormhole routing".

Wormhole routing is a variant of virtual cut-through which, in turn, is a cross between circuit and packet switching. In packet switching, a message to be transmitted is split into packets. Packets typically have a network-specific maximum length and are individually routed. This means that the different packets comprising a message may traverse different paths from the source (S) to the destination (D). One implication of this is that Packet 5 of a message may reach D before Packet 3 which was transmitted before.

In wormhole routing, a packet or worm is split into flits (think of a flit as a unit of information transfer: a flit traverses a link between two adjacent nodes in one time unit.) In the sequel, the terms packet and worm are used interchangeably. The flits move in lock step. Fig. 1 shows the progress of an isolated worm in a 2-dimensional mesh. S and D are 3 hops apart and the packet length is 7 flits. The packet was launched at time t=0. Fig. 1 shows the position of each flit between t=1 and t=9 (at which point the entire packet has reached D). The numbers in Fig. 1 identify the flits of the worm -- 1 represents the head flit and 7 the tail flit. The dark lines indicate the path over which the flits of a worm extend. Note that in the absence of any contention, a packet will take f+h-1 time units to reach its destination - h units for the head flit to reach its destination and f-1 units for the remainder of the flits.

An r-radix, d-dimensional hypercube is one in which each node may be
addressed as a d-tuple

(a_{0}, a_{1}, . . . a_{d-1}) where 0 <= a_{i} < r , 0 <= i < d.
Each node is connected to a node that differs in exactly one position
by 1 in the modulo r sense.
An example of a 2-dimensional radix 5 hypercube is shown in Figure 2.

In the first part of this assignment, we confine attention to this structure, with arbitrary radix, called a toroid. We assume that all links are uni-directional. In particular, the horizontal links are left-to-right and the vertical links are top-to-bottom.

We assume that a packet from S to D will always take the shortest
path. If the magnitudes of the difference between the X co-ordinates of
S and D is x and the difference between the Y co-ordinates is y,
then there are (x+y)!/(x!y!) shortest paths between S and D.
To minimize the probability of deadlock, however, we assume that
all packets are routed through a node that shares the x co-ordinate
with D and the y co-ordinate with S. (i.e. the path is made up of
a horizontal sub-path(if any) *followed * by a vertical sub-path
(if any).)
Figure 3 shows the different shortest paths between S and D. Of these,
the first path should be chosen in this assignment.

Figure 5 shows the usage of relevant buffers in the absence of contention. (Figures 5, 6 and 7 will be provided in class.) A packet starts out at t=0 from S to D (3 hops away.) At t=0, its head flit is in the output buffer connected to the outgoing horizontal link. The rest of the packet is in the processor at S. At t=1, the head flit has been transferred to the output buffer of S' neighbor that is connected to an outgoing vertical link. The rest of the flits follow in tandem. Since the length of the packet is 3, the entire packet is safely in the destination processor at time t=5.

Fig. 7 shows worm B blocked by A and C blocked by B. It is possible that an unblocked worm, R blocks one or more worms which in turn block one or more worms, etc. This results in a tree of blocked worms rooted at R. In such a case, no worm can be unblocked before its parent. In this scenario, each simulation cycle should first compute the new positions of the flits of R and then proceed down the tree.

2 5

1 0 1 2 0 4 7

2 0 2 3 1 2 5

3 1 3 2 1 1 7

-1 3

4 4 4 4 3 3 8

5 4 1 4 2 3 6

Here's the interpretation of the different lines.

The first line contains the number of dimensions followed by the radix of the
hypercube. (The hypercubes in this assignment will be either 2 or 3 dimensional.)

Lines 2, 3, 4, 6 and 7 specify the characteristics of a newly generated worm.

The first element is the worm id, followed by the time at which it
enters the network. The next 2*d digits are the source address
followed by the destination address (recall that the address of a node
in a d-dimensional hypercube is represented by a d-tuple.) The last
digit is the worm length.

Line 5 begins with a "-1". This is an indication that the program
should output the state of the network at time t=3.

In the 2-d hypercube, node (0,0) will be the node in the extreme north-west. The first co-ordinate is the x or horizontal or west-east co-ordinate and the second is the y or vertical or north-south co-ordinate. Co-ordinates will increase west to east and north to south. Thus an unblocked worm will move from west to east or north to south, never in the reverse direction (except while traversing the wrap-around connections.)

Again, in the 3-d cube, a node is represented as a 3-tuple. The first is the x, the second is the y and the third is the z co-ordinate or dimension.

The sink of a processor is capable of absorbing multiple flits in the same cycle. The flits being refered to of course belong to different worms destined for the same processor.

If two or more worms are blocked in a router (the latter may occur
in the 3-d hypercube), they will be unblocked in the order of their
arrival. However, if two (or more) worms arrive simultaneously, preference
will be given to the worm in the following order:

incoming z link first, then incoming y link, then incoming x link and then
from the processor (this is for the 3-d case. For the 2-d case, start
with y.) The reason for this is: If the worm has arrived via the
z-dimension, its body is probably spread over more of the network since
it has traversed x and y links already. Probabilistically, it is blocking
more worms. Hence let it be unblocked first.

If a processor generates another worm while the previous worm that it generated has not completely left its router, then that worm shall be discarded. (This means that it should not be buffered in the processor for later transmission. So, your program does not need to maintain a queue internal to the processor.)

Finally, here's a hand-crafted example of what your program should expect as input and the expected output.

The input file follows:

2 15

1 0 3 5 9 13 9

2 2 5 0 9 8 11

3 4 0 0 7 9 4

4 7 1 5 11 7 8

-1 7

5 10 2 0 9 3 4

-1 12

The corresponding output is:

worm id | lead flit | co-ordinates | b/u |
---|---|---|---|

1 | 1 | 9,6 | u |

2 | 1 | 9,1 | u |

3 | 1 | 3,0 | u |

4 | 1 | 1,5 | u |

worm id | lead flit | co-ordinates | b/u |
---|---|---|---|

1 | 1 | 9,11 | u |

2 | 1 | 9,5 | b |

3 | 1 | 5,0 | b |

4 | 1 | 6,5 | u |

5 | 1 | 2,0 | b |