a“'n - CMU (ECE)

Report 2 Downloads 262 Views
The

Turn

Model

for

Christopher Advanced

Adaptive

J. Glass

and

Computer

Department

Systems

State

Lansing,

{glass

MI

that

a model

are deadlock

free, livelock

free,

maximally adaptive. A unique is not based on adding physical topologies

(though

nels).

Instead,

which

packets

wormhole minimaf

the model

or nonminimal,

to networks

is based

with

on analyzing

in a network

can form. Prohibiting cles produces routing

algorithms and

Laboratory

48824-1027 .msu.

edu

extra

that

just enough turns to break algorithms that are deadlock

network,

local

input

devices,

and

network

nique

in

the

output

and

which

by first

into

flow

dividing

coniroi

routing the

digits

message

the tits in the packet to become available.

of the turns three

in routing.

Partially

for 2D meshes, cubes.

algorithms

adaptive

to prevent

for 2D meshes

permit

routing

k-ary

adaptive

are described

n-cubes, show

and highest

The

adaptiveness

that

which

sustainable

each router

a few

for

tilts

to store

routing algo-

on the pattern of message tratlic. For nonuniform trfic, adaptive routing algorithms perform better than nonones.

ture

for

become

constructing

computers

and

net works scalable

offer than

Systems nodes,

have

interconnection

multiprocessors,

shared-memory

massive

based where

large-scale

scalable

other

a popular

parallelism

approaches

on dh-ect each node

such

[1, 2, 3, 4, 5] and to multiprocessor

networks

are organized

has its own processor,

and broadcast

Direct

The

majority

of

as ensembles locaf

work

are

memory,

of

meshes

in part

by the

Touchstone

advantage. Its date

the

ACM

appear,

and

Association requires

copy

for

ACM

fee

are not

copyright notice

Computing

a fee and/or

@ 1992

without

the copies

specific

all

or

made notice

IS gven

part

of

th]s

or dmtrlbuted and that

Machinery.

To

the

copying copy

title

material for of

direct the

is by

all nodes

otherwise,

IS granted

publication

on their have

n-cube

the

or to republish,

permission.

0=89791 -509.7/92/0005/0278

$1.50

=

(Zj

bors if k which is n-cubes. for O < i

and of

routing

also permits

them

over multiple

communications topologies

and

k-ary

[2], the

for

wormhole

n-cabcs,

[9, 1]. routing

particularly

A 2D mesh

Intel

channels,

possible

Paragon,

is used and

the

lowin the Symult

MIT J-machine [11], and is used in the nCUBE2

the [1].

location

in the

the same number differs

from

that

mesh.

In a, k-ary

of neighbors.

The

of an n-dimensional

to k and two nodes

n-cube

definition mesh

[8], of a

in that

X and Y are neighbors

if z, = vi for all i, O < i < n— 1, except

one, j, where

1) mod k. The change to modular arithmetic in the definition adds wraparound channels to the k- ary n-cube, giving it symmetry, Every node has n neighbors if k = 2 and 2n neigh-

commercial

permission

In

bors if and only if z, = y, for all i, O < i < :n – 1, except one, j, where y, = ZJ & 1. Thus, nodes have from n to 2n neighbors,

if aud only

to

of

are low.

X is identified by n coordinates, (c., c1, . . . . Zn-2, Zn–l ), where 1. Two nodes X. and Y are neighO<xl 1/2, which indicates a significant degree of adaptiveness. Note that Sp = 1 for a sourcedestination pair does not imply that for that pair. The partially adaptive

algorithm algorithms

p is nonadaptive all permit non-

rninimed routing. In the case of the negative-first algorithm, this means that rout ing can be adaptive even when d= < s= and dv ~ Su or when d= > s= and du < Sg. this, see the bottom path in Figure 10b. (a) The

six turns

allowed

(solid

lines)

by the negative-tirst

4

■ !9.





an illustration

of

algo-

rithm.

99

For

Partially

Adaptive

n-Dimensional



4

This from

Routing

in

Networks

section describes the adaptive routing algorithms the application of the turn model to n-dimensional

and k-ary n-cubes. In general, these unless n is small or the k’s equal 2.

4.1

n-Dimensional

Of the

many

per

cycle,

analogs

algorithms

are

gorithm

is the

all-but-

first

adaptively

for

north-last,

for 2D meshes.

a packet

formed

noteworthy

of the west-first,

gorithms

are not

practical

Meshes

routing

three

topologies

resnlting meshes

The

by prohibiting

their

one turn

simplicity.

They

and negative-first

analog

of the

one-negative-jirst

west-first

routing

in the negative

are

routing

al-

routing

al-

algorithm:

directions

route

of all but

one

dimension (n – 1) and then adaptively in the other directions. The analog of the north-last routing algorithm is the all-but-onepositive-last the negative

(b) Examples Figure

3.4 How

of the negative-tirst

10. The

negative-fist

Degree adaptive

of are

sion (0) and then the negative-tlmt

u.

mmm



algorithm routing

muting

Theorem 5 n-dimensional

for 2D meshes.

Proofi

partially

adaptive

routing

adaptive

adaptive

algorithms,

algorithm, Ax

(Ax ‘f

=

Ay

partially

a node

= IdY – Sg [. Then,

channel

=

{

AZ!AY!

&J!#

if d% ~ s%

1

otherwise

(.4a&2 ~ ~z!AY!

Snorth_la.t =

when

channels

{ For minimal

routing

if dv ~ Sv

< SY)

or (d~ z so and

dv ~ SU)

1

otherwise the

the

traveling

but

the only

sum

of

the

k,

in a negative

larger

proof

All

for

an

of

we present

algorithm

direction,

K – n – X – 1, which

–1,

channels

which

leaving

the

the negative-first with

strictly

dimensional mesh. following theorem, if (dz < s= anddg

algorithms,

in the negative directions.

foT

n-dimensional

along

a

K – n – X

channels leaving the node If a packet ente~ a node it enters along a channel

islessthan node

K-n+X,

thenurn-

in the

positive

routes

every

algorithm

increasing

it enters

is less than

numbers

directions. packet

and is deadlock

along free.



The negative-fit algorithm is deadlock free as a result of prohibiting just one turn per abstract cycle, the turn from a positive direction to a negative direction. Therefore, prohibiting some quarter of the turns is sntiicient to prevent deadlock in an n-

w =

be

K-n+X

of the

produces s negattve-fsrst

K

nmnbered

Therefore,

otherwise

{

adaptively positive

and K – n + X, the numbers of the in the negative and positive directions. when traveling in a positive direction,

+ Ay)!

ber -f:rst

first

in the

The negative- fivst routing meshes is deadlock free.

Let

numbered

s west

a packet

adaptively

X be the snm of the z, for any node Zn-I ). Nnmber each channel leaving a node in a positive direction K — n + X and each channel leaving a node in a negative direction K — n — X. Then, if a packet enters

algorithms?

p be one of the three

= Idz — s= 1, and

route

then

mesh, and let (ZO , Z* , . . . . z.-2,

Let Salgor,thm be the number of shortest paths the algorithm allows from source node (s=, SY) to destination node (d=, dv). Also, let j be a fully

and

these algorithms are deadlock free, is for the negative-fist algorithm.

in an 8 x 8 mesh.

algorithm

adaptively in the other directions. The analog of routing algorithm is also called the n egative-jirst

alg orithrn:

directions

Adaptiveness these

routing algorithm: route a packet tirst adaptively in directions and the positive direction of one dimen-

Theorem n(n — 1)) to prevent

Salgor,t~~

is, the

283

maximally 6

Combining this with Theorem which supports our claim that adaptive

Prohibiting

routing

some

in an n-dimensional deadlock.

algorithms.

quarter mesh

1, we have the the tnrn model

of

the

is necessary

turns and

(that suficient

is,

As

the

tially

number

adaptive

messages k,

of

dimensions

algorithms

adaptively.

are large.

are

SP

But

=

averaged

increases, more

the

likely

to

1 less often, across

all

minimal

be able

especially

Algorithm: Minimal p-cube routing for hypercubes. Input: Current address, C, and destination address, D. Procedure:

par-

to

route

when

source-destination

the pairs,

1. If C = D, exit.

Sp/Sj >1 /2n–1, indicating that the degree of adaptiveness relative to fully adaptive algorithms decreases as n increases. Again, adaptiveness 4.2

can be increased

k-ary

The

adaptive

to use the

way is to allow only

on its

lock free, wraparound than

routing

wraparound

a packet

first

of the

are routed

mesh

along

k-ary n-cubes according to

prove

of k-ary

along

that

channels,

channels The

the

n-cubes.

routing

in strictly

decreasing

for meshes. Note that

freedom

all of these

For k-ary

routing

n-cubes

deadlock-free

routing

is a simple

with

that

Linder

and

is deadlock ever,

Harden

free,

enough

minimal,

channels

subnetworks

[16] construct

with

and

fully

to

n + 1 levels

per

k-ary

Overall,

the paths,

Routing

for

hypercubes

n-dimensional algorithms

meshes

in

h = 6,

hO

free

They

add,

how-

into

2n -1

and

cases of the

n-cubes.

Proofs

are corollaries

routing

routing

especially

channel

algorithm

3, and

algorithm when

The

offers

compared

in a di-

for hypercubes.

a choice

of many

to tlhe nonadaptive

following

table

illustrates

of the 36 possible

transmitting

for

sends a example,

shortest

the message,

e-

this

the source node (1011010100) node (0010111001). For this

h] = 3. One

For each node

kn

address

the last

paths

the number

of

hop

C.

destination

was m a posltwe

address

dmectlon,

D;

and

p.

nr”uti’gf”r

channels

1. If C = D,

route

meshes routing

2.

Ifp=l,

and al-

3.

Elseif

for

4.

Else R = C.

algorithms that

of the

special

address

case of the negative-first

of the node

the header

address

has two phases.

the

packet

to the

local

processor

(CA

D).

and

increased

of the

proofs

algorithm

flits

destination

adaptiveness the packet

for

5. Route

the

currently node.

occupy, The

and

fault

tolerance,

any dimension

flits

is D.

p depends

on which

C is a unique input

buffer

constant the

Figure

distance

of adaptiveness

bet ween S and D. for

the p-cube

The routing

the

any

packet

i for which

the

first

i for which

Wits occupy

measure

available

channel

in a di-

r, = 1.

p-cube

routing

algorithm

for hypercubes.

dimension

choices

comment

taken

‘-

g: OO1OO1OOOL , . nnlm lnnnn I 7 ./

phase c, = 1

-.-..-””



1

.

0010110001

I 1

0010111001

I

,

n

1

I

‘w

and in the

6

Simulation

Experiments

To compare the partially adaptive routing algorithms all-but-one negative-fist (ABONF), all-but-one-positive-last (ABOPL), and negative-tirst (NF) with the nonadaptive routing algorithms Zy and e-cube, we simulate a 16 x 16 mesh and a binary 8-cube for three different tratlic patterns. Each of these topologies contains 256 nodes.

of the degree

Is Sp—ctibe /.$f

12. The minimal

address

a fully adaptive routing = l(S@D)[ is the Ham-

other

along

CV

the

router. Konstantinidou proposes an algorithm similar to p-cube [20], but only for minimal routing. The number of shortest paths from S to D, SP_cti~e, is hl !ho !, where IX I represents the number of 1‘s in the binary number X, hl=l(S A ~)1, and ho=l(s A D)l. For algorithm, ~, Sf = h!, where h = hl +ho

R=

rout-

routing,

for each router,

header

O,then

and D

p-cube

In the case of minimal

along

D=

D.

has a particu-

and d, = 1. Then, the steps can be computed as shown in Figure 12. In both of these algorithms, the only input transmitted in the header

CA

R=CA

the routing

fist phase routes the packet along a dimension i for which c, = 1 and d, = O. When there is no such dimension, the second phase routes the packet along a dimension i for which c, = O and d, = 1. These steps are easily computed using bitwise logic operations as shown in Figure 11. If nomninimal routing is desired, because can also route

then

mension

ing algorithm

ming

p-cube

algorithm.

=

whether

cases.

be the binary

of its

p-cube

Current

that

larly compact expression, the p-cube routing algorithm. Let S be the binary address of the source node for a packet, C be the binary

available

choices based on the p-cube routing is also shown. The number of choices in rmrentheses indicates the additional choices available with nonmi~mal routing.

Hypercubes

are special and k-ary

are deadlock

general

The

any

~i = 1.

addhg

n-cube

Hypercubes are a special case of both n-dimensional k-ary n-cubes. Consequently, the partially adaptive

more

along

exit.

p-cube

gorithms

packet

10-cube where to the destination

is shown.

per level.

5

CAD.

i for which

routing

a binary message

nonmini-

algorithm

subnetwork

and

to construct

a routing

the

the

shortest cube

Again,

without

adaptive.

to partition

R=

11. The minimal

extra channels. This is a result of the many cycles that do not involve turns in the topology. By adding channels to a k-ary ncube,

processor

or-

of the proof

are strictly

are minimal

O,then

Route

Figure

channel and then

channels.

k > 4, it is impossible

algorithms

local

packets

or increasing

modification

algorithms

to the

CAD.

mension

algorithm. Thus, a node at the east edge will have two channels to the west: a mesh immediately to its west and a wraparound

of deadlock

If R=

packet

dead-

can be extended

at the west edge of the mesh

3. 4.

One

is still

in another way: classify each wraparound the direction in which it routes packets

to a node

R=

channel

on whether

algorithm

2.

the

can be ex-

a wraparound

depending

negative-tirst

apPIY the negative-tit of the mesh channels channel to the node

maL

To

for meshes

channels

to be routed

hop.

der in the proof.

the proof

algorithms

number the mesh channels as before and assign the channels a number that is either greater than or less

those

channel

routing.

n-cubes

partially

tended

by nouminimal

route

of neighboring

= 1/(:,).

284

A pair

of unidirectional

routers

and

channels

each router

connects

to its local

each pair

processor.

All

of the

channels

input

channel

The

routers

have into

the

same

a router

operate

bandwidth,

has a bufTer

asynchronously

20 tlits/flsec. the

and

synchronize

rithms about

Each

size of a single

flit.

(Figure 13). At low throughputs, the algorithms perform the same. For the nonuniform tratiic patterns, the partially

adaptive

to simult~

routing

algorithms

throughputs

(Figures

have the lower

high

are blocked the source

mum sustainable throughput of the partially is four times that of the nonadaptive e-cube

contain

header

an input

flits

selection

consumed. waiting

policy

must

local first-come-first-served that

arrived

indefinite

channel

has multiple

selection policy decides in favor

multiple

arbitrate.

first.

This

postponement. output

in favor policy

When channels

channels channel,

of the header

a header

policy along

and

queued

at their the

source

network

processors

fit

an

output

used is called ZY and the lowest dimension.

Average communication latency

and Teverse-jiip. any of the other the

The uniform processors with

matrix-transpose

cessor at row umn i. In the by

mapping

bors

in the

then

sent

mesh.

nodes

resulting

one d

determined

message

the by

from

hypercube

so that

hypercube.

the

q ). The

0

100

200

Average

300

matrix

transpose

are

in the

reverse-tlip

Figure

14. Comparison

traffic

in a 16 x 16 mesh.

pattern

Overall

one at

of routing

in the hypercube,

are for the partially These throughputs

sends each

able throughput algorithm and is not

algorithms

the highest

I

I

I

Average commun-

25

ication

Z.

due to shorter

(

longer

-

path

lengths

for matrix-transpose

than

tratlic 15

tion

-

xv

10

west-first

1

0 0

100

Average

1

I

200

300

network

negative-first I I 400

500

I 600

throughput

for reverse-flip

800

packets.

(flits/Psec)

The

maintained Despite

The and

simulations

hypercube,

latencies

at high

of routing

tratfic

for uniform

happen

tratlic

From

pattern

that,

nonadaptive

throughputs

for uniform routing

than

the

traffic

algorithms

traffic

probably

partially

in the mesh have adaptive

hops)

than

algorithms

routing

to embody

pattern.

trafhc

(11.34

routing

adaptive

result

as well the

is that as when

superior

for uniform provide

better

the

The

av-

for uniform perform

algorithms global, with

for

long-term

a global,

starts

informa-

long-term

message

bet-

uniform point

traffic

of

spread

and the zv and e-cube algoadaptive algorithms, on the

lower

A traffic processes

algo-

applications,

performance traiiic,

285

the

will

of uniform

tratiic

information

is used.

of the

nonadaptive

partially

perforrmmce

pattern is determined are mapped to the each node

evenness

global

tratiic has been used in many we know of no real applications

ind]cate the

algorithms

tratlic.

tratlic is 4.27 hops, versus 4.01 in the mesh, the highest sustti’n-

across the mesh or hypercube, maintain that evenness. The

algorithms Figure 13. Comparison in a 16 x 16 mesh.

for the e-cube in throughput

other hand, select channels based on local, short-term information. These selections tend to benefit just the routed packet and ordy for the immediate future and tend to interfere with other

-AI 700

they

this

the uniform

evenly rithms

-Q--

partially

is that about

view,

-Q-

north-last

5;

*

the

tratiic. sustain-

throughput in the mesh, which occurs for the uniform tratRc. Again, average path length is

traflic (10.61 hops). The reason the nonadaptive ter

throughputs

and reverse-tlip the next highest

is for the negative-&t algorithm and matrixThis throughput is 30~o higher than the second

highest sustainable xy algorithm and

latency (flsec)

800

for matrix-transpose

in the hypercube, which occurs uniform tratiic. This improvement

able throughput transpose trallic.

i

35 30

700

(flits/flsec)

sustainable

adaptive algorithms are 50% higher than

erage path length for reverse-flip hops for uniform tratlic. Overall I

600

sends each message

at (ZO, ZI, zz, Z3, ZA, Z5, ~Ij, $7) to the

I

500

throughput

neigh-

Messages

(Z-7, Z-f, $–~, ~–~,Z-3, @, Z–I, Z-o).

40

400

network

at row j and colpatterm is derived

in the hypercube

the processor

by

the pru-

at (ZO, ZI, $2, Z3, X4, Z5, Z6, X7 ) to the

(3Y4, W, Z6, X7, ZZO,r~, w, from

each

in the

dictated

pattern

the processor

message

to

are neighbors

F

pattern sends each message to equal probability. In the mesh,

sends

a 16 x 16 mesh mesh

zo

dependent. Three matrix-transpose,

i and column j to the one hypercube, a matrix-transpose

to the

The

from

pattern

25

and bounded.

is largely

the message trafFic pattern, which is application network workloads are considered: uniform,

algorithms

in an input

to it,

is small

performance

adaptive algorithm.

flits

routing is minimal. For each simulation, two characteristics of network performance are measured: average communication latency (in Usec) and average sustainable net work throughput (in tlits delivered per psec). The throughput is sustainable when the number of Obviously,

at

therefore

All

packets

especially

For matrix-transpose

used is called

is fair

available

must arbitrate. The of the output channel

input output

The policy

and decides

in the router

prevents

When

for the same available

16).

traflic in both the mesh and hypercube, the maximum sustainable throughput of the partially adaptive algorithms is twice that of the nonadaptive algorithms. For reverse-fllp traftic, the maxi-

from immediately entering the net work are queued at processor. Messages that arrive at a destination pro-

cessor are immediately

14, 15, and

latencies,

neously transmit the tlits in a packet. The processors generate messages at time intervals chosen from a negative exponential distribution. Each message has an equal probability of being one packet of 10 or 200 tlits. Messages that

previous that

adaptive

in real

systems.

is not routing

algorithms Uniform

simulation studies, but generate uniform traffic.

by the application and how its nodes of the network. For most

communicate

with

some nodes

much

more

than

algorithms

30

Average25 communication Z. latency

.

ABONF

Q

ABOPL p-cube

= L

presents

illustrate,

is often

7

Conclusions

Our

goal

tions

because

has

been

to

poor

performance.

and

Future

make

the

interconnection

in which

packets

number

they

a problem

for the

are nonadaptive.

Just

maintain the evenness of uniform tratlic, the unevenness of nonuniform trafiic. The

minimum

use

of

net works.

can turn

of turns

best

break

the

channels

Analyzing

in a network

that

they blindly result, ss the

Work

cycles

faulty ment spot

hardware and decreases and livelock. Nonminimal avoidance

and fault

the

produces

free, minimal freedom and

livelock freedom are essentiaf for routing algorithms. ness increases the chances that packets can avoid hot

15 -

in

the direc-

and prohibiting

all of the

routing algorithms that are deadlock free, livelock or nonminimal, and maximally adaptive. Deadlock

10

Adaptivespots and

the chances of indefinite postponerouting allows, even greater hot-

tolerance.

The

turn

model,

urdike

other

54

apprOmhes to designing ~aptive routing algorithms, is applicable to networks with only the channels required by the network

o~

topologies channels).

o

100

200

Average

300

400

network

500

throughput

600

700

without

800

tially

(flits/psec)

(as well Applied extra

15. Comparison in an 8-cube.

ofroutingalgorittis

formatrix-trampose

While the disadvantages.

as to networks with extra to n-dimensional meshes

chanuels,

the turn

routing

algorithms.

adaptive

adaptive routing than nonadaptive tratiic. Figure traffic

tratlic

as they maintain

wormhole-routed

35

(psec)

Nonuniform

e-cube

figures

40

others.

ZY and

algorithms algorithms

turn model Adaptive

trol

logic

for route

this

may

increase

the

need

for

produces

Simulations

tion

on more

selection

than

delay.

of the

between

information.

bases

one of the dimensions. a selection

partially

does nonadaptive

Part

to decide

header

typically

new, par-

indicate that they can perform better for nonuniform pat terns of message

For

a selection

on the distance

remaining

and

is due

output

to

chan-

Another part of the to base the route selecdimension-order

on the

For adaptive

routing,

complexity multiple

nels, all of which lead to the destination. complexity is due to the need for a router a router

severaf of these

has many advantages, it also has some routing can require more complex con-

node

a router

model

physicaf or virtual and k-ary n-cubes

distance

routing, remaining

routing,

a, router

in more

than

in

must

base

one, or all, di-

mensions. Every extra bit of header information that is required for the router to select an output channel increases router storage requirements of store 40

I

35

(flsec)

e-cube

*

on network

~

the

Q -,&

cal channels.

I

tigate

the turn

effects model

to apply octagonal,

15 10

o~

for future input

In [18],

to networks Other

that

models

for

networks.

and

more

like those

work.

In [19],

we inves-

output

selection

we illustrate include designing

Another

mit adaptive routing topologies, the turns

without are not

stract

necessarily

cycles

are not

task

butions,

the

extra

virtuaf

adaptive

policies

application

of

or physi-

routing

algo-

obvious

extension

do for

of our work

is

is the

so that

the

the addition of channels. necessarily 90-degrees and formed

identification results

by

four

of realktic

of future

simulations

turns. workload can

In such the abA final distribe more

meaningful. 500

network

1000

1500

throughput

2000

2500

(flits/psec)

Acknowledgments The

Figure 16. Comparison fic in an 6-cube.

latencies

the turn model to other topologies, such as hexagonal, and cube-connected cycle net works, all of which per-

important

u]

Average

directions of different

performance.

the enhanced

o

communication

rithms are based on adding extra channels to networks, but not produce routing algorithms that are maximally adaptive

.20

54

are mauy

I

ABOPL p-cube (NF)

25

There

I

ABONF

30

Average communication latency

and makes

and f&ward.

of routing

algorithms

for reverse-flip

authors

develop

traf-

wish

to thank

Dr.

Philip

K. McKinley

for helping

us

the simulator.

References 1. NCUBE 2. Intel iion,

286

Company,

Corporation, 1991.

NCUBE A

6400

Touchdone

P70ce88c,r DELTA

Manual, System

1990. De8crip-

3.

S. B. Borkar, Kung,

R. Cohn,

M. Lam,

P. S. Tseng,

J. Sutton,

An integrated

solution

Proceedings 4. D.

Lenoski,

J.

May A.

and J. Webb, parallel

Gharachorloo,

multiprocessorfl on

Agarwal,

B.-H.

Lim,

A processor

tiprocessor,” on

Computer

W.

J. Dally

D.

Kranz,

tmchitecture

in Proc.

of the

Architecture,

Networks,

9.

vol.

vol. 3, no. 4, pp.

nection

“Performance IEEE net works,”

39, pp.

775–785,

J.

J.

Kubiatowicz,

International

torus

May

1990.

routing

1, no. 3, pp.

X. Lh

June

267–286,

chipfl

Journal

187–196,

and L. M. Ni,

“Deadlock-fi-ee networks,”

International

116–125,

C.

L.

Seitz,

A new Computer

1979.

multicast

wormhole

in Proceedings

Symposium

May

1986.

1990.

ing in in multicomputer pp.

mul-

Symposium

analysis of Ic-ary n-cube interconTransactions on Compute.., vol. C-

Dally,

Annual

10.

and

protocol

7. P. Kermani and L. Kleinrock, “Virtual cut-through: computer communication switching technique,”

W.

20.

1988.

multiprocessing

104–114,

“The

Computing,

and

for

17th

pp.

and C. L. Seitz,

of Distributed

8.

Conference

the 17th Internapp. 148–159,

of

Architecture,

on

of

Computer

routthe 18th

Architecture,

1991.

W.

Athas,

C.

C.

M.

Flaig,

A.

J.

Martin,

J. Seizovic, C. S. Steele, and W.-K. Su, “The arch] tecturc and programming of the Ametek Series 2010 multicomputer~ in Proceedings of the Third Conference current Computers and Applications, CA), pp. 1988. 11.

12.

13.

W.

J.

33–36,

Association

“The

Dally,

on Hypercube ConVolume I, (Pasadena,

for Computing

J-machine:

System

in Actors:

Knowledge-Based

Concument

and

eds.),

1989.

Agha,

MIT

Press,

Mach] nery,

support

for

Jan.

Actors)

Computing

(Hewitt

W. J. DallY and H. Aoki, “Adaptive routing using virtual channels,” tech. rep., Massachusetts Institute of Technology, Laboratory for Computer Science, Sept. 1990. H. Sullivan fully Annu.

and T. R. Bashkow,

distributed Symp.

parallel

“A large

machine? Architecture,

Comput.

scale, homogeneous,

in Proceedings

oj the lth

vol. 5, pp. 105–124,

Mar.

1977. 14.

W. J. DallY and in multiprocessor tions

15.

on

Computers,

J. T. Yantchev deadlock-free IEE

16.

18.

and

vol.

C-36,

Pt.

routing

for

E, vol.

W. J. Dally,

“Fine-grain

message

ers,”

in PTOC. of the Third

rent

Computers,

vol.

vol.

pp.

1987. low

latency,

of processorsfl 178-186,

May

in 1988.

“An adaptive and fault tolfor k-ary n-cubes ,“ IEEE 40, pp. passing

Conference

1, (Pasadena,

May

“Adaptive,

networks

136(3),

D. H. Linder and J. C. Harden, erant wormhole routing strategy on Computers,

message routing IEEE Tmnsac-

pp. 547–553,

C. R. Jesshope,

packet

Proceedings,

Transactions 17.

C. L. Seitz, “Deadlock-free interconnection networks,”

on CA.),

2-12,

Jan.

concurrent Hypercube pp.

2–12,

1991. computConcurJan.

C. J. Glass and L. M. N], “Adaptive routing in meshconnected networks,” in Proceedings of the 12th International on

Distributed

Computing

System.,

June

1992.

in

Gupta,

coherence

in Proc.

Computer

Nov.

A.

cache

“iWarp:

computing:

’88, pp. 330-339,

K.

19.

H. T.

L. Rankh,

1990.

“APRIL:

6.

J. Urbanski,

directory-baaed

Symposium

T. Gross,

J. Pieper,

to high-speed

Laudon,

“The

for the DASH tional

S. Gleason,

C. Peterson,

oj Svpercompnting

J. Hennessy,

5.

G. Cox,

B. Moore,

1988.

C. J. GhSS and L. M. Ni, “Maximally fully adaptive routing in 2d meshes; Tech. Rep. MSU-CPS-ACS-51, Dept. of Computer Science, Michigan State University, East Lansing, Michigan, Jan. 1992.

287

S. Konstantinidou, “Adaptive, minimal routing in hypercubes, “ in Proc. of the 6th MIT Conference: Advanced Re1990. search in VLSI, pp. 139–153,