Anda di halaman 1dari 21

CS575 Parallel Processing

Lecture three: Interconnection Networks


Wim Bohm, CSU

Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 license.
Interconnection networks
n Connect
n Processors, memories, I/O devices
n Dynamic interconnection networks
n Connect any to any using switches or busses
n Two types of switches
n On / off: 1 input, 1 output
n Pass through / cross over: 2 inputs, 2 outputs
n Static interconnection networks
n Connect point to point using wires

CS575 lecture 3 2
Dynamic Interconnection Network:
Crossbar
n Connects e.g. p processors to b memories
n p * b matrix
n p horizontal lines, b vertical lines
n Cross points: on/off switches
n Only one switch on per (row,column) pair
n Non blocking: Pi to Mj does not block Pl to Mk
n Very costly, does not scale well
n p * b switches, complex timing and checking

CS575 lecture 3 3
Dynamic Interconnection Network:
Bus
n Connects processors, memories, I/O devices
n Master: can issue a request to get the bus
n Slave: can respond to a request, one bus is granted
n If there are multiple masters, we need an arbiter
n Sequential
n Only one communication at the time
n Bottleneck
n But simple and cheap

CS575 lecture 3 4
Crossbar vs bus
n Crossbar
n Scalable in performance
n Not scalable in hardware complexity
n Bus
n Not scalable in performance
n Scalable in hardware complexity
n Compromise: multistage network

CS575 lecture 3 5
Multi-stage network
n Connects n components to each other
n Usually built from O(n.log(n)) 2x2 switches
n Cheaper than cross bar
n Faster than bus
n Many topologies
n e.g. Omega (book fig 2.12), Butterfly, ...

CS575 lecture 3 6
Static Interconnection Networks
n Fixed wires (channels) between devices
n Many topologies
n Completely connected
n (n(n-1))/2 channels
n Static counterpart of crossbar
n Star
n One central PE for message passing
n Static counterpart of bus
n Multistage network with PE at each switch

CS575 lecture 3 7
More topologies
n Necklace or ring
n Mesh / Torus
n 2D, 3D
n Trees
n Fat tree
n Hypercube
n 2n nodes in nD hypercube
n n links per node in nD hypercube
n Addressing: 1 bit per dimension

CS575 lecture 3 8
Hypercube
n Two connected nodes differ in one bit
n nD hypercube can be divided in
n 2 (n-1) D cubes in n ways
n 4 (n-2) D cubes
n 8 (n-3) D cubes
n To get from node s to node t
n Follow the path determined by the differing bits
n E.g. 01100 11000:
01100 11100 11000
n Question: how many (simple) paths from one node to another?

CS575 lecture 3 9
Measures of static networks
n Diameter
n Maximal shortest path between two nodes
n Ring: p/2, hypercube: log(p)
2D wraparound mesh: 2 sqrt(p)/2
n Connectivity
n Measure of multiplicity of paths between nodes
n Arc connectivity
n Minimum #arcs to be removed to create two disconnected
networks
n Ring: 2, hypercube: log(p), mesh: 2, wraparound mesh: 4

CS575 lecture 3 10
More measures
n Bisection width
n Minimal #arcs to be removed to partition the network in two
(off by one node) equal halves
n Ring: 2, Complete binary tree: 1, 2D mesh: sqrt(p)
n Question: bisection width of a hypercube?
n Channel width
n #bits communicated simultaneously over channel
n Channel rate / bandwidth
n Peak communication rate (#bits/second)
n Bisection bandwidth
n Bisection width * channel bandwidth

CS575 lecture 3 11
Summary of measures: p nodes
Network Diameter Bisection Arc #links
width connectivity
Completely- 1 p2/4 p-1 p(p-1)/2
Connected
Star 2 p/2 * 1 p-1
Ring p/2 2 2 p
Complete 2log((p+1)/2) 1 1 p-1
binary tree
Hypercube log(p) p/2 log(p) p.log(p)/2
* The textbook mentions bisection width of a star as 1, but the only way to split a star
into (almost) equal halves is by cutting half of its links.

CS575 lecture 3 12
Meshes and Hyper cubes
n Mesh
n Buildable, scalable, cheaper than hyper cubes
n Many (eg grid) applications map naturally
n Cut through works well in meshes
n Commercial systems based on it.
n Hyper cube
n Recursive structure nice for algorithm design
n Often same O complexity as PRAMs
n Often hypercube algorithm also good for other
topologies, so good starting point

CS575 lecture 3 13
Embedding
n Relationship between two networks
n Studied by mapping one into the other
n Why?
n G(V,E) G(V,E)
n graph G, G, vertices V, V, edges E, E
n Map E E, V V
n congestion of k: k (>1) e-s to one e
n dilation of k: 1 e to k e-s
n expansion: |V| / |V|
n Often we want congestion=dilation=expansion=1

CS575 lecture 3 14
Ring into hypercube
n Number the nodes of the ring s.t.
n Hamming distance between two adjacent nodes = 1
n Gray code provides such a numbering
n Can be built recursively: binary reflected Gray code
n 2 nodes: 0 1 OK
n 2k nodes:
n take Gray code for 2k-1 nodes
n Concatenate it with reflected Gray code for 2k-1 nodes
n Put 0 in front of first batch, 1 in front of second
n Mesh can be embedded into a hypercube
n (Toroidal) mesh = rings of rings
CS575 lecture 3 15
ring to hypercube cont
0 00 000 G(0,1) = 0 i G(i,dim)
1 01 001 G(1,1) = 1
11 011
10 010 G(i,x+1) = 0||G(i,x) i<2x
110 = 1||G(2 x+1-i-1,x) i>=2x
111 (|| is concatenation)
101
100

CS575 lecture 3 16
2D Mesh into hypercube
n Note 2D Mesh
n Rows: rings
n Cols: rings
n 2r * 2s wraparound mesh into 2r+s cube
n Map node(i,j) onto node G(i,r)||G(j,s)
n Row coincides with sub cube
n Column coincides with sub cube
n S.t. if adjacent in mesh then adjacent in cube

CS575 lecture 3 17
Complete binary tree into hypercube
n Map tree root to any cube node
n left child to same node
n right child at level j: invert bit j of parent node

000
000 001
000 010 001 011
000 100 010 110 001 101 011 111

CS575 lecture 3 18
Routing Mechanisms
n Determine all source destination paths
n Minimal: a shortest path
n Deterministic: one path per (src,dst) pair
n Mesh: dimension ordered (XY routing)
n Cube: E-routing
n Send along least significant 1 bit in src XOR dst
n Adaptive: many paths per (src,dst) pair
n Minimal: only shortest
n Why adaptive? Discuss.

CS575 lecture 3 19
Routing (communication) Costs
n Three factors
n Start up at source (ts)
n OS, buffers, error correction info, routing algorithm
n Hop time (th)
n The time it takes to get from one PE to the next
n Also called node latency
n Word transfer time (tw)
n Inverse of channel bandwidth

CS575 lecture 3 20
Two rout(switch)ing techniques
n Store and Forward O(m.l)
n Strict: whole message travels from PE to PE
n m words, l links
tcomm = ts + (m.tw + th).l
n Often, th is much less than m.tw: tcomm= ts + m.l.tw
n Cut-through O(m+l)
n Non-strict: message broken in flits (packets)
n Flits are pipelined through the network
tcomm= ts + l.th + m.tw
n Circular path + finite flit buffer can give rise to deadlock

CS575 lecture 3 21

Anda mungkin juga menyukai