port — providing 40, 80, or 120 Gb/s
of bandwidth per direction, respectively. This allows the 36-ported 4
ports can be aggregated to form a 8
QDR router to be treated as
a 12-port 12
QDR (120 Gb/s per direction) router which provides flexibility for building fat-trees,
and torus networks with speedup in the network fabric, for example.
Each IS4 chip provide 16 service levels (SLs) with SL15 being reserved for control messages
called management datatgrams (MADs). The SL is carried in the packet header and is invariant
throughout the route. At each hop, a service level to virtual lane (VL) assignment takes place. The
IS4 chip provides up to eight independent VLs which can be used for deadlock avoidance in the
routing algorithm, performance isolation or QoS. The VLs use credit-based flow control to manage
to downstream input buffer space and never will drop a packet due to congestion in the input buffer.
Instead, the packet is blocked at the sender. If a different VL has room in the input buffer it may flow.
Virtual cut-through flow control (VCT) [ 37 ] is used across the network links, with the exception
of SL15 (the management SL) where no flow control is provided by the hardware. Software must
provide the flow control in this case.
As off-chip router bandwidth exponentially increases while typical packet sizes remain roughly
constant, the increase in pin bandwidth relative to packet size motivates networks built from many
thin links and create high-radix routers. However, the router microarchitecture needs to scale to
a high port count effectively to enable a high-radix network. In this chapter, we described the
challenges in scaling to high-radix - primarily the complexity of the switch and the virtual channel
allocation that is proportional to the square of the radix. We presented an alternative hierarchical
router microarchitectures and provided an example of a radix-64 Cray YARC router that leverages
this hierarchical organization. By decoupling the input and the output allocation and reducing the
intermediate buffering requirements, an hierarchical switch organization provides a cost-effective
router microarchitecture that can scale to high port count.